Imputation-Based Population Genetics Analysis of Malaria Parasites

Characterizing genetic diversity and function in Plasmodium falciparum, including identifying determinants of emerging drug resistance, is crucial to informing public health strategies to contain and eliminate this malaria parasite. The lack of a robust framework to handle missing P. falciparum genotypes arising from next-generation sequencing efforts, impedes genome-wide methods that depend on complete genotype information, and often leads to analysis that discards entire regions of the genome. This study is the first to evaluate the performance of missing data imputation or “filling in” in the P. falciparum genome, where the correlation between genetic markers is generally lower than in the human genome. We considered 86k markers in 459 clinical isolates from 4 malaria-endemic populations of Africa and Southeast Asia. Although low genotype missingness per SNP (<10%) results in complete datasets for only 25% of SNPs, imputation is accurate. This finding is corroborated by the ability of imputed haplotype analysis to recover several well-established vaccine candidates and drug resistance loci, including kelch13—a recently-validated gene involved in artemisinin resistance. Our work demonstrates that imputation can assist the application of genome-wide methods to identify the determinants of P. falciparum diversity, including those involved in drug resistance, immune evasion, and host virulence.

