RESUMO
Traditionally, patient travel history has been used to distinguish imported from autochthonous malaria cases, but the dormant liver stages of Plasmodium vivax confound this approach. Molecular tools offer an alternative method to identify, and map imported cases. Using machine learning approaches incorporating hierarchical fixation index and decision tree analyses applied to 799 P. vivax genomes from 21 countries, we identified 33-SNP, 50-SNP and 55-SNP barcodes (GEO33, GEO50 and GEO55), with high capacity to predict the infection's country of origin. The Matthews correlation coefficient (MCC) for an existing, commonly applied 38-SNP barcode (BR38) exceeded 0.80 in 62% countries. The GEO panels outperformed BR38, with median MCCs > 0.80 in 90% countries at GEO33, and 95% at GEO50 and GEO55. An online, open-access, likelihood-based classifier framework was established to support data analysis (vivaxGEN-geo). The SNP selection and classifier methods can be readily amended for other use cases to support malaria control programs.
Assuntos
Malária Vivax , Malária , Humanos , Malária Vivax/diagnóstico , Malária Vivax/genética , Funções Verossimilhança , Plasmodium vivax/genética , InternetRESUMO
BACKGROUND: The control and elimination of Plasmodium vivax will require a better understanding of its transmission dynamics, through the application of genotyping and population genetics analyses. This paper describes VivaxGEN (http://vivaxgen.menzies.edu.au), a web-based platform that has been developed to support P. vivax short tandem repeat data sharing and comparative analyses. RESULTS: The VivaxGEN platform provides a repository for raw data generated by capillary electrophoresis (FSA files), with fragment analysis and standardized allele calling tools. The query system of the platform enables users to filter, select and differentiate samples and alleles based on their specified criteria. Key population genetic analyses are supported including measures of population differentiation (FST), expected heterozygosity (HE), linkage disequilibrium (IAS), neighbor-joining analysis and Principal Coordinate Analysis. Datasets can also be formatted and exported for application in commonly used population genetic software including GENEPOP, Arlequin and STRUCTURE. To date, data from 10 countries, including 5 publicly available data sets have been shared with VivaxGEN. CONCLUSIONS: VivaxGEN is well placed to facilitate regional overviews of P. vivax transmission dynamics in different endemic settings and capable to be adapted for similar genetic studies of P. falciparum and other organisms.