An accurate method for identifying recent recombinants from unaligned sequences.

Feng, Qian; Tiedje, Kathryn E; Ruybal-Pesántez, Shazia; Tonkin-Hill, Gerry; Duffy, Michael F; Day, Karen P; Shim, Heejung; Chan, Yao-Ban

Feng, Qian; Tiedje, Kathryn E; Ruybal-Pesántez, Shazia; Tonkin-Hill, Gerry; Duffy, Michael F; Day, Karen P; Shim, Heejung; Chan, Yao-Ban.

Afiliación

Feng Q; Melbourne Integrative Genomics/School of Mathematics and Statistics, The University of Melbourne, Melbourne, VIC 3010, Australia.
Tiedje KE; School of BioSciences, The University of Melbourne, Bio21 Molecular Science and Biotechnology Institute, Melbourne, VIC 3010, Australia.
Ruybal-Pesántez S; Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and Immunity and Bio21 Molecular Science and Biotechnology Institute, Melbourne, VIC 3000, Australia.
Tonkin-Hill G; School of BioSciences, The University of Melbourne, Bio21 Molecular Science and Biotechnology Institute, Melbourne, VIC 3010, Australia.
Duffy MF; Population Health and Immunity Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC 3052, Australia.
Day KP; Department of Medical Biology, The University of Melbourne, Melbourne, VIC 3010, Australia.
Shim H; Burnet Institute, Melbourne, VIC 3004, Australia.
Chan YB; School of BioSciences, The University of Melbourne, Bio21 Molecular Science and Biotechnology Institute, Melbourne, VIC 3010, Australia.

Bioinformatics ; 38(7): 1823-1829, 2022 03 28.

Article en En | MEDLINE | ID: mdl-35025988

RESUMEN

MOTIVATION: Recombination is a fundamental process in molecular evolution, and the identification of recombinant sequences is thus of major interest. However, current methods for detecting recombinants are primarily designed for aligned sequences. Thus, they struggle with analyses of highly diverse genes, such as the var genes of the malaria parasite Plasmodium falciparum, which are known to diversify primarily through recombination. RESULTS: We introduce an algorithm to detect recent recombinant sequences from a dataset without a full multiple alignment. Our algorithm can handle thousands of gene-length sequences without the need for a reference panel. We demonstrate the accuracy of our algorithm through extensive numerical simulations; in particular, it maintains its effectiveness in the presence of insertions and deletions. We apply our algorithm to a dataset of 17 335 DBLα types in var genes from Ghana, observing that sequences belonging to the same ups group or domain subclass recombine amongst themselves more frequently, and that non-recombinant DBLα types are more conserved than recombinant ones. AVAILABILITY AND IMPLEMENTATION: Source code is freely available at https://github.com/qianfeng2/detREC_program. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Variación Genética; Proteínas Protozoarias; Proteínas Protozoarias/genética; Plasmodium falciparum/genética; Programas Informáticos; Evolución Molecular

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Contexto en salud: 3_ND Problema de salud: 3_malaria Asunto principal: Variación Genética / Proteínas Protozoarias Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Australia

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google