Your browser doesn't support javascript.
loading
Accurate Detection of Convergent Mutations in Large Protein Alignments With ConDor.
Morel, Marie; Zhukova, Anna; Lemoine, Frédéric; Gascuel, Olivier.
  • Morel M; Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France.
  • Zhukova A; Université Claude Bernard Lyon 1, LBBE, UMR 5558, CNRS, VAS, Villeurbanne, 69100, France.
  • Lemoine F; Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France.
  • Gascuel O; Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France.
Genome Biol Evol ; 16(4)2024 04 02.
Article en En | MEDLINE | ID: mdl-38451738
ABSTRACT
Evolutionary convergences are observed at all levels, from phenotype to DNA and protein sequences, and changes at these different levels tend to be correlated. Notably, convergent mutations can lead to convergent changes in phenotype, such as changes in metabolism, drug resistance, and other adaptations to changing environments. We propose a two-component approach to detect mutations subject to convergent evolution in protein alignments. The "Emergence" component selects mutations that emerge more often than expected, while the "Correlation" component selects mutations that correlate with the convergent phenotype under study. With regard to Emergence, a phylogeny deduced from the alignment is provided by the user and is used to simulate the evolution of each alignment position. These simulations allow us to estimate the expected number of mutations in a neutral model, which is compared to the observed number of mutations in the data studied. In Correlation, a comparative phylogenetic approach, is used to measure whether the presence of each of the observed mutations is correlated with the convergent phenotype. Each component can be used on its own, for example Emergence when no phenotype is available. Our method is implemented in a standalone workflow and a webserver, called ConDor. We evaluate the properties of ConDor using simulated data, and we apply it to three real datasets sedge PEPC proteins, HIV reverse transcriptase, and fish rhodopsin. The results show that the two components of ConDor complement each other, with an overall accuracy that compares favorably to other available tools, especially on large datasets.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Evolución Molecular / Peces Límite: Animals Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Evolución Molecular / Peces Límite: Animals Idioma: En Año: 2024 Tipo del documento: Article