Your browser doesn't support javascript.
loading
DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.
Kapli, Paschalia; Kotari, Ioanna; Telford, Maximilian J; Goldman, Nick; Yang, Ziheng.
Afiliação
  • Kapli P; Department of Genetics, University College London, Gower Street, London WC1E 6BT, UK.
  • Kotari I; Department of Genetics, University College London, Gower Street, London WC1E 6BT, UK.
  • Telford MJ; Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, 1210, Austria.
  • Goldman N; Department of Genetics, University College London, Gower Street, London WC1E 6BT, UK.
  • Yang Z; European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Syst Biol ; 72(5): 1119-1135, 2023 11 01.
Article em En | MEDLINE | ID: mdl-37366056
ABSTRACT
Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-coding data generated under models of heterogeneous substitution processes across sites in the sequence and among lineages on the tree, and then analyzed using nucleotide, amino acid, and codon models. Analysis of DNA sequences under nucleotide-substitution models (possibly with the third codon positions excluded) recovered the correct tree at least as often as analysis of the corresponding protein sequences under modern amino acid models. We also applied the different data-analysis strategies to an empirical dataset to infer the metazoan phylogeny. Our results from both simulated and real data suggest that DNA sequences may be as useful as proteins for inferring deep phylogenies and should not be excluded from such analyses. Analysis of DNA data under nucleotide models has a major computational advantage over protein-data analysis, potentially making it feasible to use advanced models that account for among-site and among-lineage heterogeneity in the nucleotide-substitution process in inference of deep phylogenies.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Modelos Genéticos / Nucleotídeos Limite: Animals Idioma: En Revista: Syst Biol Assunto da revista: BIOLOGIA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Reino Unido

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Modelos Genéticos / Nucleotídeos Limite: Animals Idioma: En Revista: Syst Biol Assunto da revista: BIOLOGIA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Reino Unido