The effects of sequence length, tree topology, and number of taxa on the performance of phylogenetic methods.
J Comput Biol
; 1(2): 133-51, 1994.
Article
em En
| MEDLINE
| ID: mdl-8790460
ABSTRACT
Simulations were used to study the performance of several character-based and distance-based phylogenetic methods in obtaining the correct tree from pseudo-randomly generated input data. The study included all the topologies of unrooted binary trees with from 4 to 10 pendant vertices (taxa) inclusive. The length of the character sequences used ranged from 10 to 10(5) characters exponentially. The methods studied include Closest Tree, Compatibility, Li's method, Maximum Parsimony, Neighbor-joining, Neighborliness, and UPGMA. We also provide a modification to Li's method (SimpLi) which is consistent with additive data. We give estimations of the sequence lengths required for given confidence in the output of these methods under the assumptions of molecular evolution used in this study. A notation for characterizing all tree topologies is described. We show that when the number of taxa, the maximum path length, and the minimum edge length are held constant, there it little but significant dependence of the performance of the methods on the tree topology. We show that those methods that are consistent with the model used perform similarly, whereas the inconsistent methods, UPGMA and Li's method, perform very poorly.
Buscar no Google
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Filogenia
/
Simulação por Computador
/
Sequência de Bases
/
Alinhamento de Sequência
/
Modelos Biológicos
Tipo de estudo:
Incidence_studies
/
Prognostic_studies
Idioma:
En
Revista:
J Comput Biol
Assunto da revista:
BIOLOGIA MOLECULAR
/
INFORMATICA MEDICA
Ano de publicação:
1994
Tipo de documento:
Article
País de afiliação:
Nova Zelândia