Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Biol Evol ; 35(1): 159-179, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29087487

RESUMO

The phylogenetic relationships among extant gibbon species remain unresolved despite numerous efforts using morphological, behavorial, and genetic data and the sequencing of whole genomes. A major challenge in reconstructing the gibbon phylogeny is the radiative speciation process, which resulted in extremely short internal branches in the species phylogeny and extensive incomplete lineage sorting with extensive gene-tree heterogeneity across the genome. Here, we analyze two genomic-scale data sets, with ∼10,000 putative noncoding and exonic loci, respectively, to estimate the species tree for the major groups of gibbons. We used the Bayesian full-likelihood method bpp under the multispecies coalescent model, which naturally accommodates incomplete lineage sorting and uncertainties in the gene trees. For comparison, we included three heuristic coalescent-based methods (mp-est, SVDQuartets, and astral) as well as concatenation. From both data sets, we infer the phylogeny for the four extant gibbon genera to be (Hylobates, (Nomascus, (Hoolock, Symphalangus))). We used simulation guided by the real data to evaluate the accuracy of the methods used. Astral, while not as efficient as bpp, performed well in estimation of the species tree even in presence of excessive incomplete lineage sorting. Concatenation, mp-est and SVDQuartets were unreliable when the species tree contains very short internal branches. Likelihood ratio test of gene flow suggests a small amount of migration from Hylobates moloch to H. pileatus, while cross-genera migration is absent or rare. Our results highlight the utility of coalescent-based methods in addressing challenging species tree problems characterized by short internal branches and rampant gene tree-species tree discordance.


Assuntos
Hylobates/classificação , Hylobates/genética , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Teorema de Bayes , Simulação por Computador , Evolução Molecular , Especiação Genética , Genética Populacional/métodos , Genômica/métodos , Modelos Genéticos , Filogenia
2.
Bull Math Biol ; 81(2): 408-430, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29926380

RESUMO

Coalescent models of evolution account for incomplete lineage sorting by specifying a species tree parameter which determines a distribution on gene trees, and consequently, a site pattern probability distribution. It has been shown that the unrooted topology of the species tree parameter of the multispecies coalescent is generically identifiable, and a reconstruction method called SVDQuartets has been developed to infer this topology. In this paper, we describe a modified multispecies coalescent model that allows for varying effective population size and violations of the molecular clock. We show that the unrooted topology of the species tree parameter for these models is generically identifiable and that SVDQuartets can still be used to infer this topology.


Assuntos
Modelos Genéticos , Filogenia , Biologia Computacional , Simulação por Computador , Evolução Molecular , Especiação Genética , Conceitos Matemáticos , Modelos Estatísticos , Probabilidade
3.
BMC Genomics ; 19(Suppl 5): 286, 2018 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-29745854

RESUMO

BACKGROUND: Estimation of species trees from multiple genes is complicated by processes such as incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer, that result in gene trees that differ from each other and from the species phylogeny. Methods to estimate species trees in the presence of gene tree discord due to incomplete lineage sorting have been developed and proved to be statistically consistent when gene tree discord is due only to incomplete lineage sorting and every gene tree includes the full set of species. RESULTS: We establish statistical consistency of certain coalescent-based species tree estimation methods under some models of taxon deletion from genes. We also evaluate the impact of missing data on four species tree estimation methods (ASTRAL-II, ASTRID, MP-EST, and SVDquartets) using simulated datasets with varying levels of incomplete lineage sorting, gene tree estimation error, and degrees/patterns of missing data. CONCLUSIONS: All the species tree estimation methods improved in accuracy as the number of genes increased and often produced highly accurate species trees even when the amount of missing data was large. These results together indicate that accurate species tree estimation is possible under a variety of conditions, even when there are substantial amounts of missing data.


Assuntos
Classificação/métodos , Especiação Genética , Modelos Genéticos , Filogenia , Algoritmos , Simulação por Computador , Genes , Genômica , Especificidade da Espécie
4.
Syst Biol ; 64(6): 1032-47, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26227865

RESUMO

Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the presence of missing data. Phylogenetic analysis of RAD loci requires careful attention to model assumptions, especially if downstream analyses depend on branch lengths.


Assuntos
Classificação/métodos , Simulação por Computador , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Animais , Lagartos/classificação , Lagartos/genética
5.
Am J Bot ; 103(2): 337-47, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26851268

RESUMO

PREMISE OF THE STUDY: Relationships among closely related and recently diverged taxa can be especially difficult to resolve. Here we use both Sanger sequencing and next-generation RADseq data sets to estimate phylogenetic relationships among species of Carex section Racemosae (Cyperaceae), a clade largely restricted to high latitudes and elevations. Interest in relationships among these taxa derives from questions about the species' biogeographic histories and possible links between diversification and Pleistocene glaciations. METHODS: A combination of approaches and molecular markers were used to estimate relationships among Carex species within sect. Racemosae and taxa from closely related sections. Nuclear and chloroplast loci generated by Sanger sequencing were analyzed with *BEAST, and SNP data from RADseq loci were analyzed as a concatenated data set using maximum likelihood and as independent loci using SVDquartets. KEY RESULTS: Sanger sequencing data sets resolved relationships among taxa at intermediate phylogenetic depths (albeit with low levels of support). Only the RADseq data resolved relationships with strong support at all phylogenetic depths. Moreover, different methods and data partitions of the RADseq data resulted in nearly identical topologies. Carex sect. Racemosae is a strongly supported clade, although a handful of species were found to group with closely related sections. Herbarium specimens up to 35 yr old successfully produced informative RADseq data. CONCLUSIONS: Despite the short read lengths of RADseq data, they nevertheless resolved relationships that Sanger sequencing data did not. Resolution of the phylogenetic relationships among recently and rapidly diversifying taxa within sect. Racemosae clades suggest a role for the Pleistocene glaciations in clade diversification.


Assuntos
Carex (Planta)/genética , Evolução Molecular , Filogenia , Núcleo Celular/genética , Cloroplastos/genética , DNA Intergênico/genética , DNA de Plantas/genética , Dados de Sequência Molecular , Análise de Sequência de DNA
6.
Front Genet ; 12: 664357, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34276772

RESUMO

A phylogenetic model of sequence evolution for a set of n taxa is a collection of probability distributions on the 4 n possible site patterns that may be observed in their aligned DNA sequences. For a four-taxon model, one can arrange the entries of these probability distributions into three flattening matrices that correspond to the three different unrooted leaf-labeled four-leaf trees, or quartet trees. The flattening matrix corresponding to the tree parameter of the model is known to satisfy certain rank conditions. Methods such as ErikSVD and SVDQuartets take advantage of this observation by applying singular value decomposition to flattening matrices consisting of empirical data. Each possible quartet is assigned an "SVD score" based on how close the flattening is to the set of matrices of the predicted rank. When choosing among possible quartets, the one with the lowest score is inferred to be the phylogeny of the four taxa under consideration. Since an n-leaf phylogenetic tree is determined by its quartets, this approach can be generalized to infer larger phylogenies. In this article, we explore using the SVD score as a test statistic to test whether phylogenetic data were generated by a particular quartet tree. To do so, we use several results to approximate the distribution of the SVD score and to give upper bounds on the p-value of the associated hypothesis tests. We also apply these hypothesis tests to simulated phylogenetic data and discuss the implications for interpreting SVD scores in rank-based inference methods.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA