Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 37(Suppl_1): i102-i110, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252953

RESUMO

MOTIVATION: Precise time calibrations needed to estimate ages of species divergence are not always available due to fossil records' incompleteness. Consequently, clock calibrations available for Bayesian dating analyses can be few and diffused, i.e. phylogenies are calibration-poor, impeding reliable inference of the timetree of life. We examined the role of speciation birth-death (BD) tree prior on Bayesian node age estimates in calibration-poor phylogenies and tested the usefulness of an informative, data-driven tree prior to enhancing the accuracy and precision of estimated times. RESULTS: We present a simple method to estimate parameters of the BD tree prior from the molecular phylogeny for use in Bayesian dating analyses. The use of a data-driven birth-death (ddBD) tree prior leads to improvement in Bayesian node age estimates for calibration-poor phylogenies. We show that the ddBD tree prior, along with only a few well-constrained calibrations, can produce excellent node ages and credibility intervals, whereas the use of an uninformative, uniform (flat) tree prior may require more calibrations. Relaxed clock dating with ddBD tree prior also produced better results than a flat tree prior when using diffused node calibrations. We also suggest using ddBD tree priors to improve the detection of outliers and influential calibrations in cross-validation analyses.These results have practical applications because the ddBD tree prior reduces the number of well-constrained calibrations necessary to obtain reliable node age estimates. This would help address key impediments in building the grand timetree of life, revealing the process of speciation and elucidating the dynamics of biological diversification. AVAILABILITY AND IMPLEMENTATION: An R module for computing the ddBD tree prior, simulated datasets and empirical datasets are available at https://github.com/cathyqqtao/ddBD-tree-prior.


Assuntos
Evolução Molecular , Fósseis , Teorema de Bayes , Calibragem , Especiação Genética , Filogenia
2.
Mol Biol Evol ; 37(6): 1819-1831, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32119075

RESUMO

The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.


Assuntos
Evolução Molecular , Genômica/métodos , Modelos Genéticos , Filogenia , Plantas/genética
3.
Bioinformatics ; 36(Suppl_2): i884-i894, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381826

RESUMO

MOTIVATION: As the number and diversity of species and genes grow in contemporary datasets, two common assumptions made in all molecular dating methods, namely the time-reversibility and stationarity of the substitution process, become untenable. No software tools for molecular dating allow researchers to relax these two assumptions in their data analyses. Frequently the same General Time Reversible (GTR) model across lineages along with a gamma (+Γ) distributed rates across sites is used in relaxed clock analyses, which assumes time-reversibility and stationarity of the substitution process. Many reports have quantified the impact of violations of these underlying assumptions on molecular phylogeny, but none have systematically analyzed their impact on divergence time estimates. RESULTS: We quantified the bias on time estimates that resulted from using the GTR + Γ model for the analysis of computer-simulated nucleotide sequence alignments that were evolved with non-stationary (NS) and non-reversible (NR) substitution models. We tested Bayesian and RelTime approaches that do not require a molecular clock for estimating divergence times. Divergence times obtained using a GTR + Γ model differed only slightly (∼3% on average) from the expected times for NR datasets, but the difference was larger for NS datasets (∼10% on average). The use of only a few calibrations reduced these biases considerably (∼5%). Confidence and credibility intervals from GTR + Γ analysis usually contained correct times. Therefore, the bias introduced by the use of the GTR + Γ model to analyze datasets, in which the time-reversibility and stationarity assumptions are violated, is likely not large and can be reduced by applying multiple calibrations. AVAILABILITY AND IMPLEMENTATION: All datasets are deposited in Figshare: https://doi.org/10.6084/m9.figshare.12594638.


Assuntos
Evolução Molecular , Modelos Genéticos , Teorema de Bayes , Fósseis , Filogenia , Alinhamento de Sequência
4.
Syst Biol ; 67(4): 594-615, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29342307

RESUMO

Primates have long been a test case for the development of phylogenetic methods for divergence time estimation. Despite a large number of studies, however, the timing of origination of crown Primates relative to the Cretaceous-Paleogene (K-Pg) boundary and the timing of diversification of the main crown groups remain controversial. Here, we analysed a data set of 372 taxa (367 Primates and 5 outgroups, 3.4 million aligned base pairs) that includes nine primate genomes. We systematically explore the effect of different interpretations of fossil calibrations and molecular clock models on primate divergence time estimates. We find that even small differences in the construction of fossil calibrations can have a noticeable impact on estimated divergence times, especially for the oldest nodes in the tree. Notably, choice of molecular rate model (autocorrelated or independently distributed rates) has an especially strong effect on estimated times, with the independent rates model producing considerably more ancient age estimates for the deeper nodes in the phylogeny. We implement thermodynamic integration, combined with Gaussian quadrature, in the program MCMCTree, and use it to calculate Bayes factors for clock models. Bayesian model selection indicates that the autocorrelated rates model fits the primate data substantially better, and we conclude that time estimates under this model should be preferred. We show that for eight core nodes in the phylogeny, uncertainty in time estimates is close to the theoretical limit imposed by fossil uncertainties. Thus, these estimates are unlikely to be improved by collecting additional molecular sequence data. All analyses place the origin of Primates close to the K-Pg boundary, either in the Cretaceous or straddling the boundary into the Palaeogene.


Assuntos
Evolução Molecular , Genoma , Filogenia , Primatas/classificação , Animais , Teorema de Bayes , Calibragem , Fósseis/anatomia & histologia , Modelos Genéticos , Primatas/anatomia & histologia , Primatas/genética
5.
New Phytol ; 218(2): 819-834, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29399804

RESUMO

Through the lens of the fossil record, angiosperm diversification precipitated a Cretaceous Terrestrial Revolution (KTR) in which pollinators, herbivores and predators underwent explosive co-diversification. Molecular dating studies imply that early angiosperm evolution is not documented in the fossil record. This mismatch remains controversial. We used a Bayesian molecular dating method to analyse a dataset of 83 genes from 644 taxa and 52 fossil calibrations to explore the effect of different interpretations of the fossil record, molecular clock models, data partitioning, among other factors, on angiosperm divergence time estimation. Controlling for different sources of uncertainty indicates that the timescale of angiosperm diversification is much less certain than previous molecular dating studies have suggested. Discord between molecular clock and purely fossil-based interpretations of angiosperm diversification may be a consequence of false precision on both sides. We reject a post-Jurassic origin of angiosperms, supporting the notion of a cryptic early history of angiosperms, but this history may be as much as 121 Myr, or as little as 23 Myr. These conclusions remain compatible with palaeobotanical evidence and a more general KTR in which major groups of angiosperms diverged later within the Cretaceous, alongside the diversification of pollinators, herbivores and their predators.


Assuntos
Evolução Biológica , Magnoliopsida/fisiologia , Incerteza , Teorema de Bayes , Calibragem , Fósseis , Variação Genética , Magnoliopsida/genética , Fatores de Tempo
6.
Mol Phylogenet Evol ; 114: 386-400, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28709986

RESUMO

Fossil calibrations are the utmost source of information for resolving the distances between molecular sequences into estimates of absolute times and absolute rates in molecular clock dating analysis. The quality of calibrations is thus expected to have a major impact on divergence time estimates even if a huge amount of molecular data is available. In Bayesian molecular clock dating, fossil calibration information is incorporated in the analysis through the prior on divergence times (the time prior). Here, we evaluate three strategies for converting fossil calibrations (in the form of minimum- and maximum-age bounds) into the prior on times, which differ according to whether they borrow information from the maximum age of ancestral nodes and minimum age of descendent nodes to form constraints for any given node on the phylogeny. We study a simple example that is analytically tractable, and analyze two real datasets (one of 10 primate species and another of 48 seed plant species) using three Bayesian dating programs: MCMCTree, MrBayes and BEAST2. We examine how different calibration strategies, the birth-death process, and automatic truncation (to enforce the constraint that ancestral nodes are older than descendent nodes) interact to determine the time prior. In general, truncation has a great impact on calibrations so that the effective priors on the calibration node ages after the truncation can be very different from the user-specified calibration densities. The different strategies for generating the effective prior also had considerable impact, leading to very different marginal effective priors. Arbitrary parameters used to implement minimum-bound calibrations were found to have a strong impact upon the prior and posterior of the divergence times. Our results highlight the importance of inspecting the joint time prior used by the dating program before any Bayesian dating analysis.


Assuntos
Fósseis , Animais , Teorema de Bayes , Evolução Biológica , Calibragem , Citocromos b/classificação , Citocromos b/genética , Complexo IV da Cadeia de Transporte de Elétrons/classificação , Complexo IV da Cadeia de Transporte de Elétrons/genética , Fósseis/história , História Antiga , Mitocôndrias/genética , NADH Desidrogenase/classificação , NADH Desidrogenase/genética , Filogenia , Plantas/classificação , Plantas/genética , Primatas/classificação , Primatas/genética
7.
Front Bioinform ; 3: 1225807, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37600967

RESUMO

A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.

8.
Front Bioinform ; 3: 1284744, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38162123

RESUMO

The primate infraorder Simiiformes, comprising Old and New World monkeys and apes, includes the most well-studied species on earth. Their most comprehensive molecular timetree, assembled from thousands of published studies, is found in the TimeTree database and contains 268 simiiform species. It is, however, missing 38 out of 306 named species in the NCBI taxonomy for which at least one molecular sequence exists in the NCBI GenBank. We developed a three-pronged approach to expanding the timetree of Simiiformes to contain 306 species. First, molecular divergence times were searched and found for 21 missing species in timetrees published across 15 studies. Second, untimed molecular phylogenies were searched and scaled to time using relaxed clocks to add four more species. Third, we reconstructed ten new timetrees from genetic data in GenBank, allowing us to incorporate 13 more species. Finally, we assembled the most comprehensive molecular timetree of Simiiformes containing all 306 species for which any molecular data exists. We compared the species divergence times with those previously imputed using statistical approaches in the absence of molecular data. The latter data-less imputed times were not significantly correlated with those derived from the molecular data. Also, using phylogenies containing imputed times produced different trends of evolutionary distinctiveness and speciation rates over time than those produced using the molecular timetree. These results demonstrate that more complete clade-specific timetrees can be produced by analyzing existing information, which we hope will encourage future efforts to fill in the missing taxa in the global timetree of life.

9.
Genome Biol Evol ; 13(11)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34751377

RESUMO

Rapid relaxed-clock dating methods are frequently applied to analyze phylogenomic data sets containing hundreds to thousands of sequences because of their accuracy and computational efficiency. However, the relative performance of different rapid dating methods is yet to be compared on the same data sets, and, thus, the power and pitfalls of selecting among these approaches remain unclear. We compared the accuracy, bias, and coverage probabilities of RelTime, treePL, and least-squares dating time estimates by applying them to analyze computer-simulated data sets in which evolutionary rates varied extensively among branches in the phylogeny. RelTime estimates were consistently more accurate than the other two, particularly when evolutionary rates were autocorrelated or shifted convergently among lineages. The 95% confidence intervals (CIs) around RelTime dates showed appropriate coverage probabilities (95% on average), but other methods produced rather low coverage probabilities because of overly narrow CIs of time estimates. Overall, RelTime appears to be a more efficient method for estimating divergence times for large phylogenies.


Assuntos
Evolução Molecular , Modelos Genéticos , Teorema de Bayes , Evolução Biológica , Simulação por Computador , Filogenia
10.
BMC Ecol Evol ; 21(1): 83, 2021 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-33980146

RESUMO

BACKGROUND: Matrices of morphological characters are frequently used for dating species divergence times in systematics. In some studies, morphological and molecular character data from living taxa are combined, whereas others use morphological characters from extinct taxa as well. We investigated whether morphological data produce time estimates that are concordant with molecular data. If true, it will justify the use of morphological characters alongside molecular data in divergence time inference. RESULTS: We systematically analyzed three empirical datasets from different species groups to test the concordance of species divergence dates inferred using molecular and discrete morphological data from extant taxa as test cases. We found a high correlation between their divergence time estimates, despite a poor linear relationship between branch lengths for morphological and molecular data mapped onto the same phylogeny. This was because node-to-tip distances showed a much higher correlation than branch lengths due to an averaging effect over multiple branches. We found that nodes with a large number of taxa often benefit from such averaging. However, considerable discordance between time estimates from molecules and morphology may still occur as  some intermediate nodes may show large time differences between these two types of data. CONCLUSIONS: Our findings suggest that node- and tip-calibration approaches may be better suited for nodes with many taxa. Nevertheless, we highlight the importance of evaluating the concordance of intrinsic time structure in morphological and molecular data before any dating analysis using combined datasets.


Assuntos
Evolução Biológica , Fósseis , Teorema de Bayes , Filogenia , Tempo
11.
Mol Ecol Resour ; 21(1): 122-136, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32881388

RESUMO

Simultaneous molecular dating of population and species divergences is essential in many biological investigations, including phylogeography, phylodynamics and species delimitation studies. In these investigations, multiple sequence alignments consist of both intra- and interspecies samples (mixed samples). As a result, the phylogenetic trees contain interspecies, interpopulation and within-population divergences. Bayesian relaxed clock methods are often employed in these analyses, but they assume the same tree prior for both inter- and intraspecies branching processes and require specification of a clock model for branch rates (independent vs. autocorrelated rates models). We evaluated the impact of a single tree prior on Bayesian divergence time estimates by analysing computer-simulated data sets. We also examined the effect of the assumption of independence of evolutionary rate variation among branches when the branch rates are autocorrelated. Bayesian approach with coalescent tree priors generally produced excellent molecular dates and highest posterior densities with high coverage probabilities. We also evaluated the performance of a non-Bayesian method, RelTime, which does not require the specification of a tree prior or a clock model. RelTime's performance was similar to that of the Bayesian approach, suggesting that it is also suitable to analyse data sets containing both populations and species variation when its computational efficiency is needed.


Assuntos
Evolução Molecular , Mamíferos , Modelos Genéticos , Filogenia , Animais , Teorema de Bayes , Simulação por Computador , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA