Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Proc Natl Acad Sci U S A ; 120(7): e2208851120, 2023 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-36757894


The birth-death model is commonly used to infer speciation and extinction rates by fitting the model to phylogenetic trees with exclusively extant taxa. Recently, it was demonstrated that speciation and extinction rates are not identifiable if the rates are allowed to vary freely over time. The group of birth-death models that have the same likelihood is called a congruence class, and there is no statistical evidence to favor one model over the other. This issue has led researchers to question if and what patterns can reliably be inferred from phylogenies of only extant taxa and whether time-variable birth-death models should be fitted at all. We explore the congruence class in the context of several empirical phylogenies as well as hypothetical scenarios. For these empirical phylogenies, we assume that we inferred the true congruence class. Thus, our conclusions apply to any empirical phylogeny for which we robustly inferred the true congruence class. When we summarize shared patterns in the congruence class, we show that strong directional trends in speciation and extinction rates are shared among most models. Therefore, we conclude that the inference of strong directional trends is robust. Conversely, estimates of constant rates or gentle slopes are not robust and must be treated with caution. Interestingly, the space of valid speciation rates is narrower and more limited in contrast to extinction rates, which are less constrained. These results provide further evidence and insights that speciation rates can be estimated more reliably than extinction rates.

Extinção Biológica , Parto , Feminino , Gravidez , Humanos , Filogenia , Probabilidade , Especiação Genética
Mol Biol Evol ; 41(5)2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38630635


Bayesian coalescent skyline plot models are widely used to infer demographic histories. The first (non-Bayesian) coalescent skyline plot model assumed a known genealogy as data, while subsequent models and implementations jointly inferred the genealogy and demographic history from sequence data, including heterochronous samples. Overall, there exist multiple different Bayesian coalescent skyline plot models which mainly differ in two key aspects: (i) how changes in population size are modeled through independent or autocorrelated prior distributions, and (ii) how many change-points in the demographic history are used, where they occur and if the number is pre-specified or inferred. The specific impact of each of these choices on the inferred demographic history is not known because of two reasons: first, not all models are implemented in the same software, and second, each model implementation makes specific choices that the biologist cannot influence. To facilitate a detailed evaluation of Bayesian coalescent skyline plot models, we implemented all currently described models in a flexible design into the software RevBayes. Furthermore, we evaluated models and choices on an empirical dataset of horses supplemented by a small simulation study. We find that estimated demographic histories can be grouped broadly into two groups depending on how change-points in the demographic history are specified (either independent of or at coalescent events). Our simulations suggest that models using change-points at coalescent events produce spurious variation near the present, while most models using independent change-points tend to over-smooth the inferred demographic history.

Teorema de Bayes , Genética Populacional , Modelos Genéticos , Animais , Genética Populacional/métodos , Cavalos , Densidade Demográfica , Simulação por Computador , Software , Demografia
Mol Biol Evol ; 41(3)2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38437512


Poor fit between models of sequence or trait evolution and empirical data is known to cause biases and lead to spurious conclusions about evolutionary patterns and processes. Bayesian posterior prediction is a flexible and intuitive approach for detecting such cases of poor fit. However, the expected behavior of posterior predictive tests has never been characterized for evolutionary models, which is critical for their proper interpretation. Here, we show that the expected distribution of posterior predictive P-values is generally not uniform, in contrast to frequentist P-values used for hypothesis testing, and extreme posterior predictive P-values often provide more evidence of poor fit than typically appreciated. Posterior prediction assesses model adequacy under highly favorable circumstances, because the model is fitted to the data, which leads to expected distributions that are often concentrated around intermediate values. Nonuniform expected distributions of P-values do not pose a problem for the application of these tests, however, and posterior predictive P-values can be interpreted as the posterior probability that the fitted model would predict a dataset with a test statistic value as extreme as the value calculated from the observed data.

Modelos Estatísticos , Teorema de Bayes , Probabilidade
Syst Biol ; 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38771253


The ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost. Instead, phylogenetic pipelines generally consist of sequential analyses, whereby a single point estimate from a given analysis is used as input for the next analysis (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step, which can lead to inaccurate or spuriously confident results. Here, we formally develop and test a sequential inference approach for Bayesian phylogenetic inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior distribution produced in the previous step. Our sequential inference approach presented here not only accounts for uncertainty between analysis steps, but also allows for greater flexibility in software choice (and hence model availability) and can be computationally more efficient than the traditional joint inference approach when multiple models are being tested. We show that our sequential inference approach is identical in practice to the joint inference approach only if sufficient information in the data is present (a narrow posterior distribution) and/or sufficiently many importance samples are used. Conversely, we show that the common practice of using a single point estimate can be biased, e.g., a single phylogeny estimate to transform an unrooted phylogeny into a time-calibrated phylogeny. We demonstrate the theory of sequential Bayesian inference using both a toy example and an empirical case study of divergence-time estimation in insects using a relaxed clock model from transcriptome data. In the empirical example, we estimate three posterior distributions of branch lengths from the same data (DNA character matrix with a GTR+Γ+I substitution model, an amino acid data matrix with empirical substitution models, and an amino acid data matrix with the PhyloBayes CAT-GTR model). Finally, we apply three different node-calibration strategies and show that divergence-time estimates are affected by both the data source and underlying substitution process to estimate branch lengths as well as the node-calibration strategies. Thus, our new sequential Bayesian phylogenetic inference provides the opportunity to efficiently test different approaches for divergence time estimation, including branch-length estimation from other software.

Syst Biol ; 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-38963801


Phylogenetic trees establish a historical context for the study of organismal form and function. Most phylogenetic trees are estimated using a model of evolution. For molecular data, modeling evolution is often based on biochemical observations about changes between character states. For example, there are four nucleotides, and we can make assumptions about the probability of transitions between them. By contrast, for morphological characters, we may not know a priori how many characters states there are per character, as both extant sampling and the fossil record may be highly incomplete, which leads to an observer bias. For a given character, the state space may be larger than what has been observed in the sample of taxa collected by the researcher. In this case, how many evolutionary rates are needed to even describe transitions between morphological character states may not be clear, potentially leading to model misspecification. To explore the impact of this model misspecification, we simulated character data with varying numbers of character states per character. We then used the data to estimate phylogenetic trees using models of evolution with the correct number of character states and an incorrect number of character states. The results of this study indicate that this observer bias may lead to phylogenetic error, particularly in the branch lengths of trees. If the state space is wrongly assumed to be too large, then we underestimate the branch lengths, and the opposite occurs when the state space is wrongly assumed to be too small.

Syst Biol ; 73(2): 455-469, 2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-38284268


Phylogenies are central to many research areas in biology and commonly estimated using likelihood-based methods. Unfortunately, any likelihood-based method, including Bayesian inference, can be restrictively slow for large datasets-with many taxa and/or many sites in the sequence alignment-or complex substitutions models. The primary limiting factor when using large datasets and/or complex models in probabilistic phylogenetic analyses is the likelihood calculation, which dominates the total computation time. To address this bottleneck, we incorporated the high-performance phylogenetic library BEAGLE into RevBayes, which enables multi-threading on multi-core CPUs and GPUs, as well as hardware specific vectorized instructions for faster likelihood calculations. Our new implementation of RevBayes+BEAGLE retains the flexibility and dynamic nature that users expect from vanilla RevBayes. In addition, we implemented native parallelization within RevBayes without an external library using the message passing interface (MPI); RevBayes+MPI. We evaluated our new implementation of RevBayes+BEAGLE using multi-threading on CPUs and 2 different powerful GPUs (NVidia Titan V and NVIDIA A100) against our native implementation of RevBayes+MPI. We found good improvements in speedup when multiple cores were used, with up to 20-fold speedup when using multiple CPU cores and over 90-fold speedup when using multiple GPU cores. The improvement depended on the data type used, DNA or amino acids, and the size of the alignment, but less on the size of the tree. We additionally investigated the cost of rescaling partial likelihoods to avoid numerical underflow and showed that unnecessarily frequent and inefficient rescaling can increase runtimes up to 4-fold. Finally, we presented and compared a new approach to store partial likelihoods on branches instead of nodes that can speed up computations up to 1.7 times but comes at twice the memory requirements.

Teorema de Bayes , Filogenia , Software , Classificação/métodos , Biologia Computacional/métodos
Syst Biol ; 2024 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-39374100


Reconstructing the evolutionary history of different groups of organisms provides insight into how life originated and diversified on Earth. Phylogenetic trees are commonly used to estimate this evolutionary history. Within Bayesian phylogenetics a major step in estimating a tree is in choosing an appropriate model of character evolution. While the most common character data used is molecular sequence data, morphological data remains a vital source of information. The use of morphological characters allows for the incorporation fossil taxa, and despite advances in molecular sequencing, continues to play a significant role in neontology. Moreover, it is the main data source that allows us to unite extinct and extant taxa directly under the same generating process. We therefore require suitable models of morphological character evolution, the most common being the Mk Lewis model. While it is frequently used in both palaeobiology and neontology, it is not known whether the simple Mk substitution model, or any extensions to it, provide a sufficiently good description of the process of morphological evolution. In this study we investigate the impact of different morphological models on empirical tetrapod data sets. Specifically, we compare unpartitioned Mk models with those where characters are partitioned by the number of observed states, both with and without allowing for rate variation across sites and accounting for ascertainment bias. We show that the choice of substitution model has an impact on both topology and branch lengths, highlighting the importance of model choice. Through simulations, we validate the use of the model adequacy approach, posterior predictive simulations, for choosing an appropriate model. Additionally, we compare the performance of model adequacy with Bayesian model selection. We demonstrate how model selection approaches based on marginal likelihoods are not appropriate for choosing between models with partition schemes that vary in character state space (i.e., that vary in Q-matrix state size). Using posterior predictive simulations, we found that current variations of the Mk model are often performing adequately in capturing the evolutionary dynamics that generated our data. We do not find any preference for a particular model extension across multiple data sets, indicating that there is no 'one size fits all' when it comes to morphological data and that careful consideration should be given to choosing models of discrete character evolution. By using suitable models of character evolution, we can increase our confidence in our phylogenetic estimates, which should in turn allow us to gain more accurate insights into the evolutionary history of both extinct and extant taxa.

Syst Biol ; 72(6): 1418-1432, 2023 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-37455495


Model selection aims to choose the most adequate model for the statistical analysis at hand. The model must be complex enough to capture the complexity of the data but should be simple enough not to overfit. In phylogenetics, the most common model selection scenario concerns selecting an adequate substitution and partition model for sequence evolution to infer a phylogenetic tree. Previously, several studies showed that substitution model under-parameterization can bias phylogenetic studies. Here, we explored the impact of substitution model over-parameterization in a Bayesian statistical framework. We performed simulations under the simplest substitution model, the Jukes-Cantor model, and compare posterior estimates of phylogenetic tree topologies and tree length under the true model to the most complex model, the $\text{GTR}+\Gamma+\text{I}$ substitution model, including over-splitting the data into additional subsets (i.e., applying partitioned models). We explored 4 choices of prior distributions: the default substitution model priors of MrBayes, BEAST2, and RevBayes and a newly devised prior choice (Tame). Our results show that Bayesian inference of phylogeny is robust to substitution model over-parameterization and over-partitioning but only under our new prior settings. All 3 current default priors introduced biases for the estimated tree length. We conclude that substitution and partition model selection are superfluous steps in Bayesian phylogenetic inference pipelines if well-behaved prior distributions are applied and more effort should focus on more complex and biologically realistic substitution models.

Modelos Genéticos , Projetos de Pesquisa , Filogenia , Teorema de Bayes
Syst Biol ; 71(4): 797-809, 2022 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-34668564


Dating the tree of life is central to understanding the evolution of life on Earth. Molecular clocks calibrated with fossils represent the state of the art for inferring the ages of major groups. Yet, other information on the timing of species diversification can be used to date the tree of life. For example, horizontal gene transfer events and ancient coevolutionary interactions such as (endo)symbioses occur between contemporaneous species and thus can imply temporal relationships between two nodes in a phylogeny. Temporal constraints from these alternative sources can be particularly helpful when the geological record is sparse, for example, for microorganisms, which represent the majority of extant and extinct biodiversity. Here, we present a new method to combine fossil calibrations and relative age constraints to estimate chronograms. We provide an implementation of relative age constraints in RevBayes that can be combined in a modular manner with the wide range of molecular dating methods available in the software. We use both realistic simulations and empirical datasets of 40 Cyanobacteria and 62 Archaea to evaluate our method. We show that the combination of relative age constraints with fossil calibrations significantly improves the estimation of node ages. [Archaea, Bayesian analysis, cyanobacteria, dating, endosymbiosis, lateral gene transfer, MCMC, molecular clock, phylogenetic dating, relaxed molecular clock, revbayes, tree of life.].

Fósseis , Transferência Genética Horizontal , Teorema de Bayes , Evolução Molecular , Filogenia , Simbiose
J Evol Biol ; 35(11): 1488-1499, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36168726


The firefly Photinus pyralis inhabits a wide range of latitudinal and ecological niches, with populations living from temperate to tropical habitats. Despite its broad distribution, its demographic history is unknown. In this study, we modelled and inferred different demographic scenarios for North American populations of P. pyralis, which were collected from Texas to New Jersey. We used a combination of ABC techniques (for multi-population/colonization analyses) and likelihood inference (dadi, StairwayPlot2, PoMo) for single-population demographic inference, which proved useful with our RAD data. We uncovered that the most ancestral North American population lays in Texas, which further colonized the Central region of the US and more recently the North Eastern coast. Our study confidently rejects a demographic scenario where the North Eastern populations colonized more southern populations until reaching Texas. To estimate the age of divergence between of P. pyralis, which provides deeper insights into the history of the entire species, we assembled a multi-locus phylogenetic data covering the genus Photinus. We uncovered that the phylogenetic node leading to P. pyralis lies at the end of the Miocene. Importantly, modelling the demographic history of North American P. pyralis serves as a null model of nucleotide diversity patterns in a widespread native insect species, which will serve in future studies for the detection of adaptation events in this firefly species, as well as a comparison for future studies of other North American insect taxa.

Aclimatação , Vaga-Lumes , Animais , Filogenia , América do Norte , Demografia
PLoS Comput Biol ; 16(10): e1007999, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33112848


Birth-death processes have given biologists a model-based framework to answer questions about changes in the birth and death rates of lineages in a phylogenetic tree. Therefore birth-death models are central to macroevolutionary as well as phylodynamic analyses. Early approaches to studying temporal variation in birth and death rates using birth-death models faced difficulties due to the restrictive choices of birth and death rate curves through time. Sufficiently flexible time-varying birth-death models are still lacking. We use a piecewise-constant birth-death model, combined with both Gaussian Markov random field (GMRF) and horseshoe Markov random field (HSMRF) prior distributions, to approximate arbitrary changes in birth rate through time. We implement these models in the widely used statistical phylogenetic software platform RevBayes, allowing us to jointly estimate birth-death process parameters, phylogeny, and nuisance parameters in a Bayesian framework. We test both GMRF-based and HSMRF-based models on a variety of simulated diversification scenarios, and then apply them to both a macroevolutionary and an epidemiological dataset. We find that both models are capable of inferring variable birth rates and correctly rejecting variable models in favor of effectively constant models. In general the HSMRF-based model has higher precision than its GMRF counterpart, with little to no loss of accuracy. Applied to a macroevolutionary dataset of the Australian gecko family Pygopodidae (where birth rates are interpretable as speciation rates), the GMRF-based model detects a slow decrease whereas the HSMRF-based model detects a rapid speciation-rate decrease in the last 12 million years. Applied to an infectious disease phylodynamic dataset of sequences from HIV subtype A in Russia and Ukraine (where birth rates are interpretable as the rate of accumulation of new infections), our models detect a strongly elevated rate of infection in the 1990s.

Coeficiente de Natalidade , Modelos Biológicos , Modelos Estatísticos , Mortalidade , Algoritmos , Animais , Teorema de Bayes , Evolução Biológica , Biologia Computacional , Simulação por Computador , Lagartos/fisiologia
Syst Biol ; 68(3): 505-519, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30476308


A major goal of evolutionary biology is to identify key evolutionary transitions that correspond with shifts in speciation and extinction rates. Stochastic character mapping has become the primary method used to infer the timing, nature, and number of character state transitions along the branches of a phylogeny. The method is widely employed for standard substitution models of character evolution. However, current approaches cannot be used for models that specifically test the association of character state transitions with shifts in diversification rates such as state-dependent speciation and extinction (SSE) models. Herein, we introduce a new stochastic character mapping algorithm that overcomes these limitations, and apply it to study mating system evolution over a time-calibrated phylogeny of the plant family Onagraceae. Utilizing a hidden state SSE model we tested the association of the loss of self-incompatibility (SI) with shifts in diversification rates. We found that self-compatible lineages have higher extinction rates and lower net-diversification rates compared with self-incompatible lineages. Furthermore, these results provide empirical evidence for the "senescing" diversification rates predicted in highly selfing lineages: our mapped character histories show that the loss of SI is followed by a short-term spike in speciation rates, which declines after a time lag of several million years resulting in negative net-diversification. Lineages that have long been self-compatible, such as Fuchsia and Clarkia, are in a previously unrecognized and ongoing evolutionary decline. Our results demonstrate that stochastic character mapping of SSE models is a powerful tool for examining the timing and nature of both character state transitions and shifts in diversification rates over the phylogeny.

Classificação/métodos , Modelos Biológicos , Onagraceae/classificação , Filogenia , Algoritmos , Extinção Biológica , Especiação Genética
Syst Biol ; 68(1): 78-92, 2019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29931325


New World Monkeys (NWM) (platyrrhines) are one of the most diverse groups of primates, occupying today a wide range of ecosystems in the American tropics and exhibiting large variations in ecology, morphology, and behavior. Although the relationships among the almost 200 living species are relatively well understood, we lack robust estimates of the timing of origin, ancestral morphology, and geographic range evolution of the clade. Herein, we integrate paleontological and molecular evidence to assess the evolutionary dynamics of extinct and extant platyrrhines. We develop novel analytical frameworks to infer the evolution of body mass, changes in latitudinal ranges through time, and species diversification rates using a phylogenetic tree of living and fossil taxa. Our results show that platyrrhines originated 5-10 million years earlier than previously assumed, dating back to the Middle Eocene. The estimated ancestral platyrrhine was small-weighing 0.4 kg-and matched the size of their presumed African ancestors. As the three platyrrhine families diverged, we recover a rapid change in body mass range. During the Miocene Climatic Optimum, fossil diversity peaked and platyrrhines reached their widest latitudinal range, expanding as far South as Patagonia, favored by warm and humid climate and the lower elevation of the Andes. Finally, global cooling and aridification after the middle Miocene triggered a geographic contraction of NWM and increased their extinction rates. These results unveil the full evolutionary trajectory of an iconic and ecologically important radiation of monkeys and showcase the necessity of integrating fossil and molecular data for reliably estimating evolutionary rates and trends.

Clima , Fósseis , Filogenia , Platirrinos/classificação , África , Animais , Platirrinos/anatomia & histologia
Mol Biol Evol ; 35(4): 1028-1034, 2018 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-29136211


Tests of absolute model fit are crucial in model-based inference because poorly structured models can lead to biased parameter estimates. In Bayesian inference, posterior predictive simulations can be used to test absolute model fit. However, such tests have not been commonly practiced in phylogenetic inference due to a lack of convenient and flexible software. Here, we describe our newly implemented tests of model fit using posterior predictive testing, based on both data- and inference-based test statistics, in the phylogenetics software RevBayes. This new implementation makes a large spectrum of models available for use through a user-friendly and flexible interface.

Modelos Estatísticos , Filogenia , Software , Animais , Teorema de Bayes , Citocromos b/genética , Primatas/genética
Syst Biol ; 67(2): 195-215, 2018 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-28945917


ABSSTRACT: Chromosome number is a key feature of the higher-order organization of the genome, and changes in chromosome number play a fundamental role in evolution. Dysploid gains and losses in chromosome number, as well as polyploidization events, may drive reproductive isolation and lineage diversification. The recent development of probabilistic models of chromosome number evolution in the groundbreaking work by Mayrose et al. (2010, ChromEvol) have enabled the inference of ancestral chromosome numbers over molecular phylogenies and generated new interest in studying the role of chromosome changes in evolution. However, the ChromEvol approach assumes all changes occur anagenetically (along branches), and does not model events that are specifically cladogenetic. Cladogenetic changes may be expected if chromosome changes result in reproductive isolation. Here we present a new class of models of chromosome number evolution (called ChromoSSE) that incorporate both anagenetic and cladogenetic change. The ChromoSSE models allow us to determine the mode of chromosome number evolution; is chromosome evolution occurring primarily within lineages, primarily at lineage splitting, or in clade-specific combinations of both? Furthermore, we can estimate the location and timing of possible chromosome speciation events over the phylogeny. We implemented ChromoSSE in a Bayesian statistical framework, specifically in the software RevBayes, to accommodate uncertainty in parameter estimates while leveraging the full power of likelihood based methods. We tested ChromoSSE's accuracy with simulations and re-examined chromosomal evolution in Aristolochia, Carex section Spirostachyae, Helianthus, Mimulus sensu lato (s.l.), and Primula section Aleuritia, finding evidence for clade-specific combinations of anagenetic and cladogenetic dysploid and polyploid modes of chromosome evolution. [Anagenetic; Bayes factors; chromosome evolution; chromosome speciation; chromoSSE; cladogenetic; dysploidy; phylogenetic models; polyploidy; reversible-jump Markov chain Monte Carlo; whole genome duplication.].

Classificação , Evolução Molecular , Modelos Genéticos , Teorema de Bayes , Filogenia , Plantas/classificação , Plantas/genética
Syst Biol ; 67(6): 940-964, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29438538


In macroevolution, the Red Queen (RQ) model posits that biodiversity dynamics depend mainly on species-intrinsic biotic factors such as interactions among species or life-history traits, while the Court Jester (CJ) model states that extrinsic environmental abiotic factors have a stronger role. Until recently, a lack of relevant methodological approaches has prevented the unraveling of contributions from these 2 types of factors to the evolutionary history of a lineage. Herein, we take advantage of the rapid development of new macroevolution models that tie diversification rates to changes in paleoenvironmental (extrinsic) and/or biotic (intrinsic) factors. We inferred a robust and fully-sampled species-level phylogeny, as well as divergence times and ancestral geographic ranges, and related these to the radiation of Apollo butterflies (Parnassiinae) using both extant (molecular) and extinct (fossil/morphological) evidence. We tested whether their diversification dynamics are better explained by an RQ or CJ hypothesis, by assessing whether speciation and extinction were mediated by diversity-dependence (niche filling) and clade-dependent host-plant association (RQ) or by large-scale continuous changes in extrinsic factors such as climate or geology (CJ). For the RQ hypothesis, we found significant differences in speciation rates associated with different host-plants but detected no sign of diversity-dependence. For CJ, the role of Himalayan-Tibetan building was substantial for biogeography but not a driver of high speciation, while positive dependence between warm climate and speciation/extinction was supported by continuously varying maximum-likelihood models. We find that rather than a single factor, the joint effect of multiple factors (biogeography, species traits, environmental drivers, and mass extinction) is responsible for current diversity patterns and that the same factor might act differently across clades, emphasizing the notion of opportunity. This study confirms the importance of the confluence of several factors rather than single explanations in modeling diversification within lineages.

Evolução Biológica , Borboletas/classificação , Modelos Biológicos , Animais , Biodiversidade , Borboletas/genética , Especiação Genética , Filogenia
Proc Natl Acad Sci U S A ; 113(34): 9569-74, 2016 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-27512038


Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM.

Coevolução Biológica , Extinção Biológica , Especiação Genética , Filogenia , Baleias/classificação , Animais , Teorema de Bayes , Biodiversidade , Funções Verossimilhança , Distribuição de Poisson , Baleias/genética
Mol Ecol ; 27(4): 831-838, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29148600


Saglam et al. recently argued that the Devil's Hole pupfish (Cyprinodon diabolis), a conservation icon with the smallest known species range, was isolated 60 kya based on a new genomic data set. If true, this would be a radically long timescale for any species to persist at population sizes <500 individuals, in contrast to conservation genetics theory. However, here we argue that their analyses and interpretation are inappropriate. They placed highly restrictive prior distributions on divergence times, which do not appropriately model the large uncertainty and result in removing nearly all uncertainty from their analyses, and chose among models by assuming that pupfishes exhibit human mutation rates. We reanalysed their data with their same methods, only using an informative prior for the plausible range of mutation rates observed across vertebrates, including an estimate of the genomewide mutation rate from a pedigree analysis of cichlid fishes. In fact, Saglam et al.'s phylogenetic data support much younger median divergence times for C. diabolis, ranging from 6.2 to 19.9 kya, overlapping with our previous phylogenetic divergence time estimates of 2.5-6.5 kya. There are many reasons to suspect an even younger age and higher mutation rate in C. diabolis, as we previously estimated, due to their high metabolism, small adult size, small population size and severe environmental stressors. In conclusion, our results highlight the need for measuring mutation rate in this fascinating species and suggest that the ages of endangered taxa present in small, isolated populations may frequently be overestimated.

Taxa de Mutação , Filogenia , Animais , Genômica , Humanos , Peixes Listrados
Bioinformatics ; 32(5): 789-91, 2016 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-26543171


UNLABELLED: Many fundamental questions in evolutionary biology entail estimating rates of lineage diversification (speciation-extinction) that are modeled using birth-death branching processes. We leverage recent advances in branching-process theory to develop a flexible Bayesian framework for specifying diversification models-where rates are constant, vary continuously, or change episodically through time-and implement numerical methods to estimate parameters of these models from molecular phylogenies, even when species sampling is incomplete. We enable both statistical inference and efficient simulation under these models. We also provide robust methods for comparing the relative and absolute fit of competing branching-process models to a given tree, thereby providing rigorous tests of biological hypotheses regarding patterns and processes of lineage diversification. AVAILABILITY AND IMPLEMENTATION: The source code for TESS is freely available at CONTACT:

Filogenia , Teorema de Bayes , Evolução Biológica , Simulação por Computador , Linguagens de Programação
Syst Biol ; 65(4): 726-36, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27235697


Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].

Classificação/métodos , Modelos Biológicos , Filogenia , Software , Teorema de Bayes
Detalhe da pesquisa