Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 94
Filtrar
1.
Syst Biol ; 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-38085256

RESUMO

Time-scaled phylogenetic trees are an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. The accumulation of genomic data has resolved the tree of life to a great extent, yet timing evolutionary events remains challenging if not impossible without external information such as fossil ages and morphological characters. Methods for incorporating morphology in tree estimation have lagged behind their molecular counterparts, especially in the case of continuous characters. Despite recent advances, such tools are still direly needed as we approach the limits of what molecules can teach us. Here, we implement a suite of state-of-the-art methods for leveraging continuous morphology in phylogenetics, and by conducting extensive simulation studies we thoroughly validate and explore our methods' properties. While retaining model generality and scalability, we make it possible to estimate absolute and relative divergence times from multiple continuous characters while accounting for uncertainty. We compile and analyze one of the most data-type diverse data sets to date, comprised of contemporaneous and ancient molecular sequences, and discrete and continuous characters from living and extinct Carnivora taxa. We conclude by synthesizing lessons about our method's behavior, and suggest future research venues.

2.
PLoS Comput Biol ; 19(7): e1011226, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37463154

RESUMO

Phylogenetic models have become increasingly complex, and phylogenetic data sets have expanded in both size and richness. However, current inference tools lack a model specification language that can concisely describe a complete phylogenetic analysis while remaining independent of implementation details. We introduce a new lightweight and concise model specification language, 'LPhy', which is designed to be both human and machine-readable. A graphical user interface accompanies 'LPhy', allowing users to build models, simulate data, and create natural language narratives describing the models. These narratives can serve as the foundation for manuscript method sections. Additionally, we present a command-line interface for converting LPhy-specified models into analysis specification files (in XML format) compatible with the BEAST2 software platform. Collectively, these tools aim to enhance the clarity of descriptions and reporting of probabilistic models in phylogenetic studies, ultimately promoting reproducibility of results.


Assuntos
Idioma , Software , Humanos , Filogenia , Reprodutibilidade dos Testes , Modelos Estatísticos , Interface Usuário-Computador
3.
Mol Biol Evol ; 39(8)2022 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-35733333

RESUMO

Single-cell sequencing provides a new way to explore the evolutionary history of cells. Compared to traditional bulk sequencing, where a population of heterogeneous cells is pooled to form a single observation, single-cell sequencing isolates and amplifies genetic material from individual cells, thereby preserving the information about the origin of the sequences. However, single-cell data are more error-prone than bulk sequencing data due to the limited genomic material available per cell. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. Our simulations show that modeling errors increase the accuracy of relative divergence times and substitution parameters. We reconstruct the phylogenetic history of a colorectal cancer patient and a healthy patient from single-cell DNA sequencing data. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. We observed that not accounting for errors can overestimate the phylogenetic diversity in single-cell DNA sequencing data. We estimate that 30-50% of the apparent diversity can be attributed to error. Our work enables a full Bayesian approach capable of accounting for errors in the data within the integrative Bayesian software framework BEAST2.


Assuntos
Neoplasias , Software , Teorema de Bayes , Evolução Molecular , Genômica , Humanos , Modelos Genéticos , Filogenia
4.
Syst Biol ; 71(1): 208-220, 2021 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-34228807

RESUMO

Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth-death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth-death; molecular clock; multispecies coalescent; phylogenetics; species trees.].


Assuntos
Especiação Genética , Modelos Biológicos , Evolução Biológica , Fósseis , Filogenia
5.
Emerg Infect Dis ; 27(5): 1317-1322, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33900175

RESUMO

Real-time genomic sequencing has played a major role in tracking the global spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), contributing greatly to disease mitigation strategies. In August 2020, after having eliminated the virus, New Zealand experienced a second outbreak. During that outbreak, New Zealand used genomic sequencing in a primary role, leading to a second elimination of the virus. We generated genomes from 78% of the laboratory-confirmed samples of SARS-CoV-2 from the second outbreak and compared them with the available global genomic data. Genomic sequencing rapidly identified that virus causing the second outbreak in New Zealand belonged to a single cluster, thus resulting from a single introduction. However, successful identification of the origin of this outbreak was impeded by substantial biases and gaps in global sequencing data. Access to a broader and more heterogenous sample of global genomic data would strengthen efforts to locate the source of any new outbreaks.


Assuntos
COVID-19 , SARS-CoV-2 , Surtos de Doenças , Genômica , Humanos , Nova Zelândia/epidemiologia
6.
Emerg Infect Dis ; 27(3): 687-693, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33400642

RESUMO

Since the first wave of coronavirus disease in March 2020, citizens and permanent residents returning to New Zealand have been required to undergo managed isolation and quarantine (MIQ) for 14 days and mandatory testing for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As of October 20, 2020, of 62,698 arrivals, testing of persons in MIQ had identified 215 cases of SARS-CoV-2 infection. Among 86 passengers on a flight from Dubai, United Arab Emirates, that arrived in New Zealand on September 29, test results were positive for 7 persons in MIQ. These passengers originated from 5 different countries before a layover in Dubai; 5 had negative predeparture SARS-CoV-2 test results. To assess possible points of infection, we analyzed information about their journeys, disease progression, and virus genomic data. All 7 SARS-CoV-2 genomes were genetically identical, except for a single mutation in 1 sample. Despite predeparture testing, multiple instances of in-flight SARS-CoV-2 transmission are likely.


Assuntos
Aeronaves , COVID-19 , Quarentena , SARS-CoV-2/isolamento & purificação , COVID-19/diagnóstico , COVID-19/transmissão , Humanos , Máscaras , Nova Zelândia , Distanciamento Físico , SARS-CoV-2/classificação , Emirados Árabes Unidos
7.
PLoS Comput Biol ; 16(2): e1006717, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32059006

RESUMO

Transcription elongation can be modelled as a three step process, involving polymerase translocation, NTP binding, and nucleotide incorporation into the nascent mRNA. This cycle of events can be simulated at the single-molecule level as a continuous-time Markov process using parameters derived from single-molecule experiments. Previously developed models differ in the way they are parameterised, and in their incorporation of partial equilibrium approximations. We have formulated a hierarchical network comprised of 12 sequence-dependent transcription elongation models. The simplest model has two parameters and assumes that both translocation and NTP binding can be modelled as equilibrium processes. The most complex model has six parameters makes no partial equilibrium assumptions. We systematically compared the ability of these models to explain published force-velocity data, using approximate Bayesian computation. This analysis was performed using data for the RNA polymerase complexes of E. coli, S. cerevisiae and Bacteriophage T7. Our analysis indicates that the polymerases differ significantly in their translocation rates, with the rates in T7 pol being fast compared to E. coli RNAP and S. cerevisiae pol II. Different models are applicable in different cases. We also show that all three RNA polymerases have an energetic preference for the posttranslocated state over the pretranslocated state. A Bayesian inference and model selection framework, like the one presented in this publication, should be routinely applicable to the interrogation of single-molecule datasets.


Assuntos
Teorema de Bayes , Modelos Genéticos , Processos Estocásticos , Transcrição Gênica , Bacteriófago T7/enzimologia , RNA Polimerases Dirigidas por DNA/metabolismo , Escherichia coli/enzimologia , Cinética , Cadeias de Markov , Saccharomyces cerevisiae
8.
BMC Evol Biol ; 20(1): 54, 2020 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-32410614

RESUMO

BACKGROUND: Bayesian MCMC has become a common approach for phylogenetic inference. But the growing size of molecular sequence data sets has created a pressing need to improve the computational efficiency of Bayesian phylogenetic inference algorithms. RESULTS: This paper develops a new algorithm to improve the efficiency of Bayesian phylogenetic inference for models that include a per-branch rate parameter. In a Markov chain Monte Carlo algorithm, the presented proposal kernel changes evolutionary rates and divergence times at the same time, under the constraint that the implied genetic distances remain constant. Specifically, the proposal operates on the divergence time of an internal node and the three adjacent branch rates. For the root of a phylogenetic tree, there are three strategies discussed, named Simple Distance, Small Pulley and Big Pulley. Note that Big Pulley is able to change the tree topology, which enables the operator to sample all the possible rooted trees consistent with the implied unrooted tree. To validate its effectiveness, a series of experiments have been performed by implementing the proposed operator in the BEAST2 software. CONCLUSIONS: The results demonstrate that the proposed operator is able to improve the performance by giving better estimates for a given chain length and by using less running time for a given level of accuracy. Measured by effective samples per hour, use of the proposed operator results in overall mixing more efficient than the current operators in BEAST2. Especially for large data sets, the improvement is up to half an order of magnitude.


Assuntos
Modelos Genéticos , Filogenia , Algoritmos , Teorema de Bayes , Calibragem , Simulação por Computador , Cadeias de Markov , Método de Monte Carlo , Fatores de Tempo
9.
Mol Biol Evol ; 36(8): 1804-1816, 2019 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-31058982

RESUMO

Modern phylodynamic methods interpret an inferred phylogenetic tree as a partial transmission chain providing information about the dynamic process of transmission and removal (where removal may be due to recovery, death, or behavior change). Birth-death and coalescent processes have been introduced to model the stochastic dynamics of epidemic spread under common epidemiological models such as the SIS and SIR models and are successfully used to infer phylogenetic trees together with transmission (birth) and removal (death) rates. These methods either integrate analytically over past incidence and prevalence to infer rate parameters, and thus cannot explicitly infer past incidence or prevalence, or allow such inference only in the coalescent limit of large population size. Here, we introduce a particle filtering framework to explicitly infer prevalence and incidence trajectories along with phylogenies and epidemiological model parameters from genomic sequences and case count data in a manner consistent with the underlying birth-death model. After demonstrating the accuracy of this method on simulated data, we use it to assess the prevalence through time of the early 2014 Ebola outbreak in Sierra Leone.


Assuntos
Genômica/métodos , Incidência , Epidemiologia Molecular/métodos , Prevalência , Teorema de Bayes , Doença pelo Vírus Ebola/epidemiologia , Humanos , Serra Leoa/epidemiologia
10.
Syst Biol ; 68(2): 358-364, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29945220

RESUMO

Rapidly evolving pathogens, such as viruses and bacteria, accumulate genetic change at a similar timescale over which their epidemiological processes occur, such that, it is possible to make inferences about their infectious spread using phylogenetic time-trees. For this purpose it is necessary to choose a phylodynamic model. However, the resulting inferences are contingent on whether the model adequately describes key features of the data. Model adequacy methods allow formal rejection of a model if it cannot generate the main features of the data. We present TreeModelAdequacy, a package for the popular BEAST2 software that allows assessing the adequacy of phylodynamic models. We illustrate its utility by analyzing phylogenetic trees from two viral outbreaks of Ebola and H1N1 influenza. The main features of the Ebola data were adequately described by the coalescent exponential-growth model, whereas the H1N1 influenza data were best described by the birth-death susceptible-infected-recovered model.


Assuntos
Simulação por Computador , Ebolavirus/classificação , Ebolavirus/genética , Genoma Viral/genética , Vírus da Influenza A Subtipo H1N1/classificação , Vírus da Influenza A Subtipo H1N1/genética , Filogenia , Doença pelo Vírus Ebola/epidemiologia , Doença pelo Vírus Ebola/virologia , Humanos , Influenza Humana/epidemiologia , Influenza Humana/virologia , Software
11.
PLoS Comput Biol ; 15(8): e1007189, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31386651

RESUMO

Model-based phylodynamic approaches recently employed generalized linear models (GLMs) to uncover potential predictors of viral spread. Very recently some of these models have allowed both the predictors and their coefficients to be time-dependent. However, these studies mainly focused on predictors that are assumed to be constant through time. Here we inferred the phylodynamics of avian influenza A virus H9N2 isolated in 12 Asian countries and regions under both discrete trait analysis (DTA) and structured coalescent (MASCOT) approaches. Using MASCOT we applied a new time-dependent GLM to uncover the underlying factors behind H9N2 spread. We curated a rich set of time-series predictors including annual international live poultry trade and national poultry production figures. This time-dependent phylodynamic prediction model was compared to commonly employed time-independent alternatives. Additionally the time-dependent MASCOT model allowed for the estimation of viral effective sub-population sizes and their changes through time, and these effective population dynamics within each country were predicted by a GLM. International annual poultry trade is a strongly supported predictor of virus migration rates. There was also strong support for geographic proximity as a predictor of migration rate in all GLMs investigated. In time-dependent MASCOT models, national poultry production was also identified as a predictor of virus genetic diversity through time and this signal was obvious in mainland China. Our application of a recently introduced time-dependent GLM predictors integrated rich time-series data in Bayesian phylodynamic prediction. We demonstrated the contribution of poultry trade and geographic proximity (potentially unheralded wild bird movements) to avian influenza spread in Asia. To gain a better understanding of the drivers of H9N2 spread, we suggest increased surveillance of the H9N2 virus in countries that are currently under-sampled as well as in wild bird populations in the most affected countries.


Assuntos
Vírus da Influenza A Subtipo H9N2 , Influenza Aviária/transmissão , Modelos Biológicos , Migração Animal , Animais , Animais Selvagens/virologia , Ásia/epidemiologia , Teorema de Bayes , Aves/virologia , Comércio , Biologia Computacional , Monitoramento Ambiental , Vírus da Influenza A Subtipo H9N2/classificação , Vírus da Influenza A Subtipo H9N2/genética , Influenza Aviária/epidemiologia , Influenza Aviária/virologia , Modelos Lineares , Filogeografia/estatística & dados numéricos , Dinâmica Populacional , Aves Domésticas/virologia , Análise Espaço-Temporal
12.
PLoS Comput Biol ; 15(4): e1006650, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30958812

RESUMO

Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.


Assuntos
Teorema de Bayes , Evolução Biológica , Filogenia , Software , Animais , Biologia Computacional , Simulação por Computador , Evolução Molecular , Humanos , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
13.
Mol Biol Evol ; 35(2): 504-517, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-29220490

RESUMO

Reticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data. A novel birth-hybridization process is used as the prior for the species network, and we assume a multispecies network coalescent prior for the embedded gene trees. We verify the ability of our method to correctly sample from the posterior distribution, and thus to infer a species network, through simulations. To quantify the power of our method, we reanalyze two large data sets of genes from spruces and yeasts. For the three closely related spruces, we verify the previously suggested homoploid hybridization event in this clade; for the yeast data, we find extensive hybridization events. Our method is available within the BEAST 2 add-on SpeciesNetwork, and thus provides an extensible framework for Bayesian inference of reticulate evolution.


Assuntos
Técnicas Genéticas , Hibridização Genética , Modelos Genéticos , Filogenia , Teorema de Bayes , Simulação por Computador , Dados de Sequência Molecular , Picea/genética , Saccharomyces/genética
14.
Syst Biol ; 67(5): 901-904, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29718447

RESUMO

Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.


Assuntos
Teorema de Bayes , Filogenia , Software , Evolução Molecular , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
15.
Syst Biol ; 67(1): 170-174, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28673048

RESUMO

Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the "Taming the Beast" (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2.


Assuntos
Biologia Computacional/educação , Biologia Computacional/métodos , Filogenia , Software , Materiais de Ensino , Algoritmos
16.
Ecol Appl ; 29(4): e01877, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30811075

RESUMO

Invertebrates are a major component of terrestrial ecosystems, however, estimating their biodiversity is challenging. We compiled an inventory of invertebrate biodiversity along an elevation gradient on the temperate forested island of Hauturu, New Zealand, by DNA barcoding of specimens obtained from leaf litter samples and pitfall traps. We compared the barcodes and biodiversity estimates from this data set with those from a parallel DNA metabarcoding analysis of soil from the same locations, and with pre-existing sequences in reference databases, before exploring the use of combined data sets as a basis for estimating total invertebrate biodiversity. We obtained 1,282 28S and 1,610 COI barcodes from a total of 1,947 invertebrate specimens, which were clustered into 247 (28S) and 366 (COI) OTUs, of which ≤ 10% were represented in GenBank. Coleoptera were most abundant (730 sequenced specimens), followed by Hymenoptera, Diptera, Lepidoptera, and Amphipoda. The most abundant OTU from both the 28S (153 sequences) and COI (140 sequences) data sets was an undescribed beetle from the family Salpingidae. Based on the occurrences of COI OTUs along the elevation gradient, we estimated there are ~1,000 arthropod species (excluding mites) on Hauturu, including 770 insects, of which 344 are beetles. A DNA metabarcoding analysis of soil DNA from the same sites resulted in the identification of similar numbers of OTUs in most invertebrate groups compared with the DNA barcoding, but less than 10% of the DNA barcoding COI OTUs were also detected by the metabarcoding analysis of soil DNA. A mark-recapture analysis based on the overlap between these data sets estimated the presence of approximately 6,800 arthropod species (excluding mites) on the island, including ~3,900 insects. Estimates of New Zealand-wide biodiversity for selected arthropod groups based on matching of the COI DNA barcodes with pre-existing reference sequences suggested over 13,200 insect species are present, including 4,000 Coleoptera, 2,200 Diptera, and 2,700 Hymenoptera species, and 1,000 arachnid species (excluding mites). These results confirm that metabarcoding analyses of soil DNA tends to recover different components of terrestrial invertebrate biodiversity compared to traditional invertebrate sampling, but the combined methods provide a novel basis for estimating invertebrate biodiversity.


Assuntos
Código de Barras de DNA Taxonômico , Ecossistema , Animais , Biodiversidade , DNA , Invertebrados , Ilhas , Nova Zelândia
17.
Mol Biol Evol ; 34(8): 2101-2114, 2017 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-28431121

RESUMO

Fully Bayesian multispecies coalescent (MSC) methods like *BEAST estimate species trees from multiple sequence alignments. Today thousands of genes can be sequenced for a given study, but using that many genes with *BEAST is intractably slow. An alternative is to use heuristic methods which compromise accuracy or completeness in return for speed. A common heuristic is concatenation, which assumes that the evolutionary history of each gene tree is identical to the species tree. This is an inconsistent estimator of species tree topology, a worse estimator of divergence times, and induces spurious substitution rate variation when incomplete lineage sorting is present. Another class of heuristics directly motivated by the MSC avoids many of the pitfalls of concatenation but cannot be used to estimate divergence times. To enable fuller use of available data and more accurate inference of species tree topologies, divergence times, and substitution rates, we have developed a new version of *BEAST called StarBEAST2. To improve convergence rates we add analytical integration of population sizes, novel MCMC operators and other optimizations. Computational performance improved by 13.5× and 13.8× respectively when analyzing two empirical data sets, and an average of 33.1× across 30 simulated data sets. To enable accurate estimates of per-species substitution rates, we introduce species tree relaxed clocks, and show that StarBEAST2 is a more powerful and robust estimator of rate variation than concatenation. StarBEAST2 is available through the BEAUTi package manager in BEAST 2.4 and above.


Assuntos
Alinhamento de Sequência/métodos , Sequência de Bases , Teorema de Bayes , Evolução Biológica , Simulação por Computador , Especiação Genética , Modelos Genéticos , Taxa de Mutação , Filogenia , Software
18.
Syst Biol ; 66(1): 57-73, 2017 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-28173531

RESUMO

The total-evidence approach to divergence time dating uses molecular and morphological data from extant and fossil species to infer phylogenetic relationships, species divergence times, and macroevolutionary parameters in a single coherent framework. Current model-based implementations of this approach lack an appropriate model for the tree describing the diversification and fossilization process and can produce estimates that lead to erroneous conclusions. We address this shortcoming by providing a total-evidence method implemented in a Bayesian framework. This approach uses a mechanistic tree prior to describe the underlying diversification process that generated the tree of extant and fossil taxa. Previous attempts to apply the total-evidence approach have used tree priors that do not account for the possibility that fossil samples may be direct ancestors of other samples, that is, ancestors of fossil or extant species or of clades. The fossilized birth­death (FBD) process explicitly models the diversification, fossilization, and sampling processes and naturally allows for sampled ancestors. This model was recently applied to estimate divergence times based on molecular data and fossil occurrence dates. We incorporate the FBD model and a model of morphological trait evolution into a Bayesian total-evidence approach to dating species phylogenies. We apply this method to extant and fossil penguins and show that the modern penguins radiated much more recently than has been previously estimated, with the basal divergence in the crown clade occurring at ∼12.7 ∼12.7 Ma and most splits leading to extant species occurring in the last 2 myr. Our results demonstrate that including stem-fossil diversity can greatly improve the estimates of the divergence times of crown taxa. The method is available in BEAST2 (version 2.4) software www.beast2.org with packages SA (version at least 1.1.4) and morph-models (version at least 1.0.4) installed.


Assuntos
Modelos Biológicos , Filogenia , Spheniscidae/classificação , Animais , Teorema de Bayes , Fósseis , Especiação Genética , Spheniscidae/anatomia & histologia , Spheniscidae/genética , Fatores de Tempo
19.
J Theor Biol ; 447: 41-55, 2018 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-29550451

RESUMO

A birth-death-sampling model gives rise to phylogenetic trees with samples from the past and the present. Interpreting "birth" as branching speciation, "death" as extinction, and "sampling" as fossil preservation and recovery, this model - also referred to as the fossilized birth-death (FBD) model - gives rise to phylogenetic trees on extant and fossil samples. The model has been mathematically analyzed and successfully applied to a range of datasets on different taxonomic levels, such as penguins, plants, and insects. However, the current mathematical treatment of this model does not allow for a group of temporally distinct fossil specimens to be assigned to the same species. In this paper, we provide a general mathematical FBD modeling framework that explicitly takes "stratigraphic ranges" into account, with a stratigraphic range being defined as the lineage interval associated with a single species, ranging through time from the first to the last fossil appearance of the species. To assign a sequence of fossil samples in the phylogenetic tree to the same species, i.e., to specify a stratigraphic range, we need to define the mode of speciation. We provide expressions to account for three common speciation modes: budding (or asymmetric) speciation, bifurcating (or symmetric) speciation, and anagenetic speciation. Our equations allow for flexible joint Bayesian analysis of paleontological and neontological data. Furthermore, our framework is directly applicable to epidemiology, where a stratigraphic range is the observed duration of infection of a single patient, "birth" via budding is transmission, "death" is recovery, and "sampling" is sequencing the pathogen of a patient. Thus, we present a model that allows for incorporation of multiple observations through time from a single patient.


Assuntos
Especiação Genética , Modelos Biológicos , Modelos Teóricos , Animais , Teorema de Bayes , Epidemiologia , Extinção Biológica , Fósseis , Humanos , Filogenia
20.
BMC Evol Biol ; 17(1): 42, 2017 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-28166715

RESUMO

BACKGROUND: Reconstructing phylogenies through Bayesian methods has many benefits, which include providing a mathematically sound framework, providing realistic estimates of uncertainty and being able to incorporate different sources of information based on formal principles. Bayesian phylogenetic analyses are popular for interpreting nucleotide sequence data, however for such studies one needs to specify a site model and associated substitution model. Often, the parameters of the site model is of no interest and an ad-hoc or additional likelihood based analysis is used to select a single site model. RESULTS: bModelTest allows for a Bayesian approach to inferring and marginalizing site models in a phylogenetic analysis. It is based on trans-dimensional Markov chain Monte Carlo (MCMC) proposals that allow switching between substitution models as well as estimating the posterior probability for gamma-distributed rate heterogeneity, a proportion of invariable sites and unequal base frequencies. The model can be used with the full set of time-reversible models on nucleotides, but we also introduce and demonstrate the use of two subsets of time-reversible substitution models. CONCLUSION: With the new method the site model can be inferred (and marginalized) during the MCMC analysis and does not need to be pre-determined, as is now often the case in practice, by likelihood-based methods. The method is implemented in the bModelTest package of the popular BEAST 2 software, which is open source, licensed under the GNU Lesser General Public License and allows joint site model and tree inference under a wide range of models.


Assuntos
Modelos Genéticos , Filogenia , Software , Algoritmos , Sequência de Bases , Teorema de Bayes , Funções Verossimilhança , Cadeias de Markov , Método de Monte Carlo , Incerteza
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA