Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
PLoS Comput Biol ; 15(4): e1006650, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30958812

RESUMO

Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.


Assuntos
Teorema de Bayes , Evolução Biológica , Filogenia , Software , Animais , Biologia Computacional , Simulação por Computador , Evolução Molecular , Humanos , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
2.
Syst Biol ; 66(1): 57-73, 2017 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-28173531

RESUMO

The total-evidence approach to divergence time dating uses molecular and morphological data from extant and fossil species to infer phylogenetic relationships, species divergence times, and macroevolutionary parameters in a single coherent framework. Current model-based implementations of this approach lack an appropriate model for the tree describing the diversification and fossilization process and can produce estimates that lead to erroneous conclusions. We address this shortcoming by providing a total-evidence method implemented in a Bayesian framework. This approach uses a mechanistic tree prior to describe the underlying diversification process that generated the tree of extant and fossil taxa. Previous attempts to apply the total-evidence approach have used tree priors that do not account for the possibility that fossil samples may be direct ancestors of other samples, that is, ancestors of fossil or extant species or of clades. The fossilized birth­death (FBD) process explicitly models the diversification, fossilization, and sampling processes and naturally allows for sampled ancestors. This model was recently applied to estimate divergence times based on molecular data and fossil occurrence dates. We incorporate the FBD model and a model of morphological trait evolution into a Bayesian total-evidence approach to dating species phylogenies. We apply this method to extant and fossil penguins and show that the modern penguins radiated much more recently than has been previously estimated, with the basal divergence in the crown clade occurring at ∼12.7 ∼12.7 Ma and most splits leading to extant species occurring in the last 2 myr. Our results demonstrate that including stem-fossil diversity can greatly improve the estimates of the divergence times of crown taxa. The method is available in BEAST2 (version 2.4) software www.beast2.org with packages SA (version at least 1.1.4) and morph-models (version at least 1.0.4) installed.


Assuntos
Modelos Biológicos , Filogenia , Spheniscidae/classificação , Animais , Teorema de Bayes , Fósseis , Especiação Genética , Spheniscidae/anatomia & histologia , Spheniscidae/genética , Fatores de Tempo
3.
J Theor Biol ; 447: 41-55, 2018 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-29550451

RESUMO

A birth-death-sampling model gives rise to phylogenetic trees with samples from the past and the present. Interpreting "birth" as branching speciation, "death" as extinction, and "sampling" as fossil preservation and recovery, this model - also referred to as the fossilized birth-death (FBD) model - gives rise to phylogenetic trees on extant and fossil samples. The model has been mathematically analyzed and successfully applied to a range of datasets on different taxonomic levels, such as penguins, plants, and insects. However, the current mathematical treatment of this model does not allow for a group of temporally distinct fossil specimens to be assigned to the same species. In this paper, we provide a general mathematical FBD modeling framework that explicitly takes "stratigraphic ranges" into account, with a stratigraphic range being defined as the lineage interval associated with a single species, ranging through time from the first to the last fossil appearance of the species. To assign a sequence of fossil samples in the phylogenetic tree to the same species, i.e., to specify a stratigraphic range, we need to define the mode of speciation. We provide expressions to account for three common speciation modes: budding (or asymmetric) speciation, bifurcating (or symmetric) speciation, and anagenetic speciation. Our equations allow for flexible joint Bayesian analysis of paleontological and neontological data. Furthermore, our framework is directly applicable to epidemiology, where a stratigraphic range is the observed duration of infection of a single patient, "birth" via budding is transmission, "death" is recovery, and "sampling" is sequencing the pathogen of a patient. Thus, we present a model that allows for incorporation of multiple observations through time from a single patient.


Assuntos
Especiação Genética , Modelos Biológicos , Modelos Teóricos , Animais , Teorema de Bayes , Epidemiologia , Extinção Biológica , Fósseis , Humanos , Filogenia
4.
PLoS Comput Biol ; 10(12): e1003919, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25474353

RESUMO

Phylogenetic analyses which include fossils or molecular sequences that are sampled through time require models that allow one sample to be a direct ancestor of another sample. As previously available phylogenetic inference tools assume that all samples are tips, they do not allow for this possibility. We have developed and implemented a Bayesian Markov Chain Monte Carlo (MCMC) algorithm to infer what we call sampled ancestor trees, that is, trees in which sampled individuals can be direct ancestors of other sampled individuals. We use a family of birth-death models where individuals may remain in the tree process after sampling, in particular we extend the birth-death skyline model [Stadler et al., 2013] to sampled ancestor trees. This method allows the detection of sampled ancestors as well as estimation of the probability that an individual will be removed from the process when it is sampled. We show that even if sampled ancestors are not of specific interest in an analysis, failing to account for them leads to significant bias in parameter estimates. We also show that sampled ancestor birth-death models where every sample comes from a different time point are non-identifiable and thus require one parameter to be known in order to infer other parameters. We apply our phylogenetic inference accounting for sampled ancestors to epidemiological data, where the possibility of sampled ancestors enables us to identify individuals that infected other individuals after being sampled and to infer fundamental epidemiological parameters. We also apply the method to infer divergence times and diversification rates when fossils are included along with extant species samples, so that fossilisation events are modelled as a part of the tree branching process. Such modelling has many advantages as argued in the literature. The sampler is available as an open-source BEAST2 package (https://github.com/CompEvol/sampled-ancestors).


Assuntos
Biologia Computacional/métodos , Fósseis , Modelos Genéticos , Filogenia , Algoritmos , Teorema de Bayes , Evolução Molecular , Infecções por HIV/virologia , HIV-1/classificação , HIV-1/genética , Humanos , Software
5.
Database (Oxford) ; 20202020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33258967

RESUMO

Variants within the non-coding genome are frequently associated with phenotypes in genome-wide association studies. These non-coding regions may be involved in the regulation of gene expression, encode functional non-coding RNAs, or influence splicing and other cellular functions. We have curated a list of characterized non-coding human genome variants based on the published evidence that indicates phenotypic consequences of the variation. In order to minimize annotation errors, two curators have independently verified the supporting evidence for pathogenicity of each non-coding variant in the published literature. The database consists of 721 non-coding variants linked to the published literature describing the evidence of functional consequences. We have also sampled 7228 covariate-matched benign controls, that have a population frequency of over 5%, from the single nucleotide polymorphism database (dbSNP151) database. These were sampled controlling for potential confounding factors such as linkage with pathogenic variants, annotation type (untranslated region, intron, intergenic, etc.) and variant type (substitution or indel). The dataset presented here represents a curated repository, with a potential use for the training or evaluation of algorithms used in the prediction of non-coding variant functionality. Database URL: https://github.com/Gardner-BinfLab/ncVarDB.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Ligação Genética , Variação Genética , Genoma Humano , Humanos , Mutação INDEL , Polimorfismo de Nucleotídeo Único/genética
6.
Nat Commun ; 9(1): 5237, 2018 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-30532040

RESUMO

Measuring the pace at which speciation and extinction occur is fundamental to understanding the origin and evolution of biodiversity. Both the fossil record and molecular phylogenies of living species can provide independent estimates of speciation and extinction rates, but often produce strikingly divergent results. Despite its implications, the theoretical reasons for this discrepancy remain unknown. Here, we reveal a conceptual and methodological basis able to reconcile palaeontological and molecular evidence: discrepancies are driven by different implicit assumptions about the processes of speciation and species evolution in palaeontological and neontological analyses. We present the "birth-death chronospecies" model that clarifies the definition of speciation and extinction processes allowing for a coherent joint analysis of fossil and phylogenetic data. Using simulations and empirical analyses we demonstrate not only that this model explains much of the apparent incongruence between fossils and phylogenies, but that differences in rate estimates are actually informative about the prevalence of different speciation modes.


Assuntos
Extinção Biológica , Fósseis , Especiação Genética , Paleontologia/métodos , Algoritmos , Animais , Cetáceos/classificação , Cetáceos/genética , Evolução Molecular , Modelos Genéticos , Filogenia
7.
Sci Transl Med ; 8(320): 320ra2, 2016 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-26738795

RESUMO

New HIV diagnoses among men having sex with men (MSM) have not decreased appreciably in most countries, even though care and prevention services have been scaled up substantially in the past 20 years. To maximize the impact of prevention strategies, it is crucial to quantify the sources of transmission at the population level. We used viral sequence and clinical patient data from one of Europe's nationwide cohort studies to estimate probable sources of transmission for 617 recently infected MSM. Seventy-one percent of transmissions were from undiagnosed men, 6% from men who had initiated antiretroviral therapy (ART), 1% from men with no contact to care for at least 18 months, and 43% from those in their first year of infection. The lack of substantial reductions in incidence among Dutch MSM is not a result of ineffective ART provision or inadequate retention in care. In counterfactual modeling scenarios, 19% of these past cases could have been averted with current annual testing coverage and immediate ART to those testing positive. Sixty-six percent of these cases could have been averted with available antiretrovirals (immediate ART provided to all MSM testing positive, and preexposure antiretroviral prophylaxis taken by half of all who test negative for HIV), but only if half of all men at risk of transmission had tested annually. With increasing sequence coverage, molecular epidemiological analyses can be a key tool to direct HIV prevention strategies to the predominant sources of infection, and help send HIV epidemics among MSM into a decisive decline.


Assuntos
Infecções por HIV/epidemiologia , Infecções por HIV/prevenção & controle , Homossexualidade Masculina/estatística & dados numéricos , Adulto , Infecções por HIV/diagnóstico , Infecções por HIV/transmissão , Humanos , Incidência , Masculino , Países Baixos/epidemiologia , Filogenia
8.
Algorithms Mol Biol ; 8(1): 26, 2013 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-24164709

RESUMO

BACKGROUND: In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. RESULTS: We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. CONCLUSIONS: These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA