Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Bioinformatics ; 38(23): 5182-5190, 2022 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-36227122

RESUMO

MOTIVATION: The multispecies coalescent model is now widely accepted as an effective model for incorporating variation in the evolutionary histories of individual genes into methods for phylogenetic inference from genome-scale data. However, because model-based analysis under the coalescent can be computationally expensive for large datasets, a variety of inferential frameworks and corresponding algorithms have been proposed for estimation of species-level phylogenies and associated parameters, including speciation times and effective population sizes. RESULTS: We consider the problem of estimating the timing of speciation events along a phylogeny in a coalescent framework. We propose a maximum a posteriori estimator based on composite likelihood (MAPCL) for inferring these speciation times under a model of DNA sequence evolution for which exact site-pattern probabilities can be computed under the assumption of a constant θ throughout the species tree. We demonstrate that the MAPCL estimates are statistically consistent and asymptotically normally distributed, and we show how this result can be used to estimate their asymptotic variance. We also provide a more computationally efficient estimator of the asymptotic variance based on the non-parametric bootstrap. We evaluate the performance of our method using simulation and by application to an empirical dataset for gibbons. AVAILABILITY AND IMPLEMENTATION: The method has been implemented in the PAUP* program, freely available at https://paup.phylosolutions.com for Macintosh, Windows and Linux operating systems. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Filogenia , Simulação por Computador , Probabilidade , Modelos Genéticos , Especiação Genética
2.
Syst Biol ; 68(6): 1052-1061, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31034053

RESUMO

BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io.


Assuntos
Classificação/métodos , Software/normas , Interpretação Estatística de Dados , Filogenia
3.
Proc Natl Acad Sci U S A ; 113(29): 8049-56, 2016 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-27432945

RESUMO

Phylogeographic analysis can be described as the study of the geological and climatological processes that have produced contemporary geographic distributions of populations and species. Here, we attempt to understand how the dynamic process of landscape change on Madagascar has shaped the distribution of a targeted clade of mouse lemurs (genus Microcebus) and, conversely, how phylogenetic and population genetic patterns in these small primates can reciprocally advance our understanding of Madagascar's prehuman environment. The degree to which human activity has impacted the natural plant communities of Madagascar is of critical and enduring interest. Today, the eastern rainforests are separated from the dry deciduous forests of the west by a large expanse of presumed anthropogenic grassland savanna, dominated by the Family Poaceae, that blankets most of the Central Highlands. Although there is firm consensus that anthropogenic activities have transformed the original vegetation through agricultural and pastoral practices, the degree to which closed-canopy forest extended from the east to the west remains debated. Phylogenetic and population genetic patterns in a five-species clade of mouse lemurs suggest that longitudinal dispersal across the island was readily achieved throughout the Pleistocene, apparently ending at ∼55 ka. By examining patterns of both inter- and intraspecific genetic diversity in mouse lemur species found in the eastern, western, and Central Highland zones, we conclude that the natural environment of the Central Highlands would have been mosaic, consisting of a matrix of wooded savanna that formed a transitional zone between the extremes of humid eastern and dry western forest types.


Assuntos
Cheirogaleidae/genética , Animais , DNA Mitocondrial/genética , Florestas , Madagáscar , Filogenia , Filogeografia
4.
Syst Biol ; 64(3): 525-31, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25577605

RESUMO

Phycas is open source, freely available Bayesian phylogenetics software written primarily in C++ but with a Python interface. Phycas specializes in Bayesian model selection for nucleotide sequence data, particularly the estimation of marginal likelihoods, central to computing Bayes Factors. Marginal likelihoods can be estimated using newer methods (Thermodynamic Integration and Generalized Steppingstone) that are more accurate than the widely used Harmonic Mean estimator. In addition, Phycas supports two posterior predictive approaches to model selection: Gelfand-Ghosh and Conditional Predictive Ordinates. The General Time Reversible family of substitution models, as well as a codon model, are available, and data can be partitioned with all parameters unlinked except tree topology and edge lengths. Phycas provides for analyses in which the prior on tree topologies allows polytomous trees as well as fully resolved trees, and provides for several choices for edge length priors, including a hierarchical model as well as the recently described compound Dirichlet prior, which helps avoid overly informative induced priors on tree length.


Assuntos
Classificação/métodos , Filogenia , Software , Algoritmos , Teorema de Bayes , Clorófitas/classificação , Clorófitas/genética
5.
Am Nat ; 185(3): 433-42, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25674696

RESUMO

A fern from the French Pyrenees-×Cystocarpium roskamianum-is a recently formed intergeneric hybrid between parental lineages that diverged from each other approximately 60 million years ago (mya; 95% highest posterior density: 40.2-76.2 mya). This is an extraordinarily deep hybridization event, roughly akin to an elephant hybridizing with a manatee or a human with a lemur. In the context of other reported deep hybrids, this finding suggests that populations of ferns, and other plants with abiotically mediated fertilization, may evolve reproductive incompatibilities more slowly, perhaps because they lack many of the premating isolation mechanisms that characterize most other groups of organisms. This conclusion implies that major features of Earth's biodiversity-such as the relatively small number of species of ferns compared to those of angiosperms-may be, in part, an indirect by-product of this slower "speciation clock" rather than a direct consequence of adaptive innovations by the more diverse lineages.


Assuntos
Gleiquênias/genética , Especiação Genética , Hibridização Genética , Evolução Biológica , França , Dados de Sequência Molecular , Filogenia , Reprodução , Análise de Sequência de Proteína
6.
Genome Res ; 21(6): 850-62, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21518738

RESUMO

Here we provide a detailed comparative analysis across the candidate X-Inactivation Center (XIC) region and the XIST locus in the genomes of six primates and three mammalian outgroup species. Since lemurs and other strepsirrhine primates represent the sister lineage to all other primates, this analysis focuses on lemurs to reconstruct the ancestral primate sequences and to gain insight into the evolution of this region and the genes within it. This comparative evolutionary genomics approach reveals significant expansion in genomic size across the XIC region in higher primates, with minimal size alterations across the XIST locus itself. Reconstructed primate ancestral XIC sequences show that the most dramatic changes during the past 80 million years occurred between the ancestral primate and the lineage leading to Old World monkeys. In contrast, the XIST locus compared between human and the primate ancestor does not indicate any dramatic changes to exons or XIST-specific repeats; rather, evolution of this locus reflects small incremental changes in overall sequence identity and short repeat insertions. While this comparative analysis reinforces that the region around XIST has been subject to significant genomic change, even among primates, our data suggest that evolution of the XIST sequences themselves represents only small lineage-specific changes across the past 80 million years.


Assuntos
Evolução Molecular , Genes Ligados ao Cromossomo X/genética , Lemur/genética , Filogenia , RNA não Traduzido/genética , Animais , Sequência de Bases , Cromossomos Artificiais Bacterianos , Biologia Computacional , DNA Complementar/genética , Humanos , Hibridização in Situ Fluorescente , Funções Verossimilhança , Modelos Genéticos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , RNA Longo não Codificante , Análise de Sequência de DNA , Especificidade da Espécie
7.
Syst Biol ; 61(1): 170-3, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21963610

RESUMO

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.


Assuntos
Biologia Computacional/métodos , Filogenia , Software , Algoritmos , Metodologias Computacionais , Evolução Molecular , Genoma
8.
Bioinformatics ; 24(4): 581-3, 2008 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-17766271

RESUMO

UNLABELLED: A key element to a successful Markov chain Monte Carlo (MCMC) inference is the programming and run performance of the Markov chain. However, the explicit use of quality assessments of the MCMC simulations-convergence diagnostics-in phylogenetics is still uncommon. Here, we present a simple tool that uses the output from MCMC simulations and visualizes a number of properties of primary interest in a Bayesian phylogenetic analysis, such as convergence rates of posterior split probabilities and branch lengths. Graphical exploration of the output from phylogenetic MCMC simulations gives intuitive and often crucial information on the success and reliability of the analysis. The tool presented here complements convergence diagnostics already available in other software packages primarily designed for other applications of MCMC. Importantly, the common practice of using trace-plots of a single parameter or summary statistic, such as the likelihood score of sampled trees, can be misleading for assessing the success of a phylogenetic MCMC simulation. AVAILABILITY: The program is available as source under the GNU General Public License and as a web application at http://ceb.scs.fsu.edu/awty.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Cadeias de Markov , Método de Monte Carlo , Filogenia , Software , Teorema de Bayes
9.
Nat Commun ; 9(1): 5451, 2018 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-30575731

RESUMO

Interactions between fungi and plants, including parasitism, mutualism, and saprotrophy, have been invoked as key to their respective macroevolutionary success. Here we evaluate the origins of plant-fungal symbioses and saprotrophy using a time-calibrated phylogenetic framework that reveals linked and drastic shifts in diversification rates of each kingdom. Fungal colonization of land was associated with at least two origins of terrestrial green algae and preceded embryophytes (as evidenced by losses of fungal flagellum, ca. 720 Ma), likely facilitating terrestriality through endomycorrhizal and possibly endophytic symbioses. The largest radiation of fungi (Leotiomyceta), the origin of arbuscular mycorrhizae, and the diversification of extant embryophytes occurred ca. 480 Ma. This was followed by the origin of extant lichens. Saprotrophic mushrooms diversified in the Late Paleozoic as forests of seed plants started to dominate the landscape. The subsequent diversification and explosive radiation of Agaricomycetes, and eventually of ectomycorrhizal mushrooms, were associated with the evolution of Pinaceae in the Mesozoic, and establishment of angiosperm-dominated biomes in the Cretaceous.


Assuntos
Evolução Biológica , Embriófitas , Fungos , Simbiose
11.
Cladistics ; 13(1-2): 153-159, 1997 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34920631

RESUMO

We provide three simple examples demonstrating that Wheeler and Nixon's method of recoding "stepmatrix' characters can fail to yield most parsimonious reconstructions of character evolution under specified cost (transformation-weight) schemes. These examples variously indicate undercounting or overcounting of tree lengths due to an inappropriate assumption of independence among the recoded characters. Their method is therefore not equivalent to Sankoff's dynamic programming algorithm, contrary to their claim.

12.
Mol Biol Evol ; 22(6): 1386-92, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15758203

RESUMO

Almost all studies that estimate phylogenies from DNA sequence data under the maximum-likelihood (ML) criterion employ an approximate approach. Most commonly, model parameters are estimated on some initial phylogenetic estimate derived using a rapid method (neighbor-joining or parsimony). Parameters are then held constant during a tree search, and ideally, the procedure is repeated until convergence is achieved. However, the effectiveness of this approximation has not been formally assessed, in part because doing so requires computationally intensive, full-optimization analyses. Here, we report both indirect and direct evaluations of the effectiveness of successive approximations. We obtained an indirect evaluation by comparing the results of replicate runs on real data that use random trees to provide initial parameter estimates. For six real data sets taken from the literature, all replicate iterative searches converged to the same joint estimates of topology and model parameters, suggesting that the approximation is not starting-point dependent, as long as the heuristic searches of tree space are rigorous. We conducted a more direct assessment using simulations in which we compared the accuracy of phylogenies estimated using full optimization of all model parameters on each tree evaluated to the accuracy of trees estimated via successive approximations. There is no significant difference between the accuracy of the approximation searches relative to full-optimization searches. Our results demonstrate that successive approximation is reliable and provide reassurance that this much faster approach is safe to use for ML estimation of topology.


Assuntos
Biologia Computacional/métodos , Modelos Genéticos , Filogenia , Algoritmos , Bases de Dados Genéticas , Evolução Molecular , Funções Verossimilhança , Modelos Teóricos , Software , Fatores de Tempo
13.
Mol Phylogenet Evol ; 33(2): 440-51, 2004 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-15336677

RESUMO

Although long-branch attraction (LBA) is frequently cited as the cause of anomalous phylogenetic groupings, few examples of LBA involving real sequence data are known. We have found several cases of probable LBA by analyzing subsamples from an alignment of 18S rDNA sequences for 133 metazoans. In one example, maximum parsimony analysis of sequences from two rotifers, a ctenophore, and a polychaete annelid resulted in strong support for a tree grouping two "long-branch taxa" (a rotifer and the ctenophore). Maximum-likelihood analysis of the same sequences yielded strong support for a more biologically reasonable "rotifer monophyly" tree. Attempts to break up long branches for problematic subsamples through increased taxon sampling reduced, but did not eliminate, LBA problems. Exhaustive analyses of all quartets for a subset of 50 sequences were performed in order to compare the performance of maximum likelihood, equal-weights parsimony, and two additional variants of parsimony; these methods do differ substantially in their rates of failure to recover trees consistent with well established, but highly unresolved phylogenies. Power analyses using simulations suggest that some incorrect inferences by maximum parsimony are due to statistical inconsistency and that when estimates of central branch lengths for certain quartets are very low, maximum-likelihood analyses have difficulty recovering accepted phylogenies even with large amounts of data. These examples demonstrate that LBA problems can occur in real data sets, and they provide an opportunity to investigate causes of incorrect inferences.


Assuntos
Viés , Invertebrados/classificação , Filogenia , RNA Ribossômico 18S/classificação , Animais , DNA Ribossômico/classificação , Invertebrados/genética , Funções Verossimilhança , RNA Ribossômico 18S/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA