Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
Mol Biol Evol ; 38(4): 1537-1543, 2021 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33295605

RESUMEN

The rooting of the SARS-CoV-2 phylogeny is important for understanding the origin and early spread of the virus. Previously published phylogenies have used different rootings that do not always provide consistent results. We investigate several different strategies for rooting the SARS-CoV-2 tree and provide measures of statistical uncertainty for all methods. We show that methods based on the molecular clock tend to place the root in the B clade, whereas methods based on outgroup rooting tend to place the root in the A clade. The results from the two approaches are statistically incompatible, possibly as a consequence of deviations from a molecular clock or excess back-mutations. We also show that none of the methods provide strong statistical support for the placement of the root in any particular edge of the tree. These results suggest that phylogenetic evidence alone is unlikely to identify the origin of the SARS-CoV-2 virus and we caution against strong inferences regarding the early spread of the virus based solely on such evidence.


Asunto(s)
COVID-19/virología , Genoma Viral , Mutación , Filogenia , SARS-CoV-2/genética , Algoritmos , Animales , Teorema de Bayes , Evolución Molecular , Humanos , Funciones de Verosimilitud , Cadenas de Markov , Modelos Genéticos , Modelos Estadísticos , Método de Montecarlo , Mutación Missense , ARN Viral/genética , Incertidumbre
2.
Syst Biol ; 69(5): 1016-1032, 2020 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-31985810

RESUMEN

Sampling across tree space is one of the major challenges in Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) algorithms. Standard MCMC tree moves consider small random perturbations of the topology, and select from candidate trees at random or based on the distance between the old and new topologies. MCMC algorithms using such moves tend to get trapped in tree space, making them slow in finding the globally most probable trees (known as "convergence") and in estimating the correct proportions of the different types of them (known as "mixing"). Here, we introduce a new class of moves, which propose trees based on their parsimony scores. The proposal distribution derived from the parsimony scores is a quickly computable albeit rough approximation of the conditional posterior distribution over candidate trees. We demonstrate with simulations that parsimony-guided moves correctly sample the uniform distribution of topologies from the prior. We then evaluate their performance against standard moves using six challenging empirical data sets, for which we were able to obtain accurate reference estimates of the posterior using long MCMC runs, a mix of topology proposals, and Metropolis coupling. On these data sets, ranging in size from 357 to 934 taxa and from 1740 to 5681 sites, we find that single chains using parsimony-guided moves usually converge an order of magnitude faster than chains using standard moves. They also exhibit better mixing, that is, they cover the most probable trees more quickly. Our results show that tree moves based on quick and dirty estimates of the posterior probability can significantly outperform standard moves. Future research will have to show to what extent the performance of such moves can be improved further by finding better ways of approximating the posterior probability, taking the trade-off between accuracy and speed into account. [Bayesian phylogenetic inference; MCMC; parsimony; tree proposal.].


Asunto(s)
Clasificación/métodos , Filogenia , Algoritmos , Teorema de Bayes , Modelos Biológicos
3.
Arch Microbiol ; 203(3): 1211-1219, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33231748

RESUMEN

This study aimed to compare the fungal rhizosphere communities of Rhazya stricta, Enneapogon desvauxii, Citrullus colocynthis, Senna italica, and Zygophyllum simplex, and the gut mycobiota of Poekilocerus bufonius (Orthoptera, Pyrgomorphidae, "Usherhopper"). A total of 164,485 fungal reads were observed from the five plant rhizospheres and Usherhopper gut. The highest reads were in S. italica rhizosphere (29,883 reads). Species richness in the P. bufonius gut was the highest among the six samples. Ascomycota was dominant in all samples, with the highest reads in E. desvauxii (26,734 reads) rhizosphere. Sordariomycetes and Dothideomycetes were the dominant classes detected with the highest abundance in C. colocynthis and E. desvauxii rhizospheres. Aspergillus and Ceratobasidium were the most abundant genera in the R. stricta rhizosphere, Fusarium and Penicillium in the E. desvauxii rhizosphere and P. bufonius gut, Ceratobasidium and Myrothecium in the C. colocynthis rhizosphere, Aspergillus and Fusarium in the S. italica rhizosphere, and Cochliobolus in the Z. simplex rhizosphere. Aspergillus terreus was the most abundant species in the R. stricta and S. italica rhizospheres, Fusarium sp. in E. desvauxii rhizosphere, Ceratobasidium sp. in C. colocynthis rhizosphere, Cochliobolus sp. in Z. simplex rhizosphere, and Penicillium sp. in P. bufonius gut. The phylogenetic results revealed the unclassified species were related closely to Ascomycota and the species in E. desvauxii, S. italica and Z. simplex rhizospheres were closely related, where the species in the P. bufonius gut, were closely related to the species in the R. stricta, and C. colocynthis rhizospheres.


Asunto(s)
Biodiversidad , Hongos/genética , Metagenómica , Micobioma/genética , Plantas/microbiología , Rizosfera , Microbiología del Suelo , Clima Desértico , Hongos/clasificación , Filogenia , Raíces de Plantas/microbiología
4.
Syst Biol ; 68(6): 1052-1061, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31034053

RESUMEN

BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io.


Asunto(s)
Clasificación/métodos , Programas Informáticos/normas , Interpretación Estadística de Datos , Filogenia
5.
Proc Natl Acad Sci U S A ; 113(34): 9569-74, 2016 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-27512038

RESUMEN

Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM.


Asunto(s)
Coevolución Biológica , Extinción Biológica , Especiación Genética , Filogenia , Ballenas/clasificación , Animales , Teorema de Bayes , Biodiversidad , Funciones de Verosimilitud , Distribución de Poisson , Ballenas/genética
6.
Syst Biol ; 65(4): 726-36, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27235697

RESUMEN

Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].


Asunto(s)
Clasificación/métodos , Modelos Biológicos , Filogenia , Programas Informáticos , Teorema de Bayes
7.
Proc Natl Acad Sci U S A ; 111(29): E2957-66, 2014 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-25009181

RESUMEN

Time-calibrated species phylogenies are critical for addressing a wide range of questions in evolutionary biology, such as those that elucidate historical biogeography or uncover patterns of coevolution and diversification. Because molecular sequence data are not informative on absolute time, external data--most commonly, fossil age estimates--are required to calibrate estimates of species divergence dates. For Bayesian divergence time methods, the common practice for calibration using fossil information involves placing arbitrarily chosen parametric distributions on internal nodes, often disregarding most of the information in the fossil record. We introduce the "fossilized birth-death" (FBD) process--a model for calibrating divergence time estimates in a Bayesian framework, explicitly acknowledging that extant species and fossils are part of the same macroevolutionary process. Under this model, absolute node age estimates are calibrated by a single diversification model and arbitrary calibration densities are not necessary. Moreover, the FBD model allows for inclusion of all available fossils. We performed analyses of simulated data and show that node age estimation under the FBD model results in robust and accurate estimates of species divergence times with realistic measures of statistical uncertainty, overcoming major limitations of standard divergence time estimation methods. We used this model to estimate the speciation times for a dataset composed of all living bears, indicating that the genus Ursus diversified in the Late Miocene to Middle Pliocene.


Asunto(s)
Evolución Biológica , Fósiles , Modelos Biológicos , Animales , Calibración , Simulación por Computador , Extinción Biológica , Factores de Tiempo , Ursidae/fisiología
8.
Syst Biol ; 63(5): 753-71, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24951559

RESUMEN

Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution.


Asunto(s)
Clasificación/métodos , Modelos Estadísticos , Filogenia , Algoritmos , Simulación por Computador
10.
Mol Biol Evol ; 30(9): 2197-208, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23748181

RESUMEN

Paired epistatic interactions, such as those in the stem regions of RNA, play an important role in many biological processes. However, unlike protein-coding regions, paired epistatic interactions have lacked the appropriate statistical tools for the detection of departures from selective neutrality. Here, a model is presented for the analysis of paired epistatic regions that draws upon the population genetics of the compensatory substitution process to detect the relative strength of natural selection acting against deleterious combinations of alleles. The method is based upon the relative rates of double and single substitution, and can differentiate between nonindependent interactions and negatively epistatic ones. The model is implemented in a fully Bayesian framework for parameter estimation and is demonstrated using a 5S rRNA data set. In addition to the detection of selection, modeling the double and single substitution processes in this manner inherently accounts for a substantial proportion of rate variation among stem positions.


Asunto(s)
Epistasis Genética , Modelos Genéticos , Mutación , Filogenia , ARN Ribosómico 5S/genética , Alelos , Animales , Teorema de Bayes , Humanos , Conformación de Ácido Nucleico , ARN Ribosómico 5S/clasificación , Selección Genética
11.
Syst Biol ; 62(6): 789-804, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23736102

RESUMEN

Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a "data-augmentation" approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea.


Asunto(s)
Algoritmos , Filogeografía/métodos , Teorema de Bayes , Simulación por Computador , Filogenia , Rhododendron/clasificación
12.
Mol Biol Evol ; 29(3): 939-55, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22049064

RESUMEN

We introduce a new model for relaxing the assumption of a strict molecular clock for use as a prior in Bayesian methods for divergence time estimation. Lineage-specific rates of substitution are modeled using a Dirichlet process prior (DPP), a type of stochastic process that assumes lineages of a phylogenetic tree are distributed into distinct rate classes. Under the Dirichlet process, the number of rate classes, assignment of branches to rate classes, and the rate value associated with each class are treated as random variables. The performance of this model was evaluated by conducting analyses on data sets simulated under a range of different models. We compared the Dirichlet process model with two alternative models for rate variation: the strict molecular clock and the independent rates model. Our results show that divergence time estimation under the DPP provides robust estimates of node ages and branch rates without significantly reducing power. Further analyses were conducted on a biological data set, and we provide examples of ways to summarize Markov chain Monte Carlo samples under this model.


Asunto(s)
Evolución Molecular , Modelos Genéticos , Tasa de Mutación , Filogenia , Teorema de Bayes , Simulación por Computador , Cadenas de Markov , Método de Montecarlo , Procesos Estocásticos
13.
Syst Biol ; 61(1): 170-3, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21963610

RESUMEN

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.


Asunto(s)
Biología Computacional/métodos , Filogenia , Programas Informáticos , Algoritmos , Metodologías Computacionales , Evolución Molecular , Genoma
14.
Syst Biol ; 61(3): 539-42, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22357727

RESUMEN

Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site d(N)/d(S) rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software.


Asunto(s)
Clasificación/métodos , Programas Informáticos , Algoritmos , Cadenas de Markov , Modelos Biológicos , Método de Montecarlo , Filogenia
15.
Syst Biol ; 60(2): 225-32, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21252385

RESUMEN

But Tuffley and Steel (1997) introduced a model called No Common Mechanism (NCM), in which characters may-but are not required to-vary their relative rates independently, both within and between branches. Because the independent variation is taken only as a possibility, not as a requirement, NCM would apply to almost any situation, and so may be accepted as realistic. This is useful because Tuffley and Steel also showed that maximum likelihood under NCM selects the same trees as does parsimony. With the realistic NCM in the background, then, most parsimonious trees have greatest power to explain available observations. -Farris (2008).


Asunto(s)
Modelos Genéticos , Filogenia , Teorema de Bayes , Cadenas de Markov , Método de Montecarlo
16.
Syst Biol ; 60(1): 60-73, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21081481

RESUMEN

Nearly all commonly used methods of phylogenetic inference assume that characters in an alignment evolve independently of one another. This assumption is attractive for simplicity and computational tractability but is not biologically reasonable for RNAs and proteins that have secondary and tertiary structures. Here, we simulate RNA and protein-coding DNA sequence data under a general model of dependence in order to assess the robustness of traditional methods of phylogenetic inference to violation of the assumption of independence among sites. We find that the accuracy of independence-assuming methods is reduced by the dependence among sites; for proteins this reduction is relatively mild, but for RNA this reduction may be substantial. We introduce the concept of effective sequence length and its utility for considering information content in phylogenetics.


Asunto(s)
Evolución Molecular , Modelos Genéticos , Filogenia , Proteínas/química , Proteínas/genética , ARN/química , ARN/genética , Animales , Secuencia de Bases , Bombyx/genética , Simulación por Computador , ADN/química , ADN/genética , Escherichia coli/genética , Mioglobina/genética , Conformación de Ácido Nucleico , Conformación Proteica , Alineación de Secuencia/métodos , Cachalote/genética
17.
PeerJ ; 9: e12438, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34760401

RESUMEN

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.

18.
Genetics ; 181(1): 225-34, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19001294

RESUMEN

Parallel evolution is the acquisition of identical adaptive traits in independently evolving populations. Understanding whether the genetic changes underlying adaptation to a common selective environment are parallel within and between species is interesting because it sheds light on the degree of evolutionary constraints. If parallel evolution is perfect, then the implication is that forces such as functional constraints, epistasis, and pleiotropy play an important role in shaping the outcomes of adaptive evolution. In addition, population genetic theory predicts that the probability of parallel evolution will decline with an increase in the number of adaptive solutions-if a single adaptive solution exists, then parallel evolution will be observed among highly divergent species. For this reason, it is predicted that close relatives-which likely overlap more in the details of their adaptive solutions-will show more parallel evolution. By adapting three related bacteriophage species to a novel environment we find (1) a high rate of parallel genetic evolution at orthologous nucleotide and amino acid residues within species, (2) parallel beneficial mutations do not occur in a common order in which they fix or appear in an evolving population, (3) low rates of parallel evolution and convergent evolution between species, and (4) the probability of parallel and convergent evolution between species is strongly effected by divergence.


Asunto(s)
Bacteriófagos/genética , Evolución Biológica , Variación Genética , Adaptación Fisiológica/genética , Sustitución de Aminoácidos/genética , Secuencia de Bases , Genoma Viral/genética , Mutación/genética , Polimorfismo Genético , Especificidad de la Especie , Temperatura
19.
Syst Biol ; 57(5): 750-7, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18853361

RESUMEN

We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA.


Asunto(s)
ADN/genética , Modelos Genéticos , Modelos Estadísticos , Filogenia , Animales , Secuencia de Bases , Teorema de Bayes , Simulación por Computador , Insectos/genética , Plantas/genética
20.
Evolution ; 62(8): 2042-64, 2008 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-18507743

RESUMEN

An important challenge in evolutionary biology is to understand how major changes in body form arise. The dramatic transition from a lizard-like to snake-like body form in squamate reptiles offers an exciting system for such research because this change is replicated dozens of times. Here, we use morphometric data for 258 species and a time-calibrated phylogeny to explore rates and patterns of body-form evolution across squamates. We also demonstrate how time-calibrated phylogenies may be used to make inferences about the time frame over which major morphological transitions occur. Using the morphometric data, we find that the transition from lizard-like to snake-like body form involves concerted evolution of limb reduction, digit loss, and body elongation. These correlations are similar across squamate clades, despite very different ecologies and >180 million years (My) of divergence. Using the time-calibrated phylogeny and ancestral reconstructions, we find that the dramatic transition between these body forms can occur in 20 My or less, but that seemingly intermediate morphologies can also persist for tens of millions of years. Finally, although loss of digits is common, we find statistically significant support for at least six examples of the re-evolution of lost digits in the forelimb and hind limb.


Asunto(s)
Evolución Biológica , Reptiles/genética , Reptiles/fisiología , Algoritmos , Animales , Tipificación del Cuerpo , Calibración , Ecología , Evolución Molecular , Extremidades , Especiación Genética , Modelos Genéticos , Modelos Estadísticos , Filogenia , Análisis de Regresión , Factores de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA