Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Biol Sci ; 291(2025): 20240090, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38889793

RESUMO

The more insects there are, the more food there is for insectivores and the higher the likelihood for insect-associated ecosystem services. Yet, we lack insights into the drivers of insect biomass over space and seasons, for both tropical and temperate zones. We used 245 Malaise traps, managed by 191 volunteers and park guards, to characterize year-round flying insect biomass in a temperate (Sweden) and a tropical (Madagascar) country. Surprisingly, we found that local insect biomass was similar across zones. In Sweden, local insect biomass increased with accumulated heat and varied across habitats, while biomass in Madagascar was unrelated to the environmental predictors measured. Drivers behind seasonality partly converged: In both countries, the seasonality of insect biomass differed between warmer and colder sites, and wetter and drier sites. In Sweden, short-term deviations from expected season-specific biomass were explained by week-to-week fluctuations in accumulated heat, rainfall and soil moisture, whereas in Madagascar, weeks with higher soil moisture had higher insect biomass. Overall, our study identifies key drivers of the seasonal distribution of flying insect biomass in a temperate and a tropical climate. This knowledge is key to understanding the spatial and seasonal availability of insects-as well as predicting future scenarios of insect biomass change.


Assuntos
Biomassa , Estações do Ano , Temperatura , Clima Tropical , Animais , Suécia , Madagáscar , Insetos/fisiologia , Água , Ecossistema
2.
Syst Biol ; 72(6): 1316-1336, 2023 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-37605524

RESUMO

Several total-evidence dating studies under the fossilized birth-death (FBD) model have produced very old age estimates, which are not supported by the fossil record. This phenomenon has been termed "deep root attraction (DRA)." For two specific data sets, involving divergence time estimation for the early radiations of ants, bees, and wasps (Hymenoptera) and of placental mammals (Eutheria), it has been shown that the DRA effect can be greatly reduced by accommodating the fact that extant species in these trees have been sampled to maximize diversity, so-called diversified sampling. Unfortunately, current methods to accommodate diversified sampling only consider the extreme case where it is possible to identify a cut-off time such that all splits occurring before this time are represented in the sampled tree but none of the younger splits. In reality, the sampling bias is rarely this extreme and may be difficult to model properly. Similar modeling challenges apply to the sampling of the fossil record. This raises the question of whether it is possible to find dating methods that are more robust to sampling biases. Here, we show that the skyline FBD (SFBD) process, where the diversification and fossil-sampling rates can vary over time in a piecewise fashion, provides age estimates that are more robust to inadequacies in the modeling of the sampling process and less sensitive to DRA effects. In the SFBD model we consider, rates in different time intervals are either considered to be independent and identically distributed or assumed to be autocorrelated following an Ornstein-Uhlenbeck (OU) process. Through simulations and reanalyses of Hymenoptera and Eutheria data, we show that both variants of the SFBD model unify age estimates under random and diversified sampling assumptions. The SFBD model can resolve DRA by absorbing the deviations from the sampling assumptions into the inferred dynamics of the diversification process over time. Although this means that the inferred diversification dynamics must be interpreted with caution, taking sampling biases into account, we conclude that the SFBD model represents the most robust approach currently available for addressing DRA in total-evidence dating.


Assuntos
Formigas , Placenta , Feminino , Gravidez , Animais , Filogenia , Tempo , Eutérios , Fósseis
3.
Syst Biol ; 72(5): 1199-1206, 2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37498209

RESUMO

Bayesian phylogenetics is now facing a critical point. Over the last 20 years, Bayesian methods have reshaped phylogenetic inference and gained widespread popularity due to their high accuracy, the ability to quantify the uncertainty of inferences and the possibility of accommodating multiple aspects of evolutionary processes in the models that are used. Unfortunately, Bayesian methods are computationally expensive, and typical applications involve at most a few hundred sequences. This is problematic in the age of rapidly expanding genomic data and increasing scope of evolutionary analyses, forcing researchers to resort to less accurate but faster methods, such as maximum parsimony and maximum likelihood. Does this spell doom for Bayesian methods? Not necessarily. Here, we discuss some recently proposed approaches that could help scale up Bayesian analyses of evolutionary problems considerably. We focus on two particular aspects: online phylogenetics, where new data sequences are added to existing analyses, and alternatives to Markov chain Monte Carlo (MCMC) for scalable Bayesian inference. We identify 5 specific challenges and discuss how they might be overcome. We believe that online phylogenetic approaches and Sequential Monte Carlo hold great promise and could potentially speed up tree inference by orders of magnitude. We call for collaborative efforts to speed up the development of methods for real-time tree expansion through online phylogenetics.


Assuntos
Evolução Biológica , Modelos Genéticos , Filogenia , Teorema de Bayes , Método de Monte Carlo , Cadeias de Markov
4.
BMC Bioinformatics ; 24(1): 6, 2023 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-36604610

RESUMO

BACKGROUND: The Living Atlas is an open source platform used to collect, visualise and analyse biodiversity data from multiple sources, and serves as the national biodiversity data hub in many countries. Although powerful, the Living Atlas has had limited functionality for species occurrence data derived from DNA sequences. As a step toward integrating this fast-growing data source into the platform, we developed the Amplicon Sequence Variant (ASV) portal: a web interface to sequence-based biodiversity observations in the Living Atlas. RESULTS: The ASV portal allows data providers to submit denoised metabarcoding output to the Living Atlas platform via an intermediary ASV database. It also enables users to search for existing ASVs and associated Living Atlas records using the Basic Local Alignment Search Tool, or via filters on taxonomy and sequencing details. The ASV portal is a Python-Flask/jQuery web interface, implemented as a multi-container docker service, and is an integral part of the Swedish Biodiversity Data Infrastructure. CONCLUSION: The ASV portal is a web interface that effectively integrates biodiversity data derived from DNA sequences into the Living Atlas platform.


Assuntos
Biodiversidade , DNA , DNA/genética , Software , Código de Barras de DNA Taxonômico
5.
Syst Biol ; 71(6): 1404-1422, 2022 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-35556139

RESUMO

New, rapid, accurate, scalable, and cost-effective species discovery and delimitation methods are needed for tackling "dark taxa," here defined as groups for which $<$10$\%$ of all species are described and the estimated diversity exceeds 1,000 species. Species delimitation for these taxa should be based on multiple data sources ("integrative taxonomy") but collecting multiple types of data risks impeding a discovery process that is already too slow. We here develop large-scale integrative taxonomy (LIT), an explicit method where preliminary species hypotheses are generated based on inexpensive data that can be obtained quickly and cost-effectively. These hypotheses are then evaluated based on a more expensive type of "validation data" that is only obtained for specimens selected based on objective criteria applied to the preliminary species hypotheses. We here use this approach to sort 18,000 scuttle flies (Diptera: Phoridae) into 315 preliminary species hypotheses based on next-generation sequencing barcode (313 bp) clusters (using objective clustering [OC] with a 3$\%$ threshold). These clusters are then evaluated with morphology as the validation data. We develop quantitative indicators for predicting which barcode clusters are likely to be incongruent with morphospecies by randomly selecting 100 clusters for in-depth validation with morphology. A linear model demonstrates that the best predictors for incongruence between barcode clusters and morphology are maximum p-distance within the cluster and a newly proposed index that measures cluster stability across different clustering thresholds. A test of these indicators using the 215 remaining clusters reveals that these predictors correctly identify all clusters that are incongruent with morphology. In our study, all morphospecies are true or disjoint subsets of the initial barcode clusters so that all incongruence can be eliminated by varying clustering thresholds. This leads to a discussion of when a third data source is needed to resolve incongruent grouping statements. The morphological validation step in our study involved 1,039 specimens (5.8$\%$ of the total). The formal LIT protocol we propose would only have required the study of 915 (5.1$\%$: 2.5 specimens per species), as we show that clusters without signatures of incongruence can be validated by only studying two specimens representing the most divergent haplotypes. To test the generality of our results across different barcode clustering techniques, we establish that the levels of incongruence are similar across OC, Automatic Barcode Gap Discovery (ABGD), Poisson Tree Processes (PTP), and Refined Single Linkage (RESL) (used by Barcode of Life Data System to assign Barcode Index Numbers [BINs]). OC and ABGD achieved a maximum congruence score with the morphology of 89$\%$ while PTP was slightly less effective (84$\%$). RESL could only be tested for a subset of the specimens because the algorithm is not public. BINs based on 277 of the original 1,714 haplotypes were 86$\%$ congruent with morphology while the values were 89$\%$ for OC, 74$\%$ for PTP, and 72$\%$ for ABGD. [Biodiversity discovery; dark taxa; DNA barcodes; integrative taxonomy.].


Assuntos
Biodiversidade , Código de Barras de DNA Taxonômico , Análise por Conglomerados , Código de Barras de DNA Taxonômico/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Filogenia
6.
Nature ; 530(7588): 89-93, 2016 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-26842059

RESUMO

The position of Xenacoelomorpha in the tree of life remains a major unresolved question in the study of deep animal relationships. Xenacoelomorpha, comprising Acoela, Nemertodermatida, and Xenoturbella, are bilaterally symmetrical marine worms that lack several features common to most other bilaterians, for example an anus, nephridia, and a circulatory system. Two conflicting hypotheses are under debate: Xenacoelomorpha is the sister group to all remaining Bilateria (= Nephrozoa, namely protostomes and deuterostomes) or is a clade inside Deuterostomia. Thus, determining the phylogenetic position of this clade is pivotal for understanding the early evolution of bilaterian features, or as a case of drastic secondary loss of complexity. Here we show robust phylogenomic support for Xenacoelomorpha as the sister taxon of Nephrozoa. Our phylogenetic analyses, based on 11 novel xenacoelomorph transcriptomes and using different models of evolution under maximum likelihood and Bayesian inference analyses, strongly corroborate this result. Rigorous testing of 25 experimental data sets designed to exclude data partitions and taxa potentially prone to reconstruction biases indicates that long-branch attraction, saturation, and missing data do not influence these results. The sister group relationship between Nephrozoa and Xenacoelomorpha supported by our phylogenomic analyses implies that the last common ancestor of bilaterians was probably a benthic, ciliated acoelomate worm with a single opening into an epithelial gut, and that excretory organs, coelomic cavities, and nerve cords evolved after xenacoelomorphs separated from the stem lineage of Nephrozoa.


Assuntos
Organismos Aquáticos/classificação , Filogenia , Estruturas Animais/anatomia & histologia , Animais , Organismos Aquáticos/genética , Teorema de Bayes , Genes , Funções Verossimilhança , Masculino , Modelos Biológicos , Transcriptoma
7.
Ecol Lett ; 24(10): 2134-2145, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34297474

RESUMO

The study of herbivorous insects underpins much of the theory that concerns the evolution of species interactions. In particular, Pieridae butterflies and their host plants have served as a model system for studying evolutionary arms races. To learn more about the coevolution of these two clades, we reconstructed ancestral ecological networks using stochastic mappings that were generated by a phylogenetic model of host-repertoire evolution. We then measured if, when, and how two ecologically important structural features of the ancestral networks (modularity and nestedness) evolved over time. Our study shows that as pierids gained new hosts and formed new modules, a subset of them retained or recolonised the ancestral host(s), preserving connectivity to the original modules. Together, host-range expansions and recolonisations promoted a phase transition in network structure. Our results demonstrate the power of combining network analysis with Bayesian inference of host-repertoire evolution to understand changes in complex species interactions over time.


Assuntos
Borboletas , Animais , Teorema de Bayes , Borboletas/genética , Herbivoria , Filogenia , Plantas
8.
Mol Ecol ; 30(5): 1120-1135, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33432777

RESUMO

High-throughput sequencing (HTS) is increasingly being used for the characterization and monitoring of biodiversity. If applied in a structured way, across broad geographical scales, it offers the potential for a much deeper understanding of global biodiversity through the integration of massive quantities of molecular inventory data generated independently at local, regional and global scales. The universality, reliability and efficiency of HTS data can potentially facilitate the seamless linking of data among species assemblages from different sites, at different hierarchical levels of diversity, for any taxonomic group and regardless of prior taxonomic knowledge. However, collective international efforts are required to optimally exploit the potential of site-based HTS data for global integration and synthesis, efforts that at present are limited to the microbial domain. To contribute to the development of an analogous strategy for the nonmicrobial terrestrial domain, an international symposium entitled "Next Generation Biodiversity Monitoring" was held in November 2019 in Nicosia (Cyprus). The symposium brought together evolutionary geneticists, ecologists and biodiversity scientists involved in diverse regional and global initiatives using HTS as a core tool for biodiversity assessment. In this review, we summarize the consensus that emerged from the 3-day symposium. We converged on the opinion that an effective terrestrial Genomic Observatories network for global biodiversity integration and synthesis should be spatially led and strategically united under the umbrella of the metabarcoding approach. Subsequently, we outline an HTS-based strategy to collectively build an integrative framework for site-based biodiversity data generation.


Assuntos
Biodiversidade , Código de Barras de DNA Taxonômico , Chipre , Genômica , Reprodutibilidade dos Testes
9.
Syst Biol ; 69(5): 1016-1032, 2020 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-31985810

RESUMO

Sampling across tree space is one of the major challenges in Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) algorithms. Standard MCMC tree moves consider small random perturbations of the topology, and select from candidate trees at random or based on the distance between the old and new topologies. MCMC algorithms using such moves tend to get trapped in tree space, making them slow in finding the globally most probable trees (known as "convergence") and in estimating the correct proportions of the different types of them (known as "mixing"). Here, we introduce a new class of moves, which propose trees based on their parsimony scores. The proposal distribution derived from the parsimony scores is a quickly computable albeit rough approximation of the conditional posterior distribution over candidate trees. We demonstrate with simulations that parsimony-guided moves correctly sample the uniform distribution of topologies from the prior. We then evaluate their performance against standard moves using six challenging empirical data sets, for which we were able to obtain accurate reference estimates of the posterior using long MCMC runs, a mix of topology proposals, and Metropolis coupling. On these data sets, ranging in size from 357 to 934 taxa and from 1740 to 5681 sites, we find that single chains using parsimony-guided moves usually converge an order of magnitude faster than chains using standard moves. They also exhibit better mixing, that is, they cover the most probable trees more quickly. Our results show that tree moves based on quick and dirty estimates of the posterior probability can significantly outperform standard moves. Future research will have to show to what extent the performance of such moves can be improved further by finding better ways of approximating the posterior probability, taking the trade-off between accuracy and speed into account. [Bayesian phylogenetic inference; MCMC; parsimony; tree proposal.].


Assuntos
Classificação/métodos , Filogenia , Algoritmos , Teorema de Bayes , Modelos Biológicos
10.
Syst Biol ; 69(6): 1149-1162, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32191324

RESUMO

Intimate ecological interactions, such as those between parasites and their hosts, may persist over long time spans, coupling the evolutionary histories of the lineages involved. Most methods that reconstruct the coevolutionary history of such interactions make the simplifying assumption that parasites have a single host. Many methods also focus on congruence between host and parasite phylogenies, using cospeciation as the null model. However, there is an increasing body of evidence suggesting that the host ranges of parasites are more complex: that host ranges often include more than one host and evolve via gains and losses of hosts rather than through cospeciation alone. Here, we develop a Bayesian approach for inferring coevolutionary history based on a model accommodating these complexities. Specifically, a parasite is assumed to have a host repertoire, which includes both potential hosts and one or more actual hosts. Over time, potential hosts can be added or lost, and potential hosts can develop into actual hosts or vice versa. Thus, host colonization is modeled as a two-step process that may potentially be influenced by host relatedness. We first explore the statistical behavior of our model by simulating evolution of host-parasite interactions under a range of parameter values. We then use our approach, implemented in the program RevBayes, to infer the coevolutionary history between 34 Nymphalini butterfly species and 25 angiosperm families. Our analysis suggests that host relatedness among angiosperm families influences how easily Nymphalini lineages gain new hosts. [Ancestral hosts; coevolution; herbivorous insects; probabilistic modeling.].


Assuntos
Interações Hospedeiro-Parasita/fisiologia , Modelos Biológicos , Filogenia , Animais , Teorema de Bayes , Coevolução Biológica , Borboletas/fisiologia , Especificidade de Hospedeiro/fisiologia , Magnoliopsida/parasitologia
11.
Syst Biol ; 68(6): 876-895, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30825372

RESUMO

Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing automated systems for this task. Previous attempts have been based on laborious and complex handcrafted extraction of image features, but in recent years it has been shown that sophisticated convolutional neural networks (CNNs) can learn to extract relevant features automatically, without human intervention. Unfortunately, reaching expert-level accuracy in CNN identifications requires substantial computational power and huge training data sets, which are often not available for taxonomic tasks. This can be addressed using feature transfer: a CNN that has been pretrained on a generic image classification task is exposed to the taxonomic images of interest, and information about its perception of those images is used in training a simpler, dedicated identification system. Here, we develop an effective method of CNN feature transfer, which achieves expert-level accuracy in taxonomic identification of insects with training sets of 100 images or less per category, depending on the nature of data set. Specifically, we extract rich representations of intermediate to high-level image features from the CNN architecture VGG16 pretrained on the ImageNet data set. This information is submitted to a linear support vector machine classifier, which is trained on the target problem. We tested the performance of our approach on two types of challenging taxonomic tasks: 1) identifying insects to higher groups when they are likely to belong to subgroups that have not been seen previously and 2) identifying visually similar species that are difficult to separate even for experts. For the first task, our approach reached $CDATA[$CDATA[$>$$92% accuracy on one data set (884 face images of 11 families of Diptera, all specimens representing unique species), and $CDATA[$CDATA[$>$$96% accuracy on another (2936 dorsal habitus images of 14 families of Coleoptera, over 90% of specimens belonging to unique species). For the second task, our approach outperformed a leading taxonomic expert on one data set (339 images of three species of the Coleoptera genus Oxythyrea; 97% accuracy), and both humans and traditional automated identification systems on another data set (3845 images of nine species of Plecoptera larvae; 98.6 % accuracy). Reanalyzing several biological image identification tasks studied in the recent literature, we show that our approach is broadly applicable and provides significant improvements over previous methods, whether based on dedicated CNNs, CNN feature transfer, or more traditional techniques. Thus, our method, which is easy to apply, can be highly successful in developing automated taxonomic identification systems even when training data sets are small and computational budgets limited. We conclude by briefly discussing some promising CNN-based research directions in morphological systematics opened up by the success of these techniques in providing accurate diagnostic tools.


Assuntos
Classificação/métodos , Insetos/classificação , Redes Neurais de Computação , Animais , Filogenia , Reprodutibilidade dos Testes
12.
Syst Biol ; 65(1): 161-76, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26231183

RESUMO

Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes.


Assuntos
Classificação/métodos , Modelos Teóricos , Filogenia , Algoritmos , Teorema de Bayes , Simulação por Computador , Cadeias de Markov , Método de Monte Carlo
13.
Syst Biol ; 65(2): 228-49, 2016 03.
Artigo em Inglês | MEDLINE | ID: mdl-26493827

RESUMO

Bayesian total-evidence dating involves the simultaneous analysis of morphological data from the fossil record and morphological and sequence data from recent organisms, and it accommodates the uncertainty in the placement of fossils while dating the phylogenetic tree. Due to the flexibility of the Bayesian approach, total-evidence dating can also incorporate additional sources of information. Here, we take advantage of this and expand the analysis to include information about fossilization and sampling processes. Our work is based on the recently described fossilized birth-death (FBD) process, which has been used to model speciation, extinction, and fossilization rates that can vary over time in a piecewise manner. So far, sampling of extant and fossil taxa has been assumed to be either complete or uniformly at random, an assumption which is only valid for a minority of data sets. We therefore extend the FBD process to accommodate diversified sampling of extant taxa, which is standard practice in studies of higher-level taxa. We verify the implementation using simulations and apply it to the early radiation of Hymenoptera (wasps, ants, and bees). Previous total-evidence dating analyses of this data set were based on a simple uniform tree prior and dated the initial radiation of extant Hymenoptera to the late Carboniferous (309 Ma). The analyses using the FBD prior under diversified sampling, however, date the radiation to the Triassic and Permian (252 Ma), slightly older than the age of the oldest hymenopteran fossils. By exploring a variety of FBD model assumptions, we show that it is mainly the accommodation of diversified sampling that causes the push toward more recent divergence times. Accounting for diversified sampling thus has the potential to close the long-discussed gap between rocks and clocks. We conclude that the explicit modeling of fossilization and sampling processes can improve divergence time estimates, but only if all important model aspects, including sampling biases, are adequately addressed.


Assuntos
Classificação/métodos , Fósseis , Himenópteros/classificação , Modelos Biológicos , Animais , Biodiversidade , Especiação Genética , Filogenia , Tempo
14.
Syst Biol ; 65(4): 726-36, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27235697

RESUMO

Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].


Assuntos
Classificação/métodos , Modelos Biológicos , Filogenia , Software , Teorema de Bayes
15.
Syst Biol ; 64(6): 1089-103, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26272507

RESUMO

Directional evolution has played an important role in shaping the morphological, ecological, and molecular diversity of life. However, standard substitution models assume stationarity of the evolutionary process over the time scale examined, thus impeding the study of directionality. Here we explore a simple, nonstationary model of evolution for discrete data, which assumes that the state frequencies at the root differ from the equilibrium frequencies of the homogeneous evolutionary process along the rest of the tree (i.e., the process is nonstationary, nonreversible, but homogeneous). Within this framework, we develop a Bayesian approach for testing directional versus stationary evolution using a reversible-jump algorithm. Simulations show that when only data from extant taxa are available, the success in inferring directionality is strongly dependent on the evolutionary rate, the shape of the tree, the relative branch lengths, and the number of taxa. Given suitable evolutionary rates (0.1-0.5 expected substitutions between root and tips), accounting for directionality improves tree inference and often allows correct rooting of the tree without the use of an outgroup. As an empirical test, we apply our method to study directional evolution in hymenopteran morphology. We focus on three character systems: wing veins, muscles, and sclerites. We find strong support for a trend toward loss of wing veins and muscles, while stationarity cannot be ruled out for sclerites. Adding fossil and time information in a total-evidence dating approach, we show that accounting for directionality results in more precise estimates not only of the ancestral state at the root of the tree, but also of the divergence times. Our model relaxes the assumption of stationarity and reversibility by adding a minimum of additional parameters, and is thus well suited to studying the nature of the evolutionary process in data sets of limited size, such as morphology and ecology.


Assuntos
Evolução Biológica , Himenópteros/anatomia & histologia , Himenópteros/citologia , Modelos Biológicos , Animais , Simulação por Computador , Cadeias de Markov
16.
Syst Biol ; 63(5): 753-71, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-24951559

RESUMO

Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution.


Assuntos
Classificação/métodos , Modelos Estatísticos , Filogenia , Algoritmos , Simulação por Computador
17.
Syst Biol ; 62(5): 660-73, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23628960

RESUMO

We review Bayesian approaches to model testing in general and to the assessment of topological hypotheses in particular. We show that the standard way of setting up Bayes factor tests of the monophyly of a group, or the placement of a sample sequence in a known reference tree, can be misleading. The reason for this is related to the well-known dependency of Bayes factors on model-specific priors. Specifically, when testing tree hypotheses it is important that each hypothesis is associated with an appropriate tree space in the prior. This can be achieved by using appropriately constrained searches or by filtering trees in the posterior sample, but in a more elaborate way than typically implemented. If it is difficult to find the appropriate tree sets to be contrasted, then the posterior model odds may be more informative than the Bayes factor. We illustrate the recommended techniques using an empirical test case addressing the issue of whether two genera of diving beetles (Coleoptera: Dytiscidae), Suphrodytes and Hydroporus, should be synonymized. Our refined Bayes factor tests, in contrast to standard analyses, show that there is strong support for Suphrodytes nesting inside Hydroporus, and the genera are therefore synonymized.


Assuntos
Classificação/métodos , Besouros/classificação , Filogenia , Animais , Teorema de Bayes
18.
Mol Phylogenet Evol ; 67(1): 266-76, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23396205

RESUMO

The eukaryotic translation elongation factor-1α gene (eEF1A) has been used extensively in higher level phylogenetics of insects and other groups, despite being present in two or more copies in several taxa. Orthology assessment has relied heavily on the position of introns, but the basic assumption of low rates of intron loss and absence of convergent intron gains has not been tested thoroughly. Here, we study the evolution of eEF1A based on a broad sample of taxa in the insect order Hymenoptera. The gene is universally present in two copies - F1 and F2 - both of which apparently originated before the emergence of the order. An elevated ratio of non-synonymous versus synonymous substitutions and differences in rates of amino acid replacements between the copies suggest that they evolve independently, and phylogenetic methods clearly cluster the copies separately. The F2 copy appears to be ancient; it is orthologous with the copy known as F1 in Diptera, and is likely present in most insect orders. The hymenopteran F1 copy, which may or may not be unique to this order, apparently originated through retroposition and was originally intron free. During the evolution of the Hymenoptera, it has successively accumulated introns, at least three of which have appeared at the same position as introns in the F2 copy or in eEF1A copies in other insects. The sites of convergent intron gain are characterized by highly conserved nucleotides that strongly resemble specific intron-associated sequence motifs, so-called proto-splice sites. The significant rate of convergent intron gain renders intron-exon structure unreliable as an indicator of orthology in eEF1A, and probably also in other protein-coding genes.


Assuntos
Evolução Molecular , Himenópteros/genética , Íntrons , Fator 1 de Elongação de Peptídeos/genética , Filogenia , Animais , Éxons , Genes de Insetos , Análise de Sequência de DNA
19.
Syst Biol ; 61(6): 973-99, 2012 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-22723471

RESUMO

Phylogenies are usually dated by calibrating interior nodes against the fossil record. This relies on indirect methods that, in the worst case, misrepresent the fossil information. Here, we contrast such node dating with an approach that includes fossils along with the extant taxa in a Bayesian total-evidence analysis. As a test case, we focus on the early radiation of the Hymenoptera, mostly documented by poorly preserved impression fossils that are difficult to place phylogenetically. Specifically, we compare node dating using nine calibration points derived from the fossil record with total-evidence dating based on 343 morphological characters scored for 45 fossil (4--20 complete) and 68 extant taxa. In both cases we use molecular data from seven markers (∼5 kb) for the extant taxa. Because it is difficult to model speciation, extinction, sampling, and fossil preservation realistically, we develop a simple uniform prior for clock trees with fossils, and we use relaxed clock models to accommodate rate variation across the tree. Despite considerable uncertainty in the placement of most fossils, we find that they contribute significantly to the estimation of divergence times in the total-evidence analysis. In particular, the posterior distributions on divergence times are less sensitive to prior assumptions and tend to be more precise than in node dating. The total-evidence analysis also shows that four of the seven Hymenoptera calibration points used in node dating are likely to be based on erroneous or doubtful assumptions about the fossil placement. With respect to the early radiation of Hymenoptera, our results suggest that the crown group dates back to the Carboniferous, ∼309 Ma (95% interval: 291--347 Ma), and diversified into major extant lineages much earlier than previously thought, well before the Triassic. [Bayesian inference; fossil dating; morphological evolution; relaxed clock; statistical phylogenetics.].


Assuntos
Fósseis , Himenópteros/classificação , Filogenia , Animais , Especiação Genética , Himenópteros/anatomia & histologia , Himenópteros/genética , Modelos Genéticos , Modelos Estatísticos , Estatística como Assunto , Fatores de Tempo
20.
Syst Biol ; 61(1): 170-3, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21963610

RESUMO

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.


Assuntos
Biologia Computacional/métodos , Filogenia , Software , Algoritmos , Metodologias Computacionais , Evolução Molecular , Genoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA