RESUMO
Chronograms-phylogenies with branch lengths proportional to time-represent key data on timing of evolutionary events, allowing us to study natural processes in many areas of biological research. Chronograms also provide valuable information that can be used for education, science communication, and conservation policy decisions. Yet, achieving a high-quality reconstruction of a chronogram is a difficult and resource-consuming task. Here we present DateLife, a phylogenetic software implemented as an R package and an R Shiny web application available at www.datelife.org, that provides services for efficient and easy discovery, summary, reuse, and reanalysis of node age data mined from a curated database of expert, peer-reviewed, and openly available chronograms. The main DateLife workflow starts with one or more scientific taxon names provided by a user. Names are processed and standardized to a unified taxonomy, allowing DateLife to run a name match across its local chronogram database that is curated from Open Tree of Life's phylogenetic repository, and extract all chronograms that contain at least two queried taxon names, along with their metadata. Finally, node ages from matching chronograms are mapped using the congruification algorithm to corresponding nodes on a tree topology, either extracted from Open Tree of Life's synthetic phylogeny or one provided by the user. Congruified node ages are used as secondary calibrations to date the chosen topology, with or without initial branch lengths, using different phylogenetic dating methods such as BLADJ, treePL, PATHd8, and MrBayes. We performed a cross-validation test to compare node ages resulting from a DateLife analysis (i.e, phylogenetic dating using secondary calibrations) to those from the original chronograms (i.e, obtained with primary calibrations), and found that DateLife's node age estimates are consistent with the age estimates from the original chronograms, with the largest variation in ages occurring around topologically deeper nodes. Because the results from any software for scientific analysis can only be as good as the data used as input, we highlight the importance of considering the results of a DateLife analysis in the context of the input chronograms. DateLife can help to increase awareness of the existing disparities among alternative hypotheses of dates for the same diversification events, and to support exploration of the effect of alternative chronogram hypotheses on downstream analyses, providing a framework for a more informed interpretation of evolutionary results.
Assuntos
Classificação , Filogenia , Software , Classificação/métodos , Bases de Dados FactuaisRESUMO
The biological sciences community is increasingly recognizing the value of open, reproducible and transparent research practices for science and society at large. Despite this recognition, many researchers fail to share their data and code publicly. This pattern may arise from knowledge barriers about how to archive data and code, concerns about its reuse, and misaligned career incentives. Here, we define, categorize and discuss barriers to data and code sharing that are relevant to many research fields. We explore how real and perceived barriers might be overcome or reframed in the light of the benefits relative to costs. By elucidating these barriers and the contexts in which they arise, we can take steps to mitigate them and align our actions with the goals of open science, both as individual scientists and as a scientific community.
Assuntos
Disciplinas das Ciências Biológicas , Motivação , Disseminação de InformaçãoRESUMO
BACKGROUND: Phylogenies are a key part of research in many areas of biology. Tools that automate some parts of the process of phylogenetic reconstruction, mainly molecular character matrix assembly, have been developed for the advantage of both specialists in the field of phylogenetics and non-specialists. However, interpretation of results, comparison with previously available phylogenetic hypotheses, and selection of one phylogeny for downstream analyses and discussion still impose difficulties to one that is not a specialist either on phylogenetic methods or on a particular group of study. RESULTS: Physcraper is a command-line Python program that automates the update of published phylogenies by adding public DNA sequences to underlying alignments of previously published phylogenies. It also provides a framework for straightforward comparison of published phylogenies with their updated versions, by leveraging upon tools from the Open Tree of Life project to link taxonomic information across databases. The program can be used by the nonspecialist, as a tool to generate phylogenetic hypotheses based on publicly available expert phylogenetic knowledge. Phylogeneticists and taxonomic group specialists will find it useful as a tool to facilitate molecular dataset gathering and comparison of alternative phylogenetic hypotheses (topologies). CONCLUSION: The Physcraper workflow showcases the benefits of doing open science for phylogenetics, encouraging researchers to strive for better scientific sharing practices. Physcraper can be used with any OS and is released under an open-source license. Detailed instructions for installation and usage are available at https://physcraper.readthedocs.io.
Assuntos
FilogeniaRESUMO
BACKGROUND AND AIMS: As angiosperms became one of the megadiverse groups of macroscopic eukaryotes, they forged modern ecosystems and promoted the evolution of extant terrestrial biota. Unequal distribution of species among lineages suggests that diversification, the process that ultimately determines species richness, acted differentially through angiosperm evolution. METHODS: We investigate how angiosperms became megadiverse by identifying the phylogenetic and temporal placement of exceptional radiations, by combining the most densely fossil-calibrated molecular clock phylogeny with a Bayesian model that identifies diversification shifts among evolutionary lineages and through time. We evaluate the effect of the prior number of expected shifts in the phylogenetic tree. KEY RESULTS: Major diversification increases took place over 100 Ma, from the Early Cretaceous to the end of the Paleogene, and are distributed across the angiosperm phylogeny. The long-term diversification trajectory of angiosperms shows moderate rate variation, but is underlain by increasing speciation and extinction, and results from temporally overlapping, independent radiations and depletions in component lineages. CONCLUSIONS: The identified deep time diversification shifts are clues to the identification of ultimate drivers of angiosperm megadiversity, which probably involve multivariate interactions among intrinsic traits and extrinsic forces. An enhanced understanding of angiosperm diversification will involve a more precise phylogenetic location of diversification shifts, and integration of fossil information.
Assuntos
Evolução Biológica , Magnoliopsida , Filogenia , Adaptação Biológica , Teorema de Bayes , Evolução Molecular , Fósseis/anatomia & histologiaRESUMO
Arid biomes are particularly prominent in the Neotropics providing some of its most emblematic landscapes and a substantial part of its species diversity. To understand some of the evolutionary processes underlying the speciation of lineages in the Mexican Deserts, the diversification of Fouquieria is investigated, which includes eleven species, all endemic to the warm deserts and dry subtropical regions of North America. Using a phylogeny from plastid DNA sequences with samples of individuals from populations of all the species recognized in Fouquieria, we estimate divergence times, test for temporal diversification heterogeneity, test for geographical structure, and conduct ancestral area reconstruction. Fouquieria is an ancient lineage that diverged from Polemoniaceae ca. 75.54â¯Ma. A Mio-Pliocene diversification of Fouquieria with vicariance, associated with Neogene orogenesis underlying the early development of regional deserts is strongly supported. Test for temporal diversification heterogeneity indicates that during its evolutionary history, Fouquieria had a drastic diversification rate shift at ca.12.72â¯Ma, agreeing with hypotheses that some of the lineages in North American deserts diversified as early as the late Miocene to Pliocene, and not during the Pleistocene. Long-term diversification dynamics analyses suggest that extinction also played a significant role in Fouquieria's evolution, with a very high rate at the onset of the process. From the late Miocene onwards, Fouquieria underwent substantial diversification change, involving high speciation decreasing to the present and negligible extinction, which is congruent with its scant fossil record during this period. Geographic phylogenetic structure and the pattern of most sister species inhabiting different desert nucleus support that isolation by distance could be the main driver of speciation.
Assuntos
Clima Desértico , Ericales/classificação , Filogenia , Biodiversidade , Fósseis , Especiação Genética , Geografia , Funções Verossimilhança , América do Norte , Software , Fatores de Tempo , Estados UnidosRESUMO
The relationship between clade age and species richness has been increasingly used in macroevolutionary studies as evidence for ecologically versus time-dependent diversification processes. However, theory suggests that phylogenetic structure, age type (crown or stem age), and taxonomic delimitation can affect estimates of the age-richness correlation (ARC) considerably. We currently lack an integrative understanding of how these different factors affect ARCs, which in turn, obscures further interpretations. To assess its informative breadth, we characterize ARC behavior with simulated and empirical phylogenies, considering phylogenetic structure and both crown and stem ages. First, we develop a two-state birth-death model to simulate phylogenies including the origin of higher taxa and a hierarchical taxonomy to determine ARC expectations under ecologically and time-dependent diversification processes. Then, we estimate ARCs across various taxonomic ranks of extant amphibians, squamate reptiles, mammals, birds, and flowering plants. We find that our model reproduces the general ARC trends of a wide range of biological systems despite the particularities of taxonomic practice within each, suggesting that the model is adequate to establish a framework of ARC null expectations for different diversification processes when taxa are defined with a hierarchical taxonomy. ARCs estimated with crown ages were positive in all the scenarios we studied, including ecologically dependent processes. Negative ARCs were only found at less inclusive taxonomic ranks, when considering stem age, and when rates varied among clades. This was the case both in ecologically and time-dependent processes. Together, our results warn against direct interpretations of single ARC estimates and advocate for a more integrative use of ARCs across age types and taxonomic ranks in diversification studies. [Birth-Death models; crown age; diversity dependence; extinction; phylogenetic structure; speciation; stem age; taxonomy; time dependence; tree simulations.].
Assuntos
Biodiversidade , Classificação/métodos , Modelos Biológicos , Filogenia , Animais , Especiação Genética , MagnoliopsidaRESUMO
The establishment of modern terrestrial life is indissociable from angiosperm evolution. While available molecular clock estimates of angiosperm age range from the Paleozoic to the Late Cretaceous, the fossil record is consistent with angiosperm diversification in the Early Cretaceous. The time-frame of angiosperm evolution is here estimated using a sample representing 87% of families and sequences of five plastid and nuclear markers, implementing penalized likelihood and Bayesian relaxed clocks. A literature-based review of the palaeontological record yielded calibrations for 137 phylogenetic nodes. The angiosperm crown age was bound within a confidence interval calculated with a method that considers the fossil record of the group. An Early Cretaceous crown angiosperm age was estimated with high confidence. Magnoliidae, Monocotyledoneae and Eudicotyledoneae diversified synchronously 135-130 million yr ago (Ma); Pentapetalae is 126-121 Ma; and Rosidae (123-115 Ma) preceded Asteridae (119-110 Ma). Family stem ages are continuously distributed between c. 140 and 20 Ma. This time-frame documents an early phylogenetic proliferation that led to the establishment of major angiosperm lineages, and the origin of over half of extant families, in the Cretaceous. While substantial amounts of angiosperm morphological and functional diversity have deep evolutionary roots, extant species richness was probably acquired later.
Assuntos
Sequência de Bases , Biodiversidade , Evolução Biológica , Fósseis , Magnoliopsida/genética , Filogenia , Teorema de Bayes , Núcleo Celular , DNA de Plantas/análise , Evolução Molecular , Plastídeos , Análise de Sequência de DNARESUMO
A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts, including research, education, and public policy. Yet, accessing the tree of life typically requires special knowledge, complex software, or long periods of training. The Phylotastic project aims make it as easy to get a phylogeny of species as it is to get driving directions from mapping software. In prior work, we presented a design for an open system to validate and manage taxon names, find phylogeny resources, extract subtrees matching a user's taxon list, scale trees to time, and integrate related resources such as species images. Here, we report the implementation of a set of tools that together represent a robust, accessible system for on-the-fly delivery of phylogenetic knowledge. This set of tools includes a web portal to execute several customizable workflows to obtain species phylogenies (scaled by geologic time and decorated with thumbnail images); more than 30 underlying web services (accessible via a common registry); and code toolkits in R and Python (allowing others to develop custom applications using Phylotastic services). The Phylotastic system, accessible via http://www.phylotastic.org, provides a unique resource to access the current state of phylogenetic knowledge, useful for a variety of cases in which a tree extracted quickly from online resources (as distinct from a tree custom-made from character data) is sufficient, as it is for many casual uses of trees identified here.
RESUMO
Escherichia coli occur as either free-living microorganisms, or within the colons of mammals and birds as pathogenic or commensal bacteria. Although the Mexican population of intestinal E. coli maintains high levels of genetic diversity, the exact mechanisms by which this occurs remain unknown. We therefore investigated the role of homologous recombination and point mutation in the genetic diversification and population structure of Mexican strains of E. coli. This was explored using a multi locus sequence typing (MLST) approach in a non-outbreak related, host-wide sample of 128 isolates. Overall, genetic diversification in this sample appears to be driven primarily by homologous recombination, and to a lesser extent, by point mutation. Since genetic diversity is hierarchically organized according to the MLST genealogy, we observed that there is not a homogeneous recombination rate, but that different rates emerge at different clustering levels such as phylogenetic group, lineage and clonal complex (CC). Moreover, we detected clear signature of substructure among the A+B1 phylogenetic group, where the majority of isolates were differentiated into four discrete lineages. Substructure pattern is revealed by the presence of several CCs associated to a particular life style and host as well as to different genetic diversification mechanisms. We propose these findings as an alternative explanation for the maintenance of the clear phylogenetic signal of this species despite the prevalence of homologous recombination. Finally, we corroborate using both phylogenetic and genetic population approaches as an effective mean to establish epidemiological surveillance tailored to the ecological specificities of each geographic region.