Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Bioessays ; 39(11)2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-28980328

RESUMEN

Phylogenetic trees are a crucial backbone for a wide breadth of biological research spanning systematics, organismal biology, ecology, and medicine. In 2015, the Open Tree of Life project published a first draft of a comprehensive tree of life, summarizing digitally available taxonomic and phylogenetic knowledge. This paper reviews, investigates, and addresses the following questions as a follow-up to that paper, from the perspective of researchers involved in building this summary of the tree of life: Is there a tree of life and should we reconstruct it? Is available data sufficient to reconstruct the tree of life? Do we have access to phylogenetic inferences in usable form? Can we combine different phylogenetic estimates across the tree of life? And finally, what is the future of understanding the tree of life?


Asunto(s)
Evolución Biológica , Genómica/métodos , Filogenia , Archaea/genética , Bacterias/genética , Eucariontes/genética , Transferencia de Gen Horizontal
2.
PLoS Comput Biol ; 13(6): e1005510, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28640806

RESUMEN

Computers are now essential in all branches of science, but most researchers are never taught the equivalent of basic lab skills for research computing. As a result, data can get lost, analyses can take much longer than necessary, and researchers are limited in how effectively they can work with software and data. Computing workflows need to follow the same practices as lab projects and notebooks, with organized data, documented steps, and the project structured for reproducibility, but researchers new to computing often don't know where to start. This paper presents a set of good computing practices that every researcher can adopt, regardless of their current level of computational skill. These practices, which encompass data management, programming, collaborating with colleagues, organizing projects, tracking work, and writing manuscripts, are drawn from a wide variety of published sources from our daily lives and from our work with volunteer organizations that have delivered workshops to over 11,000 people since 2010.


Asunto(s)
Seguridad Computacional/normas , Metodologías Computacionales , Exactitud de los Datos , Investigación/normas , Ciencia/normas , Programas Informáticos/normas , Documentación/normas , Guías como Asunto
3.
Proc Natl Acad Sci U S A ; 112(41): 12764-9, 2015 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-26385966

RESUMEN

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips-the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.


Asunto(s)
Clasificación/métodos , Filogenia , Animales , Humanos
4.
Bioinformatics ; 31(17): 2794-800, 2015 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25940563

RESUMEN

MOTIVATION: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. RESULTS: Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git's version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the 'phylesystem-api', which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. AVAILABILITY AND IMPLEMENTATION: Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator. Code for that tool is available from https://github.com/OpenTreeOfLife/opentree. CONTACT: mtholder@gmail.com.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información , Filogenia , Programas Informáticos , Humanos , Internet , Lenguajes de Programación , Reproducibilidad de los Resultados , Interfaz Usuario-Computador
5.
Syst Biol ; 64(5): 853-9, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25922515

RESUMEN

Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important-often least appreciated-step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for nonspecialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as a key pipeline for peer-reviewed calibrations to enter the database.


Asunto(s)
Bases de Datos Factuales/normas , Fósiles , Filogenia , Acceso a la Información , Calibración , Interpretación Estadística de Datos , Internet , Tiempo
6.
PLoS Biol ; 11(1): e1001468, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23335860

RESUMEN

How should funding agencies enable researchers to explore high-risk but potentially high-reward science? One model that appears to work is the NSF-funded synthesis center, an incubator for community-led, innovative science.


Asunto(s)
Investigación Biomédica/economía , Financiación Gubernamental/economía , Interpretación Estadística de Datos , Administración Financiera , Humanos , Investigadores , Estados Unidos
7.
BMC Bioinformatics ; 14: 158, 2013 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-23668630

RESUMEN

BACKGROUND: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. RESULTS: With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image. CONCLUSIONS: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.


Asunto(s)
Filogenia , Programas Informáticos , Internet
8.
Plant J ; 63(3): 430-42, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20487382

RESUMEN

Despite knowledge that polyploidy is widespread and a major evolutionary force in flowering plant diversification, detailed comparative molecular studies on polyploidy have been confined to only a few species and families. The genus Oryza is composed of 23 species that are classified into ten distinct 'genome types' (six diploid and four polyploid), and is emerging as a powerful new model system to study polyploidy. Here we report the identification, sequence and comprehensive comparative annotation of eight homoeologous genomes from a single orthologous region (Adh1-Adh2) from four allopolyploid species representing each of the known Oryza genome types (BC, CD, HJ and KL). Detailed comparative phylogenomic analyses of these regions within and across species and ploidy levels provided several insights into the spatio-temporal dynamics of genome organization and evolution of this region in 'natural' polyploids of Oryza. The major findings of this study are that: (i) homoeologous genomic regions within the same nucleus experience both independent and parallel evolution, (ii) differential lineage-specific selection pressures do not occur between polyploids and their diploid progenitors, (iii) there have been no dramatic structural changes relative to the diploid ancestors, (iv) a variation in the molecular evolutionary rate exists between the two genomes in the BC complex species even though the BC and CD polyploid species appear to have arisen <2 million years ago, and (v) there are no clear distinctions in the patterns of genome evolution in the diploid versus polyploid species.


Asunto(s)
Evolución Molecular , Genoma de Planta , Oryza/genética , Tetraploidía , Cromosomas Artificiales Bacterianos , Genes de Plantas , Datos de Secuencia Molecular , Filogenia , Retroelementos
9.
New Phytol ; 191(2): 555-563, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21449951

RESUMEN

Competing evolutionary forces shape plant breeding systems (e.g. inbreeding depression, reproductive assurance). Which of these forces prevails in a given population or species is predicted to depend upon such factors as life history, ecological conditions, and geographical context. Here, we examined two such predictions: that self-compatibility should be associated with the annual life history or extreme climatic conditions. We analyzed data from a clade of plants remarkable for variation in breeding system, life history and climatic conditions (Oenothera, sections Anogra and Kleinia, Onagraceae). We used a phylogenetic comparative approach and Bayesian or hybrid Bayesian tests to account for phylogenetic uncertainty. Geographic information system (GIS)-based climate data and ecological niche modeling allowed us to quantify climatic conditions. Breeding system and reproductive life span are not correlated in Anogra and Kleinia. Instead, self-compatibility is associated with the extremes of temperature in the coldest part of the year and precipitation in the driest part of the year. In the 60 yr since this pattern was anticipated, this is the first demonstration of a relationship between the evolution of self-compatibility and climatic extremes. We discuss possible explanations for this pattern and possible implications with respect to anthropogenic climate change.


Asunto(s)
Adaptación Biológica/fisiología , Oenothera biennis/fisiología , Adaptación Biológica/genética , Teorema de Bayes , Biodiversidad , Evolución Biológica , Clima , Ecosistema , Geografía , Endogamia , Oenothera biennis/genética , Filogenia , Reproducción/genética
10.
Syst Biol ; 58(5): 489-500, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20525603

RESUMEN

Several methods have recently been developed to infer multilocus phylogenies by incorporating information from topological incongruence of the individual genes. In this study, we investigate 2 such methods, Bayesian concordance analysis and Bayesian estimation of species trees. Our test data are a collection of genes from cultivated rice (genus Oryza) and the most closely related wild species, generated using a high-throughput sequencing protocol and bioinformatics pipeline. Trees inferred from independent genes display levels of topological incongruence that far exceed that seen in previous data sets analyzed with these species tree methods. We identify differences in phylogenetic results between inference methods that incorporate gene tree incongruence. Finally, we discuss the challenges of scaling these analyses for data sets with thousands of gene trees and extensive levels of missing data.


Asunto(s)
Teorema de Bayes , Clasificación/métodos , Biología Computacional/métodos , Genes/genética , Modelos Genéticos , Oryza/genética , Filogenia , Secuencia de Bases , Alineación de Secuencia , Análisis de Secuencia de ADN , Programas Informáticos
11.
PLoS Curr ; 62014 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-24987572

RESUMEN

As phylogenetic data becomes increasingly available, along with associated data on species' genomes, traits, and geographic distributions, the need to ensure data availability and reuse become more and more acute. In this paper, we provide ten "simple rules" that we view as best practices for data sharing in phylogenetic research. These rules will help lead towards a future phylogenetics where data can easily be archived, shared, reused, and repurposed across a wide variety of projects.

13.
Front Plant Sci ; 2: 34, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22645531

RESUMEN

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.

15.
Syst Biol ; 57(3): 335-46, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18570030

RESUMEN

As an archive of sequence data for over 165,000 species, GenBank is an indispensable resource for phylogenetic inference. Here we describe an informatics processing pipeline and online database, the PhyLoTA Browser (http://loco.biosci.arizona.edu/pb), which offers a view of GenBank tailored for molecular phylogenetics. The first release of the Browser is computed from 2.6 million sequences representing the taxonomically enriched subset of GenBank sequences for eukaryotes (excluding most genome survey sequences, ESTs, and other high-throughput data). In addition to summarizing sequence diversity and species diversity across nodes in the NCBI taxonomy, it reports 87,000 potentially phylogenetically informative clusters of homologous sequences, which can be viewed or downloaded, along with provisional alignments and coarse phylogenetic trees. At each node in the NCBI hierarchy, the user can display a "data availability matrix" of all available sequences for entries in a subtaxa-by-clusters matrix. This matrix provides a guidepost for subsequent assembly of multigene data sets or supertrees. The database allows for comparison of results from previous GenBank releases, highlighting recent additions of either sequences or taxa to GenBank and letting investigators track progress on data availability worldwide. Although the reported alignments and trees are extremely approximate, the database reports several statistics correlated with alignment quality to help users choose from alternative data sources.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Filogenia , Programas Informáticos , Análisis por Conglomerados , Biología Computacional/métodos , Internet
16.
Syst Biol ; 56(4): 578-90, 2007 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-17654363

RESUMEN

Bayesian inference of phylogeny is unique among phylogenetic reconstruction methods in that it produces a posterior distribution of trees rather than a point estimate of the best tree. The most common way to summarize this distribution is to report the majority-rule consensus tree annotated with the marginal posterior probabilities of each partition. Reporting a single tree discards information contained in the full underlying distribution and reduces the Bayesian analysis to simply another method for finding a point estimate of the tree. Even when a point estimate of the phylogeny is desired, the majority-rule consensus tree is only one possible method, and there may be others that are more appropriate for the given data set and application. We present a method for summarizing the distribution of trees that is based on identifying agreement subtrees that are frequently present in the posterior distribution. This method provides fully resolved binary trees for subsets of taxa with high marginal posterior probability on the entire tree and includes additional information about the spread of the distribution.


Asunto(s)
Filogenia , Algoritmos , Teorema de Bayes , Simulación por Computador , Modelos Biológicos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA