Pesquisa | BVS Integralidade em Saúde

How and Why to Build a Unified Tree of Life.

McTavish, Emily Jane; Drew, Bryan T; Redelings, Ben; Cranston, Karen A.

Bioessays ; 39(11)2017 11.

Artigo em Inglês | MEDLINE | ID: mdl-28980328

RESUMO

Phylogenetic trees are a crucial backbone for a wide breadth of biological research spanning systematics, organismal biology, ecology, and medicine. In 2015, the Open Tree of Life project published a first draft of a comprehensive tree of life, summarizing digitally available taxonomic and phylogenetic knowledge. This paper reviews, investigates, and addresses the following questions as a follow-up to that paper, from the perspective of researchers involved in building this summary of the tree of life: Is there a tree of life and should we reconstruct it? Is available data sufficient to reconstruct the tree of life? Do we have access to phylogenetic inferences in usable form? Can we combine different phylogenetic estimates across the tree of life? And finally, what is the future of understanding the tree of life?

Assuntos

Evolução Biológica , Genômica/métodos , Filogenia , Archaea/genética , Bactérias/genética , Eucariotos/genética , Transferência Genética Horizontal

Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

Hinchliff, Cody E; Smith, Stephen A; Allman, James F; Burleigh, J Gordon; Chaudhary, Ruchi; Coghill, Lyndon M; Crandall, Keith A; Deng, Jiabin; Drew, Bryan T; Gazis, Romina; Gude, Karl; Hibbett, David S; Katz, Laura A; Laughinghouse, H Dail; McTavish, Emily Jane; Midford, Peter E; Owen, Christopher L; Ree, Richard H; Rees, Jonathan A; Soltis, Douglas E; Williams, Tiffani; Cranston, Karen A.

Proc Natl Acad Sci U S A ; 112(41): 12764-9, 2015 Oct 13.

Artigo em Inglês | MEDLINE | ID: mdl-26385966

RESUMO

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips-the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.

Assuntos

Classificação/métodos , Filogenia , Animais , Humanos

Phylesystem: a git-based data store for community-curated phylogenetic estimates.

McTavish, Emily Jane; Hinchliff, Cody E; Allman, James F; Brown, Joseph W; Cranston, Karen A; Holder, Mark T; Rees, Jonathan A; Smith, Stephen A.

Bioinformatics ; 31(17): 2794-800, 2015 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-25940563

RESUMO

MOTIVATION: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. RESULTS: Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git's version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the 'phylesystem-api', which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. AVAILABILITY AND IMPLEMENTATION: Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator. Code for that tool is available from https://github.com/OpenTreeOfLife/opentree. CONTACT: mtholder@gmail.com.

Assuntos

Biologia Computacional/métodos , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Filogenia , Software , Humanos , Internet , Linguagens de Programação , Reprodutibilidade dos Testes , Interface Usuário-Computador

The Fossil Calibration Database-A New Resource for Divergence Dating.

Ksepka, Daniel T; Parham, James F; Allman, James F; Benton, Michael J; Carrano, Matthew T; Cranston, Karen A; Donoghue, Philip C J; Head, Jason J; Hermsen, Elizabeth J; Irmis, Randall B; Joyce, Walter G; Kohli, Manpreet; Lamm, Kristin D; Leehr, Dan; Patané, Josés L; Polly, P David; Phillips, Matthew J; Smith, N Adam; Smith, Nathan D; Van Tuinen, Marcel; Ware, Jessica L; Warnock, Rachel C M.

Syst Biol ; 64(5): 853-9, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-25922515

RESUMO

Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important-often least appreciated-step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for nonspecialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as a key pipeline for peer-reviewed calibrations to enter the database.

Assuntos

Bases de Dados Factuais/normas , Fósseis , Filogenia , Acesso à Informação , Calibragem , Interpretação Estatística de Dados , Internet , Tempo

Spatio-temporal patterns of genome evolution in allotetraploid species of the genus Oryza.

Ammiraju, Jetty S S; Fan, Chuanzhu; Yu, Yeisoo; Song, Xiang; Cranston, Karen A; Pontaroli, Ana Clara; Lu, Fei; Sanyal, Abhijit; Jiang, Ning; Rambo, Teri; Currie, Jennifer; Collura, Kristi; Talag, Jayson; Bennetzen, Jeffrey L; Chen, Mingsheng; Jackson, Scott; Wing, Rod A.

Plant J ; 63(3): 430-42, 2010 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-20487382

RESUMO

Despite knowledge that polyploidy is widespread and a major evolutionary force in flowering plant diversification, detailed comparative molecular studies on polyploidy have been confined to only a few species and families. The genus Oryza is composed of 23 species that are classified into ten distinct 'genome types' (six diploid and four polyploid), and is emerging as a powerful new model system to study polyploidy. Here we report the identification, sequence and comprehensive comparative annotation of eight homoeologous genomes from a single orthologous region (Adh1-Adh2) from four allopolyploid species representing each of the known Oryza genome types (BC, CD, HJ and KL). Detailed comparative phylogenomic analyses of these regions within and across species and ploidy levels provided several insights into the spatio-temporal dynamics of genome organization and evolution of this region in 'natural' polyploids of Oryza. The major findings of this study are that: (i) homoeologous genomic regions within the same nucleus experience both independent and parallel evolution, (ii) differential lineage-specific selection pressures do not occur between polyploids and their diploid progenitors, (iii) there have been no dramatic structural changes relative to the diploid ancestors, (iv) a variation in the molecular evolutionary rate exists between the two genomes in the BC complex species even though the BC and CD polyploid species appear to have arisen <2 million years ago, and (v) there are no clear distinctions in the patterns of genome evolution in the diploid versus polyploid species.

Assuntos

Evolução Molecular , Genoma de Planta , Oryza/genética , Tetraploidia , Cromossomos Artificiais Bacterianos , Genes de Plantas , Dados de Sequência Molecular , Filogenia , Retroelementos

Species trees from highly incongruent gene trees in rice.

Cranston, Karen A; Hurwitz, Bonnie; Ware, Doreen; Stein, Lincoln; Wing, Rod A.

Syst Biol ; 58(5): 489-500, 2009 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-20525603

RESUMO

Several methods have recently been developed to infer multilocus phylogenies by incorporating information from topological incongruence of the individual genes. In this study, we investigate 2 such methods, Bayesian concordance analysis and Bayesian estimation of species trees. Our test data are a collection of genes from cultivated rice (genus Oryza) and the most closely related wild species, generated using a high-throughput sequencing protocol and bioinformatics pipeline. Trees inferred from independent genes display levels of topological incongruence that far exceed that seen in previous data sets analyzed with these species tree methods. We identify differences in phylogenetic results between inference methods that incorporate gene tree incongruence. Finally, we discuss the challenges of scaling these analyses for data sets with thousands of gene trees and extensive levels of missing data.

Assuntos

Teorema de Bayes , Classificação/métodos , Biologia Computacional/métodos , Genes/genética , Modelos Genéticos , Oryza/genética , Filogenia , Sequência de Bases , Alinhamento de Sequência , Análise de Sequência de DNA , Software

The iPlant Collaborative: Cyberinfrastructure for Plant Biology.

Goff, Stephen A; Vaughn, Matthew; McKay, Sheldon; Lyons, Eric; Stapleton, Ann E; Gessler, Damian; Matasci, Naim; Wang, Liya; Hanlon, Matthew; Lenards, Andrew; Muir, Andy; Merchant, Nirav; Lowry, Sonya; Mock, Stephen; Helmke, Matthew; Kubach, Adam; Narro, Martha; Hopkins, Nicole; Micklos, David; Hilgert, Uwe; Gonzales, Michael; Jordan, Chris; Skidmore, Edwin; Dooley, Rion; Cazes, John; McLay, Robert; Lu, Zhenyuan; Pasternak, Shiran; Koesterke, Lars; Piel, William H; Grene, Ruth; Noutsos, Christos; Gendler, Karla; Feng, Xin; Tang, Chunlao; Lent, Monica; Kim, Seung-Jin; Kvilekval, Kristian; Manjunath, B S; Tannen, Val; Stamatakis, Alexandros; Sanderson, Michael; Welch, Stephen M; Cranston, Karen A; Soltis, Pamela; Soltis, Doug; O'Meara, Brian; Ane, Cecile; Brutnell, Tom; Kleibenstein, Daniel J.

Front Plant Sci ; 2: 34, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-22645531

RESUMO

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.

The PhyLoTA Browser: processing GenBank for molecular phylogenetics research.

Sanderson, Michael J; Boss, Darren; Chen, Duhong; Cranston, Karen A; Wehe, Andre.

Syst Biol ; 57(3): 335-46, 2008 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-18570030

RESUMO

As an archive of sequence data for over 165,000 species, GenBank is an indispensable resource for phylogenetic inference. Here we describe an informatics processing pipeline and online database, the PhyLoTA Browser (http://loco.biosci.arizona.edu/pb), which offers a view of GenBank tailored for molecular phylogenetics. The first release of the Browser is computed from 2.6 million sequences representing the taxonomically enriched subset of GenBank sequences for eukaryotes (excluding most genome survey sequences, ESTs, and other high-throughput data). In addition to summarizing sequence diversity and species diversity across nodes in the NCBI taxonomy, it reports 87,000 potentially phylogenetically informative clusters of homologous sequences, which can be viewed or downloaded, along with provisional alignments and coarse phylogenetic trees. At each node in the NCBI hierarchy, the user can display a "data availability matrix" of all available sequences for entries in a subtaxa-by-clusters matrix. This matrix provides a guidepost for subsequent assembly of multigene data sets or supertrees. The database allows for comparison of results from previous GenBank releases, highlighting recent additions of either sequences or taxa to GenBank and letting investigators track progress on data availability worldwide. Although the reported alignments and trees are extremely approximate, the database reports several statistics correlated with alignment quality to help users choose from alternative data sources.

Assuntos

Bases de Dados de Ácidos Nucleicos , Filogenia , Software , Análise por Conglomerados , Biologia Computacional/métodos , Internet

Summarizing a posterior distribution of trees using agreement subtrees.

Cranston, Karen A; Rannala, Bruce.

Syst Biol ; 56(4): 578-90, 2007 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-17654363

RESUMO

Bayesian inference of phylogeny is unique among phylogenetic reconstruction methods in that it produces a posterior distribution of trees rather than a point estimate of the best tree. The most common way to summarize this distribution is to report the majority-rule consensus tree annotated with the marginal posterior probabilities of each partition. Reporting a single tree discards information contained in the full underlying distribution and reduces the Bayesian analysis to simply another method for finding a point estimate of the tree. Even when a point estimate of the phylogeny is desired, the majority-rule consensus tree is only one possible method, and there may be others that are more appropriate for the given data set and application. We present a method for summarizing the distribution of trees that is based on identifying agreement subtrees that are frequently present in the posterior distribution. This method provides fully resolved binary trees for subsets of taxa with high marginal posterior probability on the entire tree and includes additional information about the spread of the distribution.

Assuntos

Filogenia , Algoritmos , Teorema de Bayes , Simulação por Computador , Modelos Biológicos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa