RESUMO
Microbial genome annotation is the process of identifying structural and functional elements in DNA sequences and subsequently attaching biological information to those elements. DRAM is a tool developed to annotate bacterial, archaeal, and viral genomes derived from pure cultures or metagenomes. DRAM goes beyond traditional annotation tools by distilling multiple gene annotations to genome level summaries of functional potential. Despite these benefits, a downside of DRAM is the requirement of large computational resources, which limits its accessibility. Further, it did not integrate with downstream metabolic modeling tools that require genome annotation. To alleviate these constraints, DRAM and the viral counterpart, DRAM-v, are now available and integrated with the freely accessible KBase cyberinfrastructure. With kb_DRAM users can generate DRAM annotations and functional summaries from microbial or viral genomes in a point-and-click interface, as well as generate genome-scale metabolic models from DRAM annotations. AVAILABILITY AND IMPLEMENTATION: For kb_DRAM users, the kb_DRAM apps on KBase can be found in the catalog at https://narrative.kbase.us/#catalog/modules/kb_DRAM. For kb_DRAM users, a tutorial workflow with all documentation is available at https://narrative.kbase.us/narrative/129480. For kb_DRAM developers, software is available at https://github.com/shafferm/kb_DRAM.
Assuntos
Bactérias , Software , Anotação de Sequência Molecular , Bactérias/genética , Archaea/genética , MetabolômicaRESUMO
In recent years, large-scale oceanic sequencing efforts have provided a deeper understanding of marine microbial communities and their dynamics. These research endeavors require the acquisition of complex and varied datasets through large, interdisciplinary and collaborative efforts. However, no unifying framework currently exists for the marine science community to integrate sequencing data with physical, geological, and geochemical datasets. Planet Microbe is a web-based platform that enables data discovery from curated historical and on-going oceanographic sequencing efforts. In Planet Microbe, each 'omics sample is linked with other biological and physiochemical measurements collected for the same water samples or during the same sample collection event, to provide a broader environmental context. This work highlights the need for curated aggregation efforts that can enable new insights into high-quality metagenomic datasets. Planet Microbe is freely accessible from https://www.planetmicrobe.org/.
Assuntos
Organismos Aquáticos/microbiologia , Análise de Dados , Meio Ambiente , Metagenômica , Planetas , Bases de Dados Genéticas , Padrões de Referência , Interface Usuário-ComputadorRESUMO
For over 10 years, ModelSEED has been a primary resource for the construction of draft genome-scale metabolic models based on annotated microbial or plant genomes. Now being released, the biochemistry database serves as the foundation of biochemical data underlying ModelSEED and KBase. The biochemistry database embodies several properties that, taken together, distinguish it from other published biochemistry resources by: (i) including compartmentalization, transport reactions, charged molecules and proton balancing on reactions; (ii) being extensible by the user community, with all data stored in GitHub; and (iii) design as a biochemical 'Rosetta Stone' to facilitate comparison and integration of annotations from many different tools and databases. The database was constructed by combining chemical data from many resources, applying standard transformations, identifying redundancies and computing thermodynamic properties. The ModelSEED biochemistry is continually tested using flux balance analysis to ensure the biochemical network is modeling-ready and capable of simulating diverse phenotypes. Ontologies can be designed to aid in comparing and reconciling metabolic reconstructions that differ in how they represent various metabolic pathways. ModelSEED now includes 33,978 compounds and 36,645 reactions, available as a set of extensible files on GitHub, and available to search at https://modelseed.org/biochem and KBase.
Assuntos
Bactérias/metabolismo , Bases de Dados Factuais , Fungos/metabolismo , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Plantas/metabolismo , Bactérias/genética , Genoma Bacteriano , TermodinâmicaRESUMO
Viruses are fundamental components of marine microbial communities that significantly influence oceanic productivity, biogeochemistry, and ecosystem processes. Despite their importance, the temporal activities and dynamics of viral assemblages in natural settings remain largely unexplored. Here we report the transcriptional activities and variability of dominant dsDNA viruses in the open ocean's euphotic zone over daily and seasonal timescales. While dsDNA viruses exhibited some fluctuation in abundance in both cellular and viral size fractions, the viral assemblage was remarkably stable, with the most abundant viral types persisting over many days. More extended time series indicated that long-term persistence (>1 y) was the rule for most dsDNA viruses observed, suggesting that both core viral genomes as well as viral community structure were conserved over interannual periods. Viral gene transcription in host cell assemblages revealed diel cycling among many different viral types. Most notably, an afternoon peak in cyanophage transcriptional activity coincided with a peak in Prochlorococcus DNA replication, indicating coordinated diurnal coupling of virus and host reproduction. In aggregate, our analyses suggested a tightly synchronized diel coupling of viral and cellular replication cycles in both photoautotrophic and heterotrophic bacterial hosts. A surprising consequence of these findings is that diel cycles in the ocean's photic zone appear to be universal organizing principles that shape ecosystem dynamics, ecological interactions, and biogeochemical cycling of both cellular and acellular community components.
Assuntos
Bacteriófagos/genética , Bacteriófagos/fisiologia , Prochlorococcus/fisiologia , Prochlorococcus/virologia , Ritmo Circadiano , DNA Bacteriano/genética , Regulação Viral da Expressão Gênica , Oceanos e Mares , RNA Bacteriano/genética , Replicação Viral , Microbiologia da ÁguaRESUMO
Recent metagenomic analyses have revealed a high diversity of viruses in the pelagic ocean and uncovered clear habitat-specific viral distribution patterns. Conversely, similar insights into the composition, host specificity and function of viruses associated with marine organisms have been limited by challenges associated with sampling and computational analysis. Here, we performed targeted viromic analysis of six coral reef invertebrate species and their surrounding seawater to deliver taxonomic and functional profiles of viruses associated with reef organisms. Sponges and corals' host species-specific viral assemblages with low sequence identity to known viral genomes. While core viral genes involved in capsid formation, tail structure and infection mechanisms were observed across all reef samples, auxiliary genes including those involved in herbicide resistance and viral pathogenesis pathways such as host immune suppression were differentially enriched in reef hosts. Utilising a novel OTU based assessment, we also show a prevalence of dsDNA viruses belonging to the Mimiviridae, Caudovirales and Phycodnaviridae in reef environments and further highlight the abundance of ssDNA viruses belonging to the Circoviridae, Parvoviridae, Bidnaviridae and Microviridae in reef invertebrates. These insights into coral reef viruses provide an important framework for future research into how viruses contribute to the health and evolution of reef organisms.
Assuntos
Antozoários/virologia , Recifes de Corais , Vírus/classificação , Vírus/genética , Animais , DNA Viral/genética , Ecossistema , Genoma Viral , Especificidade de Hospedeiro , Metagenômica , Filogenia , Água do Mar/virologia , Vírus/isolamento & purificaçãoRESUMO
Reef-building corals form close associations with organisms from all three domains of life and therefore have many potential viral hosts. Yet knowledge of viral communities associated with corals is barely explored. This complexity presents a number of challenges in terms of the metagenomic assessments of coral viral communities and requires specialized methods for purification and amplification of viral nucleic acids, as well as virome annotation. In this minireview, we conduct a meta-analysis of the limited number of existing coral virome studies, as well as available coral transcriptome and metagenome data, to identify trends and potential complications inherent in different methods. The analysis shows that the method used for viral nucleic acid isolation drastically affects the observed viral assemblage and interpretation of the results. Further, the small number of viral reference genomes available, coupled with short sequence read lengths might cause errors in virus identification. Despite these limitations and potential biases, the data show that viral communities associated with corals are diverse, with double- and single-stranded DNA and RNA viruses. The identified viruses are dominated by double-stranded DNA-tailed bacteriophages, but there are also viruses that infect eukaryote hosts, likely the endosymbiotic dinoflagellates, Symbiodinium spp., host coral and other eukaryotes in close association.
Assuntos
Antozoários/virologia , Recifes de Corais , Vírus de DNA/genética , Genoma Viral/genética , Consórcios Microbianos/genética , Vírus de RNA/genética , Animais , DNA/genética , Vírus de DNA/isolamento & purificação , DNA de Cadeia Simples/genética , Dinoflagellida/virologia , Células Eucarióticas/virologia , Metagenômica , Vírus de RNA/isolamento & purificação , Simbiose/genética , TranscriptomaRESUMO
DNA/RNA-stable isotope probing (SIP) is a powerful tool to link in situ microbial activity to sequencing data. Every SIP dataset captures distinct information about microbial community metabolism, process rates, and population dynamics, offering valuable insights for a wide range of research questions. Data reuse maximizes the information derived from the labor and resource-intensive SIP approaches. Yet, a review of publicly available SIP sequencing metadata showed that critical information necessary for reproducibility and reuse was often missing. Here, we outline the Minimum Information for any Stable Isotope Probing Sequence (MISIP) according to the Minimum Information for any (x) Sequence (MIxS) framework and include examples of MISIP reporting for common SIP experiments. Our objectives are to expand the capacity of MIxS to accommodate SIP-specific metadata and guide SIP users in metadata collection when planning and reporting an experiment. The MISIP standard requires 5 metadata fields-isotope, isotopolog, isotopolog label, labeling approach, and gradient position-and recommends several fields that represent best practices in acquiring and reporting SIP sequencing data (e.g., gradient density and nucleic acid amount). The standard is intended to be used in concert with other MIxS checklists to comprehensively describe the origin of sequence data, such as for marker genes (MISIP-MIMARKS) or metagenomes (MISIP-MIMS), in combination with metadata required by an environmental extension (e.g., soil). The adoption of the proposed data standard will improve the reuse of any sequence derived from a SIP experiment and, by extension, deepen understanding of in situ biogeochemical processes and microbial ecology.
Assuntos
Marcação por Isótopo , Marcação por Isótopo/métodos , Reprodutibilidade dos Testes , Microbiota/genética , Metadados , Metagenômica/métodos , Análise de Sequência de DNA/métodos , MetagenomaRESUMO
Systems biology research spans a range of biological scales and science domains, and often requires a collaborative effort to collect and share data so that integration is possible. However, sharing data effectively is a challenging task that requires effort and alignment between collaborative partners, as well as coordination between organizations, repositories, and journals. As a community of systems biology researchers, we must get better at efficiently sharing data, and ensuring that shared data comes with the recognition and citations it deserves.
RESUMO
Uncultivated Bacteria and Archaea account for the vast majority of species on Earth, but obtaining their genomes directly from the environment, using shotgun sequencing, has only become possible recently. To realize the hope of capturing Earth's microbial genetic complement and to facilitate the investigation of the functional roles of specific lineages in a given ecosystem, technologies that accelerate the recovery of high-quality genomes are necessary. We present a series of analysis steps and data products for the extraction of high-quality metagenome-assembled genomes (MAGs) from microbiomes using the U.S. Department of Energy Systems Biology Knowledgebase (KBase) platform ( http://www.kbase.us/ ). Overall, these steps take about a day to obtain extracted genomes when starting from smaller environmental shotgun read libraries, or up to about a week from larger libraries. In KBase, the process is end-to-end, allowing a user to go from the initial sequencing reads all the way through to MAGs, which can then be analyzed with other KBase capabilities such as phylogenetic placement, functional assignment, metabolic modeling, pangenome functional profiling, RNA-Seq and others. While portions of such capabilities are available individually from other resources, the combination of the intuitive usability, data interoperability and integration of tools in a freely available computational resource makes KBase a powerful platform for obtaining MAGs from microbiomes. While this workflow offers tools for each of the key steps in the genome extraction process, it also provides a scaffold that can be easily extended with additional MAG recovery and analysis tools, via the KBase software development kit (SDK).
Assuntos
Metagenoma , Microbiota , Filogenia , Genoma Bacteriano , Microbiota/genética , Bactérias/genética , MetagenômicaRESUMO
Predicting elemental cycles and maintaining water quality under increasing anthropogenic influence requires understanding the spatial drivers of river microbiomes. However, the unifying microbial processes governing river biogeochemistry are hindered by a lack of genome-resolved functional insights and sampling across multiple rivers. Here we employed a community science effort to accelerate the sampling, sequencing, and genome-resolved analyses of river microbiomes to create the Genome Resolved Open Watersheds database (GROWdb). This resource profiled the identity, distribution, function, and expression of thousands of microbial genomes across rivers covering 90% of United States watersheds. Specifically, GROWdb encompasses 1,469 microbial species from 27 phyla, including novel lineages from 10 families and 128 genera, and defines the core river microbiome for the first time at genome level. GROWdb analyses coupled to extensive geospatial information revealed local and regional drivers of microbial community structuring, while also presenting a myriad of foundational hypotheses about ecosystem function. Building upon the previously conceived River Continuum Concept 1 , we layer on microbial functional trait expression, which suggests the structure and function of river microbiomes is predictable. We make GROWdb available through various collaborative cyberinfrastructures 2, 3 so that it can be widely accessed across disciplines for watershed predictive modeling and microbiome-based management practices.
RESUMO
Marine microbial ecology requires the systematic comparison of biogeochemical and sequence data to analyze environmental influences on the distribution and variability of microbial communities. With ever-increasing quantities of metagenomic data, there is a growing need to make datasets Findable, Accessible, Interoperable, and Reusable (FAIR) across diverse ecosystems. FAIR data is essential to developing analytical frameworks that integrate microbiological, genomic, ecological, oceanographic, and computational methods. Although community standards defining the minimal metadata required to accompany sequence data exist, they haven't been consistently used across projects, precluding interoperability. Moreover, these data are not machine-actionable or discoverable by cyberinfrastructure systems. By making 'omic and physicochemical datasets FAIR to machine systems, we can enable sequence data discovery and reuse based on machine-readable descriptions of environments or physicochemical gradients. In this work, we developed a novel technical specification for dataset encapsulation for the FAIR reuse of marine metagenomic and physicochemical datasets within cyberinfrastructure systems. This includes using Frictionless Data Packages enriched with terminology from environmental and life-science ontologies to annotate measured variables, their units, and the measurement devices used. This approach was implemented in Planet Microbe, a cyberinfrastructure platform and marine metagenomic web-portal. Here, we discuss the data properties built into the specification to make global ocean datasets FAIR within the Planet Microbe portal. We additionally discuss the selection of, and contributions to marine-science ontologies used within the specification. Finally, we use the system to discover data by which to answer various biological questions about environments, physicochemical gradients, and microbial communities in meta-analyses. This work represents a future direction in marine metagenomic research by proposing a specification for FAIR dataset encapsulation that, if adopted within cyberinfrastructure systems, would automate the discovery, exchange, and re-use of data needed to answer broader reaching questions than originally intended.
RESUMO
Microbes drive myriad ecosystem processes, but under strong influence from viruses. Because studying viruses in complex systems requires different tools than those for microbes, they remain underexplored. To combat this, we previously aggregated double-stranded DNA (dsDNA) virus analysis capabilities and resources into 'iVirus' on the CyVerse collaborative cyberinfrastructure. Here we substantially expand iVirus's functionality and accessibility, to iVirus 2.0, as follows. First, core iVirus apps were integrated into the Department of Energy's Systems Biology KnowledgeBase (KBase) to provide an additional analytical platform. Second, at CyVerse, 20 software tools (apps) were upgraded or added as new tools and capabilities. Third, nearly 20-fold more sequence reads were aggregated to capture new data and environments. Finally, documentation, as "live" protocols, was updated to maximize user interaction with and contribution to infrastructure development. Together, iVirus 2.0 serves as a uniquely central and accessible analytical platform for studying how viruses, particularly dsDNA viruses, impact diverse microbial ecosystems.
RESUMO
The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.
Assuntos
Archaea/genética , Bactérias/genética , Metabolômica/métodos , Metagenoma , Metagenômica/métodos , Vírus/genética , Microbiologia do Ar , Animais , Archaea/classificação , Archaea/isolamento & purificação , Bactérias/classificação , Bactérias/isolamento & purificação , Catálogos como Assunto , Ecossistema , Humanos , Filogenia , Microbiologia do Solo , Vírus/isolamento & purificação , Microbiologia da ÁguaRESUMO
Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.
RESUMO
Marine sponges can form stable partnerships with a wide diversity of microbes and viruses, and this high intraspecies symbiont specificity makes them ideal models for exploring how host-associated viromes respond to changing environmental conditions. Here we exposed the abundant Great Barrier Reef sponge Rhopaloiedes odorabile to elevated seawater temperature for 48 h and utilised a metaviromic approach to assess the response of the associated viral community. An increase in endogenous retro-transcribing viruses within the Caulimorviridae and Retroviridae families was detected within the first 12 h of exposure to 32 °C, and a 30-fold increase in retro-transcribing viruses was evident after 48 h at 32 °C. Thermally stressed sponges also exhibited a complete loss of ssDNA viruses which were prevalent in field samples and sponges from the control temperature treatment. Despite these viromic changes, functional analysis failed to detect any loss or gain of auxiliary metabolic genes, indicating that viral communities are not providing a direct competitive advantage to their host under thermal stress. In contrast, endogenous sponge retro-transcribing viruses appear to be replicating under thermal stress, and consistent with retroviral infections in other organisms, may be contributing to the previously described rapid decline in host health evident at elevated temperature.
Assuntos
Resposta ao Choque Térmico , Poríferos/virologia , Simbiose , Vírus/classificação , Animais , Expressão Gênica , Filogenia , Água do Mar/virologiaRESUMO
BACKGROUND: Scientists have amassed a wealth of microbiome datasets, making it possible to study microbes in biotic and abiotic systems on a population or planetary scale; however, this potential has not been fully realized given that the tools, datasets, and computation are available in diverse repositories and locations. To address this challenge, we developed iMicrobe.us, a community-driven microbiome data marketplace and tool exchange for users to integrate their own data and tools with those from the broader community. FINDINGS: The iMicrobe platform brings together analysis tools and microbiome datasets by leveraging National Science Foundation-supported cyberinfrastructure and computing resources from CyVerse, Agave, and XSEDE. The primary purpose of iMicrobe is to provide users with a freely available, web-based platform to (1) maintain and share project data, metadata, and analysis products, (2) search for related public datasets, and (3) use and publish bioinformatics tools that run on highly scalable computing resources. Analysis tools are implemented in containers that encapsulate complex software dependencies and run on freely available XSEDE resources via the Agave API, which can retrieve datasets from the CyVerse Data Store or any web-accessible location (e.g., FTP, HTTP). CONCLUSIONS: iMicrobe promotes data integration, sharing, and community-driven tool development by making open source data and tools accessible to the research community in a web-based platform.