Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 109
Filter
Add more filters










Publication year range
1.
Microbiol Resour Announc ; 13(3): e0098023, 2024 Mar 12.
Article in English | MEDLINE | ID: mdl-38329355

ABSTRACT

We present six whole community shotgun metagenomic sequencing data sets of two types of biological soil crusts sampled at the ecotone of the Mojave Desert and Colorado Desert in California. These data will help us understand the diversity and function of biocrust microbial communities, which are essential for desert ecosystems.

2.
Microbiol Resour Announc ; 13(2): e0108023, 2024 Feb 15.
Article in English | MEDLINE | ID: mdl-38189307

ABSTRACT

We present eight metatranscriptomic datasets of light algal and cyanolichen biological soil crusts from the Mojave Desert in response to wetting. These data will help us understand gene expression patterns in desert biocrust microbial communities after they have been reactivated by the addition of water.

3.
Nucleic Acids Res ; 52(D1): D164-D173, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37930866

ABSTRACT

Plasmids are mobile genetic elements found in many clades of Archaea and Bacteria. They drive horizontal gene transfer, impacting ecological and evolutionary processes within microbial communities, and hold substantial importance in human health and biotechnology. To support plasmid research and provide scientists with data of an unprecedented diversity of plasmid sequences, we introduce the IMG/PR database, a new resource encompassing 699 973 plasmid sequences derived from genomes, metagenomes and metatranscriptomes. IMG/PR is the first database to provide data of plasmid that were systematically identified from diverse microbiome samples. IMG/PR plasmids are associated with rich metadata that includes geographical and ecosystem information, host taxonomy, similarity to other plasmids, functional annotation, presence of genes involved in conjugation and antibiotic resistance. The database offers diverse methods for exploring its extensive plasmid collection, enabling users to navigate plasmids through metadata-centric queries, plasmid comparisons and BLAST searches. The web interface for IMG/PR is accessible at https://img.jgi.doe.gov/pr. Plasmid metadata and sequences can be downloaded from https://genome.jgi.doe.gov/portal/IMG_PR.


Subject(s)
Metagenome , Microbiota , Humans , Metadata , Software , Databases, Genetic , Plasmids/genetics
4.
Nature ; 622(7983): 594-602, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37821698

ABSTRACT

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.


Subject(s)
Metagenome , Metagenomics , Microbiology , Proteins , Cluster Analysis , Metagenome/genetics , Metagenomics/methods , Proteins/chemistry , Proteins/classification , Proteins/genetics , Databases, Protein , Protein Conformation
5.
Microbiol Resour Announc ; 12(10): e0054823, 2023 Oct 19.
Article in English | MEDLINE | ID: mdl-37712678

ABSTRACT

Xenorhabdus species are bacterial symbionts of entomopathogenic Steinernema nematodes, in which they produce diverse secondary metabolites implicated in pathogenesis. To expand resources for natural product prospecting and exploration of host-symbiont-pathogen relationships, the genomes of Xenorhabdus cabanillasi, Xenorhabdus ehlersii, Xenorhabdus japonica, Xenorhabdus koppenhoeferii, and Xenorhabdus mauleonii were analyzed.

6.
Nucleic Acids Res ; 51(D1): D723-D732, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36382399

ABSTRACT

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users to perform comparative analysis of isolate and single cell genomes, metagenomes, and metatranscriptomes. In addition to datasets produced by the JGI, IMG v.7 also includes datasets imported from public sources such as NCBI Genbank, SRA, and the DOE National Microbiome Data Collaborative (NMDC), or submitted by external users. In the past couple years, we have continued our effort to help the user community by improving the annotation pipeline, upgrading the contents with new reference database versions, and adding new analysis functionalities such as advanced scaffold search, Average Nucleotide Identity (ANI) for high-quality metagenome bins, new cassette search, improved gene neighborhood display, and improvements to metatranscriptome data display and analysis. We also extended the collaboration and integration efforts with other DOE-funded projects such as NMDC and DOE Biology Knowledgebase (KBase).


Subject(s)
Data Management , Genomics , Genome, Bacterial , Software , Genome, Archaeal , Databases, Genetic , Metagenome
7.
Nucleic Acids Res ; 51(D1): D733-D743, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36399502

ABSTRACT

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. A web interface enables users to efficiently browse and search viruses based on genome features and/or sequence similarity. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. These clustered into 8.7 million viral operational taxonomic units, including 231 408 with at least one high-quality representative. Viral sequences in IMG/VR are now systematically identified from genomes, metagenomes, and metatranscriptomes using a new detection approach (geNomad), and IMG standard annotation are complemented with genome quality estimation using CheckV, taxonomic classification reflecting the latest taxonomic standards, and microbial host taxonomy prediction. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Subject(s)
Databases, Genetic , Genome, Viral , Metadata , Metagenomics , Software
8.
Nucleic Acids Res ; 50(3): e17, 2022 02 22.
Article in English | MEDLINE | ID: mdl-34871418

ABSTRACT

Plasmids are mobile genetic elements that play a key role in microbial ecology and evolution by mediating horizontal transfer of important genes, such as antimicrobial resistance genes. Many microbial genomes have been sequenced by short read sequencers and have resulted in a mix of contigs that derive from plasmids or chromosomes. New tools that accurately identify plasmids are needed to elucidate new plasmid-borne genes of high biological importance. We have developed Deeplasmid, a deep learning tool for distinguishing plasmids from bacterial chromosomes based on the DNA sequence and its encoded biological data. It requires as input only assembled sequences generated by any sequencing platform and assembly algorithm and its runtime scales linearly with the number of assembled sequences. Deeplasmid achieves an AUC-ROC of over 89%, and it was more accurate than five other plasmid classification methods. Finally, as a proof of concept, we used Deeplasmid to predict new plasmids in the fish pathogen Yersinia ruckeri ATCC 29473 that has no annotated plasmids. Deeplasmid predicted with high reliability that a long assembled contig is part of a plasmid. Using long read sequencing we indeed validated the existence of a 102 kb long plasmid, demonstrating Deeplasmid's ability to detect novel plasmids.


Subject(s)
Deep Learning , Genome, Bacterial , Plasmids , Animals , Chromosomes, Bacterial/genetics , Plasmids/genetics , Reproducibility of Results , Sequence Analysis, DNA
9.
Cell Genom ; 2(12): 100213, 2022 Dec 14.
Article in English | MEDLINE | ID: mdl-36778052

ABSTRACT

The phylum Actinobacteria includes important human pathogens like Mycobacterium tuberculosis and Corynebacterium diphtheriae and renowned producers of secondary metabolites of commercial interest, yet only a small part of its diversity is represented by sequenced genomes. Here, we present 824 actinobacterial isolate genomes in the context of a phylum-wide analysis of 6,700 genomes including public isolates and metagenome-assembled genomes (MAGs). We estimate that only 30%-50% of projected actinobacterial phylogenetic diversity possesses genomic representation via isolates and MAGs. A comparison of gene functions reveals novel determinants of host-microbe interaction as well as environment-specific adaptations such as potential antimicrobial peptides. We identify plasmids and prophages across isolates and uncover extensive prophage diversity structured mainly by host taxonomy. Analysis of >80,000 biosynthetic gene clusters reveals that horizontal gene transfer and gene loss shape secondary metabolite repertoire across taxa. Our observations illustrate the essential role of and need for high-quality isolate genome sequences.

10.
Nat Commun ; 12(1): 5483, 2021 09 16.
Article in English | MEDLINE | ID: mdl-34531387

ABSTRACT

Eukaryotic phytoplankton are responsible for at least 20% of annual global carbon fixation. Their diversity and activity are shaped by interactions with prokaryotes as part of complex microbiomes. Although differences in their local species diversity have been estimated, we still have a limited understanding of environmental conditions responsible for compositional differences between local species communities on a large scale from pole to pole. Here, we show, based on pole-to-pole phytoplankton metatranscriptomes and microbial rDNA sequencing, that environmental differences between polar and non-polar upper oceans most strongly impact the large-scale spatial pattern of biodiversity and gene activity in algal microbiomes. The geographic differentiation of co-occurring microbes in algal microbiomes can be well explained by the latitudinal temperature gradient and associated break points in their beta diversity, with an average breakpoint at 14 °C ± 4.3, separating cold and warm upper oceans. As global warming impacts upper ocean temperatures, we project that break points of beta diversity move markedly pole-wards. Hence, abrupt regime shifts in algal microbiomes could be caused by anthropogenic climate change.


Subject(s)
Genetic Variation , Microalgae/genetics , Microbiota/genetics , Phytoplankton/genetics , Transcriptome/genetics , Antarctic Regions , Arctic Regions , Biodiversity , Carbon Cycle , Climate Change , Gene Ontology , Geography , Global Warming , Microalgae/classification , Microalgae/growth & development , Oceans and Seas , Phytoplankton/classification , Phytoplankton/growth & development , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 18S/genetics , Sequence Analysis, DNA/methods , Species Specificity , Temperature
11.
Nat Microbiol ; 6(7): 960-970, 2021 07.
Article in English | MEDLINE | ID: mdl-34168315

ABSTRACT

Bacteriophages have important roles in the ecology of the human gut microbiome but are under-represented in reference databases. To address this problem, we assembled the Metagenomic Gut Virus catalogue that comprises 189,680 viral genomes from 11,810 publicly available human stool metagenomes. Over 75% of genomes represent double-stranded DNA phages that infect members of the Bacteroidia and Clostridia classes. Based on sequence clustering we identified 54,118 candidate viral species, 92% of which were not found in existing databases. The Metagenomic Gut Virus catalogue improves detection of viruses in stool metagenomes and accounts for nearly 40% of CRISPR spacers found in human gut Bacteria and Archaea. We also produced a catalogue of 459,375 viral protein clusters to explore the functional potential of the gut virome. This revealed tens of thousands of diversity-generating retroelements, which use error-prone reverse transcription to mutate target genes and may be involved in the molecular arms race between phages and their bacterial hosts.


Subject(s)
DNA Viruses/genetics , Gastrointestinal Microbiome/genetics , Genome, Viral/genetics , Archaea/virology , Bacteria/virology , Bacteriophages/genetics , Catalogs as Topic , DNA Viruses/classification , DNA, Viral/genetics , Feces/microbiology , Genetic Variation , Humans , Metagenomics , Phylogeny , Viral Proteins/genetics
12.
Microbiol Resour Announc ; 10(22): e0025821, 2021 Jun 03.
Article in English | MEDLINE | ID: mdl-34080906

ABSTRACT

Cyanobacteria are ubiquitous microorganisms with crucial ecosystem functions, yet most knowledge of their biology relates to aquatic taxa. We have constructed metagenomes for 50 taxonomically well-characterized terrestrial cyanobacterial cultures. These data will support phylogenomic studies of evolutionary relationships and gene content among these unique algae and their aquatic relatives.

13.
mSystems ; 6(3)2021 May 18.
Article in English | MEDLINE | ID: mdl-34006627

ABSTRACT

The DOE Joint Genome Institute (JGI) Metagenome Workflow performs metagenome data processing, including assembly; structural, functional, and taxonomic annotation; and binning of metagenomic data sets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) (I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, et al., Nucleic Acids Res, 49:D751-D763, 2021, https://doi.org/10.1093/nar/gkaa939) comparative analysis system and provided for download via the JGI data portal (https://genome.jgi.doe.gov/portal/). This workflow scales to run on thousands of metagenome samples per year, which can vary by the complexity of microbial communities and sequencing depth. Here, we describe the different tools, databases, and parameters used at different steps of the workflow to help with the interpretation of metagenome data available in IMG and to enable researchers to apply this workflow to their own data. We use 20 publicly available sediment metagenomes to illustrate the computing requirements for the different steps and highlight the typical results of data processing. The workflow modules for read filtering and metagenome assembly are available as a workflow description language (WDL) file (https://code.jgi.doe.gov/BFoster/jgi_meta_wdl). The workflow modules for annotation and binning are provided as a service to the user community at https://img.jgi.doe.gov/submit and require filling out the project and associated metadata descriptions in the Genomes OnLine Database (GOLD) (S. Mukherjee, D. Stamatis, J. Bertsch, G. Ovchinnikova, et al., Nucleic Acids Res, 49:D723-D733, 2021, https://doi.org/10.1093/nar/gkaa983).IMPORTANCE The DOE JGI Metagenome Workflow is designed for processing metagenomic data sets starting from Illumina fastq files. It performs data preprocessing, error correction, assembly, structural and functional annotation, and binning. The results of processing are provided in several standard formats, such as fasta and gff, and can be used for subsequent integration into the Integrated Microbial Genomes and Microbiomes (IMG/M) system where they can be compared to a comprehensive set of publicly available metagenomes. As of 30 July 2020, 7,155 JGI metagenomes have been processed by the DOE JGI Metagenome Workflow. Here, we present a metagenome workflow developed at the JGI that generates rich data in standard formats and has been optimized for downstream analyses ranging from assessment of the functional and taxonomic composition of microbial communities to genome-resolved metagenomics and the identification and characterization of novel taxa. This workflow is currently being used to analyze thousands of metagenomic data sets in a consistent and standardized manner.

16.
Nat Biotechnol ; 39(4): 499-509, 2021 04.
Article in English | MEDLINE | ID: mdl-33169036

ABSTRACT

The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.


Subject(s)
Archaea/genetics , Bacteria/genetics , Metabolomics/methods , Metagenome , Metagenomics/methods , Viruses/genetics , Air Microbiology , Animals , Archaea/classification , Archaea/isolation & purification , Bacteria/classification , Bacteria/isolation & purification , Catalogs as Topic , Ecosystem , Humans , Phylogeny , Soil Microbiology , Viruses/isolation & purification , Water Microbiology
17.
Nucleic Acids Res ; 49(D1): D764-D775, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33137183

ABSTRACT

Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity. IMG/VR v3 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Subject(s)
Databases, Genetic , Ecosystem , Evolution, Molecular , Genome, Viral , Viruses/genetics , Base Sequence , Cluster Analysis , Geography , Molecular Sequence Annotation , Sequence Homology, Nucleic Acid , User-Computer Interface
18.
Nucleic Acids Res ; 49(D1): D751-D763, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33119741

ABSTRACT

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) contains annotated isolate genome and metagenome datasets sequenced at the DOE's Joint Genome Institute (JGI), submitted by external users, or imported from public sources such as NCBI. IMG v 6.0 includes advanced search functions and a new tool for statistical analysis of mixed sets of genomes and metagenome bins. The new IMG web user interface also has a new Help page with additional documentation and webinar tutorials to help users better understand how to use various IMG functions and tools for their research. New datasets have been processed with the prokaryotic annotation pipeline v.5, which includes extended protein family assignments.


Subject(s)
Data Analysis , Data Management , Databases, Genetic , Genome, Archaeal , Genome, Microbial , Metagenome , RNA, Ribosomal, 16S/genetics , Search Engine
19.
Microbiol Resour Announc ; 9(41)2020 Oct 08.
Article in English | MEDLINE | ID: mdl-33033130

ABSTRACT

Hydrologic changes modify microbial community structure and ecosystem functions, especially in wetland systems. Here, we present 24 metagenomes from a coastal freshwater wetland experiment in which we manipulated hydrologic conditions and plant presence. These wetland soil metagenomes will deepen our understanding of how hydrology and vegetation influence microbial functional diversity.

20.
Microbiol Resour Announc ; 9(44)2020 Oct 29.
Article in English | MEDLINE | ID: mdl-33122409

ABSTRACT

The addition of glucose to soil has long been used to study the metabolic activity of microbes in soil; however, the response of the microbial ecophysiology remains poorly characterized. To address this, we sequenced the metagenomes and metatranscriptomes of glucose-amended soil microbial communities in a laboratory incubation.

SELECTION OF CITATIONS
SEARCH DETAIL
...