Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 114
Filtrar
1.
Nucleic Acids Res ; 52(D1): D164-D173, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37930866

RESUMO

Plasmids are mobile genetic elements found in many clades of Archaea and Bacteria. They drive horizontal gene transfer, impacting ecological and evolutionary processes within microbial communities, and hold substantial importance in human health and biotechnology. To support plasmid research and provide scientists with data of an unprecedented diversity of plasmid sequences, we introduce the IMG/PR database, a new resource encompassing 699 973 plasmid sequences derived from genomes, metagenomes and metatranscriptomes. IMG/PR is the first database to provide data of plasmid that were systematically identified from diverse microbiome samples. IMG/PR plasmids are associated with rich metadata that includes geographical and ecosystem information, host taxonomy, similarity to other plasmids, functional annotation, presence of genes involved in conjugation and antibiotic resistance. The database offers diverse methods for exploring its extensive plasmid collection, enabling users to navigate plasmids through metadata-centric queries, plasmid comparisons and BLAST searches. The web interface for IMG/PR is accessible at https://img.jgi.doe.gov/pr. Plasmid metadata and sequences can be downloaded from https://genome.jgi.doe.gov/portal/IMG_PR.


Assuntos
Metagenoma , Microbiota , Humanos , Metadados , Software , Bases de Dados Genéticas , Plasmídeos/genética
3.
Nucleic Acids Res ; 51(D1): D957-D963, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36318257

RESUMO

The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute (DOE-JGI) continues to maintain its role as one of the flagship genomic metadata repositories of the world. The ever-increasing number of projects and metadata are freely available to the user community world-wide. GOLD's metadata is consumed by scientists and remains an important source for large-scale comparative genomics analysis initiatives. Encouraged by this active user engagement and growth, GOLD has continued to add new components and capabilities. The new features such as a public Application Programming Interface (API) and Ecosystem landing page as well as the growth of different entities in this current GOLD v.9 edition are described in detail in this manuscript.


Assuntos
Bases de Dados Genéticas , Genômica , Genoma , Software
4.
Nucleic Acids Res ; 51(D1): D723-D732, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36382399

RESUMO

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users to perform comparative analysis of isolate and single cell genomes, metagenomes, and metatranscriptomes. In addition to datasets produced by the JGI, IMG v.7 also includes datasets imported from public sources such as NCBI Genbank, SRA, and the DOE National Microbiome Data Collaborative (NMDC), or submitted by external users. In the past couple years, we have continued our effort to help the user community by improving the annotation pipeline, upgrading the contents with new reference database versions, and adding new analysis functionalities such as advanced scaffold search, Average Nucleotide Identity (ANI) for high-quality metagenome bins, new cassette search, improved gene neighborhood display, and improvements to metatranscriptome data display and analysis. We also extended the collaboration and integration efforts with other DOE-funded projects such as NMDC and DOE Biology Knowledgebase (KBase).


Assuntos
Gerenciamento de Dados , Genômica , Genoma Bacteriano , Software , Genoma Arqueal , Bases de Dados Genéticas , Metagenoma
5.
Nucleic Acids Res ; 51(D1): D733-D743, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36399502

RESUMO

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. A web interface enables users to efficiently browse and search viruses based on genome features and/or sequence similarity. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. These clustered into 8.7 million viral operational taxonomic units, including 231 408 with at least one high-quality representative. Viral sequences in IMG/VR are now systematically identified from genomes, metagenomes, and metatranscriptomes using a new detection approach (geNomad), and IMG standard annotation are complemented with genome quality estimation using CheckV, taxonomic classification reflecting the latest taxonomic standards, and microbial host taxonomy prediction. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Assuntos
Bases de Dados Genéticas , Genoma Viral , Metadados , Metagenômica , Software
6.
Nucleic Acids Res ; 49(D1): D723-D733, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33152092

RESUMO

The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.


Assuntos
Bases de Dados Genéticas , Genoma , Ecossistema , Ontologia Genética , Ferramenta de Busca , Análise de Sequência de DNA
7.
Nucleic Acids Res ; 49(D1): D764-D775, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33137183

RESUMO

Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity. IMG/VR v3 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Assuntos
Bases de Dados Genéticas , Ecossistema , Evolução Molecular , Genoma Viral , Vírus/genética , Sequência de Bases , Análise por Conglomerados , Geografia , Anotação de Sequência Molecular , Homologia de Sequência do Ácido Nucleico , Interface Usuário-Computador
8.
Nucleic Acids Res ; 47(D1): D678-D686, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30407573

RESUMO

The Integrated Microbial Genome/Virus (IMG/VR) system v.2.0 (https://img.jgi.doe.gov/vr/) is the largest publicly available data management and analysis platform dedicated to viral genomics. Since the last report published in the 2016, NAR Database Issue, the data has tripled in size and currently contains genomes of 8389 cultivated reference viruses, 12 498 previously published curated prophages derived from cultivated microbial isolates, and 735 112 viral genomic fragments computationally predicted from assembled shotgun metagenomes. Nearly 60% of the viral genomes and genome fragments are clustered into 110 384 viral Operational Taxonomic Units (vOTUs) with two or more members. To improve data quality and predictions of host specificity, IMG/VR v.2.0 now separates prokaryotic and eukaryotic viruses, utilizes known prophage sequences to improve taxonomic assignments, and provides viral genome quality scores based on the estimated genome completeness. New features also include enhanced BLAST search capabilities for external queries. Finally, geographic map visualization to locate user-selected viral genomes or genome fragments has been implemented and download options have been extended. All of these features make IMG/VR v.2.0 a key resource for the study of viruses.


Assuntos
Gerenciamento de Dados/métodos , Genoma Viral , Genômica/métodos , Software
9.
BMC Genomics ; 21(1): 214, 2020 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-32143559

RESUMO

BACKGROUND: Cupriavidus strain STM 6070 was isolated from nickel-rich soil collected near Koniambo massif, New Caledonia, using the invasive legume trap host Mimosa pudica. STM 6070 is a heavy metal-tolerant strain that is highly effective at fixing nitrogen with M. pudica. Here we have provided an updated taxonomy for STM 6070 and described salient features of the annotated genome, focusing on heavy metal resistance (HMR) loci and heavy metal efflux (HME) systems. RESULTS: The 6,771,773 bp high-quality-draft genome consists of 107 scaffolds containing 6118 protein-coding genes. ANI values show that STM 6070 is a new species of Cupriavidus. The STM 6070 symbiotic region was syntenic with that of the M. pudica-nodulating Cupriavidus taiwanensis LMG 19424T. In contrast to the nickel and zinc sensitivity of C. taiwanensis strains, STM 6070 grew at high Ni2+ and Zn2+ concentrations. The STM 6070 genome contains 55 genes, located in 12 clusters, that encode HMR structural proteins belonging to the RND, MFS, CHR, ARC3, CDF and P-ATPase protein superfamilies. These HMR molecular determinants are putatively involved in arsenic (ars), chromium (chr), cobalt-zinc-cadmium (czc), copper (cop, cup), nickel (nie and nre), and silver and/or copper (sil) resistance. Seven of these HMR clusters were common to symbiotic and non-symbiotic Cupriavidus species, while four clusters were specific to STM 6070, with three of these being associated with insertion sequences. Within the specific STM 6070 HMR clusters, three novel HME-RND systems (nieIC cep nieBA, czcC2B2A2, and hmxB zneAC zneR hmxS) were identified, which constitute new candidate genes for nickel and zinc resistance. CONCLUSIONS: STM 6070 belongs to a new Cupriavidus species, for which we have proposed the name Cupriavidus neocaledonicus sp. nov.. STM6070 harbours a pSym with a high degree of gene conservation to the pSyms of M. pudica-nodulating C. taiwanensis strains, probably as a result of recent horizontal transfer. The presence of specific HMR clusters, associated with transposase genes, suggests that the selection pressure of the New Caledonian ultramafic soils has driven the specific adaptation of STM 6070 to heavy-metal-rich soils via horizontal gene transfer.


Assuntos
Cupriavidus/efeitos dos fármacos , Cupriavidus/genética , Metais Pesados/toxicidade , Mimosa/microbiologia , Cádmio/metabolismo , Família Multigênica , Níquel/toxicidade , Filogenia , RNA Ribossômico 16S/genética , Rhizobium/efeitos dos fármacos , Rhizobium/genética , Solo , Microbiologia do Solo , Simbiose , Sintenia/genética , Zinco/toxicidade
10.
Curr Microbiol ; 76(5): 566-574, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-30820638

RESUMO

Burkholderia cenocepacia TAtl-371 was isolated from the rhizosphere of a tomato plant growing in Atlatlahucan, Morelos, Mexico. This strain exhibited a broad antimicrobial spectrum against bacteria, yeast, and fungi. Here, we report and describe the improved, high-quality permanent draft genome of B. cenocepacia TAtl-371, which was sequenced using a combination of PacBio RS and PacBio RS II sequencing methods. The 7,496,106 bp genome of the TAtl-371 strain is arranged in three scaffolds, contains 6722 protein-coding genes, and 99 RNA only-encoding genes. Genome analysis revealed genes related to biosynthesis of antimicrobials such as non-ribosomal peptides, siderophores, chitinases, and bacteriocins. Moreover, analysis of bacterial growth on different carbon and nitrogen sources shows that the strain retains its antimicrobial ability.


Assuntos
Antibiose , Burkholderia cenocepacia/genética , Complexo Burkholderia cepacia , Carbono/metabolismo , Genoma Bacteriano , Nitrogênio/metabolismo , Bacteriocinas/genética , Burkholderia cenocepacia/isolamento & purificação , Quitinases/genética , Solanum lycopersicum/microbiologia , México , Rizosfera , Análise de Sequência de DNA , Sideróforos/genética , Microbiologia do Solo
11.
Nucleic Acids Res ; 45(D1): D446-D456, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27794040

RESUMO

The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is a manually curated data management system that catalogs sequencing projects with associated metadata from around the world. In the current version of GOLD (v.6), all projects are organized based on a four level classification system in the form of a Study, Organism (for isolates) or Biosample (for environmental samples), Sequencing Project and Analysis Project. Currently, GOLD provides information for 26 117 Studies, 239 100 Organisms, 15 887 Biosamples, 97 212 Sequencing Projects and 78 579 Analysis Projects. These are integrated with over 312 metadata fields from which 58 are controlled vocabularies with 2067 terms. The web interface facilitates submission of a diverse range of Sequencing Projects (such as isolate genome, single-cell genome, metagenome, metatranscriptome) and complex Analysis Projects (such as genome from metagenome, or combined assembly from multiple Sequencing Projects). GOLD provides a seamless interface with the Integrated Microbial Genomes (IMG) system and supports and promotes the Genomic Standards Consortium (GSC) Minimum Information standards. This paper describes the data updates and additional features added during the last two years.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genoma , Genômica/métodos , Mineração de Dados , Metagenoma , Metagenômica/métodos , Software , Interface Usuário-Computador
12.
Nucleic Acids Res ; 43(Database issue): D1099-106, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25348402

RESUMO

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19,200 studies, 56,000 Biosamples, 56,000 sequencing projects and 39,400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Metagenômica , Internet
13.
Mamm Genome ; 26(7-8): 295-304, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26084703

RESUMO

We report here a semi-automated process by which mouse genome feature predictions and curated annotations (i.e., genes, pseudogenes, functional RNAs, etc.) from Ensembl, NCBI and Vertebrate Genome Annotation database (Vega) are reconciled with the genome features in the Mouse Genome Informatics (MGI) database (http://www.informatics.jax.org) into a comprehensive and non-redundant catalog. Our gene unification method employs an algorithm (fjoin--feature join) for efficient detection of genome coordinate overlaps among features represented in two annotation data sets. Following the analysis with fjoin, genome features are binned into six possible categories (1:1, 1:0, 0:1, 1:n, n:1, n:m) based on coordinate overlaps. These categories are subsequently prioritized for assessment of annotation equivalencies and differences. The version of the unified catalog reported here contains more than 59,000 entries, including 22,599 protein-coding coding genes, 12,455 pseudogenes, and 24,007 other feature types (e.g., microRNAs, lincRNAs, etc.). More than 23,000 of the entries in the MGI gene catalog have equivalent gene models in the annotation files obtained from NCBI, Vega, and Ensembl. 12,719 of the features are unique to NCBI relative to Ensembl/Vega; 11,957 are unique to Ensembl/Vega relative to NCBI, and 3095 are unique to MGI. More than 4000 genome features fall into categories that require manual inspection to resolve structural differences in the gene models from different annotation sources. Using the MGI unified gene catalog, researchers can easily generate a comprehensive report of mouse genome features from a single source and compare the details of gene and transcript structure using MGI's mouse genome browser.


Assuntos
Bases de Dados Genéticas , Genoma , Genômica/métodos , Software , Algoritmos , Animais , Genômica/estatística & dados numéricos , Internet , Camundongos , Modelos Genéticos , Anotação de Sequência Molecular , Fases de Leitura Aberta , Pseudogenes , RNA/genética , Terminologia como Assunto
14.
Microbiol Resour Announc ; 13(6): e0032224, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38771040

RESUMO

When very dry soil is rewet, rapid stimulation of microbial activity has important implications for ecosystem biogeochemistry, yet associated changes in microbial transcription are poorly known. Here, we present metatranscriptomes of California annual grassland soil microbial communities, collected over 1 week from soils rewet after a summer drought-providing a time series of short-term transcriptional response during rewetting.

15.
Microbiol Resour Announc ; 13(2): e0108023, 2024 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-38189307

RESUMO

We present eight metatranscriptomic datasets of light algal and cyanolichen biological soil crusts from the Mojave Desert in response to wetting. These data will help us understand gene expression patterns in desert biocrust microbial communities after they have been reactivated by the addition of water.

16.
Methods Mol Biol ; 2802: 587-609, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38819573

RESUMO

Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.


Assuntos
Genômica , Metagenoma , Metagenômica , Metagenômica/métodos , Metagenômica/normas , Genômica/métodos , Genômica/normas , Metagenoma/genética , Bases de Dados Genéticas , Microbiologia do Solo
17.
Sci Data ; 11(1): 432, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38693191

RESUMO

The genus Clostridium is a large and diverse group within the Bacillota (formerly Firmicutes), whose members can encode useful complex traits such as solvent production, gas-fermentation, and lignocellulose breakdown. We describe 270 genome sequences of solventogenic clostridia from a comprehensive industrial strain collection assembled by Professor David Jones that includes 194 C. beijerinckii, 57 C. saccharobutylicum, 4 C. saccharoperbutylacetonicum, 5 C. butyricum, 7 C. acetobutylicum, and 3 C. tetanomorphum genomes. We report methods, analyses and characterization for phylogeny, key attributes, core biosynthetic genes, secondary metabolites, plasmids, prophage/CRISPR diversity, cellulosomes and quorum sensing for the 6 species. The expanded genomic data described here will facilitate engineering of solvent-producing clostridia as well as non-model microorganisms with innately desirable traits. Sequences could be applied in conventional platform biocatalysts such as yeast or Escherichia coli for enhanced chemical production. Recently, gene sequences from this collection were used to engineer Clostridium autoethanogenum, a gas-fermenting autotrophic acetogen, for continuous acetone or isopropanol production, as well as butanol, butanoic acid, hexanol and hexanoic acid production.


Assuntos
Clostridium , Genoma Bacteriano , Filogenia , Clostridium/genética , Solventes , Fermentação
18.
Microbiol Resour Announc ; 13(3): e0098023, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38329355

RESUMO

We present six whole community shotgun metagenomic sequencing data sets of two types of biological soil crusts sampled at the ecotone of the Mojave Desert and Colorado Desert in California. These data will help us understand the diversity and function of biocrust microbial communities, which are essential for desert ecosystems.

19.
Database (Oxford) ; 20232023 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-36794865

RESUMO

The power of next-generation sequencing has resulted in an explosive growth in the number of projects aiming to understand the metagenomic diversity of complex microbial environments. The interdisciplinary nature of this microbiome research community, along with the absence of reporting standards for microbiome data and samples, poses a significant challenge for follow-up studies. Commonly used names of metagenomes and metatranscriptomes in public databases currently lack the essential information necessary to accurately describe and classify the underlying samples, which makes a comparative analysis difficult to conduct and often results in misclassified sequences in data repositories. The Genomes OnLine Database (GOLD) (https:// gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute has been at the forefront of addressing this challenge by developing a standardized nomenclature system for naming microbiome samples. GOLD, currently in its twenty-fifth anniversary, continues to enrich the research community with hundreds of thousands of metagenomes and metatranscriptomes with well-curated and easy-to-understand names. Through this manuscript, we describe the overall naming process that can be easily adopted by researchers worldwide. Additionally, we propose the use of this naming system as a best practice for the scientific community to facilitate better interoperability and reusability of microbiome data.


Assuntos
Microbiota , Software , Microbiota/genética , Metagenoma/genética , Metagenômica/métodos , Gerenciamento de Dados
20.
Front Microbiol ; 14: 1082107, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36925474

RESUMO

Integrated virus genomes (prophages) are commonly found in sequenced bacterial genomes but have rarely been described in detail for rhizobial genomes. Cupriavidus taiwanensis STM 6018 is a rhizobial Betaproteobacteria strain that was isolated in 2006 from a root nodule of a Mimosa pudica host in French Guiana, South America. Here we describe features of the genome of STM 6018, focusing on the characterization of two different types of prophages that have been identified in its genome. The draft genome of STM 6018 is 6,553,639 bp, and consists of 80 scaffolds, containing 5,864 protein-coding genes and 61 RNA genes. STM 6018 contains all the nodulation and nitrogen fixation gene clusters common to symbiotic Cupriavidus species; sharing >99.97% bp identity homology to the nod/nif/noeM gene clusters from C. taiwanensis LMG19424T and "Cupriavidus neocalidonicus" STM 6070. The STM 6018 genome contains the genomes of two prophages: one complete Mu-like capsular phage and one filamentous phage, which integrates into a putative dif site. This is the first characterization of a filamentous phage found within the genome of a rhizobial strain. Further examination of sequenced rhizobial genomes identified filamentous prophage sequences in several Beta-rhizobial strains but not in any Alphaproteobacterial rhizobia.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa