Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 109
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 622(7983): 594-602, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37821698

RESUMEN

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.


Asunto(s)
Metagenoma , Metagenómica , Microbiología , Proteínas , Análisis por Conglomerados , Metagenoma/genética , Metagenómica/métodos , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Bases de Datos de Proteínas , Conformación Proteica
2.
Nucleic Acids Res ; 52(D1): D164-D173, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37930866

RESUMEN

Plasmids are mobile genetic elements found in many clades of Archaea and Bacteria. They drive horizontal gene transfer, impacting ecological and evolutionary processes within microbial communities, and hold substantial importance in human health and biotechnology. To support plasmid research and provide scientists with data of an unprecedented diversity of plasmid sequences, we introduce the IMG/PR database, a new resource encompassing 699 973 plasmid sequences derived from genomes, metagenomes and metatranscriptomes. IMG/PR is the first database to provide data of plasmid that were systematically identified from diverse microbiome samples. IMG/PR plasmids are associated with rich metadata that includes geographical and ecosystem information, host taxonomy, similarity to other plasmids, functional annotation, presence of genes involved in conjugation and antibiotic resistance. The database offers diverse methods for exploring its extensive plasmid collection, enabling users to navigate plasmids through metadata-centric queries, plasmid comparisons and BLAST searches. The web interface for IMG/PR is accessible at https://img.jgi.doe.gov/pr. Plasmid metadata and sequences can be downloaded from https://genome.jgi.doe.gov/portal/IMG_PR.


Asunto(s)
Metagenoma , Microbiota , Humanos , Metadatos , Programas Informáticos , Bases de Datos Genéticas , Plásmidos/genética
3.
Nucleic Acids Res ; 51(D1): D723-D732, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36382399

RESUMEN

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users to perform comparative analysis of isolate and single cell genomes, metagenomes, and metatranscriptomes. In addition to datasets produced by the JGI, IMG v.7 also includes datasets imported from public sources such as NCBI Genbank, SRA, and the DOE National Microbiome Data Collaborative (NMDC), or submitted by external users. In the past couple years, we have continued our effort to help the user community by improving the annotation pipeline, upgrading the contents with new reference database versions, and adding new analysis functionalities such as advanced scaffold search, Average Nucleotide Identity (ANI) for high-quality metagenome bins, new cassette search, improved gene neighborhood display, and improvements to metatranscriptome data display and analysis. We also extended the collaboration and integration efforts with other DOE-funded projects such as NMDC and DOE Biology Knowledgebase (KBase).


Asunto(s)
Manejo de Datos , Genómica , Genoma Bacteriano , Programas Informáticos , Genoma Arqueal , Bases de Datos Genéticas , Metagenoma
4.
Nucleic Acids Res ; 51(D1): D733-D743, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399502

RESUMEN

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. A web interface enables users to efficiently browse and search viruses based on genome features and/or sequence similarity. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. These clustered into 8.7 million viral operational taxonomic units, including 231 408 with at least one high-quality representative. Viral sequences in IMG/VR are now systematically identified from genomes, metagenomes, and metatranscriptomes using a new detection approach (geNomad), and IMG standard annotation are complemented with genome quality estimation using CheckV, taxonomic classification reflecting the latest taxonomic standards, and microbial host taxonomy prediction. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Asunto(s)
Bases de Datos Genéticas , Genoma Viral , Metadatos , Metagenómica , Programas Informáticos
5.
Nucleic Acids Res ; 50(3): e17, 2022 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-34871418

RESUMEN

Plasmids are mobile genetic elements that play a key role in microbial ecology and evolution by mediating horizontal transfer of important genes, such as antimicrobial resistance genes. Many microbial genomes have been sequenced by short read sequencers and have resulted in a mix of contigs that derive from plasmids or chromosomes. New tools that accurately identify plasmids are needed to elucidate new plasmid-borne genes of high biological importance. We have developed Deeplasmid, a deep learning tool for distinguishing plasmids from bacterial chromosomes based on the DNA sequence and its encoded biological data. It requires as input only assembled sequences generated by any sequencing platform and assembly algorithm and its runtime scales linearly with the number of assembled sequences. Deeplasmid achieves an AUC-ROC of over 89%, and it was more accurate than five other plasmid classification methods. Finally, as a proof of concept, we used Deeplasmid to predict new plasmids in the fish pathogen Yersinia ruckeri ATCC 29473 that has no annotated plasmids. Deeplasmid predicted with high reliability that a long assembled contig is part of a plasmid. Using long read sequencing we indeed validated the existence of a 102 kb long plasmid, demonstrating Deeplasmid's ability to detect novel plasmids.


Asunto(s)
Aprendizaje Profundo , Genoma Bacteriano , Plásmidos , Animales , Cromosomas Bacterianos/genética , Plásmidos/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN
6.
Nucleic Acids Res ; 49(D1): D764-D775, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33137183

RESUMEN

Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity. IMG/VR v3 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.


Asunto(s)
Bases de Datos Genéticas , Ecosistema , Evolución Molecular , Genoma Viral , Virus/genética , Secuencia de Bases , Análisis por Conglomerados , Geografía , Anotación de Secuencia Molecular , Homología de Secuencia de Ácido Nucleico , Interfaz Usuario-Computador
7.
Nucleic Acids Res ; 49(D1): D751-D763, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33119741

RESUMEN

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) contains annotated isolate genome and metagenome datasets sequenced at the DOE's Joint Genome Institute (JGI), submitted by external users, or imported from public sources such as NCBI. IMG v 6.0 includes advanced search functions and a new tool for statistical analysis of mixed sets of genomes and metagenome bins. The new IMG web user interface also has a new Help page with additional documentation and webinar tutorials to help users better understand how to use various IMG functions and tools for their research. New datasets have been processed with the prokaryotic annotation pipeline v.5, which includes extended protein family assignments.


Asunto(s)
Análisis de Datos , Manejo de Datos , Bases de Datos Genéticas , Genoma Arqueal , Genoma Microbiano , Metagenoma , ARN Ribosómico 16S/genética , Motor de Búsqueda
8.
Nature ; 536(7617): 425-30, 2016 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-27533034

RESUMEN

Viruses are the most abundant biological entities on Earth, but challenges in detecting, isolating, and classifying unknown viruses have prevented exhaustive surveys of the global virome. Here we analysed over 5 Tb of metagenomic sequence data from 3,042 geographically diverse samples to assess the global distribution, phylogenetic diversity, and host specificity of viruses. We discovered over 125,000 partial DNA viral genomes, including the largest phage yet identified, and increased the number of known viral genes by 16-fold. Half of the predicted partial viral genomes were clustered into genetically distinct groups, most of which included genes unrelated to those in known viruses. Using CRISPR spacers and transfer RNA matches to link viral groups to microbial host(s), we doubled the number of microbial phyla known to be infected by viruses, and identified viruses that can infect organisms from different phyla. Analysis of viral distribution across diverse ecosystems revealed strong habitat-type specificity for the vast majority of viruses, but also identified some cosmopolitan groups. Our results highlight an extensive global viral diversity and provide detailed insight into viral habitat distribution and host­virus interactions.


Asunto(s)
Planeta Tierra , Ecosistema , Genoma Viral/genética , Metagenómica , Virus/genética , Animales , Organismos Acuáticos/virología , Bacteriófagos/genética , Biodiversidad , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , ADN Viral/análisis , ADN Viral/genética , Conjuntos de Datos como Asunto , Genes Virales , Especificidad del Huésped , Interacciones Huésped-Patógeno , Humanos , Metagenoma/genética , Filogenia , Filogeografía , ARN de Transferencia/genética , Análisis de Secuencia , Virus/clasificación , Virus/aislamiento & purificación
9.
Nucleic Acids Res ; 48(D1): D422-D430, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31665416

RESUMEN

Microbial secondary metabolism is a reservoir of bioactive compounds of immense biotechnological and biomedical potential. The biosynthetic machinery responsible for the production of these secondary metabolites (SMs) (also called natural products) is often encoded by collocated groups of genes called biosynthetic gene clusters (BGCs). High-throughput genome sequencing of both isolates and metagenomic samples combined with the development of specialized computational workflows is enabling systematic identification of BGCs and the discovery of novel SMs. In order to advance exploration of microbial secondary metabolism and its diversity, we developed the largest publicly available database of predicted BGCs combined with experimentally verified BGCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc-public). Here we describe the first major content update of the IMG-ABC knowledgebase, since its initial release in 2015, refreshing the BGC prediction pipeline with the latest version of antiSMASH (v5) as well as presenting the data in the context of underlying environmental metadata sourced from GOLD (https://gold.jgi.doe.gov/). This update has greatly improved the quality and expanded the types of predicted BGCs compared to the previous version.


Asunto(s)
Vías Biosintéticas/genética , Bases de Datos Genéticas , Genoma Microbiano , Familia de Multigenes , Metabolismo Secundario/genética , Bacteriocinas/biosíntesis , Bacteriocinas/genética , Bases del Conocimiento , Metadatos , Metagenoma , Interfaz Usuario-Computador
10.
Nucleic Acids Res ; 47(D1): D666-D677, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30289528

RESUMEN

The Integrated Microbial Genomes & Microbiomes system v.5.0 (IMG/M: https://img.jgi.doe.gov/m/) contains annotated datasets categorized into: archaea, bacteria, eukarya, plasmids, viruses, genome fragments, metagenomes, cell enrichments, single particle sorts, and metatranscriptomes. Source datasets include those generated by the DOE's Joint Genome Institute (JGI), submitted by external scientists, or collected from public sequence data archives such as NCBI. All submissions are typically processed through the IMG annotation pipeline and then loaded into the IMG data warehouse. IMG's web user interface provides a variety of analytical and visualization tools for comparative analysis of isolate genomes and metagenomes in IMG. IMG/M allows open access to all public genomes in the IMG data warehouse, while its expert review (ER) system (IMG/MER: https://img.jgi.doe.gov/mer/) allows registered users to access their private genomes and to store their private datasets in workspace for sharing and for further analysis. IMG/M data content has grown by 60% since the last report published in the 2017 NAR Database Issue. IMG/M v.5.0 has a new and more powerful genome search feature, new statistical tools, and supports metagenome binning.


Asunto(s)
Manejo de Datos/métodos , Bases de Datos Genéticas , Genómica/métodos , Metagenoma , Microbiota , Programas Informáticos , Anotación de Secuencia Molecular/métodos , Alineación de Secuencia/métodos
11.
Nucleic Acids Res ; 47(D1): D678-D686, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30407573

RESUMEN

The Integrated Microbial Genome/Virus (IMG/VR) system v.2.0 (https://img.jgi.doe.gov/vr/) is the largest publicly available data management and analysis platform dedicated to viral genomics. Since the last report published in the 2016, NAR Database Issue, the data has tripled in size and currently contains genomes of 8389 cultivated reference viruses, 12 498 previously published curated prophages derived from cultivated microbial isolates, and 735 112 viral genomic fragments computationally predicted from assembled shotgun metagenomes. Nearly 60% of the viral genomes and genome fragments are clustered into 110 384 viral Operational Taxonomic Units (vOTUs) with two or more members. To improve data quality and predictions of host specificity, IMG/VR v.2.0 now separates prokaryotic and eukaryotic viruses, utilizes known prophage sequences to improve taxonomic assignments, and provides viral genome quality scores based on the estimated genome completeness. New features also include enhanced BLAST search capabilities for external queries. Finally, geographic map visualization to locate user-selected viral genomes or genome fragments has been implemented and download options have been extended. All of these features make IMG/VR v.2.0 a key resource for the study of viruses.


Asunto(s)
Manejo de Datos/métodos , Genoma Viral , Genómica/métodos , Programas Informáticos
12.
Nature ; 499(7459): 431-7, 2013 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-23851394

RESUMEN

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called 'microbial dark matter'. With this additional genomic information, we are able to resolve many intra- and inter-phylum-level relationships and to propose two new superphyla. We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20% of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.


Asunto(s)
Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Metagenómica , Filogenia , Archaea/aislamiento & purificación , Archaea/metabolismo , Bacterias/aislamiento & purificación , Bacterias/metabolismo , Ecosistema , Genoma Arqueal/genética , Genoma Bacteriano/genética , Metagenoma/genética , Datos de Secuencia Molecular , Análisis de Secuencia de ADN , Análisis de la Célula Individual
13.
Nucleic Acids Res ; 45(5): 2776-2785, 2017 03 17.
Artículo en Inglés | MEDLINE | ID: mdl-28076288

RESUMEN

We report the identification of novel tRNA species with 12-base pair amino-acid acceptor branches composed of longer acceptor stem and shorter T-stem. While canonical tRNAs have a 7/5 configuration of the branch, the novel tRNAs have either 8/4 or 9/3 structure. They were found during the search for selenocysteine tRNAs in terabytes of genome, metagenome and metatranscriptome sequences. Certain bacteria and their phages employ the 8/4 structure for serine and histidine tRNAs, while minor cysteine and selenocysteine tRNA species may have a modified 8/4 structure with one bulge nucleotide. In Acidobacteria, tRNAs with 8/4 and 9/3 structures may function as missense and nonsense suppressor tRNAs and/or regulatory noncoding RNAs. In δ-proteobacteria, an additional cysteine tRNA with an 8/4 structure mimics selenocysteine tRNA and may function as opal suppressor. We examined the potential translation function of suppressor tRNA species in Escherichia coli; tRNAs with 8/4 or 9/3 structures efficiently inserted serine, alanine and cysteine in response to stop and sense codons, depending on the identity element and anticodon sequence of the tRNA. These findings expand our view of how tRNA, and possibly the genetic code, is diversified in nature.


Asunto(s)
ARN Bacteriano/química , ARN de Transferencia/química , Anticodón , Bacterias/genética , Toxinas Bacterianas/genética , Conformación de Ácido Nucleico , Biosíntesis de Proteínas , ARN de Transferencia Aminoácido-Específico/química , ARN de Transferencia de Cisteína/química , ARN de Transferencia de Cisteína/metabolismo
14.
Nucleic Acids Res ; 45(D1): D560-D565, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27903896

RESUMEN

Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.


Asunto(s)
Bacterias/genética , Bacterias/metabolismo , Genoma Bacteriano , Genómica/métodos , Metabolómica/métodos , Biología Computacional/métodos , Programas Informáticos , Navegador Web
15.
Nucleic Acids Res ; 45(D1): D507-D516, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27738135

RESUMEN

The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.


Asunto(s)
Biología Computacional/métodos , Metagenoma , Metagenómica/métodos , Microbiota/genética , Programas Informáticos , Navegador Web
16.
Nucleic Acids Res ; 45(D1): D457-D465, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27799466

RESUMEN

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community.


Asunto(s)
Virus ADN/genética , Bases de Datos Genéticas , Genoma Viral , Genómica/métodos , Metagenómica/métodos , Retroviridae/genética , Programas Informáticos , Microbiología Ambiental , Interacciones Huésped-Patógeno , Metagenoma , Análisis de Secuencia de ADN
17.
BMC Genomics ; 17: 307, 2016 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-27118214

RESUMEN

BACKGROUND: The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. RESULTS: Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existing IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. CONCLUSION: By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.


Asunto(s)
Biología Computacional/métodos , Genoma Microbiano , Genómica/métodos , Anotación de Secuencia Molecular/métodos , Programas Informáticos , Conducta Cooperativa , Exactitud de los Datos , Difusión de la Información , Internet , Interfaz Usuario-Computador
18.
Environ Microbiol ; 18(4): 1122-36, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26487573

RESUMEN

Hydrothermal vents represent a deep, hot, aphotic biosphere where chemosynthetic primary producers, fuelled by chemicals from Earth's subsurface, form the basis of life. In this study, we examined microbial mats from two distinct volcanic sites within the Hellenic Volcanic Arc (HVA). The HVA is geologically and ecologically unique, with reported emissions of CO2 -saturated fluids at temperatures up to 220°C and a notable absence of macrofauna. Metagenomic data reveals highly complex prokaryotic communities composed of chemolithoautotrophs, some methanotrophs, and to our surprise, heterotrophs capable of anaerobic degradation of aromatic hydrocarbons. Our data suggest that aromatic hydrocarbons may indeed be a significant source of carbon in these sites, and instigate additional research into the nature and origin of these compounds in the HVA. Novel physiology was assigned to several uncultured prokaryotic lineages; most notably, a SAR406 representative is attributed with a role in anaerobic hydrocarbon degradation. This dataset, the largest to date from submarine volcanic ecosystems, constitutes a significant resource of novel genes and pathways with potential biotechnological applications.


Asunto(s)
Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Ecosistema , Respiraderos Hidrotermales/microbiología , Archaea/aislamiento & purificación , Bacterias/aislamiento & purificación , Secuencia de Bases , Geología , Metagenómica , ARN Ribosómico 16S/genética , Temperatura
19.
Nucleic Acids Res ; 42(Database issue): D560-7, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24165883

RESUMEN

The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).


Asunto(s)
Bases de Datos Genéticas , Genoma Microbiano , Vías Biosintéticas/genética , Perfilación de la Expresión Génica , Genoma Arqueal , Genoma Bacteriano , Genoma Viral , Genómica , Internet , Anotación de Secuencia Molecular , Plásmidos/genética , Proteómica , Programas Informáticos , Integración de Sistemas
20.
Nucleic Acids Res ; 42(Database issue): D568-73, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24136997

RESUMEN

IMG/M (http://img.jgi.doe.gov/m) provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in the context of a comprehensive set of reference genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG/M's data content and analytical tools have expanded continuously since its first version was released in 2007. Since the last report published in the 2012 NAR Database Issue, IMG/M's database architecture, annotation and data integration pipelines and analysis tools have been extended to copewith the rapid growth in the number and size of metagenome data sets handled by the system. IMG/M data marts provide support for the analysis of publicly available genomes, expert review of metagenome annotations (IMG/M ER: http://img.jgi.doe.gov/mer) and Human Microbiome Project (HMP)-specific metagenome samples (IMG/M HMP: http://img.jgi.doe.gov/imgm_hmp).


Asunto(s)
Bases de Datos Genéticas , Metagenoma , Perfilación de la Expresión Génica , Genoma Arqueal , Genoma Bacteriano , Genoma Viral , Internet , Metagenómica/normas , Plásmidos/genética , Estándares de Referencia , Análisis de Secuencia de Proteína , Programas Informáticos , Integración de Sistemas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA