Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 45(D1): D507-D516, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27738135

ABSTRACT

The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.


Subject(s)
Computational Biology/methods , Metagenome , Metagenomics/methods , Microbiota/genetics , Software , Web Browser
2.
Nucleic Acids Res ; 45(D1): D457-D465, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27799466

ABSTRACT

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community.


Subject(s)
DNA Viruses/genetics , Databases, Genetic , Genome, Viral , Genomics/methods , Metagenomics/methods , Retroviridae/genetics , Software , Environmental Microbiology , Host-Pathogen Interactions , Metagenome , Sequence Analysis, DNA
3.
BMC Genomics ; 17: 307, 2016 Apr 26.
Article in English | MEDLINE | ID: mdl-27118214

ABSTRACT

BACKGROUND: The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. RESULTS: Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existing IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. CONCLUSION: By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.


Subject(s)
Computational Biology/methods , Genome, Microbial , Genomics/methods , Molecular Sequence Annotation/methods , Software , Cooperative Behavior , Data Accuracy , Information Dissemination , Internet , User-Computer Interface
5.
Stand Genomic Sci ; 11: 17, 2016.
Article in English | MEDLINE | ID: mdl-26918089

ABSTRACT

The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provided via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation is followed by functional annotation including assignment of protein product names and connection to various protein family databases.

6.
Stand Genomic Sci ; 11: 3, 2016.
Article in English | MEDLINE | ID: mdl-26767090

ABSTRACT

This report presents the permanent draft genome sequence of Desulfurococcus mobilis type strain DSM 2161, an obligate anaerobic hyperthermophilic crenarchaeon that was isolated from acidic hot springs in Hveravellir, Iceland. D. mobilis utilizes peptides as carbon and energy sources and reduces elemental sulfur to H2S. A metabolic construction derived from the draft genome identified putative pathways for peptide degradation and sulfur respiration in this archaeon. Existence of several hydrogenase genes in the genome supported previous findings that H2 is produced during the growth of D. mobilis in the absence of sulfur. Interestingly, genes encoding glucose transport and utilization systems also exist in the D. mobilis genome though this archaeon does not utilize carbohydrate for growth. The draft genome of D. mobilis provides an additional mean for comparative genomic analysis of desulfurococci. In addition, our analysis on the Average Nucleotide Identity between D. mobilis and Desulfurococcus mucosus suggested that these two desulfurococci are two different strains of the same species.

7.
Stand Genomic Sci ; 10: 86, 2015.
Article in English | MEDLINE | ID: mdl-26512311

ABSTRACT

The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.

8.
Trends Microbiol ; 23(11): 730-741, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26439299

ABSTRACT

Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure.


Subject(s)
Databases, Genetic , Genome, Microbial/genetics , Metagenome/genetics , Computational Biology , Database Management Systems , Genomics/methods , Models, Molecular , Software , Statistics as Topic
9.
Stand Genomic Sci ; 10: 48, 2015.
Article in English | MEDLINE | ID: mdl-26380636

ABSTRACT

Bacteroides barnesiae Lan et al. 2006 is a species of the genus Bacteroides, which belongs to the family Bacteroidaceae. Strain BL2(T) is of interest because it was isolated from the gut of a chicken and the growing awareness that the anaerobic microbiota of the caecum is of benefit for the host and may impact poultry farming. The 3,621,509 bp long genome with its 3,059 protein-coding and 97 RNA genes is a part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project.

10.
Stand Genomic Sci ; 10: 52, 2015.
Article in English | MEDLINE | ID: mdl-26380640

ABSTRACT

Members of the genus Halotalea (family Halomonadaceae) are of high significance since they can tolerate the greatest glucose and maltose concentrations ever reported for known bacteria and are involved in the degradation of industrial effluents. Here, the characteristics and the permanent-draft genome sequence and annotation of Halotalea alkalilenta AW-7(T) are described. The microorganism was sequenced as a part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project at the DOE Joint Genome Institute, and it is the only strain within the genus Halotalea having its genome sequenced. The genome is 4,467,826 bp long and consists of 40 scaffolds with 64.62 % average GC content. A total of 4,104 genes were predicted, comprising of 4,028 protein-coding and 76 RNA genes. Most protein-coding genes (87.79 %) were assigned to a putative function. Halotalea alkalilenta AW-7(T) encodes the catechol and protocatechuate degradation to ß-ketoadipate via the ß-ketoadipate and protocatechuate ortho-cleavage degradation pathway, and it possesses the genetic ability to detoxify fluoroacetate, cyanate and acrylonitrile. An emended description of the genus Halotalea Ntougias et al. 2007 is also provided in order to describe the delayed fermentation ability of the type strain.

11.
Genome Announc ; 3(4)2015 Jul 23.
Article in English | MEDLINE | ID: mdl-26205857

ABSTRACT

Clostridium clariflavum strain 4-2a, a novel strain isolated from a thermophilic biocompost pile, has demonstrated an extensive capability to utilize both cellulose and hemicellulose under thermophilic anaerobic conditions. Here, we report the draft genome of this strain.

12.
Stand Genomic Sci ; 10: 21, 2015.
Article in English | MEDLINE | ID: mdl-26203333

ABSTRACT

Leucobacter chironomi strain MM2LB(T) (Halpern et al., Int J Syst Evol Microbiol 59:665-70 2009) is a Gram-positive, rod shaped, non-motile, aerobic, chemoorganotroph bacterium. L. chironomi belongs to the family Microbacteriaceae, a family within the class Actinobacteria. Strain MM2LB(T) was isolated from a chironomid (Diptera; Chironomidae) egg mass that was sampled from a waste stabilization pond in northern Israel. In a phylogenetic tree based on 16S rRNA gene sequences, strain MM2LB(T) formed a distinct branch within the radiation encompassing the genus Leucobacter. Here we describe the features of this organism, together with the complete genome sequence and annotation. The DNA GC content is 69.90%. The chromosome length is 2,964,712 bp. It encodes 2,690 proteins and 61 RNA genes. L. chironomi genome is part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project.

13.
mBio ; 6(4): e00932, 2015 Jul 14.
Article in English | MEDLINE | ID: mdl-26173699

ABSTRACT

UNLABELLED: In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMPORTANCE: IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.


Subject(s)
Biosynthetic Pathways/genetics , Computational Biology/methods , Knowledge Bases , Multigene Family , Secondary Metabolism/genetics
14.
Genome Announc ; 2(5)2014 Oct 16.
Article in English | MEDLINE | ID: mdl-25323723

ABSTRACT

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes.

15.
Nucleic Acids Res ; 42(Database issue): D560-7, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24165883

ABSTRACT

The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).


Subject(s)
Databases, Genetic , Genome, Microbial , Biosynthetic Pathways/genetics , Gene Expression Profiling , Genome, Archaeal , Genome, Bacterial , Genome, Viral , Genomics , Internet , Molecular Sequence Annotation , Plasmids/genetics , Proteomics , Software , Systems Integration
16.
Nucleic Acids Res ; 42(Database issue): D568-73, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24136997

ABSTRACT

IMG/M (http://img.jgi.doe.gov/m) provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in the context of a comprehensive set of reference genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG/M's data content and analytical tools have expanded continuously since its first version was released in 2007. Since the last report published in the 2012 NAR Database Issue, IMG/M's database architecture, annotation and data integration pipelines and analysis tools have been extended to copewith the rapid growth in the number and size of metagenome data sets handled by the system. IMG/M data marts provide support for the analysis of publicly available genomes, expert review of metagenome annotations (IMG/M ER: http://img.jgi.doe.gov/mer) and Human Microbiome Project (HMP)-specific metagenome samples (IMG/M HMP: http://img.jgi.doe.gov/imgm_hmp).


Subject(s)
Databases, Genetic , Metagenome , Gene Expression Profiling , Genome, Archaeal , Genome, Bacterial , Genome, Viral , Internet , Metagenomics/standards , Plasmids/genetics , Reference Standards , Sequence Analysis, Protein , Software , Systems Integration
17.
PLoS One ; 8(2): e54859, 2013.
Article in English | MEDLINE | ID: mdl-23424620

ABSTRACT

Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.


Subject(s)
Databases, Genetic , Microbiology , Molecular Sequence Annotation/methods , Phenotype
18.
PLoS One ; 7(7): e40151, 2012.
Article in English | MEDLINE | ID: mdl-22792232

ABSTRACT

The Integrated Microbial Genomes and Metagenomes (IMG/M) resource is a data management system that supports the analysis of sequence data from microbial communities in the integrated context of all publicly available draft and complete genomes from the three domains of life as well as a large number of plasmids and viruses. IMG/M currently contains thousands of genomes and metagenome samples with billions of genes. IMG/M-HMP is an IMG/M data mart serving the US National Institutes of Health (NIH) Human Microbiome Project (HMP), focussed on HMP generated metagenome datasets, and is one of the central resources provided from the HMP Data Analysis and Coordination Center (DACC). IMG/M-HMP is available at http://www.hmpdacc-resources.org/imgm_hmp/.


Subject(s)
Database Management Systems , Databases, Genetic , Internet , Metagenome/genetics , Archaea/genetics , Bacteria/genetics , Eukaryota/genetics , Humans , User-Computer Interface
19.
Nucleic Acids Res ; 40(Database issue): D123-9, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22086953

ABSTRACT

The integrated microbial genomes and metagenomes (IMG/M) system provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in a comprehensive integrated context. IMG/M integrates metagenome data sets with isolate microbial genomes from the IMG system. IMG/M's data content and analytical capabilities have been extended through regular updates since its first release in 2007. IMG/M is available at http://img.jgi.doe.gov/m. A companion IMG/M systems provide support for annotation and expert review of unpublished metagenomic data sets (IMG/M ER: http://img.jgi.doe.gov/mer).


Subject(s)
Databases, Genetic , Metagenome , Metagenomics , Database Management Systems , Eukaryota/genetics , Genome, Archaeal , Genome, Bacterial , Genome, Viral , Plasmids/genetics , Systems Integration
20.
Nucleic Acids Res ; 40(Database issue): D115-22, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22194640

ABSTRACT

The Integrated Microbial Genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG integrates publicly available draft and complete genomes from all three domains of life with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. IMG's data content and analytical capabilities have been continuously extended through regular updates since its first release in March 2005. IMG is available at http://img.jgi.doe.gov. Companion IMG systems provide support for expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er), teaching courses and training in microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu) and analysis of genomes related to the Human Microbiome Project (IMG/HMP: http://www.hmpdacc-resources.org/img_hmp).


Subject(s)
Databases, Genetic , Genome, Archaeal , Genome, Bacterial , Genome, Viral , Genomics , Eukaryota/genetics , Phenotype , Plasmids/genetics , Proteomics , Software , Systems Integration
SELECTION OF CITATIONS
SEARCH DETAIL
...