Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 58
Filter
1.
Nucleic Acids Res ; 52(D1): D134-D137, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37889039

ABSTRACT

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 25 trillion base pairs from over 3.7 billion nucleotide sequences for 557 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include policies for including spatio-temporal metadata, clarified documentation for GenBank data processing, enhanced foreign contamination screening tools, new processes in the Submission Portal, migration of Entrez Genome and Assembly displays into NCBI Datasets, and the impending retirement of tbl2asn, replaced by table2asn.


Subject(s)
Databases, Nucleic Acid , Genomics , Base Sequence , Internet , Humans
2.
Nucleic Acids Res ; 51(D1): D141-D144, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36350640

ABSTRACT

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 19.6 trillion base pairs from over 2.9 billion nucleotide sequences for 504 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include resources for data from the SARS-CoV-2 virus, NCBI Datasets, BLAST ClusteredNR, the Submission Portal, table2asn, a Foreign Contamination Screening tool and BioSample.


Subject(s)
Databases, Nucleic Acid , Humans , COVID-19/genetics , Genomics , SARS-CoV-2/genetics
3.
Nucleic Acids Res ; 50(D1): D161-D164, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34850943

ABSTRACT

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 15.3 trillion base pairs from over 2.5 billion nucleotide sequences for 504 000 formally described species. Recent updates include resources for data from the SARS-CoV-2 virus, including a SARS-CoV-2 landing page, NCBI Datasets, NCBI Virus and the Submission Portal. We also discuss upcoming changes to GI identifiers, a new data management interface for BioProject, and advice for providing contextual metadata in submissions.


Subject(s)
Databases, Nucleic Acid , Viruses/genetics , Genome, Viral , National Library of Medicine (U.S.) , SARS-CoV-2/genetics , United States , User-Computer Interface
4.
Nucleic Acids Res ; 49(D1): D121-D124, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33166387

ABSTRACT

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been the core infrastructure for collecting and providing nucleotide sequence data and metadata for >30 years. Three partner organizations, the DNA Data Bank of Japan (DDBJ) at the National Institute of Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have been collaboratively maintaining the INSDC for the benefit of not only science but all types of community worldwide.


Subject(s)
Databases, Nucleic Acid , Metadata/statistics & numerical data , Nucleotides/genetics , Sequence Analysis, DNA/statistics & numerical data , Sequence Analysis, RNA/statistics & numerical data , Academies and Institutes , Base Sequence , Europe , High-Throughput Nucleotide Sequencing/statistics & numerical data , Humans , International Cooperation , Japan , Nucleotides/metabolism , United States
5.
Nucleic Acids Res ; 49(D1): D92-D96, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33196830

ABSTRACT

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 9.9 trillion base pairs from over 2.1 billion nucleotide sequences for 478 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. Recent updates include new resources for data from the SARS-CoV-2 virus, updates to the NCBI Submission Portal and associated submission wizards for dengue and SARS-CoV-2 viruses, new taxonomy queries for viruses and prokaryotes, and simplified submission processes for EST and GSS sequences.


Subject(s)
Computational Biology/statistics & numerical data , Databases, Nucleic Acid , Genomics/methods , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , Animals , COVID-19/epidemiology , COVID-19/virology , Computational Biology/methods , Humans , Information Storage and Retrieval/methods , Internet , Molecular Sequence Annotation/methods , Pandemics
6.
Nucleic Acids Res ; 48(D1): D84-D86, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31665464

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains over 6.25 trillion base pairs from over 1.6 billion nucleotide sequences for 450 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include a new version of Genome Workbench that supports GenBank submissions, new submission wizards for viral genomes, enhancements to BankIt and improved handling of taxonomy for sequences from pathogens.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid , Genomics/methods , Software , Molecular Sequence Annotation , National Institutes of Health (U.S.) , United States , Web Browser
7.
BMC Bioinformatics ; 22(1): 400, 2021 Aug 12.
Article in English | MEDLINE | ID: mdl-34384346

ABSTRACT

BACKGROUND: The DNA sequences encoding ribosomal RNA genes (rRNAs) are commonly used as markers to identify species, including in metagenomics samples that may combine many organismal communities. The 16S small subunit ribosomal RNA (SSU rRNA) gene is typically used to identify bacterial and archaeal species. The nuclear 18S SSU rRNA gene, and 28S large subunit (LSU) rRNA gene have been used as DNA barcodes and for phylogenetic studies in different eukaryote taxonomic groups. Because of their popularity, the National Center for Biotechnology Information (NCBI) receives a disproportionate number of rRNA sequence submissions and BLAST queries. These sequences vary in quality, length, origin (nuclear, mitochondria, plastid), and organism source and can represent any region of the ribosomal cistron. RESULTS: To improve the timely verification of quality, origin and loci boundaries, we developed Ribovore, a software package for sequence analysis of rRNA sequences. The ribotyper and ribosensor programs are used to validate incoming sequences of bacterial and archaeal SSU rRNA. The ribodbmaker program is used to create high-quality datasets of rRNAs from different taxonomic groups. Key algorithmic steps include comparing candidate sequences against rRNA sequence profile hidden Markov models (HMMs) and covariance models of rRNA sequence and secondary-structure conservation, as well as other tests. Nine freely available blastn rRNA databases created and maintained with Ribovore are used for checking incoming GenBank submissions and used by the blastn browser interface at NCBI. Since 2018, Ribovore has been used to analyze more than 50 million prokaryotic SSU rRNA sequences submitted to GenBank, and to select at least 10,435 fungal rRNA RefSeq records from type material of 8350 taxa. CONCLUSION: Ribovore combines single-sequence and profile-based methods to improve GenBank processing and analysis of rRNA sequences. It is a standalone, portable, and extensible software package for the alignment, classification and validation of rRNA sequences. Researchers planning on submitting SSU rRNA sequences to GenBank are encouraged to download and use Ribovore to analyze their sequences prior to submission to determine which sequences are likely to be automatically accepted into GenBank.


Subject(s)
Databases, Nucleic Acid , RNA, Ribosomal , DNA, Ribosomal , Phylogeny , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 18S/genetics , Sequence Analysis, RNA
8.
Nucleic Acids Res ; 47(D1): D94-D99, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30365038

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 420 000 formally described species. Most GenBank submissions are made using BankIt, the NCBI Submission Portal, or the tool tbl2asn, and are obtained from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include an expansion of sequence identifier formats to accommodate expected database growth, submission wizards for ribosomal RNA, and the transfer of Expressed Sequence Tag (EST) and Genome Survey Sequence (GSS) data into the Nucleotide database.


Subject(s)
Databases, Nucleic Acid , Web Browser , Computational Biology/methods , Databases, Nucleic Acid/trends , Genomics/methods , Humans , Information Storage and Retrieval , Software Design
9.
BMC Bioinformatics ; 21(1): 211, 2020 May 24.
Article in English | MEDLINE | ID: mdl-32448124

ABSTRACT

BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.


Subject(s)
Betacoronavirus , Coronavirus Infections , Databases, Nucleic Acid , Molecular Sequence Annotation , Pandemics , Pneumonia, Viral , Software , Betacoronavirus/genetics , COVID-19 , Coronavirus Infections/genetics , DNA Viruses , Genomics , Humans , Molecular Sequence Annotation/standards , Pneumonia, Viral/genetics , SARS-CoV-2 , Viruses
10.
Nucleic Acids Res ; 46(D1): D48-D51, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29190397

ABSTRACT

For more than 30 years, the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been committed to capturing, preserving and providing access to comprehensive public domain nucleotide sequence and associated metadata which enables discovery in biomedicine, biodiversity and biological sciences. Since 1987, the DNA Data Bank of Japan (DDBJ) at the National Institute for Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have worked collaboratively to enable access to nucleotide sequence data in standardized formats for the worldwide scientific community. In this article, we reiterate the principles of the INSDC collaboration and briefly summarize the trends of the archival content.


Subject(s)
Databases, Nucleic Acid , Animals , Classification , Computational Biology , Databases, Factual , Databases, Nucleic Acid/trends , Europe , High-Throughput Nucleotide Sequencing , Humans , International Cooperation , Japan , National Library of Medicine (U.S.) , United States
11.
Nucleic Acids Res ; 46(D1): D41-D47, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29140468

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun and environmental sampling projects. Most submissions are made using BankIt, the National Center for Biotechnology Information (NCBI) Submission Portal, or the tool tbl2asn. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to sequence identifiers, submission wizards for 16S and Influenza sequences, and an Identical Protein Groups resource.


Subject(s)
Databases, Nucleic Acid , Animals , Computational Biology , Databases, Nucleic Acid/statistics & numerical data , Databases, Nucleic Acid/trends , Europe , Genomics , Humans , Information Dissemination , Information Storage and Retrieval , Internet , Japan , National Library of Medicine (U.S.) , Orthomyxoviridae/genetics , Proteomics , RNA, Ribosomal/genetics , Sequence Alignment , United States
12.
Bioinformatics ; 34(5): 755-759, 2018 03 01.
Article in English | MEDLINE | ID: mdl-29069347

ABSTRACT

Motivation: Nucleic acid sequences in public databases should not contain vector contamination, but many sequences in GenBank do (or did) contain vectors. The National Center for Biotechnology Information uses the program VecScreen to screen submitted sequences for contamination. Additional tools are needed to distinguish true-positive (contamination) from false-positive (not contamination) VecScreen matches. Results: A principal reason for false-positive VecScreen matches is that the sequence and the matching vector subsequence originate from closely related or identical organisms (for example, both originate in Escherichia coli). We collected information on the taxonomy of sources of vector segments in the UniVec database used by VecScreen. We used that information in two overlapping software pipelines for retrospective analysis of contamination in GenBank and for prospective analysis of contamination in new sequence submissions. Using the retrospective pipeline, we identified and corrected over 8000 contaminated sequences in the nonredundant nucleotide database. The prospective analysis pipeline has been in production use since April 2017 to evaluate some new GenBank submissions. Availability and implementation: Data on the sources of UniVec entries were included in release 10.0 (ftp://ftp.ncbi.nih.gov/pub/UniVec/). The main software is freely available at https://github.com/aaschaffer/vecscreen_plus_taxonomy. Contact: aschaffe@helix.nih.gov. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Databases, Nucleic Acid/standards , Sequence Analysis, DNA/methods , Software , Bacteria , Eukaryota
13.
Nucleic Acids Res ; 45(D1): D37-D42, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899564

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Animals , DNA Methylation , Genome, Bacterial , Genomics , Humans , RNA, Ribosomal, 16S/genetics , beta-Lactamases/genetics
14.
Nucleic Acids Res ; 44(D1): D48-50, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26657633

ABSTRACT

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences.


Subject(s)
Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA , Cooperative Behavior , Databases, Nucleic Acid/standards , Policy
15.
Nucleic Acids Res ; 44(D1): D67-72, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26590407

ABSTRACT

GenBank(®) (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 340 000 formally described species. Recent developments include a new starting page for submitters, a shift toward using accession.version identifiers rather than GI numbers, a wizard for submitting 16S rRNA sequences, and an Identical Protein Report to address growing issues of data redundancy. GenBank organizes the sequence data received from individual laboratories and large-scale sequencing projects into 18 divisions, and GenBank staff assign unique accession.version identifiers upon data receipt. Most submitters use the web-based BankIt or standalone Sequin programs. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the nuccore, nucest, and nucgss databases of the Entrez retrieval system, which integrates these records with a variety of other data including taxonomy nodes, genomes, protein structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Proteins/genetics , RNA, Ribosomal, 16S/genetics
16.
Nucleic Acids Res ; 43(Database issue): D30-5, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25414350

ABSTRACT

GenBank(®) (http://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 300 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Subject(s)
Databases, Nucleic Acid , Bacteria/classification , Genomics , Internet , Sequence Analysis, DNA , Sequence Analysis, Protein
17.
Nucleic Acids Res ; 42(Database issue): D32-7, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24217914

ABSTRACT

GenBank is a comprehensive database that contains publicly available nucleotide sequences for over 280,000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Bacteria/classification , Bacteria/genetics , High-Throughput Nucleotide Sequencing , Internet , Molecular Sequence Annotation
18.
Nucleic Acids Res ; 41(Database issue): D21-4, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23180798

ABSTRACT

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), one of the longest-standing global alliances of biological data archives, captures, preserves and provides comprehensive public domain nucleotide sequence information. Three partners of the INSDC work in cooperation to establish formats for data and metadata and protocols that facilitate reliable data submission to their databases and support continual data exchange around the world. In this article, the INSDC current status and update for the year of 2012 are presented. Among discussed items of international collaboration meeting in 2012, BioSample database and changes in submission are described as topics.


Subject(s)
Base Sequence , Databases, Nucleic Acid , Genomics , Internet , Molecular Sequence Annotation
19.
Nucleic Acids Res ; 41(Database issue): D36-42, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23193287

ABSTRACT

GenBank® (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Subject(s)
Base Sequence , Databases, Nucleic Acid , Genomics , High-Throughput Nucleotide Sequencing , Internet , Molecular Sequence Annotation , Sequence Analysis, DNA
20.
PLoS Biol ; 9(6): e1001088, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21713030

ABSTRACT

A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.


Subject(s)
Databases, Genetic , Genomics/standards , International Cooperation , Metagenome
SELECTION OF CITATIONS
SEARCH DETAIL