Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(D1): D134-D137, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37889039

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 25 trillion base pairs from over 3.7 billion nucleotide sequences for 557 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include policies for including spatio-temporal metadata, clarified documentation for GenBank data processing, enhanced foreign contamination screening tools, new processes in the Submission Portal, migration of Entrez Genome and Assembly displays into NCBI Datasets, and the impending retirement of tbl2asn, replaced by table2asn.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Secuencia de Bases , Internet , Humanos
2.
Nucleic Acids Res ; 51(D1): D141-D144, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350640

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 19.6 trillion base pairs from over 2.9 billion nucleotide sequences for 504 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include resources for data from the SARS-CoV-2 virus, NCBI Datasets, BLAST ClusteredNR, the Submission Portal, table2asn, a Foreign Contamination Screening tool and BioSample.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Humanos , COVID-19/genética , Genómica , SARS-CoV-2/genética
3.
Nucleic Acids Res ; 50(D1): D161-D164, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34850943

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 15.3 trillion base pairs from over 2.5 billion nucleotide sequences for 504 000 formally described species. Recent updates include resources for data from the SARS-CoV-2 virus, including a SARS-CoV-2 landing page, NCBI Datasets, NCBI Virus and the Submission Portal. We also discuss upcoming changes to GI identifiers, a new data management interface for BioProject, and advice for providing contextual metadata in submissions.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Virus/genética , Genoma Viral , National Library of Medicine (U.S.) , SARS-CoV-2/genética , Estados Unidos , Interfaz Usuario-Computador
4.
Nucleic Acids Res ; 49(D1): D121-D124, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33166387

RESUMEN

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been the core infrastructure for collecting and providing nucleotide sequence data and metadata for >30 years. Three partner organizations, the DNA Data Bank of Japan (DDBJ) at the National Institute of Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have been collaboratively maintaining the INSDC for the benefit of not only science but all types of community worldwide.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Metadatos/estadística & datos numéricos , Nucleótidos/genética , Análisis de Secuencia de ADN/estadística & datos numéricos , Análisis de Secuencia de ARN/estadística & datos numéricos , Academias e Institutos , Secuencia de Bases , Europa (Continente) , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Cooperación Internacional , Japón , Nucleótidos/metabolismo , Estados Unidos
5.
Nucleic Acids Res ; 49(D1): D92-D96, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33196830

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 9.9 trillion base pairs from over 2.1 billion nucleotide sequences for 478 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. Recent updates include new resources for data from the SARS-CoV-2 virus, updates to the NCBI Submission Portal and associated submission wizards for dengue and SARS-CoV-2 viruses, new taxonomy queries for viruses and prokaryotes, and simplified submission processes for EST and GSS sequences.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Análisis de Secuencia de ADN/métodos , Animales , COVID-19/epidemiología , COVID-19/virología , Biología Computacional/métodos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias
6.
Nucleic Acids Res ; 48(D1): D84-D86, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31665464

RESUMEN

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains over 6.25 trillion base pairs from over 1.6 billion nucleotide sequences for 450 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include a new version of Genome Workbench that supports GenBank submissions, new submission wizards for viral genomes, enhancements to BankIt and improved handling of taxonomy for sequences from pathogens.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , Programas Informáticos , Anotación de Secuencia Molecular , National Institutes of Health (U.S.) , Estados Unidos , Navegador Web
7.
BMC Bioinformatics ; 22(1): 400, 2021 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-34384346

RESUMEN

BACKGROUND: The DNA sequences encoding ribosomal RNA genes (rRNAs) are commonly used as markers to identify species, including in metagenomics samples that may combine many organismal communities. The 16S small subunit ribosomal RNA (SSU rRNA) gene is typically used to identify bacterial and archaeal species. The nuclear 18S SSU rRNA gene, and 28S large subunit (LSU) rRNA gene have been used as DNA barcodes and for phylogenetic studies in different eukaryote taxonomic groups. Because of their popularity, the National Center for Biotechnology Information (NCBI) receives a disproportionate number of rRNA sequence submissions and BLAST queries. These sequences vary in quality, length, origin (nuclear, mitochondria, plastid), and organism source and can represent any region of the ribosomal cistron. RESULTS: To improve the timely verification of quality, origin and loci boundaries, we developed Ribovore, a software package for sequence analysis of rRNA sequences. The ribotyper and ribosensor programs are used to validate incoming sequences of bacterial and archaeal SSU rRNA. The ribodbmaker program is used to create high-quality datasets of rRNAs from different taxonomic groups. Key algorithmic steps include comparing candidate sequences against rRNA sequence profile hidden Markov models (HMMs) and covariance models of rRNA sequence and secondary-structure conservation, as well as other tests. Nine freely available blastn rRNA databases created and maintained with Ribovore are used for checking incoming GenBank submissions and used by the blastn browser interface at NCBI. Since 2018, Ribovore has been used to analyze more than 50 million prokaryotic SSU rRNA sequences submitted to GenBank, and to select at least 10,435 fungal rRNA RefSeq records from type material of 8350 taxa. CONCLUSION: Ribovore combines single-sequence and profile-based methods to improve GenBank processing and analysis of rRNA sequences. It is a standalone, portable, and extensible software package for the alignment, classification and validation of rRNA sequences. Researchers planning on submitting SSU rRNA sequences to GenBank are encouraged to download and use Ribovore to analyze their sequences prior to submission to determine which sequences are likely to be automatically accepted into GenBank.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN Ribosómico , ADN Ribosómico , Filogenia , ARN Ribosómico 16S/genética , ARN Ribosómico 18S/genética , Análisis de Secuencia de ARN
8.
Nucleic Acids Res ; 47(D1): D94-D99, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30365038

RESUMEN

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 420 000 formally described species. Most GenBank submissions are made using BankIt, the NCBI Submission Portal, or the tool tbl2asn, and are obtained from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include an expansion of sequence identifier formats to accommodate expected database growth, submission wizards for ribosomal RNA, and the transfer of Expressed Sequence Tag (EST) and Genome Survey Sequence (GSS) data into the Nucleotide database.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Navegador Web , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos/tendencias , Genómica/métodos , Humanos , Almacenamiento y Recuperación de la Información , Diseño de Software
9.
BMC Bioinformatics ; 21(1): 211, 2020 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-32448124

RESUMEN

BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.


Asunto(s)
Betacoronavirus , Infecciones por Coronavirus , Bases de Datos de Ácidos Nucleicos , Anotación de Secuencia Molecular , Pandemias , Neumonía Viral , Programas Informáticos , Betacoronavirus/genética , COVID-19 , Infecciones por Coronavirus/genética , Virus ADN , Genómica , Humanos , Anotación de Secuencia Molecular/normas , Neumonía Viral/genética , SARS-CoV-2 , Virus
10.
Nucleic Acids Res ; 46(D1): D48-D51, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29190397

RESUMEN

For more than 30 years, the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been committed to capturing, preserving and providing access to comprehensive public domain nucleotide sequence and associated metadata which enables discovery in biomedicine, biodiversity and biological sciences. Since 1987, the DNA Data Bank of Japan (DDBJ) at the National Institute for Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have worked collaboratively to enable access to nucleotide sequence data in standardized formats for the worldwide scientific community. In this article, we reiterate the principles of the INSDC collaboration and briefly summarize the trends of the archival content.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Animales , Clasificación , Biología Computacional , Bases de Datos Factuales , Bases de Datos de Ácidos Nucleicos/tendencias , Europa (Continente) , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Cooperación Internacional , Japón , National Library of Medicine (U.S.) , Estados Unidos
11.
Nucleic Acids Res ; 46(D1): D41-D47, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29140468

RESUMEN

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun and environmental sampling projects. Most submissions are made using BankIt, the National Center for Biotechnology Information (NCBI) Submission Portal, or the tool tbl2asn. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to sequence identifiers, submission wizards for 16S and Influenza sequences, and an Identical Protein Groups resource.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Animales , Biología Computacional , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos/tendencias , Europa (Continente) , Genómica , Humanos , Difusión de la Información , Almacenamiento y Recuperación de la Información , Internet , Japón , National Library of Medicine (U.S.) , Orthomyxoviridae/genética , Proteómica , ARN Ribosómico/genética , Alineación de Secuencia , Estados Unidos
12.
Bioinformatics ; 34(5): 755-759, 2018 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-29069347

RESUMEN

Motivation: Nucleic acid sequences in public databases should not contain vector contamination, but many sequences in GenBank do (or did) contain vectors. The National Center for Biotechnology Information uses the program VecScreen to screen submitted sequences for contamination. Additional tools are needed to distinguish true-positive (contamination) from false-positive (not contamination) VecScreen matches. Results: A principal reason for false-positive VecScreen matches is that the sequence and the matching vector subsequence originate from closely related or identical organisms (for example, both originate in Escherichia coli). We collected information on the taxonomy of sources of vector segments in the UniVec database used by VecScreen. We used that information in two overlapping software pipelines for retrospective analysis of contamination in GenBank and for prospective analysis of contamination in new sequence submissions. Using the retrospective pipeline, we identified and corrected over 8000 contaminated sequences in the nonredundant nucleotide database. The prospective analysis pipeline has been in production use since April 2017 to evaluate some new GenBank submissions. Availability and implementation: Data on the sources of UniVec entries were included in release 10.0 (ftp://ftp.ncbi.nih.gov/pub/UniVec/). The main software is freely available at https://github.com/aaschaffer/vecscreen_plus_taxonomy. Contact: aschaffe@helix.nih.gov. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bases de Datos de Ácidos Nucleicos/normas , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Bacterias , Eucariontes
13.
Nucleic Acids Res ; 45(D1): D37-D42, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899564

RESUMEN

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Análisis de Secuencia de ADN , Animales , Metilación de ADN , Genoma Bacteriano , Genómica , Humanos , ARN Ribosómico 16S/genética , beta-Lactamasas/genética
14.
Nucleic Acids Res ; 44(D1): D48-50, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26657633

RESUMEN

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Conducta Cooperativa , Bases de Datos de Ácidos Nucleicos/normas , Políticas
15.
Nucleic Acids Res ; 44(D1): D67-72, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26590407

RESUMEN

GenBank(®) (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 340 000 formally described species. Recent developments include a new starting page for submitters, a shift toward using accession.version identifiers rather than GI numbers, a wizard for submitting 16S rRNA sequences, and an Identical Protein Report to address growing issues of data redundancy. GenBank organizes the sequence data received from individual laboratories and large-scale sequencing projects into 18 divisions, and GenBank staff assign unique accession.version identifiers upon data receipt. Most submitters use the web-based BankIt or standalone Sequin programs. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the nuccore, nucest, and nucgss databases of the Entrez retrieval system, which integrates these records with a variety of other data including taxonomy nodes, genomes, protein structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Análisis de Secuencia de ADN , Proteínas/genética , ARN Ribosómico 16S/genética
16.
Nucleic Acids Res ; 43(Database issue): D30-5, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25414350

RESUMEN

GenBank(®) (http://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 300 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Bacterias/clasificación , Genómica , Internet , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína
17.
Nucleic Acids Res ; 42(Database issue): D32-7, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24217914

RESUMEN

GenBank is a comprehensive database that contains publicly available nucleotide sequences for over 280,000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Análisis de Secuencia de ADN , Bacterias/clasificación , Bacterias/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Anotación de Secuencia Molecular
18.
Nucleic Acids Res ; 41(Database issue): D21-4, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23180798

RESUMEN

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), one of the longest-standing global alliances of biological data archives, captures, preserves and provides comprehensive public domain nucleotide sequence information. Three partners of the INSDC work in cooperation to establish formats for data and metadata and protocols that facilitate reliable data submission to their databases and support continual data exchange around the world. In this article, the INSDC current status and update for the year of 2012 are presented. Among discussed items of international collaboration meeting in 2012, BioSample database and changes in submission are described as topics.


Asunto(s)
Secuencia de Bases , Bases de Datos de Ácidos Nucleicos , Genómica , Internet , Anotación de Secuencia Molecular
19.
Nucleic Acids Res ; 41(Database issue): D36-42, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193287

RESUMEN

GenBank® (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Asunto(s)
Secuencia de Bases , Bases de Datos de Ácidos Nucleicos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Anotación de Secuencia Molecular , Análisis de Secuencia de ADN
20.
PLoS Biol ; 9(6): e1001088, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21713030

RESUMEN

A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.


Asunto(s)
Bases de Datos Genéticas , Genómica/normas , Cooperación Internacional , Metagenoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA