Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 55
Filter
1.
Nucleic Acids Res ; 52(D1): D1-D9, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-38035367

ABSTRACT

The 2024 Nucleic Acids Research database issue contains 180 papers from across biology and neighbouring disciplines. There are 90 papers reporting on new databases and 83 updates from resources previously published in the Issue. Updates from databases most recently published elsewhere account for a further seven. Nucleic acid databases include the new NAKB for structural information and updates from Genbank, ENA, GEO, Tarbase and JASPAR. The Issue's Breakthrough Article concerns NMPFamsDB for novel prokaryotic protein families and the AlphaFold Protein Structure Database has an important update. Metabolism is covered by updates from Reactome, Wikipathways and Metabolights. Microbes are covered by RefSeq, UNITE, SPIRE and P10K; viruses by ViralZone and PhageScope. Medically-oriented databases include the familiar COSMIC, Drugbank and TTD. Genomics-related resources include Ensembl, UCSC Genome Browser and Monarch. New arrivals cover plant imaging (OPIA and PlantPAD) and crop plants (SoyMD, TCOD and CropGS-Hub). The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). Over the last year the NAR online Molecular Biology Database Collection has been updated, reviewing 1060 entries, adding 97 new resources and eliminating 388 discontinued URLs bringing the current total to 1959 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.


Subject(s)
Computational Biology , Databases, Nucleic Acid , Databases, Genetic , Databases, Nucleic Acid/trends , Genomics , Internet , Molecular Biology/trends
2.
Nucleic Acids Res ; 52(D1): D762-D769, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37962425

ABSTRACT

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains over 315 000 bacterial and archaeal genomes and 236 million proteins with up-to-date and consistent annotation. In the past 3 years, we have expanded the diversity of the RefSeq collection by including the best quality metagenome-assembled genomes (MAGs) submitted to INSDC (DDBJ, ENA and GenBank), while maintaining its quality by adding validation checks. Assemblies are now more stringently evaluated for contamination and for completeness of annotation prior to acceptance into RefSeq. MAGs now account for over 17000 assemblies in RefSeq, split over 165 orders and 362 families. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP), which is used to annotate nearly all RefSeq assemblies include better detection of protein-coding genes. Nearly 83% of RefSeq proteins are now named by a curated Protein Family Model, a 4.7% increase in the past three years ago. In addition to literature citations, Enzyme Commission numbers, and gene symbols, Gene Ontology terms are now assigned to 48% of RefSeq proteins, allowing for easier multi-genome comparison. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/. PGAP is available as a stand-alone tool able to produce GenBank-ready files at https://github.com/ncbi/pgap.


Subject(s)
Archaea , Bacteria , Databases, Nucleic Acid , Metagenome , Archaea/genetics , Bacteria/genetics , Databases, Nucleic Acid/standards , Databases, Nucleic Acid/trends , Genome, Archaeal/genetics , Genome, Bacterial/genetics , Internet , Molecular Sequence Annotation , Proteins/genetics
3.
Nucleic Acids Res ; 49(D1): D82-D85, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33175160

ABSTRACT

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid/trends , Nucleic Acids/genetics , Nucleotides/genetics , Databases, Nucleic Acid/statistics & numerical data , Europe , High-Throughput Nucleotide Sequencing , Humans , Internet , Molecular Sequence Annotation , Nucleic Acids/chemistry , Nucleotides/chemistry , Sequence Analysis, DNA , Sequence Analysis, RNA
4.
Methods Mol Biol ; 1912: 251-285, 2019.
Article in English | MEDLINE | ID: mdl-30635897

ABSTRACT

One of the most important resources for researchers of noncoding RNAs is the information available in public databases spread over the internet. However, the effective exploration of this data can represent a daunting task, given the large amount of databases available and the variety of stored data. This chapter describes a classification of databases based on information source, type of RNA, source organisms, data formats, and the mechanisms for information retrieval, detailing the relevance of each of these classifications and its usability by researchers. This classification is used to update a 2012 review, indexing now more than 229 public databases. This review will include an assessment of the new trends for ncRNA research based on the information that is being offered by the databases. Additionally, we will expand the previous analysis focusing on the usability and application of these databases in pathogen and disease research. Finally, this chapter will analyze how currently available database schemas can help the development of new and improved web resources.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid/trends , Information Storage and Retrieval/trends , RNA, Untranslated/genetics , Computational Biology/trends , Databases, Nucleic Acid/statistics & numerical data , Datasets as Topic , Humans , Information Storage and Retrieval/statistics & numerical data
5.
Nucleic Acids Res ; 47(D1): D94-D99, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30365038

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 420 000 formally described species. Most GenBank submissions are made using BankIt, the NCBI Submission Portal, or the tool tbl2asn, and are obtained from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include an expansion of sequence identifier formats to accommodate expected database growth, submission wizards for ribosomal RNA, and the transfer of Expressed Sequence Tag (EST) and Genome Survey Sequence (GSS) data into the Nucleotide database.


Subject(s)
Databases, Nucleic Acid , Web Browser , Computational Biology/methods , Databases, Nucleic Acid/trends , Genomics/methods , Humans , Information Storage and Retrieval , Software Design
6.
Biosystems ; 167: 47-61, 2018 May.
Article in English | MEDLINE | ID: mdl-29608931

ABSTRACT

In this paper, a well secured, high capacity, preserved algorithm is proposed through integrating the cryptography and steganography concepts with the molecular biology concepts. We achieved this by first encrypting the confidential data using the DNA Playfair cipher to avoid extra information sent to the receiver and it consequently acts as a trap for an attacker. Second, it achieves a randomized steganography process by exploiting the DNA conservative mutations. The DNA conservative mutations are utilized in a way that allows a DNA base to be substituted by another base to allow carrying two bits. Consequently, a high capacity feature is obtained with no payload for the used sequence. There are three main achieved contributions in this work. First, is hiding high capacity of data within DNA by exploiting each codon to hide two bits whilst preserving the sequence properties of protein after the steganography process, which is a trade off in the field. Secondly, using the conservative mutation with all its valid biological permutations, leads to the lowest cracking probability achieved and published till now, as proven in the security analysis section. Finally, a comparison is conducted between the proposed algorithm and five recent substitution based algorithms using large sized data up to three megabytes, to prove the algorithm's scalability.


Subject(s)
Conserved Sequence/genetics , DNA/genetics , Databases, Nucleic Acid , Mutation/genetics , Sequence Analysis, DNA/methods , Animals , Base Sequence , Databases, Genetic/trends , Databases, Nucleic Acid/trends , Humans , Random Allocation , Sequence Analysis, DNA/trends
7.
Nucleic Acids Res ; 46(D1): D1-D7, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29316735

ABSTRACT

The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.


Subject(s)
Databases, Nucleic Acid , Animals , Computational Biology , Databases, Nucleic Acid/trends , Databases, Protein , Genomics , Humans , Internet , Molecular Biology , Proteomics
8.
Nucleic Acids Res ; 46(D1): D48-D51, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29190397

ABSTRACT

For more than 30 years, the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been committed to capturing, preserving and providing access to comprehensive public domain nucleotide sequence and associated metadata which enables discovery in biomedicine, biodiversity and biological sciences. Since 1987, the DNA Data Bank of Japan (DDBJ) at the National Institute for Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have worked collaboratively to enable access to nucleotide sequence data in standardized formats for the worldwide scientific community. In this article, we reiterate the principles of the INSDC collaboration and briefly summarize the trends of the archival content.


Subject(s)
Databases, Nucleic Acid , Animals , Classification , Computational Biology , Databases, Factual , Databases, Nucleic Acid/trends , Europe , High-Throughput Nucleotide Sequencing , Humans , International Cooperation , Japan , National Library of Medicine (U.S.) , United States
9.
Nucleic Acids Res ; 46(D1): D30-D35, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29040613

ABSTRACT

The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.


Subject(s)
Databases, Nucleic Acid , Academies and Institutes , Cloud Computing , Computational Biology , Confidentiality/legislation & jurisprudence , Databases, Nucleic Acid/history , Databases, Nucleic Acid/trends , Europe , Genetic Association Studies , History, 20th Century , History, 21st Century , Humans , Information Storage and Retrieval , International Cooperation , Japan , National Library of Medicine (U.S.) , United States
10.
Nucleic Acids Res ; 46(D1): D41-D47, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29140468

ABSTRACT

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun and environmental sampling projects. Most submissions are made using BankIt, the National Center for Biotechnology Information (NCBI) Submission Portal, or the tool tbl2asn. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to sequence identifiers, submission wizards for 16S and Influenza sequences, and an Identical Protein Groups resource.


Subject(s)
Databases, Nucleic Acid , Animals , Computational Biology , Databases, Nucleic Acid/statistics & numerical data , Databases, Nucleic Acid/trends , Europe , Genomics , Humans , Information Dissemination , Information Storage and Retrieval , Internet , Japan , National Library of Medicine (U.S.) , Orthomyxoviridae/genetics , Proteomics , RNA, Ribosomal/genetics , Sequence Alignment , United States
11.
Nucleic Acids Res ; 46(D1): D36-D40, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29140475

ABSTRACT

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.


Subject(s)
Databases, Nucleic Acid , Computational Biology , Databases, Nucleic Acid/trends , Europe , High-Throughput Nucleotide Sequencing , Humans , Information Storage and Retrieval , Internet , Molecular Sequence Annotation
13.
Nucleic Acids Res ; 45(D1): D1-D11, 2017 01 04.
Article in English | MEDLINE | ID: mdl-28053160

ABSTRACT

This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein-protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as 'breakthrough' contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the 'golden set' of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/.


Subject(s)
Databases, Nucleic Acid/trends , Databases, Protein/trends , Databases, Chemical/trends , Genomics , Humans
14.
Genet Med ; 19(7): 838-841, 2017 07.
Article in English | MEDLINE | ID: mdl-27977006

ABSTRACT

Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.


Subject(s)
Databases, Genetic/ethics , Databases, Genetic/legislation & jurisprudence , Databases, Nucleic Acid/ethics , Data Curation/standards , Databases, Genetic/statistics & numerical data , Databases, Nucleic Acid/legislation & jurisprudence , Databases, Nucleic Acid/trends , Genetic Variation , Genomics/ethics , Genomics/legislation & jurisprudence , Humans , Information Dissemination/ethics , Information Dissemination/legislation & jurisprudence
15.
Rev. derecho genoma hum ; (40): 51-73, ene.-jun. 2014.
Article in Spanish | IBECS | ID: ibc-133429

ABSTRACT

La ley 12.654, de 28 de mayo de 2012, implementó en Brasil la base de datos de los perfiles genéticos con finalidades criminales, con el objetivo de permitir la identificación genética de las personas investigadas y condenadas por «crímenes hediondos» o delitos dolosos practicados con violencia de naturaleza grave contra las personas. El presente trabajo tiene por objeto realizar un análisis crítico de las implicaciones procesales penales del funcionamiento de las bases de datos genéticos en el ordenamiento jurídico brasileño, así como las principales dificultades encontradas para la implementación de este mecanismo técnico-investigativo en nuestro país. Tales reflexiones son hechas a partir de la concepción de un modelo constitucional de proceso que tiene por finalidad la protección de derechos fundamentales. De esta forma, se analiza primeramente la noción de identidad en el Estado Democrático de Derecho, trazándose un panorama sobre la identificación criminal, como medida de intervención corporal, y como la jurisprudencia brasileña viene tratando las injerencias en el cuerpo humano. Finalmente, se analiza el régimen jurídico de los datos genéticos y el empleo de los bancos de datos de AND (AU)


The act 12.654 of May 28, 2012 implemented the database of genetic profiles for criminal purposes, in order to allow for the identification of individuals investigated and convicted of heinous crimes or intentional violent offenses of serious nature against other persons. The present work seeks to make a critical analysis of criminal procedural implications of the genetic databases in the Brazilian legal system and the main difficulties found in the implementation o such technical investigate mechanism in our country. These reflections are made from the idea of a constitutional process model that is intended to protect fundamental rights. Thus, it first analyzes the notion of identity in the democratic state of law, mapping out a perspective of criminal identification as a measure of body intervention as well as how Brazilian jurisprudence has been treating various interferences in the human body. Finally it discusses the legal regime of genetic data and the use of DNA databases (AU)


Subject(s)
Humans , Male , Female , DNA Fingerprinting/legislation & jurisprudence , DNA Fingerprinting/trends , Databases, Nucleic Acid/legislation & jurisprudence , Databases, Nucleic Acid/trends , Databases, Nucleic Acid , Epidemiological Monitoring , Criminals/legislation & jurisprudence , Brazil/epidemiology
18.
Forensic Sci Int Genet ; 5(1): 16-20, 2011 Jan.
Article in English | MEDLINE | ID: mdl-20739248

ABSTRACT

DNA evidence is widely recognized as an invaluable tool in the process of investigation and identification, as well as one of the most sought after types of evidence for presentation to a jury. In the United States, the development of state and federal DNA databases has greatly impacted the forensic community by creating an efficient, searchable system that can be used to eliminate or include suspects in an investigation based on matching DNA profiles - the profile already in the database to the profile of the unknown sample in evidence. Recent changes in legislation have begun to allow for the possibility to expand the parameters of DNA database searches, taking into account the possibility of familial searches. This article discusses prospective positive outcomes of utilizing familial DNA searches and acknowledges potential negative outcomes, thereby presenting both sides of this very complicated, rapidly evolving situation.


Subject(s)
DNA/genetics , Databases, Nucleic Acid , Forensic Genetics/methods , Law Enforcement/methods , DNA/analysis , DNA Fingerprinting/methods , Databases, Factual , Databases, Nucleic Acid/legislation & jurisprudence , Databases, Nucleic Acid/trends , Family , Forecasting , Humans , United States
19.
Adv Exp Med Biol ; 680: 125-35, 2010.
Article in English | MEDLINE | ID: mdl-20865494

ABSTRACT

The Center for Information Biology and DNA Data Bank of Japan (CIB-DDBJ) has operated biological databases since 1987 in collaboration with NCBI and EBI. As one of the three major public databases, CIB-DDBJ has run four primary databases DDBJ, CIBEX, DDBJ Trace Archive (DTA), and DDBJ Read Archive (DRA) to collect, archive, and provide various kinds of biological data. As the massively parallel new sequencing platforms are increasingly in use, huge amounts of the raw data have been produced. To archive these raw data, we at CIB-DDBJ began operating a new repository, the DDBJ Read Archive (DRA). To accommodate efficiently the processed data as well, we have developed a new pipeline, the DDBJ Read Annotation Pipeline that deals with both data submission and analysis. For data produced by the next generation platforms, the three archives DRA, DDBJ, and CIBEX, which are interconnected by the pipeline, collect the raw, processed sequence, and quantitative data, respectively. The public biological databases at CIB-DDBJ, EBI, and NCBI will together construct world-wide archives for biological data by data sharing to accelerate research in life sciences in the era of next generation sequencing technologies.


Subject(s)
Databases, Nucleic Acid/statistics & numerical data , Sequence Analysis, DNA/statistics & numerical data , Computational Biology , Databases, Nucleic Acid/trends , Japan , Models, Statistical , Sequence Analysis, DNA/trends
20.
Genome Biol ; 11(7): 402, 2010.
Article in English | MEDLINE | ID: mdl-20670392

ABSTRACT

Maintaining up-to-date annotation on reference genomes is becoming more important, not less, as the ability to rapidly and cheaply resequence genomes expands.


Subject(s)
Databases, Nucleic Acid , Genomics/methods , Animals , Arabidopsis/genetics , Communication , Databases, Nucleic Acid/economics , Databases, Nucleic Acid/trends , Genome/genetics , Genomics/economics , Genomics/trends , Research Support as Topic/economics , Saccharomyces cerevisiae/genetics , Sequence Analysis, DNA/economics
SELECTION OF CITATIONS
SEARCH DETAIL
...