RESUMO
This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein-protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as 'breakthrough' contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the 'golden set' of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/.
Assuntos
Bases de Dados de Ácidos Nucleicos/tendências , Bases de Dados de Proteínas/tendências , Bases de Dados de Compostos Químicos/tendências , Genômica , HumanosRESUMO
The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Bases de Dados Factuais , Bases de Dados de Proteínas , Genômica , HumanosRESUMO
The 2015 Nucleic Acids Research Database Issue contains 172 papers that include descriptions of 56 new molecular biology databases, and updates on 115 databases whose descriptions have been previously published in NAR or other journals. Following the classification that has been introduced last year in order to simplify navigation of the entire issue, these articles are divided into eight subject categories. This year's highlights include RNAcentral, an international community portal to various databases on noncoding RNA; ValidatorDB, a validation database for protein structures and their ligands; SASBDB, a primary repository for small-angle scattering data of various macromolecular complexes; MoonProt, a database of 'moonlighting' proteins, and two new databases of protein-protein and other macromolecular complexes, ComPPI and the Complex Portal. This issue also includes an unusually high number of cancer-related databases and other databases dedicated to genomic basics of disease and potential drugs and drug targets. The size of NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/a/, remained approximately the same, following the addition of 74 new resources and removal of 77 obsolete web sites. The entire Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Assuntos
Bases de Dados Genéticas , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Bases de Dados de ProteínasRESUMO
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Assuntos
Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Internet , Biologia MolecularRESUMO
The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year's highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein-protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI's dbVar and EBI's DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein-ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and currently lists 1512 online databases. The full content of the Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Assuntos
Bases de Dados Genéticas , Doença/genética , Genômica , Humanos , InternetRESUMO
The 19th annual Database Issue of Nucleic Acids Research features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals. The highlights of this issue include, among others, a description of neXtProt, a knowledgebase on human proteins; a detailed explanation of the principles behind the NCBI Taxonomy Database; NCBI and EBI papers on the recently launched BioSample databases that store sample information for a variety of database resources; descriptions of the recent developments in the Gene Ontology and UniProt Gene Ontology Annotation projects; updates on Pfam, SMART and InterPro domain databases; update papers on KEGG and TAIR, two universally acclaimed databases that face an uncertain future; and a separate section with 10 wiki-based databases, introduced in an accompanying editorial. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and now lists 1380 databases. Brief machine-readable descriptions of the databases featured in this issue, according to the BioDBcore standards, will be provided at the http://biosharing.org/biodbcore web site. The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Assuntos
Bases de Dados Factuais , Bases de Dados Genéticas , Biologia Computacional , Humanos , Internet , Biologia Molecular , Sistemas On-LineRESUMO
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.
Assuntos
Bases de Dados Genéticas , Genômica , Animais , Regulação da Expressão Gênica , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , RatosRESUMO
The spontaneously hypertensive rat (SHR) is the most widely studied animal model of hypertension. Scores of SHR quantitative loci (QTLs) have been mapped for hypertension and other phenotypes. We have sequenced the SHR/OlaIpcv genome at 10.7-fold coverage by paired-end sequencing on the Illumina platform. We identified 3.6 million high-quality single nucleotide polymorphisms (SNPs) between the SHR/OlaIpcv and Brown Norway (BN) reference genome, with a high rate of validation (sensitivity 96.3%-98.0% and specificity 99%-100%). We also identified 343,243 short indels between the SHR/OlaIpcv and reference genomes. These SNPs and indels resulted in 161 gain or loss of stop codons and 629 frameshifts compared with the BN reference sequence. We also identified 13,438 larger deletions that result in complete or partial absence of 107 genes in the SHR/OlaIpcv genome compared with the BN reference and 588 copy number variants (CNVs) that overlap with the gene regions of 688 genes. Genomic regions containing genes whose expression had been previously mapped as cis-regulated expression quantitative trait loci (eQTLs) were significantly enriched with SNPs, short indels, and larger deletions, suggesting that some of these variants have functional effects on gene expression. Genes that were affected by major alterations in their coding sequence were highly enriched for genes related to ion transport, transport, and plasma membrane localization, providing insights into the likely molecular and cellular basis of hypertension and other phenotypes specific to the SHR strain. This near complete catalog of genomic differences between two extensively studied rat strains provides the starting point for complete elucidation, at the molecular level, of the physiological and pathophysiological phenotypic differences between individuals from these strains.
Assuntos
Hipertensão/genética , Animais , Códon de Terminação , Dosagem de Genes , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Ratos , Ratos Endogâmicos SHR , Transcrição GênicaRESUMO
The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.
Assuntos
Bases de Dados Genéticas , Genômica , Animais , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Sequências Reguladoras de Ácido Nucleico , Software , Peixe-Zebra/genéticaRESUMO
The number of databases in molecular biological fields has rapidly increased to provide a large-scale resource. Though valuable information is available, data can be difficult to access, compare and integrate due to different formats and presentations of web interfaces. This paper offers a practical guide to the integration of gene, comparative genomic, and functional genomics data using the Ensembl website at http://www.ensembl.org.The Ensembl genome browser and underlying databases focus on chordate organisms. More species such as plants and microorganisms can be investigated using our sister browser at http://www.ensemblgenomes.org.In this study, four examples are used that sample many pages and features of the Ensembl browser. We focus on comparative studies across over 50 mostly chordate organisms, variations linked to disease, functional genomics, and access of external information housed in databases outside the Ensembl project. Researchers will learn how to go beyond simply exporting one gene sequence, and explore how a genome browser can integrate data from various sources and databases to build a full and comprehensive biological picture.
Assuntos
Bases de Dados Genéticas , Genômica/métodos , Internet , Animais , Sequência de Bases , Sequência Conservada , Humanos , Interleucina-2/genética , Camundongos , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Ratos , Sequências Reguladoras de Ácido Nucleico/genética , Interface Usuário-ComputadorRESUMO
BACKGROUND: Endothelial cells form the interface between the porcine graft and the recipient and frequently become activated after xenotransplantation. To evaluate the safety of xenotransplantation further, we assessed the effect of cellular activation on the expression and release of porcine endogenous retroviruses from primary endothelial cells isolated from transgenic and nontransgenic pigs. METHODS: Primary porcine endothelial cells, cultured from pigs transgenic for human decay accelerating factor, were treated with human tumor necrosis factor-alpha, porcine interferon-gamma, or lipopolysaccharide. The release of porcine endogenous retroviruses into the supernatant was monitored at 24-hr intervals (up to 72 hr) by polymerase chain reaction-based reverse transcriptase (PBRT) assay. Activated and unactivated endothelial cells were co-cultured with human cells to investigate the capacity of any virus released from the porcine cells to infect human cells. RESULTS: Virus was not detected in supernatants from quiescent cells by PBRT analysis. The number of viral particles released from endothelial cells was 10 to 5 x 10 viral particles/mL after cellular activation with tumor necrosis factor-alpha, interferon-gamma, or lipopolysaccharide, as shown by PBRT analysis. In contrast, in vitro infection of human cells was observed with unactivated endothelial cells only and was not observed in co-cultures with the activated porcine cells. CONCLUSIONS: Cytokine treatment of primary porcine endothelial cells results in an increase in the release of virus into the supernatant, but the observed increase in viral titer was not mirrored by an increase in infectivity toward human cells.
Assuntos
Antígenos CD55/fisiologia , Retrovirus Endógenos/isolamento & purificação , Células Endoteliais/virologia , Suínos/virologia , Animais , Células Cultivadas , Humanos , Interferon gama/farmacologia , Lipopolissacarídeos/farmacologia , Reação em Cadeia da Polimerase , Fator de Necrose Tumoral alfa/farmacologiaRESUMO
Biological databases are an important resource for the life sciences community. Accessing the hundreds of databases supporting molecular biology and related fields is a daunting and time-consuming task. Integrating this information into one access point is a necessity for the life sciences community, which includes researchers focusing on human disease. Here we discuss the Ensembl genome browser, which acts as a single entry point with Graphical User Interface to data from multiple projects, including OMIM, dbSNP, and the NHGRI GWAS catalog. Ensembl provides a comprehensive source of annotation for the human genome, along with other species of biomedical interest. In this unit, we explore how to use the Ensembl genome browser in example queries related to human genetic diseases. Support protocols demonstrate quick sequence export using the BioMart tool.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Doença/genética , Biologia Computacional/instrumentação , Genoma Humano , Humanos , Armazenamento e Recuperação da InformaçãoRESUMO
The Ensembl project provides a comprehensive source of automatic annotation of the human genome sequence, as well as other species of biomedical interest, with confirmed gene predictions that have been integrated with external data sources. This unit describes how to use the Ensembl genome browser (http://www.ensembl.org/), the public interface of the project. It describes how to find a gene or protein of interest, how to get additional information and external links, and how to use the comparative genomic data. Curr. Protoc. Bioinform. 30:1.15.1-1.15.48. (c) 2010 by John Wiley & Sons, Inc.
Assuntos
Bases de Dados Genéticas , Animais , Biologia Computacional , Sistemas de Gerenciamento de Base de Dados , Genoma , Genoma Humano , HumanosRESUMO
It has been four years since the original publication of the draft sequence of the rat genome. Five groups are now working together to assemble, annotate and release an updated version of the rat genome. As the prevailing model for physiology, complex disease and pharmacological studies, there is an acute need for the rat's genomic resources to keep pace with the rat's prominence in the laboratory. In this commentary, we describe the current status of the rat genome sequence and the plans for its impending 'upgrade'. We then cover the key online resources providing access to the rat genome, including the new SNP views at Ensembl, the RefSeq and Genes databases at the US National Center for Biotechnology Information, Genome Browser at the University of California Santa Cruz and the disease portals for cardiovascular disease and obesity at the Rat Genome Database.
Assuntos
Bases de Dados Genéticas , Genoma , Ratos/genética , Animais , Biologia Computacional , Modelos Animais de Doenças , Doenças Genéticas Inatas/genética , Variação Genética , Genômica , Haplótipos , Humanos , Internet , Polimorfismo de Nucleotídeo Único , Ratos Mutantes , Análise de Sequência de DNARESUMO
The rat is an important system for modeling human disease. Four years ago, the rich 150-year history of rat research was transformed by the sequencing of the rat genome, ushering in an era of exceptional opportunity for identifying genes and pathways underlying disease phenotypes. Genome-wide association studies in human populations have recently provided a direct approach for finding robust genetic associations in common diseases, but identifying the precise genes and their mechanisms of action remains problematic. In the context of significant progress in rat genomic resources over the past decade, we outline achievements in rat gene discovery to date, show how these findings have been translated to human disease, and document an increasing pace of discovery of new disease genes, pathways and mechanisms. Finally, we present a set of principles that justify continuing and strengthening genetic studies in the rat model, and further development of genomic infrastructure for rat research.
Assuntos
Modelos Animais de Doenças , Doenças Genéticas Inatas/genética , Genoma , Genômica/tendências , Ratos/genética , Animais , Animais Geneticamente Modificados , Mapeamento Cromossômico , Marcação de Genes , HumanosRESUMO
The Ensembl genome Web browser (http://www.ensembl.org) provides a comprehensive source of automatic annotation of the human genome sequence (as well as other species of biomedical interest), with confirmed gene predictions that have been integrated with external data sources. This unit describes how to use the Ensembl browser, how to find your gene or protein of interest and get information and external links about them, and how to use the comparative genomic data.
Assuntos
Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Genoma Humano/genética , Armazenamento e Recuperação da Informação/métodos , Análise de Sequência de DNA/métodos , Software , Interface Usuário-Computador , Algoritmos , Sequência de Bases , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Humanos , Internet , Dados de Sequência MolecularRESUMO
A wealth of gene information is accruing in public databases. Genome browsers such as Ensembl are needed to organize and depict this information in the context of the genome. Ensembl provides an open source gene set based on experimental evidence for over 30 species, the majority of which are vertebrates. Genes and annotation are accessible through the Ensembl browser (http://www.ensembl.org), and through direct queries of its databases using the Perl API (Application Programme Interface), MySQL or BioMart.
Assuntos
Genoma Humano , Genoma , Animais , Biotecnologia/métodos , Mapeamento Cromossômico , Cromossomos/genética , Cromossomos Humanos/genética , Mapeamento de Sequências Contíguas , Bases de Dados de Ácidos Nucleicos , Éxons , Humanos , RNA Mensageiro/genéticaRESUMO
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to "widen" this biological integration to include other model organisms relevant to understanding human biology as they become available; to "deepen" this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive.