Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
1.
Hum Genome Var ; 9(1): 44, 2022 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-36509753

RESUMO

TogoVar ( https://togovar.org ) is a database that integrates allele frequencies derived from Japanese populations and provides annotations for variant interpretation. First, a scheme to reanalyze individual-level genome sequence data deposited in the Japanese Genotype-phenotype Archive (JGA), a controlled-access database, was established to make allele frequencies publicly available. As more Japanese individual-level genome sequence data are deposited in JGA, the sample size employed in TogoVar is expected to increase, contributing to genetic study as reference data for Japanese populations. Second, public datasets of Japanese and non-Japanese populations were integrated into TogoVar to easily compare allele frequencies in Japanese and other populations. Each variant detected in Japanese populations was assigned a TogoVar ID as a permanent identifier. Third, these variants were annotated with molecular consequence, pathogenicity, and literature information for interpreting and prioritizing variants. Here, we introduce the newly developed TogoVar database that compares allele frequencies among Japanese and non-Japanese populations and describes the integrated annotations.

2.
Hum Genome Var ; 9(1): 48, 2022 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-36539398

RESUMO

Accurate genotype imputation requires large-scale reference panel datasets. When conducting genotype imputation on the Japanese population, researchers can use such datasets under collaborative studies or controlled access conditions in public databases. We developed the NBDC-DDBJ imputation server, which securely provides users with a web user interface to execute genotype imputation on the server. Our benchmarking analysis showed that the accuracy of genotype imputation was improved by leveraging controlled access datasets to increase the number of haplotypes available for analysis compared to using publicly available reference panels such as the 1000 Genomes Project. The NBDC-DDBJ imputation server facilitates the use of controlled access datasets for accurate genotype imputation.

3.
Genes Genet Syst ; 95(1): 43-50, 2020 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-32213716

RESUMO

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1st- and 2nd-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%-9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.


Assuntos
Arabidopsis/genética , Cromatina/genética , Bases de Dados de Ácidos Nucleicos , Genoma de Planta/genética , Aprendizado de Máquina , Biologia Computacional , Crowdsourcing , Análise de Dados , Sequenciamento de Nucleotídeos em Larga Escala , Japão , Anotação de Sequência Molecular
4.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30624651

RESUMO

TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.


Assuntos
Bases de Dados Genéticas , Genômica/métodos , Web Semântica , Software , Humanos
5.
Bioinformation ; 15(12): 883-886, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32256008

RESUMO

A comprehensive search system for the bioscience databases is in progress. We constructed a search service, Life science database cross search system (https://biosciencedbc.jp/dbsearch/index. php?lang=en) by integrating numerous biomedical databases using database crawling algorithms. The described system integrates 600 databases containing over 90 million entries indexed for biomedical research and development.

6.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30576482

RESUMO

In the life sciences, researchers increasingly want to access multiple databases in an integrated way. However, different databases currently use different formats and vocabularies, hindering the proper integration of heterogeneous life science data. Adopting the Resource Description Framework (RDF) has the potential to address such issues by improving database interoperability, leading to advances in automatic data processing. Based on this idea, we have advised many Japanese database development groups to expose their databases in RDF. To further promote such activities, we have developed an RDF-based life science dataset repository called the National Bioscience Database Center (NBDC) RDF portal. All the datasets in this repository have been reviewed by the NBDC to ensure interoperability and queryability. As of July 2018, the service includes 21 RDF datasets, comprising over 45.5 billion triples. It provides SPARQL endpoints for all datasets, useful metadata and the ability to download RDF files. The NBDC RDF portal can be accessed at https://integbio.jp/rdf/.


Assuntos
Disciplinas das Ciências Biológicas , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Semântica , Internet , Interface Usuário-Computador
7.
Am J Hum Genet ; 103(3): 389-399, 2018 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-30173820

RESUMO

Recently, to speed up the differential-diagnosis process based on symptoms and signs observed from an affected individual in the diagnosis of rare diseases, researchers have developed and implemented phenotype-driven differential-diagnosis systems. The performance of those systems relies on the quantity and quality of underlying databases of disease-phenotype associations (DPAs). Although such databases are often developed by manual curation, they inherently suffer from limited coverage. To address this problem, we propose a text-mining approach to increase the coverage of DPA databases and consequently improve the performance of differential-diagnosis systems. Our analysis showed that a text-mining approach using one million case reports obtained from PubMed could increase the coverage of manually curated DPAs in Orphanet by 125.6%. We also present PubCaseFinder (see Web Resources), a new phenotype-driven differential-diagnosis system in a freely available web application. By utilizing automatically extracted DPAs from case reports in addition to manually curated DPAs, PubCaseFinder improves the performance of automated differential diagnosis. Moreover, PubCaseFinder helps clinicians search for relevant case reports by using phenotype-based comparisons and confirm the results with detailed contextual information.


Assuntos
Doenças Raras/diagnóstico , Doenças Raras/genética , Mineração de Dados/métodos , Bases de Dados Genéticas , Diagnóstico Diferencial , Humanos , Fenótipo
8.
Nucleic Acids Res ; 46(D1): D30-D35, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29040613

RESUMO

The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.


Assuntos
Bases de Dados de Ácidos Nucleicos , Academias e Institutos , Computação em Nuvem , Biologia Computacional , Confidencialidade/legislação & jurisprudência , Bases de Dados de Ácidos Nucleicos/história , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Estudos de Associação Genética , História do Século XX , História do Século XXI , Humanos , Armazenamento e Recuperação da Informação , Cooperação Internacional , Japão , National Library of Medicine (U.S.) , Estados Unidos
9.
Nucleic Acids Res ; 46(D1): D48-D51, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29190397

RESUMO

For more than 30 years, the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) has been committed to capturing, preserving and providing access to comprehensive public domain nucleotide sequence and associated metadata which enables discovery in biomedicine, biodiversity and biological sciences. Since 1987, the DNA Data Bank of Japan (DDBJ) at the National Institute for Genetics in Mishima, Japan; the European Nucleotide Archive (ENA) at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK; and GenBank at National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health in Bethesda, Maryland, USA have worked collaboratively to enable access to nucleotide sequence data in standardized formats for the worldwide scientific community. In this article, we reiterate the principles of the INSDC collaboration and briefly summarize the trends of the archival content.


Assuntos
Bases de Dados de Ácidos Nucleicos , Animais , Classificação , Biologia Computacional , Bases de Dados Factuais , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cooperação Internacional , Japão , National Library of Medicine (U.S.) , Estados Unidos
10.
Nucleic Acids Res ; 45(D1): D25-D31, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27924010

RESUMO

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Animais , Genótipo , Humanos , Internet , Japão , Anotação de Sequência Molecular , Fenótipo , Software
11.
Nucleic Acids Res ; 44(D1): D51-7, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578571

RESUMO

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Ontologias Biológicas , Computadores , Genótipo , Fenótipo
12.
Nucleic Acids Res ; 44(D1): D48-50, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26657633

RESUMO

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Comportamento Cooperativo , Bases de Dados de Ácidos Nucleicos/normas , Políticas
13.
Nucleic Acids Res ; 43(Database issue): D18-22, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25477381

RESUMO

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genótipo , Fenótipo , Estudos de Associação Genética , Humanos , Internet , Análise de Sequência de DNA
14.
Mol Cell Biol ; 34(10): 1776-87, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24591654

RESUMO

In mammalian circadian clockwork, the CLOCK-BMAL1 complex binds to DNA enhancers of target genes and drives circadian oscillation of transcription. Here we identified 7,978 CLOCK-binding sites in mouse liver by chromatin immunoprecipitation-sequencing (ChIP-Seq), and a newly developed bioinformatics method, motif centrality analysis of ChIP-Seq (MOCCS), revealed a genome-wide distribution of previously unappreciated noncanonical E-boxes targeted by CLOCK. In vitro promoter assays showed that CACGNG, CACGTT, and CATG(T/C)G are functional CLOCK-binding motifs. Furthermore, we extensively revealed rhythmically expressed genes by poly(A)-tailed RNA-Seq and identified 1,629 CLOCK target genes within 11,926 genes expressed in the liver. Our analysis also revealed rhythmically expressed genes that have no apparent CLOCK-binding site, indicating the importance of indirect transcriptional and posttranscriptional regulations. Indirect transcriptional regulation is represented by rhythmic expression of CLOCK-regulated transcription factors, such as Krüppel-like factors (KLFs). Indirect posttranscriptional regulation involves rhythmic microRNAs that were identified by small-RNA-Seq. Collectively, CLOCK-dependent direct transactivation through multiple E-boxes and indirect regulations polyphonically orchestrate dynamic circadian outputs.


Assuntos
Proteínas CLOCK/fisiologia , Ritmo Circadiano , Elementos E-Box , Interferência de RNA , Animais , Sequência de Bases , Sítios de Ligação , Sequência Consenso , Células HEK293 , Humanos , Fatores de Transcrição Kruppel-Like/metabolismo , Fígado , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , MicroRNAs/genética , MicroRNAs/metabolismo , Ligação Proteica , Transcriptoma
15.
Nucleic Acids Res ; 42(Database issue): D44-9, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24194602

RESUMO

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. This database content is shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). DDBJ launched a new nucleotide sequence submission system for receiving traditional nucleotide sequence. We expect that the new submission system will be useful for many submitters to input accurate annotation and reduce the time needed for data input. In addition, DDBJ has started a new service, the Japanese Genotype-phenotype Archive (JGA), with our partner institute, the National Bioscience Database Center (NBDC). JGA permanently archives and shares all types of individual human genetic and phenotypic data. We also introduce improvements in the DDBJ services and databases made during the past year.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Genômica , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Fenótipo
16.
DNA Res ; 20(4): 383-90, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23657089

RESUMO

High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.


Assuntos
Genômica , Anotação de Sequência Molecular/métodos , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala , Internet
17.
Nucleic Acids Res ; 41(Database issue): D25-9, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23180790

RESUMO

The DNA data bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) maintains a primary nucleotide sequence database and provides analytical resources for biological information to researchers. This database content is exchanged with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Resources provided by the DDBJ include traditional nucleotide sequence data released in the form of 27 316 452 entries or 16 876 791 557 base pairs (as of June 2012), and raw reads of new generation sequencers in the sequence read archive (SRA). A Japanese researcher published his own genome sequence via DDBJ-SRA on 31 July 2012. To cope with the ongoing genomic data deluge, in March 2012, our computer previous system was totally replaced by a commodity cluster-based system that boasts 122.5 TFlops of CPU capacity and 5 PB of storage space. During this upgrade, it was considered crucial to replace and refactor substantial portions of the DDBJ software systems as well. As a result of the replacement process, which took more than 2 years to perform, we have achieved significant improvements in system performance.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência de DNA , Software
18.
BMC Bioinformatics ; 13 Suppl 11: S1, 2012 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-22759455

RESUMO

BACKGROUND: The Genia task, when it was introduced in 2009, was the first community-wide effort to address a fine-grained, structural information extraction from biomedical literature. Arranged for the second time as one of the main tasks of BioNLP Shared Task 2011, it aimed to measure the progress of the community since 2009, and to evaluate generalization of the technology to full text papers. The Protein Coreference task was arranged as one of the supporting tasks, motivated from one of the lessons of the 2009 task that the abundance of coreference structures in natural language text hinders further improvement with the Genia task. RESULTS: The Genia task received final submissions from 15 teams. The results show that the community has made a significant progress, marking 74% of the best F-score in extracting bio-molecular events of simple structure, e.g., gene expressions, and 45% ~ 48% in extracting those of complex structure, e.g., regulations. The Protein Coreference task received 6 final submissions. The results show that the coreference resolution performance in biomedical domain is lagging behind that in newswire domain, cf. 50% vs. 66% in MUC score. Particularly, in terms of protein coreference resolution the best system achieved 34% in F-score. CONCLUSIONS: Detailed analysis performed on the results improves our insight into the problem and suggests the directions for further improvements.


Assuntos
Sistemas de Informação , Processamento de Linguagem Natural , Proteínas/química , Congressos como Assunto , Expressão Gênica , Proteínas/genética , Proteínas/metabolismo
19.
BMC Genomics ; 13 Suppl 3: S8, 2012 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-22759617

RESUMO

BACKGROUND: Term clustering, by measuring the string similarities between terms, is known within the natural language processing community to be an effective method for improving the quality of texts and dictionaries. However, we have observed that chemical names are difficult to cluster using string similarity measures. In order to clearly demonstrate this difficulty, we compared the string similarities determined using the edit distance, the Monge-Elkan score, SoftTFIDF, and the bigram Dice coefficient for chemical names with those for non-chemical names. RESULTS: Our experimental results revealed the following: (1) The edit distance had the best performance in the matching of full forms, whereas Cohen et al. reported that SoftTFIDF with the Jaro-Winkler distance would yield the best measure for matching pairs of terms for their experiments. (2) For each of the string similarity measures above, the best threshold for term matching differs for chemical names and for non-chemical names; the difference is especially large for the edit distance. (3) Although the matching results obtained for chemical names using the edit distance, Monge-Elkan scores, or the bigram Dice coefficients are better than the result obtained for non-chemical names, the results were contrary when using SoftTFIDF. (4) A suitable weight for chemical names varies substantially from one for non-chemical names. In particular, a weight vector that has been optimized for non-chemical names is not suitable for chemical names. (5) The matching results using the edit distances improve further by dividing a set of full forms into two subsets, according to whether a full form is a chemical name or not. These results show that our hypothesis is acceptable, and that we can significantly improve the performance of abbreviation-full form clustering by computing chemical names and non-chemical names separately. CONCLUSIONS: In conclusion, the discriminative application of string similarity methods to chemical and non-chemical names may be a simple yet effective way to improve the performance of term clustering.


Assuntos
Algoritmos , Análise por Conglomerados , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Química , Biologia Computacional/métodos , Reprodutibilidade dos Testes , Terminologia como Assunto
20.
Nucleic Acids Res ; 40(Database issue): D38-42, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22110025

RESUMO

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA