Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 46(D1): D30-D35, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29040613

RESUMO

The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.


Assuntos
Bases de Dados de Ácidos Nucleicos , Academias e Institutos , Computação em Nuvem , Biologia Computacional , Confidencialidade/legislação & jurisprudência , Bases de Dados de Ácidos Nucleicos/história , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Estudos de Associação Genética , História do Século XX , História do Século XXI , Humanos , Armazenamento e Recuperação da Informação , Cooperação Internacional , Japão , National Library of Medicine (U.S.) , Estados Unidos
2.
Nucleic Acids Res ; 45(D1): D25-D31, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27924010

RESUMO

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Animais , Genótipo , Humanos , Internet , Japão , Anotação de Sequência Molecular , Fenótipo , Software
3.
Nucleic Acids Res ; 44(D1): D51-7, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578571

RESUMO

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Ontologias Biológicas , Computadores , Genótipo , Fenótipo
4.
Nucleic Acids Res ; 43(Database issue): D18-22, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25477381

RESUMO

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genótipo , Fenótipo , Estudos de Associação Genética , Humanos , Internet , Análise de Sequência de DNA
5.
Plant Cell Physiol ; 57(1): e1, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26578696

RESUMO

The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.


Assuntos
Bases de Dados Genéticas , Variação Genética , Genoma de Planta/genética , Genômica , Oryza/genética , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único
6.
Nucleic Acids Res ; 42(Database issue): D44-9, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24194602

RESUMO

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. This database content is shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). DDBJ launched a new nucleotide sequence submission system for receiving traditional nucleotide sequence. We expect that the new submission system will be useful for many submitters to input accurate annotation and reduce the time needed for data input. In addition, DDBJ has started a new service, the Japanese Genotype-phenotype Archive (JGA), with our partner institute, the National Bioscience Database Center (NBDC). JGA permanently archives and shares all types of individual human genetic and phenotypic data. We also introduce improvements in the DDBJ services and databases made during the past year.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Genômica , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Fenótipo
7.
BMC Genomics ; 16: 240, 2015 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-25879859

RESUMO

BACKGROUND: Lactobacillus hokkaidonensis is an obligate heterofermentative lactic acid bacterium, which is isolated from Timothy grass silage in Hokkaido, a subarctic region of Japan. This bacterium is expected to be useful as a silage starter culture in cold regions because of its remarkable psychrotolerance; it can grow at temperatures as low as 4°C. To elucidate its genetic background, particularly in relation to the source of psychrotolerance, we constructed the complete genome sequence of L. hokkaidonensis LOOC260(T) using PacBio single-molecule real-time sequencing technology. RESULTS: The genome of LOOC260(T) comprises one circular chromosome (2.28 Mbp) and two circular plasmids: pLOOC260-1 (81.6 kbp) and pLOOC260-2 (41.0 kbp). We identified diverse mobile genetic elements, such as prophages, integrated and conjugative elements, and conjugative plasmids, which may reflect adaptation to plant-associated niches. Comparative genome analysis also detected unique genomic features, such as genes involved in pentose assimilation and NADPH generation. CONCLUSIONS: This is the first complete genome in the L. vaccinostercus group, which is poorly characterized, so the genomic information obtained in this study provides insight into the genetics and evolution of this group. We also found several factors that may contribute to the ability of L. hokkaidonensis to grow at cold temperatures. The results of this study will facilitate further investigation for the cold-tolerance mechanism of L. hokkaidonensis.


Assuntos
Genoma Bacteriano , Lactobacillus/genética , Silagem/microbiologia , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Mapeamento Cromossômico , Cromossomos Bacterianos/química , Cromossomos Bacterianos/metabolismo , Hibridização Genômica Comparativa , Lactobacillus/classificação , Lactobacillus/isolamento & purificação , NADP/metabolismo , NADP Trans-Hidrogenases/genética , NADP Trans-Hidrogenases/metabolismo , Filogenia , Plasmídeos/genética , Plasmídeos/metabolismo , Análise de Sequência de DNA
8.
Nucleic Acids Res ; 41(Database issue): D25-9, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23180790

RESUMO

The DNA data bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) maintains a primary nucleotide sequence database and provides analytical resources for biological information to researchers. This database content is exchanged with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Resources provided by the DDBJ include traditional nucleotide sequence data released in the form of 27 316 452 entries or 16 876 791 557 base pairs (as of June 2012), and raw reads of new generation sequencers in the sequence read archive (SRA). A Japanese researcher published his own genome sequence via DDBJ-SRA on 31 July 2012. To cope with the ongoing genomic data deluge, in March 2012, our computer previous system was totally replaced by a commodity cluster-based system that boasts 122.5 TFlops of CPU capacity and 5 PB of storage space. During this upgrade, it was considered crucial to replace and refactor substantial portions of the DDBJ software systems as well. As a result of the replacement process, which took more than 2 years to perform, we have achieved significant improvements in system performance.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência de DNA , Software
9.
Nucleic Acids Res ; 41(Database issue): D880-4, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193255

RESUMO

H2DB (http://tga.nig.ac.jp/h2db/), an annotation database of genetic heritability estimates for humans and other species, has been developed as a knowledge database to connect trait-associated genomic loci. Heritability estimates have been investigated for individual species, particularly in human twin studies and plant/animal breeding studies. However, there appears to be no comprehensive heritability database for both humans and other species. Here, we introduce an annotation database for genetic heritabilities of various species that was annotated by manually curating online public resources in PUBMED abstracts and journal contents. The proposed heritability database contains attribute information for trait descriptions, experimental conditions, trait-associated genomic loci and broad- and narrow-sense heritability specifications. Annotated trait-associated genomic loci, for which most are single-nucleotide polymorphisms derived from genome-wide association studies, may be valuable resources for experimental scientists. In addition, we assigned phenotype ontologies to the annotated traits for the purposes of discussing heritability distributions based on phenotypic classifications.


Assuntos
Bases de Dados de Ácidos Nucleicos , Loci Gênicos , Característica Quantitativa Herdável , Animais , Genoma , Humanos , Internet , Anotação de Sequência Molecular , Fenótipo
10.
Plant Cell Physiol ; 55(2): 445-54, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24319074

RESUMO

Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.


Assuntos
Genoma de Planta/genética , Polimorfismo de Nucleotídeo Único , Solanum lycopersicum/genética , Cruzamento , Mapeamento Cromossômico , DNA Intergênico , DNA de Plantas/química , DNA de Plantas/genética , Biblioteca Gênica , Genômica , Mutação INDEL , Anotação de Sequência Molecular , Mutação , Fenótipo , Análise de Sequência de DNA , Especificidade da Espécie
11.
Nucleic Acids Res ; 40(Database issue): D38-42, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22110025

RESUMO

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA , Análise de Sequência de RNA
12.
Proc Natl Acad Sci U S A ; 108(24): 10004-9, 2011 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-21613568

RESUMO

Genome integrity is continuously threatened by external stresses and endogenous hazards such as DNA replication errors and reactive oxygen species. The DNA damage checkpoint in metazoans ensures genome integrity by delaying cell-cycle progression to repair damaged DNA or by inducing apoptosis. ATM and ATR (ataxia-telangiectasia-mutated and -Rad3-related) are sensor kinases that relay the damage signal to transducer kinases Chk1 and Chk2 and to downstream cell-cycle regulators. Plants also possess ATM and ATR orthologs but lack obvious counterparts of downstream regulators. Instead, the plant-specific transcription factor SOG1 (suppressor of gamma response 1) plays a central role in the transmission of signals from both ATM and ATR kinases. Here we show that in Arabidopsis, endoreduplication is induced by DNA double-strand breaks (DSBs), but not directly by DNA replication stress. When root or sepal cells, or undifferentiated suspension cells, were treated with DSB inducers, they displayed increased cell size and DNA ploidy. We found that the ATM-SOG1 and ATR-SOG1 pathways both transmit DSB-derived signals and that either one suffices for endocycle induction. These signaling pathways govern the expression of distinct sets of cell-cycle regulators, such as cyclin-dependent kinases and their suppressors. Our results demonstrate that Arabidopsis undergoes a programmed endoreduplicative response to DSBs, suggesting that plants have evolved a distinct strategy to sustain growth under genotoxic stress.


Assuntos
Arabidopsis/genética , Quebras de DNA de Cadeia Dupla/efeitos dos fármacos , Dano ao DNA , Replicação do DNA/efeitos dos fármacos , DNA de Plantas/genética , Arabidopsis/citologia , Arabidopsis/crescimento & desenvolvimento , Proteínas de Arabidopsis/genética , Proteínas Mutadas de Ataxia Telangiectasia , Bleomicina/toxicidade , Proteínas de Ciclo Celular/genética , Células Cultivadas , Cisplatino/toxicidade , Quebras de DNA de Cadeia Dupla/efeitos da radiação , Replicação do DNA/efeitos da radiação , Raios gama , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Regulação da Expressão Gênica no Desenvolvimento/efeitos da radiação , Regulação da Expressão Gênica de Plantas/efeitos dos fármacos , Regulação da Expressão Gênica de Plantas/efeitos da radiação , Metanossulfonato de Metila/toxicidade , Mutagênicos/toxicidade , Mutação , Raízes de Plantas/genética , Raízes de Plantas/crescimento & desenvolvimento , Ploidias , Proteínas Serina-Treonina Quinases/genética , Transdução de Sinais/genética , Fatores de Transcrição/genética , Raios Ultravioleta
13.
Nucleic Acids Res ; 39(Database issue): D22-7, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21062814

RESUMO

The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) provides a nucleotide sequence archive database and accompanying database tools for sequence submission, entry retrieval and annotation analysis. The DDBJ collected and released 3,637,446 entries/2,272,231,889 bases between July 2009 and June 2010. A highlight of the released data was archive datasets from next-generation sequencing reads of Japanese rice cultivar, Koshihikari submitted by the National Institute of Agrobiological Sciences. In this period, we started a new archive for quantitative genomics data, the DDBJ Omics aRchive (DOR). The DOR stores quantitative data both from the microarray and high-throughput new sequencing platforms. Moreover, we improved the content of the DDBJ patent sequence, released a new submission tool of the DDBJ Sequence Read Archive (DRA) which archives massive raw sequencing reads, and enhanced a cloud computing-based analytical system from sequencing reads, the DDBJ Read Annotation Pipeline. In this article, we describe these new functions of the DDBJ databases and support tools.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Genômica , Anotação de Sequência Molecular , Patentes como Assunto , Software
14.
Breed Sci ; 63(1): 14-20, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23641177

RESUMO

Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv 'Heinz 1706' serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding.

15.
Nucleic Acids Res ; 38(Database issue): D33-8, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19850725

RESUMO

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has collected and released 1,701,110 entries/1,116,138,614 bases between July 2008 and June 2009. A few highlighted data releases from DDBJ were the complete genome sequence of an endosymbiont within protist cells in the termite gut and Cap Analysis Gene Expression tags for human and mouse deposited from the Functional Annotation of the Mammalian cDNA consortium. In this period, we started a novel user announcement service using Really Simple Syndication (RSS) to deliver a list of data released from DDBJ on a daily basis. Comprehensive visualization of a DDBJ release data was attempted by using a word cloud program. Moreover, a new archive for sequencing data from next-generation sequencers, the 'DDBJ Read Archive' (DRA), was launched. Concurrently, for read data registered in DRA, a semi-automatic annotation tool called the 'DDBJ Read Annotation Pipeline' was released as a preliminary step. The pipeline consists of two parts: basic analysis for reference genome mapping and de novo assembly and high-level analysis of structural and functional annotations. These new services will aid users' research and provide easier access to DDBJ databases.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Algoritmos , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Genoma Bacteriano , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Japão , Software
16.
Proc Natl Acad Sci U S A ; 106(7): 2453-8, 2009 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-19181858

RESUMO

The nonsense-mediated mRNA decay (NMD) pathway is a well-known eukaryotic surveillance mechanism that eliminates aberrant mRNAs that contain a premature termination codon (PTC). The UP-Frameshift (UPF) proteins, UPF1, UPF2, and UPF3, are essential for normal NMD function. Several NMD substrates have been identified, but detailed information on NMD substrates is lacking. Here, we noticed that, in Arabidopsis, most of the mRNA-like nonprotein-coding RNAs (ncRNAs) have the features of an NMD substrate. We examined the expression profiles of 2 Arabidopsis mutants, upf1-1 and upf3-1, using a whole-genome tiling array. The results showed that expression of not only protein-coding transcripts but also many mRNA-like ncRNAs (mlncRNAs), including natural antisense transcript RNAs (nat-RNAs) transcribed from the opposite strands of the coding strands, were up-regulated in both mutants. The percentage of the up-regulated mlncRNAs to all expressed mlncRNAs was much higher than that of the up-regulated protein-coding transcripts to all expressed protein- coding transcripts. This finding demonstrates that one of the most important roles of NMD is the genome-wide suppression of the aberrant mlncRNAs including nat-RNAs.


Assuntos
Arabidopsis/genética , Genoma de Planta , RNA não Traduzido/genética , Proteínas de Arabidopsis/metabolismo , Cicloeximida/farmacologia , Éxons , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Modelos Biológicos , Modelos Genéticos , Mutação , Inibidores da Síntese de Proteínas/farmacologia , RNA/metabolismo , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa
17.
Adv Exp Med Biol ; 680: 125-35, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20865494

RESUMO

The Center for Information Biology and DNA Data Bank of Japan (CIB-DDBJ) has operated biological databases since 1987 in collaboration with NCBI and EBI. As one of the three major public databases, CIB-DDBJ has run four primary databases DDBJ, CIBEX, DDBJ Trace Archive (DTA), and DDBJ Read Archive (DRA) to collect, archive, and provide various kinds of biological data. As the massively parallel new sequencing platforms are increasingly in use, huge amounts of the raw data have been produced. To archive these raw data, we at CIB-DDBJ began operating a new repository, the DDBJ Read Archive (DRA). To accommodate efficiently the processed data as well, we have developed a new pipeline, the DDBJ Read Annotation Pipeline that deals with both data submission and analysis. For data produced by the next generation platforms, the three archives DRA, DDBJ, and CIBEX, which are interconnected by the pipeline, collect the raw, processed sequence, and quantitative data, respectively. The public biological databases at CIB-DDBJ, EBI, and NCBI will together construct world-wide archives for biological data by data sharing to accelerate research in life sciences in the era of next generation sequencing technologies.


Assuntos
Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Análise de Sequência de DNA/estatística & dados numéricos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/tendências , Japão , Modelos Estatísticos , Análise de Sequência de DNA/tendências
18.
Genes Genet Syst ; 95(1): 43-50, 2020 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-32213716

RESUMO

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1st- and 2nd-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%-9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.


Assuntos
Arabidopsis/genética , Cromatina/genética , Bases de Dados de Ácidos Nucleicos , Genoma de Planta/genética , Aprendizado de Máquina , Biologia Computacional , Crowdsourcing , Análise de Dados , Sequenciamento de Nucleotídeos em Larga Escala , Japão , Anotação de Sequência Molecular
19.
Plant J ; 56(3): 470-82, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18643999

RESUMO

Quantitative morphological traits may be defined based on the 3D anatomy reconstructed from micro X-ray computed tomography (microCT) images. In this study, the heterogeneous spatial distribution of trichomes (hairs) on the adaxial leaf blade surface in Arabidopsis was evaluated in terms of 3D quantitative traits, including trichome number, average nearest-neighbour distance between trichomes, and proportion of large trichomes. The data reflect spatial heterogeneity in the radial direction, in that a greater number of trichomes were observed on the leaf blade margins relative to the non-margins, a distribution effect caused by the CAPRICE (CPC) and GLABRA3 (GL3) genes, which have previously been shown to affect trichome density. We further determined that the proportion of large trichomes on the blade mid-rib increases from the proximal end to the distal leaf tip in both wild-type plants and GL3 mutants. Our results indicate that the CPC [corrected] gene affects trichome distribution, rather than trichome growth, causing trichome initiation at the proximal base rather than the distal tip. On the other hand, CPC does affect trichome growth and developmental progression. Hence, quantitative phenotyping based on microCT enables precise phenotypic description for elucidation of gene control in morphological mutants.


Assuntos
Arabidopsis/citologia , Epiderme Vegetal/citologia , Folhas de Planta/citologia , Tomografia Computadorizada por Raios X/métodos , Análise de Variância , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Genes de Plantas , Imageamento Tridimensional , Modelos Estatísticos , Fenótipo , Epiderme Vegetal/genética , Folhas de Planta/genética , Plantas Geneticamente Modificadas/citologia , Plantas Geneticamente Modificadas/genética , Proteínas Proto-Oncogênicas c-myb/genética , Característica Quantitativa Herdável
20.
Plant Cell Physiol ; 50(9): 1715-20, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19633021

RESUMO

MicroRNAs (miRNAs) are 20-24 nucleotide endogenous regulatory molecules conserved in higher eukaryotes. In Arabidopsis, miRNAs are produced through step-wise cleavages of primary miRNA precursors (pri-miRNAs) by DICER-LIKE1 (DCL1). This cleavage step is also supported by a double-stranded RNA-binding protein, HYPONASTIC LEAVES1 (HYL1). In many cases, mature miRNA is predominantly incorporated into an endonuclease, ARGONAUTE1 (AGO1), which degrades miRNA-targeted mRNAs. Here, we examined and revealed whole genome transcriptomes in ago1-25 and hyl1-2 mutants using tiling arrays. The data in this paper are valuable for understanding the relationship between the miRNA pathway and its effect on transcriptomes.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Perfilação da Expressão Gênica , MicroRNAs/metabolismo , Proteínas de Ligação a RNA/metabolismo , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas Argonautas , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , RNA de Plantas/metabolismo , Proteínas de Ligação a RNA/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA