DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu.

Afiliação

Mochizuki T; Genome Informatics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.
Tanizawa Y; Genome Informatics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.
Fujisawa T; Genome Informatics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.
Ohta T; Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Mishima, Shizuoka, Japan.
Nikoh N; Department of Liberal Arts, The Open University of Japan, Chiba, Chiba, Japan.
Shimizu T; Division of Citrus Research, Institute of Fruit Tree and Tea Science, NARO, Shimizu, Shizuoka, Japan.
Toyoda A; Comparative Genomics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.
Fujiyama A; Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan.
Kurata N; Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan.
Nagasaki H; Plant Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.
Kaminuma E; Genome Informatics Group, Department of Technology Development, Kazusa DNA Research Institute, Kisarazu, Chiba, Japan.
Nakamura Y; Genome Informatics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.

PLoS One ; 12(2): e0172269, 2017.

Article em En | MEDLINE | ID: mdl-28234924

RESUMO

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.

Assuntos

DNA/genética; Bases de Dados de Ácidos Nucleicos; Sequenciamento de Nucleotídeos em Larga Escala/métodos; Polimorfismo Genético; Produtos Agrícolas/genética; DNA de Plantas; Genes de Plantas; Homozigoto; Anotação de Sequência Molecular; Oryza/genética; Fenótipo; Filogenia; Valores de Referência; Reprodutibilidade dos Testes; Software; Sorghum/genética; Zea mays/genética

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Polimorfismo Genético / DNA / Bases de Dados de Ácidos Nucleicos / Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google