Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
Add more filters










Publication year range
1.
Genes Genet Syst ; 95(1): 43-50, 2020 Apr 22.
Article in English | MEDLINE | ID: mdl-32213716

ABSTRACT

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1st- and 2nd-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%-9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.


Subject(s)
Arabidopsis/genetics , Chromatin/genetics , Databases, Nucleic Acid , Genome, Plant/genetics , Machine Learning , Computational Biology , Crowdsourcing , Data Analysis , High-Throughput Nucleotide Sequencing , Japan , Molecular Sequence Annotation
2.
Nucleic Acids Res ; 46(D1): D30-D35, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29040613

ABSTRACT

The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.


Subject(s)
Databases, Nucleic Acid , Academies and Institutes , Cloud Computing , Computational Biology , Confidentiality/legislation & jurisprudence , Databases, Nucleic Acid/history , Databases, Nucleic Acid/trends , Europe , Genetic Association Studies , History, 20th Century , History, 21st Century , Humans , Information Storage and Retrieval , International Cooperation , Japan , National Library of Medicine (U.S.) , United States
3.
Front Genet ; 8: 180, 2017.
Article in English | MEDLINE | ID: mdl-29259619

ABSTRACT

Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma ("Miyagawa Wase") was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome.

4.
Biosci Microbiota Food Health ; 36(3): 129-134, 2017.
Article in English | MEDLINE | ID: mdl-28748134

ABSTRACT

Whole-genome sequencing was performed for Lactobacillus parakefiri JCM 8573T to confirm its hitherto controversial taxonomic position. Here, we report its first reliable reference genome. Genome-wide metrics, such as average nucleotide identity and digital DNA-DNA hybridization, and phylogenomic analysis based on multiple genes supported its taxonomic status as a distinct species in the genus Lactobacillus. The availability of a reliable genome sequence will aid future investigations on the industrial applications of L. parakefiri in functional foods such as kefir grains.

5.
Sci Rep ; 7(1): 4721, 2017 07 05.
Article in English | MEDLINE | ID: mdl-28680114

ABSTRACT

Novel genomics-based approaches such as genome-wide association studies (GWAS) and genomic selection (GS) are expected to be useful in fruit tree breeding, which requires much time from the cross to the release of a cultivar because of the long generation time. In this study, a citrus parental population (111 varieties) and a breeding population (676 individuals from 35 full-sib families) were genotyped for 1,841 single nucleotide polymorphisms (SNPs) and phenotyped for 17 fruit quality traits. GWAS power and prediction accuracy were increased by combining the parental and breeding populations. A multi-kernel model considering both additive and dominance effects improved prediction accuracy for acidity and juiciness, implying that the effects of both types are important for these traits. Genomic best linear unbiased prediction (GBLUP) with linear ridge kernel regression (RR) was more robust and accurate than GBLUP with non-linear Gaussian kernel regression (GAUSS) in the tails of the phenotypic distribution. The results of this study suggest that both GWAS and GS are effective for genetic improvement of citrus fruit traits. Furthermore, the data collected from breeding populations are beneficial for increasing the detection power of GWAS and the prediction accuracy of GS.


Subject(s)
Citrus/genetics , Genome-Wide Association Study/methods , Genomics/methods , Quantitative Trait Loci , Genome, Plant , Models, Genetic , Phenotype , Plant Breeding , Polymorphism, Single Nucleotide , Selection, Genetic , Sequence Analysis, DNA
6.
PLoS One ; 12(2): e0172269, 2017.
Article in English | MEDLINE | ID: mdl-28234924

ABSTRACT

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.


Subject(s)
DNA/genetics , Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing/methods , Polymorphism, Genetic , Crops, Agricultural/genetics , DNA, Plant , Genes, Plant , Homozygote , Molecular Sequence Annotation , Oryza/genetics , Phenotype , Phylogeny , Reference Values , Reproducibility of Results , Software , Sorghum/genetics , Zea mays/genetics
7.
Nucleic Acids Res ; 45(D1): D25-D31, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27924010

ABSTRACT

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Animals , Genotype , Humans , Internet , Japan , Molecular Sequence Annotation , Phenotype , Software
8.
PLoS One ; 11(11): e0166969, 2016.
Article in English | MEDLINE | ID: mdl-27902727

ABSTRACT

Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy-Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies.


Subject(s)
Cell Nucleus/genetics , Chloroplasts/genetics , Citrus/genetics , DNA, Plant/genetics , Genetic Variation , Genome, Mitochondrial/genetics , Genomics , Citrus/classification , Genetic Markers/genetics , Genome, Plant/genetics , Genotyping Techniques , Phylogeny
9.
Biosci Microbiota Food Health ; 35(4): 173-184, 2016.
Article in English | MEDLINE | ID: mdl-27867804

ABSTRACT

Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.

10.
Nucleic Acids Res ; 44(D1): D51-7, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26578571

ABSTRACT

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Biological Ontologies , Computers , Genotype , Phenotype
11.
Plant Cell Physiol ; 57(1): e1, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26578696

ABSTRACT

The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.


Subject(s)
Databases, Genetic , Genetic Variation , Genome, Plant/genetics , Genomics , Oryza/genetics , Genotype , Phenotype , Polymorphism, Single Nucleotide
12.
BMC Genomics ; 16: 240, 2015 Mar 25.
Article in English | MEDLINE | ID: mdl-25879859

ABSTRACT

BACKGROUND: Lactobacillus hokkaidonensis is an obligate heterofermentative lactic acid bacterium, which is isolated from Timothy grass silage in Hokkaido, a subarctic region of Japan. This bacterium is expected to be useful as a silage starter culture in cold regions because of its remarkable psychrotolerance; it can grow at temperatures as low as 4°C. To elucidate its genetic background, particularly in relation to the source of psychrotolerance, we constructed the complete genome sequence of L. hokkaidonensis LOOC260(T) using PacBio single-molecule real-time sequencing technology. RESULTS: The genome of LOOC260(T) comprises one circular chromosome (2.28 Mbp) and two circular plasmids: pLOOC260-1 (81.6 kbp) and pLOOC260-2 (41.0 kbp). We identified diverse mobile genetic elements, such as prophages, integrated and conjugative elements, and conjugative plasmids, which may reflect adaptation to plant-associated niches. Comparative genome analysis also detected unique genomic features, such as genes involved in pentose assimilation and NADPH generation. CONCLUSIONS: This is the first complete genome in the L. vaccinostercus group, which is poorly characterized, so the genomic information obtained in this study provides insight into the genetics and evolution of this group. We also found several factors that may contribute to the ability of L. hokkaidonensis to grow at cold temperatures. The results of this study will facilitate further investigation for the cold-tolerance mechanism of L. hokkaidonensis.


Subject(s)
Genome, Bacterial , Lactobacillus/genetics , Silage/microbiology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Chromosome Mapping , Chromosomes, Bacterial/chemistry , Chromosomes, Bacterial/metabolism , Comparative Genomic Hybridization , Lactobacillus/classification , Lactobacillus/isolation & purification , NADP/metabolism , NADP Transhydrogenases/genetics , NADP Transhydrogenases/metabolism , Phylogeny , Plasmids/genetics , Plasmids/metabolism , Sequence Analysis, DNA
13.
Nucleic Acids Res ; 43(Database issue): D18-22, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25477381

ABSTRACT

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.


Subject(s)
Databases, Nucleic Acid , Genotype , Phenotype , Genetic Association Studies , Humans , Internet , Sequence Analysis, DNA
14.
Biomed Res Int ; 2014: 303451, 2014.
Article in English | MEDLINE | ID: mdl-25243128

ABSTRACT

In plants, miRNAs and siRNAs, such as transacting siRNAs (ta-siRNAs), affect their targets through distinct regulatory mechanisms. In this study, the expression profiles of small RNAs (smRNAs) in Arabidopsis plants subjected to drought, cold, and high-salinity stress were analyzed using 454 DNA sequencing technology. Expression of three groups of ta-siRNAs (TAS1, TAS2, and TAS3) and their precursors was downregulated in Arabidopsis plants subjected to drought and high-salinity stress. Analysis of ta-siRNA synthesis mutants and mutated ARF3-overexpressing plants that escape the tasiRNA-ARF target indicated that self-pollination was hampered by short stamens in plants under drought and high-salinity stress. Microarray analysis of flower buds of rdr6 and wild-type plants under drought stress and nonstressed conditions revealed that expression of floral development- and auxin response-related genes was affected by drought stress and by the RDR6 mutation. The overall results of the present study indicated that tasiRNA-ARF is involved in maintaining the normal morphogenesis of flowers in plants under stress conditions through fine-tuning expression changes of floral development-related and auxin response-related genes.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/anatomy & histology , Arabidopsis/physiology , DNA-Binding Proteins/metabolism , Droughts , Flowers/anatomy & histology , Nuclear Proteins/metabolism , RNA, Small Interfering/metabolism , Stress, Physiological , Arabidopsis/genetics , Down-Regulation/genetics , Gene Expression Profiling , Gene Expression Regulation, Plant , Genes, Plant , Indoleacetic Acids/metabolism , Models, Biological , Molecular Sequence Data , Mutation/genetics , Oligonucleotide Array Sequence Analysis , Pollination/genetics , RNA, Plant/genetics , RNA, Plant/metabolism , RNA-Dependent RNA Polymerase/metabolism , Self-Fertilization/genetics , Sequence Analysis, RNA , Signal Transduction , Stress, Physiological/genetics
15.
Genome Announc ; 2(4)2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25169865

ABSTRACT

We report the 1.86-Mb draft genome and annotation of Lactobacillus oryzae SG293(T) isolated from fermented rice grains. This genome information may provide further insights into the mechanisms underlying the fermentation of rice grains.

16.
Genome Announc ; 2(4)2014 Jul 10.
Article in English | MEDLINE | ID: mdl-25013139

ABSTRACT

Weissella oryzae was originally isolated from fermented rice grains. Here we report the draft genome sequence of the type strain of W. oryzae. This first report on the genomic sequence of this species may help identify the mechanisms underlying bacterial adaptation to the ecological niche of fermented rice grains.

17.
Nucleic Acids Res ; 42(Database issue): D44-9, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24194602

ABSTRACT

The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. This database content is shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). DDBJ launched a new nucleotide sequence submission system for receiving traditional nucleotide sequence. We expect that the new submission system will be useful for many submitters to input accurate annotation and reduce the time needed for data input. In addition, DDBJ has started a new service, the Japanese Genotype-phenotype Archive (JGA), with our partner institute, the National Bioscience Database Center (NBDC). JGA permanently archives and shares all types of individual human genetic and phenotypic data. We also introduce improvements in the DDBJ services and databases made during the past year.


Subject(s)
Base Sequence , Databases, Nucleic Acid , Molecular Sequence Annotation , Genomics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Internet , Phenotype
18.
Plant Cell Physiol ; 55(2): 445-54, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24319074

ABSTRACT

Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.


Subject(s)
Genome, Plant/genetics , Polymorphism, Single Nucleotide , Solanum lycopersicum/genetics , Breeding , Chromosome Mapping , DNA, Intergenic , DNA, Plant/chemistry , DNA, Plant/genetics , Gene Library , Genomics , INDEL Mutation , Molecular Sequence Annotation , Mutation , Phenotype , Sequence Analysis, DNA , Species Specificity
19.
DNA Res ; 20(4): 383-90, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23657089

ABSTRACT

High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.


Subject(s)
Genomics , Molecular Sequence Annotation/methods , Sequence Analysis, DNA/methods , Software , High-Throughput Nucleotide Sequencing , Internet
20.
Breed Sci ; 63(1): 14-20, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23641177

ABSTRACT

Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv 'Heinz 1706' serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding.

SELECTION OF CITATIONS
SEARCH DETAIL
...