RESUMO
Tomato (Solanum lycopersicum L., Solanaceae) is an excellent model plant for genomic research of solanaceous plants, as well as for studying the development, ripening, and metabolism of fruit. In 2003, the International Solanaceae Project (SOL, www.sgn.cornell.edu ) was initiated by members from more than 30 countries, and the tomato genome-sequencing project is currently underway. Genome sequence of tomato obtained by this project will provide a firm foundation for forthcoming genomic studies such as the comparative analysis of genes conserved among the Solanaceae species and the elucidation of the functions of unknown tomato genes. To exploit the wealth of the genome sequence information, there is an urgent need for novel resources and analytical tools for tomato functional genomics. Here, we present an overview of the development of genetic and genomic resources of tomato in the last decade, with a special focus on the activities of Japan SOL and the National Bio-Resource Project in the development of functional genomic resources of a model cultivar, Micro-Tom.
RESUMO
For comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 22,983 5' end expressed sequence tags (ESTs) were accumulated from normalized and size-selected cDNA libraries constructed from young (2 weeks old) plants. The EST sequences were clustered into 7137 non-redundant groups. Similarity search against public non-redundant protein database indicated that 3302 groups showed similarity to genes of known function, 1143 groups to hypothetical genes, and 2692 were novel sequences. Homologues of 5 nodule-specific genes which have been reported in other legume species were contained in the collected ESTs, suggesting that the EST source generated in this study will become a useful tool for identification of genes related to legume-specific biological processes. The sequence data of individual ESTs are available at the web site: http://www.kazusa.or.jp/en/plant/lotus/EST/.
Assuntos
Etiquetas de Sequências Expressas , Fabaceae/genética , Genoma de Planta , Plantas Medicinais , Códon , Biblioteca Gênica , Dados de Sequência Molecular , RNA de Plantas/análise , Alinhamento de SequênciaRESUMO
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Assuntos
Arabidopsis/genética , DNA Complementar/metabolismo , Etiquetas de Sequências Expressas , Biblioteca Gênica , Dados de Sequência Molecular , Poli A/metabolismo , Distribuição TecidualRESUMO
The complete nucleotide sequence of the chloroplast genome of Arabidopsis thaliana has been determined. The genome as a circular DNA composed of 154,478 bp containing a pair of inverted repeats of 26,264 bp, which are separated by small and large single copy regions of 17,780 bp and 84,170 bp, respectively. A total of 87 potential protein-coding genes including 8 genes duplicated in the inverted repeat regions, 4 ribosomal RNA genes and 37 tRNA genes (30 gene species) representing 20 amino acid species were assigned to the genome on the basis of similarity to the chloroplast genes previously reported for other species. The translated amino acid sequences from respective potential protein-coding genes showed 63.9% to 100% sequence similarity to those of the corresponding genes in the chloroplast genome of Nicotiana tabacum, indicating the occurrence of significant diversity in the chloroplast genes between two dicot plants. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Cloroplastos/genética , Genoma de Planta , Anticódon/genética , Cloroplastos/metabolismo , Códon/genética , Cianobactérias/genética , Dados de Sequência Molecular , Família Multigênica , Proteínas de Plantas/genética , RNA de Plantas/genética , RNA Ribossômico/genética , RNA de Transferência/genética , Análise de Sequência de DNA , SoftwareRESUMO
To understand genetic information carried in a unicellular green alga, Chlamydomonas reinhardtii, normalized and size-selected cDNA libraries were constructed from cells at photoautotrophic growth, and a total of 11,571 5'-end sequence tags were established. These sequences were grouped into 3433 independent EST species. Similarity search against the public non-redundant protein database indicated that 817 groups showed significant similarity to registered sequences, of which 140 were of previously identified C. reinhardtii genes, but the remaining 2616 species were novel sequences. The coverage of full-length protein coding regions was estimated to be over 60%. These cDNA clones and EST sequence information will provide a powerful source for the future genome-wide functional analysis of uncharacterized genes.
Assuntos
Chlamydomonas reinhardtii/genética , Etiquetas de Sequências Expressas , Biblioteca Gênica , Animais , Sequência de Bases , DNA Complementar , Bases de Dados Factuais , Dados de Sequência MolecularRESUMO
In our ongoing project to deduce the nucleotide sequence of Arabidopsis thaliana chromosome 5, non-redundant P1 and TAC clones have been sequenced on the basis of the fine physical map, and as of January, 2000, the sequences of 16.6 Mb representing approximately 60% of chromosome 5 have been accumulated and released at our web site. Along with the sequence determination, structural features of the sequenced regions have been analyzed by applying a variety of computer programs, and we already predicted a total of 2697 potential protein coding genes in the 11,166,130 bp regions, which are covered by 159 P1 and TAC clones. In this paper, we describe the structural features of the 3,076,755 bp regions covered by newly analyzed 60 P1 and TAC clones. A total of 715 potential protein coding genes were identified, giving an average density of the genes identified of 1 gene per 4001 bp. Introns were observed in 80% of the genes, and the average number per gene and the average length of the introns were 4.5 and 147 bp, respectively. These sequence features are nearly identical to those in our latest report in which the data were compiled based on a new standard of gene assignment including the computer-predicted hypothetical genes. The regions also contained 12 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.
Assuntos
Arabidopsis/genética , Cromossomos/genética , Genoma de Planta , Bases de Dados Factuais , Etiquetas de Sequências Expressas , Internet , Mapeamento Físico do Cromossomo , Análise de Sequência de DNARESUMO
A total of 56 TAC clones with an average insert size of 100 kb were isolated from a TAC library of the Lotus japonicus genome based on the expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined according to the shot-gun based strategy. The total length of the sequenced regions is 5,473,195 bp. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 605 potential protein-encoding genes with known or predicted functions, 69 gene segments, and 172 pseudogenes were identified. The average density of the genes assigned so far is 1 gene/8120 bp. Introns were identified in approximately 78% of the potential genes. There was an average of 3.8 introns per gene and the average length of the introns was 375 bp. DNA markers were generated based on the nucleotide sequences obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of L. japonicus Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.
Assuntos
Mapeamento Cromossômico , Genes de Plantas/genética , Genoma de Planta , Lotus/genética , Sequência de Bases , Biomarcadores/análise , Cromossomos , DNA de Plantas , Etiquetas de Sequências Expressas , Expressão Gênica , Biblioteca Gênica , Lotus/crescimento & desenvolvimento , Dados de Sequência Molecular , Análise de SequênciaRESUMO
A total of 13 P1 clones, each containing a marker(s) specifically mapped on chromosome 5, were isolated from a P1 library of the Arabidopsis thaliana Columbia genome, and their nucleotide sequences were determined according to the shot gun based strategy and precisely located on the physical map of chromosome 5. The total length of the sequenced regions was 1,044,062 bp. Since we have previously reported the sequence of 1,621,245 bp by analysis of 20 non-redundant P1 clones, the total length of the sequences of chromosome 5 determined so far reached 2,665,307 bp. The regions sequenced in this study were analysed by comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling; a total of 225 potential protein-coding genes and/or gene segments with known or predicted functions were identified. The positions of exons which do not exhibit similarity to known genes were also predicted by computer-aided analysis. An average density of the genes and/or gene was 1 gene/4,640 bp. Introns were identified in approximately 84% of the potential genes, and the average number and length of the introns per gene were 5.3 and 184 bp, respectively. These sequence features are essentially identical to those for the previously sequenced regions. The transcription level of the predicted genes has been roughly monitored by counting the numbers of matched Arabidopsis ESTs. The sequence data and gene information are available through the World Wide Web at http:@www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , Clonagem MolecularRESUMO
A total of 17 P1 and TAC clones each containing a marker(s) specifically mapped on chromosome 5 were isolated from P1 and TAC libraries of the Arabidopsis thaliana Columbia genome, and their nucleotide sequences were determined according to the shot gun-based strategy and precisely located on the physical map of chromosome 5. The total length of the clones sequenced in this study was 1,191,918 bp. As we have previously reported the sequence of 2,662,078 bp by analysis of 33 P1 clones, the total length of the sequences of chromosome 5 determined so far is now 3,853,996 bp. The sequences determined in this study were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling, and a total of 310 potential protein-coding genes and/or gene segments with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also predicted by computer-aided analysis. An average density of the assigned genes and/or gene segments was 1 gene/3,845 bp. Introns were identified in 78% of the potential protein genes, and the average number per gene and the average length of the introns were 3.7 and 185 bp, respectively. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , Sequência de Bases , Cromossomos , Clonagem Molecular , DNA de Plantas , Expressão Gênica , Genes de Plantas , Dados de Sequência MolecularRESUMO
A total of 20 P1 clones with an average insert size of 80 kb and each containing a marker(s) specifically mapped on chromosome 5 were isolated from a P1 library of the Arabidopsis thaliana genome, and their nucleotide sequences were determined according to a shotgun-based strategy and precisely located on the physical map of chromosome 5 separately constructed. The total length of the sequenced regions were summed up to 1,621,245 bp. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 347 potential protein-coding genes and/or gene segments with known or predicted functions were identified. The positions of exons which do not exhibit any similarity to known genes were also predicted. An average density of the genes and/or gene segments assigned so far as 1 gene/4,672 bp. Introns were identified in approximately 78% of the potential genes, and the average number and length of the introns per gene were 3.7 and 161 bp. The transcription level of the predicted genes was roughly monitored by counting the numbers of identified Arabidopsis ESTs. The sequence data and gene information are available through the World Wide Web at http:/(/)www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , DNA de Plantas/análise , Biblioteca Genômica , Fases de Leitura Aberta , Marcadores Genéticos/genética , Análise de Sequência de DNARESUMO
Nineteen P1 and TAC clones, which have been mapped on the fine physical map of the Arabidopsis thaliana chromosome 5, were sequenced according to the shotgun-based strategy, and their structural features were analysed. The total length of the regions sequenced in this study was 1,367,185 bp. Combining this with the regions covered by 90 P1 and TAC clones previously reported, the total length of chromosome 5 sequenced to date becomes 8,058,855 bp. On the basis of similarity search against protein and EST databases and gene modeling with computer programs, a total of 330 potential protein-coding regions were identified, bringing an average density of the genes to approximately one gene per 4.1 kb. Introns were identified in 81.0% of the potential protein genes for which the entire gene structure was predicted, with an average number per gene of 4.2 and an average length of the introns of 180 bp. The RNA-coding genes identified were 9 tRNA genes corresponding to 8 amino acid species and 2 genes for U2 nuclear RNA. These sequence features are essentially identical to those in the previously reported sequences. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Genes de Plantas , Clonagem Molecular , DNA de Plantas/genética , Éxons , Marcadores Genéticos , Biblioteca Genômica , Íntrons , Dados de Sequência Molecular , Mapeamento Físico do Cromossomo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Análise de Sequência de DNA , SoftwareRESUMO
A total of 17 Pl and TAC clones each representing an assigned region of chromosome 5 were isolated from P1 and TAC genomic libraries of Arabidopsis thaliana Columbia, and their nucleotide sequences were determined. The length of the clones sequenced in this study summed up to 1,081,958 bp. As we have previously reported the sequence of 9,072,622 bp by analysis of 125 P1 and TAC clones, the total length of the sequences of chromosome 5 determined so far is now 10,154,580 bp. The sequences were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling. As a consequence, a total of 253 potential protein-coding genes with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also assigned using computer programs for exon prediction. The average density of the genes identified in this study was 1 gene per 4277 bp. Introns were observed in 74% of the potential protein genes, and the average number per gene and the average length of the introns were 4.3 and 168 bp, respectively. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Bacteriófago P1/genética , Bases de Dados Factuais , Expressão Gênica , Marcadores Genéticos , Biblioteca Genômica , Modelos Biológicos , Mapeamento Físico do Cromossomo , Análise de Sequência de DNARESUMO
Based on the physical map of Arabidopsis thaliana chromosome 3 previously constructed with CIC YAC, TAC, P1 and BAC clones (Sato, S. et al., DNA Res., 5, 163-168, 1998), a total of 60 P1 and TAC clones were sequenced, and the sequence features of the resulting 4,504,864 bp regions were analyzed by applying various computer programs for similarity search and gene modeling. As a result, a total of 1054 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4066 bp. Introns were observed in 77% of the genes, and the average number per gene and the average length of the introns were 3.9 and 156 bp, respectively. These sequence features are essentially identical to those of chromosome 5 in our previous reports, but the gene density was slightly higher than that observed for chromosomes 2 and 4. The regions also contained 10 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.
Assuntos
Arabidopsis/genética , Cromossomos/genética , Genes de Plantas , Arabidopsis/metabolismo , Análise de Sequência de DNARESUMO
To characterize genes whose expression is induced in carbon-stress conditions, 12,969 and 13,450 5'-end expressed sequence tags (ESTs) were generated from cells grown in low-CO2 and high-CO2 conditions of the unicellular green alga, Chlamydomonas reinhardtii. These ESTs were clustered into 4436 and 3566 non-redundant EST groups, respectively. Comparison of their sequences with those of 3433 non-redundant ESTs previously generated from the cells under the standard growth condition indicated that 2665 and 1879 EST groups occurred only in the low-CO2 and high-CO2 populations, respectively. It was also noted that 96.2% and 96.0% of the cDNA species respectively obtained from the low-CO2 and high-CO2 conditions had no similar EST sequence deposited in the public databases. The EST species identified only in the low-CO2 treated cells included genes previously reported to be expressed specifically in low-CO2 acclimatized cells, suggesting that the ESTs generated in this study will be a useful source for analysis of genes related to carbon-stress acclimatization. The sequence information and search results of each clone will appear at the web site: http://www.kazusa.or.jp/en/plant/chlamy/EST/.
Assuntos
Chlamydomonas reinhardtii/genética , Etiquetas de Sequências Expressas , Animais , Dióxido de Carbono/farmacologia , Chlamydomonas reinhardtii/efeitos dos fármacos , Chlamydomonas reinhardtii/fisiologia , DNA Complementar/genética , Biblioteca GênicaRESUMO
The nucleotide sequences of 21 P1 and TAC clones which have been precisely localized to the fine physical map of the Arabidopsis thaliana chromosome 5, were determined, and their sequence features were analyzed. The total length of the regions sequenced in this study were 1,381,565 bp, bringing the total length of the chromosome 5 sequences determined so far to 6,691,670 bp together with the regions of the 69 clones previously reported. By computer-aided analyses including similarity search against protein and EST databases and gene modeling with computer programs, a total of 337 potential protein-coding genes and/or gene segments were identified on the basis of similarity to the reported gene sequences. An average density of the genes and/or gene segments thus assigned was 1 gene/4,100 bp. Introns were identified in 76.7% of the potential protein genes for which the entire gene structure were predicted, and the average number per gene and the average length of the introns were 3.9 and 176 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http:@www.kazusa.or.jp@arabi
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , DNA de Plantas/genética , Genes de Plantas/genética , Clonagem Molecular , Éxons , Marcadores Genéticos , Biblioteca Genômica , Íntrons , Análise de Sequência de DNARESUMO
Nineteen P1 and TAC clones, which have been precisely localized to the fine physical map of Arabidopsis thaliana chromosome 5, were newly sequenced, and their sequence features were analysed. The total length of the clones sequenced was 1,456,315 bp. Together with the previously reported sequences, the regions of chromosome 5 that have been sequenced to date is now 5,310,105 bp. When the sequences determined in this study were subjected to similarity search against protein and expressed sequence tag (EST) databases and analysis with computer programs for gene modeling, a total of 354 potential protein-coding genes and/or gene segments were identified. The average density of the assigned genes and/or gene segments was one gene per 4,114 bp. Introns were identified in 75% of the potential protein genes, and the average number per gene and the average length of the introns were 3.7 and 194 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (the Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Genes de Plantas , Mapeamento Cromossômico , DNA de Plantas/genética , Análise de Sequência de DNARESUMO
To deduce the entire sequence of the top arm of the Arabidopsis thaliana chromosome 3, the sequence determination was performed on a total of 90 P1, TAC and BAC clones chosen according to our sequencing strategy. Sequence features of the resulting 4,251,695 bp regions were analyzed with various computer programs for similarity search and gene modeling. As a result, a total of 941 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4210 bp. Introns were observed in 73% of the genes, and the average number per gene and the average length of the introns were 3.6 and 159 bp, respectively. These sequence features are essentially identical to those of chromosomes 3 and 5 in our previous reports. The regions also contained 14 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.
Assuntos
Arabidopsis/genética , Cromossomos , Genes de Plantas , Mapeamento Cromossômico , Simulação por Computador , Bases de Dados Factuais , Etiquetas de Sequências Expressas , Marcadores Genéticos , Íntrons , Modelos Genéticos , Mapeamento Físico do Cromossomo , RNA de Transferência/genéticaRESUMO
A total of 10,154 5'-end expressed sequence tags (EST) were established from the normalized and size-selected cDNA libraries of a marine red alga, Porphyra yezoensis. Among the ESTs, 2140 were unique species, and the remaining 8014 were grouped into 1127 species. Database search of the 3267 non-redundant ESTs by BLAST algorithm showed that the sequences of 1080 species (33.1%) have similarity to those of registered genes from various organisms including higher plants, mammals, yeasts, and cyanobacteria, while 2187 (66.9%) are novel. Codon usage analysis in the coding regions of 101 non-redundant EST groups showing significant similarity to known genes indicated the higher GC contents at the third position of codons (79.4%) than the first (62.2%) and the second position (45.0%), suggesting that the genome has been exposed to high GC pressure during evolution. The sequence data of individual ESTs are available at the web site http://www.kazusa.or.jp/en/plant/porphyra/EST/.
Assuntos
Etiquetas de Sequências Expressas , Rodófitas/genética , Algoritmos , Códon , DNA Complementar/metabolismo , Bases de Dados Factuais , Biblioteca Gênica , Dados de Sequência Molecular , SoftwareRESUMO
In this series of projects sequencing the entire genome of Arabidopsis thaliana chromosome 5, non-redundant P1 and TAC clones have been sequenced according to the fine physical map, and as of May 7, 1999, the sequences of 16.2 Mb representing approximately 60% of chromosome 5 have been accumulated and released at our web site. In parallel, structural features of the sequenced regions have been analyzed by applying a variety of computer programs, and to date we have predicted a total of 2380 potential protein-coding genes in the 10,154,580 bp regions, which are covered by 142 P1 and TAC clones. In this paper, we newly analyzed the structural features of the 1,011,550 bp regions covered by additional 17 P1 and TAC clones, and predicted 298 protein-coding genes. The average density of the genes identified was 1 gene per 3394 bp. Introns were observed in 67% of the genes, and the average number per gene and the average length of the introns were 3.2 and 159 bp, respectively. The gene density became higher than the value estimated in the previously analyzed regions (1 gene per 4,267 bp), as the data in this paper were compiled based on a new standard of gene assignment including the computer-predicted hypothetical genes. The regions also contained 8 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available on the database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , Expressão Gênica , Alinhamento de Sequência , Sequência de Bases , Cromossomos Artificiais de Levedura , Clonagem Molecular , Biologia Computacional , DNA Complementar , Biblioteca Genômica , Humanos , Dados de Sequência MolecularRESUMO
Sixteen P1 and TAC clones assigned to Arabidopsis thaliana chromosome 5 were sequenced, and their sequence features were analyzed using various computer programs. The total length of the sequences determined was 1,013,767 bp. Together with the nucleotide sequences of 109 clones previously reported, the regions of chromosome 5 sequenced so far now total 9,072,622 bp, which presumably covers approximately one-third of the chromosome. A similarity search against the reported gene sequences predicted the presence of a total of 225 protein-coding genes and/or gene segments in the newly sequenced regions, indicating an average gene density of one gene per 4.5 kb. Introns were identified in 72.4% of the potential protein genes for which the entire gene structure was predicted, and the average number per gene and the average length of the introns were 3.3 and 163 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.