Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Bioinformatics ; 38(17): 4194-4199, 2022 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-35801937

RESUMO

MOTIVATION: Understanding life cannot be accomplished without making full use of biological data, which are scattered across databases of diverse categories in life sciences. To connect such data seamlessly, identifier (ID) conversion plays a key role. However, existing ID conversion services have disadvantages, such as covering only a limited range of biological categories of databases, not keeping up with the updates of the original databases and outputs being hard to interpret in the context of biological relations, especially when converting IDs in multiple steps. RESULTS: TogoID is an ID conversion service implementing unique features with an intuitive web interface and an application programming interface (API) for programmatic access. TogoID currently supports 65 datasets covering various biological categories. TogoID users can perform exploratory multistep conversions to find a path among IDs. To guide the interpretation of biological meanings in the conversions, we crafted an ontology that defines the semantics of the dataset relations. AVAILABILITY AND IMPLEMENTATION: The TogoID service is freely available on the TogoID website (https://togoid.dbcls.jp/) and the API is also provided to allow programmatic access. To encourage developers to add new dataset pairs, the system stores the configurations of pairs at the GitHub repository (https://github.com/togoid/togoid-config) and accepts the request of additional pairs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Gerenciamento de Dados , Software , Bases de Dados Factuais
2.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30624651

RESUMO

TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.


Assuntos
Bases de Dados Genéticas , Genômica/métodos , Web Semântica , Software , Humanos
3.
J Biomed Semantics ; 6: 3, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25973165

RESUMO

BACKGROUND: Linked Data has gained some attention recently in the life sciences as an effective way to provide and share data. As a part of the Semantic Web, data are linked so that a person or machine can explore the web of data. Resource Description Framework (RDF) is the standard means of implementing Linked Data. In the process of generating RDF data, not only are data simply linked to one another, the links themselves are characterized by ontologies, thereby allowing the types of links to be distinguished. Although there is a high labor cost to define an ontology for data providers, the merit lies in the higher level of interoperability with data analysis and visualization software. This increase in interoperability facilitates the multi-faceted retrieval of data, and the appropriate data can be quickly extracted and visualized. Such retrieval is usually performed using the SPARQL (SPARQL Protocol and RDF Query Language) query language, which is used to query RDF data stores. For the database provider, such interoperability will surely lead to an increase in the number of users. RESULTS: This manuscript describes the experiences and discussions shared among participants of the week-long BioHackathon 2011 who went through the development of RDF representations of their own data and developed specific RDF and SPARQL use cases. Advice regarding considerations to take when developing RDF representations of their data are provided for bioinformaticians considering making data available and interoperable. CONCLUSIONS: Participants of the BioHackathon 2011 were able to produce RDF representations of their data and gain a better understanding of the requirements for producing such data in a period of just five days. We summarize the work accomplished with the hope that it will be useful for researchers involved in developing laboratory databases or data analysis, and those who are considering such technologies as RDF and Linked Data.

4.
Plant Cell Physiol ; 56(2): 334-45, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25416288

RESUMO

Although cyanobacteria are photoautotrophs, they have the capability for heterotrophic metabolism that enables them to survive in their natural habitat. However, cyanobacterial species that grow heterotrophically in the dark are rare. It remains largely unknown how cyanobacteria regulate heterotrophic activity. The cyanobacterium Leptolyngbya boryana grows heterotrophically with glucose in the dark. A dark-adapted variant dg5 isolated from the wild type (WT) exhibits enhanced heterotrophic growth in the dark. We sequenced the genomes of dg5 and the WT to identify the mutation(s) of dg5. The WT genome consists of a circular chromosome (6,176,364 bp), a circular plasmid pLBA (77,793 bp) and two linear plasmids pLBX (504,942 bp) and pLBY (44,369 bp). Genome comparison revealed three mutation sites. Phenotype analysis of mutants isolated from the WT by introducing these mutations individually revealed that the relevant mutation is a single adenine insertion causing a frameshift of cytM encoding Cyt c(M). The respiratory oxygen consumption of the cytM-lacking mutant grown in the dark was significantly higher than that of the WT. We isolated a cytM-lacking mutant, ΔcytM, from another cyanobacterium Synechocystis sp. PCC 6803, and ΔcytM grew in the dark with a doubling time of 33 h in contrast to no growth of the WT. The respiratory oxygen consumption of ΔcytM grown in the dark was about 2-fold higher than that of the WT. These results suggest a suppressive role(s) for Cyt cM in regulation of heterotrophic activity.


Assuntos
Cianobactérias/crescimento & desenvolvimento , Cianobactérias/genética , Citocromos c/genética , Escuridão , Processos Heterotróficos/genética , Mutação/genética , Sequência de Bases , Rearranjo Gênico , Genoma Bacteriano , Fenótipo , Filogenia , Synechocystis/genética , Synechocystis/crescimento & desenvolvimento , Synechocystis/metabolismo , Transformação Genética
5.
J Bioinform Comput Biol ; 12(6): 1442001, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25385078

RESUMO

Genomics is faced with the issue of many partially annotated putative enzyme-encoding genes for which activities have not yet been verified, while metabolomics is faced with the issue of many putative enzyme reactions for which full equations have not been verified. Knowledge of enzymes has been collected by IUBMB, and has been made public as the Enzyme List. To date, however, the terminology of the Enzyme List has not been assessed comprehensively by bioinformatics studies. Instead, most of the bioinformatics studies simply use the identifiers of the enzymes, i.e. the Enzyme Commission (EC) numbers. We investigated the actual usage of terminology throughout the Enzyme List, and demonstrated that the partial characteristics of reactions cannot be retrieved by simply using EC numbers. Thus, we developed a novel ontology, named PIERO, for annotating biochemical transformations as follows. First, the terminology describing enzymatic reactions was retrieved from the Enzyme List, and was grouped into those related to overall reactions and biochemical transformations. Consequently, these terms were mapped onto the actual transformations taken from enzymatic reaction equations. This ontology was linked to Gene Ontology (GO) and EC numbers, allowing the extraction of common partial reaction characteristics from given sets of orthologous genes and the elucidation of possible enzymes from the given transformations. Further future development of the PIERO ontology should enhance the Enzyme List to promote the integration of genomics and metabolomics.


Assuntos
Ontologias Biológicas , Bases de Dados de Proteínas , Enzimas/química , Enzimas/classificação , Armazenamento e Recuperação da Informação/métodos , Terminologia como Assunto , Enzimas/genética , Processamento de Linguagem Natural
6.
J Biomed Semantics ; 5(1): 5, 2014 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-24495517

RESUMO

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.

7.
Nucleic Acids Res ; 42(Database issue): D666-70, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24275496

RESUMO

To understand newly sequenced genomes of closely related species, comprehensively curated reference genome databases are becoming increasingly important. We have extended CyanoBase (http://genome.microbedb.jp/cyanobase), a genome database for cyanobacteria, and newly developed RhizoBase (http://genome.microbedb.jp/rhizobase), a genome database for rhizobia, nitrogen-fixing bacteria associated with leguminous plants. Both databases focus on the representation and reusability of reference genome annotations, which are continuously updated by manual curation. Domain experts have extracted names, products and functions of each gene reported in the literature. To ensure effectiveness of this procedure, we developed the TogoAnnotation system offering a web-based user interface and a uniform storage of annotations for the curators of the CyanoBase and RhizoBase databases. The number of references investigated for CyanoBase increased from 2260 in our previous report to 5285, and for RhizoBase, we perused 1216 references. The results of these intensive annotations are displayed on the GeneView pages of each database. Advanced users can also retrieve this information through the representational state transfer-based web application programming interface in an automated manner.


Assuntos
Alphaproteobacteria/genética , Cianobactérias/genética , Bases de Dados Genéticas , Genoma Bacteriano , Bradyrhizobium/genética , Genes Bacterianos , Internet , Mesorhizobium/genética , Anotação de Sequência Molecular , Rhizobium/genética , Sinorhizobium/genética
8.
J Biomed Semantics ; 4(1): 6, 2013 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-23398680

RESUMO

BACKGROUND: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.

9.
Microbes Environ ; 27(3): 306-15, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22452844

RESUMO

Bradyrhizobium sp. S23321 is an oligotrophic bacterium isolated from paddy field soil. Although S23321 is phylogenetically close to Bradyrhizobium japonicum USDA110, a legume symbiont, it is unable to induce root nodules in siratro, a legume often used for testing Nod factor-dependent nodulation. The genome of S23321 is a single circular chromosome, 7,231,841 bp in length, with an average GC content of 64.3%. The genome contains 6,898 potential protein-encoding genes, one set of rRNA genes, and 45 tRNA genes. Comparison of the genome structure between S23321 and USDA110 showed strong colinearity; however, the symbiosis islands present in USDA110 were absent in S23321, whose genome lacked a chaperonin gene cluster (groELS3) for symbiosis regulation found in USDA110. A comparison of sequences around the tRNA-Val gene strongly suggested that S23321 contains an ancestral-type genome that precedes the acquisition of a symbiosis island by horizontal gene transfer. Although S23321 contains a nif (nitrogen fixation) gene cluster, the organization, homology, and phylogeny of the genes in this cluster were more similar to those of photosynthetic bradyrhizobia ORS278 and BTAi1 than to those on the symbiosis island of USDA110. In addition, we found genes encoding a complete photosynthetic system, many ABC transporters for amino acids and oligopeptides, two types (polar and lateral) of flagella, multiple respiratory chains, and a system for lignin monomer catabolism in the S23321 genome. These features suggest that S23321 is able to adapt to a wide range of environments, probably including low-nutrient conditions, with multiple survival strategies in soil and rhizosphere.


Assuntos
Bradyrhizobium/genética , DNA Bacteriano/química , DNA Bacteriano/genética , Genoma Bacteriano , Análise de Sequência de DNA , Proteínas de Bactérias/genética , Composição de Bases , Bradyrhizobium/isolamento & purificação , Bradyrhizobium/fisiologia , Redes e Vias Metabólicas/genética , Dados de Sequência Molecular , Fases de Leitura Aberta , RNA não Traduzido/genética , Microbiologia do Solo , Simbiose , Sintenia
10.
J Biomed Semantics ; 2: 4, 2011 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-21806842

RESUMO

BACKGROUND: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. RESULTS: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. CONCLUSIONS: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA