Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 38(17): 4194-4199, 2022 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-35801937

RESUMEN

MOTIVATION: Understanding life cannot be accomplished without making full use of biological data, which are scattered across databases of diverse categories in life sciences. To connect such data seamlessly, identifier (ID) conversion plays a key role. However, existing ID conversion services have disadvantages, such as covering only a limited range of biological categories of databases, not keeping up with the updates of the original databases and outputs being hard to interpret in the context of biological relations, especially when converting IDs in multiple steps. RESULTS: TogoID is an ID conversion service implementing unique features with an intuitive web interface and an application programming interface (API) for programmatic access. TogoID currently supports 65 datasets covering various biological categories. TogoID users can perform exploratory multistep conversions to find a path among IDs. To guide the interpretation of biological meanings in the conversions, we crafted an ontology that defines the semantics of the dataset relations. AVAILABILITY AND IMPLEMENTATION: The TogoID service is freely available on the TogoID website (https://togoid.dbcls.jp/) and the API is also provided to allow programmatic access. To encourage developers to add new dataset pairs, the system stores the configurations of pairs at the GitHub repository (https://github.com/togoid/togoid-config) and accepts the request of additional pairs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Manejo de Datos , Programas Informáticos , Bases de Datos Factuales
2.
Int J Mol Sci ; 21(20)2020 Oct 12.
Artículo en Inglés | MEDLINE | ID: mdl-33053895

RESUMEN

Efforts to determine the mosquito genes that affect dengue virus replication have identified a number of candidates that positively or negatively modify amplification in the invertebrate host. We used deep sequencing to compare the differential transcript abundances in Aedes aegypti 14 days post dengue infection to those of uninfected A. aegypti. The gene lethal(2)-essential-for-life [l(2)efl], which encodes a member of the heat shock 20 protein (HSP20) family, was upregulated following dengue virus type 2 (DENV-2) infection in vivo. The transcripts of this gene did not exhibit differential accumulation in mosquitoes exposed to insecticides or pollutants. The induction and overexpression of l(2)efl gene products using poly(I:C) resulted in decreased DENV-2 replication in the cell line. In contrast, the RNAi-mediated suppression of l(2)efl gene products resulted in enhanced DENV-2 replication, but this enhancement occurred only if multiple l(2)efl genes were suppressed. l(2)efl homologs induce the phosphorylation of eukaryotic initiation factor 2α (eIF2α) in the fruit fly Drosophila melanogaster, and we confirmed this finding in the cell line. However, the mechanism by which l(2)efl phosphorylates eIF2α remains unclear. We conclude that l(2)efl encodes a potential anti-dengue protein in the vector mosquito.


Asunto(s)
Aedes/genética , Aedes/virología , Virus del Dengue/fisiología , Dengue/virología , Proteínas del Choque Térmico HSP20/genética , Proteínas de Insectos/genética , Mosquitos Vectores/genética , Mosquitos Vectores/virología , Animales , Biología Computacional/métodos , Perfilación de la Expresión Génica , Interacciones Huésped-Patógeno , Transcriptoma , Replicación Viral
3.
J Plant Res ; 131(4): 709-717, 2018 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-29460198

RESUMEN

Recent studies have shown that environmental DNA is found almost everywhere. Flower petal surfaces are an attractive tissue to use for investigation of the dispersal of environmental DNA in nature as they are isolated from the external environment until the bud opens and only then can the petal surface accumulate environmental DNA. Here, we performed a crowdsourced experiment, the "Ohanami Project", to obtain environmental DNA samples from petal surfaces of Cerasus × yedoensis 'Somei-yoshino' across the Japanese archipelago during spring 2015. C. × yedoensis is the most popular garden cherry species in Japan and clones of this cultivar bloom simultaneously every spring. Data collection spanned almost every prefecture and totaled 577 DNA samples from 149 collaborators. Preliminary amplicon-sequencing analysis showed the rapid attachment of environmental DNA onto the petal surfaces. Notably, we found DNA of other common plant species in samples obtained from a wide distribution; this DNA likely originated from the pollen of the Japanese cedar. Our analysis supports our belief that petal surfaces after blossoming are a promising target to reveal the dynamics of environmental DNA in nature. The success of our experiment also shows that crowdsourced environmental DNA analyses have considerable value in ecological studies.


Asunto(s)
ADN de Plantas/genética , ADN/genética , Ambiente , Flores/genética , Prunus/genética , Cloroplastos/genética , Cianobacterias/genética , Flores/microbiología , Japón , Proteobacteria/genética , Prunus/microbiología , Alineación de Secuencia , Análisis de Secuencia de ADN
4.
Genome Res ; 24(9): 1433-44, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25091627

RESUMEN

To understand the molecular mechanisms of parasitism in vivo, it is essential to elucidate how the transcriptomes of the human hosts and the infecting parasites affect one another. Here we report the RNA-seq analysis of 116 Indonesian patients infected with the malaria parasite Plasmodium falciparum (Pf). We extracted RNAs from their peripheral blood as a mixture of host and parasite transcripts and mapped the RNA-seq tags to the human and Pf reference genomes to separate the respective tags. We were thus able to simultaneously analyze expression patterns in both humans and parasites. We identified human and parasite genes and pathways that correlated with various clinical data, which may serve as primary targets for drug developments. Of particular importance, we revealed characteristic expression changes in the human innate immune response pathway genes including TLR2 and TICAM2 that correlated with the severity of the malaria infection. We also found a group of transcription regulatory factors, JUND, for example, and signaling molecules, TNFAIP3, for example, that were strongly correlated in the expression patterns of humans and parasites. We also identified several genetic variations in important anti-malaria drug resistance-related genes. Furthermore, we identified the genetic variations which are potentially associated with severe malaria symptoms both in humans and parasites. The newly generated data should collectively lay a unique foundation for understanding variable behaviors of the field malaria parasites, which are far more complex than those observed under laboratory conditions.


Asunto(s)
Genoma Humano , Genoma de Protozoos , Malaria/genética , Plasmodium falciparum/genética , Transcriptoma , Proteínas Adaptadoras Transductoras de Señales/genética , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Adolescente , Adulto , Antimaláricos/uso terapéutico , Estudios de Casos y Controles , Niño , Preescolar , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Resistencia a Medicamentos/genética , Etiquetas de Secuencia Expresada , Femenino , Interacciones Huésped-Parásitos/genética , Humanos , Inmunidad Innata/genética , Lactante , Péptidos y Proteínas de Señalización Intracelular/genética , Péptidos y Proteínas de Señalización Intracelular/metabolismo , Malaria/diagnóstico , Malaria/tratamiento farmacológico , Masculino , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Plasmodium falciparum/patogenicidad , Polimorfismo de Nucleótido Simple , Proteínas Proto-Oncogénicas c-jun/genética , Proteínas Proto-Oncogénicas c-jun/metabolismo , Receptor Toll-Like 2/genética , Receptor Toll-Like 2/metabolismo , Proteína 3 Inducida por el Factor de Necrosis Tumoral alfa , Virulencia/genética
6.
Nucleic Acids Res ; 41(Database issue): D353-7, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193276

RESUMEN

The identification of orthologous genes in an increasing number of fully sequenced genomes is a challenging issue in recent genome science. Here we present KEGG OC (http://www.genome.jp/tools/oc/), a novel database of ortholog clusters (OCs). The current version of KEGG OC contains 1 176 030 OCs, obtained by clustering 8 357 175 genes in 2112 complete genomes (153 eukaryotes, 1830 bacteria and 129 archaea). The OCs were constructed by applying the quasi-clique-based clustering method to all possible protein coding genes in all complete genomes, based on their amino acid sequence similarities. It is computationally efficient to calculate OCs, which enables to regularly update the contents. KEGG OC has the following two features: (i) It consists of all complete genomes of a wide variety of organisms from three domains of life, and the number of organisms is the largest among the existing databases; and (ii) It is compatible with the KEGG database by sharing the same sets of genes and identifiers, which leads to seamless integration of OCs with useful components in KEGG such as biological pathways, pathway modules, functional hierarchy, diseases and drugs. The KEGG OC resources are accessible via OC Viewer that provides an interactive visualization of OCs at different taxonomic levels.


Asunto(s)
Bases de Datos Genéticas , Genes Arqueales , Genes Bacterianos , Genes , Algoritmos , Clasificación/métodos , Análisis por Conglomerados , Eucariontes/genética , Genoma Arqueal , Genoma Bacteriano , Genómica/métodos , Internet , Homología de Secuencia de Aminoácido
7.
Hum Genome Var ; 9(1): 44, 2022 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-36509753

RESUMEN

TogoVar ( https://togovar.org ) is a database that integrates allele frequencies derived from Japanese populations and provides annotations for variant interpretation. First, a scheme to reanalyze individual-level genome sequence data deposited in the Japanese Genotype-phenotype Archive (JGA), a controlled-access database, was established to make allele frequencies publicly available. As more Japanese individual-level genome sequence data are deposited in JGA, the sample size employed in TogoVar is expected to increase, contributing to genetic study as reference data for Japanese populations. Second, public datasets of Japanese and non-Japanese populations were integrated into TogoVar to easily compare allele frequencies in Japanese and other populations. Each variant detected in Japanese populations was assigned a TogoVar ID as a permanent identifier. Third, these variants were annotated with molecular consequence, pathogenicity, and literature information for interpreting and prioritizing variants. Here, we introduce the newly developed TogoVar database that compares allele frequencies among Japanese and non-Japanese populations and describes the integrated annotations.

8.
Nucleic Acids Res ; 37(Database issue): D520-5, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18987005

RESUMEN

Full-Malaria/Parasites is a database for transcriptome studies of apicomplexa and other parasites, which is based on our original full-length cDNA sequences and physical cDNA clone resources. In this update, the database has been expanded to contain the shogun sequencing for the entire sequences of 14,818 non-redundant full-length cDNA clones from six apicomplexa parasites and 6.8 million of transcription start sites (TSS), both of which had been produced by novel protocols using the oligo-capping method and the Illumina GA sequencer. The former should be the ultimate data for exact annotation of the expressed genes, while the latter should be useful for ultra-deep expression analysis. Furthermore, we have launched Full-Arthropods, a full-length cDNA database for arthropods of medical importance. Full-Arthropods contains 50 343 one-pass sequences, 10 399 shotgun complete sequences and 22.4 million TSS tags in anopheles mosquitoes that transmit malaria, tsetse flies that transmit trypanosomiasis and dust mites that cause allergic dermatitis and bronchial asthma. By providing the largest integrated full-length cDNA data resources in the apicomplexa parasites as well as their vectors, Full-Malaria/Parasites and Full-Arthropods should help combat parasitic diseases. Full-Malaria/Parasites and Full-Arthropods are accessible from http://fullmal.hgc.jp/.


Asunto(s)
Apicomplexa/genética , Vectores Artrópodos/genética , Artrópodos/genética , ADN Complementario/química , Bases de Datos de Ácidos Nucleicos , Parásitos/genética , Animales , Anopheles/genética , Plasmodium/genética , Análisis de Secuencia de ADN , Toxoplasma/genética , Sitio de Iniciación de la Transcripción , Moscas Tse-Tse/genética
9.
Cell Genom ; 1(2): None, 2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-34820659

RESUMEN

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

10.
Nucleic Acids Res ; 36(Database issue): D202-5, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17998252

RESUMEN

AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. We have added a collection of protein contact potentials to the AAindex as a new section. Accordingly AAindex consists of three sections now: AAindex1 for the amino acid index of 20 numerical values, AAindex2 for the amino acid substitution matrix and AAindex3 for the statistical protein contact potentials. All data are derived from published literature. The database can be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.jp/dbget-bin/www_bfind?aaindex) or downloaded by anonymous FTP (ftp://ftp.genome.jp/pub/db/community/aaindex/).


Asunto(s)
Aminoácidos/química , Bases de Datos de Proteínas , Proteínas/química , Internet
11.
Nucleic Acids Res ; 36(Database issue): D480-4, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18077471

RESUMEN

KEGG (http://www.genome.jp/kegg/) is a database of biological systems that integrates genomic, chemical and systemic functional information. KEGG provides a reference knowledge base for linking genomes to life through the process of PATHWAY mapping, which is to map, for example, a genomic or transcriptomic content of genes to KEGG reference pathways to infer systemic behaviors of the cell or the organism. In addition, KEGG provides a reference knowledge base for linking genomes to the environment, such as for the analysis of drug-target relationships, through the process of BRITE mapping. KEGG BRITE is an ontology database representing functional hierarchies of various biological objects, including molecules, cells, organisms, diseases and drugs, as well as relationships among them. KEGG PATHWAY is now supplemented with a new global map of metabolic pathways, which is essentially a combined map of about 120 existing pathway maps. In addition, smaller pathway modules are defined and stored in KEGG MODULE that also contains other functional units and complexes. The KEGG resource is being expanded to suit the needs for practical applications. KEGG DRUG contains all approved drugs in the US and Japan, and KEGG DISEASE is a new database linking disease genes, pathways, drugs and diagnostic markers.


Asunto(s)
Bases de Datos Factuales , Genómica , Biología de Sistemas , Enfermedad , Humanos , Internet , Redes y Vías Metabólicas , Estructura Molecular , Preparaciones Farmacéuticas/química , Integración de Sistemas , Interfaz Usuario-Computador
12.
F1000Res ; 9: 136, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32308977

RESUMEN

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Biología Computacional , Web Semántica , Minería de Datos , Metadatos , Reproducibilidad de los Resultados
13.
Nucleic Acids Res ; 35(Web Server issue): W625-32, 2007 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-17586824

RESUMEN

With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador/tendencias , Programas Informáticos , Animales , Caenorhabditis elegans/genética , Sistemas de Administración de Bases de Datos/estadística & datos numéricos , Bases de Datos Genéticas/estadística & datos numéricos , Drosophila/genética , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Ratones , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Saccharomyces cerevisiae/genética , Factores de Transcripción/genética , Transcripción Genética/fisiología
14.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30624651

RESUMEN

TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Web Semántica , Programas Informáticos , Humanos
15.
Nucleic Acids Res ; 34(Database issue): D358-62, 2006 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-16381886

RESUMEN

Operon structures play an important role in co-regulation in prokaryotes. Although over 200 complete genome sequences are now available, databases providing genome-wide operon information have been limited to certain specific genomes. Thus, we have developed an ODB (Operon DataBase), which provides a data retrieval system of known operons among the many complete genomes. Additionally, putative operons that are conserved in terms of known operons are also provided. The current version of our database contains about 2000 known operon information in more than 50 genomes and about 13 000 putative operons in more than 200 genomes. This system integrates four types of associations: genome context, gene co-expression obtained from microarray data, functional links in biological pathways and the conservation of gene order across the genomes. These associations are indicators of the genes that organize an operon, and the combination of these indicators allows us to predict more reliable operons. Furthermore, our system validates these predictions using known operon information obtained from the literature. This database integrates known literature-based information and genomic data. In addition, it provides an operon prediction tool, which make the system useful for both bioinformatics researchers and experimental biologists. Our database is accessible at http://odb.kuicr.kyoto-u.ac.jp/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Operón , Transcripción Genética , Perfilación de la Expresión Génica , Orden Génico , Genoma Bacteriano , Internet , Interfaz Usuario-Computador
16.
Nucleic Acids Res ; 34(Web Server issue): W459-62, 2006 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-16845049

RESUMEN

Expressed sequence tag (EST) sequencing has proven to be an economically feasible alternative for gene discovery in species lacking a draft genome sequence. Ongoing large-scale EST sequencing projects feel the need for bioinformatics tools to facilitate uniform EST handling. This brings about a renewed importance for a universal tool for processing and functional annotation of large sets of ESTs. EGassembler (http://egassembler.hgc.jp/) is a web server, which provides an automated as well as a user-customized analysis tool for cleaning, repeat masking, vector trimming, organelle masking, clustering and assembling of ESTs and genomic fragments. The web server is publicly available and provides the community a unique all-in-one online application web service for large-scale ESTs and genomic DNA clustering and assembling. Running on a Sun Fire 15K supercomputer, a significantly large volume of data can be processed in a short period of time. The results can be used to functionally annotate genes, to facilitate splice alignment analysis, to link the transcripts to genetic and physical maps, design microarray chips, to perform transcriptome analysis and to map to KEGG metabolic pathways. The service provides an excellent bioinformatics tool to research groups in wet-lab as well as an all-in-one-tool for sequence handling to bioinformatics researchers.


Asunto(s)
Biología Computacional/métodos , Etiquetas de Secuencia Expresada , Genómica/métodos , Programas Informáticos , Internet , Análisis de Secuencia de ADN , Interfaz Usuario-Computador
17.
Nucleic Acids Res ; 34(Database issue): D354-7, 2006 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-16381885

RESUMEN

The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource (http://www.genome.jp/kegg/) provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps.


Asunto(s)
Biotransformación , Química , Bases de Datos Factuales , Bases de Datos Genéticas , Genómica , Fenómenos Químicos , Ambiente , Enzimas/química , Enzimas/genética , Humanos , Internet , Ligandos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/clasificación , Transducción de Señal , Integración de Sistemas , Interfaz Usuario-Computador
18.
Database (Oxford) ; 20182018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30576482

RESUMEN

In the life sciences, researchers increasingly want to access multiple databases in an integrated way. However, different databases currently use different formats and vocabularies, hindering the proper integration of heterogeneous life science data. Adopting the Resource Description Framework (RDF) has the potential to address such issues by improving database interoperability, leading to advances in automatic data processing. Based on this idea, we have advised many Japanese database development groups to expose their databases in RDF. To further promote such activities, we have developed an RDF-based life science dataset repository called the National Bioscience Database Center (NBDC) RDF portal. All the datasets in this repository have been reviewed by the NBDC to ensure interoperability and queryability. As of July 2018, the service includes 21 RDF datasets, comprising over 45.5 billion triples. It provides SPARQL endpoints for all datasets, useful metadata and the ability to download RDF files. The NBDC RDF portal can be accessed at https://integbio.jp/rdf/.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Semántica , Internet , Interfaz Usuario-Computador
19.
BMC Genomics ; 8: 48, 2007 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-17298663

RESUMEN

BACKGROUND: Operon structures play an important role in transcriptional regulation in prokaryotes. However, there have been fewer studies on complicated operon structures in which the transcriptional units vary with changing environmental conditions. Information about such complicated operons is helpful for predicting and analyzing operon structures, as well as understanding gene functions and transcriptional regulation. RESULTS: We systematically analyzed the experimentally verified transcriptional units (TUs) in Bacillus subtilis and Escherichia coli obtained from ODB and RegulonDB. To understand the relationships between TUs and operons, we defined a new classification system for adjacent gene pairs, divided into three groups according to the level of gene co-regulation: operon pairs (OP) belong to the same TU, sub-operon pairs (SOP) that are at the transcriptional boundaries within an operon, and non-operon pairs (NOP) belonging to different operons. Consequently, we found that the levels of gene co-regulation was correlated to intergenic distances and gene expression levels. Additional analysis revealed that they were also correlated to the levels of conservation across about 200 prokaryotic genomes. Most interestingly, we found that functional associations in SOPs were more observed in the environmental and genetic information processes. CONCLUSION: Complicated operon structures were correlated with genome organization and gene expression profiles. Such intricately regulated operons allow functional differences depending on environmental conditions. These regulatory mechanisms are helpful in accommodating the variety of changes that happen around the cell. In addition, such differences may play an important role in the evolution of gene order across genomes.


Asunto(s)
Bacillus subtilis/genética , Escherichia coli/genética , Operón , Regulón , Secuencia Conservada , Perfilación de la Expresión Génica , Regulación Bacteriana de la Expresión Génica , Modelos Biológicos , Transducción de Señal , Transcripción Genética
20.
Genome Inform ; 18: 152-61, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-18546483

RESUMEN

Amino acid indices are useful tools in bioinformatics. With the appearance of novel theory and technology, and the rapid increase of experimental data, building new indices to cope with new or unsolved old problems is still necessary. In this study, residue networks are constructed from the PDB structures of 640 representative proteins based on the distance between Calpha atoms with an 8 A cutoff. All these networks show typical small world features. New amino acid indices, termed relative connectivity, clustering coefficient, closeness and betweenness, are derived from the corresponding topological parameters of amino acids in the residue networks. The 4 new network based indices are closely clustered together and related to hydrophobicity and beta propensity. When compared with related amino acid indices, the new indices show better or comparable performance in protein surface residue prediction. Relative connectivity is the best index and can reach a useful performance with an area under the curve about 0.75. It indicates that the network property based amino acid indices can be useful complements to the existing physicochemical property based amino acid indices.


Asunto(s)
Aminoácidos/química , Análisis por Conglomerados , Bases de Datos de Proteínas , Modelos Moleculares
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA