Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Nucleic Acids Res ; 36(Database issue): D572-6, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17942425

RESUMEN

The pathogen-host interaction database (PHI-base) is a web-accessible database that catalogues experimentally verified pathogenicity, virulence and effector genes from bacterial, fungal and Oomycete pathogens, which infect human, animal, plant, insect, fish and fungal hosts. Plant endophytes are also included. PHI-base is therefore an invaluable resource for the discovery of genes in medically and agronomically important pathogens, which may be potential targets for chemical intervention. The database is freely accessible to both academic and non-academic users. This publication describes recent additions to the database and both current and future applications. The number of fields that characterize PHI-base entries has almost doubled. Important additional fields deal with new experimental methods, strain information, pathogenicity islands and external references that link the database to external resources, for example, gene ontology terms and Locus IDs. Another important addition is the inclusion of anti-infectives and their target genes that makes it possible to predict the compounds, that may interact with newly identified virulence factors. In parallel, the curation process has been improved and now involves several external experts. On the technical side, several new search tools have been provided and the database is also now distributed in XML format. PHI-base is available at: http://www.phi-base.org/.


Asunto(s)
Bacterias/patogenicidad , Bases de Datos Genéticas , Hongos/patogenicidad , Interacciones Huésped-Patógeno/genética , Oomicetos/patogenicidad , Factores de Virulencia/genética , Antiinfecciosos/farmacología , Bacterias/genética , Hongos/genética , Genes Bacterianos , Genes Fúngicos , Internet , Oomicetos/genética , Interfaz Usuario-Computador , Factores de Virulencia/antagonistas & inhibidores
2.
Nucleic Acids Res ; 34(Database issue): D459-64, 2006 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-16381911

RESUMEN

To utilize effectively the growing number of verified genes that mediate an organism's ability to cause disease and/or to trigger host responses, we have developed PHI-base. This is a web-accessible database that currently catalogs 405 experimentally verified pathogenicity, virulence and effector genes from 54 fungal and Oomycete pathogens, of which 176 are from animal pathogens, 227 from plant pathogens and 3 from pathogens with a fungal host. PHI-base is the first on-line resource devoted to the identification and presentation of information on fungal and Oomycete pathogenicity genes and their host interactions. As such, PHI-base is a valuable resource for the discovery of candidate targets in medically and agronomically important fungal and Oomycete pathogens for intervention with synthetic chemistries and natural products. Each entry in PHI-base is curated by domain experts and supported by strong experimental evidence (gene/transcript disruption experiments) as well as literature references in which the experiments are described. Each gene in PHI-base is presented with its nucleotide and deduced amino acid sequence as well as a detailed description of the predicted protein's function during the host infection process. To facilitate data interoperability, we have annotated genes using controlled vocabularies (Gene Ontology terms, Enzyme Commission Numbers and so on), and provide links to other external data sources (e.g. NCBI taxonomy and EMBL). We welcome new data for inclusion in PHI-base, which is freely accessed at www4.rothamsted.bbsrc.ac.uk/phibase/.


Asunto(s)
Proteínas Algáceas/genética , Bases de Datos Genéticas , Hongos/patogenicidad , Genes Fúngicos , Oomicetos/patogenicidad , Factores de Virulencia/genética , Proteínas Fúngicas/genética , Hongos/genética , Internet , Oomicetos/genética , Programas Informáticos , Interfaz Usuario-Computador
3.
BMC Bioinformatics ; 7: 212, 2006 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-16623942

RESUMEN

BACKGROUND: Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way. RESULTS: We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO. CONCLUSION: Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.


Asunto(s)
Bases de Datos Genéticas , Documentación/métodos , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Proteínas/clasificación , Proteínas/genética , Terminología como Asunto , Inteligencia Artificial , Clasificación/métodos , Documentación/normas , Almacenamiento y Recuperación de la Información/normas , Filogenia , Proteínas/metabolismo , Control de Calidad , Vocabulario Controlado
4.
IEEE Trans Inf Technol Biomed ; 10(4): 714-21, 2006 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17044405

RESUMEN

In the light of the increasing number of biological databases, their integration is a fundamental prerequisite for answering complex biological questions. Database integration, therefore, is an important area of research in bioinformatics. Since most of the publicly available life science databases are still exclusively exchanged by means of proprietary flat files, database integration requires parsers for very different flat file formats. Unfortunately, the development and maintenance of database specific flat file parsers is a nontrivial and time-consuming task, which takes considerable effort in large-scale integration scenarios. This paper introduces heuristically based concepts for automatic structure extraction from life science database flat files. On the basis of these concepts the FlatEx prototype is developed for the automatic conversion of flat files into XML representations.


Asunto(s)
Algoritmos , Disciplinas de las Ciencias Biológicas/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Procesamiento Automatizado de Datos , Hipermedia , Almacenamiento y Recuperación de la Información/métodos , Inteligencia Artificial , Programas Informáticos
5.
IEEE Trans Inf Technol Biomed ; 8(2): 154-60, 2004 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15217260

RESUMEN

Several hundred internet accessible life science databases with constantly growing contents and varying areas of specialization are publicly available via the internet. Database integration, consequently, is a fundamental prerequisite to be able to answer complex biological questions. Due to the presence of syntactic, schematic, and semantic heterogeneities, large scale database integration at present takes considerable efforts. As there is a growing apprehension of extensible markup language (XML) as a means for data exchange in the life sciences, this article focuses on the impact of XML technology on database integration in this area. In detail, a general architecture for ontology-driven data integration based on XML technology is introduced, which overcomes some of the traditional problems in this area. As a proof of concept, a prototypical implementation of this architecture based on a native XML database and an expert system shell is described for the realization of a real world integration scenario.


Asunto(s)
Inteligencia Artificial , Disciplinas de las Ciencias Biológicas/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Hipermedia , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Integración de Sistemas , Algoritmos , Internet , Programas Informáticos , Diseño de Software
6.
J Integr Bioinform ; 5(2)2008 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-20134069

RESUMEN

The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.


Asunto(s)
Gráficos por Computador , Sistemas de Administración de Bases de Datos , Genómica/métodos , Algoritmos , Interfaz Usuario-Computador
7.
Nat Rev Genet ; 7(6): 482-8, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16682980

RESUMEN

A prerequisite to systems biology is the integration of heterogeneous experimental data, which are stored in numerous life-science databases. However, a wide range of obstacles that relate to access, handling and integration impede the efficient use of the contents of these databases. Addressing these issues will not only be essential for progress in systems biology, it will also be crucial for sustaining the more traditional uses of life-science databases.


Asunto(s)
Bases de Datos Factuales , Biología de Sistemas , Animales , Disciplinas de las Ciencias Biológicas , Simulación por Computador , Sistemas de Administración de Bases de Datos , Humanos
8.
Bioinformatics ; 22(11): 1383-90, 2006 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-16533819

RESUMEN

MOTIVATION: Assembling the relevant information needed to interpret the output from high-throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant changes in expression levels, but more information is needed to determine their biological relevance. The challenge is to bring these genes together with biological information distributed across hundreds of databases or buried in the scientific literature (millions of articles). Software tools are needed to automate this task which at present is labor-intensive and requires considerable informatics and biological expertise. RESULTS: This article describes ONDEX and how it can be applied to the task of interpreting gene expression results. ONDEX is a database system that combines the features of semantic database integration and text mining with methods for graph-based analysis. An overview of the ONDEX system is presented, concentrating on recently developed features for graph-based analysis and visualization. A case study is used to show how ONDEX can help to identify causal relationships between stress response genes and metabolic pathways from gene expression data. ONDEX also discovered functional annotations for most of the genes that emerged as significant in the microarray experiment, but were previously of unknown function.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Algoritmos , Arabidopsis/genética , Automatización , Gráficos por Computador , Interpretación Estadística de Datos , Bases de Datos Genéticas , Regulación de la Expresión Génica , Procesamiento de Lenguaje Natural , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos
9.
Brief Bioinform ; 6(3): 263-76, 2005 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16212774

RESUMEN

Biology can be regarded as a science of networks: interactions between various biological entities (eg genes, proteins, metabolites) on different levels (eg gene regulation, cell signalling) can be represented as graphs and, thus, analysis of such networks might shed new light on the function of biological systems. Such biological networks can be obtained from different sources. The extraction of networks from text is an important technique that requires the integration of several different computational disciplines. This paper summarises the most important steps in network extraction and reviews common approaches and solutions for the extraction of biological networks from scientific literature.


Asunto(s)
Bases de Datos Bibliográficas , Documentación/métodos , Regulación de la Expresión Génica/fisiología , Almacenamiento y Recuperación de la Información/métodos , Modelos Biológicos , Publicaciones Periódicas como Asunto , Transducción de Señal/fisiología , Indización y Redacción de Resúmenes/métodos , Animales , Inteligencia Artificial , Biología/métodos , Fenómenos Fisiológicos Celulares , Sistemas de Administración de Bases de Datos , Humanos , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , Ciencia/métodos , Terminología como Asunto
10.
Genome Biol ; 6(5): R46, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-15892874

RESUMEN

To enhance the treatment of relations in biomedical ontologies we advance a methodology for providing consistent and unambiguous formal definitions of the relational expressions used in such ontologies in a way designed to assist developers and users in avoiding errors in coding and annotation. The resulting Relation Ontology can promote interoperability of ontologies and support new types of automated reasoning about the spatial and temporal dimensions of biological and medical phenomena.


Asunto(s)
Biología Computacional/métodos , Terminología como Asunto , Vocabulario Controlado , Investigación Biomédica
11.
In Silico Biol ; 2(3): 219-31, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12542408

RESUMEN

A system for "intelligent" semantic integration and querying of federated databases is being implemented by using three main components: A component which enables SQL access to integrated databases by database federation (MARGBench), an ontology based semantic metadatabase (SEMEDA) and an ontology based query interface (SEMEDA-query). In this publication we explain and demonstrate the principles, architecture and the use of SEMEDA. Since SEMEDA is implemented as 3 tiered web application database providers can enter all relevant semantic and technical information about their databases by themselves via a web browser. SEMEDA' s collaborative ontology editing feature is not restricted to database integration, and might also be useful for ongoing ontology developments, such as the "Gene Ontology" [2]. SEMEDA can be found at http://www-bm.cs.uni-magdeburg.de/semeda/. We explain how this ontologically structured information can be used for semantic database integration. In addition, requirements to ontologies for molecular biological database integration are discussed and relevant existing ontologies are evaluated. We further discuss how ontologies and structured knowledge sources can be used in SEMEDA and whether they can be merged supplemented or updated to meet the requirements for semantic database integration.


Asunto(s)
Bases de Datos Genéticas , Biología Molecular , Integración de Sistemas
12.
Bioinformatics ; 19(18): 2420-7, 2003 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-14668226

RESUMEN

MOTIVATION: Many molecular biological databases are implemented on relational Database Management Systems, which provide standard interfaces like JDBC and ODBC for data and metadata exchange. By using these interfaces, many technical problems of database integration vanish and issues related to semantics remain, e.g. the use of different terms for the same things, different names for equivalent database attributes and missing links between relevant entries in different databases. RESULTS: In this publication, principles and methods that were used to implement SEMEDA (Semantic Meta Database) are described. Database owners can use SEMEDA to provide semantically integrated access to their databases as well as to collaboratively edit and maintain ontologies and controlled vocabularies. Biologists can use SEMEDA to query the integrated databases in real time without having to know the structure or any technical details of the underlying databases. AVAILABILITY: SEMEDA is available at http://www-bm.ipk-gatersleben.de/semeda/. Database providers who intend to grant access to their databases via SEMEDA are encouraged to contact the authors.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Terminología como Asunto , Vocabulario Controlado , Biología Computacional/métodos , Semántica , Integración de Sistemas , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA