Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
2.
IEEE Trans Inf Technol Biomed ; 10(4): 714-21, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17044405

RESUMO

In the light of the increasing number of biological databases, their integration is a fundamental prerequisite for answering complex biological questions. Database integration, therefore, is an important area of research in bioinformatics. Since most of the publicly available life science databases are still exclusively exchanged by means of proprietary flat files, database integration requires parsers for very different flat file formats. Unfortunately, the development and maintenance of database specific flat file parsers is a nontrivial and time-consuming task, which takes considerable effort in large-scale integration scenarios. This paper introduces heuristically based concepts for automatic structure extraction from life science database flat files. On the basis of these concepts the FlatEx prototype is developed for the automatic conversion of flat files into XML representations.


Assuntos
Algoritmos , Disciplinas das Ciências Biológicas/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Processamento Eletrônico de Dados , Hipermídia , Armazenamento e Recuperação da Informação/métodos , Inteligência Artificial , Software
3.
Nat Rev Genet ; 7(6): 482-8, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16682980

RESUMO

A prerequisite to systems biology is the integration of heterogeneous experimental data, which are stored in numerous life-science databases. However, a wide range of obstacles that relate to access, handling and integration impede the efficient use of the contents of these databases. Addressing these issues will not only be essential for progress in systems biology, it will also be crucial for sustaining the more traditional uses of life-science databases.


Assuntos
Bases de Dados Factuais , Biologia de Sistemas , Animais , Disciplinas das Ciências Biológicas , Simulação por Computador , Sistemas de Gerenciamento de Base de Dados , Humanos
4.
Bioinformatics ; 22(11): 1383-90, 2006 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-16533819

RESUMO

MOTIVATION: Assembling the relevant information needed to interpret the output from high-throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant changes in expression levels, but more information is needed to determine their biological relevance. The challenge is to bring these genes together with biological information distributed across hundreds of databases or buried in the scientific literature (millions of articles). Software tools are needed to automate this task which at present is labor-intensive and requires considerable informatics and biological expertise. RESULTS: This article describes ONDEX and how it can be applied to the task of interpreting gene expression results. ONDEX is a database system that combines the features of semantic database integration and text mining with methods for graph-based analysis. An overview of the ONDEX system is presented, concentrating on recently developed features for graph-based analysis and visualization. A case study is used to show how ONDEX can help to identify causal relationships between stress response genes and metabolic pathways from gene expression data. ONDEX also discovered functional annotations for most of the genes that emerged as significant in the microarray experiment, but were previously of unknown function.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Arabidopsis/genética , Automação , Gráficos por Computador , Interpretação Estatística de Dados , Bases de Dados Genéticas , Regulação da Expressão Gênica , Processamento de Linguagem Natural , Análise de Sequência com Séries de Oligonucleotídeos , Software
5.
In Silico Biol ; 5(1): 33-44, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15972003

RESUMO

The structure of a closely integrated data warehouse is described that is designed to link different types and varying numbers of biological networks, sequence analysis methods and experimental results such as those coming from microarrays. The data schema is inspired by a combination of graph based methods and generalised data structures and makes use of ontologies and meta-data. The core idea is to consider and store biological networks as graphs, and to use generalised data structures (GDS) for the storage of further relevant information. This is possible because many biological networks can be stored as graphs: protein interactions, signal transduction networks, metabolic pathways, gene regulatory networks etc. Nodes in biological graphs represent entities such as promoters, proteins, genes and transcripts whereas the edges of such graphs specify how the nodes are related. The semantics of the nodes and edges are defined using ontologies of node and relation types. Besides generic attributes that most biological entities possess (name, attribute description), further information is stored using generalised data structures. By directly linking to underlying sequences (exons, introns, promoters, amino acid sequences) in a systematic way, close interoperability to sequence analysis methods can be achieved. This approach allows us to store, query and update a wide variety of biological information in a way that is semantically compact without requiring changes at the database schema level when new kinds of biological information is added. We describe how this datawarehouse is being implemented by extending the text-mining framework ONDEX to link, support and complement different bioinformatics applications and research activities such as microarray analysis, sequence analysis and modelling/simulation of biological systems. The system is developed under the GPL license and can be downloaded from http://sourceforge.net/projects/ondex/


Assuntos
Biologia Computacional/métodos , Algoritmos , Animais , Biologia/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Bases de Dados Genéticas , Bases de Dados de Proteínas , Perfilação da Expressão Gênica , Genoma , Humanos , Armazenamento e Recuperação da Informação , Substâncias Macromoleculares , Análise em Microsséries , Proteínas , RNA Mensageiro/metabolismo , Software , Biologia de Sistemas
6.
IEEE Trans Inf Technol Biomed ; 8(2): 154-60, 2004 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15217260

RESUMO

Several hundred internet accessible life science databases with constantly growing contents and varying areas of specialization are publicly available via the internet. Database integration, consequently, is a fundamental prerequisite to be able to answer complex biological questions. Due to the presence of syntactic, schematic, and semantic heterogeneities, large scale database integration at present takes considerable efforts. As there is a growing apprehension of extensible markup language (XML) as a means for data exchange in the life sciences, this article focuses on the impact of XML technology on database integration in this area. In detail, a general architecture for ontology-driven data integration based on XML technology is introduced, which overcomes some of the traditional problems in this area. As a proof of concept, a prototypical implementation of this architecture based on a native XML database and an expert system shell is described for the realization of a real world integration scenario.


Assuntos
Inteligência Artificial , Disciplinas das Ciências Biológicas/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Hipermídia , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Integração de Sistemas , Algoritmos , Internet , Software , Design de Software
7.
Bioinformatics ; 20(1): 51-7, 2004 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-14693808

RESUMO

MOTIVATION: Due to the increasing number of molecular biological databases and the exponential growth of their contents, database integration is an important topic of research in bioinformatics. Existing approaches in this area have in common that considerable efforts are needed to provide integrated access to heterogeneous data sources. RESULTS: This article describes the LIMBO architecture as a light-weight approach to molecular biological database integration. By building systems upon this architecture, the efforts needed for database integration can be significantly lowered. AVAILABILITY: As an illustration of the principle usefulness of the underlying ideas, a prototypical implementation based upon the LIMBO architecture is described. This implementation is exclusively based on freely available open source components like the PostgreSQL database management system and the BioRuby project. Additional files and modified components are available upon request from the author.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados/normas , Bases de Dados Factuais/normas , Armazenamento e Recuperação da Informação/métodos , Armazenamento e Recuperação da Informação/normas , Interface Usuário-Computador , Biologia Computacional/normas , Hipermídia , Projetos Piloto , Software , Integração de Sistemas
8.
Bioinformatics ; 19(18): 2420-7, 2003 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-14668226

RESUMO

MOTIVATION: Many molecular biological databases are implemented on relational Database Management Systems, which provide standard interfaces like JDBC and ODBC for data and metadata exchange. By using these interfaces, many technical problems of database integration vanish and issues related to semantics remain, e.g. the use of different terms for the same things, different names for equivalent database attributes and missing links between relevant entries in different databases. RESULTS: In this publication, principles and methods that were used to implement SEMEDA (Semantic Meta Database) are described. Database owners can use SEMEDA to provide semantically integrated access to their databases as well as to collaboratively edit and maintain ontologies and controlled vocabularies. Biologists can use SEMEDA to query the integrated databases in real time without having to know the structure or any technical details of the underlying databases. AVAILABILITY: SEMEDA is available at http://www-bm.ipk-gatersleben.de/semeda/. Database providers who intend to grant access to their databases via SEMEDA are encouraged to contact the authors.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Terminologia como Assunto , Vocabulário Controlado , Biologia Computacional/métodos , Semântica , Integração de Sistemas , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA