Pesquisa | Biblioteca Virtual em Saúde

Flexible network reconstruction from relational databases with Cytoscape and CytoSQL.

Laukens, Kris; Hollunder, Jens; Dang, Thanh Hai; De Jaeger, Geert; Kuiper, Martin; Witters, Erwin; Verschoren, Alain; Van Leemput, Koenraad.

BMC Bioinformatics ; 11: 360, 2010 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-20594316

RESUMO

BACKGROUND: Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. RESULTS: A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. CONCLUSIONS: CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL.

Assuntos

Bases de Dados Genéticas , Software , Animais , Fenômenos Fisiológicos Celulares , Genômica , Humanos , Proteínas/metabolismo

Prediction of kinase-specific phosphorylation sites using conditional random fields.

Dang, Thanh Hai; Van Leemput, Koenraad; Verschoren, Alain; Laukens, Kris.

Bioinformatics ; 24(24): 2857-64, 2008 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-18940828

RESUMO

MOTIVATION: Phosphorylation is a crucial post-translational protein modification mechanism with important regulatory functions in biological systems. It is catalyzed by a group of enzymes called kinases, each of which recognizes certain target sites in its substrate proteins. Several authors have built computational models trained from sets of experimentally validated phosphorylation sites to predict these target sites for each given kinase. All of these models suffer from certain limitations, such as the fact that they do not take into account the dependencies between amino acid motifs within protein sequences in a global fashion. RESULTS: We propose a novel approach to predict phosphorylation sites from the protein sequence. The method uses a positive dataset to train a conditional random field (CRF) model. The negative training dataset is used to specify the decision threshold corresponding to a desired false positive rate. Application of the method on experimentally verified benchmark phosphorylation data (Phospho.ELM) shows that it performs well compared to existing methods for most kinases. This is to our knowledge that the first report of the use of CRFs to predict post-translational modification sites in protein sequences. AVAILABILITY: The source code of the implementation, called CRPhos, is available from http://www.ptools.ua.ac.be/CRPhos/

Assuntos

Algoritmos , Proteínas Quinases/metabolismo , Biologia Computacional/métodos , Bases de Dados de Proteínas , Fosforilação , Proteínas Quinases/química , Análise de Sequência de Proteína

Exploring the operational characteristics of inference algorithms for transcriptional networks by means of synthetic data.

Van Leemput, Koenraad; Van den Bulcke, Tim; Dhollander, Thomas; De Moor, Bart; Marchal, Kathleen; van Remortel, Piet.

Artif Life ; 14(1): 49-63, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18171130

RESUMO

The development of structure-learning algorithms for gene regulatory networks depends heavily on the availability of synthetic data sets that contain both the original network and associated expression data. This article reports the application of SynTReN, an existing network generator that samples topologies from existing biological networks and uses Michaelis-Menten and Hill enzyme kinetics to simulate gene interactions. We illustrate the effects of different aspects of the expression data on the quality of the inferred network. The tested expression data parameters are network size, network topology, type and degree of noise, quantity of expression data, and interaction types between genes. This is done by applying three well-known inference algorithms to SynTReN data sets. The results show the power of synthetic data in revealing operational characteristics of inference algorithms that are unlikely to be discovered by means of biological microarray data only.

Assuntos

Algoritmos , Simulação por Computador , Redes Reguladoras de Genes , Modelos Biológicos , Transcrição Gênica , Inteligência Artificial , Bases de Dados Genéticas , Software

Validating module network learning algorithms using simulated data.

Michoel, Tom; Maere, Steven; Bonnet, Eric; Joshi, Anagha; Saeys, Yvan; Van den Bulcke, Tim; Van Leemput, Koenraad; van Remortel, Piet; Kuiper, Martin; Marchal, Kathleen; Van de Peer, Yves.

BMC Bioinformatics ; 8 Suppl 2: S5, 2007 May 03.

Artigo em Inglês | MEDLINE | ID: mdl-17493254

RESUMO

BACKGROUND: In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. RESULTS: Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators. CONCLUSION: We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.

Assuntos

Algoritmos , Inteligência Artificial , Modelos Biológicos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Validação de Programas de Computador , Software , Simulação por Computador , Regulação da Expressão Gênica/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Biologia de Sistemas/métodos

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Van den Bulcke, Tim; Van Leemput, Koenraad; Naudts, Bart; van Remortel, Piet; Ma, Hongwu; Verschoren, Alain; De Moor, Bart; Marchal, Kathleen.

BMC Bioinformatics ; 7: 43, 2006 Jan 26.

Artigo em Inglês | MEDLINE | ID: mdl-16438721

RESUMO

BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. RESULTS: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. CONCLUSION: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.

Assuntos

Algoritmos , Regulação da Expressão Gênica/fisiologia , Modelos Biológicos , Transdução de Sinais/fisiologia , Validação de Programas de Computador , Software , Fatores de Transcrição/metabolismo , Inteligência Artificial , Benchmarking/métodos , Simulação por Computador , Bases de Dados Factuais

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA