Búsqueda | Portal Regional de la BVS

An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data.

Vu, Trung N; Valkenborg, Dirk; Smets, Koen; Verwaest, Kim A; Dommisse, Roger; Lemière, Filip; Verschoren, Alain; Goethals, Bart; Laukens, Kris.

BMC Bioinformatics ; 12: 405, 2011 Oct 20.

Artículo en Inglés | MEDLINE | ID: mdl-22014236

RESUMEN

BACKGROUND: Nuclear magnetic resonance spectroscopy (NMR) is a powerful technique to reveal and compare quantitative metabolic profiles of biological tissues. However, chemical and physical sample variations make the analysis of the data challenging, and typically require the application of a number of preprocessing steps prior to data interpretation. For example, noise reduction, normalization, baseline correction, peak picking, spectrum alignment and statistical analysis are indispensable components in any NMR analysis pipeline. RESULTS: We introduce a novel suite of informatics tools for the quantitative analysis of NMR metabolomic profile data. The core of the processing cascade is a novel peak alignment algorithm, called hierarchical Cluster-based Peak Alignment (CluPA). The algorithm aligns a target spectrum to the reference spectrum in a top-down fashion by building a hierarchical cluster tree from peak lists of reference and target spectra and then dividing the spectra into smaller segments based on the most distant clusters of the tree. To reduce the computational time to estimate the spectral misalignment, the method makes use of Fast Fourier Transformation (FFT) cross-correlation. Since the method returns a high-quality alignment, we can propose a simple methodology to study the variability of the NMR spectra. For each aligned NMR data point the ratio of the between-group and within-group sum of squares (BW-ratio) is calculated to quantify the difference in variability between and within predefined groups of NMR spectra. This differential analysis is related to the calculation of the F-statistic or a one-way ANOVA, but without distributional assumptions. Statistical inference based on the BW-ratio is achieved by bootstrapping the null distribution from the experimental data. CONCLUSIONS: The workflow performance was evaluated using a previously published dataset. Correlation maps, spectral and grey scale plots show clear improvements in comparison to other methods, and the down-to-earth quantitative analysis works well for the CluPA-aligned spectra. The whole workflow is embedded into a modular and statistically sound framework that is implemented as an R package called "speaq" ("spectrum alignment and quantitation"), which is freely available from http://code.google.com/p/speaq/.

Asunto(s)

Algoritmos , Análisis por Conglomerados , Espectroscopía de Resonancia Magnética/métodos , Análisis de Varianza , Imagen por Resonancia Magnética , Metabolómica , Programas Informáticos , Flujo de Trabajo

Flexible network reconstruction from relational databases with Cytoscape and CytoSQL.

Laukens, Kris; Hollunder, Jens; Dang, Thanh Hai; De Jaeger, Geert; Kuiper, Martin; Witters, Erwin; Verschoren, Alain; Van Leemput, Koenraad.

BMC Bioinformatics ; 11: 360, 2010 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-20594316

RESUMEN

BACKGROUND: Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. RESULTS: A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. CONCLUSIONS: CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL.

Asunto(s)

Bases de Datos Genéticas , Programas Informáticos , Animales , Fenómenos Fisiológicos Celulares , Genómica , Humanos , Proteínas/metabolismo

Prediction of kinase-specific phosphorylation sites using conditional random fields.

Dang, Thanh Hai; Van Leemput, Koenraad; Verschoren, Alain; Laukens, Kris.

Bioinformatics ; 24(24): 2857-64, 2008 Dec 15.

Artículo en Inglés | MEDLINE | ID: mdl-18940828

RESUMEN

MOTIVATION: Phosphorylation is a crucial post-translational protein modification mechanism with important regulatory functions in biological systems. It is catalyzed by a group of enzymes called kinases, each of which recognizes certain target sites in its substrate proteins. Several authors have built computational models trained from sets of experimentally validated phosphorylation sites to predict these target sites for each given kinase. All of these models suffer from certain limitations, such as the fact that they do not take into account the dependencies between amino acid motifs within protein sequences in a global fashion. RESULTS: We propose a novel approach to predict phosphorylation sites from the protein sequence. The method uses a positive dataset to train a conditional random field (CRF) model. The negative training dataset is used to specify the decision threshold corresponding to a desired false positive rate. Application of the method on experimentally verified benchmark phosphorylation data (Phospho.ELM) shows that it performs well compared to existing methods for most kinases. This is to our knowledge that the first report of the use of CRFs to predict post-translational modification sites in protein sequences. AVAILABILITY: The source code of the implementation, called CRPhos, is available from http://www.ptools.ua.ac.be/CRPhos/

Asunto(s)

Algoritmos , Proteínas Quinasas/metabolismo , Biología Computacional/métodos , Bases de Datos de Proteínas , Fosforilación , Proteínas Quinasas/química , Análisis de Secuencia de Proteína

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Van den Bulcke, Tim; Van Leemput, Koenraad; Naudts, Bart; van Remortel, Piet; Ma, Hongwu; Verschoren, Alain; De Moor, Bart; Marchal, Kathleen.

BMC Bioinformatics ; 7: 43, 2006 Jan 26.

Artículo en Inglés | MEDLINE | ID: mdl-16438721

RESUMEN

BACKGROUND: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. RESULTS: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. CONCLUSION: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.

Asunto(s)

Algoritmos , Regulación de la Expresión Génica/fisiología , Modelos Biológicos , Transducción de Señal/fisiología , Validación de Programas de Computación , Programas Informáticos , Factores de Transcripción/metabolismo , Inteligencia Artificial , Benchmarking/métodos , Simulación por Computador , Bases de Datos Factuales

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA