Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
J Comput Biol ; 31(6): 513-523, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38814745

RESUMO

Single-cell transcriptomic studies of differentiating systems allow meaningful understanding, especially in human embryonic development and cell fate determination. We present an innovative method aimed at modeling these intricate processes by leveraging scRNAseq data from various human developmental stages. Our implemented method identifies pseudo-perturbations, since actual perturbations are unavailable due to ethical and technical constraints. By integrating these pseudo-perturbations with prior knowledge of gene interactions, our framework generates stage-specific Boolean networks (BNs). We apply our method to medium and late trophectoderm developmental stages and identify 20 pseudo-perturbations required to infer BNs. The resulting BN families delineate distinct regulatory mechanisms, enabling the differentiation between these developmental stages. We show that our program outperforms existing pseudo-perturbation identification tool. Our framework contributes to comprehending human developmental processes and holds potential applicability to diverse developmental stages and other research scenarios.


Assuntos
Desenvolvimento Embrionário , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Humanos , Desenvolvimento Embrionário/genética , Análise de Célula Única/métodos , Transcriptoma , Blastocisto/metabolismo , Diferenciação Celular/genética , Biologia Computacional/métodos
2.
BMC Bioinformatics ; 24(Suppl 1): 321, 2023 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-37626282

RESUMO

BACKGROUND: The impact of a perturbation, over-expression, or repression of a key node on an organism, can be modelled based on a regulatory and/or metabolic network. Integration of these two networks could improve our global understanding of biological mechanisms triggered by a perturbation. This study focuses on improving the modelling of the regulatory network to facilitate a possible integration with the metabolic network. Previously proposed methods that study this problem fail to deal with a real-size regulatory network, computing predictions sensitive to perturbation and quantifying the predicted species behaviour more finely. RESULTS: To address previously mentioned limitations, we develop a new method based on Answer Set Programming, MajS. It takes a regulatory network and a discrete partial set of observations as input. MajS tests the consistency between the input data, proposes minimal repairs on the network to establish consistency, and finally computes weighted and signed predictions over the network species. We tested MajS by comparing the HIF-1 signalling pathway with two gene-expression datasets. Our results show that MajS can predict 100% of unobserved species. When comparing MajS with two similar (discrete and quantitative) tools, we observed that compared with the discrete tool, MajS proposes a better coverage of the unobserved species, is more sensitive to system perturbations, and proposes predictions closer to real data. Compared to the quantitative tool, MajS provides more refined discrete predictions that agree with the dynamic proposed by the quantitative tool. CONCLUSIONS: MajS is a new method to test the consistency between a regulatory network and a dataset that provides computational predictions on unobserved network species. It provides fine-grained discrete predictions by outputting the weight of the predicted sign as a piece of additional information. MajS' output, thanks to its weight, could easily be integrated with metabolic network modelling.


Assuntos
Transdução de Sinais , Expressão Gênica
3.
BMC Bioinformatics ; 21(1): 18, 2020 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-31937236

RESUMO

BACKGROUND: Integrating genome-wide gene expression patient profiles with regulatory knowledge is a challenging task because of the inherent heterogeneity, noise and incompleteness of biological data. From the computational side, several solvers for logic programs are able to perform extremely well in decision problems for combinatorial search domains. The challenge then is how to process the biological knowledge in order to feed these solvers to gain insights in a biological study. It requires formalizing the biological knowledge to give a precise interpretation of this information; currently, very few pathway databases offer this possibility. RESULTS: The presented work proposes an automatic pipeline to extract automatically regulatory knowledge from pathway databases and generate novel computational predictions related to the state of expression or activity of biological molecules. We applied it in the context of hepatocellular carcinoma (HCC) progression, and evaluate the precision and the stability of these computational predictions. Our working base is a graph of 3383 nodes and 13,771 edges extracted from the KEGG database, in which we integrate 209 differentially expressed genes between low and high aggressive HCC across 294 patients. Our computational model predicts the shifts of expression of 146 initially non-observed biological components. Our predictions were validated at 88% using a larger experimental dataset and cross-validation techniques. In particular, we focus on the protein complexes predictions and show for the first time that NFKB1/BCL-3 complexes are activated in aggressive HCC. In spite of the large dimension of the reconstructed models, our analyses over the computational predictions discover a well constrained region where KEGG regulatory knowledge constrains gene expression of several biomolecules. These regions can offer interesting windows to perturb experimentally such complex systems. CONCLUSION: This new pipeline allows biologists to develop their own predictive models based on a list of genes. It facilitates the identification of new regulatory biomolecules using knowledge graphs and predictive computational methods. Our workflow is implemented in an automatic python pipeline which is publicly available at https://github.com/LokmaneChebouba/key-pipeand contains as testing data all the data used in this paper.


Assuntos
Carcinoma Hepatocelular/genética , Neoplasias Hepáticas/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Progressão da Doença , Redes Reguladoras de Genes , Humanos , Transcriptoma , Fluxo de Trabalho
4.
PLoS Comput Biol ; 14(10): e1006538, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30372442

RESUMO

Protein signaling networks are static views of dynamic processes where proteins go through many biochemical modifications such as ubiquitination and phosphorylation to propagate signals that regulate cells and can act as feed-back systems. Understanding the precise mechanisms underlying protein interactions can elucidate how signaling and cell cycle progression occur within cells in different diseases such as cancer. Large-scale protein signaling networks contain an important number of experimentally verified protein relations but lack the capability to predict the outcomes of the system, and therefore to be trained with respect to experimental measurements. Boolean Networks (BNs) are a simple yet powerful framework to study and model the dynamics of the protein signaling networks. While many BN approaches exist to model biological systems, they focus mainly on system properties, and few exist to integrate experimental data in them. In this work, we show an application of a method conceived to integrate time series phosphoproteomic data into protein signaling networks. We use a large-scale real case study from the HPN-DREAM Breast Cancer challenge. Our efficient and parameter-free method combines logic programming and model-checking to infer a family of BNs from multiple perturbation time series data of four breast cancer cell lines given a prior protein signaling network. Because each predicted BN family is cell line specific, our method highlights commonalities and discrepancies between the four cell lines. Our models have a Root Mean Square Error (RMSE) of 0.31 with respect to the testing data, while the best performant method of this HPN-DREAM challenge had a RMSE of 0.47. To further validate our results, BNs are compared with the canonical mTOR pathway showing a comparable AUROC score (0.77) to the top performing HPN-DREAM teams. In addition, our approach can also be used as a complementary method to identify erroneous experiments. These results prove our methodology as an efficient dynamic model discovery method in multiple perturbation time course experimental data of large-scale signaling networks. The software and data are publicly available at https://github.com/misbahch6/caspo-ts.


Assuntos
Modelos Biológicos , Neoplasias/genética , Mapas de Interação de Proteínas/genética , Proteômica/métodos , Transdução de Sinais/genética , Algoritmos , Linhagem Celular Tumoral , Humanos , Neoplasias/metabolismo , Fosfoproteínas/genética , Fosfoproteínas/metabolismo
5.
J Med Syst ; 42(7): 129, 2018 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-29869179

RESUMO

The use of data issued from high throughput technologies in drug target problems is widely widespread during the last decades. This study proposes a meta-heuristic framework using stochastic local search (SLS) combined with random forest (RF) where the aim is to specify the most important genes and proteins leading to the best classification of Acute Myeloid Leukemia (AML) patients. First we use a stochastic local search meta-heuristic as a feature selection technique to select the most significant proteins to be used in the classification task step. Then we apply RF to classify new patients into their corresponding classes. The evaluation technique is to run the RF classifier on the training data to get a model. Then, we apply this model on the test data to find the appropriate class. We use as metrics the balanced accuracy (BAC) and the area under the receiver operating characteristic curve (AUROC) to measure the performance of our model. The proposed method is evaluated on the dataset issued from DREAM 9 challenge. The comparison is done with a pure random forest (without feature selection), and with the two best ranked results of the DREAM 9 challenge. We used three types of data: only clinical data, only proteomics data, and finally clinical and proteomics data combined. The numerical results show that the highest scores are obtained when using clinical data alone, and the lowest is obtained when using proteomics data alone. Further, our method succeeds in finding promising results compared to the methods presented in the DREAM challenge.


Assuntos
Leucemia Mieloide Aguda/diagnóstico , Proteômica , Algoritmos , Área Sob a Curva , Humanos , Curva ROC
6.
BMC Bioinformatics ; 19(Suppl 2): 59, 2018 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-29536824

RESUMO

BACKGROUND: During the last years, several approaches were applied on biomedical data to detect disease specific proteins and genes in order to better target drugs. It was shown that statistical and machine learning based methods use mainly clinical data and improve later their results by adding omics data. This work proposes a new method to discriminate the response of Acute Myeloid Leukemia (AML) patients to treatment. The proposed approach uses proteomics data and prior regulatory knowledge in the form of networks to predict cancer treatment outcomes by finding out the different Boolean networks specific to each type of response to drugs. To show its effectiveness we evaluate our method on a dataset from the DREAM 9 challenge. RESULTS: The results are encouraging and demonstrate the benefit of our approach to distinguish patient groups with different response to treatment. In particular each treatment response group is characterized by a predictive model in the form of a signaling Boolean network. This model describes regulatory mechanisms which are specific to each response group. The proteins in this model were selected from the complete dataset by imposing optimization constraints that maximize the difference in the logical response of the Boolean network associated to each group of patients given the omic dataset. This mechanistic and predictive model also allow us to classify new patients data into the two different patient response groups. CONCLUSIONS: We propose a new method to detect the most relevant proteins for understanding different patient responses upon treatments in order to better target drugs using a Prior Knowledge Network and proteomics data. The results are interesting and show the effectiveness of our method.


Assuntos
Algoritmos , Leucemia Mieloide Aguda/metabolismo , Leucemia Mieloide Aguda/terapia , Proteômica , Bases de Dados de Proteínas , Humanos , Lógica , Mapas de Interação de Proteínas , Reprodutibilidade dos Testes
7.
R Soc Open Sci ; 5(2): 171852, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29515890

RESUMO

In a previous article, an algorithm for identifying therapeutic targets in Boolean networks modelling pathological mechanisms was introduced. In the present article, the improvements made on this algorithm, named kali, are described. These improvements are (i) the possibility to work on asynchronous Boolean networks, (ii) a finer assessment of therapeutic targets and (iii) the possibility to use multivalued logic. kali assumes that the attractors of a dynamical system, such as a Boolean network, are associated with the phenotypes of the modelled biological system. Given a logic-based model of pathological mechanisms, kali searches for therapeutic targets able to reduce the reachability of the attractors associated with pathological phenotypes, thus reducing their likeliness. kali is illustrated on an example network and used on a biological case study. The case study is a published logic-based model of bladder tumorigenesis from which kali returns consistent results. However, like any computational tool, kali can predict but cannot replace human expertise: it is a supporting tool for coping with the complexity of biological systems in the field of drug discovery.

8.
BMC Syst Biol ; 12(Suppl 3): 32, 2018 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-29589566

RESUMO

BACKGROUND: The integration of gene expression profiles (GEPs) and large-scale biological networks derived from pathways databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic, however our approach does not require a prior species selection according to their gene expression level. RESULTS: We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this confrontation independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We apply our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph links Multiple Myeloma (MM) genes to known receptors for this blood cancer. CONCLUSION: We discover that the NCI-PID MM graph had 15 independent components, and when confronted to 611 MM GEPs, we find 1 component as being more specific to represent the difference between cancer and healthy profiles.


Assuntos
Biologia Computacional , Gráficos por Computador , Redes Reguladoras de Genes , Lógica , Mieloma Múltiplo/genética , Mieloma Múltiplo/patologia , Transdução de Sinais , Modelos Biológicos
9.
Sci Rep ; 7(1): 9257, 2017 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-28835615

RESUMO

Innovative approaches combining regulatory networks (RN) and genomic data are needed to extract biological information for a better understanding of diseases, such as cancer, by improving the identification of entities and thereby leading to potential new therapeutic avenues. In this study, we confronted an automatically generated RN with gene expression profiles (GEP) from a cohort of multiple myeloma (MM) patients and normal individuals using global reasoning on the RN causality to identify key-nodes. We modeled each patient by his or her GEP, the RN and the possible automatically detected repairs needed to establish a coherent flow of the information that explains the logic of the GEP. These repairs could represent cancer mutations leading to GEP variability. With this reasoning, unmeasured protein states can be inferred, and we can simulate the impact of a protein perturbation on the RN behavior to identify therapeutic targets. We showed that JUN/FOS and FOXM1 activities are altered in almost all MM patients and identified two survival markers for MM patients. Our results suggest that JUN/FOS-activation has a strong impact on the RN in view of the whole GEP, whereas FOXM1-activation could be an interesting way to perturb an MM subgroup identified by our method.


Assuntos
Reprogramação Celular/genética , Mieloma Múltiplo/genética , Mieloma Múltiplo/metabolismo , Fatores de Transcrição/metabolismo , Algoritmos , Biologia Computacional/métodos , Proteína Forkhead Box M1/metabolismo , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Proteínas Quinases JNK Ativadas por Mitógeno/metabolismo , Modelos Biológicos , Mieloma Múltiplo/mortalidade , Mieloma Múltiplo/patologia , Proteínas Oncogênicas v-fos/metabolismo , Reprodutibilidade dos Testes , Software , Transcriptoma
10.
Algorithms Mol Biol ; 12: 19, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28736575

RESUMO

BACKGROUND: Numerous cellular differentiation processes can be captured using discrete qualitative models of biological regulatory networks. These models describe the temporal evolution of the state of the network subject to different competing transitions, potentially leading the system to different attractors. This paper focusses on the formal identification of states and transitions that are crucial for preserving or pre-empting the reachability of a given behaviour. METHODS: In the context of non-deterministic automata networks, we propose a static identification of so-called bifurcations, i.e., transitions after which a given goal is no longer reachable. Such transitions are naturally good candidates for controlling the occurrence of the goal, notably by modulating their propensity. Our method combines Answer-Set Programming with static analysis of reachability properties to provide an under-approximation of all the existing bifurcations. RESULTS: We illustrate our discrete bifurcation analysis on several models of biological systems, for which we identify transitions which impact the reachability of given long-term behaviour. In particular, we apply our implementation on a regulatory network among hundreds of biological species, supporting the scalability of our approach. CONCLUSIONS: Our method allows a formal and scalable identification of transitions which are responsible for the lost of capability to reach a given state. It can be applied to any asynchronous automata networks, which encompass Boolean and multi-valued models. An implementation is provided as part of the Pint software, available at http://loicpauleve.name/pint.

11.
Bioinformatics ; 33(6): 947-950, 2017 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-28065903

RESUMO

Summary: We introduce the caspo toolbox, a python package implementing a workflow for reasoning on logical networks families. Our software allows researchers to (i) a family of logical networks derived from a given topology and explaining the experimental response to various perturbations; (ii) all logical networks in a given family by their input-output behaviors; (iii) the response of the system to every possible perturbation based on the ensemble of predictions; (iv) new experimental perturbations to discriminate among a family of logical networks; and (v) a family of logical networks by finding all interventions strategies forcing a set of targets into a desired steady state. Availability and Implementation: caspo is open-source software distributed under the GPLv3 license. Source code is publicly hosted at http://github.com/bioasp/caspo . Contact: anne.siegel@irisa.fr.


Assuntos
Transdução de Sinais , Software , Biologia de Sistemas/métodos , Hepatócitos/metabolismo , Humanos , Modelos Biológicos , Fosfoproteínas , Fluxo de Trabalho
12.
BMC Bioinformatics ; 17: 35, 2016 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-26772805

RESUMO

BACKGROUND: Gene co-expression evidenced as a response to environmental changes has shown that transcriptional activity is coordinated, which pinpoints the role of transcriptional regulatory networks (TRNs). Nevertheless, the prediction of TRNs based on the affinity of transcription factors (TFs) with binding sites (BSs) generally produces an over-estimation of the observable TF/BS relations within the network and therefore many of the predicted relations are spurious. RESULTS: We present LOMBARDE, a bioinformatics method that extracts from a TRN determined from a set of predicted TF/BS affinities a subnetwork explaining a given set of observed co-expressions by choosing the TFs and BSs most likely to be involved in the co-regulation. LOMBARDE solves an optimization problem which selects confident paths within a given TRN that join a putative common regulator with two co-expressed genes via regulatory cascades. To evaluate the method, we used public data of Escherichia coli to produce a regulatory network that explained almost all observed co-expressions while using only 19 % of the input TF/BS affinities but including about 66 % of the independent experimentally validated regulations in the input data. When all known validated TF/BS affinities were integrated into the input data the precision of LOMBARDE increased significantly. The topological characteristics of the subnetwork that was obtained were similar to the characteristics described for known validated TRNs. CONCLUSIONS: LOMBARDE provides a useful modeling scheme for deciphering the regulatory mechanisms that underlie the phenotypic responses of an organism to environmental challenges. The method can become a reliable tool for further research on genome-scale transcriptional regulation studies.


Assuntos
Biologia Computacional/métodos , Meio Ambiente , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Transcrição Gênica , Escherichia coli/genética , Fatores de Transcrição
13.
FEBS J ; 283(2): 350-60, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26518250

RESUMO

An effective means to analyze mRNA expression data is to take advantage of established knowledge from pathway databases, using methods such as pathway-enrichment analyses. However, pathway databases are not case-specific and expression data could be used to infer gene-regulation patterns in the context of specific pathways. In addition, canonical pathways may not always describe the signaling mechanisms properly, because interactions can frequently occur between genes in different pathways. Relatively few methods have been proposed to date for generating and analyzing such networks, preserving the causality between gene interactions and reasoning over the qualitative logic of regulatory effects. We present an algorithm (MCWalk) integrated with a logic programming approach, to discover subgraphs in large-scale signaling networks by random walks in a fully automated pipeline. As an exemplary application, we uncover the signal transduction mechanisms in a gene interaction network describing hepatocyte growth factor-stimulated cell migration and proliferation from gene-expression measured with microarray and RT-qPCR using in-house perturbation experiments in a keratinocyte-fibroblast co-culture. The resulting subgraphs illustrate possible associations of hepatocyte growth factor receptor c-Met nodes, differentially expressed genes and cellular states. Using perturbation experiments and Answer Set programming, we are able to select those which are more consistent with the experimental data. We discover key regulator nodes by measuring the frequency with which they are traversed when connecting signaling between receptors and significantly regulated genes and predict their expression-shift consistently with the measured data. The Java implementation of MCWalk is publicly available under the MIT license at: https://bitbucket.org/akittas/biosubg.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Fator de Crescimento de Hepatócito/genética , Bases de Dados Factuais , Regulação da Expressão Gênica , Fator de Crescimento de Hepatócito/metabolismo , Humanos , Queratinócitos/metabolismo , Método de Monte Carlo , Análise de Sequência com Séries de Oligonucleotídeos , Distribuição Aleatória , Transdução de Sinais
14.
BMC Bioinformatics ; 16: 345, 2015 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-26510976

RESUMO

BACKGROUND: A rapidly growing amount of knowledge about signaling and gene regulatory networks is available in databases such as KEGG, Reactome, or RegulonDB. There is an increasing need to relate this knowledge to high-throughput data in order to (in)validate network topologies or to decide which interactions are present or inactive in a given cell type under a particular environmental condition. Interaction graphs provide a suitable representation of cellular networks with information flows and methods based on sign consistency approaches have been shown to be valuable tools to (i) predict qualitative responses, (ii) to test the consistency of network topologies and experimental data, and (iii) to apply repair operations to the network model suggesting missing or wrong interactions. RESULTS: We present a framework to unify different notions of sign consistency and propose a refined method for data discretization that considers uncertainties in experimental profiles. We furthermore introduce a new constraint to filter undesired model behaviors induced by positive feedback loops. Finally, we generalize the way predictions can be made by the sign consistency approach. In particular, we distinguish strong predictions (e.g. increase of a node level) and weak predictions (e.g., node level increases or remains unchanged) enlarging the overall predictive power of the approach. We then demonstrate the applicability of our framework by confronting a large-scale gene regulatory network model of Escherichia coli with high-throughput transcriptomic measurements. CONCLUSION: Overall, our work enhances the flexibility and power of the sign consistency approach for the prediction of the behavior of signaling and gene regulatory networks and, more generally, for the validation and inference of these networks.


Assuntos
Escherichia coli/metabolismo , Redes Reguladoras de Genes , Transdução de Sinais , Algoritmos , Escherichia coli/genética
15.
Artigo em Inglês | MEDLINE | ID: mdl-26389116

RESUMO

Logic models of signaling pathways are a promising way of building effective in silico functional models of a cell, in particular of signaling pathways. The automated learning of Boolean logic models describing signaling pathways can be achieved by training to phosphoproteomics data, which is particularly useful if it is measured upon different combinations of perturbations in a high-throughput fashion. However, in practice, the number and type of allowed perturbations are not exhaustive. Moreover, experimental data are unavoidably subjected to noise. As a result, the learning process results in a family of feasible logical networks rather than in a single model. This family is composed of logic models implementing different internal wirings for the system and therefore the predictions of experiments from this family may present a significant level of variability, and hence uncertainty. In this paper, we introduce a method based on Answer Set Programming to propose an optimal experimental design that aims to narrow down the variability (in terms of input-output behaviors) within families of logical models learned from experimental data. We study how the fitness with respect to the data can be improved after an optimal selection of signaling perturbations and how we learn optimal logic models with minimal number of experiments. The methods are applied on signaling pathways in human liver cells and phosphoproteomics experimental data. Using 25% of the experiments, we obtained logical models with fitness scores (mean square error) 15% close to the ones obtained using all experiments, illustrating the impact that our approach can have on the design of experiments for efficient model calibration.

16.
Bioinformatics ; 29(18): 2320-6, 2013 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-23853063

RESUMO

MOTIVATION: Logic modeling is a useful tool to study signal transduction across multiple pathways. Logic models can be generated by training a network containing the prior knowledge to phospho-proteomics data. The training can be performed using stochastic optimization procedures, but these are unable to guarantee a global optima or to report the complete family of feasible models. This, however, is essential to provide precise insight in the mechanisms underlaying signal transduction and generate reliable predictions. RESULTS: We propose the use of Answer Set Programming to explore exhaustively the space of feasible logic models. Toward this end, we have developed caspo, an open-source Python package that provides a powerful platform to learn and characterize logic models by leveraging the rich modeling language and solving technologies of Answer Set Programming. We illustrate the usefulness of caspo by revisiting a model of pro-growth and inflammatory pathways in liver cells. We show that, if experimental error is taken into account, there are thousands (11 700) of models compatible with the data. Despite the large number, we can extract structural features from the models, such as links that are always (or never) present or modules that appear in a mutual exclusive fashion. To further characterize this family of models, we investigate the input-output behavior of the models. We find 91 behaviors across the 11 700 models and we suggest new experiments to discriminate among them. Our results underscore the importance of characterizing in a global and exhaustive manner the family of feasible models, with important implications for experimental design. AVAILABILITY: caspo is freely available for download (license GPLv3) and as a web service at http://caspo.genouest.org/. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online. CONTACT: santiago.videla@irisa.fr.


Assuntos
Transdução de Sinais , Software , Linhagem Celular Tumoral , Humanos , Lógica , Proteômica
17.
FEBS J ; 279(18): 3462-74, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22540519

RESUMO

Despite the increasing number of growth factor-related signalling networks, their lack of logical and causal connection to factual changes in cell states frequently impairs the functional interpretation of microarray data. We present a novel method enabling the automatic inference of causal multi-layer networks from such data, allowing the functional interpretation of growth factor stimulation experiments using pathway databases. Our environment of evaluation was hepatocyte growth factor-stimulated cell migration and proliferation in a keratinocyte-fibroblast co-culture. The network for this system was obtained by applying the steps: (a) automatic integration of the comprehensive set of all known cellular networks from the Pathway Interaction Database into a master structure; (b) retrieval of an active-network from the master structure, where the network edges that connect nodes with an absent mRNA level were excluded; and (c) reduction of the active-network complexity to a causal subnetwork from a set of seed nodes specific for the microarray experiment. The seed nodes comprised the receptors stimulated in the experiment, the consequently differentially expressed genes, and the expected cell states. The resulting network shows how well-known players, in the context of hepatocyte growth factor stimulation, are mechanistically linked in a pathway triggering functional cell state changes. Using BIOQUALI, we checked and validated the consistency of the network with respect to microarray data by computational simulation. The network has properties that can be classified into different functional layers because it not only shows signal processing down to the transcriptional level, but also the modulation of the network structure by the preceeding stimulation. The software for generating computable objects from the Pathway Interaction Database database, as well as the generated networks, are freely available at: http://www.tiga.uni-hd.de/supplements/inferringFromPID.html.


Assuntos
Movimento Celular/efeitos dos fármacos , Fator de Crescimento de Hepatócito/fisiologia , Transdução de Sinais/fisiologia , Técnicas de Cocultura , Simulação por Computador , Bases de Dados Factuais , Fibroblastos/metabolismo , Perfilação da Expressão Gênica/métodos , Queratinócitos/metabolismo , Mapeamento de Interação de Proteínas , RNA Mensageiro/metabolismo , Software
18.
Artigo em Inglês | MEDLINE | ID: mdl-20733239

RESUMO

We discuss the propagation of constraints in eukaryotic interaction networks in relation to model prediction and the identification of critical pathways. In order to cope with posttranslational interactions, we consider two types of nodes in the network, corresponding to proteins and to RNA. Microarray data provides very lacunar information for such types of networks because protein nodes, although needed in the model, are not observed. Propagation of observations in such networks leads to poor and nonsignificant model predictions, mainly because rules used to propagate information--usually disjunctive constraints--are weak. Here, we propose a new, stronger type of logical constraints that allow us to strengthen the analysis of the relation between microarray and interaction data. We use these rules to identify the nodes which are responsible for a phenotype, in particular for cell cycle progression. As the benchmark, we use an interaction network describing major pathways implied in Ewing's tumor development. The Python library used to obtain our results is publicly available on our supplementary web page.


Assuntos
Redes Reguladoras de Genes , Modelos Biológicos , Mapeamento de Interação de Proteínas/métodos , Sarcoma de Ewing/genética , Sarcoma de Ewing/metabolismo , Biologia de Sistemas/métodos , Algoritmos , Ciclo Celular/fisiologia , Linhagem Celular Tumoral , Simulação por Computador , Perfilação da Expressão Gênica/métodos , Humanos , Modelos Lineares , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Transdução de Sinais
19.
BMC Genomics ; 10: 244, 2009 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-19470162

RESUMO

BACKGROUND: The method most commonly used to analyse regulatory networks is the in silico simulation of fluctuations in network components when a network is perturbed. Nevertheless, confronting experimental data with a regulatory network entails many difficulties, such as the incomplete state-of-art of regulatory knowledge, the large-scale of regulatory models, heterogeneity in the available data and the sometimes violated assumption that mRNA expression is correlated to protein activity. RESULTS: We have developed a plugin for the Cytoscape environment, designed to facilitate automatic reasoning on regulatory networks. The BioQuali plugin enhances user-friendly conversions of regulatory networks (including reference databases) into signed directed graphs. BioQuali performs automatic global reasoning in order to decide which products in the network need to be up or down regulated (active or inactive) to globally explain experimental data. It highlights incomplete regions in the network, meaning that gene expression levels do not globally correlate with existing knowledge on regulation carried by the topology of the network. CONCLUSION: The BioQuali plugin facilitates in silico exploration of large-scale regulatory networks by combining the user-friendly tools of the Cytoscape environment with high-performance automatic reasoning algorithms. As a main feature, the plugin guides further investigation regarding a system by highlighting regions in the network that are not accurately described and merit specific study.


Assuntos
Algoritmos , Biologia Computacional , Redes Reguladoras de Genes , Software , Escherichia coli/genética , Ácidos Graxos/genética , Ácidos Graxos/metabolismo , Modelos Biológicos , Transcrição Gênica
20.
BMC Bioinformatics ; 9: 228, 2008 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-18460200

RESUMO

BACKGROUND: Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. RESULTS: We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions. CONCLUSION: Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica/fisiologia , Modelos Biológicos , Transdução de Sinais/fisiologia , Fatores de Transcrição/metabolismo , Ativação Transcricional/fisiologia , Simulação por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA