Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
PLoS One ; 19(5): e0303231, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38771886

RESUMO

Extracting biological interactions from published literature helps us understand complex biological systems, accelerate research, and support decision-making in drug or treatment development. Despite efforts to automate the extraction of biological relations using text mining tools and machine learning pipelines, manual curation continues to serve as the gold standard. However, the rapidly increasing volume of literature pertaining to biological relations poses challenges in its manual curation and refinement. These challenges are further compounded because only a small fraction of the published literature is relevant to biological relation extraction, and the embedded sentences of relevant sections have complex structures, which can lead to incorrect inference of relationships. To overcome these challenges, we propose GIX, an automated and robust Gene Interaction Extraction framework, based on pre-trained Large Language models fine-tuned through extensive evaluations on various gene/protein interaction corpora including LLL and RegulonDB. GIX identifies relevant publications with minimal keywords, optimises sentence selection to reduce computational overhead, simplifies sentence structure while preserving meaning, and provides a confidence factor indicating the reliability of extracted relations. GIX's Stage-2 relation extraction method performed well on benchmark protein/gene interaction datasets, assessed using 10-fold cross-validation, surpassing state-of-the-art approaches. We demonstrated that the proposed method, although fully automated, performs as well as manual relation extraction, with enhanced robustness. We also observed GIX's capability to augment existing datasets with new sentences, incorporating newly discovered biological terms and processes. Further, we demonstrated GIX's real-world applicability in inferring E. coli gene circuits.


Assuntos
Mineração de Dados , Mineração de Dados/métodos , Processamento de Linguagem Natural , Aprendizado de Máquina , Biologia Computacional/métodos , Humanos , Algoritmos
2.
PLoS One ; 18(7): e0288174, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37418430

RESUMO

In systems biology, the accurate reconstruction of Gene Regulatory Networks (GRNs) is crucial since these networks can facilitate the solving of complex biological problems. Amongst the plethora of methods available for GRN reconstruction, information theory and fuzzy concepts-based methods have abiding popularity. However, most of these methods are not only complex, incurring a high computational burden, but they may also produce a high number of false positives, leading to inaccurate inferred networks. In this paper, we propose a novel hybrid fuzzy GRN inference model called MICFuzzy which involves the aggregation of the effects of Maximal Information Coefficient (MIC). This model has an information theory-based pre-processing stage, the output of which is applied as an input to the novel fuzzy model. In this preprocessing stage, the MIC component filters relevant genes for each target gene to significantly reduce the computational burden of the fuzzy model when selecting the regulatory genes from these filtered gene lists. The novel fuzzy model uses the regulatory effect of the identified activator-repressor gene pairs to determine target gene expression levels. This approach facilitates accurate network inference by generating a high number of true regulatory interactions while significantly reducing false regulatory predictions. The performance of MICFuzzy was evaluated using DREAM3 and DREAM4 challenge data, and the SOS real gene expression dataset. MICFuzzy outperformed the other state-of-the-art methods in terms of F-score, Matthews Correlation Coefficient, Structural Accuracy, and SS_mean, and outperformed most of them in terms of efficiency. MICFuzzy also had improved efficiency compared with the classical fuzzy model since the design of MICFuzzy leads to a reduction in combinatorial computation.


Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Redes Reguladoras de Genes , Biologia de Sistemas , Genes Reguladores , Modelos Genéticos
4.
Biosystems ; 221: 104757, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36007675

RESUMO

The reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is highly relevant for the discovery of complex biological interactions and dynamics. Various computational strategies have been developed for this task, but most approaches have low computational efficiency and are not able to cope with high-dimensional, low sample-number, gene expression data. In this paper, we introduce a novel combined filter feature selection approach for efficient and accurate inference of GRNs. A Boolean framework for network modelling is used to demonstrate the efficacy of the proposed approach. Using discretized microarray expression data, the genes most relevant to each target gene are first filtered using ReliefF, an instance-based feature ranking method that is here applied for the first time to GRN inference. Then, further gene selection from the filtered-gene list is done using a mutual information-based min-redundancy max-relevance criterion by eliminating irrelevant genes. This combined method is executed on resampled datasets to finalize the optimal set of regulatory genes. Building upon our previous research, a Pearson correlation coefficient-based Boolean modelling approach is utilized for the efficient identification of the optimal regulatory rules associated with selected regulatory genes. The proposed approach was evaluated using gene expression datasets from small-scale and medium-scale real gene networks, and was observed to be more effective than Linear Discriminant Analysis, performed better than the individual feature selection methods, and obtained improved Structural Accuracy with a higher number of true positives than other state-of-the-art methods, while outperforming these methods with respect to Dynamic Accuracy and efficiency.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Biologia Computacional/métodos , Redes Reguladoras de Genes/genética , Fatores de Tempo
5.
Biosystems ; 220: 104736, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35863700

RESUMO

S-System models, non-linear differential equation models, are widely used for reconstructing gene regulatory networks from temporal gene expression data. An S-System model involves two states, generation and degeneration, and uses the kinetic parameters gij and hij, to represent the direction, nature, and intensity of the genetic interactions. The need for learning a large number of model parameters results in increased computational expense. Previously, we improved the performance of the algorithm using dynamic allocation of the maximum in-degree for each gene. While the method was effective for smaller networks, a large amount of computation was still needed for larger networks. This problem arose mainly due to the increased occurrence of invalid networks during optimization, primarily because the two kinetic parameters (gij and hij) of the S-System model converge independently during optimization. Being independent, these two parameters can converge to values that can indicate contradictory gene interactions, specifically inhibition or activation. In this study, to address this major challenge in S-System modelling, we developed a novel method that includes two features: a penalty term that penalizes those networks with invalid kinetic orders, and a parameter, wij, derived by combining the kinetic parameters gij and hij. The novel penalty term was used for candidate selection during the process of optimizing the DRNI (Dynamically Regulated Network Initialization) algorithm. Rather than remaining constant, it is dynamic, with its magnitude dependent on the number of invalid interactions in the given network. This approach encourages the generation of valid candidate solutions, and eliminates invalid networks in a systematic manner. The previous DRNI method, a two-stage approach which uses dynamic allocation of the maximum in-degree for each gene, was further improved by adding a third stage which applies the proposed wij to handle the invalid regulations that may still exist in that candidate solutions. The method was tested on different gene expression datasets, and was able to reduce the number of iterations and produce improved network accuracies. For a 20 gene network, the number of generations required for convergence was reduced by 300, and the F-score improved by 0.05 compared to our previously reported DRNI approach. For the well-known 10 gene networks of the DREAM challenge, our method produced an improvement in the average area under the ROC curve of the DREAM4 10 gene networks.


Assuntos
Biologia Computacional , Redes Reguladoras de Genes , Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes/genética , Cinética
6.
Biosystems ; 182: 30-41, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31185246

RESUMO

A gene regulatory network (GRN) represents a set of genes along with their regulatory interactions. Cellular behavior is driven by genetic level interactions. Dynamics of such systems show nonlinear saturation kinetics which can be best modeled by Michaelis-Menten (MM) and Hill equations. Although MM equation is being widely used for modeling biochemical processes, it has been applied rarely for reverse engineering GRNs. In this paper, we develop a complete framework for a novel model for GRN inference using MM kinetics. A set of coupled equations is first proposed for modeling GRNs. In the coupled model, Michaelis-Menten constant associated with regulation by a gene is made invariant irrespective of the gene being regulated. The parameter estimation of the proposed model is carried out using an evolutionary optimization method, namely, trigonometric differential evolution (TDE). Subsequently, the model is further improved and the regulations of different genes by a given gene are made distinct by allowing varying values of Michaelis-Menten constants for each regulation. Apart from making the model more relevant biologically, the improvement results in a decoupled GRN model with fast estimation of model parameters. Further, to enhance exploitation of the search, we propose a local search algorithm based on hill climbing heuristics. A novel mutation operation is also proposed to avoid population stagnation and premature convergence. Real life benchmark data sets generated in vivo are used for validating the proposed model. Further, we also analyze realistic in silico datasets generated using GeneNetweaver. The comparison of the performance of proposed model with other existing methods shows the potential of the proposed model.


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Modelos Genéticos , Animais , Teorema de Bayes , Simulação por Computador , Escherichia coli/genética , Regulação da Expressão Gênica , Humanos , Cinética , Saccharomyces cerevisiae/genética
7.
Cogn Neurodyn ; 12(4): 417-429, 2018 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30137878

RESUMO

A gene regulatory network (GRN) represents a set of genes and its regulatory interactions. The inference of the regulatory interactions between genes is usually carried out using an appropriate mathematical model and the available gene expression profile. Among the various models proposed for GRN inference, our recently proposed Michaelis-Menten based ODE model provides a good trade-off between the computational complexity and biological relevance. This model, like other known GRN models, also uses an evolutionary algorithm for parameter estimation. Considering various issues associated with such population based stochastic optimization approaches (e.g. diversity, premature convergence due to local optima, accuracy, etc.), it becomes important to seed the initial population with good individuals which are closer to the optimal solution. In this paper, we exploit the inherent strength of principal component analysis (PCA) in a novel manner to initialize the population for GRN optimization. The benefit of the proposed method is validated by reconstructing in silico and in vivo networks of various sizes. For the same level of accuracy, the approach with PCA based initialization shows improved convergence speed.

8.
Cogn Neurodyn ; 9(5): 535-47, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26379803

RESUMO

Microarray gene expression data can provide insights into biological processes at a system-wide level and is commonly used for reverse engineering gene regulatory networks (GRN). Due to the amalgamation of noise from different sources, microarray expression profiles become inherently noisy leading to significant impact on the GRN reconstruction process. Microarray replicates (both biological and technical), generated to increase the reliability of data obtained under noisy conditions, have limited influence in enhancing the accuracy of reconstruction . Therefore, instead of the conventional GRN modeling approaches which are deterministic, stochastic techniques are becoming increasingly necessary for inferring GRN from noisy microarray data. In this paper, we propose a new stochastic GRN model by investigating incorporation of various standard noise measurements in the deterministic S-system model. Experimental evaluations performed for varying sizes of synthetic network, representing different stochastic processes, demonstrate the effect of noise on the accuracy of genetic network modeling and the significance of stochastic modeling for GRN reconstruction . The proposed stochastic model is subsequently applied to infer the regulations among genes in two real life networks: (1) the well-studied IRMA network, a real-life in-vivo synthetic network constructed within the Saccharomyces cerevisiae yeast, and (2) the SOS DNA repair network in Escherichia coli.

9.
Mol Biosyst ; 11(9): 2449-63, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26126758

RESUMO

Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Software , Algoritmos
10.
PLoS One ; 10(5): e0125148, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25973856

RESUMO

Unicellular diazotrophic cyanobacteria such as Cyanothece sp. ATCC 51142 (henceforth Cyanothece), temporally separate the oxygen sensitive nitrogen fixation from oxygen evolving photosynthesis not only under diurnal cycles (LD) but also in continuous light (LL). However, recent reports demonstrate that the oscillations in LL occur with a shorter cycle time of ~11 h. We find that indeed, majority of the genes oscillate in LL with this cycle time. Genes that are upregulated at a particular time of day under diurnal cycle also get upregulated at an equivalent metabolic phase under LL suggesting tight coupling of various cellular events with each other and with the cell's metabolic status. A number of metabolic processes get upregulated in a coordinated fashion during the respiratory phase under LL including glycogen degradation, glycolysis, oxidative pentose phosphate pathway, and tricarboxylic acid cycle. These precede nitrogen fixation apparently to ensure sufficient energy and anoxic environment needed for the nitrogenase enzyme. Photosynthetic phase sees upregulation of photosystem II, carbonate transport, carbon concentrating mechanism, RuBisCO, glycogen synthesis and light harvesting antenna pigment biosynthesis. In Synechococcus elongates PCC 7942, a non-nitrogen fixing cyanobacteria, expression of a relatively smaller fraction of genes oscillates under LL condition with the major periodicity being 24 h. In contrast, the entire cellular machinery of Cyanothece orchestrates coordinated oscillation in anticipation of the ensuing metabolic phase in both LD and LL. These results may have important implications in understanding the timing of various cellular events and in engineering cyanobacteria for biofuel production.


Assuntos
Proteínas de Bactérias/genética , Relógios Biológicos/efeitos da radiação , Cyanothece/efeitos da radiação , Regulação Bacteriana da Expressão Gênica , Fixação de Nitrogênio/efeitos da radiação , Fotossíntese/efeitos da radiação , Proteínas de Bactérias/metabolismo , Relógios Biológicos/genética , Carbono/metabolismo , Ritmo Circadiano/genética , Ciclo do Ácido Cítrico/genética , Ciclo do Ácido Cítrico/efeitos da radiação , Cyanothece/genética , Cyanothece/metabolismo , Glicogênio/biossíntese , Glicólise/genética , Glicólise/efeitos da radiação , Luz , Complexos de Proteínas Captadores de Luz/genética , Complexos de Proteínas Captadores de Luz/metabolismo , Anotação de Sequência Molecular , Nitrogênio/metabolismo , Fixação de Nitrogênio/genética , Nitrogenase/genética , Nitrogenase/metabolismo , Oxigênio/metabolismo , Via de Pentose Fosfato/genética , Via de Pentose Fosfato/efeitos da radiação , Fotossíntese/genética , Complexo de Proteína do Fotossistema II/genética , Complexo de Proteína do Fotossistema II/metabolismo
11.
Bioresour Technol ; 188: 145-52, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25736893

RESUMO

This study investigates the influence of mixotrophy on physiology and metabolism by analysis of global gene expression in unicellular diazotrophic cyanobacterium Cyanothece sp. ATCC 51142 (henceforth Cyanothece 51142). It was found that Cyanothece 51142 continues to oscillate between photosynthesis and respiration in continuous light under mixotrophy with cycle time of ∼ 13 h. Mixotrophy is marked by an extended respiratory phase compared with photoautotrophy. It can be argued that glycerol provides supplementary energy for nitrogen fixation, which is derived primarily from the glycogen reserves during photoautotrophy. The genes of NDH complex, cytochrome c oxidase and ATP synthase are significantly overexpressed in mixotrophy during the day compared to autotrophy with synchronous expression of the bidirectional hydrogenase genes possibly to maintain redox balance. However, nitrogenase complex remains exclusive to nighttime metabolism concomitantly with uptake hydrogenase. This study throws light on interrelations between metabolic pathways with implications in design of hydrogen producer strains.


Assuntos
Cianobactérias/metabolismo , Cyanothece/metabolismo , Redes e Vias Metabólicas , Processos Autotróficos , Biotecnologia/métodos , Dióxido de Carbono/química , Análise por Conglomerados , Meios de Cultura , Transporte de Elétrons , Perfilação da Expressão Gênica , Glicerol/química , Glicogênio/química , Hidrogênio/química , Concentração de Íons de Hidrogênio , Nitrogênio/química , Fixação de Nitrogênio , Nitrogenase/química , Análise de Sequência com Séries de Oligonucleotídeos , Oscilometria , Fotobiorreatores , Processos Fotoquímicos , Fotossíntese , Explosão Respiratória , Transcriptoma
12.
Cogn Neurodyn ; 8(3): 251-9, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24808933

RESUMO

Gene regulatory network (GRN) consists of interactions between transcription factors (TFs) and target genes (TGs). Recently, it has been observed that micro RNAs (miRNAs) play a significant part in genetic interactions. However, current microarray technologies do not capture miRNA expression levels. To overcome this, we propose a new technique to reverse engineer GRN from the available partial microarray data which contains expression levels of TFs and TGs only. Using S-System model, the approach is adapted to cope with the unavailability of information about the expression levels of miRNAs. The versatile Differential Evolutionary algorithm is used for optimization and parameter estimation. Experimental studies on four in silico networks, and a real network of Saccharomyces cerevisiae called IRMA network, show significant improvement compared to traditional S-System approach.

13.
Front Microbiol ; 4: 374, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24367360

RESUMO

Cyanobacteria, a group of photosynthetic prokaryotes, oscillate between day and night time metabolisms with concomitant oscillations in gene expression in response to light/dark cycles (LD). The oscillations in gene expression have been shown to sustain in constant light (LL) with a free running period of 24 h in a model cyanobacterium Synechococcus elongatus PCC 7942. However, equivalent oscillations in metabolism are not reported under LL in this non-nitrogen fixing cyanobacterium. Here we focus on Cyanothece sp. ATCC 51142, a unicellular, nitrogen-fixing cyanobacterium known to temporally separate the processes of oxygenic photosynthesis and oxygen-sensitive nitrogen fixation. In a recent report, metabolism of Cyanothece 51142 has been shown to oscillate between photosynthetic and respiratory phases under LL with free running periods that are temperature dependent but significantly shorter than the circadian period. Further, the oscillations shift to circadian pattern at moderate cell densities that are concomitant with slower growth rates. Here we take this understanding forward and demonstrate that the ultradian rhythm under LL sustains at much higher cell densities when grown under turbulent regimes that simulate flashing light effect. Our results suggest that the ultradian rhythm in metabolism may be needed to support higher carbon and nitrogen requirements of rapidly growing cells under LL. With a comprehensive Real time PCR based gene expression analysis we account for key regulatory interactions and demonstrate the interplay between clock genes and the genes of key metabolic pathways. Further, we observe that several genes that peak at dusk in Synechococcus peak at dawn in Cyanothece and vice versa. The circadian rhythm of this organism appears to be more robust with peaking of genes in anticipation of the ensuing photosynthetic and respiratory metabolic phases.

14.
Photosynth Res ; 118(1-2): 51-7, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23881383

RESUMO

Mixotrophic cultivation of cyanobacteria in wastewaters with flue gas sparging has the potential to simultaneously sequester carbon content from gaseous and aqueous streams and convert to biomass and biofuels. Therefore, it was of interest to study the effect of mixotrophy and elevated CO2 on metabolism, morphology and rhythm of gene expression under diurnal cycles. We chose a diazotrophic unicellular cyanobacterium Cyanothece sp. ATCC 51142 as a model, which is a known hydrogen producer with robust circadian rhythm. Cyanothece 51142 grows faster with nitrate and/or an additional carbon source in the growth medium and at 3 % CO2. Intracellular glycogen contents undergo diurnal oscillations with greater accumulation under mixotrophy. While glycogen is exhausted by midnight under autotrophic conditions, significant amounts remain unutilized accompanied by a prolonged upregulation of nifH gene under mixotrophy. This possibly supports nitrogen fixation for longer periods thereby leading to better growth. To gain insights into the influence of mixotrophy and elevated CO2 on circadian rhythm, transcription of core clock genes kaiA, kaiB1 and kaiC1, the input pathway, cikA, output pathway, rpaA and representatives of key metabolic pathways was analyzed. Clock genes' transcripts were lower under mixotrophy suggesting a dampening effect exerted by an external carbon source such as glycerol. Nevertheless, the genes of the clock and important metabolic pathways show diurnal oscillations in expression under mixotrophic and autotrophic growth at ambient and elevated CO2, respectively. Taken together, the results indicate segregation of light and dark associated reactions even under mixotrophy and provide important insights for further applications.


Assuntos
Dióxido de Carbono/fisiologia , Ritmo Circadiano , Cyanothece/fisiologia , Tamanho Celular , Técnicas de Cultura , Cyanothece/citologia , Fixação de Nitrogênio
15.
BMC Bioinformatics ; 14: 196, 2013 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-23777625

RESUMO

BACKGROUND: In any gene regulatory network (GRN), the complex interactions occurring amongst transcription factors and target genes can be either instantaneous or time-delayed. However, many existing modeling approaches currently applied for inferring GRNs are unable to represent both these interactions simultaneously. As a result, all these approaches cannot detect important interactions of the other type. S-System model, a differential equation based approach which has been increasingly applied for modeling GRNs, also suffers from this limitation. In fact, all S-System based existing modeling approaches have been designed to capture only instantaneous interactions, and are unable to infer time-delayed interactions. RESULTS: In this paper, we propose a novel Time-Delayed S-System (TDSS) model which uses a set of delay differential equations to represent the system dynamics. The ability to incorporate time-delay parameters in the proposed S-System model enables simultaneous modeling of both instantaneous and time-delayed interactions. Furthermore, the delay parameters are not limited to just positive integer values (corresponding to time stamps in the data), but can also take fractional values. Moreover, we also propose a new criterion for model evaluation exploiting the sparse and scale-free nature of GRNs to effectively narrow down the search space, which not only reduces the computation time significantly but also improves model accuracy. The evaluation criterion systematically adapts the max-min in-degrees and also systematically balances the effect of network accuracy and complexity during optimization. CONCLUSION: The four well-known performance measures applied to the experimental studies on synthetic networks with various time-delayed regulations clearly demonstrate that the proposed method can capture both instantaneous and delayed interactions correctly with high precision. The experiments carried out on two well-known real-life networks, namely IRMA and SOS DNA repair network in Escherichia coli show a significant improvement compared with other state-of-the-art approaches for GRN modeling.


Assuntos
Engenharia Genética , Modelos Biológicos , Modelos Genéticos , Resposta SOS em Genética/genética , Algoritmos , Escherichia coli/genética , Redes Reguladoras de Genes , Fatores de Tempo , Fatores de Transcrição/genética
16.
BMC Bioinformatics ; 14 Suppl 2: S14, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23368635

RESUMO

BACKGROUND: The over consumption of fossil fuels has led to growing concerns over climate change and global warming. Increasing research activities have been carried out towards alternative viable biofuel sources. Of several different biofuel platforms, cyanobacteria possess great potential, for their ability to accumulate biomass tens of times faster than traditional oilseed crops. The cyanobacterium Cyanothece sp. ATCC 51142 has recently attracted lots of research interest as a model organism for such research. Cyanothece can perform efficiently both photosynthesis and nitrogen fixation within the same cell, and has been recently shown to produce biohydrogen--a byproduct of nitrogen fixation--at very high rates of several folds higher than previously described hydrogen-producing photosynthetic microbes. Since the key enzyme for nitrogen fixation is very sensitive to oxygen produced by photosynthesis, Cyanothece employs a sophisticated temporal separation scheme, where nitrogen fixation occurs at night and photosynthesis at day. At the core of this temporal separation scheme is a robust clocking mechanism, which so far has not been thoroughly studied. Understanding how this circadian clock interacts with and harmonizes global transcription of key cellular processes is one of the keys to realize the inherent potential of this organism. RESULTS: In this paper, we employ several state of the art bioinformatics techniques for studying the core circadian clock in Cyanothece sp. ATCC 51142, and its interactions with other key cellular processes. We employ comparative genomics techniques to map the circadian clock genes and genetic interactions from another cyanobacterial species, namely Synechococcus elongatus PCC 7942, of which the circadian clock has been much more thoroughly investigated. Using time series gene expression data for Cyanothece, we employ gene regulatory network reconstruction techniques to learn this network de novo, and compare the reconstructed network against the interactions currently reported in the literature. Next, we build a computational model of the interactions between the core clock and other cellular processes, and show how this model can predict the behaviour of the system under changing environmental conditions. The constructed models significantly advance our understanding of the Cyanothece circadian clock functional mechanisms.


Assuntos
Relógios Circadianos , Biologia Computacional/métodos , Cyanothece/genética , Redes Reguladoras de Genes , Modelos Biológicos , Biocombustíveis , Biomassa , Mapeamento Cromossômico , Cyanothece/metabolismo , Fixação de Nitrogênio/genética , Análise de Sequência com Séries de Oligonucleotídeos , Fotossíntese/genética , Synechococcus/genética , Synechococcus/metabolismo
17.
Biochim Biophys Acta ; 1824(12): 1434-41, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22683439

RESUMO

Genetic network reverse engineering has been an area of intensive research within the systems biology community during the last decade. With many techniques currently available, the task of validating them and choosing the best one for a certain problem is a complex issue. Current practice has been to validate an approach on in-silico synthetic data sets, and, wherever possible, on real data sets with known ground-truth. In this study, we highlight a major issue that the validation of reverse engineering algorithms on small benchmark networks very often results in networks which are not statistically better than a randomly picked network. Another important issue highlighted is that with short time series, a small variation in the pre-processing procedure might yield large differences in the inferred networks. To demonstrate these issues, we have selected as our case study the IRMA in-vivo synthetic yeast network recently published in Cell. Using Fisher's exact test, we show that many results reported in the literature on reverse-engineering this network are not significantly better than random. The discussion is further extended to some other networks commonly used for validation purposes in the literature. The results presented in this study emphasize that studies carried out using small genetic networks are likely to be trivial, making it imperative that larger real networks be used for validating and benchmarking purposes. If smaller networks are considered, then the results should be interpreted carefully to avoid over confidence. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Teorema de Bayes , Biologia Computacional
18.
BMC Syst Biol ; 6: 62, 2012 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-22691450

RESUMO

BACKGROUND: Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. RESULTS: In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. CONCLUSION: By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach.


Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Biologia de Sistemas/métodos , Teorema de Bayes , Cianobactérias/genética , Reparo do DNA/genética , Escherichia coli/genética , Glucose/metabolismo , Homeostase/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteína Son Of Sevenless de Drosófila/metabolismo , Fatores de Tempo
19.
BMC Bioinformatics ; 13: 131, 2012 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-22694481

RESUMO

BACKGROUND: Dynamic Bayesian network (DBN) is among the mainstream approaches for modeling various biological networks, including the gene regulatory network (GRN). Most current methods for learning DBN employ either local search such as hill-climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing, which are only able to locate sub-optimal solutions. Further, current DBN applications have essentially been limited to small sized networks. RESULTS: To overcome the above difficulties, we introduce here a deterministic global optimization based DBN approach for reverse engineering genetic networks from time course gene expression data. For such DBN models that consist only of inter time slice arcs, we show that there exists a polynomial time algorithm for learning the globally optimal network structure. The proposed approach, named GlobalMIT+, employs the recently proposed information theoretic scoring metric named mutual information test (MIT). GlobalMIT+ is able to learn high-order time delayed genetic interactions, which are common to most biological systems. Evaluation of the approach using both synthetic and real data sets, including a 733 cyanobacterial gene expression data set, shows significantly improved performance over other techniques. CONCLUSIONS: Our studies demonstrate that deterministic global optimization approaches can infer large scale genetic networks.


Assuntos
Algoritmos , Teorema de Bayes , Redes Reguladoras de Genes , Biologia Computacional/métodos , Cianobactérias/genética , Perfilação da Expressão Gênica/métodos
20.
Bioinformatics ; 27(19): 2765-6, 2011 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-21813478

RESUMO

MOTIVATION: Dynamic Bayesian networks (DBN) are widely applied in modeling various biological networks including the gene regulatory network (GRN). Due to the NP-hard nature of learning static Bayesian network structure, most methods for learning DBN also employ either local search such as hill climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing. RESULTS: This article presents GlobalMIT, a toolbox for learning the globally optimal DBN structure from gene expression data. We propose using a recently introduced information theoretic-based scoring metric named mutual information test (MIT). With MIT, the task of learning the globally optimal DBN is efficiently achieved in polynomial time. AVAILABILITY: The toolbox, implemented in Matlab and C++, is available at http://code.google.com/p/globalmit. CONTACT: vinh.nguyen@monash.edu; madhu.chetty@monash.edu SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.


Assuntos
Algoritmos , Expressão Gênica , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Teorema de Bayes , Regulação da Expressão Gênica , Armazenamento e Recuperação da Informação , Modelos Biológicos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA