Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 38(24): 5383-5389, 2022 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-36321881

RESUMEN

MOTIVATION: The cellular system of a living organism is composed of interacting bio-molecules that control cellular processes at multiple levels. Their correspondences are represented by tightly regulated molecular networks. The increase of omics technologies has favored the generation of large-scale disparate data and the consequent demand for simultaneously using molecular and functional interaction networks: gene co-expression, protein-protein interaction (PPI), genetic interaction and metabolic networks. They are rich sources of information at different molecular levels, and their effective integration is essential to understand cell functioning and their building blocks (proteins). Therefore, it is necessary to obtain informative representations of proteins and their proximity, that are not fully captured by features extracted directly from a single informational level. We propose BraneMF, a novel random walk-based matrix factorization method for learning node representation in a multilayer network, with application to omics data integration. RESULTS: We test BraneMF with PPI networks of Saccharomyces cerevisiae, a well-studied yeast model organism. We demonstrate the applicability of the learned features for essential multi-omics inference tasks: clustering, function and PPI prediction. We compare it to the state-of-the-art integration methods for multilayer networks. BraneMF outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks. The robustness of results is assessed by an extensive parameter sensitivity analysis. AVAILABILITY AND IMPLEMENTATION: BraneMF's code is freely available at: https://github.com/Surabhivj/BraneMF, along with datasets, embeddings and result files. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Saccharomyces cerevisiae , Proteínas/metabolismo , Análisis por Conglomerados , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
2.
BMC Bioinformatics ; 23(1): 429, 2022 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-36245002

RESUMEN

BACKGROUND: Gene expression is regulated at different molecular levels, including chromatin accessibility, transcription, RNA maturation, and transport. These regulatory mechanisms have strong connections with cellular metabolism. In order to study the cellular system and its functioning, omics data at each molecular level can be generated and efficiently integrated. Here, we propose BRANENET, a novel multi-omics integration framework for multilayer heterogeneous networks. BRANENET is an expressive, scalable, and versatile method to learn node embeddings, leveraging random walk information within a matrix factorization framework. Our goal is to efficiently integrate multi-omics data to study different regulatory aspects of multilayered processes that occur in organisms. We evaluate our framework using multi-omics data of Saccharomyces cerevisiae, a well-studied yeast model organism. RESULTS: We test BRANENET on transcriptomics (RNA-seq) and targeted metabolomics (NMR) data for wild-type yeast strain during a heat-shock time course of 0, 20, and 120 min. Our framework learns features for differentially expressed bio-molecules showing heat stress response. We demonstrate the applicability of the learned features for targeted omics inference tasks: transcription factor (TF)-target prediction, integrated omics network (ION) inference, and module identification. The performance of BRANENET is compared to existing network integration methods. Our model outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks.


Asunto(s)
ARN , Saccharomyces cerevisiae , Cromatina , RNA-Seq , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
3.
Microbiol Spectr ; 10(2): e0228821, 2022 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-35412381

RESUMEN

Transcription initiation is a tightly regulated process that is crucial for many aspects of prokaryotic physiology. High-throughput transcription start site (TSS) mapping can shed light on global and local regulation of transcription initiation, which in turn may help us understand and predict microbial behavior. In this study, we used Capp-Switch sequencing to determine the TSS positions in the genomes of three model solventogenic clostridia: Clostridium acetobutylicum ATCC 824, C. beijerinckii DSM 6423, and C. beijerinckii NCIMB 8052. We first refined the approach by implementing a normalization pipeline accounting for gene expression, yielding a total of 12,114 mapped TSSs across the species. We further compared the distributions of these sites in the three strains. Results indicated similar distribution patterns at the genome scale, but also some sharp differences, such as for the butyryl-CoA synthesis operon, particularly when comparing C. acetobutylicum to the C. beijerinckii strains. Lastly, we found that promoter structure is generally poorly conserved between C. acetobutylicum and C. beijerinckii. A few conserved promoters across species are discussed, showing interesting examples of how TSS determination and comparison can improve our understanding of gene expression regulation at the transcript level. IMPORTANCE Solventogenic clostridia have been employed in industry for more than a century, initially being used in the acetone-butanol-ethanol (ABE) fermentation process for acetone and butanol production. Interest in these bacteria has recently increased in the context of green chemistry and sustainable development. However, our current understanding of their genomes and physiology limits their optimal use as industrial solvent production platforms. The gene regulatory mechanisms of solventogenesis are still only partly understood, impeding efforts to increase rates and yields. Genome-wide mapping of transcription start sites (TSSs) for three model solventogenic Clostridium strains is an important step toward understanding mechanisms of gene regulation in these industrially important bacteria.


Asunto(s)
Acetona , Clostridium acetobutylicum , Acetona/metabolismo , Bacterias Anaerobias , Butanoles/metabolismo , Clostridium/genética , Clostridium/metabolismo , Clostridium acetobutylicum/genética , Clostridium acetobutylicum/metabolismo , Fermentación
4.
BMC Genomics ; 21(1): 885, 2020 Dec 10.
Artículo en Inglés | MEDLINE | ID: mdl-33302864

RESUMEN

BACKGROUND: The degradation of cellulose and hemicellulose molecules into simpler sugars such as glucose is part of the second generation biofuel production process. Hydrolysis of lignocellulosic substrates is usually performed by enzymes produced and secreted by the fungus Trichoderma reesei. Studies identifying transcription factors involved in the regulation of cellulase production have been conducted but no overview of the whole regulation network is available. A transcriptomic approach with mixtures of glucose and lactose, used as a substrate for cellulase induction, was used to help us decipher missing parts in the network of T. reesei Rut-C30. RESULTS: Experimental results on the Rut-C30 hyperproducing strain confirmed the impact of sugar mixtures on the enzymatic cocktail composition. The transcriptomic study shows a temporal regulation of the main transcription factors and a lactose concentration impact on the transcriptional profile. A gene regulatory network built using BRANE Cut software reveals three sub-networks related to i) a positive correlation between lactose concentration and cellulase production, ii) a particular dependence of the lactose onto the ß-glucosidase regulation and iii) a negative regulation of the development process and growth. CONCLUSIONS: This work is the first investigating a transcriptomic study regarding the effects of pure and mixed carbon sources in a fed-batch mode. Our study expose a co-orchestration of xyr1, clr2 and ace3 for cellulase and hemicellulase induction and production, a fine regulation of the ß-glucosidase and a decrease of growth in favor of cellulase production. These conclusions provide us with potential targets for further genetic engineering leading to better cellulase-producing strains in industry-like conditions.


Asunto(s)
Celulasa , Trichoderma , Celulasa/genética , Redes Reguladoras de Genes , Glucosa , Hypocreales , Lactosa , Trichoderma/genética
5.
Artículo en Inglés | MEDLINE | ID: mdl-28368827

RESUMEN

Discovering meaningful gene interactions is crucial for the identification of novel regulatory processes in cells. Building accurately the related graphs remains challenging due to the large number of possible solutions from available data. Nonetheless, enforcing a priori on the graph structure, such as modularity, may reduce network indeterminacy issues. BRANE Clust (Biologically-Related A priori Network Enhancement with Clustering) refines gene regulatory network (GRN) inference thanks to cluster information. It works as a post-processing tool for inference methods (i.e., CLR, GENIE3). In BRANE Clust, the clustering is based on the inversion of a system of linear equations involving a graph-Laplacian matrix promoting a modular structure. Our approach is validated on DREAM4 and DREAM5 datasets with objective measures, showing significant comparative improvements. We provide additional insights on the discovery of novel regulatory or co-expressed links in the inferred Escherichia coli network evaluated using the STRING database. The comparative pertinence of clustering is discussed computationally (SIMoNe, WGCNA, X-means) and biologically (RegulonDB). BRANE Clust software is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-clust.html.


Asunto(s)
Análisis por Conglomerados , Biología Computacional/métodos , Redes Reguladoras de Genes/genética , Algoritmos , Bases de Datos Genéticas , Escherichia coli/genética , Perfilación de la Expresión Génica , Programas Informáticos
6.
BMC Bioinformatics ; 16: 368, 2015 Nov 04.
Artículo en Inglés | MEDLINE | ID: mdl-26537179

RESUMEN

BACKGROUND: Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions. METHODS: Weighting all possible pairwise gene relationships by a probability of edge presence, we formulate the regulatory network inference as a discrete variational problem on graphs. We enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. We compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge. RESULTS: Our BRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6% to 11%). On a real Escherichia coli compendium, an improvement of 11.8% compared to CLR and 3% compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster. The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html. CONCLUSIONS: BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the art GRN inference methods. It is applicable as a generic network inference post-processing, due to its computational efficiency.


Asunto(s)
Algoritmos , Escherichia coli/genética , Redes Reguladoras de Genes , Área Bajo la Curva , Simulación por Computador , Bases de Datos Genéticas , Reproducibilidad de los Resultados
7.
Biotechnol Biofuels ; 7(1): 173, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25550711

RESUMEN

BACKGROUND: The filamentous fungus Trichoderma reesei is the main industrial cellulolytic enzyme producer. Several strains have been developed in the past using random mutagenesis, and despite impressive performance enhancements, the pressure for low-cost cellulases has stimulated continuous research in the field. In this context, comparative study of the lower and higher producer strains obtained through random mutagenesis using systems biology tools (genome and transcriptome sequencing) can shed light on the mechanisms of cellulase production and help identify genes linked to performance. Previously, our group published comparative genome sequencing of the lower and higher producer strains NG 14 and RUT C30. In this follow-up work, we examine how these mutations affect phenotype as regards the transcriptome and cultivation behaviour. RESULTS: We performed kinetic transcriptome analysis of the NG 14 and RUT C30 strains of early enzyme production induced by lactose using bioreactor cultivations close to an industrial cultivation regime. RUT C30 exhibited both earlier onset of protein production (3 h) and higher steady-state productivity. A rather small number of genes compared to previous studies were regulated (568), most of them being specific to the NG 14 strain (319). Clustering analysis highlighted similar behaviour for some functional categories and allowed us to distinguish between induction-related genes and productivity-related genes. Cross-comparison of our transcriptome data with previously identified mutations revealed that most genes from our dataset have not been mutated. Interestingly, the few mutated genes belong to the same clusters, suggesting that these clusters contain genes playing a role in strain performance. CONCLUSIONS: This is the first kinetic analysis of a transcriptomic study carried out under conditions approaching industrial ones with two related strains of T. reesei showing distinctive cultivation behaviour. Our study sheds some light on some of the events occurring in these strains following induction by lactose. The fact that few regulated genes have been affected by mutagenesis suggests that the induction mechanism is essentially intact compared to that for the wild-type isolate QM6a and might be engineered for further improvement of T. reesei. Genes from two specific clusters might be potential targets for such genetic engineering.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...