RESUMO
Cancer metabolism is a marvellously complex topic, in part, due to the reprogramming of its pathways to self-sustain the malignant phenotype in the disease, to the detriment of its healthy counterpart. Understanding these adjustments can provide novel targeted therapies that could disrupt and impair proliferation of cancerous cells. For this very purpose, genome-scale metabolic models (GEMs) have been developed, with Human1 being the most recent reconstruction of the human metabolism. Based on GEMs, we introduced the genetic Minimal Cut Set (gMCS) approach, an uncontextualized methodology that exploits the concepts of synthetic lethality to predict metabolic vulnerabilities in cancer. gMCSs define a set of genes whose knockout would render the cell unviable by disrupting an essential metabolic task in GEMs, thus, making cellular proliferation impossible. Here, we summarize the gMCS approach and review the current state of the methodology by performing a systematic meta-analysis based on two datasets of gene essentiality in cancer. First, we assess several thresholds and distinct methodologies for discerning highly and lowly expressed genes. Then, we address the premise that gMCSs of distinct length should have the same predictive power. Finally, we question the importance of a gene partaking in multiple gMCSs and analyze the importance of all the essential metabolic tasks defined in Human1. Our meta-analysis resulted in parameter evaluation to increase the predictive power for the gMCS approach, as well as a significant reduction of computation times by only selecting the crucial gMCS lengths, proposing the pertinency of particular parameters for the peak processing of gMCS.
Assuntos
Neoplasias , Humanos , Neoplasias/genética , Proliferação de Células , Expressão Gênica , Nível de Saúde , FenótipoRESUMO
MOTIVATION: The identification of minimal genetic interventions that modulate metabolic processes constitutes one of the most relevant applications of genome-scale metabolic models (GEMs). The concept of Minimal Cut Sets (MCSs) and its extension at the gene level, genetic Minimal Cut Sets (gMCSs), have attracted increasing interest in the field of Systems Biology to address this task. Different computational tools have been developed to calculate MCSs and gMCSs using both commercial and open-source software. RESULTS: Here, we present gMCSpy, an efficient Python package to calculate gMCSs in GEMs using both commercial and non-commercial optimization solvers. We show that gMCSpy substantially overperforms our previous computational tool GMCS, which exclusively relied on commercial software. Moreover, we compared gMCSpy with recently published competing algorithms in the literature, finding significant improvements in both accuracy and computation time. All these advances make gMCSpy an attractive tool for researchers in the field of Systems Biology for different applications in health and biotechnology. AVAILABILITY AND IMPLEMENTATION: The Python package gMCSpy and the data underlying this manuscript can be accessed at: https://github.com/PlanesLab/gMCSpy.
Assuntos
Algoritmos , Software , Biologia de Sistemas , Biologia de Sistemas/métodos , Genoma , Biologia Computacional/métodosRESUMO
MOTIVATION: 16S rRNA gene sequencing is the most frequent approach for the characterization of the human gut microbiota. Despite different efforts in the literature, the inference of functional and metabolic interpretations from 16S rRNA gene sequencing data is still a challenging task. High-quality metabolic reconstructions of the human gut microbiota, such as AGORA and AGREDA, constitute a curated resource to improve functional inference from 16S rRNA data, but they are not typically integrated into standard bioinformatics tools. RESULTS: Here, we present q2-metnet, a QIIME2 plugin that enables the contextualization of 16S rRNA gene sequencing data into AGORA and AGREDA. In particular, based on relative abundances of taxa, q2-metnet determines normalized activity scores for the reactions and subsystems involved in the selected metabolic reconstruction. Using these scores, q2-metnet allows the user to conduct differential activity analysis for reactions and subsystems, as well as exploratory analysis using PCA and hierarchical clustering. We apply q2-metnet to a dataset from our group that involves 16S rRNA data from stool samples from lean, allergic to cow's milk, obese and celiac children, and the Belgian Flemish Gut Flora Project cohort, which includes faecal 16S rRNA data from obese and normal-weight adult individuals. In the first case, q2-metnet outperforms existing algorithms in separating different clinical conditions based on predicted pathway abundances and subsystem scores. In the second case, q2-metnet complements competing approaches in predicting functional alterations in the gut microbiota of obese individuals. Overall, q2-metnet constitutes a powerful bioinformatics tool to provide metabolic context to 16S rRNA data from the human gut microbiota. AVAILABILITY: Python code of q2-metnet is available in https://github.com/PlanesLab/q2-metnet and https://figshare.com/articles/dataset/q2-metnet_package/26180446. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
MOTIVATION: Simulating gut microbial dynamics is extremely challenging. Several computational tools, notably the widely used BacArena, enable modeling of dynamic changes in the microbial environment. These methods, however, do not comprehensively account for microbe-microbe stimulant or inhibitory effects or for nutrient-microbe inhibitory effects, typically observed in different compounds present in the daily diet. RESULTS: Here, we present BN-BacArena, an extension of BacArena consisting on the incorporation within the native computational framework of a Bayesian network model that accounts for microbe-microbe and nutrient-microbe interactions. Using in vitro experiments, 16S rRNA gene sequencing data and nutritional composition of 55 foods, the output Bayesian network showed 23 significant nutrient-bacteria interactions, suggesting the importance of compounds such as polyols, ascorbic acid, polyphenols and other phytochemicals, and 40 bacteria-bacteria significant relationships. With test data, BN-BacArena demonstrates a statistically significant improvement over BacArena to predict the time-dependent relative abundance of bacterial species involved in the gut microbiota upon different nutritional interventions. As a result, BN-BacArena opens new avenues for the dynamic modeling and simulation of the human gut microbiota metabolism. AVAILABILITY AND IMPLEMENTATION: MATLAB and R code are available in https://github.com/PlanesLab/BN-BacArena.
Assuntos
Bactérias , Teorema de Bayes , Microbioma Gastrointestinal , RNA Ribossômico 16S , Humanos , RNA Ribossômico 16S/genética , Bactérias/metabolismo , Bactérias/classificação , Simulação por Computador , Biologia Computacional/métodos , Software , MicrobiotaRESUMO
Synthetic Lethality (SL) is currently defined as a type of genetic interaction in which the loss of function of either of two genes individually has limited effect in cell viability but inactivation of both genes simultaneously leads to cell death. Given the profound genomic aberrations acquired by tumor cells, which can be systematically identified with -omics data, SL is a promising concept in cancer research. In particular, SL has received much attention in the area of cancer metabolism, due to the fact that relevant functional alterations concentrate on key metabolic pathways that promote cellular proliferation. With the extensive prior knowledge about human metabolic networks, a number of computational methods have been developed to predict SL in cancer metabolism, including the genetic Minimal Cut Sets (gMCSs) approach. A major challenge in the application of SL approaches to cancer metabolism is to systematically integrate tumor microenvironment, given that genetic interactions and nutritional availability are interconnected to support proliferation. Here, we propose a more general definition of SL for cancer metabolism that combines genetic and environmental interactions, namely loss of gene functions and absence of nutrients in the environment. We extend our gMCSs approach to determine this new family of metabolic synthetic lethal interactions. A computational and experimental proof-of-concept is presented for predicting the lethality of dihydrofolate reductase (DHFR) inhibition in different environments. Finally, our approach is applied to identify extracellular nutrient dependences of tumor cells, elucidating cholesterol and myo-inositol depletion as potential vulnerabilities in different malignancies.
Assuntos
Neoplasias , Mutações Sintéticas Letais , Linhagem Celular Tumoral , Genômica , Humanos , Redes e Vias Metabólicas/genética , Neoplasias/genética , Neoplasias/metabolismo , Nutrientes , Mutações Sintéticas Letais/genética , Microambiente TumoralRESUMO
With the frenetic growth of high-dimensional datasets in different biomedical domains, there is an urgent need to develop predictive methods able to deal with this complexity. Feature selection is a relevant strategy in machine learning to address this challenge. We introduce a novel feature selection algorithm for linear regression called BOSO (Bilevel Optimization Selector Operator). We conducted a benchmark of BOSO with key algorithms in the literature, finding a superior accuracy for feature selection in high-dimensional datasets. Proof-of-concept of BOSO for predicting drug sensitivity in cancer is presented. A detailed analysis is carried out for methotrexate, a well-studied drug targeting cancer metabolism.
Assuntos
Algoritmos , Neoplasias , Humanos , Modelos Lineares , Aprendizado de Máquina , Neoplasias/tratamento farmacológico , Neoplasias/metabolismoRESUMO
Patients with major forms of acute hepatic porphyria present acute neurological attacks with overproduction of porphobilinogen (PBG) and δ-aminolevulinic acid (ALA). Even if ALA is considered the most likely agent inducing the acute symptoms, the mechanism of its accumulation has not been experimentally demonstrated. In the most frequent form, acute intermittent porphyria (AIP), inherited gene mutations induce a deficiency in PBG deaminase; thus, accumulation of the substrate PBG is biochemically obligated but not that of ALA. A similar scenario is observed in other forms of acute hepatic porphyria (i.e., porphyria variegate, VP) in which PBG deaminase is inhibited by metabolic intermediates. Here, we have investigated the molecular basis of δ-aminolevulinate accumulation using in vitro fluxomics monitored by NMR spectroscopy and other biophysical techniques. Our results show that porphobilinogen, the natural product of δ-aminolevulinate deaminase, effectively inhibits its anabolic enzyme at abnormally low concentrations. Structurally, this high affinity can be explained by the interactions that porphobilinogen generates with the active site, most of them shared with the substrate. Enzymatically, our flux analysis of an altered heme pathway demonstrates that a minimum accumulation of porphobilinogen will immediately trigger the accumulation of δ-aminolevulinate, a long-lasting observation in patients suffering from acute porphyrias.
Assuntos
Porfiria Aguda Intermitente , Porfirias Hepáticas , Humanos , Porfiria Aguda Intermitente/genética , Porfiria Aguda Intermitente/metabolismo , Porfobilinogênio , Hidroximetilbilano Sintase/genética , Hidroximetilbilano Sintase/metabolismo , Porfirias Hepáticas/genéticaRESUMO
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Proteínas , Software , Genes , Modelos MolecularesRESUMO
Motivation: The identification of minimal gene knockout strategies to engineer metabolic systems constitutes one of the most relevant applications of the COnstraint-Based Reconstruction and Analysis (COBRA) framework. In the last years, the minimal cut sets (MCSs) approach has emerged as a promising tool to carry out this task. However, MCSs define reaction knockout strategies, which are not necessarily transformed into feasible strategies at the gene level. Results: We present a more general, easy-to-use and efficient computational implementation of a previously published algorithm to calculate MCSs to the gene level (gMCSs). Our tool was compared with existing methods in order to calculate essential genes and synthetic lethals in metabolic networks of different complexity, showing a significant reduction in model size and computation time. Availability and implementation: gMCS is publicly and freely available under GNU license in the COBRA toolbox (https://github.com/opencobra/cobratoolbox/tree/master/src/analysis/gMCS). Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Biologia Computacional , Redes e Vias Metabólicas , Genes Essenciais , Mutações Sintéticas LetaisRESUMO
MOTIVATION: The development of computational tools exploiting -omics data and high-quality genome-scale metabolic networks for the identification of novel drug targets is a relevant topic in Systems Medicine. Metabolic Transformation Algorithm (MTA) is one of these tools, which aims to identify targets that transform a disease metabolic state back into a healthy state, with potential application in any disease where a clear metabolic alteration is observed. RESULTS: Here, we present a robust extension to MTA (rMTA), which additionally incorporates a worst-case scenario analysis and minimization of metabolic adjustment to evaluate the beneficial effect of gene knockouts. We show that rMTA complements MTA in the different datasets analyzed (gene knockout perturbations in different organisms, Alzheimer's disease and prostate cancer), bringing a more accurate tool for predicting therapeutic targets. AVAILABILITY AND IMPLEMENTATION: rMTA is freely available on The Cobra Toolbox: https://opencobra.github.io/cobratoolbox/latest/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Redes e Vias Metabólicas , Software , Algoritmos , Genoma , Análise de SistemasRESUMO
With the emergence of metabolic networks, novel mathematical pathway concepts were introduced in the past decade, aiming to go beyond canonical maps. However, the use of network-based pathways to interpret 'omics' data has been limited owing to the fact that their computation has, until very recently, been infeasible in large (genome-scale) metabolic networks. In this review article, we describe the progress made in the past few years in the field of network-based metabolic pathway analysis. In particular, we review in detail novel optimization techniques to compute elementary flux modes, an important pathway concept in this field. In addition, we summarize approaches for the integration of metabolic pathways with gene expression data, discussing recent advances using network-based pathway concepts.
Assuntos
Expressão Gênica , Redes e Vias Metabólicas , Algoritmos , Biologia Computacional , Escherichia coli/genética , Escherichia coli/metabolismo , Perfilação da Expressão Gênica/estatística & dados numéricos , Modelos Biológicos , SoftwareRESUMO
MOTIVATION: The concept of Minimal Cut Sets (MCSs) is used in metabolic network modeling to describe minimal groups of reactions or genes whose simultaneous deletion eliminates the capability of the network to perform a specific task. Previous work showed that MCSs where closely related to Elementary Flux Modes (EFMs) in a particular dual problem, opening up the possibility to use the tools developed for computing EFMs to compute MCSs. Until recently, however, there existed no method to compute an EFM with some specific characteristic, meaning that, in the case of MCSs, the only strategy to obtain them was to enumerate them using, for example, the standard K-shortest EFMs algorithm. RESULTS: In this work, we adapt the recently developed theory to compute EFMs satisfying several constraints to the calculation of MCSs involving a specific reaction knock-out. Importantly, we emphasize that not all the EFMs in the dual problem correspond to real MCSs, and propose a new formulation capable of correctly identifying the MCS wanted. Furthermore, this formulation brings interesting insights about the relationship between the primal and the dual problem of the MCS computation. AVAILABILITY AND IMPLEMENTATION: A Matlab-Cplex implementation of the proposed algorithm is available as a supplementary material CONTACT: fplanes@ceit.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional/métodos , Análise do Fluxo Metabólico/métodos , Redes e Vias Metabólicas , Algoritmos , Escherichia coli/metabolismo , HumanosAssuntos
Regulação Neoplásica da Expressão Gênica , Mieloma Múltiplo , RNA Longo não Codificante , RNA Neoplásico , Feminino , Humanos , Masculino , Mieloma Múltiplo/genética , Mieloma Múltiplo/metabolismo , RNA Longo não Codificante/biossíntese , RNA Longo não Codificante/genética , RNA Neoplásico/biossíntese , RNA Neoplásico/genéticaRESUMO
MOTIVATION: Elementary flux modes (EFMs) analysis constitutes a fundamental tool in systems biology. However, the efficient calculation of EFMs in genome-scale metabolic networks (GSMNs) is still a challenge. We present a novel algorithm that uses a linear programming-based tree search and efficiently enumerates a subset of EFMs in GSMNs. RESULTS: Our approach is compared with the EFMEvolver approach, demonstrating a significant improvement in computation time. We also validate the usefulness of our new approach by studying the acetate overflow metabolism in the Escherichia coli bacteria. To do so, we computed 1 million EFMs for each energetic amino acid and then analysed the relevance of each energetic amino acid based on gene/protein expression data and the obtained EFMs. We found good agreement between previous experiments and the conclusions reached using EFMs. Finally, we also analysed the performance of our approach when applied to large GSMNs. AVAILABILITY AND IMPLEMENTATION: The stand-alone software TreeEFM is implemented in C++ and interacts with the open-source linear solver COIN-OR Linear program Solver (CLP).
Assuntos
Acetatos/metabolismo , Algoritmos , Escherichia coli/metabolismo , Genoma Bacteriano , Análise do Fluxo Metabólico/métodos , Redes e Vias Metabólicas , Software , Aminoácidos/metabolismo , Perfilação da Expressão Gênica , Programação LinearRESUMO
MOTIVATION: With the advent of meta-'omics' data, the use of metabolic networks for the functional analysis of microbial communities became possible. However, while network-based methods are widely developed for single organisms, their application to bacterial communities is currently limited. RESULTS: Herein, we provide a novel, context-specific reconstruction procedure based on metaproteomic and taxonomic data. Without previous knowledge of a high-quality, genome-scale metabolic networks for each different member in a bacterial community, we propose a meta-network approach, where the expression levels and taxonomic assignments of proteins are used as the most relevant clues for inferring an active set of reactions. Our approach was applied to draft the context-specific metabolic networks of two different naphthalene-enriched communities derived from an anthropogenically influenced, polyaromatic hydrocarbon contaminated soil, with (CN2) or without (CN1) bio-stimulation. We were able to capture the overall functional differences between the two conditions at the metabolic level and predict an important activity for the fluorobenzoate degradation pathway in CN1 and for geraniol metabolism in CN2. Experimental validation was conducted, and good agreement with our computational predictions was observed. We also hypothesize different pathway organizations at the organismal level, which is relevant to disentangle the role of each member in the communities. The approach presented here can be easily transferred to the analysis of genomic, transcriptomic and metabolomic data.
Assuntos
Bactérias/metabolismo , Naftalenos/metabolismo , Poluentes do Solo/metabolismo , Bactérias/classificação , Bactérias/genética , Redes e Vias Metabólicas , ProteômicaRESUMO
miRNAs are small RNA molecules ('22 nt) that interact with their target mRNAs inhibiting translation or/and cleavaging the target mRNA. This interaction is guided by sequence complentarity and results in the reduction of mRNA and/or protein levels. miRNAs are involved in key biological processes and different diseases. Therefore, deciphering miRNA targets is crucial for diagnostics and therapeutics. However, miRNA regulatory mechanisms are complex and there is still no high-throughput and low-cost miRNA target screening technique. In recent years, several computational methods based on sequence complementarity of the miRNA and the mRNAs have been developed. However, the predicted interactions using these computational methods are inconsistent and the expected false positive rates are still large. Recently, it has been proposed to use the expression values of miRNAs and mRNAs (and/or proteins) to refine the results of sequence-based putative targets for a particular experiment. These methods have shown to be effective identifying the most prominent interactions from the databases of putative targets. Here, we review these methods that combine both expression and sequence-based putative targets to predict miRNA targets.
Assuntos
Regulação da Expressão Gênica , MicroRNAs/genética , RNA Mensageiro/genética , Teorema de Bayes , Bases de Dados Genéticas , Análise dos Mínimos Quadrados , Modelos Lineares , Modelos TeóricosRESUMO
MOTIVATION: The concept of Elementary Flux Mode (EFM) has been widely used for the past 20 years. However, its application to genome-scale metabolic networks (GSMNs) is still under development because of methodological limitations. Therefore, novel approaches are demanded to extend the application of EFMs. A novel family of methods based on optimization is emerging that provides us with a subset of EFMs. Because the calculation of the whole set of EFMs goes beyond our capacity, performing a selective search is a proper strategy. RESULTS: Here, we present a novel mathematical approach calculating EFMs fulfilling additional linear constraints. We validated our approach based on two metabolic networks in which all the EFMs can be obtained. Finally, we analyzed the performance of our methodology in the GSMN of the yeast Saccharomyces cerevisiae by calculating EFMs producing ethanol with a given minimum carbon yield. Overall, this new approach opens new avenues for the calculation of EFMs in GSMNs. AVAILABILITY AND IMPLEMENTATION: Matlab code is provided in the supplementary online materials CONTACT: fplanes@ceit.es. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica/métodos , Análise do Fluxo Metabólico/métodos , Redes e Vias Metabólicas , Algoritmos , Genoma Fúngico/genética , Glucose/metabolismo , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMO
MOTIVATION: Pathway analysis tools are a powerful strategy to analyze 'omics' data in the field of systems biology. From a metabolic perspective, several pathway definitions can be found in the literature, each one appropriate for a particular study. Recently, a novel pathway concept termed carbon flux paths (CFPs) was introduced and benchmarked against existing approaches, showing a clear advantage for finding linear pathways from a given source to target metabolite. CFPs are simple paths in a metabolite-metabolite graph that satisfy typical constraints in stoichiometric models: mass balancing and thermodynamics (irreversibility). In addition, CFPs guarantee carbon exchange in each of their intermediate steps, but not between the source and the target metabolites and consequently false positive solutions may arise. These pathways often lack biological interest, particularly when studying biosynthetic or degradation routes of a metabolite. To overcome this issue, we amend the formulation in CFP, so as to account for atomic fate information. This approach is termed atomic CFP (aCFP). RESULTS: By means of a side-by-side comparison in a medium scale metabolic network in Escherichia Coli, we show that aCFP provides more biologically relevant pathways than CFP, because canonical pathways are more easily recovered, which reflects the benefits of removing false positives. In addition, we demonstrate that aCFP can be successfully applied to genome-scale metabolic networks. As the quality of genome-scale atomic reconstruction is improved, methods such as the one presented here will undoubtedly be of value to interpret 'omics' data.
Assuntos
Ciclo do Carbono , Carbono/análise , Escherichia coli/química , Escherichia coli/genética , Escherichia coli/metabolismo , Genoma Bacteriano , Redes e Vias Metabólicas/genética , Piruvato Quinase/metabolismoRESUMO
MOTIVATION: The analysis of high-throughput molecular data in the context of metabolic pathways is essential to uncover their underlying functional structure. Among different metabolic pathway concepts in systems biology, elementary flux modes (EFMs) hold a predominant place, as they naturally capture the complexity and plasticity of cellular metabolism and go beyond predefined metabolic maps. However, their use to interpret high-throughput data has been limited so far, mainly because their computation in genome-scale metabolic networks has been unfeasible. To face this issue, different optimization-based techniques have been recently introduced and their application to human metabolism is promising. RESULTS: In this article, we exploit and generalize the K-shortest EFM algorithm to determine a subset of EFMs in a human genome-scale metabolic network. This subset of EFMs involves a wide number of reported human metabolic pathways, as well as potential novel routes, and constitutes a valuable database where high-throughput data can be mapped and contextualized from a metabolic perspective. To illustrate this, we took expression data of 10 healthy human tissues from a previous study and predicted their characteristic EFMs based on enrichment analysis. We used a multivariate hypergeometric test and showed that it leads to more biologically meaningful results than standard hypergeometric. Finally, a biological discussion on the characteristic EFMs obtained in liver is conducted, finding a high level of agreement when compared with the literature.