RESUMO
Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.
Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Genoma , Aprendizado de Máquina , Redes e Vias Metabólicas , Genótipo , FenótipoRESUMO
BACKGROUND: Ageing can be classified in two different ways, chronological ageing and biological ageing. While chronological age is a measure of the time that has passed since birth, biological (also known as transcriptomic) ageing is defined by how time and the environment affect an individual in comparison to other individuals of the same chronological age. Recent research studies have shown that transcriptomic age is associated with certain genes, and that each of those genes has an effect size. Using these effect sizes we can calculate the transcriptomic age of an individual from their age-associated gene expression levels. The limitation of this approach is that it does not consider how these changes in gene expression affect the metabolism of individuals and hence their observable cellular phenotype. RESULTS: We propose a method based on poly-omic constraint-based models and machine learning in order to further the understanding of transcriptomic ageing. We use normalised CD4 T-cell gene expression data from peripheral blood mononuclear cells in 499 healthy individuals to create individual metabolic models. These models are then combined with a transcriptomic age predictor and chronological age to provide new insights into the differences between transcriptomic and chronological ageing. As a result, we propose a novel metabolic age predictor. CONCLUSIONS: We show that our poly-omic predictors provide a more detailed analysis of transcriptomic ageing compared to gene-based approaches, and represent a basis for furthering our knowledge of the ageing mechanisms in human cells.
Assuntos
Envelhecimento/genética , Envelhecimento/metabolismo , Genômica , Modelos Biológicos , Adulto , Análise por Conglomerados , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Análise de Componente Principal , Análise de Regressão , Linfócitos T/metabolismo , Adulto JovemRESUMO
In this proof-of-concept work, we evaluate the performance of multiple machine-learning methods as surrogate models for use in the analysis of agent-based models (ABMs). Analysing agent-based modelling outputs can be challenging, as the relationships between input parameters can be non-linear or even chaotic even in relatively simple models, and each model run can require significant CPU time. Surrogate modelling, in which a statistical model of the ABM is constructed to facilitate detailed model analyses, has been proposed as an alternative to computationally costly Monte Carlo methods. Here we compare multiple machine-learning methods for ABM surrogate modelling in order to determine the approaches best suited as a surrogate for modelling the complex behaviour of ABMs. Our results suggest that, in most scenarios, artificial neural networks (ANNs) and gradient-boosted trees outperform Gaussian process surrogates, currently the most commonly used method for the surrogate modelling of complex computational models. ANNs produced the most accurate model replications in scenarios with high numbers of model runs, although training times were longer than the other methods. We propose that agent-based modelling would benefit from using machine-learning methods for surrogate modelling, as this can facilitate more robust sensitivity analyses for the models while also reducing CPU time consumption when calibrating and analysing the simulation.
Assuntos
Redes Neurais de ComputaçãoRESUMO
Cancer is considered a high-risk condition for severe illness resulting from COVID-19. The interaction between severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) and human metabolism is key to elucidating the risk posed by COVID-19 for cancer patients and identifying effective treatments, yet it is largely uncharacterised on a mechanistic level. We present a genome-scale map of short-term metabolic alterations triggered by SARS-CoV-2 infection of cancer cells. Through transcriptomic- and proteomic-informed genome-scale metabolic modelling, we characterise the role of RNA and fatty acid biosynthesis in conjunction with a rewiring in energy production pathways and enhanced cytokine secretion. These findings link together complementary aspects of viral invasion of cancer cells, while providing mechanistic insights that can inform the development of treatment strategies.