Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 82
Filtrar
1.
BMC Med Inform Decis Mak ; 24(1): 167, 2024 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-38877563

RESUMEN

BACKGROUND: Consider a setting where multiple parties holding sensitive data aim to collaboratively learn population level statistics, but pooling the sensitive data sets is not possible due to privacy concerns and parties are unable to engage in centrally coordinated joint computation. We study the feasibility of combining privacy preserving synthetic data sets in place of the original data for collaborative learning on real-world health data from the UK Biobank. METHODS: We perform an empirical evaluation based on an existing prospective cohort study from the literature. Multiple parties were simulated by splitting the UK Biobank cohort along assessment centers, for which we generate synthetic data using differentially private generative modelling techniques. We then apply the original study's Poisson regression analysis on the combined synthetic data sets and evaluate the effects of 1) the size of local data set, 2) the number of participating parties, and 3) local shifts in distributions, on the obtained likelihood scores. RESULTS: We discover that parties engaging in the collaborative learning via shared synthetic data obtain more accurate estimates of the regression parameters compared to using only their local data. This finding extends to the difficult case of small heterogeneous data sets. Furthermore, the more parties participate, the larger and more consistent the improvements become up to a certain limit. Finally, we find that data sharing can especially help parties whose data contain underrepresented groups to perform better-adjusted analysis for said groups. CONCLUSIONS: Based on our results we conclude that sharing of synthetic data is a viable method for enabling learning from sensitive data without violating privacy constraints even if individual data sets are small or do not represent the overall population well. Lack of access to distributed sensitive data is often a bottleneck in biomedical research, which our study shows can be alleviated with privacy-preserving collaborative learning methods.


Asunto(s)
Difusión de la Información , Humanos , Reino Unido , Conducta Cooperativa , Confidencialidad/normas , Privacidad , Bancos de Muestras Biológicas , Estudios Prospectivos
2.
Eur J Neurosci ; 59(9): 2320-2335, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38483260

RESUMEN

Recent magnetoencephalography (MEG) studies have reported that functional connectivity (FC) and power spectra can be used as neural fingerprints in differentiating individuals. Such studies have mainly used correlations between measurement sessions to distinguish individuals from each other. However, it has remained unclear whether such correlations might reflect a more generalizable principle of individually distinctive brain patterns. Here, we evaluated a machine-learning based approach, termed latent-noise Bayesian reduced rank regression (BRRR) as a means of modelling individual differences in the resting-state MEG data of the Human Connectome Project (HCP), using FC and power spectra as neural features. First, we verified that BRRR could model and reproduce the differences between metrics that correlation-based fingerprinting yields. We trained BRRR models to distinguish individuals based on data from one measurement and used the models to identify subsequent measurement sessions of those same individuals. The best performing BRRR models, using only 20 spatiospectral components, were able to identify subjects across measurement sessions with over 90% accuracy, approaching the highest correlation-based accuracies. Using cross-validation, we then determined whether that BRRR model could generalize to unseen subjects, successfully classifying the measurement sessions of novel individuals with over 80% accuracy. The results demonstrate that individual neurofunctional differences can be reliably extracted from MEG data with a low-dimensional predictive model and that the model is able to classify novel subjects.


Asunto(s)
Teorema de Bayes , Encéfalo , Conectoma , Magnetoencefalografía , Humanos , Magnetoencefalografía/métodos , Conectoma/métodos , Encéfalo/fisiología , Aprendizaje Automático , Masculino , Femenino , Adulto , Modelos Neurológicos
3.
Chem Res Toxicol ; 36(8): 1238-1247, 2023 08 21.
Artículo en Inglés | MEDLINE | ID: mdl-37556769

RESUMEN

Drug-induced liver injury (DILI) is an important safety concern and a major reason to remove a drug from the market. Advancements in recent machine learning methods have led to a wide range of in silico models for DILI predictive methods based on molecule chemical structures (fingerprints). Existing publicly available DILI data sets used for model building are based on the interpretation of drug labels or patient case reports, resulting in a typical binary clinical DILI annotation. We developed a novel phenotype-based annotation to process hepatotoxicity information extracted from repeated dose in vivo preclinical toxicology studies using INHAND annotation to provide a more informative and reliable data set for machine learning algorithms. This work resulted in a data set of 430 unique compounds covering diverse liver pathology findings which were utilized to develop multiple DILI prediction models trained on the publicly available data (TG-GATEs) using the compound's fingerprint. We demonstrate that the TG-GATEs compounds DILI labels can be predicted well and how the differences between TG-GATEs and the external test compounds (Johnson & Johnson) impact the model generalization performance.


Asunto(s)
Enfermedad Hepática Inducida por Sustancias y Drogas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Algoritmos , Aprendizaje Automático , Simulación por Computador
4.
Bioinformatics ; 39(9)2023 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-37647640

RESUMEN

MOTIVATION: Existing methods for simulating synthetic genotype and phenotype datasets have limited scalability, constraining their usability for large-scale analyses. Moreover, a systematic approach for evaluating synthetic data quality and a benchmark synthetic dataset for developing and evaluating methods for polygenic risk scores are lacking. RESULTS: We present HAPNEST, a novel approach for efficiently generating diverse individual-level genotypic and phenotypic data. In comparison to alternative methods, HAPNEST shows faster computational speed and a lower degree of relatedness with reference panels, while generating datasets that preserve key statistical properties of real data. These desirable synthetic data properties enabled us to generate 6.8 million common variants and nine phenotypes with varying degrees of heritability and polygenicity across 1 million individuals. We demonstrate how HAPNEST can facilitate biobank-scale analyses through the comparison of seven methods to generate polygenic risk scoring across multiple ancestry groups and different genetic architectures. AVAILABILITY AND IMPLEMENTATION: A synthetic dataset of 1 008 000 individuals and nine traits for 6.8 million common variants is available at https://www.ebi.ac.uk/biostudies/studies/S-BSST936. The HAPNEST software for generating synthetic datasets is available as Docker/Singularity containers and open source Julia and C code at https://github.com/intervene-EU-H2020/synthetic_data.


Asunto(s)
Benchmarking , Exactitud de los Datos , Humanos , Genotipo , Fenotipo , Herencia Multifactorial
5.
Front Artif Intell ; 6: 1097891, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37091302

RESUMEN

Modeling has actively tried to take the human out of the loop, originally for objectivity and recently also for automation. We argue that an unnecessary side effect has been that modeling workflows and machine learning pipelines have become restricted to only well-specified problems. Putting the humans back into the models would enable modeling a broader set of problems, through iterative modeling processes in which AI can offer collaborative assistance. However, this requires advances in how we scope our modeling problems, and in the user models. In this perspective article, we characterize the required user models and the challenges ahead for realizing this vision, which would enable new interactive modeling workflows, and human-centric or human-compatible machine learning pipelines.

6.
Front Neurorobot ; 17: 1289406, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38250599

RESUMEN

More than 10 million Europeans show signs of mild cognitive impairment (MCI), a transitional stage between normal brain aging and dementia stage memory disorder. The path MCI takes can be divergent; while some maintain stability or even revert to cognitive norms, alarmingly, up to half of the cases progress to dementia within 5 years. Current diagnostic practice lacks the necessary screening tools to identify those at risk of progression. The European patient experience often involves a long journey from the initial signs of MCI to the eventual diagnosis of dementia. The trajectory is far from ideal. Here, we introduce the AI-Mind project, a pioneering initiative with an innovative approach to early risk assessment through the implementation of advanced artificial intelligence (AI) on multimodal data. The cutting-edge AI-based tools developed in the project aim not only to accelerate the diagnostic process but also to deliver highly accurate predictions regarding an individual's risk of developing dementia when prevention and intervention may still be possible. AI-Mind is a European Research and Innovation Action (RIA H2020-SC1-BHC-06-2020, No. 964220) financed between 2021 and 2026. First, the AI-Mind Connector identifies dysfunctional brain networks based on high-density magneto- and electroencephalography (M/EEG) recordings. Second, the AI-Mind Predictor predicts dementia risk using data from the Connector, enriched with computerized cognitive tests, genetic and protein biomarkers, as well as sociodemographic and clinical variables. AI-Mind is integrated within a network of major European initiatives, including The Virtual Brain, The Virtual Epileptic Patient, and EBRAINS AISBL service for sensitive data, HealthDataCloud, where big patient data are generated for advancing digital and virtual twin technology development. AI-Mind's innovation lies not only in its early prediction of dementia risk, but it also enables a virtual laboratory scenario for hypothesis-driven personalized intervention research. This article introduces the background of the AI-Mind project and its clinical study protocol, setting the stage for future scientific contributions.

7.
BMC Bioinformatics ; 23(1): 522, 2022 Dec 06.
Artículo en Inglés | MEDLINE | ID: mdl-36474143

RESUMEN

BACKGROUND: A deep understanding of carcinogenesis at the DNA level underpins many advances in cancer prevention and treatment. Mutational signatures provide a breakthrough conceptualisation, as well as an analysis framework, that can be used to build such understanding. They capture somatic mutation patterns and at best identify their causes. Most studies in this context have focused on an inherently additive analysis, e.g. by non-negative matrix factorization, where the mutations within a cancer sample are explained by a linear combination of independent mutational signatures. However, other recent studies show that the mutational signatures exhibit non-additive interactions. RESULTS: We carefully analysed such additive model fits from the PCAWG study cataloguing mutational signatures as well as their activities across thousands of cancers. Our analysis identified systematic and non-random structure of residuals that is left unexplained by the additive model. We used hierarchical clustering to identify cancer subsets with similar residual profiles to show that both systematic mutation count overestimation and underestimation take place. We propose an extension to the additive mutational signature model-multiplicatively acting modulatory processes-and develop a maximum-likelihood framework to identify such modulatory mutational signatures. The augmented model is expressive enough to almost fully remove the observed systematic residual patterns. CONCLUSION: We suggest the modulatory processes biologically relate to sample specific DNA repair propensities with cancer or tissue type specific profiles. Overall, our results identify an interesting direction where to expand signature analysis.


Asunto(s)
Neoplasias , Humanos , Mutación , Neoplasias/genética
8.
J Cheminform ; 14(1): 86, 2022 Dec 28.
Artículo en Inglés | MEDLINE | ID: mdl-36578043

RESUMEN

A de novo molecular design workflow can be used together with technologies such as reinforcement learning to navigate the chemical space. A bottleneck in the workflow that remains to be solved is how to integrate human feedback in the exploration of the chemical space to optimize molecules. A human drug designer still needs to design the goal, expressed as a scoring function for the molecules that captures the designer's implicit knowledge about the optimization task. Little support for this task exists and, consequently, a chemist usually resorts to iteratively building the objective function of multi-parameter optimization (MPO) in de novo design. We propose a principled approach to use human-in-the-loop machine learning to help the chemist to adapt the MPO scoring function to better match their goal. An advantage is that the method can learn the scoring function directly from the user's feedback while they browse the output of the molecule generator, instead of the current manual tuning of the scoring function with trial and error. The proposed method uses a probabilistic model that captures the user's idea and uncertainty about the scoring function, and it uses active learning to interact with the user. We present two case studies for this: In the first use-case, the parameters of an MPO are learned, and in the second use-case a non-parametric component of the scoring function to capture human domain knowledge is developed. The results show the effectiveness of the methods in two simulated example cases with an oracle, achieving significant improvement in less than 200 feedback queries, for the goals of a high QED score and identifying potent molecules for the DRD2 receptor, respectively. We further demonstrate the performance gains with a medicinal chemist interacting with the system.

9.
Commun Biol ; 5(1): 1238, 2022 11 12.
Artículo en Inglés | MEDLINE | ID: mdl-36371468

RESUMEN

The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.


Asunto(s)
Aprendizaje Profundo , Epistasis Genética , Redes Neurales de la Computación , Polimorfismo de Nucleótido Simple , Fenotipo
10.
Front Neurosci ; 16: 1019572, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36408411

RESUMEN

Different neuroimaging methods can yield different views of task-dependent neural engagement. Studies examining the relationship between electromagnetic and hemodynamic measures have revealed correlated patterns across brain regions but the role of the applied stimulation or experimental tasks in these correlation patterns is still poorly understood. Here, we evaluated the across-tasks variability of MEG-fMRI relationship using data recorded during three distinct naming tasks (naming objects and actions from action images, and objects from object images), from the same set of participants. Our results demonstrate that the MEG-fMRI correlation pattern varies according to the performed task, and that this variability shows distinct spectral profiles across brain regions. Notably, analysis of the MEG data alone did not reveal modulations across the examined tasks in the time-frequency windows emerging from the MEG-fMRI correlation analysis. Our results suggest that the electromagnetic-hemodynamic correlation could serve as a more sensitive proxy for task-dependent neural engagement in cognitive tasks than isolated within-modality measures.

11.
IEEE/ACM Trans Comput Biol Bioinform ; 19(4): 2197-2207, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-33705322

RESUMEN

Detecting predictive biomarkers from multi-omics data is important for precision medicine, to improve diagnostics of complex diseases and for better treatments. This needs substantial experimental efforts that are made difficult by the heterogeneity of cell lines and huge cost. An effective solution is to build a computational model over the diverse omics data, including genomic, molecular, and environmental information. However, choosing informative and reliable data sources from among the different types of data is a challenging problem. We propose DIVERSE, a framework of Bayesian importance-weighted tri- and bi-matrix factorization(DIVERSE3 or DIVERSE2) to predict drug responses from data of cell lines, drugs, and gene interactions. DIVERSE integrates the data sources systematically, in a step-wise manner, examining the importance of each added data set in turn. More specifically, we sequentially integrate five different data sets, which have not all been combined in earlier bioinformatic methods for predicting drug responses. Empirical experiments show that DIVERSE clearly outperformed five other methods including three state-of-the-art approaches, under cross-validation, particularly in out-of-matrix prediction, which is closer to the setting of real use cases and more challenging than simpler in-matrix prediction. Additionally, case studies for discovering new drugs further confirmed the performance advantage of DIVERSE.


Asunto(s)
Biología Computacional , Medicina de Precisión , Teorema de Bayes , Biología Computacional/métodos , Medicina de Precisión/métodos
12.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34368832

RESUMEN

Drug combination therapy is a promising strategy to treat complex diseases such as cancer and infectious diseases. However, current knowledge of drug combination therapies, especially in cancer patients, is limited because of adverse drug effects, toxicity and cell line heterogeneity. Screening new drug combinations requires substantial efforts since considering all possible combinations between drugs is infeasible and expensive. Therefore, building computational approaches, particularly machine learning methods, could provide an effective strategy to overcome drug resistance and improve therapeutic efficacy. In this review, we group the state-of-the-art machine learning approaches to analyze personalized drug combination therapies into three categories and discuss each method in each category. We also present a short description of relevant databases used as a benchmark in drug combination therapies and provide a list of well-known, publicly available interactive data analysis portals. We highlight the importance of data integration on the identification of drug combinations. Finally, we address the advantages of combining multiple data sources on drug combination analysis by showing an experimental comparison.


Asunto(s)
Aprendizaje Automático , Protocolos de Quimioterapia Combinada Antineoplásica/administración & dosificación , Biología Computacional/métodos , Humanos , Neoplasias/tratamiento farmacológico , Medicina de Precisión
13.
Patterns (N Y) ; 2(7): 100271, 2021 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-34286296

RESUMEN

Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this limitation but would leave open the problem of designing what kind of synthetic data. We propose formulating the problem of private data release through probabilistic modeling. This approach transforms the problem of designing the synthetic data into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data. We demonstrate empirically, in an epidemiological study, that statistical discoveries can be reliably reproduced from the synthetic data. We expect the method to have broad use in creating high-quality anonymized data twins of key datasets for research.

14.
Brief Bioinform ; 22(1): 346-359, 2021 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-31838491

RESUMEN

Predicting the response of cancer cell lines to specific drugs is one of the central problems in personalized medicine, where the cell lines show diverse characteristics. Researchers have developed a variety of computational methods to discover associations between drugs and cell lines, and improved drug sensitivity analyses by integrating heterogeneous biological data. However, choosing informative data sources and methods that can incorporate multiple sources efficiently is the challenging part of successful analysis in personalized medicine. The reason is that finding decisive factors of cancer and developing methods that can overcome the problems of integrating data, such as differences in data structures and data complexities, are difficult. In this review, we summarize recent advances in data integration-based machine learning for drug response prediction, by categorizing methods as matrix factorization-based, kernel-based and network-based methods. We also present a short description of relevant databases used as a benchmark in drug response prediction analyses, followed by providing a brief discussion of challenges faced in integrating and interpreting data from multiple sources. Finally, we address the advantages of combining multiple heterogeneous data sources on drug sensitivity analysis by showing an experimental comparison. Contact:  betul.guvenc@aalto.fi.


Asunto(s)
Resistencia a Antineoplásicos , Genómica/métodos , Medicina de Precisión/métodos , Humanos , Aprendizaje Automático , Variantes Farmacogenómicas
15.
Appl Microbiol Biotechnol ; 104(24): 10515-10529, 2020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-33147349

RESUMEN

In this work, deoxyribose-5-phosphate aldolase (Ec DERA, EC 4.1.2.4) from Escherichia coli was chosen as the protein engineering target for improving the substrate preference towards smaller, non-phosphorylated aldehyde donor substrates, in particular towards acetaldehyde. The initial broad set of mutations was directed to 24 amino acid positions in the active site or in the close vicinity, based on the 3D complex structure of the E. coli DERA wild-type aldolase. The specific activity of the DERA variants containing one to three amino acid mutations was characterised using three different substrates. A novel machine learning (ML) model utilising Gaussian processes and feature learning was applied for the 3rd mutagenesis round to predict new beneficial mutant combinations. This led to the most clear-cut (two- to threefold) improvement in acetaldehyde (C2) addition capability with the concomitant abolishment of the activity towards the natural donor molecule glyceraldehyde-3-phosphate (C3P) as well as the non-phosphorylated equivalent (C3). The Ec DERA variants were also tested on aldol reaction utilising formaldehyde (C1) as the donor. Ec DERA wild-type was shown to be able to carry out this reaction, and furthermore, some of the improved variants on acetaldehyde addition reaction turned out to have also improved activity on formaldehyde. KEY POINTS: • DERA aldolases are promiscuous enzymes. • Synthetic utility of DERA aldolase was improved by protein engineering approaches. • Machine learning methods aid the protein engineering of DERA.


Asunto(s)
Escherichia coli , Fructosa-Bifosfato Aldolasa , Aldehído-Liasas/genética , Aldehído-Liasas/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Fructosa-Bifosfato Aldolasa/genética , Aprendizaje Automático , Ingeniería de Proteínas , Especificidad por Sustrato
16.
J Assoc Inf Sci Technol ; 70(9): 917-930, 2019 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-31763361

RESUMEN

The use of implicit relevance feedback from neurophysiology could deliver effortless information retrieval. However, both computing neurophysiologic responses and retrieving documents are characterized by uncertainty because of noisy signals and incomplete or inconsistent representations of the data. We present the first-of-its-kind, fully integrated information retrieval system that makes use of online implicit relevance feedback generated from brain activity as measured through electroencephalography (EEG), and eye movements. The findings of the evaluation experiment (N = 16) show that we are able to compute online neurophysiology-based relevance feedback with performance significantly better than chance in complex data domains and realistic search tasks. We contribute by demonstrating how to integrate in interactive intent modeling this inherently noisy implicit relevance feedback combined with scarce explicit feedback. Although experimental measures of task performance did not allow us to demonstrate how the classification outcomes translated into search task performance, the experiment proved that our approach is able to generate relevance feedback from brain signals and eye movements in a realistic scenario, thus providing promising implications for future work in neuroadaptive information retrieval (IR).

17.
Bioinformatics ; 35(14): i218-i224, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510659

RESUMEN

MOTIVATION: Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information. RESULTS: We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, principal component analysis and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction. AVAILABILITY AND IMPLEMENTATION: Code used in the experiments is available at https://github.com/DPBayes/dp-representation-transfer.


Asunto(s)
Aprendizaje Automático , Humanos , Neoplasias
18.
Bioinformatics ; 35(14): i427-i435, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510671

RESUMEN

MOTIVATION: Finding non-linear relationships between biomolecules and a biological outcome is computationally expensive and statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity and computational overhead. Here we propose block HSIC Lasso, a non-linear feature selector that does not present the previous drawbacks. RESULTS: We compare block HSIC Lasso to other state-of-the-art feature selection techniques in both synthetic and real data, including experiments over three common types of genomic data: gene-expression microarrays, single-cell RNA sequencing and genome-wide association studies. In all cases, we observe that features selected by block HSIC Lasso retain more information about the underlying biology than those selected by other techniques. As a proof of concept, we applied block HSIC Lasso to a single-cell RNA sequencing experiment on mouse hippocampus. We discovered that many genes linked in the past to brain development and function are involved in the biological differences between the types of neurons. AVAILABILITY AND IMPLEMENTATION: Block HSIC Lasso is implemented in the Python 2/3 package pyHSICLasso, available on PyPI. Source code is available on GitHub (https://github.com/riken-aip/pyHSICLasso). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biomarcadores , Estudio de Asociación del Genoma Completo , Programas Informáticos , Animales , Genoma , Genómica , Ratones
19.
Bioinformatics ; 35(14): i548-i557, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510676

RESUMEN

MOTIVATION: Metabolic flux balance analysis (FBA) is a standard tool in analyzing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place model assumptions on fluxes due to the convenience of formulating the problem as a linear programing model, while many methods do not consider the inherent uncertainty in flux estimates. RESULTS: We introduce a novel paradigm of Bayesian metabolic flux analysis that models the reactions of the whole genome-scale cellular system in probabilistic terms, and can infer the full flux vector distribution of genome-scale metabolic systems based on exchange and intracellular (e.g. 13C) flux measurements, steady-state assumptions, and objective function assumptions. The Bayesian model couples all fluxes jointly together in a simple truncated multivariate posterior distribution, which reveals informative flux couplings. Our model is a plug-in replacement to conventional metabolic balance methods, such as FBA. Our experiments indicate that we can characterize the genome-scale flux covariances, reveal flux couplings, and determine more intracellular unobserved fluxes in Clostridium acetobutylicum from 13C data than flux variability analysis. AVAILABILITY AND IMPLEMENTATION: The COBRA compatible software is available at github.com/markusheinonen/bamfa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Clostridium acetobutylicum , Análisis de Flujos Metabólicos , Teorema de Bayes , Redes y Vías Metabólicas , Modelos Biológicos
20.
Cogn Sci ; 43(6): e12738, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31204797

RESUMEN

This paper addresses a common challenge with computational cognitive models: identifying parameter values that are both theoretically plausible and generate predictions that match well with empirical data. While computational models can offer deep explanations of cognition, they are computationally complex and often out of reach of traditional parameter fitting methods. Weak methodology may lead to premature rejection of valid models or to acceptance of models that might otherwise be falsified. Mathematically robust fitting methods are, therefore, essential to the progress of computational modeling in cognitive science. In this article, we investigate the capability and role of modern fitting methods-including Bayesian optimization and approximate Bayesian computation-and contrast them to some more commonly used methods: grid search and Nelder-Mead optimization. Our investigation consists of a reanalysis of the fitting of two previous computational models: an Adaptive Control of Thought-Rational model of skill acquisition and a computational rationality model of visual search. The results contrast the efficiency and informativeness of the methods. A key advantage of the Bayesian methods is the ability to estimate the uncertainty of fitted parameter values. We conclude that approximate Bayesian computation is (a) efficient, (b) informative, and (c) offers a path to reproducible results.


Asunto(s)
Cognición , Simulación por Computador , Modelos Psicológicos , Teorema de Bayes , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA