Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Comput Struct Biotechnol J ; 23: 2442-2452, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38867723

RESUMEN

Bioactive peptides are short amino acid chains possessing biological activity and exerting physiological effects relevant to human health. Despite their therapeutic value, their identification remains a major problem, as it mainly relies on time-consuming in vitro tests. While bioinformatic tools for the identification of bioactive peptides are available, they are focused on specific functional classes and have not been systematically tested on realistic settings. To tackle this problem, bioactive peptide sequences and functions were here gathered from a variety of databases to generate a unified collection of bioactive peptides from microbial fermentation. This collection was organized into nine functional classes including some previously studied and some unexplored such as immunomodulatory, opioid and cardiovascular peptides. Upon assessing their sequence properties, four alternative encoding methods were tested in combination with a multitude of machine learning algorithms, from basic classifiers like logistic regression to advanced algorithms like BERT. Tests on a total of 171 models showed that, while some functions are intrinsically easier to detect, no single combination of classifiers and encoders worked universally well for all classes. For this reason, we unified all the best individual models for each class and generated CICERON (Classification of bIoaCtive pEptides fRom micrObial fermeNtation), a classification tool for the functional classification of peptides. State-of-the-art classifiers were found to underperform on our realistic benchmark dataset compared to the models included in CICERON. Altogether, our work provides a tool for real-world peptide classification and can serve as a benchmark for future model development.

2.
Trends Endocrinol Metab ; 35(6): 533-548, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38575441

RESUMEN

Genome-scale metabolic models (GEMs) are consolidating as platforms for studying mixed microbial populations, by combining biological data and knowledge with mathematical rigor. However, deploying these models to answer research questions can be challenging due to the increasing number of available computational tools, the lack of universal standards, and their inherent limitations. Here, we present a comprehensive overview of foundational concepts for building and evaluating genome-scale models of microbial communities. We then compare tools in terms of requirements, capabilities, and applications. Next, we highlight the current pitfalls and open challenges to consider when adopting existing tools and developing new ones. Our compendium can be relevant for the expanding community of modelers, both at the entry and experienced levels.


Asunto(s)
Modelos Biológicos , Microbiota/fisiología , Humanos
3.
Environ Microbiome ; 19(1): 1, 2024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-38167520

RESUMEN

BACKGROUND: The anaerobic digestion process degrades organic matter into simpler compounds and occurs in strictly anaerobic and microaerophilic environments. The process is carried out by a diverse community of microorganisms where each species has a unique role and it has relevant biotechnological applications since it is used for biogas production. Some aspects of the microbiome, including its interaction with phages, remains still unclear: a better comprehension of the community composition and role of each species is crucial for a cured understanding of the carbon cycle in anaerobic systems and improving biogas production. RESULTS: The primary objective of this study was to expand our understanding on the anaerobic digestion microbiome by jointly analyzing its prokaryotic and viral components. By integrating 192 additional datasets into a previous metagenomic database, the binning process generated 11,831 metagenome-assembled genomes from 314 metagenome samples published between 2014 and 2022, belonging to 4,568 non-redundant species based on ANI calculation and quality verification. CRISPR analysis on these genomes identified 76 archaeal genomes with active phage interactions. Moreover, single-nucleotide variants further pointed to archaea as the most critical members of the community. Among the MAGs, two methanogenic archaea, Methanothrix sp. 43zhSC_152 and Methanoculleus sp. 52maCN_3230, had the highest number of SNVs, with the latter having almost double the density of most other MAGs. CONCLUSIONS: This study offers a more comprehensive understanding of microbial community structures that thrive at different temperatures. The findings revealed that the fraction of archaeal species characterized at the genome level and reported in public databases is higher than that of bacteria, although still quite limited. The identification of shared spacers between phages and microbes implies a history of phage-bacterial interactions, and specifically lysogenic infections. A significant number of SNVs were identified, primarily comprising synonymous and nonsynonymous variants. Together, the findings indicate that methanogenic archaea are subject to intense selective pressure and suggest that genomic variants play a critical role in the anaerobic digestion process. Overall, this study provides a more balanced and diverse representation of the anaerobic digestion microbiota in terms of geographic location, temperature range and feedstock utilization.

4.
Environ Sci Technol ; 58(1): 580-590, 2024 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-38114447

RESUMEN

Ammonia release from proteinaceous feedstocks represents the main inhibitor of the anaerobic digestion (AD) process, which can result in a decreased biomethane yield or even complete failure of the process. The present study focused on the adaptation of mesophilic AD communities to a stepwise increase in the concentration of ammonium chloride in synthetic medium with casein used as the carbon source. An adaptation process occurring over more than 20 months allowed batch reactors to reach up to 20 g of NH4+ N/L without collapsing in acidification nor ceasing methane production. To decipher the microbial dynamics occurring during the adaptation and determine the genes mostly exposed to selective pressure, a combination of biochemical and metagenomics analyses was performed, reconstructing the strains of key species and tracking them over time. Subsequently, the adaptive metabolic mechanisms were delineated by following the single nucleotide variants (SNVs) characterizing the strains and prioritizing the associated genes according to their function. An in-depth exploration of the archaeon Methanoculleus bourgensis vb3066 and the putative syntrophic acetate-oxidizing bacteria Acetomicrobium sp. ma133 identified positively selected SNVs on genes involved in stress adaptation. The intraspecies diversity with multiple coexisting strains in a temporal succession pattern allows us to detect the presence of an additional level of diversity within the microbial community beyond the species level.


Asunto(s)
Compuestos de Amonio , Microbiota , Anaerobiosis , Reactores Biológicos/microbiología , Bacterias/genética , Bacterias/metabolismo , Metagenómica , Amoníaco/metabolismo , Compuestos de Amonio/metabolismo , Metano
5.
Biotechnol Adv ; 69: 108264, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37775073

RESUMEN

Cupriavidus necator is a bacterium with a high phenotypic diversity and versatile metabolic capabilities. It has been extensively studied as a model hydrogen oxidizer, as well as a producer of polyhydroxyalkanoates (PHA), plastic-like biopolymers with a high potential to substitute petroleum-based materials. Thanks to its adaptability to diverse metabolic lifestyles and to the ability to accumulate large amounts of PHA, C. necator is employed in many biotechnological processes, with particular focus on PHA production from waste carbon sources. The large availability of genomic information has enabled a characterization of C. necator's metabolism, leading to the establishment of metabolic models which are used to devise and optimize culture conditions and genetic engineering approaches. In this work, the characteristics of available C. necator strains and genomes are reviewed, underlining how a thorough comprehension of the genetic variability of C. necator is lacking and it could be instrumental for wider application of this microorganism. The metabolic paradigms of C. necator and how they are connected to PHA production and accumulation are described, also recapitulating the variety of carbon substrates used for PHA accumulation, highlighting the most promising strategies to increase the yield. Finally, the review describes and critically analyzes currently available genome-scale metabolic models and reduced metabolic network applications commonly employed in the optimization of PHA production. Overall, it appears that the capacity of C. necator of performing CO2 bioconversion to PHA is still underexplored, both in biotechnological applications and in metabolic modeling. However, the accurate characterization of this organism and the efforts in using it for gas fermentation can help tackle this challenging perspective in the future.


Asunto(s)
Cupriavidus necator , Polihidroxialcanoatos , Polihidroxialcanoatos/genética , Polihidroxialcanoatos/metabolismo , Cupriavidus necator/genética , Cupriavidus necator/metabolismo , Fermentación , Biotecnología , Carbono/metabolismo
6.
Curr Opin Microbiol ; 75: 102363, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37542746

RESUMEN

Anaerobic and microaerophilic environments are pervasive in nature, providing essential contributions to the maintenance of human health, biogeochemical cycles and the Earth's climate. These ecological niches are characterised by low free oxygen and oxidants, or lack thereof. Under these conditions, interactions between species are essential for supporting the growth of syntrophic species and maintaining thermodynamic feasibility of anaerobic fermentation. Kinetic models provide a simplified view of complex metabolic networks, while genome-scale metabolic models and flux-balance analysis (FBA) aim to unravel these systems as a whole. The target of this review is to outline the main similarities, differences and challenges associated with kinetic and metabolic modelling, and describe state-of-the-art modelling practices for studying syntrophies in the anaerobic digestion (AD) case study.


Asunto(s)
Redes y Vías Metabólicas , Interacciones Microbianas , Humanos , Anaerobiosis , Fermentación
7.
Cell Rep Methods ; 3(1): 100383, 2023 01 23.
Artículo en Inglés | MEDLINE | ID: mdl-36814842

RESUMEN

Multi-omics data integration via mechanistic models of metabolism is a scalable and flexible framework for exploring biological hypotheses in microbial systems. However, although most microorganisms are unculturable, such multi-omics modeling is limited to isolate microbes or simple synthetic communities. Here, we developed an approach for modeling microbial activity and interactions that leverages the reconstruction of metagenome-assembled genomes and associated genome-centric metatranscriptomes. At its core, we designed a method for condition-specific metabolic modeling of microbial communities through the integration of metatranscriptomic data. Using this approach, we explored the behavior of anaerobic digestion consortia driven by hydrogen availability and human gut microbiota dysbiosis associated with Crohn's disease, identifying condition-dependent amino acid requirements in archaeal species and a reduced short-chain fatty acid exchange network associated with disease, respectively. Our approach can be applied to complex microbial communities, allowing a mechanistic contextualization of multi-omics data on a metagenome scale.


Asunto(s)
Microbioma Gastrointestinal , Microbiota , Humanos , Microbiota/genética , Metagenoma/genética , Archaea/genética , Microbioma Gastrointestinal/genética
8.
Metab Eng ; 76: 120-132, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36720400

RESUMEN

Multi-strain probiotics are widely regarded as effective products for improving gut microbiota stability and host health, providing advantages over single-strain probiotics. However, in general, it is unclear to what extent different strains would cooperate or compete for resources, and how the establishment of a common biofilm microenvironment could influence their interactions. In this work, we develop an integrative experimental and computational approach to comprehensively assess the metabolic functionality and interactions of probiotics across growth conditions. Our approach combines co-culture assays with genome-scale modelling of metabolism and multivariate data analysis, thus exploiting complementary data- and knowledge-driven systems biology techniques. To show the advantages of the proposed approach, we apply it to the study of the interactions between two widely used probiotic strains of Lactobacillus reuteri and Saccharomyces boulardii, characterising their production potential for compounds that can be beneficial to human health. Our results show that these strains can establish a mixed cooperative-antagonistic interaction best explained by competition for shared resources, with an increased individual exchange but an often decreased net production of amino acids and short-chain fatty acids. Overall, our work provides a strategy that can be used to explore microbial metabolic fingerprints of biotechnological interest, capable of capturing multifaceted equilibria even in simple microbial consortia.


Asunto(s)
Microbioma Gastrointestinal , Probióticos , Humanos , Probióticos/metabolismo , Saccharomyces cerevisiae/metabolismo , Biopelículas , Técnicas de Cocultivo
9.
Comput Biol Med ; 151(Pt A): 106244, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36343407

RESUMEN

BACKGROUND: Recently, multi-omic machine learning architectures have been proposed for the early detection of cancer. However, for rare cancers and their associated small datasets, it is still unclear how to use the available multi-omics data to achieve a mechanistic prediction of cancer onset and progression, due to the limited data available. Hepatoblastoma is the most frequent liver cancer in infancy and childhood, and whose incidence has been lately increasing in several developed countries. Even though some studies have been conducted to understand the causes of its onset and discover potential biomarkers, the role of metabolic rewiring has not been investigated in depth so far. METHODS: Here, we propose and implement an interpretable multi-omics pipeline that combines mechanistic knowledge from genome-scale metabolic models with machine learning algorithms, and we use it to characterise the underlying mechanisms controlling hepatoblastoma. RESULTS AND CONCLUSIONS: While the obtained machine learning models generally present a high diagnostic classification accuracy, our results show that the type of omics combinations used as input to the machine learning models strongly affects the detection of important genes, reactions and metabolic pathways linked to hepatoblastoma. Our method also suggests that, in the context of computer-aided diagnosis of cancer, optimal diagnostic accuracy can be achieved by adopting a combination of omics that depends on the patient's clinical characteristics.


Asunto(s)
Hepatoblastoma , Neoplasias Hepáticas , Humanos , Niño , Aprendizaje Automático , Algoritmos , Redes y Vías Metabólicas/genética , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética
10.
Microbiome ; 10(1): 117, 2022 08 03.
Artículo en Inglés | MEDLINE | ID: mdl-35918706

RESUMEN

BACKGROUND: Carbon fixation through biological methanation has emerged as a promising technology to produce renewable energy in the context of the circular economy. The anaerobic digestion microbiome is the fundamental biological system operating biogas upgrading and is paramount in power-to-gas conversion. Carbon dioxide (CO2) methanation is frequently performed by microbiota attached to solid supports generating biofilms. Despite the apparent simplicity of the microbial community involved in biogas upgrading, the dynamics behind most of the interspecies interaction remain obscure. To understand the role of the microbial species in CO2 fixation, the biofilm generated during the biogas upgrading process has been selected as a case study. The present work investigates via genome-centric metagenomics, based on a hybrid Nanopore-Illumina approach the biofilm developed on the diffusion devices of four ex situ biogas upgrading reactors. Moreover, genome-guided metabolic reconstruction and flux balance analysis were used to propose a biological role for the dominant microbes. RESULTS: The combined microbiome was composed of 59 species, with five being dominant (> 70% of total abundance); the metagenome-assembled genomes representing these species were refined to reach a high level of completeness. Genome-guided metabolic analysis appointed Firmicutes sp. GSMM966 as the main responsible for biofilm formation. Additionally, species interactions were investigated considering their co-occurrence in 134 samples, and in terms of metabolic exchanges through flux balance simulation in a simplified medium. Some of the most abundant species (e.g., Limnochordia sp. GSMM975) were widespread (~ 67% of tested experiments), while others (e.g., Methanothermobacter wolfeii GSMM957) had a scattered distribution. Genome-scale metabolic models of the microbial community were built with boundary conditions taken from the biochemical data and showed the presence of a flexible interaction network mainly based on hydrogen and carbon dioxide uptake and formate exchange. CONCLUSIONS: Our work investigated the interplay between five dominant species within the biofilm and showed their importance in a large spectrum of anaerobic biogas reactor samples. Flux balance analysis provided a deeper insight into the potential syntrophic interaction between species, especially Limnochordia sp. GSMM975 and Methanothermobacter wolfeii GSMM957. Finally, it suggested species interactions to be based on formate and amino acids exchanges. Video Abstract.


Asunto(s)
Biocombustibles , Metagenoma , Anaerobiosis , Reactores Biológicos , Dióxido de Carbono/análisis , Firmicutes/metabolismo , Formiatos , Metano/metabolismo , Methanobacteriaceae/genética , Methanobacteriaceae/metabolismo
11.
Front Aging Neurosci ; 14: 894994, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35860672

RESUMEN

The degu (Octodon degus) is a diurnal long-lived rodent that can spontaneously develop molecular and behavioral changes that mirror those seen in human aging. With age some degu, but not all individuals, develop cognitive decline and brain pathology like that observed in Alzheimer's disease including neuroinflammation, hyperphosphorylated tau and amyloid plaques, together with other co-morbidities associated with aging such as macular degeneration, cataracts, alterations in circadian rhythm, diabetes and atherosclerosis. Here we report the whole-genome sequencing and analysis of the degu genome, which revealed unique features and molecular adaptations consistent with aging and Alzheimer's disease. We identified single nucleotide polymorphisms in genes associated with Alzheimer's disease including a novel apolipoprotein E (Apoe) gene variant that correlated with an increase in amyloid plaques in brain and modified the in silico predicted degu APOE protein structure and functionality. The reported genome of an unconventional long-lived animal model of aging and Alzheimer's disease offers the opportunity for understanding molecular pathways involved in aging and should help advance biomedical research into treatments for Alzheimer's disease.

12.
Comput Struct Biotechnol J ; 20: 1481-1486, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35422973

RESUMEN

Background: The rapid accumulation of sequencing data from metagenomic studies is enabling the generation of huge collections of microbial genomes, with new challenges for mapping their functional potential. In particular, metagenome-assembled genomes are typically incomplete and harbor partial gene sequences that can limit their annotation from traditional tools. New scalable solutions are thus needed to facilitate the evaluation of functional potential in microbial genomes. Methods: To resolve annotation gaps in microbial genomes, we developed KEMET, an open-source Python library devised for the analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional units. KEMET focuses on the in-depth analysis of metabolic reaction networks to identify missing orthologs through hidden Markov model profiles. Results: We evaluate the potential of KEMET for expanding functional annotations by simulating the effect of assembly issues on real gene sequences and showing that our approach can identify missing KEGG orthologs. Additionally, we show that recovered gene annotations can sensibly increase the quality of draft genome-scale metabolic models obtained from metagenome-assembled genomes, in some cases reaching the accuracy of models generated from complete genomes. Conclusions: KEMET therefore allows expanding genome annotations by targeted searches for orthologous sequences, enabling a better qualitative and quantitative assessment of metabolic capabilities in novel microbial organisms.

13.
Bioinformatics ; 38(2): 487-493, 2022 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-34499112

RESUMEN

MOTIVATION: Gene regulation is responsible for controlling numerous physiological functions and dynamically responding to environmental fluctuations. Reconstructing the human network of gene regulatory interactions is thus paramount to understanding the cell functional organization across cell types, as well as to elucidating pathogenic processes and identifying molecular drug targets. Although significant effort has been devoted towards this direction, existing computational methods mainly rely on gene expression levels, possibly ignoring the information conveyed by mechanistic biochemical knowledge. Moreover, except for a few recent attempts, most of the existing approaches only consider the information of the organism under analysis, without exploiting the information of related model organisms. RESULTS: We propose a novel method for the reconstruction of the human gene regulatory network, based on a transfer learning strategy that synergically exploits information from human and mouse, conveyed by gene-related metabolic features generated in silico from gene expression data. Specifically, we learn a predictive model from metabolic activity inferred via tissue-specific metabolic modelling of artificial gene knockouts. Our experiments show that the combination of our transfer learning approach with the constructed metabolic features provides a significant advantage in terms of reconstruction accuracy, as well as additional clues on the contribution of each constructed metabolic feature. AVAILABILITY AND IMPLEMENTATION: The method, the datasets and all the results obtained in this study are available at: https://doi.org/10.6084/m9.figshare.c.5237687. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional , Redes Reguladoras de Genes , Humanos , Animales , Ratones , Biología Computacional/métodos , Regulación de la Expresión Génica , Genoma , Aprendizaje Automático
14.
FEBS Lett ; 595(18): 2350-2365, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34409594

RESUMEN

Cancer is considered a high-risk condition for severe illness resulting from COVID-19. The interaction between severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) and human metabolism is key to elucidating the risk posed by COVID-19 for cancer patients and identifying effective treatments, yet it is largely uncharacterised on a mechanistic level. We present a genome-scale map of short-term metabolic alterations triggered by SARS-CoV-2 infection of cancer cells. Through transcriptomic- and proteomic-informed genome-scale metabolic modelling, we characterise the role of RNA and fatty acid biosynthesis in conjunction with a rewiring in energy production pathways and enhanced cytokine secretion. These findings link together complementary aspects of viral invasion of cancer cells, while providing mechanistic insights that can inform the development of treatment strategies.


Asunto(s)
COVID-19/metabolismo , Glucólisis , Modelos Biológicos , Neoplasias/metabolismo , SARS-CoV-2/metabolismo , COVID-19/complicaciones , Línea Celular Tumoral , Genoma Humano , Humanos , Neoplasias/complicaciones , Proteómica , SARS-CoV-2/aislamiento & purificación
15.
Bioinformatics ; 37(20): 3546-3552, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33974036

RESUMEN

MOTIVATION: High-throughput biological data, thanks to technological advances, have become cheaper to collect, leading to the availability of vast amounts of omic data of different types. In parallel, the in silico reconstruction and modeling of metabolic systems is now acknowledged as a key tool to complement experimental data on a large scale. The integration of these model- and data-driven information is therefore emerging as a new challenge in systems biology, with no clear guidance on how to better take advantage of the inherent multisource and multiomic nature of these data types while preserving mechanistic interpretation. RESULTS: Here, we investigate different regularization techniques for high-dimensional data derived from the integration of gene expression profiles with metabolic flux data, extracted from strain-specific metabolic models, to improve cellular growth rate predictions. To this end, we propose ad-hoc extensions of previous regularization frameworks including group, view-specific and principal component regularization and experimentally compare them using data from 1143 Saccharomyces cerevisiae strains. We observe a divergence between methods in terms of regression accuracy and integration effectiveness based on the type of regularization employed. In multiomic regression tasks, when learning from experimental and model-generated omic data, our results demonstrate the competitiveness and ease of interpretation of multimodal regularized linear models compared to data-hungry methods based on neural networks. AVAILABILITY AND IMPLEMENTATION: All data, models and code produced in this work are available on GitHub at https://github.com/Angione-Lab/HybridGroupIPFLasso_pc2Lasso. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

16.
Metab Eng ; 62: 138-149, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32905861

RESUMEN

Anaerobic digestion is a key biological process for renewable energy, yet the mechanistic knowledge on its hidden microbial dynamics is still limited. The present work charted the interaction network in the anaerobic digestion microbiome via the full characterization of pairwise interactions and the associated metabolite exchanges. To this goal, a novel collection of 836 genome-scale metabolic models was built to represent the functional capabilities of bacteria and archaea species derived from genome-centric metagenomics. Dominant microbes were shown to prefer mutualistic, parasitic and commensalistic interactions over neutralism, amensalism and competition, and are more likely to behave as metabolite importers and profiteers of the coexistence. Additionally, external hydrogen injection positively influences microbiome dynamics by promoting commensalism over amensalism. Finally, exchanges of glucogenic amino acids were shown to overcome auxotrophies caused by an incomplete tricarboxylic acid cycle. Our novel strategy predicted the most favourable growth conditions for the microbes, overall suggesting strategies to increasing the biogas production efficiency. In principle, this approach could also be applied to microbial populations of biomedical importance, such as the gut microbiome, to allow a broad inspection of the microbial interplays.


Asunto(s)
Reactores Biológicos , Microbiota , Anaerobiosis , Archaea , Metagenómica
17.
Proc Natl Acad Sci U S A ; 117(31): 18869-18879, 2020 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-32675233

RESUMEN

Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype-phenotype-environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning-based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.


Asunto(s)
Aprendizaje Automático , Análisis de Flujos Metabólicos/métodos , Saccharomyces cerevisiae , Biología de Sistemas/métodos , Modelos Biológicos , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Transcriptoma
18.
J Mol Diagn ; 22(4): 488-502, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32036093

RESUMEN

Lysosomal storage disorders (LSDs) are monogenic diseases, due to accumulation of specific undegraded substrates into lysosomes. LSD diagnosis could take several years because of both poor knowledge of these diseases and shared clinical features. The diagnostic approach includes clinical evaluations, biochemical tests, and genetic analysis of the suspected gene. In this study, we evaluated an LSD targeted sequencing panel as a tool capable to potentially reverse this classic diagnostic route. The panel includes 50 LSD genes and 230 intronic sequences conserved among 33 placental mammals. For the validation phase, 56 positive controls, 13 biochemically diagnosed patients, and nine undiagnosed patients were analyzed. Disease-causing variants were identified in 66% of the positive control alleles and in 62% of the biochemically diagnosed patients. Three undiagnosed patients were diagnosed. Eight patients undiagnosed by the panel were analyzed by whole exome sequencing: for two of them, the disease-causing variants were identified. Five patients, undiagnosed by both panel and exome analyses, were investigated through array comparative genomic hybridization: one of them was diagnosed. Conserved intronic fragment analysis, performed in cases unresolved by the first-level analysis, evidenced no candidate intronic variants. Targeted sequencing has low sequencing costs and short sequencing time. However, a coverage >60× to 80× must be ensured and/or Sanger validation should be performed. Moreover, it must be supported by a thorough clinical phenotyping.


Asunto(s)
Predisposición Genética a la Enfermedad , Pruebas Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Enfermedades por Almacenamiento Lisosomal/diagnóstico , Enfermedades por Almacenamiento Lisosomal/genética , Alelos , Biomarcadores , Estudios de Casos y Controles , Hibridación Genómica Comparativa , Femenino , Estudios de Asociación Genética , Variación Genética , Genómica/métodos , Humanos , Masculino , Mutación , Fenotipo , Análisis de Secuencia de ADN , Secuenciación del Exoma
19.
PLoS Comput Biol ; 15(7): e1007084, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31295267

RESUMEN

Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Genoma , Aprendizaje Automático , Redes y Vías Metabólicas , Genotipo , Fenotipo
20.
BMC Bioinformatics ; 19(1): 23, 2018 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-29370760

RESUMEN

BACKGROUND: The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. RESULTS: We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. CONCLUSIONS: Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .


Asunto(s)
Interfaz Usuario-Computador , Algoritmos , Bases de Datos Factuales , Estudio de Asociación del Genoma Completo , Humanos , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...