RESUMO
MOTIVATION: Random forests (RFs) can deal with a large number of variables, achieve reasonable prediction scores, and yield highly interpretable feature importance values. As such, RFs are appropriate models for feature selection and further dimension reduction. However, RFs are often not appropriate for correlated datasets due to their mode of selecting individual features for splitting. Addressing correlation relationships in high-dimensional datasets is imperative for reducing the number of variables that are assigned high importance, hence making the dimension reduction most efficient. Here, we propose the LAtent VAriable Stochastic Ensemble of Trees (LAVASET) method that derives latent variables based on the distance characteristics of each feature and aims to incorporate the correlation factor in the splitting step. RESULTS: Without compromising on performance in the majority of examples, LAVASET outperforms RF by accurately determining feature importance across all correlated variables and ensuring proper distribution of importance values. LAVASET yields mostly non-inferior prediction accuracies to traditional RFs when tested in simulated and real 1D datasets, as well as more complex and high-dimensional 3D datatypes. Unlike traditional RFs, LAVASET is unaffected by single 'important' noisy features (false positives), as it considers the local neighbourhood. LAVASET, therefore, highlights neighbourhoods of features, reflecting real signals that collectively impact the model's predictive ability. AVAILABILITY AND IMPLEMENTATION: LAVASET is freely available as a standalone package from https://github.com/melkasapi/LAVASET.
RESUMO
Enzymes are indispensable in many biological processes, and with biomedical literature growing exponentially, effective literature review becomes increasingly challenging. Natural language processing methods offer solutions to streamline this process. This study aims to develop an annotated enzyme corpus for training and evaluating enzyme named entity recognition (NER) models. A novel pipeline, combining dictionary matching and rule-based keyword searching, automatically annotated enzyme entities in >4800 full-text publications. Four deep learning NER models were created with different vocabularies (BioBERT/SciBERT) and architectures (BiLSTM/transformer) and evaluated on 526 manually annotated full-text publications. The annotation pipeline achieved an F1-score of 0.86 (precision = 1.00, recall = 0.76), surpassed by fine-tuned transformers for F1-score (BioBERT: 0.89, SciBERT: 0.88) and recall (0.86) with BiLSTM models having higher precision (0.94) than transformers (0.92). The annotation pipeline runs in seconds on standard laptops with almost perfect precision, but was outperformed by fine-tuned transformers in terms of F1-score and recall, demonstrating generalizability beyond the training data. In comparison, SciBERT-based models exhibited higher precision, and BioBERT-based models exhibited higher recall, highlighting the importance of vocabulary and architecture. These models, representing the first enzyme NER algorithms, enable more effective enzyme text mining and information extraction. Codes for automated annotation and model generation are available from https://github.com/omicsNLP/enzymeNER and https://zenodo.org/doi/10.5281/zenodo.10581586.
Assuntos
Algoritmos , Aprendizado Profundo , Enzimas , Processamento de Linguagem Natural , Anotação de Sequência Molecular/métodos , Humanos , Mineração de Dados/métodosRESUMO
MOTIVATION: Data processing is a key bottleneck for 1H NMR-based metabolic profiling of complex biological mixtures, such as biofluids. These spectra typically contain several thousands of signals, corresponding to possibly few hundreds of metabolites. A number of binning-based methods have been proposed to reduce the dimensionality of 1 D 1H NMR datasets, including statistical recoupling of variables (SRV). Here, we introduce a new binning method, named JBA ("pJRES Binning Algorithm"), which aims to extend the applicability of SRV to pJRES spectra. RESULTS: The performance of JBA is comprehensively evaluated using 617 plasma 1H NMR spectra from the FGENTCARD cohort. The results presented here show that JBA exhibits higher sensitivity than SRV to detect peaks from low-abundance metabolites. In addition, JBA allows a more efficient removal of spectral variables corresponding to pure electronic noise, and this has a positive impact on multivariate model building. AVAILABILITY AND IMPLEMENTATION: The algorithm is implemented using the MWASTools R/Bioconductor package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Metabolômica , Espectroscopia de Prótons por Ressonância MagnéticaRESUMO
Summary: MWASTools is an R package designed to provide an integrated pipeline to analyse metabonomic data in large-scale epidemiological studies. Key functionalities of our package include: quality control analysis; metabolome-wide association analysis using various models (partial correlations, generalized linear models); visualization of statistical outcomes; metabolite assignment using statistical total correlation spectroscopy (STOCSY); and biological interpretation of metabolome-wide association studies results. Availability and implementation: The MWASTools R package is implemented in R (version > =3.4) and is available from Bioconductor: https://bioconductor.org/packages/MWASTools/. Contact: m.dumas@imperial.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Metaboloma , Metabolômica/métodos , Software , Estudos de Associação Genética , Humanos , Metabolômica/normas , Modelos Biológicos , Controle de QualidadeRESUMO
BACKGROUND: Measurement of multiple food intake exposure biomarkers in urine may offer an objective method for monitoring diet. The potential of spot and cumulative urine samples that have reduced burden on participants as replacements for 24-h urine collections has not been evaluated. OBJECTIVE: The aim of this study was to determine the utility of spot and cumulative urine samples for classifying the metabolic profiles of people according to dietary intake when compared with 24-h urine collections in a controlled dietary intervention study. METHODS: Nineteen healthy individuals (10 male, 9 female, aged 21-65 y, BMI 20-35 kg/m2) each consumed 4 distinctly different diets, each for 1 wk. Spot urine samples were collected â¼2 h post meals on 3 intervention days/wk. Cumulative urine samples were collected daily over 3 separate temporal periods. A 24-h urine collection was created by combining the 3 cumulative urine samples. Urine samples were analyzed with metabolite fingerprinting by both high-resolution flow infusion electrospray mass spectrometry (FIE-HRMS) and proton nuclear magnetic resonance spectroscopy (1H-NMR). Concentrations of dietary intake biomarkers were measured with liquid chromatography triple quadrupole mass spectrometry and by integration of 1H-NMR data. RESULTS: Cross-validation modeling with 1H-NMR and FIE-HRMS data demonstrated the power of spot and cumulative urine samples in predicting dietary patterns in 24-h urine collections. Particularly, there was no significant loss of information when post-dinner (PD) spot or overnight cumulative samples were substituted for 24-h urine collections (classification accuracies of 0.891 and 0.938, respectively). Quantitative analysis of urine samples also demonstrated the relation between PD spot samples and 24-h urines for dietary exposure biomarkers. CONCLUSIONS: We conclude that PD spot urine samples are suitable replacements for 24-h urine collections. Alternatively, cumulative samples collected overnight predict similarly to 24-h urine samples and have a lower collection burden for participants.
Assuntos
Exposição Dietética , Coleta de Urina/métodos , Adulto , Idoso , Biomarcadores/urina , Dieta , Feminino , Humanos , Masculino , Metaboloma , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Adulto JovemRESUMO
Metabolism is altered by genetics, diet, disease status, environment, and many other factors. Modeling either one of these is often done without considering the effects of the other covariates. Attributing differences in metabolic profile to one of these factors needs to be done while controlling for the metabolic influence of the rest. We describe here a data analysis framework and novel confounder-adjustment algorithm for multivariate analysis of metabolic profiling data. Using simulated data, we show that similar numbers of true associations and significantly less false positives are found compared to other commonly used methods. Covariate-adjusted projections to latent structures (CA-PLS) are exemplified here using a large-scale metabolic phenotyping study of two Chinese populations at different risks for cardiovascular disease. Using CA-PLS, we find that some previously reported differences are actually associated with external factors and discover a number of previously unreported biomarkers linked to different metabolic pathways. CA-PLS can be applied to any multivariate data where confounding may be an issue and the confounder-adjustment procedure is translatable to other multivariate regression techniques.
Assuntos
Biomarcadores , Fatores de Confusão Epidemiológicos , Metaboloma , Modelos Estatísticos , Fenótipo , Algoritmos , Povo Asiático , Doenças Cardiovasculares , Simulação por Computador , Humanos , Análise Multivariada , Risco , Análise EspectralRESUMO
Summary: MetaboSignal is an R package that allows merging metabolic and signaling pathways reported in the Kyoto Encyclopaedia of Genes and Genomes (KEGG). It is a network-based approach designed to navigate through topological relationships between genes (signaling- or metabolic-genes) and metabolites, representing a powerful tool to investigate the genetic landscape of metabolic phenotypes. Availability and Implementation: MetaboSignal is available from Bioconductor: https://bioconductor.org/packages/MetaboSignal/. Contact: m.dumas@imperial.ac.uk . Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional/métodos , Redes e Vias Metabólicas , Transdução de Sinais , Software , Tecido Adiposo/metabolismo , Animais , Locos de Características Quantitativas , RatosRESUMO
A major purpose of exploratory metabolic profiling is for the identification of molecular species that are statistically associated with specific biological or medical outcomes; unfortunately, the structure elucidation process of unknowns is often a major bottleneck in this process. We present here new holistic strategies that combine different statistical spectroscopic and analytical techniques to improve and simplify the process of metabolite identification. We exemplify these strategies using study data collected as part of a dietary intervention to improve health and which elicits a relatively subtle suite of changes from complex molecular profiles. We identify three new dietary biomarkers related to the consumption of peas (N-methyl nicotinic acid), apples (rhamnitol), and onions (N-acetyl-S-(1Z)-propenyl-cysteine-sulfoxide) that can be used to enhance dietary assessment and assess adherence to diet. As part of the strategy, we introduce a new probabilistic statistical spectroscopy tool, RED-STORM (Resolution EnhanceD SubseT Optimization by Reference Matching), that uses 2D J-resolved 1H NMR spectra for enhanced information recovery using the Bayesian paradigm to extract a subset of spectra with similar spectral signatures to a reference. RED-STORM provided new information for subsequent experiments (e.g., 2D-NMR spectroscopy, solid-phase extraction, liquid chromatography prefaced mass spectrometry) used to ultimately identify an unknown compound. In summary, we illustrate the benefit of acquiring J-resolved experiments alongside conventional 1D 1H NMR as part of routine metabolic profiling in large data sets and show that application of complementary statistical and analytical techniques for the identification of unknown metabolites can be used to save valuable time and resources.
Assuntos
Malus/metabolismo , Ácidos Nicotínicos/análise , Cebolas/metabolismo , Pisum sativum/metabolismo , Ramnose/análise , Biomarcadores/análise , Biomarcadores/metabolismo , Espectroscopia de Ressonância Magnética , Malus/química , Estrutura Molecular , Ácidos Nicotínicos/metabolismo , Cebolas/química , Pisum sativum/química , Ramnose/análogos & derivados , Ramnose/metabolismoRESUMO
1H nuclear magnetic resonance (NMR) spectroscopy-based metabolic phenotyping is now widely used for large-scale epidemiological applications. To minimize signal overlap present in 1D 1H NMR spectra, we have investigated the use of 2D J-resolved (JRES) 1H NMR spectroscopy for large-scale phenotyping studies. In particular, we have evaluated the use of the 1D projections of the 2D JRES spectra (pJRES), which provide single peaks for each of the J-coupled multiplets, using 705 human plasma samples from the FGENTCARD cohort. On the basis of the assessment of several objective analytical criteria (spectral dispersion, attenuation of macromolecular signals, cross-spectral correlation with GC-MS metabolites, analytical reproducibility and biomarker discovery potential), we concluded that the pJRES approach exhibits suitable properties for implementation in large-scale molecular epidemiology workflows.
Assuntos
Metabolômica/métodos , Fenótipo , Plasma/metabolismo , Espectroscopia de Prótons por Ressonância Magnética , Feminino , Humanos , Masculino , Fluxo de TrabalhoRESUMO
Parasitic infections such as leishmaniasis induce a cascade of host physiological responses, including metabolic and immunological changes. Infection with Leishmania major parasites causes cutaneous leishmaniasis in humans, a neglected tropical disease that is difficult to manage. To understand the determinants of pathology, we studied L. major infection in two mouse models: the self-healing C57BL/6 strain and the nonhealing BALB/c strain. Metabolic profiling of urine, plasma, and feces via proton NMR spectroscopy was performed to discover parasite-specific imprints on global host metabolism. Plasma cytokine status and fecal microbiome were also characterized as additional metrics of the host response to infection. Results demonstrated differences in glucose and lipid metabolism, distinctive immunological phenotypes, and shifts in microbial composition between the two models. We present a novel approach to integrate such metrics using correlation network analyses, whereby self-healing mice demonstrated an orchestrated interaction between the biological measures shortly after infection. In contrast, the response observed in nonhealing mice was delayed and fragmented. Our study suggests that trans-system communication across host metabolism, the innate immune system, and gut microbiome is key for a successful host response to L. major and provides a new concept, potentially translatable to other diseases.
Assuntos
Biomarcadores/metabolismo , Microbioma Gastrointestinal/imunologia , Leishmania major/imunologia , Leishmaniose Cutânea/imunologia , Leishmaniose Cutânea/fisiopatologia , Modelos Biológicos , Animais , Biomarcadores/sangue , Biomarcadores/urina , Interações Hospedeiro-Patógeno , Leishmaniose Cutânea/metabolismo , Espectroscopia de Ressonância Magnética , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Especificidade da EspécieRESUMO
SUMMARY: MetaboNetworks is a tool to create custom sub-networks in Matlab using main reaction pairs as defined by the Kyoto Encyclopaedia of Genes and Genomes and can be used to explore transgenomic interactions, for example mammalian and bacterial associations. It calculates the shortest path between a set of metabolites (e.g. biomarkers from a metabonomic study) and plots the connectivity between metabolites as links in a network graph. The resulting graph can be edited and explored interactively. Furthermore, nodes and edges in the graph are linked to the Kyoto Encyclopaedia of Genes and Genomes compound and reaction pair web pages. AVAILABILITY AND IMPLEMENTATION: MetaboNetworks is available from http://www.mathworks.com/matlabcentral/fileexchange/42684. CONTACT: jmp111@ic.ac.uk or j.nicholson@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Metabolômica/métodos , Genoma , Internet , SoftwareRESUMO
Breast milk (BM) is a biofluid that has a fundamental role in early life nutrition and has direct impact on growth, neurodevelopment, and health. Global metabolic profiling is increasingly being utilized to characterize complex metabolic changes in biological samples. However, in order to achieve broad metabolite coverage, it is necessary to employ more than one analytical platform, typically requiring multiple sample preparation protocols. In an effort to improve analytical efficiency and retain comprehensive coverage of the metabolome, a new extraction methodology was developed that successfully retains metabolites from BM in a single-phase using an optimized methyl-tert-butyl ether solvent system. We conducted this single-phase extraction procedure on a representative pool of BM, and characterized the metabolic composition using LC-QTOF-MS and GC-Q-MS for polar and lipidic metabolites. To ensure that the extraction method was reproducible and fit-for-purpose, the analytical procedure was evaluated on both platforms using 18 metabolites selected to cover a range of chromatographic retention times and biochemical classes. Having validated the method, the metabolic signature of BM composition was mapped as a metabolic reaction network highlighting interconnected biological pathways and showing that the LC-MS and GC-MS platforms targeted largely different domains of the network. Subsequently, the same protocol was applied to ascertain compositional differences between BM at week 1 (n = 10) and 4 weeks (n = 9) post-partum. This single-phase approach is more efficient in terms of time, simplicity, cost, and sample volume than the existing two-phase methods and will be suited to high-throughput metabolic profiling studies of BM.
Assuntos
Fracionamento Químico/métodos , Metaboloma , Metabolômica/métodos , Leite Humano/química , Cromatografia Líquida/métodos , Feminino , Cromatografia Gasosa-Espectrometria de Massas/métodos , Humanos , Espectrometria de Massas/métodos , Éteres Metílicos/química , Leite Humano/metabolismo , Solventes/químicaRESUMO
The rich chemical information from tissue metabolomics provides a powerful means to elaborate tissue physiology or tumor characteristics at cellular and tumor microenvironment levels. However, the process of obtaining such information requires invasive biopsies, is costly, and can delay clinical patient management. Conversely, computed tomography (CT) is a clinical standard of care but does not intuitively harbor histological or prognostic information. Furthermore, the ability to embed metabolome information into CT to subsequently use the learned representation for classification or prognosis has yet to be described. This study develops a deep learning-based framework -- tissue-metabolomic-radiomic-CT (TMR-CT) by combining 48 paired CT images and tumor/normal tissue metabolite intensities to generate ten image embeddings to infer metabolite-derived representation from CT alone. In clinical NSCLC settings, we ascertain whether TMR-CT results in an enhanced feature generation model solving histology classification/prognosis tasks in an unseen international CT dataset of 742 patients. TMR-CT non-invasively determines histological classes - adenocarcinoma/squamous cell carcinoma with an F1-score = 0.78 and further asserts patients' prognosis with a c-index = 0.72, surpassing the performance of radiomics models and deep learning on single modality CT feature extraction. Additionally, our work shows the potential to generate informative biology-inspired CT-led features to explore connections between hard-to-obtain tissue metabolic profiles and routine lesion-derived image data.
RESUMO
CONTEXT: The role of glucagon-like peptide-1 (GLP-1) in type 2 diabetes (T2D) and obesity is not fully understood. OBJECTIVE: We investigate the association of cardiometabolic, diet, and lifestyle parameters on fasting and postprandial GLP-1 in people at risk of, or living with, T2D. METHODS: We analyzed cross-sectional data from the two Innovative Medicines Initiative (IMI) Diabetes Research on Patient Stratification (DIRECT) cohorts, cohort 1 (n = 2127) individuals at risk of diabetes; cohort 2 (n = 789) individuals with new-onset T2D. RESULTS: Our multiple regression analysis reveals that fasting total GLP-1 is associated with an insulin-resistant phenotype and observe a strong independent relationship with male sex, increased adiposity, and liver fat, particularly in the prediabetes population. In contrast, we showed that incremental GLP-1 decreases with worsening glycemia, higher adiposity, liver fat, male sex, and reduced insulin sensitivity in the prediabetes cohort. Higher fasting total GLP-1 was associated with a low intake of wholegrain, fruit, and vegetables in people with prediabetes, and with a high intake of red meat and alcohol in people with diabetes. CONCLUSION: These studies provide novel insights into the association between fasting and incremental GLP-1, metabolic traits of diabetes and obesity, and dietary intake, and raise intriguing questions regarding the relevance of fasting GLP-1 in the pathophysiology T2D.
Assuntos
Diabetes Mellitus Tipo 2 , Dieta , Peptídeo 1 Semelhante ao Glucagon , Estilo de Vida , Estado Pré-Diabético , Humanos , Masculino , Feminino , Diabetes Mellitus Tipo 2/sangue , Diabetes Mellitus Tipo 2/metabolismo , Peptídeo 1 Semelhante ao Glucagon/sangue , Peptídeo 1 Semelhante ao Glucagon/metabolismo , Estudos Transversais , Pessoa de Meia-Idade , Estado Pré-Diabético/sangue , Estado Pré-Diabético/metabolismo , Idoso , Adulto , Resistência à Insulina , Jejum/sangue , Obesidade/sangue , Obesidade/metabolismo , Estudos de Coortes , Glicemia/metabolismo , Glicemia/análise , Adiposidade/fisiologiaRESUMO
Most chronic diseases have been demonstrated to have a link to nutrition. Within food and nutritional research there is a major driver to understand the relationship between diet and disease in order to improve health of individuals. However, the lack of accurate dietary intake assessment in free-living populations, makes accurate estimation of how diet is associated with disease risk difficulty. Thus, there is a pressing need to find solutions to the inaccuracy of dietary reporting. Metabolic profiling of urine or plasma can provide an unbiased approach to characterizing dietary intake and various high-throughput analytical platforms have been used in order to implement targeted and nontargeted assays in nutritional clinical trials and nutritional epidemiology studies. This review describes first the challenges presented in interpreting the relationship between diet and health within individual and epidemiological frameworks. Second, we aim to explore how metabonomics can benefit different types of nutritional studies and discuss the critical importance of selecting appropriate analytical techniques in these studies. Third, we propose a strategy capable of providing accurate assessment of food intake within an epidemiological framework in order establish accurate associations between diet and health.
Assuntos
Metaboloma , Metabolômica/métodos , Animais , Dieta , Ingestão de Alimentos , Estudos Epidemiológicos , Saúde , Humanos , Avaliação NutricionalRESUMO
BACKGROUND: The capacity of an individual to respond to changes in food intake so that postprandial metabolic perturbations are resolved, and metabolism returns to its pre-prandial state, is called phenotypic flexibility. This ability may be a more important indicator of current health status than metabolic markers in a fasting state. AIM: In this parallel randomized controlled trial study, an energy-restricted healthy diet and 2 dietary challenges were used to assess the effect of weight loss on phenotypic flexibility. METHODS: Seventy-two volunteers with overweight and obesity underwent a 12-wk dietary intervention. The participants were randomized to a weight loss group (WLG) with 20% less energy intake or a weight-maintenance group (WMG). At weeks 1 and 12, participants were assessed for body composition by MRI. Concurrently, markers of metabolism and insulin sensitivity were obtained from the analysis of plasma metabolome during 2 different dietary challenges-an oral glucose tolerance test (OGTT) and a mixed-meal tolerance test. RESULTS: Intended weight loss was achieved in the WLG (-5.6 kg, P < 0.0001) and induced a significant reduction in total and regional adipose tissue as well as ectopic fat in the liver. Amino acid-based markers of insulin action and resistance such as leucine and glutamate were reduced in the postprandial phase of the OGTT in the WLG by 11.5% and 28%, respectively, after body weight reduction. Weight loss correlated with the magnitude of changes in metabolic responses to dietary challenges. Large interindividual variation in metabolic responses to weight loss was observed. CONCLUSION: Application of dietary challenges increased sensitivity to detect metabolic response to weight loss intervention. Large interindividual variation was observed across a wide range of measurements allowing the identification of distinct responses to the weight loss intervention and mechanistic insight into the metabolic response to weight loss.
Assuntos
Dieta , Sobrepeso , Redução de Peso , Sobrepeso/dietoterapia , Sobrepeso/metabolismo , Humanos , Masculino , Feminino , Adulto , Composição Corporal , Tecido Adiposo , Insulina/metabolismo , BiomarcadoresRESUMO
BACKGROUND AND AIMS: The gut microbiota is implicated in the pathogenesis of colorectal cancer (CRC). We aimed to map the CRC mucosal microbiota and metabolome and define the influence of the tumoral microbiota on oncological outcomes. METHODS: A multicentre, prospective observational study was conducted of CRC patients undergoing primary surgical resection in the UK (n = 74) and Czech Republic (n = 61). Analysis was performed using metataxonomics, ultra-performance liquid chromatography-mass spectrometry (UPLC-MS), targeted bacterial qPCR and tumour exome sequencing. Hierarchical clustering accounting for clinical and oncological covariates was performed to identify clusters of bacteria and metabolites linked to CRC. Cox proportional hazards regression was used to ascertain clusters associated with disease-free survival over median follow-up of 50 months. RESULTS: Thirteen mucosal microbiota clusters were identified, of which five were significantly different between tumour and paired normal mucosa. Cluster 7, containing the pathobionts Fusobacterium nucleatum and Granulicatella adiacens, was strongly associated with CRC (PFDR = 0.0002). Additionally, tumoral dominance of cluster 7 independently predicted favourable disease-free survival (adjusted p = 0.031). Cluster 1, containing Faecalibacterium prausnitzii and Ruminococcus gnavus, was negatively associated with cancer (PFDR = 0.0009), and abundance was independently predictive of worse disease-free survival (adjusted p = 0.0009). UPLC-MS analysis revealed two major metabolic (Met) clusters. Met 1, composed of medium chain (MCFA), long-chain (LCFA) and very long-chain (VLCFA) fatty acid species, ceramides and lysophospholipids, was negatively associated with CRC (PFDR = 2.61 × 10-11); Met 2, composed of phosphatidylcholine species, nucleosides and amino acids, was strongly associated with CRC (PFDR = 1.30 × 10-12), but metabolite clusters were not associated with disease-free survival (p = 0.358). An association was identified between Met 1 and DNA mismatch-repair deficiency (p = 0.005). FBXW7 mutations were only found in cancers predominant in microbiota cluster 7. CONCLUSIONS: Networks of pathobionts in the tumour mucosal niche are associated with tumour mutation and metabolic subtypes and predict favourable outcome following CRC resection. Video Abstract.
Assuntos
Neoplasias Colorretais , Microbioma Gastrointestinal , Microbiota , Humanos , Cromatografia Líquida , Espectrometria de Massas em Tandem , Microbiota/genética , Microbioma Gastrointestinal/genética , Neoplasias Colorretais/cirurgiaRESUMO
The prevalence of renal stone disease is increasing, although it remains higher in men than in women when matched for age. While still somewhat controversial, several studies have reported an association between renal stone disease and hypertension, but this may be confounded by a shared link with obesity. However, independent of obesity, hyperoxaluria has been shown to be associated with hypertension in stone-formers, and the most common type of renal stone is composed of calcium oxalate. The chloride-oxalate exchanger slc26a6 (also known as CFEX or PAT-1), located in the renal proximal tubule, was originally thought to have an important role in sodium homeostasis and thereby blood pressure control, but it has recently been shown to have a key function in oxalate balance by mediating oxalate secretion in the gut. We have applied two orthogonal analytical platforms (NMR spectroscopy and capillary electrophoresis with UV detection) in parallel to characterize the urinary metabolic signatures related to the loss of the renal chloride-oxalate exchanger in slc26a6 null mice. Clear metabolic differentiation between the urinary profiles of the slc26a6 null and the wild type mice were observed using both methods, with the combination of NMR and CE-UV providing extensive coverage of the urinary metabolome. Key discriminating metabolites included oxalate, m-hydroxyphenylpropionylsulfate (m-HPPS), trimethylamine-N-oxide, glycolate and scyllo-inositol (higher in slc26a6 null mice) and hippurate, taurine, trimethylamine, and citrate (lower in slc26a6 null mice). In addition to the reduced efficiency of anion transport, several of these metabolites (hippurate, m-HPPS, methylamines) reflect alteration in gut microbial cometabolic activities. Gender-related metabotypes were also observed in both wild type and slc26a6 null groups. Urinary metabolites that showed a sex-specific pattern included trimethylamine, trimethylamine-N-oxide, citrate, spermidine, guanidinoacetate, and 2-oxoisocaproate. The gender-dependent metabolic expression of the consequences of slc26a6 deletion might have relevance to the difference in prevalence of renal stone formation in men and women. The different composition of microbial metabolites in the slc26a6 null mice is consistent with the fact that the slc26a6 transporter is found in a range of tissues, including the kidney and intestine, and provides further evidence for the "long reach" of the microbiota in physiological and pathological processes.
Assuntos
Antiporters/deficiência , Metaboloma/fisiologia , Metabolômica/métodos , Compostos Orgânicos/urina , Animais , Antiporters/genética , Antiporters/urina , Eletroforese Capilar , Feminino , Masculino , Metaboloma/genética , Camundongos , Camundongos Knockout , Ressonância Magnética Nuclear Biomolecular , Compostos Orgânicos/química , Oxalatos/química , Oxalatos/metabolismo , Fenótipo , Análise de Componente Principal , Transportadores de SulfatoRESUMO
We describe a new multivariate statistical approach to recover metabolite structure information from multiple (1)H NMR spectra in population sample sets. Subset optimization by reference matching (STORM) was developed to select subsets of (1)H NMR spectra that contain specific spectroscopic signatures of biomarkers differentiating between different human populations. STORM aims to improve the visualization of structural correlations in spectroscopic data by using these reduced spectral subsets containing smaller numbers of samples than the number of variables (n ⪠p). We have used statistical shrinkage to limit the number of false positive associations and to simplify the overall interpretation of the autocorrelation matrix. The STORM approach has been applied to findings from an ongoing human metabolome-wide association study on body mass index to identify a biomarker metabolite present in a subset of the population. Moreover, we have shown how STORM improves the visualization of more abundant NMR peaks compared to a previously published method (statistical total correlation spectroscopy, STOCSY). STORM is a useful new tool for biomarker discovery in the "omic" sciences that has widespread applicability. It can be applied to any type of data, provided that there is interpretable correlation among variables, and can also be applied to data with more than one dimension (e.g., 2D NMR spectra).
Assuntos
Líquidos Corporais/metabolismo , Ressonância Magnética Nuclear Biomolecular/métodos , Adulto , Biomarcadores/urina , Feminino , Humanos , Espectroscopia de Ressonância Magnética/métodos , Espectroscopia de Ressonância Magnética/normas , Masculino , Pessoa de Meia-Idade , PrótonsRESUMO
Because cerebrospinal fluid (CSF) is the biofluid which interacts most closely with the central nervous system, it holds promise as a reporter of neurological disease, for example multiple sclerosis (MScl). To characterize the metabolomics profile of neuroinflammatory aspects of this disease we studied an animal model of MScl-experimental autoimmune/allergic encephalomyelitis (EAE). Because CSF also exchanges metabolites with blood via the blood-brain barrier, malfunctions occurring in the CNS may be reflected in the biochemical composition of blood plasma. The combination of blood plasma and CSF provides more complete information about the disease. Both biofluids can be studied by use of NMR spectroscopy. It is then necessary to perform combined analysis of the two different datasets. Mid-level data fusion was therefore applied to blood plasma and CSF datasets. First, relevant information was extracted from each biofluid dataset by use of linear support vector machine recursive feature elimination. The selected variables from each dataset were concatenated for joint analysis by partial least squares discriminant analysis (PLS-DA). The combined metabolomics information from plasma and CSF enables more efficient and reliable discrimination of the onset of EAE. Second, we introduced hierarchical models fusion, in which previously developed PLS-DA models are hierarchically combined. We show that this approach enables neuroinflamed rats (even on the day of onset) to be distinguished from either healthy or peripherally inflamed rats. Moreover, progression of EAE can be investigated because the model separates the onset and peak of the disease.