RESUMO
In this study, a comprehensive methodology combining machine learning and statistical analysis was employed to investigate alterations in the metabolite profiles, including lipids, of breast cancer tissues and their subtypes. By integrating biological and machine learning feature selection techniques, along with univariate and multivariate analyses, a notable lipid signature was identified in breast cancer tissues. The results revealed elevated levels of saturated and monounsaturated phospholipids in breast cancer tissues, consistent with external validation findings. Additionally, lipidomics analysis in both the original and validation datasets indicated lower levels of most triacylglycerols compared to non-cancerous tissues, suggesting potential alterations in lipid storage and metabolism within cancer cells. Analysis of cancer subtypes revealed that levels of PC 30:0 were relatively reduced in HER2(-) samples that were ER(+) and PR(+) compared to those that were ER(-) and PR(-). Conversely, HER2(+) tumors, which were ER(-) and PR(-), exhibited increased concentrations of PC 30:0. This increase could potentially be linked to the role of Stearoyl-CoA-Desaturase 1 in breast cancer. Comprehensive metabolomic analyses of breast cancer can offer crucial insights into cancer development, aiding in early detection and treatment evaluation of this devastating disease.
Assuntos
Neoplasias da Mama , Lipidômica , Humanos , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Feminino , Lipidômica/métodos , Metabolismo dos Lipídeos , Aprendizado de Máquina , Lipídeos/análise , Receptor ErbB-2/metabolismo , Estearoil-CoA Dessaturase/metabolismoRESUMO
Metabolites produced by the gut microbiota play an important role in the cross-talk with the human host. Many microbial metabolites are biologically active and can pass the gut barrier and make it into the systemic circulation, where they form the gut microbial exposome, i.e. the totality of gut microbial metabolites in body fluids or tissues of the host. A major difficulty faced when studying the microbial exposome and its role in health and diseases is to differentiate metabolites solely or partially derived from microbial metabolism from those produced by the host or coming from the diet. Our objective was to collect data from the scientific literature and build a database on gut microbial metabolites and on evidence of their microbial origin. Three types of evidence on the microbial origin of the gut microbial exposome were defined: (1) metabolites are produced in vitro by human faecal bacteria; (2) metabolites show reduced concentrations in humans or experimental animals upon treatment with antibiotics; (3) metabolites show reduced concentrations in germ-free animals when compared with conventional animals. Data was manually collected from peer-reviewed publications and inserted in the Exposome-Explorer database. Furthermore, to explore the chemical space of the microbial exposome and predict metabolites uniquely formed by the microbiota, genome-scale metabolic models (GSMMs) of gut bacterial strains and humans were compared. A total of 1848 records on one or more types of evidence on the gut microbial origin of 457 metabolites was collected in Exposome-Explorer. Data on their known precursors and concentrations in human blood, urine and faeces was also collected. About 66% of the predicted gut microbial metabolites (n = 1543) were found to be unique microbial metabolites not found in the human GSMM, neither in the list of 457 metabolites curated in Exposome-Explorer, and can be targets for new experimental studies. This new data on the gut microbial exposome, freely available in Exposome-Explorer ( http://exposome-explorer.iarc.fr/ ), will help researchers to identify poorly studied microbial metabolites to be considered in future studies on the gut microbiota, and study their functionalities and role in health and diseases.
Assuntos
Expossoma , Microbioma Gastrointestinal , Animais , Humanos , Bases de Dados Factuais , Gerenciamento de Dados , Dieta , Bactérias/genéticaRESUMO
Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instruments. To address these issues, a new pipeline was developed to normalize and pool metabolomics data through a set of sequential steps: (i) exclusions of the least informative observations and metabolites and removal of outliers; imputation of missing data; (ii) identification of the main sources of variability through principal component partial R-square (PC-PR2) analysis; (iii) application of linear mixed models to remove unwanted variability, including samples' originating study and batch, and preserve biological variations while accounting for potential differences in the residual variances across studies. This pipeline was applied to targeted metabolomics data acquired using Biocrates AbsoluteIDQ kits in eight case-control studies nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. Comprehensive examination of metabolomics measurements indicated that the pipeline improved the comparability of data across the studies. Our pipeline can be adapted to normalize other molecular data, including biomarkers as well as proteomics data, and could be used for pooling molecular datasets, for example in international consortia, to limit biases introduced by inter-study variability. This versatility of the pipeline makes our work of potential interest to molecular epidemiologists.
RESUMO
Objective: In 2016, the International Agency for Research on Cancer, part of the World Health Organization, released the Exposome-Explorer, the first database dedicated to biomarkers of exposure for environmental risk factors for diseases. The database contents resulted from a manual literature search that yielded over 8,500 citations, but only a small fraction of these publications were used in the final database. Manually curating a database is time-consuming and requires domain expertise to gather relevant data scattered throughout millions of articles. This work proposes a supervised machine learning pipeline to assist the manual literature retrieval process. Methods: The manually retrieved corpus of scientific publications used in the Exposome-Explorer was used as training and testing sets for the machine learning models (classifiers). Several parameters and algorithms were evaluated to predict an article's relevance based on different datasets made of titles, abstracts and metadata. Results: The top performance classifier was built with the Logistic Regression algorithm using the title and abstract set, achieving an F2-score of 70.1%. Furthermore, we extracted 1,143 entities from these articles with a classifier trained for biomarker entity recognition. Of these, we manually validated 45 new candidate entries to the database. Conclusion: Our methodology reduced the number of articles to be manually screened by the database curators by nearly 90%, while only misclassifying 22.1% of the relevant articles. We expect that this methodology can also be applied to similar biomarkers datasets or be adapted to assist the manual curation process of similar chemical or disease databases.
RESUMO
Metabolomics encompasses the systematic identification and quantification of all metabolic products in the human body. This field could provide clinicians with novel sets of diagnostic biomarkers for disease states in addition to quantifying treatment response to medications at an individualized level. This literature review aims to highlight the technology underpinning metabolic profiling, identify potential applications of metabolomics in clinical practice, and discuss the translational challenges that the field faces. We searched PubMed, MEDLINE, and EMBASE for primary and secondary research articles regarding clinical applications of metabolomics. Metabolic profiling can be performed using mass spectrometry and nuclear magnetic resonance-based techniques using a variety of biological samples. This is carried out in vivo or in vitro following careful sample collection, preparation, and analysis. The potential clinical applications constitute disruptive innovations in their respective specialities, particularly oncology and metabolic medicine. Outstanding issues currently preventing widespread clinical use are scalability of data interpretation, standardization of sample handling practice, and e-infrastructure. Routine utilization of metabolomics at a patient and population level will constitute an integral part of future healthcare provision.
Assuntos
Metabolômica , Medicina de Precisão , Estetoscópios , HumanosRESUMO
Exposome-Explorer (http://exposome-explorer.iarc.fr) is a database of dietary and pollutant biomarkers measured in population studies. In its first release, Exposome-Explorer contained comprehensive information on 692 biomarkers of dietary and pollution exposures extracted from the analysis of 480 peer-reviewed publications. Today, Exposome-Explorer has been further expanded and contains a total of 908 biomarkers. Two additional types of information have been collected. First, 185 candidate dietary biomarkers having 403 associations with food intake (as measured by metabolomic studies) have been identified and added. Second, 1356 associations between dietary biomarkers and cancer risk in epidemiological studies, which were collected from 313 publications, have also been added to the database. Classifications for both foods and compounds have been revised, and new classifications for biospecimens, analytical methods and cancers have been implemented. Finally, the web interface has been redesigned to significantly improve the user experience.
Assuntos
Bases de Dados de Compostos Químicos , Dieta , Biomarcadores Ambientais , Poluentes Ambientais , Expossoma , Neoplasias/epidemiologia , Coleta de Dados , Gerenciamento de Dados , Humanos , Fatores de RiscoRESUMO
Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on (1)H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10(-8)) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, Pâ=â6.9×10(-44)) and lysine (rs8101881, Pâ=â1.2×10(-33)), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.
Assuntos
Metaboloma/genética , Metabolômica , Polimorfismo de Nucleotídeo Único/genética , Urina , Sistemas de Transporte de Aminoácidos Básicos/genética , Animais , Doença de Crohn/genética , Doença de Crohn/metabolismo , Fucosiltransferases/genética , Fucosiltransferases/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Nefropatias/genética , Nefropatias/metabolismo , Espectroscopia de Ressonância Magnética , Masculino , Camundongos , Galactosídeo 2-alfa-L-FucosiltransferaseRESUMO
BACKGROUND: Changes in energy metabolism of the cells are common to many kinds of tumors and are considered a hallmark of cancer. Gas chromatography followed by time-of-flight mass spectrometry (GC-TOFMS) is a well-suited technique to investigate the small molecules in the central metabolic pathways. However, the metabolic changes between invasive carcinoma and normal breast tissues were not investigated in a large cohort of breast cancer samples so far. RESULTS: A cohort of 271 breast cancer and 98 normal tissue samples was investigated using GC-TOFMS-based metabolomics. A total number of 468 metabolite peaks could be detected; out of these 368 (79%) were significantly changed between cancer and normal tissues (p<0.05 in training and validation set). Furthermore, 13 tumor and 7 normal tissue markers were identified that separated cancer from normal tissues with a sensitivity and a specificity of >80%. Two-metabolite classifiers, constructed as ratios of the tumor and normal tissues markers, separated cancer from normal tissues with high sensitivity and specificity. Specifically, the cytidine-5-monophosphate / pentadecanoic acid metabolic ratio was the most significant discriminator between cancer and normal tissues and allowed detection of cancer with a sensitivity of 94.8% and a specificity of 93.9%. CONCLUSIONS: For the first time, a comprehensive metabolic map of breast cancer was constructed by GC-TOF analysis of a large cohort of breast cancer and normal tissues. Furthermore, our results demonstrate that spectrometry-based approaches have the potential to contribute to the analysis of biopsies or clinical tissue samples complementary to histopathology.