Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 122
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 25(1): 94, 2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38438850

RESUMEN

BACKGROUND: Analysis of time-resolved postprandial metabolomics data can improve the understanding of metabolic mechanisms, potentially revealing biomarkers for early diagnosis of metabolic diseases and advancing precision nutrition and medicine. Postprandial metabolomics measurements at several time points from multiple subjects can be arranged as a subjects by metabolites by time points array. Traditional analysis methods are limited in terms of revealing subject groups, related metabolites, and temporal patterns simultaneously from such three-way data. RESULTS: We introduce an unsupervised multiway analysis approach based on the CANDECOMP/PARAFAC (CP) model for improved analysis of postprandial metabolomics data guided by a simulation study. Because of the lack of ground truth in real data, we generate simulated data using a comprehensive human metabolic model. This allows us to assess the performance of CP models in terms of revealing subject groups and underlying metabolic processes. We study three analysis approaches: analysis of fasting-state data using principal component analysis, T0-corrected data (i.e., data corrected by subtracting fasting-state data) using a CP model and full-dynamic (i.e., full postprandial) data using CP. Through extensive simulations, we demonstrate that CP models capture meaningful and stable patterns from simulated meal challenge data, revealing underlying mechanisms and differences between diseased versus healthy groups. CONCLUSIONS: Our experiments show that it is crucial to analyze both fasting-state and T0-corrected data for understanding metabolic differences among subject groups. Depending on the nature of the subject group structure, the best group separation may be achieved by CP models of T0-corrected or full-dynamic data. This study introduces an improved analysis approach for postprandial metabolomics data while also shedding light on the debate about correcting baseline values in longitudinal data analysis.


Asunto(s)
Medicina , Metabolómica , Humanos , Simulación por Computador , Análisis de Datos , Estado de Salud
2.
Metabolomics ; 20(4): 86, 2024 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-39066850

RESUMEN

INTRODUCTION: Longitudinal metabolomics data from a meal challenge test contains both fasting and dynamic signals, that may be related to metabolic health and diseases. Recent work has explored the multiway structure of time-resolved metabolomics data by arranging it as a three-way array with modes: subjects, metabolites, and time. The analysis of such dynamic data (where the fasting data is subtracted from postprandial states) reveals dynamic markers of various phenotypes, and differences between fasting and dynamic states. However, there is still limited success in terms of extracting static and dynamic biomarkers for the same subject stratifications. OBJECTIVES: Through joint analysis of fasting and dynamic metabolomics data, our goal is to capture static and dynamic biomarkers of a phenotype for the same subject stratifications providing a complete picture, that will be more effective for precision health. METHODS: We jointly analyze fasting and dynamic metabolomics data collected during a meal challenge test from the COPSAC 2000 cohort using coupled matrix and tensor factorizations (CMTF), where the dynamic data (subjects by metabolites by time) is coupled with the fasting data (subjects by metabolites) in the subjects mode. RESULTS: The proposed data fusion approach extracts shared subject stratifications in terms of BMI (body mass index) from fasting and dynamic signals as well as the static and dynamic metabolic biomarker patterns corresponding to those stratifications. Specifically, we observe a subject stratification showing the positive association with all fasting VLDLs and higher BMI. For the same subject stratification, a subset of dynamic VLDLs (mainly the smaller sizes) correlates negatively with higher BMI. Higher correlations of the subject quantifications with the phenotype of interest are observed using such a data fusion approach compared to individual analyses of the fasting and postprandial state. CONCLUSION: The CMTF-based approach provides a complete picture of static and dynamic biomarkers for the same subject stratifications-when markers are present in both fasting and dynamic states.


Asunto(s)
Biomarcadores , Ayuno , Metabolómica , Periodo Posprandial , Humanos , Biomarcadores/sangre , Biomarcadores/metabolismo , Metabolómica/métodos , Ayuno/metabolismo , Masculino , Femenino , Adulto , Persona de Mediana Edad
3.
Metabolomics ; 20(3): 50, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38722393

RESUMEN

INTRODUCTION: Analysis of time-resolved postprandial metabolomics data can improve our understanding of the human metabolism by revealing similarities and differences in postprandial responses of individuals. Traditional data analysis methods often rely on data summaries or univariate approaches focusing on one metabolite at a time. OBJECTIVES: Our goal is to provide a comprehensive picture in terms of the changes in the human metabolism in response to a meal challenge test, by revealing static and dynamic markers of phenotypes, i.e., subject stratifications, related clusters of metabolites, and their temporal profiles. METHODS: We analyze Nuclear Magnetic Resonance (NMR) spectroscopy measurements of plasma samples collected during a meal challenge test from 299 individuals from the COPSAC2000 cohort using a Nightingale NMR panel at the fasting and postprandial states (15, 30, 60, 90, 120, 150, 240 min). We investigate the postprandial dynamics of the metabolism as reflected in the dynamic behaviour of the measured metabolites. The data is arranged as a three-way array: subjects by metabolites by time. We analyze the fasting state data to reveal static patterns of subject group differences using principal component analysis (PCA), and fasting state-corrected postprandial data using the CANDECOMP/PARAFAC (CP) tensor factorization to reveal dynamic markers of group differences. RESULTS: Our analysis reveals dynamic markers consisting of certain metabolite groups and their temporal profiles showing differences among males according to their body mass index (BMI) in response to the meal challenge. We also show that certain lipoproteins relate to the group difference differently in the fasting vs. dynamic state. Furthermore, while similar dynamic patterns are observed in males and females, the BMI-related group difference is observed only in males in the dynamic state. CONCLUSION: The CP model is an effective approach to analyze time-resolved postprandial metabolomics data, and provides a compact but a comprehensive summary of the postprandial data revealing replicable and interpretable dynamic markers crucial to advance our understanding of changes in the metabolism in response to a meal challenge.


Asunto(s)
Metabolómica , Periodo Posprandial , Humanos , Periodo Posprandial/fisiología , Masculino , Femenino , Metabolómica/métodos , Adulto , Ayuno/metabolismo , Análisis de Componente Principal , Espectroscopía de Resonancia Magnética/métodos , Persona de Mediana Edad , Análisis de Datos , Metaboloma/fisiología
4.
PLoS Comput Biol ; 19(6): e1011221, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37352364

RESUMEN

The intricate dependency structure of biological "omics" data, particularly those originating from longitudinal intervention studies with frequently sampled repeated measurements renders the analysis of such data challenging. The high-dimensionality, inter-relatedness of multiple outcomes, and heterogeneity in the studied systems all add to the difficulty in deriving meaningful information. In addition, the subtle differences in dynamics often deemed meaningful in nutritional intervention studies can be particularly challenging to quantify. In this work we demonstrate the use of quantitative longitudinal models within the repeated-measures ANOVA simultaneous component analysis+ (RM-ASCA+) framework to capture the dynamics in frequently sampled longitudinal data with multivariate outcomes. We illustrate the use of linear mixed models with polynomial and spline basis expansion of the time variable within RM-ASCA+ in order to quantify non-linear dynamics in a simulation study as well as in a metabolomics data set. We show that the proposed approach presents a convenient and interpretable way to systematically quantify and summarize multivariate outcomes in longitudinal studies while accounting for proper within subject dependency structures.


Asunto(s)
Algoritmos , Metabolómica , Simulación por Computador , Modelos Lineales
5.
BMC Bioinformatics ; 23(1): 31, 2022 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-35012453

RESUMEN

BACKGROUND: Analysis of dynamic metabolomics data holds the promise to improve our understanding of underlying mechanisms in metabolism. For example, it may detect changes in metabolism due to the onset of a disease. Dynamic or time-resolved metabolomics data can be arranged as a three-way array with entries organized according to a subjects mode, a metabolites mode and a time mode. While such time-evolving multiway data sets are increasingly collected, revealing the underlying mechanisms and their dynamics from such data remains challenging. For such data, one of the complexities is the presence of a superposition of several sources of variation: induced variation (due to experimental conditions or inborn errors), individual variation, and measurement error. Multiway data analysis (also known as tensor factorizations) has been successfully used in data mining to find the underlying patterns in multiway data. To explore the performance of multiway data analysis methods in terms of revealing the underlying mechanisms in dynamic metabolomics data, simulated data with known ground truth can be studied. RESULTS: We focus on simulated data arising from different dynamic models of increasing complexity, i.e., a simple linear system, a yeast glycolysis model, and a human cholesterol model. We generate data with induced variation as well as individual variation. Systematic experiments are performed to demonstrate the advantages and limitations of multiway data analysis in analyzing such dynamic metabolomics data and their capacity to disentangle the different sources of variations. We choose to use simulations since we want to understand the capability of multiway data analysis methods which is facilitated by knowing the ground truth. CONCLUSION: Our numerical experiments demonstrate that despite the increasing complexity of the studied dynamic metabolic models, tensor factorization methods CANDECOMP/PARAFAC(CP) and Parallel Profiles with Linear Dependences (Paralind) can disentangle the sources of variations and thereby reveal the underlying mechanisms and their dynamics.


Asunto(s)
Metabolómica , Simulación por Computador , Humanos
6.
Anal Chem ; 94(2): 628-636, 2022 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-34936323

RESUMEN

Lipoprotein subfractions are biomarkers for the early diagnosis of cardiovascular diseases. The reference method, ultracentrifugation, for measuring lipoproteins is time-consuming, and there is a need to develop a rapid method for cohort screenings. This study presents partial least-squares regression models developed using 1H nuclear magnetic resonance (NMR) spectra and concentrations of lipoproteins as measured by ultracentrifugation on 316 healthy Danes. This study explores, for the first time, different regions of the 1H NMR spectrum representing signals of molecules in lipoprotein particles and different lipid species to develop parsimonious, reliable, and optimal prediction models. A total of 65 lipoprotein main and subfractions were predictable with high accuracy, Q2 of >0.6, using an optimal spectral region (1.4-0.6 ppm) containing methylene and methyl signals from lipids. The models were subsequently tested on an independent cohort of 290 healthy Swedes with predicted and reference values matching by up to 85-95%. In addition, an open software tool was developed to predict lipoproteins concentrations in human blood from standardized 1H NMR spectral recordings.


Asunto(s)
Lipoproteínas LDL , Lipoproteínas , Humanos , Espectroscopía de Resonancia Magnética/métodos , Espectroscopía de Protones por Resonancia Magnética , Suecia
7.
PLoS Comput Biol ; 17(11): e1009585, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34752455

RESUMEN

Longitudinal intervention studies with repeated measurements over time are an important type of experimental design in biomedical research. Due to the advent of "omics"-sciences (genomics, transcriptomics, proteomics, metabolomics), longitudinal studies generate increasingly multivariate outcome data. Analysis of such data must take both the longitudinal intervention structure and multivariate nature of the data into account. The ASCA+-framework combines general linear models with principal component analysis and can be used to separate and visualize the multivariate effect of different experimental factors. However, this methodology has not yet been developed for the more complex designs often found in longitudinal intervention studies, which may be unbalanced, involve randomized interventions, and have substantial missing data. Here we describe a new methodology, repeated measures ASCA+ (RM-ASCA+), and show how it can be used to model metabolic changes over time, and compare metabolic changes between groups, in both randomized and non-randomized intervention studies. Tools for both visualization and model validation are discussed. This approach can facilitate easier interpretation of data from longitudinal clinical trials with multivariate outcomes.


Asunto(s)
Neoplasias de la Mama/tratamiento farmacológico , Antineoplásicos Inmunológicos/uso terapéutico , Cirugía Bariátrica , Bevacizumab/uso terapéutico , Interpretación Estadística de Datos , Femenino , Genómica , Humanos , Estudios Longitudinales , Metabolómica , Proteómica , Reproducibilidad de los Resultados
8.
Brief Bioinform ; 20(1): 317-329, 2019 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-30657888

RESUMEN

Motivation: Genome-wide measurements of genetic and epigenetic alterations are generating more and more high-dimensional binary data. The special mathematical characteristics of binary data make the direct use of the classical principal component analysis (PCA) model to explore low-dimensional structures less obvious. Although there are several PCA alternatives for binary data in the psychometric, data analysis and machine learning literature, they are not well known to the bioinformatics community. Results: In this article, we introduce the motivation and rationale of some parametric and nonparametric versions of PCA specifically geared for binary data. Using both realistic simulations of binary data as well as mutation, CNA and methylation data of the Genomic Determinants of Sensitivity in Cancer 1000 (GDSC1000), the methods were explored for their performance with respect to finding the correct number of components, overfit, finding back the correct low-dimensional structure, variable importance, etc. The results show that if a low-dimensional structure exists in the data, that most of the methods can find it. When assuming a probabilistic generating process is underlying the data, we recommend to use the parametric logistic PCA model, while when such an assumption is not valid and the data are considered as given, the nonparametric Gifi model is recommended. Availability: The codes to reproduce the results in this article are available at the homepage of the Biosystems Data Analysis group (www.bdagroup.nl).


Asunto(s)
Genómica/estadística & datos numéricos , Análisis de Componente Principal , Algoritmos , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Simulación por Computador , Variaciones en el Número de Copia de ADN , Metilación de ADN , Bases de Datos Genéticas/estadística & datos numéricos , Humanos , Modelos Logísticos , Aprendizaje Automático , Neoplasias/genética , Dinámicas no Lineales , Programas Informáticos , Estadísticas no Paramétricas
9.
Metabolomics ; 17(9): 77, 2021 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-34435244

RESUMEN

INTRODUCTION: The relationship between the chemical composition of food products and their sensory profile is a complex association confronting many challenges. However, new untargeted methodologies are helping correlate metabolites with sensory characteristics in a simpler manner. Nevertheless, in the pilot phase of a project, where only a small set of products are used to explore the relationships, choices have to be made about the most appropriate untargeted metabolomics methodology. OBJECTIVE: To provide a framework for selecting a metabolite-sensory methodology based on: the quality of measurements, the relevance of the detected metabolites in terms of distinguishing between products or in terms of whether they can be related to the sensory attributes of the products. METHODS: In this paper we introduce a systematic approach to explore all these different aspects driving the choice for the most appropriate metabolomics method. RESULTS: As an example we have used a tomato soup project where the choice between two sampling methods (SPME and SBSE) had to be made. The results are not always consistently pointing to the same method as being the best. SPME was able to detect metabolites with a better precision, SBSE seemed to be able to provide a better distinction between the soups. CONCLUSION: The three levels of comparison provide information on how the methods could perform in a follow up study and will help the researcher to make a final selection for the most appropriate method based on their strengths and weaknesses.


Asunto(s)
Metabolómica , Estudios de Seguimiento
10.
PLoS Comput Biol ; 16(9): e1008295, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32997685

RESUMEN

The field of transcriptomics uses and measures mRNA as a proxy of gene expression. There are currently two major platforms in use for quantifying mRNA, microarray and RNA-Seq. Many comparative studies have shown that their results are not always consistent. In this study we aim to find a robust method to increase comparability of both platforms enabling data analysis of merged data from both platforms. We transformed high dimensional transcriptomics data from two different platforms into a lower dimensional, and biologically relevant dataset by calculating enrichment scores based on gene set collections for all samples. We compared the similarity between data from both platforms based on the raw data and on the enrichment scores. We show that the performed data transforms the data in a biologically relevant way and filters out noise which leads to increased platform concordance. We validate the procedure using predictive models built with microarray based enrichment scores to predict subtypes of breast cancer using enrichment scores based on sequenced data. Although microarray and RNA-Seq expression levels might appear different, transforming them into biologically relevant gene set enrichment scores significantly increases their correlation, which is a step forward in data integration of the two platforms. The gene set collections were shown to contain biologically relevant gene sets. More in-depth investigation on the effect of the composition, size, and number of gene sets that are used for the transformation is suggested for future research.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos , RNA-Seq , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Humanos , Reproducibilidad de los Resultados , Transcriptoma/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA