Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

Analyzing postprandial metabolomics data using multiway models: a simulation study.

Li, Lu; Yan, Shi; Bakker, Barbara M; Hoefsloot, Huub; Chawes, Bo; Horner, David; Rasmussen, Morten A; Smilde, Age K; Acar, Evrim.

BMC Bioinformatics ; 25(1): 94, 2024 Mar 04.

Artículo en Inglés | MEDLINE | ID: mdl-38438850

RESUMEN

BACKGROUND: Analysis of time-resolved postprandial metabolomics data can improve the understanding of metabolic mechanisms, potentially revealing biomarkers for early diagnosis of metabolic diseases and advancing precision nutrition and medicine. Postprandial metabolomics measurements at several time points from multiple subjects can be arranged as a subjects by metabolites by time points array. Traditional analysis methods are limited in terms of revealing subject groups, related metabolites, and temporal patterns simultaneously from such three-way data. RESULTS: We introduce an unsupervised multiway analysis approach based on the CANDECOMP/PARAFAC (CP) model for improved analysis of postprandial metabolomics data guided by a simulation study. Because of the lack of ground truth in real data, we generate simulated data using a comprehensive human metabolic model. This allows us to assess the performance of CP models in terms of revealing subject groups and underlying metabolic processes. We study three analysis approaches: analysis of fasting-state data using principal component analysis, T0-corrected data (i.e., data corrected by subtracting fasting-state data) using a CP model and full-dynamic (i.e., full postprandial) data using CP. Through extensive simulations, we demonstrate that CP models capture meaningful and stable patterns from simulated meal challenge data, revealing underlying mechanisms and differences between diseased versus healthy groups. CONCLUSIONS: Our experiments show that it is crucial to analyze both fasting-state and T0-corrected data for understanding metabolic differences among subject groups. Depending on the nature of the subject group structure, the best group separation may be achieved by CP models of T0-corrected or full-dynamic data. This study introduces an improved analysis approach for postprandial metabolomics data while also shedding light on the debate about correcting baseline values in longitudinal data analysis.

Asunto(s)

Medicina , Metabolómica , Humanos , Simulación por Computador , Análisis de Datos , Estado de Salud

2.

Characterizing human postprandial metabolic response using multiway data analysis.

Yan, Shi; Li, Lu; Horner, David; Ebrahimi, Parvaneh; Chawes, Bo; Dragsted, Lars O; Rasmussen, Morten A; Smilde, Age K; Acar, Evrim.

Metabolomics ; 20(3): 50, 2024 May 09.

Artículo en Inglés | MEDLINE | ID: mdl-38722393

RESUMEN

INTRODUCTION: Analysis of time-resolved postprandial metabolomics data can improve our understanding of the human metabolism by revealing similarities and differences in postprandial responses of individuals. Traditional data analysis methods often rely on data summaries or univariate approaches focusing on one metabolite at a time. OBJECTIVES: Our goal is to provide a comprehensive picture in terms of the changes in the human metabolism in response to a meal challenge test, by revealing static and dynamic markers of phenotypes, i.e., subject stratifications, related clusters of metabolites, and their temporal profiles. METHODS: We analyze Nuclear Magnetic Resonance (NMR) spectroscopy measurements of plasma samples collected during a meal challenge test from 299 individuals from the COPSAC2000 cohort using a Nightingale NMR panel at the fasting and postprandial states (15, 30, 60, 90, 120, 150, 240 min). We investigate the postprandial dynamics of the metabolism as reflected in the dynamic behaviour of the measured metabolites. The data is arranged as a three-way array: subjects by metabolites by time. We analyze the fasting state data to reveal static patterns of subject group differences using principal component analysis (PCA), and fasting state-corrected postprandial data using the CANDECOMP/PARAFAC (CP) tensor factorization to reveal dynamic markers of group differences. RESULTS: Our analysis reveals dynamic markers consisting of certain metabolite groups and their temporal profiles showing differences among males according to their body mass index (BMI) in response to the meal challenge. We also show that certain lipoproteins relate to the group difference differently in the fasting vs. dynamic state. Furthermore, while similar dynamic patterns are observed in males and females, the BMI-related group difference is observed only in males in the dynamic state. CONCLUSION: The CP model is an effective approach to analyze time-resolved postprandial metabolomics data, and provides a compact but a comprehensive summary of the postprandial data revealing replicable and interpretable dynamic markers crucial to advance our understanding of changes in the metabolism in response to a meal challenge.

Asunto(s)

Metabolómica , Periodo Posprandial , Humanos , Periodo Posprandial/fisiología , Masculino , Femenino , Metabolómica/métodos , Adulto , Ayuno/metabolismo , Análisis de Componente Principal , Espectroscopía de Resonancia Magnética/métodos , Persona de Mediana Edad , Análisis de Datos , Metaboloma/fisiología

3.

Analysis of high-dimensional metabolomics data with complex temporal dynamics using RM-ASCA.

Erdos, Balázs; Westerhuis, Johan A; Adriaens, Michiel E; O'Donovan, Shauna D; Xie, Ren; Singh-Povel, Cécile M; Smilde, Age K; Arts, Ilja C W.

PLoS Comput Biol ; 19(6): e1011221, 2023 06.

Artículo en Inglés | MEDLINE | ID: mdl-37352364

RESUMEN

The intricate dependency structure of biological "omics" data, particularly those originating from longitudinal intervention studies with frequently sampled repeated measurements renders the analysis of such data challenging. The high-dimensionality, inter-relatedness of multiple outcomes, and heterogeneity in the studied systems all add to the difficulty in deriving meaningful information. In addition, the subtle differences in dynamics often deemed meaningful in nutritional intervention studies can be particularly challenging to quantify. In this work we demonstrate the use of quantitative longitudinal models within the repeated-measures ANOVA simultaneous component analysis+ (RM-ASCA+) framework to capture the dynamics in frequently sampled longitudinal data with multivariate outcomes. We illustrate the use of linear mixed models with polynomial and spline basis expansion of the time variable within RM-ASCA+ in order to quantify non-linear dynamics in a simulation study as well as in a metabolomics data set. We show that the proposed approach presents a convenient and interpretable way to systematically quantify and summarize multivariate outcomes in longitudinal studies while accounting for proper within subject dependency structures.

Asunto(s)

Algoritmos , Metabolómica , Simulación por Computador , Modelos Lineales

4.

Exploring dynamic metabolomics data with multiway data analysis: a simulation study.

Li, Lu; Hoefsloot, Huub; de Graaf, Albert A; Acar, Evrim; Smilde, Age K.

BMC Bioinformatics ; 23(1): 31, 2022 Jan 10.

Artículo en Inglés | MEDLINE | ID: mdl-35012453

RESUMEN

BACKGROUND: Analysis of dynamic metabolomics data holds the promise to improve our understanding of underlying mechanisms in metabolism. For example, it may detect changes in metabolism due to the onset of a disease. Dynamic or time-resolved metabolomics data can be arranged as a three-way array with entries organized according to a subjects mode, a metabolites mode and a time mode. While such time-evolving multiway data sets are increasingly collected, revealing the underlying mechanisms and their dynamics from such data remains challenging. For such data, one of the complexities is the presence of a superposition of several sources of variation: induced variation (due to experimental conditions or inborn errors), individual variation, and measurement error. Multiway data analysis (also known as tensor factorizations) has been successfully used in data mining to find the underlying patterns in multiway data. To explore the performance of multiway data analysis methods in terms of revealing the underlying mechanisms in dynamic metabolomics data, simulated data with known ground truth can be studied. RESULTS: We focus on simulated data arising from different dynamic models of increasing complexity, i.e., a simple linear system, a yeast glycolysis model, and a human cholesterol model. We generate data with induced variation as well as individual variation. Systematic experiments are performed to demonstrate the advantages and limitations of multiway data analysis in analyzing such dynamic metabolomics data and their capacity to disentangle the different sources of variations. We choose to use simulations since we want to understand the capability of multiway data analysis methods which is facilitated by knowing the ground truth. CONCLUSION: Our numerical experiments demonstrate that despite the increasing complexity of the studied dynamic metabolic models, tensor factorization methods CANDECOMP/PARAFAC(CP) and Parallel Profiles with Linear Dependences (Paralind) can disentangle the sources of variations and thereby reveal the underlying mechanisms and their dynamics.

Asunto(s)

Metabolómica , Simulación por Computador , Humanos

5.

Human Blood Lipoprotein Predictions from ¹H NMR Spectra: Protocol, Model Performances, and Cage of Covariance.

Khakimov, Bekzod; Hoefsloot, Huub C J; Mobaraki, Nabiollah; Aru, Violetta; Kristensen, Mette; Lind, Mads V; Holm, Lars; Castro-Mejía, Josué L; Nielsen, Dennis S; Jacobs, Doris M; Smilde, Age K; Engelsen, Søren Balling.

Anal Chem ; 94(2): 628-636, 2022 01 18.

Artículo en Inglés | MEDLINE | ID: mdl-34936323

RESUMEN

Lipoprotein subfractions are biomarkers for the early diagnosis of cardiovascular diseases. The reference method, ultracentrifugation, for measuring lipoproteins is time-consuming, and there is a need to develop a rapid method for cohort screenings. This study presents partial least-squares regression models developed using 1H nuclear magnetic resonance (NMR) spectra and concentrations of lipoproteins as measured by ultracentrifugation on 316 healthy Danes. This study explores, for the first time, different regions of the 1H NMR spectrum representing signals of molecules in lipoprotein particles and different lipid species to develop parsimonious, reliable, and optimal prediction models. A total of 65 lipoprotein main and subfractions were predictable with high accuracy, Q2 of >0.6, using an optimal spectral region (1.4-0.6 ppm) containing methylene and methyl signals from lipids. The models were subsequently tested on an independent cohort of 290 healthy Swedes with predicted and reference values matching by up to 85-95%. In addition, an open software tool was developed to predict lipoproteins concentrations in human blood from standardized 1H NMR spectral recordings.

Asunto(s)

Lipoproteínas LDL , Lipoproteínas , Humanos , Espectroscopía de Resonancia Magnética/métodos , Espectroscopía de Protones por Resonancia Magnética , Suecia

6.

Repeated measures ASCA+ for analysis of longitudinal intervention studies with multivariate outcome data.

Madssen, Torfinn S; Giskeødegård, Guro F; Smilde, Age K; Westerhuis, Johan A.

PLoS Comput Biol ; 17(11): e1009585, 2021 11.

Artículo en Inglés | MEDLINE | ID: mdl-34752455

RESUMEN

Longitudinal intervention studies with repeated measurements over time are an important type of experimental design in biomedical research. Due to the advent of "omics"-sciences (genomics, transcriptomics, proteomics, metabolomics), longitudinal studies generate increasingly multivariate outcome data. Analysis of such data must take both the longitudinal intervention structure and multivariate nature of the data into account. The ASCA+-framework combines general linear models with principal component analysis and can be used to separate and visualize the multivariate effect of different experimental factors. However, this methodology has not yet been developed for the more complex designs often found in longitudinal intervention studies, which may be unbalanced, involve randomized interventions, and have substantial missing data. Here we describe a new methodology, repeated measures ASCA+ (RM-ASCA+), and show how it can be used to model metabolic changes over time, and compare metabolic changes between groups, in both randomized and non-randomized intervention studies. Tools for both visualization and model validation are discussed. This approach can facilitate easier interpretation of data from longitudinal clinical trials with multivariate outcomes.

Asunto(s)

Neoplasias de la Mama/tratamiento farmacológico , Antineoplásicos Inmunológicos/uso terapéutico , Cirugía Bariátrica , Bevacizumab/uso terapéutico , Interpretación Estadística de Datos , Femenino , Genómica , Humanos , Estudios Longitudinales , Metabolómica , Proteómica , Reproducibilidad de los Resultados

7.

Principal component analysis of binary genomics data.

Song, Yipeng; Westerhuis, Johan A; Aben, Nanne; Michaut, Magali; Wessels, Lodewyk F A; Smilde, Age K.

Brief Bioinform ; 20(1): 317-329, 2019 01 18.

Artículo en Inglés | MEDLINE | ID: mdl-30657888

RESUMEN

Motivation: Genome-wide measurements of genetic and epigenetic alterations are generating more and more high-dimensional binary data. The special mathematical characteristics of binary data make the direct use of the classical principal component analysis (PCA) model to explore low-dimensional structures less obvious. Although there are several PCA alternatives for binary data in the psychometric, data analysis and machine learning literature, they are not well known to the bioinformatics community. Results: In this article, we introduce the motivation and rationale of some parametric and nonparametric versions of PCA specifically geared for binary data. Using both realistic simulations of binary data as well as mutation, CNA and methylation data of the Genomic Determinants of Sensitivity in Cancer 1000 (GDSC1000), the methods were explored for their performance with respect to finding the correct number of components, overfit, finding back the correct low-dimensional structure, variable importance, etc. The results show that if a low-dimensional structure exists in the data, that most of the methods can find it. When assuming a probabilistic generating process is underlying the data, we recommend to use the parametric logistic PCA model, while when such an assumption is not valid and the data are considered as given, the nonparametric Gifi model is recommended. Availability: The codes to reproduce the results in this article are available at the homepage of the Biosystems Data Analysis group (www.bdagroup.nl).

Asunto(s)

Genómica/estadística & datos numéricos , Análisis de Componente Principal , Algoritmos , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Simulación por Computador , Variaciones en el Número de Copia de ADN , Metilación de ADN , Bases de Datos Genéticas/estadística & datos numéricos , Humanos , Modelos Logísticos , Aprendizaje Automático , Neoplasias/genética , Dinámicas no Lineales , Programas Informáticos , Estadísticas no Paramétricas

8.

Systematic selection of competing metabolomics methods in a metabolite-sensory relationship study.

Davarzani, Naser; Diez-Simon, Carmen; Großmann, Justus L; Jacobs, Doris M; van Doorn, Rudi; van den Berg, Marco A; Smilde, Age K; Mumm, Roland; Hall, Robert D; Westerhuis, Johan A.

Metabolomics ; 17(9): 77, 2021 08 25.

Artículo en Inglés | MEDLINE | ID: mdl-34435244

RESUMEN

INTRODUCTION: The relationship between the chemical composition of food products and their sensory profile is a complex association confronting many challenges. However, new untargeted methodologies are helping correlate metabolites with sensory characteristics in a simpler manner. Nevertheless, in the pilot phase of a project, where only a small set of products are used to explore the relationships, choices have to be made about the most appropriate untargeted metabolomics methodology. OBJECTIVE: To provide a framework for selecting a metabolite-sensory methodology based on: the quality of measurements, the relevance of the detected metabolites in terms of distinguishing between products or in terms of whether they can be related to the sensory attributes of the products. METHODS: In this paper we introduce a systematic approach to explore all these different aspects driving the choice for the most appropriate metabolomics method. RESULTS: As an example we have used a tomato soup project where the choice between two sampling methods (SPME and SBSE) had to be made. The results are not always consistently pointing to the same method as being the best. SPME was able to detect metabolites with a better precision, SBSE seemed to be able to provide a better distinction between the soups. CONCLUSION: The three levels of comparison provide information on how the methods could perform in a follow up study and will help the researcher to make a final selection for the most appropriate method based on their strengths and weaknesses.

Asunto(s)

Metabolómica , Estudios de Seguimiento

9.

Increased comparability between RNA-Seq and microarray data by utilization of gene sets.

van der Kloet, Frans M; Buurmans, Jeroen; Jonker, Martijs J; Smilde, Age K; Westerhuis, Johan A.

PLoS Comput Biol ; 16(9): e1008295, 2020 09.

Artículo en Inglés | MEDLINE | ID: mdl-32997685

RESUMEN

The field of transcriptomics uses and measures mRNA as a proxy of gene expression. There are currently two major platforms in use for quantifying mRNA, microarray and RNA-Seq. Many comparative studies have shown that their results are not always consistent. In this study we aim to find a robust method to increase comparability of both platforms enabling data analysis of merged data from both platforms. We transformed high dimensional transcriptomics data from two different platforms into a lower dimensional, and biologically relevant dataset by calculating enrichment scores based on gene set collections for all samples. We compared the similarity between data from both platforms based on the raw data and on the enrichment scores. We show that the performed data transforms the data in a biologically relevant way and filters out noise which leads to increased platform concordance. We validate the procedure using predictive models built with microarray based enrichment scores to predict subtypes of breast cancer using enrichment scores based on sequenced data. Although microarray and RNA-Seq expression levels might appear different, transforming them into biologically relevant gene set enrichment scores significantly increases their correlation, which is a step forward in data integration of the two platforms. The gene set collections were shown to contain biologically relevant gene sets. More in-depth investigation on the effect of the composition, size, and number of gene sets that are used for the transformation is suggested for future research.

Asunto(s)

Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos , RNA-Seq , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Humanos , Reproducibilidad de los Resultados , Transcriptoma/genética

10.

Unraveling VEALYL Amyloid Formation Using Advanced Vibrational Spectroscopy and Microscopy.

Roeters, Steven J; Sawall, Mathias; Eskildsen, Carl E; Panman, Matthijs R; Tordai, Gergely; Koeman, Mike; Neymeyr, Klaus; Jansen, Jeroen; Smilde, Age K; Woutersen, Sander.

Biophys J ; 119(1): 87-98, 2020 07 07.

Artículo en Inglés | MEDLINE | ID: mdl-32562617

RESUMEN

Intermediate species are hypothesized to play an important role in the toxicity of amyloid formation, a process associated with many diseases. This process can be monitored with conventional and two-dimensional infrared spectroscopy, vibrational circular dichroism, and optical and electron microscopy. Here, we present how combining these techniques provides insight into the aggregation of the hexapeptide VEALYL (Val-Glu-Ala-Leu-Tyr-Leu), the B-chain residue 12-17 segment of insulin that forms amyloid fibrils (intermolecularly hydrogen-bonded ß-sheets) when the pH is lowered below 4. Under such circumstances, the aggregation commences after approximately an hour and continues to develop over a period of weeks. Singular value decompositions of one-dimensional and two-dimensional infrared spectroscopy spectra indicate that intermediate species are formed during the aggregation process. Multivariate curve resolution analyses of the one and two-dimensional infrared spectroscopy data show that the intermediates are more fibrillar and deprotonated than the monomers, whereas they are less ordered than the final fibrillar structure that is slowly formed from the intermediates. A comparison between the vibrational circular dichroism spectra and the scanning transmission electron microscopy and optical microscope images shows that the formation of mature fibrils of VEALYL correlates with the appearance of spherulites that are on the order of several micrometers, which give rise to a "giant" vibrational circular dichroism effect.

Asunto(s)

Amiloide , Microscopía , Dicroismo Circular , Conformación Proteica en Lámina beta , Espectroscopía Infrarroja por Transformada de Fourier , Vibración

11.

Numerical Representations of Metabolic Systems.

Smilde, Age K; Hankemeier, Thomas.

Anal Chem ; 92(20): 13614-13621, 2020 10 20.

Artículo en Inglés | MEDLINE | ID: mdl-32991165

RESUMEN

Metabolomics is becoming a mature part of analytical chemistry as evidenced by the growing number of publications and attendees of international conferences dedicated to this topic. Yet, a systematic treatment of the fundamental structure and properties of metabolomics data is lagging behind. We want to fill this gap by introducing two fundamental theories concerning metabolomics data: data theory and measurement theory. Our approach is to ask simple questions, the answers of which require applying these theories to metabolomics. We show that we can distinguish at least four different levels of metabolomics data with different properties and warn against confusing data with numbers. This treatment provides a theoretical underpinning for preprocessing and postprocessing methods in metabolomics and also argues for a proper match between type of metabolomics data and the biological question to be answered. The approach can be extended to other omics measurements such as proteomics and is thus of relevance for a large analytical chemistry community.

Asunto(s)

Metabolómica/métodos , Modelos Teóricos , Cromatografía de Gases , Cromatografía Liquida , Análisis Discriminante , Análisis de los Mínimos Cuadrados , Espectroscopía de Resonancia Magnética , Espectrometría de Masas , Análisis de Componente Principal

12.

Modeling adaptive response profiles in a vaccine clinical trial.

Hasdemir, Dicle; van den Berg, Robert A; van Kampen, Antoine; Smilde, Age K.

BMC Med Res Methodol ; 20(1): 191, 2020 07 16.

Artículo en Inglés | MEDLINE | ID: mdl-32677968

RESUMEN

BACKGROUND: Vaccine clinical studies typically provide time-resolved data on adaptive response read-outs in response to the administration of that particular vaccine to a cohort of individuals. However, modeling such data is challenged by the properties of these time-resolved profiles such as non-linearity, scarcity of measurement points, scheduling of the vaccine at multiple time points. Linear Mixed Models (LMM) are often used for the analysis of longitudinal data but their use in these time-resolved immunological data is not common yet. Apart from the modeling challenges mentioned earlier, selection of the optimal model by using information-criterion-based measures is far from being straight-forward. The aim of this study is to provide guidelines for the application and selection of LMMs that deal with the challenging characteristics of the typical data sets in the field of vaccine clinical studies. METHODS: We used antibody measurements in response to Hepatitis-B vaccine with five different adjuvant formulations for demonstration purposes. We built piecewise-linear, piecewise-quadratic and cubic models with transformations of the axes with pre-selected or optimized knot locations where time is a numerical variable. We also investigated models where time is categorical and random effects are shared intercepts between different measurement points. We compared all models by using Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Deviance Information Criterion (DIC), variations of conditional AIC and by visual inspection of the model fit in the light of prior biological information. RESULTS: There are various ways of dealing with the challenges of the data which have their own advantages and disadvantages. We explain these in detail here. Traditional information-criteria-based measures work well for the coarse selection of the model structure and complexity, however are not efficient at fine tuning of the complexity level of the random effects. CONCLUSIONS: We show that common statistical measures for optimal model complexity are not sufficient. Rather, explicitly accounting for model purpose and biological interpretation is needed to arrive at relevant models. TRIAL REGISTRATION: Clinical trial registration number for this study: NCT00805389, date of registration: December 9, 2008 (pro-active registration).

Asunto(s)

Teorema de Bayes , Humanos

13.

iTOP: inferring the topology of omics data.

Aben, Nanne; Westerhuis, Johan A; Song, Yipeng; Kiers, Henk A L; Michaut, Magali; Smilde, Age K; Wessels, Lodewyk F A.

Bioinformatics ; 34(17): i988-i996, 2018 09 01.

Artículo en Inglés | MEDLINE | ID: mdl-30423084

RESUMEN

Motivation: In biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn, influence drug response. Such relationships can strongly affect analyses performed on the data, as we have previously shown for the identification of biomarkers of drug response. Therefore, it is important to be able to chart the relationships between datasets. Results: We present iTOP, a methodology to infer a topology of relationships between datasets. We base this methodology on the RV coefficient, a measure of matrix correlation, which can be used to determine how much information is shared between two datasets. We extended the RV coefficient for partial matrix correlations, which allows the use of graph reconstruction algorithms, such as the PC algorithm, to infer the topologies. In addition, since multi-omics data often contain binary data (e.g. mutations), we also extended the RV coefficient for binary data. Applying iTOP to pharmacogenomics data, we found that gene expression acts as a mediator between most other datasets and drug response: only proteomics clearly shares information with drug response that is not present in gene expression. Based on this result, we used TANDEM, a method for drug response prediction, to identify which variables predictive of drug response were distinct to either gene expression or proteomics. Availability and implementation: An implementation of our methodology is available in the R package iTOP on CRAN. Additionally, an R Markdown document with code to reproduce all figures is provided as Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.

Asunto(s)

Proteómica , Algoritmos , Humanos , Neoplasias/genética

14.

Common and distinct variation in data fusion of designed experimental data.

Alinaghi, Masoumeh; Bertram, Hanne Christine; Brunse, Anders; Smilde, Age K; Westerhuis, Johan A.

Metabolomics ; 16(1): 2, 2019 12 03.

Artículo en Inglés | MEDLINE | ID: mdl-31797165

RESUMEN

INTRODUCTION: Integrative analysis of multiple data sets can provide complementary information about the studied biological system. However, data fusion of multiple biological data sets can be complicated as data sets might contain different sources of variation due to underlying experimental factors. Therefore, taking the experimental design of data sets into account could be of importance in data fusion concept. OBJECTIVES: In the present work, we aim to incorporate the experimental design information in the integrative analysis of multiple designed data sets. METHODS: Here we describe penalized exponential ANOVA simultaneous component analysis (PE-ASCA), a new method for integrative analysis of data sets from multiple compartments or analytical platforms with the same underlying experimental design. RESULTS: Using two simulated cases, the result of simultaneous component analysis (SCA), penalized exponential simultaneous component analysis (P-ESCA) and ANOVA-simultaneous component analysis (ASCA) are compared with the proposed method. Furthermore, real metabolomics data obtained from NMR analysis of two different brains tissues (hypothalamus and midbrain) from the same piglets with an underlying experimental design is investigated by PE-ASCA. CONCLUSIONS: This method provides an improved understanding of the common and distinct variation in response to different experimental factors.

Asunto(s)

Metabolómica , Proyectos de Investigación , Algoritmos , Animales , Hipotálamo/metabolismo , Mesencéfalo/metabolismo , Resonancia Magnética Nuclear Biomolecular , Análisis de Componente Principal , Porcinos

15.

Repeatability and reproducibility of lipoprotein particle profile measurements in plasma samples by ultracentrifugation.

Monsonis-Centelles, Sandra; Hoefsloot, Huub C J; Engelsen, Søren B; Smilde, Age K; Lind, Mads V.

Clin Chem Lab Med ; 58(1): 103-115, 2019 Dec 18.

Artículo en Inglés | MEDLINE | ID: mdl-31553695

RESUMEN

Background Characterization of lipoprotein particle profiles (LPPs) (including main classes and subclasses) by means of ultracentrifugation (UC) is highly requested given its clinical potential. However, rapid methods are required to replace the very labor-intensive UC method and one solution is to calibrate rapid nuclear magnetic resonance (NMR)-based prediction models, but the reliability of the UC-response method required for the NMR calibration has been largely overlooked. Methods This study provides a comprehensive repeatability and reproducibility study of various UC-based lipid measurements (cholesterol, triglycerides [TGs], free cholesterol, phospholipids, apolipoprotein [apo]A1 and apoB) in different main classes and subclasses of 25 duplicated fresh plasma samples and of 42 quality control (QC) frozen pooled plasma samples of healthy individuals. Results Cholesterol, apoA1 and apoB measurements were very repeatable in all classes (intraclass correlation coefficient [ICC]: 92.93%-99.54%). Free cholesterol and phospholipid concentrations in main classes and subclasses and TG concentrations in high-density lipoproteins (HDL), HDL subclasses and low-density lipoproteins (LDL) subclasses, showed worse repeatability (ICC: 19.21%-99.08%) attributable to low concentrations, variability introduced during UC and assay limitations. On frozen QC samples, the reproducibility of cholesterol, apoA1 and apoB concentrations was found to be better than for the free cholesterol, phospholipids and TGs concentrations. Conclusions This study shows that for LPPs measurements near or below the limit of detection (LOD) in some of the subclasses, as well as the use of frozen samples, results in worsened repeatability and reproducibility. Furthermore, we show that the analytical assay coupled to UC for free cholesterol and phospholipids have different repeatability and reproducibility. All of this needs to be taken into account when calibrating future NMR-based models.

Asunto(s)

Análisis Químico de la Sangre/métodos , Lipoproteínas/sangre , Lipoproteínas/aislamiento & purificación , Ultracentrifugación/métodos , Colorimetría , Femenino , Congelación , Humanos , Lipoproteínas/química , Masculino , Reproducibilidad de los Resultados , Adulto Joven

16.

Toward Reliable Lipoprotein Particle Predictions from NMR Spectra of Human Blood: An Interlaboratory Ring Test.

Monsonis Centelles, Sandra; Hoefsloot, Huub C J; Khakimov, Bekzod; Ebrahimi, Parvaneh; Lind, Mads V; Kristensen, Mette; de Roo, Niels; Jacobs, Doris M; van Duynhoven, John; Cannet, Claire; Fang, Fang; Humpfer, Eberhard; Schäfer, Hartmut; Spraul, Manfred; Engelsen, Søren B; Smilde, Age K.

Anal Chem ; 89(15): 8004-8012, 2017 08 01.

Artículo en Inglés | MEDLINE | ID: mdl-28692288

RESUMEN

Lipoprotein profiling of human blood by 1H nuclear magnetic resonance (NMR) spectroscopy is a rapid and promising approach to monitor health and disease states in medicine and nutrition. However, lack of standardization of measurement protocols has prevented the use of NMR-based lipoprotein profiling in metastudies. In this study, a standardized NMR measurement protocol was applied in a ring test performed across three different laboratories in Europe on plasma and serum samples from 28 individuals. Data was evaluated in terms of (i) spectral differences, (ii) differences in LPD predictions obtained using an existing prediction model, and (iii) agreement of predictions with cholesterol concentrations in high- and low-density lipoproteins (HDL and LDL) particles measured by standardized clinical assays. ANOVA-simultaneous component analysis (ASCA) of the ring test spectral ensemble that contains methylene and methyl peaks (1.4-0.6 ppm) showed that 97.99% of the variance in the data is related to subject, 1.62% to sample type (serum or plasma), and 0.39% to laboratory. This interlaboratory variation is in fact smaller than the maximum acceptable intralaboratory variation on quality control samples. It is also shown that the reproducibility between laboratories is good enough for the LPD predictions to be exchangeable when the standardized NMR measurement protocol is followed. With the successful implementation of this protocol, which results in reproducible prediction of lipoprotein distributions across laboratories, a step is taken toward bringing NMR more into scope of prognostic and diagnostic biomarkers, reducing the need for less efficient methods such as ultracentrifugation or high-performance liquid chromatography (HPLC).

Asunto(s)

Lipoproteínas HDL/sangre , Lipoproteínas LDL/sangre , Espectroscopía de Protones por Resonancia Magnética , Adulto , Femenino , Humanos , Laboratorios/normas , Análisis de los Mínimos Cuadrados , Lipoproteínas VLDL/sangre , Embarazo , Análisis de Componente Principal , Espectroscopía de Protones por Resonancia Magnética/normas , Adulto Joven

17.

Separating common from distinctive variation.

van der Kloet, Frans M; Sebastián-León, Patricia; Conesa, Ana; Smilde, Age K; Westerhuis, Johan A.

BMC Bioinformatics ; 17 Suppl 5: 195, 2016 Jun 06.

Artículo en Inglés | MEDLINE | ID: mdl-27294690

RESUMEN

BACKGROUND: Joint and individual variation explained (JIVE), distinct and common simultaneous component analysis (DISCO) and O2-PLS, a two-block (X-Y) latent variable regression method with an integral OSC filter can all be used for the integrated analysis of multiple data sets and decompose them in three terms: a low(er)-rank approximation capturing common variation across data sets, low(er)-rank approximations for structured variation distinctive for each data set, and residual noise. In this paper these three methods are compared with respect to their mathematical properties and their respective ways of defining common and distinctive variation. RESULTS: The methods are all applied on simulated data and mRNA and miRNA data-sets from GlioBlastoma Multiform (GBM) brain tumors to examine their overlap and differences. When the common variation is abundant, all methods are able to find the correct solution. With real data however, complexities in the data are treated differently by the three methods. CONCLUSIONS: All three methods have their own approach to estimate common and distinctive variation with their specific strength and weaknesses. Due to their orthogonality properties and their used algorithms their view on the data is slightly different. By assuming orthogonality between common and distinctive, true natural or biological phenomena that may not be orthogonal at all might be misinterpreted.

Asunto(s)

Algoritmos , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Neoplasias Encefálicas/patología , Glioblastoma/genética , Glioblastoma/metabolismo , Glioblastoma/patología , Humanos , MicroARNs/metabolismo , Análisis de Componente Principal , ARN Mensajero/metabolismo

18.

The Muscle Metabolome Differs between Healthy and Frail Older Adults.

Fazelzadeh, Parastoo; Hangelbroek, Roland W J; Tieland, Michael; de Groot, Lisette C P G M; Verdijk, Lex B; van Loon, Luc J C; Smilde, Age K; Alves, Rodrigo D A M; Vervoort, Jacques; Müller, Michael; van Duynhoven, John P M; Boekschoten, Mark V.

J Proteome Res ; 15(2): 499-509, 2016 Feb 05.

Artículo en Inglés | MEDLINE | ID: mdl-26732810

RESUMEN

Populations around the world are aging rapidly. Age-related loss of physiological functions negatively affects quality of life. A major contributor to the frailty syndrome of aging is loss of skeletal muscle. In this study we assessed the skeletal muscle biopsy metabolome of healthy young, healthy older and frail older subjects to determine the effect of age and frailty on the metabolic signature of skeletal muscle tissue. In addition, the effects of prolonged whole-body resistance-type exercise training on the muscle metabolome of older subjects were examined. The baseline metabolome was measured in muscle biopsies collected from 30 young, 66 healthy older subjects, and 43 frail older subjects. Follow-up samples from frail older (24 samples) and healthy older subjects (38 samples) were collected after 6 months of prolonged resistance-type exercise training. Young subjects were included as a reference group. Primary differences in skeletal muscle metabolite levels between young and healthy older subjects were related to mitochondrial function, muscle fiber type, and tissue turnover. Similar differences were observed when comparing frail older subjects with healthy older subjects at baseline. Prolonged resistance-type exercise training resulted in an adaptive response of amino acid metabolism, especially reflected in branched chain amino acids and genes related to tissue remodeling. The effect of exercise training on branched-chain amino acid-derived acylcarnitines in older subjects points to a downward shift in branched-chain amino acid catabolism upon training. We observed only modest correlations between muscle and plasma metabolite levels, which pleads against the use of plasma metabolites as a direct read-out of muscle metabolism and stresses the need for direct assessment of metabolites in muscle tissue biopsies.

Asunto(s)

Anciano Frágil , Metaboloma , Metabolómica/métodos , Músculo Esquelético/metabolismo , Anciano , Anciano de 80 o más Años , Aminoácidos/metabolismo , Análisis de Varianza , Ácidos Carboxílicos/metabolismo , Ejercicio Físico , Femenino , Cromatografía de Gases y Espectrometría de Masas , Humanos , Masculino , Análisis de Componente Principal , Adulto Joven

19.

A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.

Christin, Christin; Hoefsloot, Huub C J; Smilde, Age K; Hoekman, B; Suits, Frank; Bischoff, Rainer; Horvatovich, Peter.

Mol Cell Proteomics ; 12(1): 263-76, 2013 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-23115301

RESUMEN

In this paper, we compare the performance of six different feature selection methods for LC-MS-based proteomics and metabolomics biomarker discovery-t test, the Mann-Whitney-Wilcoxon test (mww test), nearest shrunken centroid (NSC), linear support vector machine-recursive features elimination (SVM-RFE), principal component discriminant analysis (PCDA), and partial least squares discriminant analysis (PLSDA)-using human urine and porcine cerebrospinal fluid samples that were spiked with a range of peptides at different concentration levels. The ideal feature selection method should select the complete list of discriminating features that are related to the spiked peptides without selecting unrelated features. Whereas many studies have to rely on classification error to judge the reliability of the selected biomarker candidates, we assessed the accuracy of selection directly from the list of spiked peptides. The feature selection methods were applied to data sets with different sample sizes and extents of sample class separation determined by the concentration level of spiked compounds. For each feature selection method and data set, the performance for selecting a set of features related to spiked compounds was assessed using the harmonic mean of the recall and the precision (f-score) and the geometric mean of the recall and the true negative rate (g-score). We conclude that the univariate t test and the mww test with multiple testing corrections are not applicable to data sets with small sample sizes (n = 6), but their performance improves markedly with increasing sample size up to a point (n > 12) at which they outperform the other methods. PCDA and PLSDA select small feature sets with high precision but miss many true positive features related to the spiked peptides. NSC strikes a reasonable compromise between recall and precision for all data sets independent of spiking level and number of samples. Linear SVM-RFE performs poorly for selecting features related to the spiked compounds, even though the classification error is relatively low.

Asunto(s)

Metabolómica/métodos , Péptidos/líquido cefalorraquídeo , Péptidos/orina , Proteómica/métodos , Animales , Biomarcadores/análisis , Cromatografía Liquida , Biología Computacional , Interpretación Estadística de Datos , Perfilación de la Expresión Génica , Humanos , Espectrometría de Masas , Análisis de Secuencia por Matrices de Oligonucleótidos , Reconocimiento de Normas Patrones Automatizadas , Análisis de Componente Principal , Porcinos

20.

Identifying inhibitory compounds in lignocellulosic biomass hydrolysates using an exometabolomics approach.

Zha, Ying; Westerhuis, Johan A; Muilwijk, Bas; Overkamp, Karin M; Nijmeijer, Bernadien M; Coulier, Leon; Smilde, Age K; Punt, Peter J.

BMC Biotechnol ; 14: 22, 2014 Mar 21.

Artículo en Inglés | MEDLINE | ID: mdl-24655423

RESUMEN

BACKGROUND: Inhibitors are formed that reduce the fermentation performance of fermenting yeast during the pretreatment process of lignocellulosic biomass. An exometabolomics approach was applied to systematically identify inhibitors in lignocellulosic biomass hydrolysates. RESULTS: We studied the composition and fermentability of 24 different biomass hydrolysates. To create diversity, the 24 hydrolysates were prepared from six different biomass types, namely sugar cane bagasse, corn stover, wheat straw, barley straw, willow wood chips and oak sawdust, and with four different pretreatment methods, i.e. dilute acid, mild alkaline, alkaline/peracetic acid and concentrated acid. Their composition and that of fermentation samples generated with these hydrolysates were analyzed with two GC-MS methods. Either ethyl acetate extraction or ethyl chloroformate derivatization was used before conducting GC-MS to prevent sugars are overloaded in the chromatograms, which obscure the detection of less abundant compounds. Using multivariate PLS-2CV and nPLS-2CV data analysis models, potential inhibitors were identified through establishing relationship between fermentability and composition of the hydrolysates. These identified compounds were tested for their effects on the growth of the model yeast, Saccharomyces. cerevisiae CEN.PK 113-7D, confirming that the majority of the identified compounds were indeed inhibitors. CONCLUSION: Inhibitory compounds in lignocellulosic biomass hydrolysates were successfully identified using a non-targeted systematic approach: metabolomics. The identified inhibitors include both known ones, such as furfural, HMF and vanillin, and novel inhibitors, namely sorbic acid and phenylacetaldehyde.

Asunto(s)

Biomasa , Fermentación , Lignina/química , Saccharomyces cerevisiae/crecimiento & desarrollo , Celulosa/química , Flavonas/química , Furaldehído/química , Hordeum/química , Metabolómica , Modelos Estadísticos , Tallos de la Planta/química , Salix/química , Triticum/química , Madera/química , Zea mays/química

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA