Pesquisa | Portal de Pesquisa da BVS

1.

Chemometrics for ion mobility spectrometry data: recent advances and future prospects.

Szymanska, Ewa; Davies, Antony N; Buydens, Lutgarde M C.

Analyst ; 141(20): 5689-5708, 2016 Oct 21.

Artigo em Inglês | MEDLINE | ID: mdl-27549384

RESUMO

Historically, advances in the field of ion mobility spectrometry have been hindered by the variation in measured signals between instruments developed by different research laboratories or manufacturers. This has triggered the development and application of chemometric techniques able to reveal and analyze precious information content of ion mobility spectra. Recent advances in multidimensional coupling of ion mobility spectrometry to chromatography and mass spectrometry has created new, unique challenges for data processing, yielding high-dimensional, megavariate datasets. In this paper, a complete overview of available chemometric techniques used in the analysis of ion mobility spectrometry data is given. We describe the current state-of-the-art of ion mobility spectrometry data analysis comprising datasets with different complexities and two different scopes of data analysis, i.e. targeted and non-targeted analyte analyses. Two main steps of data analysis are considered: data preprocessing and pattern recognition. A detailed description of recent advances in chemometric techniques is provided for these steps, together with a list of interesting applications. We demonstrate that chemometric techniques have a significant contribution to the recent and great expansion of ion mobility spectrometry technology into different application fields. We conclude that well-thought out, comprehensive data analysis strategies are currently emerging, including several chemometric techniques and addressing different data challenges. In our opinion, this trend will continue in the near future, stimulating developments in ion mobility spectrometry instrumentation even further.

2.

Data size reduction strategy for the classification of breath and air samples using multicapillary column-ion mobility spectrometry.

Szymanska, Ewa; Brodrick, Emma; Williams, Mark; Davies, Antony N; van Manen, Henk-Jan; Buydens, Lutgarde M C.

Anal Chem ; 87(2): 869-75, 2015 Jan 20.

Artigo em Inglês | MEDLINE | ID: mdl-25519893

RESUMO

Ion mobility spectrometry combined with multicapillary column separation (MCC-IMS) is a well-known technology for detecting volatile organic compounds (VOCs) in gaseous samples. Due to their large data size, processing of MCC-IMS spectra is still the main bottleneck of data analysis, and there is an increasing need for data analysis strategies in which the size of MCC-IMS data is reduced to enable further analysis. In our study, the first untargeted chemometric strategy is developed and employed in the analysis of MCC-IMS spectra from 264 breath and ambient air samples. This strategy does not comprise identification of compounds as a primary step but includes several preprocessing steps and a discriminant analysis. Data size is significantly reduced in three steps. Wavelet transform, mask construction, and sparse-partial least squares-discriminant analysis (s-PLS-DA) allow data size reduction with down to 50 variables relevant to the goal of analysis. The influence and compatibility of the data reduction tools are studied by applying different settings of the developed strategy. Loss of information after preprocessing is evaluated, e.g., by comparing the performance of classification models for different classes of samples. Finally, the interpretability of the classification models is evaluated, and regions of spectra that are related to the identification of potential analytical biomarkers are successfully determined. This work will greatly enable the standardization of analytical procedures across different instrumentation types promoting the adoption of MCC-IMS technology in a wide range of diverse application fields.

3.

Comprehensive Data Scientific Procedure for Enhanced Analysis and Interpretation of Real-Time Breath Measurements in In Vivo Aroma-Release Studies.

Szymanska, Ewa; Brown, Phil A; Ziere, Aldo; Martins, Sara; Batenburg, Max; Harren, Frans J M; Buydens, Lutgarde M C.

Anal Chem ; 87(20): 10338-45, 2015 Oct 20.

Artigo em Inglês | MEDLINE | ID: mdl-26398529

RESUMO

Real-time measurements of many low-abundance volatile organic compounds (VOCs) in breath and air samples are already feasible due to progress in analytical technologies, such as proton transfer reaction mass spectrometry (PTR-MS). Nevertheless, the information content of real-time measurements is not fully exploited, due to the lack of suitable data handling methods. This study develops a data scientific procedure to enhance data analysis and interpretation of longitudinal, multivariate data sets from real-time, in vivo, aroma-release studies. The developed procedure includes an automated data preprocessing and a multivariate assessment of the test panel performance. A large multifactorial PTR-MS data set is investigated that includes four experimental protocols, two tested food products, four aroma compounds, and eight panelists. Real-time measurements are converted into standardized breath profiles by preprocessing, and 10 kinetic parameters are derived. Next to this, panel performance is evaluated per experimental protocol and food product. Comprehensive information about panel performance, individual panelists, studied products, aroma compounds, and kinetic parameters is extracted, demonstrating the great value of the developed approach.

4.

Simple and Effective Way for Data Preprocessing Selection Based on Design of Experiments.

Gerretzen, Jan; Szymanska, Ewa; Jansen, Jeroen J; Bart, Jacob; van Manen, Henk-Jan; van den Heuvel, Edwin R; Buydens, Lutgarde M C.

Anal Chem ; 87(24): 12096-103, 2015 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-26632985

RESUMO

The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective.

5.

Safety assessment of plant varieties using transcriptomics profiling and a one-class classifier.

van Dijk, Jeroen P; de Mello, Carla Souza; Voorhuijzen, Marleen M; Hutten, Ronald C B; Arisi, Ana Carolina Maisonnave; Jansen, Jeroen J; Buydens, Lutgarde M C; van der Voet, Hilko; Kok, Esther J.

Regul Toxicol Pharmacol ; 70(1): 297-303, 2014 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-25046166

RESUMO

An important part of the current hazard identification of novel plant varieties is comparative targeted analysis of the novel and reference varieties. Comparative analysis will become much more informative with unbiased analytical approaches, e.g. omics profiling. Data analysis estimating the similarity of new varieties to a reference baseline class of known safe varieties would subsequently greatly facilitate hazard identification. Further biological and eventually toxicological analysis would then only be necessary for varieties that fall outside this reference class. For this purpose, a one-class classifier tool was explored to assess and classify transcriptome profiles of potato (Solanum tuberosum) varieties in a model study. Profiles of six different varieties, two locations of growth, two year of harvest and including biological and technical replication were used to build the model. Two scenarios were applied representing evaluation of a 'different' variety and a 'similar' variety. Within the model higher class distances resulted for the 'different' test set compared with the 'similar' test set. The present study may contribute to a more global hazard identification of novel plant varieties.

Assuntos

Perfilação da Expressão Gênica , Modelos Teóricos , Plantas Geneticamente Modificadas/toxicidade , Solanum tuberosum/genética , Transcriptoma

6.

Linear Mixed-Effects Models in chemistry: A tutorial.

Carnoli, Andrea Junior; Lohuis, Petra Oude; Buydens, Lutgarde M C; Tinnevelt, Gerjen H; Jansen, Jeroen J.

Anal Chim Acta ; 1304: 342444, 2024 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-38637030

RESUMO

A common goal in chemistry is to study the relationship between a measured signal and the variability of certain factors. To this end, researchers often use Design of Experiment to decide which experiments to conduct and (Multiple) Linear Regression, and/or Analysis of Variance to analyze the collected data. Among the assumptions to the very foundation of this strategy, all the experiments are independent, conditional on the settings of the factors. Unfortunately, due to the presence of uncontrollable factors, real-life experiments often deviate from this assumption, making the data analysis results unreliable. In these cases, Mixed-Effects modeling, despite not being widely used in chemometrics, represents a solid data analysis framework to obtain reliable results. Here we provide a tutorial for Linear Mixed-Effects models. We gently introduce the reader to these models by showing some motivating examples. Then, we discuss the theory behind Linear Mixed-Effect models, and we show how to fit these models by making use of real-life data obtained from an exposome study. Throughout the paper we provide R code so that each researcher is able to implement these useful model themselves.

7.

Predictive-property-ranked variable reduction with final complexity adapted models in partial least squares modeling for multiple responses.

Andries, Jan P M; Heyden, Yvan Vander; Buydens, Lutgarde M C.

Anal Chem ; 85(11): 5444-53, 2013 Jun 04.

Artigo em Inglês | MEDLINE | ID: mdl-23679857

RESUMO

For partial least-squares regression with one response (PLS1), many variable-reduction methods have been developed. However, only a few address the case of multiple-response partial-least-squares (PLS2) modeling. The calibration performance of PLS1 can be improved by elimination of uninformative variables. Many variable-reduction methods are based on various PLS-model-related parameters, called predictor-variable properties. Recently, an important adaptation, in which the model complexity is optimized, was introduced in these methods. This method was called Predictive-Property-Ranked Variable Reduction with Final Complexity Adapted Models, denoted as PPRVR-FCAM or simply FCAM. In this study, variable reduction for PLS2 models, using an adapted FCAM method, FCAM-PLS2, is investigated. The utility and effectiveness of four new predictor-variable properties, derived from the multiple response PLS2 regression coefficients, are studied for six data sets consisting of ultraviolet-visible (UV-vis) spectra, near-infrared (NIR) spectra, NMR spectra, and two simulated sets, one with correlated and one with uncorrelated responses. The four properties include the mean of the absolute values as well as the norm of the PLS2 regression coefficients and their significances. The four properties were found to be applicable by the FCAM-PLS2 method for variable reduction. The predictive abilities of models resulting from the four properties are similar. The norm of the PLS2 regression coefficients has the best selective abilities, low numbers of variables with an informative meaning to the responses are retained. The significance of the mean of the PLS2 regression coefficients is found to be the least-selective property.

8.

A comprehensive full factorial LC-MS/MS proteomics benchmark data set.

Wessels, Hans J C T; Bloemberg, Tom G; van Dael, Maurice; Wehrens, Ron; Buydens, Lutgarde M C; van den Heuvel, Lambert P; Gloerich, Jolein.

Proteomics ; 12(14): 2276-81, 2012 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-22887946

RESUMO

An important prerequisite for the development and benchmarking of novel analysis methods is a well-designed comprehensive LC-MS/MS data set. Here, we present our data set consisting of 59 LC-MS/MS analyses of 50 protein samples extracted individually from Escherichia coli K12 and spiked with different concentrations of bovine carbonic anhydrase II and/or chicken ovalbumin, according to a 2 × 3 full factorial design. Using the well-annotated and commonly used E. coli proteome as the sample background ensures that the complexity of the data is on a par with most current proteomic analyses. Data were acquired over a 2-month period using multiple reversed-phase columns and instrument calibrations to include real-life challenges faced when analyzing large proteomics data sets. Moreover, so-called "ground truth" data, comprised by LC-MS/MS measurements of the pure spikes are included in the data set. The current manuscript elaborates this comprehensive benchmark data set for future development and evaluation of analysis methods and software.

Assuntos

Cromatografia Líquida/métodos , Bases de Dados de Proteínas , Proteoma/química , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Anidrase Carbônica II/química , Bovinos , Galinhas , Proteínas de Escherichia coli/química , Ovalbumina/química , Fragmentos de Peptídeos/química

9.

A phase and frequency alignment protocol for 1H MRSI data of the prostate.

Wright, Alan J; Buydens, Lutgarde M C; Heerschap, Arend.

NMR Biomed ; 25(5): 755-65, 2012 May.

Artigo em Inglês | MEDLINE | ID: mdl-21953616

RESUMO

(1)H MRSI of the prostate reveals relative metabolite levels that vary according to the presence or absence of tumour, providing a sensitive method for the identification of patients with cancer. Current interpretations of prostate data rely on quantification algorithms that fit model metabolite resonances to individual voxel spectra and calculate relative levels of metabolites, such as choline, creatine, citrate and polyamines. Statistical pattern recognition techniques can potentially improve the detection of prostate cancer, but these analyses are hampered by artefacts and sources of noise in the data, such as variations in phase and frequency of resonances. Phase and frequency variations may arise as a result of spatial field gradients or local physiological conditions affecting the frequency of resonances, in particular those of citrate. Thus, there are unique challenges in developing a peak alignment algorithm for these data. We have developed a frequency and phase correction algorithm for automatic alignment of the resonances in prostate MRSI spectra. We demonstrate, with a simulated dataset, that alignment can be achieved to a phase standard deviation of 0.095 rad and a frequency standard deviation of 0.68 Hz for the citrate resonances. Three parameters were used to assess the improvement in peak alignment in the MRSI data of five patients: the percentage of variance in all MRSI spectra explained by their first principal component; the signal-to-noise ratio of a spectrum formed by taking the median value of the entire set at each spectral point; and the mean cross-correlation between all pairs of spectra. These parameters showed a greater similarity between spectra in all five datasets and the simulated data, demonstrating improved alignment for phase and frequency in these spectra. This peak alignment program is expected to improve pattern recognition significantly, enabling accurate detection and localisation of prostate cancer with MRSI.

Assuntos

Algoritmos , Imageamento por Ressonância Magnética/métodos , Espectroscopia de Ressonância Magnética/métodos , Neoplasias da Próstata/química , Colina/análise , Ácido Cítrico/análise , Simulação por Computador , Creatina/análise , Bases de Dados Factuais , Humanos , Masculino , Modelos Biológicos , Reconhecimento Automatizado de Padrão/métodos , Poliaminas/análise , Análise de Componente Principal , Neoplasias da Próstata/patologia , Processamento de Sinais Assistido por Computador , Razão Sinal-Ruído

10.

Lactate and glycine-potential MR biomarkers of prognosis in estrogen receptor-positive breast cancers.

Giskeødegård, Guro F; Lundgren, Steinar; Sitter, Beathe; Fjøsne, Hans E; Postma, Geert; Buydens, Lutgarde M C; Gribbestad, Ingrid S; Bathen, Tone F.

NMR Biomed ; 25(11): 1271-9, 2012 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-22407957

RESUMO

Breast cancer is a heterogeneous disease with a variable prognosis. Clinical factors provide some information about the prognosis of patients with breast cancer; however, there is a need for additional information to stratify patients for improved and more individualized treatment. The aim of this study was to examine the relationship between the metabolite profiles of breast cancer tissue and 5-year survival. Biopsies from breast cancer patients (n=98) were excised during surgery and analyzed by high-resolution magic angle spinning MRS. The data were analyzed by multivariate principal component analysis and partial least-squares discriminant analysis, and the findings of important metabolites were confirmed by spectral integration of the metabolite peaks. Predictions of 5-year survival using metabolite profiles were compared with predictions using clinical parameters. Based on the metabolite profiles, patients with estrogen receptor (ER)-positive breast cancer (n=71) were separated into two groups with significantly different survival rates (p=0.024). Higher levels of glycine and lactate were found to be associated with lower survival rates by both multivariate analyses and spectral integration, and are suggested as biomarkers for breast cancer prognosis. Similar metabolic differences were not observed for ER-negative patients, where survivors could not be separated from nonsurvivors. Predictions of 5-year survival of ER-positive patients using metabolite profiles gave better and more robust results than those using traditional clinical parameters. The results imply that the metabolic state of a tumor may provide additional information concerning breast cancer prognosis. Further studies should be conducted in order to evaluate the role of MR metabolomics as an additional clinical tool for determining the prognosis of patients with breast cancer.

Assuntos

Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/metabolismo , Glicina/metabolismo , Ácido Láctico/metabolismo , Espectroscopia de Ressonância Magnética , Receptores de Estrogênio/metabolismo , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias da Mama/patologia , Estudos de Coortes , Análise Discriminante , Feminino , Humanos , Estimativa de Kaplan-Meier , Análise dos Mínimos Quadrados , Pessoa de Meia-Idade , Análise de Componente Principal , Prognóstico , Curva ROC

11.

Simultaneous analysis of plasma and CSF by NMR and hierarchical models fusion.

Smolinska, Agnieszka; Posma, Joram M; Blanchet, Lionel; Ampt, Kirsten A M; Attali, Amos; Tuinstra, Tinka; Luider, Theo; Doskocz, Marek; Michiels, Paul J; Girard, Frederic C; Buydens, Lutgarde M C; Wijmenga, Sybren S.

Anal Bioanal Chem ; 403(4): 947-59, 2012 May.

Artigo em Inglês | MEDLINE | ID: mdl-22395451

RESUMO

Because cerebrospinal fluid (CSF) is the biofluid which interacts most closely with the central nervous system, it holds promise as a reporter of neurological disease, for example multiple sclerosis (MScl). To characterize the metabolomics profile of neuroinflammatory aspects of this disease we studied an animal model of MScl-experimental autoimmune/allergic encephalomyelitis (EAE). Because CSF also exchanges metabolites with blood via the blood-brain barrier, malfunctions occurring in the CNS may be reflected in the biochemical composition of blood plasma. The combination of blood plasma and CSF provides more complete information about the disease. Both biofluids can be studied by use of NMR spectroscopy. It is then necessary to perform combined analysis of the two different datasets. Mid-level data fusion was therefore applied to blood plasma and CSF datasets. First, relevant information was extracted from each biofluid dataset by use of linear support vector machine recursive feature elimination. The selected variables from each dataset were concatenated for joint analysis by partial least squares discriminant analysis (PLS-DA). The combined metabolomics information from plasma and CSF enables more efficient and reliable discrimination of the onset of EAE. Second, we introduced hierarchical models fusion, in which previously developed PLS-DA models are hierarchically combined. We show that this approach enables neuroinflamed rats (even on the day of onset) to be distinguished from either healthy or peripherally inflamed rats. Moreover, progression of EAE can be investigated because the model separates the onset and peak of the disease.

Assuntos

Espectroscopia de Ressonância Magnética/métodos , Esclerose Múltipla/sangue , Esclerose Múltipla/líquido cefalorraquidiano , Animais , Encefalomielite Autoimune Experimental/sangue , Encefalomielite Autoimune Experimental/líquido cefalorraquidiano , Humanos , Masculino , Metabolômica , Modelos Biológicos , Esclerose Múltipla/diagnóstico , Ratos , Ratos Endogâmicos Lew

12.

Influence of measurement procedure on the use of a handheld NIR spectrophotometer.

Bertinetto, Carlo G; Schoot, Mark; Dingemans, Martijn; Meeuwsen, Wouter; Buydens, Lutgarde M C; Jansen, Jeroen J.

Food Res Int ; 161: 111836, 2022 11.

Artigo em Inglês | MEDLINE | ID: mdl-36192968

RESUMO

The development of portable NIR instruments facilitates widespread use among non-specialists. However, untrained operators may follow non-optimal measurement procedures. This work investigates how different factors in the measurement procedure influence the spectra of pig feed samples produced by SCiO, a handheld NIR. Measurement conditions were studied by means of Design of Experiments and evaluated with analysis of variance - simultaneous component analysis (ANOVA-SCA or ASCA). We quantified and visualized how measurement distance, angle, background lighting, the use of plastic lids and different devices interactively affect the resulting spectra. The samples could be distinguished with 100% accuracy with Partial Least Squares-Discriminant Analysis (PLS-DA) a scanning distance of 0.5 cm. Replication of the experiment with special attention to reproducing the conditions still lead to some differences, which highlights both the challenges in controlling conditions and the importance of considering them. Based on the results, generalizable guidelines for acceptance of spectra were proposed for this case study. Of main importance are performing measurements at distances of 0.5 cm or at least in an environment without background lighting. Overall, the provided guidelines for measurement conditions and a methodology to investigate this for other devices are a key enabler to spreading handheld spectrometry to a non-expert audience.

Assuntos

Plásticos , Espectroscopia de Luz Próxima ao Infravermelho , Animais , Análise Discriminante , Análise dos Mínimos Quadrados , Espectrofotometria , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Suínos

13.

Evaluation and comparison of unsupervised methods for the extraction of spatial patterns from mass spectrometry imaging data (MSI).

Prasad, Mridula; Postma, Geert; Franceschi, Pietro; Buydens, Lutgarde M C; Jansen, Jeroen J.

Sci Rep ; 12(1): 15687, 2022 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-36127378

RESUMO

For the extraction of spatially important regions from mass spectrometry imaging (MSI) data, different clustering methods have been proposed. These clustering methods are based on certain assumptions and use different criteria to assign pixels into different classes. For high-dimensional MSI data, the curse of dimensionality also limits the performance of clustering methods which are usually overcome by pre-processing the data using dimension reduction techniques. In summary, the extraction of spatial patterns from MSI data can be done using different unsupervised methods, but the robust evaluation of clustering results is what is still missing. In this study, we have performed multiple simulations on synthetic and real MSI data to validate the performance of unsupervised methods. The synthetic data were simulated mimicking important spatial and statistical properties of real MSI data. Our simulation results confirmed that K-means clustering with correlation distance and Gaussian Mixture Modeling clustering methods give optimal performance in most of the scenarios. The clustering methods give efficient results together with dimension reduction techniques. From all the dimension techniques considered here, the best results were obtained with the minimum noise fraction (MNF) transform. The results were confirmed on both synthetic and real MSI data. However, for successful implementation of MNF transform the MSI data requires to be of limited dimensions.

Assuntos

Diagnóstico por Imagem , Análise por Conglomerados , Espectrometria de Massas/métodos , Distribuição Normal

14.

Water quality monitoring based on chemometric analysis of high-resolution phytoplankton data measured with flow cytometry.

Tinnevelt, Gerjen H; Lushchikova, Olga; Augustijn, Dillen; Lochs, Mathijs; Geertsma, Rinze W; Rijkeboer, Machteld; Kools, Harrie; Dubelaar, George; Veen, Arnold; Buydens, Lutgarde M C; Jansen, Jeroen J.

Environ Int ; 170: 107587, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36274492

RESUMO

River water is an important source of Dutch drinking water. For this reason, continuous monitoring of river water quality is needed. However, comprehensive chemical analyses with high-resolution gas chromatography [GC]-mass spectrometry [MS]/liquid chromatography [LC]-MS are quite tedious and time consuming; this makes them poorly fit for routine water quality monitoring and, therefore, many pollution events are missed. Phytoplankton are highly sensitive and responsive to toxicity, which makes them highly usable for effect-based water quality monitoring. Flow cytometry can measure the optical properties of phytoplankton every hour, generating a large amount of information-rich data in one year. However, this requires chemometrics, as the resulting fingerprints need to be processed into information about abnormal phytoplankton behaviour. We developed Discriminant Analysis of Multi-Aspect CYtometry (DAMACY) to model the "normal condition" of the phytoplankton community imposed by diurnal, meteorological, and other exogenous influences. DAMACY first describes the cellular variability and distribution of phytoplankton in each measurement using principal component analysis, and then aims to find subtle differences in these phytoplankton distributions that predict normal environmental conditions. Deviations from these normal environmental conditions indicated abnormal phytoplankton behaviour that happened alongside pollution events measured with the GC/MS and LC/MS systems. Thus, our results demonstrate that flow cytometry in combination with chemometrics may be used for an automated hourly assessment of river water quality and as a near real-time early warning for detecting harmful known or unknown contaminants. Finally, both the flow cytometer and the DAMACY algorithm run completely autonomous and only requires maintenance once or twice per year. The warning system results may be uploaded automatically, so that drinking water companies may temporary stop pumping water whenever abnormal phytoplankton behaviour is detected. In the case of prolonged abnormal phytoplankton behaviour, comprehensive analysis may still be used to identify the chemical compound, its origin, and toxicity.

Assuntos

Água Potável , Fitoplâncton , Qualidade da Água , Citometria de Fluxo , Quimiometria

15.

Convolutional neural networks to predict brain tumor grades and Alzheimer's disease with MR spectroscopic imaging data.

Acquarelli, Jacopo; van Laarhoven, Twan; Postma, Geert J; Jansen, Jeroen J; Rijpma, Anne; van Asten, Sjaak; Heerschap, Arend; Buydens, Lutgarde M C; Marchiori, Elena.

PLoS One ; 17(8): e0268881, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36001537

RESUMO

PURPOSE: To evaluate the value of convolutional neural network (CNN) in the diagnosis of human brain tumor or Alzheimer's disease by MR spectroscopic imaging (MRSI) and to compare its Matthews correlation coefficient (MCC) score against that of other machine learning methods and previous evaluation of the same data. We address two challenges: 1) limited number of cases in MRSI datasets and 2) interpretability of results in the form of relevant spectral regions. METHODS: A shallow CNN with only one hidden layer and an ad-hoc loss function was constructed involving two branches for processing spectral and image features of a brain voxel respectively. Each branch consists of a single convolutional hidden layer. The output of the two convolutional layers is merged and fed to a classification layer that outputs class predictions for the given brain voxel. RESULTS: Our CNN method separated glioma grades 3 and 4 and identified Alzheimer's disease patients using MRSI and complementary MRI data with high MCC score (Area Under the Curve were 0.87 and 0.91 respectively). The results demonstrated superior effectiveness over other popular methods as Partial Least Squares or Support Vector Machines. Also, our method automatically identified the spectral regions most important in the diagnosis process and we show that these are in good agreement with existing biomarkers from the literature. CONCLUSION: Shallow CNNs models integrating image and spectral features improved quantitative and exploration and diagnosis of brain diseases for research and clinical purposes. Software is available at https://bitbucket.org/TeslaH2O/cnn_mrsi.

Assuntos

Doença de Alzheimer , Neoplasias Encefálicas , Doença de Alzheimer/diagnóstico por imagem , Neoplasias Encefálicas/diagnóstico por imagem , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação

16.

Fusion of metabolomics and proteomics data for biomarkers discovery: case study on the experimental autoimmune encephalomyelitis.

Blanchet, Lionel; Smolinska, Agnieszka; Attali, Amos; Stoop, Marcel P; Ampt, Kirsten A M; van Aken, Hans; Suidgeest, Ernst; Tuinstra, Tinka; Wijmenga, Sybren S; Luider, Theo; Buydens, Lutgarde M C.

BMC Bioinformatics ; 12: 254, 2011 Jun 22.

Artigo em Inglês | MEDLINE | ID: mdl-21696593

RESUMO

BACKGROUND: Analysis of Cerebrospinal Fluid (CSF) samples holds great promise to diagnose neurological pathologies and gain insight into the molecular background of these pathologies. Proteomics and metabolomics methods provide invaluable information on the biomolecular content of CSF and thereby on the possible status of the central nervous system, including neurological pathologies. The combined information provides a more complete description of CSF content. Extracting the full combined information requires a combined analysis of different datasets i.e. fusion of the data. RESULTS: A novel fusion method is presented and applied to proteomics and metabolomics data from a pre-clinical model of multiple sclerosis: an Experimental Autoimmune Encephalomyelitis (EAE) model in rats. The method follows a mid-level fusion architecture. The relevant information is extracted per platform using extended canonical variates analysis. The results are subsequently merged in order to be analyzed jointly. We find that the combined proteome and metabolome data allow for the efficient and reliable discrimination between healthy, peripherally inflamed rats, and rats at the onset of the EAE. The predicted accuracy reaches 89% on a test set. The important variables (metabolites and proteins) in this model are known to be linked to EAE and/or multiple sclerosis. CONCLUSIONS: Fusion of proteomics and metabolomics data is possible. The main issues of high-dimensionality and missing values are overcome. The outcome leads to higher accuracy in prediction and more exhaustive description of the disease profile. The biological interpretation of the involved variables validates our fusion approach.

Assuntos

Biomarcadores/líquido cefalorraquidiano , Líquido Cefalorraquidiano/química , Encefalomielite Autoimune Experimental/diagnóstico , Metabolômica/métodos , Proteômica/métodos , Animais , Encefalomielite Autoimune Experimental/metabolismo , Masculino , Ressonância Magnética Nuclear Biomolecular , Ratos , Ratos Endogâmicos Lew

17.

NMR and pattern recognition can distinguish neuroinflammation and peripheral inflammation.

Smolinska, Agnieszka; Attali, Amos; Blanchet, Lionel; Ampt, Kirsten; Tuinstra, Tinka; van Aken, Hans; Suidgeest, Ernst; van Gool, Alain J; Luider, Theo; Wijmenga, Sybren S; Buydens, Lutgarde M C.

J Proteome Res ; 10(10): 4428-38, 2011 Oct 07.

Artigo em Inglês | MEDLINE | ID: mdl-21806074

RESUMO

Multiple Sclerosis (MScl) is a neurodegenerative disease of the CNS, associated with chronic neuroinflammation. Cerebrospinal fluid (CSF), being in closest interaction with CNS, was used to profile neuroinflammation to discover disease-specific markers. We used the commonly accepted animal model for the neuroinflammatory aspect of MScl: the experimental autoimmune/allergic encephalomyelitis (EAE). A combination of advanced (1)H NMR spectroscopy and pattern recognition methods was used to establish the metabolic profile of CSF of EAE-affected rats (representing neuroinflammation) and of two control groups (healthy and peripherally inflamed) to detect specific markers for early neuroinflammation. We found that the CSF metabolic profile for neuroinflammation is distinct from healthy and peripheral inflammation and characterized by changes in concentrations of metabolites such as creatine, arginine, and lysine. Using these disease-specific markers, we were able to detect early stage neuroinflammation, with high accuracy in a second independent set of animals. This confirms the predictive value of these markers. These findings from the EAE model may help to develop a molecular diagnosis for the early stage MScl in humans.

Assuntos

Encefalomielite Autoimune Experimental/metabolismo , Inflamação , Espectroscopia de Ressonância Magnética/métodos , Esclerose Múltipla/líquido cefalorraquidiano , Esclerose Múltipla/metabolismo , Animais , Citratos/metabolismo , Modelos Animais de Doenças , Glutamina/metabolismo , Humanos , Lactatos/metabolismo , Masculino , Modelos Estatísticos , Mycobacterium tuberculosis/metabolismo , Reconhecimento Automatizado de Padrão , Ácidos Pentanoicos/metabolismo , Ratos , Ratos Endogâmicos Lew , Reprodutibilidade dos Testes

18.

Pinpointing biomarkers in proteomic LC/MS data by moving-window discriminant analysis.

Bloemberg, Tom G; Wessels, Hans J C T; van Dael, Maurice; Gloerich, Jolein; van den Heuvel, Lambert P; Buydens, Lutgarde M C; Wehrens, Ron.

Anal Chem ; 83(13): 5197-206, 2011 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-21557614

RESUMO

The identification of differential patterns in data originating from combined measurement techniques such as LC/MS is pivotal to proteomics. Although "shotgun proteomics" has been employed successfully to this end, this method also has severe drawbacks, because of its dependence on largely untargeted MS/MS sequencing and databases for statistical analyses. Alternatively, several MS-signal-based (MS/MS-independent) methods have been published that are mainly based on (univariate) Student's t-tests. Here, we present a more robust multivariate alternative employing linear discriminant analysis. Like the t-test-based methods, it is applied directly to LC/MS data, instead of using MS/MS measurements. We demonstrate the method on a number of simulated data sets, as well as on a spike-in LC/MS data set, and show its superior performance over t-tests.

Assuntos

Biomarcadores/metabolismo , Análise Discriminante , Proteômica , Cromatografia Líquida , Humanos , Espectrometria de Massas

19.

The impact of delayed storage on the measured proteome and metabolome of human cerebrospinal fluid.

Rosenling, Therese; Stoop, Marcel P; Smolinska, Agnieszka; Muilwijk, Bas; Coulier, Leon; Shi, Shanna; Dane, Adrie; Christin, Christin; Suits, Frank; Horvatovich, Peter L; Wijmenga, Sybren S; Buydens, Lutgarde M C; Vreeken, Rob; Hankemeier, Thomas; van Gool, Alain J; Luider, Theo M; Bischoff, Rainer.

Clin Chem ; 57(12): 1703-11, 2011 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-21998343

RESUMO

BACKGROUND: Because cerebrospinal fluid (CSF) is in close contact with diseased areas in neurological disorders, it is an important source of material in the search for molecular biomarkers. However, sample handling for CSF collected from patients in a clinical setting might not always be adequate for use in proteomics and metabolomics studies. METHODS: We left CSF for 0, 30, and 120 min at room temperature immediately after sample collection and centrifugation/removal of cells. At 2 laboratories CSF proteomes were subjected to tryptic digestion and analyzed by use of nano-liquid chromatography (LC) Orbitrap mass spectrometry (MS) and chipLC quadrupole TOF-MS. Metabolome analysis was performed at 3 laboratories by NMR, GC-MS, and LC-MS. Targeted analyses of cystatin C and albumin were performed by LC-tandem MS in the selected reaction monitoring mode. RESULTS: We did not find significant changes in the measured proteome and metabolome of CSF stored at room temperature after centrifugation, except for 2 peptides and 1 metabolite, 2,3,4-trihydroxybutanoic (threonic) acid, of 5780 identified peptides and 93 identified metabolites. A sensitive protein stability marker, cystatin C, was not affected. CONCLUSIONS: The measured proteome and metabolome of centrifuged human CSF is stable at room temperature for up to 2 hours. We cannot exclude, however, that changes undetectable with our current methodology, such as denaturation or proteolysis, might occur because of sample handling conditions. The stability we observed gives laboratory personnel at the collection site sufficient time to aliquot samples before freezing and storage at -80 °C.

Assuntos

Metaboloma , Proteoma/metabolismo , Manejo de Espécimes , Líquido Cefalorraquidiano , Cromatografia Gasosa , Cromatografia Líquida , Humanos , Espectroscopia de Ressonância Magnética , Espectrometria de Massas/métodos , Fatores de Tempo

20.

Integrating gene expression and GO classification for PCA by preclustering.

De Haan, Jorn R; Piek, Ester; van Schaik, Rene C; de Vlieg, Jacob; Bauerschmidt, Susanne; Buydens, Lutgarde M C; Wehrens, Ron.

BMC Bioinformatics ; 11: 158, 2010 Mar 26.

Artigo em Inglês | MEDLINE | ID: mdl-20346140

RESUMO

BACKGROUND: Gene expression data can be analyzed by summarizing groups of individual gene expression profiles based on GO annotation information. The mean expression profile per group can then be used to identify interesting GO categories in relation to the experimental settings. However, the expression profiles present in GO classes are often heterogeneous, i.e., there are several different expression profiles within one class. As a result, important experimental findings can be obscured because the summarizing profile does not seem to be of interest. We propose to tackle this problem by finding homogeneous subclasses within GO categories: preclustering. RESULTS: Two microarray datasets are analyzed. First, a selection of genes from a well-known Saccharomyces cerevisiae dataset is used. The GO class "cell wall organization and biogenesis" is shown as a specific example. After preclustering, this term can be associated with different phases in the cell cycle, where it could not be associated with a specific phase previously. Second, a dataset of differentiation of human Mesenchymal Stem Cells (MSC) into osteoblasts is used. For this dataset results are shown in which the GO term "skeletal development" is a specific example of a heterogeneous GO class for which better associations can be made after preclustering. The Intra Cluster Correlation (ICC), a measure of cluster tightness, is applied to identify relevant clusters. CONCLUSIONS: We show that this method leads to an improved interpretability of results in Principal Component Analysis.

Assuntos

Perfilação da Expressão Gênica/métodos , Expressão Gênica , Análise de Componente Principal , Ciclo Celular/genética , Diferenciação Celular/genética , Análise por Conglomerados , Bases de Dados Genéticas , Humanos , Células-Tronco Mesenquimais/citologia , Saccharomyces cerevisiae/genética

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA