Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cytometry A ; 101(1): 72-85, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34327803

RESUMEN

The rapid evolution of the flow cytometry field, currently allowing the measurement of 30-50 parameters per cell, has led to a marked increase in deep multivariate information. Manual gating is insufficient to extract all this information. Therefore, multivariate analysis (MVA) methods have been developed to extract information and efficiently analyze the high-density multicolour flow cytometry (MFC) data. To aid interpretation, MFC data are often logarithmically transformed before MVA. We studied the consequences of different transformations of flow cytometry data in datasets containing negative intensities caused by background subtractions and spreading error, as logarithmic transformation of negative data is impossible. Transformations such as logicle or hyperbolic arcsine transformations allow linearity around zero, whereas higher (positive and negative) intensities are logarithmically transformed. To define the linear range, a parameter (or cofactor) must be chosen. We show how the chosen transformation parameter has great impact on the MVA results. In some cases, peak splitting is observed, producing two distributions around zero in an actual homogeneous population. This may be misinterpreted as the presence of multiple cell populations. Moreover, when performing arbitrary transformation before MVA analysis, biologically relevant and statistically significant information might be missed. We present a new algorithm, Optimal Transformation for flow cytometry data (OTflow), which uses various statistical methods to optimally choose the parameter of the transformation and prevent artifacts such as peak splitting. Arbitrary or unconsidered transformation can lead to wrong conclusions for the MVA cluster methods, dimensionality reduction methods, and classification methods. We recommend transformation of flow cytometry data by using OTflow-defined parameters estimated per channel, in order to prevent peak splitting and other artifacts in the data.


Asunto(s)
Algoritmos , Artefactos , Citometría de Flujo , Análisis Multivariante
2.
BMC Genomics ; 17: 324, 2016 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-27142305

RESUMEN

BACKGROUND: Genomic prediction (GP) allows breeders to select plants and animals based on their breeding potential for desirable traits, without lengthy and expensive field trials or progeny testing. We have proposed to use Dissimilarity-based Partial Least Squares (DPLS) for GP. As a case study, we use the DPLS approach to predict Bacterial wilt (BW) in tomatoes using SNPs as predictors. The DPLS approach was compared with the Genomic Best-Linear Unbiased Prediction (GBLUP) and single-SNP regression with SNP as a fixed effect to assess the performance of DPLS. RESULTS: Eight genomic distance measures were used to quantify relationships between the tomato accessions from the SNPs. Subsequently, each of these distance measures was used to predict the BW using the DPLS prediction model. The DPLS model was found to be robust to the choice of distance measures; similar prediction performances were obtained for each distance measure. DPLS greatly outperformed the single-SNP regression approach, showing that BW is a comprehensive trait dependent on several loci. Next, the performance of the DPLS model was compared to that of GBLUP. Although GBLUP and DPLS are conceptually very different, the prediction quality (PQ) measured by DPLS models were similar to the prediction statistics obtained from GBLUP. A considerable advantage of DPLS is that the genotype-phenotype relationship can easily be visualized in a 2-D scatter plot. This so-called score-plot provides breeders an insight to select candidates for their future breeding program. CONCLUSIONS: DPLS is a highly appropriate method for GP. The model prediction performance was similar to the GBLUP and far better than the single-SNP approach. The proposed method can be used in combination with a wide range of genomic dissimilarity measures and genotype representations such as allele-count, haplotypes or allele-intensity values. Additionally, the data can be insightfully visualized by the DPLS model, allowing for selection of desirable candidates from the breeding experiments. In this study, we have assessed the DPLS performance on a single trait.


Asunto(s)
Genómica/métodos , Enfermedades de las Plantas/microbiología , Polimorfismo de Nucleótido Simple , Solanum lycopersicum/genética , Algoritmos , Genoma de Planta , Genotipo , Análisis de los Mínimos Cuadrados , Solanum lycopersicum/microbiología , Fenotipo , Sitios de Carácter Cuantitativo
3.
Analyst ; 141(20): 5689-5708, 2016 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-27549384

RESUMEN

Historically, advances in the field of ion mobility spectrometry have been hindered by the variation in measured signals between instruments developed by different research laboratories or manufacturers. This has triggered the development and application of chemometric techniques able to reveal and analyze precious information content of ion mobility spectra. Recent advances in multidimensional coupling of ion mobility spectrometry to chromatography and mass spectrometry has created new, unique challenges for data processing, yielding high-dimensional, megavariate datasets. In this paper, a complete overview of available chemometric techniques used in the analysis of ion mobility spectrometry data is given. We describe the current state-of-the-art of ion mobility spectrometry data analysis comprising datasets with different complexities and two different scopes of data analysis, i.e. targeted and non-targeted analyte analyses. Two main steps of data analysis are considered: data preprocessing and pattern recognition. A detailed description of recent advances in chemometric techniques is provided for these steps, together with a list of interesting applications. We demonstrate that chemometric techniques have a significant contribution to the recent and great expansion of ion mobility spectrometry technology into different application fields. We conclude that well-thought out, comprehensive data analysis strategies are currently emerging, including several chemometric techniques and addressing different data challenges. In our opinion, this trend will continue in the near future, stimulating developments in ion mobility spectrometry instrumentation even further.

4.
Anal Chem ; 87(2): 869-75, 2015 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-25519893

RESUMEN

Ion mobility spectrometry combined with multicapillary column separation (MCC-IMS) is a well-known technology for detecting volatile organic compounds (VOCs) in gaseous samples. Due to their large data size, processing of MCC-IMS spectra is still the main bottleneck of data analysis, and there is an increasing need for data analysis strategies in which the size of MCC-IMS data is reduced to enable further analysis. In our study, the first untargeted chemometric strategy is developed and employed in the analysis of MCC-IMS spectra from 264 breath and ambient air samples. This strategy does not comprise identification of compounds as a primary step but includes several preprocessing steps and a discriminant analysis. Data size is significantly reduced in three steps. Wavelet transform, mask construction, and sparse-partial least squares-discriminant analysis (s-PLS-DA) allow data size reduction with down to 50 variables relevant to the goal of analysis. The influence and compatibility of the data reduction tools are studied by applying different settings of the developed strategy. Loss of information after preprocessing is evaluated, e.g., by comparing the performance of classification models for different classes of samples. Finally, the interpretability of the classification models is evaluated, and regions of spectra that are related to the identification of potential analytical biomarkers are successfully determined. This work will greatly enable the standardization of analytical procedures across different instrumentation types promoting the adoption of MCC-IMS technology in a wide range of diverse application fields.

5.
Anal Chem ; 87(20): 10338-45, 2015 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-26398529

RESUMEN

Real-time measurements of many low-abundance volatile organic compounds (VOCs) in breath and air samples are already feasible due to progress in analytical technologies, such as proton transfer reaction mass spectrometry (PTR-MS). Nevertheless, the information content of real-time measurements is not fully exploited, due to the lack of suitable data handling methods. This study develops a data scientific procedure to enhance data analysis and interpretation of longitudinal, multivariate data sets from real-time, in vivo, aroma-release studies. The developed procedure includes an automated data preprocessing and a multivariate assessment of the test panel performance. A large multifactorial PTR-MS data set is investigated that includes four experimental protocols, two tested food products, four aroma compounds, and eight panelists. Real-time measurements are converted into standardized breath profiles by preprocessing, and 10 kinetic parameters are derived. Next to this, panel performance is evaluated per experimental protocol and food product. Comprehensive information about panel performance, individual panelists, studied products, aroma compounds, and kinetic parameters is extracted, demonstrating the great value of the developed approach.

6.
Anal Chem ; 87(24): 12096-103, 2015 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-26632985

RESUMEN

The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective.

7.
Regul Toxicol Pharmacol ; 70(1): 297-303, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-25046166

RESUMEN

An important part of the current hazard identification of novel plant varieties is comparative targeted analysis of the novel and reference varieties. Comparative analysis will become much more informative with unbiased analytical approaches, e.g. omics profiling. Data analysis estimating the similarity of new varieties to a reference baseline class of known safe varieties would subsequently greatly facilitate hazard identification. Further biological and eventually toxicological analysis would then only be necessary for varieties that fall outside this reference class. For this purpose, a one-class classifier tool was explored to assess and classify transcriptome profiles of potato (Solanum tuberosum) varieties in a model study. Profiles of six different varieties, two locations of growth, two year of harvest and including biological and technical replication were used to build the model. Two scenarios were applied representing evaluation of a 'different' variety and a 'similar' variety. Within the model higher class distances resulted for the 'different' test set compared with the 'similar' test set. The present study may contribute to a more global hazard identification of novel plant varieties.


Asunto(s)
Perfilación de la Expresión Génica , Modelos Teóricos , Plantas Modificadas Genéticamente/toxicidad , Solanum tuberosum/genética , Transcriptoma
8.
Anal Chim Acta ; 1304: 342444, 2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38637030

RESUMEN

A common goal in chemistry is to study the relationship between a measured signal and the variability of certain factors. To this end, researchers often use Design of Experiment to decide which experiments to conduct and (Multiple) Linear Regression, and/or Analysis of Variance to analyze the collected data. Among the assumptions to the very foundation of this strategy, all the experiments are independent, conditional on the settings of the factors. Unfortunately, due to the presence of uncontrollable factors, real-life experiments often deviate from this assumption, making the data analysis results unreliable. In these cases, Mixed-Effects modeling, despite not being widely used in chemometrics, represents a solid data analysis framework to obtain reliable results. Here we provide a tutorial for Linear Mixed-Effects models. We gently introduce the reader to these models by showing some motivating examples. Then, we discuss the theory behind Linear Mixed-Effect models, and we show how to fit these models by making use of real-life data obtained from an exposome study. Throughout the paper we provide R code so that each researcher is able to implement these useful model themselves.

9.
Anal Chem ; 85(11): 5444-53, 2013 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-23679857

RESUMEN

For partial least-squares regression with one response (PLS1), many variable-reduction methods have been developed. However, only a few address the case of multiple-response partial-least-squares (PLS2) modeling. The calibration performance of PLS1 can be improved by elimination of uninformative variables. Many variable-reduction methods are based on various PLS-model-related parameters, called predictor-variable properties. Recently, an important adaptation, in which the model complexity is optimized, was introduced in these methods. This method was called Predictive-Property-Ranked Variable Reduction with Final Complexity Adapted Models, denoted as PPRVR-FCAM or simply FCAM. In this study, variable reduction for PLS2 models, using an adapted FCAM method, FCAM-PLS2, is investigated. The utility and effectiveness of four new predictor-variable properties, derived from the multiple response PLS2 regression coefficients, are studied for six data sets consisting of ultraviolet-visible (UV-vis) spectra, near-infrared (NIR) spectra, NMR spectra, and two simulated sets, one with correlated and one with uncorrelated responses. The four properties include the mean of the absolute values as well as the norm of the PLS2 regression coefficients and their significances. The four properties were found to be applicable by the FCAM-PLS2 method for variable reduction. The predictive abilities of models resulting from the four properties are similar. The norm of the PLS2 regression coefficients has the best selective abilities, low numbers of variables with an informative meaning to the responses are retained. The significance of the mean of the PLS2 regression coefficients is found to be the least-selective property.

10.
Proteomics ; 12(14): 2276-81, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22887946

RESUMEN

An important prerequisite for the development and benchmarking of novel analysis methods is a well-designed comprehensive LC-MS/MS data set. Here, we present our data set consisting of 59 LC-MS/MS analyses of 50 protein samples extracted individually from Escherichia coli K12 and spiked with different concentrations of bovine carbonic anhydrase II and/or chicken ovalbumin, according to a 2 × 3 full factorial design. Using the well-annotated and commonly used E. coli proteome as the sample background ensures that the complexity of the data is on a par with most current proteomic analyses. Data were acquired over a 2-month period using multiple reversed-phase columns and instrument calibrations to include real-life challenges faced when analyzing large proteomics data sets. Moreover, so-called "ground truth" data, comprised by LC-MS/MS measurements of the pure spikes are included in the data set. The current manuscript elaborates this comprehensive benchmark data set for future development and evaluation of analysis methods and software.


Asunto(s)
Cromatografía Liquida/métodos , Bases de Datos de Proteínas , Proteoma/química , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Animales , Anhidrasa Carbónica II/química , Bovinos , Pollos , Proteínas de Escherichia coli/química , Ovalbúmina/química , Fragmentos de Péptidos/química
11.
NMR Biomed ; 25(5): 755-65, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-21953616

RESUMEN

(1)H MRSI of the prostate reveals relative metabolite levels that vary according to the presence or absence of tumour, providing a sensitive method for the identification of patients with cancer. Current interpretations of prostate data rely on quantification algorithms that fit model metabolite resonances to individual voxel spectra and calculate relative levels of metabolites, such as choline, creatine, citrate and polyamines. Statistical pattern recognition techniques can potentially improve the detection of prostate cancer, but these analyses are hampered by artefacts and sources of noise in the data, such as variations in phase and frequency of resonances. Phase and frequency variations may arise as a result of spatial field gradients or local physiological conditions affecting the frequency of resonances, in particular those of citrate. Thus, there are unique challenges in developing a peak alignment algorithm for these data. We have developed a frequency and phase correction algorithm for automatic alignment of the resonances in prostate MRSI spectra. We demonstrate, with a simulated dataset, that alignment can be achieved to a phase standard deviation of 0.095 rad and a frequency standard deviation of 0.68 Hz for the citrate resonances. Three parameters were used to assess the improvement in peak alignment in the MRSI data of five patients: the percentage of variance in all MRSI spectra explained by their first principal component; the signal-to-noise ratio of a spectrum formed by taking the median value of the entire set at each spectral point; and the mean cross-correlation between all pairs of spectra. These parameters showed a greater similarity between spectra in all five datasets and the simulated data, demonstrating improved alignment for phase and frequency in these spectra. This peak alignment program is expected to improve pattern recognition significantly, enabling accurate detection and localisation of prostate cancer with MRSI.


Asunto(s)
Algoritmos , Imagen por Resonancia Magnética/métodos , Espectroscopía de Resonancia Magnética/métodos , Neoplasias de la Próstata/química , Colina/análisis , Ácido Cítrico/análisis , Simulación por Computador , Creatina/análisis , Bases de Datos Factuales , Humanos , Masculino , Modelos Biológicos , Reconocimiento de Normas Patrones Automatizadas/métodos , Poliaminas/análisis , Análisis de Componente Principal , Neoplasias de la Próstata/patología , Procesamiento de Señales Asistido por Computador , Relación Señal-Ruido
12.
NMR Biomed ; 25(11): 1271-9, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22407957

RESUMEN

Breast cancer is a heterogeneous disease with a variable prognosis. Clinical factors provide some information about the prognosis of patients with breast cancer; however, there is a need for additional information to stratify patients for improved and more individualized treatment. The aim of this study was to examine the relationship between the metabolite profiles of breast cancer tissue and 5-year survival. Biopsies from breast cancer patients (n=98) were excised during surgery and analyzed by high-resolution magic angle spinning MRS. The data were analyzed by multivariate principal component analysis and partial least-squares discriminant analysis, and the findings of important metabolites were confirmed by spectral integration of the metabolite peaks. Predictions of 5-year survival using metabolite profiles were compared with predictions using clinical parameters. Based on the metabolite profiles, patients with estrogen receptor (ER)-positive breast cancer (n=71) were separated into two groups with significantly different survival rates (p=0.024). Higher levels of glycine and lactate were found to be associated with lower survival rates by both multivariate analyses and spectral integration, and are suggested as biomarkers for breast cancer prognosis. Similar metabolic differences were not observed for ER-negative patients, where survivors could not be separated from nonsurvivors. Predictions of 5-year survival of ER-positive patients using metabolite profiles gave better and more robust results than those using traditional clinical parameters. The results imply that the metabolic state of a tumor may provide additional information concerning breast cancer prognosis. Further studies should be conducted in order to evaluate the role of MR metabolomics as an additional clinical tool for determining the prognosis of patients with breast cancer.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/metabolismo , Glicina/metabolismo , Ácido Láctico/metabolismo , Espectroscopía de Resonancia Magnética , Receptores de Estrógenos/metabolismo , Adulto , Anciano , Anciano de 80 o más Años , Neoplasias de la Mama/patología , Estudios de Cohortes , Análisis Discriminante , Femenino , Humanos , Estimación de Kaplan-Meier , Análisis de los Mínimos Cuadrados , Persona de Mediana Edad , Análisis de Componente Principal , Pronóstico , Curva ROC
13.
Anal Bioanal Chem ; 403(4): 947-59, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22395451

RESUMEN

Because cerebrospinal fluid (CSF) is the biofluid which interacts most closely with the central nervous system, it holds promise as a reporter of neurological disease, for example multiple sclerosis (MScl). To characterize the metabolomics profile of neuroinflammatory aspects of this disease we studied an animal model of MScl-experimental autoimmune/allergic encephalomyelitis (EAE). Because CSF also exchanges metabolites with blood via the blood-brain barrier, malfunctions occurring in the CNS may be reflected in the biochemical composition of blood plasma. The combination of blood plasma and CSF provides more complete information about the disease. Both biofluids can be studied by use of NMR spectroscopy. It is then necessary to perform combined analysis of the two different datasets. Mid-level data fusion was therefore applied to blood plasma and CSF datasets. First, relevant information was extracted from each biofluid dataset by use of linear support vector machine recursive feature elimination. The selected variables from each dataset were concatenated for joint analysis by partial least squares discriminant analysis (PLS-DA). The combined metabolomics information from plasma and CSF enables more efficient and reliable discrimination of the onset of EAE. Second, we introduced hierarchical models fusion, in which previously developed PLS-DA models are hierarchically combined. We show that this approach enables neuroinflamed rats (even on the day of onset) to be distinguished from either healthy or peripherally inflamed rats. Moreover, progression of EAE can be investigated because the model separates the onset and peak of the disease.


Asunto(s)
Espectroscopía de Resonancia Magnética/métodos , Esclerosis Múltiple/sangre , Esclerosis Múltiple/líquido cefalorraquídeo , Animales , Encefalomielitis Autoinmune Experimental/sangre , Encefalomielitis Autoinmune Experimental/líquido cefalorraquídeo , Humanos , Masculino , Metabolómica , Modelos Biológicos , Esclerosis Múltiple/diagnóstico , Ratas , Ratas Endogámicas Lew
14.
Mol Cell Proteomics ; 9(9): 2063-75, 2010 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-20811074

RESUMEN

The analysis of cerebrospinal fluid (CSF) is used in biomarker discovery studies for various neurodegenerative central nervous system (CNS) disorders. However, little is known about variation of CSF proteins and metabolites between patients without neurological disorders. A baseline for a large number of CSF compounds appears to be lacking. To analyze the variation in CSF protein and metabolite abundances in a number of well-defined individual samples of patients undergoing routine, non-neurological surgical procedures, we determined the variation of various proteins and metabolites by multiple analytical platforms. A total of 126 common proteins were assessed for biological variations between individuals by ESI-Orbitrap. A large spread in inter-individual variation was observed (relative standard deviations [RSDs] ranged from 18 to 148%) for proteins with both high abundance and low abundance. Technical variation was between 15 and 30% for all 126 proteins. Metabolomics analysis was performed by means of GC-MS and nuclear magnetic resonance (NMR) imaging and amino acids were specifically analyzed by LC-MS/MS, resulting in the detection of more than 100 metabolites. The variation in the metabolome appears to be much more limited compared with the proteome: the observed RSDs ranged from 12 to 70%. Technical variation was less than 20% for almost all metabolites. Consequently, an understanding of the biological variation of proteins and metabolites in CSF of neurologically normal individuals appears to be essential for reliable interpretation of biomarker discovery studies for CNS disorders because such results may be influenced by natural inter-individual variations. Therefore, proteins and metabolites with high variation between individuals ought to be assessed with caution as candidate biomarkers because at least part of the difference observed between the diseased individuals and the controls will not be caused by the disease, but rather by the natural biological variation between individuals.


Asunto(s)
Líquido Cefalorraquídeo/metabolismo , Metabolómica , Proteómica , Estudios de Casos y Controles , Cromatografía Liquida , Humanos , Espectroscopía de Resonancia Magnética , Reproducibilidad de los Resultados , Espectrometría de Masa por Ionización de Electrospray , Espectrometría de Masas en Tándem
15.
Food Res Int ; 161: 111836, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36192968

RESUMEN

The development of portable NIR instruments facilitates widespread use among non-specialists. However, untrained operators may follow non-optimal measurement procedures. This work investigates how different factors in the measurement procedure influence the spectra of pig feed samples produced by SCiO, a handheld NIR. Measurement conditions were studied by means of Design of Experiments and evaluated with analysis of variance - simultaneous component analysis (ANOVA-SCA or ASCA). We quantified and visualized how measurement distance, angle, background lighting, the use of plastic lids and different devices interactively affect the resulting spectra. The samples could be distinguished with 100% accuracy with Partial Least Squares-Discriminant Analysis (PLS-DA) a scanning distance of 0.5 cm. Replication of the experiment with special attention to reproducing the conditions still lead to some differences, which highlights both the challenges in controlling conditions and the importance of considering them. Based on the results, generalizable guidelines for acceptance of spectra were proposed for this case study. Of main importance are performing measurements at distances of 0.5 cm or at least in an environment without background lighting. Overall, the provided guidelines for measurement conditions and a methodology to investigate this for other devices are a key enabler to spreading handheld spectrometry to a non-expert audience.


Asunto(s)
Plásticos , Espectroscopía Infrarroja Corta , Animales , Análisis Discriminante , Análisis de los Mínimos Cuadrados , Espectrofotometría , Espectroscopía Infrarroja Corta/métodos , Porcinos
16.
Sci Rep ; 12(1): 15687, 2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36127378

RESUMEN

For the extraction of spatially important regions from mass spectrometry imaging (MSI) data, different clustering methods have been proposed. These clustering methods are based on certain assumptions and use different criteria to assign pixels into different classes. For high-dimensional MSI data, the curse of dimensionality also limits the performance of clustering methods which are usually overcome by pre-processing the data using dimension reduction techniques. In summary, the extraction of spatial patterns from MSI data can be done using different unsupervised methods, but the robust evaluation of clustering results is what is still missing. In this study, we have performed multiple simulations on synthetic and real MSI data to validate the performance of unsupervised methods. The synthetic data were simulated mimicking important spatial and statistical properties of real MSI data. Our simulation results confirmed that K-means clustering with correlation distance and Gaussian Mixture Modeling clustering methods give optimal performance in most of the scenarios. The clustering methods give efficient results together with dimension reduction techniques. From all the dimension techniques considered here, the best results were obtained with the minimum noise fraction (MNF) transform. The results were confirmed on both synthetic and real MSI data. However, for successful implementation of MNF transform the MSI data requires to be of limited dimensions.


Asunto(s)
Diagnóstico por Imagen , Análisis por Conglomerados , Espectrometría de Masas/métodos , Distribución Normal
17.
Anal Chim Acta ; 1203: 339707, 2022 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-35361420

RESUMEN

Many industries see a shifting focus towards performing on-site analysis using handheld spectroscopic devices. A determining factor for decision-making on the commissioning of these devices is available information on the potential performance of the device for specific applications. By now, myriad handheld solutions with very different specifications and pricing are available on the market. Although specifications are generally available for new devices, this does not directly quantify or predict how available devices will perform for targeted cases. We present a novel chemometric method to estimate the prediction performance of handheld NIR hardware and apply it to estimate the performance of two commercially available handheld NIR technologies in predicting protein content (ranging 120-180 g kg-1) in pig feed from existing data of a benchtop device. Adjusting benchtop data to the wavelength range and resolution of the handheld device lead to over-optimistic estimates of the handheld performances. Our method additionally utilizes information on the error structure of the handheld devices for the estimation. It yielded performance estimates differing less than 1 g kg-1 from the experimentally determined handheld performances and similar model parameters. Our method was effective for linear and nonlinear calibration algorithms, also when estimating performance after averaging multiple scans. Replicate spectra of twenty samples recorded using the handheld were required for replication error estimation to obtain an accurate performance estimation. The error structure could be reported by manufacturers in the future for this approach to be universally employed for predictive quantitative technology assessment. Overall, our method provides estimates of the performance of a handheld device for a specific task with minimal testing required and can thus be used as a device or application screening tool before committing to develop calibrations.


Asunto(s)
Fotones , Espectroscopía Infrarroja Corta , Algoritmos , Animales , Calibración , Espectroscopía Infrarroja Corta/métodos , Porcinos
18.
PLoS One ; 17(8): e0268881, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36001537

RESUMEN

PURPOSE: To evaluate the value of convolutional neural network (CNN) in the diagnosis of human brain tumor or Alzheimer's disease by MR spectroscopic imaging (MRSI) and to compare its Matthews correlation coefficient (MCC) score against that of other machine learning methods and previous evaluation of the same data. We address two challenges: 1) limited number of cases in MRSI datasets and 2) interpretability of results in the form of relevant spectral regions. METHODS: A shallow CNN with only one hidden layer and an ad-hoc loss function was constructed involving two branches for processing spectral and image features of a brain voxel respectively. Each branch consists of a single convolutional hidden layer. The output of the two convolutional layers is merged and fed to a classification layer that outputs class predictions for the given brain voxel. RESULTS: Our CNN method separated glioma grades 3 and 4 and identified Alzheimer's disease patients using MRSI and complementary MRI data with high MCC score (Area Under the Curve were 0.87 and 0.91 respectively). The results demonstrated superior effectiveness over other popular methods as Partial Least Squares or Support Vector Machines. Also, our method automatically identified the spectral regions most important in the diagnosis process and we show that these are in good agreement with existing biomarkers from the literature. CONCLUSION: Shallow CNNs models integrating image and spectral features improved quantitative and exploration and diagnosis of brain diseases for research and clinical purposes. Software is available at https://bitbucket.org/TeslaH2O/cnn_mrsi.


Asunto(s)
Enfermedad de Alzheimer , Neoplasias Encefálicas , Enfermedad de Alzheimer/diagnóstico por imagen , Neoplasias Encefálicas/diagnóstico por imagen , Humanos , Aprendizaje Automático , Imagen por Resonancia Magnética/métodos , Redes Neurales de la Computación
19.
Environ Int ; 170: 107587, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36274492

RESUMEN

River water is an important source of Dutch drinking water. For this reason, continuous monitoring of river water quality is needed. However, comprehensive chemical analyses with high-resolution gas chromatography [GC]-mass spectrometry [MS]/liquid chromatography [LC]-MS are quite tedious and time consuming; this makes them poorly fit for routine water quality monitoring and, therefore, many pollution events are missed. Phytoplankton are highly sensitive and responsive to toxicity, which makes them highly usable for effect-based water quality monitoring. Flow cytometry can measure the optical properties of phytoplankton every hour, generating a large amount of information-rich data in one year. However, this requires chemometrics, as the resulting fingerprints need to be processed into information about abnormal phytoplankton behaviour. We developed Discriminant Analysis of Multi-Aspect CYtometry (DAMACY) to model the "normal condition" of the phytoplankton community imposed by diurnal, meteorological, and other exogenous influences. DAMACY first describes the cellular variability and distribution of phytoplankton in each measurement using principal component analysis, and then aims to find subtle differences in these phytoplankton distributions that predict normal environmental conditions. Deviations from these normal environmental conditions indicated abnormal phytoplankton behaviour that happened alongside pollution events measured with the GC/MS and LC/MS systems. Thus, our results demonstrate that flow cytometry in combination with chemometrics may be used for an automated hourly assessment of river water quality and as a near real-time early warning for detecting harmful known or unknown contaminants. Finally, both the flow cytometer and the DAMACY algorithm run completely autonomous and only requires maintenance once or twice per year. The warning system results may be uploaded automatically, so that drinking water companies may temporary stop pumping water whenever abnormal phytoplankton behaviour is detected. In the case of prolonged abnormal phytoplankton behaviour, comprehensive analysis may still be used to identify the chemical compound, its origin, and toxicity.


Asunto(s)
Agua Potable , Fitoplancton , Calidad del Agua , Citometría de Flujo , Quimiometría
20.
BMC Bioinformatics ; 12: 254, 2011 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-21696593

RESUMEN

BACKGROUND: Analysis of Cerebrospinal Fluid (CSF) samples holds great promise to diagnose neurological pathologies and gain insight into the molecular background of these pathologies. Proteomics and metabolomics methods provide invaluable information on the biomolecular content of CSF and thereby on the possible status of the central nervous system, including neurological pathologies. The combined information provides a more complete description of CSF content. Extracting the full combined information requires a combined analysis of different datasets i.e. fusion of the data. RESULTS: A novel fusion method is presented and applied to proteomics and metabolomics data from a pre-clinical model of multiple sclerosis: an Experimental Autoimmune Encephalomyelitis (EAE) model in rats. The method follows a mid-level fusion architecture. The relevant information is extracted per platform using extended canonical variates analysis. The results are subsequently merged in order to be analyzed jointly. We find that the combined proteome and metabolome data allow for the efficient and reliable discrimination between healthy, peripherally inflamed rats, and rats at the onset of the EAE. The predicted accuracy reaches 89% on a test set. The important variables (metabolites and proteins) in this model are known to be linked to EAE and/or multiple sclerosis. CONCLUSIONS: Fusion of proteomics and metabolomics data is possible. The main issues of high-dimensionality and missing values are overcome. The outcome leads to higher accuracy in prediction and more exhaustive description of the disease profile. The biological interpretation of the involved variables validates our fusion approach.


Asunto(s)
Biomarcadores/líquido cefalorraquídeo , Líquido Cefalorraquídeo/química , Encefalomielitis Autoinmune Experimental/diagnóstico , Metabolómica/métodos , Proteómica/métodos , Animales , Encefalomielitis Autoinmune Experimental/metabolismo , Masculino , Resonancia Magnética Nuclear Biomolecular , Ratas , Ratas Endogámicas Lew
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA