RESUMEN
Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those because of collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected. In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality because of the need for instrument cleaning and/or re-calibration. To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired to dynamically flag potential issues with instrument performance or sample quality. QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis. We demonstrate the utility and performance of QC-ART in identifying deviations in data quality because of both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments.
Asunto(s)
Sistemas de Computación , Proteómica/métodos , Proteómica/normas , Control de Calidad , Espectrometría de Masas en Tándem/métodos , Algoritmos , Estudios de Cohortes , Bases de Datos de Proteínas , Humanos , Marcaje Isotópico , Oxidación-Reducción , Péptidos/metabolismo , Curva ROC , Interfaz Usuario-ComputadorRESUMEN
Type 1 diabetes (T1D) results from autoimmune destruction of ß cells. Insufficient availability of biomarkers represents a significant gap in understanding the disease cause and progression. We conduct blinded, two-phase case-control plasma proteomics on the TEDDY study to identify biomarkers predictive of T1D development. Untargeted proteomics of 2,252 samples from 184 individuals identify 376 regulated proteins, showing alteration of complement, inflammatory signaling, and metabolic proteins even prior to autoimmunity onset. Extracellular matrix and antigen presentation proteins are differentially regulated in individuals who progress to T1D vs. those that remain in autoimmunity. Targeted proteomics measurements of 167 proteins in 6,426 samples from 990 individuals validate 83 biomarkers. A machine learning analysis predicts if individuals would remain in autoimmunity or develop T1D 6 months before autoantibody appearance, with areas under receiver operating characteristic curves of 0.871 and 0.918, respectively. Our study identifies and validates biomarkers, highlighting pathways affected during T1D development.
Asunto(s)
Diabetes Mellitus Tipo 1 , Células Secretoras de Insulina , Humanos , Diabetes Mellitus Tipo 1/diagnóstico , Autoinmunidad , Autoanticuerpos , BiomarcadoresRESUMEN
The microbial and molecular characterization of the ectorhizosphere is an important step towards developing a more complete understanding of how the cultivation of biofuel crops can be undertaken in nutrient poor environments. The ectorhizosphere of Setaria is of particular interest because the plant component of this plant-microbe system is an important agricultural grain crop and a model for biofuel grasses. Importantly, Setaria lends itself to high throughput molecular studies. As such, we have identified important intra- and interspecific microbial and molecular differences in the ectorhizospheres of three geographically distant Setaria italica accessions and their wild ancestor S. viridis. All were grown in a nutrient-poor soil with and without nutrient addition. To assess the contrasting impact of nutrient deficiency observed for two S. italica accessions, we quantitatively evaluated differences in soil organic matter, microbial community, and metabolite profiles. Together, these measurements suggest that rhizosphere priming differs with Setaria accession, which comes from alterations in microbial community abundances, specifically Actinobacteria and Proteobacteria populations. When globally comparing the metabolomic response of Setaria to nutrient addition, plants produced distinctly different metabolic profiles in the leaves and roots. With nutrient addition, increases of nitrogen containing metabolites were significantly higher in plant leaves and roots along with significant increases in tyrosine derived alkaloids, serotonin, and synephrine. Glycerol was also found to be significantly increased in the leaves as well as the ectorhizosphere. These differences provide insight into how C4 grasses adapt to changing nutrient availability in soils or with contrasting fertilization schemas. Gained knowledge could then be utilized in plant enhancement and bioengineering efforts to produce plants with superior traits when grown in nutrient poor soils.
Asunto(s)
Bacterias/clasificación , ARN Ribosómico 16S/genética , Setaria (Planta)/clasificación , Setaria (Planta)/crecimiento & desarrollo , Suelo/química , Alcaloides/metabolismo , Bacterias/genética , Bacterias/aislamiento & purificación , ADN Bacteriano/genética , ADN Ribosómico/genética , Glicerol , Metabolómica , Nitrógeno/metabolismo , Filogenia , Filogeografía , Hojas de la Planta/clasificación , Hojas de la Planta/crecimiento & desarrollo , Hojas de la Planta/metabolismo , Hojas de la Planta/microbiología , Raíces de Plantas/clasificación , Raíces de Plantas/crecimiento & desarrollo , Raíces de Plantas/metabolismo , Raíces de Plantas/microbiología , Rizosfera , Análisis de Secuencia de ADN , Setaria (Planta)/metabolismo , Setaria (Planta)/microbiología , Microbiología del SueloRESUMEN
BACKGROUND: The Environmental Determinants of the Diabetes in the Young (TEDDY) study has prospectively followed, from birth, children at increased genetic risk of type 1 diabetes. TEDDY has collected heterogenous data longitudinally to gain insights into the environmental and biological mechanisms driving the progression to persistent islet autoantibodies. METHODS: We developed a machine learning model to predict imminent transition to the development of persistent islet autoantibodies based on time-varying metabolomics data integrated with time-invariant risk factors (eg, gestational age). The machine learning was initiated with 221 potential features (85 genetic, 5 environmental, 131 metabolomic) and an ensemble-based feature evaluation was utilized to identify a small set of predictive features that can be interrogated to better understand the pathogenesis leading up to persistent islet autoimmunity. RESULTS: The final integrative machine learning model included 42 disparate features, returning a cross-validated receiver operating characteristic area under the curve (AUC) of 0.74 and an AUC of ~0.65 on an independent validation dataset. The model identified a principal set of 20 time-invariant markers, including 18 genetic markers (16 single nucleotide polymorphisms [SNPs] and two HLA-DR genotypes) and two demographic markers (gestational age and exposure to a prebiotic formula). Integration with the metabolome identified 22 supplemental metabolites and lipids, including adipic acid and ceramide d42:0, that predicted development of islet autoantibodies. CONCLUSIONS: The majority (86%) of metabolites that predicted development of islet autoantibodies belonged to three pathways: lipid oxidation, phospholipase A2 signaling, and pentose phosphate, suggesting that these metabolic processes may play a role in triggering islet autoimmunity.