RESUMO
As metabolomics grows into a high-throughput and high demand research field, current metrics for the identification of small molecules in gas chromatography-mass spectrometry (GC-MS) still require manual verification. Though steps have been taken to improve scoring metrics by combining spectral similarity (SS) and retention index (RI), the problem persists. A large body of literature has analyzed and refined SS scores, but few studies have explicitly studied improvements to RI scores. Here, we examined whether uninvestigated assumptions of the RI score are valid and propose ways to improve them. Query RIs were matched to library RI with a generous window of ±35 to avoid unintentional removal of valid compound identifications. Each match was manually verified as a true positive (TP), true negative, or unknown. Metabolites with at least 30 TP identifications were included in downstream analyses, resulting in a total of 87 metabolites from samples of varying complexity and type (e.g., amino acid mixtures, human urine, fungal species, and so on.). Our results showed that the RI score assumptions of normality, consistent variance across metabolites, and a mean error centered at 0 are often violated. We demonstrated through a cross-validation analysis that modifying these underlying assumptions according to empirical metabolite-specific distributions improved the TP and negative rankings. Further, we statistically determined the minimum number of samples required to estimate distributional parameters for scoring metrics. Overall, this work proposes a robust statistical pipeline to reduce the time bottleneck of metabolite identification by improving RI scores and thus minimize the effort to complete manual verification.
Assuntos
Metabolômica , Humanos , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodosRESUMO
Biological nitrogen fixation by microbial diazotrophs can contribute significantly to nitrogen availability in non-nodulating plant species. In this study of molecular mechanisms and gene expression relating to biological nitrogen fixation, the aerobic nitrogen-fixing endophyte Burkholderia vietnamiensis, strain WPB, isolated from Populus trichocarpa served as a model for endophyte-poplar interactions. Nitrogen-fixing activity was observed to be dynamic on nitrogen-free medium with a subset of colonies growing to form robust, raised globular like structures. Secondary ion mass spectrometry (NanoSIMS) confirmed that N-fixation was uneven within the population. A fluorescent transcriptional reporter (GFP) revealed that the nitrogenase subunit nifH is not uniformly expressed across genetically identical colonies of WPB and that only ~11% of the population was actively expressing the nifH gene. Higher nifH gene expression was observed in clustered cells through monitoring individual bacterial cells using single-molecule fluorescence in situ hybridization. Through 15N2 enrichment, we identified key nitrogenous metabolites and proteins synthesized by WPB and employed targeted metabolomics in active and inactive populations. We cocultivated WPB Pnif-GFP with poplar within a RhizoChip, a synthetic soil habitat, which enabled direct imaging of microbial nifH expression within root epidermal cells. We observed that nifH expression is localized to the root elongation zone where the strain forms a unique physical interaction with the root cells. This work employed comprehensive experimentation to identify novel mechanisms regulating both biological nitrogen fixation and beneficial plant-endophyte interactions.
Assuntos
Fixação de Nitrogênio , Populus , Fixação de Nitrogênio/fisiologia , Populus/genética , Populus/metabolismo , Endófitos/genética , Oxirredutases/genética , Hibridização in Situ Fluorescente , Nitrogenase/genética , Nitrogenase/metabolismo , NitrogênioRESUMO
The ability to reliably identify small molecules (e.g., metabolites) is key toward driving scientific advancement in metabolomics. Gas chromatography-mass spectrometry (GC-MS) is an analytic method that may be applied to facilitate this process. The typical GC-MS identification workflow involves quantifying the similarity of an observed sample spectrum and other features (e.g., retention index) to that of several references, noting the compound of the best-matching reference spectrum as the identified metabolite. While a deluge of similarity metrics exist, none quantify the error rate of generated identifications, thereby presenting an unknown risk of false identification or discovery. To quantify this unknown risk, we propose a model-based framework for estimating the false discovery rate (FDR) among a set of identifications. Extending a traditional mixture modeling framework, our method incorporates both similarity score and experimental information in estimating the FDR. We apply these models to identification lists derived from across 548 samples of varying complexity and sample type (e.g., fungal species, standard mixtures, etc.), comparing their performance to that of the traditional Gaussian mixture model (GMM). Through simulation, we additionally assess the impact of reference library size on the accuracy of FDR estimates. In comparing the best performing model extensions to the GMM, our results indicate relative decreases in median absolute estimation error (MAE) ranging from 12% to 70%, based on comparisons of the median MAEs across all hit-lists. Results indicate that these relative performance improvements generally hold despite library size; however FDR estimation error typically worsens as the set of reference compounds diminishes.
Assuntos
Metabolômica , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodosRESUMO
Metabolomics provides a unique snapshot into the world of small molecules and the complex biological processes that govern the human, animal, plant, and environmental ecosystems encapsulated by the One Health modeling framework. However, this "molecular snapshot" is only as informative as the number of metabolites confidently identified within it. The spectral similarity (SS) score is traditionally used to identify compound(s) in mass spectrometry approaches to metabolomics, where spectra are matched to reference libraries of candidate spectra. Unfortunately, there is little consensus on which of the dozens of available SS metrics should be used. This lack of standard SS score creates analytic uncertainty and potentially leads to issues in reproducibility, especially as these data are integrated across other domains. In this work, we use metabolomic spectral similarity as a case study to showcase the challenges in consistency within just one piece of the One Health framework that must be addressed to enable data science approaches for One Health problems. Here, using a large cohort of datasets comprising both standard and complex datasets with expert-verified truth annotations, we evaluated the effectiveness of 66 similarity metrics to delineate between correct matches (true positives) and incorrect matches (true negatives). We additionally characterize the families of these metrics to make informed recommendations for their use. Our results indicate that specific families of metrics (the Inner Product, Correlative, and Intersection families of scores) tend to perform better than others, with no single similarity metric performing optimally for all queried spectra. This work and its findings provide an empirically-based resource for researchers to use in their selection of similarity metrics for GC-MS identification, increasing scientific reproducibility through taking steps towards standardizing identification workflows.
RESUMO
Red alder (Alnus rubra Bong.) is an ecologically significant and important fast-growing commercial tree species native to western coastal and riparian regions of North America, having highly desirable wood, pigment, and medicinal properties. We have sequenced the genome of a rapidly growing clone. The assembly is nearly complete, containing the full complement of expected genes. This supports our objectives of identifying and studying genes and pathways involved in nitrogen-fixing symbiosis and those related to secondary metabolites that underlie red alder's many interesting defense, pigmentation, and wood quality traits. We established that this clone is most likely diploid and identified a set of SNPs that will have utility in future breeding and selection endeavors, as well as in ongoing population studies. We have added a well-characterized genome to others from the order Fagales. In particular, it improves significantly upon the only other published alder genome sequence, that of Alnus glutinosa. Our work initiated a detailed comparative analysis of members of the order Fagales and established some similarities with previous reports in this clade, suggesting a biased retention of certain gene functions in the vestiges of an ancient genome duplication when compared with more recent tandem duplications.
Assuntos
Alnus , Alnus/metabolismo , Diploide , Melhoramento Vegetal , Simbiose , ÁrvoresRESUMO
Building mechanistic models of kinase-driven signaling pathways requires quantitative measurements of protein phosphorylation across physiologically relevant conditions, but this is rarely done because of the insensitivity of traditional technologies. By using a multiplexed deep phosphoproteome profiling workflow, we were able to generate a deep phosphoproteomics dataset of the EGFR-MAPK pathway in non-transformed MCF10A cells across physiological ligand concentrations with a time resolution of <12 min and in the presence and absence of multiple kinase inhibitors. An improved phosphosite mapping technique allowed us to reliably identify >46,000 phosphorylation sites on >6600 proteins, of which >4500 sites from 2110 proteins displayed a >2-fold increase in phosphorylation in response to EGF. This data was then placed into a cellular context by linking it to 15 previously published protein databases. We found that our results were consistent with much, but not all previously reported data regarding the activation and negative feedback phosphorylation of core EGFR-ERK pathway proteins. We also found that EGFR signaling is biphasic with substrates downstream of RAS/MAPK activation showing a maximum response at <3ng/ml EGF while direct substrates, such as HGS and STAT5B, showing no saturation. We found that RAS activation is mediated by at least 3 parallel pathways, two of which depend on PTPN11. There appears to be an approximately 4-minute delay in pathway activation at the step between RAS and RAF, but subsequent pathway phosphorylation was extremely rapid. Approximately 80 proteins showed a >2-fold increase in phosphorylation across all experiments and these proteins had a significantly higher median number of phosphorylation sites (~18) relative to total cellular phosphoproteins (~4). Over 60% of EGF-stimulated phosphoproteins were downstream of MAPK and included mediators of cellular processes such as gene transcription, transport, signal transduction and cytoskeletal arrangement. Their phosphorylation was either linear with respect to MAPK activation or biphasic, corresponding to the biphasic signaling seen at the level of the EGFR. This deep, integrated phosphoproteomics data resource should be useful in building mechanistic models of EGFR and MAPK signaling and for understanding how downstream responses are regulated.
RESUMO
BACKGROUND: Physiological and biochemical processes across tissues of the body are regulated in response to the high demands of intense physical activity in several occupations, such as firefighting, law enforcement, military, and sports. A better understanding of such processes can ultimately help improve human performance and prevent illnesses in the work environment. METHODS: To study regulatory processes in intense physical activity simulating real-life conditions, we performed a multi-omics analysis of three biofluids (blood plasma, urine, and saliva) collected from 11 wildland firefighters before and after a 45 min, intense exercise regimen. Omics profiles post- versus pre-exercise were compared by Student's t-test followed by pathway analysis and comparison between the different omics modalities. RESULTS: Our multi-omics analysis identified and quantified 3835 proteins, 730 lipids and 182 metabolites combining the 3 different types of samples. The blood plasma analysis revealed signatures of tissue damage and acute repair response accompanied by enhanced carbon metabolism to meet energy demands. The urine analysis showed a strong, concomitant regulation of 6 out of 8 identified proteins from the renin-angiotensin system supporting increased excretion of catabolites, reabsorption of nutrients and maintenance of fluid balance. In saliva, we observed a decrease in 3 pro-inflammatory cytokines and an increase in 8 antimicrobial peptides. A systematic literature review identified 6 papers that support an altered susceptibility to respiratory infection. CONCLUSION: This study shows simultaneous regulatory signatures in biofluids indicative of homeostatic maintenance during intense physical activity with possible effects on increased infection susceptibility, suggesting that caution against respiratory diseases could benefit workers on highly physical demanding jobs.
Assuntos
Exercício Físico , Multiômica , Humanos , Exercício Físico/fisiologia , CitocinasRESUMO
BACKGROUND: Microbiomes contribute to multiple ecosystem services by transforming organic matter in the soil. Extreme shifts in the environment, such as drying-rewetting cycles during drought, can impact the microbial metabolism of organic matter by altering microbial physiology and function. These physiological responses are mediated in part by lipids that are responsible for regulating interactions between cells and the environment. Despite this critical role in regulating the microbial response to stress, little is known about microbial lipids and metabolites in the soil or how they influence phenotypes that are expressed under drying-rewetting cycles. To address this knowledge gap, we conducted a soil incubation experiment to simulate soil drying during a summer drought of an arid grassland, then measured the response of the soil lipidome and metabolome during the first 3 h after wet-up. RESULTS: Reduced nutrient access during soil drying incurred a replacement of membrane phospholipids, resulting in a diminished abundance of multiple phosphorus-rich membrane lipids. The hot and dry conditions increased the prevalence of sphingolipids and lipids containing long-chain polyunsaturated fatty acids, both of which are associated with heat and osmotic stress-mitigating properties in fungi. This novel finding suggests that lipids commonly present in eukaryotes such as fungi may play a significant role in supporting community resilience displayed by arid land soil microbiomes during drought. As early as 10 min after rewetting dry soil, distinct changes were observed in several lipids that had bacterial signatures including a rapid increase in the abundance of glycerophospholipids with saturated and short fatty acid chains, prototypical of bacterial membrane lipids. Polar metabolites including disaccharides, nucleic acids, organic acids, inositols, and amino acids also increased in abundance upon rewetting. This rapid metabolic reactivation and growth after rewetting coincided with an increase in the relative abundance of firmicutes, suggesting that members of this phylum were positively impacted by rewetting. CONCLUSIONS: Our study revealed specific changes in lipids and metabolites that are indicative of stress adaptation, substrate use, and cellular recovery during soil drying and subsequent rewetting. The drought-induced nutrient limitation was reflected in the lipidome and polar metabolome, both of which rapidly shifted (within hours) upon rewet. Reduced nutrient access in dry soil caused the replacement of glycerophospholipids with phosphorus-free lipids and impeded resource-expensive osmolyte accumulation. Elevated levels of ceramides and lipids with long-chain polyunsaturated fatty acids in dry soil suggest that lipids likely play an important role in the drought tolerance of microbial taxa capable of synthesizing these lipids. An increasing abundance of bacterial glycerophospholipids and triacylglycerols with fatty acids typical of bacteria and polar metabolites suggest a metabolic recovery in representative bacteria once the environmental conditions are conducive for growth. These results underscore the importance of the soil lipidome as a robust indicator of microbial community responses, especially at the short time scales of cell-environment reactions. Video Abstract.
Assuntos
Ecossistema , Lipidômica , Aclimatação , Ceramidas , Ácidos Graxos , Ácidos Graxos InsaturadosRESUMO
Microglia, the innate immune cells of the central nervous system, have been genetically implicated in multiple neurodegenerative diseases. We previously mapped the genetic regulation of gene expression and mRNA splicing in human microglia, identifying several loci where common genetic variants in microglia-specific regulatory elements explain disease risk loci identified by GWAS. However, identifying genetic effects on splicing has been challenging due to the use of short sequencing reads to identify causal isoforms. Here we present the isoform-centric microglia genomic atlas (isoMiGA) which leverages the power of long-read RNA-seq to identify 35,879 novel microglia isoforms. We show that the novel microglia isoforms are involved in stimulation response and brain region specificity. We then quantified the expression of both known and novel isoforms in a multi-ethnic meta-analysis of 555 human microglia short-read RNA-seq samples from 391 donors, the largest to date, and found associations with genetic risk loci in Alzheimer's disease and Parkinson's disease. We nominate several loci that may act through complex changes in isoform and splice site usage.
RESUMO
The detailed mechanisms of COVID-19 infection pathology remain poorly understood. To improve our understanding of SARS-CoV-2 pathology, we performed a multi-omics and correlative analysis of an immunologically naïve SARS-CoV-2 clinical cohort from blood plasma of uninfected controls, mild, and severe infections. Consistent with previous observations, severe patient populations showed an elevation of pulmonary surfactant levels. Intriguingly, mild patients showed a statistically significant elevation in the carnosine dipeptidase modifying enzyme (CNDP1). Mild and severe patient populations showed a strong elevation in the metabolite L-cystine (oxidized form of the amino acid cysteine) and enzymes with roles in glutathione metabolism. Neutrophil extracellular traps (NETs) were observed in both mild and severe populations, and NET formation was higher in severe vs. mild samples. Our correlative analysis suggests a potential protective role for CNDP1 in suppressing PSPB release from the pulmonary space whereas NET formation correlates with increased PSPB levels and disease severity. In our discussion we put forward a possible model where NET formation drives pulmonary occlusions and CNDP1 promotes antioxidation, pleiotropic immune responses, and vasodilation by accelerating histamine synthesis.
RESUMO
Successful establishment of pregnancy requires adhesion of an embryo to the endometrium and subsequent invasion into the maternal tissue. Abnormalities in this critical process of implantation and placentation lead to many pregnancy complications. Here we present a microenigneered system to model a complex sequence of orchestrated multicellular events that plays an essential role in early pregnancy. Our implantation-on-a-chip is capable of reconstructing the three-dimensional structural organization of the maternal-fetal interface to model the invasion of specialized fetal extravillous trophoblasts into the maternal uterus. Using primary human cells isolated from clinical specimens, we demonstrate in vivo-like directional migration of extravillous trophoblasts towards a microengineered maternal vessel and their interactions with the endothelium necessary for vascular remodeling. Through parametric variation of the cellular microenvironment and proteomic analysis of microengineered tissues, we show the important role of decidualized stromal cells as a regulator of extravillous trophoblast migration. Furthermore, our study reveals previously unknown effects of pre-implantation maternal immune cells on extravillous trophoblast invasion. This work represents a significant advance in our ability to model early human pregnancy, and may enable the development of advanced in vitro platforms for basic and clinical research of human reproduction.
Assuntos
Proteômica , Trofoblastos , Movimento Celular , Implantação do Embrião/fisiologia , Endométrio , Feminino , Humanos , Placentação/fisiologia , Gravidez , Trofoblastos/fisiologiaRESUMO
Thiol-based post-translational modifications (PTMs) play a key role in redox-dependent regulation and signaling. Functional cysteine (Cys) sites serve as redox switches, regulated through multiple types of PTMs. Herein, we aim to characterize the complexity of thiol PTMs at the proteome level through the establishment of a direct detection workflow. The LC-MS/MS based workflow allows for simultaneous quantification of protein abundances and multiple types of thiol PTMs. To demonstrate its utility, the workflow was applied to mouse pancreatic ß-cells (ß-TC-6) treated with thapsigargin to induce endoplasmic reticulum (ER) stress. This resulted in the quantification of >9000 proteins and multiple types of thiol PTMs, including intra-peptide disulfide (S-S), S-glutathionylation (SSG), S-sulfinylation (SO2H), S-sulfonylation (SO3H), S-persulfidation (SSH), and S-trisulfidation (SSSH). Proteins with significant changes in abundance were observed to be involved in canonical pathways such as autophagy, unfolded protein response, protein ubiquitination pathway, and EIF2 signaling. Moreover, ~500 Cys sites were observed with one or multiple types of PTMs with SSH and S-S as the predominant types of modifications. In many cases, significant changes in the levels of different PTMs were observed on various enzymes and their active sites, while their protein abundance exhibited little change. These results provide evidence of independent translational and post-translational regulation of enzyme activity. The observed complexity of thiol modifications on the same Cys residues illustrates the challenge in the characterization and interpretation of protein thiol modifications and their functional regulation.
Assuntos
Células Secretoras de Insulina , Compostos de Sulfidrila , Animais , Cromatografia Líquida , Estresse do Retículo Endoplasmático , Células Secretoras de Insulina/metabolismo , Camundongos , Oxirredução , Processamento de Proteína Pós-Traducional , Proteoma/metabolismo , Espectrometria de Massas em TandemRESUMO
The confident identification of metabolites and xenobiotics in biological and environmental studies is an analytical challenge due to their immense dynamic range, vast chemical space and structural diversity. Ion mobility spectrometry (IMS) is widely used for small molecule analyses since it can separate isomeric species and be easily coupled with front end separations and mass spectrometry for multidimensional characterizations. However, to date IMS metabolomic and exposomic studies have been limited by an inadequate number of accurate collision cross section (CCS) values for small molecules, causing features to be detected but not confidently identified. In this work, we utilized drift tube IMS (DTIMS) to directly measure CCS values for over 500 small molecules including primary metabolites, secondary metabolites and xenobiotics. Since DTIMS measurements do not need calibrant ions or calibration like some other IMS techniques, they avoid calibration errors which can cause problems in distinguishing structurally similar molecules. All measurements were performed in triplicate in both positive and negative polarities with nitrogen gas and seven different electric fields, so that relative standard deviations (RSD) could be assessed for each molecule and structural differences studied. The primary metabolites analyzed to date have come from key metabolism pathways such as glycolysis, the pentose phosphate pathway and the tricarboxylic acid cycle, while the secondary metabolites consisted of classes such as terpenes and flavonoids, and the xenobiotics represented a range of molecules from antibiotics to polycyclic aromatic hydrocarbons. Different CCS trends were observed for several of the diverse small molecule classes and when urine features were matched to the database, the addition of the IMS dimension greatly reduced the possible number of candidate molecules. This CCS database and structural information are freely available for download at http://panomics.pnnl.gov/metabolites/ with new molecules being added frequently.