ABSTRACT
The Natural Products Magnetic Resonance Database (NP-MRD) is a comprehensive, freely available electronic resource for the deposition, distribution, searching and retrieval of nuclear magnetic resonance (NMR) data on natural products, metabolites and other biologically derived chemicals. NMR spectroscopy has long been viewed as the 'gold standard' for the structure determination of novel natural products and novel metabolites. NMR is also widely used in natural product dereplication and the characterization of biofluid mixtures (metabolomics). All of these NMR applications require large collections of high quality, well-annotated, referential NMR spectra of pure compounds. Unfortunately, referential NMR spectral collections for natural products are quite limited. It is because of the critical need for dedicated, open access natural product NMR resources that the NP-MRD was funded by the National Institute of Health (NIH). Since its launch in 2020, the NP-MRD has grown quickly to become the world's largest repository for NMR data on natural products and other biological substances. It currently contains both structural and NMR data for nearly 41,000 natural product compounds from >7400 different living species. All structural, spectroscopic and descriptive data in the NP-MRD is interactively viewable, searchable and fully downloadable in multiple formats. Extensive hyperlinks to other databases of relevance are also provided. The NP-MRD also supports community deposition of NMR assignments and NMR spectra (1D and 2D) of natural products and related meta-data. The deposition system performs extensive data enrichment, automated data format conversion and spectral/assignment evaluation. Details of these database features, how they are implemented and plans for future upgrades are also provided. The NP-MRD is available at https://np-mrd.org.
Subject(s)
Biological Products/chemistry , Databases, Factual , Magnetic Resonance Spectroscopy , Software , Biological Products/classification , InternetABSTRACT
A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.
Subject(s)
Metabolomics , Quantum TheoryABSTRACT
We present DEIMoS: Data Extraction for Integrated Multidimensional Spectrometry, a Python application programming interface (API) and command-line tool for high-dimensional mass spectrometry data analysis workflows that offers ease of development and access to efficient algorithmic implementations. Functionality includes feature detection, feature alignment, collision cross section (CCS) calibration, isotope detection, and MS/MS spectral deconvolution, with the output comprising detected features aligned across study samples and characterized by mass, CCS, tandem mass spectra, and isotopic signature. Notably, DEIMoS operates on N-dimensional data, largely agnostic to acquisition instrumentation; algorithm implementations simultaneously utilize all dimensions to (i) offer greater separation between features, thus improving detection sensitivity, (ii) increase alignment/feature matching confidence among data sets, and (iii) mitigate convolution artifacts in tandem mass spectra. We demonstrate DEIMoS with LC-IMS-MS/MS metabolomics data to illustrate the advantages of a multidimensional approach in each data processing step.
Subject(s)
Metabolomics , Tandem Mass Spectrometry , Algorithms , Chromatography, Liquid/methods , Metabolomics/methods , Software , Tandem Mass Spectrometry/methodsABSTRACT
The prediction of structure dependent molecular properties, such as collision cross sections as measured using ion mobility spectrometry, are crucially dependent on the selection of the correct population of molecular conformers. Here, we report an in-depth evaluation of multiple conformation selection techniques, including simple averaging, Boltzmann weighting, lowest energy selection, low energy threshold reductions, and similarity reduction. Generating 50â¯000 conformers each for 18 molecules, we used the In Silico Chemical Library Engine (ISiCLE) to calculate the collision cross sections for the entire data set. First, we employed Monte Carlo simulations to understand the variability between conformer structures as generated using simulated annealing. Then we employed Monte Carlo simulations to the aforementioned conformer selection techniques applied on the simulated molecular property: the ion mobility collision cross section. Based on our analyses, we found Boltzmann weighting to be a good trade-off between precision and theoretical accuracy. Combining multiple techniques revealed that energy thresholds and root-mean-squared deviation-based similarity reductions can save considerable computational expense while maintaining property prediction accuracy. Molecular dynamic conformer generation tools like AMBER can continue to generate new lowest energy conformers even after tens of thousands of generations, decreasing precision between runs. This reduced precision can be ameliorated and theoretical accuracy increased by running density functional theory geometry optimization on carefully selected conformers.
Subject(s)
Ion Mobility Spectrometry , Molecular Dynamics Simulation , Molecular ConformationABSTRACT
A growing number of software tools have been developed for metabolomics data processing and analysis. Many new tools are contributed by metabolomics practitioners who have limited prior experience with software development, and the tools are subsequently implemented by users with expertise that ranges from basic point-and-click data analysis to advanced coding. This Perspective is intended to introduce metabolomics software users and developers to important considerations that determine the overall impact of a publicly available tool within the scientific community. The recommendations reflect the collective experience of an NIH-sponsored Metabolomics Consortium working group that was formed with the goal of researching guidelines and best practices for metabolomics tool development. The recommendations are aimed at metabolomics researchers with little formal background in programming and are organized into three stages: (i) preparation, (ii) tool development, and (iii) distribution and maintenance.
Subject(s)
Cloud Computing , Metabolomics/methods , SoftwareABSTRACT
Non-targeted analysis (NTA) encompasses a rapidly evolving set of mass spectrometry techniques aimed at characterizing the chemical composition of complex samples, identifying unknown compounds, and/or classifying samples, without prior knowledge regarding the chemical content of the samples. Recent advances in NTA are the result of improved and more accessible instrumentation for data generation and analysis tools for data evaluation and interpretation. As researchers continue to develop NTA approaches in various scientific fields, there is a growing need to identify, disseminate, and adopt community-wide method reporting guidelines. In 2018, NTA researchers formed the Benchmarking and Publications for Non-Targeted Analysis Working Group (BP4NTA) to address this need. Consisting of participants from around the world and representing fields ranging from environmental science and food chemistry to 'omics and toxicology, BP4NTA provides resources addressing a variety of challenges associated with NTA. Thus far, BP4NTA group members have aimed to establish a consensus on NTA-related terms and concepts and to create consistency in reporting practices by providing resources on a public Web site, including consensus definitions, reference content, and lists of available tools. Moving forward, BP4NTA will provide a setting for NTA researchers to continue discussing emerging challenges and contribute to additional harmonization efforts.
Subject(s)
Benchmarking , HumansABSTRACT
The α2a adrenoceptor is a medically relevant subtype of the G protein-coupled receptor family. Unfortunately, high-throughput techniques aimed at producing novel drug leads for this receptor have been largely unsuccessful because of the complex pharmacology of adrenergic receptors. As such, cutting-edge in silico ligand- and structure-based assessment and de novo deep learning methods are well positioned to provide new insights into protein-ligand interactions and potential active compounds. In this work, we (i) collect a dataset of α2a adrenoceptor agonists and provide it as a resource for the drug design community; (ii) use the dataset as a basis to generate candidate-active structures via deep learning; and (iii) apply computational ligand- and structure-based analysis techniques to gain new insights into α2a adrenoceptor agonists and assess the quality of the computer-generated compounds. We further describe how such assessment techniques can be applied to putative chemical probes with a case study involving proposed medetomidine-based probes.
Subject(s)
Deep Learning , Receptors, Adrenergic, alpha-2 , Ligands , MedetomidineABSTRACT
We describe the Mass Spectrometry Adduct Calculator (MSAC), an automated Python tool to calculate the adduct ion masses of a parent molecule. Here, adduct refers to a version of a parent molecule [M] that is charged due to addition or loss of atoms and electrons resulting in a charged ion, for example, [M + H]+. MSAC includes a database of 147 potential adducts and adduct/neutral loss combinations and their mass-to-charge ratios (m/z) as extracted from the NIST/EPA/NIH Mass Spectral Library (NIST17), Global Natural Products Social Molecular Networking Public Spectral Libraries (GNPS), and MassBank of North America (MoNA). The calculator relies on user-selected subsets of the combined database to calculate expected m/z for adducts of molecules supplied as formulas. This tool is intended to help researchers create identification libraries to collect evidence for the presence of molecules in mass spectrometry data. While the included adduct database focuses on adducts typically detected during liquid chromatography-mass spectrometry analyses, users may supply their own lists of adducts and charge states for calculating expected m/z. We also analyzed statistics on adducts from spectra contained in the three selected mass spectral libraries. MSAC is freely available at https://github.com/pnnl/MSAC.
Subject(s)
Mass Spectrometry , Chromatography, Liquid/methodsABSTRACT
Uncompetitive antagonists of the N-methyl d-aspartate receptor (NMDAR) have demonstrated therapeutic benefit in the treatment of neurological diseases such as Parkinson's and Alzheimer's, but some also cause dissociative effects that have led to the synthesis of illicit drugs. The ability to generate NMDAR antagonists in silico is therefore desirable for both new medication development and preempting and identifying new designer drugs. Recently, generative deep learning models have been applied to de novo drug design as a means to expand the amount of chemical space that can be explored for potential drug-like compounds. In this study, we assess the application of a generative model to the NMDAR to achieve two primary objectives: (i) the creation and release of a comprehensive library of experimentally validated NMDAR phencyclidine (PCP) site antagonists to assist the drug discovery community and (ii) an analysis of both the advantages conferred by applying such generative artificial intelligence models to drug design and the current limitations of the approach. We apply, and provide source code for, a variety of ligand- and structure-based assessment techniques used in standard drug discovery analyses to the deep learning-generated compounds. We present twelve candidate antagonists that are not available in existing chemical databases to provide an example of what this type of workflow can achieve, though synthesis and experimental validation of these compounds are still required.
Subject(s)
Deep Learning , Receptors, N-Methyl-D-Aspartate/antagonists & inhibitors , Small Molecule Libraries/chemistry , Animals , Binding Sites , Drug Design , Ligands , Mice , Molecular Structure , Receptors, N-Methyl-D-Aspartate/chemistry , Xenopus laevisABSTRACT
Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small molecule identification is inhibited by (i) chemical reference libraries (e.g., mass spectra, collision cross section, and other measurable property libraries) representing <1% of known molecules, limiting the number of possible identifications, and (ii) the lack of a method to generate candidate matches directly from experimental features (i.e., without a library). To this end, we developed a variational autoencoder (VAE) to learn a continuous numerical, or latent, representation of molecular structure to expand reference libraries for small molecule identification. We extended the VAE to include a chemical property decoder, trained as a multitask network, in order to shape the latent representation such that it assembles according to desired chemical properties. The approach is unique in its application to metabolomics and small molecule identification, with its focus on properties that can be obtained from experimental measurements (m/z, CCS) paired with its training paradigm, which involved a cascade of transfer learning iterations. First, molecular representation is learned from a large data set of structures with m/z labels. Next, in silico property values are used to continue training, as experimental property data is limited. Finally, the network is further refined by being trained with the experimental data. This allows the network to learn as much as possible at each stage, enabling success with progressively smaller data sets without overfitting. Once trained, the network can be used to predict chemical properties directly from structure, as well as generate candidate structures with desired chemical properties. Our approach is orders of magnitude faster than first-principles simulation for CCS property prediction. Additionally, the ability to generate novel molecules along manifolds, defined by chemical property analogues, positions DarkChem as highly useful in a number of application areas, including metabolomics and small molecule identification, drug discovery and design, chemical forensics, and beyond.
Subject(s)
Computer Simulation , Deep Learning , Small Molecule Libraries/analysis , Metabolomics , Molecular Structure , Small Molecule Libraries/metabolismABSTRACT
Thousands of chemical properties can be calculated for small molecules, which can be used to place the molecules within the context of a broader "chemical space." These definitions vary based on compounds of interest and the goals for the given chemical space definition. Here, we introduce a customizable Python module, chespa, built to easily assess different chemical space definitions through clustering of compounds in these spaces and visualizing trends of these clusters. To demonstrate this, chespa currently streamlines prediction of various molecular descriptors (predicted chemical properties, molecular substructures, AI-based chemical space, and chemical class ontology) in order to test six different chemical space definitions. Furthermore, we investigated how these varying definitions trend with mass spectrometry (MS)-based observability, that is, the ability of a molecule to be observed with MS (e.g., as a function of the molecule ionizability), using an example data set from the U.S. EPA's nontargeted analysis collaborative trial, where blinded samples had been analyzed previously, providing 1398 data points. Improved understanding of observability would offer many advantages in small-molecule identification, such as (i) a priori selection of experimental conditions based on suspected sample composition, (ii) the ability to reduce the number of candidate structures during compound identification by removing those less likely to ionize, and, in turn, (iii) a reduced false discovery rate and increased confidence in identifications. Factors controlling observability are not fully understood, making prediction of this property nontrivial and a prime candidate for chemical space analysis. Chespa is available at github.com/pnnl/chespa.
Subject(s)
Mass SpectrometryABSTRACT
High-throughput, comprehensive, and confident identifications of metabolites and other chemicals in biological and environmental samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent technological advances, metabolomics studies still result in the detection of a disproportionate number of features that cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on reference libraries constructed by analysis of authentic reference materials with limited commercial availability. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers (i.e., conformational isomers) using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of molecular dynamics, quantum chemistry, and ion mobility calculations, to generate structures and chemical property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over 2 orders of magnitude. Calculated CCS values were validated against 1983 experimentally measured CCS values and compared to previously reported CCS calculation approaches. Average calculated CCS error for the validation set is 3.2% using standard parameters, outperforming other density functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calculated and experimental CCS values ( metabolomics.pnnl.gov ), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described, including providing evidence for the presence of an environmental degradation product, the separation of molecular isomers, and an initial characterization of complex blinded mixtures of exposure chemicals. This work represents a method to address the limitations of small molecule identification and offers an alternative to generating chemical identification libraries experimentally by analyzing authentic reference materials. All code is available at github.com/pnnl .
Subject(s)
Cheminformatics/methods , Density Functional Theory , Small Molecule Libraries/chemistry , Machine Learning , Models, Chemical , Molecular Dynamics SimulationABSTRACT
We report on separations of ion isotopologues and isotopomers using ultrahigh-resolution traveling wave-based Structures for Lossless Ion Manipulations with serpentine ultralong path and extended routing ion mobility spectrometry coupled to mass spectrometry (SLIM SUPER IMS-MS). Mobility separations of ions from the naturally occurring ion isotopic envelopes (e.g., [M], [M+1], [M+2], ... ions) showed the first and second isotopic peaks (i.e., [M+1] and [M+2]) for various tetraalkylammonium ions could be resolved from their respective monoisotopic ion peak ([M]) after SLIM SUPER IMS with resolving powers of â¼400-600. Similar separations were obtained for other compounds (e.g., tetrapeptide ions). Greater separation was obtained using argon versus helium drift gas, as expected from the greater reduced mass contribution to ion mobility described by the Mason-Schamp relationship. To more directly explore the role of isotopic substitutions, we studied a mixture of specific isotopically substituted (15N, 13C, and 2H) protonated arginine isotopologues. While the separations in nitrogen were primarily due to their reduced mass differences, similar to the naturally occurring isotopologues, their separations in helium, where higher resolving powers could also be achieved, revealed distinct additional relative mobility shifts. These shifts appeared correlated, after correction for the reduced mass contribution, with changes in the ion center of mass due to the different locations of heavy atom substitutions. The origin of these apparent mass distribution-induced mobility shifts was then further explored using a mixture of Iodoacetyl Tandem Mass Tag (iodoTMT) isotopomers (i.e., each having the same exact mass, but with different isotopic substitution sites). Again, the observed mobility shifts appeared correlated with changes in the ion center of mass leading to multiple monoisotopic mobilities being observed for some isotopomers (up to a â¼0.04% difference in mobility). These mobility shifts thus appear to reflect details of the ion structure, derived from the changes due to ion rotation impacting collision frequency or momentum transfer, and highlight the potential for new approaches for ion structural characterization.
Subject(s)
Deuterium/chemistry , Carbon Isotopes/chemistry , Ion Mobility Spectrometry , Ions/chemistry , Ions/isolation & purification , Mass Spectrometry , Nitrogen Isotopes/chemistryABSTRACT
Mass-spectrometry based omics technologies - namely proteomics, metabolomics and lipidomics - have enabled the molecular level systems biology investigation of organisms in unprecedented detail. There has been increasing interest for gaining a thorough, functional understanding of the biological consequences associated with cellular heterogeneity in a wide variety of research areas such as developmental biology, precision medicine, cancer research and microbiome science. Recent advances in mass spectrometry (MS) instrumentation and sample handling strategies are quickly making comprehensive omics analyses of single cells feasible, but key breakthroughs are still required to push through remaining bottlenecks. In this review, we discuss the challenges faced by single cell MS-based omics analyses and highlight recent technological advances that collectively can contribute to comprehensive and high throughput omics analyses in single cells. We provide a vision of the potential of integrating pioneering technologies such as Structures for Lossless Ion Manipulations (SLIM) for improved sensitivity and resolution, novel peptide identification tactics and standards free metabolomics approaches for future applications in single cell analysis.
Subject(s)
Genomics/methods , Mass Spectrometry/methods , Metabolomics/methods , Proteomics/methods , Single-Cell Analysis/methods , Humans , Precision Medicine , Systems BiologyABSTRACT
The current gold standard for unambiguous molecular identification in metabolomics analysis is comparing two or more orthogonal properties from the analysis of authentic reference materials (standards) to experimental data acquired in the same laboratory with the same analytical methods. This represents a significant limitation for comprehensive chemical identification of small molecules in complex samples. The process is time consuming and costly, and the majority of molecules are not yet represented by standards. Thus, there is a need to assemble evidence for the presence of small molecules in complex samples through the use of libraries containing calculated chemical properties. To address this need, we developed a Multi-Attribute Matching Engine (MAME) and a library derived in part from our in silico chemical library engine (ISiCLE). Here, we describe an initial evaluation of these methods in a blinded analysis of synthetic chemical mixtures as part of the U.S. Environmental Protection Agency's (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT, Phase 1). For molecules in all mixtures, the initial blinded false negative rate (FNR), false discovery rate (FDR), and accuracy were 57%, 77%, and 91%, respectively. For high evidence scores, the FDR was 35%. After unblinding of the sample compositions, we optimized the scoring parameters to better exploit the available evidence and increased the accuracy for molecules suspected as present. The final FNR, FDR, and accuracy were 67%, 53%, and 96%, respectively. For high evidence scores, the FDR was 10%. This study demonstrates that multiattribute matching methods in conjunction with in silico libraries may one day enable reduced reliance on experimentally derived libraries for building evidence for the presence of molecules in complex samples.
Subject(s)
Computational Biology/methods , Computer Simulation , Small Molecule Libraries/chemistry , Algorithms , Small Molecule Libraries/metabolismABSTRACT
A series of Wrightia hanleyi extracts was screened for activity against Mycobacterium tuberculosis H37Rv. One active fraction contained a compound that initially appeared to be either the isoflavonoid wrightiadione or the alkaloid tryptanthrin, both of which have been previously reported in other Wrightia species. Characterization by NMR and MS, as well as evaluation of the literature describing these compounds, led to the conclusion that wrightiadione (1) was misidentified in the first report of its isolation from W. tomentosa in 1992 and again in 2015 when reported in W. pubescens and W. religiosa. Instead, the molecule described in these reports and in the present work is almost certainly the isobaric (same nominal mass) and isosteric (same number of atoms, valency, and shape) tryptanthrin (2), a well-known quinazolinone alkaloid found in a variety of plants including Wrightia species. Tryptanthrin (2) is also accessible synthetically via several routes and has been thoroughly characterized. Wrightiadione (1) has been synthesized and characterized and may have useful biological activity; however, this compound can no longer be said to be known to exist in Nature. To our knowledge, this misidentification of wrightiadione (1) has heretofore been unrecognized.
Subject(s)
Antitubercular Agents/isolation & purification , Apocynaceae/chemistry , Quinazolines/isolation & purification , Antitubercular Agents/chemistry , Antitubercular Agents/pharmacology , Carbon-13 Magnetic Resonance Spectroscopy , Isoflavones , Mass Spectrometry , Microbial Sensitivity Tests , Molecular Structure , Mycobacterium tuberculosis/drug effects , Proton Magnetic Resonance Spectroscopy , Quinazolines/chemistry , Quinazolines/pharmacologyABSTRACT
BACKGROUND: Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications. RESULTS: We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions. CONCLUSION: Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method's ability to link genes in the same functional pathway.
Subject(s)
Gene Expression/genetics , Gene Regulatory Networks/genetics , Algorithms , HumansABSTRACT
Salmonella enterica elicits intestinal inflammation to gain access to nutrients. One of these nutrients is fructose-asparagine (F-Asn). The availability of F-Asn to Salmonella during infection is dependent upon Salmonella pathogenicity islands 1 and 2, which in turn are required to provoke inflammation. Here, we determined that F-Asn is present in mouse chow at approximately 400 pmol/mg (dry weight). F-Asn is also present in the intestinal tract of germfree mice at 2,700 pmol/mg (dry weight) and in the intestinal tract of conventional mice at 9 to 28 pmol/mg. These findings suggest that the mouse intestinal microbiota consumes F-Asn. We utilized heavy-labeled precursors of F-Asn to monitor its formation in the intestine, in the presence or absence of inflammation, and none was observed. Finally, we determined that some members of the class Clostridia encode F-Asn utilization pathways and that they are eliminated from highly inflamed Salmonella-infected mice. Collectively, our studies identify the source of F-Asn as the diet and that Salmonella-mediated inflammation is required to eliminate competitors and allow the pathogen nearly exclusive access to this nutrient.
Subject(s)
Asparagine/metabolism , Fructose/metabolism , Gastrointestinal Microbiome/immunology , Inflammation/metabolism , Salmonella Infections, Animal/immunology , Salmonella Infections, Animal/metabolism , Salmonella enterica/immunology , Salmonella enterica/metabolism , Animals , Inflammation/immunology , Inflammation/pathology , Salmonella Infections, Animal/pathology , Salmonella enterica/pathogenicityABSTRACT
MOTIVATION: Drift tube ion mobility spectrometry coupled with mass spectrometry (DTIMS-MS) is increasingly implemented in high throughput omics workflows, and new informatics approaches are necessary for processing the associated data. To automatically extract arrival times for molecules measured by DTIMS at multiple electric fields and compute their associated collisional cross sections (CCS), we created the PNNL Ion Mobility Cross Section Extractor (PIXiE). The primary application presented for this algorithm is the extraction of data that can then be used to create a reference library of experimental CCS values for use in high throughput omics analyses. RESULTS: We demonstrate the utility of this approach by automatically extracting arrival times and calculating the associated CCSs for a set of endogenous metabolites and xenobiotics. The PIXiE-generated CCS values were within error of those calculated using commercially available instrument vendor software. AVAILABILITY AND IMPLEMENTATION: PIXiE is an open-source tool, freely available on Github. The documentation, source code of the software, and a GUI can be found at https://github.com/PNNL-Comp-Mass-Spec/PIXiE and the source code of the backend workflow library used by PIXiE can be found at https://github.com/PNNL-Comp-Mass-Spec/IMS-Informed-Library . CONTACT: erin.baker@pnnl.gov or thomas.metz@pnnl.gov. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Computational Biology/methods , Mass Spectrometry/methods , Software , AlgorithmsABSTRACT
Biofilms alter their metabolism in response to environmental stress. This study explores the effect of a hyperosmotic agent-antibiotic treatment on the metabolism of Staphylococcus aureus biofilms through the use of nuclear magnetic resonance (NMR) techniques. To determine the metabolic activity of S. aureus, we quantified the concentrations of metabolites in spent medium using high-resolution NMR spectroscopy. Biofilm porosity, thickness, biovolume, and relative diffusion coefficient depth profiles were obtained using NMR microimaging. Dissolved oxygen concentration was measured to determine the availability of oxygen within the biofilm. Under vancomycin-only treatment, the biofilm communities switched to fermentation under anaerobic condition, as evidenced by high concentrations of formate (7.4 ± 2.7 mM), acetate (13.1 ± 0.9 mM), and lactate (3.0 ± 0.8 mM), and there was no detectable dissolved oxygen in the biofilm. In addition, we observed the highest consumption of pyruvate (0.19 mM remaining from an initial 40 mM concentration), the sole carbon source, under the vancomycin-only treatment. On the other hand, relative effective diffusion coefficients increased from 0.73 ± 0.08 to 0.88 ± 0.08 under vancomycin-only treatment but decreased from 0.71 ± 0.04 to 0.60 ± 0.07 under maltodextrin-only and from 0.73 ± 0.06 to 0.56 ± 0.08 under combined treatments. There was an increase in biovolume, from 2.5 ± 1 mm3 to 7 ± 1 mm3 , under the vancomycin-only treatment, while the maltodextrin-only and combined treatments showed no significant change in biovolume over time. This indicated that physical biofilm growth was halted during maltodextrin-only and combined treatments.