Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 70
Filter
1.
Nat Methods ; 18(12): 1524-1531, 2021 12.
Article in English | MEDLINE | ID: mdl-34857935

ABSTRACT

Compound identification in small-molecule research, such as untargeted metabolomics or exposome research, relies on matching tandem mass spectrometry (MS/MS) spectra against experimental or in silico mass spectral libraries. Most software programs use dot product similarity scores. Here we introduce the concept of MS/MS spectral entropy to improve scoring results in MS/MS similarity searches via library matching. Entropy similarity outperformed 42 alternative similarity algorithms, including dot product similarity, when searching 434,287 spectra against the high-quality NIST20 library. Entropy similarity scores proved to be highly robust even when we added different levels of noise ions. When we applied entropy levels to 37,299 experimental spectra of natural products, false discovery rates of less than 10% were observed at entropy similarity score 0.75. Experimental human gut metabolome data were used to confirm that entropy similarity largely improved the accuracy of MS-based annotations in small-molecule research to false discovery rates below 10%, annotated new compounds and provided the basis to automatically flag poor-quality, noisy spectra.


Subject(s)
Computational Biology/methods , Intestines/metabolism , Metabolomics/methods , Tandem Mass Spectrometry/methods , Algorithms , Chromatography, Liquid/methods , Computer Simulation , Entropy , False Positive Reactions , Humans , Metabolome , ROC Curve , Reproducibility of Results , Software
2.
Chem Rev ; 121(10): 5633-5670, 2021 05 26.
Article in English | MEDLINE | ID: mdl-33979149

ABSTRACT

A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.


Subject(s)
Metabolomics , Quantum Theory
3.
Anal Chem ; 94(6): 2732-2739, 2022 02 15.
Article in English | MEDLINE | ID: mdl-35119811

ABSTRACT

Acyl-coenzyme A derivatives (acyl-CoAs) are core molecules in the fatty acid and energy metabolism across all species. However, in vivo, many other carboxylic acids can form xenobiotic acyl-CoA esters, including drugs. More than 2467 acyl-CoAs are known from the published literature. In addition, more than 300 acyl-CoAs are covered in pathway databases, but as of October 2020, only 53 experimental acyl-CoA tandem mass spectra are present in NIST20 and MoNA libraries to enable annotation of the mass spectra in untargeted metabolomics studies. The experimental spectra originated from low-resolution ion trap and triple quadrupole mass spectrometers as well as high-resolution quadrupole-time of flight and orbital ion trap instruments at various collision energies. We used MassFrontier software and the literature to annotate fragment ions to generate fragmentation rules and intensities for the different instruments and collision energies. These rules were then applied to 1562 unique species based on [M+H]+ and [M-H]- precursor ions to generate two mass spectra per instrument platform and collision energy, amassing an in silico library of 10,934 accurate mass MS/MS spectra that are freely available at github.com/urikeshet/CoA-Blast. The spectra can be imported into a commercial or freely available mass spectral search tool. We used the libraries to annotate 23 acyl-CoA esters in mouse liver, including 8 novel species.


Subject(s)
Acyl Coenzyme A , Tandem Mass Spectrometry , Acyl Coenzyme A/metabolism , Animals , Liver/metabolism , Metabolomics , Mice , Software
4.
Anal Chem ; 94(3): 1559-1566, 2022 01 25.
Article in English | MEDLINE | ID: mdl-35006668

ABSTRACT

Chemical derivatization, especially silylation, is widely used in gas chromatography coupled to mass spectrometry (GC-MS). By introducing the trimethylsilyl (TMS) group to substitute active hydrogens in the molecule, thermostable volatile compounds are created that can be easily analyzed. While large GC-MS libraries are available, the number of spectra for TMS-derivatized compounds is comparatively small. In addition, many metabolites cannot be purchased to produce authentic library spectra. Therefore, computationally generated in silico mass spectral databases need to take TMS derivatizations into account for metabolomics. The quantum chemistry method QCEIMS is an automatic method to generate electron ionization (EI) mass spectra directly from compound structures. To evaluate the performance of the QCEIMS method for TMS-derivatized compounds, we chose 816 trimethylsilyl derivatives of organic acids, alcohols, amides, amines, and thiols to compare in silico-generated spectra against the experimental EI mass spectra from the NIST17 library. Overall, in silico spectra showed a weighted dot score similarity (1000 is maximum) of 635 compared to the NIST17 experimental spectra. Aromatic compounds yielded a better prediction accuracy with an average similarity score of 808, while oxygen-containing molecules showed lower accuracy with only an average score of 609. Such similarity scores are useful for annotation of small molecules in untargeted GC-MS-based metabolomics, suggesting that QCEIMS methods can be extended to compounds that are not present in experimental databases. Despite this overall success, 37% of all experimentally observed ions were not found in QCEIMS predictions. We investigated QCEIMS trajectories in detail and found missed fragmentations in specific rearrangement reactions. Such findings open the way forward for future improvements to the QCEIMS software.


Subject(s)
Electrons , Metabolomics , Gas Chromatography-Mass Spectrometry/methods , Mass Spectrometry , Metabolomics/methods , Software
5.
J Chem Inf Model ; 62(17): 4049-4056, 2022 09 12.
Article in English | MEDLINE | ID: mdl-36043939

ABSTRACT

Competitive Fragmentation Modeling for Metabolite Identification (CFM-ID) is a machine learning tool to predict in silico tandem mass spectra (MS/MS) for known or suspected metabolites for which chemical reference standards are not available. As a machine learning tool, it relies on both an underlying statistical model and an explicit training set that encompasses experimental mass spectra for specific compounds. Such mass spectra depend on specific parameters such as collision energies, instrument types, and adducts which are accumulated in libraries. Yet, ultimately prediction tools that are meant to cover wide expanses of entities must be validated on cases that were not included in the initial training and testing sets. Hence, we here benchmarked the performance of CFM-ID 4.0 to correctly predict MS/MS spectra for spectra that were not included in the CFM-ID training set and for different mass spectrometry conditions. We used 609,456 experimental tandem spectra from the NIST20 mass spectral library that were newly added to the previous NIST17 library version. We found that CFM-ID's highest energy prediction output would maximize the capacity for library generation. Matching the experimental collision energy with CFM-ID's prediction energy produced the best results, even for HCD-Orbitrap instruments. For benzenoids, better MS/MS predictions were achieved than for heterocyclic compounds. However, when exploring CFM-ID's performance on 8,305 compounds at 40 eV HCD-Orbitrap collision energy, >90% of the 20/80 split test compounds showed <700 MS/MS similarity score. Instead of a stand-alone tool, CFM-ID 4.0 might be useful to boost candidate structures in the greater context of identification workflows.


Subject(s)
Benchmarking , Tandem Mass Spectrometry , Gene Library , Models, Statistical , Tandem Mass Spectrometry/methods
6.
J Chem Inf Model ; 62(18): 4403-4410, 2022 09 26.
Article in English | MEDLINE | ID: mdl-36107950

ABSTRACT

Here, we provide an algorithm that introduces excited states into the molecular dynamics prediction of the 70 eV electron ionization mass spectra. To decide the contributions of different electronic states, the ionization cross section associated with relevant molecular orbitals was calculated by the binary-encounter-Bethe (BEB) model. We used a fast orthogonalization model/single and double state configuration interaction (OM2/CISD) method to implement excited states calculations and combined this with the GFN1-xTB semiempirical model. Demonstrated by predicting the mass spectrum of urocanic acid, we showed better accuracies to experimental spectra using excited-state molecular dynamics than calculations that only used the ground-state occupation. For several histidine pathway intermediates, we found that excited-state corrections yielded an average of 73% more true positive ions compared to the OM2 method when matching to experimental spectra and 16% more true positive ions compared to the GFN method. Importantly, the exited state models also correctly predict several fragmentation reactions that were missing from both ground-state methods. Overall, for 48 calculated molecules, we found the best average mass spectral similarity scores for the mixed excited-state method compared to the ground-state methods using either cosine, weighted dot score, or entropy similarity calculations. Therefore, we recommend adding excited-state calculations for predicting the electron ionization mass spectra of small molecules in metabolomics.


Subject(s)
Electrons , Urocanic Acid , Histidine , Ions , Molecular Dynamics Simulation , Quantum Theory
7.
Nat Methods ; 15(1): 53-56, 2018 01.
Article in English | MEDLINE | ID: mdl-29176591

ABSTRACT

Novel metabolites distinct from canonical pathways can be identified through the integration of three cheminformatics tools: BinVestigate, which queries the BinBase gas chromatography-mass spectrometry (GC-MS) metabolome database to match unknowns with biological metadata across over 110,000 samples; MS-DIAL 2.0, a software tool for chromatographic deconvolution of high-resolution GC-MS or liquid chromatography-mass spectrometry (LC-MS); and MS-FINDER 2.0, a structure-elucidation program that uses a combination of 14 metabolome databases in addition to an enzyme promiscuity library. We showcase our workflow by annotating N-methyl-uridine monophosphate (UMP), lysomonogalactosyl-monopalmitin, N-methylalanine, and two propofol derivatives.


Subject(s)
Blood Proteins/metabolism , Computational Biology/methods , Databases, Factual , Gas Chromatography-Mass Spectrometry/methods , Metabolome , Metabolomics/methods , Software , Bacteria/metabolism , Chromatography, Liquid , Feces/chemistry , Humans
8.
Lipids Health Dis ; 20(1): 30, 2021 Apr 03.
Article in English | MEDLINE | ID: mdl-33812378

ABSTRACT

BACKGROUND: Developing an understanding of the biochemistry of aging in both sexes is critical for managing disease throughout the lifespan. Lipidomic associations with age and sex have been reported, but prior studies are limited by measurements in serum rather than plasma or by participants taking lipid-lowering medications. METHODS: Our study included lipidomic data from 980 participants aged 18-87 years old from the Genetics of Lipid-Lowering Drugs and Diet Network (GOLDN). Participants were off lipid-lowering medications for at least 4 weeks, and signal intensities of 413 known lipid species were measured in plasma. We examined linear age and sex associations with signal intensity of (a) 413 lipid species; (b) 6 lipid classes (glycerolipids, glycerophospholipids, sphingolipids, sterol lipids, fatty acids, and acylcarnitines); and (c) 15 lipid subclasses; as well as with the particle sizes of three lipoproteins. RESULTS: Significant age associations were identified in 4 classes, 11 subclasses, 147 species, and particle size of one lipoprotein while significant sex differences were identified in 5 classes, 12 subclasses, 248 species, and particle sizes of two lipoproteins. For many lipid species (n = 97), age-related associations were significantly different between males and females. Age*sex interaction effects were most prevalent among phosphatidylcholines, sphingomyelins, and triglycerides. CONCLUSION: We identified several lipid species, subclasses, and classes that differ by age and sex; these lipid phenotypes may serve as useful biomarkers for lipid changes and associated cardiovascular risk with aging in the future. Future studies of age-related changes throughout the adult lifespan of both sexes are warranted. TRIAL REGISTRATION: ClinicalTrials.gov NCT00083369 ; May 21, 2004.


Subject(s)
Lipidomics , Lipids/blood , Sex Characteristics , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Female , Humans , Lipids/classification , Lipoproteins/chemistry , Male , Middle Aged , Particle Size , Young Adult
9.
Anal Chem ; 92(11): 7515-7522, 2020 06 02.
Article in English | MEDLINE | ID: mdl-32390414

ABSTRACT

Unidentified peaks remain a major problem in untargeted metabolomics by LC-MS/MS. Confidence in peak annotations increases by combining MS/MS matching and retention time. We here show how retention times can be predicted from molecular structures. Two large, publicly available data sets were used for model training in machine learning: the Fiehn hydrophilic interaction liquid chromatography data set (HILIC) of 981 primary metabolites and biogenic amines,and the RIKEN plant specialized metabolome annotation (PlaSMA) database of 852 secondary metabolites that uses reversed-phase liquid chromatography (RPLC). Five different machine learning algorithms have been integrated into the Retip R package: the random forest, Bayesian-regularized neural network, XGBoost, light gradient-boosting machine (LightGBM), and Keras algorithms for building the retention time prediction models. A complete workflow for retention time prediction was developed in R. It can be freely downloaded from the GitHub repository (https://www.retip.app). Keras outperformed other machine learning algorithms in the test set with minimum overfitting, verified by small error differences between training, test, and validation sets. Keras yielded a mean absolute error of 0.78 min for HILIC and 0.57 min for RPLC. Retip is integrated into the mass spectrometry software tools MS-DIAL and MS-FINDER, allowing a complete compound annotation workflow. In a test application on mouse blood plasma samples, we found a 68% reduction in the number of candidate structures when searching all isomers in MS-FINDER compound identification software. Retention time prediction increases the identification rate in liquid chromatography and subsequently leads to an improved biological interpretation of metabolomics data.


Subject(s)
Machine Learning , Metabolomics , Organic Chemicals/blood , Chromatography, Liquid , Humans , Tandem Mass Spectrometry , Time Factors
10.
Anal Chem ; 92(8): 5960-5968, 2020 04 21.
Article in English | MEDLINE | ID: mdl-32202765

ABSTRACT

Fatty acid esters of hydroxy fatty acids (FAHFAs) are a family of recently discovered lipids with important physiological functions in mammals and plants. However, low detection sensitivity in negative ionization mode mass spectrometry makes low-abundance FAHFA challenging to analyze. A 2-dimethylaminoethylamine (DMED) based chemical derivatization strategy was recently reported to improve the MS sensitivity of FAHFAs by labeling FAHFAs with a positively ionizable tertiary amine group. To facilitate reliable, high-throughput, and automatic annotation of these compounds, a DMED-FAHFA in silico library containing 4290 high-resolution tandem mass spectra covering 264 different FAHFA classes was developed. The construction of the library was based on the heuristic information from MS/MS fragmentation patterns of DMED-FAHFA authentic standards, and then, the patterns were applied to computer-generated DMED-FAHFAs. The developed DMED-FAHFA in silico library was demonstrated to be compatible with library search software NIST MS Search and the LC-MS/MS data processing tool MS-DIAL to guarantee high-throughput and automatic annotations. Applying the in silico library in Arabidopsis thaliana samples for profiling FAHFAs by high-resolution LC-MS/MS enabled the annotation of 19 DMED-FAHFAs from 16 families, including 3 novel compounds. Using the in silico library largely decreased the false-positive annotation rate in comparison to low-resolution LC-MS/MS. The developed library, MS/MS spectra, and development templates are freely available for commercial and noncommercial use at https://zenodo.org/record/3606905.


Subject(s)
Esters/analysis , Ethylamines/chemistry , Fatty Acids/analysis , Molecular Structure , Tandem Mass Spectrometry
11.
Lipids Health Dis ; 19(1): 153, 2020 Jun 25.
Article in English | MEDLINE | ID: mdl-32586392

ABSTRACT

BACKGROUND: The lipoprotein insulin resistance (LPIR) score was shown to predict insulin resistance (IR) and type 2 diabetes (T2D) in healthy adults. However, the molecular basis underlying the LPIR utility for classification remains unclear. OBJECTIVE: To identify small molecule lipids associated with variation in the LPIR score, a weighted index of lipoproteins measured by nuclear magnetic resonance, in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study (n = 980). METHODS: Linear mixed effects models were used to test the association between the LPIR score and 413 lipid species and their principal component analysis-derived groups. Significant associations were tested for replication with homeostatic model assessment-IR (HOMA-IR), a phenotype correlated with the LPIR score (r = 0.48, p <  0.001), in the Heredity and Phenotype Intervention (HAPI) Heart Study (n = 590). RESULTS: In GOLDN, 319 lipids were associated with the LPIR score (false discovery rate-adjusted p-values ranging from 4.59 × 10- 161 to 49.50 × 10- 3). Factors 1 (triglycerides and diglycerides/storage lipids) and 3 (mixed lipids) were positively (ß = 0.025, p = 4.52 × 10- 71 and ß = 0.021, p = 5.84 × 10- 41, respectively) and factor 2 (phospholipids/non-storage lipids) was inversely (ß = - 0.013, p = 2.28 × 10- 18) associated with the LPIR score. These findings were replicated for HOMA-IR in the HAPI Heart Study (ß = 0.10, p = 1.21 × 10- 02 for storage, ß = - 0.13, p = 3.14 × 10- 04 for non-storage, and ß = 0.19, p = 8.40 × 10- 07 for mixed lipids). CONCLUSIONS: Non-storage lipidomics species show a significant inverse association with the LPIR metabolic dysfunction score and present a promising focus for future therapeutic and prevention studies.


Subject(s)
Insulin Resistance/physiology , Lipids/blood , Adult , Aged , Body Mass Index , Diabetes Mellitus, Type 2/blood , Female , Humans , Lipidomics , Lipoproteins/blood , Male , Middle Aged , Triglycerides/blood , Waist Circumference
12.
Anal Chem ; 91(5): 3590-3596, 2019 03 05.
Article in English | MEDLINE | ID: mdl-30758187

ABSTRACT

Large-scale untargeted lipidomics experiments involve the measurement of hundreds to thousands of samples. Such data sets are usually acquired on one instrument over days or weeks of analysis time. Such extensive data acquisition processes introduce a variety of systematic errors, including batch differences, longitudinal drifts, or even instrument-to-instrument variation. Technical data variance can obscure the true biological signal and hinder biological discoveries. To combat this issue, we present a novel normalization approach based on using quality control pool samples (QC). This method is called systematic error removal using random forest (SERRF) for eliminating the unwanted systematic variations in large sample sets. We compared SERRF with 15 other commonly used normalization methods using six lipidomics data sets from three large cohort studies (832, 1162, and 2696 samples). SERRF reduced the average technical errors for these data sets to 5% relative standard deviation. We conclude that SERRF outperforms other existing methods and can significantly reduce the unwanted systematic variation, revealing biological variance of interest.


Subject(s)
Datasets as Topic/standards , Lipidomics/standards , Quality Control , Scientific Experimental Error/statistics & numerical data
13.
Anal Chem ; 91(3): 2155-2162, 2019 02 05.
Article in English | MEDLINE | ID: mdl-30608141

ABSTRACT

Urine metabolites are used in many clinical and biomedical studies but usually only for a few classic compounds. Metabolomics detects vastly more metabolic signals that may be used to precisely define the health status of individuals. However, many compounds remain unidentified, hampering biochemical conclusions. Here, we annotate all metabolites detected by two untargeted metabolomic assays, hydrophilic interaction chromatography (HILIC)-Q Exactive HF mass spectrometry and charged surface hybrid (CSH)-Q Exactive HF mass spectrometry. Over 9,000 unique metabolite signals were detected, of which 42% triggered MS/MS fragmentations in data-dependent mode. On the highest Metabolomics Standards Initiative (MSI) confidence level 1, we identified 175 compounds using authentic standards with precursor mass, retention time, and MS/MS matching. An additional 578 compounds were annotated by precursor accurate mass and MS/MS matching alone, MSI level 2, including a novel library specifically geared at acylcarnitines (CarniBlast). The rest of the metabolome is usually left unannotated. To fill this gap, we used the in silico fragmentation tool CSI:FingerID and the new NIST hybrid search to annotate all further compounds (MSI level 3). Testing the top-ranked metabolites in CSI:Finger ID annotations yielded 40% accuracy when applied to the MSI level 1 identified compounds. We classified all MSI level 3 annotations by the NIST hybrid search using the ClassyFire ontology into 21 superclasses that were further distinguished into 184 chemical classes. ClassyFire annotations showed that the previously unannotated urine metabolome consists of 28% derivatives of organic acids, 16% heterocyclics, and 16% lipids as major classes.


Subject(s)
Carnitine/metabolism , Metabolomics , Carnitine/analogs & derivatives , Carnitine/urine , Chromatography, High Pressure Liquid , Humans , Hydrophobic and Hydrophilic Interactions , Mass Spectrometry , Phenotype
14.
Mass Spectrom Rev ; 37(4): 513-532, 2018 07.
Article in English | MEDLINE | ID: mdl-28436590

ABSTRACT

Tandem mass spectral library search (MS/MS) is the fastest way to correctly annotate MS/MS spectra from screening small molecules in fields such as environmental analysis, drug screening, lipid analysis, and metabolomics. The confidence in MS/MS-based annotation of chemical structures is impacted by instrumental settings and requirements, data acquisition modes including data-dependent and data-independent methods, library scoring algorithms, as well as post-curation steps. We critically discuss parameters that influence search results, such as mass accuracy, precursor ion isolation width, intensity thresholds, centroiding algorithms, and acquisition speed. A range of publicly and commercially available MS/MS databases such as NIST, MassBank, MoNA, LipidBlast, Wiley MSforID, and METLIN are surveyed. In addition, software tools including NIST MS Search, MS-DIAL, Mass Frontier, SmileMS, Mass++, and XCMS2 to perform fast MS/MS search are discussed. MS/MS scoring algorithms and challenges during compound annotation are reviewed. Advanced methods such as the in silico generation of tandem mass spectra using quantum chemistry and machine learning methods are covered. Community efforts for curation and sharing of tandem mass spectra that will allow for faster distribution of scientific discoveries are discussed.


Subject(s)
Machine Learning , Small Molecule Libraries/isolation & purification , Software , Tandem Mass Spectrometry/statistics & numerical data , Computer Simulation , Databases, Chemical , Humans , Information Dissemination , Models, Chemical , Quantum Theory , Tandem Mass Spectrometry/instrumentation , Tandem Mass Spectrometry/methods
15.
Anal Chem ; 90(18): 10758-10764, 2018 09 18.
Article in English | MEDLINE | ID: mdl-30096227

ABSTRACT

Unknown metabolites represent a bottleneck in untargeted metabolomics research. Ion mobility-mass spectrometry (IM-MS) facilitates lipid identification because it yields collision cross section (CCS) information that is independent from mass or lipophilicity. To date, only a few CCS values are publicly available for complex lipids such as phosphatidylcholines, sphingomyelins, or triacylglycerides. This scarcity of data limits the use of CCS values as an identification parameter that is orthogonal to mass, MS/MS, or retention time. A combination of lipid descriptors was used to train five different machine learning algorithms for automatic lipid annotations, combining accurate mass ( m/ z), retention time (RT), CCS values, carbon number, and unsaturation level. Using a training data set of 429 true positive lipid annotations from four lipid classes, 92.7% correct annotations overall were achieved using internal cross-validation. The trained prediction model was applied to an unknown milk lipidomics data set and allowed for class 3 level annotations of most features detected in this application set according to Metabolomics Standards Initiative (MSI) reporting guidelines.


Subject(s)
Chromatography, Liquid/methods , Ion Mobility Spectrometry/methods , Lipids/chemistry , Algorithms , Animals , Cattle , Databases, Factual , Metabolomics , Milk/chemistry , Reproducibility of Results
16.
Nat Methods ; 12(6): 523-6, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25938372

ABSTRACT

Data-independent acquisition (DIA) in liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) provides comprehensive untargeted acquisition of molecular data. We provide an open-source software pipeline, which we call MS-DIAL, for DIA-based identification and quantification of small molecules by mass spectral deconvolution. For a reversed-phase LC-MS/MS analysis of nine algal strains, MS-DIAL using an enriched LipidBlast library identified 1,023 lipid compounds, highlighting the chemotaxonomic relationships between the algal strains.


Subject(s)
Chlorophyta/metabolism , Chromatography, Liquid/methods , Metabolome , Software , Tandem Mass Spectrometry/methods , Chlorophyta/genetics , Gene Expression Regulation, Plant , Lipid Metabolism/genetics , Lipid Metabolism/physiology , Lipids/chemistry , Species Specificity
17.
Anal Chem ; 89(19): 10171-10180, 2017 10 03.
Article in English | MEDLINE | ID: mdl-28876899

ABSTRACT

Mass spectrometry-based untargeted metabolomics often detects statistically significant metabolites that cannot be readily identified. Without defined chemical structure, interpretation of the biochemical relevance is not feasible. Epimetabolites are produced from canonical metabolites by defined enzymatic reactions and may represent a large fraction of the structurally unidentified metabolome. We here present a systematic workflow for annotating unknown epimetabolites using high resolution gas chromatography-accurate mass spectrometry with multiple ionization techniques and stable isotope labeled derivatization methods. We first determine elemental formulas, which are then used to query the "metabolic in-silico expansion" database (MINE DB) to obtain possible molecular structures that are predicted by enzyme promiscuity from canonical pathways. Accurate mass fragmentation rules are combined with in silico spectra prediction programs CFM-ID and MS-FINDER to derive the best candidates. We validated the workflow by correctly identifying 10 methylated nucleosides and 6 methylated amino acids. We then employed this strategy to annotate eight unknown compounds from cancer studies and other biological systems.


Subject(s)
Databases, Factual , Gas Chromatography-Mass Spectrometry/methods , Metabolome , Gas Chromatography-Mass Spectrometry/standards , Metabolomics , Molecular Weight , Reference Standards
18.
Anal Chem ; 88(16): 7946-58, 2016 08 16.
Article in English | MEDLINE | ID: mdl-27419259

ABSTRACT

Compound identification from accurate mass MS/MS spectra is a bottleneck for untargeted metabolomics. In this study, we propose nine rules of hydrogen rearrangement (HR) during bond cleavages in low-energy collision-induced dissociation (CID). These rules are based on the classic even-electron rule and cover heteroatoms and multistage fragmentation. We evaluated our HR rules by the statistics of MassBank MS/MS spectra in addition to enthalpy calculations, yielding three levels of computational MS/MS annotation: "resolved" (regular HR behavior following HR rules), "semiresolved" (irregular HR behavior), and "formula-assigned" (lacking structure assignment). With this nomenclature, 78.4% of a total of 18506 MS/MS fragment ions in the MassBank database and 84.8% of a total of 36370 MS/MS fragment ions in the GNPS database were (semi-) resolved by predicted bond cleavages. We also introduce the MS-FINDER software for structure elucidation. Molecular formulas of precursor ions are determined from accurate mass, isotope ratio, and product ion information. All isomer structures of the predicted formula are retrieved from metabolome databases, and MS/MS fragmentations are predicted in silico. The structures are ranked by a combined weighting score considering bond dissociation energies, mass accuracies, fragment linkages, and, most importantly, nine HR rules. The program was validated by its ability to correctly calculate molecular formulas with 98.0% accuracy for 5063 MassBank MS/MS records and to yield the correct structural isomer with 82.1% accuracy within the top-3 candidates. In a test with 936 manually identified spectra from an untargeted HILIC-QTOF MS data set of human plasma, formulas were correctly predicted in 90.4% of the cases, and the correct isomer structure was retrieved at 80.4% probability within the top-3 candidates, including for compounds that were absent in mass spectral libraries. The MS-FINDER software is freely available at http://prime.psc.riken.jp/ .


Subject(s)
Hydrogen/chemistry , Software , Cohort Studies , Glutamic Acid/analogs & derivatives , Glutamic Acid/chemistry , Glutathione/analogs & derivatives , Glutathione/chemistry , Humans , Lysine/analogs & derivatives , Lysine/chemistry , Molecular Structure , Phenylurea Compounds/chemistry , Phosphorylcholine/chemistry , Tandem Mass Spectrometry
19.
Nat Methods ; 10(8): 755-8, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23817071

ABSTRACT

Current tandem mass spectral libraries for lipid annotations in metabolomics are limited in size and diversity. We provide a freely available computer-generated tandem mass spectral library of 212,516 spectra covering 119,200 compounds from 26 lipid compound classes, including phospholipids, glycerolipids, bacterial lipoglycans and plant glycolipids. We show platform independence by using tandem mass spectra from 40 different mass spectrometer types including low-resolution and high-resolution instruments.


Subject(s)
Databases, Factual , Lipids/analysis , Tandem Mass Spectrometry/methods , Metabolomics/methods
20.
Chem Res Toxicol ; 29(11): 1818-1827, 2016 11 21.
Article in English | MEDLINE | ID: mdl-27788581

ABSTRACT

Human exposure to environmental tobacco smoke (ETS) is associated with an increased incidence of pulmonary and cardiovascular disease and possibly lung cancer. Metabolomics can reveal changes in metabolic networks in organisms under different physio-pathological conditions. Our objective was to identify spatial and temporal metabolic alterations with acute and repeated subchronic ETS exposure to understand mechanisms by which ETS exposure may cause adverse physiological and structural changes in the pulmonary and cardiovascular systems. Established and validated metabolomics assays of the lungs, hearts. and blood of young adult male rats following 1, 3, 8, and 21 days of exposure to ETS along with day-matched sham control rats (n = 8) were performed using gas chromatography time-of-flight mass spectrometry, BinBase database processing, multivariate statistical modeling, and MetaMapp biochemical mapping. A total of 489 metabolites were measured in the lung, heart, and blood, of which 142 metabolites were identified using a standardized metabolite annotation pipeline. Acute and repeated subchronic exposure to ETS was associated with significant metabolic changes in the lung related to energy metabolism, defense against reactive oxygen species, substrate uptake and transport, nucleotide metabolism, and substrates for structural components of collagen and membrane lipids. Metabolic changes were least prevalent in heart tissues but abundant in blood under repeated subchronic ETS exposure. Our analyses revealed that ETS causes alterations in metabolic networks, especially those associated with lung structure and function and found as systemic signals in the blood. The metabolic changes suggest that ETS exposure may adversely affects the mitochondrial respiratory chain, lung elasticity, membrane integrity, redox states, cell cycle, and normal metabolic and physiological functions of the lungs, even after subchronic ETS exposure.


Subject(s)
Metabolic Networks and Pathways , Tobacco Smoke Pollution/adverse effects , Animals , Cardiovascular System/metabolism , Lung/metabolism , Male , Metabolomics , Rats , Rats, Sprague-Dawley
SELECTION OF CITATIONS
SEARCH DETAIL