RESUMO
Compound identification is at the center of metabolomics, usually by comparing experimental mass spectra against library spectra. However, most compounds are not commercially available to generate library spectra. Hence, for such compounds, MS/MS spectra need to be predicted. Machine learning and heuristic models have largely failed except for lipids. Here, quantum chemistry software can be used to predict mass spectra. However, quantum chemistry predictions for collision induced dissociation (CID) mass spectra in LC-MS/MS are rare. We present the CIDMD (Collision-Induced Dissociation via Molecular Dynamics) framework to model CID-based MS/MS spectra. It uses first-principles molecular dynamics (MD) to simulate the physical process of molecular collisions in CID tandem mass spectrometry. First, molecular ions are constructed at specific protonation sites. Using density functional theory, these protonated ions are targeted by argon collider gas atoms at user-specified velocities. Subsequent bond breakages are simulated over time for at least 1,000 fs. Each simulation is repeated multiple times from various collisional directions. Fragmentations are accumulated over those repeated collisions to generate CIDMD in silico mass spectra. Twelve small metabolites (<205 Da) were selected to test the accuracy of this framework in comparison to experimental MS/MS spectra. When testing different protomers, collider velocities, number of simulations, simulation time and impact factor b cutoffs, we yielded 261 predicted mass spectra. These in silico spectra resulted in entropy similarity scores of an average 624 ± 189 for all 261 spectra compared to their corresponding experimental spectra, which improved to 828 ± 77 when using optimal parameters of the most probable protomers for 12 molecules. With increasing molecular mass, higher velocities achieved better results. Similarly, different protomers showed large differences in fragmentation; hence, with increasing numbers of protomers and tautomers, the average CIDMD prediction accuracy decreased. Mechanistic details showed that specific fragment ions can be produced from different protomers via multiple fragmentation pathways. We propose that CIDMD is a suitable tool to predict mass spectra of small metabolites like produced by the gut microbiome.
Assuntos
Simulação de Dinâmica Molecular , Espectrometria de Massas em TandemRESUMO
Protonation is the most frequent adduct found in positive electrospray ionization collision-induced mass spectra (CID-MS/MS). In a parallel report Lee, J. J. Chem. Inf. Model. 2024, 10.1021/acs.jcim.4c00760, we developed a quantum chemistry framework to predict mass spectra by collision-induced dissociation molecular dynamics (CIDMD). As different protonation sites affect fragmentation pathways of a given molecule, the accuracy of predicting tandem mass spectra by CIDMD ultimately depends on the choice of its protomers. To investigate the impact of molecular protonation sites on MS/MS spectra, we compared CIDMD-predicted spectra to all available experimental MS/MS spectra by similarity matching. We probed 10 molecules with a total of 43 protomers, the largest study to date, including organic acids (sorbic acid, citramalic acid, itaconic acid, mesaconic acid, citraconic acid, and taurine) as well as aromatic amines including uracil, aniline, bufotenine, and psilocin. We demonstrated how different protomers can converge different fragmentation pathways to the same fragment ions but also may explain the presence of different fragment ions in experimental MS/MS spectra. For the first time, we used in silico MS/MS predictions to test the impact of solvents on proton affinities, comparing the gas phase and a mixture of acetonitrile/water (1:1). We also extended applications of in silico MS/MS predictions to investigate the impact of protonation sites on the energy barriers of isomerization between protomers via proton transfer. Despite our initial hypothesis that the thermodynamically most stable protomer should give the best match to the experiment, we found only weak inverse relationships between the calculated proton affinities and corresponding entropy similarities of experimental and CIDMD-predicted MS/MS spectra. CIDMD-predicted mechanistic details of fragmentation reaction pathways revealed a clear preference for specific protomer forms for several molecules. Overall, however, proton affinity was not a good predictor corresponding to the predicted CIDMD spectra. For example, for uracil, only one protomer predicted all experimental MS/MS fragment ions, but this protomer had neither the highest proton affinity nor the best MS/MS match score. Instead of proton affinity, the transfer of protons during the electrospray process from the initial protonation site (i.e., mobile proton model) better explains the differences between the thermodynamic rationale and experimental data. Protomers that undergo fragmentation with lower energy barriers have greater contributions to experimental MS/MS spectra than their thermodynamic Boltzmann populations would suggest. Hence, in silico predictions still need to calculate MS/MS spectra for multiple protomers, as the extent of distributions cannot be readily predicted.
Assuntos
Simulação de Dinâmica Molecular , Prótons , Teoria Quântica , Espectrometria de Massas em Tandem , Modelos QuímicosRESUMO
Mass spectrometry is the most commonly used method for compound annotation in metabolomics. However, most mass spectra in untargeted assays cannot be annotated with specific compound structures because reference mass spectral libraries are far smaller than the complement of known molecules. Theoretically predicted mass spectra might be used as a substitute for experimental spectra especially for compounds that are not commercially available. For example, the Quantum Chemistry Electron Ionization Mass Spectra (QCEIMS) method can predict 70 eV electron ionization mass spectra from any given input molecular structure. In this work, we investigated the accuracy of QCEIMS predictions of electron ionization (EI) mass spectra for 80 purine and pyrimidine derivatives in comparison to experimental data in the NIST 17 database. Similarity scores between every pair of predicted and experimental spectra revealed that 45% of the compounds were found as the correct top hit when QCEIMS predicted spectra were matched against the NIST17 library of >267,000 EI spectra, and 74% of the compounds were found within the top 10 hits. We then investigated the impact of matching, missing, and additional fragment ions in predicted EI mass spectra versus ion abundances in MS similarity scores. We further include detailed studies of fragmentation pathways such as retro Diels-Alder reactions to predict neutral losses of (iso)cyanic acid, hydrogen cyanide, or cyanamide in the mass spectra of purines and pyrimidines. We describe how trends in prediction accuracy correlate with the chemistry of the input compounds to better understand how mechanisms of QCEIMS predictions could be improved in future developments. We conclude that QCEIMS is useful for generating large-scale predicted mass spectral libraries for identification of compounds that are absent from experimental libraries and that are not commercially available.
RESUMO
A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.
Assuntos
Metabolômica , Teoria QuânticaRESUMO
The active form of vitamin B6, pyridoxal 5'-phosphate (PLP), plays an essential role in the catalytic mechanism of various proteins, including human glutamate-oxaloacetate transaminase (hGOT1), an important enzyme in amino acid metabolism. A recent molecular and genetic study showed that the E266K, R267H, and P300L substitutions in aspartate aminotransferase, the Arabidopsis analog of hGOT1, genetically suppress a developmentally arrested Arabidopsis RUS mutant. Furthermore, CD analyses suggested that the variants exist as apo proteins and implicated a possible role of PLP in the regulation of PLP homeostasis and metabolic pathways. In this work, we assessed the stability of PLP bound to hGOT1 for the three variant and wildtype (WT) proteins using a combined 6 µs of molecular dynamics (MD) simulation. For the variants and WT in the holo form, the MD simulations reproduced the "closed-open" transition needed for substrate binding. This conformational transition was associated with the rearrangement of the P15-R32 small domain loop providing substrate access to the R387/R293 binding motif. We also showed that formation of the dimer interface is essential for PLP affinity to the active site. The position of PLP in the WT binding site was stabilized by a unique hydrogen bond network of the phosphate binding cup, which placed the cofactor for formation of the covalent Schiff base linkage with K259 for catalysis. The amino acid substitutions at positions 266, 267, and 300 reduced the structural correlation between PLP and the protein active site and/or integrity of the dimer interface. Principal component analysis and energy decomposition clearly suggested dimer misalignment and dissociation for the three variants tested in our work. The low affinity of PLP in the hGOT1 variants observed in our computational work provided structural rationale for the possible role of vitamin B6 in regulating metabolic pathways.
Assuntos
Aspartato Aminotransferase Citoplasmática/genética , Aspartato Aminotransferase Citoplasmática/fisiologia , Fosfato de Piridoxal/metabolismo , Substituição de Aminoácidos/genética , Aspartato Aminotransferase Citoplasmática/ultraestrutura , Aspartato Aminotransferases/metabolismo , Sítios de Ligação/genética , Catálise , Domínio Catalítico , Simulação por Computador , Dimerização , Glutamatos/genética , Glutamatos/fisiologia , Humanos , Modelos Moleculares , Simulação de Dinâmica Molecular , Oxaloacetatos/metabolismo , Análise de Componente Principal , Domínios Proteicos/genética , Fosfato de Piridoxal/química , Fosfato de Piridoxal/fisiologia , Vitamina B 6/metabolismoRESUMO
L-Rhamnose is a ubiquitous bacterial cell-wall component. The biosynthetic pathway for its precursor dTDP-L-rhamnose is not present in humans, which makes the enzymes of the pathway potential drug targets. In this study, the three-dimensional structure of the first protein of this pathway, glucose-1-phosphate thymidylyltransferase (RfbA), from Bacillus anthracis was determined. In other organisms this enzyme is referred to as RmlA. RfbA was co-crystallized with the products of the enzymatic reaction, dTDP-α-D-glucose and pyrophosphate, and its structure was determined at 2.3â Å resolution. This is the first reported thymidylyltransferase structure from a Gram-positive bacterium. RfbA shares overall structural characteristics with known RmlA homologs. However, RfbA exhibits a shorter sequence at its C-terminus, which results in the absence of three α-helices involved in allosteric site formation. Consequently, RfbA was observed to exhibit a quaternary structure that is unique among currently reported glucose-1-phosphate thymidylyltransferase bacterial homologs. These structural analyses suggest that RfbA may not be allosterically regulated in some organisms and is structurally distinct from other RmlA homologs.