HSQC Spectra Simulation and Matching for Molecular Identification.
J Chem Inf Model
; 64(8): 3180-3191, 2024 Apr 22.
Article
en En
| MEDLINE
| ID: mdl-38533705
ABSTRACT
In the pursuit of improved compound identification and database search tasks, this study explores heteronuclear single quantum coherence (HSQC) spectra simulation and matching methodologies. HSQC spectra serve as unique molecular fingerprints, enabling a valuable balance of data collection time and information richness. We conducted a comprehensive evaluation of the following four HSQC simulation techniques ACD/Labs (ACD), MestReNova (MNova), Gaussian NMR calculations (DFT), and a graph-based neural network (ML). For the latter two techniques, we developed a reconstruction logic to combine proton and carbon 1D spectra into HSQC spectra. The methodology involved the implementation of three peak-matching strategies (minimum-sum, Euclidean-distance, and Hungarian distance) combined with three padding strategies (zero-padding, peak-truncated, and nearest-neighbor double assignment). We found that coupling these strategies with a robust simulation technique facilitates the accurate identification of correct molecules from similar analogues (regio- and stereoisomers) and allows for fast and accurate large database searches. Furthermore, we demonstrated the efficacy of the best-performing methodology by rectifying the structures of a set of previously misidentified molecules. This research indicates that effective HSQC spectral simulation and matching methodologies significantly facilitate molecular structure elucidation. Furthermore, we offer a Google Colab notebook for researchers to use our methods on their own data (https//github.com/AstraZeneca/hsqc_structure_elucidation.git).
Texto completo:
1
Bases de datos:
MEDLINE
Asunto principal:
Simulación por Computador
Idioma:
En
Revista:
J Chem Inf Model
Asunto de la revista:
INFORMATICA MEDICA
/
QUIMICA
Año:
2024
Tipo del documento:
Article
País de afiliación:
Suecia