Your browser doesn't support javascript.
loading
Highly Accurate Prediction of NMR Chemical Shifts from Low-Level Quantum Mechanics Calculations Using Machine Learning.
Li, Jie; Liang, Jiashu; Wang, Zhe; Ptaszek, Aleksandra L; Liu, Xiao; Ganoe, Brad; Head-Gordon, Martin; Head-Gordon, Teresa.
Affiliation
  • Li J; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
  • Liang J; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
  • Wang Z; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
  • Ptaszek AL; Christian Doppler Laboratory for High-Content Structural Biology and Biotechnology, Department of Structural and Computational Biology, Max Perutz Laboratories, University of Vienna, Campus Vienna Biocenter 5, Vienna 1030, Austria.
  • Liu X; Laboratory for Computer-Aided Molecular Design, Division of Medicinal Chemistry, Otto Loewi Research Center, Medical University Graz, Neue Stiftingtalstrasse 6/III, Graz 8010, Austria.
  • Ganoe B; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
  • Head-Gordon M; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
  • Head-Gordon T; Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
J Chem Theory Comput ; 20(5): 2152-2166, 2024 Mar 12.
Article in En | MEDLINE | ID: mdl-38331423
ABSTRACT
Theoretical predictions of NMR chemical shifts from first-principles can greatly facilitate experimental interpretation and structure identification of molecules in gas, solution, and solid-state phases. However, accurate prediction of chemical shifts using the gold-standard coupled cluster with singles, doubles, and perturbative triple excitations [CCSD(T)] method with a complete basis set (CBS) can be prohibitively expensive. By contrast, machine learning (ML) methods offer inexpensive alternatives for chemical shift predictions but are hampered by generalization to molecules outside the original training set. Here, we propose several new ideas in ML of the chemical shift prediction for H, C, N, and O that first introduce a novel feature representation, based on the atomic chemical shielding tensors within a molecular environment using an inexpensive quantum mechanics (QM) method, and train it to predict NMR chemical shieldings of a high-level composite theory that approaches the accuracy of CCSD(T)/CBS. In addition, we train the ML model through a new progressive active learning workflow that reduces the total number of expensive high-level composite calculations required while allowing the model to continuously improve on unseen data. Furthermore, the algorithm provides an error estimation, signaling potential unreliability in predictions if the error is large. Finally, we introduce a novel approach to keep the rotational invariance of the features using tensor environment vectors (TEVs) that yields a ML model with the highest accuracy compared to a similar model using data augmentation. We illustrate the predictive capacity of the resulting inexpensive shift machine learning (iShiftML) models across several benchmarks, including unseen molecules in the NS372 data set, gas-phase experimental chemical shifts for small organic molecules, and much larger and more complex natural products in which we can accurately differentiate between subtle diastereomers based on chemical shift assignments.

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: J Chem Theory Comput Year: 2024 Document type: Article Affiliation country: Estados Unidos Country of publication: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: J Chem Theory Comput Year: 2024 Document type: Article Affiliation country: Estados Unidos Country of publication: Estados Unidos