Your browser doesn't support javascript.
loading
Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment.
Bass, Lewis; Elder, Luke H; Folescu, Dan E; Forouzesh, Negin; Tolokh, Igor S; Karpatne, Anuj; Onufriev, Alexey V.
Affiliation
  • Bass L; Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States.
  • Elder LH; Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States.
  • Folescu DE; Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States.
  • Forouzesh N; Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States.
  • Tolokh IS; Department of Computer Science, California State University, Los Angeles, California 90032, United States.
  • Karpatne A; Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States.
  • Onufriev AV; Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States.
J Chem Theory Comput ; 20(1): 396-410, 2024 Jan 09.
Article de En | MEDLINE | ID: mdl-38149593
ABSTRACT
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs it preserves the correct overall trend, even well outside of the training set.

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Langue: En Journal: J Chem Theory Comput / J. chem. theory comput. (Online) / Journal of chemical theory and computation (Online) Année: 2024 Type de document: Article Pays d'affiliation: États-Unis d'Amérique Pays de publication: États-Unis d'Amérique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Langue: En Journal: J Chem Theory Comput / J. chem. theory comput. (Online) / Journal of chemical theory and computation (Online) Année: 2024 Type de document: Article Pays d'affiliation: États-Unis d'Amérique Pays de publication: États-Unis d'Amérique