Leveraging DFT and Molecular Fragmentation for Chemically Accurate pKa Prediction Using Machine Learning.
J Chem Inf Model
; 64(3): 712-723, 2024 Feb 12.
Article
in En
| MEDLINE
| ID: mdl-38301279
ABSTRACT
We present a quantum mechanical/machine learning (ML) framework based on random forest to accurately predict the pKas of complex organic molecules using inexpensive density functional theory (DFT) calculations. By including physics-based features from low-level DFT calculations and structural features from our connectivity-based hierarchy (CBH) fragmentation protocol, we can correct the systematic error associated with DFT. The generalizability and performance of our model are evaluated on two benchmark sets (SAMPL6 and Novartis). We believe the carefully curated input of physics-based features lessens the model's data dependence and need for complex deep learning architectures, without compromising the accuracy of the test sets. As a point of novelty, our work extends the applicability of CBH, employing it for the generation of viable molecular descriptors for ML.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Quantum Theory
/
Models, Chemical
Type of study:
Prognostic_studies
/
Risk_factors_studies
Language:
En
Journal:
J Chem Inf Model
Journal subject:
INFORMATICA MEDICA
/
QUIMICA
Year:
2024
Document type:
Article
Affiliation country:
United States