Your browser doesn't support javascript.
loading
Cross-validation strategies in QSPR modelling of chemical reactions.
Rakhimbekova, A; Akhmetshin, T N; Minibaeva, G I; Nugmanov, R I; Gimadiev, T R; Madzhidov, T I; Baskin, I I; Varnek, A.
Affiliation
  • Rakhimbekova A; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
  • Akhmetshin TN; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
  • Minibaeva GI; Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, Strasbourg, France.
  • Nugmanov RI; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
  • Gimadiev TR; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
  • Madzhidov TI; Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo, Japan.
  • Baskin II; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
  • Varnek A; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia.
SAR QSAR Environ Res ; 32(3): 207-219, 2021 Mar.
Article in En | MEDLINE | ID: mdl-33601989
In this article, we consider cross-validation of the quantitative structure-property relationship models for reactions and show that the conventional k-fold cross-validation (CV) procedure gives an 'optimistically' biased assessment of prediction performance. To address this issue, we suggest two strategies of model cross-validation, 'transformation-out' CV, and 'solvent-out' CV. Unlike the conventional k-fold cross-validation approach that does not consider the nature of objects, the proposed procedures provide an unbiased estimation of the predictive performance of the models for novel types of structural transformations in chemical reactions and reactions going under new conditions. Both the suggested strategies have been applied to predict the rate constants of bimolecular elimination and nucleophilic substitution reactions, and Diels-Alder cycloaddition. All suggested cross-validation methodologies and tutorial are implemented in the open-source software package CIMtools (https://github.com/cimm-kzn/CIMtools).
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Quantitative Structure-Activity Relationship / Models, Chemical Type of study: Prognostic_studies Language: En Journal: SAR QSAR Environ Res Journal subject: SAUDE AMBIENTAL Year: 2021 Document type: Article Affiliation country: Russia Country of publication: United kingdom

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Quantitative Structure-Activity Relationship / Models, Chemical Type of study: Prognostic_studies Language: En Journal: SAR QSAR Environ Res Journal subject: SAUDE AMBIENTAL Year: 2021 Document type: Article Affiliation country: Russia Country of publication: United kingdom