Your browser doesn't support javascript.
loading
Dimensionally reduced machine learning model for predicting single component octanol-water partition coefficients.
Kenney, David H; Paffenroth, Randy C; Timko, Michael T; Teixeira, Andrew R.
Afiliación
  • Kenney DH; Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, 01609, USA.
  • Paffenroth RC; Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA, 01609, USA.
  • Timko MT; Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, 01609, USA.
  • Teixeira AR; Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, 01609, USA. arteixeira@wpi.edu.
J Cheminform ; 15(1): 9, 2023 Jan 19.
Article en En | MEDLINE | ID: mdl-36658606
ABSTRACT
MF-LOGP, a new method for determining a single component octanol-water partition coefficients ([Formula see text]) is presented which uses molecular formula as the only input. Octanol-water partition coefficients are useful in many applications, ranging from environmental fate and drug delivery. Currently, partition coefficients are either experimentally measured or predicted as a function of structural fragments, topological descriptors, or thermodynamic properties known or calculated from precise molecular structures. The MF-LOGP method presented here differs from classical methods as it does not require any structural information and uses molecular formula as the sole model input. MF-LOGP is therefore useful for situations in which the structure is unknown or where the use of a low dimensional, easily automatable, and computationally inexpensive calculations is required. MF-LOGP is a random forest algorithm that is trained and tested on 15,377 data points, using 10 features derived from the molecular formula to make [Formula see text] predictions. Using an independent validation set of 2713 data points, MF-LOGP was found to have an average [Formula see text] = 0.77 ± 0.007, [Formula see text] = 0.52 ± 0.003, and [Formula see text] = 0.83 ± 0.003. This performance fell within the spectrum of performances reported in the published literature for conventional higher dimensional models ([Formula see text] = 0.42-1.54, [Formula see text] = 0.09-1.07, and [Formula see text] = 0.32-0.95). Compared with existing models, MF-LOGP requires a maximum of ten features and no structural information, thereby providing a practical and yet predictive tool. The development of MF-LOGP provides the groundwork for development of more physical prediction models leveraging big data analytical methods or complex multicomponent mixtures.
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: J Cheminform Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: J Cheminform Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos