Predicting aqueous solubility of environmentally relevant compounds from molecular features: a simple but highly effective four-dimensional model based on Project to Latent Structures.
Water Res
; 47(14): 5362-70, 2013 Sep 15.
Article
in En
| MEDLINE
| ID: mdl-23866150
ABSTRACT
The aqueous solubility (log S) of xenobiotic chemicals has been identified as a key characteristic in determining their bioaccessibility/bioavailability and their fate and transport in aquatic environments. We here explore and evaluate the use of a state-of-the-art data analysis technique (Project to Latent Structures, PLS) to estimate log S of environmentally relevant chemicals. A large number (n = 624) of molecular descriptors was computed for over 1400 organic chemicals, and then refined by a feature selection technique. Candidate predictor descriptors were fitted to data by means of PLS, which was optimized by an internal leave-one-out cross-validation technique and validated by an external data set. The final (best) PLS model with only four variables (AlogP, X1sol, Mv, and E) exhibited noteworthy stability and good predictive power. It was able to explain 91% of the data (n = 1400) variance with an average absolute error of 0.5 log units through the solubilities span over 12 orders of magnitude. The newly proposed model is transparent, easily portable from one user to another, and robust enough to accurately estimate log S of a wide range of emerging contaminants.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Organic Chemicals
/
Water Pollutants, Chemical
/
Quantitative Structure-Activity Relationship
/
Models, Chemical
Type of study:
Prognostic_studies
/
Risk_factors_studies
Language:
En
Journal:
Water Res
Year:
2013
Type:
Article
Affiliation country:
United States