Your browser doesn't support javascript.
loading
Boosting the Accuracy and Chemical Space Coverage of the Detection of Small Colloidal Aggregating Molecules Using the BAD Molecule Filter.
Abou Hajal, Abdallah; Bryce, Richard A; Amor, Boulbaba Ben; Atatreh, Noor; Ghattas, Mohammad A.
Afiliação
  • Abou Hajal A; College of Pharmacy, Al Ain University, Abu Dhabi 112612, United Arab Emirates.
  • Bryce RA; AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi 112612, United Arab Emirates.
  • Amor BB; Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Oxford Road, Manchester M13 9PL, U.K.
  • Atatreh N; Core42, Inception/G42, Abu Dhabi 2282, United Arab Emirates.
  • Ghattas MA; IMT Nord Europe, Villeneuve D'Ascq 59650 France.
J Chem Inf Model ; 64(13): 4991-5005, 2024 Jul 08.
Article em En | MEDLINE | ID: mdl-38920403
ABSTRACT
The ability to conduct effective high throughput screening (HTS) campaigns in drug discovery is often hampered by the detection of false positives in these assays due to small colloidally aggregating molecules (SCAMs). SCAMs can produce artifactual hits in HTS by nonspecific inhibition of the protein target. In this work, we present a new computational prediction tool for detecting SCAMs based on their 2D chemical structure. The tool, called the boosted aggregation detection (BAD) molecule filter, employs decision tree ensemble methods, namely, the CatBoost classifier and the light gradient-boosting machine, to significantly improve the detection of SCAMs. In developing the filter, we explore models trained on individual data sets, a consensus approach using these models, and, third, a merged data set approach, each tailored for specific drug discovery needs. The individual data set method emerged as most effective, achieving 93% sensitivity and 90% specificity, outperforming existing state-of-the-art models by 20 and 5%, respectively. The consensus models offer broader chemical space coverage, exceeding 90% for all testing sets. This feature is an important aspect particularly for early stage medicinal chemistry projects, and provides information on applicability domain. Meanwhile, the merged data set models demonstrated robust performance, with a notable sensitivity of 79% in the comprehensive 10-fold cross-validation test set. A SHAP analysis of model features indicates the importance of hydrophobicity and molecular complexity as primary factors influencing the aggregation propensity. The BAD molecule filter is readily accessible for the public usage on https//molmodlab-aau.com/Tools.html. This filter provides a new, more robust tool for aggregate prediction in the early stages of drug discovery to optimize hit rates and reduce associated testing and validation overheads.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Descoberta de Drogas Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Descoberta de Drogas Idioma: En Ano de publicação: 2024 Tipo de documento: Article