RESUMEN
Flavonoids, the vastest class of natural polyphenols, are extensively investigated for their multiple benefits on human health. Due to their physicochemical or biological properties, many representatives are considered to exhibit low selectivity among various protein targets or to plague high-throughput screening (HTS) outcomes. The aim of this study is to highlight reliable, bioselective compounds sharing flavonoidic scaffolds in HTS experiments. A filtering scheme was applied to remove undesired flavonoids (and related compounds) from confirmatory PubChem bioassays. A number of 433 compounds addressing various protein targets form the core of the collection of bioselective flavonoids and related compounds (ColBioS-FlavRC). With an additional set of 2908 inactive related compounds, ColBioS-FlavRC offers the grounds for method optimization and validation. We exemplified the use of ColBioS-FlavRC by pharmacophore modeling, subsequently (externally) validated for virtual screening purposes. The early enrichment capabilities of the pharmacophore hypotheses were measured by means of the median exponential retriever operating curve enrichment (MeROCE), a suited metric in comparative evaluations of virtual screening methods. ColBioS-FlavRC is available in the Supporting Information and is freely accessible for further studies.
Asunto(s)
Algoritmos , Flavonoides/química , Proteínas/química , Diseño de Fármacos , Ensayos Analíticos de Alto Rendimiento , Humanos , Proteínas/agonistas , Proteínas/antagonistas & inhibidores , Relación Estructura-Actividad Cuantitativa , Interfaz Usuario-ComputadorRESUMEN
In this study, a simple evaluation metric, denoted as eROCE was proposed to measure the early enrichment of predictive methods. We demonstrated the superior robustness of eROCE compared to other known metrics throughout several active to inactive ratios ranging from 1:10 to 1:1000. Group fusion similarity search was investigated by varying 16 similarity coefficients, five molecular representations (binary and non-binary) and two group fusion rules using two reference structure set sizes. We used a dataset of 3478 actives and 43,938 inactive molecules and the enrichment was analyzed by means of eROCE. This retrospective study provides optimal similarity search parameters in the case of ALDH1A1 inhibitors.