Adding Stochastic Negative Examples into Machine Learning Improves Molecular Bioactivity Prediction.
J Chem Inf Model
; 60(12): 5957-5970, 2020 12 28.
Article
em En
| MEDLINE
| ID: mdl-33245237
ABSTRACT
Multitask deep neural networks learn to predict ligand-target binding by example, yet public pharmacological data sets are sparse, imbalanced, and approximate. We constructed two hold-out benchmarks to approximate temporal and drug-screening test scenarios, whose characteristics differ from a random split of conventional training data sets. We developed a pharmacological data set augmentation procedure, Stochastic Negative Addition (SNA), which randomly assigns untested molecule-target pairs as transient negative examples during training. Under the SNA procedure, drug-screening benchmark performance increases from R2 = 0.1926 ± 0.0186 to 0.4269 ± 0.0272 (122%). This gain was accompanied by a modest decrease in the temporal benchmark (13%). SNA increases in drug-screening performance were consistent for classification and regression tasks and outperformed y-randomized controls. Our results highlight where data and feature uncertainty may be problematic and how leveraging uncertainty into training improves predictions of drug-target relationships.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Redes Neurais de Computação
/
Aprendizado de Máquina
Tipo de estudo:
Clinical_trials
/
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Ano de publicação:
2020
Tipo de documento:
Article