Search | VHL Regional Portal

Machine Learning Models to Predict Inhibition of the Bile Salt Export Pump.

McLoughlin, Kevin S; Jeong, Claire G; Sweitzer, Thomas D; Minnich, Amanda J; Tse, Margaret J; Bennion, Brian J; Allen, Jonathan E; Calad-Thomson, Stacie; Rush, Thomas S; Brase, James M.

J Chem Inf Model ; 61(2): 587-602, 2021 02 22.

Article in English | MEDLINE | ID: mdl-33502191

ABSTRACT

Cholestatic liver injury is frequently associated with drug inhibition of bile salt transporters, such as the bile salt export pump (BSEP). Reliable in silico models to predict BSEP inhibition directly from chemical structures would significantly reduce costs during drug discovery and could help avoid injury to patients. We report our development of classification and regression models for BSEP inhibition with substantially improved performance over previously published models. We assessed the performance effects of different methods of chemical featurization, data set partitioning, and class labeling and identified the methods producing models that generalized best to novel chemical entities.

Subject(s)

Chemical and Drug Induced Liver Injury , Cholestasis , ATP Binding Cassette Transporter, Subfamily B, Member 11 , ATP-Binding Cassette Transporters , Humans , Machine Learning

AMPL: A Data-Driven Modeling Pipeline for Drug Discovery.

Minnich, Amanda J; McLoughlin, Kevin; Tse, Margaret; Deng, Jason; Weber, Andrew; Murad, Neha; Madej, Benjamin D; Ramsundar, Bharath; Rush, Tom; Calad-Thomson, Stacie; Brase, Jim; Allen, Jonathan E.

J Chem Inf Model ; 60(4): 1955-1968, 2020 04 27.

Article in English | MEDLINE | ID: mdl-32243153

ABSTRACT

One of the key requirements for incorporating machine learning (ML) into the drug discovery process is complete traceability and reproducibility of the model building and evaluation process. With this in mind, we have developed an end-to-end modular and extensible software pipeline for building and sharing ML models that predict key pharma-relevant parameters. The ATOM Modeling PipeLine, or AMPL, extends the functionality of the open source library DeepChem and supports an array of ML and molecular featurization tools. We have benchmarked AMPL on a large collection of pharmaceutical data sets covering a wide range of parameters. Our key findings indicate that traditional molecular fingerprints underperform other feature representation methods. We also find that data set size correlates directly with prediction performance, which points to the need to expand public data sets. Uncertainty quantification can help predict model error, but correlation with error varies considerably between data sets and model types. Our findings point to the need for an extensible pipeline that can be shared to make model building more widely accessible and reproducible. This software is open source and available at: https://github.com/ATOMconsortium/AMPL.

Subject(s)

Drug Discovery , Software , Machine Learning , Reproducibility of Results

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL