A comprehensive benchmarking of machine learning algorithms and dimensionality reduction methods for drug sensitivity prediction.
Brief Bioinform
; 25(4)2024 May 23.
Article
em En
| MEDLINE
| ID: mdl-38797968
ABSTRACT
A major challenge of precision oncology is the identification and prioritization of suitable treatment options based on molecular biomarkers of the considered tumor. In pursuit of this goal, large cancer cell line panels have successfully been studied to elucidate the relationship between cellular features and treatment response. Due to the high dimensionality of these datasets, machine learning (ML) is commonly used for their analysis. However, choosing a suitable algorithm and set of input features can be challenging. We performed a comprehensive benchmarking of ML methods and dimension reduction (DR) techniques for predicting drug response metrics. Using the Genomics of Drug Sensitivity in Cancer cell line panel, we trained random forests, neural networks, boosting trees and elastic nets for 179 anti-cancer compounds with feature sets derived from nine DR approaches. We compare the results regarding statistical performance, runtime and interpretability. Additionally, we provide strategies for assessing model performance compared with a simple baseline model and measuring the trade-off between models of different complexity. Lastly, we show that complex ML models benefit from using an optimized DR strategy, and that standard models-even when using considerably fewer features-can still be superior in performance.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Benchmarking
/
Aprendizado de Máquina
/
Antineoplásicos
Limite:
Humans
Idioma:
En
Ano de publicação:
2024
Tipo de documento:
Article