A comprehensive benchmarking of machine learning algorithms and dimensionality reduction methods for drug sensitivity prediction.

Eckhart, Lea; Lenhof, Kerstin; Rolli, Lisa-Marie; Lenhof, Hans-Peter

Eckhart, Lea; Lenhof, Kerstin; Rolli, Lisa-Marie; Lenhof, Hans-Peter.

Afiliação

Eckhart L; Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany.
Lenhof K; Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany.
Rolli LM; Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany.
Lenhof HP; Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany.

Brief Bioinform ; 25(4)2024 May 23.

Article em En | MEDLINE | ID: mdl-38797968

ABSTRACT

ABSTRACT

A major challenge of precision oncology is the identification and prioritization of suitable treatment options based on molecular biomarkers of the considered tumor. In pursuit of this goal, large cancer cell line panels have successfully been studied to elucidate the relationship between cellular features and treatment response. Due to the high dimensionality of these datasets, machine learning (ML) is commonly used for their analysis. However, choosing a suitable algorithm and set of input features can be challenging. We performed a comprehensive benchmarking of ML methods and dimension reduction (DR) techniques for predicting drug response metrics. Using the Genomics of Drug Sensitivity in Cancer cell line panel, we trained random forests, neural networks, boosting trees and elastic nets for 179 anti-cancer compounds with feature sets derived from nine DR approaches. We compare the results regarding statistical performance, runtime and interpretability. Additionally, we provide strategies for assessing model performance compared with a simple baseline model and measuring the trade-off between models of different complexity. Lastly, we show that complex ML models benefit from using an optimized DR strategy, and that standard models-even when using considerably fewer features-can still be superior in performance.

Assuntos

Algoritmos; Antineoplásicos; Benchmarking; Aprendizado de Máquina; Humanos; Antineoplásicos/farmacologia; Antineoplásicos/uso terapêutico; Neoplasias/tratamento farmacológico; Neoplasias/genética; Redes Neurais de Computação; Linhagem Celular Tumoral

Palavras-chave

cancer cell lines; dimension reduction; drug sensitivity prediction; feature extraction; feature selection; machine learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Benchmarking / Aprendizado de Máquina / Antineoplásicos Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google