Item response theory as a feature selection and interpretation tool in the context of machine learning.

Kline, Adrienne S; Kline, Theresa J B; Lee, Joon

Kline, Adrienne S; Kline, Theresa J B; Lee, Joon.

Afiliação

Kline AS; Department of Biomedical Engineering, University of Calgary, Calgary, AB, Canada. askline1@gmail.com.
Kline TJB; Undergraduate Medical Education, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada. askline1@gmail.com.
Lee J; Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada. askline1@gmail.com.

Med Biol Eng Comput ; 59(2): 471-482, 2021 Feb.

Article em En | MEDLINE | ID: mdl-33534111

ABSTRACT

ABSTRACT

Optimizing the number and utility of features to use in a classification analysis has been the subject of many research studies. Most current models use end-classifications as part of the feature reduction process, leading to circularity in the methodology. The approach demonstrated in the present research uses item response theory (IRT) to select features independent of the end-classification results without the biased accuracies that this circularity engenders. Dichotomous and polytomous IRT models were used to analyze 30 histological breast cancer features from 569 patients using the Wisconsin Diagnostic Breast Cancer data set. Based on their characteristics, three features were selected for use in a machine learning classifier. For comparison purposes, two machine learning-based feature selection protocols were run-recursive feature elimination (RFE) and ridge regression-and the three features selected from these analyses were also used in the subsequent learning classifier. Classification results demonstrated that all three selection processes performed comparably. The non-biased nature of the IRT protocol and information provided about the specific characteristics of the features as to why they are of use in classification help to shed light on understanding which attributes of features make them suitable for use in a machine learning context.

Assuntos

Aprendizado de Máquina; Máquina de Vetores de Suporte; Humanos

Palavras-chave

Breast cancer; Feature selection; Item response theory; Machine learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Máquina de Vetores de Suporte / Aprendizado de Máquina Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google