Your browser doesn't support javascript.
loading
A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population.
Chowdhury, Mohammad Ziaul Islam; Leung, Alexander A; Walker, Robin L; Sikdar, Khokan C; O'Beirne, Maeve; Quan, Hude; Turin, Tanvir C.
Afiliação
  • Chowdhury MZI; Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada. mohammad.chowdhury@ucalgary.ca.
  • Leung AA; Department of Family Medicine, University of Calgary, 3330 Hospital Drive NW, Calgary, AB, T2N 4N1, Canada. mohammad.chowdhury@ucalgary.ca.
  • Walker RL; Department of Psychiatry, University of Calgary, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada. mohammad.chowdhury@ucalgary.ca.
  • Sikdar KC; Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada.
  • O'Beirne M; Department of Medicine, University of Calgary, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada.
  • Quan H; Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada.
  • Turin TC; Primary Health Care Integration Network, Primary Health Care, Alberta Health Services, Calgary, AB, Canada.
Sci Rep ; 13(1): 13, 2023 01 02.
Article em En | MEDLINE | ID: mdl-36593280
ABSTRACT
Risk prediction models are frequently used to identify individuals at risk of developing hypertension. This study evaluates different machine learning algorithms and compares their predictive performance with the conventional Cox proportional hazards (PH) model to predict hypertension incidence using survival data. This study analyzed 18,322 participants on 24 candidate features from the large Alberta's Tomorrow Project (ATP) to develop different prediction models. To select the top features, we applied five feature selection methods, including two filter-based a univariate Cox p-value and C-index; two embedded-based random survival forest and least absolute shrinkage and selection operator (Lasso); and one constraint-based the statistically equivalent signature (SES). Five machine learning algorithms were developed to predict hypertension incidence penalized regression Ridge, Lasso, Elastic Net (EN), random survival forest (RSF), and gradient boosting (GB), along with the conventional Cox PH model. The predictive performance of the models was assessed using C-index. The performance of machine learning algorithms was observed, similar to the conventional Cox PH model. Average C-indexes were 0.78, 0.78, 0.78, 0.76, 0.76, and 0.77 for Ridge, Lasso, EN, RSF, GB and Cox PH, respectively. Important features associated with each model were also presented. Our study findings demonstrate little predictive performance difference between machine learning algorithms and the conventional Cox PH regression model in predicting hypertension incidence. In a moderate dataset with a reasonable number of features, conventional regression-based models perform similar to machine learning algorithms with good predictive accuracy.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Hipertensão Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Hipertensão Idioma: En Ano de publicação: 2023 Tipo de documento: Article