RESUMO
Several recent studies indicate that atypical changes in driving behaviors appear to be early signs of mild cognitive impairment (MCI) and dementia. These studies, however, are limited by small sample sizes and short follow-up duration. This study aims to develop an interaction-based classification method building on a statistic named Influence Score (i.e., I-score) for prediction of MCI and dementia using naturalistic driving data collected from the Longitudinal Research on Aging Drivers (LongROAD) project. Naturalistic driving trajectories were collected through in-vehicle recording devices for up to 44 months from 2977 participants who were cognitively intact at the time of enrollment. These data were further processed and aggregated to generate 31 time-series driving variables. Because of high dimensional time-series features for driving variables, we used I-score for variable selection. I-score is a measure to evaluate variables' ability to predict and is proven to be effective in differentiating between noisy and predictive variables in big data. It is introduced here to select influential variable modules or groups that account for compound interactions among explanatory variables. It is explainable regarding to what extent variables and their interactions contribute to the predictiveness of a classifier. In addition, I-score boosts the performance of classifiers over imbalanced datasets due to its association with the F1 score. Using predictive variables selected by I-score, interaction-based residual blocks are constructed over top I-score modules to generate predictors and ensemble learning aggregates these predictors to boost the prediction of the overall classifier. Experiments using naturalistic driving data show that our proposed classification method achieves the best accuracy (96%) for predicting MCI and dementia, followed by random forest (93%) and logistic regression (88%). In terms of F1 score and AUC, our proposed classifier achieves 98% and 87%, respectively, followed by random forest (with an F1 score of 96% and an AUC of 79%) and logistic regression (with an F1 score of 92% and an AUC of 77%). The results indicate that incorporating I-score into machine learning algorithms could considerably improve the model performance for predicting MCI and dementia in older drivers. We also performed the feature importance analysis and found that the right to left turn ratio and the number of hard braking events are the most important driving variables to predict MCI and dementia.