Your browser doesn't support javascript.
loading
A comparative study of model-centric and data-centric approaches in the development of cardiovascular disease risk prediction models in the UK Biobank.
Mamouei, Mohammad; Fisher, Thomas; Rao, Shishir; Li, Yikuan; Salimi-Khorshidi, Ghomalreza; Rahimi, Kazem.
Afiliação
  • Mamouei M; Deep Medicine, Oxford Martin School, University of Oxford, 1st Floor, Hayes House, 75 George Street, Oxford OX1 2BQ, UK.
  • Fisher T; Nuffield Department of Women's and Reproductive Health, Medical Science Division, University of Oxford, Oxford, UK.
  • Rao S; Deep Medicine, Oxford Martin School, University of Oxford, 1st Floor, Hayes House, 75 George Street, Oxford OX1 2BQ, UK.
  • Li Y; Nuffield Department of Women's and Reproductive Health, Medical Science Division, University of Oxford, Oxford, UK.
  • Salimi-Khorshidi G; Deep Medicine, Oxford Martin School, University of Oxford, 1st Floor, Hayes House, 75 George Street, Oxford OX1 2BQ, UK.
  • Rahimi K; Nuffield Department of Women's and Reproductive Health, Medical Science Division, University of Oxford, Oxford, UK.
Eur Heart J Digit Health ; 4(4): 337-346, 2023 Aug.
Article em En | MEDLINE | ID: mdl-37538143
ABSTRACT

Aims:

A diverse set of factors influence cardiovascular diseases (CVDs), but a systematic investigation of the interplay between these determinants and the contribution of each to CVD incidence prediction is largely missing from the literature. In this study, we leverage one of the most comprehensive biobanks worldwide, the UK Biobank, to investigate the contribution of different risk factor categories to more accurate incidence predictions in the overall population, by sex, different age groups, and ethnicity. Methods and

results:

The investigated categories include the history of medical events, behavioural factors, socioeconomic factors, environmental factors, and measurements. We included data from a cohort of 405 257 participants aged 37-73 years and trained various machine learning and deep learning models on different subsets of risk factors to predict CVD incidence. Each of the models was trained on the complete set of predictors and subsets where each category was excluded. The results were benchmarked against QRISK3. The findings highlight that (i) leveraging a more comprehensive medical history substantially improves model performance. Relative to QRISK3, the best performing models improved the discrimination by 3.78% and improved precision by 1.80%. (ii) Both model- and data-centric approaches are necessary to improve predictive performance. The benefits of using a comprehensive history of diseases were far more pronounced when a neural sequence model, BEHRT, was used. This highlights the importance of the temporality of medical events that existing clinical risk models fail to capture. (iii) Besides the history of diseases, socioeconomic factors and measurements had small but significant independent contributions to the predictive performance.

Conclusion:

These findings emphasize the need for considering broad determinants and novel modelling approaches to enhance CVD incidence prediction.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article