Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data.
BMC Med Inform Decis Mak
; 20(1): 252, 2020 10 02.
Article
em En
| MEDLINE
| ID: mdl-33008368
BACKGROUND: With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only 'known' risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data. METHODS: Large-scale case-control study with outcome of 6-month incident MI, conducted using the top 800, from an initial 52 k procedures, diagnoses, and medications within the UCHealth system, harmonized to the Observational Medical Outcomes Partnership common data model, performed on 2.27 million patients. We compared several over- and under- sampling techniques to address the imbalance in the dataset. We compared regularized logistics regression, random forest, boosted gradient machines, and shallow and deep neural networks. A baseline model for comparison was a logistic regression using a limited set of 'known' risk factors for MI. Hyper-parameters were identified using 10-fold cross-validation. RESULTS: Twenty thousand Five hundred and ninety-one patients were diagnosed with MI compared with 2.25 million who did not. A deep neural network with random undersampling provided superior classification compared with other methods. However, the benefit of the deep neural network was only moderate, showing an F1 Score of 0.092 and AUC of 0.835, compared to a logistic regression model using only 'known' risk factors. Calibration for all models was poor despite adequate discrimination, due to overfitting from low frequency of the event of interest. CONCLUSIONS: Our study suggests that DNN may not offer substantial benefit when trained on harmonized data, compared to traditional methods using established risk factors for MI.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Registros Eletrônicos de Saúde
/
Aprendizado de Máquina
/
Infarto do Miocárdio
Tipo de estudo:
Diagnostic_studies
/
Incidence_studies
/
Observational_studies
/
Prognostic_studies
/
Risk_factors_studies
Limite:
Female
/
Humans
Idioma:
En
Revista:
BMC Med Inform Decis Mak
Assunto da revista:
INFORMATICA MEDICA
Ano de publicação:
2020
Tipo de documento:
Article
País de afiliação:
Estados Unidos