Search | VHL Regional Portal

Addressing label noise for electronic health records: insights from computer vision for tabular data.

Yang, Jenny; Triendl, Hagen; Soltan, Andrew A S; Prakash, Mangal; Clifton, David A.

BMC Med Inform Decis Mak ; 24(1): 183, 2024 Jun 27.

Article in English | MEDLINE | ID: mdl-38937744

ABSTRACT

The analysis of extensive electronic health records (EHR) datasets often calls for automated solutions, with machine learning (ML) techniques, including deep learning (DL), taking a lead role. One common task involves categorizing EHR data into predefined groups. However, the vulnerability of EHRs to noise and errors stemming from data collection processes, as well as potential human labeling errors, poses a significant risk. This risk is particularly prominent during the training of DL models, where the possibility of overfitting to noisy labels can have serious repercussions in healthcare. Despite the well-documented existence of label noise in EHR data, few studies have tackled this challenge within the EHR domain. Our work addresses this gap by adapting computer vision (CV) algorithms to mitigate the impact of label noise in DL models trained on EHR data. Notably, it remains uncertain whether CV methods, when applied to the EHR domain, will prove effective, given the substantial divergence between the two domains. We present empirical evidence demonstrating that these methods, whether used individually or in combination, can substantially enhance model performance when applied to EHR data, especially in the presence of noisy/incorrect labels. We validate our methods and underscore their practical utility in real-world EHR data, specifically in the context of COVID-19 diagnosis. Our study highlights the effectiveness of CV methods in the EHR domain, making a valuable contribution to the advancement of healthcare analytics and research.

Subject(s)

Electronic Health Records , Humans , Deep Learning , COVID-19 , Machine Learning

Mitigating machine learning bias between high income and low-middle income countries for enhanced model fairness and generalizability.

Yang, Jenny; Clifton, Lei; Dung, Nguyen Thanh; Phong, Nguyen Thanh; Yen, Lam Minh; Thy, Doan Bui Xuan; Soltan, Andrew A S; Thwaites, Louise; Clifton, David A.

Sci Rep ; 14(1): 13318, 2024 06 10.

Article in English | MEDLINE | ID: mdl-38858466

ABSTRACT

Collaborative efforts in artificial intelligence (AI) are increasingly common between high-income countries (HICs) and low- to middle-income countries (LMICs). Given the resource limitations often encountered by LMICs, collaboration becomes crucial for pooling resources, expertise, and knowledge. Despite the apparent advantages, ensuring the fairness and equity of these collaborative models is essential, especially considering the distinct differences between LMIC and HIC hospitals. In this study, we show that collaborative AI approaches can lead to divergent performance outcomes across HIC and LMIC settings, particularly in the presence of data imbalances. Through a real-world COVID-19 screening case study, we demonstrate that implementing algorithmic-level bias mitigation methods significantly improves outcome fairness between HIC and LMIC sites while maintaining high diagnostic sensitivity. We compare our results against previous benchmarks, utilizing datasets from four independent United Kingdom Hospitals and one Vietnamese hospital, representing HIC and LMIC settings, respectively.

Subject(s)

COVID-19 , Developing Countries , Machine Learning , Humans , COVID-19/epidemiology , COVID-19/virology , Developed Countries , SARS-CoV-2/isolation & purification , United Kingdom , Bias , Vietnam , Income , Algorithms

Deep reinforcement learning for multi-class imbalanced training: applications in healthcare.

Yang, Jenny; El-Bouri, Rasheed; O'Donoghue, Odhran; Lachapelle, Alexander S; Soltan, Andrew A S; Eyre, David W; Lu, Lei; Clifton, David A.

Mach Learn ; 113(5): 2655-2674, 2024.

Article in English | MEDLINE | ID: mdl-38708086

ABSTRACT

With the rapid growth of memory and computing power, datasets are becoming increasingly complex and imbalanced. This is especially severe in the context of clinical data, where there may be one rare event for many cases in the majority class. We introduce an imbalanced classification framework, based on reinforcement learning, for training extremely imbalanced data sets, and extend it for use in multi-class settings. We combine dueling and double deep Q-learning architectures, and formulate a custom reward function and episode-training procedure, specifically with the capability of handling multi-class imbalanced training. Using real-world clinical case studies, we demonstrate that our proposed framework outperforms current state-of-the-art imbalanced learning methods, achieving more fair and balanced classification, while also significantly improving the prediction of minority classes. Supplementary Information: The online version contains supplementary material available at 10.1007/s10994-023-06481-z.

A scalable federated learning solution for secondary care using low-cost microcomputing: privacy-preserving development and evaluation of a COVID-19 screening test in UK hospitals.

Soltan, Andrew A S; Thakur, Anshul; Yang, Jenny; Chauhan, Anoop; D'Cruz, Leon G; Dickson, Phillip; Soltan, Marina A; Thickett, David R; Eyre, David W; Zhu, Tingting; Clifton, David A.

Lancet Digit Health ; 6(2): e93-e104, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38278619

ABSTRACT

BACKGROUND: Multicentre training could reduce biases in medical artificial intelligence (AI); however, ethical, legal, and technical considerations can constrain the ability of hospitals to share data. Federated learning enables institutions to participate in algorithm development while retaining custody of their data but uptake in hospitals has been limited, possibly as deployment requires specialist software and technical expertise at each site. We previously developed an artificial intelligence-driven screening test for COVID-19 in emergency departments, known as CURIAL-Lab, which uses vital signs and blood tests that are routinely available within 1 h of a patient's arrival. Here we aimed to federate our COVID-19 screening test by developing an easy-to-use embedded system-which we introduce as full-stack federated learning-to train and evaluate machine learning models across four UK hospital groups without centralising patient data. METHODS: We supplied a Raspberry Pi 4 Model B preloaded with our federated learning software pipeline to four National Health Service (NHS) hospital groups in the UK: Oxford University Hospitals NHS Foundation Trust (OUH; through the locally linked research University, University of Oxford), University Hospitals Birmingham NHS Foundation Trust (UHB), Bedfordshire Hospitals NHS Foundation Trust (BH), and Portsmouth Hospitals University NHS Trust (PUH). OUH, PUH, and UHB participated in federated training, training a deep neural network and logistic regressor over 150 rounds to form and calibrate a global model to predict COVID-19 status, using clinical data from patients admitted before the pandemic (COVID-19-negative) and testing positive for COVID-19 during the first wave of the pandemic. We conducted a federated evaluation of the global model for admissions during the second wave of the pandemic at OUH, PUH, and externally at BH. For OUH and PUH, we additionally performed local fine-tuning of the global model using the sites' individual training data, forming a site-tuned model, and evaluated the resultant model for admissions during the second wave of the pandemic. This study included data collected between Dec 1, 2018, and March 1, 2021; the exact date ranges used varied by site. The primary outcome was overall model performance, measured as the area under the receiver operating characteristic curve (AUROC). Removable micro secure digital (microSD) storage was destroyed on study completion. FINDINGS: Clinical data from 130â941 patients (1772 COVID-19-positive), routinely collected across three hospital groups (OUH, PUH, and UHB), were included in federated training. The evaluation step included data from 32â986 patients (3549 COVID-19-positive) attending OUH, PUH, or BH during the second wave of the pandemic. Federated training of a global deep neural network classifier improved upon performance of models trained locally in terms of AUROC by a mean of 27·6% (SD 2·2): AUROC increased from 0·574 (95% CI 0·560-0·589) at OUH and 0·622 (0·608-0·637) at PUH using the locally trained models to 0·872 (0·862-0·882) at OUH and 0·876 (0·865-0·886) at PUH using the federated global model. Performance improvement was smaller for a logistic regression model, with a mean increase in AUROC of 13·9% (0·5%). During federated external evaluation at BH, AUROC for the global deep neural network model was 0·917 (0·893-0·942), with 89·7% sensitivity (83·6-93·6) and 76·6% specificity (73·9-79·1). Site-specific tuning of the global model did not significantly improve performance (change in AUROC <0·01). INTERPRETATION: We developed an embedded system for federated learning, using microcomputing to optimise for ease of deployment. We deployed full-stack federated learning across four UK hospital groups to develop a COVID-19 screening test without centralising patient data. Federation improved model performance, and the resultant global models were generalisable. Full-stack federated learning could enable hospitals to contribute to AI development at low cost and without specialist technical expertise at each site. FUNDING: The Wellcome Trust, University of Oxford Medical and Life Sciences Translational Fund.

Subject(s)

COVID-19 , Secondary Care , Humans , Artificial Intelligence , Privacy , State Medicine , COVID-19/diagnosis , Hospitals , United Kingdom

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL