Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
JAMA Ophthalmol ; 142(1): 15-23, 2024 Jan 01.
Article in English | MEDLINE | ID: mdl-38019503

ABSTRACT

Importance: Clinical trial results of topical atropine eye drops for childhood myopia control have shown inconsistent outcomes across short-term studies, with little long-term safety or other outcomes reported. Objective: To report the long-term safety and outcomes of topical atropine for childhood myopia control. Design, Setting, and Participants: This prospective, double-masked observational study of the Atropine for the Treatment of Myopia (ATOM) 1 and ATOM2 randomized clinical trials took place at 2 single centers and included adults reviewed in 2021 through 2022 from the ATOM1 study (atropine 1% vs placebo; 1999 through 2003) and the ATOM2 study (atropine 0.01% vs 0.1% vs 0.5%; 2006 through 2012). Main Outcome Measures: Change in cycloplegic spherical equivalent (SE) with axial length (AL); incidence of ocular complications. Results: Among the original 400 participants in each original cohort, the study team evaluated 71 of 400 ATOM1 adult participants (17.8% of original cohort; study age, mean [SD] 30.5 [1.2] years; 40.6% female) and 158 of 400 ATOM2 adult participants (39.5% of original cohort; study age, mean [SD], 24.5 [1.5] years; 42.9% female) whose baseline characteristics (SE and AL) were representative of the original cohort. In this study, evaluating ATOM1 participants, the mean (SD) SE and AL were -5.20 (2.46) diopters (D), 25.87 (1.23) mm and -6.00 (1.63) D, 25.90 (1.21) mm in the 1% atropine-treated and placebo groups, respectively (difference of SE, 0.80 D; 95% CI, -0.25 to 1.85 D; P = .13; difference of AL, -0.03 mm; 95% CI, -0.65 to 0.58 mm; P = .92). In ATOM2 participants, the mean (SD) SE and AL was -6.40 (2.21) D; 26.25 (1.34) mm; -6.81 (1.92) D, 26.28 (0.99) mm; and -7.19 (2.87) D, 26.31 (1.31) mm in the 0.01%, 0.1%, and 0.5% atropine groups, respectively. There was no difference in the 20-year incidence of cataract/lens opacities, myopic macular degeneration, or parapapillary atrophy (ß/γ zone) comparing the 1% atropine-treated group vs the placebo group. Conclusions and Relevance: Among approximately one-quarter of the original participants, use of short-term topical atropine eye drops ranging from 0.01% to 1.0% for a duration of 2 to 4 years during childhood was not associated with differences in final refractive errors 10 to 20 years after treatment. There was no increased incidence of treatment or myopia-related ocular complications in the 1% atropine-treated group vs the placebo group. These findings may affect the design of future clinical trials, as further studies are required to investigate the duration and concentration of atropine for childhood myopia control.


Subject(s)
Cataract , Genetic Diseases, X-Linked , Myopia, Degenerative , Myopia , Humans , Female , Infant , Male , Atropine/administration & dosage , Prospective Studies , Ophthalmic Solutions/administration & dosage , Administration, Topical , Refraction, Ocular , Myopia, Degenerative/drug therapy
2.
Clin Exp Emerg Med ; 10(4): 354-362, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38012816

ABSTRACT

Artificial intelligence (AI) and machine learning (ML) have potential to revolutionize emergency medical care by enhancing triage systems, improving diagnostic accuracy, refining prognostication, and optimizing various aspects of clinical care. However, as clinicians often lack AI expertise, they might perceive AI as a "black box," leading to trust issues. To address this, "explainable AI," which teaches AI functionalities to end-users, is important. This review presents the definitions, importance, and role of explainable AI, as well as potential challenges in emergency medicine. First, we introduce the terms explainability, interpretability, and transparency of AI models. These terms sound similar but have different roles in discussion of AI. Second, we indicate that explainable AI is required in clinical settings for reasons of justification, control, improvement, and discovery and provide examples. Third, we describe three major categories of explainability: pre-modeling explainability, interpretable models, and post-modeling explainability and present examples (especially for post-modeling explainability), such as visualization, simplification, text justification, and feature relevance. Last, we show the challenges of implementing AI and ML models in clinical settings and highlight the importance of collaboration between clinicians, developers, and researchers. This paper summarizes the concept of "explainable AI" for emergency medicine clinicians. This review may help clinicians understand explainable AI in emergency contexts.

3.
J Biomed Inform ; 146: 104485, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37660960

ABSTRACT

OBJECTIVE: We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. MATERIALS AND METHODS: The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess FedScore's performance, we built a hypothetical global scoring system for mortality prediction within 30 days after a visit to an emergency department using 10 simulated sites divided from a tertiary hospital in Singapore. We employed a pre-existing score generator to construct 10 local scoring systems independently at each site and we also developed a scoring system using centralized data for comparison. RESULTS: We compared the acquired FedScore model's performance with that of other scoring models using the receiver operating characteristic (ROC) analysis. The FedScore model achieved an average area under the curve (AUC) value of 0.763 across all sites, with a standard deviation (SD) of 0.020. We also calculated the average AUC values and SDs for each local model, and the FedScore model showed promising accuracy and stability with a high average AUC value which was closest to the one of the pooled model and SD which was lower than that of most local models. CONCLUSION: This study demonstrates that FedScore is a privacy-preserving scoring system generator with potentially good generalizability.

4.
NPJ Digit Med ; 6(1): 172, 2023 Sep 14.
Article in English | MEDLINE | ID: mdl-37709945

ABSTRACT

Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the fairness of such data-driven insights remains a concern in high-stakes fields. Despite extensive developments, issues of AI fairness in clinical contexts have not been adequately addressed. A fair model is normally expected to perform equally across subgroups defined by sensitive variables (e.g., age, gender/sex, race/ethnicity, socio-economic status, etc.). Various fairness measurements have been developed to detect differences between subgroups as evidence of bias, and bias mitigation methods are designed to reduce the differences detected. This perspective of fairness, however, is misaligned with some key considerations in clinical contexts. The set of sensitive variables used in healthcare applications must be carefully examined for relevance and justified by clear clinical motivations. In addition, clinical AI fairness should closely investigate the ethical implications of fairness measurements (e.g., potential conflicts between group- and individual-level fairness) to select suitable and objective metrics. Generally defining AI fairness as "equality" is not necessarily reasonable in clinical settings, as differences may have clinical justifications and do not indicate biases. Instead, "equity" would be an appropriate objective of clinical AI fairness. Moreover, clinical feedback is essential to developing fair and well-performing AI models, and efforts should be made to actively involve clinicians in the process. The adaptation of AI fairness towards healthcare is not self-evident due to misalignments between technical developments and clinical considerations. Multidisciplinary collaboration between AI researchers, clinicians, and ethicists is necessary to bridge the gap and translate AI fairness into real-life benefits.

5.
J Am Med Inform Assoc ; 30(12): 2041-2049, 2023 11 17.
Article in English | MEDLINE | ID: mdl-37639629

ABSTRACT

OBJECTIVES: Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medical data, identifies contemporary limitations, and discusses potential innovations. MATERIALS AND METHODS: We searched 5 databases, SCOPUS, MEDLINE, Web of Science, Embase, and CINAHL, to identify articles that applied FL to structured medical data and reported results following the PRISMA guidelines. Each selected publication was evaluated from 3 primary perspectives, including data quality, modeling strategies, and FL frameworks. RESULTS: Out of the 1193 papers screened, 34 met the inclusion criteria, with each article consisting of one or more studies that used FL to handle structured clinical/medical data. Of these, 24 utilized data acquired from electronic health records, with clinical predictions and association studies being the most common clinical research tasks that FL was applied to. Only one article exclusively explored the vertical FL setting, while the remaining 33 explored the horizontal FL setting, with only 14 discussing comparisons between single-site (local) and FL (global) analysis. CONCLUSIONS: The existing FL applications on structured medical data lack sufficient evaluations of clinically meaningful benefits, particularly when compared to single-site analyses. Therefore, it is crucial for future FL applications to prioritize clinical motivations and develop designs and methodologies that can effectively support and aid clinical practice and research.


Subject(s)
Electronic Health Records , Learning , Data Accuracy , Databases, Factual , Motivation
6.
Artif Intell Med ; 142: 102587, 2023 08.
Article in English | MEDLINE | ID: mdl-37316097

ABSTRACT

OBJECTIVE: The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. In response to the increasing diversity and complexity of data, many researchers have developed deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus on the types of data, intending to assist healthcare researchers from various disciplines in dealing with missing data. MATERIALS AND METHODS: We searched five databases (MEDLINE, Web of Science, Embase, CINAHL, and Scopus) for articles published prior to February 8, 2023 that described the use of DL-based models for imputation. We examined selected articles from four perspectives: data types, model backbones (i.e., main architectures), imputation strategies, and comparisons with non-DL-based methods. Based on data types, we created an evidence map to illustrate the adoption of DL models. RESULTS: Out of 1822 articles, a total of 111 were included, of which tabular static data (29%, 32/111) and temporal data (40%, 44/111) were the most frequently investigated. Our findings revealed a discernible pattern in the choice of model backbones and data types, for example, the dominance of autoencoder and recurrent neural networks for tabular temporal data. The discrepancy in imputation strategy usage among data types was also observed. The "integrated" imputation strategy, which solves the imputation task simultaneously with downstream tasks, was most popular for tabular temporal data (52%, 23/44) and multi-modal data (56%, 5/9). Moreover, DL-based imputation methods yielded a higher level of imputation accuracy than non-DL methods in most studies. CONCLUSION: The DL-based imputation models are a family of techniques, with diverse network structures. Their designation in healthcare is usually tailored to data types with different characteristics. Although DL-based imputation models may not be superior to conventional approaches across all datasets, it is highly possible for them to achieve satisfactory results for a particular data type or dataset. There are, however, still issues with regard to portability, interpretability, and fairness associated with current DL-based imputation models.


Subject(s)
Deep Learning , Databases, Factual , MEDLINE , Neural Networks, Computer
7.
STAR Protoc ; 4(2): 102302, 2023 May 12.
Article in English | MEDLINE | ID: mdl-37178115

ABSTRACT

The AutoScore framework can automatically generate data-driven clinical scores in various clinical applications. Here, we present a protocol for developing clinical scoring systems for binary, survival, and ordinal outcomes using the open-source AutoScore package. We describe steps for package installation, detailed data processing and checking, and variable ranking. We then explain how to iterate through steps for variable selection, score generation, fine-tuning, and evaluation to generate understandable and explainable scoring systems using data-driven evidence and clinical knowledge. For complete details on the use and execution of this protocol, please refer to Xie et al. (2020),1 Xie et al. (2022)2, Saffari et al. (2022)3 and the online tutorial https://nliulab.github.io/AutoScore/.

8.
Eur J Neurol ; 30(6): 1658-1666, 2023 06.
Article in English | MEDLINE | ID: mdl-36912424

ABSTRACT

BACKGROUND AND PURPOSE: A broad list of variables associated with mild cognitive impairment (MCI) in Parkinson disease (PD) have been investigated separately. However, there is as yet no study including all of them to assess variable importance. Shapley variable importance cloud (ShapleyVIC) can robustly assess variable importance while accounting for correlation between variables. Objectives of this study were (i) to prioritize the important variables associated with PD-MCI and (ii) to explore new blood biomarkers related to PD-MCI. METHODS: ShapleyVIC-assisted variable selection was used to identify a subset of variables from 41 variables potentially associated with PD-MCI in a cross-sectional study. Backward selection was used to further identify the variables associated with PD-MCI. Relative risk was used to quantify the association of final associated variables and PD-MCI in the final multivariable log-binomial regression model. RESULTS: Among 41 variables analysed, 22 variables were identified as significantly important variables associated with PD-MCI and eight variables were subsequently selected in the final model, indicating fewer years of education, shorter history of hypertension, higher Movement Disorder Society-Unified Parkinson's Disease Rating Scale motor score, higher levels of triglyceride (TG) and apolipoprotein A1 (ApoA1), and SNCA rs6826785 noncarrier status were associated with increased risk of PD-MCI (p < 0.05). CONCLUSIONS: Our study highlighted the strong association between TG, ApoA1, SNCA rs6826785, and PD-MCI by machine learning approach. Screening and management of high TG and ApoA1 levels might help prevent cognitive impairment in early PD patients. SNCA rs6826785 could be a novel therapeutic target for PD-MCI. ShapleyVIC-assisted variable selection is a novel and robust alternative to traditional approaches for future clinical study to prioritize the variables of interest.


Subject(s)
Cognitive Dysfunction , Parkinson Disease , Humans , Parkinson Disease/psychology , Cross-Sectional Studies , Neuropsychological Tests , Cognitive Dysfunction/psychology , Mental Status and Dementia Tests
9.
BMC Med Res Methodol ; 22(1): 286, 2022 11 04.
Article in English | MEDLINE | ID: mdl-36333672

ABSTRACT

BACKGROUND: Risk prediction models are useful tools in clinical decision-making which help with risk stratification and resource allocations and may lead to a better health care for patients. AutoScore is a machine learning-based automatic clinical score generator for binary outcomes. This study aims to expand the AutoScore framework to provide a tool for interpretable risk prediction for ordinal outcomes. METHODS: The AutoScore-Ordinal framework is generated using the same 6 modules of the original AutoScore algorithm including variable ranking, variable transformation, score derivation (from proportional odds models), model selection, score fine-tuning, and model evaluation. To illustrate the AutoScore-Ordinal performance, the method was conducted on electronic health records data from the emergency department at Singapore General Hospital over 2008 to 2017. The model was trained on 70% of the data, validated on 10% and tested on the remaining 20%. RESULTS: This study included 445,989 inpatient cases, where the distribution of the ordinal outcome was 80.7% alive without 30-day readmission, 12.5% alive with 30-day readmission, and 6.8% died inpatient or by day 30 post discharge. Two point-based risk prediction models were developed using two sets of 8 predictor variables identified by the flexible variable selection procedure. The two models indicated reasonably good performance measured by mean area under the receiver operating characteristic curve (0.758 and 0.793) and generalized c-index (0.737 and 0.760), which were comparable to alternative models. CONCLUSION: AutoScore-Ordinal provides an automated and easy-to-use framework for development and validation of risk prediction models for ordinal outcomes, which can systematically identify potential predictors from high-dimensional data.


Subject(s)
Aftercare , Patient Discharge , Humans , Machine Learning , Patient Readmission , Electronic Health Records , Retrospective Studies
10.
EClinicalMedicine ; 48: 101422, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35706500

ABSTRACT

Background: Return of spontaneous circulation (ROSC) before arrival at the emergency department is an early indicator of successful resuscitation in out-of-hospital cardiac arrest (OHCA). Several ROSC prediction scores have been developed with European cohorts, with unclear applicability in Asian settings. We aimed to develop an interpretable prehospital ROSC (P-ROSC) score for ROSC prediction based on patients with OHCA in Asia. Methods: This retrospective study examined patients who suffered from OHCA between Jan 1, 2009 and Jun 17, 2018 using data recorded in the Pan-Asian Resuscitation Outcomes Study (PAROS) registry. AutoScore, an interpretable machine learning framework, was used to develop P-ROSC. On the same cohort, the P-ROSC was compared with two clinical scores, the RACA and the UB-ROSC. The predictive power was evaluated using the area under the curve (AUC) in the receiver operating characteristic analysis. Findings: 170,678 cases were included, of which 14,104 (8.26%) attained prehospital ROSC. The P-ROSC score identified a new variable, prehospital drug administration, which was not included in the RACA score or the UB-ROSC score. Using only five variables, the P-ROSC score achieved an AUC of 0.806 (95% confidence interval [CI] 0.799-0.814), outperforming both RACA and UB-ROSC with AUCs of 0.773 (95% CI 0.765-0.782) and 0.728 (95% CI 0.718-0.738), respectively. Interpretation: The P-ROSC score is a practical and easily interpreted tool for predicting the probability of prehospital ROSC. Funding: This research received funding from SingHealth Duke-NUS ACP Programme Funding (15/FY2020/P2/06-A79).

11.
Resuscitation ; 176: 42-50, 2022 07.
Article in English | MEDLINE | ID: mdl-35533896

ABSTRACT

BACKGROUND: Survival with favorable neurological outcomes is an important indicator of successful resuscitation in out-of-hospital cardiac arrest (OHCA). We sought to validate the CaRdiac Arrest Survival Score (CRASS), derived using data from the German Resuscitation Registry, in predicting the likelihood of good neurological outcomes after OHCA in Singapore. METHODS: We conducted a retrospective population-based validation study among EMS-attended OHCA patients (≥18 years) in Singapore, using data from the prospective Pan-Asian Resuscitation Outcomes Study registry. Good neurological outcome was defined as a cerebral performance category of 1 or 2. To evaluate the CRASS score in light of the difference in patient characteristics, we used the default constant coefficient (0.8) and the adjusted coefficient (0.2) to calculate the probability of good neurological outcomes. RESULTS: Out of 11,404 analyzed patients recruited between April 2010 and December 2018, 260 had good and 11,144 had poor neurological function. The CRASS score demonstrated good discrimination, with an area under the curve of 0.963 (95% confidence interval: 0.952-0.974). Using the default constant coefficient of 0.8, the CRASS score consistently overestimated the predicted probability of a good outcome. Following adjustment of the coefficient to 0.2, the CRASS score showed improved calibration. CONCLUSION: CRASS demonstrated good discrimination and moderate calibration in predicting favorable neurological outcomes in the validation Singapore cohort. Our study established a good foundation for future large-scale, cross-country validations of the CRASS score in diverse sociocultural, geographical, and clinical settings.


Subject(s)
Cardiopulmonary Resuscitation , Emergency Medical Services , Out-of-Hospital Cardiac Arrest , Humans , Out-of-Hospital Cardiac Arrest/therapy , Prospective Studies , Registries , Retrospective Studies
12.
BMC Med Res Methodol ; 22(1): 157, 2022 05 30.
Article in English | MEDLINE | ID: mdl-35637431

ABSTRACT

BACKGROUND: Despite the ease of interpretation and communication of a risk ratio (RR), and several other advantages in specific settings, the odds ratio (OR) is more commonly reported in epidemiological and clinical research. This is due to the familiarity of the logistic regression model for estimating adjusted ORs from data gathered in a cross-sectional, cohort or case-control design. The preservation of the OR (but not RR) in case-control samples has contributed to the perception that it is the only valid measure of relative risk from case-control samples. For cohort or cross-sectional data, a method known as 'doubling-the-cases' provides valid estimates of RR and an expression for a robust standard error has been derived, but is not available in statistical software packages. METHODS: In this paper, we first describe the doubling-of-cases approach in the cohort setting and then extend its application to case-control studies by incorporating sampling weights and deriving an expression for a robust standard error. The performance of the estimator is evaluated using simulated data, and its application illustrated in a study of neonatal jaundice. We provide an R package that implements the method for any standard design. RESULTS: Our work illustrates that the doubling-of-cases approach for estimating an adjusted RR from cross-sectional or cohort data can also yield valid RR estimates from case-control data. The approach is straightforward to apply, involving simple modification of the data followed by logistic regression analysis. The method performed well for case-control data from simulated cohorts with a range of prevalence rates. In the application to neonatal jaundice, the RR estimates were similar to those from relative risk regression, whereas the OR from naive logistic regression overestimated the RR despite the low prevalence of the outcome. CONCLUSIONS: By providing an R package that estimates an adjusted RR from cohort, cross-sectional or case-control studies, we have enabled the method to be easily implemented with familiar software, so that investigators are not limited to reporting an OR and can examine the RR when it is of interest.


Subject(s)
Jaundice, Neonatal , Cohort Studies , Cross-Sectional Studies , Humans , Infant, Newborn , Logistic Models , Odds Ratio
13.
Patterns (N Y) ; 3(4): 100452, 2022 Apr 08.
Article in English | MEDLINE | ID: mdl-35465224

ABSTRACT

Interpretable machine learning has been focusing on explaining final models that optimize performance. The state-of-the-art Shapley additive explanations (SHAP) locally explains the variable impact on individual predictions and has recently been extended to provide global assessments across the dataset. Our work further extends "global" assessments to a set of models that are "good enough" and are practically as relevant as the final model to a prediction task. The resulting Shapley variable importance cloud consists of Shapley-based importance measures from each good model and pools information across models to provide an overall importance measure, with uncertainty explicitly quantified to support formal statistical inference. We developed visualizations to highlight the uncertainty and to illustrate its implications to practical inference. Building on a common theoretical basis, our method seamlessly complements the widely adopted SHAP assessments of a single final model to avoid biased inference, which we demonstrate in two experiments using recidivism prediction data and clinical data.

14.
J Biomed Inform ; 129: 104072, 2022 05.
Article in English | MEDLINE | ID: mdl-35421602

ABSTRACT

BACKGROUND: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events. METHODS: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality. RESULTS: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.801). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models. CONCLUSIONS: We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.


Subject(s)
Algorithms , Machine Learning , Clinical Decision-Making , Logistic Models , ROC Curve
15.
EClinicalMedicine ; 45: 101315, 2022 Mar.
Article in English | MEDLINE | ID: mdl-35284804

ABSTRACT

Background: Emergency readmission poses an additional burden on both patients and healthcare systems. Risk stratification is the first step of transitional care interventions targeted at reducing readmission. To accurately predict the short- and intermediate-term risks of readmission and provide information for further temporal risk stratification, we developed and validated an interpretable machine learning risk scoring system. Methods: In this retrospective study, all emergency admission episodes from January 1st 2009 to December 31st 2016 at a tertiary hospital in Singapore were assessed. The primary outcome was time to emergency readmission within 90 days post discharge. The Score for Emergency ReAdmission Prediction (SERAP) tool was derived via an interpretable machine learning-based system for time-to-event outcomes. SERAP is six-variable survival score, and takes the number of emergency admissions last year, age, history of malignancy, history of renal diseases, serum creatinine level, and serum albumin level during index admission into consideration. Findings: A total of 293,589 ED admission episodes were finally included in the whole cohort. Among them, 203,748 episodes were included in the training cohort, 50,937 episodes in the validation cohort, and 38,904 in the testing cohort. Readmission within 90 days was documented in 80,213 (27.3%) episodes, with a median time to emergency readmission of 22 days (Interquartile range: 8-47). For different time points, the readmission rates observed in the whole cohort were 6.7% at 7 days, 10.6% at 14 days, 13.6% at 21 days, 16.4% at 30 days, and 23.0% at 60 days. In the testing cohort, the SERAP achieved an integrated area under the curve of 0.737 (95% confidence interval: 0.730-0.743). For a specific 30-day readmission prediction, SERAP outperformed the LACE index (Length of stay, Acuity of admission, Charlson comorbidity index, and Emergency department visits in past six months) and the HOSPITAL score (Hemoglobin at discharge, discharge from an Oncology service, Sodium level at discharge, Procedure during the index admission, Index Type of admission, number of Admissions during the last 12 months, and Length of stay). Besides 30-day readmission, SERAP can predict readmission rates at any time point during the 90-day period. Interpretation: Better performance in risk prediction was achieved by the SERAP than other existing scores, and accurate information about time to emergency readmission was generated for further temporal risk stratification and clinical decision-making. In the future, external validation studies are needed to evaluate the SERAP at different settings and assess their real-world performance. Funding: This study was supported by the Singapore National Medical Research Council under the PULSES Center Grant, and Duke-NUS Medical School.

16.
EClinicalMedicine ; 44: 101293, 2022 Feb.
Article in English | MEDLINE | ID: mdl-35198919

ABSTRACT

BACKGROUND: Bystander cardiopulmonary resuscitation (BCPR) is a critical component of the 'chain of survival' in reducing mortality among out-of-hospital cardiac arrest (OHCA) victims. Inconsistent findings on gender disparities among adult recipients of layperson BCPR have been reported in the literature. We aimed to fill this knowledge gap by investigating the extent of gender disparities in a cross-national setting within Pan-Asian communities. METHODS: We utilised data collected from the Pan-Asian Resuscitation Outcomes Study (PAROS), an international, multicentre, prospective study conducted between 2009 and 2018. We included all OHCA cases with non-traumatic arrest aetiology transported by emergency medical services and excluded study sites that did not consistently collect information about the location of cardiac arrest. Logistic regression was used to analyse the association between gender and BCPR, stratified by location. FINDINGS: We analysed a cohort of 56,192 OHCA cases with an overall BCPR rate of 36.2% (20,329/56,192). At public locations, the BCPR rate was 31.2% (631/2022) for female and 36.4% (3235/8892) for male OHCA victims; while at home, the rate was 38.3% (6838/17,842) for females and 35.1% (9625/27,436) for males. Controlling for site differences and several factors in multivariable logistic regression, we found females less likely to receive BCPR than males in public locations (odds ratio [OR]=0.89, 95% confidence interval [CI]: 0.70-0.99), but more likely to receive BCPR at home (OR=1.16, 95% CI: 1.11-1.21). INTERPRETATION: In Pan-Asian communities, gender differences exist in adult recipients of BCPR and differ between home and public locations. Future studies should account for additional information on bystanders and societal factors to identify targets for interventions. FUNDING: The study was supported by grants from the National Medical Research Council (NMRC/CSA/0049/2013) and Laerdal Foundation (20040).

17.
J Biomed Inform ; 126: 103980, 2022 02.
Article in English | MEDLINE | ID: mdl-34974189

ABSTRACT

OBJECTIVE: Temporal electronic health records (EHRs) contain a wealth of information for secondary uses, such as clinical events prediction and chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. METHODS: We searched five databases (PubMed, Embase, the Institute of Electrical and Electronics Engineers [IEEE] Xplore Digital Library, the Association for Computing Machinery [ACM] Digital Library, and Web of Science) complemented with hand-searching in several prestigious computer science conference proceedings. We sought articles that reported deep learning methodologies on temporal data representation in structured EHR data from January 1, 2010, to August 30, 2020. We summarized and analyzed the selected articles from three perspectives: nature of time series, methodology, and model implementation. RESULTS: We included 98 articles related to temporal data representation using deep learning. Four major challenges were identified, including data irregularity, heterogeneity, sparsity, and model opacity. We then studied how deep learning techniques were applied to address these challenges. Finally, we discuss some open challenges arising from deep learning. CONCLUSION: Temporal EHR data present several major challenges for clinical prediction modeling and data utilization. To some extent, current deep learning solutions can address these challenges. Future studies may consider designing comprehensive and integrated solutions. Moreover, researchers should incorporate clinical domain knowledge into study designs and enhance model interpretability to facilitate clinical implementation.


Subject(s)
Deep Learning , Electronic Health Records , PubMed
18.
PLOS Digit Health ; 1(6): e0000062, 2022 Jun.
Article in English | MEDLINE | ID: mdl-36812536

ABSTRACT

Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors to create parsimonious scores, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability in variable importance across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions across models, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission after hospital discharge, ShapleyVIC selected 6 variables from 41 candidates to create a well-performing risk score, which had similar performance to a 16-variable model from machine-learning-based ranking. Our work contributes to the recent emphasis on interpretability of prediction models for high-stakes decision making, providing a disciplined solution to detailed assessment of variable importance and transparent development of parsimonious clinical risk scores.

19.
J Biomed Inform ; 125: 103959, 2022 01.
Article in English | MEDLINE | ID: mdl-34826628

ABSTRACT

BACKGROUND: Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. METHODS: AutoScore was previously developed as an interpretable machine learning score generator, integrating both machine learning and point-based scores in the strong discriminability and accessibility. We have further extended it to the time-to-event outcomes and developed AutoScore-Survival, for generating time-to-event scores with right-censored survival data. Random survival forest provided an efficient solution for selecting variables, and Cox regression was used for score weighting. We implemented our proposed method as an R package. We illustrated our method in a study of 90-day survival prediction for patients in intensive care units and compared its performance with other survival models, the random survival forest, and two traditional clinical scores. RESULTS: The AutoScore-Survival-derived scoring system was more parsimonious than survival models built using traditional variable selection methods (e.g., penalized likelihood approach and stepwise variable selection), and its performance was comparable to survival models using the same set of variables. Although AutoScore-Survival achieved a comparable integrated area under the curve of 0.782 (95% CI: 0.767-0.794), the integer-valued time-to-event scores generated are favorable in clinical applications because they are easier to compute and interpret. CONCLUSIONS: Our proposed AutoScore-Survival provides a robust and easy-to-use machine learning-based clinical score generator to studies of time-to-event outcomes. It gives a systematic guideline to facilitate the future development of time-to-event scores for clinical applications.


Subject(s)
Machine Learning , Humans , Likelihood Functions
20.
JAMIA Open ; 4(2): ooab033, 2021 Apr.
Article in English | MEDLINE | ID: mdl-34142017

ABSTRACT

OBJECTIVES: The objective of this study is to facilitate monitoring of the quality of inpatient glycemic control by providing an open-source tool to compute glucometrics. To allay regulatory and privacy concerns, the tool is usable locally; no data are uploaded to the internet. MATERIALS AND METHODS: We extended code, initially developed for healthcare analytics research, to serve the clinical need for quality monitoring of diabetes. We built an application, with a graphical interface, which can be run locally without any internet connection. RESULTS: We verified that our code produced results identical to prior work in glucometrics. We extended the prior work by including additional metrics and by providing user customizability. The software has been used at an academic healthcare institution. CONCLUSION: We successfully translated code used for research methods into an open source, user-friendly tool which hospitals may use to expedite quality measure computation for the management of inpatients with diabetes.

SELECTION OF CITATIONS
SEARCH DETAIL
...