RESUMEN
BACKGROUND: Robust and accurate prediction of severity for patients with COVID-19 is crucial for patient triaging decisions. Many proposed models were prone to either high bias risk or low-to-moderate discrimination. Some also suffered from a lack of clinical interpretability and were developed based on early pandemic period data. Hence, there has been a compelling need for advancements in prediction models for better clinical applicability. OBJECTIVE: The primary objective of this study was to develop and validate a machine learning-based Robust and Interpretable Early Triaging Support (RIETS) system that predicts severity progression (involving any of the following events: intensive care unit admission, in-hospital death, mechanical ventilation required, or extracorporeal membrane oxygenation required) within 15 days upon hospitalization based on routinely available clinical and laboratory biomarkers. METHODS: We included data from 5945 hospitalized patients with COVID-19 from 19 hospitals in South Korea collected between January 2020 and August 2022. For model development and external validation, the whole data set was partitioned into 2 independent cohorts by stratified random cluster sampling according to hospital type (general and tertiary care) and geographical location (metropolitan and nonmetropolitan). Machine learning models were trained and internally validated through a cross-validation technique on the development cohort. They were externally validated using a bootstrapped sampling technique on the external validation cohort. The best-performing model was selected primarily based on the area under the receiver operating characteristic curve (AUROC), and its robustness was evaluated using bias risk assessment. For model interpretability, we used Shapley and patient clustering methods. RESULTS: Our final model, RIETS, was developed based on a deep neural network of 11 clinical and laboratory biomarkers that are readily available within the first day of hospitalization. The features predictive of severity included lactate dehydrogenase, age, absolute lymphocyte count, dyspnea, respiratory rate, diabetes mellitus, c-reactive protein, absolute neutrophil count, platelet count, white blood cell count, and saturation of peripheral oxygen. RIETS demonstrated excellent discrimination (AUROC=0.937; 95% CI 0.935-0.938) with high calibration (integrated calibration index=0.041), satisfied all the criteria of low bias risk in a risk assessment tool, and provided detailed interpretations of model parameters and patient clusters. In addition, RIETS showed potential for transportability across variant periods with its sustainable prediction on Omicron cases (AUROC=0.903, 95% CI 0.897-0.910). CONCLUSIONS: RIETS was developed and validated to assist early triaging by promptly predicting the severity of hospitalized patients with COVID-19. Its high performance with low bias risk ensures considerably reliable prediction. The use of a nationwide multicenter cohort in the model development and validation implicates generalizability. The use of routinely collected features may enable wide adaptability. Interpretations of model parameters and patients can promote clinical applicability. Together, we anticipate that RIETS will facilitate the patient triaging workflow and efficient resource allocation when incorporated into a routine clinical practice.
Asunto(s)
Algoritmos , COVID-19 , Triaje , Humanos , Biomarcadores , COVID-19/diagnóstico , Mortalidad Hospitalaria , Redes Neurales de la Computación , Triaje/métodos , República de CoreaRESUMEN
Importance: Meticulous postoperative flap monitoring is essential for preventing flap failure and achieving optimal results in free flap operations, for which physical examination has remained the criterion standard. Despite the high reliability of physical examination, the requirement of excessive use of clinician time has been considered a main drawback. Objective: To develop an automated free flap monitoring system using artificial intelligence (AI), minimizing human involvement while maintaining efficiency. Design, Setting, and Participants: In this prognostic study, the designed system involves a smartphone camera installed in a location with optimal flap visibility to capture photographs at regular intervals. The automated program identifies the flap area, checks for notable abnormalities in its appearance, and notifies medical staff if abnormalities are detected. Implementation requires 2 AI-based models: a segmentation model for automatic flap recognition in photographs and a grading model for evaluating the perfusion status of the identified flap. To develop this system, flap photographs captured for monitoring were collected from patients who underwent free flap-based reconstruction from March 1, 2020, to August 31, 2023. After the 2 models were developed, they were integrated to construct the system, which was applied in a clinical setting in November 2023. Exposure: Conducting the developed automated AI-based flap monitoring system. Main Outcomes and Measures: Accuracy of the developed models and feasibility of clinical application of the system. Results: Photographs were obtained from 305 patients (median age, 62 years [range, 8-86 years]; 178 [58.4%] were male). Based on 2068 photographs, the FS-net program (a customized model) was developed for flap segmentation, demonstrating a mean (SD) Dice similarity coefficient of 0.970 (0.001) with 5-fold cross-validation. For the flap grading system, 11â¯112 photographs from the 305 patients were used, encompassing 10â¯115 photographs with normal features and 997 with abnormal features. Tested on 5506 photographs, the DenseNet121 model demonstrated the highest performance with an area under the receiver operating characteristic curve of 0.960 (95% CI, 0.951-0.969). The sensitivity for detecting venous insufficiency was 97.5% and for arterial insufficiency was 92.8%. When applied to 10 patients, the system successfully conducted 143 automated monitoring sessions without significant issues. Conclusions and Relevance: The findings of this study suggest that a novel automated system may enable efficient flap monitoring with minimal use of clinician time. It may be anticipated to serve as an effective surveillance tool for postoperative free flap monitoring. Further studies are required to verify its reliability.
Asunto(s)
Inteligencia Artificial , Colgajos Tisulares Libres , Humanos , Masculino , Femenino , Persona de Mediana Edad , Anciano , Adulto , Anciano de 80 o más Años , Fotograbar/métodos , Monitoreo Fisiológico/métodos , Monitoreo Fisiológico/instrumentación , Adulto Joven , Adolescente , Procedimientos de Cirugía Plástica/métodos , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND: Cancer patients who are admitted to hospitals are at high risk of short-term deterioration due to treatment-related or cancer-specific complications. A rapid response system (RRS) is initiated when patients who are deteriorating or at risk of deteriorating are identified. This study was conducted to develop a deep learning-based early warning score (EWS) for cancer patients (Can-EWS) using delta values in vital signs. METHODS: A retrospective cohort study was conducted on all oncology patients who were admitted to the general ward between 2016 and 2020. The data were divided into a training set (January 2016-December 2019) and a held-out test set (January 2020-December 2020). The primary outcome was clinical deterioration, defined as the composite of in-hospital cardiac arrest (IHCA) and unexpected intensive care unit (ICU) transfer. RESULTS: During the study period, 19,739 cancer patients were admitted to the general wards and eligible for this study. Clinical deterioration occurred in 894 cases. IHCA and unexpected ICU transfer prevalence was 1.77 per 1000 admissions and 43.45 per 1000 admissions, respectively. We developed two models: Can-EWS V1, which used input vectors of the original five input variables, and Can-EWS V2, which used input vectors of 10 variables (including an additional five delta variables). The cross-validation performance of the clinical deterioration for Can-EWS V2 (AUROC, 0.946; 95% confidence interval [CI], 0.943-0.948) was higher than that for MEWS of 5 (AUROC, 0.589; 95% CI, 0.587-0.560; p < 0.001) and Can-EWS V1 (AUROC, 0.927; 95% CI, 0.924-0.931). As a virtual prognostic study, additional validation was performed on held-out test data. The AUROC and 95% CI were 0.588 (95% CI, 0.588-0.589), 0.890 (95% CI, 0.888-0.891), and 0.898 (95% CI, 0.897-0.899), for MEWS of 5, Can-EWS V1, and the deployed model Can-EWS V2, respectively. Can-EWS V2 outperformed other approaches for specificities, positive predictive values, negative predictive values, and the number of false alarms per day at the same sensitivity level on the held-out test data. CONCLUSIONS: We have developed and validated a deep learning-based EWS for cancer patients using the original values and differences between consecutive measurements of basic vital signs. The Can-EWS has acceptable discriminatory power and sensitivity, with extremely decreased false alarms compared with MEWS.
RESUMEN
Most recent survival prediction has been based on TNM staging, which does not provide individualized information. However, clinical factors including performance status, age, sex, and smoking might influence survival. Therefore, we used artificial intelligence (AI) to analyze various clinical factors to precisely predict the survival of patients with larynx squamous cell carcinoma (LSCC). We included patients with LSCC (N = 1026) who received definitive treatment from 2002 to 2020. Age, sex, smoking, alcohol consumption, Eastern Cooperative Oncology Group (ECOG) performance status, location of tumor, TNM stage, and treatment methods were analyzed using deep neural network (DNN) with multi-classification and regression, random survival forest (RSF), and Cox proportional hazards (COX-PH) model for prediction of overall survival. Each model was confirmed with five-fold cross validation, and performance was evaluated using linear slope, y-intercept, and C-index. The DNN with multi-classification model demonstrated the highest prediction power (1.000 ± 0.047, 0.126 ± 0.762, and 0.859 ± 0.018 for slope, y-intercept, and C-index, respectively), and the prediction survival curve showed the strongest agreement with the validation survival curve, followed by DNN with regression (0.731 ± 0.048, 9.659 ± 0.964, and 0.893 ± 0.017, respectively). The DNN model produced with only T/N staging showed the poorest survival prediction. When predicting the survival of LSCC patients, various clinical factors should be considered. In the present study, DNN with multi-class was shown to be an appropriate method for survival prediction. AI analysis may predict survival more accurately and improve oncologic outcomes.
Asunto(s)
Carcinoma de Células Escamosas , Neoplasias de Cabeza y Cuello , Neoplasias Laríngeas , Humanos , Carcinoma de Células Escamosas de Cabeza y Cuello/patología , Neoplasias Laríngeas/patología , Inteligencia Artificial , Carcinoma de Células Escamosas/patología , Estadificación de Neoplasias , Neoplasias de Cabeza y Cuello/patología , Pronóstico , Estudios RetrospectivosRESUMEN
Pretreatment values of the neutrophil-to-lymphocyte ratio (NLR) and the platelet-to-lymphocyte ratio (PLR) are well-established prognosticators in various cancers, including head and neck cancers. However, there are no studies on whether temporal changes in the NLR and PLR values after treatment are related to the development of recurrence. Therefore, in this study, we aimed to develop a deep neural network (DNN) model to discern cancer recurrence from temporal NLR and PLR values during follow-up after concurrent chemoradiotherapy (CCRT) and to evaluate the model's performance compared with conventional machine learning (ML) models. Along with conventional ML models such as logistic regression (LR), random forest (RF), and gradient boosting (GB), the DNN model to discern recurrences was trained using a dataset of 778 consecutive patients with primary head and neck cancers who received CCRT. There were 16 input features used, including 12 laboratory values related to the NLR and the PLR. Along with the original training dataset (N = 778), data were augmented to split the training dataset (N = 900). The model performance was measured using ROC-AUC and PR-AUC values. External validation was performed using a dataset of 173 patients from an unrelated external institution. The ROC-AUC and PR-AUC values of the DNN model were 0.828 ± 0.032 and 0.663 ± 0.069, respectively, in the original training dataset, which were higher than the ROC-AUC and PR-AUC values of the LR, RF, and GB models in the original training dataset. With the recursive feature elimination (RFE) algorithm, five input features were selected. The ROC-AUC and PR-AUC values of the DNN-RFE model were higher than those of the original DNN model (0.883 ± 0.027 and 0.778 ± 0.042, respectively). The ROC-AUC and PR-AUC values of the DNN-RFE model trained with a split dataset were 0.889 ± 0.032 and 0.771 ± 0.044, respectively. In the external validation, the ROC-AUC values of the DNN-RFE model trained with the original dataset and the same model trained with the split dataset were 0.710 and 0.784, respectively. The DNN model with feature selection using the RFE algorithm showed the best performance among the ML models to discern a recurrence after CCRT in patients with head and neck cancers. Data augmentation by splitting training data was helpful for model performance. The performance of the DNN-RFE model was also validated with an external dataset.
RESUMEN
BACKGROUND: Breast cancer is the most common cancer and the most common cause of cancer death in women. Although survival rates have improved, unmet psychosocial needs remain challenging because the quality of life (QoL) and QoL-related factors change over time. In addition, traditional statistical models have limitations in identifying factors associated with QoL over time, particularly concerning the physical, psychological, economic, spiritual, and social dimensions. OBJECTIVE: This study aimed to identify patient-centered factors associated with QoL among patients with breast cancer using a machine learning (ML) algorithm to analyze data collected along different survivorship trajectories. METHODS: The study used 2 data sets. The first data set was the cross-sectional survey data from the Breast Cancer Information Grand Round for Survivorship (BIG-S) study, which recruited consecutive breast cancer survivors who visited the outpatient breast cancer clinic at the Samsung Medical Center in Seoul, Korea, between 2018 and 2019. The second data set was the longitudinal cohort data from the Beauty Education for Distressed Breast Cancer (BEST) cohort study, which was conducted at 2 university-based cancer hospitals in Seoul, Korea, between 2011 and 2016. QoL was measured using European Organization for Research and Treatment of Cancer QoL Questionnaire Core 30 questionnaire. Feature importance was interpreted using Shapley Additive Explanations (SHAP). The final model was selected based on the highest mean area under the receiver operating characteristic curve (AUC). The analyses were performed using the Python 3.7 programming environment (Python Software Foundation). RESULTS: The study included 6265 breast cancer survivors in the training data set and 432 patients in the validation set. The mean age was 50.6 (SD 8.66) years and 46.8% (n=2004) had stage 1 cancer. In the training data set, 48.3% (n=3026) of survivors had poor QoL. The study developed ML models for QoL prediction based on 6 algorithms. Performance was good for all survival trajectories: overall (AUC 0.823), baseline (AUC 0.835), within 1 year (AUC 0.860), between 2 and 3 years (AUC 0.808), between 3 and 4 years (AUC 0.820), and between 4 and 5 years (AUC 0.826). Emotional and physical functions were the most important features before surgery and within 1 year after surgery, respectively. Fatigue was the most important feature between 1 and 4 years. Despite the survival period, hopefulness was the most influential feature on QoL. External validation of the models showed good performance with AUCs between 0.770 and 0.862. CONCLUSIONS: The study identified important factors associated with QoL among breast cancer survivors across different survival trajectories. Understanding the changing trends of these factors could help to intervene more precisely and timely, and potentially prevent or alleviate QoL-related issues for patients. The good performance of our ML models in both training and external validation sets suggests the potential use of this approach in identifying patient-centered factors and improving survivorship care.
Asunto(s)
Neoplasias de la Mama , Supervivientes de Cáncer , Humanos , Femenino , Persona de Mediana Edad , Supervivientes de Cáncer/psicología , Calidad de Vida/psicología , Neoplasias de la Mama/cirugía , Supervivencia , Estudios de Cohortes , Estudios Transversales , Sobrevivientes/psicologíaRESUMEN
BACKGROUND: Parkinsonian diseases and cerebellar ataxia among movement disorders, are representative diseases which present with distinct pathological gaits. We proposed a machine learning system that can differentiate Parkinson's disease (PD), cerebellar ataxia and progressive supranuclear palsy Richardson syndrome (PSP-RS) based on postural instability and gait analysis. METHODS: We screened 1467 gait (GAITRite) and postural instability (Pedoscan) analyses performed in Samsung Medical Center from January 2019 to December 2020. PD, probable PSP-RS, and cerebellar ataxia (i.e., probable MSA-C, hereditary ataxia, and sporadic adult-onset ataxia) were included in the study. The gated recurrent units for GaitRite and the deep neural network for Pedoscan were applied. The enhanced weight voting ensemble (EWVE) method was applied to incorporate the two modalities. RESULTS: We included 551 PD, 38 PSP-RS, 113 cerebellar ataxia and among them, 71 were MSA-C. Pedoscan-based and Gait-based model showed high sensitivity but low specificity in differentiating atypical parkinsonism from PD. The EWVE showed significantly improved specificity and reliable performance in differentiation between PD vs. ataxia patients (AUC 0.974 ± 0.036, sensitivity 0.829 ± 0.217, specificity 0.969 ± 0.038), PD vs. MSA-C (AUC 0.975 ± 0.020, sensitivity 0.823 ± 0.162, specificity 0.932 ± 0.030) and PD vs. PSP-RS (AUC 0.963 ± 0.028, sensitivity 0.555 ± 0.157, specificity 0.936 ± 0.031). CONCLUSION: We proposed reliable Pedoscan-based, Gait-based and EWVE model in differentiating gait disorders by integrating information from gait and postural instability. This model can provide diagnosis guidelines to primary caregivers and assist in differential diagnosis of PD from atypical parkinsonism for neurologists.
Asunto(s)
Ataxia Cerebelosa , Atrofia de Múltiples Sistemas , Enfermedad de Parkinson , Trastornos Parkinsonianos , Parálisis Supranuclear Progresiva , Adulto , Inteligencia Artificial , Ataxia Cerebelosa/diagnóstico , Diagnóstico Diferencial , Marcha , Humanos , Atrofia de Múltiples Sistemas/diagnóstico , Enfermedad de Parkinson/diagnóstico , Enfermedad de Parkinson/patología , Trastornos Parkinsonianos/diagnóstico , Trastornos Parkinsonianos/patología , Parálisis Supranuclear Progresiva/diagnóstico , Parálisis Supranuclear Progresiva/patologíaRESUMEN
OBJECTIVES: Pharyngocutaneous fistula (PCF) is one of the major complications following total laryngectomy (TL). Previous studies about PCF risk factors showed inconsistent results, and artificial intelligence (AI) has not been used. We identified the clinical risk factors for PCF using multiple AI models. MATERIALS & METHODS: Patients who received TL in the authors' institution during the last 20 years were enrolled (N = 313) in this study. They consisted of no PCF (n = 247) and PCF groups (n = 66). We compared 29 clinical variables between the two groups and performed logistic regression and AI analysis including random forest, gradient boosting, and neural network to predict PCF after TL. RESULTS: The best prediction performance for AI was achieved when age, smoking, body mass index, hypertension, chronic kidney disease, hemoglobin level, operation time, transfusion, nodal staging, surgical margin, extent of neck dissection, type of flap reconstruction, hematoma after TL, and concurrent chemoradiation were included in the analysis. Among logistic regression and AI models, the neural network showed the highest area under the curve (0.667 ± 0.332). CONCLUSION: Diverse clinical factors were identified as PCF risk factors using AI models and the neural network demonstrated highest predictive power. This first study about prediction of PCF using AI could be used to select high risk patients for PCF when performing TL.