RESUMO
BACKGROUND: Pulmonary arterial hypertension is a serious medical condition. However, the condition is often misdiagnosed or a rather long delay occurs from symptom onset to diagnosis, associated with decreased 5-year survival. In this study, we developed and tested a deep-learning algorithm to detect pulmonary arterial hypertension using chest X-ray (CXR) images. METHODS: From the image archive of Chiba University Hospital, 259 CXR images from 145 patients with pulmonary arterial hypertension and 260 CXR images from 260 control patients were identified; of which 418 were used for training and 101 were used for testing. Using the testing dataset for each image, the algorithm outputted a numerical value from 0 to 1 (the probability of the pulmonary arterial hypertension score). The training process employed a binary cross-entropy loss function with stochastic gradient descent optimization (learning rate parameter, α = 0.01). In addition, using the same testing dataset, the algorithm's ability to identify pulmonary arterial hypertension was compared with that of experienced doctors. RESULTS: The area under the curve (AUC) of the receiver operating characteristic curve for the detection ability of the algorithm was 0.988. Using an AUC threshold of 0.69, the sensitivity and specificity of the algorithm were 0.933 and 0.982, respectively. The AUC of the algorithm's detection ability was superior to that of the doctors. CONCLUSION: The CXR image-derived deep-learning algorithm had superior pulmonary arterial hypertension detection capability compared with that of experienced doctors.
Assuntos
Aprendizado Profundo , Hipertensão Arterial Pulmonar , Humanos , Inteligência Artificial , Hipertensão Arterial Pulmonar/diagnóstico por imagem , Raios X , TóraxRESUMO
BACKGROUND: Antifibrotic therapies are available to treat chronic fibrosing interstitial lung diseases (CF-ILDs), including idiopathic pulmonary fibrosis. Early use of these treatments is recommended to slow deterioration of respiratory function and to prevent acute exacerbation. However, identifying patients in the early stages of CF-ILD using chest radiographs is challenging. In this study, we developed and tested a deep-learning algorithm to detect CF-ILD using chest radiograph images. METHOD: From the image archive of Sapporo Medical University Hospital, 653 chest radiographs from 263 patients with CF-ILDs and 506 from 506 patients without CF-ILD were identified; 921 were used for deep learning and 238 were used for algorithm testing. The algorithm was designed to output a numerical score ranging from 0 to 1, representing the probability of CF-ILD. Using the testing dataset, the algorithm's capability to identify CF-ILD was compared with that of doctors. A second dataset, in which CF-ILD was confirmed using computed tomography images, was used to further evaluate the algorithm's performance. RESULTS: The area under the receiver operating characteristic curve, which indicates the algorithm's detection capability, was 0.979. Using a score cut-off of 0.267, the sensitivity and specificity of detection were 0.896 and 1.000, respectively. These data showed that the algorithm's performance was noninferior to that of doctors, including pulmonologists and radiologists; performance was verified using the second dataset. CONCLUSIONS: We developed a deep-learning algorithm to detect CF-ILDs using chest radiograph images. The algorithm's detection capability was noninferior to that of doctors.
Assuntos
Aprendizado Profundo , Fibrose Pulmonar Idiopática , Doenças Pulmonares Intersticiais , Humanos , Doenças Pulmonares Intersticiais/diagnóstico por imagem , Fibrose , Fibrose Pulmonar Idiopática/diagnóstico por imagem , Algoritmos , Estudos RetrospectivosRESUMO
BACKGROUND: Less experienced clinicians sometimes make misdiagnosis of hip fractures. We developed computer-aided diagnosis (CAD) system for hip fractures on plain X-rays using a deep learning model trained on a large dataset. In this study, we examined whether the accuracy of the diagnosis of hip fracture of the residents could be improved by using this system. METHODS: A deep convolutional neural network approach was used for machine learning. Pytorch 1.3 and Fast.ai 1.0 were applied as frameworks, and an EfficientNet-B4 model (a pre-trained ImageNet model) was used. We handled the 5295 X-rays from the patients with femoral neck fracture or femoral trochanteric fracture from 2009 to 2019. We excluded cases in which the bilateral hips were not included within an image range, and cases of femoral shaft fracture and periprosthetic fracture. Finally, we included 5242 AP pelvic X-rays from 4851 cases. We divided these 5242 images into two images per image, and prepared 5242 images including fracture site and 5242 images without fracture site. Thus, a total of 10,484 images were used for machine learning. The accuracy, sensitivity, specificity, F-value, and area under the curve (AUC) were assessed. Gradient-weighted class activation mapping (Grad-CAM) was used to conceptualize the basis for the diagnosis of the fracture by the deep learning algorithm. Secondly, we conducted a controlled experiment with clinicians. Thirty-one residents;young doctors within 2 years of graduation from medical school who rotate through various specialties, were tested using 300 hip fracture images that were randomly extracted from the dataset. We evaluated the diagnostic accuracy with and without the use of the CAD system for each of the 300 images. RESULTS: The accuracy, sensitivity, specificity, F-value, and AUC were 96.1, 95.2, 96.9%, 0.961, and 0.99, respectively, with the correct diagnostic basis generated by Grad-CAM. In the controlled experiment, the diagnostic accuracy of the residents significantly improved when they used the CAD system. CONCLUSIONS: We developed a newly CAD system with a deep learning algorithm from a relatively large dataset from multiple institutions. Our system achieved high diagnostic performance. Our system improved the diagnostic accuracy of residents for hip fractures. LEVEL OF EVIDENCE: Level III, Foundational evidence, before-after study. CLINICAL RELEVANCE: high.
Assuntos
Aprendizado Profundo , Fraturas do Quadril , Algoritmos , Inteligência Artificial , Fraturas do Quadril/diagnóstico por imagem , Fraturas do Quadril/epidemiologia , Humanos , Redes Neurais de ComputaçãoRESUMO
OBJECTIVES: To investigate the effectiveness of BMAX, a deep learning-based computer-aided detection system for detecting fibrosing interstitial lung disease (ILD) on chest radiographs among non-expert and expert physicians in the real-world clinical setting. DESIGN: Retrospective, observational study. SETTING: This study used chest radiograph images consecutively taken in three medical facilities with various degrees of referral. Three expert ILD physicians interpreted each image and determined whether it was a fibrosing ILD-suspected image (fibrosing ILD positive) or not (fibrosing ILD negative). Interpreters, including non-experts and experts, classified each of 120 images extracted from the pooled data for the reading test into positive or negative for fibrosing ILD without and with the assistance of BMAX. PARTICIPANTS: Chest radiographs of patients aged 20 years or older with two or more visits that were taken during consecutive periods were accumulated. 1251 chest radiograph images were collected, from which 120 images (24 positive and 96 negative images) were randomly extracted for the reading test. The interpreters for the reading test were 20 non-expert physicians and 5 expert physicians (3 pulmonologists and 2 radiologists). PRIMARY AND SECONDARY OUTCOME MEASURES: The primary outcome was the comparison of area under the receiver-operating characteristic curve (ROC-AUC) for identifying fibrosing ILD-positive images by non-experts without versus with BMAX. The secondary outcome was the comparison of sensitivity, specificity and accuracy by non-experts and experts without versus with BMAX. RESULTS: The mean ROC-AUC of non-expert interpreters was 0.795 (95% CI; 0.765 to 0.825) without BMAX and 0.825 (95% CI; 0.799 to 0.850) with BMAX (p=0.005). After using BMAX, sensitivity was improved from 0.744 (95% CI; 0.697 to 0.791) to 0.802 (95% CI; 0.754 to 0.850) among non-experts (p=0.003), but not among experts (p=0.285). Specificity and accuracy were not changed after using BMAX among either non-expert or expert interpreters. CONCLUSION: BMAX was useful for detecting fibrosing ILD-suspected chest radiographs for non-expert physicians. TRIAL REGISTRATION NUMBER: jRCT1032220090.