RESUMO
OBJECTIVES: To develop and validate an artificial intelligence (AI) system for measuring and detecting signs of carpal instability on conventional radiographs. MATERIALS AND METHODS: Two case-control datasets of hand and wrist radiographs were retrospectively acquired at three hospitals (hospitals A, B, and C). Dataset 1 (2178 radiographs from 1993 patients, hospitals A and B, 2018-2019) was used for developing an AI system for measuring scapholunate (SL) joint distances, SL and capitolunate (CL) angles, and carpal arc interruptions. Dataset 2 (481 radiographs from 217 patients, hospital C, 2017-2021) was used for testing, and with a subsample (174 radiographs from 87 patients), an observer study was conducted to compare its performance to five clinicians. Evaluation metrics included mean absolute error (MAE), sensitivity, and specificity. RESULTS: Dataset 2 included 258 SL distances, 189 SL angles, 191 CL angles, and 217 carpal arc labels obtained from 217 patients (mean age, 51 years ± 23 [standard deviation]; 133 women). The MAE in measuring SL distances, SL angles, and CL angles was respectively 0.65 mm (95%CI: 0.59, 0.72), 7.9 degrees (95%CI: 7.0, 8.9), and 5.9 degrees (95%CI: 5.2, 6.6). The sensitivity and specificity for detecting arc interruptions were 83% (95%CI: 74, 91) and 64% (95%CI: 56, 71). The measurements were largely comparable to those of the clinicians, while arc interruption detections were more accurate than those of most clinicians. CONCLUSION: This study demonstrates that a newly developed automated AI system accurately measures and detects signs of carpal instability on conventional radiographs. CLINICAL RELEVANCE STATEMENT: This system has the potential to improve detections of carpal arc interruptions and could be a promising tool for supporting clinicians in detecting carpal instability.
Assuntos
Inteligência Artificial , Instabilidade Articular , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Instabilidade Articular/diagnóstico por imagem , Estudos Retrospectivos , Sensibilidade e Especificidade , Estudos de Casos e Controles , Adulto , Ossos do Carpo/diagnóstico por imagem , Radiografia/métodos , Articulação do Punho/diagnóstico por imagem , Articulações do Carpo/diagnóstico por imagem , IdosoRESUMO
Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of methods for this task. To address this, we organized a public research challenge, NODE21, aimed at the detection and generation of lung nodules in chest X-rays. While the detection track assesses state-of-the-art nodule detection systems, the generation track determines the utility of nodule generation algorithms to augment training data and hence improve the performance of the detection systems. This paper summarizes the results of the NODE21 challenge and performs extensive additional experiments to examine the impact of the synthetically generated nodule training images on the detection algorithm performance.
Assuntos
Algoritmos , Neoplasias Pulmonares , Interpretação de Imagem Radiográfica Assistida por Computador , Radiografia Torácica , Nódulo Pulmonar Solitário , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Radiografia Torácica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Nódulo Pulmonar Solitário/diagnóstico por imagem , Pulmão/diagnóstico por imagem , Aprendizado ProfundoRESUMO
BACKGROUND: Outside a screening program, early-stage lung cancer is generally diagnosed after the detection of incidental nodules in clinically ordered chest CT scans. Despite the advances in artificial intelligence (AI) systems for lung cancer detection, clinical validation of these systems is lacking in a non-screening setting. METHOD: We developed a deep learning-based AI system and assessed its performance for the detection of actionable benign nodules (requiring follow-up), small lung cancers, and pulmonary metastases in CT scans acquired in two Dutch hospitals (internal and external validation). A panel of five thoracic radiologists labeled all nodules, and two additional radiologists verified the nodule malignancy status and searched for any missed cancers using data from the national Netherlands Cancer Registry. The detection performance was evaluated by measuring the sensitivity at predefined false positive rates on a free receiver operating characteristic curve and was compared with the panel of radiologists. RESULTS: On the external test set (100 scans from 100 patients), the sensitivity of the AI system for detecting benign nodules, primary lung cancers, and metastases is respectively 94.3% (82/87, 95% CI: 88.1-98.8%), 96.9% (31/32, 95% CI: 91.7-100%), and 92.0% (104/113, 95% CI: 88.5-95.5%) at a clinically acceptable operating point of 1 false positive per scan (FP/s). These sensitivities are comparable to or higher than the radiologists, albeit with a slightly higher FP/s (average difference of 0.6). CONCLUSIONS: The AI system reliably detects benign and malignant pulmonary nodules in clinically indicated CT scans and can potentially assist radiologists in this setting.
Early-stage lung cancer can be diagnosed after identifying an abnormal spot on a chest CT scan ordered for other medical reasons. These spots or lung nodules can be overlooked by radiologists, as they are not necessarily the focus of an examination and can be as small as a few millimeters. Software using Artificial Intelligence (AI) technology has proven to be successful for aiding radiologists in this task, but its performance is understudied outside a lung cancer screening setting. We therefore developed and validated AI software for the detection of cancerous nodules or non-cancerous nodules that would need attention. We show that the software can reliably detect these nodules in a non-screening setting and could potentially aid radiologists in daily clinical practice.
RESUMO
OBJECTIVE: To study trends in the incidence of reported pulmonary nodules and stage I lung cancer in chest CT. METHODS: We analyzed the trends in the incidence of detected pulmonary nodules and stage I lung cancer in chest CT scans in the period between 2008 and 2019. Imaging metadata and radiology reports from all chest CT studies were collected from two large Dutch hospitals. A natural language processing algorithm was developed to identify studies with any reported pulmonary nodule. RESULTS: Between 2008 and 2019, a total of 74,803 patients underwent 166,688 chest CT examinations at both hospitals combined. During this period, the annual number of chest CT scans increased from 9955 scans in 6845 patients in 2008 to 20,476 scans in 13,286 patients in 2019. The proportion of patients in whom nodules (old or new) were reported increased from 38% (2595/6845) in 2008 to 50% (6654/13,286) in 2019. The proportion of patients in whom significant new nodules (≥ 5 mm) were reported increased from 9% (608/6954) in 2010 to 17% (1660/9883) in 2017. The number of patients with new nodules and corresponding stage I lung cancer diagnosis tripled and their proportion doubled, from 0.4% (26/6954) in 2010 to 0.8% (78/9883) in 2017. CONCLUSION: The identification of incidental pulmonary nodules in chest CT has steadily increased over the past decade and has been accompanied by more stage I lung cancer diagnoses. CLINICAL RELEVANCE STATEMENT: These findings stress the importance of identifying and efficiently managing incidental pulmonary nodules in routine clinical practice. KEY POINTS: ⢠The number of patients who underwent chest CT examinations substantially increased over the past decade, as did the number of patients in whom pulmonary nodules were identified. ⢠The increased use of chest CT and more frequently identified pulmonary nodules were associated with more stage I lung cancer diagnoses.
Assuntos
Neoplasias Pulmonares , Nódulos Pulmonares Múltiplos , Nódulo Pulmonar Solitário , Humanos , Incidência , Nódulo Pulmonar Solitário/diagnóstico por imagem , Nódulo Pulmonar Solitário/epidemiologia , Nódulos Pulmonares Múltiplos/diagnóstico por imagem , Nódulos Pulmonares Múltiplos/epidemiologia , Tomografia Computadorizada por Raios X/métodos , Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/epidemiologiaRESUMO
OBJECTIVES: To assess how an artificial intelligence (AI) algorithm performs against five experienced musculoskeletal radiologists in diagnosing scaphoid fractures and whether it aids their diagnosis on conventional multi-view radiographs. METHODS: Four datasets of conventional hand, wrist, and scaphoid radiographs were retrospectively acquired at two hospitals (hospitals A and B). Dataset 1 (12,990 radiographs from 3353 patients, hospital A) and dataset 2 (1117 radiographs from 394 patients, hospital B) were used for training and testing a scaphoid localization and laterality classification component. Dataset 3 (4316 radiographs from 840 patients, hospital A) and dataset 4 (688 radiographs from 209 patients, hospital B) were used for training and testing the fracture detector. The algorithm was compared with the radiologists in an observer study. Evaluation metrics included sensitivity, specificity, positive predictive value (PPV), area under the characteristic operating curve (AUC), Cohen's kappa coefficient (κ), fracture localization precision, and reading time. RESULTS: The algorithm detected scaphoid fractures with a sensitivity of 72%, specificity of 93%, PPV of 81%, and AUC of 0.88. The AUC of the algorithm did not differ from each radiologist (0.87 [radiologists' mean], p ≥ .05). AI assistance improved five out of ten pairs of inter-observer Cohen's κ agreements (p < .05) and reduced reading time in four radiologists (p < .001), but did not improve other metrics in the majority of radiologists (p ≥ .05). CONCLUSIONS: The AI algorithm detects scaphoid fractures on conventional multi-view radiographs at the level of five experienced musculoskeletal radiologists and could significantly shorten their reading time. KEY POINTS: ⢠An artificial intelligence algorithm automatically detects scaphoid fractures on conventional multi-view radiographs at the same level of five experienced musculoskeletal radiologists. ⢠There is preliminary evidence that automated scaphoid fracture detection can significantly shorten the reading time of musculoskeletal radiologists.
Assuntos
Aprendizado Profundo , Fraturas Ósseas , Osso Escafoide , Traumatismos do Punho , Humanos , Fraturas Ósseas/diagnóstico por imagem , Punho , Estudos Retrospectivos , Inteligência Artificial , Osso Escafoide/diagnóstico por imagem , RadiologistasRESUMO
PURPOSE: To compare the performance of a convolutional neural network (CNN) to that of 11 radiologists in detecting scaphoid bone fractures on conventional radiographs of the hand, wrist, and scaphoid. MATERIALS AND METHODS: At two hospitals (hospitals A and B), three datasets consisting of conventional hand, wrist, and scaphoid radiographs were retrospectively retrieved: a dataset of 1039 radiographs (775 patients [mean age, 48 years ± 23 {standard deviation}; 505 female patients], period: 2017-2019, hospitals A and B) for developing a scaphoid segmentation CNN, a dataset of 3000 radiographs (1846 patients [mean age, 42 years ± 22; 937 female patients], period: 2003-2019, hospital B) for developing a scaphoid fracture detection CNN, and a dataset of 190 radiographs (190 patients [mean age, 43 years ± 20; 77 female patients], period: 2011-2020, hospital A) for testing the complete fracture detection system. Both CNNs were applied consecutively: The segmentation CNN localized the scaphoid and then passed the relevant region to the detection CNN for fracture detection. In an observer study, the performance of the system was compared with that of 11 radiologists. Evaluation metrics included the Dice similarity coefficient (DSC), Hausdorff distance (HD), sensitivity, specificity, positive predictive value (PPV), and area under the receiver operating characteristic curve (AUC). RESULTS: The segmentation CNN achieved a DSC of 97.4% ± 1.4 with an HD of 1.31 mm ± 1.03. The detection CNN had sensitivity of 78% (95% CI: 70, 86), specificity of 84% (95% CI: 77, 92), PPV of 83% (95% CI: 77, 90), and AUC of 0.87 (95% CI: 0.81, 0.91). There was no difference between the AUC of the CNN and that of the radiologists (0.87 [95% CI: 0.81, 0.91] vs 0.83 [radiologist range: 0.79-0.85]; P = .09). CONCLUSION: The developed CNN achieved radiologist-level performance in detecting scaphoid bone fractures on conventional radiographs of the hand, wrist, and scaphoid.Keywords: Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms, Feature Detection-Vision-Application Domain, Computer-Aided DiagnosisSee also the commentary by Li and Torriani in this issue.Supplemental material is available for this article.©RSNA, 2021.
RESUMO
Background The coronavirus disease 2019 (COVID-19) pandemic has spread across the globe with alarming speed, morbidity, and mortality. Immediate triage of patients with chest infections suspected to be caused by COVID-19 using chest CT may be of assistance when results from definitive viral testing are delayed. Purpose To develop and validate an artificial intelligence (AI) system to score the likelihood and extent of pulmonary COVID-19 on chest CT scans using the COVID-19 Reporting and Data System (CO-RADS) and CT severity scoring systems. Materials and Methods The CO-RADS AI system consists of three deep-learning algorithms that automatically segment the five pulmonary lobes, assign a CO-RADS score for the suspicion of COVID-19, and assign a CT severity score for the degree of parenchymal involvement per lobe. This study retrospectively included patients who underwent a nonenhanced chest CT examination because of clinical suspicion of COVID-19 at two medical centers. The system was trained, validated, and tested with data from one of the centers. Data from the second center served as an external test set. Diagnostic performance and agreement with scores assigned by eight independent observers were measured using receiver operating characteristic analysis, linearly weighted κ values, and classification accuracy. Results A total of 105 patients (mean age, 62 years ± 16 [standard deviation]; 61 men) and 262 patients (mean age, 64 years ± 16; 154 men) were evaluated in the internal and external test sets, respectively. The system discriminated between patients with COVID-19 and those without COVID-19, with areas under the receiver operating characteristic curve of 0.95 (95% CI: 0.91, 0.98) and 0.88 (95% CI: 0.84, 0.93), for the internal and external test sets, respectively. Agreement with the eight human observers was moderate to substantial, with mean linearly weighted κ values of 0.60 ± 0.01 for CO-RADS scores and 0.54 ± 0.01 for CT severity scores. Conclusion With high diagnostic performance, the CO-RADS AI system correctly identified patients with COVID-19 using chest CT scans and assigned standardized CO-RADS and CT severity scores that demonstrated good agreement with findings from eight independent observers and generalized well to external data. © RSNA, 2020 Supplemental material is available for this article.