RESUMO
PURPOSE: The aim of this study was to evaluate the impact of implementing an artificial intelligence (AI) solution for emergency radiology into clinical routine on physicians' perception and knowledge. MATERIALS AND METHODS: A prospective interventional survey was performed pre-implementation and 3 months post-implementation of an AI algorithm for fracture detection on radiographs in late 2022. Radiologists and traumatologists were asked about their knowledge and perception of AI on a 7-point Likert scale (-3, "strongly disagree"; +3, "strongly agree"). Self-generated identification codes allowed matching the same individuals pre-intervention and post-intervention, and using Wilcoxon signed rank test for paired data. RESULTS: A total of 47/71 matched participants completed both surveys (66% follow-up rate) and were eligible for analysis (34 radiologists [72%], 13 traumatologists [28%], 15 women [32%]; mean age, 34.8 ± 7.8 years). Postintervention, there was an increase that AI "reduced missed findings" (1.28 [pre] vs 1.94 [post], P = 0.003) and made readers "safer" (1.21 vs 1.64, P = 0.048), but not "faster" (0.98 vs 1.21, P = 0.261). There was a rising disagreement that AI could "replace the radiological report" (-2.04 vs -2.34, P = 0.038), as well as an increase in self-reported knowledge about "clinical AI," its "chances," and its "risks" (0.40 vs 1.00, 1.21 vs 1.70, and 0.96 vs 1.34; all P 's ≤ 0.028). Radiologists used AI results more frequently than traumatologists ( P < 0.001) and rated benefits higher (all P 's ≤ 0.038), whereas senior physicians were less likely to use AI or endorse its benefits (negative correlation with age, -0.35 to 0.30; all P 's ≤ 0.046). CONCLUSIONS: Implementing AI for emergency radiology into clinical routine has an educative aspect and underlines the concept of AI as a "second reader," to support and not replace physicians.
Assuntos
Médicos , Radiologia , Feminino , Humanos , Adulto , Inteligência Artificial , Estudos Prospectivos , PercepçãoRESUMO
PURPOSE: To develop and validate an artificial intelligence algorithm for the positioning assessment of tracheal tubes (TTs) and central venous catheters (CVCs) in supine chest radiographs (SCXRs) by using an algorithm approach allowing for adjustable definitions of intended device positioning. MATERIALS AND METHODS: Positioning quality of CVCs and TTs is evaluated by spatially correlating the respective tip positions with anatomical structures. For CVC analysis, a configurable region of interest is defined to approximate the expected region of well-positioned CVC tips from segmentations of anatomical landmarks. The CVC/TT information is estimated by introducing a new multitask neural network architecture for jointly performing type/existence classification, course segmentation, and tip detection. Validation data consisted of 589 SCXRs that have been radiologically annotated for inserted TTs/CVCs, including an experts' categorical positioning assessment (reading 1). In-image positions of algorithm-detected TT/CVC tips could be corrected using a validation software tool (reading 2) that finally allowed for localization accuracy quantification. Algorithmic detection of images with misplaced devices (reading 1 as reference standard) was quantified by receiver operating characteristics. RESULTS: Supine chest radiographs were correctly classified according to inserted TTs/CVCs in 100%/98% of the cases, thereby with high accuracy in also spatially localizing the medical device tips: corrections less than 3 mm in >86% (TTs) and 77% (CVCs) of the cases. Chest radiographs with malpositioned devices were detected with area under the curves of >0.98 (TTs), >0.96 (CVCs with accidental vessel turnover), and >0.93 (also suboptimal CVC insertion length considered). The receiver operating characteristics limitations regarding CVC assessment were mainly caused by limitations of the applied CXR position definitions (region of interest derived from anatomical landmarks), not by algorithmic spatial detection inaccuracies. CONCLUSIONS: The TT and CVC tips were accurately localized in SCXRs by the presented algorithms, but triaging applications for CVC positioning assessment still suffer from the vague definition of optimal CXR positioning. Our algorithm, however, allows for an adjustment of these criteria, theoretically enabling them to meet user-specific or patient subgroups requirements. Besides CVC tip analysis, future work should also include specific course analysis for accidental vessel turnover detection.
Assuntos
Cateterismo Venoso Central , Cateteres Venosos Centrais , Humanos , Cateterismo Venoso Central/métodos , Inteligência Artificial , Radiografia , Radiografia Torácica/métodosRESUMO
(1) Background: Chest radiography (CXR) is still a key diagnostic component in the emergency department (ED). Correct interpretation is essential since some pathologies require urgent treatment. This study quantifies potential discrepancies in CXR analysis between radiologists and non-radiology physicians in training with ED experience. (2) Methods: Nine differently qualified physicians (three board-certified radiologists [BCR], three radiology residents [RR], and three non-radiology residents involved in ED [NRR]) evaluated a series of 563 posterior-anterior CXR images by quantifying suspicion for four relevant pathologies: pleural effusion, pneumothorax, pneumonia, and pulmonary nodules. Reading results were noted separately for each hemithorax on a Likert scale (0-4; 0: no suspicion of pathology, 4: safe existence of pathology) adding up to a total of 40,536 reported pathology suspicions. Interrater reliability/correlation and Kruskal-Wallis tests were performed for statistical analysis. (3) Results: While interrater reliability was good among radiologists, major discrepancies between radiologists' and non-radiologists' reading results could be observed in all pathologies. Highest overall interrater agreement was found for pneumothorax detection and lowest agreement in raising suspicion for malignancy suspicious nodules. Pleural effusion and pneumonia were often suspected with indifferent choices (1-3). In terms of pneumothorax detection, all readers mainly decided for a clear option (0 or 4). Interrater reliability was usually higher when evaluating the right hemithorax (all pathologies except pneumothorax). (4) Conclusions: Quantified CXR interrater reliability analysis displays a general uncertainty and strongly depends on medical training. NRR can benefit from radiology reporting in terms of time efficiency and diagnostic accuracy. CXR evaluation of long-time trained ED specialists has not been tested.
RESUMO
Importance: Most early lung cancers present as pulmonary nodules on imaging, but these can be easily missed on chest radiographs. Objective: To assess if a novel artificial intelligence (AI) algorithm can help detect pulmonary nodules on radiographs at different levels of detection difficulty. Design, Setting, and Participants: This diagnostic study included 100 posteroanterior chest radiograph images taken between 2000 and 2010 of adult patients from an ambulatory health care center in Germany and a lung image database in the US. Included images were selected to represent nodules with different levels of detection difficulties (from easy to difficult), and comprised both normal and nonnormal control. Exposures: All images were processed with a novel AI algorithm, the AI Rad Companion Chest X-ray. Two thoracic radiologists established the ground truth and 9 test radiologists from Germany and the US independently reviewed all images in 2 sessions (unaided and AI-aided mode) with at least a 1-month washout period. Main Outcomes and Measures: Each test radiologist recorded the presence of 5 findings (pulmonary nodules, atelectasis, consolidation, pneumothorax, and pleural effusion) and their level of confidence for detecting the individual finding on a scale of 1 to 10 (1 representing lowest confidence; 10, highest confidence). The analyzed metrics for nodules included sensitivity, specificity, accuracy, and receiver operating characteristics curve area under the curve (AUC). Results: Images from 100 patients were included, with a mean (SD) age of 55 (20) years and including 64 men and 36 women. Mean detection accuracy across the 9 radiologists improved by 6.4% (95% CI, 2.3% to 10.6%) with AI-aided interpretation compared with unaided interpretation. Partial AUCs within the effective interval range of 0 to 0.2 false positive rate improved by 5.6% (95% CI, -1.4% to 12.0%) with AI-aided interpretation. Junior radiologists saw greater improvement in sensitivity for nodule detection with AI-aided interpretation as compared with their senior counterparts (12%; 95% CI, 4% to 19% vs 9%; 95% CI, 1% to 17%) while senior radiologists experienced similar improvement in specificity (4%; 95% CI, -2% to 9%) as compared with junior radiologists (4%; 95% CI, -3% to 5%). Conclusions and Relevance: In this diagnostic study, an AI algorithm was associated with improved detection of pulmonary nodules on chest radiographs compared with unaided interpretation for different levels of detection difficulty and for readers with different experience.
Assuntos
Algoritmos , Neoplasias Pulmonares/diagnóstico por imagem , Adulto , Inteligência Artificial , Feminino , Alemanha , Humanos , Masculino , Pessoa de Meia-Idade , Nódulos Pulmonares Múltiplos/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador , Radiografia Torácica , Sensibilidade e Especificidade , Nódulo Pulmonar Solitário/diagnóstico por imagemRESUMO
OBJECTIVES: We hypothesized that published performances of algorithms for artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXRs) do not sufficiently consider the influence of PTX size and confounding effects caused by thoracic tubes (TTs). Therefore, we established a radiologically annotated benchmarking cohort (n = 6446) allowing for a detailed subgroup analysis. MATERIALS AND METHODS: We retrospectively identified 6434 supine CXRs, among them 1652 PTX-positive cases and 4782 PTX-negative cases. Supine CXRs were radiologically annotated for PTX size, PTX location, and inserted TTs. The diagnostic performances of 2 AI algorithms ("AI_CheXNet" [Rajpurkar et al], "AI_1.5" [Guendel et al]), both trained on publicly available datasets with labels obtained from automatic report interpretation, were quantified. The algorithms' discriminative power for PTX detection was quantified by the area under the receiver operating characteristics (AUROC), and significance analysis was based on the corresponding 95% confidence interval. A detailed subgroup analysis was performed to quantify the influence of PTX size and the confounding effects caused by inserted TTs. RESULTS: Algorithm performance was quantified as follows: overall performance with AUROCs of 0.704 (AI_1.5) / 0.765 (AI_CheXNet) for unilateral PTXs, AUROCs of 0.666 (AI_1.5) / 0.722 (AI_CheXNet) for unilateral PTXs smaller than 1 cm, and AUROCs of 0.735 (AI_1.5) / 0.818 (AI_CheXNet) for unilateral PTXs larger than 2 cm. Subgroup analysis identified TTs to be strong confounders that significantly influence algorithm performance: Discriminative power is completely eliminated by analyzing PTX-positive cases without TTs referenced to control PTX-negative cases with inserted TTs. Contrarily, AUROCs increased up to 0.875 (AI_CheXNet) for large PTX-positive cases with inserted TTs referenced to control cases without TTs. CONCLUSIONS: Our detailed subgroup analysis demonstrated that the performance of established AI algorithms for PTX detection trained on public datasets strongly depends on PTX size and is significantly biased by confounding image features, such as inserted TTS. Our established, clinically relevant and radiologically annotated benchmarking cohort might be of great benefit for ongoing algorithm development.