Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
1.
Rofo ; 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38663428

RESUMEN

The aim of this study was to explore the potential of weak supervision in a deep learning-based label prediction model. The goal was to use this model to extract labels from German free-text thoracic radiology reports on chest X-ray images and for training chest X-ray classification models.The proposed label extraction model for German thoracic radiology reports uses a German BERT encoder as a backbone and classifies a report based on the CheXpert labels. For investigating the efficient use of manually annotated data, the model was trained using manual annotations, weak rule-based labels, and both. Rule-based labels were extracted from 66071 retrospectively collected radiology reports from 2017-2021 (DS 0), and 1091 reports from 2020-2021 (DS 1) were manually labeled according to the CheXpert classes. Label extraction performance was evaluated with respect to mention extraction, negation detection, and uncertainty detection by measuring F1 scores. The influence of the label extraction method on chest X-ray classification was evaluated on a pneumothorax data set (DS 2) containing 6434 chest radiographs with associated reports and expert diagnoses of pneumothorax. For this, DenseNet-121 models trained on manual annotations, rule-based and deep learning-based label predictions, and publicly available data were compared.The proposed deep learning-based labeler (DL) performed on average considerably stronger than the rule-based labeler (RB) for all three tasks on DS 1 with F1 scores of 0.938 vs. 0.844 for mention extraction, 0.891 vs. 0.821 for negation detection, and 0.624 vs. 0.518 for uncertainty detection. Pre-training on DS 0 and fine-tuning on DS 1 performed better than only training on either DS 0 or DS 1. Chest X-ray pneumothorax classification results (DS 2) were highest when trained with DL labels with an area under the receiver operating curve (AUC) of 0.939 compared to RB labels with an AUC of 0.858. Training with manual labels performed slightly worse than training with DL labels with an AUC of 0.934. In contrast, training with a public data set resulted in an AUC of 0.720.Our results show that leveraging a rule-based report labeler for weak supervision leads to improved labeling performance. The pneumothorax classification results demonstrate that our proposed deep learning-based labeler can serve as a substitute for manual labeling requiring only 1000 manually annotated reports for training. · The proposed deep learning-based label extraction model for German thoracic radiology reports performs better than the rule-based model.. · Training with limited supervision outperformed training with a small manually labeled data set.. · Using predicted labels for pneumothorax classification from chest radiographs performed equally to using manual annotations.. Wollek A, Haitzer P, Sedlmeyr T et al. Language modelbased labeling of German thoracic radiology reports. Fortschr Röntgenstr 2024; DOI 10.1055/a-2287-5054.

2.
Chest ; 2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38295950

RESUMEN

BACKGROUND: Chest radiographs (CXRs) are still of crucial importance in primary diagnostics, but their interpretation poses difficulties at times. RESEARCH QUESTION: Can a convolutional neural network-based artificial intelligence (AI) system that interprets CXRs add value in an emergency unit setting? STUDY DESIGN AND METHODS: A total of 563 CXRs acquired in the emergency unit of a major university hospital were retrospectively assessed twice by three board-certified radiologists, three radiology residents, and three emergency unit-experienced nonradiology residents (NRRs). They used a two-step reading process: (1) without AI support (woAI); and (2) with AI support (wAI) providing additional images with AI overlays. Suspicion of four suspected pathologies (pleural effusion, pneumothorax, consolidations suspicious for pneumonia, and nodules) was reported on a five-point confidence scale. Confidence scores of the board-certified radiologists were converted into four binary reference standards (RFS I-IV) of different sensitivities. Performance by radiology residents and NRRs woAI/wAI were statistically compared by using receiver-operating characteristics (ROCs), Youden statistics, and operating point metrics derived from fitted ROC curves. RESULTS: NRRs could significantly improve performance, sensitivity, and accuracy wAI in all four pathologies tested. In the most sensitive RFS IV, NRR consensus improved the area under the ROC curve (mean, 95% CI) in the detection of the time-critical pathology pneumothorax from 0.846 (0.785-0.907) woAI to 0.974 (0.947-1.000) wAI (P < .001), which represented a gain of 30% in sensitivity and 2% in accuracy (while maintaining an optimized specificity). The most pronounced effect was observed in nodule detection, with NRR wAI improving sensitivity by 53% and accuracy by 7% (area under the ROC curve woAI, 0.723 [0.661-0.785]; wAI, 0.890 [0.848-0.931]; P < .001). The RR consensus wAI showed smaller, mostly nonsignificant gains in performance, sensitivity, and accuracy. INTERPRETATION: In an emergency unit setting without 24/7 radiology coverage, the presented AI solution features an excellent clinical support tool to nonradiologists, similar to a second reader, and allows for a more accurate primary diagnosis and thus earlier therapy initiation.

3.
Int J Legal Med ; 2024 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-38286953

RESUMEN

BACKGROUND: Radiological age assessment using reference studies is inherently limited in accuracy due to a finite number of assignable skeletal maturation stages. To overcome this limitation, we present a deep learning approach for continuous age assessment based on clavicle ossification in computed tomography (CT). METHODS: Thoracic CT scans were retrospectively collected from the picture archiving and communication system. Individuals aged 15.0 to 30.0 years examined in routine clinical practice were included. All scans were automatically cropped around the medial clavicular epiphyseal cartilages. A deep learning model was trained to predict a person's chronological age based on these scans. Performance was evaluated using mean absolute error (MAE). Model performance was compared to an optimistic human reader performance estimate for an established reference study method. RESULTS: The deep learning model was trained on 4,400 scans of 1,935 patients (training set: mean age = 24.2 years ± 4.0, 1132 female) and evaluated on 300 scans of 300 patients with a balanced age and sex distribution (test set: mean age = 22.5 years ± 4.4, 150 female). Model MAE was 1.65 years, and the highest absolute error was 6.40 years for females and 7.32 years for males. However, performance could be attributed to norm-variants or pathologic disorders. Human reader estimate MAE was 1.84 years and the highest absolute error was 3.40 years for females and 3.78 years for males. CONCLUSIONS: We present a deep learning approach for continuous age predictions using CT volumes highlighting the medial clavicular epiphyseal cartilage with performance comparable to the human reader estimate.

4.
Rofo ; 2024 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-38295825

RESUMEN

PURPOSE: The aim of this study was to develop an algorithm to automatically extract annotations from German thoracic radiology reports to train deep learning-based chest X-ray classification models. MATERIALS AND METHODS: An automatic label extraction model for German thoracic radiology reports was designed based on the CheXpert architecture. The algorithm can extract labels for twelve common chest pathologies, the presence of support devices, and "no finding". For iterative improvements and to generate a ground truth, a web-based multi-reader annotation interface was created. With the proposed annotation interface, a radiologist annotated 1086 retrospectively collected radiology reports from 2020-2021 (data set 1). The effect of automatically extracted labels on chest radiograph classification performance was evaluated on an additional, in-house pneumothorax data set (data set 2), containing 6434 chest radiographs with corresponding reports, by comparing a DenseNet-121 model trained on extracted labels from the associated reports, image-based pneumothorax labels, and publicly available data, respectively. RESULTS: Comparing automated to manual labeling on data set 1: "mention extraction" class-wise F1 scores ranged from 0.8 to 0.995, the "negation detection" F1 scores from 0.624 to 0.981, and F1 scores for "uncertainty detection" from 0.353 to 0.725. Extracted pneumothorax labels on data set 2 had a sensitivity of 0.997 [95 % CI: 0.994, 0.999] and specificity of 0.991 [95 % CI: 0.988, 0.994]. The model trained on publicly available data achieved an area under the receiver operating curve (AUC) for pneumothorax classification of 0.728 [95 % CI: 0.694, 0.760], while the models trained on automatically extracted labels and on manual annotations achieved values of 0.858 [95 % CI: 0.832, 0.882] and 0.934 [95 % CI: 0.918, 0.949], respectively. CONCLUSION: Automatic label extraction from German thoracic radiology reports is a promising substitute for manual labeling. By reducing the time required for data annotation, larger training data sets can be created, resulting in improved overall modeling performance. Our results demonstrated that a pneumothorax classifier trained on automatically extracted labels strongly outperformed the model trained on publicly available data, without the need for additional annotation time and performed competitively compared to manually labeled data. KEY POINTS: · An algorithm for automatic German thoracic radiology report annotation was developed.. · Automatic label extraction is a promising substitute for manual labeling.. · The classifier trained on extracted labels outperformed the model trained on publicly available data..

5.
Invest Radiol ; 59(4): 306-313, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-37682731

RESUMEN

PURPOSE: To develop and validate an artificial intelligence algorithm for the positioning assessment of tracheal tubes (TTs) and central venous catheters (CVCs) in supine chest radiographs (SCXRs) by using an algorithm approach allowing for adjustable definitions of intended device positioning. MATERIALS AND METHODS: Positioning quality of CVCs and TTs is evaluated by spatially correlating the respective tip positions with anatomical structures. For CVC analysis, a configurable region of interest is defined to approximate the expected region of well-positioned CVC tips from segmentations of anatomical landmarks. The CVC/TT information is estimated by introducing a new multitask neural network architecture for jointly performing type/existence classification, course segmentation, and tip detection. Validation data consisted of 589 SCXRs that have been radiologically annotated for inserted TTs/CVCs, including an experts' categorical positioning assessment (reading 1). In-image positions of algorithm-detected TT/CVC tips could be corrected using a validation software tool (reading 2) that finally allowed for localization accuracy quantification. Algorithmic detection of images with misplaced devices (reading 1 as reference standard) was quantified by receiver operating characteristics. RESULTS: Supine chest radiographs were correctly classified according to inserted TTs/CVCs in 100%/98% of the cases, thereby with high accuracy in also spatially localizing the medical device tips: corrections less than 3 mm in >86% (TTs) and 77% (CVCs) of the cases. Chest radiographs with malpositioned devices were detected with area under the curves of >0.98 (TTs), >0.96 (CVCs with accidental vessel turnover), and >0.93 (also suboptimal CVC insertion length considered). The receiver operating characteristics limitations regarding CVC assessment were mainly caused by limitations of the applied CXR position definitions (region of interest derived from anatomical landmarks), not by algorithmic spatial detection inaccuracies. CONCLUSIONS: The TT and CVC tips were accurately localized in SCXRs by the presented algorithms, but triaging applications for CVC positioning assessment still suffer from the vague definition of optimal CXR positioning. Our algorithm, however, allows for an adjustment of these criteria, theoretically enabling them to meet user-specific or patient subgroups requirements. Besides CVC tip analysis, future work should also include specific course analysis for accidental vessel turnover detection.


Asunto(s)
Cateterismo Venoso Central , Catéteres Venosos Centrales , Humanos , Cateterismo Venoso Central/métodos , Inteligencia Artificial , Radiografía , Radiografía Torácica/métodos
6.
Invest Radiol ; 59(5): 404-412, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-37843828

RESUMEN

PURPOSE: The aim of this study was to evaluate the impact of implementing an artificial intelligence (AI) solution for emergency radiology into clinical routine on physicians' perception and knowledge. MATERIALS AND METHODS: A prospective interventional survey was performed pre-implementation and 3 months post-implementation of an AI algorithm for fracture detection on radiographs in late 2022. Radiologists and traumatologists were asked about their knowledge and perception of AI on a 7-point Likert scale (-3, "strongly disagree"; +3, "strongly agree"). Self-generated identification codes allowed matching the same individuals pre-intervention and post-intervention, and using Wilcoxon signed rank test for paired data. RESULTS: A total of 47/71 matched participants completed both surveys (66% follow-up rate) and were eligible for analysis (34 radiologists [72%], 13 traumatologists [28%], 15 women [32%]; mean age, 34.8 ± 7.8 years). Postintervention, there was an increase that AI "reduced missed findings" (1.28 [pre] vs 1.94 [post], P = 0.003) and made readers "safer" (1.21 vs 1.64, P = 0.048), but not "faster" (0.98 vs 1.21, P = 0.261). There was a rising disagreement that AI could "replace the radiological report" (-2.04 vs -2.34, P = 0.038), as well as an increase in self-reported knowledge about "clinical AI," its "chances," and its "risks" (0.40 vs 1.00, 1.21 vs 1.70, and 0.96 vs 1.34; all P 's ≤ 0.028). Radiologists used AI results more frequently than traumatologists ( P < 0.001) and rated benefits higher (all P 's ≤ 0.038), whereas senior physicians were less likely to use AI or endorse its benefits (negative correlation with age, -0.35 to 0.30; all P 's ≤ 0.046). CONCLUSIONS: Implementing AI for emergency radiology into clinical routine has an educative aspect and underlines the concept of AI as a "second reader," to support and not replace physicians.


Asunto(s)
Médicos , Radiología , Femenino , Humanos , Adulto , Inteligencia Artificial , Estudios Prospectivos , Percepción
7.
Med Phys ; 51(4): 2721-2732, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37831587

RESUMEN

BACKGROUND: Deep learning models are being applied to more and more use cases with astonishing success stories, but how do they perform in the real world? Models are typically tested on specific cleaned data sets, but when deployed in the real world, the model will encounter unexpected, out-of-distribution (OOD) data. PURPOSE: To investigate the impact of OOD radiographs on existing chest x-ray classification models and to increase their robustness against OOD data. METHODS: The study employed the commonly used chest x-ray classification model, CheXnet, trained on the chest x-ray 14 data set, and tested its robustness against OOD data using three public radiography data sets: IRMA, Bone Age, and MURA, and the ImageNet data set. To detect OOD data for multi-label classification, we proposed in-distribution voting (IDV). The OOD detection performance is measured across data sets using the area under the receiver operating characteristic curve (AUC) analysis and compared with Mahalanobis-based OOD detection, MaxLogit, MaxEnergy, self-supervised OOD detection (SS OOD), and CutMix. RESULTS: Without additional OOD detection, the chest x-ray classifier failed to discard any OOD images, with an AUC of 0.5. The proposed IDV approach trained on ID (chest x-ray 14) and OOD data (IRMA and ImageNet) achieved, on average, 0.999 OOD AUC across the three data sets, surpassing all other OOD detection methods. Mahalanobis-based OOD detection achieved an average OOD detection AUC of 0.982. IDV trained solely with a few thousand ImageNet images had an AUC 0.913, which was considerably higher than MaxLogit (0.726), MaxEnergy (0.724), SS OOD (0.476), and CutMix (0.376). CONCLUSIONS: The performance of all tested OOD detection methods did not translate well to radiography data sets, except Mahalanobis-based OOD detection and the proposed IDV method. Consequently, training solely on ID data led to incorrect classification of OOD images as ID, resulting in increased false positive rates. IDV substantially improved the model's ID classification performance, even when trained with data that will not occur in the intended use case or test set (ImageNet), without additional inference overhead or performance decrease in the target classification. The corresponding code is available at https://gitlab.lrz.de/IP/a-knee-cannot-have-lung-disease.


Asunto(s)
Votación , Rayos X , Radiografía , Curva ROC
8.
J Imaging ; 9(12)2023 Dec 06.
Artículo en Inglés | MEDLINE | ID: mdl-38132688

RESUMEN

Public chest X-ray (CXR) data sets are commonly compressed to a lower bit depth to reduce their size, potentially hiding subtle diagnostic features. In contrast, radiologists apply a windowing operation to the uncompressed image to enhance such subtle features. While it has been shown that windowing improves classification performance on computed tomography (CT) images, the impact of such an operation on CXR classification performance remains unclear. In this study, we show that windowing strongly improves the CXR classification performance of machine learning models and propose WindowNet, a model that learns multiple optimal window settings. Our model achieved an average AUC score of 0.812 compared with the 0.759 score of a commonly used architecture without windowing capabilities on the MIMIC data set.

9.
Eur Radiol ; 2023 Oct 05.
Artículo en Inglés | MEDLINE | ID: mdl-37794249

RESUMEN

OBJECTIVES: To assess the quality of simplified radiology reports generated with the large language model (LLM) ChatGPT and to discuss challenges and chances of ChatGPT-like LLMs for medical text simplification. METHODS: In this exploratory case study, a radiologist created three fictitious radiology reports which we simplified by prompting ChatGPT with "Explain this medical report to a child using simple language." In a questionnaire, we tasked 15 radiologists to rate the quality of the simplified radiology reports with respect to their factual correctness, completeness, and potential harm for patients. We used Likert scale analysis and inductive free-text categorization to assess the quality of the simplified reports. RESULTS: Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed relevant medical information, and potentially harmful passages were reported. CONCLUSION: While we see a need for further adaption to the medical field, the initial insights of this study indicate a tremendous potential in using LLMs like ChatGPT to improve patient-centered care in radiology and other medical domains. CLINICAL RELEVANCE STATEMENT: Patients have started to use ChatGPT to simplify and explain their medical reports, which is expected to affect patient-doctor interaction. This phenomenon raises several opportunities and challenges for clinical routine. KEY POINTS: • Patients have started to use ChatGPT to simplify their medical reports, but their quality was unknown. • In a questionnaire, most participating radiologists overall asserted good quality to radiology reports simplified with ChatGPT. However, they also highlighted a notable presence of errors, potentially leading patients to draw harmful conclusions. • Large language models such as ChatGPT have vast potential to enhance patient-centered care in radiology and other medical domains. To realize this potential while minimizing harm, they need supervision by medical experts and adaption to the medical field.

10.
Thromb J ; 21(1): 51, 2023 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-37131204

RESUMEN

BACKGROUND: Pulmonary embolism (PE) is an important complication of Coronavirus disease 2019 (COVID-19). COVID-19 is associated with respiratory impairment and a pro-coagulative state, rendering PE more likely and difficult to recognize. Several decision algorithms relying on clinical features and D-dimer have been established. High prevalence of PE and elevated Ddimer in patients with COVID-19 might impair the performance of common decision algorithms. Here, we aimed to validate and compare five common decision algorithms implementing age adjusted Ddimer, the GENEVA, and Wells scores as well as the PEGeD- and YEARS-algorithms in patients hospitalized with COVID-19. METHODS: In this single center study, we included patients who were admitted to our tertiary care hospital in the COVID-19 Registry of the LMU Munich. We retrospectively selected patients who received a computed tomography pulmonary angiogram (CTPA) or pulmonary ventilation/perfusion scintigraphy (V/Q) for suspected PE. The performances of five commonly used diagnostic algorithms (age-adjusted D-dimer, GENEVA score, PEGeD-algorithm, Wells score, and YEARS-algorithm) were compared. RESULTS: We identified 413 patients with suspected PE who received a CTPA or V/Q confirming 62 PEs (15%). Among them, 358 patients with 48 PEs (13%) could be evaluated for performance of all algorithms. Patients with PE were older and their overall outcome was worse compared to patients without PE. Of the above five diagnostic algorithms, the PEGeD- and YEARS-algorithms performed best, reducing diagnostic imaging by 14% and 15% respectively with a sensitivity of 95.7% and 95.6%. The GENEVA score was able to reduce CTPA or V/Q by 32.2% but suffered from a low sensitivity (78.6%). Age-adjusted D-dimer and Wells score could not significantly reduce diagnostic imaging. CONCLUSION: The PEGeD- and YEARS-algorithms outperformed other tested decision algorithms and worked well in patients admitted with COVID-19. These findings need independent validation in a prospective study.

11.
Radiol Artif Intell ; 5(2): e220187, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37035429

RESUMEN

Purpose: To investigate the chest radiograph classification performance of vision transformers (ViTs) and interpretability of attention-based saliency maps, using the example of pneumothorax classification. Materials and Methods: In this retrospective study, ViTs were fine-tuned for lung disease classification using four public datasets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData. Saliency maps were generated using transformer multimodal explainability and gradient-weighted class activation mapping (GradCAM). Classification performance was evaluated on the Chest X-Ray 14, VinBigData, and Society for Imaging Informatics in Medicine-American College of Radiology (SIIM-ACR) Pneumothorax Segmentation datasets using the area under the receiver operating characteristic curve (AUC) analysis and compared with convolutional neural networks (CNNs). The explainability methods were evaluated with positive and negative perturbation, sensitivity-n, effective heat ratio, intra-architecture repeatability, and interarchitecture reproducibility. In the user study, three radiologists classified 160 chest radiographs with and without saliency maps for pneumothorax and rated their usefulness. Results: ViTs had comparable chest radiograph classification AUCs compared with state-of-the-art CNNs: 0.95 (95% CI: 0.94, 0.95) versus 0.83 (95%, CI 0.83, 0.84) on Chest X-Ray 14, 0.84 (95% CI: 0.77, 0.91) versus 0.83 (95% CI: 0.76, 0.90) on VinBigData, and 0.85 (95% CI: 0.85, 0.86) versus 0.87 (95% CI: 0.87, 0.88) on SIIM-ACR. Both saliency map methods unveiled a strong bias toward pneumothorax tubes in the models. Radiologists found 47% of the attention-based and 39% of the GradCAM saliency maps useful. The attention-based methods outperformed GradCAM on all metrics. Conclusion: ViTs performed similarly to CNNs in chest radiograph classification, and their attention-based saliency maps were more useful to radiologists and outperformed GradCAM.Keywords: Conventional Radiography, Thorax, Diagnosis, Supervised Learning, Convolutional Neural Network (CNN) Online supplemental material is available for this article. © RSNA, 2023.

12.
Ultraschall Med ; 44(5): 537-543, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-36854384

RESUMEN

PURPOSE: The aim of the study was to evaluate whether the quantification of B-lines via lung ultrasound after lung transplantation is feasible and correlates with the diagnosis of primary graft dysfunction. METHODS: Following lung transplantation, patients underwent daily lung ultrasound on postoperative days 1-3. B-lines were quantified by an ultrasound score based on the number of single and confluent B-lines per intercostal space, using a four-region protocol. The ultrasound score was correlated with the diagnosis of primary graft dysfunction. Furthermore, correlation analyses and receiver operating characteristics analyses taking into account ultrasound score, chest radiographs, and PaO2/FiO2 ratio were performed. RESULTS: A total of 32 patients (91 ultrasound measurements) were included, whereby 10 were diagnosed with primary graft dysfunction. The median B-line score was 5 [IQR: 4, 8]. There was a significant correlation between B-line score and the diagnosis of primary graft dysfunction (r = 0.59, p < 0.001). A significant correlation could also be seen between chest X-rays and primary graft dysfunction (r = 0.34, p = 0.008), but the B-line score showed superiority over chest X-rays with respect to diagnosing primary graft dysfunction in the receiver operating characteristics curves with an area under the curve value of 0.921 versus 0.708. There was a significant negative correlation between B-line score and PaO2/FiO2 ratio (r = -0.41, p < 0.001), but not between chest X-rays and PaO2/FiO2 ratio (r = -0.14, p = 0.279). CONCLUSION: The appearance of B-lines correlated well with primary graft dysfunction and outperformed chest radiographs.


Asunto(s)
Trasplante de Pulmón , Disfunción Primaria del Injerto , Síndrome de Dificultad Respiratoria , Humanos , Disfunción Primaria del Injerto/diagnóstico por imagen , Pulmón/diagnóstico por imagen , Ultrasonografía , Trasplante de Pulmón/efectos adversos
13.
Int J Legal Med ; 137(3): 733-742, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-36729183

RESUMEN

BACKGROUND: Deep learning is a promising technique to improve radiological age assessment. However, expensive manual annotation by experts poses a bottleneck for creating large datasets to appropriately train deep neural networks. We propose an object detection approach to automatically annotate the medial clavicular epiphyseal cartilages in computed tomography (CT) scans. METHODS: The sternoclavicular joints were selected as structure-of-interest (SOI) in chest CT scans and served as an easy-to-identify proxy for the actual medial clavicular epiphyseal cartilages. CT slices containing the SOI were manually annotated with bounding boxes around the SOI. All slices in the training set were used to train the object detection network RetinaNet. Afterwards, the network was applied individually to all slices of the test scans for SOI detection. Bounding box and slice position of the detection with the highest classification score were used as the location estimate for the medial clavicular epiphyseal cartilages inside the CT scan. RESULTS: From 100 CT scans of 82 patients, 29,656 slices were used for training and 30,846 slices from 110 CT scans of 110 different patients for testing the object detection network. The location estimate from the deep learning approach for the SOI was in a correct slice in 97/110 (88%), misplaced by one slice in 5/110 (5%), and missing in 8/110 (7%) test scans. No estimate was misplaced by more than one slice. CONCLUSIONS: We demonstrated a robust automated approach for annotating the medial clavicular epiphyseal cartilages. This enables training and testing of deep neural networks for age assessment.


Asunto(s)
Aprendizaje Profundo , Placa de Crecimiento , Humanos , Placa de Crecimiento/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Redes Neurales de la Computación , Clavícula/diagnóstico por imagen
14.
Diagnostics (Basel) ; 12(11)2022 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-36428913

RESUMEN

(1) Background: CT perfusion (CTP) is a fast, robust and widely available but dose-exposing imaging technique for infarct core and penumbra detection. Carotid CT angiography (CTA) can precede CTP in the stroke protocol. Temporal information of the bolus tracking series of CTA could allow for better timing and a decreased number of scans in CTP, resulting in less radiation exposure, if the shortening of CTP does not alter the calculated infarct core and penumbra or the resulting perfusion maps, which are essential for further treatment decisions. (2) Methods: 66 consecutive patients with ischemic stroke proven by follow-up imaging or endovascular intervention were included in this retrospective study approved by the local ethics committee. In each case, six simulated, stepwise shortened CTP examinations were compared with the original data regarding the perfusion maps, infarct core, penumbra and endovascular treatment decision. (3) Results: In simulated CTPs with 26, 28 and 30 scans, the infarct core, penumbra and PRR values were equivalent, and the resulting clinical decision was identical to the original CTP. (4) Conclusions: The temporal information of the bolus tracking series of the carotid CTA can allow for better timing and a lower radiation exposure by eliminating unnecessary scans in CTP.

15.
Healthcare (Basel) ; 10(8)2022 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-36011128

RESUMEN

MR-guided high-intensity focused ultrasound (MR-HIFU) is an effective method for treating symptomatic uterine fibroids, especially solitary lesions. The aim of our study was to compare the clinical and morphological outcomes of patients who underwent MR-HIFU due to solitary fibroid (SF) or multiple fibroids (MFs) in a prospective clinical trial. We prospectively included 21 consecutive patients with SF (10) and MF (11) eligible for MR-guided HIFU. The morphological data were assessed using mint Lesion™ for MRI. The clinical data were determined using the Uterine Fibroid Symptom and Quality of Life (UFS-QOL) questionnaire before and 6 months after treatment. Unpaired and paired Wilcoxon-test and t-tests were applied, and Pearson's coefficient was used for correlation analysis. A p-value of 0.05 was considered statistically significant. The volume of treated fibroids significantly decreased in both the SF (mean baseline: 118.6 cm3; mean 6-month follow-up: 64.6 cm3) and MF (107.2 cm3; 55.1 cm3) groups. The UFS-QOL showed clinical symptoms significantly improved for patients in both the SF and MF groups regarding concern, activities, energy/mood, and control. The short-term outcome for the treatment of symptomatic fibroids in myomatous uterus by MR-guided HIFU is clinically similar to that of solitary fibroids.

16.
Sci Rep ; 12(1): 12764, 2022 07 27.
Artículo en Inglés | MEDLINE | ID: mdl-35896763

RESUMEN

Artificial intelligence (AI) algorithms evaluating [supine] chest radiographs ([S]CXRs) have remarkably increased in number recently. Since training and validation are often performed on subsets of the same overall dataset, external validation is mandatory to reproduce results and reveal potential training errors. We applied a multicohort benchmarking to the publicly accessible (S)CXR analyzing AI algorithm CheXNet, comprising three clinically relevant study cohorts which differ in patient positioning ([S]CXRs), the applied reference standards (CT-/[S]CXR-based) and the possibility to also compare algorithm classification with different medical experts' reading performance. The study cohorts include [1] a cohort, characterized by 563 CXRs acquired in the emergency unit that were evaluated by 9 readers (radiologists and non-radiologists) in terms of 4 common pathologies, [2] a collection of 6,248 SCXRs annotated by radiologists in terms of pneumothorax presence, its size and presence of inserted thoracic tube material which allowed for subgroup and confounding bias analysis and [3] a cohort consisting of 166 patients with SCXRs that were evaluated by radiologists for underlying causes of basal lung opacities, all of those cases having been correlated to a timely acquired computed tomography scan (SCXR and CT within < 90 min). CheXNet non-significantly exceeded the radiology resident (RR) consensus in the detection of suspicious lung nodules (cohort [1], AUC AI/RR: 0.851/0.839, p = 0.793) and the radiological readers in the detection of basal pneumonia (cohort [3], AUC AI/reader consensus: 0.825/0.782, p = 0.390) and basal pleural effusion (cohort [3], AUC AI/reader consensus: 0.762/0.710, p = 0.336) in SCXR, partly with AUC values higher than originally published ("Nodule": 0.780, "Infiltration": 0.735, "Effusion": 0.864). The classifier "Infiltration" turned out to be very dependent on patient positioning (best in CXR, worst in SCXR). The pneumothorax SCXR cohort [2] revealed poor algorithm performance in CXRs without inserted thoracic material and in the detection of small pneumothoraces, which can be explained by a known systematic confounding error in the algorithm training process. The benefit of clinically relevant external validation is demonstrated by the differences in algorithm performance as compared to the original publication. Our multi-cohort benchmarking finally enables the consideration of confounders, different reference standards and patient positioning as well as the AI performance comparison with differentially qualified medical readers.


Asunto(s)
Inteligencia Artificial , Neumotórax , Algoritmos , Benchmarking , Humanos , Neumotórax/etiología , Radiografía Torácica/métodos , Estudios Retrospectivos
17.
Invest Radiol ; 57(2): 90-98, 2022 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-34352804

RESUMEN

OBJECTIVES: Chest radiographs (CXRs) are commonly performed in emergency units (EUs), but the interpretation requires radiology experience. We developed an artificial intelligence (AI) system (precommercial) that aims to mimic board-certified radiologists' (BCRs') performance and can therefore support non-radiology residents (NRRs) in clinical settings lacking 24/7 radiology coverage. We validated by quantifying the clinical value of our AI system for radiology residents (RRs) and EU-experienced NRRs in a clinically representative EU setting. MATERIALS AND METHODS: A total of 563 EU CXRs were retrospectively assessed by 3 BCRs, 3 RRs, and 3 EU-experienced NRRs. Suspected pathologies (pleural effusion, pneumothorax, consolidations suspicious for pneumonia, lung lesions) were reported on a 5-step confidence scale (sum of 20,268 reported pathology suspicions [563 images × 9 readers × 4 pathologies]) separately by every involved reader. Board-certified radiologists' confidence scores were converted into 4 binary reference standards (RFSs) of different sensitivities. The RRs' and NRRs' performances were statistically compared with our AI system (trained on nonpublic data from different clinical sites) based on receiver operating characteristics (ROCs) and operating point metrics approximated to the maximum sum of sensitivity and specificity (Youden statistics). RESULTS: The NRRs lose diagnostic accuracy to RRs with increasingly sensitive BCRs' RFSs for all considered pathologies. Based on our external validation data set, the AI system/NRRs' consensus mimicked the most sensitive BCRs' RFSs with areas under ROC of 0.940/0.837 (pneumothorax), 0.953/0.823 (pleural effusion), and 0.883/0.747 (lung lesions), which were comparable to experienced RRs and significantly overcomes EU-experienced NRRs' diagnostic performance. For consolidation detection, the AI system performed on the NRRs' consensus level (and overcomes each individual NRR) with an area under ROC of 0.847 referenced to the BCRs' most sensitive RFS. CONCLUSIONS: Our AI system matched RRs' performance, meanwhile significantly outperformed NRRs' diagnostic accuracy for most of considered CXR pathologies (pneumothorax, pleural effusion, and lung lesions) and therefore might serve as clinical decision support for NRRs.


Asunto(s)
Enfermedades Pulmonares , Derrame Pleural , Neumotórax , Radiología , Inteligencia Artificial , Servicio de Urgencia en Hospital , Humanos , Derrame Pleural/diagnóstico por imagen , Neumotórax/diagnóstico por imagen , Radiografía , Radiografía Torácica/métodos , Estudios Retrospectivos
18.
JAMA Netw Open ; 4(12): e2141096, 2021 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-34964851

RESUMEN

Importance: Most early lung cancers present as pulmonary nodules on imaging, but these can be easily missed on chest radiographs. Objective: To assess if a novel artificial intelligence (AI) algorithm can help detect pulmonary nodules on radiographs at different levels of detection difficulty. Design, Setting, and Participants: This diagnostic study included 100 posteroanterior chest radiograph images taken between 2000 and 2010 of adult patients from an ambulatory health care center in Germany and a lung image database in the US. Included images were selected to represent nodules with different levels of detection difficulties (from easy to difficult), and comprised both normal and nonnormal control. Exposures: All images were processed with a novel AI algorithm, the AI Rad Companion Chest X-ray. Two thoracic radiologists established the ground truth and 9 test radiologists from Germany and the US independently reviewed all images in 2 sessions (unaided and AI-aided mode) with at least a 1-month washout period. Main Outcomes and Measures: Each test radiologist recorded the presence of 5 findings (pulmonary nodules, atelectasis, consolidation, pneumothorax, and pleural effusion) and their level of confidence for detecting the individual finding on a scale of 1 to 10 (1 representing lowest confidence; 10, highest confidence). The analyzed metrics for nodules included sensitivity, specificity, accuracy, and receiver operating characteristics curve area under the curve (AUC). Results: Images from 100 patients were included, with a mean (SD) age of 55 (20) years and including 64 men and 36 women. Mean detection accuracy across the 9 radiologists improved by 6.4% (95% CI, 2.3% to 10.6%) with AI-aided interpretation compared with unaided interpretation. Partial AUCs within the effective interval range of 0 to 0.2 false positive rate improved by 5.6% (95% CI, -1.4% to 12.0%) with AI-aided interpretation. Junior radiologists saw greater improvement in sensitivity for nodule detection with AI-aided interpretation as compared with their senior counterparts (12%; 95% CI, 4% to 19% vs 9%; 95% CI, 1% to 17%) while senior radiologists experienced similar improvement in specificity (4%; 95% CI, -2% to 9%) as compared with junior radiologists (4%; 95% CI, -3% to 5%). Conclusions and Relevance: In this diagnostic study, an AI algorithm was associated with improved detection of pulmonary nodules on chest radiographs compared with unaided interpretation for different levels of detection difficulty and for readers with different experience.


Asunto(s)
Algoritmos , Neoplasias Pulmonares/diagnóstico por imagen , Adulto , Inteligencia Artificial , Femenino , Alemania , Humanos , Masculino , Persona de Mediana Edad , Nódulos Pulmonares Múltiples/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador , Radiografía Torácica , Sensibilidad y Especificidad , Nódulo Pulmonar Solitario/diagnóstico por imagen
19.
Diagnostics (Basel) ; 11(10)2021 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-34679566

RESUMEN

(1) Background: Chest radiography (CXR) is still a key diagnostic component in the emergency department (ED). Correct interpretation is essential since some pathologies require urgent treatment. This study quantifies potential discrepancies in CXR analysis between radiologists and non-radiology physicians in training with ED experience. (2) Methods: Nine differently qualified physicians (three board-certified radiologists [BCR], three radiology residents [RR], and three non-radiology residents involved in ED [NRR]) evaluated a series of 563 posterior-anterior CXR images by quantifying suspicion for four relevant pathologies: pleural effusion, pneumothorax, pneumonia, and pulmonary nodules. Reading results were noted separately for each hemithorax on a Likert scale (0-4; 0: no suspicion of pathology, 4: safe existence of pathology) adding up to a total of 40,536 reported pathology suspicions. Interrater reliability/correlation and Kruskal-Wallis tests were performed for statistical analysis. (3) Results: While interrater reliability was good among radiologists, major discrepancies between radiologists' and non-radiologists' reading results could be observed in all pathologies. Highest overall interrater agreement was found for pneumothorax detection and lowest agreement in raising suspicion for malignancy suspicious nodules. Pleural effusion and pneumonia were often suspected with indifferent choices (1-3). In terms of pneumothorax detection, all readers mainly decided for a clear option (0 or 4). Interrater reliability was usually higher when evaluating the right hemithorax (all pathologies except pneumothorax). (4) Conclusions: Quantified CXR interrater reliability analysis displays a general uncertainty and strongly depends on medical training. NRR can benefit from radiology reporting in terms of time efficiency and diagnostic accuracy. CXR evaluation of long-time trained ED specialists has not been tested.

20.
Diagnostics (Basel) ; 11(6)2021 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-34205176

RESUMEN

(1) Background: Extracorporeal membrane oxygenation (ECMO) therapy in intensive care units (ICUs) remains the last treatment option for Coronavirus disease 2019 (COVID-19) patients with severely affected lungs but is highly resource demanding. Early risk stratification for the need of ECMO therapy upon admission to the hospital using artificial intelligence (AI)-based computed tomography (CT) assessment and clinical scores is beneficial for patient assessment and resource management; (2) Methods: Retrospective single-center study with 95 confirmed COVID-19 patients admitted to the participating ICUs. Patients requiring ECMO therapy (n = 14) during ICU stay versus patients without ECMO treatment (n = 81) were evaluated for discriminative clinical prediction parameters and AI-based CT imaging features and their diagnostic potential to predict ECMO therapy. Reported patient data include clinical scores, AI-based CT findings and patient outcomes; (3) Results: Patients subsequently allocated to ECMO therapy had significantly higher sequential organ failure (SOFA) scores (p < 0.001) and significantly lower oxygenation indices on admission (p = 0.009) than patients with standard ICU therapy. The median time from hospital admission to ECMO placement was 1.4 days (IQR 0.2-4.0). The percentage of lung involvement on AI-based CT assessment on admission to the hospital was significantly higher in ECMO patients (p < 0.001). In binary logistic regression analyses for ECMO prediction including age, sex, body mass index (BMI), SOFA score on admission, lactate on admission and percentage of lung involvement on admission CTs, only SOFA score (OR 1.32, 95% CI 1.08-1.62) and lung involvement (OR 1.06, 95% CI 1.01-1.11) were significantly associated with subsequent ECMO allocation. Receiver operating characteristic (ROC) curves showed an area under the curve (AUC) of 0.83 (95% CI 0.73-0.94) for lung involvement on admission CT and 0.82 (95% CI 0.72-0.91) for SOFA scores on ICU admission. A combined parameter of SOFA on ICU admission and lung involvement on admission CT yielded an AUC of 0.91 (0.84-0.97) with a sensitivity of 0.93 and a specificity of 0.84 for ECMO prediction; (4) Conclusions: AI-based assessment of lung involvement on CT scans on admission to the hospital and SOFA scoring, especially if combined, can be used as risk stratification tools for subsequent requirement for ECMO therapy in patients with severe COVID-19 disease to improve resource management in ICU settings.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...