Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Assunto principal
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Med Phys ; 51(4): 2721-2732, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37831587

RESUMO

BACKGROUND: Deep learning models are being applied to more and more use cases with astonishing success stories, but how do they perform in the real world? Models are typically tested on specific cleaned data sets, but when deployed in the real world, the model will encounter unexpected, out-of-distribution (OOD) data. PURPOSE: To investigate the impact of OOD radiographs on existing chest x-ray classification models and to increase their robustness against OOD data. METHODS: The study employed the commonly used chest x-ray classification model, CheXnet, trained on the chest x-ray 14 data set, and tested its robustness against OOD data using three public radiography data sets: IRMA, Bone Age, and MURA, and the ImageNet data set. To detect OOD data for multi-label classification, we proposed in-distribution voting (IDV). The OOD detection performance is measured across data sets using the area under the receiver operating characteristic curve (AUC) analysis and compared with Mahalanobis-based OOD detection, MaxLogit, MaxEnergy, self-supervised OOD detection (SS OOD), and CutMix. RESULTS: Without additional OOD detection, the chest x-ray classifier failed to discard any OOD images, with an AUC of 0.5. The proposed IDV approach trained on ID (chest x-ray 14) and OOD data (IRMA and ImageNet) achieved, on average, 0.999 OOD AUC across the three data sets, surpassing all other OOD detection methods. Mahalanobis-based OOD detection achieved an average OOD detection AUC of 0.982. IDV trained solely with a few thousand ImageNet images had an AUC 0.913, which was considerably higher than MaxLogit (0.726), MaxEnergy (0.724), SS OOD (0.476), and CutMix (0.376). CONCLUSIONS: The performance of all tested OOD detection methods did not translate well to radiography data sets, except Mahalanobis-based OOD detection and the proposed IDV method. Consequently, training solely on ID data led to incorrect classification of OOD images as ID, resulting in increased false positive rates. IDV substantially improved the model's ID classification performance, even when trained with data that will not occur in the intended use case or test set (ImageNet), without additional inference overhead or performance decrease in the target classification. The corresponding code is available at https://gitlab.lrz.de/IP/a-knee-cannot-have-lung-disease.


Assuntos
Votação , Raios X , Radiografia , Curva ROC
2.
Rofo ; 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38663428

RESUMO

The aim of this study was to explore the potential of weak supervision in a deep learning-based label prediction model. The goal was to use this model to extract labels from German free-text thoracic radiology reports on chest X-ray images and for training chest X-ray classification models.The proposed label extraction model for German thoracic radiology reports uses a German BERT encoder as a backbone and classifies a report based on the CheXpert labels. For investigating the efficient use of manually annotated data, the model was trained using manual annotations, weak rule-based labels, and both. Rule-based labels were extracted from 66071 retrospectively collected radiology reports from 2017-2021 (DS 0), and 1091 reports from 2020-2021 (DS 1) were manually labeled according to the CheXpert classes. Label extraction performance was evaluated with respect to mention extraction, negation detection, and uncertainty detection by measuring F1 scores. The influence of the label extraction method on chest X-ray classification was evaluated on a pneumothorax data set (DS 2) containing 6434 chest radiographs with associated reports and expert diagnoses of pneumothorax. For this, DenseNet-121 models trained on manual annotations, rule-based and deep learning-based label predictions, and publicly available data were compared.The proposed deep learning-based labeler (DL) performed on average considerably stronger than the rule-based labeler (RB) for all three tasks on DS 1 with F1 scores of 0.938 vs. 0.844 for mention extraction, 0.891 vs. 0.821 for negation detection, and 0.624 vs. 0.518 for uncertainty detection. Pre-training on DS 0 and fine-tuning on DS 1 performed better than only training on either DS 0 or DS 1. Chest X-ray pneumothorax classification results (DS 2) were highest when trained with DL labels with an area under the receiver operating curve (AUC) of 0.939 compared to RB labels with an AUC of 0.858. Training with manual labels performed slightly worse than training with DL labels with an AUC of 0.934. In contrast, training with a public data set resulted in an AUC of 0.720.Our results show that leveraging a rule-based report labeler for weak supervision leads to improved labeling performance. The pneumothorax classification results demonstrate that our proposed deep learning-based labeler can serve as a substitute for manual labeling requiring only 1000 manually annotated reports for training. · The proposed deep learning-based label extraction model for German thoracic radiology reports performs better than the rule-based model.. · Training with limited supervision outperformed training with a small manually labeled data set.. · Using predicted labels for pneumothorax classification from chest radiographs performed equally to using manual annotations.. Wollek A, Haitzer P, Sedlmeyr T et al. Language modelbased labeling of German thoracic radiology reports. Fortschr Röntgenstr 2024; DOI 10.1055/a-2287-5054.

3.
Rofo ; 2024 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-38295825

RESUMO

PURPOSE: The aim of this study was to develop an algorithm to automatically extract annotations from German thoracic radiology reports to train deep learning-based chest X-ray classification models. MATERIALS AND METHODS: An automatic label extraction model for German thoracic radiology reports was designed based on the CheXpert architecture. The algorithm can extract labels for twelve common chest pathologies, the presence of support devices, and "no finding". For iterative improvements and to generate a ground truth, a web-based multi-reader annotation interface was created. With the proposed annotation interface, a radiologist annotated 1086 retrospectively collected radiology reports from 2020-2021 (data set 1). The effect of automatically extracted labels on chest radiograph classification performance was evaluated on an additional, in-house pneumothorax data set (data set 2), containing 6434 chest radiographs with corresponding reports, by comparing a DenseNet-121 model trained on extracted labels from the associated reports, image-based pneumothorax labels, and publicly available data, respectively. RESULTS: Comparing automated to manual labeling on data set 1: "mention extraction" class-wise F1 scores ranged from 0.8 to 0.995, the "negation detection" F1 scores from 0.624 to 0.981, and F1 scores for "uncertainty detection" from 0.353 to 0.725. Extracted pneumothorax labels on data set 2 had a sensitivity of 0.997 [95 % CI: 0.994, 0.999] and specificity of 0.991 [95 % CI: 0.988, 0.994]. The model trained on publicly available data achieved an area under the receiver operating curve (AUC) for pneumothorax classification of 0.728 [95 % CI: 0.694, 0.760], while the models trained on automatically extracted labels and on manual annotations achieved values of 0.858 [95 % CI: 0.832, 0.882] and 0.934 [95 % CI: 0.918, 0.949], respectively. CONCLUSION: Automatic label extraction from German thoracic radiology reports is a promising substitute for manual labeling. By reducing the time required for data annotation, larger training data sets can be created, resulting in improved overall modeling performance. Our results demonstrated that a pneumothorax classifier trained on automatically extracted labels strongly outperformed the model trained on publicly available data, without the need for additional annotation time and performed competitively compared to manually labeled data. KEY POINTS: · An algorithm for automatic German thoracic radiology report annotation was developed.. · Automatic label extraction is a promising substitute for manual labeling.. · The classifier trained on extracted labels outperformed the model trained on publicly available data..

4.
J Imaging ; 9(12)2023 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-38132688

RESUMO

Public chest X-ray (CXR) data sets are commonly compressed to a lower bit depth to reduce their size, potentially hiding subtle diagnostic features. In contrast, radiologists apply a windowing operation to the uncompressed image to enhance such subtle features. While it has been shown that windowing improves classification performance on computed tomography (CT) images, the impact of such an operation on CXR classification performance remains unclear. In this study, we show that windowing strongly improves the CXR classification performance of machine learning models and propose WindowNet, a model that learns multiple optimal window settings. Our model achieved an average AUC score of 0.812 compared with the 0.759 score of a commonly used architecture without windowing capabilities on the MIMIC data set.

5.
Radiol Artif Intell ; 5(2): e220187, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37035429

RESUMO

Purpose: To investigate the chest radiograph classification performance of vision transformers (ViTs) and interpretability of attention-based saliency maps, using the example of pneumothorax classification. Materials and Methods: In this retrospective study, ViTs were fine-tuned for lung disease classification using four public datasets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData. Saliency maps were generated using transformer multimodal explainability and gradient-weighted class activation mapping (GradCAM). Classification performance was evaluated on the Chest X-Ray 14, VinBigData, and Society for Imaging Informatics in Medicine-American College of Radiology (SIIM-ACR) Pneumothorax Segmentation datasets using the area under the receiver operating characteristic curve (AUC) analysis and compared with convolutional neural networks (CNNs). The explainability methods were evaluated with positive and negative perturbation, sensitivity-n, effective heat ratio, intra-architecture repeatability, and interarchitecture reproducibility. In the user study, three radiologists classified 160 chest radiographs with and without saliency maps for pneumothorax and rated their usefulness. Results: ViTs had comparable chest radiograph classification AUCs compared with state-of-the-art CNNs: 0.95 (95% CI: 0.94, 0.95) versus 0.83 (95%, CI 0.83, 0.84) on Chest X-Ray 14, 0.84 (95% CI: 0.77, 0.91) versus 0.83 (95% CI: 0.76, 0.90) on VinBigData, and 0.85 (95% CI: 0.85, 0.86) versus 0.87 (95% CI: 0.87, 0.88) on SIIM-ACR. Both saliency map methods unveiled a strong bias toward pneumothorax tubes in the models. Radiologists found 47% of the attention-based and 39% of the GradCAM saliency maps useful. The attention-based methods outperformed GradCAM on all metrics. Conclusion: ViTs performed similarly to CNNs in chest radiograph classification, and their attention-based saliency maps were more useful to radiologists and outperformed GradCAM.Keywords: Conventional Radiography, Thorax, Diagnosis, Supervised Learning, Convolutional Neural Network (CNN) Online supplemental material is available for this article. © RSNA, 2023.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA