Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

[Between clinical and forensic imaging : Differences, similarities and legal framework]. / Zwischen klinischer und forensischer Bildgebung : Unterschiede, Gemeinsamkeiten und rechtliche Rahmenbedingungen.

Bruch, Gina Maria; Schäffer, Benno; Sabel, Bastian O.

Radiologie (Heidelb) ; 2024 Aug 22.

Artigo em Alemão | MEDLINE | ID: mdl-39174666

RESUMO

Clinical imaging uses a variety of medical imaging techniques to diagnose and monitor diseases, injuries and other health conditions. These include Xray images, computed tomography (CT), magnetic resonance imaging (MRI) and ultrasound. These procedures are used to make accurate diagnoses and plan the best possible treatment for patients. Forensic imaging, in contrast, is used in both living and deceased persons in the context of criminal investigations. Postmortem forensic imaging techniques, such as postmortem CT (PMCT) and postmortem CT angiography (PMCTA), include some of the same procedures used in clinical imaging. An important difference between clinical and forensic imaging is the purpose and context in which the imaging studies are used. In addition, radiological procedures, such as angiography, need to be adapted and modified in the post-mortem setting. From a legal perspective clinical and forensic imaging must strictly adhere to privacy and procedural guidelines. Forensic images often need to be admissible as evidence in court, which places specific requirements on the quality, authenticity and documentation of images. In the case of living individuals, there must be a valid indication and consent from the patient. Consent must also fundamentally be obtained for post-mortem examinations, e.g. from the public prosecutor's office.

2.

A direct comparison of multi-energy x-ray and proton CT for imaging and relative stopping power estimation of plastic andex-vivophantoms.

Fogazzi, Elena; Hu, Guyue; Bruzzi, Mara; Farace, Paolo; Kröncke, Thomas; Niepel, Katharina; Ricke, Jens; Risch, Franka; Sabel, Bastian; Scaringella, Monica; Schwarz, Florian; Tommasino, Francesco; Landry, Guillaume; Civinini, Carlo; Parodi, Katia.

Phys Med Biol ; 69(17)2024 Aug 30.

Artigo em Inglês | MEDLINE | ID: mdl-39159669

RESUMO

Objective.Proton therapy administers a highly conformal dose to the tumour region, necessitating accurate prediction of the patient's 3D map of proton relative stopping power (RSP) compared to water. This remains challenging due to inaccuracies inherent in single-energy computed tomography (SECT) calibration. Recent advancements in spectral x-ray CT (xCT) and proton CT (pCT) have shown improved RSP estimation compared to traditional SECT methods. This study aims to provide the first comparison of the imaging and RSP estimation performance among dual-energy CT (DECT) and photon-counting CT (PCCT) scanners, and a pCT system prototype.Approach.Two phantoms were scanned with the three systems for their performance characterisation: a plastic phantom, filled with water and containing four plastic inserts and a wood insert, and a heterogeneous biological phantom, containing a formalin-stabilised bovine specimen. RSP maps were generated by converting CT numbers to RSP using a calibration based on low- and high-energy xCT images, while pCT utilised a distance-driven filtered back projection algorithm for RSP reconstruction. Spatial resolution, noise, and RSP accuracy were compared across the resulting images.Main results.All three systems exhibited similar spatial resolution of around 0.54 lp/mm for the plastic phantom. The PCCT images were less noisy than the DECT images at the same dose level. The lowest mean absolute percentage error (MAPE) of RSP,(0.28±0.07)%, was obtained with the pCT system, compared to MAPE values of(0.51±0.08)%and(0.80±0.08)%for the DECT- and PCCT-based methods, respectively. For the biological phantom, the xCT-based methods resulted in higher RSP values in most of the voxels compared to pCT.Significance.The pCT system yielded the most accurate estimation of RSP values for the plastic materials, and was thus used to benchmark the xCT calibration performance on the biological phantom. This study underlined the potential benefits and constraints of utilising such a novelex-vivophantom for inter-centre surveys in future.

Assuntos

Imagens de Fantasmas , Plásticos , Prótons , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por Computador/métodos , Animais , Bovinos , Calibragem , Raios X

3.

[Automatic ICD-10 coding : Natural language processing for German MRI reports]. / Automatische ICD-10-Codierung : Natural Language Processing für deutsche MRT-Befunde.

Mittermeier, Andreas; Aßenmacher, Matthias; Schachtner, Balthasar; Grosu, Sergio; Dakovic, Vladana; Kandratovich, Viktar; Sabel, Bastian; Ingrisch, Michael.

Radiologie (Heidelb) ; 64(10): 793-800, 2024 Oct.

Artigo em Alemão | MEDLINE | ID: mdl-39120724

RESUMO

BACKGROUND: The medical coding of radiology reports is essential for a good quality of care and correct billing, but at the same time a complex and error-prone task. OBJECTIVE: To assess the performance of natural language processing (NLP) for ICD-10 coding of German radiology reports using fine tuning of suitable language models. MATERIAL AND METHODS: This retrospective study included all magnetic resonance imaging (MRI) radiology reports acquired at our institution between 2010 and 2020. The codes on discharge ICD-10 were matched to the corresponding reports to construct a dataset for multiclass classification. Fine tuning of GermanBERT and flanT5 was carried out on the total dataset (dstotal) containing 1035 different ICD-10 codes and 2 reduced subsets containing the 100 (ds100) and 50 (ds50) most frequent codes. The performance of the model was assessed using topk accuracy for kâ¯= 1, 3 and 5. In an ablation study both models were trained on the accompanying metadata and the radiology report alone. RESULTS: The total dataset consisted of 100,672 radiology reports, the reduced subsets ds100 of 68,103 and ds50 of 52,293 reports. The performance of the model increased when several of the best predictions of the model were taken into consideration, when the number of target classes was reduced and the metadata were combined with the report. The flanT5 outperformed GermanBERT across all datasets and metrics and was is suited as a medical coding assistant, achieving a top 3 accuracy of nearly 70% in the real-world dataset dstotal. CONCLUSION: Finely tuned language models can reliably predict ICD-10 codes of German magnetic resonance imaging (MRI) radiology reports across various settings. As a coding assistant flanT5 can guide medical coders to make informed decisions and potentially reduce the workload.

4.

Language model-based labeling of German thoracic radiology reports.

Wollek, Alessandro; Haitzer, Philip; Sedlmeyr, Thomas; Hyska, Sardi; Rueckel, Johannes; Sabel, Bastian O; Ingrisch, Michael; Lasser, Tobias.

Rofo ; 2024 Apr 25.

Artigo em Inglês | MEDLINE | ID: mdl-38663428

RESUMO

The aim of this study was to explore the potential of weak supervision in a deep learning-based label prediction model. The goal was to use this model to extract labels from German free-text thoracic radiology reports on chest X-ray images and for training chest X-ray classification models.The proposed label extraction model for German thoracic radiology reports uses a German BERT encoder as a backbone and classifies a report based on the CheXpert labels. For investigating the efficient use of manually annotated data, the model was trained using manual annotations, weak rule-based labels, and both. Rule-based labels were extracted from 66071 retrospectively collected radiology reports from 2017-2021 (DS 0), and 1091 reports from 2020-2021 (DS 1) were manually labeled according to the CheXpert classes. Label extraction performance was evaluated with respect to mention extraction, negation detection, and uncertainty detection by measuring F1 scores. The influence of the label extraction method on chest X-ray classification was evaluated on a pneumothorax data set (DS 2) containing 6434 chest radiographs with associated reports and expert diagnoses of pneumothorax. For this, DenseNet-121 models trained on manual annotations, rule-based and deep learning-based label predictions, and publicly available data were compared.The proposed deep learning-based labeler (DL) performed on average considerably stronger than the rule-based labeler (RB) for all three tasks on DS 1 with F1 scores of 0.938 vs. 0.844 for mention extraction, 0.891 vs. 0.821 for negation detection, and 0.624 vs. 0.518 for uncertainty detection. Pre-training on DS 0 and fine-tuning on DS 1 performed better than only training on either DS 0 or DS 1. Chest X-ray pneumothorax classification results (DS 2) were highest when trained with DL labels with an area under the receiver operating curve (AUC) of 0.939 compared to RB labels with an AUC of 0.858. Training with manual labels performed slightly worse than training with DL labels with an AUC of 0.934. In contrast, training with a public data set resulted in an AUC of 0.720.Our results show that leveraging a rule-based report labeler for weak supervision leads to improved labeling performance. The pneumothorax classification results demonstrate that our proposed deep learning-based labeler can serve as a substitute for manual labeling requiring only 1000 manually annotated reports for training. · The proposed deep learning-based label extraction model for German thoracic radiology reports performs better than the rule-based model.. · Training with limited supervision outperformed training with a small manually labeled data set.. · Using predicted labels for pneumothorax classification from chest radiographs performed equally to using manual annotations.. Wollek A, Haitzer P, Sedlmeyr T et al. Language modelbased labeling of German thoracic radiology reports. Fortschr Röntgenstr 2024; DOI 10.1055/a-2287-5054.

5.

German CheXpert Chest X-ray Radiology Report Labeler.

Wollek, Alessandro; Hyska, Sardi; Sedlmeyr, Thomas; Haitzer, Philip; Rueckel, Johannes; Sabel, Bastian O; Ingrisch, Michael; Lasser, Tobias.

Rofo ; 196(9): 956-965, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38295825

RESUMO

PURPOSE: The aim of this study was to develop an algorithm to automatically extract annotations from German thoracic radiology reports to train deep learning-based chest X-ray classification models. MATERIALS AND METHODS: An automatic label extraction model for German thoracic radiology reports was designed based on the CheXpert architecture. The algorithm can extract labels for twelve common chest pathologies, the presence of support devices, and "no finding". For iterative improvements and to generate a ground truth, a web-based multi-reader annotation interface was created. With the proposed annotation interface, a radiologist annotated 1086 retrospectively collected radiology reports from 2020-2021 (data set 1). The effect of automatically extracted labels on chest radiograph classification performance was evaluated on an additional, in-house pneumothorax data set (data set 2), containing 6434 chest radiographs with corresponding reports, by comparing a DenseNet-121 model trained on extracted labels from the associated reports, image-based pneumothorax labels, and publicly available data, respectively. RESULTS: Comparing automated to manual labeling on data set 1: "mention extraction" class-wise F1 scores ranged from 0.8 to 0.995, the "negation detection" F1 scores from 0.624 to 0.981, and F1 scores for "uncertainty detection" from 0.353 to 0.725. Extracted pneumothorax labels on data set 2 had a sensitivity of 0.997 [95â% CI: 0.994, 0.999] and specificity of 0.991 [95â% CI: 0.988, 0.994]. The model trained on publicly available data achieved an area under the receiver operating curve (AUC) for pneumothorax classification of 0.728 [95â% CI: 0.694, 0.760], while the models trained on automatically extracted labels and on manual annotations achieved values of 0.858 [95â% CI: 0.832, 0.882] and 0.934 [95â% CI: 0.918, 0.949], respectively. CONCLUSION: Automatic label extraction from German thoracic radiology reports is a promising substitute for manual labeling. By reducing the time required for data annotation, larger training data sets can be created, resulting in improved overall modeling performance. Our results demonstrated that a pneumothorax classifier trained on automatically extracted labels strongly outperformed the model trained on publicly available data, without the need for additional annotation time and performed competitively compared to manually labeled data. KEY POINTS: · An algorithm for automatic German thoracic radiology report annotation was developed.. · Automatic label extraction is a promising substitute for manual labeling.. · The classifier trained on extracted labels outperformed the model trained on publicly available data.. ZITIERWEISE: · Wollek A, Hyska S, Sedlmeyr T etâal. German CheXpert Chest X-ray Radiology Report Labeler. Fortschr Röntgenstr 2024; 196: 956â-â965.

Assuntos

Algoritmos , Radiografia Torácica , Radiografia Torácica/métodos , Humanos , Alemanha , Estudos Retrospectivos , Pneumotórax/diagnóstico por imagem , Redes Neurais de Computação

6.

Nonradiology Health Care Professionals Significantly Benefit From AI Assistance in Emergency-Related Chest Radiography Interpretation.

Rudolph, Jan; Huemmer, Christian; Preuhs, Alexander; Buizza, Giulia; Hoppe, Boj F; Dinkel, Julien; Koliogiannis, Vanessa; Fink, Nicola; Goller, Sophia S; Schwarze, Vincent; Mansour, Nabeel; Schmidt, Vanessa F; Fischer, Maximilian; Jörgens, Maximilian; Ben Khaled, Najib; Liebig, Thomas; Ricke, Jens; Rueckel, Johannes; Sabel, Bastian O.

Chest ; 166(1): 157-170, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38295950

RESUMO

BACKGROUND: Chest radiographs (CXRs) are still of crucial importance in primary diagnostics, but their interpretation poses difficulties at times. RESEARCH QUESTION: Can a convolutional neural network-based artificial intelligence (AI) system that interprets CXRs add value in an emergency unit setting? STUDY DESIGN AND METHODS: A total of 563 CXRs acquired in the emergency unit of a major university hospital were retrospectively assessed twice by three board-certified radiologists, three radiology residents, and three emergency unit-experienced nonradiology residents (NRRs). They used a two-step reading process: (1) without AI support; and (2) with AI support providing additional images with AI overlays. Suspicion of four suspected pathologies (pleural effusion, pneumothorax, consolidations suspicious for pneumonia, and nodules) was reported on a five-point confidence scale. Confidence scores of the board-certified radiologists were converted into four binary reference standards of different sensitivities. Performance by radiology residents and NRRs without AI support/with AI support were statistically compared by using receiver-operating characteristics (ROCs), Youden statistics, and operating point metrics derived from fitted ROC curves. RESULTS: NRRs could significantly improve performance, sensitivity, and accuracy with AI support in all four pathologies tested. In the most sensitive reference standard (reference standard IV), NRR consensus improved the area under the ROC curve (mean, 95% CI) in the detection of the time-critical pathology pneumothorax from 0.846 (0.785-0.907) without AI support to 0.974 (0.947-1.000) with AI support (P < .001), which represented a gain of 30% in sensitivity and 2% in accuracy (while maintaining an optimized specificity). The most pronounced effect was observed in nodule detection, with NRR with AI support improving sensitivity by 53% and accuracy by 7% (area under the ROC curve without AI support, 0.723 [0.661-0.785]; with AI support, 0.890 [0.848-0.931]; P < .001). Radiology residents had smaller, mostly nonsignificant gains in performance, sensitivity, and accuracy with AI support. INTERPRETATION: We found that in an emergency unit setting without 24/7 radiology coverage, the presented AI solution features an excellent clinical support tool to nonradiologists, similar to a second reader, and allows for a more accurate primary diagnosis and thus earlier therapy initiation.

Assuntos

Inteligência Artificial , Serviço Hospitalar de Emergência , Radiografia Torácica , Humanos , Radiografia Torácica/métodos , Estudos Retrospectivos , Masculino , Feminino , Competência Clínica , Pessoa de Meia-Idade , Curva ROC , Adulto , Idoso

7.

Radiological age assessment based on clavicle ossification in CT: enhanced accuracy through deep learning.

Wesp, Philipp; Schachtner, Balthasar Maria; Jeblick, Katharina; Topalis, Johanna; Weber, Marvin; Fischer, Florian; Penning, Randolph; Ricke, Jens; Ingrisch, Michael; Sabel, Bastian Oliver.

Int J Legal Med ; 138(4): 1497-1507, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38286953

RESUMO

BACKGROUND: Radiological age assessment using reference studies is inherently limited in accuracy due to a finite number of assignable skeletal maturation stages. To overcome this limitation, we present a deep learning approach for continuous age assessment based on clavicle ossification in computed tomography (CT). METHODS: Thoracic CT scans were retrospectively collected from the picture archiving and communication system. Individuals aged 15.0 to 30.0 years examined in routine clinical practice were included. All scans were automatically cropped around the medial clavicular epiphyseal cartilages. A deep learning model was trained to predict a person's chronological age based on these scans. Performance was evaluated using mean absolute error (MAE). Model performance was compared to an optimistic human reader performance estimate for an established reference study method. RESULTS: The deep learning model was trained on 4,400 scans of 1,935 patients (training set: mean age = 24.2 years ± 4.0, 1132 female) and evaluated on 300 scans of 300 patients with a balanced age and sex distribution (test set: mean age = 22.5 years ± 4.4, 150 female). Model MAE was 1.65 years, and the highest absolute error was 6.40 years for females and 7.32 years for males. However, performance could be attributed to norm-variants or pathologic disorders. Human reader estimate MAE was 1.84 years and the highest absolute error was 3.40 years for females and 3.78 years for males. CONCLUSIONS: We present a deep learning approach for continuous age predictions using CT volumes highlighting the medial clavicular epiphyseal cartilage with performance comparable to the human reader estimate.

Assuntos

Determinação da Idade pelo Esqueleto , Clavícula , Aprendizado Profundo , Osteogênese , Tomografia Computadorizada por Raios X , Humanos , Clavícula/diagnóstico por imagem , Clavícula/crescimento & desenvolvimento , Determinação da Idade pelo Esqueleto/métodos , Masculino , Feminino , Adolescente , Adulto , Adulto Jovem , Estudos Retrospectivos

8.

Artificial Intelligence to Assess Tracheal Tubes and Central Venous Catheters in Chest Radiographs Using an Algorithmic Approach With Adjustable Positioning Definitions.

Rueckel, Johannes; Huemmer, Christian; Shahidi, Casra; Buizza, Giulia; Hoppe, Boj Friedrich; Liebig, Thomas; Ricke, Jens; Rudolph, Jan; Sabel, Bastian Oliver.

Invest Radiol ; 59(4): 306-313, 2024 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-37682731

RESUMO

PURPOSE: To develop and validate an artificial intelligence algorithm for the positioning assessment of tracheal tubes (TTs) and central venous catheters (CVCs) in supine chest radiographs (SCXRs) by using an algorithm approach allowing for adjustable definitions of intended device positioning. MATERIALS AND METHODS: Positioning quality of CVCs and TTs is evaluated by spatially correlating the respective tip positions with anatomical structures. For CVC analysis, a configurable region of interest is defined to approximate the expected region of well-positioned CVC tips from segmentations of anatomical landmarks. The CVC/TT information is estimated by introducing a new multitask neural network architecture for jointly performing type/existence classification, course segmentation, and tip detection. Validation data consisted of 589 SCXRs that have been radiologically annotated for inserted TTs/CVCs, including an experts' categorical positioning assessment (reading 1). In-image positions of algorithm-detected TT/CVC tips could be corrected using a validation software tool (reading 2) that finally allowed for localization accuracy quantification. Algorithmic detection of images with misplaced devices (reading 1 as reference standard) was quantified by receiver operating characteristics. RESULTS: Supine chest radiographs were correctly classified according to inserted TTs/CVCs in 100%/98% of the cases, thereby with high accuracy in also spatially localizing the medical device tips: corrections less than 3 mm in >86% (TTs) and 77% (CVCs) of the cases. Chest radiographs with malpositioned devices were detected with area under the curves of >0.98 (TTs), >0.96 (CVCs with accidental vessel turnover), and >0.93 (also suboptimal CVC insertion length considered). The receiver operating characteristics limitations regarding CVC assessment were mainly caused by limitations of the applied CXR position definitions (region of interest derived from anatomical landmarks), not by algorithmic spatial detection inaccuracies. CONCLUSIONS: The TT and CVC tips were accurately localized in SCXRs by the presented algorithms, but triaging applications for CVC positioning assessment still suffer from the vague definition of optimal CXR positioning. Our algorithm, however, allows for an adjustment of these criteria, theoretically enabling them to meet user-specific or patient subgroups requirements. Besides CVC tip analysis, future work should also include specific course analysis for accidental vessel turnover detection.

Assuntos

Cateterismo Venoso Central , Cateteres Venosos Centrais , Humanos , Cateterismo Venoso Central/métodos , Inteligência Artificial , Radiografia , Radiografia Torácica/métodos

9.

Out-of-distribution detection with in-distribution voting using the medical example of chest x-ray classification.

Wollek, Alessandro; Willem, Theresa; Ingrisch, Michael; Sabel, Bastian; Lasser, Tobias.

Med Phys ; 51(4): 2721-2732, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37831587

RESUMO

BACKGROUND: Deep learning models are being applied to more and more use cases with astonishing success stories, but how do they perform in the real world? Models are typically tested on specific cleaned data sets, but when deployed in the real world, the model will encounter unexpected, out-of-distribution (OOD) data. PURPOSE: To investigate the impact of OOD radiographs on existing chest x-ray classification models and to increase their robustness against OOD data. METHODS: The study employed the commonly used chest x-ray classification model, CheXnet, trained on the chest x-ray 14 data set, and tested its robustness against OOD data using three public radiography data sets: IRMA, Bone Age, and MURA, and the ImageNet data set. To detect OOD data for multi-label classification, we proposed in-distribution voting (IDV). The OOD detection performance is measured across data sets using the area under the receiver operating characteristic curve (AUC) analysis and compared with Mahalanobis-based OOD detection, MaxLogit, MaxEnergy, self-supervised OOD detection (SS OOD), and CutMix. RESULTS: Without additional OOD detection, the chest x-ray classifier failed to discard any OOD images, with an AUC of 0.5. The proposed IDV approach trained on ID (chest x-ray 14) and OOD data (IRMA and ImageNet) achieved, on average, 0.999 OOD AUC across the three data sets, surpassing all other OOD detection methods. Mahalanobis-based OOD detection achieved an average OOD detection AUC of 0.982. IDV trained solely with a few thousand ImageNet images had an AUC 0.913, which was considerably higher than MaxLogit (0.726), MaxEnergy (0.724), SS OOD (0.476), and CutMix (0.376). CONCLUSIONS: The performance of all tested OOD detection methods did not translate well to radiography data sets, except Mahalanobis-based OOD detection and the proposed IDV method. Consequently, training solely on ID data led to incorrect classification of OOD images as ID, resulting in increased false positive rates. IDV substantially improved the model's ID classification performance, even when trained with data that will not occur in the intended use case or test set (ImageNet), without additional inference overhead or performance decrease in the target classification. The corresponding code is available at https://gitlab.lrz.de/IP/a-knee-cannot-have-lung-disease.

Assuntos

Votação , Raios X , Radiografia , Curva ROC

10.

Implementing Artificial Intelligence for Emergency Radiology Impacts Physicians' Knowledge and Perception: A Prospective Pre- and Post-Analysis.

Hoppe, Boj Friedrich; Rueckel, Johannes; Dikhtyar, Yevgeniy; Heimer, Maurice; Fink, Nicola; Sabel, Bastian Oliver; Ricke, Jens; Rudolph, Jan; Cyran, Clemens C.

Invest Radiol ; 59(5): 404-412, 2024 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-37843828

RESUMO

PURPOSE: The aim of this study was to evaluate the impact of implementing an artificial intelligence (AI) solution for emergency radiology into clinical routine on physicians' perception and knowledge. MATERIALS AND METHODS: A prospective interventional survey was performed pre-implementation and 3 months post-implementation of an AI algorithm for fracture detection on radiographs in late 2022. Radiologists and traumatologists were asked about their knowledge and perception of AI on a 7-point Likert scale (-3, "strongly disagree"; +3, "strongly agree"). Self-generated identification codes allowed matching the same individuals pre-intervention and post-intervention, and using Wilcoxon signed rank test for paired data. RESULTS: A total of 47/71 matched participants completed both surveys (66% follow-up rate) and were eligible for analysis (34 radiologists [72%], 13 traumatologists [28%], 15 women [32%]; mean age, 34.8 ± 7.8 years). Postintervention, there was an increase that AI "reduced missed findings" (1.28 [pre] vs 1.94 [post], P = 0.003) and made readers "safer" (1.21 vs 1.64, P = 0.048), but not "faster" (0.98 vs 1.21, P = 0.261). There was a rising disagreement that AI could "replace the radiological report" (-2.04 vs -2.34, P = 0.038), as well as an increase in self-reported knowledge about "clinical AI," its "chances," and its "risks" (0.40 vs 1.00, 1.21 vs 1.70, and 0.96 vs 1.34; all P 's ≤ 0.028). Radiologists used AI results more frequently than traumatologists ( P < 0.001) and rated benefits higher (all P 's ≤ 0.038), whereas senior physicians were less likely to use AI or endorse its benefits (negative correlation with age, -0.35 to 0.30; all P 's ≤ 0.046). CONCLUSIONS: Implementing AI for emergency radiology into clinical routine has an educative aspect and underlines the concept of AI as a "second reader," to support and not replace physicians.

Assuntos

Médicos , Radiologia , Feminino , Humanos , Adulto , Inteligência Artificial , Estudos Prospectivos , Percepção

11.

WindowNet: Learnable Windows for Chest X-ray Classification.

Wollek, Alessandro; Hyska, Sardi; Sabel, Bastian; Ingrisch, Michael; Lasser, Tobias.

J Imaging ; 9(12)2023 Dec 06.

Artigo em Inglês | MEDLINE | ID: mdl-38132688

RESUMO

Public chest X-ray (CXR) data sets are commonly compressed to a lower bit depth to reduce their size, potentially hiding subtle diagnostic features. In contrast, radiologists apply a windowing operation to the uncompressed image to enhance such subtle features. While it has been shown that windowing improves classification performance on computed tomography (CT) images, the impact of such an operation on CXR classification performance remains unclear. In this study, we show that windowing strongly improves the CXR classification performance of machine learning models and propose WindowNet, a model that learns multiple optimal window settings. Our model achieved an average AUC score of 0.812 compared with the 0.759 score of a commonly used architecture without windowing capabilities on the MIMIC data set.

12.

ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports.

Jeblick, Katharina; Schachtner, Balthasar; Dexl, Jakob; Mittermeier, Andreas; Stüber, Anna Theresa; Topalis, Johanna; Weber, Tobias; Wesp, Philipp; Sabel, Bastian Oliver; Ricke, Jens; Ingrisch, Michael.

Eur Radiol ; 2023 Oct 05.

Artigo em Inglês | MEDLINE | ID: mdl-37794249

RESUMO

OBJECTIVES: To assess the quality of simplified radiology reports generated with the large language model (LLM) ChatGPT and to discuss challenges and chances of ChatGPT-like LLMs for medical text simplification. METHODS: In this exploratory case study, a radiologist created three fictitious radiology reports which we simplified by prompting ChatGPT with "Explain this medical report to a child using simple language." In a questionnaire, we tasked 15 radiologists to rate the quality of the simplified radiology reports with respect to their factual correctness, completeness, and potential harm for patients. We used Likert scale analysis and inductive free-text categorization to assess the quality of the simplified reports. RESULTS: Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed relevant medical information, and potentially harmful passages were reported. CONCLUSION: While we see a need for further adaption to the medical field, the initial insights of this study indicate a tremendous potential in using LLMs like ChatGPT to improve patient-centered care in radiology and other medical domains. CLINICAL RELEVANCE STATEMENT: Patients have started to use ChatGPT to simplify and explain their medical reports, which is expected to affect patient-doctor interaction. This phenomenon raises several opportunities and challenges for clinical routine. KEY POINTS: â¢ Patients have started to use ChatGPT to simplify their medical reports, but their quality was unknown. â¢ In a questionnaire, most participating radiologists overall asserted good quality to radiology reports simplified with ChatGPT. However, they also highlighted a notable presence of errors, potentially leading patients to draw harmful conclusions. â¢ Large language models such as ChatGPT have vast potential to enhance patient-centered care in radiology and other medical domains. To realize this potential while minimizing harm, they need supervision by medical experts and adaption to the medical field.

13.

How to exclude pulmonary embolism in patients hospitalized with COVID-19: a comparison of predictive scores.

Vielhauer, Jakob; Benesch, Christopher; Pernpruner, Anna; Johlke, Anna-Lena; Hellmuth, Johannes Christian; Muenchhoff, Maximilian; Scherer, Clemens; Fink, Nicola; Sabel, Bastian; Schulz, Christian; Mayerle, Julia; Mahajan, Ujjwal Mukund; Stubbe, Hans Christian.

Thromb J ; 21(1): 51, 2023 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-37131204

RESUMO

BACKGROUND: Pulmonary embolism (PE) is an important complication of Coronavirus disease 2019 (COVID-19). COVID-19 is associated with respiratory impairment and a pro-coagulative state, rendering PE more likely and difficult to recognize. Several decision algorithms relying on clinical features and D-dimer have been established. High prevalence of PE and elevated Ddimer in patients with COVID-19 might impair the performance of common decision algorithms. Here, we aimed to validate and compare five common decision algorithms implementing age adjusted Ddimer, the GENEVA, and Wells scores as well as the PEGeD- and YEARS-algorithms in patients hospitalized with COVID-19. METHODS: In this single center study, we included patients who were admitted to our tertiary care hospital in the COVID-19 Registry of the LMU Munich. We retrospectively selected patients who received a computed tomography pulmonary angiogram (CTPA) or pulmonary ventilation/perfusion scintigraphy (V/Q) for suspected PE. The performances of five commonly used diagnostic algorithms (age-adjusted D-dimer, GENEVA score, PEGeD-algorithm, Wells score, and YEARS-algorithm) were compared. RESULTS: We identified 413 patients with suspected PE who received a CTPA or V/Q confirming 62 PEs (15%). Among them, 358 patients with 48 PEs (13%) could be evaluated for performance of all algorithms. Patients with PE were older and their overall outcome was worse compared to patients without PE. Of the above five diagnostic algorithms, the PEGeD- and YEARS-algorithms performed best, reducing diagnostic imaging by 14% and 15% respectively with a sensitivity of 95.7% and 95.6%. The GENEVA score was able to reduce CTPA or V/Q by 32.2% but suffered from a low sensitivity (78.6%). Age-adjusted D-dimer and Wells score could not significantly reduce diagnostic imaging. CONCLUSION: The PEGeD- and YEARS-algorithms outperformed other tested decision algorithms and worked well in patients admitted with COVID-19. These findings need independent validation in a prospective study.

14.

Attention-based Saliency Maps Improve Interpretability of Pneumothorax Classification.

Wollek, Alessandro; Graf, Robert; Cecatka, Sasa; Fink, Nicola; Willem, Theresa; Sabel, Bastian O; Lasser, Tobias.

Radiol Artif Intell ; 5(2): e220187, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-37035429

RESUMO

Purpose: To investigate the chest radiograph classification performance of vision transformers (ViTs) and interpretability of attention-based saliency maps, using the example of pneumothorax classification. Materials and Methods: In this retrospective study, ViTs were fine-tuned for lung disease classification using four public datasets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData. Saliency maps were generated using transformer multimodal explainability and gradient-weighted class activation mapping (GradCAM). Classification performance was evaluated on the Chest X-Ray 14, VinBigData, and Society for Imaging Informatics in Medicine-American College of Radiology (SIIM-ACR) Pneumothorax Segmentation datasets using the area under the receiver operating characteristic curve (AUC) analysis and compared with convolutional neural networks (CNNs). The explainability methods were evaluated with positive and negative perturbation, sensitivity-n, effective heat ratio, intra-architecture repeatability, and interarchitecture reproducibility. In the user study, three radiologists classified 160 chest radiographs with and without saliency maps for pneumothorax and rated their usefulness. Results: ViTs had comparable chest radiograph classification AUCs compared with state-of-the-art CNNs: 0.95 (95% CI: 0.94, 0.95) versus 0.83 (95%, CI 0.83, 0.84) on Chest X-Ray 14, 0.84 (95% CI: 0.77, 0.91) versus 0.83 (95% CI: 0.76, 0.90) on VinBigData, and 0.85 (95% CI: 0.85, 0.86) versus 0.87 (95% CI: 0.87, 0.88) on SIIM-ACR. Both saliency map methods unveiled a strong bias toward pneumothorax tubes in the models. Radiologists found 47% of the attention-based and 39% of the GradCAM saliency maps useful. The attention-based methods outperformed GradCAM on all metrics. Conclusion: ViTs performed similarly to CNNs in chest radiograph classification, and their attention-based saliency maps were more useful to radiologists and outperformed GradCAM.Keywords: Conventional Radiography, Thorax, Diagnosis, Supervised Learning, Convolutional Neural Network (CNN) Online supplemental material is available for this article. © RSNA, 2023.

15.

Lung Ultrasound as a Promising Diagnostic Tool for Primary Graft Dysfunction after Lung Transplantation.

Schroeder, Ines; Scharf, Christina; Schneider, Julia; Weggesser, Patricia; Hübner, Lucas; Kneidinger, Nikolaus; Michel, Sebastian; Schneider, Christian; Clevert, Dirk-Andre; Sabel, Bastian; Irlbeck, Michael; Scheiermann, Patrick.

Ultraschall Med ; 44(5): 537-543, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-36854384

RESUMO

PURPOSE: The aim of the study was to evaluate whether the quantification of B-lines via lung ultrasound after lung transplantation is feasible and correlates with the diagnosis of primary graft dysfunction. METHODS: Following lung transplantation, patients underwent daily lung ultrasound on postoperative days 1-3. B-lines were quantified by an ultrasound score based on the number of single and confluent B-lines per intercostal space, using a four-region protocol. The ultrasound score was correlated with the diagnosis of primary graft dysfunction. Furthermore, correlation analyses and receiver operating characteristics analyses taking into account ultrasound score, chest radiographs, and PaO2/FiO2 ratio were performed. RESULTS: A total of 32 patients (91 ultrasound measurements) were included, whereby 10 were diagnosed with primary graft dysfunction. The median B-line score was 5 [IQR: 4, 8]. There was a significant correlation between B-line score and the diagnosis of primary graft dysfunction (r = 0.59, p < 0.001). A significant correlation could also be seen between chest X-rays and primary graft dysfunction (r = 0.34, p = 0.008), but the B-line score showed superiority over chest X-rays with respect to diagnosing primary graft dysfunction in the receiver operating characteristics curves with an area under the curve value of 0.921 versus 0.708. There was a significant negative correlation between B-line score and PaO2/FiO2 ratio (r = -0.41, p < 0.001), but not between chest X-rays and PaO2/FiO2 ratio (r = -0.14, p = 0.279). CONCLUSION: The appearance of B-lines correlated well with primary graft dysfunction and outperformed chest radiographs.

Assuntos

Transplante de Pulmão , Disfunção Primária do Enxerto , Síndrome do Desconforto Respiratório , Humanos , Disfunção Primária do Enxerto/diagnóstico por imagem , Pulmão/diagnóstico por imagem , Ultrassonografia , Transplante de Pulmão/efeitos adversos

16.

Automated localization of the medial clavicular epiphyseal cartilages using an object detection network: a step towards deep learning-based forensic age assessment.

Wesp, Philipp; Sabel, Bastian Oliver; Mittermeier, Andreas; Stüber, Anna Theresa; Jeblick, Katharina; Schinke, Patrick; Mühlmann, Marc; Fischer, Florian; Penning, Randolph; Ricke, Jens; Ingrisch, Michael; Schachtner, Balthasar Maria.

Int J Legal Med ; 137(3): 733-742, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-36729183

RESUMO

BACKGROUND: Deep learning is a promising technique to improve radiological age assessment. However, expensive manual annotation by experts poses a bottleneck for creating large datasets to appropriately train deep neural networks. We propose an object detection approach to automatically annotate the medial clavicular epiphyseal cartilages in computed tomography (CT) scans. METHODS: The sternoclavicular joints were selected as structure-of-interest (SOI) in chest CT scans and served as an easy-to-identify proxy for the actual medial clavicular epiphyseal cartilages. CT slices containing the SOI were manually annotated with bounding boxes around the SOI. All slices in the training set were used to train the object detection network RetinaNet. Afterwards, the network was applied individually to all slices of the test scans for SOI detection. Bounding box and slice position of the detection with the highest classification score were used as the location estimate for the medial clavicular epiphyseal cartilages inside the CT scan. RESULTS: From 100 CT scans of 82 patients, 29,656 slices were used for training and 30,846 slices from 110 CT scans of 110 different patients for testing the object detection network. The location estimate from the deep learning approach for the SOI was in a correct slice in 97/110 (88%), misplaced by one slice in 5/110 (5%), and missing in 8/110 (7%) test scans. No estimate was misplaced by more than one slice. CONCLUSIONS: We demonstrated a robust automated approach for annotating the medial clavicular epiphyseal cartilages. This enables training and testing of deep neural networks for age assessment.

Assuntos

Aprendizado Profundo , Lâmina de Crescimento , Humanos , Lâmina de Crescimento/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Redes Neurais de Computação , Clavícula/diagnóstico por imagem

17.

Reduction in Radiation Exposure of CT Perfusion by Optimized Imaging Timing Using Temporal Information of the Preceding CT Angiography of the Carotid Artery in the Stroke Protocol.

Deak, Zsuzsanna; Schuettoff, Lara; Lohse, Ann-Kathrin; Fabritius, Matthias; Reidler, Paul; Forbrig, Robert; Kunz, Wolfgang; Dimitriadis, Konstantin; Ricke, Jens; Sabel, Bastian.

Diagnostics (Basel) ; 12(11)2022 Nov 18.

Artigo em Inglês | MEDLINE | ID: mdl-36428913

RESUMO

(1) Background: CT perfusion (CTP) is a fast, robust and widely available but dose-exposing imaging technique for infarct core and penumbra detection. Carotid CT angiography (CTA) can precede CTP in the stroke protocol. Temporal information of the bolus tracking series of CTA could allow for better timing and a decreased number of scans in CTP, resulting in less radiation exposure, if the shortening of CTP does not alter the calculated infarct core and penumbra or the resulting perfusion maps, which are essential for further treatment decisions. (2) Methods: 66 consecutive patients with ischemic stroke proven by follow-up imaging or endovascular intervention were included in this retrospective study approved by the local ethics committee. In each case, six simulated, stepwise shortened CTP examinations were compared with the original data regarding the perfusion maps, infarct core, penumbra and endovascular treatment decision. (3) Results: In simulated CTPs with 26, 28 and 30 scans, the infarct core, penumbra and PRR values were equivalent, and the resulting clinical decision was identical to the original CTP. (4) Conclusions: The temporal information of the bolus tracking series of the carotid CTA can allow for better timing and a lower radiation exposure by eliminating unnecessary scans in CTP.

18.

Therapeutic Outcome of MR-Guided High-Intensity Focused Ultrasound (MR-HIFU) in Solitary versus Multiple Uterine Fibroids.

Erber, Bernd; Schwarze, Vincent; Strobl, Frederik; Burges, Alexander; Mahner, Sven; Goller, Sophia Samira; Rudolph, Jan; Ricke, Jens; Sabel, Bastian Oliver.

Healthcare (Basel) ; 10(8)2022 Aug 04.

Artigo em Inglês | MEDLINE | ID: mdl-36011128

RESUMO

MR-guided high-intensity focused ultrasound (MR-HIFU) is an effective method for treating symptomatic uterine fibroids, especially solitary lesions. The aim of our study was to compare the clinical and morphological outcomes of patients who underwent MR-HIFU due to solitary fibroid (SF) or multiple fibroids (MFs) in a prospective clinical trial. We prospectively included 21 consecutive patients with SF (10) and MF (11) eligible for MR-guided HIFU. The morphological data were assessed using mint Lesion™ for MRI. The clinical data were determined using the Uterine Fibroid Symptom and Quality of Life (UFS-QOL) questionnaire before and 6 months after treatment. Unpaired and paired Wilcoxon-test and t-tests were applied, and Pearson's coefficient was used for correlation analysis. A p-value of 0.05 was considered statistically significant. The volume of treated fibroids significantly decreased in both the SF (mean baseline: 118.6 cm3; mean 6-month follow-up: 64.6 cm3) and MF (107.2 cm3; 55.1 cm3) groups. The UFS-QOL showed clinical symptoms significantly improved for patients in both the SF and MF groups regarding concern, activities, energy/mood, and control. The short-term outcome for the treatment of symptomatic fibroids in myomatous uterus by MR-guided HIFU is clinically similar to that of solitary fibroids.

19.

Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis.

Rudolph, Jan; Schachtner, Balthasar; Fink, Nicola; Koliogiannis, Vanessa; Schwarze, Vincent; Goller, Sophia; Trappmann, Lena; Hoppe, Boj F; Mansour, Nabeel; Fischer, Maximilian; Ben Khaled, Najib; Jörgens, Maximilian; Dinkel, Julien; Kunz, Wolfgang G; Ricke, Jens; Ingrisch, Michael; Sabel, Bastian O; Rueckel, Johannes.

Sci Rep ; 12(1): 12764, 2022 07 27.

Artigo em Inglês | MEDLINE | ID: mdl-35896763

RESUMO

Artificial intelligence (AI) algorithms evaluating [supine] chest radiographs ([S]CXRs) have remarkably increased in number recently. Since training and validation are often performed on subsets of the same overall dataset, external validation is mandatory to reproduce results and reveal potential training errors. We applied a multicohort benchmarking to the publicly accessible (S)CXR analyzing AI algorithm CheXNet, comprising three clinically relevant study cohorts which differ in patient positioning ([S]CXRs), the applied reference standards (CT-/[S]CXR-based) and the possibility to also compare algorithm classification with different medical experts' reading performance. The study cohorts include [1] a cohort, characterized by 563 CXRs acquired in the emergency unit that were evaluated by 9 readers (radiologists and non-radiologists) in terms of 4 common pathologies, [2] a collection of 6,248 SCXRs annotated by radiologists in terms of pneumothorax presence, its size and presence of inserted thoracic tube material which allowed for subgroup and confounding bias analysis and [3] a cohort consisting of 166 patients with SCXRs that were evaluated by radiologists for underlying causes of basal lung opacities, all of those cases having been correlated to a timely acquired computed tomography scan (SCXR and CT within < 90 min). CheXNet non-significantly exceeded the radiology resident (RR) consensus in the detection of suspicious lung nodules (cohort [1], AUC AI/RR: 0.851/0.839, p = 0.793) and the radiological readers in the detection of basal pneumonia (cohort [3], AUC AI/reader consensus: 0.825/0.782, p = 0.390) and basal pleural effusion (cohort [3], AUC AI/reader consensus: 0.762/0.710, p = 0.336) in SCXR, partly with AUC values higher than originally published ("Nodule": 0.780, "Infiltration": 0.735, "Effusion": 0.864). The classifier "Infiltration" turned out to be very dependent on patient positioning (best in CXR, worst in SCXR). The pneumothorax SCXR cohort [2] revealed poor algorithm performance in CXRs without inserted thoracic material and in the detection of small pneumothoraces, which can be explained by a known systematic confounding error in the algorithm training process. The benefit of clinically relevant external validation is demonstrated by the differences in algorithm performance as compared to the original publication. Our multi-cohort benchmarking finally enables the consideration of confounders, different reference standards and patient positioning as well as the AI performance comparison with differentially qualified medical readers.

Assuntos

Inteligência Artificial , Pneumotórax , Algoritmos , Benchmarking , Humanos , Pneumotórax/etiologia , Radiografia Torácica/métodos , Estudos Retrospectivos

20.

Artificial Intelligence in Chest Radiography Reporting Accuracy: Added Clinical Value in the Emergency Unit Setting Without 24/7 Radiology Coverage.

Rudolph, Jan; Huemmer, Christian; Ghesu, Florin-Cristian; Mansoor, Awais; Preuhs, Alexander; Fieselmann, Andreas; Fink, Nicola; Dinkel, Julien; Koliogiannis, Vanessa; Schwarze, Vincent; Goller, Sophia; Fischer, Maximilian; Jörgens, Maximilian; Ben Khaled, Najib; Vishwanath, Reddappagari Suryanarayana; Balachandran, Abishek; Ingrisch, Michael; Ricke, Jens; Sabel, Bastian Oliver; Rueckel, Johannes.

Invest Radiol ; 57(2): 90-98, 2022 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-34352804

RESUMO

OBJECTIVES: Chest radiographs (CXRs) are commonly performed in emergency units (EUs), but the interpretation requires radiology experience. We developed an artificial intelligence (AI) system (precommercial) that aims to mimic board-certified radiologists' (BCRs') performance and can therefore support non-radiology residents (NRRs) in clinical settings lacking 24/7 radiology coverage. We validated by quantifying the clinical value of our AI system for radiology residents (RRs) and EU-experienced NRRs in a clinically representative EU setting. MATERIALS AND METHODS: A total of 563 EU CXRs were retrospectively assessed by 3 BCRs, 3 RRs, and 3 EU-experienced NRRs. Suspected pathologies (pleural effusion, pneumothorax, consolidations suspicious for pneumonia, lung lesions) were reported on a 5-step confidence scale (sum of 20,268 reported pathology suspicions [563 images × 9 readers × 4 pathologies]) separately by every involved reader. Board-certified radiologists' confidence scores were converted into 4 binary reference standards (RFSs) of different sensitivities. The RRs' and NRRs' performances were statistically compared with our AI system (trained on nonpublic data from different clinical sites) based on receiver operating characteristics (ROCs) and operating point metrics approximated to the maximum sum of sensitivity and specificity (Youden statistics). RESULTS: The NRRs lose diagnostic accuracy to RRs with increasingly sensitive BCRs' RFSs for all considered pathologies. Based on our external validation data set, the AI system/NRRs' consensus mimicked the most sensitive BCRs' RFSs with areas under ROC of 0.940/0.837 (pneumothorax), 0.953/0.823 (pleural effusion), and 0.883/0.747 (lung lesions), which were comparable to experienced RRs and significantly overcomes EU-experienced NRRs' diagnostic performance. For consolidation detection, the AI system performed on the NRRs' consensus level (and overcomes each individual NRR) with an area under ROC of 0.847 referenced to the BCRs' most sensitive RFS. CONCLUSIONS: Our AI system matched RRs' performance, meanwhile significantly outperformed NRRs' diagnostic accuracy for most of considered CXR pathologies (pneumothorax, pleural effusion, and lung lesions) and therefore might serve as clinical decision support for NRRs.

Assuntos

Pneumopatias , Derrame Pleural , Pneumotórax , Radiologia , Inteligência Artificial , Serviço Hospitalar de Emergência , Humanos , Derrame Pleural/diagnóstico por imagem , Pneumotórax/diagnóstico por imagem , Radiografia , Radiografia Torácica/métodos , Estudos Retrospectivos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA