Pesquisa | Portal Regional da BVS

1.

The virtual reference radiologist: comprehensive AI assistance for clinical image reading and interpretation.

Siepmann, Robert; Huppertz, Marc; Rastkhiz, Annika; Reen, Matthias; Corban, Eric; Schmidt, Christian; Wilke, Stephan; Schad, Philipp; Yüksel, Can; Kuhl, Christiane; Truhn, Daniel; Nebelung, Sven.

Eur Radiol ; 2024 Apr 16.

Artigo em Inglês | MEDLINE | ID: mdl-38627289

RESUMO

OBJECTIVES: Large language models (LLMs) have shown potential in radiology, but their ability to aid radiologists in interpreting imaging studies remains unexplored. We investigated the effects of a state-of-the-art LLM (GPT-4) on the radiologists' diagnostic workflow. MATERIALS AND METHODS: In this retrospective study, six radiologists of different experience levels read 40 selected radiographic [n = 10], CT [n = 10], MRI [n = 10], and angiographic [n = 10] studies unassisted (session one) and assisted by GPT-4 (session two). Each imaging study was presented with demographic data, the chief complaint, and associated symptoms, and diagnoses were registered using an online survey tool. The impact of Artificial Intelligence (AI) on diagnostic accuracy, confidence, user experience, input prompts, and generated responses was assessed. False information was registered. Linear mixed-effect models were used to quantify the factors (fixed: experience, modality, AI assistance; random: radiologist) influencing diagnostic accuracy and confidence. RESULTS: When assessing if the correct diagnosis was among the top-3 differential diagnoses, diagnostic accuracy improved slightly from 181/240 (75.4%, unassisted) to 188/240 (78.3%, AI-assisted). Similar improvements were found when only the top differential diagnosis was considered. AI assistance was used in 77.5% of the readings. Three hundred nine prompts were generated, primarily involving differential diagnoses (59.1%) and imaging features of specific conditions (27.5%). Diagnostic confidence was significantly higher when readings were AI-assisted (p > 0.001). Twenty-three responses (7.4%) were classified as hallucinations, while two (0.6%) were misinterpretations. CONCLUSION: Integrating GPT-4 in the diagnostic process improved diagnostic accuracy slightly and diagnostic confidence significantly. Potentially harmful hallucinations and misinterpretations call for caution and highlight the need for further safeguarding measures. CLINICAL RELEVANCE STATEMENT: Using GPT-4 as a virtual assistant when reading images made six radiologists of different experience levels feel more confident and provide more accurate diagnoses; yet, GPT-4 gave factually incorrect and potentially harmful information in 7.4% of its responses.

2.

Reduction of ADC bias in diffusion MRI with deep learning-based acceleration: A phantom validation study at 3.0 T.

Lemainque, Teresa; Yoneyama, Masami; Morsch, Chiara; Iordanishvili, Elene; Barabasch, Alexandra; Schulze-Hagen, Maximilian; Peeters, Johannes M; Kuhl, Christiane; Zhang, Shuo.

Magn Reson Imaging ; 110: 96-103, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38631532

RESUMO

PURPOSE: Further acceleration of DWI in diagnostic radiology is desired but challenging mainly due to low SNR in high b-value images and associated bias in quantitative ADC values. Deep learning-based reconstruction and denoising may provide a solution to address this challenge. METHODS: The effects of SNR reduction on ADC bias and variability were investigated using a commercial diffusion phantom and numerical simulations. In the phantom, performance of different reconstruction methods, including conventional parallel (SENSE) imaging, compressed sensing (C-SENSE), and compressed SENSE acceleration with an artificial intelligence deep learning-based technique (C-SENSE AI), was compared at different acceleration factors and flip angles using ROI-based analysis. ADC bias was assessed by Lin's Concordance correlation coefficient (CCC) followed by bootstrapping to calculate confidence intervals (CI). ADC random measurement error (RME) was assessed by the mean coefficient of variation (CV¯) and non-parametric statistical tests. RESULTS: The simulations predicted increasingly negative bias and loss of precision towards lower SNR. These effects were confirmed in phantom measurements of increasing acceleration, for which CCC decreased from 0.947 to 0.279 and CV¯ increased from 0.043 to 0.439, and of decreasing flip angle, for which CCC decreased from 0.990 to 0.063 and CV¯ increased from 0.037 to 0.508. At high acceleration and low flip angle, C-SENSE AI reconstruction yielded best denoised ADC maps. For the lowest investigated flip angle, CCC = {0.630, 0.771 and 0.987} and CV¯={0.508, 0.426 and 0.254} were obtained for {SENSE, C-SENSE, C-SENSE AI}, the improvement by C-SENSE AI being significant as compared to the other methods (CV: p = 0.033 for C-SENSE AI vs. C-SENSE and p < 0.001 for C-SENSE AI vs. SENSE; CCC: non-overlapping CI between reconstruction methods). For the highest investigated acceleration factor, CCC = {0.479,0.926,0.960} and CV¯={0.519,0.119,0.118} were found, confirming the reduction of bias and RME by C-SENSE AI as compared to C-SENSE (by trend) and to SENSE (CV: p < 0.001; CCC: non-overlapping CI). CONCLUSION: ADC bias and random measurement error in DWI at low SNR, typically associated with scan acceleration, can be effectively reduced by deep-learning based C-SENSE AI reconstruction.

3.

Diffusion probabilistic versus generative adversarial models to reduce contrast agent dose in breast MRI.

Müller-Franzes, Gustav; Huck, Luisa; Bode, Maike; Nebelung, Sven; Kuhl, Christiane; Truhn, Daniel; Lemainque, Teresa.

Eur Radiol Exp ; 8(1): 53, 2024 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-38689178

RESUMO

BACKGROUND: To compare denoising diffusion probabilistic models (DDPM) and generative adversarial networks (GAN) for recovering contrast-enhanced breast magnetic resonance imaging (MRI) subtraction images from virtual low-dose subtraction images. METHODS: Retrospective, ethically approved study. DDPM- and GAN-reconstructed single-slice subtraction images of 50 breasts with enhancing lesions were compared to original ones at three dose levels (25%, 10%, 5%) using quantitative measures and radiologic evaluations. Two radiologists stated their preference based on the reconstruction quality and scored the lesion conspicuity as compared to the original, blinded to the model. Fifty lesion-free maximum intensity projections were evaluated for the presence of false-positives. Results were compared between models and dose levels, using generalized linear mixed models. RESULTS: At 5% dose, both radiologists preferred the GAN-generated images, whereas at 25% dose, both radiologists preferred the DDPM-generated images. Median lesion conspicuity scores did not differ between GAN and DDPM at 25% dose (5 versus 5, p = 1.000) and 10% dose (4 versus 4, p = 1.000). At 5% dose, both readers assigned higher conspicuity to the GAN than to the DDPM (3 versus 2, p = 0.007). In the lesion-free examinations, DDPM and GAN showed no differences in the false-positive rate at 5% (15% versus 22%), 10% (10% versus 6%), and 25% (6% versus 4%) (p = 1.000). CONCLUSIONS: Both GAN and DDPM yielded promising results in low-dose image reconstruction. However, neither of them showed superior results over the other model for all dose levels and evaluation metrics. Further development is needed to counteract false-positives. RELEVANCE STATEMENT: For MRI-based breast cancer screening, reducing the contrast agent dose is desirable. Diffusion probabilistic models and generative adversarial networks were capable of retrospectively enhancing the signal of low-dose images. Hence, they may supplement imaging with reduced doses in the future. KEY POINTS: â¢ Deep learning may help recover signal in low-dose contrast-enhanced breast MRI. â¢ Two models (DDPM and GAN) were trained at different dose levels. â¢ Radiologists preferred DDPM at 25%, and GAN images at 5% dose. â¢ Lesion conspicuity between DDPM and GAN was similar, except at 5% dose. â¢ GAN and DDPM yield promising results in low-dose image reconstruction.

Assuntos

Neoplasias da Mama , Meios de Contraste , Imageamento por Ressonância Magnética , Humanos , Feminino , Estudos Retrospectivos , Meios de Contraste/administração & dosagem , Neoplasias da Mama/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Pessoa de Meia-Idade , Modelos Estatísticos , Adulto , Idoso

4.

Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging.

Tayebi Arasteh, Soroosh; Ziller, Alexander; Kuhl, Christiane; Makowski, Marcus; Nebelung, Sven; Braren, Rickmer; Rueckert, Daniel; Truhn, Daniel; Kaissis, Georgios.

Commun Med (Lond) ; 4(1): 46, 2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-38486100

RESUMO

BACKGROUND: Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure its protection are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. Prior work indicates that DP has negative implications on model accuracy and fairness, which are unacceptable in medicine and represent a main barrier to the widespread use of privacy-preserving techniques. In this work, we evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training. METHODS: We used two datasets: (1) A large dataset (N = 193,311) of high quality clinical chest radiographs, and (2) a dataset (N = 1625) of 3D abdominal computed tomography (CT) images, with the task of classifying the presence of pancreatic ductal adenocarcinoma (PDAC). Both were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver operating characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson's r or Statistical Parity Difference. RESULTS: We find that, while the privacy-preserving training yields lower accuracy, it largely does not amplify discrimination against age, sex or co-morbidity. However, we find an indication that difficult diagnoses and subgroups suffer stronger performance hits in private training. CONCLUSIONS: Our study shows that - under the challenging realistic circumstances of a real-life clinical dataset - the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.

Artificial intelligence (AI), in which computers can learn to do tasks that normally require human intelligence, is particularly useful in medical imaging. However, AI should be used in a way that preserves patient privacy. We explored the balance between maintaining patient data privacy and AI performance in medical imaging. We use an approach called differential privacy to protect the privacy of patients' images. We show that, although training AI with differential privacy leads to a slight decrease in accuracy, it does not substantially increase bias against different age groups, genders, or patients with multiple health conditions. However, we notice that AI faces more challenges in accurately diagnosing complex cases and specific subgroups when trained under these privacy constraints. These findings highlight the importance of designing AI systems that are both privacy-conscious and capable of reliable diagnoses across patient groups.

5.

Abbreviated Breast MRI: State of the Art.

Kuhl, Christiane K.

Radiology ; 310(3): e221822, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38530181

RESUMO

Abbreviated MRI is an umbrella term, defined as a focused MRI examination tailored to answer a single specific clinical question. For abbreviated breast MRI, this question is: "Is there evidence of breast cancer?" Abbreviated MRI of the breast makes maximum use of the fact that the kinetics of breast cancers and of benign tissue differ most in the very early postcontrast phase; therefore, abbreviated breast MRI focuses on this period. The different published approaches to abbreviated MRI include the following three subtypes: (a) short protocols, consisting of a precontrast and either a single postcontrast acquisition (first postcontrast subtracted [FAST]) or a time-resolved series of postcontrast acquisitions with lower spatial resolution (ultrafast [UF]), obtained during the early postcontrast phase immediately after contrast agent injection; (b) abridged protocols, consisting of FAST or UF acquisitions plus selected additional pulse sequences; and (c) noncontrast protocols, where diffusion-weighted imaging replaces the contrast information. Abbreviated MRI was proposed to increase tolerability of and access to breast MRI as a screening tool. But its widening application now includes follow-up after breast cancer and even diagnostic assessment. This review defines the three subtypes of abbreviated MRI, highlighting the differences between the protocols and their clinical implications and summarizing the respective evidence on diagnostic accuracy and clinical utility.

Assuntos

Neoplasias da Mama , Imageamento por Ressonância Magnética , Humanos , Feminino , Imagem de Difusão por Ressonância Magnética , Mama/diagnóstico por imagem , Neoplasias da Mama/diagnóstico por imagem , Cinética

6.

Author Correction: A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports.

Truhn, Daniel; Weber, Christian D; Braun, Benedikt J; Bressem, Keno; Kather, Jakob N; Kuhl, Christiane; Nebelung, Sven.

Sci Rep ; 14(1): 5431, 2024 Mar 05.

Artigo em Inglês | MEDLINE | ID: mdl-38443449

7.

Evaluation of Pulmonary Nodules by Radiologists vs. Radiomics in Stand-Alone and Complementary CT and MRI.

Tietz, Eric; Müller-Franzes, Gustav; Zimmermann, Markus; Kuhl, Christiane Katharina; Keil, Sebastian; Nebelung, Sven; Truhn, Daniel.

Diagnostics (Basel) ; 14(5)2024 Feb 23.

Artigo em Inglês | MEDLINE | ID: mdl-38472955

RESUMO

Increased attention has been given to MRI in radiation-free screening for malignant nodules in recent years. Our objective was to compare the performance of human readers and radiomic feature analysis based on stand-alone and complementary CT and MRI imaging in classifying pulmonary nodules. This single-center study comprises patients with CT findings of pulmonary nodules who underwent additional lung MRI and whose nodules were classified as benign/malignant by resection. For radiomic features analysis, 2D segmentation was performed for each lung nodule on axial CT, T2-weighted (T2w), and diffusion (DWI) images. The 105 extracted features were reduced by iterative backward selection. The performance of radiomics and human readers was compared by calculating accuracy with Clopper-Pearson confidence intervals. Fifty patients (mean age 63 +/- 10 years) with 66 pulmonary nodules (40 malignant) were evaluated. ACC values for radiomic features analysis vs. radiologists based on CT alone (0.68; 95%CI: 0.56, 0.79 vs. 0.59; 95%CI: 0.46, 0.71), T2w alone (0.65; 95%CI: 0.52, 0.77 vs. 0.68; 95%CI: 0.54, 0.78), DWI alone (0.61; 95%CI:0.48, 0.72 vs. 0.73; 95%CI: 0.60, 0.83), combined T2w/DWI (0.73; 95%CI: 0.60, 0.83 vs. 0.70; 95%CI: 0.57, 0.80), and combined CT/T2w/DWI (0.83; 95%CI: 0.72, 0.91 vs. 0.64; 95%CI: 0.51, 0.75) were calculated. This study is the first to show that by combining quantitative image information from CT, T2w, and DWI datasets, pulmonary nodule assessment through radiomics analysis is superior to using one modality alone, even exceeding human readers' performance.

8.

Large language models streamline automated machine learning for clinical studies.

Tayebi Arasteh, Soroosh; Han, Tianyu; Lotfinia, Mahshad; Kuhl, Christiane; Kather, Jakob Nikolas; Truhn, Daniel; Nebelung, Sven.

Nat Commun ; 15(1): 1603, 2024 Feb 21.

Artigo em Inglês | MEDLINE | ID: mdl-38383555

RESUMO

A knowledge gap persists between machine learning (ML) developers (e.g., data scientists) and practitioners (e.g., clinicians), hampering the full utilization of ML for clinical data analysis. We investigated the potential of the ChatGPT Advanced Data Analysis (ADA), an extension of GPT-4, to bridge this gap and perform ML analyses efficiently. Real-world clinical datasets and study details from large trials across various medical specialties were presented to ChatGPT ADA without specific guidance. ChatGPT ADA autonomously developed state-of-the-art ML models based on the original study's training data to predict clinical outcomes such as cancer development, cancer progression, disease complications, or biomarkers such as pathogenic gene sequences. Following the re-implementation and optimization of the published models, the head-to-head comparison of the ChatGPT ADA-crafted ML models and their respective manually crafted counterparts revealed no significant differences in traditional performance metrics (p ≥ 0.072). Strikingly, the ChatGPT ADA-crafted ML models often outperformed their counterparts. In conclusion, ChatGPT ADA offers a promising avenue to democratize ML in medicine by simplifying complex data analyses, yet should enhance, not replace, specialized training and resources, to promote broader applications in medical research and practice.

Assuntos

Algoritmos , Neoplasias , Humanos , Benchmarking , Idioma , Aprendizado de Máquina

9.

[Current MR imaging of cartilage in the context of knee osteoarthritis (part 2) : Cartilage pathologies and their assessment]. / Aktuelle MRT-Bildgebung des Knorpels im Kontext der Gonarthrose (Teil 2) : Knorpelpathologien und deren Beurteilung.

Huppertz, Marc Sebastian; Lemainque, Teresa; Yüksel, Can; Siepmann, Robert; Kuhl, Christiane; Roemer, Frank; Truhn, Daniel; Nebelung, Sven.

Radiologie (Heidelb) ; 64(4): 304-311, 2024 Apr.

Artigo em Alemão | MEDLINE | ID: mdl-38170243

RESUMO

High-quality magnetic resonance (MR) imaging is essential for the precise assessment of the knee joint and plays a key role in the diagnostics, treatment and prognosis. Intact cartilage tissue is characterized by a smooth surface, uniform tissue thickness and an organized zonal structure, which are manifested as depth-dependent signal intensity variations. Cartilage pathologies are identifiable through alterations in signal intensity and morphology and should be communicated based on a precise terminology. Cartilage pathologies can show hyperintense and hypointense signal alterations. Cartilage defects are assessed based on their depth and should be described in terms of their location and extent. The following symptom constellations are of overarching clinical relevance in image reading and interpretation: symptom constellations associated with rapidly progressive forms of joint degeneration and unfavorable prognosis, accompanying symptom constellations mostly in connection with destabilizing meniscal lesions and subchondral insufficiency fractures (accelerated osteoarthritis) as well as symptoms beyond the "typical" degeneration, especially when a discrepancy is observed between (minor) structural changes and (major) synovitis and effusion (inflammatory arthropathy).

Assuntos

Cartilagem Articular , Osteoartrite do Joelho , Humanos , Osteoartrite do Joelho/complicações , Osteoartrite do Joelho/patologia , Cartilagem Articular/patologia , Progressão da Doença , Articulação do Joelho/patologia , Imageamento por Ressonância Magnética/métodos

10.

[Current MR imaging of cartilage in the context of knee osteoarthritis (part 1) : Principles and sequences]. / Aktuelle MRT-Bildgebung des Knorpels im Kontext der Gonarthrose (Teil 1) : Grundlagen und Sequenzen.

Lemainque, Teresa; Huppertz, Marc Sebastian; Yüksel, Can; Siepmann, Robert; Kuhl, Christiane; Roemer, Frank; Truhn, Daniel; Nebelung, Sven.

Radiologie (Heidelb) ; 64(4): 295-303, 2024 Apr.

Artigo em Alemão | MEDLINE | ID: mdl-38158404

RESUMO

Magnetic resonance imaging (MRI) is the clinical method of choice for cartilage imaging in the context of degenerative and nondegenerative joint diseases. The MRI-based definitions of osteoarthritis rely on the detection of osteophytes, cartilage pathologies, bone marrow edema and meniscal lesions but currently a scientific consensus is lacking. In the clinical routine proton density-weighted, fat-suppressed 2D turbo spin echo sequences with echo times of 30-40â¯ms are predominantly used, which are sufficiently sensitive and specific for the assessment of cartilage. The additionally acquired T1-weighted sequences are primarily used for evaluating other intra-articular and periarticular structures. Diagnostically relevant artifacts include magic angle and chemical shift artifacts, which can lead to artificial signal enhancement in cartilage or incorrect representations of the subchondral lamina and its thickness. Although scientifically validated, high-resolution 3D gradient echo sequences (for cartilage segmentation) and compositional MR sequences (for quantification of physical tissue parameters) are currently reserved for scientific research questions. The future integration of artificial intelligence techniques in areas such as image reconstruction (to reduce scan times while maintaining image quality), image analysis (for automated identification of cartilage defects), and image postprocessing (for automated segmentation of cartilage in terms of volume and thickness) will significantly improve the diagnostic workflow and advance the field further.

Assuntos

Doenças das Cartilagens , Cartilagem Articular , Osteoartrite do Joelho , Humanos , Osteoartrite do Joelho/patologia , Cartilagem Articular/patologia , Inteligência Artificial , Doenças das Cartilagens/patologia , Imageamento por Ressonância Magnética/métodos

11.

Encrypted federated learning for secure decentralized collaboration in cancer image analysis.

Truhn, Daniel; Tayebi Arasteh, Soroosh; Saldanha, Oliver Lester; Müller-Franzes, Gustav; Khader, Firas; Quirke, Philip; West, Nicholas P; Gray, Richard; Hutchins, Gordon G A; James, Jacqueline A; Loughrey, Maurice B; Salto-Tellez, Manuel; Brenner, Hermann; Brobeil, Alexander; Yuan, Tanwei; Chang-Claude, Jenny; Hoffmeister, Michael; Foersch, Sebastian; Han, Tianyu; Keil, Sebastian; Schulze-Hagen, Maximilian; Isfort, Peter; Bruners, Philipp; Kaissis, Georgios; Kuhl, Christiane; Nebelung, Sven; Kather, Jakob Nikolas.

Med Image Anal ; 92: 103059, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38104402

RESUMO

Artificial intelligence (AI) has a multitude of applications in cancer research and oncology. However, the training of AI systems is impeded by the limited availability of large datasets due to data protection requirements and other regulatory obstacles. Federated and swarm learning represent possible solutions to this problem by collaboratively training AI models while avoiding data transfer. However, in these decentralized methods, weight updates are still transferred to the aggregation server for merging the models. This leaves the possibility for a breach of data privacy, for example by model inversion or membership inference attacks by untrusted servers. Somewhat-homomorphically-encrypted federated learning (SHEFL) is a solution to this problem because only encrypted weights are transferred, and model updates are performed in the encrypted space. Here, we demonstrate the first successful implementation of SHEFL in a range of clinically relevant tasks in cancer image analysis on multicentric datasets in radiology and histopathology. We show that SHEFL enables the training of AI models which outperform locally trained models and perform on par with models which are centrally trained. In the future, SHEFL can enable multiple institutions to co-train AI models without forsaking data governance and without ever transmitting any decryptable data to untrusted servers.

Assuntos

Neoplasias , Radiologia , Humanos , Inteligência Artificial , Aprendizagem , Neoplasias/diagnóstico por imagem , Processamento de Imagem Assistida por Computador

12.

Enhancing domain generalization in the AI-based analysis of chest radiographs with federated learning.

Tayebi Arasteh, Soroosh; Kuhl, Christiane; Saehn, Marwin-Jonathan; Isfort, Peter; Truhn, Daniel; Nebelung, Sven.

Sci Rep ; 13(1): 22576, 2023 12 19.

Artigo em Inglês | MEDLINE | ID: mdl-38114729

RESUMO

Developing robust artificial intelligence (AI) models that generalize well to unseen datasets is challenging and usually requires large and variable datasets, preferably from multiple institutions. In federated learning (FL), a model is trained collaboratively at numerous sites that hold local datasets without exchanging them. So far, the impact of training strategy, i.e., local versus collaborative, on the diagnostic on-domain and off-domain performance of AI models interpreting chest radiographs has not been assessed. Consequently, using 610,000 chest radiographs from five institutions across the globe, we assessed diagnostic performance as a function of training strategy (i.e., local vs. collaborative), network architecture (i.e., convolutional vs. transformer-based), single versus cross-institutional performance (i.e., on-domain vs. off-domain), imaging finding (i.e., cardiomegaly, pleural effusion, pneumonia, atelectasis, consolidation, pneumothorax, and no abnormality), dataset size (i.e., from n = 18,000 to 213,921 radiographs), and dataset diversity. Large datasets not only showed minimal performance gains with FL but, in some instances, even exhibited decreases. In contrast, smaller datasets revealed marked improvements. Thus, on-domain performance was mainly driven by training data size. However, off-domain performance leaned more on training diversity. When trained collaboratively across diverse external institutions, AI models consistently surpassed models trained locally for off-domain tasks, emphasizing FL's potential in leveraging data diversity. In conclusion, FL can bolster diagnostic privacy, reproducibility, and off-domain reliability of AI models and, potentially, optimize healthcare outcomes.

Assuntos

Inteligência Artificial , Aprendizagem , Reprodutibilidade dos Testes , Generalização Psicológica , Radiografia

13.

A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports.

Truhn, Daniel; Weber, Christian D; Braun, Benedikt J; Bressem, Keno; Kather, Jakob N; Kuhl, Christiane; Nebelung, Sven.

Sci Rep ; 13(1): 20159, 2023 11 17.

Artigo em Inglês | MEDLINE | ID: mdl-37978240

RESUMO

Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment recommendations generated by GPT-4 for common knee and shoulder orthopedic conditions using anonymized clinical MRI reports. A retrospective analysis was conducted using 20 anonymized clinical MRI reports, with varying severity and complexity. Treatment recommendations were elicited from GPT-4 and evaluated by two board-certified specialty-trained senior orthopedic surgeons. Their evaluation focused on semiquantitative gradings of accuracy and clinical utility and potential limitations of the LLM-generated recommendations. GPT-4 provided treatment recommendations for 20 patients (mean age, 50 years ± 19 [standard deviation]; 12 men) with acute and chronic knee and shoulder conditions. The LLM produced largely accurate and clinically useful recommendations. However, limited awareness of a patient's overall situation, a tendency to incorrectly appreciate treatment urgency, and largely schematic and unspecific treatment recommendations were observed and may reduce its clinical usefulness. In conclusion, LLM-based treatment recommendations are largely adequate and not prone to 'hallucinations', yet inadequate in particular situations. Critical guidance by healthcare professionals is obligatory, and independent use by patients is discouraged, given the dependency on precise data input.

Assuntos

Medicina , Doenças Musculoesqueléticas , Masculino , Humanos , Pessoa de Meia-Idade , Projetos Piloto , Estudos Retrospectivos , Idioma , Imageamento por Ressonância Magnética

14.

Multimodal Deep Learning for Integrating Chest Radiographs and Clinical Parameters: A Case for Transformers.

Khader, Firas; Müller-Franzes, Gustav; Wang, Tianci; Han, Tianyu; Tayebi Arasteh, Soroosh; Haarburger, Christoph; Stegmaier, Johannes; Bressem, Keno; Kuhl, Christiane; Nebelung, Sven; Kather, Jakob Nikolas; Truhn, Daniel.

Radiology ; 309(1): e230806, 2023 10.

Artigo em Inglês | MEDLINE | ID: mdl-37787671

RESUMO

Background Clinicians consider both imaging and nonimaging data when diagnosing diseases; however, current machine learning approaches primarily consider data from a single modality. Purpose To develop a neural network architecture capable of integrating multimodal patient data and compare its performance to models incorporating a single modality for diagnosing up to 25 pathologic conditions. Materials and Methods In this retrospective study, imaging and nonimaging patient data were extracted from the Medical Information Mart for Intensive Care (MIMIC) database and an internal database comprised of chest radiographs and clinical parameters inpatients in the intensive care unit (ICU) (January 2008 to December 2020). The MIMIC and internal data sets were each split into training (n = 33 893, n = 28 809), validation (n = 740, n = 7203), and test (n = 1909, n = 9004) sets. A novel transformer-based neural network architecture was trained to diagnose up to 25 conditions using nonimaging data alone, imaging data alone, or multimodal data. Diagnostic performance was assessed using area under the receiver operating characteristic curve (AUC) analysis. Results The MIMIC and internal data sets included 36 542 patients (mean age, 63 years ± 17 [SD]; 20 567 male patients) and 45 016 patients (mean age, 66 years ± 16; 27 577 male patients), respectively. The multimodal model showed improved diagnostic performance for all pathologic conditions. For the MIMIC data set, the mean AUC was 0.77 (95% CI: 0.77, 0.78) when both chest radiographs and clinical parameters were used, compared with 0.70 (95% CI: 0.69, 0.71; P < .001) for only chest radiographs and 0.72 (95% CI: 0.72, 0.73; P < .001) for only clinical parameters. These findings were confirmed on the internal data set. Conclusion A model trained on imaging and nonimaging data outperformed models trained on only one type of data for diagnosing multiple diseases in patients in an ICU setting. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Kitamura and Topol in this issue.

Assuntos

Aprendizado Profundo , Humanos , Masculino , Pessoa de Meia-Idade , Idoso , Estudos Retrospectivos , Radiografia , Bases de Dados Factuais , Pacientes Internados

15.

Fibroglandular tissue segmentation in breast MRI using vision transformers: a multi-institutional evaluation.

Müller-Franzes, Gustav; Müller-Franzes, Fritz; Huck, Luisa; Raaff, Vanessa; Kemmer, Eva; Khader, Firas; Arasteh, Soroosh Tayebi; Lemainque, Teresa; Kather, Jakob Nikolas; Nebelung, Sven; Kuhl, Christiane; Truhn, Daniel.

Sci Rep ; 13(1): 14207, 2023 08 30.

Artigo em Inglês | MEDLINE | ID: mdl-37648728

RESUMO

Accurate and automatic segmentation of fibroglandular tissue in breast MRI screening is essential for the quantification of breast density and background parenchymal enhancement. In this retrospective study, we developed and evaluated a transformer-based neural network for breast segmentation (TraBS) in multi-institutional MRI data, and compared its performance to the well established convolutional neural network nnUNet. TraBS and nnUNet were trained and tested on 200 internal and 40 external breast MRI examinations using manual segmentations generated by experienced human readers. Segmentation performance was assessed in terms of the Dice score and the average symmetric surface distance. The Dice score for nnUNet was lower than for TraBS on the internal testset (0.909 ± 0.069 versus 0.916 ± 0.067, P < 0.001) and on the external testset (0.824 ± 0.144 versus 0.864 ± 0.081, P = 0.004). Moreover, the average symmetric surface distance was higher (= worse) for nnUNet than for TraBS on the internal (0.657 ± 2.856 versus 0.548 ± 2.195, P = 0.001) and on the external testset (0.727 ± 0.620 versus 0.584 ± 0.413, P = 0.03). Our study demonstrates that transformer-based networks improve the quality of fibroglandular tissue segmentation in breast MRI compared to convolutional-based models like nnUNet. These findings might help to enhance the accuracy of breast density and parenchymal enhancement quantification in breast MRI screening.

Assuntos

Densidade da Mama , Imageamento por Ressonância Magnética , Humanos , Estudos Retrospectivos , Radiografia , Fontes de Energia Elétrica

16.

Medical transformer for multimodal survival prediction in intensive care: integration of imaging and non-imaging data.

Khader, Firas; Kather, Jakob Nikolas; Müller-Franzes, Gustav; Wang, Tianci; Han, Tianyu; Tayebi Arasteh, Soroosh; Hamesch, Karim; Bressem, Keno; Haarburger, Christoph; Stegmaier, Johannes; Kuhl, Christiane; Nebelung, Sven; Truhn, Daniel.

Sci Rep ; 13(1): 10666, 2023 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-37393383

RESUMO

When clinicians assess the prognosis of patients in intensive care, they take imaging and non-imaging data into account. In contrast, many traditional machine learning models rely on only one of these modalities, limiting their potential in medical applications. This work proposes and evaluates a transformer-based neural network as a novel AI architecture that integrates multimodal patient data, i.e., imaging data (chest radiographs) and non-imaging data (clinical data). We evaluate the performance of our model in a retrospective study with 6,125 patients in intensive care. We show that the combined model (area under the receiver operating characteristic curve [AUROC] of 0.863) is superior to the radiographs-only model (AUROC = 0.811, p < 0.001) and the clinical data-only model (AUROC = 0.785, p < 0.001) when tasked with predicting in-hospital survival per patient. Furthermore, we demonstrate that our proposed model is robust in cases where not all (clinical) data points are available.

Assuntos

Cuidados Críticos , Diagnóstico por Imagem , Humanos , Estudos Retrospectivos , Área Sob a Curva , Fontes de Energia Elétrica

17.

Efficacy and Safety of Half-Dose Gadopiclenol versus Full-Dose Gadobutrol for Contrast-enhanced Body MRI.

Kuhl, Christiane; Csoszi, Tibor; Piskorski, Wojciech; Miszalski, Tomasz; Lee, Jeong-Min; Otto, Pamela M.

Radiology ; 308(1): e222612, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37462494

RESUMO

Background Gadopiclenol is a macrocyclic gadolinium-based contrast agent (GBCA) with higher relaxivity compared with standard GBCAs, potentially allowing gadolinium dose reduction without decreasing efficacy. Purpose To investigate whether gadopiclenol at 0.05 mmol/kg is noninferior to gadobutrol at 0.1 mmol/kg for lesion visualization in body MRI. Materials and Methods A randomized, double-blind, crossover, phase 3 study was conducted between August 2019 and December 2020 at 33 centers in 11 countries. Adults with at least one suspected focal lesion in one of three different body regions (head and neck; breast, thorax, abdomen, or pelvis; or musculoskeletal system) underwent two contrast-enhanced MRI examinations, randomized to start with either gadopiclenol or gadobutrol. MRI examinations were read by three blinded expert readers for each respective body region. Readers rated border delineation, internal morphologic characteristics, and visual contrast enhancement. Three additional blinded readers assessed reader preference. For safety analysis, adverse events were recorded. The differences between gadopiclenol- and gadobutrol-enhanced MRI in terms of lesion visualization were analyzed with a generalized linear mixed model using a two-sided paired t test. Results Among 273 participants (mean age, 57 years ± 13 [SD]; 162 women) who underwent both gadopiclenol- and gadobutrol-enhanced MRI and had at least one correlating lesion, 260 participants without major protocol deviations were analyzed for noninferiority. Gadopiclenol was noninferior to gadobutrol for all qualitative visualization parameters and for all readers (lower limit 95% CI of the difference of at least -0.10, which was above the noninferiority margin [-0.35]; P < .001). For most participants (75%-83% [206-228 of 276]), readers reported no preference between gadopiclenol- and gadobutrol-enhanced images. Adverse events did not differ in frequency, intensity, type, or association with GBCA injection (12 of 288 participants receiving gadopiclenol and 16 of 290 receiving gadobutrol). Conclusion Gadopiclenol at 0.05 mmol/kg was comparable with gadobutrol at 0.1 mmol/kg for lesion evaluation at contrast-enhanced body MRI and had a similar safety profile. Clinical trial registration no. NCT03986138 Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Bashir and Thomas in this issue.

Assuntos

Neoplasias Encefálicas , Compostos Organometálicos , Adulto , Humanos , Feminino , Pessoa de Meia-Idade , Gadolínio/efeitos adversos , Neoplasias Encefálicas/patologia , Meios de Contraste , Imageamento por Ressonância Magnética/métodos

18.

A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis.

Müller-Franzes, Gustav; Niehues, Jan Moritz; Khader, Firas; Arasteh, Soroosh Tayebi; Haarburger, Christoph; Kuhl, Christiane; Wang, Tianci; Han, Tianyu; Nolte, Teresa; Nebelung, Sven; Kather, Jakob Nikolas; Truhn, Daniel.

Sci Rep ; 13(1): 12098, 2023 07 26.

Artigo em Inglês | MEDLINE | ID: mdl-37495660

RESUMO

Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images.

Assuntos

Artefatos , Rememoração Mental , Difusão , Modelos Estatísticos , Oftalmoscopia , Processamento de Imagem Assistida por Computador

19.

European Society of Breast Imaging (EUSOBI) guidelines on the management of axillary lymphadenopathy after COVID-19 vaccination: 2023 revision.

Schiaffino, Simone; Pinker, Katja; Cozzi, Andrea; Magni, Veronica; Athanasiou, Alexandra; Baltzer, Pascal A T; Camps Herrero, Julia; Clauser, Paola; Fallenberg, Eva M; Forrai, Gabor; Fuchsjäger, Michael H; Gilbert, Fiona J; Helbich, Thomas; Kilburn-Toppin, Fleur; Kuhl, Christiane K; Lesaru, Mihai; Mann, Ritse M; Panizza, Pietro; Pediconi, Federica; Sardanelli, Francesco; Sella, Tamar; Thomassin-Naggara, Isabelle; Zackrisson, Sophia; Pijnappel, Ruud M.

Insights Imaging ; 14(1): 126, 2023 Jul 19.

Artigo em Inglês | MEDLINE | ID: mdl-37466753

RESUMO

Axillary lymphadenopathy is a common side effect of COVID-19 vaccination, leading to increased imaging-detected asymptomatic and symptomatic unilateral axillary lymphadenopathy. This has threatened to negatively impact the workflow of breast imaging services, leading to the release of ten recommendations by the European Society of Breast Imaging (EUSOBI) in August 2021. Considering the rapidly changing scenario and data scarcity, these initial recommendations kept a highly conservative approach. As of 2023, according to newly acquired evidence, EUSOBI proposes the following updates, in order to reduce unnecessary examinations and avoid delaying necessary examinations. First, recommendation n. 3 has been revised to state that breast examinations should not be delayed or rescheduled because of COVID-19 vaccination, as evidence from the first pandemic waves highlights how delayed or missed screening tests have a negative effect on breast cancer morbidity and mortality, and that there is a near-zero risk of subsequent malignant findings in asymptomatic patients who have unilateral lymphadenopathy and no suspicious breast findings. Second, recommendation n. 7 has been revised to simplify follow-up strategies: in patients without breast cancer history and no imaging findings suspicious for cancer, symptomatic and asymptomatic imaging-detected unilateral lymphadenopathy on the same side of recent COVID-19 vaccination (within 12 weeks) should be classified as a benign finding (BI-RADS 2) and no further work-up should be pursued. All other recommendations issued by EUSOBI in 2021 remain valid.

20.

An MRI Deep Learning Model Predicts Outcome in Rectal Cancer.

Jiang, Xiaofeng; Zhao, Hengyu; Saldanha, Oliver Lester; Nebelung, Sven; Kuhl, Christiane; Amygdalos, Iakovos; Lang, Sven Arke; Wu, Xiaojian; Meng, Xiaochun; Truhn, Daniel; Kather, Jakob Nikolas; Ke, Jia.

Radiology ; 307(5): e222223, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37278629

RESUMO

Background Deep learning (DL) models can potentially improve prognostication of rectal cancer but have not been systematically assessed. Purpose To develop and validate an MRI DL model for predicting survival in patients with rectal cancer based on segmented tumor volumes from pretreatment T2-weighted MRI scans. Materials and Methods DL models were trained and validated on retrospectively collected MRI scans of patients with rectal cancer diagnosed between August 2003 and April 2021 at two centers. Patients were excluded from the study if there were concurrent malignant neoplasms, prior anticancer treatment, incomplete course of neoadjuvant therapy, or no radical surgery performed. The Harrell C-index was used to determine the best model, which was applied to internal and external test sets. Patients were stratified into high- and low-risk groups based on a fixed cutoff calculated in the training set. A multimodal model was also assessed, which used DL model-computed risk score and pretreatment carcinoembryonic antigen level as input. Results The training set included 507 patients (median age, 56 years [IQR, 46-64 years]; 355 men). In the validation set (n = 218; median age, 55 years [IQR, 47-63 years]; 144 men), the best algorithm reached a C-index of 0.82 for overall survival. The best model reached hazard ratios of 3.0 (95% CI: 1.0, 9.0) in the high-risk group in the internal test set (n = 112; median age, 60 years [IQR, 52-70 years]; 76 men) and 2.3 (95% CI: 1.0, 5.4) in the external test set (n = 58; median age, 57 years [IQR, 50-67 years]; 38 men). The multimodal model further improved the performance, with a C-index of 0.86 and 0.67 for the validation and external test set, respectively. Conclusion A DL model based on preoperative MRI was able to predict survival of patients with rectal cancer. The model could be used as a preoperative risk stratification tool. Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Langs in this issue.

Assuntos

Aprendizado Profundo , Neoplasias Retais , Masculino , Humanos , Pessoa de Meia-Idade , Estudos Retrospectivos , Neoplasias Retais/diagnóstico por imagem , Neoplasias Retais/terapia , Imageamento por Ressonância Magnética , Fatores de Risco

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA