Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Acta Radiol ; 65(7): 800-807, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38798137

RESUMO

BACKGROUND: The accurate differentiation of primary central nervous system lymphoma (PCNSL) from glioblastoma multiforme (GBM) is clinically crucial due to the different treatment strategies between them. PURPOSE: To define magnetic resonance imaging (MRI) perfusion findings in PCNSL to make a safe distinction from GBM with dynamic contrast-enhanced (DCE) T1 and DSC T2 MRI perfusion findings. MATERIAL AND METHODS: This retrospective analysis included 19 patients with histopathologically diagnosed PCNSL and 21 individuals with GBM. DCE T1 vascular permeability perfusion values including K-trans, Ve, Kep, IAUGC, and DSC T2 perfusion values including cerebral blood volume (CBV) and cerebral blood flow (CBF) in axial sections from the pathological lesion and contralateral normal brain parenchyma were measured quantitatively using region of interest analysis. RESULTS: The study observed no statistically significant difference between patients with PCNSL (T/B cell) and GBM in the median values of DCE T1 perfusion ratios (P > 0.05). Nevertheless, the DSC T2 perfusion ratios showed a substantial distinction between the two groups. In contrast to patients with PCNSL (1.185 vs. 1.224, respectively), those with GBM had higher median levels of r-CBV and r-CBF (2.898 vs. 2.467, respectively; P 0.01). A cutoff value of ≤1.473 for r-CBV (Lesion/N) and ≤1.6005 for r-CBF (Lesion/N) was found to estimate the positivity of PCNSL. CONCLUSION: DSC T2 MRI perfusion values showed lower r-CBV and r-CBF values in PCNSL patients compared to GBM patients. According to the findings, r-CBV and r-CBF are the most accurate MRI perfusion parameters for distinguishing between PCSNL and GBM.


Assuntos
Neoplasias Encefálicas , Meios de Contraste , Glioblastoma , Linfoma , Imageamento por Ressonância Magnética , Humanos , Masculino , Feminino , Glioblastoma/diagnóstico por imagem , Glioblastoma/irrigação sanguínea , Pessoa de Meia-Idade , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/irrigação sanguínea , Neoplasias Encefálicas/patologia , Estudos Retrospectivos , Linfoma/diagnóstico por imagem , Adulto , Idoso , Imageamento por Ressonância Magnética/métodos , Diagnóstico Diferencial , Encéfalo/diagnóstico por imagem , Encéfalo/irrigação sanguínea
3.
Cureus ; 16(5): e60009, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38854352

RESUMO

Background Recent studies have highlighted the diagnostic performance of ChatGPT 3.5 and GPT-4 in a text-based format, demonstrating their radiological knowledge across different areas. Our objective is to investigate the impact of prompt engineering on the diagnostic performance of ChatGPT 3.5 and GPT-4 in diagnosing thoracic radiology cases, highlighting how the complexity of prompts influences model performance. Methodology We conducted a retrospective cross-sectional study using 124 publicly available Case of the Month examples from the Thoracic Society of Radiology website. We initially input the cases into the ChatGPT versions without prompting. Then, we employed five different prompts, ranging from basic task-oriented to complex role-specific formulations to measure the diagnostic accuracy of ChatGPT versions. The differential diagnosis lists generated by the models were compared against the radiological diagnoses listed on the Thoracic Society of Radiology website, with a scoring system in place to comprehensively assess the accuracy. Diagnostic accuracy and differential diagnosis scores were analyzed using the McNemar, Chi-square, Kruskal-Wallis, and Mann-Whitney U tests. Results Without any prompts, ChatGPT 3.5's accuracy was 25% (31/124), which increased to 56.5% (70/124) with the most complex prompt (P < 0.001). GPT-4 showed a high baseline accuracy at 53.2% (66/124) without prompting. This accuracy increased to 59.7% (74/124) with complex prompts (P = 0.09). Notably, there was no statistical difference in peak performance between ChatGPT 3.5 (70/124) and GPT-4 (74/124) (P = 0.55). Conclusions This study emphasizes the critical influence of prompt engineering on enhancing the diagnostic performance of ChatGPT versions, especially ChatGPT 3.5.

4.
J Thorac Imaging ; 2024 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-39269227

RESUMO

PURPOSE: To investigate and compare the diagnostic performance of 10 different large language models (LLMs) and 2 board-certified general radiologists in thoracic radiology cases published by The Society of Thoracic Radiology. MATERIALS AND METHODS: We collected publicly available 124 "Case of the Month" from the Society of Thoracic Radiology website between March 2012 and December 2023. Medical history and imaging findings were input into LLMs for diagnosis and differential diagnosis, while radiologists independently visually provided their assessments. Cases were categorized anatomically (parenchyma, airways, mediastinum-pleura-chest wall, and vascular) and further classified as specific or nonspecific for radiologic diagnosis. Diagnostic accuracy and differential diagnosis scores (DDxScore) were analyzed using the χ2, Kruskal-Wallis, Wilcoxon, McNemar, and Mann-Whitney U tests. RESULTS: Among the 124 cases, Claude 3 Opus showed the highest diagnostic accuracy (70.29%), followed by ChatGPT 4/Google Gemini 1.5 Pro (59.75%), Meta Llama 3 70b (57.3%), ChatGPT 3.5 (53.2%), outperforming radiologists (52.4% and 41.1%) and other LLMs (P<0.05). Claude 3 Opus DDxScore was significantly better than other LLMs and radiologists, except ChatGPT 3.5 (P<0.05). All LLMs and radiologists showed greater accuracy in specific cases (P<0.05), with no DDxScore difference for Perplexity and Google Bard based on specificity (P>0.05). There were no significant differences between LLMs and radiologists in the diagnostic accuracy of anatomic subgroups (P>0.05), except for Meta Llama 3 70b in the vascular cases (P=0.040). CONCLUSIONS: Claude 3 Opus outperformed other LLMs and radiologists in text-based thoracic radiology cases. LLMs hold great promise for clinical decision systems under proper medical supervision.

5.
JCO Glob Oncol ; 10: e2400200, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39208360

RESUMO

This study evaluates LLM integration in interpreting Lung-RADS for lung cancer screening, highlighting their innovative role in enhancing radiological practice. Our findings reveal that Claude 3 Opus and Perplexity achieved a 96% accuracy rate, outperforming other models.


Assuntos
Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Detecção Precoce de Câncer/métodos
6.
Clin Imaging ; 114: 110271, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39236553

RESUMO

The advent of large language models (LLMs) marks a transformative leap in natural language processing, offering unprecedented potential in radiology, particularly in enhancing the accuracy and efficiency of coronary artery disease (CAD) diagnosis. While previous studies have explored the capabilities of specific LLMs like ChatGPT in cardiac imaging, a comprehensive evaluation comparing multiple LLMs in the context of CAD-RADS 2.0 has been lacking. This study addresses this gap by assessing the performance of various LLMs, including ChatGPT 4, ChatGPT 4o, Claude 3 Opus, Gemini 1.5 Pro, Mistral Large, Meta Llama 3 70B, and Perplexity Pro, in answering 30 multiple-choice questions derived from the CAD-RADS 2.0 guidelines. Our findings reveal that ChatGPT 4o achieved the highest accuracy at 100 %, with ChatGPT 4 and Claude 3 Opus closely following at 96.6 %. Other models, including Mistral Large, Perplexity Pro, Meta Llama 3 70B, and Gemini 1.5 Pro, also demonstrated commendable performance, though with slightly lower accuracy ranging from 90 % to 93.3 %. This study underscores the proficiency of current LLMs in understanding and applying CAD-RADS 2.0, suggesting their potential to significantly enhance radiological reporting and patient care in coronary artery disease. The variations in model performance highlight the need for further research, particularly in evaluating the visual diagnostic capabilities of LLMs-a critical component of radiology practice. This study provides a foundational comparison of LLMs in CAD-RADS 2.0 and sets the stage for future investigations into their broader applications in radiology, emphasizing the importance of integrating both text-based and visual knowledge for optimal clinical outcomes.


Assuntos
Angiografia por Tomografia Computadorizada , Angiografia Coronária , Doença da Artéria Coronariana , Processamento de Linguagem Natural , Humanos , Angiografia por Tomografia Computadorizada/métodos , Doença da Artéria Coronariana/diagnóstico por imagem , Angiografia Coronária/métodos , Reprodutibilidade dos Testes
7.
Diagn Interv Radiol ; 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39248152

RESUMO

PURPOSE: This study aimed to evaluate the performance of large language models (LLMs) and multimodal LLMs in interpreting the Breast Imaging Reporting and Data System (BI-RADS) categories and providing clinical management recommendations for breast radiology in text-based and visual questions. METHODS: This cross-sectional observational study involved two steps. In the first step, we compared ten LLMs (namely ChatGPT 4o, ChatGPT 4, ChatGPT 3.5, Google Gemini 1.5 Pro, Google Gemini 1.0, Microsoft Copilot, Perplexity, Claude 3.5 Sonnet, Claude 3 Opus, and Claude 3 Opus 200K), general radiologists, and a breast radiologist using 100 text-based multiple-choice questions (MCQs) related to the BI-RADS Atlas 5th edition. In the second step, we assessed the performance of five multimodal LLMs (ChatGPT 4o, ChatGPT 4V, Claude 3.5 Sonnet, Claude 3 Opus, and Google Gemini 1.5 Pro) in assigning BI-RADS categories and providing clinical management recommendations on 100 breast ultrasound images. The comparison of correct answers and accuracy by question types was analyzed using McNemar's and chi-squared tests. Management scores were analyzed using the Kruskal- Wallis and Wilcoxon tests. RESULTS: Claude 3.5 Sonnet achieved the highest accuracy in text-based MCQs (90%), followed by ChatGPT 4o (89%), outperforming all other LLMs and general radiologists (78% and 76%) (P < 0.05), except for the Claude 3 Opus models and the breast radiologist (82%) (P > 0.05). Lower-performing LLMs included Google Gemini 1.0 (61%) and ChatGPT 3.5 (60%). Performance across different categories of showed no significant variation among LLMs or radiologists (P > 0.05). For breast ultrasound images, Claude 3.5 Sonnet achieved 59% accuracy, significantly higher than other multimodal LLMs (P < 0.05). Management recommendations were evaluated using a 3-point Likert scale, with Claude 3.5 Sonnet scoring the highest (mean: 2.12 ± 0.97) (P < 0.05). Accuracy varied significantly across BI-RADS categories, except Claude 3 Opus (P < 0.05). Gemini 1.5 Pro failed to answer any BI-RADS 5 questions correctly. Similarly, ChatGPT 4V failed to answer any BI-RADS 1 questions correctly, making them the least accurate in these categories (P < 0.05). CONCLUSION: Although LLMs such as Claude 3.5 Sonnet and ChatGPT 4o show promise in text-based BI-RADS assessments, their limitations in visual diagnostics suggest they should be used cautiously and under radiologists' supervision to avoid misdiagnoses. CLINICAL SIGNIFICANCE: This study demonstrates that while LLMs exhibit strong capabilities in text-based BI-RADS assessments, their visual diagnostic abilities are currently limited, necessitating further development and cautious application in clinical practice.

8.
Cureus ; 15(8): e43324, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37700980

RESUMO

Introduction The purpose of this study was to determine the utility of current magnetic resonance imaging (MRI) in the diagnosis of bucket-handle meniscal tears. Materials and methods Patients treated for arthroscopic meniscal tears between March 2019 and March 2022 were reviewed. The current study included all patients with bucket handle tears diagnosed arthroscopically and having MRI scans (n=51). A control group of 58 individuals with similar demographic characteristics and meniscal tears apart from bucket handle tears was also formed. The assessment of bucket handle and non-bucket handle tears was performed blindly by a musculoskeletal (MSK) radiologist with 20 years of experience and a trainee radiologist, achieving consensus on group allocation. The MRIs were examined for various findings, including the presence of a bucket handle tear, tear location, presence of anterior cruciate ligament (ACL) rupture, intercondyler notch sign, double anterior horn sign, flipped meniscus sign, double posterior cruciate ligament (PCL) sign, absent bow sign, and the disproportionate posterior horn sign. These well-known signs, detailed in the literature, were evaluated. Additionally, less studied and less commonly known signs such as the V sign and double anterior cruciate ligament sign were assessed. The V sign appears similarly to the letter V, resulting from the displacement of the bucket handle tear and the angle of the intact meniscus on axial images. The double anterior cruciate ligament sign is the appearance formed by the compression of the displaced meniscal part behind the anterior cruciate ligament in bucket handle tears. Results Following the retrospective evaluation of MRI scans, 44 out of 51 tears diagnosed as bucket handle tears by arthroscopy were accurately identified (sensitivity: 86.27%). The same conclusion was reached for MRI scans in 52 out of 58 tears where arthroscopy did not detect a bucket handle tear (specificity: 89.66%). The most prevalent MRI signs in patients with bucket handle tears identified by arthroscopy in the study were the intercondylar notch sign (84.31%), V sign (72.55%), double PCL sign (56.86%), double anterior horn sign (49.02%), absent bow sign (43.14%), flipped meniscus sign (19.61%), disproportionate posterior horn sign (9.80%), and double ACL sign (5.88%). The intercondylar notch sign, V sign, and double PCL sign exhibited the highest sensitivity, while flipped meniscus, disproportionate posterior horn, and double ACL sign demonstrated the highest specificity. Conclusion MRI demonstrates a high level of sensitivity and specificity in identifying meniscal bucket handle tears, particularly when considering the eight MRI signs investigated in this study.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA