RESUMO
Background: Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists. Methods: Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed. Results: Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively]. Conclusions: GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.
RESUMO
OBJECTIVES: This study aims to assess the performance of a multimodal artificial intelligence (AI) model capable of analyzing both images and textual data (GPT-4V), in interpreting radiological images. It focuses on a range of modalities, anatomical regions, and pathologies to explore the potential of zero-shot generative AI in enhancing diagnostic processes in radiology. METHODS: We analyzed 230 anonymized emergency room diagnostic images, consecutively collected over 1 week, using GPT-4V. Modalities included ultrasound (US), computerized tomography (CT), and X-ray images. The interpretations provided by GPT-4V were then compared with those of senior radiologists. This comparison aimed to evaluate the accuracy of GPT-4V in recognizing the imaging modality, anatomical region, and pathology present in the images. RESULTS: GPT-4V identified the imaging modality correctly in 100% of cases (221/221), the anatomical region in 87.1% (189/217), and the pathology in 35.2% (76/216). However, the model's performance varied significantly across different modalities, with anatomical region identification accuracy ranging from 60.9% (39/64) in US images to 97% (98/101) and 100% (52/52) in CT and X-ray images (p < 0.001). Similarly, pathology identification ranged from 9.1% (6/66) in US images to 36.4% (36/99) in CT and 66.7% (34/51) in X-ray images (p < 0.001). These variations indicate inconsistencies in GPT-4V's ability to interpret radiological images accurately. CONCLUSION: While the integration of AI in radiology, exemplified by multimodal GPT-4, offers promising avenues for diagnostic enhancement, the current capabilities of GPT-4V are not yet reliable for interpreting radiological images. This study underscores the necessity for ongoing development to achieve dependable performance in radiology diagnostics. CLINICAL RELEVANCE STATEMENT: Although GPT-4V shows promise in radiological image interpretation, its high diagnostic hallucination rate (> 40%) indicates it cannot be trusted for clinical use as a standalone tool. Improvements are necessary to enhance its reliability and ensure patient safety. KEY POINTS: GPT-4V's capability in analyzing images offers new clinical possibilities in radiology. GPT-4V excels in identifying imaging modalities but demonstrates inconsistent anatomy and pathology detection. Ongoing AI advancements are necessary to enhance diagnostic reliability in radiological applications.
RESUMO
BACKGROUND: Obesity is associated with metabolic syndrome and fat accumulation in various organs such as the liver and the kidneys. Our goal was to assess, using magnetic resonance imaging (MRI) Dual-Echo phase sequencing, the association between liver and kidney fat deposition and their relation to obesity. METHODS: We analyzed MRI scans of individuals who were referred to the Chaim Sheba Medical Center between December 2017 and May 2020 to perform a study for any indication. For each individual, we retrieved from the computerized charts data on sex, and age, weight, height, body mass index (BMI), systolic and diastolic blood pressure (BP), and comorbidities (diabetes mellitus, hypertension, dyslipidemia). RESULTS: We screened MRI studies of 399 subjects with a median age of 51 years, 52.4% of whom were women, and a median BMI 24.6 kg/m2. We diagnosed 18% of the participants with fatty liver and 18.6% with fat accumulation in the kidneys (fatty kidneys). Out of the 67 patients with fatty livers, 23 (34.3%) also had fatty kidneys, whereas among the 315 patients without fatty livers, only 48 patients (15.2%) had fatty kidneys (p < 0.01). In comparison to the patients who did not have a fatty liver or fatty kidneys (n = 267), those who had both (n = 23) were more obese, had higher systolic BP, and were more likely to have diabetes mellitus. In comparison to the patients without a fatty liver, those with fatty livers had an adjusted odds ratio of 2.91 (97.5% CI; 1.61-5.25) to have fatty kidneys. In total, 19.6% of the individuals were obese (BMI ≥ 30), and 26.1% had overweight (25 < BMI < 30). The obese and overweight individuals were older and more likely to have diabetes mellitus and hypertension and had higher rates of fatty livers and fatty kidneys. Fat deposition in both the liver and the kidneys was observed in 15.9% of the obese patients, in 8.3% of the overweight patients, and in none of those with normal weight. Obesity was the only risk factor for fatty kidneys and fatty livers, with an adjusted OR of 6.3 (97.5% CI 2.1-18.6). CONCLUSIONS: Obesity is a major risk factor for developing a fatty liver and fatty kidneys. Individuals with a fatty liver are more likely to have fatty kidneys. MRI is an accurate modality for diagnosing fatty kidneys. Reviewing MRI scans of any indication should include assessment of fat fractions in the kidneys in addition to that of the liver.
Assuntos
Fígado Gorduroso , Rim , Imageamento por Ressonância Magnética , Obesidade , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , Obesidade/complicações , Rim/diagnóstico por imagem , Rim/fisiopatologia , Adulto , Fígado Gorduroso/diagnóstico por imagem , Fígado Gorduroso/epidemiologia , Índice de Massa Corporal , Fígado/diagnóstico por imagem , Fígado/patologia , Nefropatias/diagnóstico por imagem , Nefropatias/epidemiologia , Idoso , Fatores de RiscoRESUMO
Crohn's disease (CD) poses significant morbidity, underscoring the need for effective, non-invasive inflammatory assessment using magnetic resonance enterography (MRE). This literature review evaluates recent publications on the role of deep learning in improving MRE for CD assessment. We searched MEDLINE/PUBMED for studies that reported the use of deep learning algorithms for assessment of CD activity. The study was conducted according to the PRISMA guidelines. The risk of bias was evaluated using the QUADAS-2 tool. Five eligible studies, encompassing 468 subjects, were identified. Our study suggests that diverse deep learning applications, including image quality enhancement, bowel segmentation for disease burden quantification, and 3D reconstruction for surgical planning are useful and promising for CD assessment. However, most of the studies are preliminary, retrospective studies, and have a high risk of bias in at least one category. Future research is needed to assess how deep learning can impact CD patient diagnostics, particularly when considering the increasing integration of such models into hospital systems.
Assuntos
Doença de Crohn , Aprendizado Profundo , Imageamento por Ressonância Magnética , Humanos , Doença de Crohn/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Interpretação de Imagem Assistida por Computador/métodosRESUMO
Thymic imaging is challenging because the imaging appearance of a variety of benign and malignant thymic conditions are similar. CT is the most commonly used modality for mediastinal imaging, while MRI and fluorine 18 fluorodeoxyglucose (FDG) PET/CT are helpful when they are tailored to the correct indication. Each of these imaging modalities has limitations and technical pitfalls that may lead to an incorrect diagnosis and mismanagement. CT may not be sufficient for the characterization of cystic thymic processes and differentiation between thymic hyperplasia and thymic tumors. MRI can be used to overcome these limitations but is subject to other potential pitfalls such as an equivocal decrease in signal intensity at chemical shift imaging, size limitations, unusual signal intensity for cysts, subtraction artifacts, pseudonodularity on T2-weighted MR images, early imaging misinterpretation, flow and spatial resolution issues hampering assessment of local invasion, and the overlap of apparent diffusion coefficients between malignant and benign thymic entities. FDG PET/CT is not routinely indicated due to some overlap in FDG uptake between thymomas and benign thymic processes. However, it is useful for staging and follow-up of aggressive tumors (eg, thymic carcinoma), particularly for detection of occult metastatic disease. Pitfalls in imaging after treatment of thymic malignancies relate to technical challenges such as postthymectomy sternotomy streak metal artifacts, differentiation of postsurgical thymic bed changes from tumor recurrence, or human error with typical "blind spots" for identification of metastatic disease. Understanding these pitfalls enables appropriate selection of imaging modalities, improves diagnostic accuracy, and guides patient treatment. ©RSNA, 2024 Test Your Knowledge questions for this article are available in the supplemental material.
Assuntos
Timoma , Neoplasias do Timo , Humanos , Fluordesoxiglucose F18 , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Recidiva Local de Neoplasia , Neoplasias do Timo/diagnóstico por imagem , Neoplasias do Timo/patologia , Timoma/diagnóstico , Tomografia por Emissão de Pósitrons , Imageamento por Ressonância Magnética , Compostos RadiofarmacêuticosRESUMO
BACKGROUND: Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. METHODS: The authors searched for studies published up to November 2023. Search terms focused on LLMs generated MCQs for medical examinations. Non-English, out of year range and studies not focusing on AI generated multiple-choice questions were excluded. MEDLINE was used as a search database. Risk of bias was evaluated using a tailored QUADAS-2 tool. RESULTS: Overall, eight studies published between April 2023 and October 2023 were included. Six studies used Chat-GPT 3.5, while two employed GPT 4. Five studies showed that LLMs can produce competent questions valid for medical exams. Three studies used LLMs to write medical questions but did not evaluate the validity of the questions. One study conducted a comparative analysis of different models. One other study compared LLM-generated questions with those written by humans. All studies presented faulty questions that were deemed inappropriate for medical exams. Some questions required additional modifications in order to qualify. CONCLUSIONS: LLMs can be used to write MCQs for medical examinations. However, their limitations cannot be ignored. Further study in this field is essential and more conclusive evidence is needed. Until then, LLMs may serve as a supplementary tool for writing medical examinations. 2 studies were at high risk of bias. The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Assuntos
Avaliação Educacional , Humanos , Avaliação Educacional/métodos , Redação/normas , Idioma , Educação MédicaRESUMO
PURPOSE: Despite advanced technologies in breast cancer management, challenges remain in efficiently interpreting vast clinical data for patient-specific insights. We reviewed the literature on how large language models (LLMs) such as ChatGPT might offer solutions in this field. METHODS: We searched MEDLINE for relevant studies published before December 22, 2023. Keywords included: "large language models", "LLM", "GPT", "ChatGPT", "OpenAI", and "breast". The risk bias was evaluated using the QUADAS-2 tool. RESULTS: Six studies evaluating either ChatGPT-3.5 or GPT-4, met our inclusion criteria. They explored clinical notes analysis, guideline-based question-answering, and patient management recommendations. Accuracy varied between studies, ranging from 50 to 98%. Higher accuracy was seen in structured tasks like information retrieval. Half of the studies used real patient data, adding practical clinical value. Challenges included inconsistent accuracy, dependency on the way questions are posed (prompt-dependency), and in some cases, missing critical clinical information. CONCLUSION: LLMs hold potential in breast cancer care, especially in textual information extraction and guideline-driven clinical question-answering. Yet, their inconsistent accuracy underscores the need for careful validation of these models, and the importance of ongoing supervision.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Feminino , Humanos , Neoplasias da Mama/terapiaRESUMO
INTRODUCTION: Bidirectional Encoder Representations from Transformers (BERT), introduced in 2018, has revolutionized natural language processing. Its bidirectional understanding of word context has enabled innovative applications, notably in radiology. This study aimed to assess BERT's influence and applications within the radiologic domain. METHODS: Adhering to Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, we conducted a systematic review, searching PubMed for literature on BERT-based models and natural language processing in radiology from January 1, 2018, to February 12, 2023. The search encompassed keywords related to generative models, transformer architecture, and various imaging techniques. RESULTS: Of 597 results, 30 met our inclusion criteria. The remaining were unrelated to radiology or did not use BERT-based models. The included studies were retrospective, with 14 published in 2022. The primary focus was on classification and information extraction from radiology reports, with x-rays as the prevalent imaging modality. Specific investigations included automatic CT protocol assignment and deep learning applications in chest x-ray interpretation. CONCLUSION: This review underscores the primary application of BERT in radiology for report classification. It also reveals emerging BERT applications for protocol assignment and report generation. As BERT technology advances, we foresee further innovative applications. Its implementation in radiology holds potential for enhancing diagnostic precision, expediting report generation, and optimizing patient care.
Assuntos
Processamento de Linguagem Natural , Humanos , Radiologia , Sistemas de Informação em RadiologiaRESUMO
BACKGROUND: Advancements in artificial intelligence (AI) and natural language processing (NLP) have led to the development of language models such as ChatGPT. These models have the potential to transform healthcare and medical research. However, understanding their applications and limitations is essential. OBJECTIVES: To present a view of ChatGPT research and to critically assess ChatGPT's role in medical writing and clinical environments. METHODS: We performed a literature review via the PubMed search engine from 20 November 2022, to 23 April 2023. The search terms included ChatGPT, OpenAI, and large language models. We included studies that focused on ChatGPT, explored its use or implications in medicine, and were original research articles. The selected studies were analyzed considering study design, NLP tasks, main findings, and limitations. RESULTS: Our study included 27 articles that examined ChatGPT's performance in various tasks and medical fields. These studies covered knowledge assessment, writing, and analysis tasks. While ChatGPT was found to be useful in tasks such as generating research ideas, aiding clinical reasoning, and streamlining workflows, limitations were also identified. These limitations included inaccuracies, inconsistencies, fictitious information, and limited knowledge, highlighting the need for further improvements. CONCLUSIONS: The review underscores ChatGPT's potential in various medical applications. Yet, it also points to limitations that require careful human oversight and responsible use to improve patient care, education, and decision-making.
Assuntos
Inteligência Artificial , Medicina , Humanos , Escolaridade , Idioma , Atenção à SaúdeRESUMO
PURPOSE: High volumes of chest radiographs (CXR) remain uninterpreted due to severe shortage of radiologists. These CXRs may be informally reported by non-radiologist physicians, or not reviewed at all. Artificial intelligence (AI) software can aid lung nodule detection. Our aim was to assess evaluation and management by non-radiologists of uninterpreted CXRs with AI detected nodules, compared to retrospective radiology reports. MATERIALS AND METHODS: AI detected nodules on uninterpreted CXRs of adults, performed 30/6/2022-31/1/2023, were evaluated. Excluded were patients with known active malignancy and duplicate CXRs of the same patient. The electronic medical records (EMR) were reviewed, and the clinicians' notes on the CXR and AI detected nodule were documented. Dedicated thoracic radiologists retrospectively interpreted all CXRs, and similarly to the clinicians, they had access to the AI findings, prior imaging and EMR. The radiologists' interpretation served as the ground truth, and determined if the AI-detected nodule was a true lung nodule and if further workup was required. RESULTS: A total of 683 patients met the inclusion criteria. The clinicians commented on 386 (56.5%) CXRs, identified true nodules on 113 CXRs (16.5%), incorrectly mentioned 31 (4.5%) false nodules as real nodules, and did not mention the AI detected nodule on 242 (35%) CXRs, of which 68 (10%) patients were retrospectively referred for further workup by the radiologist. For 297 patients (43.5%) there were no comments regarding the CXR in the EMR. Of these, 77 nodules (11.3%) were retrospectively referred for further workup by the radiologist. CONCLUSION: AI software for lung nodule detection may be insufficient without a formal radiology report, and may lead to over diagnosis or misdiagnosis of nodules.
Assuntos
Inteligência Artificial , Neoplasias Pulmonares , Adulto , Humanos , Estudos Retrospectivos , Neoplasias Pulmonares/diagnóstico por imagem , Radiografia Torácica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radiologistas , InteligênciaRESUMO
OBJECTIVES: Scaphoid fractures are usually diagnosed using X-rays, a low-sensitivity modality. Artificial intelligence (AI) using Convolutional Neural Networks (CNNs) has been explored for diagnosing scaphoid fractures in X-rays. The aim of this systematic review and meta-analysis is to evaluate the use of AI for detecting scaphoid fractures on X-rays and analyze its accuracy and usefulness. MATERIALS AND METHODS: This study followed the guidelines of Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) and PRISMA-Diagnostic Test Accuracy. A literature search was conducted in the PubMed database for original articles published until July 2023. The risk of bias and applicability were evaluated using the QUADAS-2 tool. A bivariate diagnostic random-effects meta-analysis was conducted, and the results were analyzed using the Summary Receiver Operating Characteristic (SROC) curve. RESULTS: Ten studies met the inclusion criteria and were all retrospective. The AI's diagnostic performance for detecting scaphoid fractures ranged from AUC 0.77 to 0.96. Seven studies were included in the meta-analysis, with a total of 3373 images. The meta-analysis pooled sensitivity and specificity were 0.80 and 0.89, respectively. The meta-analysis overall AUC was 0.88. The QUADAS-2 tool found high risk of bias and concerns about applicability in 9 out of 10 studies. CONCLUSIONS: The current results of AI's diagnostic performance for detecting scaphoid fractures in X-rays show promise. The results show high overall sensitivity and specificity and a high SROC result. Further research is needed to compare AI's diagnostic performance to human diagnostic performance in a clinical setting. CLINICAL RELEVANCE STATEMENT: Scaphoid fractures are prone to be missed secondary to assessment with a low sensitivity modality and a high occult fracture rate. AI systems can be beneficial for clinicians and radiologists to facilitate early diagnosis, and avoid missed injuries. KEY POINTS: ⢠Scaphoid fractures are common and some can be easily missed in X-rays. ⢠Artificial intelligence (AI) systems demonstrate high diagnostic performance for the diagnosis of scaphoid fractures in X-rays. ⢠AI systems can be beneficial in diagnosing both obvious and occult scaphoid fractures.
Assuntos
Inteligência Artificial , Fraturas Ósseas , Osso Escafoide , Humanos , Osso Escafoide/lesões , Osso Escafoide/diagnóstico por imagem , Fraturas Ósseas/diagnóstico por imagem , Sensibilidade e Especificidade , Radiografia/métodosRESUMO
BACKGROUND: Computed tomography (CT) is the main diagnostic modality for detecting pancreatic adenocarcinoma. OBJECTIVES: To assess the frequency of missed pancreatic adenocarcinoma on CT scans according to different CT protocols. METHODS: The medical records of consecutive pancreatic adenocarcinoma patients were retrospectively collected (12/2011-12/2015). Patients with abdominal CT scans performed up to a year prior to cancer diagnosis were included. Two radiologists registered the presence of radiological signs of missed cancers. The frequency of missed cancers was compared between portal and pancreatic/triphasic CT protocols. RESULTS: Overall, 180 CT scans of pancreatic adenocarcinoma patients performed prior to cancer diagnosis were retrieved; 126/180 (70.0%) were conducted using pancreatic/triphasic protocols and 54/180 (30.0%) used portal protocols. The overall frequency of missed cancers was 6/180 (3.3%) in our study population. The frequency of missed cancers was higher with the portal CT protocols compared to the pancreatic/triphasic protocols: 5/54 (9.3%) vs. 1/126 (0.8%), P = 0.01. CT signs of missed cancers included small hypodense lesions, peri-pancreatic fat stranding, and dilated pancreatic duct with a cut-off sign. CONCLUSIONS: The frequency of missed pancreatic adenocarcinoma is higher on portal CT protocols. Physicians should consider the cancer miss rate on different CT protocols.
Assuntos
Adenocarcinoma , Neoplasias Pancreáticas , Humanos , Neoplasias Pancreáticas/diagnóstico por imagem , Adenocarcinoma/diagnóstico por imagem , Estudos Retrospectivos , Tomografia Computadorizada por Raios X/métodos , Neoplasias PancreáticasRESUMO
PURPOSE: The growing application of deep learning in radiology has raised concerns about cybersecurity, particularly in relation to adversarial attacks. This study aims to systematically review the literature on adversarial attacks in radiology. METHODS: We searched for studies on adversarial attacks in radiology published up to April 2023, using MEDLINE and Google Scholar databases. RESULTS: A total of 22 studies published between March 2018 and April 2023 were included, primarily focused on image classification algorithms. Fourteen studies evaluated white-box attacks, three assessed black-box attacks and five investigated both. Eleven of the 22 studies targeted chest X-ray classification algorithms, while others involved chest CT (6/22), brain MRI (4/22), mammography (2/22), abdominal CT (1/22), hepatic US (1/22), and thyroid US (1/22). Some attacks proved highly effective, reducing the AUC of algorithm performance to 0 and achieving success rates up to 100 %. CONCLUSIONS: Adversarial attacks are a growing concern. Although currently the threats are more theoretical than practical, they still represent a potential risk. It is important to be alert to such attacks, reinforce cybersecurity measures, and influence the formulation of ethical and legal guidelines. This will ensure the safe use of deep learning technology in medicine.
Assuntos
Radiologia , Humanos , Radiografia , Mamografia , Tomografia Computadorizada por Raios X , AlgoritmosRESUMO
BACKGROUND: To assess the effect of a commercial artificial intelligence (AI) solution implementation in the emergency department on clinical outcomes in a single level 1 trauma center. METHODS: A retrospective cohort study for two time periods-pre-AI (1.1.2017-1.1.2018) and post-AI (1.1.2019-1.1.2020)-in a level 1 trauma center was performed. The ICH algorithm was applied to 587 consecutive patients with a confirmed diagnosis of ICH on head CT upon admission to the emergency department. Study variables included demographics, patient outcomes, and imaging data. Participants admitted to the emergency department during the same time periods for other acute diagnoses (ischemic stroke (IS) and myocardial infarction (MI)) served as control groups. Primary outcomes were 30- and 120-day all-cause mortality. The secondary outcome was morbidity based on Modified Rankin Scale for Neurologic Disability (mRS) at discharge. RESULTS: Five hundred eighty-seven participants (289 pre-AI-age 71 ± 1, 169 men; 298 post-AI-age 69 ± 1, 187 men) with ICH were eligible for the analyzed period. Demographics, comorbidities, Emergency Severity Score, type of ICH, and length of stay were not significantly different between the two time periods. The 30- and 120-day all-cause mortality were significantly reduced in the post-AI group when compared to the pre-AI group (27.7% vs 17.5%; p = 0.004 and 31.8% vs 21.7%; p = 0.017, respectively). Modified Rankin Scale (mRS) at discharge was significantly reduced post-AI implementation (3.2 vs 2.8; p = 0.044). CONCLUSION: The added value of this study emphasizes the introduction of artificial intelligence (AI) computer-aided triage and prioritization software in an emergent care setting that demonstrated a significant reduction in a 30- and 120-day all-cause mortality and morbidity for patients diagnosed with intracranial hemorrhage (ICH). Along with mortality rates, the AI software was associated with a significant reduction in the Modified Ranking Scale (mRs).
RESUMO
Rationale and objectives: Intraductal papillary mucinous neoplasm of the bile ducts (IPMN-B) is a true pre-cancerous lesion, which shares common features with pancreatic IPMN (IPMN-P). While IPMN-P is a well described entity for which guidelines were formulated and revised, IPMN-B is a poorly described entity.We carried out a systematic review to evaluate the existing literature, emphasizing the role of MRI in IPMN-B depiction. Materials and methods: PubMed database was used to identify original studies and case series that reported MR Imaging features of IPMN-B. The search keywords were "IPMN OR intraductal papillary mucinous neoplasm OR IPNB OR intraductal papillary neoplasm of the bile duct AND Biliary OR biliary cancer OR hepatic cystic lesions". Risk of bias and applicability were evaluated using the QUADAS-2 tool. Results: 884 Records were Identified through database searching. 12 studies satisfied the inclusion criteria, resulting in MR features of 288 patients. All the studies were retrospective. Classic features of IPMN-B are under-described. Few studies note worrisome features, concerning for an underlying malignancy. 50 % of the studies had a high risk of bias and concerns regarding applicability. Conclusions: The MRI features of IPMN-B are not well elaborated and need to be further studied. Worrisome features and guidelines regarding reporting the imaging findings should be established and published. Radiologists should be aware of IPMN-B, since malignancy diagnosis in an early stage will yield improved prognosis.
RESUMO
BACKGROUND: Perivascular cuffing as the sole imaging manifestation of pancreatic ductal adenocarcinoma (PDAC) is an under-recognized entity. OBJECTIVES: To present this rare finding and differentiate it from retroperitoneal fibrosis and vasculitis. METHODS: Patients with abdominal vasculature cuffing were retrospectively collected (January 2011 to September 2017). We evaluated vessels involved, wall thickness, length of involvement and extra-vascular manifestations. RESULTS: Fourteen patients with perivascular cuffing were retrieved: three with celiac and superior mesenteric artery (SMA) perivascular cuffing as the only manifestation of surgically proven PDAC, seven with abdominal vasculitis, and four with retroperitoneal fibrosis. PDAC patients exhibited perivascular cuffing of either or both celiac and SMA (3/3). Vasculitis patients showed aortitis with or without iliac or SMA cuffing (3/7) or cuffing of either or both celiac and SMA (4/7). Retroperitoneal fibrosis involved the aorta (4/4), common iliac (4/4), and renal arteries (2/4). Hydronephrosis was present in 3/4 of retroperitoneal fibrosis patients. PDAC and vasculitis demonstrated reduced wall thickness in comparison to retroperitoneal fibrosis (PDAC: 1.0 ± 0.2 cm, vasculitis: 1.2 ± 0.5 cm, retroperitoneal fibrosis: 2.4 ± 0.4 cm; P = 0.002). There was no significant difference in length of vascular involvement (PDAC: 6.3 ± 2.1 cm, vasculitis: 7.1 ± 2.6 cm, retroperitoneal fibrosis: 8.7 ± 0.5 cm). CONCLUSIONS: Celiac and SMA perivascular cuffing can be the sole finding in PDAC and may be indistinguishable from vasculitis. This entity may differ from retroperitoneal fibrosis as it spares the aorta, iliac, and renal arteries and demonstrates thinner walls and no hydronephrosis.
Assuntos
Neoplasias Pancreáticas , Fibrose Retroperitoneal , Vasculite , Humanos , Fibrose Retroperitoneal/patologia , Estudos Retrospectivos , Aorta/patologia , Vasculite/patologia , Neoplasias Pancreáticas/diagnóstico por imagem , Neoplasias PancreáticasRESUMO
PURPOSE: The quality of radiology referrals influences patient management and imaging interpretation by radiologists. The aim of this study was to evaluate ChatGPT-4 as a decision support tool for selecting imaging examinations and generating radiology referrals in the emergency department (ED). METHODS: Five consecutive clinical notes from the ED were retrospectively extracted, for each of the following pathologies: pulmonary embolism, obstructing kidney stones, acute appendicitis, diverticulitis, small bowel obstruction, acute cholecystitis, acute hip fracture, and testicular torsion. A total of 40 cases were included. These notes were entered into ChatGPT-4, requesting recommendations on the most appropriate imaging examinations and protocols. The chatbot was also asked to generate radiology referrals. Two independent radiologists graded the referral on a scale ranging from 1 to 5 for clarity, clinical relevance, and differential diagnosis. The chatbot's imaging recommendations were compared with the ACR Appropriateness Criteria (AC) and with the examinations performed in the ED. Agreement between readers was assessed using linear weighted Cohen's κ coefficient. RESULTS: ChatGPT-4's imaging recommendations aligned with the ACR AC and ED examinations in all cases. Protocol discrepancies between ChatGPT and the ACR AC were observed in two cases (5%). ChatGPT-4-generated referrals received mean scores of 4.6 and 4.8 for clarity, 4.5 and 4.4 for clinical relevance, and 4.9 from both reviewers for differential diagnosis. Agreement between readers was moderate for clinical relevance and clarity and substantial for differential diagnosis grading. CONCLUSIONS: ChatGPT-4 has shown potential in aiding imaging study selection for select clinical cases. As a complementary tool, large language models may improve radiology referral quality. Radiologists should stay informed about this technology and be mindful of potential challenges and risks.
Assuntos
Fraturas do Quadril , Radiologia , Humanos , Estudos Retrospectivos , Radiografia , Serviço Hospitalar de EmergênciaRESUMO
Large language models such as ChatGPT have gained public and scientific attention. These models may support oncologists in their work. Oncologists should be familiar with large language models to harness their potential while being aware of potential dangers and limitations.
Assuntos
Idioma , Oncologistas , Humanos , OncologiaRESUMO
Pulmonary embolism (PE) is a common, life threatening cardiovascular emergency. Risk stratification is one of the core principles of acute PE management and determines the choice of diagnostic and therapeutic strategies. In routine clinical practice, clinicians rely on the patient's electronic health record (EHR) to provide a context for their medical imaging interpretation. Most deep learning models for radiology applications only consider pixel-value information without the clinical context. Only a few integrate both clinical and imaging data. In this work, we develop and compare multimodal fusion models that can utilize multimodal data by combining both volumetric pixel data and clinical patient data for automatic risk stratification of PE. Our best performing model is an intermediate fusion model that incorporates both bilinear attention and TabNet, and can be trained in an end-to-end manner. The results show that multimodality boosts performance by up to 14% with an area under the curve (AUC) of 0.96 for assessing PE severity, with a sensitivity of 90% and specificity of 94%, thus pointing to the value of using multimodal data to automatically assess PE severity.
Assuntos
Embolia Pulmonar , Radiologia , Humanos , Embolia Pulmonar/diagnóstico por imagem , Área Sob a Curva , Suplementos Nutricionais , Registros Eletrônicos de SaúdeRESUMO
Large language models (LLM) such as ChatGPT have gained public and scientific attention. The aim of this study is to evaluate ChatGPT as a support tool for breast tumor board decisions making. We inserted into ChatGPT-3.5 clinical information of ten consecutive patients presented in a breast tumor board in our institution. We asked the chatbot to recommend management. The results generated by ChatGPT were compared to the final recommendations of the tumor board. They were also graded independently by two senior radiologists. Grading scores were between 1-5 (1 = completely disagree, 5 = completely agree), and in three different categories: summarization, recommendation, and explanation. The mean age was 49.4, 8/10 (80%) of patients had invasive ductal carcinoma, one patient (1/10, 10%) had a ductal carcinoma in-situ and one patient (1/10, 10%) had a phyllodes tumor with atypia. In seven out of ten cases (70%), ChatGPT's recommendations were similar to the tumor board's decisions. Mean scores while grading the chatbot's summarization, recommendation and explanation by the first reviewer were 3.7, 4.3, and 4.6 respectively. Mean values for the second reviewer were 4.3, 4.0, and 4.3, respectively. In this proof-of-concept study, we present initial results on the use of an LLM as a decision support tool in a breast tumor board. Given the significant advancements, it is warranted for clinicians to be familiar with the potential benefits and harms of the technology.