RESUMO
OBJECTIVES: This study aims to assess the performance of a multimodal artificial intelligence (AI) model capable of analyzing both images and textual data (GPT-4V), in interpreting radiological images. It focuses on a range of modalities, anatomical regions, and pathologies to explore the potential of zero-shot generative AI in enhancing diagnostic processes in radiology. METHODS: We analyzed 230 anonymized emergency room diagnostic images, consecutively collected over 1 week, using GPT-4V. Modalities included ultrasound (US), computerized tomography (CT), and X-ray images. The interpretations provided by GPT-4V were then compared with those of senior radiologists. This comparison aimed to evaluate the accuracy of GPT-4V in recognizing the imaging modality, anatomical region, and pathology present in the images. RESULTS: GPT-4V identified the imaging modality correctly in 100% of cases (221/221), the anatomical region in 87.1% (189/217), and the pathology in 35.2% (76/216). However, the model's performance varied significantly across different modalities, with anatomical region identification accuracy ranging from 60.9% (39/64) in US images to 97% (98/101) and 100% (52/52) in CT and X-ray images (p < 0.001). Similarly, pathology identification ranged from 9.1% (6/66) in US images to 36.4% (36/99) in CT and 66.7% (34/51) in X-ray images (p < 0.001). These variations indicate inconsistencies in GPT-4V's ability to interpret radiological images accurately. CONCLUSION: While the integration of AI in radiology, exemplified by multimodal GPT-4, offers promising avenues for diagnostic enhancement, the current capabilities of GPT-4V are not yet reliable for interpreting radiological images. This study underscores the necessity for ongoing development to achieve dependable performance in radiology diagnostics. CLINICAL RELEVANCE STATEMENT: Although GPT-4V shows promise in radiological image interpretation, its high diagnostic hallucination rate (> 40%) indicates it cannot be trusted for clinical use as a standalone tool. Improvements are necessary to enhance its reliability and ensure patient safety. KEY POINTS: GPT-4V's capability in analyzing images offers new clinical possibilities in radiology. GPT-4V excels in identifying imaging modalities but demonstrates inconsistent anatomy and pathology detection. Ongoing AI advancements are necessary to enhance diagnostic reliability in radiological applications.
RESUMO
BACKGROUND: Natural Language Processing (NLP) and Large Language Models (LLMs) hold largely untapped potential in infectious disease management. This review explores their current use and uncovers areas needing more attention. METHODS: This analysis followed systematic review procedures, registered with the Prospective Register of Systematic Reviews. We conducted a search across major databases including PubMed, Embase, Web of Science, and Scopus, up to December 2023, using keywords related to NLP, LLM, and infectious diseases. We also employed the Quality Assessment of Diagnostic Accuracy Studies-2 tool for evaluating the quality and robustness of the included studies. RESULTS: Our review identified 15 studies with diverse applications of NLP in infectious disease management. Notable examples include GPT-4's application in detecting urinary tract infections and BERTweet's use in Lyme Disease surveillance through social media analysis. These models demonstrated effective disease monitoring and public health tracking capabilities. However, the effectiveness varied across studies. For instance, while some NLP tools showed high accuracy in pneumonia detection and high sensitivity in identifying invasive mold diseases from medical reports, others fell short in areas like bloodstream infection management. CONCLUSIONS: This review highlights the yet-to-be-fully-realized promise of NLP and LLMs in infectious disease management. It calls for more exploration to fully harness AI's capabilities, particularly in the areas of diagnosis, surveillance, predicting disease courses, and tracking epidemiological trends.
Assuntos
Doenças Transmissíveis , Processamento de Linguagem Natural , Humanos , Doenças Transmissíveis/diagnósticoRESUMO
OBJECTIVE: The aim of this study is to share our experience in treating patients with lymphatic malformations (LMs) over a span of 14 years, evaluating the efficacy and safety of sclerotherapy, particularly with the use of ethanol as sclerosant of choice. METHODS: A retrospective review of pediatric patients diagnosed and later treated for LMs between 2008 and 2022 was conducted. We collected patient demographics, LM characteristics, treatment strategies, and outcomes, including response to treatment and complications. RESULTS: The cohort included 36 patients (24 male patients), first presenting clinically at a median age of 5 months (range: 0-12 years). LMs were macrocystic (17), microcystic (3), and mixed types (16). In most patients (22), the malformation involved the cervicofacial area. Twenty-five patients underwent 54 procedures, averaging 2 procedures per patient (range: 1-13). Sclerotherapy resulted in 90% of patients exhibiting some response of the LM (P = .005). Ethanol was used in most procedures (31) and proved most efficacious, facilitating partial or complete response of the malformations in all cases compared with 72% with other sclerosants (P = .06). Sclerotherapy exhibited low complication rates among all sclerosants used (7%, P = .74). CONCLUSIONS: Sclerotherapy is a safe and effective intervention for pediatric LMs. Ethanol demonstrated comparable efficacy and safety to other sclerosants, highlighting its potential as a preferred treatment option. This study supports the tailored use of sclerotherapy, guided by a thorough understanding of the risks and benefits, to provide optimized care for patients with LMs.
Assuntos
Etanol , Anormalidades Linfáticas , Soluções Esclerosantes , Escleroterapia , Humanos , Escleroterapia/efeitos adversos , Masculino , Estudos Retrospectivos , Anormalidades Linfáticas/terapia , Anormalidades Linfáticas/diagnóstico por imagem , Feminino , Criança , Pré-Escolar , Lactente , Soluções Esclerosantes/efeitos adversos , Soluções Esclerosantes/uso terapêutico , Resultado do Tratamento , Etanol/efeitos adversos , Etanol/administração & dosagem , Recém-Nascido , Fatores de TempoRESUMO
The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models' consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT's 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.