Your browser doesn't support javascript.
loading
Feasibility of Artificial Intelligence Powered Adverse Event Analysis: Using a Large Language Model to Analyze Microwave Ablation Malfunction Data.
Warren, Blair E; Alkhalifah, Fahd; Ahrari, Aida; Min, Adam; Fawzy, Aly; Annamalai, Ganesan; Jaberi, Arash; Beecroft, Robert; Kachura, John R; Mafeld, Sebastian C.
Affiliation
  • Warren BE; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
  • Alkhalifah F; Division of Vascular and Interventional Radiology, Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada.
  • Ahrari A; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
  • Min A; Division of Vascular and Interventional Radiology, Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada.
  • Fawzy A; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
  • Annamalai G; Division of Vascular and Interventional Radiology, Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada.
  • Jaberi A; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
  • Beecroft R; Division of Vascular and Interventional Radiology, Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada.
  • Kachura JR; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
  • Mafeld SC; Department of Medical Imaging, University of Toronto, Temerty Faculty of Medicine, Toronto, ON, Canada.
Can Assoc Radiol J ; : 8465371241269436, 2024 Aug 21.
Article in En | MEDLINE | ID: mdl-39169480
ABSTRACT

Objectives:

Determine if a large language model (LLM, GPT-4) can label and consolidate and analyze interventional radiology (IR) microwave ablation device safety event data into meaningful summaries similar to humans.

Methods:

Microwave ablation safety data from January 1, 2011 to October 31, 2023 were collected and type of failure was categorized by human readers. Using GPT-4 and iterative prompt development, the data were classified. Iterative summarization of the reports was performed using GPT-4 to generate a final summary of the large text corpus.

Results:

Training (n = 25), validation (n = 639), and test (n = 79) data were split to reflect real-world deployment of an LLM for this task. GPT-4 demonstrated high accuracy in the multiclass classification problem of microwave ablation device data (accuracy [95% CI] training data 96.0% [79.7, 99.9], validation 86.4% [83.5, 89.0], test 87.3% [78.0, 93.8]). The text content was distilled through GPT-4 and iterative summarization prompts. A final summary was created which reflected the clinically relevant insights from the microwave ablation data relative to human interpretation but had inaccurate event class counts.

Conclusion:

The LLM emulated the human analysis, suggesting feasibility of using LLMs to process large volumes of IR safety data as a tool for clinicians. It accurately labelled microwave ablation device event data by type of malfunction through few-shot learning. Content distillation was used to analyze a large text corpus (>650 reports) and generate an insightful summary which was like the human interpretation.
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Can Assoc Radiol J Journal subject: RADIOLOGIA Year: 2024 Document type: Article Affiliation country: Canadá Country of publication: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Can Assoc Radiol J Journal subject: RADIOLOGIA Year: 2024 Document type: Article Affiliation country: Canadá Country of publication: Estados Unidos