Your browser doesn't support javascript.
loading
Practical Evaluation of ChatGPT Performance for Radiology Report Generation.
Soleimani, Mohsen; Seyyedi, Navisa; Ayyoubzadeh, Seyed Mohammad; Kalhori, Sharareh Rostam Niakan; Keshavarz, Hamidreza.
Afiliação
  • Soleimani M; Department of Health Information Management and Medical Informatics, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran.
  • Seyyedi N; Department of Health Information Management and Medical Informatics, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran. Electronic address: n-seyyedi@razi.tums.ac.ir.
  • Ayyoubzadeh SM; Department of Health Information Management and Medical Informatics, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran; Health Information Management Research Centre, Tehran University of Medical Sciences, Tehran, Iran.
  • Kalhori SRN; Department of Health Information Management and Medical Informatics, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran; Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Braunschweig, Germany.
  • Keshavarz H; Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.
Acad Radiol ; 2024 Aug 13.
Article em En | MEDLINE | ID: mdl-39142976
ABSTRACT
RATIONALE AND

OBJECTIVES:

The process of generating radiology reports is often time-consuming and labor-intensive, prone to incompleteness, heterogeneity, and errors. By employing natural language processing (NLP)-based techniques, this study explores the potential for enhancing the efficiency of radiology report generation through the remarkable capabilities of ChatGPT (Generative Pre-training Transformer), a prominent large language model (LLM). MATERIALS AND

METHODS:

Using a sample of 1000 records from the Medical Information Mart for Intensive Care (MIMIC) Chest X-ray Database, this investigation employed Claude.ai to extract initial radiological report keywords. ChatGPT then generated radiology reports using a consistent 3-step prompt template outline. Various lexical and sentence similarity techniques were employed to evaluate the correspondence between the AI assistant-generated reports and reference reports authored by medical professionals.

RESULTS:

Results showed varying performance among NLP models, with Bart (Bidirectional and Auto-Regressive Transformers) and XLM (Cross-lingual Language Model) displaying high proficiency (mean similarity scores up to 99.3%), closely mirroring physician reports. Conversely, DeBERTa (Decoding-enhanced BERT with disentangled attention) and sequence-matching models scored lower, indicating less alignment with medical language. In the Impression section, the Word-Embedding model excelled with a mean similarity of 84.4%, while others like the Jaccard index showed lower performance.

CONCLUSION:

Overall, the study highlights significant variations across NLP models in their ability to generate radiology reports consistent with medical professionals' language. Pairwise comparisons and Kruskal-Wallis tests confirmed these differences, emphasizing the need for careful selection and evaluation of NLP models in radiology report generation. This research underscores the potential of ChatGPT to streamline and improve the radiology reporting process, with implications for enhancing efficiency and accuracy in clinical practice.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Acad Radiol / Acad. radiol / Academic radiology Assunto da revista: RADIOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Irã País de publicação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Acad Radiol / Acad. radiol / Academic radiology Assunto da revista: RADIOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Irã País de publicação: Estados Unidos