Your browser doesn't support javascript.
loading
AI-Assisted Summarization of Radiologic Reports: Evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical.
Chien, Aichi; Tang, Hubert; Jagessar, Bhavita; Chang, Kai-Wei; Peng, Nanyun; Nael, Kambiz; Salamon, Noriko.
Afiliação
  • Chien A; From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California aichi@ucla.edu.
  • Tang H; From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California.
  • Jagessar B; From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California.
  • Chang KW; Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California.
  • Peng N; Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California.
  • Nael K; From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California.
  • Salamon N; From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California.
AJNR Am J Neuroradiol ; 45(2): 244-248, 2024 02 07.
Article em En | MEDLINE | ID: mdl-38238092
ABSTRACT
BACKGROUND AND

PURPOSE:

The review of clinical reports is an essential part of monitoring disease progression. Synthesizing multiple imaging reports is also important for clinical decisions. It is critical to aggregate information quickly and accurately. Machine learning natural language processing (NLP) models hold promise to address an unmet need for report summarization. MATERIALS AND

METHODS:

We evaluated NLP methods to summarize longitudinal aneurysm reports. A total of 137 clinical reports and 100 PubMed case reports were used in this study. Models were 1) compared against expert-generated summary using longitudinal imaging notes collected in our institute and 2) compared using publicly accessible PubMed case reports. Five AI models were used to summarize the clinical reports, and a sixth model, the online GPT3davinci NLP large language model (LLM), was added for the summarization of PubMed case reports. We assessed the summary quality through comparison with expert summaries using quantitative metrics and quality reviews by experts.

RESULTS:

In clinical summarization, BARTcnn had the best performance (BERTscore = 0.8371), followed by LongT5Booksum and LEDlegal. In the analysis using PubMed case reports, GPT3davinci demonstrated the best performance, followed by models BARTcnn and then LEDbooksum (BERTscore = 0.894, 0.872, and 0.867, respectively).

CONCLUSIONS:

AI NLP summarization models demonstrated great potential in summarizing longitudinal aneurysm reports, though none yet reached the level of quality for clinical usage. We found the online GPT LLM outperformed the others; however, the BARTcnn model is potentially more useful because it can be implemented on-site. Future work to improve summarization, address other types of neuroimaging reports, and develop structured reports may allow NLP models to ease clinical workflow.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Aneurisma Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: AJNR Am J Neuroradiol Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Aneurisma Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: AJNR Am J Neuroradiol Ano de publicação: 2024 Tipo de documento: Article