Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.

Mukherjee, Pritam; Hou, Benjamin; Lanfredi, Ricardo B; Summers, Ronald M

Mukherjee, Pritam; Hou, Benjamin; Lanfredi, Ricardo B; Summers, Ronald M.

Afiliación

Mukherjee P; From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182.
Hou B; From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182.
Lanfredi RB; From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182.
Summers RM; From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182.

Radiology ; 309(1): e231147, 2023 10.

Article en En | MEDLINE | ID: mdl-37815442

RESUMEN

Background Large language models (LLMs) such as ChatGPT, though proficient in many text-based tasks, are not suitable for use with radiology reports due to patient privacy constraints. Purpose To test the feasibility of using an alternative LLM (Vicuna-13B) that can be run locally for labeling radiography reports. Materials and Methods Chest radiography reports from the MIMIC-CXR and National Institutes of Health (NIH) data sets were included in this retrospective study. Reports were examined for 13 findings. Outputs reporting the presence or absence of the 13 findings were generated by Vicuna by using a single-step or multistep prompting strategy (prompts 1 and 2, respectively). Agreements between Vicuna outputs and CheXpert and CheXbert labelers were assessed using Fleiss κ. Agreement between Vicuna outputs from three runs under a hyperparameter setting that introduced some randomness (temperature, 0.7) was also assessed. The performance of Vicuna and the labelers was assessed in a subset of 100 NIH reports annotated by a radiologist with use of area under the receiver operating characteristic curve (AUC). Results A total of 3269 reports from the MIMIC-CXR data set (median patient age, 68 years [IQR, 59-79 years]; 161 male patients) and 25 596 reports from the NIH data set (median patient age, 47 years [IQR, 32-58 years]; 1557 male patients) were included. Vicuna outputs with prompt 2 showed, on average, moderate to substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 [IQR, 0.45-0.66] with CheXpert and 0.64 [IQR, 0.45-0.68] with CheXbert) and NIH (κ median, 0.52 [IQR, 0.41-0.65] with CheXpert and 0.55 [IQR, 0.41-0.74] with CheXbert) data sets, respectively. Vicuna with prompt 2 performed at par (median AUC, 0.84 [IQR, 0.74-0.93]) with both labelers on nine of 11 findings. Conclusion In this proof-of-concept study, outputs of the LLM Vicuna reporting the presence or absence of 13 findings on chest radiography reports showed moderate to substantial agreement with existing labelers. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Cai in this issue.

Asunto(s)

Camélidos del Nuevo Mundo; Radiología; Estados Unidos; Humanos; Masculino; Animales; Anciano; Persona de Mediana Edad; Privacidad; Estudios de Factibilidad; Estudios Retrospectivos; Lenguaje

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Radiología / Camélidos del Nuevo Mundo Tipo de estudio: Observational_studies / Prognostic_studies Límite: Aged / Animals / Humans / Male / Middle aged País/Región como asunto: America do norte Idioma: En Revista: Radiology Año: 2023 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google