The shaky foundations of large language models and foundation models for electronic health records.

Wornow, Michael; Xu, Yizhe; Thapa, Rahul; Patel, Birju; Steinberg, Ethan; Fleming, Scott; Pfeffer, Michael A; Fries, Jason; Shah, Nigam H

Wornow, Michael; Xu, Yizhe; Thapa, Rahul; Patel, Birju; Steinberg, Ethan; Fleming, Scott; Pfeffer, Michael A; Fries, Jason; Shah, Nigam H.

Afiliación

Wornow M; Department of Computer Science, Stanford University, Stanford, CA, USA. mwornow@stanford.edu.
Xu Y; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.
Thapa R; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.
Patel B; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.
Steinberg E; Department of Computer Science, Stanford University, Stanford, CA, USA.
Fleming S; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.
Pfeffer MA; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.
Fries J; Technology and Digital Services, Stanford Health Care, Palo Alto, CA, USA.
Shah NH; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA.

NPJ Digit Med ; 6(1): 135, 2023 Jul 29.

Article en En | MEDLINE | ID: mdl-37516790

RESUMEN

The success of foundation models such as ChatGPT and AlphaFold has spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. In this narrative review, we examine 84 foundation models trained on non-imaging EMR data (i.e., clinical text and/or structured data) and create a taxonomy delineating their architectures, training data, and potential use cases. We find that most models are trained on small, narrowly-scoped clinical datasets (e.g., MIMIC-III) or broad, public biomedical corpora (e.g., PubMed) and are evaluated on tasks that do not provide meaningful insights on their usefulness to health systems. Considering these findings, we propose an improved evaluation framework for measuring the benefits of clinical foundation models that is more closely grounded to metrics that matter in healthcare.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: NPJ Digit Med Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google