Pesquisa | Biblioteca Virtual em Saúde

Development of a liver disease-specific large language model chat interface using retrieval-augmented generation.

Ge, Jin; Sun, Steve; Owens, Joseph; Galvez, Victor; Gologorskaya, Oksana; Lai, Jennifer C; Pletcher, Mark J; Lai, Ki.

Hepatology ; 2024 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-38451962

RESUMO

BACKGROUND AND AIMS: Large language models (LLMs) have significant capabilities in clinical information processing tasks. Commercially available LLMs, however, are not optimized for clinical uses and are prone to generating hallucinatory information. Retrieval-augmented generation (RAG) is an enterprise architecture that allows the embedding of customized data into LLMs. This approach "specializes" the LLMs and is thought to reduce hallucinations. APPROACH AND RESULTS: We developed "LiVersa," a liver disease-specific LLM, by using our institution's protected health information-complaint text embedding and LLM platform, "Versa." We conducted RAG on 30 publicly available American Association for the Study of Liver Diseases guidance documents to be incorporated into LiVersa. We evaluated LiVersa's performance by conducting 2 rounds of testing. First, we compared LiVersa's outputs versus those of trainees from a previously published knowledge assessment. LiVersa answered all 10 questions correctly. Second, we asked 15 hepatologists to evaluate the outputs of 10 hepatology topic questions generated by LiVersa, OpenAI's ChatGPT 4, and Meta's Large Language Model Meta AI 2. LiVersa's outputs were more accurate but were rated less comprehensive and safe compared to those of ChatGPT 4. RESULTS: We evaluated LiVersa's performance by conducting 2 rounds of testing. First, we compared LiVersa's outputs versus those of trainees from a previously published knowledge assessment. LiVersa answered all 10 questions correctly. Second, we asked 15 hepatologists to evaluate the outputs of 10 hepatology topic questions generated by LiVersa, OpenAI's ChatGPT 4, and Meta's Large Language Model Meta AI 2. LiVersa's outputs were more accurate but were rated less comprehensive and safe compared to those of ChatGPT 4. CONCLUSIONS: In this demonstration, we built disease-specific and protected health information-compliant LLMs using RAG. While LiVersa demonstrated higher accuracy in answering questions related to hepatology, there were some deficiencies due to limitations set by the number of documents used for RAG. LiVersa will likely require further refinement before potential live deployment. The LiVersa prototype, however, is a proof of concept for utilizing RAG to customize LLMs for clinical use cases.

Reply: Refining retrieval and chunking strategies for enhanced clinical reliability of large language models in liver disease.

Ge, Jin; Sun, Steve; Owens, Joseph; Galvez, Victor; Gologorskaya, Oksana; Lai, Jennifer C; Pletcher, Mark J; Lai, Ki.

Hepatology ; 2024 Jun 27.

Artigo em Inglês | MEDLINE | ID: mdl-38935858

Machine-Learning Algorithm to Improve Cohort Identification in Interstitial Lung Disease.

Farrand, Erica; Gologorskaya, Oksana; Mills, Hunter; Radhakrishnan, Lakshmi; Collard, Harold R; Butte, Atul J.

Am J Respir Crit Care Med ; 207(10): 1398-1401, 2023 05 15.

Artigo em Inglês | MEDLINE | ID: mdl-36943196

Assuntos

Doenças Pulmonares Intersticiais , Humanos , Doenças Pulmonares Intersticiais/diagnóstico , Aprendizado de Máquina , Algoritmos

"My mom is a fighter": A qualitative analysis of the use of combat metaphors in intensive care unit clinician notes.

Kim, Shannen; Mills, Hunter; Brender, Teva; McGowan, Samuel; Widera, Eric; Chapman, Allyson C; Harrison, Krista L; Lee, Sei; Smith, Alex K; Bamman, David; Gologorskaya, Oksana; Cobert, Julien.

Chest ; 2024 Aug 26.

Artigo em Inglês | MEDLINE | ID: mdl-39197512

RESUMO

BACKGROUND: A metaphor conceptualizes one, typically abstract, experience in terms of another, more concrete, experience with the goal of making it easier to understand. Even though combat metaphors have been well-described in some health contexts, they have not been well-characterized in the setting of critical illness. RESEARCH QUESTION: How do clinicians use combat metaphors when describing critically ill patients and families in the electronic health record? STUDY DESIGN AND METHODS: We included notes written about patients >=18 years admitted to ICUs within a large hospital system from 2012-2020. We developed a lexicon of combat words, and isolated note segments that contained any combat mentions. Combat mentions were systematically defined as a metaphor or not across two coders. Among combat metaphors, we used a grounded theory approach to construct a conceptual framework around their use. RESULTS: Across 6,404 combat-related mentions, 5,970 were defined as metaphors (Cohen's kappa 0.84). The most common metaphors were "bout" (26.2% of isolated segments), "combat" (18.5%), "confront" (17.8%) and "struggle" (17.5%). We present a conceptual framework highlighting how combat metaphors can present as identity ("mom is a fighter") and process constructs ("struggling to breathe"). Identity constructs were usually framed around: (1) hope, (2) internal strength, and/or (3) contextualization of current illness based on prior experiences. Process constructs were used to describe: (1) "fighting for" (e.g. working toward) a goal, (2) "fighting against" an unwanted force, or (3) experiencing internal turmoil. INTERPRETATION: We provide a novel conceptual framework around the use of combat metaphors in the ICU. Further studies are needed to understand intentionality behind their use and how they impact clinician behaviors and patient/caregiver emotional responses.

Measuring Implicit Bias in ICU Notes Using Word-Embedding Neural Network Models.

Cobert, Julien; Mills, Hunter; Lee, Albert; Gologorskaya, Oksana; Espejo, Edie; Jeon, Sun Young; Boscardin, W John; Heintz, Timothy A; Kennedy, Christopher J; Ashana, Deepshikha C; Chapman, Allyson Cook; Raghunathan, Karthik; Smith, Alex K; Lee, Sei J.

Chest ; 165(6): 1481-1490, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38199323

RESUMO

BACKGROUND: Language in nonmedical data sets is known to transmit human-like biases when used in natural language processing (NLP) algorithms that can reinforce disparities. It is unclear if NLP algorithms of medical notes could lead to similar transmissions of biases. RESEARCH QUESTION: Can we identify implicit bias in clinical notes, and are biases stable across time and geography? STUDY DESIGN AND METHODS: To determine whether different racial and ethnic descriptors are similar contextually to stigmatizing language in ICU notes and whether these relationships are stable across time and geography, we identified notes on critically ill adults admitted to the University of California, San Francisco (UCSF), from 2012 through 2022 and to Beth Israel Deaconess Hospital (BIDMC) from 2001 through 2012. Because word meaning is derived largely from context, we trained unsupervised word-embedding algorithms to measure the similarity (cosine similarity) quantitatively of the context between a racial or ethnic descriptor (eg, African-American) and a stigmatizing target word (eg, nonco-operative) or group of words (violence, passivity, noncompliance, nonadherence). RESULTS: In UCSF notes, Black descriptors were less likely to be similar contextually to violent words compared with White descriptors. Contrastingly, in BIDMC notes, Black descriptors were more likely to be similar contextually to violent words compared with White descriptors. The UCSF data set also showed that Black descriptors were more similar contextually to passivity and noncompliance words compared with Latinx descriptors. INTERPRETATION: Implicit bias is identifiable in ICU notes. Racial and ethnic group descriptors carry different contextual relationships to stigmatizing words, depending on when and where notes were written. Because NLP models seem able to transmit implicit bias from training data, use of NLP algorithms in clinical prediction could reinforce disparities. Active debiasing strategies may be necessary to achieve algorithmic fairness when using language models in clinical research.

Assuntos

Unidades de Terapia Intensiva , Processamento de Linguagem Natural , Redes Neurais de Computação , Humanos , Algoritmos , Estado Terminal/psicologia , Viés , Registros Eletrônicos de Saúde , Masculino , Feminino

Development of a Liver Disease-Specific Large Language Model Chat Interface using Retrieval Augmented Generation.

Ge, Jin; Sun, Steve; Owens, Joseph; Galvez, Victor; Gologorskaya, Oksana; Lai, Jennifer C; Pletcher, Mark J; Lai, Ki.

medRxiv ; 2023 Nov 10.

Artigo em Inglês | MEDLINE | ID: mdl-37986764

RESUMO

Background: Large language models (LLMs) have significant capabilities in clinical information processing tasks. Commercially available LLMs, however, are not optimized for clinical uses and are prone to generating incorrect or hallucinatory information. Retrieval-augmented generation (RAG) is an enterprise architecture that allows embedding of customized data into LLMs. This approach "specializes" the LLMs and is thought to reduce hallucinations. Methods: We developed "LiVersa," a liver disease-specific LLM, by using our institution's protected health information (PHI)-complaint text embedding and LLM platform, "Versa." We conducted RAG on 30 publicly available American Association for the Study of Liver Diseases (AASLD) guidelines and guidance documents to be incorporated into LiVersa. We evaluated LiVersa's performance by comparing its responses versus those of trainees from a previously published knowledge assessment study regarding hepatitis B (HBV) treatment and hepatocellular carcinoma (HCC) surveillance. Results: LiVersa answered all 10 questions correctly when forced to provide a "yes" or "no" answer. Full detailed responses with justifications and rationales, however, were not completely correct for three of the questions. Discussions: In this study, we demonstrated the ability to build disease-specific and PHI-compliant LLMs using RAG. While our LLM, LiVersa, demonstrated more specificity in answering questions related to clinical hepatology - there were some knowledge deficiencies due to limitations set by the number and types of documents used for RAG. The LiVersa prototype, however, is a proof of concept for utilizing RAG to customize LLMs for clinical uses and a potential strategy to realize personalized medicine in the future.

Negativity and Positivity in the ICU: Exploratory Development of Automated Sentiment Capture in the Electronic Health Record.

Kennedy, Chris J; Chiu, Catherine; Chapman, Allyson Cook; Gologorskaya, Oksana; Farhan, Hassan; Han, Mary; Hodgson, MacGregor; Lazzareschi, Daniel; Ashana, Deepshikha; Lee, Sei; Smith, Alexander K; Espejo, Edie; Boscardin, John; Pirracchio, Romain; Cobert, Julien.

Crit Care Explor ; 5(10): e0960, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37753238

RESUMO

OBJECTIVES: To develop proof-of-concept algorithms using alternative approaches to capture provider sentiment in ICU notes. DESIGN: Retrospective observational cohort study. SETTING: The Multiparameter Intelligent Monitoring of Intensive Care III (MIMIC-III) and the University of California, San Francisco (UCSF) deidentified notes databases. PATIENTS: Adult (≥18 yr old) patients admitted to the ICU. MEASUREMENTS AND MAIN RESULTS: We developed two sentiment models: 1) a keywords-based approach using a consensus-based clinical sentiment lexicon comprised of 72 positive and 103 negative phrases, including negations and 2) a Decoding-enhanced Bidirectional Encoder Representations from Transformers with disentangled attention-v3-based deep learning model (keywords-independent) trained on clinical sentiment labels. We applied the models to 198,944 notes across 52,997 ICU admissions in the MIMIC-III database. Analyses were replicated on an external sample of patients admitted to a UCSF ICU from 2018 to 2019. We also labeled sentiment in 1,493 note fragments and compared the predictive accuracy of our tools to three popular sentiment classifiers. Clinical sentiment terms were found in 99% of patient visits across 88% of notes. Our two sentiment tools were substantially more predictive (Spearman correlations of 0.62-0.84, p values < 0.00001) of labeled sentiment compared with general language algorithms (0.28-0.46). CONCLUSION: Our exploratory healthcare-specific sentiment models can more accurately detect positivity and negativity in clinical notes compared with general sentiment tools not designed for clinical usage.

Electronic Medical Record Search Engine (EMERSE): An Information Retrieval Tool for Supporting Cancer Research.

Hanauer, David A; Barnholtz-Sloan, Jill S; Beno, Mark F; Del Fiol, Guilherme; Durbin, Eric B; Gologorskaya, Oksana; Harris, Daniel; Harnett, Brett; Kawamoto, Kensaku; May, Benjamin; Meeks, Eric; Pfaff, Emily; Weiss, Janie; Zheng, Kai.

JCO Clin Cancer Inform ; 4: 454-463, 2020 05.

Artigo em Inglês | MEDLINE | ID: mdl-32412846

RESUMO

PURPOSE: The Electronic Medical Record Search Engine (EMERSE) is a software tool built to aid research spanning cohort discovery, population health, and data abstraction for clinical trials. EMERSE is now live at three academic medical centers, with additional sites currently working on implementation. In this report, we describe how EMERSE has been used to support cancer research based on a variety of metrics. METHODS: We identified peer-reviewed publications that used EMERSE through online searches as well as through direct e-mails to users based on audit logs. These logs were also used to summarize use at each of the three sites. Search terms for two of the sites were characterized using the natural language processing tool MetaMap to determine to which semantic types the terms could be mapped. RESULTS: We identified a total of 326 peer-reviewed publications that used EMERSE through August 2019, although this is likely an underestimation of the true total based on the use log analysis. Oncology-related research comprised nearly one third (n = 105; 32.2%) of all research output. The use logs showed that EMERSE had been used by multiple people at each site (nearly 3,500 across all three) who had collectively logged into the system > 100,000 times. Many user-entered search queries could not be mapped to a semantic type, but the most common semantic type for terms that did match was "disease or syndrome," followed by "pharmacologic substance." CONCLUSION: EMERSE has been shown to be a valuable tool for supporting cancer research. It has been successfully deployed at other sites, despite some implementation challenges unique to each deployment environment.

Assuntos

Neoplasias , Ferramenta de Busca , Registros Eletrônicos de Saúde , Humanos , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Neoplasias/terapia , Software

Crowdsourcing the CTSA innovation mission.

Kahlon, Maninder; Yuan, Leslie; Gologorskaya, Oksana; Johnston, S Claiborne.

Clin Transl Sci ; 7(2): 89-92, 2014 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-24655812

Assuntos

Crowdsourcing , Invenções , Pesquisa Translacional Biomédica , Projetos de Pesquisa

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA