Búsqueda | Portal de Búsqueda de la BVS

Redefining Virtual Assistants in Health Care: The Future With Large Language Models.

Sezgin, Emre.

J Med Internet Res ; 26: e53225, 2024 Jan 19.

Artículo en Inglés | MEDLINE | ID: mdl-38241074

RESUMEN

This editorial explores the evolving and transformative role of large language models (LLMs) in enhancing the capabilities of virtual assistants (VAs) in the health care domain, highlighting recent research on the performance of VAs and LLMs in health care information sharing. Focusing on recent research, this editorial unveils the marked improvement in the accuracy and clinical relevance of responses from LLMs, such as GPT-4, compared to current VAs, especially in addressing complex health care inquiries, like those related to postpartum depression. The improved accuracy and clinical relevance with LLMs mark a paradigm shift in digital health tools and VAs. Furthermore, such LLM applications have the potential to dynamically adapt and be integrated into existing VA platforms, offering cost-effective, scalable, and inclusive solutions. These suggest a significant increase in the applicable range of VA applications, as well as the increased value, risk, and impact in health care, moving toward more personalized digital health ecosystems. However, alongside these advancements, it is necessary to develop and adhere to ethical guidelines, regulatory frameworks, governance principles, and privacy and safety measures. We need a robust interdisciplinary collaboration to navigate the complexities of safely and effectively integrating LLMs into health care applications, ensuring that these emerging technologies align with the diverse needs and ethical considerations of the health care domain.

Asunto(s)

Depresión Posparto , Ecosistema , Femenino , Humanos , Salud Digital , Difusión de la Información , Lenguaje

Evaluation of a Digital Scribe: Conversation Summarization for Emergency Department Consultation Calls.

Sezgin, Emre; Sirrianni, Joseph Winstead; Kranz, Kelly.

Appl Clin Inform ; 2024 May 15.

Artículo en Inglés | MEDLINE | ID: mdl-38749477

RESUMEN

OBJECTIVE: We present a proof-of-concept digital scribe system as an Emergency Department (ED) consultation call-based clinical conversation summarization pipeline to support clinical documentation, and report its performance. MATERIALS AND METHODS: We use four pre-trained large language models to establish the digital scribe system: T5-small, T5-base, PEGASUS-PubMed, and BART-Large-CNN via zero-shot and fine-tuning approaches. Our dataset includes 100 referral conversations among ED clinicians and medical records. We report the ROUGE-1, ROUGE-2, and ROUGE-L to compare model performance. In addition, we annotated transcriptions to assess the quality of generated summaries. RESULTS: The fine-tuned BART-Large-CNN model demonstrates greater performance in summarization tasks with the highest ROUGE scores (F1ROUGE-1=0.49, F1ROUGE-2=0.23, F1ROUGE-L=0.35) scores. In contrast, PEGASUS-PubMed lags notably (F1ROUGE-1=0.28, F1ROUGE-2=0.11, F1ROUGE-L=0.22). BART-Large-CNN's performance decreases by more than 50% with the zero-shot approach. Annotations show that BART-Large-CNN performs 71.4% recall in identifying key information and a 67.7% accuracy rate. DISCUSSION: The BART-Large-CNN model demonstrates a high level of understanding of clinical dialogue structure, indicated by its performance with and without fine-tuning. Despite some instances of high recall, there is variability in the model's performance, particularly in achieving consistent correctness, suggesting room for refinement. The model's recall ability varies across different information categories. CONCLUSION: The study provides evidence towards the potential of AI-assisted tools in assisting clinical documentation. Future work is suggested on expanding the research scope with additional language models and hybrid approaches, and comparative analysis to measure documentation burden and human factors.

Acceptance of voice assistant technology in dental practice: A cross sectional study with dentists and validation using structural equation modeling.

Warren, Spencer; Claman, Daniel; Meyer, Beau; Peng, Jin; Sezgin, Emre.

PLOS Digit Health ; 3(5): e0000510, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38743686

RESUMEN

Voice assistant technologies (VAT) has been part of our daily lives, as a virtual assistant to complete requested tasks. The integration of VAT in dental offices has the potential to augment productivity and hygiene practices. Prior to the adoption of such innovations in dental settings, it is crucial to evaluate their applicability. This study aims to assess dentists' perceptions and the factors influencing their intention to use VAT in a clinical setting. A survey and research model were designed based on an extended Unified Theory of Acceptance and Use of Technology (UTAUT). The survey was sent to 7,544 Ohio-licensed dentists through email. The data was analyzed and reported using descriptive statistics, model reliability testing, and partial least squares regression (PLSR) to explain dentists' behavioral intention (BI) to use VAT. In total, 257 participants completed the survey. The model accounted for 74.2% of the variance in BI to use VAT. Performance expectancy and perceived enjoyment had significant positive influence on BI to use VAT. Perceived risk had significant negative influence on BI to use VAT. Self-efficacy had significantly influenced perceived enjoyment, accounting for 35.5% of the variance of perceived enjoyment. This investigation reveals that performance efficiency and user enjoyment are key determinants in dentists' decision to adopt VAT. Concerns regarding the privacy of VAT also play a crucial role in its acceptance. This study represents the first documented inquiry into dentists' reception of VAT, laying groundwork for future research and implementation strategies.

Chatbot for Social Need Screening and Resource Sharing With Vulnerable Families: Iterative Design and Evaluation Study.

Sezgin, Emre; Kocaballi, A Baki; Dolce, Millie; Skeens, Micah; Militello, Lisa; Huang, Yungui; Stevens, Jack; Kemper, Alex R.

JMIR Hum Factors ; 11: e57114, 2024 Jul 19.

Artículo en Inglés | MEDLINE | ID: mdl-39028995

RESUMEN

BACKGROUND: Health outcomes are significantly influenced by unmet social needs. Although screening for social needs has become common in health care settings, there is often poor linkage to resources after needs are identified. The structural barriers (eg, staffing, time, and space) to helping address social needs could be overcome by a technology-based solution. OBJECTIVE: This study aims to present the design and evaluation of a chatbot, DAPHNE (Dialog-Based Assistant Platform for Healthcare and Needs Ecosystem), which screens for social needs and links patients and families to resources. METHODS: This research used a three-stage study approach: (1) an end-user survey to understand unmet needs and perception toward chatbots, (2) iterative design with interdisciplinary stakeholder groups, and (3) a feasibility and usability assessment. In study 1, a web-based survey was conducted with low-income US resident households (n=201). Following that, in study 2, web-based sessions were held with an interdisciplinary group of stakeholders (n=10) using thematic and content analysis to inform the chatbot's design and development. Finally, in study 3, the assessment on feasibility and usability was completed via a mix of a web-based survey and focus group interviews following scenario-based usability testing with community health workers (family advocates; n=4) and social workers (n=9). We reported descriptive statistics and chi-square test results for the household survey. Content analysis and thematic analysis were used to analyze qualitative data. Usability score was descriptively reported. RESULTS: Among the survey participants, employed and younger individuals reported a higher likelihood of using a chatbot to address social needs, in contrast to the oldest age group. Regarding designing the chatbot, the stakeholders emphasized the importance of provider-technology collaboration, inclusive conversational design, and user education. The participants found that the chatbot's capabilities met expectations and that the chatbot was easy to use (System Usability Scale score=72/100). However, there were common concerns about the accuracy of suggested resources, electronic health record integration, and trust with a chatbot. CONCLUSIONS: Chatbots can provide personalized feedback for families to identify and meet social needs. Our study highlights the importance of user-centered iterative design and development of chatbots for social needs. Future research should examine the efficacy, cost-effectiveness, and scalability of chatbot interventions to address social needs.

Asunto(s)

Poblaciones Vulnerables , Humanos , Encuestas y Cuestionarios , Femenino , Evaluación de Necesidades , Adulto , Masculino , Grupos Focales , Persona de Mediana Edad

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden.

Sezgin, Emre; Sirrianni, Joseph; Kranz, Kelly.

medRxiv ; 2023 Dec 07.

Artículo en Inglés | MEDLINE | ID: mdl-38106162

RESUMEN

Objective: We present a proof-of-concept digital scribe system as an ED clinical conversation summarization pipeline and report its performance. Materials and Methods: We use four pre-trained large language models to establish the digital scribe system: T5-small, T5-base, PEGASUS-PubMed, and BART-Large-CNN via zero-shot and fine-tuning approaches. Our dataset includes 100 referral conversations among ED clinicians and medical records. We report the ROUGE-1, ROUGE-2, and ROUGE-L to compare model performance. In addition, we annotated transcriptions to assess the quality of generated summaries. Results: The fine-tuned BART-Large-CNN model demonstrates greater performance in summarization tasks with the highest ROUGE scores (F1ROUGE-1=0.49, F1ROUGE-2=0.23, F1ROUGE-L=0.35) scores. In contrast, PEGASUS-PubMed lags notably (F1ROUGE-1=0.28, F1ROUGE-2=0.11, F1ROUGE-L=0.22). BART-Large-CNN's performance decreases by more than 50% with the zero-shot approach. Annotations show that BART-Large-CNN performs 71.4% recall in identifying key information and a 67.7% accuracy rate. Discussion: The BART-Large-CNN model demonstrates a high level of understanding of clinical dialogue structure, indicated by its performance with and without fine-tuning. Despite some instances of high recall, there is variability in the model's performance, particularly in achieving consistent correctness, suggesting room for refinement. The model's recall ability varies across different information categories. Conclusion: The study provides evidence towards the potential of AI-assisted tools in reducing clinical documentation burden. Future work is suggested on expanding the research scope with larger language models, and comparative analysis to measure documentation efforts and time.

Behavioral health and generative AI: a perspective on future of therapies and patient care.

Sezgin, Emre; McKay, Ian.

Npj Ment Health Res ; 3(1): 25, 2024 Jun 07.

Artículo en Inglés | MEDLINE | ID: mdl-38849499

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA