Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Regulating AI in Mental Health: Ethics of Care Perspective.

Tavory, Tamar.

JMIR Ment Health ; 11: e58493, 2024 Sep 19.

Artigo em Inglês | MEDLINE | ID: mdl-39298759

RESUMO

This article contends that the responsible artificial intelligence (AI) approach-which is the dominant ethics approach ruling most regulatory and ethical guidance-falls short because it overlooks the impact of AI on human relationships. Focusing only on responsible AI principles reinforces a narrow concept of accountability and responsibility of companies developing AI. This article proposes that applying the ethics of care approach to AI regulation can offer a more comprehensive regulatory and ethical framework that addresses AI's impact on human relationships. This dual approach is essential for the effective regulation of AI in the domain of mental health care. The article delves into the emergence of the new "therapeutic" area facilitated by AI-based bots, which operate without a therapist. The article highlights the difficulties involved, mainly the absence of a defined duty of care toward users, and shows how implementing ethics of care can establish clear responsibilities for developers. It also sheds light on the potential for emotional manipulation and the risks involved. In conclusion, the article proposes a series of considerations grounded in the ethics of care for the developmental process of AI-powered therapeutic tools.

Assuntos

Inteligência Artificial , Inteligência Artificial/ética , Humanos , Serviços de Saúde Mental/ética , Serviços de Saúde Mental/legislação & jurisprudência , Saúde Mental/ética

2.

Neural Conversational Agent for Weight Loss Counseling: Protocol for an Implementation and Feasibility Study.

Kotov, Alexander; Idalski Carcone, April; Towner, Elizabeth.

JMIR Res Protoc ; 13: e60361, 2024 Sep 20.

Artigo em Inglês | MEDLINE | ID: mdl-39303273

RESUMO

BACKGROUND: Obesity is a common, serious and costly chronic disease. Current clinical practice guidelines recommend that providers augment the longitudinal care of people living with obesity with consistent support for the development of self-efficacy and motivation to modify their lifestyle behaviors. Lifestyle behavior change aligns with the goals of motivational interviewing (MI), a client-centered yet directive counseling modality. However, training health care providers to be proficient in MI is expensive and time-consuming, resulting in a lack of trained counselors and limiting the widespread adoption of MI in clinical practice. Artificial intelligence (AI) counselors accessible via the internet can help circumvent these barriers. OBJECTIVE: The primary objective is to explore the feasibility of conducting unscripted MI-consistent counseling using Neural Agent for Obesity Motivational Interviewing (NAOMI), a large language model (LLM)-based web app for weight loss counseling. The secondary objectives are to test the acceptability and usability of NAOMI's counseling and examine its ability to shift motivational precursors in a sample of patients with overweight and obesity recruited from primary care clinics. METHODS: NAOMI will be developed based on recent advances in deep learning in four stages. In stages 1 and 2, NAOMI will be implemented using an open-source foundation LLM and (1) few-shot learning based on a prompt with task-specific instructions and (2) domain adaptation strategy based on fine-tuning LLM using a large corpus of general psychotherapy and MI treatment transcripts. In stages 3 and 4, we will refine the best of these 2 approaches. Each NAOMI version will be evaluated using a mixed methods approach in which 10 adults (18-65 years) meeting the criteria for overweight or obesity (25.0≥BMI≤39.9) interact with NAOMI and provide feedback. NAOMI's fidelity to the MI framework will be assessed using the Motivational Interviewing Treatment Integrity scale. Participants' general perceptions of AI conversational agents and NAOMI specifically will be assessed via Pre- and Post-Interaction Questionnaires. Motivational precursors, such as participants' confidence, importance, and readiness for changing lifestyle behaviors (eg, diet and activity), will be measured before and after the interaction, and 1 week later. A qualitative analysis of changes in the measures of perceptions of AI agents and counselors and motivational precursors will be performed. Participants will rate NAOMI's usability and empathic skills post interaction via questionnaire-based assessments along with providing feedback about their experience with NAOMI via a qualitative interview. RESULTS: NAOMI (version 1.0) has been developed. Participant recruitment will commence in September 2024. Data collection activities are expected to conclude in May 2025. CONCLUSIONS: If proven effective, LLM-based counseling agents can become a cost-effective approach for addressing the obesity epidemic at a public health level. They can also have a broad, transformative impact on the delivery of MI and other psychotherapeutic treatment modalities extending their reach and broadening access. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/60361.

Assuntos

Aconselhamento , Estudos de Viabilidade , Entrevista Motivacional , Obesidade , Humanos , Aconselhamento/métodos , Entrevista Motivacional/métodos , Obesidade/terapia , Obesidade/psicologia , Adulto , Masculino , Feminino , Redução de Peso , Pessoa de Meia-Idade , Programas de Redução de Peso/métodos

3.

Usability, Engagement, and Report Usefulness of Chatbot-Based Family Health History Data Collection: Mixed Methods Analysis.

Nguyen, Michelle Hoang; Sedoc, João; Taylor, Casey Overby.

J Med Internet Res ; 26: e55164, 2024 Sep 30.

Artigo em Inglês | MEDLINE | ID: mdl-39348188

RESUMO

BACKGROUND: Family health history (FHx) is an important predictor of a person's genetic risk but is not collected by many adults in the United States. OBJECTIVE: This study aims to test and compare the usability, engagement, and report usefulness of 2 web-based methods to collect FHx. METHODS: This mixed methods study compared FHx data collection using a flow-based chatbot (KIT; the curious interactive test) and a form-based method. KIT's design was optimized to reduce user burden. We recruited and randomized individuals from 2 crowdsourced platforms to 1 of the 2 FHx methods. All participants were asked to complete a questionnaire to assess the method's usability, the usefulness of a report summarizing their experience, user-desired chatbot enhancements, and general user experience. Engagement was studied using log data collected by the methods. We used qualitative findings from analyzing free-text comments to supplement the primary quantitative results. RESULTS: Participants randomized to KIT reported higher usability than those randomized to the form, with a mean System Usability Scale score of 80.2 versus 61.9 (P<.001), respectively. The engagement analysis reflected design differences in the onboarding process. KIT users spent less time entering FHx information and reported more conditions than form users (mean 5.90 vs 7.97 min; P=.04; and mean 7.8 vs 10.1 conditions; P=.04). Both KIT and form users somewhat agreed that the report was useful (Likert scale ratings of 4.08 and 4.29, respectively). Among desired enhancements, personalization was the highest-rated feature (188/205, 91.7% rated medium- to high-priority). Qualitative analyses revealed positive and negative characteristics of both KIT and the form-based method. Among respondents randomized to KIT, most indicated it was easy to use and navigate and that they could respond to and understand user prompts. Negative comments addressed KIT's personality, conversational pace, and ability to manage errors. For KIT and form respondents, qualitative results revealed common themes, including a desire for more information about conditions and a mutual appreciation for the multiple-choice button response format. Respondents also said they wanted to report health information beyond KIT's prompts (eg, personal health history) and for KIT to provide more personalized responses. CONCLUSIONS: We showed that KIT provided a usable way to collect FHx. We also identified design considerations to improve chatbot-based FHx data collection: First, the final report summarizing the FHx collection experience should be enhanced to provide more value for patients. Second, the onboarding chatbot prompt may impact data quality and should be carefully considered. Finally, we highlighted several areas that could be improved by moving from a flow-based chatbot to a large language model implementation strategy.

Assuntos

Anamnese , Humanos , Feminino , Masculino , Anamnese/métodos , Anamnese/estatística & dados numéricos , Adulto , Saúde da Família , Inquéritos e Questionários , Pessoa de Meia-Idade , Coleta de Dados/métodos , Internet

4.

Health Professionals' Views on the Use of Conversational Agents for Health Care: Qualitative Descriptive Study.

MacNeill, A Luke; MacNeill, Lillian; Luke, Alison; Doucet, Shelley.

J Med Internet Res ; 26: e49387, 2024 Sep 25.

Artigo em Inglês | MEDLINE | ID: mdl-39320936

RESUMO

BACKGROUND: In recent years, there has been an increase in the use of conversational agents for health promotion and service delivery. To date, health professionals' views on the use of this technology have received limited attention in the literature. OBJECTIVE: The purpose of this study was to gain a better understanding of how health professionals view the use of conversational agents for health care. METHODS: Physicians, nurses, and regulated mental health professionals were recruited using various web-based methods. Participants were interviewed individually using the Zoom (Zoom Video Communications, Inc) videoconferencing platform. Interview questions focused on the potential benefits and risks of using conversational agents for health care, as well as the best way to integrate conversational agents into the health care system. Interviews were transcribed verbatim and uploaded to NVivo (version 12; QSR International, Inc) for thematic analysis. RESULTS: A total of 24 health professionals participated in the study (19 women, 5 men; mean age 42.75, SD 10.71 years). Participants said that the use of conversational agents for health care could have certain benefits, such as greater access to care for patients or clients and workload support for health professionals. They also discussed potential drawbacks, such as an added burden on health professionals (eg, program familiarization) and the limited capabilities of these programs. Participants said that conversational agents could be used for routine or basic tasks, such as screening and assessment, providing information and education, and supporting individuals between appointments. They also said that health professionals should have some oversight in terms of the development and implementation of these programs. CONCLUSIONS: The results of this study provide insight into health professionals' views on the use of conversational agents for health care, particularly in terms of the benefits and drawbacks of these programs and how they should be integrated into the health care system. These collective findings offer useful information and guidance to stakeholders who have an interest in the development and implementation of this technology.

Assuntos

Pessoal de Saúde , Pesquisa Qualitativa , Humanos , Feminino , Masculino , Adulto , Pessoal de Saúde/psicologia , Pessoa de Meia-Idade , Comunicação , Atitude do Pessoal de Saúde , Comunicação por Videoconferência , Atenção à Saúde

5.

A conversational agent for enhanced Self-Management after cardiothoracic surgery.

Martins, Ana; Velez Lapão, Luís; Nunes, Isabel L; Paula Giordano, Ana; Semedo, Helena; Vital, Clara; Silva, Raquel; Coelho, Pedro; Londral, Ana.

Int J Med Inform ; 192: 105640, 2024 Sep 24.

Artigo em Inglês | MEDLINE | ID: mdl-39321492

RESUMO

BACKGROUND: Enhanced self-management is crucial for long-term survival following cardiothoracic surgery. OBJECTIVES: This study aimed to develop a conversational agent to enhance patient self-management after cardiothoracic surgery. METHODOLOGY: The solution was designed and implemented following the Design Science Research Methodology. A pilot study was conducted at the hospital to assess the feasibility, usability, and perceived effectiveness of the solution. Feedback was gathered to inform further interactions. Additionally, a focus group with clinicians was conducted to evaluate the acceptability of the solution, integrating insights from the pilot study. RESULTS: The conversational agent, implemented using a rule-based model, was successfully tested with patients in the cardiothoracic surgery unit (n = 4). Patients received one month of text messages reinforcing clinical team recommendations on a healthy diet and regular physical activity. The system received a high usability score, and two patients suggested adding a feature to answer user prompts for future improvements. The focus group feedback indicated that while the solution met the initial requirements, further testing with a larger patient cohort is necessary to establish personalized profiles. Moreover, clinicians recommended that future iterations prioritize enhanced personalization and interoperability with other hospital platforms. Additionally, while the use of artificial generative intelligence was seen as relevant for content personalization, clinicians expressed concerns regarding content safety, highlighting the necessity for rigorous testing. CONCLUSIONS: This study marks a significant step towards enhancing post-cardiothoracic surgery care through conversational agents. The integration of a diversity of stakeholder knowledge enriches the solution, grants ownership and ensures its sustainability. Future research should focus on automating message generation and delivery based on patient data and environmental factors. While the integration of artificial generative intelligence holds promise for enhancing patient interaction, ensuring the safety of its content is essential.

6.

Empathic Conversational Agent Platform Designs and Their Evaluation in the Context of Mental Health: Systematic Review.

Sanjeewa, Ruvini; Iyer, Ravi; Apputhurai, Pragalathan; Wickramasinghe, Nilmini; Meyer, Denny.

JMIR Ment Health ; 11: e58974, 2024 Sep 09.

Artigo em Inglês | MEDLINE | ID: mdl-39250799

RESUMO

BACKGROUND: The demand for mental health (MH) services in the community continues to exceed supply. At the same time, technological developments make the use of artificial intelligence-empowered conversational agents (CAs) a real possibility to help fill this gap. OBJECTIVE: The objective of this review was to identify existing empathic CA design architectures within the MH care sector and to assess their technical performance in detecting and responding to user emotions in terms of classification accuracy. In addition, the approaches used to evaluate empathic CAs within the MH care sector in terms of their acceptability to users were considered. Finally, this review aimed to identify limitations and future directions for empathic CAs in MH care. METHODS: A systematic literature search was conducted across 6 academic databases to identify journal articles and conference proceedings using search terms covering 3 topics: "conversational agents," "mental health," and "empathy." Only studies discussing CA interventions for the MH care domain were eligible for this review, with both textual and vocal characteristics considered as possible data inputs. Quality was assessed using appropriate risk of bias and quality tools. RESULTS: A total of 19 articles met all inclusion criteria. Most (12/19, 63%) of these empathic CA designs in MH care were machine learning (ML) based, with 26% (5/19) hybrid engines and 11% (2/19) rule-based systems. Among the ML-based CAs, 47% (9/19) used neural networks, with transformer-based architectures being well represented (7/19, 37%). The remaining 16% (3/19) of the ML models were unspecified. Technical assessments of these CAs focused on response accuracies and their ability to recognize, predict, and classify user emotions. While single-engine CAs demonstrated good accuracy, the hybrid engines achieved higher accuracy and provided more nuanced responses. Of the 19 studies, human evaluations were conducted in 16 (84%), with only 5 (26%) focusing directly on the CA's empathic features. All these papers used self-reports for measuring empathy, including single or multiple (scale) ratings or qualitative feedback from in-depth interviews. Only 1 (5%) paper included evaluations by both CA users and experts, adding more value to the process. CONCLUSIONS: The integration of CA design and its evaluation is crucial to produce empathic CAs. Future studies should focus on using a clear definition of empathy and standardized scales for empathy measurement, ideally including expert assessment. In addition, the diversity in measures used for technical assessment and evaluation poses a challenge for comparing CA performances, which future research should also address. However, CAs with good technical and empathic performance are already available to users of MH care services, showing promise for new applications, such as helpline services.

Assuntos

Empatia , Serviços de Saúde Mental , Humanos , Inteligência Artificial

7.

Entanglements of Technologies, Agency and Selfhood: Exploring the Complexity in Attitudes Toward Mental Health Chatbots.

Meadows, Robert; Hine, Christine.

Cult Med Psychiatry ; 2024 Aug 17.

Artigo em Inglês | MEDLINE | ID: mdl-39153178

RESUMO

Whilst chatbots for mental health are becoming increasingly prevalent, research on user experiences and expectations is relatively scarce and also equivocal on their acceptability and utility. This paper asks how people formulate their understandings of what might be appropriate in this space. We draw on data from a group of non-users who have experienced a need for support, and so can imagine self as therapeutic target-enabling us to tap into their imaginative speculations of the self in relation to the chatbot other and the forms of agency they see as being at play; unconstrained by a specific actual chatbot. Analysis points towards ambiguity over some key issues: whether the apps were seen as having a role in specific episodes of mental health or in relation to an ongoing project of supporting wellbeing; whether the chatbot could be viewed as having a therapeutic agency or was a mere tool; and how far these issues related to matters of the user's personal qualities or the specific nature of the mental health condition. A range of traditions, norms and practices were used to construct diverse expectations on whether chatbots could offer a solution to cost-effective mental health support at scale.

8.

Transforming Health Care Through Chatbots for Medical History-Taking and Future Directions: Comprehensive Systematic Review.

Hindelang, Michael; Sitaru, Sebastian; Zink, Alexander.

JMIR Med Inform ; 12: e56628, 2024 Aug 29.

Artigo em Inglês | MEDLINE | ID: mdl-39207827

RESUMO

BACKGROUND: The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice. OBJECTIVE: This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice. METHODS: A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included "chatbot*," "conversational agent*," "virtual assistant," "artificial intelligence chatbot," "medical history," and "history-taking." The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs). RESULTS: The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk. CONCLUSIONS: This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient engagement, streamline data collection, and improve health care decision-making. For effective integration into clinical practice, it is crucial to design user-friendly interfaces, ensure robust data security, and maintain empathetic patient-physician interactions. Future research should focus on refining chatbot algorithms, improving their emotional intelligence, and extending their application to different health care settings to realize their full potential in modern medicine. TRIAL REGISTRATION: PROSPERO CRD42023410312; www.crd.york.ac.uk/prospero.

9.

A Language Model-Powered Simulated Patient With Automated Feedback for History Taking: Prospective Study.

Holderried, Friederike; Stegemann-Philipps, Christian; Herrmann-Werner, Anne; Festl-Wietek, Teresa; Holderried, Martin; Eickhoff, Carsten; Mahling, Moritz.

JMIR Med Educ ; 10: e59213, 2024 Aug 16.

Artigo em Inglês | MEDLINE | ID: mdl-39150749

RESUMO

BACKGROUND: Although history taking is fundamental for diagnosing medical conditions, teaching and providing feedback on the skill can be challenging due to resource constraints. Virtual simulated patients and web-based chatbots have thus emerged as educational tools, with recent advancements in artificial intelligence (AI) such as large language models (LLMs) enhancing their realism and potential to provide feedback. OBJECTIVE: In our study, we aimed to evaluate the effectiveness of a Generative Pretrained Transformer (GPT) 4 model to provide structured feedback on medical students' performance in history taking with a simulated patient. METHODS: We conducted a prospective study involving medical students performing history taking with a GPT-powered chatbot. To that end, we designed a chatbot to simulate patients' responses and provide immediate feedback on the comprehensiveness of the students' history taking. Students' interactions with the chatbot were analyzed, and feedback from the chatbot was compared with feedback from a human rater. We measured interrater reliability and performed a descriptive analysis to assess the quality of feedback. RESULTS: Most of the study's participants were in their third year of medical school. A total of 1894 question-answer pairs from 106 conversations were included in our analysis. GPT-4's role-play and responses were medically plausible in more than 99% of cases. Interrater reliability between GPT-4 and the human rater showed "almost perfect" agreement (Cohen κ=0.832). Less agreement (κ<0.6) detected for 8 out of 45 feedback categories highlighted topics about which the model's assessments were overly specific or diverged from human judgement. CONCLUSIONS: The GPT model was effective in providing structured feedback on history-taking dialogs provided by medical students. Although we unraveled some limitations regarding the specificity of feedback for certain feedback categories, the overall high agreement with human raters suggests that LLMs can be a valuable tool for medical education. Our findings, thus, advocate the careful integration of AI-driven feedback mechanisms in medical training and highlight important aspects when LLMs are used in that context.

Assuntos

Anamnese , Simulação de Paciente , Estudantes de Medicina , Humanos , Estudos Prospectivos , Anamnese/métodos , Anamnese/normas , Estudantes de Medicina/psicologia , Feminino , Masculino , Competência Clínica/normas , Inteligência Artificial , Retroalimentação , Reprodutibilidade dos Testes , Educação de Graduação em Medicina/métodos

10.

Influence of Model Evolution and System Roles on ChatGPT's Performance in Chinese Medical Licensing Exams: Comparative Study.

Ming, Shuai; Guo, Qingge; Cheng, Wenjun; Lei, Bo.

JMIR Med Educ ; 10: e52784, 2024 Aug 13.

Artigo em Inglês | MEDLINE | ID: mdl-39140269

RESUMO

Background: With the increasing application of large language models like ChatGPT in various industries, its potential in the medical domain, especially in standardized examinations, has become a focal point of research. Objective: The aim of this study is to assess the clinical performance of ChatGPT, focusing on its accuracy and reliability in the Chinese National Medical Licensing Examination (CNMLE). Methods: The CNMLE 2022 question set, consisting of 500 single-answer multiple choices questions, were reclassified into 15 medical subspecialties. Each question was tested 8 to 12 times in Chinese on the OpenAI platform from April 24 to May 15, 2023. Three key factors were considered: the version of GPT-3.5 and 4.0, the prompt's designation of system roles tailored to medical subspecialties, and repetition for coherence. A passing accuracy threshold was established as 60%. The χ2 tests and κ values were employed to evaluate the model's accuracy and consistency. Results: GPT-4.0 achieved a passing accuracy of 72.7%, which was significantly higher than that of GPT-3.5 (54%; P<.001). The variability rate of repeated responses from GPT-4.0 was lower than that of GPT-3.5 (9% vs 19.5%; P<.001). However, both models showed relatively good response coherence, with κ values of 0.778 and 0.610, respectively. System roles numerically increased accuracy for both GPT-4.0 (0.3%-3.7%) and GPT-3.5 (1.3%-4.5%), and reduced variability by 1.7% and 1.8%, respectively (P>.05). In subgroup analysis, ChatGPT achieved comparable accuracy among different question types (P>.05). GPT-4.0 surpassed the accuracy threshold in 14 of 15 subspecialties, while GPT-3.5 did so in 7 of 15 on the first response. Conclusions: GPT-4.0 passed the CNMLE and outperformed GPT-3.5 in key areas such as accuracy, consistency, and medical subspecialty expertise. Adding a system role insignificantly enhanced the model's reliability and answer coherence. GPT-4.0 showed promising potential in medical education and clinical practice, meriting further study.

Assuntos

Avaliação Educacional , Licenciamento em Medicina , Humanos , China , Avaliação Educacional/métodos , Avaliação Educacional/normas , Reprodutibilidade dos Testes , Competência Clínica/normas

11.

Deep learning-based dimensional emotion recognition for conversational agent-based cognitive behavioral therapy.

Striegl, Julian; Richter, Jordan Wenzel; Grossmann, Leoni; Bråstad, Björn; Gotthardt, Marie; Rück, Christian; Wallert, John; Loitsch, Claudia.

PeerJ Comput Sci ; 10: e2104, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38983201

RESUMO

Internet-based cognitive behavioral therapy (iCBT) offers a scalable, cost-effective, accessible, and low-threshold form of psychotherapy. Recent advancements explored the use of conversational agents such as chatbots and voice assistants to enhance the delivery of iCBT. These agents can deliver iCBT-based exercises, recognize and track emotional states, assess therapy progress, convey empathy, and potentially predict long-term therapy outcome. However, existing systems predominantly utilize categorical approaches for emotional modeling, which can oversimplify the complexity of human emotional states. To address this, we developed a transformer-based model for dimensional text-based emotion recognition, fine-tuned with a novel, comprehensive dimensional emotion dataset comprising 75,503 samples. This model significantly outperforms existing state-of-the-art models in detecting the dimensions of valence, arousal, and dominance, achieving a Pearson correlation coefficient of r = 0.90, r = 0.77, and r = 0.64, respectively. Furthermore, a feasibility study involving 20 participants confirmed the model's technical effectiveness and its usability, acceptance, and empathic understanding in a conversational agent-based iCBT setting, marking a substantial improvement in personalized and effective therapy experiences.

12.

Assessing GPT-4's Performance in Delivering Medical Advice: Comparative Analysis With Human Experts.

Jo, Eunbeen; Song, Sanghoun; Kim, Jong-Ho; Lim, Subin; Kim, Ju Hyeon; Cha, Jung-Joon; Kim, Young-Min; Joo, Hyung Joon.

JMIR Med Educ ; 10: e51282, 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38989848

RESUMO

Background: Accurate medical advice is paramount in ensuring optimal patient care, and misinformation can lead to misguided decisions with potentially detrimental health outcomes. The emergence of large language models (LLMs) such as OpenAI's GPT-4 has spurred interest in their potential health care applications, particularly in automated medical consultation. Yet, rigorous investigations comparing their performance to human experts remain sparse. Objective: This study aims to compare the medical accuracy of GPT-4 with human experts in providing medical advice using real-world user-generated queries, with a specific focus on cardiology. It also sought to analyze the performance of GPT-4 and human experts in specific question categories, including drug or medication information and preliminary diagnoses. Methods: We collected 251 pairs of cardiology-specific questions from general users and answers from human experts via an internet portal. GPT-4 was tasked with generating responses to the same questions. Three independent cardiologists (SL, JHK, and JJC) evaluated the answers provided by both human experts and GPT-4. Using a computer interface, each evaluator compared the pairs and determined which answer was superior, and they quantitatively measured the clarity and complexity of the questions as well as the accuracy and appropriateness of the responses, applying a 3-tiered grading scale (low, medium, and high). Furthermore, a linguistic analysis was conducted to compare the length and vocabulary diversity of the responses using word count and type-token ratio. Results: GPT-4 and human experts displayed comparable efficacy in medical accuracy ("GPT-4 is better" at 132/251, 52.6% vs "Human expert is better" at 119/251, 47.4%). In accuracy level categorization, humans had more high-accuracy responses than GPT-4 (50/237, 21.1% vs 30/238, 12.6%) but also a greater proportion of low-accuracy responses (11/237, 4.6% vs 1/238, 0.4%; P=.001). GPT-4 responses were generally longer and used a less diverse vocabulary than those of human experts, potentially enhancing their comprehensibility for general users (sentence count: mean 10.9, SD 4.2 vs mean 5.9, SD 3.7; P<.001; type-token ratio: mean 0.69, SD 0.07 vs mean 0.79, SD 0.09; P<.001). Nevertheless, human experts outperformed GPT-4 in specific question categories, notably those related to drug or medication information and preliminary diagnoses. These findings highlight the limitations of GPT-4 in providing advice based on clinical experience. Conclusions: GPT-4 has shown promising potential in automated medical consultation, with comparable medical accuracy to human experts. However, challenges remain particularly in the realm of nuanced clinical judgment. Future improvements in LLMs may require the integration of specific clinical reasoning pathways and regulatory oversight for safe use. Further research is needed to understand the full potential of LLMs across various medical specialties and conditions.

Assuntos

Inteligência Artificial , Cardiologia , Humanos , Cardiologia/normas

13.

Evaluating the Potential and Pitfalls of AI-Powered Conversational Agents as Humanlike Virtual Health Carers in the Remote Management of Noncommunicable Diseases: Scoping Review.

Anisha, Sadia Azmin; Sen, Arkendu; Bain, Chris.

J Med Internet Res ; 26: e56114, 2024 Jul 16.

Artigo em Inglês | MEDLINE | ID: mdl-39012688

RESUMO

BACKGROUND: The rising prevalence of noncommunicable diseases (NCDs) worldwide and the high recent mortality rates (74.4%) associated with them, especially in low- and middle-income countries, is causing a substantial global burden of disease, necessitating innovative and sustainable long-term care solutions. OBJECTIVE: This scoping review aims to investigate the impact of artificial intelligence (AI)-based conversational agents (CAs)-including chatbots, voicebots, and anthropomorphic digital avatars-as human-like health caregivers in the remote management of NCDs as well as identify critical areas for future research and provide insights into how these technologies might be used effectively in health care to personalize NCD management strategies. METHODS: A broad literature search was conducted in July 2023 in 6 electronic databases-Ovid MEDLINE, Embase, PsycINFO, PubMed, CINAHL, and Web of Science-using the search terms "conversational agents," "artificial intelligence," and "noncommunicable diseases," including their associated synonyms. We also manually searched gray literature using sources such as ProQuest Central, ResearchGate, ACM Digital Library, and Google Scholar. We included empirical studies published in English from January 2010 to July 2023 focusing solely on health care-oriented applications of CAs used for remote management of NCDs. The narrative synthesis approach was used to collate and summarize the relevant information extracted from the included studies. RESULTS: The literature search yielded a total of 43 studies that matched the inclusion criteria. Our review unveiled four significant findings: (1) higher user acceptance and compliance with anthropomorphic and avatar-based CAs for remote care; (2) an existing gap in the development of personalized, empathetic, and contextually aware CAs for effective emotional and social interaction with users, along with limited consideration of ethical concerns such as data privacy and patient safety; (3) inadequate evidence of the efficacy of CAs in NCD self-management despite a moderate to high level of optimism among health care professionals regarding CAs' potential in remote health care; and (4) CAs primarily being used for supporting nonpharmacological interventions such as behavioral or lifestyle modifications and patient education for the self-management of NCDs. CONCLUSIONS: This review makes a unique contribution to the field by not only providing a quantifiable impact analysis but also identifying the areas requiring imminent scholarly attention for the ethical, empathetic, and efficacious implementation of AI in NCD care. This serves as an academic cornerstone for future research in AI-assisted health care for NCD management. TRIAL REGISTRATION: Open Science Framework; https://doi.org/10.17605/OSF.IO/GU5PX.

Assuntos

Inteligência Artificial , Cuidadores , Doenças não Transmissíveis , Telemedicina , Humanos , Cuidadores/psicologia

14.

Can Large Language Models Replace Therapists? Evaluating Performance at Simple Cognitive Behavioral Therapy Tasks.

Hodson, Nathan; Williamson, Simon.

JMIR AI ; 3: e52500, 2024 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-39078696

RESUMO

The advent of large language models (LLMs) such as ChatGPT has potential implications for psychological therapies such as cognitive behavioral therapy (CBT). We systematically investigated whether LLMs could recognize an unhelpful thought, examine its validity, and reframe it to a more helpful one. LLMs currently have the potential to offer reasonable suggestions for the identification and reframing of unhelpful thoughts but should not be relied on to lead CBT delivery.

15.

Conversational Chatbot for Cigarette Smoking Cessation: Results From the 11-Step User-Centered Design Development Process and Randomized Controlled Trial.

Bricker, Jonathan B; Sullivan, Brianna; Mull, Kristin; Santiago-Torres, Margarita; Lavista Ferres, Juan M.

JMIR Mhealth Uhealth ; 12: e57318, 2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-38913882

RESUMO

BACKGROUND: Conversational chatbots are an emerging digital intervention for smoking cessation. No studies have reported on the entire development process of a cessation chatbot. OBJECTIVE: We aim to report results of the user-centered design development process and randomized controlled trial for a novel and comprehensive quit smoking conversational chatbot called QuitBot. METHODS: The 4 years of formative research for developing QuitBot followed an 11-step process: (1) specifying a conceptual model; (2) conducting content analysis of existing interventions (63 hours of intervention transcripts); (3) assessing user needs; (4) developing the chat's persona ("personality"); (5) prototyping content and persona; (6) developing full functionality; (7) programming the QuitBot; (8) conducting a diary study; (9) conducting a pilot randomized controlled trial (RCT); (10) reviewing results of the RCT; and (11) adding a free-form question and answer (QnA) function, based on user feedback from pilot RCT results. The process of adding a QnA function itself involved a three-step process: (1) generating QnA pairs, (2) fine-tuning large language models (LLMs) on QnA pairs, and (3) evaluating the LLM outputs. RESULTS: We developed a quit smoking program spanning 42 days of 2- to 3-minute conversations covering topics ranging from motivations to quit, setting a quit date, choosing Food and Drug Administration-approved cessation medications, coping with triggers, and recovering from lapses and relapses. In a pilot RCT with 96% three-month outcome data retention, QuitBot demonstrated high user engagement and promising cessation rates compared to the National Cancer Institute's SmokefreeTXT text messaging program, particularly among those who viewed all 42 days of program content: 30-day, complete-case, point prevalence abstinence rates at 3-month follow-up were 63% (39/62) for QuitBot versus 38.5% (45/117) for SmokefreeTXT (odds ratio 2.58, 95% CI 1.34-4.99; P=.005). However, Facebook Messenger intermittently blocked participants' access to QuitBot, so we transitioned from Facebook Messenger to a stand-alone smartphone app as the communication channel. Participants' frustration with QuitBot's inability to answer their open-ended questions led to us develop a core conversational feature, enabling users to ask open-ended questions about quitting cigarette smoking and for the QuitBot to respond with accurate and professional answers. To support this functionality, we developed a library of 11,000 QnA pairs on topics associated with quitting cigarette smoking. Model testing results showed that Microsoft's Azure-based QnA maker effectively handled questions that matched our library of 11,000 QnA pairs. A fine-tuned, contextualized GPT-3.5 (OpenAI) responds to questions that are not within our library of QnA pairs. CONCLUSIONS: The development process yielded the first LLM-based quit smoking program delivered as a conversational chatbot. Iterative testing led to significant enhancements, including improvements to the delivery channel. A pivotal addition was the inclusion of a core LLM-supported conversational feature allowing users to ask open-ended questions. TRIAL REGISTRATION: ClinicalTrials.gov NCT03585231; https://clinicaltrials.gov/study/NCT03585231.

Assuntos

Abandono do Hábito de Fumar , Design Centrado no Usuário , Humanos , Abandono do Hábito de Fumar/métodos , Abandono do Hábito de Fumar/psicologia , Masculino , Adulto , Feminino , Pessoa de Meia-Idade

16.

A text-based conversational agent for asthma support: Mixed-methods feasibility study.

Cook, Darren; Peters, Dorian; Moradbakhti, Laura; Su, Ting; Da Re, Marco; Schuller, Bjorn W; Quint, Jennifer; Wong, Ernie; Calvo, Rafael A.

Digit Health ; 10: 20552076241258276, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38894942

RESUMO

Objective: Millions of people in the UK have asthma, yet 70% do not access basic care, leading to the largest number of asthma-related deaths in Europe. Chatbots may extend the reach of asthma support and provide a bridge to traditional healthcare. This study evaluates 'Brisa', a chatbot designed to improve asthma patients' self-assessment and self-management. Methods: We recruited 150 adults with an asthma diagnosis to test our chatbot. Participants were recruited over three waves through social media and a research recruitment platform. Eligible participants had access to 'Brisa' via a WhatsApp or website version for 28 days and completed entry and exit questionnaires to evaluate user experience and asthma control. Weekly symptom tracking, user interaction metrics, satisfaction measures, and qualitative feedback were utilised to evaluate the chatbot's usability and potential effectiveness, focusing on changes in asthma control and self-reported behavioural improvements. Results: 74% of participants engaged with 'Brisa' at least once. High task completion rates were observed: asthma attack risk assessment (86%), voice recording submission (83%) and asthma control tracking (95.5%). Post use, an 8% improvement in asthma control was reported. User satisfaction surveys indicated positive feedback on helpfulness (80%), privacy (87%), trustworthiness (80%) and functionality (84%) but highlighted a need for improved conversational depth and personalisation. Conclusions: The study indicates that chatbots are effective for asthma support, demonstrated by the high usage of features like risk assessment and control tracking, as well as a statistically significant improvement in asthma control. However, lower satisfaction in conversational flexibility highlights rising expectations for chatbot fluency, influenced by advanced models like ChatGPT. Future health-focused chatbots must balance conversational capability with accuracy and safety to maintain engagement and effectiveness.

17.

Translating motivational interviewing for the HPV vaccine into a computable ontology model for automated AI conversational interaction.

Moore, Nicole; Amith, Muhammad; Neumann, Ana C; Hamilton, Jane; Tang, Lu; Savas, Lara S; Tao, Cui.

Ext Abstr Hum Factors Computing Syst ; 20242024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38898884

RESUMO

Human papillomavirus (HPV) vaccinations are lower than expected. To protect the onset of head and neck cancers, innovative strategies to improve the rates are needed. Artificial intelligence may offer some solutions, specifically conversational agents to perform counseling methods. We present our efforts in developing a dialogue model for automating motivational interviewing (MI) to encourage HPV vaccination. We developed a formalized dialogue model for MI using an existing ontology-based framework to manifest a computable representation using OWL2. New utterance classifications were identified along with the ontology that encodes the dialogue model. Our work is available on GitHub under the GPL v.3. We discuss how an ontology-based model of MI can help standardize/formalize MI counseling for HPV vaccine uptake. Our future steps will involve assessing MI fidelity of the ontology model, operationalization, and testing the dialogue model in a simulation with live participants.

18.

Implementation of Anxiety UK's Ask Anxia Chatbot Service: Lessons Learned.

Collins, Luke; Nicholson, Niamh; Lidbetter, Nicky; Smithson, Dave; Baker, Paul.

JMIR Hum Factors ; 11: e53897, 2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38885016

RESUMO

Chatbots are increasingly being applied in the context of health care, providing access to services when there are constraints on human resources. Simple, rule-based chatbots are suited to high-volume, repetitive tasks and can therefore be used effectively in providing users with important health information. In this Viewpoint paper, we report on the implementation of a chatbot service called Ask Anxia as part of a wider provision of information and support services offered by the UK national charity, Anxiety UK. We reflect on the changes made to the chatbot over the course of approximately 18 months as the Anxiety UK team monitored its performance and responded to recurrent themes in user queries by developing further information and services. We demonstrate how corpus linguistics can contribute to the evaluation of user queries and the optimization of responses. On the basis of these observations of how Anxiety UK has developed its own chatbot service, we offer recommendations for organizations looking to add automated conversational interfaces to their services.

Assuntos

Ansiedade , Inteligência Artificial , Humanos , Ansiedade/terapia , Ansiedade/psicologia , Reino Unido

19.

On-device query intent prediction with lightweight LLMs to support ubiquitous conversations.

Dubiel, Mateusz; Barghouti, Yasmine; Kudryavtseva, Kristina; Leiva, Luis A.

Sci Rep ; 14(1): 12731, 2024 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-38830946

RESUMO

Conversational Agents (CAs) have made their way to providing interactive assistance to users. However, the current dialogue modelling techniques for CAs are predominantly based on hard-coded rules and rigid interaction flows, which negatively affects their flexibility and scalability. Large Language Models (LLMs) can be used as an alternative, but unfortunately they do not always provide good levels of privacy protection for end-users since most of them are running on cloud services. To address these problems, we leverage the potential of transfer learning and study how to best fine-tune lightweight pre-trained LLMs to predict the intent of user queries. Importantly, our LLMs allow for on-device deployment, making them suitable for personalised, ubiquitous, and privacy-preserving scenarios. Our experiments suggest that RoBERTa and XLNet offer the best trade-off considering these constraints. We also show that, after fine-tuning, these models perform on par with ChatGPT. We also discuss the implications of this research for relevant stakeholders, including researchers and practitioners. Taken together, this paper provides insights into LLM suitability for on-device CAs and highlights the middle ground between LLM performance and memory footprint while also considering privacy implications.

20.

The Role of Chatbots in Enhancing Health Care for Older Adults: A Scoping Review.

Zhang, Qian; Wong, Arkers Kwan Ching; Bayuo, Jonathan.

J Am Med Dir Assoc ; 25(9): 105108, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38917965

RESUMO

OBJECTIVES: This scoping review aimed to review the characteristics, applications, evaluation approaches, and challenges regarding the use of chatbots in older adults. DESIGN: The scoping review followed the methodological framework by Arksey and O'Malley, with revisions proposed by Levac et al. The findings were reported using the Preferred Reporting Items for Systematic Review and Meta-Analysis Extension for Scoping Reviews checklist. SETTING AND PARTICIPANTS: The reviewed articles primarily focused on older adults, with research conducted in both clinical and nonclinical settings. METHODS: Studies published from January 2010 to May 2023 were searched through 8 databases. A total of 29 studies were identified and evaluated in this review. RESULTS: Results showed that the chatbots were mainly delivered via mobile applications (n = 11), most of them used text as input (n = 16) and output modality (n = 13), and most of them targeted at improving the overall well-being of the older adults (n = 9); most chatbots were designed for fulfilling complex health care needs (n = 7) and health information collection (n = 6). Evaluation approaches of chatbots captured in this review were divided into technical performance, user acceptability, and effectiveness; challenges of applying chatbots to older adults lie in the design of the chatbot, user perception, and operational difficulties. CONCLUSIONS AND IMPLICATIONS: The use of chatbots in the field of older adults is still emerging, with a lack of specifically designed options for older users. Data about the health impact of chatbots as alternative interventions were still limited. More standardized evaluation criteria and robust controlled experiments are needed for further research regarding the effectiveness of chatbots in older adults.

Assuntos

Inteligência Artificial , Atenção à Saúde , Aplicativos Móveis , Idoso , Humanos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA