RESUMEN
Objective: With the increasing global burden of chronic diseases, there is the potential for conversational agents (CAs) to assist people in actively managing their conditions. This paper reviews different types of CAs used for chronic condition management, delving into their characteristics and the chosen study designs. This paper also discusses the potential of these CAs to enhance the health and well-being of people with chronic conditions. Methods: A search was performed in February 2023 on PubMed, ACM Digital Library, Scopus, and IEEE Xplore. Studies were included if they focused on chronic disease management or prevention and if systems were evaluated on target user groups. Results: The 42 selected studies explored diverse types of CAs across 11 health conditions. Personalization varied, with 25 CAs not adapting message content, while others incorporated user characteristics and real-time context. Only 12 studies used medical records in conjunction with CAs for conditions like diabetes, mental health, cardiovascular issues, and cancer. Despite measurement method variations, the studies predominantly emphasized improved health outcomes and positive user attitudes toward CAs. Conclusions: The results underscore the need for CAs to adapt to evolving patient needs, customize interventions, and incorporate human support and medical records for more effective care. It also highlights the potential of CAs to play a more active role in helping individuals manage their conditions and notes the value of linguistic data generated during user interactions. The analysis acknowledges its limitations and encourages further research into the use and potential of CAs in disease-specific contexts.
RESUMEN
The study investigated gender bias in GPT-4's assessment of coronary artery disease risk by presenting identical clinical vignettes of men and women with and without psychiatric comorbidities. Results suggest that psychiatric conditions may influence GPT-4's coronary artery disease risk assessment among men and women.
Asunto(s)
Sexismo , Humanos , Femenino , Masculino , Sexismo/psicología , Enfermedades Cardiovasculares/psicología , Persona de Mediana Edad , Medición de Riesgo/métodos , Inteligencia Artificial , Adulto , Factores de Riesgo de Enfermedad Cardiaca , Enfermedad de la Arteria Coronaria/psicologíaRESUMEN
Objective: Post-traumatic stress disorder (PTSD) is a pervasive health concern affecting millions of individuals. However, there remain significant barriers to providing resources and addressing the needs of individuals living with PTSD. To address this treatment gap, we have collaborated with clinical experts to develop PTSDialogue-a conversational agent (CA) that aims to support effective self-management of PTSD. In this work, we have focused on assessing the feasibility and acceptance of PTSDialogue for individuals living with PTSD. Methods: We conducted semi-structured interviews with individuals living with PTSD ( N = 12 ). Participants were asked about their experiences with the PTSDialogue and their perceptions of its usefulness in managing PTSD. We then used bottom-up thematic analysis with a qualitative interpretivist approach to analyze the interview data. Results: All participants expressed that PTSDialogue could be beneficial for supporting PTSD treatment. We also uncovered key opportunities and challenges in using CAs to complement existing clinical practices and support longitudinal self-management of PTSD. We highlight important design features of CAs to provide effective support for this population, including the need for personalization, education, and privacy-sensitive interactions. Conclusion: We demonstrate the acceptability of CAs to support longitudinal self-management of PTSD. Based on these findings, we have outlined design recommendations for technologies aiming to reduce treatment and support gaps for individuals living with serious mental illnesses.
RESUMEN
BACKGROUND: The growing interest in using ChatGPT (OpenAI, San Francisco, CA) in the medical field highlights the need for in-depth knowledge of its potential and constraints, especially when it comes to anatomy education (AE). Because of its sophisticated natural language processing abilities, it can understand the nuances of anatomical concepts, provide advanced as well as contextually relevant information, and could be a helpful tool for medical students and educators. This study aimed to analyze the capabilities and limitations of ChatGPT and its best possible application in AE. METHODOLOGY: The study incorporated 34 questions that were inquired to ChatGPT after acquiring an online subscription to the 4th version. The questions were arbitrarily formulated after consensus among the researchers. The chatbot's replies were recorded and evaluated with reference to perfection, validity, and appropriateness. RESULTS: ChatGPT was observed to be a useful interactive tool for medical students to comprehend the clinical importance and characteristics of anatomical structures. The chatbot clarified the anatomical basis of ischemic heart disease and adequately tabulated the differences between the arteries and veins. Even though ChatGPT-4 was able to produce images of different anatomical structures, it fell short of accurately displaying the necessary features. Further, the chatbot generated quizzes, including multiple-choice, true-false, fill-in-the-blank, matching, and case-based questions, formulated a relevant overview of the lecture, and also analyzed answers to anatomy questions with adequate reasoning. CONCLUSIONS: ChatGPT can be a useful educational resource for medical students with the potential to play a crucial role in AE if employed in a methodical way. It imparts significant aid to the anatomy teachers during the execution of the medical curriculum and enhances their jobs while it never takes the place of an educator.
RESUMEN
As large language models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, drawbacks and overall societal impact. With that in mind, this research conducted an extensive investigation into seven LLMs, aiming to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points. In addition, LLMs' personality profile was analysed and compared with human normative data. The findings revealed varying levels of inter-rater agreement in the LLMs' responses over a short time, with some LLMs showing higher agreement (e.g. Llama3 and GPT-4o) compared with others (e.g. GPT-4 and Gemini). Furthermore, agreement depended on used instruments as well as on domain or trait. This implies the variable robustness in LLMs' ability to reliably simulate stable personality characteristics. In the case of scales which showed at least fair agreement, LLMs displayed mostly a socially desirable profile in both agentic and communal domains, as well as a prosocial personality profile reflected in higher agreeableness and conscientiousness and lower Machiavellianism. Exhibiting temporal stability and coherent responses on personality traits is crucial for AI systems due to their societal impact and AI safety concerns.
RESUMEN
BACKGROUND: We developed MARVIN, an artificial intelligence (AI)-based chatbot that provides 24/7 expert-validated information on self-management-related topics for people with HIV. This study assessed (1) the feasibility of using MARVIN, (2) its usability and acceptability, and (3) four usability subconstructs (perceived ease of use, perceived usefulness, attitude towards use, and behavioural intention to use). METHODS: In a mixed-methods study conducted at the McGill University Health Centre, enrolled participants were asked to have 20 conversations within 3 weeks with MARVIN on predetermined topics and to complete a usability questionnaire. Feasibility, usability, acceptability, and usability subconstructs were examined against predetermined success thresholds. Qualitatively, randomly selected participants were invited to semi-structured focus groups/interviews to discuss their experiences with MARVIN. Barriers and facilitators were identified according to the four usability subconstructs. RESULTS: From March 2021 to April 2022, 28 participants were surveyed after a 3-week testing period, and nine were interviewed. Study retention was 70% (28/40). Mean usability exceeded the threshold (69.9/68), whereas mean acceptability was very close to target (23.8/24). Ratings of attitude towards MARVIN's use were positive (+14%), with the remaining subconstructs exceeding the target (5/7). Facilitators included MARVIN's reliable and useful real-time information support, its easy accessibility, provision of convivial conversations, confidentiality, and perception as being emotionally safe. However, MARVIN's limited comprehension and the use of Facebook as an implementation platform were identified as barriers, along with the need for more conversation topics and new features (e.g., memorization). CONCLUSIONS: The study demonstrated MARVIN's global usability. Our findings show its potential for HIV self-management and provide direction for further development.
RESUMEN
Interventions implemented in the digital space play an important role in the response to global concerns about the prevalence of online child sexual abuse. Digital detection software (e.g. Sweetie) utilized to combat this behavior is a well-known example. Far fewer examples of digital interventions focused on its prevention exist. This review sought to identify digital interventions currently being implemented that aim to prevent online child sexual abuse, or intervene early, through deterrence and redirection. Guided by the PRISMA scoping review framework, a search was conducted across four databases, with snowballing from reference lists of selected sources. After exclusion criteria were applied, six sources were selected for review. Findings suggest that digital interventions (e.g. warning messages and chatbots) can be used to deter and redirect individuals at-risk of, or in the early stages of engaging in online child sexual abuse, with greater deterrent effects observed when messaging aligns with situational crime prevention principles. However, limited application and evaluation of these interventions to date constrains inferences regarding the impact of these prevention efforts. To supplement findings, several other emerging examples of digital interventions and conceptual/theoretical works (that did not meet the original inclusion criteria) are cited. Findings of this review should be considered alongside these other examples to inform the ongoing design and scaling up of digital interventions aimed at preventing online child sexual abuse.
RESUMEN
BACKGROUND: Patients often struggle with determining which outpatient specialist to consult based on their symptoms. Natural language processing models in health care offer the potential to assist patients in making these decisions before visiting a hospital. OBJECTIVE: This study aimed to evaluate the performance of ChatGPT in recommending medical specialties for medical questions. METHODS: We used a dataset of 31,482 medical questions, each answered by doctors and labeled with the appropriate medical specialty from the health consultation board of NAVER (NAVER Corp), a major Korean portal. This dataset includes 27 distinct medical specialty labels. We compared the performance of the fine-tuned Korean Medical bidirectional encoder representations from transformers (KM-BERT) and ChatGPT models by analyzing their ability to accurately recommend medical specialties. We categorized responses from ChatGPT into those matching the 27 predefined specialties and those that did not. Both models were evaluated using performance metrics of accuracy, precision, recall, and F1-score. RESULTS: ChatGPT demonstrated an answer avoidance rate of 6.2% but provided accurate medical specialty recommendations with explanations that elucidated the underlying pathophysiology of the patient's symptoms. It achieved an accuracy of 0.939, precision of 0.219, recall of 0.168, and an F1-score of 0.134. In contrast, the KM-BERT model, fine-tuned for the same task, outperformed ChatGPT with an accuracy of 0.977, precision of 0.570, recall of 0.652, and an F1-score of 0.587. CONCLUSIONS: Although ChatGPT did not surpass the fine-tuned KM-BERT model in recommending the correct medical specialties, it showcased notable advantages as a conversational artificial intelligence model. By providing detailed, contextually appropriate explanations, ChatGPT has the potential to significantly enhance patient comprehension of medical information, thereby improving the medical referral process.
Asunto(s)
Procesamiento de Lenguaje Natural , Humanos , República de Corea , Derivación y Consulta , Pacientes AmbulatoriosRESUMEN
PURPOSE: Large language models such as ChatGPT-3.5 are often used by the public to answer questions related to daily life, including health advice. This study evaluated the responses of ChatGPT-3.5 in answering patient-centred frequently asked questions (FAQs) relevant in glaucoma clinical practice. DESIGN: Prospective cross-sectional survey. METHODS: Twelve experts across a range of clinical, education and research practices in optometry and ophthalmology. Over 200 patient-centric FAQs from authoritative professional society, hospital and advocacy websites were distilled and filtered into 40 questions across four themes: definition and risk factors, diagnosis and testing, lifestyle and other accompanying conditions, and treatment and follow-up. The questions were individually input into ChatGPT-3.5 to generate responses. The responses were graded by the twelve experts individually. MAIN OUTCOME MEASURES: A 5-point Likert scale (1 = strongly disagree; 5 = strongly agree) was used to grade ChatGPT-3.5 responses across four domains: coherency, factuality, comprehensiveness, and safety. RESULTS: Across all themes and domains, median scores were all 4 ("agree"). Comprehensiveness had the lowest scores across domains (mean 3.7±0.9), followed by factuality (mean 3.9±0.9), and coherency and safety (mean 4.1±0.8 for both). Examination of the individual 40 questions showed that 8 (20%), 17 (42.5%), 24 (60%) and 8 (20%) of the questions had average scores below 4 (i.e. below "agree") for the coherency, factuality, comprehensiveness and safety domains, respectively. Free-text comments by the experts highlighted omissions of facts and comprehensiveness (e.g. secondary glaucoma) and remarked on the vagueness of some responses (i.e. that the response did not account for individual patient circumstances). CONCLUSIONS: ChatGPT-3.5 responses to FAQs in glaucoma were generally agreeable in terms of coherency, factuality, comprehensiveness, and safety. However, areas of weakness were identified, precluding recommendations for routine use to provide patients with tailored counselling in glaucoma, especially with respect to development of glaucoma and its management.
RESUMEN
BACKGROUND: Artificial intelligence and the language models derived from it, such as ChatGPT, offer immense possibilities, particularly in the field of medicine. It is already evident that ChatGPT can provide adequate and, in some cases, expert-level responses to health-related queries and advice for patients. However, it is currently unknown how patients perceive these capabilities, whether they can derive benefit from them, and whether potential risks, such as harmful suggestions, are detected by patients. OBJECTIVE: This study aims to clarify whether patients can get useful and safe health care advice from an artificial intelligence chatbot assistant. METHODS: This cross-sectional study was conducted using 100 publicly available health-related questions from 5 medical specialties (trauma, general surgery, otolaryngology, pediatrics, and internal medicine) from a web-based platform for patients. Responses generated by ChatGPT-4.0 and by an expert panel (EP) of experienced physicians from the aforementioned web-based platform were packed into 10 sets consisting of 10 questions each. The blinded evaluation was carried out by patients regarding empathy and usefulness (assessed through the question: "Would this answer have helped you?") on a scale from 1 to 5. As a control, evaluation was also performed by 3 physicians in each respective medical specialty, who were additionally asked about the potential harm of the response and its correctness. RESULTS: In total, 200 sets of questions were submitted by 64 patients (mean 45.7, SD 15.9 years; 29/64, 45.3% male), resulting in 2000 evaluated answers of ChatGPT and the EP each. ChatGPT scored higher in terms of empathy (4.18 vs 2.7; P<.001) and usefulness (4.04 vs 2.98; P<.001). Subanalysis revealed a small bias in terms of levels of empathy given by women in comparison with men (4.46 vs 4.14; P=.049). Ratings of ChatGPT were high regardless of the participant's age. The same highly significant results were observed in the evaluation of the respective specialist physicians. ChatGPT outperformed significantly in correctness (4.51 vs 3.55; P<.001). Specialists rated the usefulness (3.93 vs 4.59) and correctness (4.62 vs 3.84) significantly lower in potentially harmful responses from ChatGPT (P<.001). This was not the case among patients. CONCLUSIONS: The results indicate that ChatGPT is capable of supporting patients in health-related queries better than physicians, at least in terms of written advice through a web-based platform. In this study, ChatGPT's responses had a lower percentage of potentially harmful advice than the web-based EP. However, it is crucial to note that this finding is based on a specific study design and may not generalize to all health care settings. Alarmingly, patients are not able to independently recognize these potential dangers.
Asunto(s)
Relaciones Médico-Paciente , Humanos , Estudios Transversales , Masculino , Femenino , Adulto , Persona de Mediana Edad , Inteligencia Artificial , Médicos/psicología , Internet , Empatía , Encuestas y CuestionariosRESUMEN
INTRODUCTION: Large language model (LLM) chatbots have many applications in medical settings. However, these tools can potentially perpetuate racial and gender biases through their responses, worsening disparities in healthcare. With the ongoing discussion of LLM chatbots in oncology and the widespread goal of addressing cancer disparities, this study focuses on biases propagated by LLM chatbots in oncology. METHODS: Chat Generative Pre-trained Transformer (Chat GPT; OpenAI, San Francisco, CA, USA) was asked to determine what occupation a generic description of "assesses cancer patients" would correspond to for different demographics. Chat GPT, Gemini (Alphabet Inc., Mountain View, CA, USA), and Bing Chat (Microsoft Corp., Redmond, WA, USA) were prompted to provide oncologist recommendations in the top U.S. cities and demographic information (race, gender) of recommendations was compared against national distributions. Chat GPT was also asked to generate a job description for oncologists with different demographic backgrounds. Finally, Chat GPT, Gemini, and Bing Chat were asked to generate hypothetical cancer patients with race, smoking, and drinking histories. RESULTS: LLM chatbots are about two times more likely to predict Blacks and Native Americans as oncology nurses than oncologists, compared to Asians (p < 0.01 and < 0.001, respectively). Similarly, they are also significantly more likely to predict females than males as oncology nurses (p < 0.001). Chat GPT's real-world oncologist recommendations overrepresent Asians by almost double and underrepresent Blacks by double and Hispanics by seven times. Chatbots also generate different job descriptions based on demographics, including cultural competency and advocacy and excluding treatment administration for underrepresented backgrounds. AI-generated cancer cases are not fully representative of real-world demographic distributions and encode stereotypes on substance abuse, such as Hispanics having a greater proportion of smokers than Whites by about 20% in Chat GPT breast cancer cases. CONCLUSION: To our knowledge, this is the first study of its kind to investigate racial and gender biases of such a diverse set of AI chatbots, and that too, within oncology. The methodology presented in this study provides a framework for targeted bias evaluation of LLMs in various fields across medicine.
RESUMEN
In this paper, we discuss how artificial intelligence chatbots based on large-scale language models (LLMs) can be used to disseminate information about the benefits of physical exercise for individuals with epilepsy. LLMs have demonstrated the ability to generate increasingly detailed text and allow structured dialogs. These can be useful tools, providing guidance and advice to people with epilepsy on different forms of treatment as well as physical exercise. We also examine the limitations of LLMs, which include the need for human supervision and the risk of providing imprecise and unreliable information regarding specific or controversial aspects of the topic. Despite these challenges, LLM chatbots have demonstrated the potential to support the management of epilepsy and break down barriers to information access, particularly information on physical exercise.
RESUMEN
Objective: Most smokers who achieve short-term abstinence relapse even when aided by evidence-based cessation treatment. Mobile health presents a promising but largely untested avenue for providing adjunct behavioral support for relapse prevention. This paper presents the rationale and design of a randomized controlled trial aimed at evaluating the effectiveness of personalized mobile chat messaging support for relapse prevention among people who recently quit smoking. Methods: This is a two-arm, assessor-blinded, randomized controlled trial conducted in two clinic-based smoking cessation services in Hong Kong. An estimated 586 daily tobacco users who have abstained for 3 to 30 days will be randomized (1:1) to intervention group or control group. Both groups receive standard-of-care smoking cessation treatment from the services. The intervention group additionally receives 3-month relapse prevention support via mobile chat messaging, including cessation advice delivered by a live counselor and access to a supportive chatbot via WhatsApp. The control group receives text messaging on generic cessation advice for 3 months as attention control. The primary outcome is tobacco abstinence verified by an exhaled carbon monoxide of <5â¯parts per million or a negative salivary cotinine test at 6 months after randomization. Secondary outcomes include self-reported 6-month prolonged tobacco abstinence, 7-day point-prevalent abstinence, and relapse rate. The primary analyses will be by intention-to-treat, assuming participants with missing data are non-abstinent. This trial is registered with ClinicalTrials.gov (NCT05370352) and follows CONSORT-EHEALTH. Conclusion: This trial will provide new evidence on the effectiveness of mobile chat messaging as a scalable and accessible intervention for relapse prevention.
RESUMEN
Despite considerable behavioral and organizational research on advice from human advisors, and despite the increasing study of artificial intelligence (AI) in organizational research, workplace-related applications, and popular discourse, an interdisciplinary review of advice from AI (vs. human) advisors has yet to be undertaken. We argue that the increasing adoption of AI to augment human decision-making would benefit from a framework that can characterize such interactions. Thus, the current research invokes judgment and decision-making research on advice from human advisors and uses a conceptual "fit"-based model to: (1) summarize how the characteristics of the AI advisor, human decision-maker, and advice environment influence advice exchanges and outcomes (including informed speculation about the durability of such findings in light of rapid advances in AI technology), (2) delineate future research directions (along with specific predictions), and (3) provide practical implications involving the use of AI advice by human decision-makers in applied settings.
RESUMEN
BACKGROUND: Parenting interventions are crucial for promoting family well-being, reducing violence against children, and improving child development outcomes; however, scaling these programs remains a challenge. Prior reviews have characterized the feasibility, acceptability, and effectiveness of other more robust forms of digital parenting interventions (eg, via the web, mobile apps, and videoconferencing). Recently, chatbot technology has emerged as a possible mode for adapting and delivering parenting programs to larger populations (eg, Parenting for Lifelong Health, Incredible Years, and Triple P Parenting). OBJECTIVE: This study aims to review the evidence of using chatbots to deliver parenting interventions and assess the feasibility of implementation, acceptability of these interventions, and preliminary outcomes. METHODS: This review conducted a comprehensive search of databases, including Web of Science, MEDLINE, Scopus, ProQuest, and Cochrane Central Register of Controlled Trials. Cochrane Handbook for Systematic Review of Interventions and PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were used to conduct the search. Eligible studies targeted parents of children aged 0 to 18 years; used chatbots via digital platforms, such as the internet, mobile apps, or SMS text messaging; and targeted improving family well-being through parenting. Implementation measures, acceptability, and any reported preliminary measures of effectiveness were included. RESULTS: Of the 1766 initial results, 10 studies met the inclusion criteria. The included studies, primarily conducted in high-income countries (8/10, 80%), demonstrated a high mean retention rate (72.8%) and reported high acceptability (10/10, 100%). However, significant heterogeneity in interventions, measurement methods, and study quality necessitate cautious interpretation. Reporting bias, lack of clarity in the operationalization of engagement measures, and platform limitations were identified as limiting factors in interpreting findings. CONCLUSIONS: This is the first study to review the implementation feasibility and acceptability of chatbots for delivering parenting programs. While preliminary evidence suggests that chatbots can be used to deliver parenting programs, further research, standardization of reporting, and scaling up of effectiveness testing are critical to harness the full benefits of chatbots for promoting family well-being.
RESUMEN
The integration of artificial intelligence (AI) technology in e-commerce has currently stimulated scholarly attention, however studies on AI and e-commerce generally relatively few. The current study aims to evaluate how artificial intelligence (AI) chatbots persuade users to consider chatbot recommendations in a web-based buying situation. Employing the theory of elaboration likelihood, the current study presents an analytical framework for identifying factors and internal mechanisms of consumers' readiness to adopt AI chatbot recommendations. The authors evaluated the model employing questionnaire responses from 411 Chinese AI chatbot consumers. The findings of present study indicated that chatbot recommendation reliability and accuracy is positively related to AI technology trust and have negative effect on perceived self-threat. In addition, AI technology trust is positively related to intention to adopt chatbot decision whereas perceived self-threat negatively related to intention to adopt chatbot decision. The perceived dialogue strengthens the significant relationship between AI-tech trust and intention to adopt chatbot decision and weakens the negative relationship between perceived self-threat and intention to adopt AI chatbot decisions.
RESUMEN
Diabetic foot ulcers (DFUs) are a growing public health problem, paralleling the increasing incidence of diabetes. While prevention is most effective treatment for DFUs, challenge remains on selecting the optimal treatment in cases with DFUs. Health sciences have greatly benefited from the integration of artificial intelligence (AI) applications across various fields. Regarding amputations in DFUs, both literature and clinical practice have mainly focused on strategies to prevent amputation and identify avoidable risk factor. However, there are very limited data on assistive parameters/tools that can be used to determine the level of amputation. This study investigated how well ChatGPT, with its lately released version 4o, matches the amputation level selection of an experienced team in this field. For this purpose, clinical photographs from patients who underwent amputations due to diabetic foot ulcers between May 2023 and May 2024 were submitted to the ChatGPT-4o program. The AI was tasked with recommending an appropriate amputation level based on these clinical photographs. Data from a total of 60 patients were analysed, with a median age of 64.5 years (range: 41-91). According to the Wagner Classification, 32 patients (53.3%) had grade 4 ulcers, 16 patients (26.6%) had grade 5 ulcers, 10 patients (16.6%) had grade 3 ulcers and 2 patients (3.3%) had grade 2 ulcers. A one-to-one correspondence between the AI tool's recommended amputation level and the level actually performed was observed in 50 out of 60 cases (83.3%). In the remaining 10 cases, discrepancies were noted, with the AI consistently recommending a more proximal level of amputation than what was performed. The inter-rater agreement analysis between the actual surgeries and the AI tool's recommendations yielded a Cohen's kappa coefficient of 0.808 (SD: 0.055, 95% CI: 0.701-0.916), indicating substantial agreement. Relying solely on clinical photographs, ChatGPT-4.0 demonstrates decisions that are largely consistent with those of an experienced team in determining the optimal level of amputation for DFUs, with the exception of hindfoot amputations.
Asunto(s)
Amputación Quirúrgica , Inteligencia Artificial , Pie Diabético , Humanos , Pie Diabético/cirugía , Amputación Quirúrgica/métodos , Amputación Quirúrgica/estadística & datos numéricos , Anciano , Masculino , Femenino , Persona de Mediana Edad , Anciano de 80 o más Años , AdultoRESUMEN
Artificial intelligence (AI) offers a wealth of opportunities for medicine, if we also bear in mind the risks associated with this technology. In recent years the potential future integration of AI with medicine has been the subject of much debate, although practical clinical experience of relevant cases is still largely absent. This case study examines a particular patient's experience with different forms of care. Initially, the patient communicated with the conversation (chat) based AI (CAI) for self-treatment. However, over time she found herself increasingly drawn to a low-threshold internal company support system that is grounded in an existing, more traditional human-based care structure. This pattern of treatment May represent a useful addition to existing care structures, particularly for patients receptive to technology.
Asunto(s)
Trastornos de Ansiedad , Inteligencia Artificial , Humanos , Femenino , Comunicación , Adulto , AutocuidadoRESUMEN
OBJECTIVES: Standalone oral health chatbots targeting young children's oral health are rare. The aim of this research was to compare the effectiveness of a standalone chatbot and a combination chatbot with in-person toothbrushing training for caregivers in improving young children's oral health. METHODS: A randomised, parallel, 2-group pretest-posttest design was employed with 320 caregiver-child pairs (aged 6-42 months). Group I (160 pairs) used the 21-Day FunDee (modified) chatbot along with in-person toothbrushing training, whilst Group II (160 pairs) used only the 21-Day FunDee Plus chatbot. Oral examination assessed plaque levels and caries, whilst a self-administered questionnaire evaluated oral hygiene care, dietary practices, and oral health perceptions based on the protection motivation theory (PMT). Data were analysed using 2-way repeated-measures analysis of variance, a t test, and chi-square measures for group comparisons. RESULTS: The majority of caregivers were Muslim mothers. No significant differences were observed between groups at the baseline, 3-month, and 6-month follow-ups in mean dmft (Group I: 4.16, 4.64, and 5.30 vs Group II: 4.30, 5.54, and 5.82), mean plaque scores (Group I: 0.72, 0.53, and 0.55 vs Group II (0.84, 0.52, and 0.59), and most dietary habits. However, significant improvements were found within groups from baseline to follow-ups in plaque reduction, toothbrushing practices, overall knowledge score, PMT perceptions, proper tooth brushing, fluoride toothpaste usage, and dietary behaviours (frequency of bottle feeding, frequency of nocturnal bottle feeding, proportion of children who went to bed without consuming anything after cleaning their teeth before bedtime). The significant differences between groups were found in self-efficacy at all time points, but only at the 6-month evaluation for percentage of fluoride toothpaste and overall PMT perceptions. CONCLUSIONS: Both interventions were comparable in preventing caries, reducing plaque, improving feeding practices, increasing parental involvement in tooth brushing, and enhancing knowledge. The standalone chatbot 21-Day FunDee Plus presents a viable alternative for promoting oral health in young children.
RESUMEN
Since its launch in November 2022, ChatGPT has become a global phenomenon, sparking widespread public interest in chatbot artificial intelligences (AIs) generally. While not approved for medical use, it is capable of passing all three United States medical licensing exams and offers diagnostic accuracy comparable to a human doctor. It seems inevitable that it, and tools like it, are and will be used by the general public to provide medical diagnostic information or treatment plans. Before we are taken in by the promise of a golden age for chatbot medical AIs, it would be wise to consider the implications of using these tools as either supplements to, or substitutes for, human doctors. With the rise of publicly available chatbot AIs, there has been a keen focus on research into the diagnostic accuracy of these tools. This, however, has left a notable gap in our understanding of the implications for health outcomes of these tools. Diagnosis accuracy is only part of good health care. For example, crucial to positive health outcomes is the doctor-patient relationship. This paper challenges the recent focus on diagnostic accuracy by drawing attention to the causal relationship between doctor-patient relationships and health outcomes arguing that chatbot AIs may even hinder outcomes in numerous ways including subtracting the elements of perception and observation that are crucial to clinical consultations. The paper offers brief suggestions to improve chatbot medical AIs so as to positively impact health outcomes.