Accuracy of Artificial Intelligence-Based Virtual Assistants in Responding to Frequently Asked Questions Related to Orthognathic Surgery.

Fatima, Kaleem; Singh, Pinky; Amipara, Hetal; Chaudhary, Ganesh

Fatima, Kaleem; Singh, Pinky; Amipara, Hetal; Chaudhary, Ganesh.

Affiliation

Fatima K; Senior Resident, Department of Orthodontic and Dentofacial Orthopedics, Maulana Azad Institute of Dental Sciences, New Delhi, India.
Singh P; Consultant Orthodontist, Department Of Orthodontics and Dentofacial Orthopedics, Bharatpur Hospital, Bharatpur, Chitwan, Nepal.
Amipara H; Senior Resident, Department of Oral and Maxillofacial Surgery, Vardaman Mahavir Medical College and Safadarjang Hospital, New Delhi, India.
Chaudhary G; Senior Consultant, Department of Oral and Maxillofacial Surgery, Bharatpur Hospital, Bharatpur Chitwan, Nepal. Electronic address: drsantosh12532@gmail.com.

J Oral Maxillofac Surg ; 82(8): 916-921, 2024 Aug.

Article in En | MEDLINE | ID: mdl-38729217

ABSTRACT

ABSTRACT

BACKGROUND:

Despite increasing interest in how conversational agents might improve health care delivery and information dissemination, there is limited research assessing the quality of health information provided by these technologies, especially in orthognathic surgery (OGS).

PURPOSE:

This study aimed to measure and compare the quality of four virtual assistants (VAs) in addressing the frequently asked questions about OGS. STUDY DESIGN, SETTING, AND SAMPLE This in-silico cross-sectional study assessed the responses of a sample of four VAs through a standardized set of 10 questionnaires related to OGS. INDEPENDENT VARIABLE The independent variables were the four VAs. The four VAs tested were VA1 Alexa (Seattle, Washington), VA2 Google Assistant (Google Mountain View, California), VA3 Siri (Cupertino, California), and VA4 Bing (San Diego, California). MAIN OUTCOME VARIABLE(S) The primary outcome variable was the quality of the answers generated by the four VAs. Four investigators (two orthodontists and two oral surgeons) assessed the quality of response of the four VAs through a standardized set of 10 questionnaires using a five-point modified Likert scale, with the lowest score (1) signifying the highest quality. The main outcome variables measured were the combined mean scores of the responses from each VA, and the secondary outcome assessed was the variability in responses among the different investigators. COVARIATES None. ANALYSES One-way analysis of variance was done to compare the average scores per question. One-way analysis of variance followed by Tukey's post hoc analyses was done to compare the combined mean scores among the VAs, and the combined mean scores of all questions were evaluated to determine variability if any among different VA's responses to the investigators.

RESULTS:

Among the four VAs, VA4 (1.32 ± 0.57) had significantly the lowest (best) score, followed by VA2 (1.55 ± 0.78), VA1 (2.67 ± 1.49), and VA3 (3.52 ± 0.50) (P value <.001). There were no significant differences in how the VAs VA3 (P value = .46), VA4 (P value = .45), and VA2 (P value = .44) responded to each of the investigators except VA1 (P value = .003). CONCLUSION AND RELEVANCE The VAs responded to the queries related to OGS, with VA4 displaying the best quality response, followed by VA2, VA1, and VA3. Technology companies and clinical organizations should partner for an intelligent VA with evidence-based responses specifically curated to educate patients.

Subject(s)

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Artificial Intelligence Limits: Humans Language: En Journal: J Oral Maxillofac Surg Year: 2024 Document type: Article Affiliation country: India Country of publication: Estados Unidos

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google