Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
1.
BMC Public Health ; 24(1): 2266, 2024 Aug 21.
Article in English | MEDLINE | ID: mdl-39169305

ABSTRACT

BACKGROUND: Chatbots can provide immediate assistance tailored to patients' needs, making them suitable for sustained accompanying interventions. Nevertheless, there is currently no evidence regarding their acceptability by hypertensive patients and the factors influencing the acceptability in the real-world. Existing evaluation scales often focus solely on the technology itself, overlooking the patients' perspective. Utilizing mixed methods can offer a more comprehensive exploration of influencing factors, laying the groundwork for the future integration of artificial intelligence in chronic disease management practices. METHODS: The mixed methods will provide a holistic view to understand the effectiveness and acceptability of the intervention. Participants will either receive the standard primary health care or obtain a chatbot speaker. The speaker can provide timely reminders, on-demand consultations, personalized data recording, knowledge broadcasts, as well as entertainment features such as telling jokes. The quantitative part will be conducted as a quasi-randomized controlled trial in community in Beijing. And the convergent design will be adopted. When patients use the speaker for 1 month, scales will be used to measure patients' intention to use the speaker. At the same time, semi-structured interviews will be conducted to explore patients' feelings and influencing factors of using speakers. Data on socio-demography, physical examination, blood pressure, acceptability and self-management behavior will be collected at baseline, as well as 1,3,6, and 12 months later. Furthermore, the cloud database will continuously collect patients' interactions with the speaker. The primary outcome is the efficacy of the chatbot on blood pressure control. The secondary outcome includes the acceptability of the chatbot speaker and the changes of self-management behavior. DISCUSSION: Artificial intelligence-based chatbot speaker not only caters to patients' self-management needs at home but also effectively organizes intricate and detailed knowledge system for patients with hypertension through a knowledge graph. Patients can promptly access information that aligns with their specific requirements, promoting proactive self-management and playing a crucial role in disease management. This study will serve as a foundation for the application of artificial intelligence technology in chronic disease management, paving the way for further exploration on enhancing the communicative impact of artificial intelligence technology. TRIAL REGISTRATION: Biomedical Ethics Committee of Peking University: IRB00001052-21106, 2021/10/14; Clinical Trials: ChiCTR2100050578, 2021/08/29.


Subject(s)
Artificial Intelligence , Hypertension , Humans , Hypertension/therapy , Female , China , Male , Adult , Middle Aged , Patient Acceptance of Health Care/psychology , Primary Health Care , Qualitative Research
2.
Front Bioeng Biotechnol ; 12: 1433087, 2024.
Article in English | MEDLINE | ID: mdl-39157445

ABSTRACT

Introduction: This study aimed to identify differences in voice characteristics and changes between patients with dysphagia-aspiration and healthy individuals using a deep learning model, with a focus on under-researched areas of pre- and post-swallowing voice changes in patients with dysphagia. We hypothesized that these variations may be due to weakened muscles and blocked airways in patients with dysphagia. Methods: A prospective cohort study was conducted on 198 participants aged >40 years at the Seoul National University Bundang Hospital from October 2021 to February 2023. Pre- and post-swallowing voice data of the participants were converted to a 64-kbps mp3 format, and all voice data were trimmed to a length of 2 s. The data were divided for 10-fold cross-validation and stored in HDF5 format with anonymized IDs and labels for the normal and aspiration groups. During preprocessing, the data were converted to Mel spectrograms, and the EfficientAT model was modified using the final layer of MobileNetV3 to effectively detect voice changes and analyze pre- and post-swallowing voices. This enabled the model to probabilistically categorize new patient voices as normal or aspirated. Results: In a study of the machine-learning model for aspiration detection, area under the receiver operating characteristic curve (AUC) values were analyzed across sexes under different configurations. The average AUC values for males ranged from 0.8117 to 0.8319, with the best performance achieved at a learning rate of 3.00e-5 and a batch size of 16. The average AUC values for females improved from 0.6975 to 0.7331, with the best performance observed at a learning rate of 5.00e-5 and a batch size of 32. As there were fewer female participants, a combined model was developed to maintain the sex balance. In the combined model, the average AUC values ranged from 0.7746 to 0.7997, and optimal performance was achieved at a learning rate of 3.00e-5 and a batch size of 16. Conclusion: This study evaluated a voice analysis-based program to detect pre- and post-swallowing changes in patients with dysphagia, potentially aiding in real-time monitoring. Such a system can provide healthcare professionals with daily insights into the conditions of patients, allowing for personalized interventions. Clinical Trial Registration: ClinicalTrials.gov, identifier NCT05149976.

3.
JMIR AI ; 3: e54885, 2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39052997

ABSTRACT

BACKGROUND: The escalating global prevalence of obesity has necessitated the exploration of novel diagnostic approaches. Recent scientific inquiries have indicated potential alterations in voice characteristics associated with obesity, suggesting the feasibility of using voice as a noninvasive biomarker for obesity detection. OBJECTIVE: This study aims to use deep neural networks to predict obesity status through the analysis of short audio recordings, investigating the relationship between vocal characteristics and obesity. METHODS: A pilot study was conducted with 696 participants, using self-reported BMI to classify individuals into obesity and nonobesity groups. Audio recordings of participants reading a short script were transformed into spectrograms and analyzed using an adapted YOLOv8 model (Ultralytics). The model performance was evaluated using accuracy, recall, precision, and F1-scores. RESULTS: The adapted YOLOv8 model demonstrated a global accuracy of 0.70 and a macro F1-score of 0.65. It was more effective in identifying nonobesity (F1-score of 0.77) than obesity (F1-score of 0.53). This moderate level of accuracy highlights the potential and challenges in using vocal biomarkers for obesity detection. CONCLUSIONS: While the study shows promise in the field of voice-based medical diagnostics for obesity, it faces limitations such as reliance on self-reported BMI data and a small, homogenous sample size. These factors, coupled with variability in recording quality, necessitate further research with more robust methodologies and diverse samples to enhance the validity of this novel approach. The findings lay a foundational step for future investigations in using voice as a noninvasive biomarker for obesity detection.

4.
Contemp Clin Trials ; 142: 107574, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38763307

ABSTRACT

BACKGROUND: Novel and scalable psychotherapies are urgently needed to address the depression and anxiety epidemic. Leveraging artificial intelligence (AI), a voice-based virtual coach named Lumen was developed to deliver problem solving treatment (PST). The first pilot trial showed promising changes in cognitive control measured by functional neuroimaging and improvements in depression and anxiety symptoms. METHODS: To further validate Lumen in a 3-arm randomized clinical trial, 200 participants with mild-to-moderate depression and/or anxiety will be randomly assigned in a 2:1:1 ratio to receive Lumen-coached PST, human-coached PST as active treatment comparison, or a waitlist control condition where participants can receive Lumen after the trial period. Participants will be assessed at baseline and 18 weeks. The primary aim is to confirm neural target engagement by testing whether compared with waitlist controls, Lumen participants will show significantly greater improvements from baseline to 18 weeks in the a priori neural target for cognitive control, right dorsal lateral prefrontal cortex engaged by the go/nogo task (primary superiority hypothesis). A secondary hypothesis will test whether compared with human-coached PST participants, Lumen participants will show equivalent improvements (i.e., noninferiority) in the same neural target from baseline to 18 weeks. The second aim is to examine (1) treatment effects on depression and anxiety symptoms, psychosocial functioning, and quality of life outcomes, and (2) relationships of neural target engagement to these patient-reported outcomes. CONCLUSIONS: This study offers potential to improve the reach and impact of psychotherapy, mitigating access, cost, and stigma barriers for people with depression and/or anxiety. CLINICALTRIALS: gov #: NCT05603923.


Subject(s)
Anxiety , Artificial Intelligence , Depression , Adult , Female , Humans , Male , Middle Aged , Anxiety/therapy , Counseling/methods , Depression/therapy , Functional Neuroimaging/methods , Prefrontal Cortex , Problem Solving , Psychological Distress , Psychotherapy/methods , Quality of Life , Voice
5.
J Neuroeng Rehabil ; 21(1): 43, 2024 03 30.
Article in English | MEDLINE | ID: mdl-38555417

ABSTRACT

BACKGROUND: Conventional diagnostic methods for dysphagia have limitations such as long wait times, radiation risks, and restricted evaluation. Therefore, voice-based diagnostic and monitoring technologies are required to overcome these limitations. Based on our hypothesis regarding the impact of weakened muscle strength and the presence of aspiration on vocal characteristics, this single-center, prospective study aimed to develop a machine-learning algorithm for predicting dysphagia status (normal, and aspiration) by analyzing postprandial voice limiting intake to 3 cc. METHODS: Conducted from September 2021 to February 2023 at Seoul National University Bundang Hospital, this single center, prospective cohort study included 198 participants aged 40 or older, with 128 without suspected dysphagia and 70 with dysphagia-aspiration. Voice data from participants were collected and used to develop dysphagia prediction models using the Multi-Layer Perceptron (MLP) with MobileNet V3. Male-only, female-only, and combined models were constructed using 10-fold cross-validation. Through the inference process, we established a model capable of probabilistically categorizing a new patient's voice as either normal or indicating the possibility of aspiration. RESULTS: The pre-trained models (mn40_as and mn30_as) exhibited superior performance compared to the non-pre-trained models (mn4.0 and mn3.0). Overall, the best-performing model, mn30_as, which is a pre-trained model, demonstrated an average AUC across 10 folds as follows: combined model 0.8361 (95% CI 0.7667-0.9056; max 0.9541), male model 0.8010 (95% CI 0.6589-0.9432; max 1.000), and female model 0.7572 (95% CI 0.6578-0.8567; max 0.9779). However, for the female model, a slightly higher result was observed with the mn4.0, which scored 0.7679 (95% CI 0.6426-0.8931; max 0.9722). Additionally, the other models (pre-trained; mn40_as, non-pre-trained; mn4.0 and mn3.0) also achieved performance above 0.7 in most cases, and the highest fold-level performance for most models was approximately around 0.9. The 'mn' in model names refers to MobileNet and the following number indicates the 'width_mult' parameter. CONCLUSIONS: In this study, we used mel-spectrogram analysis and a MobileNetV3 model for predicting dysphagia aspiration. Our research highlights voice analysis potential in dysphagia screening, diagnosis, and monitoring, aiming for non-invasive safer, and more effective interventions. TRIAL REGISTRATION: This study was approved by the IRB (No. B-2109-707-303) and registered on clinicaltrials.gov (ID: NCT05149976).


Subject(s)
Deglutition Disorders , Female , Humans , Male , Algorithms , Deglutition Disorders/diagnosis , Deglutition Disorders/etiology , Machine Learning , Prospective Studies , Respiratory Aspiration/diagnosis , Respiratory Aspiration/etiology , Adult
6.
Med Anthropol ; 43(3): 219-232, 2024 04 02.
Article in English | MEDLINE | ID: mdl-38451490

ABSTRACT

Drawing on a two-year ethnography of care practices during the COVID-19 pandemic in Germany, we discuss the affordances of voice-based technologies (smartphones, basic mobile phones, and landline telephones) in collecting ethnographic data and crafting relationships with participants. We illustrate how such technologies allowed us to move with participants, eased data collection through the social expectations around their use, and reoriented our attention to the multiple qualities of sound. Adapting research on the performativity of technology, we argue that voice-based technologies integrated us into participants' everyday lives while also maintaining physical distance in times of infectious sociality.


Subject(s)
COVID-19 , Cell Phone , Humans , Pandemics , Anthropology, Medical , Anthropology, Cultural
7.
Psychon Bull Rev ; 31(4): 1680-1689, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38238560

ABSTRACT

How do we perceive others based on their voices? This question has attracted research and media attention for decades, producing hundreds of studies showing that the voice is socially and biologically relevant, but these studies vary in methodology and ecological validity. Here we test whether vocalizers producing read versus free speech are judged similarly by listeners on ten biological and/or psychosocial traits. In perception experiments using speech from 208 men and women and ratings from 4,088 listeners, we show that listeners' assessments of vocalizer sex and age are highly accurate, regardless of speech type. Assessments of body size, femininity-masculinity and women's health also did not differ between free and read speech. In contrast, read speech elicited higher ratings of attractiveness, dominance and trustworthiness in both sexes and of health in males compared to free speech. Importantly, these differences were small, and we additionally show moderate to strong correlations between ratings of the same vocalizers based on their read and free speech for all ten traits, indicating that voice-based judgments are highly consistent within speakers, whether or not speech is spontaneous. Our results provide evidence that the human voice can communicate various biological and psychosocial traits via both read and free speech, with theoretical and practical implications.


Subject(s)
Judgment , Humans , Male , Female , Adult , Judgment/physiology , Young Adult , Social Perception , Voice/physiology , Speech/physiology , Beauty , Speech Perception/physiology , Body Height , Body Weight , Adolescent , Masculinity , Femininity , Middle Aged , Health Status
8.
Front Artif Intell ; 6: 1171652, 2023.
Article in English | MEDLINE | ID: mdl-37601036

ABSTRACT

Introduction: Biomarkers of mental effort may help to identify subtle cognitive impairments in the absence of task performance deficits. Here, we aim to detect mental effort on a verbal task, using automated voice analysis and machine learning. Methods: Audio data from the digit span backwards task were recorded and scored with automated speech recognition using the online platform NeuroVocalixTM, yielding usable data from 2,764 healthy adults (1,022 male, 1,742 female; mean age 31.4 years). Acoustic features were aggregated across each trial and normalized within each subject. Cognitive load was dichotomized for each trial by categorizing trials at >0.6 of each participants' maximum span as "high load." Data were divided into training (60%), test (20%), and validate (20%) datasets, each containing different participants. Training and test data were used in model building and hyper-parameter tuning. Five classification models (Logistic Regression, Naive Bayes, Support Vector Machine, Random Forest, and Gradient Boosting) were trained to predict cognitive load ("high" vs. "low") based on acoustic features. Analyses were limited to correct responses. The model was evaluated using the validation dataset, across all span lengths and within the subset of trials with a four-digit span. Classifier discriminant power was examined with Receiver Operating Curve (ROC) analysis. Results: Participants reached a mean span of 6.34 out of 8 items (SD = 1.38). The Gradient Boosting classifier provided the best performing model on test data (AUC = 0.98) and showed excellent discriminant power for cognitive load on the validation dataset, across all span lengths (AUC = 0.99), and for four-digit only utterances (AUC = 0.95). Discussion: A sensitive biomarker of mental effort can be derived from vocal acoustic features in remotely administered verbal cognitive tests. The use-case of this biomarker for improving sensitivity of cognitive tests to subtle pathology now needs to be examined.

9.
JMIR Mhealth Uhealth ; 11: e41117, 2023 03 31.
Article in English | MEDLINE | ID: mdl-37000476

ABSTRACT

BACKGROUND: Voice-based systems such as Amazon Alexa may be useful for collecting self-reported information in real time from participants of epidemiology studies using verbal input. In epidemiological research studies, self-reported data tend to be collected using short, infrequent questionnaires, in which the items require participants to select from predefined options, which may lead to errors in the information collected and lack of coverage. Voice-based systems give the potential to collect self-reported information "continuously" over several days or weeks. At present, to the best of our knowledge, voice-based systems have not been used or evaluated for collecting epidemiological data. OBJECTIVE: We aimed to demonstrate the technical feasibility of using Alexa to collect information from participants, investigate participant acceptability, and provide an initial evaluation of the validity of the collected data. We used food and drink information as an exemplar. METHODS: We recruited 45 staff members and students at the University of Bristol (United Kingdom). Participants were asked to tell Alexa what they ate or drank for 7 days and to also submit this information using a web-based form. Questionnaires asked for basic demographic information, about their experience during the study, and the acceptability of using Alexa. RESULTS: Of the 37 participants with valid data, most (n=30, 81%) were aged 20 to 39 years and 23 (62%) were female. Across 29 participants with Alexa and web entries corresponding to the same intake event, 60.1% (357/588) of Alexa entries contained the same food and drink information as the corresponding web entry. Most participants reported that Alexa interjected, and this was worse when entering the food and drink information (17/35, 49% of participants said this happened often; 1/35, 3% said this happened always) than when entering the event date and time (6/35, 17% of participants said this happened often; 1/35, 3% said this happened always). Most (28/35, 80%) said they would be happy to use a voice-controlled system for future research. CONCLUSIONS: Although there were some issues interacting with the Alexa skill, largely because of its conversational nature and because Alexa interjected if there was a pause in speech, participants were mostly willing to participate in future research studies using Alexa. More studies are needed, especially to trial less conversational interfaces.


Subject(s)
Food , Humans , Female , Male , Feasibility Studies , Surveys and Questionnaires , United Kingdom , Self Report
10.
J Cardiovasc Transl Res ; 16(3): 541-545, 2023 06.
Article in English | MEDLINE | ID: mdl-36749563

ABSTRACT

The acceptability of artificially intelligent interactive voice response (AI-IVR) systems in cardiovascular research settings is unclear. As a result, we evaluated peoples' attitudes regarding the Amazon Echo Show 8 device when used for electronic data capture in cardiovascular clinics. Participants were recruited following the Voice-Based Screening for SARS-CoV-2 Exposure in Cardiovascular clinics study. Overall, 215 people enrolled and underwent screening (mean age 46.1; 55% females) in the VOICE-COVID study and 58 people consented to participate in a post-screening survey. Following thematic analysis, four key themes affecting AI-IVR acceptability were identified. These were difficulties with communication (44.8%), limitations with available interaction modalities (41.4%), barriers with the development of therapeutic relationships (25.9%), and concerns with universality and accessibility (8.6%). While there are potential concerns with the use of AI-IVR technologies, these systems appeared to be well accepted in cardiovascular clinics. Increased development of these technologies could significantly improve healthcare access and efficiency.


Subject(s)
COVID-19 , Female , Humans , Middle Aged , Male , SARS-CoV-2 , Attitude
11.
JMIR Res Protoc ; 12: e41209, 2023 01 31.
Article in English | MEDLINE | ID: mdl-36719720

ABSTRACT

BACKGROUND: The COVID-19 pandemic has disrupted the health care system, limiting health care resources such as the availability of health care professionals, patient monitoring, contact tracing, and continuous surveillance. As a result of this significant burden, digital tools have become an important asset in increasing the efficiency of patient care delivery. Digital tools can help support health care institutions by tracking transmission of the virus, aiding in the screening process, and providing telemedicine support. However, digital health tools face challenges associated with barriers to accessibility, efficiency, and privacy-related ethical issues. OBJECTIVE: This paper describes the study design of an open-label, noninterventional, crossover, randomized controlled trial aimed at assessing whether interactive voice response systems can screen for SARS-CoV-2 in patients as accurately as standard screening done by people. The study aims to assess the concordance and interrater reliability of symptom screening done by Amazon Alexa compared to manual screening done by research coordinators. The perceived level of comfort of patients when interacting with voice response systems and their personal experience will also be evaluated. METHODS: A total of 52 patients visiting the heart failure clinic at the Royal Victoria Hospital of the McGill University Health Center, in Montreal, Quebec, will be recruited. Patients will be randomly assigned to first be screened for symptoms of SARS-CoV-2 either digitally, by Amazon Alexa, or manually, by the research coordinator. Participants will subsequently be crossed over and screened either digitally or manually. The clinical setup includes an Amazon Echo Show, a tablet, and an uninterrupted power supply mounted on a mobile cart. The primary end point will be the interrater reliability on the accuracy of randomized screening data performed by Amazon Alexa versus research coordinators. The secondary end point will be the perceived level of comfort and app engagement of patients as assessed using 5-point Likert scales and binary mode responses. RESULTS: Data collection started in May 2021 and is expected to be completed in fall 2022. Data analysis is expected to be completed in early 2023. CONCLUSIONS: The use of voice-based assistants could improve the provision of health services and reduce the burden on health care personnel. Demonstrating a high interrater reliability between Amazon Alexa and health care coordinators may serve future digital tools to streamline the screening and delivery of care in the context of other conditions and clinical settings. The COVID-19 pandemic occurs during the first digital era using digital tools such as Amazon Alexa for disease screening, and it represents an opportunity to implement such technology in health care institutions in the long term. TRIAL REGISTRATION: ClinicalTrials.gov NCT04508972; https://clinicaltrials.gov/ct2/show/NCT04508972. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/41209.

12.
Virtual Real ; 27(1): 141-158, 2023.
Article in English | MEDLINE | ID: mdl-34054327

ABSTRACT

This paper reports the development of a specialized teleguidance-based navigation assistance system for the blind and the visually impaired. We present findings from a usability and user experience study conducted with 11 blind and visually impaired participants and a sighted caretaker. Participants sent live video feed of their field of view to the remote caretaker's terminal from a smartphone camera attached to their chest. The caretaker used this video feed to guide them through indoor and outdoor navigation scenarios using a combination of haptic and voice-based communication. Haptic feedback was provided through vibrating actuators installed in the grip of a Smart Cane. Two haptic methods for directional guidance were tested: (1) two vibrating actuators to guide left and right movement and (2) a single vibrating actuator with differentiating vibration patterns for the same purpose. Users feedback was collected using a meCUE 2.0 standardized questionnaire, interviews, and group discussions. Participants' perceptions toward the proposed navigation assistance system were positive. Blind participants preferred vibrational guidance with two actuators, while partially blind participants preferred the single actuator method. Familiarity with cane use and age were important factors in the choice of haptic methods by both blind and partially blind users. It was found that smartphone camera provided sufficient field of view for remote assistance; position and angle are nonetheless important considerations. Ultimately, more research is needed to confirm our preliminary findings. We also present an expanded evaluation model developed to carry out further research on assistive systems.

13.
Interact J Med Res ; 11(2): e40655, 2022 Nov 15.
Article in English | MEDLINE | ID: mdl-36378504

ABSTRACT

The COVID-19 pandemic accelerated the use of remote patient monitoring in clinical practice or research for safety and emergency reasons, justifying the need for innovative digital health solutions to monitor key parameters or symptoms related to COVID-19 or Long COVID. The use of voice-based technologies, and in particular vocal biomarkers, is a promising approach, voice being a rich, easy-to-collect medium with numerous potential applications for health care, from diagnosis to monitoring. In this viewpoint, we provide an overview of the potential benefits and limitations of using voice to monitor COVID-19, Long COVID, and related symptoms. We then describe an optimal pipeline to bring a vocal biomarker candidate from research to clinical practice and discuss recommendations to achieve such a clinical implementation successfully.

14.
BMC Health Serv Res ; 22(1): 287, 2022 Mar 03.
Article in English | MEDLINE | ID: mdl-35236341

ABSTRACT

BACKGROUND: The smart hospital's concept of using the Internet of Things (IoT) to reduce human resources demand has become more popular in the aging society. OBJECTIVE: To implement the voice smart care (VSC) system in hospital wards and explore patient acceptance via the Technology Acceptance Model (TAM). METHODS: A structured questionnaire based on TAM was developed and validated as a research tool. Only the patients hospitalized in the VSC wards and who used it for more than two days were invited to fill the questionnaire. Statistical variables were analyzed using SPSS version 24.0. A total of 30 valid questionnaires were finally obtained after excluding two incomplete questionnaires. Cronbach's α values for all study constructs were above 0.84. RESULT: We observed that perceived ease of use on perceived usefulness, perceived usefulness on user satisfaction and attitude toward using, and attitude toward using on behavioral intention to use had statistical significance (p < .01), respectively. CONCLUSION: We have successfully developed the VSC system in a Taiwanese academic medical center. Our study indicated that perceived usefulness was a crucial factor, which means the system function should precisely meet the patients' demands. Additionally, a clever system design is important since perceived ease of use positively affects perceived usefulness. The insight generated from this study could be beneficial to hospitals when implementing similar systems to their wards.


Subject(s)
Aging , Intention , Attitude , Hospitals , Humans , Pilot Projects
15.
Assist Technol ; 34(2): 129-139, 2022 03 04.
Article in English | MEDLINE | ID: mdl-31910146

ABSTRACT

There are over 466 million people in the world with disabling hearing loss. People with severe-to-profound hearing impairment need to lipread or use sign language, even with hearing aids. Assistive Technologies play a vital role in helping these people interact efficiently with their environment. Deaf drivers are not currently able to take full advantage of voice-based navigation applications. In this paper, we describe research that is aimed at developing an assistive device that (1) recognizes voice-stream navigation instructions from GPS-based navigation applications, and (2) maps each voiced navigation instruction to a vibrotactile stimulus that can be perceived and understood by deaf drivers. A 13-element feature vector is extracted from each voice stream, and classified into one of six categories, where each category represents a unique navigation instruction. The classification of the feature vectors is done using a K-Nearest-Neighbor classifier (with an accuracy of 99.05%) which was found to outperform five other classifiers. Each category is then mapped to a unique vibration pattern, which drives vibration motors in real time. A usability study was conducted with ten participants. Three different alternatives were tested, to find the best body locations for mounting the vibration motors. The solution ultimately chosen was two sets of five vibrator motors, where each set was mounted on a bracelet. Ten drivers were asked to rate the proposed device (based on eight different factors) after they used the assistive device on 8 driving routes. The overall mean rating across all eight factors was 4.67 (out of 5) This indicates that the proposed assistive device was seen as useful and effective.


Subject(s)
Automobile Driving , Hearing Aids , Hearing Loss , Self-Help Devices , Humans , Sign Language
16.
Sensors (Basel) ; 21(9)2021 May 10.
Article in English | MEDLINE | ID: mdl-34068602

ABSTRACT

Maintaining a high quality of conversation between doctors and patients is essential in telehealth services, where efficient and competent communication is important to promote patient health. Assessing the quality of medical conversations is often handled based on a human auditory-perceptual evaluation. Typically, trained experts are needed for such tasks, as they follow systematic evaluation criteria. However, the daily rapid increase of consultations makes the evaluation process inefficient and impractical. This paper investigates the automation of the quality assessment process of patient-doctor voice-based conversations in a telehealth service using a deep-learning-based classification model. For this, the data consist of audio recordings obtained from Altibbi. Altibbi is a digital health platform that provides telemedicine and telehealth services in the Middle East and North Africa (MENA). The objective is to assist Altibbi's operations team in the evaluation of the provided consultations in an automated manner. The proposed model is developed using three sets of features: features extracted from the signal level, the transcript level, and the signal and transcript levels. At the signal level, various statistical and spectral information is calculated to characterize the spectral envelope of the speech recordings. At the transcript level, a pre-trained embedding model is utilized to encompass the semantic and contextual features of the textual information. Additionally, the hybrid of the signal and transcript levels is explored and analyzed. The designed classification model relies on stacked layers of deep neural networks and convolutional neural networks. Evaluation results show that the model achieved a higher level of precision when compared with the manual evaluation approach followed by Altibbi's operations team.


Subject(s)
Deep Learning , Telemedicine , Voice , Humans , Neural Networks, Computer , Referral and Consultation
17.
Eur Heart J Digit Health ; 2(3): 521-527, 2021 Sep.
Article in English | MEDLINE | ID: mdl-36713601

ABSTRACT

Aims: Artificial intelligence (A.I) driven voice-based assistants may facilitate data capture in clinical care and trials; however, the feasibility and accuracy of using such devices in a healthcare environment are unknown. We explored the feasibility of using the Amazon Alexa ('Alexa') A.I. voice-assistant to screen for risk factors or symptoms relating to SARS-CoV-2 exposure in quaternary care cardiovascular clinics. Methods and results: We enrolled participants to be screened for signs and symptoms of SARS-CoV-2 exposure by a healthcare provider and then subsequently by the Alexa. Our primary outcome was interrater reliability of Alexa to healthcare provider screening using Cohen's Kappa statistic. Participants rated the Alexa in a post-study survey (scale of 1 to 5 with 5 reflecting strongly agree). This study was approved by the McGill University Health Centre ethics board. We prospectively enrolled 215 participants. The mean age was 46 years [17.7 years standard deviation (SD)], 55% were female, and 31% were French speakers (others were English). In total, 645 screening questions were delivered by Alexa. The Alexa mis-identified one response. The simple and weighted Cohen's kappa statistic between Alexa and healthcare provider screening was 0.989 [95% confidence interval (CI) 0.982-0.997] and 0.992 (955 CI 0.985-0.999), respectively. The participants gave an overall mean rating of 4.4 (out of 5, 0.9 SD). Conclusion: Our study demonstrates the feasibility of an A.I. driven multilingual voice-based assistant to collect data in the context of SARS-CoV-2 exposure screening. Future studies integrating such devices in cardiovascular healthcare delivery and clinical trials are warranted. Registration: https://clinicaltrials.gov/ct2/show/NCT04508972.

18.
Sensors (Basel) ; 19(8)2019 Apr 20.
Article in English | MEDLINE | ID: mdl-31010025

ABSTRACT

Voice-based interfaces have become one of the most popular device capabilities, recently being regarded as one flagship user experience of smart consumer devices. However, the lack of common coordination mechanisms might often degrade the user experience, especially when interacting with multiple voice-enabled devices located closely. For example, a hotword or wake-up utterance such as "hi Bixby" or "ok Google" frequently triggers redundant responses by several nearby smartphones. Motivated by the problem of uncoordinated react of voice-enabled devices especially in a multiple device environment, in this paper, we discuss the notion of an ephemeral group of consumer devices in which the member devices and the transient lifetime are implicitly determined by an external event (e.g., hotword detection) without any provisioned group structure, and specifically we concentrate on the time-constrained leader election process in such an ephemeral group. To do so: (i) We first present the sound-based multiple device communication framework, namely tailtag, that leverages the isomorphic capability of consumer devices for the tasks of processing hotword events and transmitting data over sound, and thus renders both the tasks confined to the same room area and enables the spontaneous leader election process in a unstructured group upon a hotword event. (ii) To improve the success rate of the leader election with a given time constraint, we then develop the adaptive messaging scheme especially tailored for sound-based data communication that inherently has low data rate. Our adaptive scheme utilizes an application-specific score that is individually calculated by a member device for each event detection, and employs score-based scheduling by which messages of a high score are scheduled first and so unnecessary message transmission can be suppressed during the election process. (iii) Through experiments, we also demonstrate that, when a hotword is detected by multiple smartphones in a room, the framework with the adaptive messaging scheme enables them to successfully achieve a coordinated response under the given latency bound, yielding an insignificant non-consensus probability, no more than 2%.

19.
JMIR Ment Health ; 4(2): e25, 2017 Jun 28.
Article in English | MEDLINE | ID: mdl-28659259

ABSTRACT

BACKGROUND: Computer-delivered interventions have been shown to be effective in reducing alcohol consumption in heavy drinking college students. However, these computer-delivered interventions rely on mouse, keyboard, or touchscreen responses for interactions between the users and the computer-delivered intervention. The principles of motivational interviewing suggest that in-person interventions may be effective, in part, because they encourage individuals to think through and speak aloud their motivations for changing a health behavior, which current computer-delivered interventions do not allow. OBJECTIVE: The objective of this study was to take the initial steps toward development of a voice-based computer-delivered intervention that can ask open-ended questions and respond appropriately to users' verbal responses, more closely mirroring a human-delivered motivational intervention. METHODS: We developed (1) a voice-based computer-delivered intervention that was run by a human controller and that allowed participants to speak their responses to scripted prompts delivered by speech generation software and (2) a text-based computer-delivered intervention that relied on the mouse, keyboard, and computer screen for all interactions. We randomized 60 heavy drinking college students to interact with the voice-based computer-delivered intervention and 30 to interact with the text-based computer-delivered intervention and compared their ratings of the systems as well as their motivation to change drinking and their drinking behavior at 1-month follow-up. RESULTS: Participants reported that the voice-based computer-delivered intervention engaged positively with them in the session and delivered content in a manner consistent with motivational interviewing principles. At 1-month follow-up, participants in the voice-based computer-delivered intervention condition reported significant decreases in quantity, frequency, and problems associated with drinking, and increased perceived importance of changing drinking behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. CONCLUSIONS: Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form.

20.
Article in English | MEDLINE | ID: mdl-27684109

ABSTRACT

Rhythm is the speech property related to the temporal organization of sounds. Considerable evidence is now available for suggesting that dementia of Alzheimer's type is associated with impairments in speech rhythm. The aim of this study is to assess the use of an automatic computerized system for measuring speech rhythm characteristics in an oral reading task performed by 45 patients with Alzheimer's disease (AD) compared with those same characteristics among 82 healthy older adults without a diagnosis of dementia, and matched by age, sex and cultural background. Ranges of rhythmic-metric and clinical measurements were applied. The results show rhythmic differences between the groups, with higher variability of syllabic intervals in AD patients. Signal processing algorithms applied to oral reading recordings prove to be capable of differentiating between AD patients and older adults without dementia with an accuracy of 87% (specificity 81.7%, sensitivity 82.2%), based on the standard deviation of the duration of syllabic intervals. Experimental results show that the syllabic variability measurements extracted from the speech signal can be used to distinguish between older adults without a diagnosis of dementia and those with AD, and may be useful as a tool for the objective study and quantification of speech deficits in AD.


Subject(s)
Alzheimer Disease , Speech , Aged , Algorithms , Alzheimer Disease/diagnosis , Alzheimer Disease/physiopathology , Diagnosis, Computer-Assisted , Educational Status , Female , Humans , Language , Male , Neuropsychological Tests , Periodicity , Reading , Sensitivity and Specificity , Signal Processing, Computer-Assisted , Speech Disorders/diagnosis , Speech Disorders/etiology , Speech Disorders/physiopathology , Speech Production Measurement
SELECTION OF CITATIONS
SEARCH DETAIL