Pesquisa | Portal de Pesquisa da BVS

1.

Neurocognitive processing efficiency for discriminating human non-alarm rather than alarm scream calls.

Frühholz, Sascha; Dietziker, Joris; Staib, Matthias; Trost, Wiebke.

PLoS Biol ; 19(4): e3000751, 2021 04.

Artigo em Inglês | MEDLINE | ID: mdl-33848299

RESUMO

Across many species, scream calls signal the affective significance of events to other agents. Scream calls were often thought to be of generic alarming and fearful nature, to signal potential threats, with instantaneous, involuntary, and accurate recognition by perceivers. However, scream calls are more diverse in their affective signaling nature than being limited to fearfully alarming a threat, and thus the broader sociobiological relevance of various scream types is unclear. Here we used 4 different psychoacoustic, perceptual decision-making, and neuroimaging experiments in humans to demonstrate the existence of at least 6 psychoacoustically distinctive types of scream calls of both alarming and non-alarming nature, rather than there being only screams caused by fear or aggression. Second, based on perceptual and processing sensitivity measures for decision-making during scream recognition, we found that alarm screams (with some exceptions) were overall discriminated the worst, were responded to the slowest, and were associated with a lower perceptual sensitivity for their recognition compared with non-alarm screams. Third, the neural processing of alarm compared with non-alarm screams during an implicit processing task elicited only minimal neural signal and connectivity in perceivers, contrary to the frequent assumption of a threat processing bias of the primate neural system. These findings show that scream calls are more diverse in their signaling and communicative nature in humans than previously assumed, and, in contrast to a commonly observed threat processing bias in perceptual discriminations and neural processes, we found that especially non-alarm screams, and positive screams in particular, seem to have higher efficiency in speeded discriminations and the implicit neural processing of various scream types in humans.

Assuntos

Percepção Auditiva/fisiologia , Discriminação Psicológica/fisiologia , Medo/psicologia , Reconhecimento de Voz/fisiologia , Adulto , Vias Auditivas/diagnóstico por imagem , Vias Auditivas/fisiologia , Encéfalo/diagnóstico por imagem , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Reconhecimento Fisiológico de Modelo/fisiologia , Reconhecimento Psicológico/fisiologia , Caracteres Sexuais , Adulto Jovem

2.

'Biocomputer' combines lab-grown brain tissue with electronic hardware.

Tozer, Lilly.

Nature ; 624(7992): 481, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-38082130

Assuntos

Interfaces Cérebro-Computador , Encéfalo , Computadores , Eletrônica , Organoides , Humanos , Eletrônica/instrumentação , Eletrônica/métodos , Reconhecimento de Voz

3.

Validity, feasibility, and effectiveness of a voice-recognition based digital cognitive screener for dementia and mild cognitive impairment in community-dwelling older Chinese adults: A large-scale implementation study.

Zhao, Xuhao; Wen, Haoxuan; Xu, Guohai; Pang, Ting; Zhang, Yaping; He, Xindi; Hu, Ruofei; Yan, Ming; Chen, Christopher; Wu, Xifeng; Xu, Xin.

Alzheimers Dement ; 20(4): 2384-2396, 2024 04.

Artigo em Inglês | MEDLINE | ID: mdl-38299756

RESUMO

INTRODUCTION: We investigated the validity, feasibility, and effectiveness of a voice recognition-based digital cognitive screener (DCS), for detecting dementia and mild cognitive impairment (MCI) in a large-scale community of elderly participants. METHODS: Eligible participants completed demographic, cognitive, functional assessments and the DCS. Neuropsychological tests were used to assess domain-specific and global cognition, while the diagnosis of MCI and dementia relied on the Clinical Dementia Rating Scale. RESULTS: Among the 11,186 participants, the DCS showed high completion rates (97.5%) and a short administration time (5.9 min) across gender, age, and education groups. The DCS demonstrated areas under the receiver operating characteristics curve (AUCs) of 0.95 and 0.83 for dementia and MCI detection, respectively, among 328 participants in the validation phase. Furthermore, the DCS resulted in time savings of 16.2% to 36.0% compared to the Mini-Mental State Examination (MMSE) and Montral Cognitive Assessment (MoCA). DISCUSSION: This study suggests that the DCS is an effective and efficient tool for dementia and MCI case-finding in large-scale cognitive screening. HIGHLIGHTS: To our best knowledge, this is the first cognitive screening tool based on voice recognition and utilizing conversational AI that has been assessed in a large population of Chinese community-dwelling elderly. With the upgrading of a new multimodal understanding model, the DCS can accurately assess participants' responses, including different Chinese dialects, and provide automatic scores. The DCS not only exhibited good discriminant ability in detecting dementia and MCI cases, it also demonstrated a high completion rate and efficient administration regardless of gender, age, and education differences. The DCS is economically efficient, scalable, and had a better screening efficacy compared to the MMSE or MoCA, for wider implementation.

Assuntos

Disfunção Cognitiva , Demência , Adulto , Humanos , Pessoa de Meia-Idade , Idoso , Demência/epidemiologia , Estudos de Viabilidade , Vida Independente , Reconhecimento de Voz , Disfunção Cognitiva/epidemiologia , Cognição , Testes Neuropsicológicos , Reprodutibilidade dos Testes , China/epidemiologia

4.

Establishing the consistency of a voice recognition symbol digit modalities test analogue.

Wishart, Margaret; Everest, Marina R; Morrow, Sarah A; Rose, Jonathan; Shen, Lingkai; Feinstein, Anthony.

Mult Scler ; 29(13): 1676-1679, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37842762

RESUMO

BACKGROUND: We previously demonstrated the convergent validity of a fully automated voice recognition analogue of the Symbol Digit Modalities Test (VR-SDMT) for evaluating processing speed in people with multiple sclerosis (pwMS). OBJECTIVE/METHODS: We aimed to replicate these results in 54 pwMS and 18 healthy controls (HCs), demonstrating the VR-SDMT's reliability. RESULTS: Significant correlations were found between the VR-SDMT and the traditional oral SDMT in the multiple sclerosis (MS) (r = -0.771, p < 0.001) and HC (r = -0.785, p < 0.001) groups. CONCLUSION: Taken collectively, our two studies demonstrate the reliability and validity of the VR-SDMT for assessing processing speed in pwMS.

Assuntos

Esclerose Múltipla , Reconhecimento de Voz , Humanos , Reprodutibilidade dos Testes , Testes Neuropsicológicos , Velocidade de Processamento

5.

Creaky voice identification in Mandarin: The effects of prosodic position, tone, pitch range and creak locality.

Li, Aini; Lai, Wei; Kuang, Jianjing.

J Acoust Soc Am ; 154(1): 126-140, 2023 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-37432052

RESUMO

Creaky voice, a non-modal aperiodic phonation that is often associated with low pitch targets, has been found to not only correlate linguistically with prosodic boundary, tonal categories, and pitch range, but also socially with age, gender, and social status. However, it is still not clear whether co-varying factors such as prosodic boundary, pitch range, and tone could, in turn, affect listeners' identification of creak. To fill this gap, this current study examines how creaky voice is identified in Mandarin through experimental data, aiming to enhance our understanding of cross-linguistic perception of creaky voice and, more broadly, speech perception in multi-variable contexts. Our results reveal that in Mandarin, creak identification is context-dependent: factors including prosodic position, tone, pitch range, and the amount of creak all affect how Mandarin listeners identify creak. This reflects listeners' knowledge about the distribution of creak in linguistically universal (e.g., prosodic boundary) and language-specific (e.g., lexical tone) environments.

Assuntos

Percepção da Fala , Reconhecimento de Voz , Fonação , Idioma , Linguística

6.

Modulation of the Primary Auditory Thalamus When Recognizing Speech with Background Noise.

Mihai, Paul Glad; Tschentscher, Nadja; von Kriegstein, Katharina.

J Neurosci ; 41(33): 7136-7147, 2021 08 18.

Artigo em Inglês | MEDLINE | ID: mdl-34244362

RESUMO

Recognizing speech in background noise is a strenuous daily activity, yet most humans can master it. An explanation of how the human brain deals with such sensory uncertainty during speech recognition is to-date missing. Previous work has shown that recognition of speech without background noise involves modulation of the auditory thalamus (medial geniculate body; MGB): there are higher responses in left MGB for speech recognition tasks that require tracking of fast-varying stimulus properties in contrast to relatively constant stimulus properties (e.g., speaker identity tasks) despite the same stimulus input. Here, we tested the hypotheses that (1) this task-dependent modulation for speech recognition increases in parallel with the sensory uncertainty in the speech signal, i.e., the amount of background noise; and that (2) this increase is present in the ventral MGB, which corresponds to the primary sensory part of the auditory thalamus. In accordance with our hypothesis, we show, by using ultra-high-resolution functional magnetic resonance imaging (fMRI) in male and female human participants, that the task-dependent modulation of the left ventral MGB (vMGB) for speech is particularly strong when recognizing speech in noisy listening conditions in contrast to situations where the speech signal is clear. The results imply that speech in noise recognition is supported by modifications at the level of the subcortical sensory pathway providing driving input to the auditory cortex.SIGNIFICANCE STATEMENT Speech recognition in noisy environments is a challenging everyday task. One reason why humans can master this task is the recruitment of additional cognitive resources as reflected in recruitment of non-language cerebral cortex areas. Here, we show that also modulation in the primary sensory pathway is specifically involved in speech in noise recognition. We found that the left primary sensory thalamus (ventral medial geniculate body; vMGB) is more involved when recognizing speech signals as opposed to a control task (speaker identity recognition) when heard in background noise versus when the noise was absent. This finding implies that the brain optimizes sensory processing in subcortical sensory pathway structures in a task-specific manner to deal with speech recognition in noisy environments.

Assuntos

Mapeamento Encefálico , Corpos Geniculados/fisiologia , Colículos Inferiores/fisiologia , Ruído , Percepção da Fala/fisiologia , Tálamo/fisiologia , Adulto , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Modelos Neurológicos , Fonética , Projetos Piloto , Tempo de Reação , Razão Sinal-Ruído , Incerteza , Reconhecimento de Voz/fisiologia

7.

Neural dissociation of the acoustic and cognitive representation of voice identity.

Bestelmeyer, Patricia E G; Mühl, Constanze.

Neuroimage ; 263: 119647, 2022 11.

Artigo em Inglês | MEDLINE | ID: mdl-36162634

RESUMO

Recognising a speaker's identity by the sound of their voice is important for successful interaction. The skill depends on our ability to discriminate minute variations in the acoustics of the vocal signal. Performance on voice identity assessments varies widely across the population. The neural underpinnings of this ability and its individual differences, however, remain poorly understood. Here we provide critical tests of a theoretical framework for the neural processing stages of voice identity and address how individual differences in identity discrimination mediate activation in this neural network. We scanned 40 individuals on an fMRI adaptation task involving voices drawn from morphed continua between two personally familiar identities. Analyses dissociated neuronal effects induced by repetition of acoustically similar morphs from those induced by a switch in perceived identity. Activation in temporal voice-sensitive areas decreased with acoustic similarity between consecutive stimuli. This repetition suppression effect was mediated by the performance on an independent voice assessment and this result highlights an important functional role of adaptive coding in voice expertise. Bilateral anterior insulae and medial frontal gyri responded to a switch in perceived voice identity compared to an acoustically equidistant switch within identity. Our results support a multistep model of voice identity perception.

Assuntos

Acústica , Doenças Auditivas Centrais , Cognição , Reconhecimento de Voz , Humanos , Estimulação Acústica , Cognição/fisiologia , Imageamento por Ressonância Magnética , Córtex Pré-Frontal/fisiologia , Reconhecimento de Voz/fisiologia , Doenças Auditivas Centrais/fisiopatologia , Masculino , Feminino , Adolescente , Adulto Jovem , Adulto , Rede Nervosa/fisiologia

8.

Surface Potential Tuned Single Active Material Comprised Triboelectric Nanogenerator for a High Performance Voice Recognition Sensor.

Babu, Anand; Malik, Pinki; Das, Nityananda; Mandal, Dipankar.

Small ; 18(22): e2201331, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35499190

RESUMO

To fabricate a high-performance and ultrasensitive triboelectric nanogenerator (TENG), choice of a combination of different materials of triboelectric series is one of the prime challenging tasks. An effective way to fabricate a TENG with a single material (abbreviated as S-TENG) is proposed, comprising electrospun nylon nanofibers. The surface potential of the nanofibers are tuned by changing the voltage polarity in the electrospinning setup, employed between the needle and collector. The difference in surface potential leads to a different work function that is the key to design S-TENG with a single material only. Further, S-TENG is demonstrated as an ultrahigh sensitive acoustic sensor with mechanoacoustic sensitivity of ≈27 500 mV Pa-1 . Due to high sensitivity in the low-to-middle decibel (60-70 dB) sounds, S-TENG is highly capable in recognizing different voice signals depending on the condition of the vocal cord. This effective voice recognition ability indicates that it has high potential to open an alternative pathway for medical professionals to detect several diseases such as neurological voice disorder, muscle tension dysphonia, vocal cord paralysis, and speech delay/disorder related to laryngeal complications.

Assuntos

Nanofibras , Nanotecnologia , Fontes de Energia Elétrica , Nylons , Reconhecimento de Voz

9.

Voice Emotion Recognition by Mandarin-Speaking Children with Cochlear Implants.

Ren, Lei; Zhang, Yanmei; Zhang, Junbo; Qin, Yao; Zhang, Zhikai; Chen, Zhe; Wei, Chaogang; Liu, Yuhe.

Ear Hear ; 43(1): 165-180, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-34288631

RESUMO

OBJECTIVES: Emotional expressions are very important in social interactions. Children with cochlear implants can have voice emotion recognition deficits due to device limitations. Mandarin-speaking children with cochlear implants may face greater challenges than those speaking nontonal languages; the pitch information is not well preserved in cochlear implants, and such children could benefit from child-directed speech, which carries more exaggerated distinctive acoustic cues for different emotions. This study investigated voice emotion recognition, using both adult-directed and child-directed materials, in Mandarin-speaking children with cochlear implants compared with normal hearing peers. The authors hypothesized that both the children with cochlear implants and those with normal hearing would perform better with child-directed materials than with adult-directed materials. DESIGN: Thirty children (7.17-17 years of age) with cochlear implants and 27 children with normal hearing (6.92-17.08 years of age) were recruited in this study. Participants completed a nonverbal reasoning test, speech recognition tests, and a voice emotion recognition task. Children with cochlear implants over the age of 10 years also completed the Chinese version of the Nijmegen Cochlear Implant Questionnaire to evaluate the health-related quality of life. The voice emotion recognition task was a five-alternative, forced-choice paradigm, which contains sentences spoken with five emotions (happy, angry, sad, scared, and neutral) in a child-directed or adult-directed manner. RESULTS: Acoustic analyses showed substantial variations across emotions in all materials, mainly on measures of mean fundamental frequency and fundamental frequency range. Mandarin-speaking children with cochlear implants displayed a significantly poorer performance than normal hearing peers in voice emotion perception tasks, regardless of whether the performance is measured in accuracy scores, Hu value, or reaction time. Children with cochlear implants and children with normal hearing were mainly affected by the mean fundamental frequency in speech emotion recognition tasks. Chronological age had a significant effect on speech emotion recognition in children with normal hearing; however, there was no significant correlation between chronological age and accuracy scores in speech emotion recognition in children with implants. Significant effects of specific emotion and test materials (better performance with child-directed materials) in both groups of children were observed. Among the children with cochlear implants, age at implantation, percentage scores of nonverbal intelligence quotient test, and sentence recognition threshold in quiet could predict recognition performance in both accuracy scores and Hu values. Time wearing cochlear implant could predict reaction time in emotion perception tasks among children with cochlear implants. No correlation was observed between the accuracy score in voice emotion perception and the self-reported scores of health-related quality of life; however, the latter were significantly correlated with speech recognition skills among Mandarin-speaking children with cochlear implants. CONCLUSIONS: Mandarin-speaking children with cochlear implants could have significant deficits in voice emotion recognition tasks compared with their normally hearing peers and can benefit from the exaggerated prosody of child-directed speech. The effects of age at cochlear implantation, speech and language development, and cognition could play an important role in voice emotion perception by Mandarin-speaking children with cochlear implants.

Assuntos

Implante Coclear , Implantes Cocleares , Percepção da Fala , Adolescente , Adulto , Criança , Humanos , Qualidade de Vida , Reconhecimento de Voz

10.

Age-Related Changes in Voice Emotion Recognition by Postlingually Deafened Listeners With Cochlear Implants.

Cannon, Shauntelle A; Chatterjee, Monita.

Ear Hear ; 43(2): 323-334, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-34406157

RESUMO

OBJECTIVES: Identification of emotional prosody in speech declines with age in normally hearing (NH) adults. Cochlear implant (CI) users have deficits in the perception of prosody, but the effects of age on vocal emotion recognition by adult postlingually deaf CI users are not known. The objective of the present study was to examine age-related changes in CI users' and NH listeners' emotion recognition. DESIGN: Participants included 18 CI users (29.6 to 74.5 years) and 43 NH adults (25.8 to 74.8 years). Participants listened to emotion-neutral sentences spoken by a male and female talker in five emotions (happy, sad, scared, angry, neutral). NH adults heard them in four conditions: unprocessed (full spectrum) speech, 16-channel, 8-channel, and 4-channel noise-band vocoded speech. The adult CI users only listened to unprocessed (full spectrum) speech. Sensitivity (d') to emotions and Reaction Times were obtained using a single-interval, five-alternative, forced-choice paradigm. RESULTS: For NH participants, results indicated age-related declines in Accuracy and d', and age-related increases in Reaction Time in all conditions. Results indicated an overall deficit, as well as age-related declines in overall d' for CI users, but Reaction Times were elevated compared with NH listeners and did not show age-related changes. Analysis of Accuracy scores (hit rates) were generally consistent with d' data. CONCLUSIONS: Both CI users and NH listeners showed age-related deficits in emotion identification. The CI users' overall deficit in emotion perception, and their slower response times, suggest impaired social communication which may in turn impact overall well-being, particularly so for older CI users, as lower vocal emotion recognition scores have been associated with poorer subjective quality of life in CI patients.

Assuntos

Implante Coclear , Implantes Cocleares , Percepção da Fala , Adulto , Implantes Cocleares/psicologia , Feminino , Humanos , Masculino , Qualidade de Vida , Reconhecimento de Voz

11.

Neural Network-Enabled Flexible Pressure and Temperature Sensor with Honeycomb-like Architecture for Voice Recognition.

Su, Yue; Ma, Kainan; Zhang, Xu; Liu, Ming.

Sensors (Basel) ; 22(3)2022 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-35161507

RESUMO

Flexible pressure sensors have been studied as wearable voice-recognition devices to be utilized in human-machine interaction. However, the development of highly sensitive, skin-attachable, and comfortable sensing devices to achieve clear voice detection remains a considerable challenge. Herein, we present a wearable and flexible pressure and temperature sensor with a sensitive response to vibration, which can accurately recognize the human voice by combing with the artificial neural network. The device consists of a polyethylene terephthalate (PET) printed with a silver electrode, a filament-microstructured polydimethylsiloxane (PDMS) film embedded with single-walled carbon nanotubes and a polyimide (PI) film sputtered with a patterned Ti/Pt thermistor strip. The developed pressure sensor exhibited a pressure sensitivity of 0.398 kPa-1 in the low-pressure regime, and the fabricated temperature sensor shows a desirable temperature coefficient of resistance of 0.13% ∘C in the range of 25 ∘C to 105 ∘C. Through training and testing the neural network model with the waveform data of the sensor obtained from human pronunciation, the vocal fold vibrations of different words can be successfully recognized, and the total recognition accuracy rate can reach 93.4%. Our results suggest that the fabricated sensor has substantial potential for application in the human-computer interface fields, such as voice control, vocal healthcare monitoring, and voice authentication.

Assuntos

Nanotubos de Carbono , Dispositivos Eletrônicos Vestíveis , Humanos , Redes Neurais de Computação , Temperatura , Reconhecimento de Voz

12.

How Long Does It Take for a Voice to Become Familiar? Speech Intelligibility and Voice Recognition Are Differentially Sensitive to Voice Training.

Holmes, Emma; To, Grace; Johnsrude, Ingrid S.

Psychol Sci ; 32(6): 903-915, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33979256

RESUMO

When people listen to speech in noisy places, they can understand more words spoken by someone familiar, such as a friend or partner, than someone unfamiliar. Yet we know little about how voice familiarity develops over time. We exposed participants (N = 50) to three voices for different lengths of time (speaking 88, 166, or 478 sentences during familiarization and training). These previously heard voices were recognizable and more intelligible when presented with a competing talker than novel voices-even the voice previously heard for the shortest duration. However, recognition and intelligibility improved at different rates with longer exposures. Whereas recognition was similar for all previously heard voices, intelligibility was best for the voice that had been heard most extensively. The speech-intelligibility benefit for the most extensively heard voice (10%-15%) is as large as that reported for voices that are naturally very familiar (friends and spouses)-demonstrating that the intelligibility of a voice can be improved substantially after only an hour of training.

Assuntos

Percepção da Fala , Voz , Humanos , Inteligibilidade da Fala , Reconhecimento de Voz , Treinamento da Voz

13.

Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study.

Hu, Hao-Chun; Chang, Shyue-Yih; Wang, Chuen-Heng; Li, Kai-Jun; Cho, Hsiao-Yun; Chen, Yi-Ting; Lu, Chang-Jung; Tsai, Tzu-Pei; Lee, Oscar Kuang-Sheng.

J Med Internet Res ; 23(6): e25247, 2021 06 08.

Artigo em Inglês | MEDLINE | ID: mdl-34100770

RESUMO

BACKGROUND: Dysphonia influences the quality of life by interfering with communication. However, a laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis. OBJECTIVE: This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence. METHODS: We collected 189 normal voice samples and 552 samples of individuals with voice disorders, including vocal atrophy (n=224), unilateral vocal paralysis (n=50), organic vocal fold lesions (n=248), and adductor spasmodic dysphonia (n=30). The 741 samples were divided into 2 sets: 593 samples as the training set and 148 samples as the testing set. A convolutional neural network approach was applied to train the model, and findings were compared with those of human specialists. RESULTS: The convolutional neural network model achieved a sensitivity of 0.66, a specificity of 0.91, and an overall accuracy of 66.9% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared with the accuracy of human specialists, the overall accuracy rates were 60.1% and 56.1% for the 2 laryngologists and 51.4% and 43.2% for the 2 general ear, nose, and throat doctors. CONCLUSIONS: Voice alone could be used for common vocal fold disease recognition through a deep learning approach after training with our Mandarin pathological voice database. This approach involving artificial intelligence could be clinically useful for screening general vocal fold disease using the voice. The approach includes a quick survey and a general health examination. It can be applied during telemedicine in areas with primary care units lacking laryngoscopic abilities. It could support physicians when prescreening cases by allowing for invasive examinations to be performed only for cases involving problems with automatic recognition or listening and for professional analyses of other clinical examination results that reveal doubts about the presence of pathologies.

Assuntos

Aprendizado Profundo , Prega Vocal , Inteligência Artificial , Humanos , Qualidade de Vida , Reconhecimento de Voz

14.

Challenges and opportunities for telehealth assessment during COVID-19: iT-RES, adapting a remote version of the test for rating emotions in speech.

Ben-David, Boaz M; Mentzel, Maya; Icht, Michal; Gilad, Maya; Dor, Yehuda I; Ben-David, Sarah; Carl, Micalle; Shakuf, Vered.

Int J Audiol ; 60(5): 319-321, 2021 05.

Artigo em Inglês | MEDLINE | ID: mdl-33063553

RESUMO

OBJECTIVE: COVID-19 social isolation restrictions have accelerated the need to adapt clinical assessment tools to telemedicine. Remote adaptations are of special importance for populations at risk, e.g. older adults and individuals with chronic medical comorbidities. In response to this urgent clinical and scientific need, we describe a remote adaptation of the T-RES (Oron et al. 2020; IJA), designed to assess the complex processing of spoken emotions, based on identification and integration of the semantics and prosody of spoken sentences. DESIGN: We present iT-RES, an online version of the speech-perception assessment tool, detailing the challenges considered and solution chosen when designing the telehealth tool. We show a preliminary validation of performance against the original lab-based T-RES. STUDY SAMPLE: A between-participants design, within two groups of 78 young adults (T-RES, n = 39; iT-RES, n = 39). RESULTS: i-TRES performance closely followed that of T-RES, with no group differences found in the main trends, identification of emotions, selective attention, and integration. CONCLUSIONS: The design of iT-RES mapped the main challenges for remote auditory assessments, and solutions taken to address them. We hope that this will encourage further efforts for telehealth adaptations of clinical services, to meet the needs of special populations and avoid halting scientific research.

Assuntos

Audiologia/métodos , Audiometria da Fala/métodos , COVID-19 , Telemedicina/métodos , Reconhecimento de Voz , Adulto , Atenção , Emoções , Feminino , Humanos , Masculino , Quarentena , SARS-CoV-2 , Semântica , Percepção da Fala , Adulto Jovem

15.

Alternative Input for Perfusion Management Devices: Voice Recognition for Data Input and the Effects on Charting and Perioperative Calculation Use.

Lung, Kara; Brummer, Brandi; Sanderson, Scott; Holt, David W.

J Extra Corpor Technol ; 53(4): 286-292, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34992319

RESUMO

Technology in healthcare has become increasingly prevalent and user friendly. In the last decade, advances in hands-free methods of data input have become more viable in a variety of medical professions. The aim of this study was to assess the advantages or disadvantages of hands-free charting through a voice-to-text app designed for perfusionists. Twelve clinical perfusion students using two different simulated bypass cases were recorded and assessed for the number of events noticed and charted, as well as the speed at which they accomplished these steps. Paper charts were compared with a custom app with voice-to-text charting capability. Data was analyzed using linear mixed models to detect differences in length of time until a chartable event was noticed, and how long after noticing an event it took to record the event. Timeliness of recording an event was made by assessing log-transformed time data. There was significantly more information recorded when charting on paper, while charting with voice-to-text resulted in significantly faster mean time from noticing an event to the recording of it. There was no significant difference between how many events were noticed and recorded. When using paper charting, a higher percentage of events that were missed were drug administration events, while voice charting had a higher percentage of missed events that were associated with cardioplegia delivery or bypass timing. With a decreased time interval between noticing an event and charting the event, speech-to-text for perfusion could be of benefit in situations where many events occur at once, such as emergency situations or highly active portions of bypass such as initiation and termination. While efforts were made to make the app as intuitive as possible, there is room for improvement.

Assuntos

Reconhecimento de Voz , Voz , Humanos , Perfusão , Interface Usuário-Computador

16.

A Comparison of Voice Recognition Program and Traditional Keyboard Charting for Nurse Documentation.

Mayer, LeAnn; Xu, Dongjuan; Edwards, Nancy; Bokhart, Gordon.

Comput Inform Nurs ; 40(2): 90-94, 2021 Aug 04.

Artigo em Inglês | MEDLINE | ID: mdl-34347642

RESUMO

The purposes of this study are threefold: (1) compare the document times between a voice recognition system and keyboard charting, (2) compare the number of errors between the two methods, and (3) identify factors influencing documentation time. Voice recognition systems are considered a potential solution to decrease documentation time. However, little is known to what extent voice recognition systems can save nurses' documentation time. A pilot, simulation study was conducted using a voice recognition system and keyboard charting with 15 acute care nurses. A crossover method with repeated measures was utilized. Each nurse was given two simple and two complex assessment scenarios, assigned in random order, to document using both methods. Paired t-tests and multivariate linear regression models were used for data analysis. The voice recognition method saved the nurses 2.3 minutes (simple scenario) and 6.1 minutes (complex scenario) on average and was statistically significant (P < .001). There were no significant differences in errors or factors identified influencing documentation times. Eighty percent reported a preference of using voice recognition systems, and 87% agreed this method helped speed up charting. This study can show how a voice recognition system can improve documentation times compared with keyboard charting while still having thorough documentation.

Assuntos

Cuidados de Enfermagem , Reconhecimento de Voz , Cuidados Críticos , Documentação , Humanos

17.

A Personalized Voice-Based Diet Assistant for Caregivers of Alzheimer Disease and Related Dementias: System Development and Validation.

Li, Juan; Maharjan, Bikesh; Xie, Bo; Tao, Cui.

J Med Internet Res ; 22(9): e19897, 2020 09 21.

Artigo em Inglês | MEDLINE | ID: mdl-32955452

RESUMO

BACKGROUND: The world's aging population is increasing, with an expected increase in the prevalence of Alzheimer disease and related dementias (ADRD). Proper nutrition and good eating behavior show promise for preventing and slowing the progression of ADRD and consequently improving patients with ADRD's health status and quality of life. Most ADRD care is provided by informal caregivers, so assisting caregivers to manage patients with ADRD's diet is important. OBJECTIVE: This study aims to design, develop, and test an artificial intelligence-powered voice assistant to help informal caregivers manage the daily diet of patients with ADRD and learn food and nutrition-related knowledge. METHODS: The voice assistant is being implemented in several steps: construction of a comprehensive knowledge base with ontologies that define ADRD diet care and user profiles, and is extended with external knowledge graphs; management of conversation between users and the voice assistant; personalized ADRD diet services provided through a semantics-based knowledge graph search and reasoning engine; and system evaluation in use cases with additional qualitative evaluations. RESULTS: A prototype voice assistant was evaluated in the lab using various use cases. Preliminary qualitative test results demonstrate reasonable rates of dialogue success and recommendation correctness. CONCLUSIONS: The voice assistant provides a natural, interactive interface for users, and it does not require the user to have a technical background, which may facilitate senior caregivers' use in their daily care tasks. This study suggests the feasibility of using the intelligent voice assistant to help caregivers manage patients with ADRD's diet.

Assuntos

Doença de Alzheimer/terapia , Cuidadores/normas , Demência/terapia , Dietoterapia/métodos , Dieta/métodos , Qualidade de Vida/psicologia , Idoso , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Reconhecimento de Voz

18.

Superiority of blind over sighted listeners in voice recognition.

Pang, Wenbin; Xing, Hongbing; Zhang, Linjun; Shu, Hua; Zhang, Yang.

J Acoust Soc Am ; 148(2): EL208, 2020 08.

Artigo em Inglês | MEDLINE | ID: mdl-32873006

RESUMO

The current study examined whether the blind are superior to sighted listeners in voice recognition. Three subject groups, including 17 congenitally blind, 18 late blind, and 18 sighted, showed no significant differences in the immediate voice recognition test. In the delayed test conducted two weeks later, however, both congenitally blind and late blind groups performed better than the sighted with no significant difference between the two blind groups. These results partly confirmed the anecdotal observation about the blind's superiority in voice recognition, which resides mainly in delayed memory phase but not in immediate recall and generalization phase.

Assuntos

Reconhecimento de Voz , Voz , Cegueira/diagnóstico , Generalização Psicológica , Humanos , Visão Ocular

19.

Wheelchair Neuro Fuzzy Control and Tracking System Based on Voice Recognition.

Abdulghani, Mokhles M; Al-Aubidy, Kasim M; Ali, Mohammed M; Hamarsheh, Qadri J.

Sensors (Basel) ; 20(10)2020 May 19.

Artigo em Inglês | MEDLINE | ID: mdl-32438575

RESUMO

Autonomous wheelchairs are important tools to enhance the mobility of people with disabilities. Advances in computer and wireless communication technologies have contributed to the provision of smart wheelchairs to suit the needs of the disabled person. This research paper presents the design and implementation of a voice controlled electric wheelchair. This design is based on voice recognition algorithms to classify the required commands to drive the wheelchair. An adaptive neuro-fuzzy controller has been used to generate the required real-time control signals for actuating motors of the wheelchair. This controller depends on real data received from obstacle avoidance sensors and a voice recognition classifier. The wheelchair is considered as a node in a wireless sensor network in order to track the position of the wheelchair and for supervisory control. The simulated and running experiments demonstrate that, by combining the concepts of soft-computing and mechatronics, the implemented wheelchair has become more sophisticated and gives people more mobility.

Assuntos

Pessoas com Deficiência , Reconhecimento de Voz , Cadeiras de Rodas , Algoritmos , Computadores , Desenho de Equipamento , Humanos

20.

Evaluation of the Severity of Major Depression Using a Voice Index for Emotional Arousal.

Shinohara, Shuji; Toda, Hiroyuki; Nakamura, Mitsuteru; Omiya, Yasuhiro; Higuchi, Masakazu; Takano, Takeshi; Saito, Taku; Tanichi, Masaaki; Boku, Shuken; Mitsuyoshi, Shunji; So, Mirai; Yoshino, Aihide; Tokuno, Shinichi.

Sensors (Basel) ; 20(18)2020 Sep 04.

Artigo em Inglês | MEDLINE | ID: mdl-32899881

RESUMO

Recently, the relationship between emotional arousal and depression has been studied. Focusing on this relationship, we first developed an arousal level voice index (ALVI) to measure arousal levels using the Interactive Emotional Dyadic Motion Capture database. Then, we calculated ALVI from the voices of depressed patients from two hospitals (Ginza Taimei Clinic (H1) and National Defense Medical College hospital (H2)) and compared them with the severity of depression as measured by the Hamilton Rating Scale for Depression (HAM-D). Depending on the HAM-D score, the datasets were classified into a no depression (HAM-D < 8) and a depression group (HAM-D ≥ 8) for each hospital. A comparison of the mean ALVI between the groups was performed using the Wilcoxon rank-sum test and a significant difference at the level of 10% (p = 0.094) at H1 and 1% (p = 0.0038) at H2 was determined. The area under the curve (AUC) of the receiver operating characteristic was 0.66 when categorizing between the two groups for H1, and the AUC for H2 was 0.70. The relationship between arousal level and depression severity was indirectly suggested via the ALVI.

Assuntos

Nível de Alerta , Transtorno Depressivo Maior , Reconhecimento de Voz , Adulto , Idoso , Depressão/diagnóstico , Transtorno Depressivo Maior/diagnóstico , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Escalas de Graduação Psiquiátrica , Índice de Gravidade de Doença , Adulto Jovem

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA