Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
1.
Alzheimers Dement ; 20(4): 2384-2396, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38299756

RESUMEN

INTRODUCTION: We investigated the validity, feasibility, and effectiveness of a voice recognition-based digital cognitive screener (DCS), for detecting dementia and mild cognitive impairment (MCI) in a large-scale community of elderly participants. METHODS: Eligible participants completed demographic, cognitive, functional assessments and the DCS. Neuropsychological tests were used to assess domain-specific and global cognition, while the diagnosis of MCI and dementia relied on the Clinical Dementia Rating Scale. RESULTS: Among the 11,186 participants, the DCS showed high completion rates (97.5%) and a short administration time (5.9 min) across gender, age, and education groups. The DCS demonstrated areas under the receiver operating characteristics curve (AUCs) of 0.95 and 0.83 for dementia and MCI detection, respectively, among 328 participants in the validation phase. Furthermore, the DCS resulted in time savings of 16.2% to 36.0% compared to the Mini-Mental State Examination (MMSE) and Montral Cognitive Assessment (MoCA). DISCUSSION: This study suggests that the DCS is an effective and efficient tool for dementia and MCI case-finding in large-scale cognitive screening. HIGHLIGHTS: To our best knowledge, this is the first cognitive screening tool based on voice recognition and utilizing conversational AI that has been assessed in a large population of Chinese community-dwelling elderly. With the upgrading of a new multimodal understanding model, the DCS can accurately assess participants' responses, including different Chinese dialects, and provide automatic scores. The DCS not only exhibited good discriminant ability in detecting dementia and MCI cases, it also demonstrated a high completion rate and efficient administration regardless of gender, age, and education differences. The DCS is economically efficient, scalable, and had a better screening efficacy compared to the MMSE or MoCA, for wider implementation.


Asunto(s)
Disfunción Cognitiva , Demencia , Adulto , Humanos , Persona de Mediana Edad , Anciano , Demencia/epidemiología , Estudios de Factibilidad , Vida Independiente , Reconocimiento de Voz , Disfunción Cognitiva/epidemiología , Cognición , Pruebas Neuropsicológicas , Reproducibilidad de los Resultados , China/epidemiología
2.
Appl Ergon ; 116: 104184, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38048717

RESUMEN

Trust in an automated vehicle system (AVs) can impact the experience and safety of drivers and passengers. This work investigates the effects of speech to measure drivers' trust in the AVs. Seventy-five participants were randomly assigned to high-trust (the AVs with 100% correctness, 0 crash, and 4 system messages with visual-auditory TORs) and low-trust group (the AVs with a correctness of 60%, a crash rate of 40%, 2 system messages with visual-only TORs). Voice interaction tasks were used to collect speech information during the driving process. The results revealed that our settings successfully induced trust and distrust states. The corresponding extracted speech feature data of the two trust groups were used for back-propagation neural network training and evaluated for its ability to accurately predict the trust classification. The highest classification accuracy of trust was 90.80%. This study proposes a method for accurately measuring trust in automated vehicles using voice recognition.


Asunto(s)
Conducción de Automóvil , Vehículos Autónomos , Humanos , Automatización , Reconocimiento de Voz , Confianza , Accidentes de Tránsito
3.
World Neurosurg ; 183: e243-e249, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38103686

RESUMEN

BACKGROUND: Many predictive models for estimating clinical outcomes after spine surgery have been reported in the literature. However, implementation of predictive scores in practice is limited by the time-intensive nature of manually abstracting relevant predictors. In this study, we designed natural language processing (NLP) algorithms to automate data abstraction for the thoracolumbar injury classification score (TLICS). METHODS: We retrieved the radiology reports of all Mayo Clinic patients with an International Classification of Diseases, 9th or 10th revision, code corresponding to a fracture of the thoracolumbar spine between January 2005 and October 2020. Annotated data were used to train an N-gram NLP model using machine learning methods, including random forest, stepwise linear discriminant analysis, k-nearest neighbors, and penalized logistic regression models. RESULTS: A total of 1085 spine radiology reports were included in our analysis. Our dataset included 483 compression, 401 burst, 103 translational/rotational, and 98 distraction fractures. A total of 103 reports had documented an injury of the posterior ligamentous complex. The overall accuracy of the random forest model for fracture morphology feature detection was 76.96% versus 65.90% in the stepwise linear discriminant analysis, 50.69% in the k-nearest neighbors, and 62.67% in the penalized logistic regression. The overall accuracy to detect posterior ligamentous complex integrity was highest in the random forest model at 83.41%. Our random forest model was implemented in the backend of a web application in which users can dictate reports and have TLICS features automatically extracted. CONCLUSIONS: We have developed a machine learning NLP model for extracting TLICS features from radiology reports, which we deployed in a web application that can be integrated into clinical practice.


Asunto(s)
Fracturas Óseas , Radiología , Humanos , Procesamiento de Lenguaje Natural , Reconocimiento de Voz , Vértebras Lumbares/diagnóstico por imagen , Vértebras Lumbares/lesiones , Vértebras Torácicas/diagnóstico por imagen , Vértebras Torácicas/lesiones
4.
Audiol., Commun. res ; 29: e2778, 2024. tab, graf
Artículo en Portugués | LILACS | ID: biblio-1533839

RESUMEN

RESUMO Objetivo Avaliar a contribuição da tecnologia de escuta assistida em usuários de implante coclear (IC) em situações de reverberação e ruído. Métodos Estudo transversal prospectivo aprovado pelo Comitê de Ética Institucional (CAAE 8 3031418.4.0000.0068). Foram selecionados adolescentes e adultos usuários de IC com surdez pré ou pós-lingual. Para usuários bilaterais, cada orelha foi avaliada separadamente. O reconhecimento de fala foi avaliado por meio de listas gravadas de palavras dissílabas apresentadas a 65 dBA a 0° azimute com e sem o Mini Microfone2 (Cochlear™) conectado ao processador de fala Nucleus®6. A reverberação da sala foi medida como 550 ms. Para avaliar a contribuição do dispositivo de escuta assistida (DEA) em ambiente reverberante, o reconhecimento de fala foi avaliado no silêncio. Para avaliar a contribuição do DEA em reverberação e ruído, o reconhecimento de fala foi apresentado a 0° azimute com o ruído proveniente de 8 alto-falantes dispostos simetricamente a 2 metros de distância do centro com ruído de múltiplos falantes usando relação sinal-ruído de +10dB. Para evitar viés de aprendizado ou fadiga, a ordem dos testes foi randomizada. A comparação das médias foi analisada pelo teste t para amostras pareadas, adotando-se nível de significância de p<0,005. Resultados Dezessete pacientes com idade média de 40 anos foram convidados e concordaram em participar, sendo 2 participantes bilaterais, totalizando 19 orelhas. Houve contribuição positiva significante do Mini Mic2 na reverberação e ruído+reverberação (p<0,001). Conclusão DEA foi capaz de melhorar o reconhecimento de fala de usuários de IC tanto em situações de reverberação quanto ruidosas.


ABSTRACT Purpose This study aimed to evaluate the contribution of assistive listening technology with wireless connectivity in cochlear implant (CI) users in reverberating and noise situations. Methods Prospective cross-sectional study approved by the Institutional Ethics Committee (CAAE 8 3031418.4.0000.0068). Adolescents and adults CI users with pre- or post-lingual deafness were selected. For bilateral users, each ear was assessed separately. Speech recognition was assessed using recorded lists of disyllabic words presented at 65 dBA at 0° azimuth with and without the Wireless Mini Microphone 2 (Cochlear™) connected to the Nucleus®6 speech processor. Room reverberation was measured as 550 ms. To assess the contribution of the assistive listening device (ALD) in a reverberating environment, speech recognition was assessed in quiet. To assess the contribution of the ALD in reverberation and noise, speech recognition was presented at 0° azimuth along with the noise coming from 8 loudspeakers symmetrically arranged 2 meters away from the center with multi-talker babble noise using signal to noise ratio of +10dB. To avoid learning bias or fatigue, the order of the tests was randomized. Comparison of means was analyzed by t test for paired samples, adopting significance level of p <0.005. Results Seventeen patients with a mean age of 40 years were invited and agreed to participate, with 2 bilateral participants, totaling 19 ears assessed. There was a significant positive contribution from the Mini Mic2 in reverberation, and noise+reverberation (p <0.001). Conclusion ALD was able to improve speech recognition of CI users in both reverberation and noisy situations.


Asunto(s)
Humanos , Masculino , Femenino , Adulto , Dispositivos de Autoayuda , Medición del Ruido , Implantación Coclear , Sordera , Reconocimiento de Voz , Inteligibilidad del Habla , Estudios Transversales
6.
Sci Rep ; 13(1): 18742, 2023 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-37907749

RESUMEN

Human voice recognition over telephone channels typically yields lower accuracy when compared to audio recorded in a studio environment with higher quality. Here, we investigated the extent to which audio in video conferencing, subject to various lossy compression mechanisms, affects human voice recognition performance. Voice recognition performance was tested in an old-new recognition task under three audio conditions (telephone, Zoom, studio) across all matched (familiarization and test with same audio condition) and mismatched combinations (familiarization and test with different audio conditions). Participants were familiarized with female voices presented in either studio-quality (N = 22), Zoom-quality (N = 21), or telephone-quality (N = 20) stimuli. Subsequently, all listeners performed an identical voice recognition test containing a balanced stimulus set from all three conditions. Results revealed that voice recognition performance (d') in Zoom audio was not significantly different to studio audio but both in Zoom and studio audio listeners performed significantly better compared to telephone audio. This suggests that signal processing of the speech codec used by Zoom provides equally relevant information in terms of voice recognition compared to studio audio. Interestingly, listeners familiarized with voices via Zoom audio showed a trend towards a better recognition performance in the test (p = 0.056) compared to listeners familiarized with studio audio. We discuss future directions according to which a possible advantage of Zoom audio for voice recognition might be related to some of the speech coding mechanisms used by Zoom.


Asunto(s)
Percepción del Habla , Voz , Humanos , Femenino , Reconocimiento de Voz , Habla , Acústica
7.
Mult Scler ; 29(13): 1676-1679, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37842762

RESUMEN

BACKGROUND: We previously demonstrated the convergent validity of a fully automated voice recognition analogue of the Symbol Digit Modalities Test (VR-SDMT) for evaluating processing speed in people with multiple sclerosis (pwMS). OBJECTIVE/METHODS: We aimed to replicate these results in 54 pwMS and 18 healthy controls (HCs), demonstrating the VR-SDMT's reliability. RESULTS: Significant correlations were found between the VR-SDMT and the traditional oral SDMT in the multiple sclerosis (MS) (r = -0.771, p < 0.001) and HC (r = -0.785, p < 0.001) groups. CONCLUSION: Taken collectively, our two studies demonstrate the reliability and validity of the VR-SDMT for assessing processing speed in pwMS.


Asunto(s)
Esclerosis Múltiple , Reconocimiento de Voz , Humanos , Reproducibilidad de los Resultados , Pruebas Neuropsicológicas , Velocidad de Procesamiento
8.
Science ; 382(6669): 417-423, 2023 10 27.
Artículo en Inglés | MEDLINE | ID: mdl-37883535

RESUMEN

Faces and voices are the dominant social signals used to recognize individuals among primates. Yet, it is not known how these signals are integrated into a cross-modal representation of individual identity in the primate brain. We discovered that, although single neurons in the marmoset hippocampus exhibited selective responses when presented with the face or voice of a specific individual, a parallel mechanism for representing the cross-modal identities for multiple individuals was evident within single neurons and at the population level. Manifold projections likewise showed the separability of individuals as well as clustering for others' families, which suggests that multiple learned social categories are encoded as related dimensions of identity in the hippocampus. Neural representations of identity in the hippocampus are thus both modality independent and reflect the primate social network.


Asunto(s)
Callithrix , Reconocimiento Facial , Hipocampo , Neuronas , Identificación Social , Reconocimiento de Voz , Animales , Hipocampo/citología , Hipocampo/fisiología , Callithrix/fisiología , Callithrix/psicología , Reconocimiento Facial/fisiología , Reconocimiento de Voz/fisiología , Neuronas/fisiología , Red Social
9.
J Alzheimers Dis ; 95(1): 227-236, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37482999

RESUMEN

BACKGROUND: A rapid digital instrument is needed to facilitate community-based screening of mild cognitive impairment (MCI) and Alzheimer's disease (AD) in China. OBJECTIVE: We developed a voice recognition-based cognitive assessment (Shanghai Cognitive Screening, SCS) on mobile devices and evaluated its diagnostic performance. METHODS: Participants (N = 251) including healthy controls (N = 98), subjective cognitive decline (SCD, N = 42), MCI (N = 80), and mild AD (N = 31) were recruited from the memory clinic at Shanghai Sixth People's Hospital. The SCS is fully self-administered, takes about six minutes and measures the function of visual memory, language, and executive function. Participants were instructed to complete SCS tests, gold-standard neuropsychological tests and standardized structural 3T brain MRI. RESULTS: The Cronbach's alpha was 0.910 of the overall scale, indicating high internal consistency. The SCS total score had an AUC of 0.921 to detect AD (sensitivity = 0.903, specificity = 0.945, positive predictive value = 0.700, negative predictive value = 0.986, likelihood ratio = 16.42, number needed for screening utility = 0.639), and an AUC of 0.838 to detect MCI (sensitivity = 0.793, specificity = 0.671, positive predictive value = 0.657, negative predictive value = 0.803, likelihood ratio = 2.41, number needed for screening utility = 0.944). The subtests demonstrated moderate to high correlations with the gold-standard tests from their respective cognitive domains. The SCS total score and its memory scores all correlated positively with relative volumes of the whole hippocampus and almost all subregions, after controlling for age, sex, and education. CONCLUSION: The SCS has good diagnostic accuracy for detecting MCI and AD dementia and has the potential to facilitate large-scale screening in the general community.


Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Comportamiento del Uso de la Herramienta , Humanos , Reconocimiento de Voz , China , Disfunción Cognitiva/diagnóstico , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/psicología , Pruebas Neuropsicológicas , Cognición
10.
J Acoust Soc Am ; 154(1): 126-140, 2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37432052

RESUMEN

Creaky voice, a non-modal aperiodic phonation that is often associated with low pitch targets, has been found to not only correlate linguistically with prosodic boundary, tonal categories, and pitch range, but also socially with age, gender, and social status. However, it is still not clear whether co-varying factors such as prosodic boundary, pitch range, and tone could, in turn, affect listeners' identification of creak. To fill this gap, this current study examines how creaky voice is identified in Mandarin through experimental data, aiming to enhance our understanding of cross-linguistic perception of creaky voice and, more broadly, speech perception in multi-variable contexts. Our results reveal that in Mandarin, creak identification is context-dependent: factors including prosodic position, tone, pitch range, and the amount of creak all affect how Mandarin listeners identify creak. This reflects listeners' knowledge about the distribution of creak in linguistically universal (e.g., prosodic boundary) and language-specific (e.g., lexical tone) environments.


Asunto(s)
Percepción del Habla , Reconocimiento de Voz , Fonación , Lenguaje , Lingüística
11.
PLoS One ; 18(3): e0272545, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36952436

RESUMEN

BACKGROUND: In 2013, Marshfield Clinic Health System (MCHS) implemented the Dragon Medical One (DMO) system provided by Nuance Management Center (NMC) for Real-Time Dictation (RTD), embracing the idea of streamlined clinic workflow, reduced dictation hours, and improved documentation legibility. Since then, MCHS has observed a trend of reduced time in documentation, however, the target goal of 100% adoption of voice recognition (VR)-based RTD has not been met. OBJECTIVE: To evaluate the uptake/adoption of VR technology for RTD in MCHS, between 2018-2020. METHODS: DMO data for 1,373 MCHS providers from 2018-2020 were analyzed. The study outcome was VR uptake, defined as the median number of hours each provider used VR technology to dictate patient information, and classified as no/yes. Covariates included sex, age, US-trained/international medical graduates, trend, specialty, and facility. Descriptive statistics and unadjusted and adjusted logistic regression analyses were performed. Stata/SE.version.17 was used for analyses. P-values less than/equal to 0.05 were considered statistically significant. RESULTS: Of the 1,373 MCHS providers, the mean (SD) age was 48.3 (12.4) years. VR uptake was higher than no uptake (72.0% vs. 28.0%). In both unadjusted and adjusted analyses, VR uptake was 4.3 times and 7.7 times higher in 2019-2020 compared to 2018, respectively (OR:4.30,95%CI:2.44-7.46 and AOR:7.74,95%CI:2.51-23.86). VR uptake was 0.5 and 0.6 times lower among US-trained physicians compared to internationally-trained physicians (OR:0.53,95%CI:0.37-0.76 and AOR:0.58,95%CI:0.35-0.97). Uptake was 0.2 times lower among physicians aged 60/above than physicians aged 29/less (OR:0.20,95%CI:0.10-0.59, and AOR:0.17,95%CI:0.27-1.06). CONCLUSION: Since 2018, VR adoption has increased significantly across MCHS. However, it was lower among US-trained physicians than among internationally-trained physicians (although internationally physicians were in minority) and lower among more senior physicians than among younger physicians. These findings provide critical information about VR trends, physician factors, and which providers could benefit from additional training to increase VR adoption in healthcare systems.


Asunto(s)
Médicos , Reconocimiento de Voz , Humanos , Estudios Retrospectivos , Instituciones de Atención Ambulatoria , Atención a la Salud
12.
ACS Appl Mater Interfaces ; 15(9): 12551-12559, 2023 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-36808950

RESUMEN

Intelligent sensors have attracted substantial attention for various applications, including wearable electronics, artificial intelligence, healthcare monitoring, and human-machine interactions. However, there still remains a critical challenge in developing a multifunctional sensing system for complex signal detection and analysis in practical applications. Here, we develop a machine learning-combined flexible sensor for real-time tactile sensing and voice recognition through laser-induced graphitization. The intelligent sensor with a triboelectric layer can convert local pressure to an electrical signal through a contact electrification effect without external bias, which has a characteristic response behavior when exposed to various mechanical stimuli. With the special patterning design, a smart human-machine interaction controlling system composed of a digital arrayed touch panel is constructed to control electronic devices. Based on machine learning, the real-time monitoring and recognition of the changes of voice are achieved with high accuracy. The machine learning-empowered flexible sensor provides a promising platform for the development of flexible tactile sensing, real-time health detection, human-machine interaction, and intelligent wearable devices.


Asunto(s)
Inteligencia Artificial , Dispositivos Electrónicos Vestibles , Humanos , Reconocimiento de Voz , Electricidad , Aprendizaje Automático
13.
Q J Exp Psychol (Hove) ; 76(12): 2804-2822, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36718784

RESUMEN

Voice identification parades can be unreliable due to the error-prone nature of earwitness responses. UK government guidelines recommend that voice parades should have nine voices, each played for 60 s. This makes parades resource-consuming to construct. In this article, we conducted two experiments to see if voice parade procedures could be simplified. In Experiment 1 (N = 271, 135 female), we investigated if reducing the duration of the voice samples on a nine-voice parade would negatively affect identification performance using both conventional logistic and signal detection approaches. In Experiment 2 (N = 270, 136 female), we first explored if the same sample duration conditions used in Experiment 1 would lead to different outcomes if we reduced the parade size to include only six voices. Following this, we pooled the data from both experiments to investigate the influence of target-position effects. The results show that 15-s sample durations result in statistically equivalent voice identification performance to the longer 60-s sample durations, but that the 30-s sample duration suffers in terms of overall signal sensitivity. This pattern of results was replicated using both a nine- and a six-voice parade. Performance on target-absent parades were at chance levels in both parade sizes, and response criteria were mostly liberal. In addition, unwanted position effects were present. The results provide initial evidence that the sample duration used in a voice parade may be reduced, but we argue that the guidelines recommending a parade with nine voices should be maintained to provide additional protection for a potentially innocent suspect given the low target-absent accuracy.


Asunto(s)
Reconocimiento de Voz , Voz , Femenino , Humanos , Voz/fisiología , Masculino
14.
Comput Biol Med ; 152: 106336, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36473341

RESUMEN

Silent speech recognition (SSR) is a system that implements speech communication when a sound signal is not available using surface electromyography (sEMG)-based speech recognition. Researchers have used surface electrodes to record the electrically-activated potential of human articulation muscles to recognize speech content. SSR can be used for pilot-assisted speech recognition, communication of individuals with speech impairment, private communication, and other fields. In this feasibility study, we collected sEMG data for ten single Mandarin numeric words. After reducing power frequency interference and power supply noise from the sEMG signal, short-term energy (STE) was used for voice activity detection (VAD). The power spectrum features were extracted and fed into the classifier for final identification results. We used the Hold-out method to divide the data into training and test sets on a 7-3 scale, with an average accuracy of 92.3% and a maximum of 100% using a support vector machine (SVM) classifier. Experimental results showed that the proposed method has development potential, and is effective in identifying isolated words from the sEMG signal of the articulation muscles.


Asunto(s)
Trastornos del Habla , Reconocimiento de Voz , Humanos , Electromiografía/métodos , Habla , Tecnología , Algoritmos
15.
Sex., salud soc. (Rio J.) ; (39): e22202, 2023.
Artículo en Portugués | LILACS | ID: biblio-1450501

RESUMEN

Resumo Neste artigo, buscamos refletir sobre a produção da voz de pessoas transexuais e travestis por práticas e tecnologias biomédicas de generificação. Durante dois anos, realizamos pesquisa etnográfica, por meio de observação participante, entrevistas e acompanhamento cotidiano no Ambulatório Trans da Universidade Federal de São Paulo. Timbres de voz evocam corpos que tendem a ser imaginados por quem os ouve como masculinos ou femininos. Uma pessoa que não vocaliza de forma a confirmar essa linearidade pode ter o gênero colocado sob dúvida. As pessoas que cruzam essa linearidade buscam atuar sobre a voz, estabelecendo, para tal, complexas negociações com profissionais de saúde e serviços de saúde. A nova voz surge dessas negociações, nos encontros entre hormônios e práticas de fonoaudiologia.


Abstract In this article, we seek to reflect on the voice production of trans persons and travestis through biomedical practices and technologies of genderfication. For two years, we conducted ethnographic research, engaging in participant observation, interviews, and daily follow-up at the Trans Outpatient Service of the Federal University of São Paulo. Voice timbres evoke bodies that tend to be perceived as male or female by those who hear them. A person who does not vocalize in a manner that confirms this linearity may have their gender questioned. Individuals who cross this linearity seek to enact their voice, establishing complex negotiations with health professionals and health services. The new voice emerges from these negotiations, in encounters between hormones and speech therapy practices.


Resumen En este artículo buscamos reflexionar sobre la producción de la voz de personas transexuales y travestis a través de prácticas y tecnologías biomédicas de generificación. Durante dos años, realizamos una investigación etnográfica, a través de observación participante, entrevistas y seguimiento diario en el Ambulatorio Trans de la Universidad Federal de São Paulo. Los timbres de voz evocan cuerpos que tienden a ser imaginados por quienes los escuchan como masculinos o femeninos. Una persona que no vocalice de una manera que confirme esta linealidad puede tener el género puesto en duda. Las personas que cruzan esta linealidad buscan actuar sobre la voz, estableciendo negociaciones complejas con los profesionales de la salud y los servicios de salud. La nueva voz emerge de estas negociaciones, en los encuentros entre las hormonas y las prácticas de la fonoaudiología.


Asunto(s)
Humanos , Masculino , Femenino , Reconocimiento de Voz/efectos de los fármacos , Entrenamiento de la Voz
16.
Neuroimage ; 263: 119647, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36162634

RESUMEN

Recognising a speaker's identity by the sound of their voice is important for successful interaction. The skill depends on our ability to discriminate minute variations in the acoustics of the vocal signal. Performance on voice identity assessments varies widely across the population. The neural underpinnings of this ability and its individual differences, however, remain poorly understood. Here we provide critical tests of a theoretical framework for the neural processing stages of voice identity and address how individual differences in identity discrimination mediate activation in this neural network. We scanned 40 individuals on an fMRI adaptation task involving voices drawn from morphed continua between two personally familiar identities. Analyses dissociated neuronal effects induced by repetition of acoustically similar morphs from those induced by a switch in perceived identity. Activation in temporal voice-sensitive areas decreased with acoustic similarity between consecutive stimuli. This repetition suppression effect was mediated by the performance on an independent voice assessment and this result highlights an important functional role of adaptive coding in voice expertise. Bilateral anterior insulae and medial frontal gyri responded to a switch in perceived voice identity compared to an acoustically equidistant switch within identity. Our results support a multistep model of voice identity perception.


Asunto(s)
Acústica , Enfermedades Auditivas Centrales , Cognición , Reconocimiento de Voz , Humanos , Estimulación Acústica , Cognición/fisiología , Imagen por Resonancia Magnética , Corteza Prefrontal/fisiología , Reconocimiento de Voz/fisiología , Enfermedades Auditivas Centrales/fisiopatología , Masculino , Femenino , Adolescente , Adulto Joven , Adulto , Red Nerviosa/fisiología
17.
Small ; 18(22): e2201331, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35499190

RESUMEN

To fabricate a high-performance and ultrasensitive triboelectric nanogenerator (TENG), choice of a combination of different materials of triboelectric series is one of the prime challenging tasks. An effective way to fabricate a TENG with a single material (abbreviated as S-TENG) is proposed, comprising electrospun nylon nanofibers. The surface potential of the nanofibers are tuned by changing the voltage polarity in the electrospinning setup, employed between the needle and collector. The difference in surface potential leads to a different work function that is the key to design S-TENG with a single material only. Further, S-TENG is demonstrated as an ultrahigh sensitive acoustic sensor with mechanoacoustic sensitivity of ≈27 500 mV Pa-1 . Due to high sensitivity in the low-to-middle decibel (60-70 dB) sounds, S-TENG is highly capable in recognizing different voice signals depending on the condition of the vocal cord. This effective voice recognition ability indicates that it has high potential to open an alternative pathway for medical professionals to detect several diseases such as neurological voice disorder, muscle tension dysphonia, vocal cord paralysis, and speech delay/disorder related to laryngeal complications.


Asunto(s)
Nanofibras , Nanotecnología , Suministros de Energía Eléctrica , Nylons , Reconocimiento de Voz
18.
Comput Intell Neurosci ; 2022: 3466987, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35634052

RESUMEN

Artistic voice is the artistic life of professional voice users. In the process of selecting and cultivating artistic performing talents, the evaluation of voice even occupies a very important position. Therefore, an appropriate evaluation of the artistic voice is crucial. With the development of art education, how to scientifically evaluate artistic voice training methods and fairly select artistic voice talents is an urgent need for objective evaluation of artistic voice. The current evaluation methods for artistic voices are time-consuming, laborious, and highly subjective. In the objective evaluation of artistic voice, the selection of evaluation acoustic parameters is very important. Attempt to extract the average energy, average frequency error, and average range error of singing voice by using speech analysis technology as the objective evaluation acoustic parameters, use neural network method to objectively evaluate the singing quality of artistic voice, and compare with the subjective evaluation of senior professional teachers. In this paper, voice analysis technology is used to extract the first formant, third formant, fundamental frequency, sound range, fundamental frequency perturbation, first formant perturbation, third formant perturbation, and average energy of singing acoustic parameters. By using BP neural network methods, the quality of singing was evaluated objectively and compared with the subjective evaluation of senior vocal professional teachers. The results show that the BP neural network method can accurately and objectively evaluate the quality of singing voice by using the evaluation parameters, which is helpful in scientifically guiding the selection and training of artistic voice talents.


Asunto(s)
Música , Humanos , Redes Neurales de la Computación , Calidad de la Voz , Reconocimiento de Voz , Entrenamiento de la Voz
20.
Sensors (Basel) ; 22(3)2022 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-35161507

RESUMEN

Flexible pressure sensors have been studied as wearable voice-recognition devices to be utilized in human-machine interaction. However, the development of highly sensitive, skin-attachable, and comfortable sensing devices to achieve clear voice detection remains a considerable challenge. Herein, we present a wearable and flexible pressure and temperature sensor with a sensitive response to vibration, which can accurately recognize the human voice by combing with the artificial neural network. The device consists of a polyethylene terephthalate (PET) printed with a silver electrode, a filament-microstructured polydimethylsiloxane (PDMS) film embedded with single-walled carbon nanotubes and a polyimide (PI) film sputtered with a patterned Ti/Pt thermistor strip. The developed pressure sensor exhibited a pressure sensitivity of 0.398 kPa-1 in the low-pressure regime, and the fabricated temperature sensor shows a desirable temperature coefficient of resistance of 0.13% ∘C in the range of 25 ∘C to 105 ∘C. Through training and testing the neural network model with the waveform data of the sensor obtained from human pronunciation, the vocal fold vibrations of different words can be successfully recognized, and the total recognition accuracy rate can reach 93.4%. Our results suggest that the fabricated sensor has substantial potential for application in the human-computer interface fields, such as voice control, vocal healthcare monitoring, and voice authentication.


Asunto(s)
Nanotubos de Carbono , Dispositivos Electrónicos Vestibles , Humanos , Redes Neurales de la Computación , Temperatura , Reconocimiento de Voz
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...