Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.907
Filter
1.
Hum Brain Mapp ; 45(10): e26724, 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-39001584

ABSTRACT

Music is ubiquitous, both in its instrumental and vocal forms. While speech perception at birth has been at the core of an extensive corpus of research, the origins of the ability to discriminate instrumental or vocal melodies is still not well investigated. In previous studies comparing vocal and musical perception, the vocal stimuli were mainly related to speaking, including language, and not to the non-language singing voice. In the present study, to better compare a melodic instrumental line with the voice, we used singing as a comparison stimulus, to reduce the dissimilarities between the two stimuli as much as possible, separating language perception from vocal musical perception. In the present study, 45 newborns were scanned, 10 full-term born infants and 35 preterm infants at term-equivalent age (mean gestational age at test = 40.17 weeks, SD = 0.44) using functional magnetic resonance imaging while listening to five melodies played by a musical instrument (flute) or sung by a female voice. To examine the dynamic task-based effective connectivity, we employed a psychophysiological interaction of co-activation patterns (PPI-CAPs) analysis, using the auditory cortices as seed region, to investigate moment-to-moment changes in task-driven modulation of cortical activity during an fMRI task. Our findings reveal condition-specific, dynamically occurring patterns of co-activation (PPI-CAPs). During the vocal condition, the auditory cortex co-activates with the sensorimotor and salience networks, while during the instrumental condition, it co-activates with the visual cortex and the superior frontal cortex. Our results show that the vocal stimulus elicits sensorimotor aspects of the auditory perception and is processed as a more salient stimulus while the instrumental condition activated higher-order cognitive and visuo-spatial networks. Common neural signatures for both auditory stimuli were found in the precuneus and posterior cingulate gyrus. Finally, this study adds knowledge on the dynamic brain connectivity underlying the newborns capability of early and specialized auditory processing, highlighting the relevance of dynamic approaches to study brain function in newborn populations.


Subject(s)
Auditory Perception , Magnetic Resonance Imaging , Music , Humans , Female , Male , Auditory Perception/physiology , Infant, Newborn , Singing/physiology , Infant, Premature/physiology , Brain Mapping , Acoustic Stimulation , Brain/physiology , Brain/diagnostic imaging , Voice/physiology
2.
J Acoust Soc Am ; 156(1): 278-283, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38980102

ABSTRACT

How we produce and perceive voice is constrained by laryngeal physiology and biomechanics. Such constraints may present themselves as principal dimensions in the voice outcome space that are shared among speakers. This study attempts to identify such principal dimensions in the voice outcome space and the underlying laryngeal control mechanisms in a three-dimensional computational model of voice production. A large-scale voice simulation was performed with parametric variations in vocal fold geometry and stiffness, glottal gap, vocal tract shape, and subglottal pressure. Principal component analysis was applied to data combining both the physiological control parameters and voice outcome measures. The results showed three dominant dimensions accounting for at least 50% of the total variance. The first two dimensions describe respiratory-laryngeal coordination in controlling the energy balance between low- and high-frequency harmonics in the produced voice, and the third dimension describes control of the fundamental frequency. The dominance of these three dimensions suggests that voice changes along these principal dimensions are likely to be more consistently produced and perceived by most speakers than other voice changes, and thus are more likely to have emerged during evolution and be used to convey important personal information, such as emotion and larynx size.


Subject(s)
Larynx , Phonation , Principal Component Analysis , Humans , Biomechanical Phenomena , Larynx/physiology , Larynx/anatomy & histology , Voice/physiology , Vocal Cords/physiology , Vocal Cords/anatomy & histology , Computer Simulation , Voice Quality , Speech Acoustics , Pressure , Models, Biological , Models, Anatomic
3.
J Acoust Soc Am ; 155(6): 3822-3832, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38874464

ABSTRACT

This study proposes the use of vocal resonators to enhance cardiac auscultation signals and evaluates their performance for voice-noise suppression. Data were collected using two electronic stethoscopes while each study subject was talking. One collected auscultation signal from the chest while the other collected voice signals from one of the three voice resonators (cheek, back of the neck, and shoulder). The spectral subtraction method was applied to the signals. Both objective and subjective metrics were used to evaluate the quality of enhanced signals and to investigate the most effective vocal resonator for noise suppression. Our preliminary findings showed a significant improvement after enhancement and demonstrated the efficacy of vocal resonators. A listening survey was conducted with thirteen physicians to evaluate the quality of enhanced signals, and they have received significantly better scores regarding the sound quality than their original signals. The shoulder resonator group demonstrated significantly better sound quality than the cheek group when reducing voice sound in cardiac auscultation signals. The suggested method has the potential to be used for the development of an electronic stethoscope with a robust noise removal function. Significant clinical benefits are expected from the expedited preliminary diagnostic procedure.


Subject(s)
Heart Auscultation , Signal Processing, Computer-Assisted , Stethoscopes , Humans , Heart Auscultation/instrumentation , Heart Auscultation/methods , Heart Auscultation/standards , Male , Female , Adult , Heart Sounds/physiology , Sound Spectrography , Equipment Design , Voice/physiology , Middle Aged , Voice Quality , Vibration , Noise
4.
Proc Natl Acad Sci U S A ; 121(26): e2318361121, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38889147

ABSTRACT

When listeners hear a voice, they rapidly form a complex first impression of who the person behind that voice might be. We characterize how these multivariate first impressions from voices emerge over time across different levels of abstraction using electroencephalography and representational similarity analysis. We find that for eight perceived physical (gender, age, and health), trait (attractiveness, dominance, and trustworthiness), and social characteristics (educatedness and professionalism), representations emerge early (~80 ms after stimulus onset), with voice acoustics contributing to those representations between ~100 ms and 400 ms. While impressions of person characteristics are highly correlated, we can find evidence for highly abstracted, independent representations of individual person characteristics. These abstracted representationse merge gradually over time. That is, representations of physical characteristics (age, gender) arise early (from ~120 ms), while representations of some trait and social characteristics emerge later (~360 ms onward). The findings align with recent theoretical models and shed light on the computations underpinning person perception from voices.


Subject(s)
Auditory Perception , Brain , Electroencephalography , Voice , Humans , Male , Female , Voice/physiology , Adult , Brain/physiology , Auditory Perception/physiology , Young Adult , Social Perception
5.
Commun Biol ; 7(1): 711, 2024 Jun 11.
Article in English | MEDLINE | ID: mdl-38862808

ABSTRACT

Deepfakes are viral ingredients of digital environments, and they can trick human cognition into misperceiving the fake as real. Here, we test the neurocognitive sensitivity of 25 participants to accept or reject person identities as recreated in audio deepfakes. We generate high-quality voice identity clones from natural speakers by using advanced deepfake technologies. During an identity matching task, participants show intermediate performance with deepfake voices, indicating levels of deception and resistance to deepfake identity spoofing. On the brain level, univariate and multivariate analyses consistently reveal a central cortico-striatal network that decoded the vocal acoustic pattern and deepfake-level (auditory cortex), as well as natural speaker identities (nucleus accumbens), which are valued for their social relevance. This network is embedded in a broader neural identity and object recognition network. Humans can thus be partly tricked by deepfakes, but the neurocognitive mechanisms identified during deepfake processing open windows for strengthening human resilience to fake information.


Subject(s)
Speech Perception , Humans , Male , Female , Adult , Young Adult , Speech Perception/physiology , Nerve Net/physiology , Auditory Cortex/physiology , Voice/physiology , Corpus Striatum/physiology
6.
Sci Rep ; 14(1): 13132, 2024 06 07.
Article in English | MEDLINE | ID: mdl-38849382

ABSTRACT

Voice production of humans and most mammals is governed by the MyoElastic-AeroDynamic (MEAD) principle, where an air stream is modulated by self-sustained vocal fold oscillation to generate audible air pressure fluctuations. An alternative mechanism is found in ultrasonic vocalizations of rodents, which are established by an aeroacoustic (AA) phenomenon without vibration of laryngeal tissue. Previously, some authors argued that high-pitched human vocalization is also produced by the AA principle. Here, we investigate the so-called "whistle register" voice production in nine professional female operatic sopranos singing a scale from C6 (≈ 1047 Hz) to G6 (≈ 1568 Hz). Super-high-speed videolaryngoscopy revealed vocal fold collision in all participants, with closed quotients from 30 to 73%. Computational modeling showed that the biomechanical requirements to produce such high-pitched voice would be an increased contraction of the cricothyroid muscle, vocal fold strain of about 50%, and high subglottal pressure. Our data suggest that high-pitched operatic soprano singing uses the MEAD mechanism. Consequently, the commonly used term "whistle register" does not reflect the physical principle of a whistle with regard to voice generation in high pitched classical singing.


Subject(s)
Singing , Vocal Cords , Humans , Female , Singing/physiology , Biomechanical Phenomena , Vocal Cords/physiology , Adult , Sound , Voice/physiology , Phonation/physiology
7.
Proc Natl Acad Sci U S A ; 121(25): e2405588121, 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38861607

ABSTRACT

Many animals can extract useful information from the vocalizations of other species. Neuroimaging studies have evidenced areas sensitive to conspecific vocalizations in the cerebral cortex of primates, but how these areas process heterospecific vocalizations remains unclear. Using fMRI-guided electrophysiology, we recorded the spiking activity of individual neurons in the anterior temporal voice patches of two macaques while they listened to complex sounds including vocalizations from several species. In addition to cells selective for conspecific macaque vocalizations, we identified an unsuspected subpopulation of neurons with strong selectivity for human voice, not merely explained by spectral or temporal structure of the sounds. The auditory representational geometry implemented by these neurons was strongly related to that measured in the human voice areas with neuroimaging and only weakly to low-level acoustical structure. These findings provide new insights into the neural mechanisms involved in auditory expertise and the evolution of communication systems in primates.


Subject(s)
Auditory Perception , Magnetic Resonance Imaging , Neurons , Vocalization, Animal , Voice , Animals , Humans , Neurons/physiology , Voice/physiology , Magnetic Resonance Imaging/methods , Vocalization, Animal/physiology , Auditory Perception/physiology , Male , Macaca mulatta , Brain/physiology , Acoustic Stimulation , Brain Mapping/methods
8.
J Speech Lang Hear Res ; 67(7): 2139-2158, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38875480

ABSTRACT

PURPOSE: This systematic review aimed to evaluate the effects of singing as an intervention for aging voice. METHOD: Quantitative studies of interventions for older adults with any medical condition that involves singing as training were reviewed, measured by respiration, phonation, and posture, which are the physical functions related to the aging voice. English and Chinese studies published until April 2024 were searched using 31 electronic databases, and seven studies were included. The included articles were assessed according to the Grading of Recommendations, Assessment, Development, and Evaluations rubric. RESULTS: Seven studies were included. These studies reported outcome measures that were related to respiratory functions only. For the intervention effect, statistically significant improvements were observed in five of the included studies, among which three studies had large effect sizes. The overall level of evidence of the included studies was not high, with three studies having moderate levels and the rest having lower levels. The intervention activities included trainings other than singing. These non-singing training items may have caused co-intervention bias in the study results. CONCLUSIONS: This systematic review suggests that singing as an intervention for older adults with respiratory and cognitive problems could improve respiration and respiratory-phonatory control. However, none of the included studies covers the other two of the physical functions related to aging voice (phonatory and postural functions). The overall level of evidence of the included studies was not high either. There is a need for more research evidence in singing-based intervention specifically for patient with aging voice.


Subject(s)
Aging , Singing , Humans , Aged , Aging/physiology , Voice Disorders/therapy , Phonation/physiology , Voice Quality , Voice/physiology , Respiration , Posture/physiology , Aged, 80 and over
9.
J Speech Lang Hear Res ; 67(7): 1997-2020, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38861454

ABSTRACT

PURPOSE: Although different factors and voice measures have been associated with phonotraumatic vocal hyperfunction (PVH), it is unclear what percentage of individuals with PVH exhibit such differences during their daily lives. This study used a machine learning approach to quantify the consistency with which PVH manifests according to ambulatory voice measures. Analyses included acoustic parameters of phonation as well as temporal aspects of phonation and rest, with the goal of determining optimally consistent signatures of PVH. METHOD: Ambulatory neck-surface acceleration signals were recorded over 1 week from 116 female participants diagnosed with PVH and age-, sex-, and occupation-matched vocally healthy controls. The consistency of the manifestation of PVH was defined as the percentage of participants in each group that exhibited an atypical signature based on a target voice measure. Evaluation of each machine learning model used nested 10-fold cross-validation to improve the generalizability of findings. In Experiment 1, we trained separate logistic regression models based on the distributional characteristics of 14 voice measures and durations of voicing and resting segments. In Experiments 2 and 3, features of voicing and resting duration augmented the existing distributional characteristics to examine whether more consistent signatures would result. RESULTS: Experiment 1 showed that the difference in the magnitude of the first two harmonics (H1-H2) exhibited the most consistent signature (69.4% of participants with PVH and 20.4% of controls had an atypical H1-H2 signature), followed by spectral tilt over eight harmonics (73.6% participants with PVH and 32.1% of controls had an atypical spectral tilt signature) and estimated sound pressure level (SPL; 66.9% participants with PVH and 27.6% of controls had an atypical SPL signature). Additionally, 77.6% of participants with PVH had atypical resting duration, with 68.9% exhibiting atypical voicing duration. Experiments 2 and 3 showed that augmenting the best-performing voice measures with univariate features of voicing or resting durations yielded only incremental improvement in the classifier's performance. CONCLUSIONS: Females with PVH were more likely to use more abrupt vocal fold closure (lower H1-H2), phonate louder (higher SPL), and take shorter vocal rests. They were also less likely to use higher fundamental frequency during their daily activities. The difference in the voicing duration signature between participants with PVH and controls had a large effect size, providing strong empirical evidence regarding the role of voice use in the development of PVH.


Subject(s)
Machine Learning , Phonation , Humans , Female , Adult , Middle Aged , Phonation/physiology , Voice Disorders/physiopathology , Voice Disorders/diagnosis , Young Adult , Voice Quality/physiology , Vocal Cords/physiopathology , Speech Acoustics , Voice/physiology , Aged , Case-Control Studies
10.
J Matern Fetal Neonatal Med ; 37(1): 2362933, 2024 Dec.
Article in English | MEDLINE | ID: mdl-38910112

ABSTRACT

OBJECTIVE: To study the effects of playing mother's recorded voice to preterm infants in the NICU on their mothers' mental health as measured by the Depression, Anxiety and Stress Scale -21 (DASS-21) questionnaire. DESIGN/METHODS: This was a pilot single center prospective randomized controlled trial done at a level IV NICU. The trial was registered at clinicaltrials.gov (NCT04559620). Inclusion criteria were mothers of preterm infants with gestational ages between 26wks and 30 weeks. DASS-21 questionnaire was administered to all the enrolled mothers in the first week after birth followed by recording of their voice by the music therapists. In the interventional group, recorded maternal voice was played into the infant incubator between 15 and 21 days of life. A second DASS-21 was administered between 21 and 23 days of life. The Wilcoxon rank-sum test was used to compare DASS-21 scores between the two groups and Wilcoxon signed-rank test was used to compare the pre- and post-intervention DASS-21 scores. RESULTS: Forty eligible mothers were randomized: 20 to the intervention group and 20 to the control group. The baseline maternal and neonatal characteristics were similar between the two groups. There was no significant difference in the DASS-21 scores between the two groups at baseline or after the study intervention. There was no difference in the pre- and post-interventional DASS-21 scores or its individual components in the experimental group. There was a significant decrease in the total DASS-21 score and the anxiety component of DASS-21 between weeks 1 and 4 in the control group. CONCLUSION: In this pilot randomized control study, recorded maternal voice played into preterm infant's incubator did not have any effect on maternal mental health as measured by the DASS-21 questionnaire. Data obtained in this pilot study are useful in future RCTs (Randomized Controlled Trial) to address this important issue.


Subject(s)
Anxiety , Depression , Infant, Premature , Stress, Psychological , Humans , Female , Pilot Projects , Infant, Newborn , Infant, Premature/psychology , Anxiety/therapy , Adult , Stress, Psychological/therapy , Depression/therapy , Mothers/psychology , Incubators, Infant , Prospective Studies , Music Therapy/methods , Voice/physiology
11.
Sci Rep ; 14(1): 10488, 2024 05 07.
Article in English | MEDLINE | ID: mdl-38714709

ABSTRACT

Vocal attractiveness influences important social outcomes. While most research on the acoustic parameters that influence vocal attractiveness has focused on the possible roles of sexually dimorphic characteristics of voices, such as fundamental frequency (i.e., pitch) and formant frequencies (i.e., a correlate of body size), other work has reported that increasing vocal averageness increases attractiveness. Here we investigated the roles these three characteristics play in judgments of the attractiveness of male and female voices. In Study 1, we found that increasing vocal averageness significantly decreased distinctiveness ratings, demonstrating that participants could detect manipulations of vocal averageness in this stimulus set and using this testing paradigm. However, in Study 2, we found no evidence that increasing averageness significantly increased attractiveness ratings of voices. In Study 3, we found that fundamental frequency was negatively correlated with male vocal attractiveness and positively correlated with female vocal attractiveness. By contrast with these results for fundamental frequency, vocal attractiveness and formant frequencies were not significantly correlated. Collectively, our results suggest that averageness may not necessarily significantly increase attractiveness judgments of voices and are consistent with previous work reporting significant associations between attractiveness and voice pitch.


Subject(s)
Beauty , Voice , Humans , Male , Female , Voice/physiology , Adult , Young Adult , Judgment/physiology , Adolescent
12.
Commun Biol ; 7(1): 540, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38714798

ABSTRACT

The genetic influence on human vocal pitch in tonal and non-tonal languages remains largely unknown. In tonal languages, such as Mandarin Chinese, pitch changes differentiate word meanings, whereas in non-tonal languages, such as Icelandic, pitch is used to convey intonation. We addressed this question by searching for genetic associations with interindividual variation in median pitch in a Chinese major depression case-control cohort and compared our results with a genome-wide association study from Iceland. The same genetic variant, rs11046212-T in an intron of the ABCC9 gene, was one of the most strongly associated loci with median pitch in both samples. Our meta-analysis revealed four genome-wide significant hits, including two novel associations. The discovery of genetic variants influencing vocal pitch across both tonal and non-tonal languages suggests the possibility of a common genetic contribution to the human vocal system shared in two distinct populations with languages that differ in tonality (Icelandic and Mandarin).


Subject(s)
Genome-Wide Association Study , Language , Humans , Male , Female , Polymorphism, Single Nucleotide , Adult , Iceland , Case-Control Studies , Middle Aged , Voice/physiology , Pitch Perception , Asian People/genetics
13.
PLoS One ; 19(5): e0302739, 2024.
Article in English | MEDLINE | ID: mdl-38728329

ABSTRACT

BACKGROUND: Deep brain stimulation (DBS) reliably ameliorates cardinal motor symptoms in Parkinson's disease (PD) and essential tremor (ET). However, the effects of DBS on speech, voice and language have been inconsistent and have not been examined comprehensively in a single study. OBJECTIVE: We conducted a systematic analysis of literature by reviewing studies that examined the effects of DBS on speech, voice and language in PD and ET. METHODS: A total of 675 publications were retrieved from PubMed, Embase, CINHAL, Web of Science, Cochrane Library and Scopus databases. Based on our selection criteria, 90 papers were included in our analysis. The selected publications were categorized into four subcategories: Fluency, Word production, Articulation and phonology and Voice quality. RESULTS: The results suggested a long-term decline in verbal fluency, with more studies reporting deficits in phonemic fluency than semantic fluency following DBS. Additionally, high frequency stimulation, left-sided and bilateral DBS were associated with worse verbal fluency outcomes. Naming improved in the short-term following DBS-ON compared to DBS-OFF, with no long-term differences between the two conditions. Bilateral and low-frequency DBS demonstrated a relative improvement for phonation and articulation. Nonetheless, long-term DBS exacerbated phonation and articulation deficits. The effect of DBS on voice was highly variable, with both improvements and deterioration in different measures of voice. CONCLUSION: This was the first study that aimed to combine the outcome of speech, voice, and language following DBS in a single systematic review. The findings revealed a heterogeneous pattern of results for speech, voice, and language across DBS studies, and provided directions for future studies.


Subject(s)
Deep Brain Stimulation , Language , Parkinson Disease , Speech , Voice , Deep Brain Stimulation/methods , Humans , Parkinson Disease/therapy , Parkinson Disease/physiopathology , Speech/physiology , Voice/physiology , Essential Tremor/therapy , Essential Tremor/physiopathology
14.
PLoS One ; 19(5): e0299140, 2024.
Article in English | MEDLINE | ID: mdl-38809807

ABSTRACT

Non-random exploration of infant speech-like vocalizations (e.g., squeals, growls, and vowel-like sounds or "vocants") is pivotal in speech development. This type of vocal exploration, often noticed when infants produce particular vocal types in clusters, serves two crucial purposes: it establishes a foundation for speech because speech requires formation of new vocal categories, and it serves as a basis for vocal signaling of wellness and interaction with caregivers. Despite the significance of clustering, existing research has largely relied on subjective descriptions and anecdotal observations regarding early vocal category formation. In this study, we aim to address this gap by presenting the first large-scale empirical evidence of vocal category exploration and clustering throughout the first year of life. We observed infant vocalizations longitudinally using all-day home recordings from 130 typically developing infants across the entire first year of life. To identify clustering patterns, we conducted Fisher's exact tests to compare the occurrence of squeals versus vocants, as well as growls versus vocants. We found that across the first year, infants demonstrated clear clustering patterns of squeals and growls, indicating that these categories were not randomly produced, but rather, it seemed, infants actively engaged in practice of these specific categories. The findings lend support to the concept of infants as manifesting active vocal exploration and category formation, a key foundation for vocal language.


Subject(s)
Speech , Humans , Infant , Male , Female , Speech/physiology , Language Development , Voice/physiology , Longitudinal Studies , Phonetics
15.
Cortex ; 176: 1-10, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38723449

ABSTRACT

Recognizing talkers' identity via speech is an important social skill in interpersonal interaction. Behavioral evidence has shown that listeners can identify better the voices of their native language than those of a non-native language, which is known as the language familiarity effect (LFE). However, its underlying neural mechanisms remain unclear. This study therefore investigated how the LFE occurs at the neural level by employing functional near-infrared spectroscopy (fNIRS). Late unbalanced bilinguals were first asked to learn to associate strangers' voices with their identities and then tested for recognizing the talkers' identities based on their voices speaking a language either highly familiar (i.e., native language Chinese), or moderately familiar (i.e., second language English), or completely unfamiliar (i.e., Ewe) to participants. Participants identified talkers the most accurately in Chinese and the least accurately in Ewe. Talker identification was quicker in Chinese than in English and Ewe but reaction time did not differ between the two non-native languages. At the neural level, recognizing voices speaking Chinese relative to English/Ewe produced less activity in the inferior frontal gyrus, precentral/postcentral gyrus, supramarginal gyrus, and superior temporal sulcus/gyrus while no difference was found between English and Ewe, indicating facilitation of voice identification by the automatic phonological encoding in the native language. These findings shed new light on the interrelations between language ability and voice recognition, revealing that the brain activation pattern of the LFE depends on the automaticity of language processing.


Subject(s)
Language , Recognition, Psychology , Spectroscopy, Near-Infrared , Speech Perception , Voice , Humans , Spectroscopy, Near-Infrared/methods , Female , Male , Recognition, Psychology/physiology , Young Adult , Voice/physiology , Speech Perception/physiology , Adult , Multilingualism , Brain Mapping , Reaction Time/physiology , Brain/physiology , Brain/diagnostic imaging
16.
J Speech Lang Hear Res ; 67(6): 1731-1751, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38754028

ABSTRACT

PURPOSE: The present study examined whether participants respond to unperturbed parameters while experiencing specific perturbations in auditory feedback. For instance, we aim to determine if speakers adjust voice loudness when only pitch is artificially altered in auditory feedback. This phenomenon is referred to as the "accompanying effect" in the present study. METHOD: Thirty native Mandarin speakers were asked to sustain the vowel /ɛ/ for 3 s while their auditory feedback underwent single shifts in one of the three distinct ways: pitch shift (±100 cents; coded as PT), loudness shift (±6 dB; coded as LD), or first formant (F1) shift (±100 Hz; coded as FM). Participants were instructed to ignore the perturbations in their auditory feedback. Response types were categorized based on pitch, loudness, and F1 for each individual trial, such as Popp_Lopp_Fopp indicating opposing responses in all three domains. RESULTS: The accompanying effect appeared 93% of the time. Bayesian Poisson regression models indicate that opposing responses in all three domains (Popp_Lopp_Fopp) were the most prevalent response type across the conditions (PT, LD, and FM). The more frequently used response types exhibited opposing responses and significantly larger response curves than the less frequently used response types. Following responses became more prevalent only when the perturbed stimuli were perceived as voices from someone else (external references), particularly in the FM condition. In terms of isotropy, loudness and F1 tended to change in the same direction rather than loudness and pitch. CONCLUSION: The presence of the accompanying effect suggests that the motor systems responsible for regulating pitch, loudness, and formants are not entirely independent but rather interconnected to some degree.


Subject(s)
Bayes Theorem , Pitch Perception , Humans , Male , Female , Young Adult , Pitch Perception/physiology , Adult , Speech Perception/physiology , Loudness Perception/physiology , Feedback, Sensory/physiology , Voice/physiology , Acoustic Stimulation/methods , Speech Acoustics
17.
Sci Rep ; 14(1): 12407, 2024 05 30.
Article in English | MEDLINE | ID: mdl-38811832

ABSTRACT

Many lecturers develop voice problems, such as hoarseness. Nevertheless, research on how voice quality influences listeners' perception, comprehension, and retention of spoken language is limited to a small number of audio-only experiments. We aimed to address this gap by using audio-visual virtual reality (VR) to investigate the impact of a lecturer's hoarseness on university students' heard text recall, listening effort, and listening impression. Fifty participants were immersed in a virtual seminar room, where they engaged in a Dual-Task Paradigm. They listened to narratives presented by a virtual female professor, who spoke in either a typical or hoarse voice. Simultaneously, participants performed a secondary task. Results revealed significantly prolonged secondary-task response times with the hoarse voice compared to the typical voice, indicating increased listening effort. Subjectively, participants rated the hoarse voice as more annoying, effortful to listen to, and impeding for their cognitive performance. No effect of voice quality was found on heard text recall, suggesting that, while hoarseness may compromise certain aspects of spoken language processing, this might not necessarily result in reduced information retention. In summary, our findings underscore the importance of promoting vocal health among lecturers, which may contribute to enhanced listening conditions in learning spaces.


Subject(s)
Speech Perception , Virtual Reality , Voice Quality , Humans , Female , Male , Adult , Young Adult , Speech Perception/physiology , Memory/physiology , Auditory Perception/physiology , Hoarseness/etiology , Voice/physiology
18.
Multisens Res ; 37(2): 125-141, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38714314

ABSTRACT

Trust is an aspect critical to human social interaction and research has identified many cues that help in the assimilation of this social trait. Two of these cues are the pitch of the voice and the width-to-height ratio of the face (fWHR). Additionally, research has indicated that the content of a spoken sentence itself has an effect on trustworthiness; a finding that has not yet been brought into multisensory research. The current research aims to investigate previously developed theories on trust in relation to vocal pitch, fWHR, and sentence content in a multimodal setting. Twenty-six female participants were asked to judge the trustworthiness of a voice speaking a neutral or romantic sentence while seeing a face. The average pitch of the voice and the fWHR were varied systematically. Results indicate that the content of the spoken message was an important predictor of trustworthiness extending into multimodality. Further, the mean pitch of the voice and fWHR of the face appeared to be useful indicators in a multimodal setting. These effects interacted with one another across modalities. The data demonstrate that trust in the voice is shaped by task-irrelevant visual stimuli. Future research is encouraged to clarify whether these findings remain consistent across genders, age groups, and languages.


Subject(s)
Face , Trust , Voice , Humans , Female , Voice/physiology , Young Adult , Adult , Face/physiology , Speech Perception/physiology , Pitch Perception/physiology , Facial Recognition/physiology , Cues , Adolescent
19.
Int J Pediatr Otorhinolaryngol ; 180: 111962, 2024 May.
Article in English | MEDLINE | ID: mdl-38657429

ABSTRACT

PURPOSE: In this prospective study, we aimed to investigate the difference in voice acoustic parameters between girls with idiopathic central precocious puberty (ICPP) and those who developed normally during prepuberty. MATERIALS AND METHODS: Our study recruited 54 girls diagnosed with ICPP and randomly sampled 51 healthy prepubertal girls as the control. Tanner stages, circulating hormone levels and bone ages of the girls with ICPP and the age and body mass index (BMI) of all participants were recorded. Acoustic analyses were performed using PRAAT computer-based voice analysis software and the mean pitch (F0), jitter, shimmer, noise-to harmonic-ratio (NHR) and harmonic-to-noise ratio (HNR) values were compared in the patient and control groups. RESULTS: The two groups did not significantly differ in age or BMI. In the evaluation of the F0 and jitter values, we were found to be lower in the control group than in the patient group. However, we did not find a statistical significance. The mean shimmer values of the patient group were significantly higher than those of the control group. In addition, a statistically significant difference was noted for the mean HNR and NHR values (P < 0.001). A moderate negative correlation was found between shimmer and hormone levels in the patient group. CONCLUSIONS: Voice acoustic parameters one of the defining features of girls with ICPP. Voice changes in acoustic parameters could reflect hormonal changes during puberty. Clinicians should suspect ICPP when there is a change in the voice.


Subject(s)
Puberty, Precocious , Humans , Puberty, Precocious/blood , Female , Child , Prospective Studies , Voice Quality/physiology , Speech Acoustics , Case-Control Studies , Voice/physiology , Body Mass Index
20.
Ear Hear ; 45(4): 952-968, 2024.
Article in English | MEDLINE | ID: mdl-38616318

ABSTRACT

OBJECTIVES: Postlingually deaf adults with cochlear implants (CIs) have difficulties with perceiving differences in speakers' voice characteristics and benefit little from voice differences for the perception of speech in competing speech. However, not much is known yet about the perception and use of voice characteristics in prelingually deaf implanted children with CIs. Unlike CI adults, most CI children became deaf during the acquisition of language. Extensive neuroplastic changes during childhood could make CI children better at using the available acoustic cues than CI adults, or the lack of exposure to a normal acoustic speech signal could make it more difficult for them to learn which acoustic cues they should attend to. This study aimed to examine to what degree CI children can perceive voice cues and benefit from voice differences for perceiving speech in competing speech, comparing their abilities to those of normal-hearing (NH) children and CI adults. DESIGN: CI children's voice cue discrimination (experiment 1), voice gender categorization (experiment 2), and benefit from target-masker voice differences for perceiving speech in competing speech (experiment 3) were examined in three experiments. The main focus was on the perception of mean fundamental frequency (F0) and vocal-tract length (VTL), the primary acoustic cues related to speakers' anatomy and perceived voice characteristics, such as voice gender. RESULTS: CI children's F0 and VTL discrimination thresholds indicated lower sensitivity to differences compared with their NH-age-equivalent peers, but their mean discrimination thresholds of 5.92 semitones (st) for F0 and 4.10 st for VTL indicated higher sensitivity than postlingually deaf CI adults with mean thresholds of 9.19 st for F0 and 7.19 st for VTL. Furthermore, CI children's perceptual weighting of F0 and VTL cues for voice gender categorization closely resembled that of their NH-age-equivalent peers, in contrast with CI adults. Finally, CI children had more difficulties in perceiving speech in competing speech than their NH-age-equivalent peers, but they performed better than CI adults. Unlike CI adults, CI children showed a benefit from target-masker voice differences in F0 and VTL, similar to NH children. CONCLUSION: Although CI children's F0 and VTL voice discrimination scores were overall lower than those of NH children, their weighting of F0 and VTL cues for voice gender categorization and their benefit from target-masker differences in F0 and VTL resembled that of NH children. Together, these results suggest that prelingually deaf implanted CI children can effectively utilize spectrotemporally degraded F0 and VTL cues for voice and speech perception, generally outperforming postlingually deaf CI adults in comparable tasks. These findings underscore the presence of F0 and VTL cues in the CI signal to a certain degree and suggest other factors contributing to the perception challenges faced by CI adults.


Subject(s)
Cochlear Implantation , Cochlear Implants , Cues , Deafness , Speech Perception , Humans , Deafness/rehabilitation , Male , Female , Child , Adult , Young Adult , Adolescent , Voice/physiology , Case-Control Studies , Child, Preschool , Middle Aged
SELECTION OF CITATIONS
SEARCH DETAIL
...