Search | VHL Regional Portal

1.

Sensorimotor learning during synchronous speech is modulated by the acoustics of the other voice.

Bradshaw, Abigail R; Wheeler, Emma D; McGettigan, Carolyn; Lametti, Daniel R.

Psychon Bull Rev ; 2024 Jul 02.

Article in English | MEDLINE | ID: mdl-38955989

ABSTRACT

This study tested the hypothesis that speaking with other voices can influence sensorimotor predictions of one's own voice. Real-time manipulations of auditory feedback were used to drive sensorimotor adaptation in speech, while participants spoke sentences in synchrony with another voice, a task known to induce implicit imitation (phonetic convergence). The acoustic-phonetic properties of the other voice were manipulated between groups, such that convergence with it would either oppose (incongruent group, n = 15) or align with (congruent group, n = 16) speech motor adaptation. As predicted, significantly greater adaptation was seen in the congruent compared to the incongruent group. This suggests the use of shared sensory targets in speech for predicting the sensory outcomes of both the actions of others (speech perception) and the actions of the self (speech production). This finding has important implications for wider theories of shared predictive mechanisms across perception and action, such as active inference.

2.

Self-ownership, not self-production, modulates bias and agency over a synthesised voice.

Payne, Bryony; Addlesee, Angus; Rieser, Verena; McGettigan, Carolyn.

Cognition ; 248: 105804, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38678806

ABSTRACT

Voices are fundamentally social stimuli, and their importance to the self may be underpinned by how far they can be used to express the self and achieve communicative goals. This paper examines how self-bias and agency over a synthesised voice is altered when that voice is used to represent the self in social interaction. To enable participants to use a new voice, a novel two-player game was created, in which participants communicated online using a text-to-speech (TTS) synthesised voice. We then measured self-bias and sense of agency attributed to this synthesised voice, comparing participants who had used their new voice to interact with another person (n = 44) to a control group of participants (n = 44) who had been only briefly exposed to the voices. We predicted that the new, synthesised self-voice would be more perceptually prioritised after it had been self-produced. Further, that participants' sense of agency over the voice would be increased, if they had experienced self-producing the voice, relative to those who only owned it. Contrary to the hypothesis, the results indicated that both experimental participants and control participants similarly prioritised the new synthesised voice and experienced a similar degree of agency over it, relative to voices owned by others. Critically then, being able to produce the new voice in a social interaction did not modulate bias towards it nor participant's sense of agency over it. These results suggest that merely having ownership over a new voice may be sufficient to generate a perceptual bias and a sense of agency over it.

Subject(s)

Self Concept , Voice , Humans , Female , Male , Adult , Young Adult , Social Interaction , Ownership , Adolescent

3.

Human talkers change their voices to elicit specific trait percepts.

Guldner, Stella; Lavan, Nadine; Lally, Clare; Wittmann, Lisa; Nees, Frauke; Flor, Herta; McGettigan, Carolyn.

Psychon Bull Rev ; 31(1): 209-222, 2024 Feb.

Article in English | MEDLINE | ID: mdl-37507647

ABSTRACT

The voice is a variable and dynamic social tool with functional relevance for self-presentation, for example, during a job interview or courtship. Talkers adjust their voices flexibly to their situational or social environment. Here, we investigated how effectively intentional voice modulations can evoke trait impressions in listeners (Experiment 1), whether these trait impressions are recognizable (Experiment 2), and whether they meaningfully influence social interactions (Experiment 3). We recorded 40 healthy adult speakers' whilst speaking neutrally and whilst producing vocal expressions of six social traits (e.g., likeability, confidence). Multivariate ratings of 40 listeners showed that vocal modulations amplified specific trait percepts (Experiments 1 and 2), which could be explained by two principal components relating to perceived affiliation and competence. Moreover, vocal modulations increased the likelihood of listeners choosing the voice to be suitable for corresponding social goals (i.e., a confident rather than likeable voice to negotiate a promotion, Experiment 3). These results indicate that talkers modulate their voice along a common trait space for social navigation. Moreover, beyond reactive voice changes, vocal behaviour can be strategically used by talkers to communicate subtle information about themselves to listeners. These findings advance our understanding of non-verbal vocal behaviour for social communication.

Subject(s)

Voice , Adult , Humans , Communication

4.

An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance images.

Belyk, Michel; Carignan, Christopher; McGettigan, Carolyn.

Behav Res Methods ; 56(3): 2623-2635, 2024 Mar.

Article in English | MEDLINE | ID: mdl-37507650

ABSTRACT

Real-time magnetic resonance imaging (rtMRI) is a technique that provides high-contrast videographic data of human anatomy in motion. Applied to the vocal tract, it is a powerful method for capturing the dynamics of speech and other vocal behaviours by imaging structures internal to the mouth and throat. These images provide a means of studying the physiological basis for speech, singing, expressions of emotion, and swallowing that are otherwise not accessible for external observation. However, taking quantitative measurements from these images is notoriously difficult. We introduce a signal processing pipeline that produces outlines of the vocal tract from the lips to the larynx as a quantification of the dynamic morphology of the vocal tract. Our approach performs simple tissue classification, but constrained to a researcher-specified region of interest. This combination facilitates feature extraction while retaining the domain-specific expertise of a human analyst. We demonstrate that this pipeline generalises well across datasets covering behaviours such as speech, vocal size exaggeration, laughter, and whistling, as well as producing reliable outcomes across analysts, particularly among users with domain-specific expertise. With this article, we make this pipeline available for immediate use by the research community, and further suggest that it may contribute to the continued development of fully automated methods based on deep learning algorithms.

Subject(s)

Larynx , Singing , Humans , Magnetic Resonance Imaging/methods , Larynx/diagnostic imaging , Larynx/anatomy & histology , Larynx/physiology , Speech/physiology , Mouth/anatomy & histology , Mouth/physiology

5.

Speech With Pauses Sounds Deceptive to Listeners With and Without Hearing Impairment.

Patel, Bindiya; Zhang, Ziyun; McGettigan, Carolyn; Belyk, Michel.

J Speech Lang Hear Res ; 66(10): 3735-3744, 2023 Oct 04.

Article in English | MEDLINE | ID: mdl-37672786

ABSTRACT

PURPOSE: Communication is as much persuasion as it is the transfer of information. This creates a tension between the interests of the speaker and those of the listener, as dishonest speakers naturally attempt to hide deceptive speech and listeners are faced with the challenge of sorting truths from lies. Listeners with hearing impairment in particular may have differing levels of access to the acoustical cues that give away deceptive speech. A greater tendency toward speech pauses has been hypothesized to result from the cognitive demands of lying convincingly. Higher vocal pitch has also been hypothesized to mark the increased anxiety of a dishonest speaker. METHOD: Listeners with or without hearing impairments heard short utterances from natural conversations, some of which had been digitally manipulated to contain either increased pausing or raised vocal pitch. Listeners were asked to guess whether each statement was a lie in a two-alternative forced-choice task. Participants were also asked explicitly which cues they believed had influenced their decisions. RESULTS: Statements were more likely to be perceived as a lie when they contained pauses, but not when vocal pitch was raised. This pattern held regardless of hearing ability. In contrast, both groups of listeners self-reported using vocal pitch cues to identify deceptive statements, though at lower rates than pauses. CONCLUSIONS: Listeners may have only partial awareness of the cues that influence their impression of dishonesty. Listeners with hearing impairment may place greater weight on acoustical cues according to the differing degrees of access provided by hearing aids. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.24052446.

6.

Perceptual Learning of Noise-Vocoded Speech Under Divided Attention.

Wang, Han; Chen, Rongru; Yan, Yu; McGettigan, Carolyn; Rosen, Stuart; Adank, Patti.

Trends Hear ; 27: 23312165231192297, 2023.

Article in English | MEDLINE | ID: mdl-37547940

ABSTRACT

Speech perception performance for degraded speech can improve with practice or exposure. Such perceptual learning is thought to be reliant on attention and theoretical accounts like the predictive coding framework suggest a key role for attention in supporting learning. However, it is unclear whether speech perceptual learning requires undivided attention. We evaluated the role of divided attention in speech perceptual learning in two online experiments (N = 336). Experiment 1 tested the reliance of perceptual learning on undivided attention. Participants completed a speech recognition task where they repeated forty noise-vocoded sentences in a between-group design. Participants performed the speech task alone or concurrently with a domain-general visual task (dual task) at one of three difficulty levels. We observed perceptual learning under divided attention for all four groups, moderated by dual-task difficulty. Listeners in easy and intermediate visual conditions improved as much as the single-task group. Those who completed the most challenging visual task showed faster learning and achieved similar ending performance compared to the single-task group. Experiment 2 tested whether learning relies on domain-specific or domain-general processes. Participants completed a single speech task or performed this task together with a dual task aiming to recruit domain-specific (lexical or phonological), or domain-general (visual) processes. All secondary task conditions produced patterns and amount of learning comparable to the single speech task. Our results demonstrate that the impact of divided attention on perceptual learning is not strictly dependent on domain-general or domain-specific processes and speech perceptual learning persists under divided attention.

Subject(s)

Speech Perception , Speech , Humans , Learning , Noise/adverse effects , Language

7.

Speech motor adaptation during synchronous and metronome-timed speech.

Bradshaw, Abigail R; Lametti, Daniel R; Shiller, Douglas M; Jasmin, Kyle; Huang, Ruiling; McGettigan, Carolyn.

J Exp Psychol Gen ; 152(12): 3476-3489, 2023 Dec.

Article in English | MEDLINE | ID: mdl-37616075

ABSTRACT

Sensorimotor integration during speech has been investigated by altering the sound of a speaker's voice in real time; in response, the speaker learns to change their production of speech sounds in order to compensate (adaptation). This line of research has however been predominantly limited to very simple speaking contexts, typically involving (a) repetitive production of single words and (b) production of speech while alone, without the usual exposure to other voices. This study investigated adaptation to a real-time perturbation of the first and second formants during production of sentences either in synchrony with a prerecorded voice (synchronous speech group) or alone (solo speech group). Experiment 1 (n = 30) found no significant difference in the average magnitude of compensatory formant changes between the groups; however, synchronous speech resulted in increased between-individual variability in such formant changes. Participants also showed acoustic-phonetic convergence to the voice they were synchronizing with prior to introduction of the feedback alteration. Furthermore, the extent to which the changes required for convergence agreed with those required for adaptation was positively correlated with the magnitude of subsequent adaptation. Experiment 2 tested an additional group with a metronome-timed speech task (n = 15) and found a similar pattern of increased between-participant variability in formant changes. These findings demonstrate that speech motor adaptation can be measured robustly at the group level during performance of more complex speaking tasks; however, further work is needed to resolve whether self-voice adaptation and other-voice convergence reflect additive or interactive effects during sensorimotor control of speech. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

Subject(s)

Speech Perception , Voice , Humans , Speech/physiology , Speech Perception/physiology , Voice/physiology , Phonetics , Learning

8.

Talker and accent familiarity yield advantages for voice identity perception: A voice sorting study.

Njie, Sheriff; Lavan, Nadine; McGettigan, Carolyn.

Mem Cognit ; 51(1): 175-187, 2023 01.

Article in English | MEDLINE | ID: mdl-35274221

ABSTRACT

In the current study, we examine and compare the effects of talker and accent familiarity in the context of a voice identity sorting task, using naturally varying voice recording samples from the TV show Derry Girls. Voice samples were thus all spoken with a regional accent of UK/Irish English (from [London]derry). We tested four listener groups: Listeners were either familiar or unfamiliar with the TV show (and therefore the talker identities) and were either highly familiar or relatively less familiar with Northern Irish accents. Both talker and accent familiarity significantly improved accuracy of voice identity sorting performance. However, the talker familiarity benefits were overall larger, and more consistent. We discuss the results in light of a possible hierarchy of familiarity effects and argue that our findings may provide additional evidence for interactions of speech and identity processing pathways in voice identity perception. We also identify some key limitations in the current work and provide suggestions for future studies to address these.

Subject(s)

Speech Perception , Voice , Female , Humans , Language , Speech , Recognition, Psychology

9.

A model for person perception from familiar and unfamiliar voices.

Lavan, Nadine; McGettigan, Carolyn.

Commun Psychol ; 1(1): 1, 2023.

Article in English | MEDLINE | ID: mdl-38665246

ABSTRACT

When hearing a voice, listeners can form a detailed impression of the person behind the voice. Existing models of voice processing focus primarily on one aspect of person perception - identity recognition from familiar voices - but do not account for the perception of other person characteristics (e.g., sex, age, personality traits). Here, we present a broader perspective, proposing that listeners have a common perceptual goal of perceiving who they are hearing, whether the voice is familiar or unfamiliar. We outline and discuss a model - the Person Perception from Voices (PPV) model - that achieves this goal via a common mechanism of recognising a familiar person, persona, or set of speaker characteristics. Our PPV model aims to provide a more comprehensive account of how listeners perceive the person they are listening to, using an approach that incorporates and builds on aspects of the hierarchical frameworks and prototype-based mechanisms proposed within existing models of voice identity recognition.

10.

Real-time magnetic resonance imaging reveals distinct vocal tract configurations during spontaneous and volitional laughter.

Belyk, Michel; McGettigan, Carolyn.

Philos Trans R Soc Lond B Biol Sci ; 377(1863): 20210511, 2022 11 07.

Article in English | MEDLINE | ID: mdl-36126659

ABSTRACT

A substantial body of acoustic and behavioural evidence points to the existence of two broad categories of laughter in humans: spontaneous laughter that is emotionally genuine and somewhat involuntary, and volitional laughter that is produced on demand. In this study, we tested the hypothesis that these are also physiologically distinct vocalizations, by measuring and comparing them using real-time magnetic resonance imaging (rtMRI) of the vocal tract. Following Ruch and Ekman (Ruch and Ekman 2001 In Emotions, qualia, and consciousness (ed. A Kaszniak), pp. 426-443), we further predicted that spontaneous laughter should be relatively less speech-like (i.e. less articulate) than volitional laughter. We collected rtMRI data from five adult human participants during spontaneous laughter, volitional laughter and spoken vowels. We report distinguishable vocal tract shapes during the vocalic portions of these three vocalization types, where volitional laughs were intermediate between spontaneous laughs and vowels. Inspection of local features within the vocal tract across the different vocalization types offers some additional support for Ruch and Ekman's predictions. We discuss our findings in light of a dual pathway hypothesis for the neural control of human volitional and spontaneous vocal behaviours, identifying tongue shape and velum lowering as potential biomarkers of spontaneous laughter to be investigated in future research. This article is part of the theme issue 'Cracking the laugh code: laughter through the lens of biology, psychology and neuroscience'.

Subject(s)

Laughter , Voice , Adult , Emotions , Humans , Laughter/physiology , Laughter/psychology , Magnetic Resonance Imaging , Volition

11.

Individual differences in vocal size exaggeration.

Belyk, Michel; Waters, Sheena; Kanber, Elise; Miquel, Marc E; McGettigan, Carolyn.

Sci Rep ; 12(1): 2611, 2022 02 16.

Article in English | MEDLINE | ID: mdl-35173178

ABSTRACT

The human voice carries socially relevant information such as how authoritative, dominant, and attractive the speaker sounds. However, some speakers may be able to manipulate listeners by modulating the shape and size of their vocal tract to exaggerate certain characteristics of their voice. We analysed the veridical size of speakers' vocal tracts using real-time magnetic resonance imaging as they volitionally modulated their voice to sound larger or smaller, corresponding changes to the size implied by the acoustics of their voice, and their influence over the perceptions of listeners. Individual differences in this ability were marked, spanning from nearly incapable to nearly perfect vocal modulation, and was consistent across modalities of measurement. Further research is needed to determine whether speakers who are effective at vocal size exaggeration are better able to manipulate their social environment, and whether this variation is an inherited quality of the individual, or the result of life experiences such as vocal training.

Subject(s)

Auditory Perception/physiology , Individuality , Speech Perception/physiology , Vocal Cords/anatomy & histology , Vocal Cords/physiology , Voice , Humans , Life Change Events , Magnetic Resonance Imaging , Phonetics , Social Environment , Sound , Speech Acoustics

12.

Speech timing cues reveal deceptive speech in social deduction board games.

Zhang, Ziyun; McGettigan, Carolyn; Belyk, Michel.

PLoS One ; 17(2): e0263852, 2022.

Article in English | MEDLINE | ID: mdl-35148352

ABSTRACT

The faculty of language allows humans to state falsehoods in their choice of words. However, while what is said might easily uphold a lie, how it is said may reveal deception. Hence, some features of the voice that are difficult for liars to control may keep speech mostly, if not always, honest. Previous research has identified that speech timing and voice pitch cues can predict the truthfulness of speech, but this evidence has come primarily from laboratory experiments, which sacrifice ecological validity for experimental control. We obtained ecologically valid recordings of deceptive speech while observing natural utterances from players of a popular social deduction board game, in which players are assigned roles that either induce honest or dishonest interactions. When speakers chose to lie, they were prone to longer and more frequent pauses in their speech. This finding is in line with theoretical predictions that lying is more cognitively demanding. However, lying was not reliably associated with vocal pitch. This contradicts predictions that increased physiological arousal from lying might increase muscular tension in the larynx, but is consistent with human specialisations that grant Homo sapiens sapiens an unusual degree of control over the voice relative to other primates. The present study demonstrates the utility of social deduction board games as a means of making naturalistic observations of human behaviour from semi-structured social interactions.

Subject(s)

Deception , Larynx/physiology , Speech Perception/physiology , Cues , Female , Games, Recreational , Humans , Male , Muscle Tonus , Time Factors

13.

Highly accurate and robust identity perception from personally familiar voices.

Kanber, Elise; Lavan, Nadine; McGettigan, Carolyn.

J Exp Psychol Gen ; 151(4): 897-911, 2022 Apr.

Article in English | MEDLINE | ID: mdl-34672658

ABSTRACT

Previous research suggests that familiarity with a voice can afford benefits for voice and speech perception. However, even familiar voice perception has been reported to be error-prone, especially in the face of challenges such as reduced verbal cues and acoustic distortions. It has been hypothesized that such findings may arise due to listeners not being "familiar enough" with the voices used in laboratory studies, and thus being inexperienced with their full vocal repertoire. Extending this idea, voice perception based on highly familiar voices-acquired via substantial, naturalistic experience-should therefore be more robust than voice perception from less familiar voices. We investigated this proposal by contrasting voice perception of personally familiar voices (participants' romantic partners) versus lab-trained voices in challenging experimental tasks. Specifically, we tested how differences in familiarity may affect voice-identity perception from nonverbal vocalizations and acoustically modulated speech. Large benefits for the personally familiar voice over a less familiar, lab-trained voice were found for identity recognition, with listeners displaying both highly accurate yet more conservative recognition of personally familiar voices. However, no familiar-voice benefits were found for speech perception in background noise. Our findings suggest that listeners have fine-tuned representations of highly familiar voices that result in more robust and accurate voice recognition despite challenging listening contexts, yet these advantages may not always extend to speech perception. We conclude that familiarity with voices is indeed on a continuum, with identity perception for personally familiar voices being highly accurate. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

Subject(s)

Speech Perception , Voice , Auditory Perception , Humans , Recognition, Psychology , Speech

14.

Unimodal and cross-modal identity judgements using an audio-visual sorting task: Evidence for independent processing of faces and voices.

Lavan, Nadine; Smith, Harriet M J; McGettigan, Carolyn.

Mem Cognit ; 50(1): 216-231, 2022 01.

Article in English | MEDLINE | ID: mdl-34254274

ABSTRACT

Unimodal and cross-modal information provided by faces and voices contribute to identity percepts. To examine how these sources of information interact, we devised a novel audio-visual sorting task in which participants were required to group video-only and audio-only clips into two identities. In a series of three experiments, we show that unimodal face and voice sorting were more accurate than cross-modal sorting: While face sorting was consistently most accurate followed by voice sorting, cross-modal sorting was at chancel level or below. In Experiment 1, we compared performance in our novel audio-visual sorting task to a traditional identity matching task, showing that unimodal and cross-modal identity perception were overall moderately more accurate than the traditional identity matching task. In Experiment 2, separating unimodal from cross-modal sorting led to small improvements in accuracy for unimodal sorting, but no change in cross-modal sorting performance. In Experiment 3, we explored the effect of minimal audio-visual training: Participants were shown a clip of the two identities in conversation prior to completing the sorting task. This led to small, nonsignificant improvements in accuracy for unimodal and cross-modal sorting. Our results indicate that unfamiliar face and voice perception operate relatively independently with no evidence of mutual benefit, suggesting that extracting reliable cross-modal identity information is challenging.

Subject(s)

Auditory Perception , Recognition, Psychology , Voice , Humans

15.

Singers show enhanced performance and neural representation of vocal imitation.

Waters, Sheena; Kanber, Elise; Lavan, Nadine; Belyk, Michel; Carey, Daniel; Cartei, Valentina; Lally, Clare; Miquel, Marc; McGettigan, Carolyn.

Philos Trans R Soc Lond B Biol Sci ; 376(1840): 20200399, 2021 12 20.

Article in English | MEDLINE | ID: mdl-34719245

ABSTRACT

Humans have a remarkable capacity to finely control the muscles of the larynx, via distinct patterns of cortical topography and innervation that may underpin our sophisticated vocal capabilities compared with non-human primates. Here, we investigated the behavioural and neural correlates of laryngeal control, and their relationship to vocal expertise, using an imitation task that required adjustments of larynx musculature during speech. Highly trained human singers and non-singer control participants modulated voice pitch and vocal tract length (VTL) to mimic auditory speech targets, while undergoing real-time anatomical scans of the vocal tract and functional scans of brain activity. Multivariate analyses of speech acoustics, larynx movements and brain activation data were used to quantify vocal modulation behaviour and to search for neural representations of the two modulated vocal parameters during the preparation and execution of speech. We found that singers showed more accurate task-relevant modulations of speech pitch and VTL (i.e. larynx height, as measured with vocal tract MRI) during speech imitation; this was accompanied by stronger representation of VTL within a region of the right somatosensory cortex. Our findings suggest a common neural basis for enhanced vocal control in speech and song. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part I)'.

Subject(s)

Singing , Voice , Animals , Humans , Imitative Behavior , Primates , Singing/physiology , Speech/physiology , Vocalization, Animal , Voice/physiology

16.

A dual larynx motor networks hypothesis.

Belyk, Michel; Eichert, Nicole; McGettigan, Carolyn.

Philos Trans R Soc Lond B Biol Sci ; 376(1840): 20200392, 2021 12 20.

Article in English | MEDLINE | ID: mdl-34719252

ABSTRACT

Humans are vocal modulators par excellence. This ability is supported in part by the dual representation of the laryngeal muscles in the motor cortex. Movement, however, is not the product of motor cortex alone but of a broader motor network. This network consists of brain regions that contain somatotopic maps that parallel the organization in motor cortex. We therefore present a novel hypothesis that the dual laryngeal representation is repeated throughout the broader motor network. In support of the hypothesis, we review existing literature that demonstrates the existence of network-wide somatotopy and present initial evidence for the hypothesis' plausibility. Understanding how this uniquely human phenotype in motor cortex interacts with broader brain networks is an important step toward understanding how humans evolved the ability to speak. We further suggest that this system may provide a means to study how individual components of the nervous system evolved within the context of neuronal networks. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part I)'.

Subject(s)

Larynx , Motor Cortex , Brain , Brain Mapping , Larynx/physiology , Magnetic Resonance Imaging , Motor Cortex/physiology , Movement

17.

Convergence in voice fundamental frequency during synchronous speech.

Bradshaw, Abigail R; McGettigan, Carolyn.

PLoS One ; 16(10): e0258747, 2021.

Article in English | MEDLINE | ID: mdl-34673811

ABSTRACT

Joint speech behaviours where speakers produce speech in unison are found in a variety of everyday settings, and have clinical relevance as a temporary fluency-enhancing technique for people who stutter. It is currently unknown whether such synchronisation of speech timing among two speakers is also accompanied by alignment in their vocal characteristics, for example in acoustic measures such as pitch. The current study investigated this by testing whether convergence in voice fundamental frequency (F0) between speakers could be demonstrated during synchronous speech. Sixty participants across two online experiments were audio recorded whilst reading a series of sentences, first on their own, and then in synchrony with another speaker (the accompanist) in a number of between-subject conditions. Experiment 1 demonstrated significant convergence in participants' F0 to a pre-recorded accompanist voice, in the form of both upward (high F0 accompanist condition) and downward (low and extra-low F0 accompanist conditions) changes in F0. Experiment 2 demonstrated that such convergence was not seen during a visual synchronous speech condition, in which participants spoke in synchrony with silent video recordings of the accompanist. An audiovisual condition in which participants were able to both see and hear the accompanist in pre-recorded videos did not result in greater convergence in F0 compared to synchronisation with the pre-recorded voice alone. These findings suggest the need for models of speech motor control to incorporate interactions between self- and other-speech feedback during speech production, and suggest a novel hypothesis for the mechanisms underlying the fluency-enhancing effects of synchronous speech in people who stutter.

Subject(s)

Phonation , Phonetics , Speech Acoustics , Speech/physiology , Voice/physiology , Adult , Female , Humans , Language , Male

18.

Familiarity and task context shape the use of acoustic information in voice identity perception.

Lavan, Nadine; Kreitewolf, Jens; Obleser, Jonas; McGettigan, Carolyn.

Cognition ; 215: 104780, 2021 10.

Article in English | MEDLINE | ID: mdl-34298232

ABSTRACT

Familiar and unfamiliar voice perception are often understood as being distinct from each other. For identity perception, theoretical work has proposed that listeners use acoustic information in different ways to perceive identity from familiar and unfamiliar voices: Unfamiliar voices are thought to be processed based on close comparisons of acoustic properties, while familiar voices are processed based on diagnostic acoustic features that activate a stored person-specific representation of that voice. To date no empirical study has directly examined whether and how familiar and unfamiliar listeners differ in their use of acoustic information for identity perception. Here, we tested this theoretical claim by linking listeners' judgements in voice identity tasks to complex acoustic representation - spectral similarity of the heard voice recordings. Participants (N = 177) who were either familiar or unfamiliar with a set of voices completed an identity discrimination task (Experiment 1) or an identity sorting task (Experiment 2). In both experiments, identity judgements for familiar and unfamiliar voices were guided by spectral similarity: Pairs of recordings with greater acoustic similarity were more likely to be perceived as belonging to the same voice identity. However, while there were no differences in how familiar and unfamiliar listeners used acoustic information for identity discrimination, differences were apparent for identity sorting. Our study therefore challenges proposals that view familiar and unfamiliar voice perception as being at all times distinct. Instead, our data suggest a critical role of the listening situation in which familiar and unfamiliar voices are evaluated, thus characterising voice identity perception as a highly dynamic process in which listeners opportunistically make use of any kind of information they can access.

Subject(s)

Speech Perception , Voice , Acoustics , Auditory Perception , Humans , Recognition, Psychology

19.

Human larynx motor cortices coordinate respiration for vocal-motor control.

Belyk, Michel; Brown, Rachel; Beal, Deryk S; Roebroeck, Alard; McGettigan, Carolyn; Guldner, Stella; Kotz, Sonja A.

Neuroimage ; 239: 118326, 2021 10 01.

Article in English | MEDLINE | ID: mdl-34216772

ABSTRACT

Vocal flexibility is a hallmark of the human species, most particularly the capacity to speak and sing. This ability is supported in part by the evolution of a direct neural pathway linking the motor cortex to the brainstem nucleus that controls the larynx the primary sound source for communication. Early brain imaging studies demonstrated that larynx motor cortex at the dorsal end of the orofacial division of motor cortex (dLMC) integrated laryngeal and respiratory control, thereby coordinating two major muscular systems that are necessary for vocalization. Neurosurgical studies have since demonstrated the existence of a second larynx motor area at the ventral extent of the orofacial motor division (vLMC) of motor cortex. The vLMC has been presumed to be less relevant to speech motor control, but its functional role remains unknown. We employed a novel ultra-high field (7T) magnetic resonance imaging paradigm that combined singing and whistling simple melodies to localise the larynx motor cortices and test their involvement in respiratory motor control. Surprisingly, whistling activated both 'larynx areas' more strongly than singing despite the reduced involvement of the larynx during whistling. We provide further evidence for the existence of two larynx motor areas in the human brain, and the first evidence that laryngeal-respiratory integration is a shared property of both larynx motor areas. We outline explicit predictions about the descending motor pathways that give these cortical areas access to both the laryngeal and respiratory systems and discuss the implications for the evolution of speech.

Subject(s)

Larynx/physiology , Magnetic Resonance Imaging/methods , Motor Cortex/physiology , Neural Pathways/physiology , Respiration , Speech/physiology , Adult , Female , Humans , Least-Squares Analysis , Male , Motor Cortex/diagnostic imaging , Respiratory Mechanics/physiology , Rest/physiology , Singing/physiology , Young Adult

20.

Explaining face-voice matching decisions: The contribution of mouth movements, stimulus effects and response biases.

Lavan, Nadine; Smith, Harriet; Jiang, Li; McGettigan, Carolyn.

Atten Percept Psychophys ; 83(5): 2205-2216, 2021 Jul.

Article in English | MEDLINE | ID: mdl-33797024

ABSTRACT

Previous studies have shown that face-voice matching accuracy is more consistently above chance for dynamic (i.e. speaking) faces than for static faces. This suggests that dynamic information can play an important role in informing matching decisions. We initially asked whether this advantage for dynamic stimuli is due to shared information across modalities that is encoded in articulatory mouth movements. Participants completed a sequential face-voice matching task with (1) static images of faces, (2) dynamic videos of faces, (3) dynamic videos where only the mouth was visible, and (4) dynamic videos where the mouth was occluded, in a well-controlled stimulus set. Surprisingly, after accounting for random variation in the data due to design choices, accuracy for all four conditions was at chance. Crucially, however, exploratory analyses revealed that participants were not responding randomly, with different patterns of response biases being apparent for different conditions. Our findings suggest that face-voice identity matching may not be possible with above-chance accuracy but that analyses of response biases can shed light upon how people attempt face-voice matching. We discuss these findings with reference to the differential functional roles for faces and voices recently proposed for multimodal person perception.

Subject(s)

Voice , Bias , Face , Humans , Mouth

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL