RESUMO
In speech, the highly flexible modulation of vocal pitch creates intonation patterns that speakers use to convey linguistic meaning. This human ability is unique among primates. Here, we used high-density cortical recordings directly from the human brain to determine the encoding of vocal pitch during natural speech. We found neural populations in bilateral dorsal laryngeal motor cortex (dLMC) that selectively encoded produced pitch but not non-laryngeal articulatory movements. This neural population controlled short pitch accents to express prosodic emphasis on a word in a sentence. Other larynx cortical representations controlling voicing and longer pitch phrase contours were found at separate sites. dLMC sites also encoded vocal pitch during a non-speech singing task. Finally, direct focal stimulation of dLMC evoked laryngeal movements and involuntary vocalization, confirming its causal role in feedforward control. Together, these results reveal the neural basis for the voluntary control of vocal pitch in human speech. VIDEO ABSTRACT.
Assuntos
Laringe/fisiologia , Córtex Motor/fisiologia , Fala , Adolescente , Adulto , Mapeamento Encefálico , Eletrocorticografia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Adulto JovemRESUMO
Vocal communication is a critical feature of social interaction across species; however, the relation between such behavior in humans and nonhumans remains unclear. To enable comparative investigation of this topic, we review the literature pertinent to interactive language use and identify the superset of cognitive operations involved in generating communicative action. We posit these functions comprise three intersecting multistep pathways: (a) the Content Pathway, which selects the movements constituting a response; (b) the Timing Pathway, which temporally structures responses; and (c) the Affect Pathway, which modulates response parameters according to internal state. These processing streams form the basis of the Convergent Pathways for Interaction framework, which provides a conceptual model for investigating the cognitive and neural computations underlying vocal communication across species.
Assuntos
Idioma , Vocalização Animal , Animais , Humanos , Vocalização Animal/fisiologiaRESUMO
How humans and animals segregate sensory information into discrete, behaviorally meaningful categories is one of the hallmark questions in neuroscience. Much of the research around this topic in the auditory system has centered around human speech perception, in which categorical processes result in an enhanced sensitivity for acoustically meaningful differences and a reduced sensitivity for nonmeaningful distinctions. Much less is known about whether nonhuman primates process their species-specific vocalizations in a similar manner. We address this question in the common marmoset, a small arboreal New World primate with a rich vocal repertoire produced across a range of behavioral contexts. We first show that marmosets perceptually categorize their vocalizations in ways that correspond to previously defined call types for this species. Next, we show that marmosets are differentially sensitive to changes in particular acoustic features of their most common call types and that these sensitivity differences are matched to the population statistics of their vocalizations in ways that likely maximize category formation. Finally, we show that marmosets are less sensitive to changes in these acoustic features when within the natural range of variability of their calls, which possibly reflects perceptual specializations which maintain existing call categories. These findings suggest specializations for categorical vocal perception in a New World primate species and pave the way for future studies examining their underlying neural mechanisms.
Assuntos
Callithrix , Percepção da Fala , Animais , Humanos , Vocalização Animal , Acústica , Especificidade da EspécieRESUMO
Speech, as the spoken form of language, is fundamental for human communication. The phenomenon of covert inner speech implies functional independence of speech content and motor production. However, it remains unclear how a flexible mapping between speech content and production is achieved on the neural level. To address this, we recorded magnetoencephalography in humans performing a rule-based vocalization task. On each trial, vocalization content (one of two vowels) and production form (overt or covert) were instructed independently. Using multivariate pattern analysis, we found robust neural information about vocalization content and production, mostly originating from speech areas of the left hemisphere. Production signals dynamically transformed upon presentation of the content cue, whereas content signals remained largely stable throughout the trial. In sum, our results show dissociable neural representations of vocalization content and production in the human brain and provide insights into the neural dynamics underlying human vocalization.
Assuntos
Encéfalo , Percepção da Fala , Humanos , Fala , Magnetoencefalografia/métodos , Mapeamento EncefálicoRESUMO
The ventrolateral prefrontal cortex (VLPFC) shows robust activation during the perception of faces and voices. However, little is known about what categorical features of social stimuli drive neural activity in this region. Since perception of identity and expression are critical social functions, we examined whether neural responses to naturalistic stimuli were driven by these two categorical features in the prefrontal cortex. We recorded single neurons in the VLPFC, while two male rhesus macaques (Macaca mulatta) viewed short audiovisual videos of unfamiliar conspecifics making expressions of aggressive, affiliative, and neutral valence. Of the 285 neurons responsive to the audiovisual stimuli, 111 neurons had a main effect (two-way ANOVA) of identity, expression, or their interaction in their stimulus-related firing rates; however, decoding of expression and identity using single-unit firing rates rendered poor accuracy. Interestingly, when decoding from pseudo-populations of recorded neurons, the accuracy for both expression and identity increased with population size, suggesting that the population transmitted information relevant to both variables. Principal components analysis of mean population activity across time revealed that population responses to the same identity followed similar trajectories in the response space, facilitating segregation from other identities. Our results suggest that identity is a critical feature of social stimuli that dictates the structure of population activity in the VLPFC, during the perception of vocalizations and their corresponding facial expressions. These findings enhance our understanding of the role of the VLPFC in social behavior.
Assuntos
Córtex Pré-Frontal , Comportamento Social , Animais , Masculino , Macaca mulatta , Córtex Pré-Frontal/fisiologia , Neurônios/fisiologia , Expressão FacialRESUMO
Harmonics are an integral part of music, speech, and vocalizations of animals. Since the rest of the auditory environment is primarily made up of nonharmonic sounds, the auditory system needs to perceptually separate the above two kinds of sounds. In mice, harmonics, generally with two-tone components (two-tone harmonic complexes, TTHCs), form an important component of vocal communication. Communication by pups during isolation from the mother and by adult males during courtship elicits typical behaviors in female mice-dams and adult courting females, respectively. Our study shows that the processing of TTHC is specialized in mice providing neural basis for perceptual differences between tones and TTHCs and also nonharmonic sounds. Investigation of responses in the primary auditory cortex (Au1) from in vivo extracellular recordings and two-photon Ca2+ imaging of excitatory and inhibitory neurons to TTHCs exhibit enhancement, suppression, or no-effect with respect to tones. Irrespective of neuron type, harmonic enhancement is maximized, and suppression is minimized when the fundamental frequencies (F 0) match the neuron's best fundamental frequency (BF0). Sex-specific processing of TTHC is evident from differences in the distributions of neurons' best frequency (BF) and best fundamental frequency (BF0) in single units, differences in harmonic suppressed cases re-BF0, independent of neuron types, and from pairwise noise correlations among excitatory and parvalbumin inhibitory interneurons. Furthermore, TTHCs elicit a higher response compared with two-tone nonharmonics in females, but not in males. Thus, our study shows specialized neural processing of TTHCs over tones and nonharmonics, highlighting local network specialization among different neuronal types.
Assuntos
Estimulação Acústica , Córtex Auditivo , Animais , Feminino , Córtex Auditivo/fisiologia , Masculino , Camundongos , Estimulação Acústica/métodos , Caracteres Sexuais , Percepção Auditiva/fisiologia , Camundongos Endogâmicos C57BL , Neurônios/fisiologiaRESUMO
Audiovisual (AV) interaction has been shown in many studies of auditory cortex. However, the underlying processes and circuits are unclear because few studies have used methods that delineate the timing and laminar distribution of net excitatory and inhibitory processes within areas, much less across cortical levels. This study examined laminar profiles of neuronal activity in auditory core (AC) and parabelt (PB) cortices recorded from macaques during active discrimination of conspecific faces and vocalizations. We found modulation of multi-unit activity (MUA) in response to isolated visual stimulation, characterized by a brief deep MUA spike, putatively in white matter, followed by mid-layer MUA suppression in core auditory cortex; the later suppressive event had clear current source density concomitants, while the earlier MUA spike did not. We observed a similar facilitation-suppression sequence in the PB, with later onset latency. In combined AV stimulation, there was moderate reduction of responses to sound during the visual-evoked MUA suppression interval in both AC and PB. These data suggest a common sequence of afferent spikes, followed by synaptic inhibition; however, differences in timing and laminar location may reflect distinct visual projections to AC and PB.
Assuntos
Córtex Auditivo , Estimulação Luminosa , Animais , Córtex Auditivo/fisiologia , Masculino , Estimulação Luminosa/métodos , Estimulação Acústica/métodos , Percepção Auditiva/fisiologia , Percepção Visual/fisiologia , Macaca mulatta , Potenciais de Ação/fisiologia , Neurônios/fisiologia , Feminino , Vocalização Animal/fisiologiaRESUMO
Animal communication is central to many animal societies, and effective signal transmission is crucial for individuals to survive and reproduce successfully. One environmental factor that exerts selection pressure on acoustic signals is ambient noise. To maintain signal efficiency, species can adjust signals through phenotypic plasticity or microevolutionary response to natural selection. One of these signal adjustments is the increase in signal amplitude, called the Lombard effect, which has been frequently found in birds and mammals. However, the evolutionary origin of the Lombard effect is largely unresolved. Using a phylogenetically controlled meta-analysis, we show that the Lombard effect is also present in fish and amphibians, and contradictory results in the literature can be explained by differences in signal-to-noise ratios among studies. Our analysis also demonstrates that subcortical processes are sufficient to elicit the Lombard effect and that amplitude adjustments do not require vocal learning. We conclude that the Lombard effect is a widespread mechanism based on phenotypic plasticity in vertebrates for coping with changes in ambient noise levels.
Assuntos
Evolução Biológica , Ruído , Vocalização Animal , Acústica , Animais , Mamíferos , Vertebrados/classificação , Vocalização Animal/fisiologiaRESUMO
AbstractVocal production learning (the capacity to learn to produce vocalizations) is a multidimensional trait that involves different learning mechanisms during different temporal and socioecological contexts. Key outstanding questions are whether vocal production learning begins during the embryonic stage and whether mothers play an active role in this through pupil-directed vocalization behaviors. We examined variation in vocal copy similarity (an indicator of learning) in eight species from the songbird family Maluridae, using comparative and experimental approaches. We found that (1) incubating females from all species vocalized inside the nest and produced call types including a signature "B element" that was structurally similar to their nestlings' begging call; (2) in a prenatal playback experiment using superb fairy wrens (Malurus cyaneus), embryos showed a stronger heart rate response to playbacks of the B element than to another call element (A); and (3) mothers that produced slower calls had offspring with greater similarity between their begging call and the mother's B element vocalization. We conclude that malurid mothers display behaviors concordant with pupil-directed vocalizations and may actively influence their offspring's early life through sound learning shaped by maternal call tempo.
Assuntos
Passeriformes , Aves Canoras , Animais , Feminino , Humanos , Mães , Vocalização Animal/fisiologia , Aves Canoras/fisiologia , AprendizagemRESUMO
AbstractAutonomous sensors provide opportunities to observe organisms across spatial and temporal scales that humans cannot directly observe. By processing large data streams from autonomous sensors with deep learning methods, researchers can make novel and important natural history discoveries. In this study, we combine automated acoustic monitoring with deep learning models to observe breeding-associated activity in the endangered Sierra Nevada yellow-legged frog (Rana sierrae), a behavior that current surveys do not measure. By deploying inexpensive hydrophones and developing a deep learning model to recognize breeding-associated vocalizations, we discover three undocumented R. sierrae vocalization types and find an unexpected temporal pattern of nocturnal breeding-associated vocal activity. This study exemplifies how the combination of autonomous sensor data and deep learning can shed new light on species' natural history, especially during times or in locations where human observation is limited or impossible.
Assuntos
Ranidae , Vocalização Animal , Animais , Humanos , AcústicaRESUMO
Animal play encompasses a variety of aspects, with kinematic and social aspects being particularly prevalent in mammalian play behaviour. While the developmental effects of play have been increasingly documented in recent decades, understanding the specific contributions of different play aspects remains crucial to understand the function and evolutionary benefit of animal play. In our study, developing male rats were exposed to rough-and-tumble play selectively reduced in either the kinematic or the social aspect. We then assessed the developmental effects of reduced play on their appraisal of standardized human-rat play ('tickling') by examining their emission of 50 kHz ultrasonic vocalizations (USVs). Using a deep learning framework, we efficiently classified five subtypes of these USVs across six behavioural states. Our results revealed that rats lacking the kinematic aspect in play emitted fewer USVs during tactile contacts by human and generally produced fewer USVs of positive valence compared with control rats. Rats lacking the social aspect did not differ from the control and the kinematically reduced group. These results indicate aspects of play have different developmental effects, underscoring the need for researchers to further disentangle how each aspect affects animals.
Assuntos
Jogos e Brinquedos , Vocalização Animal , Animais , Masculino , Ratos/fisiologia , Comportamento Social , Humanos , Comportamento Animal , Fenômenos BiomecânicosRESUMO
Several animal species prefer consonant over dissonant sounds, a building block of musical scales and harmony. Could consonance and dissonance be linked, beyond music, to the emotional valence of vocalizations? We extracted the fundamental frequency from calls of young chickens with either positive or negative emotional valence, i.e. contact, brood and food calls. For each call, we calculated the frequency ratio between the maximum and the minimum values of the fundamental frequency, and we investigated which frequency ratios occurred with higher probability. We found that, for all call types, the most frequent ratios matched perfect consonance, like an arpeggio in pop music. These music-like intervals, based on the auditory frequency resolution of chicks, cannot be miscategorized into contiguous dissonant intervals. When we analysed frequency ratio distributions at a finer-grained level, we found some dissonant ratios in the contact calls produced during distress only, thus sounding a bit jazzy. Complementing the empirical data, our computational simulations suggest that physiological constraints can only partly explain both consonances and dissonances in chicks' phonation. Our data add to the mounting evidence that the building blocks of human musical traits can be found in several species, even phylogenetically distant from us.
Assuntos
Galinhas , Vocalização Animal , Animais , Galinhas/fisiologia , Música , Emoções , SomRESUMO
Human speech and language are among the most complex motor and cognitive abilities. The discovery of a mutation in the transcription factor FOXP2 in KE family members with speech disturbances has been a landmark example of the genetic control of vocal communication in humans. Cellular mechanisms underlying this control have remained unclear. By leveraging FOXP2 mutation/deletion mouse models, we found that the KE family FOXP2R553H mutation directly disables intracellular dynein-dynactin 'protein motors' in the striatum by induction of a disruptive high level of dynactin1 that impairs TrkB endosome trafficking, microtubule dynamics, dendritic outgrowth and electrophysiological activity in striatal neurons alongside vocalization deficits. Dynactin1 knockdown in mice carrying FOXP2R553H mutations rescued these cellular abnormalities and improved vocalization. We suggest that FOXP2 controls vocal circuit formation by regulating protein motor homeostasis in striatal neurons, and that its disruption could contribute to the pathophysiology of FOXP2 mutation/deletion-associated speech disorders.
Assuntos
Corpo Estriado , Fala , Humanos , Camundongos , Animais , Fala/fisiologia , Corpo Estriado/metabolismo , Neurônios/metabolismo , Neostriado/metabolismo , Distúrbios da Fala , Mutação/genética , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Vocalização Animal/fisiologiaRESUMO
Ultrasonic hearing and vocalization are the physiological mechanisms controlling echolocation used in hunting and navigation by microbats and bottleneck dolphins and for social communication by mice and rats. The molecular and cellular basis for ultrasonic hearing is as yet unknown. Here, we show that knockout of the mechanosensitive ion channel PIEZO2 in cochlea disrupts ultrasonic- but not low-frequency hearing in mice, as shown by audiometry and acoustically associative freezing behavior. Deletion of Piezo2 in outer hair cells (OHCs) specifically abolishes associative learning in mice during hearing exposure at ultrasonic frequencies. Ex vivo cochlear Ca2+ imaging has revealed that ultrasonic transduction requires both PIEZO2 and the hair-cell mechanotransduction channel. The present study demonstrates that OHCs serve as effector cells, combining with PIEZO2 as an essential molecule for ultrasonic hearing in mice.
Assuntos
Células Ciliadas Auditivas Externas/metabolismo , Audição/fisiologia , Canais Iônicos/metabolismo , Ultrassom , Animais , Cálcio/metabolismo , Reação de Congelamento Cataléptica , Deleção de Genes , Células HEK293 , Humanos , Mecanotransdução Celular , Camundongos KnockoutRESUMO
To improve the classification of pig vocalization using vocal signals and improve recognition accuracy, a pig vocalization classification method based on multi-feature fusion is proposed in this study. With the typical vocalization of pigs in large-scale breeding houses as the research object, short-time energy, frequency centroid, formant frequency and first-order difference, and Mel frequency cepstral coefficient and first-order difference were extracted as the fusion features. These fusion features were improved using principal component analysis. A pig vocalization classification model with a BP neural network optimized based on the genetic algorithm was constructed. The results showed that using the improved features to recognize pig grunting, squealing, and coughing, the average recognition accuracy was 93.2%; the recognition precisions were 87.9%, 98.1%, and 92.7%, respectively, with an average of 92.9%; and the recognition recalls were 92.0%, 99.1%, and 87.4%, respectively, with an average of 92.8%, which indicated that the proposed pig vocalization classification method had good recognition precision and recall, and could provide a reference for pig vocalization information feedback and automatic recognition.
Assuntos
Tosse , Reconhecimento Psicológico , Suínos , Animais , Redes Neurais de Computação , Análise de Componente PrincipalRESUMO
Anuran behavior and reproduction are dominated by vocalizations, rendering them vulnerable to the effects of signal masking. For anurans on display in zoos and aquaria, a major source of ambient noise is visitors, which pose a unique source of potential anthropogenic signal masking. Call characteristics (total call duration, and minimum and maximum call frequencies) of three populations of dendrobatids (Dendrobates leucomelas, Epipedobates tricolor, and Ranitomeya imitator) on public display were investigated at time periods of increasing visitor-related noise (closed, off-peak, and peak aquarium visiting hours) to determine if there were changes in call characteristics that correlated with changes in visitor noise levels. The data revealed that call length increased with more visitor noise for D. leucomelas and E. tricolor, with their longest calls during peak hours, and all three species had their shortest calls during closed hours. Both minimum and maximum call frequencies increased with more visitor noise for E. tricolor and R. imitator, with their highest frequencies during peak hours, and lowest frequencies during closed hours. This study found evidence that anurans on public display adjust their vocalizations in the presence of visitor noise. These findings support expanded monitoring of ambient noise for animals on public display to determine if noise poses significant effects that might influence well-being or reproduction.
Assuntos
Animais de Zoológico , Anuros , Abrigo para Animais , Ruído , Vocalização Animal , Animais , Vocalização Animal/fisiologia , Anuros/fisiologia , Humanos , Atividades HumanasRESUMO
The identity and location of vocalization pattern generating (VPG) circuits in mammals is debated. Based on physiological experiments, investigators suggested anterior brainstem circuits in the reticular formation, and anatomic evidence suggested the nucleus retroambiguus (NRA) in the posterior brainstem, or combinations of these sites as the putative mammalian VPG. Additionally, vocalization loudness is a critical factor in acoustic communication. However, many of the underlying neuronal mechanisms are still unknown. Here, we evoked calls by stimulation of the periaqueductal gray in anesthetized male rats, performed a large-scale mapping of vocalization-related activity using the activity marker c-fos, and high-density recordings of brainstem circuits using Neuropixels probes. Both c-fos expression and recording of vocalization-related activity point to a participation of the NRA in vocalization. More important, among our recorded structures, we found that the NRA is the only brainstem area showing a strong correlation between unit activity and call intensity. In addition, we observed functionally diverse patterns of vocalization-related activity in a set of regions around NRA. Dorsal to NRA, we observed activity specific to the beginning and end of vocalizations in the posterior level of the medullary reticular nucleus, dorsal part, whereas medial and lateral to the NRA, we observed activity related to call initiation. No clear vocalization-related activity was observed at anterior brainstem sites. Our findings suggest a set of functionally heterogeneous regions around the NRA contribute to vocal pattern generation in rats.SIGNIFICANCE STATEMENT Vocalization patterns are shaped in the mammalian brainstem, but the identity and location of the circuits involved is debated. Additionally, the neuronal mechanisms of vocal intensity control are still unknown. This study consisted of a large-scale mapping of brainstem vocalization circuits based on the activity marker c-fos and high-density recordings with Neuropixels probes. The results confirm the role of nucleus retroambiguus in call production and point to a key role of neurons in this nucleus in loudness control. Dorsal to the nucleus retroambiguus and in the posterior medulla, the authors identify neurons with activity specific to the beginning and end of vocalizations. The results point to specific neural dials for various aspects of rat vocalization control in the posterior brainstem.
Assuntos
Tronco Encefálico , Vocalização Animal , Ratos , Masculino , Animais , Vocalização Animal/fisiologia , Tronco Encefálico/fisiologia , Bulbo/fisiologia , Substância Cinzenta Periaquedutal/fisiologia , Formação Reticular , MamíferosRESUMO
Speech and language play an important role in human vocal communication. Studies have shown that vocal disorders can result from genetic factors. In the absence of high-quality data on humans, mouse vocalization experiments in laboratory settings have been proven useful in providing valuable insights into mammalian vocal development, including especially the impact of certain genetic mutations. Such data sets usually consist of categorical syllable sequences along with continuous intersyllable interval (ISI) times for mice of different genotypes vocalizing under different contexts. ISIs are of particular importance as increased ISIs can be an indication of possible vocal impairment. Statistical methods for properly analyzing ISIs along with the transition probabilities have however been lacking. In this article, we propose a class of novel Markov renewal mixed models that capture the stochastic dynamics of both state transitions and ISI lengths. Specifically, we model the transition dynamics and the ISIs using Dirichlet and gamma mixtures, respectively, allowing the mixture probabilities in both cases to vary flexibly with fixed covariate effects as well as random individual-specific effects. We apply our model to analyze the impact of a mutation in the Foxp2 gene on mouse vocal behavior. We find that genotypes and social contexts significantly affect the length of ISIs but, compared to previous analyses, the influences of genotype and social context on the syllable transition dynamics are weaker.
RESUMO
The study of ecological mechanisms influencing organisms' phenotypic variation is a central subject of evolutionary biology. In this study, we characterized morphological, plumage colour and acoustic variation in cactus wrens Campylorhynchus brunneicapillus throughout its distribution. We assessed whether Gloger's, Allen's and Bergmann's ecogeographical rules, and the acoustic adaptation hypothesis relate to geographical trait variation. We analysed specimen coloration in belly and crown plumage, beak shape and structural song characteristics. We tested whether the subspecific classification or the peninsular/mainland groups mirrored the geographical variation in phenotypes and whether ecological factors were associated with patterns of trait variation. Our results suggest that colour, beak shape and acoustic traits varied across the range, in agreement with two lineages described by genetics. The simple versions of Gloger's and Allen's rules are related to variations in colour traits and morphology. Conversely, patterns of phenotypic variation did not support Bergmann's rule. The acoustic adaptation hypothesis supported song divergence for frequency-related traits. Phenotypic variation supports the hypothesis of two taxa: C. affinis in the Baja California peninsula and C. brunneicapillus in the mainland. The ecological factors are associated with phenotypic trait adaptations, suggesting that divergence between lineages could result from ecological divergence.
Assuntos
Cactaceae , Aves Canoras , Animais , Aves Canoras/genética , Cor , México , FenótipoRESUMO
Human language follows statistical regularities or linguistic laws. For instance, Zipf's law of brevity states that the more frequently a word is used, the shorter it tends to be. All human languages adhere to this word structure. However, it is unclear whether Zipf's law emerged de novo in humans or whether it also exists in the non-linguistic vocal systems of our primate ancestors. Using a vocal conditioning paradigm, we examined the capacity of marmoset monkeys to efficiently encode vocalizations. We observed that marmosets adopted vocal compression strategies at three levels: (i) increasing call rate, (ii) decreasing call duration and (iii) increasing the proportion of short calls. Our results demonstrate that marmosets, when able to freely choose what to vocalize, exhibit vocal statistical regularities consistent with Zipf's law of brevity that go beyond their context-specific natural vocal behaviour. This suggests that linguistic laws emerged in non-linguistic vocal systems in the primate lineage.