Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28.081
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 184(18): 4626-4639.e13, 2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-34411517

RESUMO

Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.


Assuntos
Córtex Auditivo/fisiologia , Fala/fisiologia , Audiometria de Tons Puros , Eletrodos , Processamento Eletrônico de Dados , Humanos , Fonética , Percepção da Altura Sonora , Tempo de Reação/fisiologia , Lobo Temporal/fisiologia
2.
Cell ; 174(1): 21-31.e9, 2018 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-29958109

RESUMO

In speech, the highly flexible modulation of vocal pitch creates intonation patterns that speakers use to convey linguistic meaning. This human ability is unique among primates. Here, we used high-density cortical recordings directly from the human brain to determine the encoding of vocal pitch during natural speech. We found neural populations in bilateral dorsal laryngeal motor cortex (dLMC) that selectively encoded produced pitch but not non-laryngeal articulatory movements. This neural population controlled short pitch accents to express prosodic emphasis on a word in a sentence. Other larynx cortical representations controlling voicing and longer pitch phrase contours were found at separate sites. dLMC sites also encoded vocal pitch during a non-speech singing task. Finally, direct focal stimulation of dLMC evoked laryngeal movements and involuntary vocalization, confirming its causal role in feedforward control. Together, these results reveal the neural basis for the voluntary control of vocal pitch in human speech. VIDEO ABSTRACT.


Assuntos
Laringe/fisiologia , Córtex Motor/fisiologia , Fala , Adolescente , Adulto , Mapeamento Encefálico , Eletrocorticografia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Adulto Jovem
3.
Cell ; 164(6): 1269-1276, 2016 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-26967292

RESUMO

The use of vocalizations to communicate information and elaborate social bonds is an adaptation seen in many vertebrate species. Human speech is an extreme version of this pervasive form of communication. Unlike the vocalizations exhibited by the majority of land vertebrates, speech is a learned behavior requiring early sensory exposure and auditory feedback for its development and maintenance. Studies in humans and a small number of other species have provided insights into the neural and genetic basis for learned vocal communication and are helping to delineate the roles of brain circuits across the cortex, basal ganglia, and cerebellum in generating vocal behaviors. This Review provides an outline of the current knowledge about these circuits and the genes implicated in vocal communication, as well as a perspective on future research directions in this field.


Assuntos
Fala , Vocalização Animal , Animais , Encéfalo/fisiologia , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Humanos , Aprendizagem , Doenças do Sistema Nervoso/genética , Vias Neurais
4.
Nature ; 626(7999): 603-610, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38297120

RESUMO

Humans are capable of generating extraordinarily diverse articulatory movement combinations to produce meaningful speech. This ability to orchestrate specific phonetic sequences, and their syllabification and inflection over subsecond timescales allows us to produce thousands of word sounds and is a core component of language1,2. The fundamental cellular units and constructs by which we plan and produce words during speech, however, remain largely unknown. Here, using acute ultrahigh-density Neuropixels recordings capable of sampling across the cortical column in humans, we discover neurons in the language-dominant prefrontal cortex that encoded detailed information about the phonetic arrangement and composition of planned words during the production of natural speech. These neurons represented the specific order and structure of articulatory events before utterance and reflected the segmentation of phonetic sequences into distinct syllables. They also accurately predicted the phonetic, syllabic and morphological components of upcoming words and showed a temporally ordered dynamic. Collectively, we show how these mixtures of cells are broadly organized along the cortical column and how their activity patterns transition from articulation planning to production. We also demonstrate how these cells reliably track the detailed composition of consonant and vowel sounds during perception and how they distinguish processes specifically related to speaking from those related to listening. Together, these findings reveal a remarkably structured organization and encoding cascade of phonetic representations by prefrontal neurons in humans and demonstrate a cellular process that can support the production of speech.


Assuntos
Neurônios , Fonética , Córtex Pré-Frontal , Fala , Humanos , Movimento , Neurônios/fisiologia , Fala/fisiologia , Percepção da Fala/fisiologia , Córtex Pré-Frontal/citologia , Córtex Pré-Frontal/fisiologia
5.
Nature ; 626(7999): 593-602, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38093008

RESUMO

Understanding the neural basis of speech perception requires that we study the human brain both at the scale of the fundamental computational unit of neurons and in their organization across the depth of cortex. Here we used high-density Neuropixels arrays1-3 to record from 685 neurons across cortical layers at nine sites in a high-level auditory region that is critical for speech, the superior temporal gyrus4,5, while participants listened to spoken sentences. Single neurons encoded a wide range of speech sound cues, including features of consonants and vowels, relative vocal pitch, onsets, amplitude envelope and sequence statistics. Neurons at each cross-laminar recording exhibited dominant tuning to a primary speech feature while also containing a substantial proportion of neurons that encoded other features contributing to heterogeneous selectivity. Spatially, neurons at similar cortical depths tended to encode similar speech features. Activity across all cortical layers was predictive of high-frequency field potentials (electrocorticography), providing a neuronal origin for macroelectrode recordings from the cortical surface. Together, these results establish single-neuron tuning across the cortical laminae as an important dimension of speech encoding in human superior temporal gyrus.


Assuntos
Córtex Auditivo , Neurônios , Percepção da Fala , Lobo Temporal , Humanos , Estimulação Acústica , Córtex Auditivo/citologia , Córtex Auditivo/fisiologia , Neurônios/fisiologia , Fonética , Fala , Percepção da Fala/fisiologia , Lobo Temporal/citologia , Lobo Temporal/fisiologia , Sinais (Psicologia) , Eletrodos
6.
Nat Rev Neurosci ; 25(7): 473-492, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38745103

RESUMO

Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by directly decoding speech from intact cortical activity has the potential to restore natural communication and self-expression. Recent discoveries have defined how key features of speech production are facilitated by the coordinated activity of vocal-tract articulatory and motor-planning cortical representations. In this Review, we highlight such progress and how it has led to successful speech decoding, first in individuals implanted with intracranial electrodes for clinical epilepsy monitoring and subsequently in individuals with paralysis as part of early feasibility clinical trials to restore speech. We discuss high-spatiotemporal-resolution neural interfaces and the adaptation of state-of-the-art speech computational algorithms that have driven rapid and substantial progress in decoding neural activity into text, audible speech, and facial movements. Although restoring natural speech is a long-term goal, speech neuroprostheses already have performance levels that surpass communication rates offered by current assistive-communication technology. Given this accelerated rate of progress in the field, we propose key evaluation metrics for speed and accuracy, among others, to help standardize across studies. We finish by highlighting several directions to more fully explore the multidimensional feature space of speech and language, which will continue to accelerate progress towards a clinically viable speech neuroprosthesis.


Assuntos
Interfaces Cérebro-Computador , Fala , Humanos , Fala/fisiologia , Próteses Neurais , Animais
7.
Nature ; 620(7976): 1031-1036, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37612500

RESUMO

Speech brain-computer interfaces (BCIs) have the potential to restore rapid communication to people with paralysis by decoding neural activity evoked by attempted speech into text1,2 or sound3,4. Early demonstrations, although promising, have not yet achieved accuracies sufficiently high for communication of unconstrained sentences from a large vocabulary1-7. Here we demonstrate a speech-to-text BCI that records spiking activity from intracortical microelectrode arrays. Enabled by these high-resolution recordings, our study participant-who can no longer speak intelligibly owing to amyotrophic lateral sclerosis-achieved a 9.1% word error rate on a 50-word vocabulary (2.7 times fewer errors than the previous state-of-the-art speech BCI2) and a 23.8% word error rate on a 125,000-word vocabulary (the first successful demonstration, to our knowledge, of large-vocabulary decoding). Our participant's attempted speech was decoded  at 62 words per minute, which is 3.4 times as fast as the previous record8 and begins to approach the speed of natural conversation (160 words per minute9). Finally, we highlight two aspects of the neural code for speech that are encouraging for speech BCIs: spatially intermixed tuning to speech articulators that makes accurate decoding possible from only a small region of cortex, and a detailed articulatory representation of phonemes that persists years after paralysis. These results show a feasible path forward for restoring rapid communication to people with paralysis who can no longer speak.


Assuntos
Interfaces Cérebro-Computador , Próteses Neurais , Paralisia , Fala , Humanos , Esclerose Lateral Amiotrófica/fisiopatologia , Esclerose Lateral Amiotrófica/reabilitação , Córtex Cerebral/fisiologia , Microeletrodos , Paralisia/fisiopatologia , Paralisia/reabilitação , Vocabulário
8.
Nature ; 620(7976): 1037-1046, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37612505

RESUMO

Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive1. Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant's pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis.


Assuntos
Face , Próteses Neurais , Paralisia , Fala , Humanos , Córtex Cerebral/fisiologia , Córtex Cerebral/fisiopatologia , Ensaios Clínicos como Assunto , Comunicação , Aprendizado Profundo , Gestos , Movimento , Próteses Neurais/normas , Paralisia/fisiopatologia , Paralisia/reabilitação , Vocabulário , Voz
9.
Nat Rev Neurosci ; 24(11): 711-722, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37783820

RESUMO

Is the singing voice processed distinctively in the human brain? In this Perspective, we discuss what might distinguish song processing from speech processing in light of recent work suggesting that some cortical neuronal populations respond selectively to song and we outline the implications for our understanding of auditory processing. We review the literature regarding the neural and physiological mechanisms of song production and perception and show that this provides evidence for key differences between song and speech processing. We conclude by discussing the significance of the notion that song processing is special in terms of how this might contribute to theories of the neurobiological origins of vocal communication and to our understanding of the neural circuitry underlying sound processing in the human cortex.


Assuntos
Córtex Auditivo , Humanos , Percepção Auditiva/fisiologia , Fala/fisiologia , Encéfalo/fisiologia , Estimulação Acústica
10.
Nature ; 602(7895): 117-122, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34987226

RESUMO

During conversation, people take turns speaking by rapidly responding to their partners while simultaneously avoiding interruption1,2. Such interactions display a remarkable degree of coordination, as gaps between turns are typically about 200 milliseconds3-approximately the duration of an eyeblink4. These latencies are considerably shorter than those observed in simple word-production tasks, which indicates that speakers often plan their responses while listening to their partners2. Although a distributed network of brain regions has been implicated in speech planning5-9, the neural dynamics underlying the specific preparatory processes that enable rapid turn-taking are poorly understood. Here we use intracranial electrocorticography to precisely measure neural activity as participants perform interactive tasks, and we observe a functionally and anatomically distinct class of planning-related cortical dynamics. We localize these responses to a frontotemporal circuit centred on the language-critical caudal inferior frontal cortex10 (Broca's region) and the caudal middle frontal gyrus-a region not normally implicated in speech planning11-13. Using a series of motor tasks, we then show that this planning network is more active when preparing speech as opposed to non-linguistic actions. Finally, we delineate planning-related circuitry during natural conversation that is nearly identical to the network mapped with our interactive tasks, and we find this circuit to be most active before participant speech during unconstrained turn-taking. Therefore, we have identified a speech planning network that is central to natural language generation during social interaction.


Assuntos
Comportamento Social , Fala/fisiologia , Adulto , Idoso , Área de Broca/fisiologia , Eletrocorticografia , Função Executiva , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Vias Neurais , Fatores de Tempo
11.
PLoS Biol ; 22(2): e3002492, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38386639

RESUMO

Stuttering occurs in early childhood during a dynamic phase of brain and behavioral development. The latest studies examining children at ages close to this critical developmental period have identified early brain alterations that are most likely linked to stuttering, while spontaneous recovery appears related to increased inter-area connectivity. By contrast, therapy-driven improvement in adults is associated with a functional reorganization within and beyond the speech network. The etiology of stuttering, however, remains enigmatic. This Unsolved Mystery highlights critical questions and points to neuroimaging findings that could inspire future research to uncover how genetics, interacting neural hierarchies, social context, and reward circuitry contribute to the many facets of stuttering.


Assuntos
Gagueira , Criança , Adulto , Humanos , Pré-Escolar , Fala , Encéfalo , Neuroimagem , Estudos de Casos e Controles
12.
PLoS Biol ; 22(3): e3002534, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38466713

RESUMO

Selective attention-related top-down modulation plays a significant role in separating relevant speech from irrelevant background speech when vocal attributes separating concurrent speakers are small and continuously evolving. Electrophysiological studies have shown that such top-down modulation enhances neural tracking of attended speech. Yet, the specific cortical regions involved remain unclear due to the limited spatial resolution of most electrophysiological techniques. To overcome such limitations, we collected both electroencephalography (EEG) (high temporal resolution) and functional magnetic resonance imaging (fMRI) (high spatial resolution), while human participants selectively attended to speakers in audiovisual scenes containing overlapping cocktail party speech. To utilise the advantages of the respective techniques, we analysed neural tracking of speech using the EEG data and performed representational dissimilarity-based EEG-fMRI fusion. We observed that attention enhanced neural tracking and modulated EEG correlates throughout the latencies studied. Further, attention-related enhancement of neural tracking fluctuated in predictable temporal profiles. We discuss how such temporal dynamics could arise from a combination of interactions between attention and prediction as well as plastic properties of the auditory cortex. EEG-fMRI fusion revealed attention-related iterative feedforward-feedback loops between hierarchically organised nodes of the ventral auditory object related processing stream. Our findings support models where attention facilitates dynamic neural changes in the auditory cortex, ultimately aiding discrimination of relevant sounds from irrelevant ones while conserving neural resources.


Assuntos
Córtex Auditivo , Percepção da Fala , Humanos , Percepção da Fala/fisiologia , Fala , Retroalimentação , Eletroencefalografia/métodos , Córtex Auditivo/fisiologia , Estimulação Acústica/métodos
13.
PLoS Biol ; 22(5): e3002631, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38805517

RESUMO

Music and speech are complex and distinct auditory signals that are both foundational to the human experience. The mechanisms underpinning each domain are widely investigated. However, what perceptual mechanism transforms a sound into music or speech and how basic acoustic information is required to distinguish between them remain open questions. Here, we hypothesized that a sound's amplitude modulation (AM), an essential temporal acoustic feature driving the auditory system across processing levels, is critical for distinguishing music and speech. Specifically, in contrast to paradigms using naturalistic acoustic signals (that can be challenging to interpret), we used a noise-probing approach to untangle the auditory mechanism: If AM rate and regularity are critical for perceptually distinguishing music and speech, judging artificially noise-synthesized ambiguous audio signals should align with their AM parameters. Across 4 experiments (N = 335), signals with a higher peak AM frequency tend to be judged as speech, lower as music. Interestingly, this principle is consistently used by all listeners for speech judgments, but only by musically sophisticated listeners for music. In addition, signals with more regular AM are judged as music over speech, and this feature is more critical for music judgment, regardless of musical sophistication. The data suggest that the auditory system can rely on a low-level acoustic property as basic as AM to distinguish music from speech, a simple principle that provokes both neurophysiological and evolutionary experiments and speculations.


Assuntos
Estimulação Acústica , Percepção Auditiva , Música , Percepção da Fala , Humanos , Masculino , Feminino , Adulto , Percepção Auditiva/fisiologia , Estimulação Acústica/métodos , Percepção da Fala/fisiologia , Adulto Jovem , Fala/fisiologia , Adolescente
14.
Proc Natl Acad Sci U S A ; 121(11): e2310766121, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38442171

RESUMO

The neural correlates of sentence production are typically studied using task paradigms that differ considerably from the experience of speaking outside of an experimental setting. In this fMRI study, we aimed to gain a better understanding of syntactic processing in spontaneous production versus naturalistic comprehension in three regions of interest (BA44, BA45, and left posterior middle temporal gyrus). A group of participants (n = 16) was asked to speak about the events of an episode of a TV series in the scanner. Another group of participants (n = 36) listened to the spoken recall of a participant from the first group. To model syntactic processing, we extracted word-by-word metrics of phrase-structure building with a top-down and a bottom-up parser that make different hypotheses about the timing of structure building. While the top-down parser anticipates syntactic structure, sometimes before it is obvious to the listener, the bottom-up parser builds syntactic structure in an integratory way after all of the evidence has been presented. In comprehension, neural activity was found to be better modeled by the bottom-up parser, while in production, it was better modeled by the top-down parser. We additionally modeled structure building in production with two strategies that were developed here to make different predictions about the incrementality of structure building during speaking. We found evidence for highly incremental and anticipatory structure building in production, which was confirmed by a converging analysis of the pausing patterns in speech. Overall, this study shows the feasibility of studying the neural dynamics of spontaneous language production.


Assuntos
Benchmarking , Rememoração Mental , Humanos , Idioma , Software , Fala
15.
Proc Natl Acad Sci U S A ; 121(3): e2308837121, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38198530

RESUMO

The development of individuality during learned behavior is a common trait observed across animal species; however, the underlying biological mechanisms remain understood. Similar to human speech, songbirds develop individually unique songs with species-specific traits through vocal learning. In this study, we investigate the developmental and molecular mechanisms underlying individuality in vocal learning by utilizing F1 hybrid songbirds (Taeniopygia guttata cross with Taeniopygia bichenovii), taking an integrating approach combining experimentally controlled systematic song tutoring, unbiased discriminant analysis of song features, and single-cell transcriptomics. When tutoring with songs from both parental species, F1 hybrid individuals exhibit evident diversity in their acquired songs. Approximately 30% of F1 hybrids selectively learn either song of the two parental species, while others develop merged songs that combine traits from both species. Vocal acoustic biases during vocal babbling initially appear as individual differences in songs among F1 juveniles and are maintained through the sensitive period of song vocal learning. These vocal acoustic biases emerge independently of the initial auditory experience of hearing the biological father's and passive tutored songs. We identify individual differences in transcriptional signatures in a subset of cell types, including the glutamatergic neurons projecting from the cortical vocal output nucleus to the hypoglossal nuclei, which are associated with variations of vocal acoustic features. These findings suggest that a genetically predisposed vocal motor bias serves as the initial origin of individual variation in vocal learning, influencing learning constraints and preferences.


Assuntos
Individualidade , Aves Canoras , Animais , Humanos , Predisposição Genética para Doença , Fala , Acústica , Viés
16.
Proc Natl Acad Sci U S A ; 121(22): e2316149121, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38768342

RESUMO

Speech impediments are a prominent yet understudied symptom of Parkinson's disease (PD). While the subthalamic nucleus (STN) is an established clinical target for treating motor symptoms, these interventions can lead to further worsening of speech. The interplay between dopaminergic medication, STN circuitry, and their downstream effects on speech in PD is not yet fully understood. Here, we investigate the effect of dopaminergic medication on STN circuitry and probe its association with speech and cognitive functions in PD patients. We found that changes in intrinsic functional connectivity of the STN were associated with alterations in speech functions in PD. Interestingly, this relationship was characterized by altered functional connectivity of the dorsolateral and ventromedial subdivisions of the STN with the language network. Crucially, medication-induced changes in functional connectivity between the STN's dorsolateral subdivision and key regions in the language network, including the left inferior frontal cortex and the left superior temporal gyrus, correlated with alterations on a standardized neuropsychological test requiring oral responses. This relation was not observed in the written version of the same test. Furthermore, changes in functional connectivity between STN and language regions predicted the medication's downstream effects on speech-related cognitive performance. These findings reveal a previously unidentified brain mechanism through which dopaminergic medication influences speech function in PD. Our study sheds light into the subcortical-cortical circuit mechanisms underlying impaired speech control in PD. The insights gained here could inform treatment strategies aimed at mitigating speech deficits in PD and enhancing the quality of life for affected individuals.


Assuntos
Idioma , Doença de Parkinson , Fala , Núcleo Subtalâmico , Humanos , Doença de Parkinson/fisiopatologia , Doença de Parkinson/tratamento farmacológico , Núcleo Subtalâmico/fisiopatologia , Núcleo Subtalâmico/efeitos dos fármacos , Masculino , Fala/fisiologia , Fala/efeitos dos fármacos , Feminino , Pessoa de Meia-Idade , Idoso , Imageamento por Ressonância Magnética , Dopamina/metabolismo , Rede Nervosa/efeitos dos fármacos , Rede Nervosa/fisiopatologia , Cognição/efeitos dos fármacos , Dopaminérgicos/farmacologia , Dopaminérgicos/uso terapêutico
17.
Annu Rev Neurosci ; 41: 527-552, 2018 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-29986161

RESUMO

How the cerebral cortex encodes auditory features of biologically important sounds, including speech and music, is one of the most important questions in auditory neuroscience. The pursuit to understand related neural coding mechanisms in the mammalian auditory cortex can be traced back several decades to the early exploration of the cerebral cortex. Significant progress in this field has been made in the past two decades with new technical and conceptual advances. This article reviews the progress and challenges in this area of research.


Assuntos
Córtex Auditivo/fisiologia , Vias Auditivas/fisiologia , Percepção Auditiva/fisiologia , Mapeamento Encefálico , Animais , Audição , Humanos , Música , Fala
18.
PLoS Biol ; 21(6): e3002128, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37279203

RESUMO

Humans can easily tune in to one talker in a multitalker environment while still picking up bits of background speech; however, it remains unclear how we perceive speech that is masked and to what degree non-target speech is processed. Some models suggest that perception can be achieved through glimpses, which are spectrotemporal regions where a talker has more energy than the background. Other models, however, require the recovery of the masked regions. To clarify this issue, we directly recorded from primary and non-primary auditory cortex (AC) in neurosurgical patients as they attended to one talker in multitalker speech and trained temporal response function models to predict high-gamma neural activity from glimpsed and masked stimulus features. We found that glimpsed speech is encoded at the level of phonetic features for target and non-target talkers, with enhanced encoding of target speech in non-primary AC. In contrast, encoding of masked phonetic features was found only for the target, with a greater response latency and distinct anatomical organization compared to glimpsed phonetic features. These findings suggest separate mechanisms for encoding glimpsed and masked speech and provide neural evidence for the glimpsing model of speech perception.


Assuntos
Percepção da Fala , Fala , Humanos , Fala/fisiologia , Estimulação Acústica , Fonética , Percepção da Fala/fisiologia , Tempo de Reação
19.
PLoS Biol ; 21(3): e3002046, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36947552

RESUMO

Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.


Assuntos
Compreensão , Percepção da Fala , Humanos , Compreensão/fisiologia , Fala , Percepção da Fala/fisiologia , Encéfalo/fisiologia , Idioma
20.
Proc Natl Acad Sci U S A ; 120(10): e2209384120, 2023 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-36848573

RESUMO

The machine learning (ML) research community has landed on automated hate speech detection as the vital tool in the mitigation of bad behavior online. However, it is not clear that this is a widely supported view outside of the ML world. Such a disconnect can have implications for whether automated detection tools are accepted or adopted. Here we lend insight into how other key stakeholders understand the challenge of addressing hate speech and the role automated detection plays in solving it. To do so, we develop and apply a structured approach to dissecting the discourses used by online platform companies, governments, and not-for-profit organizations when discussing hate speech. We find that, where hate speech mitigation is concerned, there is a profound disconnect between the computer science research community and other stakeholder groups-which puts progress on this important problem at serious risk. We identify urgent steps that need to be taken to incorporate computational researchers into a single, coherent, multistakeholder community that is working towards civil discourse online.


Assuntos
Ódio , Fala , Governo , Aprendizado de Máquina , Organizações sem Fins Lucrativos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA