Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
J Neurosci ; 42(17): 3648-3658, 2022 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-35347046

RESUMO

Speech perception in noise is a challenging everyday task with which many listeners have difficulty. Here, we report a case in which electrical brain stimulation of implanted intracranial electrodes in the left planum temporale (PT) of a neurosurgical patient significantly and reliably improved subjective quality (up to 50%) and objective intelligibility (up to 97%) of speech in noise perception. Stimulation resulted in a selective enhancement of speech sounds compared with the background noises. The receptive fields of the PT sites whose stimulation improved speech perception were tuned to spectrally broad and rapidly changing sounds. Corticocortical evoked potential analysis revealed that the PT sites were located between the sites in Heschl's gyrus and the superior temporal gyrus. Moreover, the discriminability of speech from nonspeech sounds increased in population neural responses from Heschl's gyrus to the PT to the superior temporal gyrus sites. These findings causally implicate the PT in background noise suppression and may point to a novel potential neuroprosthetic solution to assist in the challenging task of speech perception in noise.SIGNIFICANCE STATEMENT Speech perception in noise remains a challenging task for many individuals. Here, we present a case in which the electrical brain stimulation of intracranially implanted electrodes in the planum temporale of a neurosurgical patient significantly improved both the subjective quality (up to 50%) and objective intelligibility (up to 97%) of speech perception in noise. Stimulation resulted in a selective enhancement of speech sounds compared with the background noises. Our local and network-level functional analyses placed the planum temporale sites in between the sites in the primary auditory areas in Heschl's gyrus and nonprimary auditory areas in the superior temporal gyrus. These findings causally implicate planum temporale in acoustic scene analysis and suggest potential neuroprosthetic applications to assist hearing in noise.


Assuntos
Córtex Auditivo , Percepção da Fala , Estimulação Acústica , Córtex Auditivo/fisiologia , Encéfalo , Mapeamento Encefálico/métodos , Audição , Humanos , Imageamento por Ressonância Magnética/métodos , Fala/fisiologia , Percepção da Fala/fisiologia
2.
Nat Hum Behav ; 6(3): 455-469, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35145280

RESUMO

To derive meaning from sound, the brain must integrate information across many timescales. What computations underlie multiscale integration in human auditory cortex? Evidence suggests that auditory cortex analyses sound using both generic acoustic representations (for example, spectrotemporal modulation tuning) and category-specific computations, but the timescales over which these putatively distinct computations integrate remain unclear. To answer this question, we developed a general method to estimate sensory integration windows-the time window when stimuli alter the neural response-and applied our method to intracranial recordings from neurosurgical patients. We show that human auditory cortex integrates hierarchically across diverse timescales spanning from ~50 to 400 ms. Moreover, we find that neural populations with short and long integration windows exhibit distinct functional properties: short-integration electrodes (less than ~200 ms) show prominent spectrotemporal modulation selectivity, while long-integration electrodes (greater than ~200 ms) show prominent category selectivity. These findings reveal how multiscale integration organizes auditory computation in the human brain.


Assuntos
Córtex Auditivo , Estimulação Acústica/métodos , Percepção Auditiva , Encéfalo , Mapeamento Encefálico/métodos , Humanos
3.
Neuroimage ; 223: 117282, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32828921

RESUMO

Hearing-impaired people often struggle to follow the speech stream of an individual talker in noisy environments. Recent studies show that the brain tracks attended speech and that the attended talker can be decoded from neural data on a single-trial level. This raises the possibility of "neuro-steered" hearing devices in which the brain-decoded intention of a hearing-impaired listener is used to enhance the voice of the attended speaker from a speech separation front-end. So far, methods that use this paradigm have focused on optimizing the brain decoding and the acoustic speech separation independently. In this work, we propose a novel framework called brain-informed speech separation (BISS)1 in which the information about the attended speech, as decoded from the subject's brain, is directly used to perform speech separation in the front-end. We present a deep learning model that uses neural data to extract the clean audio signal that a listener is attending to from a multi-talker speech mixture. We show that the framework can be applied successfully to the decoded output from either invasive intracranial electroencephalography (iEEG) or non-invasive electroencephalography (EEG) recordings from hearing-impaired subjects. It also results in improved speech separation, even in scenes with background noise. The generalization capability of the system renders it a perfect candidate for neuro-steered hearing-assistive devices.


Assuntos
Encéfalo/fisiologia , Eletroencefalografia , Processamento de Sinais Assistido por Computador , Acústica da Fala , Percepção da Fala/fisiologia , Estimulação Acústica , Adulto , Algoritmos , Aprendizado Profundo , Perda Auditiva/fisiopatologia , Humanos , Pessoa de Meia-Idade
4.
Elife ; 92020 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-32589140

RESUMO

Our understanding of nonlinear stimulus transformations by neural circuits is hindered by the lack of comprehensive yet interpretable computational modeling frameworks. Here, we propose a data-driven approach based on deep neural networks to directly model arbitrarily nonlinear stimulus-response mappings. Reformulating the exact function of a trained neural network as a collection of stimulus-dependent linear functions enables a locally linear receptive field interpretation of the neural network. Predicting the neural responses recorded invasively from the auditory cortex of neurosurgical patients as they listened to speech, this approach significantly improves the prediction accuracy of auditory cortical responses, particularly in nonprimary areas. Moreover, interpreting the functions learned by neural networks uncovered three distinct types of nonlinear transformations of speech that varied considerably from primary to nonprimary auditory regions. The ability of this framework to capture arbitrary stimulus-response mappings while maintaining model interpretability leads to a better understanding of cortical processing of sensory signals.


Assuntos
Córtex Auditivo/fisiologia , Percepção Auditiva/fisiologia , Células Receptoras Sensoriais/fisiologia , Estimulação Acústica , Eletrocorticografia , Humanos , Modelos Neurológicos , Redes Neurais de Computação , Dinâmica não Linear , Fala
5.
Elife ; 92020 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-32122465

RESUMO

Humans engagement in music rests on underlying elements such as the listeners' cultural background and interest in music. These factors modulate how listeners anticipate musical events, a process inducing instantaneous neural responses as the music confronts these expectations. Measuring such neural correlates would represent a direct window into high-level brain processing. Here we recorded cortical signals as participants listened to Bach melodies. We assessed the relative contributions of acoustic versus melodic components of the music to the neural signal. Melodic features included information on pitch progressions and their tempo, which were extracted from a predictive model of musical structure based on Markov chains. We related the music to brain activity with temporal response functions demonstrating, for the first time, distinct cortical encoding of pitch and note-onset expectations during naturalistic music listening. This encoding was most pronounced at response latencies up to 350 ms, and in both planum temporale and Heschl's gyrus.


Assuntos
Percepção Auditiva/fisiologia , Música , Lobo Temporal/fisiologia , Estimulação Acústica , Eletroencefalografia , Potenciais Evocados Auditivos/fisiologia , Humanos , Tempo de Reação
6.
Sci Rep ; 9(1): 11538, 2019 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-31395905

RESUMO

Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which should be enhanced for the listener and which should be suppressed. Traditionally, researchers have separated the AAD problem into two stages: reconstruction of a representation of the attended audio from neural signals, followed by determining the similarity between the candidate audio streams and the reconstruction. Here, we compare the traditional two-stage approach with a novel neural-network architecture that subsumes the explicit similarity step. We compare this new architecture against linear and non-linear (neural-network) baselines using both wet and dry electroencephalogram (EEG) systems. Our results indicate that the new architecture outperforms the baseline linear stimulus-reconstruction method, improving decoding accuracy from 66% to 81% using wet EEG and from 59% to 87% for dry EEG. Also of note was the finding that the dry EEG system can deliver comparable or even better results than the wet, despite the latter having one third as many EEG channels as the former. The 11-subject, wet-electrode AAD dataset for two competing, co-located talkers, the 11-subject, dry-electrode AAD dataset, and our software are available for further validation, experimentation, and modification.


Assuntos
Atenção/fisiologia , Córtex Auditivo/fisiologia , Interfaces Cérebro-Computador , Eletroencefalografia , Estimulação Acústica , Algoritmos , Córtex Auditivo/diagnóstico por imagem , Eletrocorticografia , Auxiliares de Audição/tendências , Humanos , Modelos Lineares , Redes Neurais de Computação , Ruído , Dinâmica não Linear , Percepção da Fala/fisiologia
7.
Sci Rep ; 9(1): 874, 2019 01 29.
Artigo em Inglês | MEDLINE | ID: mdl-30696881

RESUMO

Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.


Assuntos
Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Fala/fisiologia , Estimulação Acústica/métodos , Algoritmos , Córtex Auditivo/fisiologia , Mapeamento Encefálico , Aprendizado Profundo , Potenciais Evocados Auditivos/fisiologia , Humanos , Redes Neurais de Computação , Próteses Neurais
8.
J Neural Eng ; 14(5): 056001, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28776506

RESUMO

OBJECTIVE: People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. APPROACH: We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener's neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker's voice to assist the listener. MAIN RESULTS: Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. SIGNIFICANCE: Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.


Assuntos
Estimulação Acústica/métodos , Córtex Auditivo/fisiologia , Eletrodos Implantados/tendências , Auxiliares de Audição/tendências , Rede Nervosa/fisiologia , Percepção da Fala/fisiologia , Percepção Auditiva/fisiologia , Eletroencefalografia/métodos , Feminino , Humanos , Masculino
9.
J Neurosci ; 37(8): 2176-2185, 2017 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-28119400

RESUMO

Humans are unique in their ability to communicate using spoken language. However, it remains unclear how the speech signal is transformed and represented in the brain at different stages of the auditory pathway. In this study, we characterized electroencephalography responses to continuous speech by obtaining the time-locked responses to phoneme instances (phoneme-related potential). We showed that responses to different phoneme categories are organized by phonetic features. We found that each instance of a phoneme in continuous speech produces multiple distinguishable neural responses occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Comparing the patterns of phoneme similarity in the neural responses and the acoustic signals confirms a repetitive appearance of acoustic distinctions of phonemes in the neural data. Analysis of the phonetic and speaker information in neural activations revealed that different time intervals jointly encode the acoustic similarity of both phonetic and speaker categories. These findings provide evidence for a dynamic neural transformation of low-level speech features as they propagate along the auditory pathway, and form an empirical framework to study the representational changes in learning, attention, and speech disorders.SIGNIFICANCE STATEMENT We characterized the properties of evoked neural responses to phoneme instances in continuous speech. We show that each instance of a phoneme in continuous speech produces several observable neural responses at different times occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Each temporal event explicitly encodes the acoustic similarity of phonemes, and linguistic and nonlinguistic information are best represented at different time intervals. Finally, we show a joint encoding of phonetic and speaker information, where the neural representation of speakers is dependent on phoneme category. These findings provide compelling new evidence for dynamic processing of speech sounds in the auditory pathway.


Assuntos
Mapeamento Encefálico , Potenciais Evocados Auditivos/fisiologia , Fonética , Percepção da Fala/fisiologia , Fala/fisiologia , Estimulação Acústica , Acústica , Eletroencefalografia , Feminino , Humanos , Idioma , Masculino , Tempo de Reação , Estatística como Assunto , Fatores de Tempo
10.
J Neural Eng ; 13(5): 056004, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27484713

RESUMO

OBJECTIVE: The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. APPROACH: The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. MAIN RESULTS: The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. SIGNIFICANCE: These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.


Assuntos
Córtex Cerebral/fisiologia , Interface para o Reconhecimento da Fala , Estimulação Acústica , Algoritmos , Córtex Auditivo/fisiologia , Simulação por Computador , Análise Discriminante , Eletrocorticografia , Eletrodos Implantados , Feminino , Ritmo Gama , Humanos , Funções Verossimilhança , Masculino , Cadeias de Markov , Reprodutibilidade dos Testes , Caracteres Sexuais , Lobo Temporal/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA