Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
J Neurosci ; 43(44): 7429-7440, 2023 11 01.
Article in English | MEDLINE | ID: mdl-37793908

ABSTRACT

Selective attention to one of several competing speakers is required for comprehending a target speaker among other voices and for successful communication with them. It moreover has been found to involve the neural tracking of low-frequency speech rhythms in the auditory cortex. Effects of selective attention have also been found in subcortical neural activities, in particular regarding the frequency-following response related to the fundamental frequency of speech (speech-FFR). Recent investigations have, however, shown that the speech-FFR contains cortical contributions as well. It remains unclear whether these are also modulated by selective attention. Here we used magnetoencephalography to assess the attentional modulation of the cortical contributions to the speech-FFR. We presented both male and female participants with two competing speech signals and analyzed the cortical responses during attentional switching between the two speakers. Our findings revealed robust attentional modulation of the cortical contribution to the speech-FFR: the neural responses were higher when the speaker was attended than when they were ignored. We also found that, regardless of attention, a voice with a lower fundamental frequency elicited a larger cortical contribution to the speech-FFR than a voice with a higher fundamental frequency. Our results show that the attentional modulation of the speech-FFR does not only occur subcortically but extends to the auditory cortex as well.SIGNIFICANCE STATEMENT Understanding speech in noise requires attention to a target speaker. One of the speech features that a listener can use to identify a target voice among others and attend it is the fundamental frequency, together with its higher harmonics. The fundamental frequency arises from the opening and closing of the vocal folds and is tracked by high-frequency neural activity in the auditory brainstem and in the cortex. Previous investigations showed that the subcortical neural tracking is modulated by selective attention. Here we show that attention affects the cortical tracking of the fundamental frequency as well: it is stronger when a particular voice is attended than when it is ignored.


Subject(s)
Auditory Cortex , Speech Perception , Humans , Male , Female , Speech , Speech Perception/physiology , Auditory Cortex/physiology , Magnetoencephalography , Evoked Potentials, Auditory, Brain Stem/physiology , Acoustic Stimulation , Electroencephalography/methods
2.
J Cogn Neurosci ; 35(8): 1301-1311, 2023 08 01.
Article in English | MEDLINE | ID: mdl-37379482

ABSTRACT

The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.


Subject(s)
Auditory Cortex , Speech Perception , Humans , Speech/physiology , Electroencephalography , Speech Perception/physiology , Auditory Cortex/physiology , Linguistics , Acoustic Stimulation
3.
J Neural Eng ; 19(4)2022 07 06.
Article in English | MEDLINE | ID: mdl-35709698

ABSTRACT

Objective.Smart hearing aids which can decode the focus of a user's attention could considerably improve comprehension levels in noisy environments. Methods for decoding auditory attention from electroencapholography (EEG) have attracted considerable interest for this reason. Recent studies suggest that the integration of deep neural networks (DNNs) into existing auditory attention decoding (AAD) algorithms is highly beneficial, although it remains unclear whether these enhanced algorithms can perform robustly in different real-world scenarios. Therefore, we sought to characterise the performance of DNNs at reconstructing the envelope of an attended speech stream from EEG recordings in different listening conditions. In addition, given the relatively sparse availability of EEG data, we investigate possibility of applying subject-independent algorithms to EEG recorded from unseen individuals.Approach.Both linear models and nonlinear DNNs were employed to decode the envelope of clean speech from EEG recordings, with and without subject-specific information. The mean behaviour, as well as the variability of the reconstruction, was characterised for each model. We then trained subject-specific linear models and DNNs to reconstruct the envelope of speech in clean and noisy conditions, and investigated how well they performed in different listening scenarios. We also established that these models can be used to decode auditory attention in competing-speaker scenarios.Main results.The DNNs offered a considerable advantage over their linear analogue at reconstructing the envelope of clean speech. This advantage persisted even when subject-specific information was unavailable at the time of training. The same DNN architectures generalised to a distinct dataset, which contained EEG recorded under a variety of listening conditions. In competing-speakers and speech-in-noise conditions, the DNNs significantly outperformed the linear models. Finally, the DNNs offered a considerable improvement over the linear approach at decoding auditory attention in competing-speakers scenarios.Significance.We present the first detailed study into the extent to which DNNs can be employed for reconstructing the envelope of an attended speech stream. We conclusively demonstrate that DNNs improve the reconstruction of the attended speech envelope. The variance of the reconstruction error is shown to be similar for both DNNs and the linear model. DNNs therefore show promise for real-world AAD, since they perform well in multiple listening conditions and generalise to data recorded from unseen participants.


Subject(s)
Speech Perception , Speech , Acoustic Stimulation/methods , Electroencephalography/methods , Humans , Neural Networks, Computer
4.
J Cogn Neurosci ; 34(3): 411-424, 2022 02 01.
Article in English | MEDLINE | ID: mdl-35015867

ABSTRACT

Speech and music are spectrotemporally complex acoustic signals that are highly relevant for humans. Both contain a temporal fine structure that is encoded in the neural responses of subcortical and cortical processing centers. The subcortical response to the temporal fine structure of speech has recently been shown to be modulated by selective attention to one of two competing voices. Music similarly often consists of several simultaneous melodic lines, and a listener can selectively attend to a particular one at a time. However, the neural mechanisms that enable such selective attention remain largely enigmatic, not least since most investigations to date have focused on short and simplified musical stimuli. Here, we studied the neural encoding of classical musical pieces in human volunteers, using scalp EEG recordings. We presented volunteers with continuous musical pieces composed of one or two instruments. In the latter case, the participants were asked to selectively attend to one of the two competing instruments and to perform a vibrato identification task. We used linear encoding and decoding models to relate the recorded EEG activity to the stimulus waveform. We show that we can measure neural responses to the temporal fine structure of melodic lines played by one single instrument, at the population level as well as for most individual participants. The neural response peaks at a latency of 7.6 msec and is not measurable past 15 msec. When analyzing the neural responses to the temporal fine structure elicited by competing instruments, we found no evidence of attentional modulation. We observed, however, that low-frequency neural activity exhibited a modulation consistent with the behavioral task at latencies from 100 to 160 msec, in a similar manner to the attentional modulation observed in continuous speech (N100). Our results show that, much like speech, the temporal fine structure of music is tracked by neural activity. In contrast to speech, however, this response appears unaffected by selective attention in the context of our experiment.


Subject(s)
Music , Speech Perception , Acoustic Stimulation/methods , Auditory Perception/physiology , Electroencephalography/methods , Humans , Speech , Speech Perception/physiology
5.
J Neural Eng ; 18(5)2021 10 12.
Article in English | MEDLINE | ID: mdl-34547737

ABSTRACT

Objective.Seeing a person talking can help us understand them, particularly in a noisy environment. However, how the brain integrates the visual information with the auditory signal to enhance speech comprehension remains poorly understood.Approach.Here we address this question in a computational model of a cortical microcircuit for speech processing. The model consists of an excitatory and an inhibitory neural population that together create oscillations in the theta frequency range. When stimulated with speech, the theta rhythm becomes entrained to the onsets of syllables, such that the onsets can be inferred from the network activity. We investigate how well the obtained syllable parsing performs when different types of visual stimuli are added. In particular, we consider currents related to the rate of syllables as well as currents related to the mouth-opening area of the talking faces.Main results.We find that currents that target the excitatory neuronal population can influence speech comprehension, both boosting it or impeding it, depending on the temporal delay and on whether the currents are excitatory or inhibitory. In contrast, currents that act on the inhibitory neurons do not impact speech comprehension significantly.Significance.Our results suggest neural mechanisms for the integration of visual information with the acoustic information in speech and make experimentally-testable predictions.


Subject(s)
Speech Perception , Speech , Acoustic Stimulation , Brain , Humans , Theta Rhythm
6.
IEEE Trans Neural Syst Rehabil Eng ; 28(1): 23-31, 2020 01.
Article in English | MEDLINE | ID: mdl-31751277

ABSTRACT

Neural activity tracks the envelope of a speech signal at latencies from 50 ms to 300 ms. Modulating this neural tracking through transcranial alternating current stimulation influences speech comprehension. Two important variables that can affect this modulation are the latency and the phase of the stimulation with respect to the sound. While previous studies have found an influence of both variables on speech comprehension, the interaction between both has not yet been measured. We presented 17 subjects with speech in noise coupled with simultaneous transcranial alternating current stimulation. The currents were based on the envelope of the target speech but shifted by different phases, as well as by two temporal delays of 100 ms and 250 ms. We also employed various control stimulations, and assessed the signal-to-noise ratio at which the subject understood half of the speech. We found that, at both latencies, speech comprehension is modulated by the phase of the current stimulation. However, the form of the modulation differed between the two latencies. Phase and latency of neurostimulation have accordingly distinct influences on speech comprehension. The different effects at the latencies of 100 ms and 250 ms hint at distinct neural processes for speech processing.


Subject(s)
Comprehension/physiology , Noise , Speech Perception/physiology , Transcranial Direct Current Stimulation/methods , Acoustic Stimulation , Adult , Algorithms , Female , Healthy Volunteers , Humans , Male , Nonlinear Dynamics , Signal-To-Noise Ratio , Young Adult
7.
Sci Rep ; 9(1): 14131, 2019 Oct 01.
Article in English | MEDLINE | ID: mdl-31575950

ABSTRACT

People with normal hearing thresholds can nonetheless have difficulty with understanding speech in noisy backgrounds. The origins of such supra-threshold hearing deficits remain largely unclear. Previously we showed that the auditory brainstem response to running speech is modulated by selective attention, evidencing a subcortical mechanism that contributes to speech-in-noise comprehension. We observed, however, significant variation in the magnitude of the brainstem's attentional modulation between the different volunteers. Here we show that this variability relates to the ability of the subjects to understand speech in background noise. In particular, we assessed 43 young human volunteers with normal hearing thresholds for their speech-in-noise comprehension. We also recorded their auditory brainstem responses to running speech when selectively attending to one of two competing voices. To control for potential peripheral hearing deficits, and in particular for cochlear synaptopathy, we further assessed noise exposure, the temporal sensitivity threshold, the middle-ear muscle reflex, and the auditory-brainstem response to clicks in various levels of background noise. These tests did not show evidence for cochlear synaptopathy amongst the volunteers. Furthermore, we found that only the attentional modulation of the brainstem response to speech was significantly related to speech-in-noise comprehension. Our results therefore evidence an impact of top-down modulation of brainstem activity on the variability in speech-in-noise comprehension amongst the subjects.


Subject(s)
Attention/physiology , Auditory Threshold/physiology , Brain Stem/physiology , Evoked Potentials, Auditory, Brain Stem/physiology , Hearing/physiology , Speech/physiology , Acoustic Stimulation/methods , Adult , Audiometry, Speech/methods , Cochlea/physiology , Female , Hearing Loss, Noise-Induced/physiopathology , Hearing Tests/methods , Humans , Individuality , Male , Noise , Otoacoustic Emissions, Spontaneous/physiology , Speech Perception/physiology , Young Adult
8.
J Neurosci ; 39(29): 5750-5759, 2019 07 17.
Article in English | MEDLINE | ID: mdl-31109963

ABSTRACT

Humans excel at understanding speech even in adverse conditions such as background noise. Speech processing may be aided by cortical activity in the delta and theta frequency bands, which have been found to track the speech envelope. However, the rhythm of non-speech sounds is tracked by cortical activity as well. It therefore remains unclear which aspects of neural speech tracking represent the processing of acoustic features, related to the clarity of speech, and which aspects reflect higher-level linguistic processing related to speech comprehension. Here we disambiguate the roles of cortical tracking for speech clarity and comprehension through recording EEG responses to native and foreign language in different levels of background noise, for which clarity and comprehension vary independently. We then use a both a decoding and an encoding approach to relate clarity and comprehension to the neural responses. We find that cortical tracking in the theta frequency band is mainly correlated to clarity, whereas the delta band contributes most to speech comprehension. Moreover, we uncover an early neural component in the delta band that informs on comprehension and that may reflect a predictive mechanism for language processing. Our results disentangle the functional contributions of cortical speech tracking in the delta and theta bands to speech processing. They also show that both speech clarity and comprehension can be accurately decoded from relatively short segments of EEG recordings, which may have applications in future mind-controlled auditory prosthesis.SIGNIFICANCE STATEMENT Speech is a highly complex signal whose processing requires analysis from lower-level acoustic features to higher-level linguistic information. Recent work has shown that neural activity in the delta and theta frequency bands track the rhythm of speech, but the role of this tracking for speech processing remains unclear. Here we disentangle the roles of cortical entrainment in different frequency bands and at different temporal lags for speech clarity, reflecting the acoustics of the signal, and speech comprehension, related to linguistic processing. We show that cortical speech tracking in the theta frequency band encodes mostly speech clarity, and thus acoustic aspects of the signal, whereas speech tracking in the delta band encodes the higher-level speech comprehension.


Subject(s)
Acoustic Stimulation/methods , Auditory Cortex/physiology , Delta Rhythm/physiology , Noise , Speech Perception/physiology , Theta Rhythm/physiology , Adult , Electroencephalography/methods , Female , Humans , Male , Speech/physiology , Young Adult
9.
Curr Biol ; 28(23): 3833-3839.e3, 2018 12 03.
Article in English | MEDLINE | ID: mdl-30471997

ABSTRACT

Recent studies identify severely brain-injured patients with limited or no behavioral responses who successfully perform functional magnetic resonance imaging (fMRI) or electroencephalogram (EEG) mental imagery tasks [1-5]. Such tasks are cognitively demanding [1]; accordingly, recent studies support that fMRI command following in brain-injured patients associates with preserved cerebral metabolism and preserved sleep-wake EEG [5, 6]. We investigated the use of an EEG response that tracks the natural speech envelope (NSE) of spoken language [7-22] in healthy controls and brain-injured patients (vegetative state to emergence from minimally conscious state). As audition is typically preserved after brain injury, auditory paradigms may be preferred in searching for covert cognitive function [23-25]. NSE measures are obtained by cross-correlating EEG with the NSE. We compared NSE latencies and amplitudes with and without consideration of fMRI assessments. NSE latencies showed significant and progressive delay across diagnostic categories. Patients who could carry out fMRI-based mental imagery tasks showed no statistically significant difference in NSE latencies relative to healthy controls; this subgroup included patients without behavioral command following. The NSE may stratify patients with severe brain injuries and identify those patients demonstrating "cognitive motor dissociation" (CMD) [26] who show only covert evidence of command following utilizing neuroimaging or electrophysiological methods that demand high levels of cognitive function. Thus, the NSE is a passive measure that may provide a useful screening tool to improve detection of covert cognition with fMRI or other methods and improve stratification of patients with disorders of consciousness in research studies.


Subject(s)
Brain Injuries/physiopathology , Cognition/physiology , Speech/physiology , Adolescent , Adult , Brain Injuries/classification , Brain Injuries/diagnosis , Electroencephalography , Female , Humans , Magnetic Resonance Imaging , Male , Middle Aged , Neuroimaging , Young Adult
10.
Proc Natl Acad Sci U S A ; 113(30): E4304-10, 2016 07 26.
Article in English | MEDLINE | ID: mdl-27407145

ABSTRACT

Low-frequency hearing is critically important for speech and music perception, but no mechanical measurements have previously been available from inner ears with intact low-frequency parts. These regions of the cochlea may function in ways different from the extensively studied high-frequency regions, where the sensory outer hair cells produce force that greatly increases the sound-evoked vibrations of the basilar membrane. We used laser interferometry in vitro and optical coherence tomography in vivo to study the low-frequency part of the guinea pig cochlea, and found that sound stimulation caused motion of a minimal portion of the basilar membrane. Outside the region of peak movement, an exponential decline in motion amplitude occurred across the basilar membrane. The moving region had different dependence on stimulus frequency than the vibrations measured near the mechanosensitive stereocilia. This behavior differs substantially from the behavior found in the extensively studied high-frequency regions of the cochlea.


Subject(s)
Basilar Membrane/physiology , Hair Cells, Auditory, Outer/physiology , Hearing/physiology , Organ of Corti/physiology , Acoustic Stimulation , Animals , Guinea Pigs , Interferometry , Motion , Organ of Corti/cytology , Sound , Tomography, Optical Coherence
11.
PLoS One ; 7(9): e45579, 2012.
Article in English | MEDLINE | ID: mdl-23029113

ABSTRACT

An auditory neuron can preserve the temporal fine structure of a low-frequency tone by phase-locking its response to the stimulus. Apart from sound localization, however, much about the role of this temporal information for signal processing in the brain remains unknown. Through psychoacoustic studies we provide direct evidence that humans employ temporal fine structure to discriminate between frequencies. To this end we construct tones that are based on a single frequency but in which, through the concatenation of wavelets, the phase changes randomly every few cycles. We then test the frequency discrimination of these phase-changing tones, of control tones without phase changes, and of short tones that consist of a single wavelet. For carrier frequencies below a few kilohertz we find that phase changes systematically worsen frequency discrimination. No such effect appears for higher carrier frequencies at which temporal information is not available in the central auditory system.


Subject(s)
Auditory Perception/physiology , Sensory Receptor Cells/physiology , Sound , Acoustic Stimulation , Adult , Female , Humans , Male , Psychoacoustics
SELECTION OF CITATIONS
SEARCH DETAIL