RESUMEN
Communication in the real world is inherently multimodal. When having a conversation, typically sighted and hearing people use both auditory and visual cues to understand one another. For example, objects may make sounds as they move in space, or we may use the movement of a person's mouth to better understand what they are saying in a noisy environment. Still, many neuroscience experiments rely on unimodal stimuli to understand encoding of sensory features in the brain. The extent to which visual information may influence encoding of auditory information and vice versa in natural environments is thus unclear. Here, we addressed this question by recording scalp electroencephalography (EEG) in 11 subjects as they listened to and watched movie trailers in audiovisual (AV), visual (V) only, and audio (A) only conditions. We then fit linear encoding models that described the relationship between the brain responses and the acoustic, phonetic, and visual information in the stimuli. We also compared whether auditory and visual feature tuning was the same when stimuli were presented in the original AV format versus when visual or auditory information was removed. In these stimuli, visual and auditory information was relatively uncorrelated, and included spoken narration over a scene as well as animated or live-action characters talking with and without their face visible. For this stimulus, we found that auditory feature tuning was similar in the AV and A-only conditions, and similarly, tuning for visual information was similar when stimuli were presented with the audio present (AV) and when the audio was removed (V only). In a cross prediction analysis, we investigated whether models trained on AV data predicted responses to A or V only test data similarly to models trained on unimodal data. Overall, prediction performance using AV training and V test sets was similar to using V training and V test sets, suggesting that the auditory information has a relatively smaller effect on EEG. In contrast, prediction performance using AV training and A only test set was slightly worse than using matching A only training and A only test sets. This suggests the visual information has a stronger influence on EEG, though this makes no qualitative difference in the derived feature tuning. In effect, our results show that researchers may benefit from the richness of multimodal datasets, which can then be used to answer more than one research question.
Asunto(s)
Estimulación Acústica , Percepción Auditiva , Electroencefalografía , Estimulación Luminosa , Percepción Visual , Humanos , Electroencefalografía/métodos , Masculino , Femenino , Percepción Auditiva/fisiología , Adulto , Percepción Visual/fisiología , Adulto Joven , Encéfalo/fisiología , Modelos Neurológicos , Biología ComputacionalRESUMEN
Chronic pain syndromes are often refractory to treatment and cause substantial suffering and disability. Pain severity is often measured through subjective report, while objective biomarkers that may guide diagnosis and treatment are lacking. Also, which brain activity underlies chronic pain on clinically relevant timescales, or how this relates to acute pain, remains unclear. Here four individuals with refractory neuropathic pain were implanted with chronic intracranial electrodes in the anterior cingulate cortex and orbitofrontal cortex (OFC). Participants reported pain metrics coincident with ambulatory, direct neural recordings obtained multiple times daily over months. We successfully predicted intraindividual chronic pain severity scores from neural activity with high sensitivity using machine learning methods. Chronic pain decoding relied on sustained power changes from the OFC, which tended to differ from transient patterns of activity associated with acute, evoked pain states during a task. Thus, intracranial OFC signals can be used to predict spontaneous, chronic pain state in patients.
Asunto(s)
Dolor Crónico , Humanos , Dolor Crónico/diagnóstico , Electrodos Implantados , Corteza Prefrontal/fisiología , Giro del CínguloRESUMEN
The neurological basis of affective behaviours in everyday life is not well understood. We obtained continuous intracranial electroencephalography recordings from the human mesolimbic network in 11 participants with epilepsy and hand-annotated spontaneous behaviours from 116 h of multiday video recordings. In individual participants, binary random forest models decoded affective behaviours from neutral behaviours with up to 93% accuracy. Both positive and negative affective behaviours were associated with increased high-frequency and decreased low-frequency activity across the mesolimbic network. The insula, amygdala, hippocampus and anterior cingulate cortex made stronger contributions to affective behaviours than the orbitofrontal cortex, but the insula and anterior cingulate cortex were most critical for differentiating behaviours with observable affect from those without. In a subset of participants (N = 3), multiclass decoders distinguished amongst the positive, negative and neutral behaviours. These results suggest that spectro-spatial features of brain activity in the mesolimbic network are associated with affective behaviours of everyday life.
Asunto(s)
Emociones , Giro del Cíngulo , Amígdala del Cerebelo/diagnóstico por imagen , Giro del Cíngulo/diagnóstico por imagen , Hipocampo , Humanos , Corteza PrefrontalRESUMEN
In many experiments that investigate auditory and speech processing in the brain using electroencephalography (EEG), the experimental paradigm is often lengthy and tedious. Typically, the experimenter errs on the side of including more data, more trials, and therefore conducting a longer task to ensure that the data are robust and effects are measurable. Recent studies used naturalistic stimuli to investigate the brain's response to individual or a combination of multiple speech features using system identification techniques, such as multivariate temporal receptive field (mTRF) analyses. The neural data collected from such experiments must be divided into a training set and a test set to fit and validate the mTRF weights. While a good strategy is clearly to collect as much data as is feasible, it is unclear how much data are needed to achieve stable results. Furthermore, it is unclear whether the specific stimulus used for mTRF fitting and the choice of feature representation affects how much data would be required for robust and generalizable results. Here, we used previously collected EEG data from our lab using sentence stimuli and movie stimuli as well as EEG data from an open-source dataset using audiobook stimuli to better understand how much data needs to be collected for naturalistic speech experiments measuring acoustic and phonetic tuning. We found that the EEG receptive field structure tested here stabilizes after collecting a training dataset of approximately 200 s of TIMIT sentences, around 600 s of movie trailers training set data, and approximately 460 s of audiobook training set data. Thus, we provide suggestions on the minimum amount of data that would be necessary for fitting mTRFs from naturalistic listening data. Our findings are motivated by highly practical concerns when working with children, patient populations, or others who may not tolerate long study sessions. These findings will aid future researchers who wish to study naturalistic speech processing in healthy and clinical populations while minimizing participant fatigue and retaining signal quality.
RESUMEN
In natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as "speech tracking." Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from acoustically rich, naturalistic environments with and without background noise can be generalized to more controlled stimuli. If encoding models for acoustically rich, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations of individuals who may not tolerate listening to more controlled and less engaging stimuli for long periods of time. We recorded noninvasive scalp EEG while 17 human participants (8 male/9 female) listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled datasets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to speech in a rich acoustic background were more accurate when including both phonological and acoustic features. Our findings suggest that naturalistic audiovisual stimuli can be used to measure receptive fields that are comparable and generalizable to more controlled audio-only stimuli.SIGNIFICANCE STATEMENT Understanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli-sentences from a speech corpus and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.
Asunto(s)
Estimulación Acústica/métodos , Encéfalo/fisiología , Electroencefalografía/métodos , Electrooculografía/métodos , Estimulación Luminosa/métodos , Percepción del Habla/fisiología , Adulto , Femenino , Humanos , Masculino , Películas Cinematográficas , Adulto JovenRESUMEN
The anterior cingulate cortex (ACC) has been extensively implicated in the functional brain network underlying chronic pain. Electrical stimulation of the ACC has been proposed as a therapy for refractory chronic pain, although, mechanisms of therapeutic action are still unclear. As stimulation of the ACC has been reported to produce many different behavioral and perceptual responses, this region likely plays a varied role in sensory and emotional integration as well as modulating internally generated perceptual states. In this case series, we report the emergence of subjective musical hallucinations (MH) after electrical stimulation of the ACC in two patients with refractory chronic pain. In an N-of-1 analysis from one patient, we identified neural activity (local field potentials) that distinguish MH from both the non-MH condition and during a task involving music listening. Music hallucinations were associated with reduced alpha band activity and increased gamma band activity in the ACC. Listening to similar music was associated with different changes in ACC alpha and gamma power, extending prior results that internally generated perceptual phenomena are supported by circuits in the ACC. We discuss these findings in the context of phantom perceptual phenomena and posit a framework whereby chronic pain may be interpreted as a persistent internally generated percept.
RESUMEN
Music and speech are human-specific behaviours that share numerous properties, including the fine motor skills required to produce them. Given these similarities, previous work has suggested that music and speech may at least partially share neural substrates. To date, much of this work has focused on perception, and has not investigated the neural basis of production, particularly in trained musicians. Here, we report two rare cases of musicians undergoing neurosurgical procedures, where it was possible to directly stimulate the left hemisphere cortex during speech and piano/guitar music production tasks. We found that stimulation to left inferior frontal cortex, including pars opercularis and ventral pre-central gyrus, caused slowing and arrest for both speech and music, and note sequence errors for music. Stimulation to posterior superior temporal cortex only caused production errors during speech. These results demonstrate partially dissociable networks underlying speech and music production, with a shared substrate in frontal regions.