Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 661
Filtrar
Más filtros

Medicinas Complementárias
Intervalo de año de publicación
1.
Commun Biol ; 7(1): 291, 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38459110

RESUMEN

When engaged in a conversation, one receives auditory information from the other's speech but also from their own speech. However, this information is processed differently by an effect called Speech-Induced Suppression. Here, we studied brain representation of acoustic properties of speech in natural unscripted dialogues, using electroencephalography (EEG) and high-quality speech recordings from both participants. Using encoding techniques, we were able to reproduce a broad range of previous findings on listening to another's speech, and achieving even better performances when predicting EEG signal in this complex scenario. Furthermore, we found no response when listening to oneself, using different acoustic features (spectrogram, envelope, etc.) and frequency bands, evidencing a strong effect of SIS. The present work shows that this mechanism is present, and even stronger, during natural dialogues. Moreover, the methodology presented here opens the possibility of a deeper understanding of the related mechanisms in a wider range of contexts.


Asunto(s)
Electroencefalografía , Habla , Humanos , Habla/fisiología , Estimulación Acústica/métodos , Electroencefalografía/métodos , Encéfalo , Mapeo Encefálico/métodos
2.
IEEE Trans Biomed Eng ; 71(8): 2454-2462, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38470574

RESUMEN

Some classification studies of brain-computer interface (BCI) based on speech imagery show potential for improving communication skills in patients with amyotrophic lateral sclerosis (ALS). However, current research on speech imagery is limited in scope and primarily focuses on vowels or a few selected words. In this paper, we propose a complete research scheme for multi-character classification based on EEG signals derived from speech imagery. Firstly, we record 31 speech imagery contents, including 26 alphabets and five commonly used punctuation marks, from seven subjects using a 32-channel electroencephalogram (EEG) device. Secondly, we introduce the wavelet scattering transform (WST), which shares a structural resemblance to Convolutional Neural Networks (CNNs), for feature extraction. The WST is a knowledge-driven technique that preserves high-frequency information and maintains the deformation stability of EEG signals. To reduce the dimensionality of wavelet scattering coefficient features, we employ Kernel Principal Component Analysis (KPCA). Finally, the reduced features are fed into an Extreme Gradient Boosting (XGBoost) classifier within a multi-classification framework. The XGBoost classifier is optimized through hyperparameter tuning using grid search and 10-fold cross-validation, resulting in an average accuracy of 78.73% for the multi-character classification task. We utilize t-Distributed Stochastic Neighbor Embedding (t-SNE) technology to visualize the low-dimensional representation of multi-character speech imagery. This visualization effectively enables us to observe the clustering of similar characters. The experimental results demonstrate the effectiveness of our proposed multi-character classification scheme. Furthermore, our classification categories and accuracy exceed those reported in existing research.


Asunto(s)
Interfaces Cerebro-Computador , Electroencefalografía , Procesamiento de Señales Asistido por Computador , Habla , Humanos , Electroencefalografía/métodos , Habla/fisiología , Algoritmos , Análisis de Ondículas , Imaginación/fisiología , Adulto , Masculino , Femenino , Redes Neurales de la Computación
3.
J Neurosci ; 44(10)2024 Mar 06.
Artículo en Inglés | MEDLINE | ID: mdl-38199864

RESUMEN

During communication in real-life settings, our brain often needs to integrate auditory and visual information and at the same time actively focus on the relevant sources of information, while ignoring interference from irrelevant events. The interaction between integration and attention processes remains poorly understood. Here, we use rapid invisible frequency tagging and magnetoencephalography to investigate how attention affects auditory and visual information processing and integration, during multimodal communication. We presented human participants (male and female) with videos of an actress uttering action verbs (auditory; tagged at 58 Hz) accompanied by two movie clips of hand gestures on both sides of fixation (attended stimulus tagged at 65 Hz; unattended stimulus tagged at 63 Hz). Integration difficulty was manipulated by a lower-order auditory factor (clear/degraded speech) and a higher-order visual semantic factor (matching/mismatching gesture). We observed an enhanced neural response to the attended visual information during degraded speech compared to clear speech. For the unattended information, the neural response to mismatching gestures was enhanced compared to matching gestures. Furthermore, signal power at the intermodulation frequencies of the frequency tags, indexing nonlinear signal interactions, was enhanced in the left frontotemporal and frontal regions. Focusing on the left inferior frontal gyrus, this enhancement was specific for the attended information, for those trials that benefitted from integration with a matching gesture. Together, our results suggest that attention modulates audiovisual processing and interaction, depending on the congruence and quality of the sensory input.


Asunto(s)
Encéfalo , Percepción del Habla , Humanos , Masculino , Femenino , Encéfalo/fisiología , Percepción Visual/fisiología , Magnetoencefalografía , Habla/fisiología , Atención/fisiología , Percepción del Habla/fisiología , Estimulación Acústica , Estimulación Luminosa
4.
Cortex ; 171: 287-307, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38061210

RESUMEN

The spectral formant structure and periodicity pitch are the major features that determine the identity of vowels and the characteristics of the speaker. However, very little is known about how the processing of these features in the auditory cortex changes during development. To address this question, we independently manipulated the periodicity and formant structure of vowels while measuring auditory cortex responses using magnetoencephalography (MEG) in children aged 7-12 years and adults. We analyzed the sustained negative shift of source current associated with these vowel properties, which was present in the auditory cortex in both age groups despite differences in the transient components of the auditory response. In adults, the sustained activation associated with formant structure was lateralized to the left hemisphere early in the auditory processing stream requiring neither attention nor semantic mapping. This lateralization was not yet established in children, in whom the right hemisphere contribution to formant processing was strong and decreased during or after puberty. In contrast to the formant structure, periodicity was associated with a greater response in the right hemisphere in both children and adults. These findings suggest that left-lateralization for the automatic processing of vowel formant structure emerges relatively late in ontogenesis and pose a serious challenge to current theories of hemispheric specialization for speech processing.


Asunto(s)
Corteza Auditiva , Percepción del Habla , Adulto , Humanos , Niño , Corteza Auditiva/fisiología , Estimulación Acústica , Percepción Auditiva/fisiología , Magnetoencefalografía , Habla/fisiología , Percepción del Habla/fisiología
5.
Artículo en Inglés | MEDLINE | ID: mdl-38083588

RESUMEN

Brain-computer interface (BCI) based on speech imagery can decode users' verbal intent and help people with motor disabilities communicate naturally. Functional near-infrared spectroscopy (fNIRS) is a commonly used brain signal acquisition method. Asynchronous BCI can response to control commands at any time, which provides great convenience for users. Task state detection, defined as identifying whether user starts or continues covertly articulating, plays an important role in speech imagery BCIs. To better distinguish task state from idle state during speech imagery, this work used fNIRS signals from different brain regions to study the effects of different brain regions on task state detection accuracy. The imagined tonal syllables included four lexical tones and four vowels in Mandarin Chinese. The brain regions that were measured included Broca's area, Wernicke's area, Superior temporal cortex and Motor cortex. Task state detection accuracies of imagining tonal monosyllables with four different tones were analyzed. The average accuracy of four speech imagery tasks based on the whole brain was 0.67 and it was close to 0.69, which was the average accuracy based on Broca's area. The accuracies of Broca's area and the whole brain were significantly higher than those of other brain regions. The findings of this work demonstrated that using a few channels of Broca's area could result in a similar task state detection accuracy to that using all the channels of the brain. Moreover, it was discovered that speech imagery with tone 2/3 tasks yielded higher task state detection accuracy than speech imagery with other tones.


Asunto(s)
Corteza Motora , Habla , Humanos , Habla/fisiología , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Imágenes en Psicoterapia , Lóbulo Temporal , Corteza Motora/fisiología
6.
Proc Natl Acad Sci U S A ; 120(49): e2309166120, 2023 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-38032934

RESUMEN

Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.


Asunto(s)
Inteligibilidad del Habla , Percepción del Habla , Inteligibilidad del Habla/fisiología , Estimulación Acústica/métodos , Habla/fisiología , Ruido , Acústica , Magnetoencefalografía/métodos , Percepción del Habla/fisiología
7.
J Neurosci ; 43(48): 8189-8200, 2023 11 29.
Artículo en Inglés | MEDLINE | ID: mdl-37793909

RESUMEN

Spontaneous speech is produced in chunks called intonation units (IUs). IUs are defined by a set of prosodic cues and presumably occur in all human languages. Recent work has shown that across different grammatical and sociocultural conditions IUs form rhythms of ∼1 unit per second. Linguistic theory suggests that IUs pace the flow of information in the discourse. As a result, IUs provide a promising and hitherto unexplored theoretical framework for studying the neural mechanisms of communication. In this article, we identify a neural response unique to the boundary defined by the IU. We measured the EEG of human participants (of either sex), who listened to different speakers recounting an emotional life event. We analyzed the speech stimuli linguistically and modeled the EEG response at word offset using a GLM approach. We find that the EEG response to IU-final words differs from the response to IU-nonfinal words even when equating acoustic boundary strength. Finally, we relate our findings to the body of research on rhythmic brain mechanisms in speech processing. We study the unique contribution of IUs and acoustic boundary strength in predicting delta-band EEG. This analysis suggests that IU-related neural activity, which is tightly linked to the classic Closure Positive Shift (CPS), could be a time-locked component that captures the previously characterized delta-band neural speech tracking.SIGNIFICANCE STATEMENT Linguistic communication is central to human experience, and its neural underpinnings are a topic of much research in recent years. Neuroscientific research has benefited from studying human behavior in naturalistic settings, an endeavor that requires explicit models of complex behavior. Usage-based linguistic theory suggests that spoken language is prosodically structured in intonation units. We reveal that the neural system is attuned to intonation units by explicitly modeling their impact on the EEG response beyond mere acoustics. To our understanding, this is the first time this is demonstrated in spontaneous speech under naturalistic conditions and under a theoretical framework that connects the prosodic chunking of speech, on the one hand, with the flow of information during communication, on the other.


Asunto(s)
Percepción del Habla , Habla , Humanos , Habla/fisiología , Electroencefalografía , Estimulación Acústica , Percepción del Habla/fisiología , Lenguaje
8.
Hum Brain Mapp ; 44(17): 6149-6172, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-37818940

RESUMEN

The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.


Asunto(s)
Corteza Auditiva , Percepción del Habla , Humanos , Habla/fisiología , Electroencefalografía/métodos , Estimulación Acústica/métodos , Percepción del Habla/fisiología , Corteza Auditiva/fisiología
9.
Neuroimage ; 282: 120404, 2023 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-37806465

RESUMEN

Despite the distortion of speech signals caused by unavoidable noise in daily life, our ability to comprehend speech in noisy environments is relatively stable. However, the neural mechanisms underlying reliable speech-in-noise comprehension remain to be elucidated. The present study investigated the neural tracking of acoustic and semantic speech information during noisy naturalistic speech comprehension. Participants listened to narrative audio recordings mixed with spectrally matched stationary noise at three signal-to-ratio (SNR) levels (no noise, 3 dB, -3 dB), and 60-channel electroencephalography (EEG) signals were recorded. A temporal response function (TRF) method was employed to derive event-related-like responses to the continuous speech stream at both the acoustic and the semantic levels. Whereas the amplitude envelope of the naturalistic speech was taken as the acoustic feature, word entropy and word surprisal were extracted via the natural language processing method as two semantic features. Theta-band frontocentral TRF responses to the acoustic feature were observed at around 400 ms following speech fluctuation onset over all three SNR levels, and the response latencies were more delayed with increasing noise. Delta-band frontal TRF responses to the semantic feature of word entropy were observed at around 200 to 600 ms leading to speech fluctuation onset over all three SNR levels. The response latencies became more leading with increasing noise and decreasing speech comprehension and intelligibility. While the following responses to speech acoustics were consistent with previous studies, our study revealed the robustness of leading responses to speech semantics, which suggests a possible predictive mechanism at the semantic level for maintaining reliable speech comprehension in noisy environments.


Asunto(s)
Comprensión , Percepción del Habla , Humanos , Comprensión/fisiología , Semántica , Habla/fisiología , Percepción del Habla/fisiología , Electroencefalografía , Acústica , Estimulación Acústica
10.
J Neurosci ; 43(40): 6779-6795, 2023 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-37607822

RESUMEN

Communication difficulties are one of the core criteria in diagnosing autism spectrum disorder (ASD), and are often characterized by speech reception difficulties, whose biological underpinnings are not yet identified. This deficit could denote atypical neuronal ensemble activity, as reflected by neural oscillations. Atypical cross-frequency oscillation coupling, in particular, could disrupt the joint tracking and prediction of dynamic acoustic stimuli, a dual process that is essential for speech comprehension. Whether such oscillatory anomalies already exist in very young children with ASD, and with what specificity they relate to individual language reception capacity is unknown. We collected neural activity data using electroencephalography (EEG) in 64 very young children with and without ASD (mean age 3; 17 females, 47 males) while they were exposed to naturalistic-continuous speech. EEG power of frequency bands typically associated with phrase-level chunking (δ, 1-3 Hz), phonemic encoding (low-γ, 25-35 Hz), and top-down control (ß, 12-20 Hz) were markedly reduced in ASD relative to typically developing (TD) children. Speech neural tracking by δ and θ (4-8 Hz) oscillations was also weaker in ASD compared with TD children. After controlling gaze-pattern differences, we found that the classical θ/γ coupling was replaced by an atypical ß/γ coupling in children with ASD. This anomaly was the single most specific predictor of individual speech reception difficulties in ASD children. These findings suggest that early interventions (e.g., neurostimulation) targeting the disruption of ß/γ coupling and the upregulation of θ/γ coupling could improve speech processing coordination in young children with ASD and help them engage in oral interactions.SIGNIFICANCE STATEMENT Very young children already present marked alterations of neural oscillatory activity in response to natural speech at the time of autism spectrum disorder (ASD) diagnosis. Hierarchical processing of phonemic-range and syllabic-range information (θ/γ coupling) is disrupted in ASD children. Abnormal bottom-up (low-γ) and top-down (low-ß) coordination specifically predicts speech reception deficits in very young ASD children, and no other cognitive deficit.


Asunto(s)
Trastorno del Espectro Autista , Trastorno Autístico , Masculino , Femenino , Humanos , Niño , Preescolar , Habla/fisiología , Trastorno del Espectro Autista/diagnóstico , Electroencefalografía , Estimulación Acústica
11.
J Speech Lang Hear Res ; 66(9): 3223-3241, 2023 09 13.
Artículo en Inglés | MEDLINE | ID: mdl-37524116

RESUMEN

PURPOSE: Children with residual speech sound disorders (RSSD) have shown differences in neural function for speech production, as compared to their typical peers; however, information about how these differences may change over time and relative to speech therapy is needed. To address this gap, we used functional magnetic resonance imaging (fMRI) to examine functional activation and connectivity on adaptations of the syllable repetition task (SRT-Early Sounds and SRT-Late Sounds) in children with RSSD before and after a speech therapy program. METHOD: Sixteen children with RSSD completed an fMRI experiment before (Time 1) and after (Time 2) a speech therapy program with ultrasound visual feedback for /ɹ/ misarticulation. Progress in therapy was measured via perceptual ratings of productions of untreated /ɹ/ word probes. To control for practice effects and developmental change in patterns of activation and connectivity, 17 children with typical speech development (TD) completed the fMRI at Time 1 and Time 2. Functional activation was analyzed using a region-of-interest approach and functional connectivity was analyzed using a seed-to-voxel approach. RESULTS: Children with RSSD showed a range of responses to therapy. After correcting for multiple comparisons, we did not observe any statistically significant cross-sectional differences or longitudinal changes in functional activation. A negative relationship between therapy effect size and functional activation in the left visual association cortex was on the SRT-Late Sounds after therapy, but it did not survive correction for multiple comparisons. Significant longitudinal changes in functional connectivity were observed for the RSSD group on SRT-Early Sounds and SRT-Late Sounds, as well as for the TD group on the SRT-Early Sounds. RSSD and TD groups showed connectivity differences near the left insula on the SRT-Late Sounds at Time 2. CONCLUSION: RSSD and treatment with ultrasound visual feedback may thus be associated with neural differences in speech motor and visual association processes recruited for speech production.


Asunto(s)
Apraxias , Trastornos del Desarrollo del Lenguaje , Trastorno Fonológico , Tartamudeo , Humanos , Niño , Habla/fisiología , Trastorno Fonológico/diagnóstico por imagen , Trastorno Fonológico/terapia , Logopedia/métodos , Estudios Transversales , Biorretroalimentación Psicológica/métodos
12.
PLoS One ; 18(7): e0289288, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37498891

RESUMEN

The decoding multivariate Temporal Response Function (decoder) or speech envelope reconstruction approach is a well-known tool for assessing the cortical tracking of speech envelope. It is used to analyse the correlation between the speech stimulus and the neural response. It is known that auditory late responses are enhanced with longer gaps between stimuli, but it is not clear if this applies to the decoder, and whether the addition of gaps/pauses in continuous speech could be used to increase the envelope reconstruction accuracy. We investigated this in normal hearing participants who listened to continuous speech with no added pauses (natural speech), and then with short (250 ms) or long (500 ms) silent pauses inserted between each word. The total duration for continuous speech stimulus with no, short, and long pauses were approximately, 10 minutes, 16 minutes, and 21 minutes, respectively. EEG and speech envelope were simultaneously acquired and then filtered into delta (1-4 Hz) and theta (4-8 Hz) frequency bands. In addition to analysing responses to the whole speech envelope, speech envelope was also segmented to focus response analysis on onset and non-onset regions of speech separately. Our results show that continuous speech with additional pauses inserted between words significantly increases the speech envelope reconstruction correlations compared to using natural speech, in both the delta and theta frequency bands. It also appears that these increase in speech envelope reconstruction are dominated by the onset regions in the speech envelope. Introducing pauses in speech stimuli has potential clinical benefit for increasing auditory evoked response detectability, though with the disadvantage of speech sounding less natural. The strong effect of pauses and onsets on the decoder should be considered when comparing results from different speech corpora. Whether the increased cortical response, when longer pauses are introduced, reflect improved intelligibility requires further investigation.


Asunto(s)
Percepción del Habla , Habla , Humanos , Habla/fisiología , Electroencefalografía/métodos , Estimulación Acústica/métodos , Potenciales Evocados Auditivos , Percepción del Habla/fisiología
13.
J Neural Eng ; 20(4)2023 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-37406631

RESUMEN

Objective.Many recent studies investigating the processing of continuous natural speech have employed electroencephalography (EEG) due to its high temporal resolution. However, most of these studies explored the response mechanism limited to the electrode space. In this study, we intend to explore the underlying neural processing in the source space, particularly the dynamic functional interactions among different regions during neural entrainment to speech.Approach.We collected 128-channel EEG data while 22 participants listened to story speech and time-reversed speech using a naturalistic paradigm. We compared three different strategies to determine the best method to estimate the neural tracking responses from the sensor space to the brain source space. After that, we used dynamic graph theory to investigate the source connectivity dynamics among regions that were involved in speech tracking.Main result.By comparing the correlations between the predicted neural response and the original common neural response under the two experimental conditions, we found that estimating the common neural response of participants in the electrode space followed by source localization of neural responses achieved the best performance. Analysis of the distribution of brain sources entrained to story speech envelopes showed that not only auditory regions but also frontoparietal cognitive regions were recruited, indicating a hierarchical processing mechanism of speech. Further analysis of inter-region interactions based on dynamic graph theory found that neural entrainment to speech operates across multiple brain regions along the hierarchical structure, among which the bilateral insula, temporal lobe, and inferior frontal gyrus are key brain regions that control information transmission. All of these information flows result in dynamic fluctuations in functional connection strength and network topology over time, reflecting both bottom-up and top-down processing while orchestrating computations toward understanding.Significance.Our findings have important implications for understanding the neural mechanisms of the brain during processing natural speech stimuli.


Asunto(s)
Percepción del Habla , Habla , Humanos , Habla/fisiología , Percepción del Habla/fisiología , Encéfalo/fisiología , Electroencefalografía , Lóbulo Temporal/fisiología , Estimulación Acústica/métodos
14.
J Cogn Neurosci ; 35(8): 1301-1311, 2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37379482

RESUMEN

The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.


Asunto(s)
Corteza Auditiva , Percepción del Habla , Humanos , Habla/fisiología , Electroencefalografía , Percepción del Habla/fisiología , Corteza Auditiva/fisiología , Lingüística , Estimulación Acústica
15.
Comput Biol Med ; 159: 106909, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37071937

RESUMEN

Speech imagery has been successfully employed in developing Brain-Computer Interfaces because it is a novel mental strategy that generates brain activity more intuitively than evoked potentials or motor imagery. There are many methods to analyze speech imagery signals, but those based on deep neural networks achieve the best results. However, more research is necessary to understand the properties and features that describe imagined phonemes and words. In this paper, we analyze the statistical properties of speech imagery EEG signals from the KaraOne dataset to design a method that classifies imagined phonemes and words. With this analysis, we propose a Capsule Neural Network that categorizes speech imagery patterns into bilabial, nasal, consonant-vocal, and vowels/iy/ and/uw/. The method is called Capsules for Speech Imagery Analysis (CapsK-SI). The input of CapsK-SI is a set of statistical features of EEG speech imagery signals. The architecture of the Capsule Neural Network is composed of a convolution layer, a primary capsule layer, and a class capsule layer. The average accuracy reached is 90.88%±7 for bilabial, 90.15%±8 for nasal, 94.02%±6 for consonant-vowel, 89.70%±8 for word-phoneme, 94.33%± for/iy/ vowel and, 94.21%±3 for/uw/ vowel detection. Finally, with the activity vectors of the CapsK-SI capsules, we generated brain maps to represent brain activity in the production of bilabial, nasal, and consonant-vocal signals.


Asunto(s)
Interfaces Cerebro-Computador , Habla , Habla/fisiología , Cápsulas , Electroencefalografía/métodos , Redes Neurales de la Computación , Encéfalo/fisiología , Imaginación/fisiología , Algoritmos
16.
Hear Res ; 433: 108767, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37060895

RESUMEN

The goal of describing how the human brain responds to complex acoustic stimuli has driven auditory neuroscience research for decades. Often, a systems-based approach has been taken, in which neurophysiological responses are modeled based on features of the presented stimulus. This includes a wealth of work modeling electroencephalogram (EEG) responses to complex acoustic stimuli such as speech. Examples of the acoustic features used in such modeling include the amplitude envelope and spectrogram of speech. These models implicitly assume a direct mapping from stimulus representation to cortical activity. However, in reality, the representation of sound is transformed as it passes through early stages of the auditory pathway, such that inputs to the cortex are fundamentally different from the raw audio signal that was presented. Thus, it could be valuable to account for the transformations taking place in lower-order auditory areas, such as the auditory nerve, cochlear nucleus, and inferior colliculus (IC) when predicting cortical responses to complex sounds. Specifically, because IC responses are more similar to cortical inputs than acoustic features derived directly from the audio signal, we hypothesized that linear mappings (temporal response functions; TRFs) fit to the outputs of an IC model would better predict EEG responses to speech stimuli. To this end, we modeled responses to the acoustic stimuli as they passed through the auditory nerve, cochlear nucleus, and inferior colliculus before fitting a TRF to the output of the modeled IC responses. Results showed that using model-IC responses in traditional systems analyzes resulted in better predictions of EEG activity than using the envelope or spectrogram of a speech stimulus. Further, it was revealed that model-IC derived TRFs predict different aspects of the EEG than acoustic-feature TRFs, and combining both types of TRF models provides a more accurate prediction of the EEG response.


Asunto(s)
Corteza Auditiva , Colículos Inferiores , Humanos , Habla/fisiología , Vías Auditivas/fisiología , Electroencefalografía , Corteza Auditiva/fisiología , Colículos Inferiores/fisiología , Estimulación Acústica/métodos , Percepción Auditiva/fisiología
17.
Neuroimage Clin ; 38: 103394, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37003130

RESUMEN

PURPOSE: Progressive apraxia of speech (PAOS) is a neurodegenerative disorder affecting the planning or programming of speech. Little is known about its magnetic susceptibility profiles indicative of biological processes such as iron deposition and demyelination. This study aims to clarify (1) the pattern of susceptibility in PAOS patients, (2) the susceptibility differences between the phonetic (characterized by predominance of distorted sound substitutions and additions) and prosodic (characterized by predominance of slow speech rate and segmentation) subtypes of PAOS, and (3) the relationships between susceptibility and symptom severity. METHODS: Twenty patients with PAOS (nine phonetic and eleven prosodic subtypes) were prospectively recruited and underwent a 3 Tesla MRI scan. They also underwent detailed speech, language, and neurological evaluations. Quantitative susceptibility maps (QSM) were reconstructed from multi-echo gradient echo MRI images. Region of interest analysis was conducted to estimate susceptibility coefficients in several subcortical and frontal regions. We compared susceptibility values between PAOS and an age-matched control group and performed a correlation analysis between susceptibilities and an apraxia of speech rating scale (ASRS) phonetic and prosodic feature ratings. RESULTS: The magnetic susceptibility of PAOS was statistically greater than that of controls in subcortical regions (left putamen, left red nucleus, and right dentate nucleus) (p < 0.01, also survived FDR correction) and in the left white-matter precentral gyrus (p < 0.05, but not survived FDR correction). The prosodic patients showed greater susceptibilities than controls in these subcortical and precentral regions. The susceptibility in the left red nucleus and in the left precentral gyrus correlated with the prosodic sub-score of the ASRS. CONCLUSION: Magnetic susceptibility in PAOS patients was greater than controls mainly in the subcortical regions. While larger samples are needed before QSM is considered ready for clinical differential diagnosis, the present study contributes to our understanding of magnetic susceptibility changes and the pathophysiology of PAOS.


Asunto(s)
Apraxias , Corteza Motora , Humanos , Encéfalo/diagnóstico por imagen , Habla/fisiología , Apraxias/diagnóstico por imagen , Imagen por Resonancia Magnética
18.
Neuroimage ; 272: 120040, 2023 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-36935084

RESUMEN

During listening, brain activity tracks the rhythmic structures of speech signals. Here, we directly dissociated the contribution of neural envelope tracking in the processing of speech acoustic cues from that related to linguistic processing. We examined the neural changes associated with the comprehension of Noise-Vocoded (NV) speech using magnetoencephalography (MEG). Participants listened to NV sentences in a 3-phase training paradigm: (1) pre-training, where NV stimuli were barely comprehended, (2) training with exposure of the original clear version of speech stimulus, and (3) post-training, where the same stimuli gained intelligibility from the training phase. Using this paradigm, we tested if the neural responses of a speech signal was modulated by its intelligibility without any change in its acoustic structure. To test the influence of spectral degradation on neural envelope tracking independently of training, participants listened to two types of NV sentences (4-band and 2-band NV speech), but were only trained to understand 4-band NV speech. Significant changes in neural tracking were observed in the delta range in relation to the acoustic degradation of speech. However, we failed to find a direct effect of intelligibility on the neural tracking of speech envelope in both theta and delta ranges, in both auditory regions-of-interest and whole-brain sensor-space analyses. This suggests that acoustics greatly influence the neural tracking response to speech envelope, and that caution needs to be taken when choosing the control signals for speech-brain tracking analyses, considering that a slight change in acoustic parameters can have strong effects on the neural tracking response.


Asunto(s)
Percepción del Habla , Habla , Humanos , Habla/fisiología , Estimulación Acústica , Percepción del Habla/fisiología , Magnetoencefalografía , Ruido , Inteligibilidad del Habla
19.
J Neural Eng ; 20(2)2023 03 09.
Artículo en Inglés | MEDLINE | ID: mdl-36812597

RESUMEN

Objective.The human brain tracks the temporal envelope of speech, which contains essential cues for speech understanding. Linear models are the most common tool to study neural envelope tracking. However, information on how speech is processed can be lost since nonlinear relations are precluded. Analysis based on mutual information (MI), on the other hand, can detect both linear and nonlinear relations and is gradually becoming more popular in the field of neural envelope tracking. Yet, several different approaches to calculating MI are applied with no consensus on which approach to use. Furthermore, the added value of nonlinear techniques remains a subject of debate in the field. The present paper aims to resolve these open questions.Approach.We analyzed electroencephalography (EEG) data of participants listening to continuous speech and applied MI analyses and linear models.Main results.Comparing the different MI approaches, we conclude that results are most reliable and robust using the Gaussian copula approach, which first transforms the data to standard Gaussians. With this approach, the MI analysis is a valid technique for studying neural envelope tracking. Like linear models, it allows spatial and temporal interpretations of speech processing, peak latency analyses, and applications to multiple EEG channels combined. In a final analysis, we tested whether nonlinear components were present in the neural response to the envelope by first removing all linear components in the data. We robustly detected nonlinear components on the single-subject level using the MI analysis.Significance.We demonstrate that the human brain processes speech in a nonlinear way. Unlike linear models, the MI analysis detects such nonlinear relations, proving its added value to neural envelope tracking. In addition, the MI analysis retains spatial and temporal characteristics of speech processing, an advantage lost when using more complex (nonlinear) deep neural networks.


Asunto(s)
Percepción del Habla , Humanos , Estimulación Acústica/métodos , Percepción del Habla/fisiología , Electroencefalografía/métodos , Encéfalo/fisiología , Percepción Auditiva , Habla/fisiología
20.
J Neural Eng ; 20(1)2023 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-36630714

RESUMEN

Objective.Speech imagery (SI) can be used as a reliable, natural, and user-friendly activation task for the development of brain-computer interface (BCI), which empowers individuals with severe disabilities to interact with their environment. The functional near-infrared spectroscopy (fNIRS) is advanced as one of the most suitable brain imaging methods for developing BCI systems owing to its advantages of being non-invasive, portable, insensitive to motion artifacts, and having relatively high spatial resolution.Approach.To improve the classification performance of SI BCI based on fNIRS, a novel paradigm was developed in this work by simplifying the articulation movements in SI to make the articulation movement differences clearer between different words imagery tasks. A SI BCI was proposed to directly answer questions by covertly rehearsing the word '' or '' ('yes' or 'no' in English), and an unconstrained rest task also was contained in this BCI. The articulation movements of SI were simplified by retaining only the movements of the jaw and lips of vowels in Chinese Pinyin for words '' and ''.Main results.Compared with conventional speech imagery, simplifying the articulation movements in SI could generate more different brain activities among different tasks, which led to more differentiable temporal features and significantly higher classification performance. The average 3-class classification accuracies of the proposed paradigm across all 20 participants reached 69.6% and 60.2% which were about 10.8% and 5.6% significantly higher than those of the conventional SI paradigm operated in the 0-10 s and 0-2.5 s time windows, respectively.Significance.These results suggested that simplifying the articulation movements in SI is promising for improving the classification performance of intuitive BCIs based on speech imagery.


Asunto(s)
Interfaces Cerebro-Computador , Humanos , Habla/fisiología , Imágenes en Psicoterapia , Encéfalo/fisiología , Movimiento , Electroencefalografía/métodos , Imaginación/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA