Pesquisa | BVS MTCI Américas

1.

Neural tracking measures of speech intelligibility: Manipulating intelligibility while keeping acoustics unchanged.

Karunathilake, I M Dushyanthi; Kulasingham, Joshua P; Simon, Jonathan Z.

Proc Natl Acad Sci U S A ; 120(49): e2309166120, 2023 Dec 05.

Artigo em Inglês | MEDLINE | ID: mdl-38032934

RESUMO

Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.

Assuntos

Inteligibilidade da Fala , Percepção da Fala , Inteligibilidade da Fala/fisiologia , Estimulação Acústica/métodos , Fala/fisiologia , Ruído , Acústica , Magnetoencefalografia/métodos , Percepção da Fala/fisiologia

2.

Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech.

Corcoran, Andrew W; Perera, Ricardo; Koroma, Matthieu; Kouider, Sid; Hohwy, Jakob; Andrillon, Thomas.

Cereb Cortex ; 33(3): 691-708, 2023 01 05.

Artigo em Inglês | MEDLINE | ID: mdl-35253871

RESUMO

Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.

Assuntos

Motivação , Percepção da Fala , Ruído , Eletroencefalografia , Estimulação Acústica , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia

3.

Predicting speech intelligibility from EEG in a non-linear classification paradigm.

Accou, Bernd; Jalilpour Monesi, Mohammad; Van Hamme, Hugo; Francart, Tom.

J Neural Eng ; 18(6)2021 11 15.

Artigo em Inglês | MEDLINE | ID: mdl-34706347

RESUMO

Objective.Currently, only behavioral speech understanding tests are available, which require active participation of the person being tested. As this is infeasible for certain populations, an objective measure of speech intelligibility is required. Recently, brain imaging data has been used to establish a relationship between stimulus and brain response. Linear models have been successfully linked to speech intelligibility but require per-subject training. We present a deep-learning-based model incorporating dilated convolutions that operates in a match/mismatch paradigm. The accuracy of the model's match/mismatch predictions can be used as a proxy for speech intelligibility without subject-specific (re)training.Approach.We evaluated the performance of the model as a function of input segment length, electroencephalography (EEG) frequency band and receptive field size while comparing it to multiple baseline models. Next, we evaluated performance on held-out data and finetuning. Finally, we established a link between the accuracy of our model and the state-of-the-art behavioral MATRIX test.Main results.The dilated convolutional model significantly outperformed the baseline models for every input segment length, for all EEG frequency bands except the delta and theta band, and receptive field sizes between 250 and 500 ms. Additionally, finetuning significantly increased the accuracy on a held-out dataset. Finally, a significant correlation (r= 0.59,p= 0.0154) was found between the speech reception threshold (SRT) estimated using the behavioral MATRIX test and our objective method.Significance.Our method is the first to predict the SRT from EEG for unseen subjects, contributing to objective measures of speech intelligibility.

Assuntos

Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Encéfalo , Eletroencefalografia/métodos , Audição/fisiologia , Humanos , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia

4.

Frontotemporal activation differs between perception of simulated cochlear implant speech and speech in background noise: An image-based fNIRS study.

Defenderfer, Jessica; Forbes, Samuel; Wijeakumar, Sobanawartiny; Hedrick, Mark; Plyler, Patrick; Buss, Aaron T.

Neuroimage ; 240: 118385, 2021 10 15.

Artigo em Inglês | MEDLINE | ID: mdl-34256138

RESUMO

In this study we used functional near-infrared spectroscopy (fNIRS) to investigate neural responses in normal-hearing adults as a function of speech recognition accuracy, intelligibility of the speech stimulus, and the manner in which speech is distorted. Participants listened to sentences and reported aloud what they heard. Speech quality was distorted artificially by vocoding (simulated cochlear implant speech) or naturally by adding background noise. Each type of distortion included high and low-intelligibility conditions. Sentences in quiet were used as baseline comparison. fNIRS data were analyzed using a newly developed image reconstruction approach. First, elevated cortical responses in the middle temporal gyrus (MTG) and middle frontal gyrus (MFG) were associated with speech recognition during the low-intelligibility conditions. Second, activation in the MTG was associated with recognition of vocoded speech with low intelligibility, whereas MFG activity was largely driven by recognition of speech in background noise, suggesting that the cortical response varies as a function of distortion type. Lastly, an accuracy effect in the MFG demonstrated significantly higher activation during correct perception relative to incorrect perception of speech. These results suggest that normal-hearing adults (i.e., untrained listeners of vocoded stimuli) do not exploit the same attentional mechanisms of the frontal cortex used to resolve naturally degraded speech and may instead rely on segmental and phonetic analyses in the temporal lobe to discriminate vocoded speech.

Assuntos

Estimulação Acústica/métodos , Implantes Cocleares , Lobo Frontal/fisiologia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Lobo Temporal/fisiologia , Adolescente , Adulto , Feminino , Lobo Frontal/diagnóstico por imagem , Humanos , Masculino , Ruído/efeitos adversos , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Lobo Temporal/diagnóstico por imagem , Adulto Jovem

5.

Signal envelope and speech intelligibility differentially impact auditory motion perception.

Warnecke, Michaela; Litovsky, Ruth Y.

Sci Rep ; 11(1): 15117, 2021 07 23.

Artigo em Inglês | MEDLINE | ID: mdl-34302032

RESUMO

Our acoustic environment contains a plethora of complex sounds that are often in motion. To gauge approaching danger and communicate effectively, listeners need to localize and identify sounds, which includes determining sound motion. This study addresses which acoustic cues impact listeners' ability to determine sound motion. Signal envelope (ENV) cues are implicated in both sound motion tracking and stimulus intelligibility, suggesting that these processes could be competing for sound processing resources. We created auditory chimaera from speech and noise stimuli and varied the number of frequency bands, effectively manipulating speech intelligibility. Normal-hearing adults were presented with stationary or moving chimaeras and reported perceived sound motion and content. Results show that sensitivity to sound motion is not affected by speech intelligibility, but shows a clear difference for original noise and speech stimuli. Further, acoustic chimaera with speech-like ENVs which had intelligible content induced a strong bias in listeners to report sounds as stationary. Increasing stimulus intelligibility systematically increased that bias and removing intelligible content reduced it, suggesting that sound content may be prioritized over sound motion. These findings suggest that sound motion processing in the auditory system can be biased by acoustic parameters related to speech intelligibility.

Assuntos

Percepção Auditiva/fisiologia , Percepção de Movimento/fisiologia , Inteligibilidade da Fala/fisiologia , Estimulação Acústica/métodos , Adulto , Limiar Auditivo/fisiologia , Sinais (Psicologia) , Feminino , Audição/fisiologia , Testes Auditivos/métodos , Humanos , Masculino , Movimento (Física) , Ruído , Mascaramento Perceptivo/fisiologia , Som , Acústica da Fala , Percepção da Fala/fisiologia , Adulto Jovem

6.

Spectrally specific temporal analyses of spike-train responses to complex sounds: A unifying framework.

Parida, Satyabrata; Bharadwaj, Hari; Heinz, Michael G.

PLoS Comput Biol ; 17(2): e1008155, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33617548

RESUMO

Significant scientific and translational questions remain in auditory neuroscience surrounding the neural correlates of perception. Relating perceptual and neural data collected from humans can be useful; however, human-based neural data are typically limited to evoked far-field responses, which lack anatomical and physiological specificity. Laboratory-controlled preclinical animal models offer the advantage of comparing single-unit and evoked responses from the same animals. This ability provides opportunities to develop invaluable insight into proper interpretations of evoked responses, which benefits both basic-science studies of neural mechanisms and translational applications, e.g., diagnostic development. However, these comparisons have been limited by a disconnect between the types of spectrotemporal analyses used with single-unit spike trains and evoked responses, which results because these response types are fundamentally different (point-process versus continuous-valued signals) even though the responses themselves are related. Here, we describe a unifying framework to study temporal coding of complex sounds that allows spike-train and evoked-response data to be analyzed and compared using the same advanced signal-processing techniques. The framework uses a set of peristimulus-time histograms computed from single-unit spike trains in response to polarity-alternating stimuli to allow advanced spectral analyses of both slow (envelope) and rapid (temporal fine structure) response components. Demonstrated benefits include: (1) novel spectrally specific temporal-coding measures that are less confounded by distortions due to hair-cell transduction, synaptic rectification, and neural stochasticity compared to previous metrics, e.g., the correlogram peak-height, (2) spectrally specific analyses of spike-train modulation coding (magnitude and phase), which can be directly compared to modern perceptually based models of speech intelligibility (e.g., that depend on modulation filter banks), and (3) superior spectral resolution in analyzing the neural representation of nonstationary sounds, such as speech and music. This unifying framework significantly expands the potential of preclinical animal models to advance our understanding of the physiological correlates of perceptual deficits in real-world listening following sensorineural hearing loss.

Assuntos

Percepção Auditiva/fisiologia , Potenciais Evocados Auditivos/fisiologia , Modelos Neurológicos , Estimulação Acústica , Animais , Chinchila/fisiologia , Nervo Coclear/fisiologia , Biologia Computacional , Modelos Animais de Doenças , Perda Auditiva Neurossensorial/fisiopatologia , Perda Auditiva Neurossensorial/psicologia , Humanos , Modelos Animais , Dinâmica não Linear , Psicoacústica , Som , Análise Espaço-Temporal , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Pesquisa Translacional Biomédica

7.

Long-term Outcomes in Down Syndrome Children After Cochlear Implantation: Particular Issues and Considerations.

Clarós, Pedro; Remjasz, Agnieszka; Clarós-Pujol, Astrid; Pujol, Carmen; Clarós, Andrés; Wiatrow, Andrzej.

Otol Neurotol ; 40(10): 1278-1286, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31634275

RESUMO

OBJECTIVE: The aim of the study was to analyze the long-term outcomes after cochlear implantation in deaf children with Down syndrome (DS) regarding age at the first implantation and refer the results to preoperative radiological findings as well as postoperative auditory and speech performance. Additionally, the influence of the age at implantation and duration of CI use on postoperative hearing and language skills were closely analyzed in children with DS. STUDY DESIGN: Retrospective analysis. SETTING: Referral center (Cochlear Implant Center). MATERIALS AND METHODS: Nine children with Down syndrome were compared with 220 pediatric patients without additional mental disorders or genetic mutations. Patients were divided into four categories depending on the age of the first implantation: CAT1 (0-3 yr), CAT2 (4-5 yr), CAT3 (6-7 yr), and CAT4 (8-17 yr). The auditory performance was assessed with the meaningful auditory integration scales (MAIS) and categories of auditory performance (CAP) scales. The speech and language development were further evaluated with meaningful use of speech scale (MUSS) and speech intelligibility rating (SIR). The postoperative speech skills were analyzed and compared between the study group and the reference group by using nonparametric statistical tests. Anatomic abnormalities of the inner ear were examined using magnetic resonance imaging (MRI) and high-resolution computed tomography of the temporal bones (HRCT). RESULTS: The mean follow-up time was 14.9 years (range, 13.1-18.3 yr). Patients with DS received a multichannel implant at a mean age of 75.3 months (SD 27.9; ranging from 21 to 127 mo) and 220 non-syndromic children from reference group at a mean age of 51.4 months (SD 34.2; ranging from 9 to 167 mo). The intraoperative neural response was present in all cases. The auditory and speech performance improved in each DS child. The postoperative mean CAP and SIR scores were 4.4 (SD 0.8) and 3.2 (SD 0.6), respectively. The average of scores in MUSS and MAIS/IT-MAIS scales was 59.8% (SD 0.1) and 76.9% (SD 0.1), respectively. Gathered data indicates that children with DS implanted with CI at a younger age (<6 years of age) benefited from the CI more than children implanted later in life, similarly in a control group. There were additional anomalies of the temporal bone, external, middle, or inner ear observed in 90% of DS children, basing on MRI or HRCT. CONCLUSIONS: The early cochlear implantation in children with DS is a similarly useful method in treating severe to profound sensorineural hearing loss (SNHL) as in non-syndromic patients, although the development of speech skills present differently. Due to a higher prevalence of ear and temporal bone malformations, detailed diagnostic imaging should be taken into account before the CI qualification. Better postoperative outcomes may be achieved through comprehensive care from parents/guardians and speech therapists thanks to intensive and systematic rehabilitation.

Assuntos

Implante Coclear , Síndrome de Down/complicações , Perda Auditiva Neurossensorial/cirurgia , Criança , Pré-Escolar , Implante Coclear/métodos , Implantes Cocleares , Feminino , Audição/fisiologia , Testes Auditivos , Humanos , Lactente , Desenvolvimento da Linguagem , Masculino , Período Pós-Operatório , Estudos Retrospectivos , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia

8.

The effect of envelope modulations on binaural processing.

Goupell, Matthew J; Fong, Stephen; Stakhovskaya, Olga.

Hear Res ; 379: 117-127, 2019 08.

Artigo em Inglês | MEDLINE | ID: mdl-31154164

RESUMO

An experiment was performed with 10 young normal-hearing listeners that attempted to determine if envelope modulations affected binaural processing in bandlimited pulse trains. Listeners detected an interaurally out-of-phase carrier pulse train in the presence of different amplitude modulations. The peaks of the pulses were constant (called "flat" or F), followed envelope modulations from an interaurally correlated 50-Hz bandwidth noise (called CM), or followed modulations from an interaurally uncorrelated noise (called UM). The pulse rate was varied from 50 to 500 pulses per second (pps) and the center frequency (CF) was 4 or 8â¯kHz. It was hypothesized that CM would cause no change or an increase in performance compared to F; UM would cause a decrease because of the blurring of the binaural detection cue. There was a small but significant decrease from F to CM (inconsistent with the hypothesis) and a further decrease from CM to UM (consistent with the hypothesis). Critically, there was a significant envelope by rate interaction caused by a decrease from F to CM for the 200-300 pps rates. The data can be explained by a subject-based factor, where some listeners experienced interaural envelope decorrelation when the sound was encoded by the auditory system that reduced performance when the modulations were present. Since the decrease in performance between F and CM conditions was small, it seems that most young normal-hearing listeners have very similar encoding of modulated stimuli across the ears. This type of task, when further optimized, may be able to assess if hearing-impaired populations experience interaural decorrelation from encoding modulated stimuli and therefore could help better understand the limited spatial hearing in populations like cochlear-implant users.

Assuntos

Percepção Auditiva/fisiologia , Localização de Som/fisiologia , Estimulação Acústica , Adulto , Implantes Cocleares/estatística & dados numéricos , Lateralidade Funcional/fisiologia , Voluntários Saudáveis , Humanos , Psicoacústica , Processamento de Sinais Assistido por Computador , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adulto Jovem

9.

ParkinSong: A Controlled Trial of Singing-Based Therapy for Parkinson's Disease.

Tamplin, Jeanette; Morris, Meg E; Marigliani, Caterina; Baker, Felicity A; Vogel, Adam P.

Neurorehabil Neural Repair ; 33(6): 453-463, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-31081485

RESUMO

Background. Communication impairment is one of the most common symptoms of Parkinson's disease (PD), significantly affecting quality of life. Singing shares many of the neural networks and structural mechanisms used during speech and, thus, has potential for therapeutic application to address speech disorders. Objective. To explore the effects of an interdisciplinary singing-based therapeutic intervention (ParkinSong) on voice and communication in people with PD. Methods. A controlled trial compared the effects of the ParkinSong intervention with an active control condition at 2 dosage levels (weekly vs monthly) over 3 months, on voice, speech, respiratory strength, and voice-related quality-of-life outcomes for 75 people living with PD. The interdisciplinary ParkinSong model comprised high-effort vocal and respiratory tasks, speech exercises, group singing, and social communication opportunities. Results. ParkinSong intervention participants demonstrated significant improvements in vocal intensity (P = .018), maximum expiratory pressure (P = .032), and voice-related quality of life (P = .043) in comparison to controls. Weekly ParkinSong participants increased vocal intensity more than monthly participants (P = .011). Vocal intensity declined in nontreatment control groups. No statistical differences between groups on maximum phonation length or maximum inspiratory pressure were observed at 3 months. Conclusions. ParkinSong is an engaging intervention with the potential to increase loudness and respiratory function in people with mild to moderately severe PD.

Assuntos

Exercícios Respiratórios , Comunicação , Relações Interpessoais , Musicoterapia , Doença de Parkinson/fisiopatologia , Doença de Parkinson/reabilitação , Psicoterapia de Grupo , Canto , Distúrbios da Fala/fisiopatologia , Distúrbios da Fala/reabilitação , Fonoterapia , Idoso , Idoso de 80 Anos ou mais , Exercícios Respiratórios/métodos , Terapia Combinada , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Musicoterapia/métodos , Doença de Parkinson/complicações , Psicoterapia de Grupo/métodos , Índice de Gravidade de Doença , Distúrbios da Fala/etiologia , Inteligibilidade da Fala/fisiologia , Fonoterapia/métodos , Resultado do Tratamento

10.

Effects of Task Difficulty on Neural Processes Underlying Semantics: An Event-Related Potentials Study.

Kemp, Amy; Eddins, David; Shrivastav, Rahul; Hampton Wray, Amanda.

J Speech Lang Hear Res ; 62(2): 367-386, 2019 02 26.

Artigo em Inglês | MEDLINE | ID: mdl-30950685

RESUMO

Purpose Improving the ability to listen efficiently in noisy environments is a critical goal for hearing rehabilitation. However, understanding of the impact of difficult listening conditions on language processing is limited. The current study evaluated the neural processes underlying semantics in challenging listening conditions. Method Thirty adults with normal hearing completed an auditory sentence processing task in 4-talker babble. Event-related brain potentials were elicited by the final word in high- or low-context sentences, where the final word was either highly expected or not expected, followed by a 4-alternative forced-choice response with either longer (1,000 ms), middle (700 ms), or shorter (400 ms) response time deadlines (RTDs). Results Behavioral accuracy was reduced, and reactions times were faster for shorter RTDs. N400 amplitudes, reflecting ease of lexical access, were larger when elicited by target words in low-context sentences followed by shorter compared with longer RTDs. Conclusions These results reveal that more neural resources are allocated for semantic processing/lexical access when listening difficulty increases. Differences between RTDs may reflect increased attentional allocation for shorter RTDs. These findings suggest that situational listening demands can impact the demands for cognitive resources engaged in language processing, which could significantly impact listener experiences across environments.

Assuntos

Potenciais Evocados/fisiologia , Semântica , Estimulação Acústica , Adolescente , Adulto , Limiar Auditivo/fisiologia , Comunicação , Eletroencefalografia , Feminino , Humanos , Idioma , Masculino , Desempenho Psicomotor/fisiologia , Tempo de Reação , Inteligibilidade da Fala/fisiologia , Adulto Jovem

11.

Sentence Context Facilitation for Children's and Adults' Recognition of Native- and Nonnative-Accented Speech.

Bent, Tessa; Holt, Rachael Frush; Miller, Katherine; Libersky, Emma.

J Speech Lang Hear Res ; 62(2): 423-433, 2019 02 26.

Artigo em Inglês | MEDLINE | ID: mdl-30950691

RESUMO

Purpose Supportive semantic and syntactic information can increase children's and adults' word recognition accuracy in adverse listening conditions. However, there are inconsistent findings regarding how a talker's accent or dialect modulates these context effects. Here, we compare children's and adults' abilities to capitalize on sentence context to overcome misleading acoustic-phonetic cues in nonnative-accented speech. Method Monolingual American English-speaking 5- to 7-year-old children ( n = 90) and 18- to 35-year-old adults ( n = 30) were presented with full sentences or the excised final word from each of the sentences and repeated what they heard. Participants were randomly assigned to 1 of 2 conditions: native-accented (Midland American English) or nonnative-accented (Spanish- and Japanese-accented English) speech. Participants also completed the NIH Toolbox Picture Vocabulary Test. Results Children and adults benefited from sentence context for both native- and nonnative-accent talkers, but the benefit was greater for nonnative than native talkers. Furthermore, adults showed a greater context benefit than children for nonnative talkers, but the 2 age groups showed a similar benefit for native talkers. Children's age and vocabulary scores both correlated with context benefit. Conclusions The cognitive-linguistic development that occurs between the early school-age years and adulthood may increase listeners' abilities to capitalize on top-down cues for lexical identification with nonnative-accented speech. These results have implications for the perception of speech with source degradation, including speech sound disorders, hearing loss, or signal processing that does not faithfully represent the original signal.

Assuntos

Reconhecimento Psicológico/fisiologia , Inteligibilidade da Fala/fisiologia , Estimulação Acústica , Adolescente , Adulto , Criança , Pré-Escolar , Compreensão/fisiologia , Sinais (Psicologia) , Feminino , Humanos , Masculino , Ruído , Fonética , Semântica , Vocabulário , Adulto Jovem

12.

Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries.

Flinker, Adeen; Doyle, Werner K; Mehta, Ashesh D; Devinsky, Orrin; Poeppel, David.

Nat Hum Behav ; 3(4): 393-405, 2019 04.

Artigo em Inglês | MEDLINE | ID: mdl-30971792

RESUMO

The principles underlying functional asymmetries in cortex remain debated. For example, it is accepted that speech is processed bilaterally in auditory cortex, but a left hemisphere dominance emerges when the input is interpreted linguistically. The mechanisms, however, are contested, such as what sound features or processing principles underlie laterality. Recent findings across species (humans, canines and bats) provide converging evidence that spectrotemporal sound features drive asymmetrical responses. Typically, accounts invoke models wherein the hemispheres differ in time-frequency resolution or integration window size. We develop a framework that builds on and unifies prevailing models, using spectrotemporal modulation space. Using signal processing techniques motivated by neural responses, we test this approach, employing behavioural and neurophysiological measures. We show how psychophysical judgements align with spectrotemporal modulations and then characterize the neural sensitivities to temporal and spectral modulations. We demonstrate differential contributions from both hemispheres, with a left lateralization for temporal modulations and a weaker right lateralization for spectral modulations. We argue that representations in the modulation domain provide a more mechanistic basis to account for lateralization in auditory cortex.

Assuntos

Córtex Auditivo/fisiologia , Lateralidade Funcional/fisiologia , Percepção da Altura Sonora/fisiologia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Estimulação Acústica/métodos , Adolescente , Adulto , Eletrocorticografia/métodos , Feminino , Humanos , Magnetoencefalografia/métodos , Masculino , Psicofísica/métodos , Processamento de Sinais Assistido por Computador , Fatores de Tempo , Adulto Jovem

13.

EEG can predict speech intelligibility.

Iotzov, Ivan; Parra, Lucas C.

J Neural Eng ; 16(3): 036008, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-30776785

RESUMO

OBJECTIVE: Speech signals have a remarkable ability to entrain brain activity to the rapid fluctuations of speech sounds. For instance, one can readily measure a correlation of the sound amplitude with the evoked responses of the electroencephalogram (EEG), and the strength of this correlation is indicative of whether the listener is attending to the speech. In this study we asked whether this stimulus-response correlation is also predictive of speech intelligibility. APPROACH: We hypothesized that when a listener fails to understand the speech in adverse hearing conditions, attention wanes and stimulus-response correlation also drops. To test this, we measure a listener's ability to detect words in noisy speech while recording their brain activity using EEG. We alter intelligibility without changing the acoustic stimulus by pairing it with congruent and incongruent visual speech. MAIN RESULTS: For almost all subjects we found that an improvement in speech detection coincided with an increase in correlation between the noisy speech and the EEG measured over a period of 30 min. SIGNIFICANCE: We conclude that simultaneous recordings of the perceived sound and the corresponding EEG response may be a practical tool to assess speech intelligibility in the context of hearing aids.

Assuntos

Estimulação Acústica/métodos , Encéfalo/fisiologia , Eletroencefalografia/métodos , Inteligibilidade da Fala/fisiologia , Percepção Auditiva/fisiologia , Feminino , Previsões , Humanos , Masculino , Estimulação Luminosa/métodos , Percepção da Fala/fisiologia , Percepção Visual/fisiologia , Adulto Jovem

14.

Time-compression thresholds for Mandarin sentences in normal-hearing and cochlear implant listeners.

Meng, Qinglin; Wang, Xianren; Cai, Yuexin; Kong, Fanhui; Buck, Alexa Nadezhda; Yu, Guangzheng; Zheng, Nengheng; Schnupp, Jan W H.

Hear Res ; 374: 58-68, 2019 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-30732921

RESUMO

Faster speech may facilitate more efficient communication, but if speech is too fast it becomes unintelligible. The maximum speeds at which Mandarin words were intelligible in a sentence context were quantified for normal hearing (NH) and cochlear implant (CI) listeners by measuring time-compression thresholds (TCTs) in an adaptive staircase procedure. In Experiment 1, both original and CI-vocoded time-compressed speech from the MSP (Mandarin speech perception) and MHINT (Mandarin hearing in noise test) corpora was presented to 10 NH subjects over headphones. In Experiment 2, original time-compressed speech was presented to 10 CI subjects and another 10 NH subjects through a loudspeaker in a soundproof room. Sentences were time-compressed without changing their spectral profile, and were presented up to three times within a single trial. At the end of each trial, the number of correctly identified words in the sentence was scored. A 50%-word recognition threshold was tracked in the psychophysical procedure. The observed median TCTs were very similar for MSP and MHINT speech. For NH listeners, median TCTs were around 16.7 syllables/s for normal speech, and 11.8 and 8.6 syllables/s respectively for 8 and 4 channel tone-carrier vocoded speech. For CI listeners, TCTs were only around 6.8 syllables/s. The interquartile range of the TCTs within each cohort was smaller than 3.0 syllables/s. Speech reception thresholds in noise were also measured in Experiment 2, and were found to be strongly correlated with TCTs for CI listeners. In conclusion, the Mandarin sentence TCTs were around 16.7 syllables/s for most NH subjects, but rarely faster than 10.0 syllables/s for CI listeners, which quantitatively illustrated upper limits of fast speech information processing with CIs.

Assuntos

Limiar Auditivo/fisiologia , Implantes Cocleares , Idioma , Inteligibilidade da Fala/fisiologia , Estimulação Acústica , Adulto , Algoritmos , Criança , Implantes Cocleares/estatística & dados numéricos , Feminino , Voluntários Saudáveis , Humanos , Masculino , Psicoacústica , Processamento de Sinais Assistido por Computador , Acústica da Fala , Percepção da Fala/fisiologia , Fatores de Tempo , Adulto Jovem

15.

Towards reconstructing intelligible speech from the human auditory cortex.

Akbari, Hassan; Khalighinejad, Bahar; Herrero, Jose L; Mehta, Ashesh D; Mesgarani, Nima.

Sci Rep ; 9(1): 874, 2019 01 29.

Artigo em Inglês | MEDLINE | ID: mdl-30696881

RESUMO

Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.

Assuntos

Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Fala/fisiologia , Estimulação Acústica/métodos , Algoritmos , Córtex Auditivo/fisiologia , Mapeamento Encefálico , Aprendizado Profundo , Potenciais Evocados Auditivos/fisiologia , Humanos , Redes Neurais de Computação , Próteses Neurais

16.

Effects of Slow- and Fast-Acting Compression on Hearing-Impaired Listeners' Consonant-Vowel Identification in Interrupted Noise.

Kowalewski, Borys; Zaar, Johannes; Fereczkowski, Michal; MacDonald, Ewen N; Strelcyk, Olaf; May, Tobias; Dau, Torsten.

Trends Hear ; 22: 2331216518800870, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30311552

RESUMO

There is conflicting evidence about the relative benefit of slow- and fast-acting compression for speech intelligibility. It has been hypothesized that fast-acting compression improves audibility at low signal-to-noise ratios (SNRs) but may distort the speech envelope at higher SNRs. The present study investigated the effects of compression with a nearly instantaneous attack time but either fast (10 ms) or slow (500 ms) release times on consonant identification in hearing-impaired listeners. Consonant-vowel speech tokens were presented at a range of presentation levels in two conditions: in the presence of interrupted noise and in quiet (with the compressor "shadow-controlled" by the corresponding mixture of speech and noise). These conditions were chosen to disentangle the effects of consonant audibility and noise-induced forward masking on speech intelligibility. A small but systematic intelligibility benefit of fast-acting compression was found in both the quiet and the noisy conditions for the lower speech levels. No detrimental effects of fast-acting compression were observed when the speech level exceeded the level of the noise. These findings suggest that fast-acting compression provides an audibility benefit in fluctuating interferers when compared with slow-acting compression while not substantially affecting the perception of consonants at higher SNRs.

Assuntos

Estimulação Acústica/métodos , Auxiliares de Audição , Perda Auditiva Neurossensorial/reabilitação , Espectrografia do Som/métodos , Inteligibilidade da Fala/fisiologia , Adulto , Idoso , Estudos de Casos e Controles , Feminino , Perda Auditiva Neurossensorial/diagnóstico , Humanos , Masculino , Fonética , Desenho de Prótese , Valores de Referência , Razão Sinal-Ruído , Teste do Limiar de Recepção da Fala , Adulto Jovem

17.

Vibro-Tactile Enhancement of Speech Intelligibility in Multi-talker Noise for Simulated Cochlear Implant Listening.

Fletcher, Mark D; Mills, Sean R; Goehring, Tobias.

Trends Hear ; 22: 2331216518797838, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30222089

RESUMO

Many cochlear implant (CI) users achieve excellent speech understanding in acoustically quiet conditions but most perform poorly in the presence of background noise. An important contributor to this poor speech-in-noise performance is the limited transmission of low-frequency sound information through CIs. Recent work has suggested that tactile presentation of this low-frequency sound information could be used to improve speech-in-noise performance for CI users. Building on this work, we investigated whether vibro-tactile stimulation can improve speech intelligibility in multi-talker noise. The signal used for tactile stimulation was derived from the speech-in-noise using a computationally inexpensive algorithm. Eight normal-hearing participants listened to CI simulated speech-in-noise both with and without concurrent tactile stimulation of their fingertip. Participants' speech recognition performance was assessed before and after a training regime, which took place over 3 consecutive days and totaled around 30 min of exposure to CI-simulated speech-in-noise with concurrent tactile stimulation. Tactile stimulation was found to improve the intelligibility of speech in multi-talker noise, and this improvement was found to increase in size after training. Presentation of such tactile stimulation could be achieved by a compact, portable device and offer an inexpensive and noninvasive means for improving speech-in-noise performance in CI users.

Assuntos

Estimulação Acústica/métodos , Implante Coclear/métodos , Perda Auditiva/cirurgia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adulto , Algoritmos , Audiometria da Fala/métodos , Percepção Auditiva/fisiologia , Limiar Auditivo/fisiologia , Implantes Cocleares , Feminino , Humanos , Masculino , Ruído , Estudos de Amostragem , Sensibilidade e Especificidade , Treinamento por Simulação , Localização de Som/fisiologia , Adulto Jovem

18.

Understanding dysrhythmic speech: When rhythm does not matter and learning does not happen.

Borrie, Stephanie A; Lansford, Kaitlin L; Barrett, Tyson S.

J Acoust Soc Am ; 143(5): EL379, 2018 05.

Artigo em Inglês | MEDLINE | ID: mdl-29857710

RESUMO

A positive relationship between rhythm perception and improved understanding of a naturally dysrhythmic speech signal, ataxic dysarthria, has been previously reported [Borrie, Lansford, and Barrett. (2017). J. Speech Lang. Hear. Res. 60, 3110-3117]. The current follow-on investigation suggests that this relationship depends on the nature of the dysrhythmia. When the corrupted rhythm cues are relatively predictable, affording some learnable acoustic regularity, the relationship is replicated. However, this relationship is nonexistent, along with any intelligibility improvements, when the corrupted rhythm cues are unpredictable. Findings highlight a key role for rhythm perception and distributional regularities in adaptation to dysrhythmic speech.

Assuntos

Estimulação Acústica/métodos , Disartria/fisiopatologia , Aprendizagem/fisiologia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adulto , Disartria/diagnóstico , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem

19.

A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements.

Hauswald, Anne; Lithari, Chrysa; Collignon, Olivier; Leonardelli, Elisa; Weisz, Nathan.

Curr Biol ; 28(9): 1453-1459.e3, 2018 05 07.

Artigo em Inglês | MEDLINE | ID: mdl-29681475

RESUMO

Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements.

Assuntos

Percepção da Fala/fisiologia , Fala/fisiologia , Córtex Visual/fisiologia , Estimulação Acústica , Adulto , Córtex Auditivo/fisiologia , Mapeamento Encefálico , Feminino , Humanos , Lábio , Leitura Labial , Magnetoencefalografia/métodos , Masculino , Córtex Motor/fisiologia , Movimento , Fonética , Inteligibilidade da Fala/fisiologia

20.

Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B.

Curr Biol ; 27(21): 3237-3247.e6, 2017 Nov 06.

Artigo em Inglês | MEDLINE | ID: mdl-29056453

RESUMO

Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges.

Assuntos

Percepção Auditiva/fisiologia , Mascaramento Perceptivo/fisiologia , Pessoas com Deficiência Auditiva , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Estimulação Acústica , Idoso , Método Duplo-Cego , Feminino , Humanos , Masculino

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA