Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 92
Filtrar
Más filtros

País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 120(49): e2309166120, 2023 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-38032934

RESUMEN

Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.


Asunto(s)
Inteligibilidad del Habla , Percepción del Habla , Inteligibilidad del Habla/fisiología , Estimulación Acústica/métodos , Habla/fisiología , Ruido , Acústica , Magnetoencefalografía/métodos , Percepción del Habla/fisiología
2.
Cereb Cortex ; 33(3): 691-708, 2023 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-35253871

RESUMEN

Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.


Asunto(s)
Motivación , Percepción del Habla , Ruido , Electroencefalografía , Estimulación Acústica , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología
3.
J Neural Eng ; 18(6)2021 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-34706347

RESUMEN

Objective.Currently, only behavioral speech understanding tests are available, which require active participation of the person being tested. As this is infeasible for certain populations, an objective measure of speech intelligibility is required. Recently, brain imaging data has been used to establish a relationship between stimulus and brain response. Linear models have been successfully linked to speech intelligibility but require per-subject training. We present a deep-learning-based model incorporating dilated convolutions that operates in a match/mismatch paradigm. The accuracy of the model's match/mismatch predictions can be used as a proxy for speech intelligibility without subject-specific (re)training.Approach.We evaluated the performance of the model as a function of input segment length, electroencephalography (EEG) frequency band and receptive field size while comparing it to multiple baseline models. Next, we evaluated performance on held-out data and finetuning. Finally, we established a link between the accuracy of our model and the state-of-the-art behavioral MATRIX test.Main results.The dilated convolutional model significantly outperformed the baseline models for every input segment length, for all EEG frequency bands except the delta and theta band, and receptive field sizes between 250 and 500 ms. Additionally, finetuning significantly increased the accuracy on a held-out dataset. Finally, a significant correlation (r= 0.59,p= 0.0154) was found between the speech reception threshold (SRT) estimated using the behavioral MATRIX test and our objective method.Significance.Our method is the first to predict the SRT from EEG for unseen subjects, contributing to objective measures of speech intelligibility.


Asunto(s)
Inteligibilidad del Habla , Percepción del Habla , Estimulación Acústica , Encéfalo , Electroencefalografía/métodos , Audición/fisiología , Humanos , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología
4.
Sci Rep ; 11(1): 15117, 2021 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-34302032

RESUMEN

Our acoustic environment contains a plethora of complex sounds that are often in motion. To gauge approaching danger and communicate effectively, listeners need to localize and identify sounds, which includes determining sound motion. This study addresses which acoustic cues impact listeners' ability to determine sound motion. Signal envelope (ENV) cues are implicated in both sound motion tracking and stimulus intelligibility, suggesting that these processes could be competing for sound processing resources. We created auditory chimaera from speech and noise stimuli and varied the number of frequency bands, effectively manipulating speech intelligibility. Normal-hearing adults were presented with stationary or moving chimaeras and reported perceived sound motion and content. Results show that sensitivity to sound motion is not affected by speech intelligibility, but shows a clear difference for original noise and speech stimuli. Further, acoustic chimaera with speech-like ENVs which had intelligible content induced a strong bias in listeners to report sounds as stationary. Increasing stimulus intelligibility systematically increased that bias and removing intelligible content reduced it, suggesting that sound content may be prioritized over sound motion. These findings suggest that sound motion processing in the auditory system can be biased by acoustic parameters related to speech intelligibility.


Asunto(s)
Percepción Auditiva/fisiología , Percepción de Movimiento/fisiología , Inteligibilidad del Habla/fisiología , Estimulación Acústica/métodos , Adulto , Umbral Auditivo/fisiología , Señales (Psicología) , Femenino , Audición/fisiología , Pruebas Auditivas/métodos , Humanos , Masculino , Movimiento (Física) , Ruido , Enmascaramiento Perceptual/fisiología , Sonido , Acústica del Lenguaje , Percepción del Habla/fisiología , Adulto Joven
5.
Neuroimage ; 240: 118385, 2021 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-34256138

RESUMEN

In this study we used functional near-infrared spectroscopy (fNIRS) to investigate neural responses in normal-hearing adults as a function of speech recognition accuracy, intelligibility of the speech stimulus, and the manner in which speech is distorted. Participants listened to sentences and reported aloud what they heard. Speech quality was distorted artificially by vocoding (simulated cochlear implant speech) or naturally by adding background noise. Each type of distortion included high and low-intelligibility conditions. Sentences in quiet were used as baseline comparison. fNIRS data were analyzed using a newly developed image reconstruction approach. First, elevated cortical responses in the middle temporal gyrus (MTG) and middle frontal gyrus (MFG) were associated with speech recognition during the low-intelligibility conditions. Second, activation in the MTG was associated with recognition of vocoded speech with low intelligibility, whereas MFG activity was largely driven by recognition of speech in background noise, suggesting that the cortical response varies as a function of distortion type. Lastly, an accuracy effect in the MFG demonstrated significantly higher activation during correct perception relative to incorrect perception of speech. These results suggest that normal-hearing adults (i.e., untrained listeners of vocoded stimuli) do not exploit the same attentional mechanisms of the frontal cortex used to resolve naturally degraded speech and may instead rely on segmental and phonetic analyses in the temporal lobe to discriminate vocoded speech.


Asunto(s)
Estimulación Acústica/métodos , Implantes Cocleares , Lóbulo Frontal/fisiología , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Lóbulo Temporal/fisiología , Adolescente , Adulto , Femenino , Lóbulo Frontal/diagnóstico por imagen , Humanos , Masculino , Ruido/efectos adversos , Espectroscopía Infrarroja Corta/métodos , Lóbulo Temporal/diagnóstico por imagen , Adulto Joven
6.
PLoS Comput Biol ; 17(2): e1008155, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33617548

RESUMEN

Significant scientific and translational questions remain in auditory neuroscience surrounding the neural correlates of perception. Relating perceptual and neural data collected from humans can be useful; however, human-based neural data are typically limited to evoked far-field responses, which lack anatomical and physiological specificity. Laboratory-controlled preclinical animal models offer the advantage of comparing single-unit and evoked responses from the same animals. This ability provides opportunities to develop invaluable insight into proper interpretations of evoked responses, which benefits both basic-science studies of neural mechanisms and translational applications, e.g., diagnostic development. However, these comparisons have been limited by a disconnect between the types of spectrotemporal analyses used with single-unit spike trains and evoked responses, which results because these response types are fundamentally different (point-process versus continuous-valued signals) even though the responses themselves are related. Here, we describe a unifying framework to study temporal coding of complex sounds that allows spike-train and evoked-response data to be analyzed and compared using the same advanced signal-processing techniques. The framework uses a set of peristimulus-time histograms computed from single-unit spike trains in response to polarity-alternating stimuli to allow advanced spectral analyses of both slow (envelope) and rapid (temporal fine structure) response components. Demonstrated benefits include: (1) novel spectrally specific temporal-coding measures that are less confounded by distortions due to hair-cell transduction, synaptic rectification, and neural stochasticity compared to previous metrics, e.g., the correlogram peak-height, (2) spectrally specific analyses of spike-train modulation coding (magnitude and phase), which can be directly compared to modern perceptually based models of speech intelligibility (e.g., that depend on modulation filter banks), and (3) superior spectral resolution in analyzing the neural representation of nonstationary sounds, such as speech and music. This unifying framework significantly expands the potential of preclinical animal models to advance our understanding of the physiological correlates of perceptual deficits in real-world listening following sensorineural hearing loss.


Asunto(s)
Percepción Auditiva/fisiología , Potenciales Evocados Auditivos/fisiología , Modelos Neurológicos , Estimulación Acústica , Animales , Chinchilla/fisiología , Nervio Coclear/fisiología , Biología Computacional , Modelos Animales de Enfermedad , Pérdida Auditiva Sensorineural/fisiopatología , Pérdida Auditiva Sensorineural/psicología , Humanos , Modelos Animales , Dinámicas no Lineales , Psicoacústica , Sonido , Análisis Espacio-Temporal , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Investigación Biomédica Traslacional
7.
Otol Neurotol ; 40(10): 1278-1286, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31634275

RESUMEN

OBJECTIVE: The aim of the study was to analyze the long-term outcomes after cochlear implantation in deaf children with Down syndrome (DS) regarding age at the first implantation and refer the results to preoperative radiological findings as well as postoperative auditory and speech performance. Additionally, the influence of the age at implantation and duration of CI use on postoperative hearing and language skills were closely analyzed in children with DS. STUDY DESIGN: Retrospective analysis. SETTING: Referral center (Cochlear Implant Center). MATERIALS AND METHODS: Nine children with Down syndrome were compared with 220 pediatric patients without additional mental disorders or genetic mutations. Patients were divided into four categories depending on the age of the first implantation: CAT1 (0-3 yr), CAT2 (4-5 yr), CAT3 (6-7 yr), and CAT4 (8-17 yr). The auditory performance was assessed with the meaningful auditory integration scales (MAIS) and categories of auditory performance (CAP) scales. The speech and language development were further evaluated with meaningful use of speech scale (MUSS) and speech intelligibility rating (SIR). The postoperative speech skills were analyzed and compared between the study group and the reference group by using nonparametric statistical tests. Anatomic abnormalities of the inner ear were examined using magnetic resonance imaging (MRI) and high-resolution computed tomography of the temporal bones (HRCT). RESULTS: The mean follow-up time was 14.9 years (range, 13.1-18.3 yr). Patients with DS received a multichannel implant at a mean age of 75.3 months (SD 27.9; ranging from 21 to 127 mo) and 220 non-syndromic children from reference group at a mean age of 51.4 months (SD 34.2; ranging from 9 to 167 mo). The intraoperative neural response was present in all cases. The auditory and speech performance improved in each DS child. The postoperative mean CAP and SIR scores were 4.4 (SD 0.8) and 3.2 (SD 0.6), respectively. The average of scores in MUSS and MAIS/IT-MAIS scales was 59.8% (SD 0.1) and 76.9% (SD 0.1), respectively. Gathered data indicates that children with DS implanted with CI at a younger age (<6 years of age) benefited from the CI more than children implanted later in life, similarly in a control group. There were additional anomalies of the temporal bone, external, middle, or inner ear observed in 90% of DS children, basing on MRI or HRCT. CONCLUSIONS: The early cochlear implantation in children with DS is a similarly useful method in treating severe to profound sensorineural hearing loss (SNHL) as in non-syndromic patients, although the development of speech skills present differently. Due to a higher prevalence of ear and temporal bone malformations, detailed diagnostic imaging should be taken into account before the CI qualification. Better postoperative outcomes may be achieved through comprehensive care from parents/guardians and speech therapists thanks to intensive and systematic rehabilitation.


Asunto(s)
Implantación Coclear , Síndrome de Down/complicaciones , Pérdida Auditiva Sensorineural/cirugía , Niño , Preescolar , Implantación Coclear/métodos , Implantes Cocleares , Femenino , Audición/fisiología , Pruebas Auditivas , Humanos , Lactante , Desarrollo del Lenguaje , Masculino , Periodo Posoperatorio , Estudios Retrospectivos , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología
8.
Hear Res ; 379: 117-127, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31154164

RESUMEN

An experiment was performed with 10 young normal-hearing listeners that attempted to determine if envelope modulations affected binaural processing in bandlimited pulse trains. Listeners detected an interaurally out-of-phase carrier pulse train in the presence of different amplitude modulations. The peaks of the pulses were constant (called "flat" or F), followed envelope modulations from an interaurally correlated 50-Hz bandwidth noise (called CM), or followed modulations from an interaurally uncorrelated noise (called UM). The pulse rate was varied from 50 to 500 pulses per second (pps) and the center frequency (CF) was 4 or 8 kHz. It was hypothesized that CM would cause no change or an increase in performance compared to F; UM would cause a decrease because of the blurring of the binaural detection cue. There was a small but significant decrease from F to CM (inconsistent with the hypothesis) and a further decrease from CM to UM (consistent with the hypothesis). Critically, there was a significant envelope by rate interaction caused by a decrease from F to CM for the 200-300 pps rates. The data can be explained by a subject-based factor, where some listeners experienced interaural envelope decorrelation when the sound was encoded by the auditory system that reduced performance when the modulations were present. Since the decrease in performance between F and CM conditions was small, it seems that most young normal-hearing listeners have very similar encoding of modulated stimuli across the ears. This type of task, when further optimized, may be able to assess if hearing-impaired populations experience interaural decorrelation from encoding modulated stimuli and therefore could help better understand the limited spatial hearing in populations like cochlear-implant users.


Asunto(s)
Percepción Auditiva/fisiología , Localización de Sonidos/fisiología , Estimulación Acústica , Adulto , Implantes Cocleares/estadística & datos numéricos , Lateralidad Funcional/fisiología , Voluntarios Sanos , Humanos , Psicoacústica , Procesamiento de Señales Asistido por Computador , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Adulto Joven
9.
Neurorehabil Neural Repair ; 33(6): 453-463, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31081485

RESUMEN

Background. Communication impairment is one of the most common symptoms of Parkinson's disease (PD), significantly affecting quality of life. Singing shares many of the neural networks and structural mechanisms used during speech and, thus, has potential for therapeutic application to address speech disorders. Objective. To explore the effects of an interdisciplinary singing-based therapeutic intervention (ParkinSong) on voice and communication in people with PD. Methods. A controlled trial compared the effects of the ParkinSong intervention with an active control condition at 2 dosage levels (weekly vs monthly) over 3 months, on voice, speech, respiratory strength, and voice-related quality-of-life outcomes for 75 people living with PD. The interdisciplinary ParkinSong model comprised high-effort vocal and respiratory tasks, speech exercises, group singing, and social communication opportunities. Results. ParkinSong intervention participants demonstrated significant improvements in vocal intensity (P = .018), maximum expiratory pressure (P = .032), and voice-related quality of life (P = .043) in comparison to controls. Weekly ParkinSong participants increased vocal intensity more than monthly participants (P = .011). Vocal intensity declined in nontreatment control groups. No statistical differences between groups on maximum phonation length or maximum inspiratory pressure were observed at 3 months. Conclusions. ParkinSong is an engaging intervention with the potential to increase loudness and respiratory function in people with mild to moderately severe PD.


Asunto(s)
Ejercicios Respiratorios , Comunicación , Relaciones Interpersonales , Musicoterapia , Enfermedad de Parkinson/fisiopatología , Enfermedad de Parkinson/rehabilitación , Psicoterapia de Grupo , Canto , Trastornos del Habla/fisiopatología , Trastornos del Habla/rehabilitación , Logopedia , Anciano , Anciano de 80 o más Años , Ejercicios Respiratorios/métodos , Terapia Combinada , Femenino , Humanos , Masculino , Persona de Mediana Edad , Musicoterapia/métodos , Enfermedad de Parkinson/complicaciones , Psicoterapia de Grupo/métodos , Índice de Severidad de la Enfermedad , Trastornos del Habla/etiología , Inteligibilidad del Habla/fisiología , Logopedia/métodos , Resultado del Tratamiento
10.
Nat Hum Behav ; 3(4): 393-405, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30971792

RESUMEN

The principles underlying functional asymmetries in cortex remain debated. For example, it is accepted that speech is processed bilaterally in auditory cortex, but a left hemisphere dominance emerges when the input is interpreted linguistically. The mechanisms, however, are contested, such as what sound features or processing principles underlie laterality. Recent findings across species (humans, canines and bats) provide converging evidence that spectrotemporal sound features drive asymmetrical responses. Typically, accounts invoke models wherein the hemispheres differ in time-frequency resolution or integration window size. We develop a framework that builds on and unifies prevailing models, using spectrotemporal modulation space. Using signal processing techniques motivated by neural responses, we test this approach, employing behavioural and neurophysiological measures. We show how psychophysical judgements align with spectrotemporal modulations and then characterize the neural sensitivities to temporal and spectral modulations. We demonstrate differential contributions from both hemispheres, with a left lateralization for temporal modulations and a weaker right lateralization for spectral modulations. We argue that representations in the modulation domain provide a more mechanistic basis to account for lateralization in auditory cortex.


Asunto(s)
Corteza Auditiva/fisiología , Lateralidad Funcional/fisiología , Percepción de la Altura Tonal/fisiología , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Estimulación Acústica/métodos , Adolescente , Adulto , Electrocorticografía/métodos , Femenino , Humanos , Magnetoencefalografía/métodos , Masculino , Psicofísica/métodos , Procesamiento de Señales Asistido por Computador , Factores de Tiempo , Adulto Joven
11.
J Speech Lang Hear Res ; 62(2): 367-386, 2019 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-30950685

RESUMEN

Purpose Improving the ability to listen efficiently in noisy environments is a critical goal for hearing rehabilitation. However, understanding of the impact of difficult listening conditions on language processing is limited. The current study evaluated the neural processes underlying semantics in challenging listening conditions. Method Thirty adults with normal hearing completed an auditory sentence processing task in 4-talker babble. Event-related brain potentials were elicited by the final word in high- or low-context sentences, where the final word was either highly expected or not expected, followed by a 4-alternative forced-choice response with either longer (1,000 ms), middle (700 ms), or shorter (400 ms) response time deadlines (RTDs). Results Behavioral accuracy was reduced, and reactions times were faster for shorter RTDs. N400 amplitudes, reflecting ease of lexical access, were larger when elicited by target words in low-context sentences followed by shorter compared with longer RTDs. Conclusions These results reveal that more neural resources are allocated for semantic processing/lexical access when listening difficulty increases. Differences between RTDs may reflect increased attentional allocation for shorter RTDs. These findings suggest that situational listening demands can impact the demands for cognitive resources engaged in language processing, which could significantly impact listener experiences across environments.


Asunto(s)
Potenciales Evocados/fisiología , Semántica , Estimulación Acústica , Adolescente , Adulto , Umbral Auditivo/fisiología , Comunicación , Electroencefalografía , Femenino , Humanos , Lenguaje , Masculino , Desempeño Psicomotor/fisiología , Tiempo de Reacción , Inteligibilidad del Habla/fisiología , Adulto Joven
12.
J Speech Lang Hear Res ; 62(2): 423-433, 2019 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-30950691

RESUMEN

Purpose Supportive semantic and syntactic information can increase children's and adults' word recognition accuracy in adverse listening conditions. However, there are inconsistent findings regarding how a talker's accent or dialect modulates these context effects. Here, we compare children's and adults' abilities to capitalize on sentence context to overcome misleading acoustic-phonetic cues in nonnative-accented speech. Method Monolingual American English-speaking 5- to 7-year-old children ( n = 90) and 18- to 35-year-old adults ( n = 30) were presented with full sentences or the excised final word from each of the sentences and repeated what they heard. Participants were randomly assigned to 1 of 2 conditions: native-accented (Midland American English) or nonnative-accented (Spanish- and Japanese-accented English) speech. Participants also completed the NIH Toolbox Picture Vocabulary Test. Results Children and adults benefited from sentence context for both native- and nonnative-accent talkers, but the benefit was greater for nonnative than native talkers. Furthermore, adults showed a greater context benefit than children for nonnative talkers, but the 2 age groups showed a similar benefit for native talkers. Children's age and vocabulary scores both correlated with context benefit. Conclusions The cognitive-linguistic development that occurs between the early school-age years and adulthood may increase listeners' abilities to capitalize on top-down cues for lexical identification with nonnative-accented speech. These results have implications for the perception of speech with source degradation, including speech sound disorders, hearing loss, or signal processing that does not faithfully represent the original signal.


Asunto(s)
Reconocimiento en Psicología/fisiología , Inteligibilidad del Habla/fisiología , Estimulación Acústica , Adolescente , Adulto , Niño , Preescolar , Comprensión/fisiología , Señales (Psicología) , Femenino , Humanos , Masculino , Ruido , Fonética , Semántica , Vocabulario , Adulto Joven
13.
J Neural Eng ; 16(3): 036008, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-30776785

RESUMEN

OBJECTIVE: Speech signals have a remarkable ability to entrain brain activity to the rapid fluctuations of speech sounds. For instance, one can readily measure a correlation of the sound amplitude with the evoked responses of the electroencephalogram (EEG), and the strength of this correlation is indicative of whether the listener is attending to the speech. In this study we asked whether this stimulus-response correlation is also predictive of speech intelligibility. APPROACH: We hypothesized that when a listener fails to understand the speech in adverse hearing conditions, attention wanes and stimulus-response correlation also drops. To test this, we measure a listener's ability to detect words in noisy speech while recording their brain activity using EEG. We alter intelligibility without changing the acoustic stimulus by pairing it with congruent and incongruent visual speech. MAIN RESULTS: For almost all subjects we found that an improvement in speech detection coincided with an increase in correlation between the noisy speech and the EEG measured over a period of 30 min. SIGNIFICANCE: We conclude that simultaneous recordings of the perceived sound and the corresponding EEG response may be a practical tool to assess speech intelligibility in the context of hearing aids.


Asunto(s)
Estimulación Acústica/métodos , Encéfalo/fisiología , Electroencefalografía/métodos , Inteligibilidad del Habla/fisiología , Percepción Auditiva/fisiología , Femenino , Predicción , Humanos , Masculino , Estimulación Luminosa/métodos , Percepción del Habla/fisiología , Percepción Visual/fisiología , Adulto Joven
14.
Hear Res ; 374: 58-68, 2019 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-30732921

RESUMEN

Faster speech may facilitate more efficient communication, but if speech is too fast it becomes unintelligible. The maximum speeds at which Mandarin words were intelligible in a sentence context were quantified for normal hearing (NH) and cochlear implant (CI) listeners by measuring time-compression thresholds (TCTs) in an adaptive staircase procedure. In Experiment 1, both original and CI-vocoded time-compressed speech from the MSP (Mandarin speech perception) and MHINT (Mandarin hearing in noise test) corpora was presented to 10 NH subjects over headphones. In Experiment 2, original time-compressed speech was presented to 10 CI subjects and another 10 NH subjects through a loudspeaker in a soundproof room. Sentences were time-compressed without changing their spectral profile, and were presented up to three times within a single trial. At the end of each trial, the number of correctly identified words in the sentence was scored. A 50%-word recognition threshold was tracked in the psychophysical procedure. The observed median TCTs were very similar for MSP and MHINT speech. For NH listeners, median TCTs were around 16.7 syllables/s for normal speech, and 11.8 and 8.6 syllables/s respectively for 8 and 4 channel tone-carrier vocoded speech. For CI listeners, TCTs were only around 6.8 syllables/s. The interquartile range of the TCTs within each cohort was smaller than 3.0 syllables/s. Speech reception thresholds in noise were also measured in Experiment 2, and were found to be strongly correlated with TCTs for CI listeners. In conclusion, the Mandarin sentence TCTs were around 16.7 syllables/s for most NH subjects, but rarely faster than 10.0 syllables/s for CI listeners, which quantitatively illustrated upper limits of fast speech information processing with CIs.


Asunto(s)
Umbral Auditivo/fisiología , Implantes Cocleares , Lenguaje , Inteligibilidad del Habla/fisiología , Estimulación Acústica , Adulto , Algoritmos , Niño , Implantes Cocleares/estadística & datos numéricos , Femenino , Voluntarios Sanos , Humanos , Masculino , Psicoacústica , Procesamiento de Señales Asistido por Computador , Acústica del Lenguaje , Percepción del Habla/fisiología , Factores de Tiempo , Adulto Joven
15.
Sci Rep ; 9(1): 874, 2019 01 29.
Artículo en Inglés | MEDLINE | ID: mdl-30696881

RESUMEN

Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.


Asunto(s)
Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Habla/fisiología , Estimulación Acústica/métodos , Algoritmos , Corteza Auditiva/fisiología , Mapeo Encefálico , Aprendizaje Profundo , Potenciales Evocados Auditivos/fisiología , Humanos , Redes Neurales de la Computación , Prótesis Neurales
16.
Trends Hear ; 22: 2331216518800870, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30311552

RESUMEN

There is conflicting evidence about the relative benefit of slow- and fast-acting compression for speech intelligibility. It has been hypothesized that fast-acting compression improves audibility at low signal-to-noise ratios (SNRs) but may distort the speech envelope at higher SNRs. The present study investigated the effects of compression with a nearly instantaneous attack time but either fast (10 ms) or slow (500 ms) release times on consonant identification in hearing-impaired listeners. Consonant-vowel speech tokens were presented at a range of presentation levels in two conditions: in the presence of interrupted noise and in quiet (with the compressor "shadow-controlled" by the corresponding mixture of speech and noise). These conditions were chosen to disentangle the effects of consonant audibility and noise-induced forward masking on speech intelligibility. A small but systematic intelligibility benefit of fast-acting compression was found in both the quiet and the noisy conditions for the lower speech levels. No detrimental effects of fast-acting compression were observed when the speech level exceeded the level of the noise. These findings suggest that fast-acting compression provides an audibility benefit in fluctuating interferers when compared with slow-acting compression while not substantially affecting the perception of consonants at higher SNRs.


Asunto(s)
Estimulación Acústica/métodos , Audífonos , Pérdida Auditiva Sensorineural/rehabilitación , Espectrografía del Sonido/métodos , Inteligibilidad del Habla/fisiología , Adulto , Anciano , Estudios de Casos y Controles , Femenino , Pérdida Auditiva Sensorineural/diagnóstico , Humanos , Masculino , Fonética , Diseño de Prótesis , Valores de Referencia , Relación Señal-Ruido , Prueba del Umbral de Recepción del Habla , Adulto Joven
17.
Trends Hear ; 22: 2331216518797838, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30222089

RESUMEN

Many cochlear implant (CI) users achieve excellent speech understanding in acoustically quiet conditions but most perform poorly in the presence of background noise. An important contributor to this poor speech-in-noise performance is the limited transmission of low-frequency sound information through CIs. Recent work has suggested that tactile presentation of this low-frequency sound information could be used to improve speech-in-noise performance for CI users. Building on this work, we investigated whether vibro-tactile stimulation can improve speech intelligibility in multi-talker noise. The signal used for tactile stimulation was derived from the speech-in-noise using a computationally inexpensive algorithm. Eight normal-hearing participants listened to CI simulated speech-in-noise both with and without concurrent tactile stimulation of their fingertip. Participants' speech recognition performance was assessed before and after a training regime, which took place over 3 consecutive days and totaled around 30 min of exposure to CI-simulated speech-in-noise with concurrent tactile stimulation. Tactile stimulation was found to improve the intelligibility of speech in multi-talker noise, and this improvement was found to increase in size after training. Presentation of such tactile stimulation could be achieved by a compact, portable device and offer an inexpensive and noninvasive means for improving speech-in-noise performance in CI users.


Asunto(s)
Estimulación Acústica/métodos , Implantación Coclear/métodos , Pérdida Auditiva/cirugía , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Adulto , Algoritmos , Audiometría del Habla/métodos , Percepción Auditiva/fisiología , Umbral Auditivo/fisiología , Implantes Cocleares , Femenino , Humanos , Masculino , Ruido , Muestreo , Sensibilidad y Especificidad , Entrenamiento Simulado , Localización de Sonidos/fisiología , Adulto Joven
18.
J Acoust Soc Am ; 143(5): EL379, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29857710

RESUMEN

A positive relationship between rhythm perception and improved understanding of a naturally dysrhythmic speech signal, ataxic dysarthria, has been previously reported [Borrie, Lansford, and Barrett. (2017). J. Speech Lang. Hear. Res. 60, 3110-3117]. The current follow-on investigation suggests that this relationship depends on the nature of the dysrhythmia. When the corrupted rhythm cues are relatively predictable, affording some learnable acoustic regularity, the relationship is replicated. However, this relationship is nonexistent, along with any intelligibility improvements, when the corrupted rhythm cues are unpredictable. Findings highlight a key role for rhythm perception and distributional regularities in adaptation to dysrhythmic speech.


Asunto(s)
Estimulación Acústica/métodos , Disartria/fisiopatología , Aprendizaje/fisiología , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Adulto , Disartria/diagnóstico , Femenino , Humanos , Masculino , Persona de Mediana Edad , Adulto Joven
19.
Curr Biol ; 28(9): 1453-1459.e3, 2018 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-29681475

RESUMEN

Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements.


Asunto(s)
Percepción del Habla/fisiología , Habla/fisiología , Corteza Visual/fisiología , Estimulación Acústica , Adulto , Corteza Auditiva/fisiología , Mapeo Encefálico , Femenino , Humanos , Labio , Lectura de los Labios , Magnetoencefalografía/métodos , Masculino , Corteza Motora/fisiología , Movimiento , Fonética , Inteligibilidad del Habla/fisiología
20.
Curr Biol ; 27(21): 3237-3247.e6, 2017 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-29056453

RESUMEN

Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges.


Asunto(s)
Percepción Auditiva/fisiología , Enmascaramiento Perceptual/fisiología , Personas con Deficiencia Auditiva , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Estimulación Acústica , Anciano , Método Doble Ciego , Femenino , Humanos , Masculino
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA