Búsqueda | BVS Bolivia

1.

Band importance for speech-in-speech recognition in the presence of extended high-frequency cues.

Ananthanarayana, Rohit M; Buss, Emily; Monson, Brian B.

J Acoust Soc Am ; 156(2): 1202-1213, 2024 Aug 01.

Artículo en Inglés | MEDLINE | ID: mdl-39158325

RESUMEN

Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8-20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4-1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3-8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.

Asunto(s)

Estimulación Acústica , Señales (Psicología) , Ruido , Enmascaramiento Perceptual , Reconocimiento en Psicología , Percepción del Habla , Humanos , Femenino , Percepción del Habla/fisiología , Adulto Joven , Adulto , Masculino , Audiometría del Habla , Inteligibilidad del Habla , Umbral Auditivo , Localización de Sonidos , Acústica del Lenguaje , Espectrografía del Sonido

2.

A Step Toward Precision Audiology: Individual Differences and Characteristic Profiles From Auditory Perceptual and Cognitive Abilities.

Cherri, Dana; Eddins, David A; Ozmeral, Erol J.

Trends Hear ; 28: 23312165241263485, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39099537

RESUMEN

Older adults with normal hearing or with age-related hearing loss face challenges when listening to speech in noisy environments. To better serve individuals with communication difficulties, precision diagnostics are needed to characterize individuals' auditory perceptual and cognitive abilities beyond pure tone thresholds. These abilities can be heterogenous across individuals within the same population. The goal of the present study is to consider the suprathreshold variability and develop characteristic profiles for older adults with normal hearing (ONH) and with hearing loss (OHL). Auditory perceptual and cognitive abilities were tested on ONH (n = 20) and OHL (n = 20) on an abbreviated test battery using portable automated rapid testing. Using cluster analyses, three main profiles were revealed for each group, showing differences in auditory perceptual and cognitive abilities despite similar audiometric thresholds. Analysis of variance showed that ONH profiles differed in spatial release from masking, speech-in-babble testing, cognition, tone-in-noise, and binaural temporal processing abilities. The OHL profiles differed in spatial release from masking, speech-in-babble testing, cognition, and tolerance to background noise performance. Correlation analyses showed significant relationships between auditory and cognitive abilities in both groups. This study showed that auditory perceptual and cognitive deficits can be present to varying degrees in the presence of audiometrically normal hearing and among listeners with similar degrees of hearing loss. The results of this study inform the need for taking individual differences into consideration and developing targeted intervention options beyond pure tone thresholds and speech testing.

Asunto(s)

Audiometría de Tonos Puros , Umbral Auditivo , Cognición , Ruido , Enmascaramiento Perceptual , Percepción del Habla , Humanos , Masculino , Cognición/fisiología , Femenino , Anciano , Umbral Auditivo/fisiología , Percepción del Habla/fisiología , Persona de Mediana Edad , Ruido/efectos adversos , Estimulación Acústica , Percepción Auditiva/fisiología , Anciano de 80 o más Años , Audición/fisiología , Factores de Edad , Estudios de Casos y Controles , Presbiacusia/diagnóstico , Presbiacusia/fisiopatología , Valor Predictivo de las Pruebas , Audiología/métodos , Individualidad , Personas con Deficiencia Auditiva/psicología , Análisis por Conglomerados , Audiometría del Habla/métodos

3.

Effects of Age on Responses of Principal Cells of the Mouse Anteroventral Cochlear Nucleus in Quiet and Noise.

Postolache, Maggie; Connelly Graham, Catherine J; Burke, Kali; Lauer, Amanda M; Xu-Friedman, Matthew A.

eNeuro ; 11(8)2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-39134409

RESUMEN

Older listeners often report difficulties understanding speech in noisy environments. It is important to identify where in the auditory pathway hearing-in-noise deficits arise to develop appropriate therapies. We tested how encoding of sounds is affected by masking noise at early stages of the auditory pathway by recording responses of principal cells in the anteroventral cochlear nucleus (AVCN) of aging CBA/CaJ and C57BL/6J mice in vivo. Previous work indicated that masking noise shifts the dynamic range of single auditory nerve fibers (ANFs), leading to elevated tone thresholds. We hypothesized that such threshold shifts could contribute to increased hearing-in-noise deficits with age if susceptibility to masking increased in AVCN units. We tested this by recording the responses of AVCN principal neurons to tones in the presence and absence of masking noise. Surprisingly, we found that masker-induced threshold shifts decreased with age in primary-like units and did not change in choppers. In addition, spontaneous activity decreased in primary-like and chopper units of old mice, with no change in dynamic range or tuning precision. In C57 mice, which undergo early-onset hearing loss, units showed similar changes in threshold and spontaneous rate at younger ages, suggesting they were related to hearing loss and not simply aging. These findings suggest that sound information carried by AVCN principal cells remains largely unchanged with age. Therefore, hearing-in-noise deficits may result from other changes during aging, such as distorted across-channel input from the cochlea and changes in sound coding at later stages of the auditory pathway.

Asunto(s)

Envejecimiento , Núcleo Coclear , Ratones Endogámicos C57BL , Ratones Endogámicos CBA , Ruido , Animales , Núcleo Coclear/fisiología , Envejecimiento/fisiología , Masculino , Estimulación Acústica , Neuronas/fisiología , Femenino , Umbral Auditivo/fisiología , Enmascaramiento Perceptual/fisiología , Ratones , Potenciales de Acción/fisiología

4.

Intelligibility of Natively and Nonnatively Produced English Speech Presented in Noise to a Large Cohort of United States Service Members.

Bieber, Rebecca E; Makashay, Matthew J; Sheffield, Benjamin M; Brungart, Douglas S.

J Speech Lang Hear Res ; 67(7): 2454-2472, 2024 Jul 09.

Artículo en Inglés | MEDLINE | ID: mdl-38950169

RESUMEN

PURPOSE: A corpus of English matrix sentences produced by 60 native and nonnative speakers of English was developed as part of a multinational coalition task group. This corpus was tested on a large cohort of U.S. Service members in order to examine the effects of talker nativeness, listener nativeness, masker type, and hearing sensitivity on speech recognition performance in this population. METHOD: A total of 1,939 U.S. Service members (ages 18-68 years) completed this closed-set listening task, including 430 women and 110 nonnative English speakers. Stimuli were produced by native and nonnative speakers of English and were presented in speech-shaped noise and multitalker babble. Keyword recognition accuracy and response times were analyzed. RESULTS: General(ized) linear mixed-effects regression models found that, on the whole, speech recognition performance was lower for listeners who identified as nonnative speakers of English and when listening to speech produced by nonnative speakers of English. Talker and listener effects were more pronounced when listening in a babble masker than in a speech-shaped noise masker. Response times varied as a function of recognition score, with longest response times found for intermediate levels of performance. CONCLUSIONS: This study found additive effects of talker and listener nonnativeness when listening to speech in background noise. These effects were present in both accuracy and response time measures. No multiplicative effects of talker and listener language background were found. There was little evidence of a negative interaction between talker nonnativeness and hearing impairment, suggesting that these factors may have redundant effects on speech recognition. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.26060191.

Asunto(s)

Ruido , Enmascaramiento Perceptual , Inteligibilidad del Habla , Percepción del Habla , Humanos , Femenino , Adulto , Persona de Mediana Edad , Masculino , Adulto Joven , Anciano , Adolescente , Estados Unidos , Enmascaramiento Perceptual/fisiología , Estudios de Cohortes , Lenguaje , Personal Militar

5.

Learning English vowels: The effects of different phonetic training modes on Arabic learners' production and perceptiona).

Alshangiti, Wafaa; Evans, Bronwen G.

J Acoust Soc Am ; 156(1): 284-298, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-38984810

RESUMEN

This study investigated the effect of different types of phonetic training on potential changes in the production and perception of English vowels by Arabic learners of English. Forty-six Arabic learners of English were randomly assigned to one of three high variability vowel training programs: Perception training (High Variability Phonetic Training), Production training, and a Hybrid Training program (production and perception training). Pre- and post-tests (vowel identification, category discrimination, speech recognition in noise, and vowel production) showed that all training types led to improvements in perception and production. There was some evidence that improvements were linked to training type: learners in the Perception Training condition improved in vowel identification but not vowel production, while those in the Production Training condition showed only small improvements in performance on perceptual tasks, but greater improvement in production. However, the effects of training modality were complicated by proficiency, with high proficiency learners benefitting more from different types of training regardless of training mode than lower proficiency learners.

Asunto(s)

Multilingüismo , Fonética , Percepción del Habla , Humanos , Femenino , Masculino , Adulto Joven , Adulto , Acústica del Lenguaje , Aprendizaje , Medición de la Producción del Habla , Reconocimiento en Psicología , Enmascaramiento Perceptual , Ruido , Lenguaje , Adolescente

6.

Echolocating bats show species-specific variation in susceptibility to acoustic forward masking.

Capshaw, Grace; Diebold, Clarice A; Sterbing, Susanne J; Lauer, Amanda M; Moss, Cynthia F.

J Acoust Soc Am ; 156(1): 511-523, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-39013168

RESUMEN

Echolocating bats rely on precise auditory temporal processing to detect echoes generated by calls that may be emitted at rates reaching 150-200 Hz. High call rates can introduce forward masking perceptual effects that interfere with echo detection; however, bats may have evolved specializations to prevent repetition suppression of auditory responses and facilitate detection of sounds separated by brief intervals. Recovery of the auditory brainstem response (ABR) was assessed in two species that differ in the temporal characteristics of their echolocation behaviors: Eptesicus fuscus, which uses high call rates to capture prey, and Carollia perspicillata, which uses lower call rates to avoid obstacles and forage for fruit. We observed significant species differences in the effects of forward masking on ABR wave 1, in which E. fuscus maintained comparable ABR wave 1 amplitudes when stimulated at intervals of <3 ms, whereas post-stimulus recovery in C. perspicillata required 12 ms. When the intensity of the second stimulus was reduced by 20-30 dB relative to the first, however, C. perspicillata showed greater recovery of wave 1 amplitudes. The results demonstrate that species differences in temporal resolution are established at early levels of the auditory pathway and that these differences reflect auditory processing requirements of species-specific echolocation behaviors.

Asunto(s)

Estimulación Acústica , Quirópteros , Ecolocación , Potenciales Evocados Auditivos del Tronco Encefálico , Enmascaramiento Perceptual , Especificidad de la Especie , Animales , Quirópteros/fisiología , Estimulación Acústica/métodos , Potenciales Evocados Auditivos del Tronco Encefálico/fisiología , Factores de Tiempo , Masculino , Femenino , Umbral Auditivo , Percepción Auditiva/fisiología

7.

Attenuation and distortion components of age-related hearing loss: Contributions to recognizing temporal-envelope filtered speech in modulated noise.

Fogerty, Daniel; Ahlstrom, Jayne B; Dubno, Judy R.

J Acoust Soc Am ; 156(1): 93-106, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-38958486

RESUMEN

Older adults with hearing loss may experience difficulty recognizing speech in noise due to factors related to attenuation (e.g., reduced audibility and sensation levels, SLs) and distortion (e.g., reduced temporal fine structure, TFS, processing). Furthermore, speech recognition may improve when the amplitude modulation spectrum of the speech and masker are non-overlapping. The current study investigated this by filtering the amplitude modulation spectrum into different modulation rates for speech and speech-modulated noise. The modulation depth of the noise was manipulated to vary the SL of speech glimpses. Younger adults with normal hearing and older adults with normal or impaired hearing listened to natural speech or speech vocoded to degrade TFS cues. Control groups of younger adults were tested on all conditions with spectrally shaped speech and threshold matching noise, which reduced audibility to match that of the older hearing-impaired group. All groups benefitted from increased masker modulation depth and preservation of syllabic-rate speech modulations. Older adults with hearing loss had reduced speech recognition across all conditions. This was explained by factors related to attenuation, due to reduced SLs, and distortion, due to reduced TFS processing, which resulted in poorer auditory processing of speech cues during the dips of the masker.

Asunto(s)

Estimulación Acústica , Umbral Auditivo , Señales (Psicología) , Ruido , Enmascaramiento Perceptual , Percepción del Habla , Humanos , Percepción del Habla/fisiología , Anciano , Ruido/efectos adversos , Adulto , Adulto Joven , Masculino , Femenino , Persona de Mediana Edad , Factores de Edad , Reconocimiento en Psicología , Factores de Tiempo , Envejecimiento/fisiología , Presbiacusia/fisiopatología , Presbiacusia/diagnóstico , Presbiacusia/psicología , Personas con Deficiencia Auditiva/psicología , Anciano de 80 o más Años , Estudios de Casos y Controles , Inteligibilidad del Habla

8.

Learning effects in speech-in-noise tasks: Effect of masker modulation and masking release.

Lie, Sisi; Zekveld, Adriana A; Smits, Cas; Kramer, Sophia E; Versfeld, Niek J.

J Acoust Soc Am ; 156(1): 341-349, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-38990038

RESUMEN

Previous research has shown that learning effects are present for speech intelligibility in temporally modulated (TM) noise, but not in stationary noise. The present study aimed to gain more insight into the factors that might affect the time course (the number of trials required to reach stable performance) and size [the improvement in the speech reception threshold (SRT)] of the learning effect. Two hypotheses were addressed: (1) learning effects are present in both TM and spectrally modulated (SM) noise and (2) the time course and size of the learning effect depend on the amount of masking release caused by either TM or SM noise. Eighteen normal-hearing adults (23-62 years) participated in SRT measurements, in which they listened to sentences in six masker conditions, including stationary, TM, and SM noise conditions. The results showed learning effects in all TM and SM noise conditions, but not for the stationary noise condition. The learning effect was related to the size of masking release: a larger masking release was accompanied by an increased time course of the learning effect and a larger learning effect. The results also indicate that speech is processed differently in SM noise than in TM noise.

Asunto(s)

Estimulación Acústica , Aprendizaje , Ruido , Enmascaramiento Perceptual , Inteligibilidad del Habla , Percepción del Habla , Humanos , Ruido/efectos adversos , Adulto , Adulto Joven , Masculino , Percepción del Habla/fisiología , Femenino , Persona de Mediana Edad , Prueba del Umbral de Recepción del Habla , Factores de Tiempo , Umbral Auditivo

9.

Effect of stimulus duration on estimates of human cochlear tuning.

López-Ramos, David; Eustaquio-Martín, Almudena; López-Bascuas, Luis E; Lopez-Poveda, Enrique A.

Hear Res ; 451: 109080, 2024 Sep 15.

Artículo en Inglés | MEDLINE | ID: mdl-39004016

RESUMEN

Auditory masking methods originally employed to assess behavioral frequency selectivity have evolved over the years to infer cochlear tuning. Behavioral forward masking thresholds for spectrally notched noise maskers and a fixed, low-level probe tone provide accurate estimates of cochlear tuning. Here, we use this method to investigate the effect of stimulus duration on human cochlear tuning at 500 Hz and 4 kHz. Probes were 20-ms sinusoids at 10 dB sensation level. Maskers were noises with a spectral notch symmetrically and asymmetrically placed around the probe frequency. For seven participants with normal hearing, masker levels at masking threshold were measured in forward masking for various notch widths and for masker durations of 30 and 400 ms. Measurements were fitted assuming rounded exponential filter shapes and the power spectrum model of masking, and equivalent rectangular bandwidths (ERBs) were inferred from the fits. At 4 kHz, masker thresholds were higher for the shorter maskers but ERBs were not significantly different for the two masker durations (ERB30ms=294 Hz vs. ERB400ms=277 Hz). At 500 Hz, by contrast, notched-noise curves were shallower for the 30-ms than the 400-ms masker, and ERBs were significantly broader for the shorter masker (ERB30ms=126 Hz vs. ERB400ms=55 Hz). We discuss possible factors that may underlay the duration effect at low frequencies and argue that it may not be possible to fully control for those factors. We conclude that tuning estimates are not affected by maker duration at high frequencies but should be measured and interpreted with caution at low frequencies.

Asunto(s)

Estimulación Acústica , Umbral Auditivo , Cóclea , Ruido , Enmascaramiento Perceptual , Humanos , Cóclea/fisiología , Adulto , Masculino , Femenino , Factores de Tiempo , Ruido/efectos adversos , Adulto Joven

10.

The contribution of short-term memory for sound features to speech-in-noise perception and cognition.

Lad, Meher; Taylor, John-Paul; Griffiths, Timothy D.

Hear Res ; 451: 109081, 2024 Sep 15.

Artículo en Inglés | MEDLINE | ID: mdl-39004015

RESUMEN

Speech-in-noise (SIN) perception is a fundamental ability that declines with aging, as does general cognition. We assess whether auditory cognitive ability, in particular short-term memory for sound features, contributes to both. We examined how auditory memory for fundamental sound features, the carrier frequency and amplitude modulation rate of modulated white noise, contributes to SIN perception. We assessed SIN in 153 healthy participants with varying degrees of hearing loss using measures that require single-digit perception (the Digits-in-Noise, DIN) and sentence perception (Speech-in-Babble, SIB). Independent variables were auditory memory and a range of other factors including the Pure Tone Audiogram (PTA), a measure of dichotic pitch-in-noise perception (Huggins pitch), and demographic variables including age and sex. Multiple linear regression models were compared using Bayesian Model Comparison. The best predictor model for DIN included PTA and Huggins pitch (r2 = 0.32, p < 0.001), whereas the model for SIB included the addition of auditory memory for sound features (r2 = 0.24, p < 0.001). Further analysis demonstrated that auditory memory also explained a significant portion of the variance (28 %) in scores for a screening cognitive test for dementia. Auditory memory for non-speech sounds may therefore provide an important predictor of both SIN and cognitive ability.

Asunto(s)

Estimulación Acústica , Cognición , Memoria a Corto Plazo , Ruido , Enmascaramiento Perceptual , Percepción del Habla , Humanos , Femenino , Masculino , Ruido/efectos adversos , Persona de Mediana Edad , Adulto , Anciano , Adulto Joven , Percepción de la Altura Tonal , Teorema de Bayes , Anciano de 80 o más Años , Audiometría de Tonos Puros , Audición , Umbral Auditivo , Pruebas de Audición Dicótica

11.

Pupil dilation reflects covert familiar face recognition under interocular suppression.

Mejía, Manuel Alejandro; Valdés-Sosa, Mitchell; Bobes, Maria Antonieta.

Conscious Cogn ; 123: 103726, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38972288

RESUMEN

In prosopagnosia, brain lesions impair overt face recognition, but not face detection, and may coexist with residual covert recognition of familiar faces. Previous studies that simulated covert recognition in healthy individuals have impaired face detection as well as recognition, thus not fully mirroring the deficits in prosopagnosia. We evaluated a model of covert recognition based on continuous flash suppression (CFS). Familiar and unfamiliar faces and houses were masked while participants performed two discrimination tasks. With increased suppression, face/house discrimination remained largely intact, but face familiarity discrimination deteriorated. Covert recognition was present across all masking levels, evinced by higher pupil dilation to familiar than unfamiliar faces. Pupil dilation was uncorrelated with overt performance across subjects. Thus, CFS can impede overt face recognition without disrupting covert recognition and face detection, mirroring critical features of prosopagnosia. CFS could be used to uncover shared neural mechanisms of covert recognition in prosopagnosic patients and neurotypicals.

Asunto(s)

Reconocimiento Facial , Pupila , Reconocimiento en Psicología , Humanos , Reconocimiento Facial/fisiología , Adulto , Femenino , Masculino , Reconocimiento en Psicología/fisiología , Adulto Joven , Pupila/fisiología , Enmascaramiento Perceptual/fisiología

12.

The radial-tangential anisotropy of numerosity perception.

L-Miao, Li; Reynvoet, Bert; Sayim, Bilge.

J Vis ; 24(7): 15, 2024 Jul 02.

Artículo en Inglés | MEDLINE | ID: mdl-39046720

RESUMEN

Humans can estimate the number of visually presented items without counting. In most studies on numerosity perception, items are uniformly distributed across displays, with identical distributions in central and eccentric parts. However, the neural and perceptual representation of the human visual field differs between the fovea and the periphery. For example, in peripheral vision, there are strong asymmetries with regard to perceptual interferences between visual items. In particular, items arranged radially usually interfere more strongly with each other than items arranged tangentially (the radial-tangential anisotropy). This has been shown for crowding (the deleterious effect of clutter on target identification) and redundancy masking (the reduction of the number of perceived items in repeating patterns). In the present study, we tested how the radial-tangential anisotropy of peripheral vision impacts numerosity perception. In four experiments, we presented displays with varying numbers of discs that were predominantly arranged radially or tangentially, forming strong and weak interference conditions, respectively. Participants were asked to report the number of discs. We found that radial displays were reported as less numerous than tangential displays for all radial and tangential manipulations: weak (Experiment 1), strong (Experiment 2), and when using displays with mixed contrast polarity discs (Experiments 3 and 4). We propose that numerosity perception exhibits a significant radial-tangential anisotropy, resulting from local spatial interactions between items.

Asunto(s)

Reconocimiento Visual de Modelos , Humanos , Anisotropía , Adulto , Masculino , Femenino , Adulto Joven , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodos , Campos Visuales/fisiología , Enmascaramiento Perceptual/fisiología , Percepción Visual/fisiología

13.

French version of the coordinate response measure corpus and its validation on a speech-on-speech task.

Isnard, Vincent; Chastres, Véronique; Andéol, Guillaume.

JASA Express Lett ; 4(7)2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-39051871

RESUMEN

Since its creation, the coordinate response measure (CRM) corpus has been applied in hundreds of studies to explore the mechanisms of informational masking in multi-talker situations, but also in speech-in-noise or auditory attentional tasks. Here, we present its French version, with equivalent content to the original version in English. Furthermore, an evaluation of speech-on-speech intelligibility in French shows informational masking with similar result patterns to the original data in English. This validation of the French CRM corpus allows to propose the use of the CRM for intelligibility tests in French, and for comparisons with a foreign language under masking conditions.

Asunto(s)

Lenguaje , Inteligibilidad del Habla , Percepción del Habla , Humanos , Percepción del Habla/fisiología , Femenino , Masculino , Adulto , Enmascaramiento Perceptual/fisiología , Francia , Adulto Joven , Ruido

14.

Using deep learning to improve the intelligibility of a target speaker in noisy multi-talker environments for people with normal hearing and hearing loss.

Thoidis, Iordanis; Goehring, Tobias.

J Acoust Soc Am ; 156(1): 706-724, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-39082692

RESUMEN

Understanding speech in noisy environments is a challenging task, especially in communication situations with several competing speakers. Despite their ongoing improvement, assistive listening devices and speech processing approaches still do not perform well enough in noisy multi-talker environments, as they may fail to restore the intelligibility of a speaker of interest among competing sound sources. In this study, a quasi-causal deep learning algorithm was developed that can extract the voice of a target speaker, as indicated by a short enrollment utterance, from a mixture of multiple concurrent speakers in background noise. Objective evaluation with computational metrics demonstrated that the speaker-informed algorithm successfully extracts the target speaker from noisy multi-talker mixtures. This was achieved using a single algorithm that generalized to unseen speakers, different numbers of speakers and relative speaker levels, and different speech corpora. Double-blind sentence recognition tests on mixtures of one, two, and three speakers in restaurant noise were conducted with listeners with normal hearing and listeners with hearing loss. Results indicated significant intelligibility improvements with the speaker-informed algorithm of 17% and 31% for people without and with hearing loss, respectively. In conclusion, it was demonstrated that deep learning-based speaker extraction can enhance speech intelligibility in noisy multi-talker environments where uninformed speech enhancement methods fail.

Asunto(s)

Aprendizaje Profundo , Ruido , Inteligibilidad del Habla , Percepción del Habla , Humanos , Ruido/efectos adversos , Femenino , Masculino , Adulto , Persona de Mediana Edad , Pérdida Auditiva/fisiopatología , Pérdida Auditiva/psicología , Adulto Joven , Anciano , Algoritmos , Audición , Enmascaramiento Perceptual

15.

Development of a Phrase-Based Speech-Recognition Test Using Synthetic Speech.

Ibelings, Saskia; Brand, Thomas; Ruigendijk, Esther; Holube, Inga.

Trends Hear ; 28: 23312165241261490, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39051703

RESUMEN

Speech-recognition tests are widely used in both clinical and research audiology. The purpose of this study was the development of a novel speech-recognition test that combines concepts of different speech-recognition tests to reduce training effects and allows for a large set of speech material. The new test consists of four different words per trial in a meaningful construct with a fixed structure, the so-called phrases. Various free databases were used to select the words and to determine their frequency. Highly frequent nouns were grouped into thematic categories and combined with related adjectives and infinitives. After discarding inappropriate and unnatural combinations, and eliminating duplications of (sub-)phrases, a total number of 772 phrases remained. Subsequently, the phrases were synthesized using a text-to-speech system. The synthesis significantly reduces the effort compared to recordings with a real speaker. After excluding outliers, measured speech-recognition scores for the phrases with 31 normal-hearing participants at fixed signal-to-noise ratios (SNR) revealed speech-recognition thresholds (SRT) for each phrase varying up to 4 dB. The median SRT was -9.1 dB SNR and thus comparable to existing sentence tests. The psychometric function's slope of 15 percentage points per dB is also comparable and enables efficient use in audiology. Summarizing, the principle of creating speech material in a modular system has many potential applications.

Asunto(s)

Reconocimiento en Psicología , Percepción del Habla , Humanos , Masculino , Femenino , Adulto , Adulto Joven , Estimulación Acústica , Prueba del Umbral de Recepción del Habla/métodos , Umbral Auditivo , Reproducibilidad de los Resultados , Valor Predictivo de las Pruebas , Psicometría , Inteligibilidad del Habla , Relación Señal-Ruido , Enmascaramiento Perceptual

16.

Is Recognition of Speech in Noise Related to Memory Disruption Caused by Irrelevant Sound?

Oberfeld, Daniel; Staab, Katharina; Kattner, Florian; Ellermeier, Wolfgang.

Trends Hear ; 28: 23312165241262517, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39051688

RESUMEN

Listeners with normal audiometric thresholds show substantial variability in their ability to understand speech in noise (SiN). These individual differences have been reported to be associated with a range of auditory and cognitive abilities. The present study addresses the association between SiN processing and the individual susceptibility of short-term memory to auditory distraction (i.e., the irrelevant sound effect [ISE]). In a sample of 67 young adult participants with normal audiometric thresholds, we measured speech recognition performance in a spatial listening task with two interfering talkers (speech-in-speech identification), audiometric thresholds, binaural sensitivity to the temporal fine structure (interaural phase differences [IPD]), serial memory with and without interfering talkers, and self-reported noise sensitivity. Speech-in-speech processing was not significantly associated with the ISE. The most important predictors of high speech-in-speech recognition performance were a large short-term memory span, low IPD thresholds, bilaterally symmetrical audiometric thresholds, and low individual noise sensitivity. Surprisingly, the susceptibility of short-term memory to irrelevant sound accounted for a substantially smaller amount of variance in speech-in-speech processing than the nondisrupted short-term memory capacity. The data confirm the role of binaural sensitivity to the temporal fine structure, although its association to SiN recognition was weaker than in some previous studies. The inverse association between self-reported noise sensitivity and SiN processing deserves further investigation.

Asunto(s)

Estimulación Acústica , Umbral Auditivo , Memoria a Corto Plazo , Ruido , Enmascaramiento Perceptual , Reconocimiento en Psicología , Percepción del Habla , Humanos , Ruido/efectos adversos , Masculino , Femenino , Percepción del Habla/fisiología , Adulto Joven , Memoria a Corto Plazo/fisiología , Adulto , Inteligibilidad del Habla , Atención/fisiología , Adolescente

17.

Evidence for proactive and retroactive temporal pattern analysis in simultaneous maskinga).

Laback, Bernhard; Tabuchi, Hisaaki; Kohlrausch, Armin.

J Acoust Soc Am ; 155(6): 3742-3759, 2024 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-38856312

RESUMEN

Amplitude modulation (AM) of a masker reduces its masking on a simultaneously presented unmodulated pure-tone target, which likely involves dip listening. This study tested the idea that dip-listening efficiency may depend on stimulus context, i.e., the match in AM peakedness (AMP) between the masker and a precursor or postcursor stimulus, assuming a form of temporal pattern analysis process. Masked thresholds were measured in normal-hearing listeners using Schroeder-phase harmonic complexes as maskers and precursors or postcursors. Experiment 1 showed threshold elevation (i.e., interference) when a flat cursor preceded or followed a peaked masker, suggesting proactive and retroactive temporal pattern analysis. Threshold decline (facilitation) was observed when the masker AMP was matched to the precursor, irrespective of stimulus AMP, suggesting only proactive processing. Subsequent experiments showed that both interference and facilitation (1) remained robust when a temporal gap was inserted between masker and cursor, (2) disappeared when an F0-difference was introduced between masker and precursor, and (3) decreased when the presentation level was reduced. These results suggest an important role of envelope regularity in dip listening, especially when masker and cursor are F0-matched and, therefore, form one perceptual stream. The reported effects seem to represent a time-domain variant of comodulation masking release.

Asunto(s)

Estimulación Acústica , Umbral Auditivo , Enmascaramiento Perceptual , Humanos , Adulto Joven , Adulto , Factores de Tiempo , Femenino , Masculino , Audiometría de Tonos Puros , Percepción Auditiva/fisiología

18.

Modeling the Intelligibility Benefit of Active Noise Cancelation in Hearing Devices That Improve Signal-to-Noise Ratio.

Sabin, Andrew T; McElhone, Dale; Gauger, Daniel; Rabinowitz, Bill.

Trends Hear ; 28: 23312165241260029, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38831646

RESUMEN

The extent to which active noise cancelation (ANC), when combined with hearing assistance, can improve speech intelligibility in noise is not well understood. One possible source of benefit is ANC's ability to reduce the sound level of the direct (i.e., vent-transmitted) path. This reduction lowers the "floor" imposed by the direct path, thereby allowing any increases to the signal-to-noise ratio (SNR) created in the amplified path to be "realized" at the eardrum. Here we used a modeling approach to estimate this benefit. We compared pairs of simulated hearing aids that differ only in terms of their ability to provide ANC and computed intelligibility metrics on their outputs. The difference in metric scores between simulated devices is termed the "ANC Benefit." These simulations show that ANC Benefit increases as (1) the environmental sound level increases, (2) the ability of the hearing aid to improve SNR increases, (3) the strength of the ANC increases, and (4) the hearing loss severity decreases. The predicted size of the ANC Benefit can be substantial. For a moderate hearing loss, the model predicts improvement in intelligibility metrics of >30% when environments are moderately loud (>70âdB SPL) and devices are moderately capable of increasing SNR (by >4âdB). It appears that ANC can be a critical ingredient in hearing devices that attempt to improve SNR in loud environments. ANC will become more and more important as advanced SNR-improving algorithms (e.g., artificial intelligence speech enhancement) are included in hearing devices.

Asunto(s)

Audífonos , Ruido , Enmascaramiento Perceptual , Relación Señal-Ruido , Inteligibilidad del Habla , Percepción del Habla , Humanos , Ruido/efectos adversos , Simulación por Computador , Estimulación Acústica , Corrección de Deficiencia Auditiva/instrumentación , Personas con Deficiencia Auditiva/rehabilitación , Personas con Deficiencia Auditiva/psicología , Pérdida Auditiva/diagnóstico , Pérdida Auditiva/rehabilitación , Pérdida Auditiva/fisiopatología , Diseño de Equipo , Procesamiento de Señales Asistido por Computador

19.

A one-man bilingual cocktail party: linguistic and non-linguistic effects on bilinguals' speech recognition in Mandarin and English.

Smith, Erin D; Holt, Lori L; Dick, Frederic.

Cogn Res Princ Implic ; 9(1): 35, 2024 Jun 05.

Artículo en Inglés | MEDLINE | ID: mdl-38834918

RESUMEN

Multilingual speakers can find speech recognition in everyday environments like restaurants and open-plan offices particularly challenging. In a world where speaking multiple languages is increasingly common, effective clinical and educational interventions will require a better understanding of how factors like multilingual contexts and listeners' language proficiency interact with adverse listening environments. For example, word and phrase recognition is facilitated when competing voices speak different languages. Is this due to a "release from masking" from lower-level acoustic differences between languages and talkers, or higher-level cognitive and linguistic factors? To address this question, we created a "one-man bilingual cocktail party" selective attention task using English and Mandarin speech from one bilingual talker to reduce low-level acoustic cues. In Experiment 1, 58 listeners more accurately recognized English targets when distracting speech was Mandarin compared to English. Bilingual Mandarin-English listeners experienced significantly more interference and intrusions from the Mandarin distractor than did English listeners, exacerbated by challenging target-to-masker ratios. In Experiment 2, 29 Mandarin-English bilingual listeners exhibited linguistic release from masking in both languages. Bilinguals experienced greater release from masking when attending to English, confirming an influence of linguistic knowledge on the "cocktail party" paradigm that is separate from primarily energetic masking effects. Effects of higher-order language processing and expertise emerge only in the most demanding target-to-masker contexts. The "one-man bilingual cocktail party" establishes a useful tool for future investigations and characterization of communication challenges in the large and growing worldwide community of Mandarin-English bilinguals.

Asunto(s)

Atención , Multilingüismo , Percepción del Habla , Humanos , Percepción del Habla/fisiología , Adulto , Femenino , Masculino , Adulto Joven , Atención/fisiología , Enmascaramiento Perceptual/fisiología , Psicolingüística

20.

Frequency importance functions in simulated bimodal cochlear-implant users with spectral holes.

Yoon, Yang-Soo; Whitaker, Reagan; White, Naomi.

J Acoust Soc Am ; 155(6): 3589-3599, 2024 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-38829154

RESUMEN

Frequency importance functions (FIFs) for simulated bimodal hearing were derived using sentence perception scores measured in quiet and noise. Acoustic hearing was simulated using low-pass filtering. Electric hearing was simulated using a six-channel vocoder with three input frequency ranges, resulting in overlap, meet, and gap maps, relative to the acoustic cutoff frequency. Spectral holes present in the speech spectra were created within electric stimulation by setting amplitude(s) of channels to zero. FIFs were significantly different between frequency maps. In quiet, the three FIFs were similar with gradually increasing weights with channels 5 and 6 compared to the first three channels. However, the most and least weighted channels slightly varied depending on the maps. In noise, the patterns of the three FIFs were similar to those in quiet, with steeper increasing weights with channels 5 and 6 compared to the first four channels. Thus, channels 5 and 6 contributed to speech perception the most, while channels 1 and 2 contributed the least, regardless of frequency maps. Results suggest that the contribution of cochlear implant frequency bands for bimodal speech perception depends on the degree of frequency overlap between acoustic and electric stimulation and if noise is absent or present.

Asunto(s)

Estimulación Acústica , Implantes Cocleares , Estimulación Eléctrica , Ruido , Percepción del Habla , Humanos , Ruido/efectos adversos , Implantación Coclear/instrumentación , Personas con Deficiencia Auditiva/psicología , Personas con Deficiencia Auditiva/rehabilitación , Enmascaramiento Perceptual , Adulto

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA