RESUMO
Acoustic feedback in hearing aids occurs due to the coupling between the hearing aid loudspeaker and microphones. In order to reduce acoustic feedback, adaptive filters are often used to estimate the feedback path. To increase the convergence speed and decrease the computational complexity of the adaptive algorithms, it has been proposed to split the acoustic feedback path into a time-invariant fixed part and a time-varying variable part. A key question of this approach is how to determine the fixed part. In this paper, two approaches are investigated: (1) a digital filter design approach that makes use of the signals of at least two hearing aid microphones and (2) a defined physical location approach using an electro-acoustic model and the signals of one hearing aid microphone and an additional ear canal microphone. An experimental comparison using measured acoustic feedback paths showed that both approaches enable one to reduce the number of variable part coefficients. It is shown that individualization of the fixed part increases the performance. Furthermore, the two approaches offer solutions for different requirements on the effort to a specific hearing aid design on the one hand and the effort during the hearing aid fitting on the other hand.
Assuntos
Auxiliares de Audição , Processamento de Sinais Assistido por Computador , Estimulação Acústica , Acústica , Desenho de Equipamento , Retroalimentação , Humanos , Modelos Biológicos , Modelos TeóricosRESUMO
Identifying the target speaker in hearing aid applications is an essential ingredient to improve speech intelligibility. Recently, a least-squares-based method has been proposed to identify the attended speaker from single-trial EEG recordings for an acoustic scenario with two competing speakers. This least-squares-based auditory attention decoding (AAD) method aims at decoding auditory attention by reconstructing the attended speech envelope from the EEG recordings using a trained spatio-temporal filter. While the performance of this AAD method has been mainly studied for noiseless and anechoic acoustic conditions, it is important to fully understand its performance in realistic noisy and reverberant acoustic conditions. In this paper, we investigate AAD using EEG recordings for different acoustic conditions (anechoic, reverberant, noisy, and reverberant-noisy). In particular, we investigate the impact of different acoustic conditions for AAD filter training and for decoding. In addition, we investigate the influence on the decoding performance of the different acoustic components (i.e., reverberation, background noise, and interfering speaker) in the reference signals used for decoding and the training signals used for computing the filters. First, we found that for all considered acoustic conditions it is possible to decode auditory attention with a considerably large decoding performance. In particular, even when the acoustic conditions for AAD filter training and for decoding are different, the decoding performance is still comparably large. Second, when using speech signals affected by either reverberation and/or background noise there is no significant difference in decoding performance ( ) compared to when using clean speech signals as reference signals. In contrast, when using reference signals affected by the interfering speaker, the decoding performance significantly decreases. Third, the experimental results indicate that it is even feasible to use training signals affected by reverberation, background noise and/or the interfering speaker for computing the filters.
Assuntos
Estimulação Acústica , Atenção/fisiologia , Percepção Auditiva/fisiologia , Eletroencefalografia/métodos , Ruído , Adulto , Algoritmos , Feminino , Humanos , Masculino , Fala , Inteligibilidade da Fala , Adulto JovemRESUMO
Adaptive feedback cancellation (AFC) techniques are common in modern hearing aid devices (HADs) since these techniques have been successful in increasing the stable gain. Accordingly, there has been a significant effort to improve AFC technology, especially for open-fitting and in-ear HADs, for which howling is more prevalent due to the large acoustic coupling between the loudspeaker and the microphone. In this paper, the authors propose a hybrid AFC (H-AFC) scheme that is able to shorten the time it takes to recover from howling. The proposed H-AFC scheme consists of a switched combination adaptive filter, which is controlled by a soft-clipping-based stability detector to select either the standard normalized least mean squares (NLMS) algorithm or the prediction-error-method (PEM) NLMS algorithm to update the adaptive filter. The standard NLMS algorithm is used to obtain fast convergence, while the PEM-NLMS algorithm is used to provide a low bias solution. This stability-controlled adaptation is hence the means to improve performance in terms of both convergence rate as well as misalignment, while only slightly increasing computational complexity. The proposed H-AFC scheme has been evaluated for both speech and music signals, resulting in a significantly improved convergence and re-convergence rate, i.e., a shorter howling period, as well as a lower average misalignment and a larger added stable gain compared to using either the NLMS or the PEM-NLMS algorithm alone. An objective evaluation using the perceptual evaluation of speech quality and the perceptual evaluation of audio quality measures shows that the proposed H-AFC scheme provides very high-quality speech and music signals. This has also been verified through a subjective listening experiment with N = 15 normal-hearing subjects using a multi-stimulus test with hidden reference and anchor, showing that the proposed H-AFC scheme results in a better perceptual quality than the state-of-the-art PEM-NLMS algorithm.
Assuntos
Acústica , Algoritmos , Percepção Auditiva , Correção de Deficiência Auditiva/instrumentação , Auxiliares de Audição , Pessoas com Deficiência Auditiva/reabilitação , Processamento de Sinais Assistido por Computador , Estimulação Acústica , Adulto , Desenho de Equipamento , Humanos , Modelos Teóricos , Música , Pessoas com Deficiência Auditiva/psicologia , Espectrografia do Som , Inteligibilidade da Fala , Percepção da FalaRESUMO
To decode auditory attention from single-trial EEG recordings in an acoustic scenario with two competing speakers, a least-squares method has been recently proposed. This method however requires the clean speech signals of both the attended and the unattended speaker to be available as reference signals. Since in practice only the binaural signals consisting of a reverberant mixture of both speakers and background noise are available, in this paper we explore the potential of using these (unprocessed) signals as reference signals for decoding auditory attention in different acoustic conditions (anechoic, reverberant, noisy, and reverberant-noisy). In addition, we investigate whether it is possible to use these signals instead of the clean attended speech signal for filter training. The experimental results show that using the unprocessed binaural signals for filter training and for decoding auditory attention is feasible with a relatively large decoding performance, although for most acoustic conditions the decoding performance is significantly lower than when using the clean speech signals.
Assuntos
Eletroencefalografia , Estimulação Acústica , Atenção , Ruído , Percepção da FalaRESUMO
In many applications in which speech is played back via a sound reinforcement system such as public address systems and mobile phones, speech intelligibility is degraded by additive environmental noise. A possible solution to maintain high intelligibility in noise is to pre-process the speech signal based on the estimated noise power at the position of the listener. The previously proposed AdaptDRC algorithm [Schepker, Rennies, and Doclo (2015). J. Acoust. Soc. Am. 138, 2692-2706] applies both frequency shaping and dynamic range compression under an equal-power constraint, where the processing is adaptively controlled by short-term estimates of the speech intelligibility index. Previous evaluations of the algorithm have focused on normal-hearing listeners. In this study, the algorithm was extended with an adaptive gain stage under an equal-peak-power constraint, and evaluated with eleven normal-hearing and ten mildly to moderately hearing-impaired listeners. For normal-hearing listeners, average improvements in speech reception thresholds of about 4 and 8 dB compared to the unprocessed reference condition were measured for the original algorithm and its extension, respectively. For hearing-impaired listeners, the average improvements were about 2 and 6 dB, indicating that the relative improvement due to the proposed adaptive gain stage was larger for these listeners than the benefit of the original processing stages.
Assuntos
Acústica , Algoritmos , Ruído/efeitos adversos , Mascaramento Perceptivo , Pessoas com Deficiência Auditiva/psicologia , Presbiacusia/psicologia , Processamento de Sinais Assistido por Computador , Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Adulto , Idoso , Audiometria de Tons Puros , Limiar Auditivo , Estudos de Casos e Controles , Feminino , Audição , Humanos , Masculino , Pessoa de Meia-Idade , Presbiacusia/diagnóstico , Presbiacusia/fisiopatologia , Teste do Limiar de Recepção da Fala , Adulto JovemRESUMO
In many speech communication applications, such as public address systems, speech is degraded by additive noise, leading to reduced speech intelligibility. In this paper a pre-processing algorithm is proposed that is capable of increasing speech intelligibility under an equal-power constraint. The proposed AdaptDRC algorithm comprises two time- and frequency-dependent stages, i.e., an amplification stage and a dynamic range compression stage that are both dependent on the Speech Intelligibility Index (SII). Experiments using two objective measures, namely, the extended SII and the short-time objective intelligibility measure (STOI), and a formal listening test were conducted to compare the AdaptDRC algorithm with a modified version of a recently proposed algorithm in three different noise conditions (stationary car noise and speech-shaped noise and non-stationary cafeteria noise). While the objective measures indicate a similar performance for both algorithms, results from the formal listening test indicate that for the two stationary noises both algorithms lead to statistically significant improvements in speech intelligibility and for the non-stationary cafeteria noise only the proposed AdaptDRC algorithm leads to statistically significant improvements. A comparison of both objective measures and results from the listening test shows high correlations, although, in general, the performance of both algorithms is overestimated.
Assuntos
Algoritmos , Inteligibilidade da Fala , Percepção da Fala/fisiologia , Adulto , Compressão de Dados , Feminino , Humanos , Masculino , Razão Sinal-Ruído , Adulto JovemRESUMO
A reciprocal measurement procedure to measure the acoustic feedback path in hearing aids is investigated. The advantage of the reciprocal measurement compared to the direct measurement is a significantly reduced sound pressure in the ear. The direct and reciprocal measurements are compared using measurements on a dummy head with adjustable ear canals, different earmolds, and variations in the outer sound field. The results show that the reciprocal measurement procedure can be used to obtain plausible feedback paths, while reducing the sound pressure in the ear canal by 30 to 40 dB.
Assuntos
Retroalimentação , Auxiliares de Audição , Estimulação Acústica , Meato Acústico Externo , Desenho de Equipamento , Humanos , Modelos Anatômicos , Ruído , Pressão , Acústica da Fala , Transdutores de PressãoRESUMO
In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios.
Assuntos
Algoritmos , Percepção Auditiva/fisiologia , Auxiliares de Audição , Perda Auditiva Neurossensorial/terapia , Ruído/prevenção & controle , Inteligibilidade da Fala/fisiologia , Acústica/instrumentação , Limiar Auditivo/fisiologia , Perda Auditiva Neurossensorial/diagnóstico , Humanos , Percepção Sonora/fisiologia , Razão Sinal-RuídoRESUMO
Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users.
Assuntos
Limiar Auditivo/fisiologia , Implante Coclear/instrumentação , Ruído/prevenção & controle , Mascaramento Perceptivo/fisiologia , Desenho de Prótese , Inteligibilidade da Fala , Adulto , Idoso , Algoritmos , Audiometria da Fala/métodos , Implantes Cocleares , Humanos , Pessoa de Meia-Idade , Falha de Prótese , Estudos de Amostragem , Razão Sinal-Ruído , Teste do Limiar de Recepção da Fala , Adulto JovemRESUMO
When re-synthesizing individual head related transfer functions (HRTFs) with a microphone array, smoothing HRTFs spectrally and/or spatially prior to the computation of appropriate microphone filters may improve the synthesis accuracy. In this study, the limits of the associated HRTF modifications, until which no perceptual degradations occur, are explored. First, complex spectral smoothing of HRTFs into constant relative bandwidths was considered. As a prerequisite to complex smoothing, the HRTF phase spectra were substituted by linear phases, either for the whole frequency range or above a certain cut-off frequency only. The results indicate that a broadband phase linearization of HRTFs can be perceived for certain directions/subjects and that the thresholds can be predicted by a simple model. HRTF phase spectra can be linearized above 1 kHz without being detectable. After substituting the original phase by a linear phase above 5 kHz, HRTFs may be smoothed complexly into constant relative bandwidths of 1/5 octave, without introducing noticeable artifacts. Second, spatially smoother HRTF directivity patterns were obtained by levelling out spatial notches. It turned out that spatial notches do not have to be retained if they are less than 29 dB below the maximum level in the directivity pattern.
RESUMO
In order to study the interaction between the intelligibility advantage in rooms due to the presence of early reflections and due to binaural unmasking, a series of speech reception threshold experiments was performed employing a single reflection of the frontal target speech source as a function of its delay ranging from 0 to 200 ms. The direction of the reflection and the spatial characteristic of the interfering noise (diotic, diffuse, or laterally localized) were varied in the experiments. For the frontal reflection, full temporal integration was observed for all three noise types up to a delay of at least 25 ms followed by gradual intelligibility decay at longer delays. At 200 ms delay the reflection introduced additional intelligibility deterioration. For short delays, intelligibility was not reduced when the reflection was spatially separated from the direct sound in the diffuse and lateral noise conditions. A release from the deterioration effect at 200 ms delay was found for all spatially separated reflections. The suppression of a detrimental reflection was symmetrical in diffuse noise, but azimuth-dependent in lateral noise. This indicates an interaction of spatial and temporal processing of speech reflections which challenges existing binaural speech intelligibility models.
Assuntos
Acústica , Arquitetura de Instituições de Saúde , Ruído/efeitos adversos , Mascaramento Perceptivo , Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Adulto , Análise de Variância , Audiometria de Tons Puros , Limiar Auditivo , Feminino , Humanos , Masculino , Teste do Limiar de Recepção da Fala , Fatores de Tempo , Adulto JovemRESUMO
This paper evaluates speech enhancement in binaural multimicrophone hearing aids by noise reduction algorithms based on the multichannel Wiener filter (MWF) and the MWF with partial noise estimate (MWF-N). Both algorithms are specifically developed to combine noise reduction with the preservation of binaural cues. Objective and perceptual evaluations were performed with different speech-in-multitalker-babble configurations in two different acoustic environments. The main conclusions are as follows: (a) A bilateral MWF with perfect voice activity detection equals or outperforms a bilateral adaptive directional microphone in terms of speech enhancement while preserving the binaural cues of the speech component. (b) A significant gain in speech enhancement is found when transmitting one contralateral microphone signal to the MWF active at the ipsilateral hearing aid. Adding a second contralateral microphone showed a significant improvement during the objective evaluations but not in the subset of scenarios tested during the perceptual evaluations. (c) Adding the partial noise estimate to the MWF, done to improve the spatial awareness of the hearing aid user, reduces the amount of speech enhancement in a limited way. In some conditions the MWF-N even outperformed the MWF possibly due to an improved spatial release from masking.
Assuntos
Auxiliares de Audição , Percepção da Fala , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Ruído , Teste do Limiar de Recepção da FalaRESUMO
This paper evaluates the influence of three multimicrophone noise reduction algorithms on the ability to localize sound sources. Two recently developed noise reduction techniques for binaural hearing aids were evaluated, namely, the binaural multichannel Wiener filter (MWF) and the binaural multichannel Wiener filter with partial noise estimate (MWF-N), together with a dual-monaural adaptive directional microphone (ADM), which is a widely used noise reduction approach in commercial hearing aids. The influence of the different algorithms on perceived sound source localization and their noise reduction performance was evaluated. It is shown that noise reduction algorithms can have a large influence on localization and that (a) the ADM only preserves localization in the forward direction over azimuths where limited or no noise reduction is obtained; (b) the MWF preserves localization of the target speech component but may distort localization of the noise component. The latter is dependent on signal-to-noise ratio and masking effects; (c) the MWF-N enables correct localization of both the speech and the noise components; (d) the statistical Wiener filter approach introduces a better combination of sound source localization and noise reduction performance than the ADM approach.