Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 62
Filtrar
1.
IEEE Access ; 11: 4350-4358, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37621739

RESUMO

In this paper, we propose a directional signal extraction network (DSENet). DSENet is a low-latency, real-time neural network that, given a reverberant mixture of signals captured by a microphone array, aims at extracting the reverberant signal whose source is located within a directional region of interest. If there are multiple sources situated within the directional region of interest, DSENet will aim at extracting a combination of their reverberant signals. As such, the formulation of DSENet circumvents the well-known crosstalk problem in beamforming while providing an alternative and perhaps more practical approach to other spatially constrained signal extraction methods proposed in the literature. DSENet is based on a computationally efficient and low-distortion linear model formulated in the time domain. As a result, an important application of our work is hearing improvement on edge devices. Simulation results show that DSENet outperforms oracle beamformers, as well as state-of-the-art in low-latency causal speech separation, while incurring a system latency of only 4 ms. Additionally, DSENet has been successfully deployed as a real-time application on a smartphone.

2.
IEEE Access ; 9: 157800-157811, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34926101

RESUMO

Direction-of-arrival (DOA) estimation is a fundamental technique in array signal processing due to its wide applications in beamforming, speech enhancement and many other assistive speech processing technologies. In this paper, we devise a novel DOA technique based on randomized singular value decomposition (RSVD) to improve the performance of non-uniform non-linear microphone arrays (NUNLA). The accurate and efficient singular value decomposition of large data matrices is computationally challenging, and randomization provides an effective tool for performing matrix approximation, therefore, the developed DOA estimation utilizes a modified dictionary-based RSVD method for localizing single speech sources under low signal-to-noise ratios (SNR). Unlike previous methods developed for uniform linear microphone arrays, the proposed approach with L-shaped three microphone setup has no 'left-right' ambiguity. We present the performance of our proposed method in comparison to other techniques. The demonstrated experiments shows at-least 20% performance improvement using simulated data and 25% performance improvement using real data when compared with similar DoA estimation techniques for NUNLA. The proposed method exploits frame-based online time delay of arrival (TDOA) measurements which facilitates the proposed algorithm to run on real-time devices. We also show an efficient real-time implementation of the proposed method on a Pixel 3 Android smartphone using its built-in three microphones for hearing aid applications.

3.
J Acoust Soc Am ; 150(3): 1663, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34598612

RESUMO

This work presents a single-channel speech enhancement (SE) framework based on the super-Gaussian extension of the joint maximum a posteriori (SGJMAP) estimation rule. The developed SE algorithm is an open-source research smartphone-based application for hearing improvement studies. In this algorithm, the SGJMAP-based estimation for noisy speech mixture is smoothed along the frequency axis by a Mel filter-bank, resulting in a Mel-warped frequency-domain SGJMAP estimation. The impulse response of this Mel-warped estimation is obtained by applying a Mel-warped inverse discrete cosine transform (Mel-IDCT). This helps in filtering out the background noise and enhancing the speech signal. The proposed application is implemented on an iPhone (Apple, Cupertino, CA) to operate in real time and tested with normal-hearing (NH) and hearing-impaired (HI) listeners with different types of hearing aids through wireless connectivity. The objective speech quality and intelligibility test results are used to compare the performance of the proposed algorithm to existing conventional single-channel SE methods. Additionally, test results from NH and HI listeners show substantial improvement in speech recognition with the developed method in simulated real-world noisy conditions at different signal-to-noise ratio levels.


Assuntos
Auxiliares de Audição , Perda Auditiva Neurossensorial , Percepção da Fala , Perda Auditiva Neurossensorial/diagnóstico , Perda Auditiva Neurossensorial/terapia , Humanos , Ruído/efeitos adversos , Smartphone , Inteligibilidade da Fala
4.
Artigo em Inglês | MEDLINE | ID: mdl-34064080

RESUMO

BACKGROUND: Identifying and treating hearing loss can help improve communication skills, which often leads to improved quality of life. Many people do not seek medical treatment and, therefore, go undiagnosed for an extended period before realizing they have hearing loss. This study presents a self-administered, low-cost, smartphone-based hearing test application (HearTest) to quantify the pure-tone hearing thresholds of a user. The HearTest application can be used with commercially available smartphone devices and an earphone with the mentioned specification. METHODS: Air-conduction-based pure-tone audiometry for the smartphone application was designed and implemented to detect hearing thresholds using a traditional "10 dB down and 5 dB up" approach. Employed smartphone-earphone combination was calibrated with respect to a GSI-61 audiometer and insert earphone ER-3A to maintain clinical standards with the help of subjective testing on 20 normal-hearing (NH) participants. RESULTS: Further subjective testing on 14 participants with NH and retesting on five participants showed that HearTest achieves high-accuracy audiogram within clinically acceptable limits (≤10 dB HL mean difference) when compared with the reference clinical audiometer. Hardware challenges and limitations in air-conduction-based hearing tests through smartphones and ways to improve their accuracy and reliability are discussed. CONCLUSION: The proposed smartphone application provides a simple, affordable, and reliable means for people to learn more about their hearing health without needing access to a formal clinical facility.


Assuntos
Qualidade de Vida , Smartphone , Audiometria de Tons Puros , Limiar Auditivo , Humanos , Reprodutibilidade dos Testes
5.
IEEE Open J Signal Process ; 2: 535-544, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35495551

RESUMO

This work presents a methodology for the joint calibration and synchronization of two arrays of microphones and loudspeakers. The problem is modeled as estimation of the rigid motion of one array with respect to the other, as well as estimation of the synchronization mismatch between the two. The proposed method uses dedicated signals emitted by the loudspeakers of the two arrays to compute a set of time of arrival (TOA) estimates. Through a simple transformation, estimated TOAs are converted into a set of linearly independent time difference of arrival (TDOA) measurements, which are modeled by a system of nonlinear equations in the unknown parameters of interest. A maximum likelihood estimate is then given as the solution to a nonlinear weighted least squares (NWLS) problem, which is optimized applying a parallelizable variant of Particle Swarm Optimization (PSO). In this paper, we also derive the Cramér-Rao lower bound (CRLB), and benchmark it against the proposed method in a series of Monte Carlo (MC) simulations. Results show that the proposed method attains high-performance comparable to the CRLB.

6.
Semin Hear ; 41(4): 291-301, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33364678

RESUMO

As part of a National Institutes of Health-National Institute on Deafness and Other communication Disorders (NIH-NIDCD)-supported project to develop open-source research and smartphone-based apps for enhancing speech recognition in noise, an app called Smartphone Hearing Aid Research Project Version 2 (SHARP-2) was tested with persons with normal and impaired hearing when using three sets of hearing aids (HAs) with wireless connectivity to an iPhone. Participants were asked to type sentences presented from a speaker in front of them while hearing noise from behind in two conditions, HA alone and HA + SHARP-2 app running on the iPhone. The signal was presented at a constant level of 65 dBA and the signal-to-noise ratio varied from -10 to +10, so that the task was difficult when listening through the bilateral HAs alone. This was important to allow for improvement to be measured when the HAs were connected to the SHARP-2 app on the smartphone. Benefit was achieved for most listeners with all three manufacturer HAs with the greatest improvements recorded for persons with normal (33.56%) and impaired hearing (22.21%) when using the SHARP-2 app with one manufacturer's made-for-all phones HAs. These results support the continued development of smartphone-based apps as an economical solution for enhancing speech recognition in noise for both persons with normal and impaired hearing.

7.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 952-955, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018142

RESUMO

In this paper, a dual-channel speech enhancement (SE) method is proposed. The proposed method is a combination of minimum variance distortionless response (MVDR) beamformer and a super-Gaussian joint maximum a posteriori (SGJMAP) based SE gain function. The proposed SE method runs on a smartphone in real-time, providing a portable device for hearing aid (HA) applications. Spectral Flux based voice activity detector (VAD) is used to improve the accuracy of the beamformer output. The efficiency of the proposed SE method is evaluated using speech quality and intelligibility measures and compared with that of other SE techniques. The objective and subjective test results show the capability of the proposed SE method in three different noisy conditions at low signal to noise ratios (SNRs) of -5, 0, and +5 dB.


Assuntos
Auxiliares de Audição , Smartphone , Voz , Humanos , Ruído , Inteligibilidade da Fala
8.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 956-959, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018143

RESUMO

Deep neural networks (DNNs) have been useful in solving benchmark problems in various domains including audio. DNNs have been used to improve several speech processing algorithms that improve speech perception for hearing impaired listeners. To make use of DNNs to their full potential and to configure models easily, automated machine learning (AutoML) systems are developed, focusing on model optimization. As an application of AutoML to audio and hearing aids, this work presents an AutoML based voice activity detector (VAD) that is implemented on a smartphone as a real-time application. The developed VAD can be used to elevate the performance of speech processing applications like speech enhancement that are widely used in hearing aid devices. The classification model generated by AutoML is computationally fast and has minimal processing delay, which enables an efficient, real-time operation on a smartphone. The steps involved in real-time implementation are discussed in detail. The key contribution of this work include the utilization of AutoML platform for hearing aid applications and the realization of AutoML model on smartphone. The experimental analysis and results demonstrate the significance and importance of using the AutoML for the current approach. The evaluations also show improvements over the state of art techniques and reflect the practical usability of the developed smartphone app in different noisy environments.


Assuntos
Auxiliares de Audição , Smartphone , Aprendizado de Máquina , Ruído , Inteligibilidade da Fala
9.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 972-975, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018147

RESUMO

Acoustic feedback cancellation is a challenging problem in the design of sound reinforcement systems, hearing aids, etc. Acoustic feedback is inevitable when the acoustic signal path forms a loop between the microphone and loudspeaker. An efficient short duration noise injection algorithm is proposed in this paper to estimate the impulse response of the acoustic feedback path model. The algorithm does not require any prior information about the acoustic feedback path. It is capable of optimally estimate the acoustic feedback path for cancellation, and avoid the occurrence of any howling episode, in varying acoustic environments. Presented algorithm is efficiently implemented on smartphone device having close proximity of loudspeaker and microphone to emulate the feedback condition. The algorithm being platform-independent can also be implemented for any set-up or system. The experimental results of the proposed method shows satisfying results and its ability to track and cancel the acoustic feedback in changing characteristics of the acoustic path.


Assuntos
Auxiliares de Audição , Ruído , Acústica , Algoritmos , Retroalimentação
10.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 968-971, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018146

RESUMO

A compressor in hearing aid devices (HADs) is responsible for mapping the dynamic range of input signals to the residual dynamic range of hearing-impaired (HI) patients. Gains and parameters of the compressor are set according to the HI patient's preferences. In different surroundings depending upon noise level, the patient may seek to tune the parameters to improve performance. Traditionally, fitting of the hearing aids is done by an audiologist using hearing aid software and the HI patient's opinion at a clinic. In this paper, we propose a frequency-based multi-band compressor implemented as a smartphone application, which can be used as an alternative to that of the traditional HADs. The proposed solution allows the user to tune the compression parameters for each band along with a choice of compression speed and fitting strategy. Exploiting smartphone processing and hardware capabilities, the application can be used for bilateral hearing loss. The performance of this easy-to-use smartphone-based application is compared with traditional HADs using a hearing aid test system. Objective and subjective evaluations are also carried out to quantify the performance.


Assuntos
Compressão de Dados , Auxiliares de Audição , Perda Auditiva , Percepção da Fala , Perda Auditiva/terapia , Perda Auditiva Bilateral , Humanos
11.
IEEE Access ; 8: 106296-106309, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32793404

RESUMO

Alert signals like sirens and home alarms are important as they warn people of precarious situations. This work presents the detection and separation of these acoustically important alert signals, not to be attenuated as noise, to assist the hearing impaired listeners. The proposed method is based on convolutional neural network (CNN) and convolutional-recurrent neural network (CRNN). The developed method consists of two blocks, the detector block, and the separator block. The entire setup is integrated with speech enhancement (SE) algorithms, and before the compression stage, used in a hearing aid device (HAD) signal processing pipeline. The detector recognizes the presence of alert signal in various noisy environments. The separator block separates the alert signal from the mixture of noisy signals before passing it through SE to ensure minimal or no attenuation of the alert signal. It is implemented on a smartphone as an application that seamlessly works with HADs in real-time. This smartphone assistive setup allows the hearing aid users to know the presence of the alert sounds even when these are out of sight. The algorithm is computationally efficient with a low processing delay. The key contribution of this paper includes the development and integration of alert signal separator block with SE and the realization of the entire setup on a smartphone in real-time. The proposed method is compared with several state-of-the-art techniques through objective measures in various noisy conditions. The experimental analysis demonstrates the effectiveness and practical usefulness of the developed setup in real-world noisy scenarios.

12.
J Acoust Soc Am ; 148(1): 389, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32752751

RESUMO

This work presents a two-microphone speech enhancement (SE) framework based on basic recurrent neural network (RNN) cell. The proposed method operates in real-time, improving the speech quality and intelligibility in noisy environments. The RNN model trained using a simple feature set-real and imaginary parts of the short-time Fourier transform (STFT) are computationally efficient with a minimal input-output processing delay. The proposed algorithm can be used in any stand-alone platform such as a smartphone using its two inbuilt microphones. The detailed operation of the real-time implementation on the smartphone is presented. The developed application works as an assistive tool for hearing aid devices (HADs). Speech quality and intelligibility test results are used to compare the proposed algorithm to existing conventional and neural network-based SE methods. Subjective and objective scores show the superior performance of the developed method over several conventional methods in different noise conditions and low signal to noise ratios (SNRs).


Assuntos
Auxiliares de Audição , Perda Auditiva Neurossensorial , Percepção da Fala , Audição , Humanos , Redes Neurais de Computação , Inteligibilidade da Fala
13.
Artigo em Inglês | MEDLINE | ID: mdl-33972890

RESUMO

This work proposes a convolutional recurrent neural network (CRNN) based direction of arrival (DOA) angle estimation method, implemented on the Android smartphone for hearing aid applications. The proposed app provides a 'visual' indication of the direction of a talker on the screen of Android smartphones for improving the hearing of people with hearing disorders. We use real and imaginary parts of short-time Fourier transform (STFT) as a feature set for the proposed CRNN architecture for DOA angle estimation. Real smartphone recordings are utilized for assessing performance of the proposed method. The accuracy of the proposed method reaches 87.33% for unseen (untrained) environments. This work also presents real-time inference of the proposed method, which is done on an Android smartphone using only its two built-in microphones and no additional component or external hardware. The real-time implementation also proves the generalization and robustness of the proposed CRNN based model.

14.
IEEE Access ; 8: 197047-197058, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33981519

RESUMO

In this article, we present a real-time convolutional neural network (CNN)-based Speech source localization (SSL) algorithm that is robust to realistic background acoustic conditions (noise and reverberation). We have implemented and tested the proposed method on a prototype (Raspberry Pi) for real-time operation. We have used the combination of the imaginary-real coefficients of the short-time Fourier transform (STFT) and Spectral Flux (SF) with delay-and-sum (DAS) beamforming as the input feature. We have trained the CNN model using noisy speech recordings collected from different rooms and inference on an unseen room. We provide quantitative comparison with five other previously published SSL algorithms under several realistic noisy conditions, and show significant improvements by incorporating the Spectral Flux (SF) with beamforming as an additional feature to learn temporal variation in speech spectra. We perform real-time inferencing of our CNN model on the prototyped platform with low latency (21 milliseconds (ms) per frame with a frame length of 30 ms) and high accuracy (i.e. 89.68% under Babble noise condition at 5dB SNR). Lastly, we provide a detailed explanation of real-time implementation and on-device performance (including peak power consumption metrics) that sets this work apart from previously published works. This work has several notable implications for improving the audio-processing algorithms for portable battery-operated Smart loudspeakers and hearing improvement (HI) devices.

15.
Artigo em Inglês | MEDLINE | ID: mdl-34026330

RESUMO

In this paper, we present a real-time noise-robust direction of arrival (DOA) estimation technique using only the three built-in microphones of the modern Android-based smartphone. The proposed method eliminates the 'front-back' ambiguity caused by the symmetry of the two microphones reported previously and improves the performance of DOA estimation in noisy speech environments. Our method enhances the spatial awareness of hearing-impaired users by displaying the precise DOA angle of speech source on their smartphone screen. For increased efficiency, noise-robustness, and accuracy of the proposed DOA estimation method, a spectral pre-filtering technique and a Voice Activity Detector (VAD) based post-filtering are used along with a modified generalized cross-correlation (GCC) technique. Real recorded and simulated data under realistic noisy conditions are used in the evaluations of the proposed algorithm. Real-time implementation of the proposed system is carried out on an Android-based smartphone without any additional hardware or external microphone attachments. Experimental results show the performance of the proposed method versus those without pre or post-filtering under three different noisy conditions with 0dB to 10dB signal to noise ratios (SNRs).

16.
Interspeech ; 2020: 3281-3285, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33898608

RESUMO

In this paper, we present a deep neural network architecture comprising of both convolutional neural network (CNN) and recurrent neural network (RNN) layers for real-time single-channel speech enhancement (SE). The proposed neural network model focuses on enhancing the noisy speech magnitude spectrum on a frame-by-frame process. The developed model is implemented on the smartphone (edge device), to demonstrate the real-time usability of the proposed method. Perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) test results are used to compare the proposed algorithm to previously published conventional and deep learning-based SE methods. Subjective ratings show the performance improvement of the proposed model over the other baseline SE methods.

17.
Proc Meet Acoust ; 39(1)2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-33005287

RESUMO

Conventional Blind Source Separation (BSS) techniques are computationally complex. This is due to the calculation of the demixing matrix for the entire signal or due to the frequent update of the demixing matrix at every time frame index, making them impractical to use in many real-time applications. In this paper, a robust, neural network based two-microphone sound source localization method is used as a criterion to enhance the efficiency of the Independent Vector Analysis (IVA), a BSS method. IVA is used to separate speech and noise sources which are convolutedly mixed. The practical usability of the proposed method is proved by implementing it on a smartphone in real-time to separate speech and noise in realistic scenarios for Hearing-Aid (HA) applications. The experimental results using objective and subjective tests reveal the usefulness of the developed method for real-world applications.

18.
IEEE Access ; 7: 78421-78433, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32661495

RESUMO

This paper presents a Speech Enhancement (SE) technique based on multi-objective learning convolutional neural network to improve the overall quality of speech perceived by Hearing Aid (HA) users. The proposed method is implemented on a smartphone as an application that performs real-time SE. This arrangement works as an assistive tool to HA. A multi-objective learning architecture including primary and secondary features uses a mapping-based convolutional neural network (CNN) model to remove noise from a noisy speech spectrum. The algorithm is computationally fast and has a low processing delay which enables it to operate seamlessly on a smartphone. The steps and the detailed analysis of real-time implementation are discussed. The proposed method is compared with existing conventional and neural network-based SE techniques through speech quality and intelligibility metrics in various noisy speech conditions. The key contribution of this paper includes the realization of CNN SE model on a smartphone processor that works seamlessly with HA. The experimental results demonstrate significant improvements over the state-of-the-art techniques and reflect the usability of the developed SE application in noisy environments.

19.
Proc Meet Acoust ; 39(1)2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32714483

RESUMO

Multi-band Dynamic Range (MBDR) Compression is a key part of the signal processing operation in hearing aid devices (HADs). Operating speed of the MBDR compressor plays an important role in preserving the quality and intelligibility of the output signal. Traditional fast-acting compressor preserves the audible cues in quiet speech but, in presence of surrounding noise, it can degrade the sound quality by introducing pumping and breathing effects. Alternatively, slow-acting compressor maintains the temporal cues and the listening comfort but may provide inadequate gain for soft inputs that come right after loud inputs. HADs may operate in a variable acoustic environment. Therefore, a fixed speed in compression might affect the performance of the hearing aids. In this study, we propose a frequency(FFT) based nine-band adaptive MBDR compression which uses spectral flux as a measure of the intensity change in input level to adapt the speed of the compressor in each band. Gain, threshold and compression ratio of the compressor for nine bands are adjusted based on the audiogram of the hearing impaired patient. The proposed frequency-based adaptive MBDR compression method is implemented on smartphone. The objective and subjective test results demonstrate the performance of proposed method compared to fixed compression approaches.

20.
Proc Meet Acoust ; 39(1)2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-32742552

RESUMO

Deep neural network (DNN) techniques are gaining popularity due to performance boost in many applications. In this work we propose a DNN-based method for finding the direction of arrival (DOA) of speech source for hearing study improvement and hearing aid applications using popular smartphone with no external components as a cost-effective stand-alone platform. We consider the DOA estimation as a classification problem and use the magnitude and phase of speech signal as a feature set for DNN training stage and obtaining appropriate model. The model is trained and derived using real speech and real noisy speech data recorded on smartphone in different noisy environments under low signal to noise ratios (SNRs). The DNN-based DOA method with the pre-trained model is implemented and run on Android smartphone in real time. The performance of proposed method is evaluated objectively and subjectively in the both training and unseen environments. The test results are presented showing the superior performance of proposed method over conventional methods.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...