Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis.

Zhang, Wenbo; Xie, Xuefeng; Du, Yanling; Huang, Dongmei

Zhang, Wenbo; Xie, Xuefeng; Du, Yanling; Huang, Dongmei.

Afiliación

Zhang W; College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.
Xie X; College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.
Du Y; College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.
Huang D; Shanghai University of Electric Power, Shanghai, 201306, China.

J Acoust Soc Am ; 155(6): 3580-3588, 2024 Jun 01.

Article en En | MEDLINE | ID: mdl-38829156

ABSTRACT

ABSTRACT

Speech enhancement aims to make noisy speech signals clearer. Traditional time-frequency domain methods struggle to differentiate between speech and noise, leading to a risk of speech distortion. This paper introduces an approach that combines the time domain and time-frequency domain using the W-net module to suppress noise at the front end. The module is an improved version of Wave-U-Net, called TTF-W-Net. We conducted experiments using the TIMIT speech and NOISEX-92 noise datasets to evaluate the enhancement performance achieved by integrating preprocessing networks, specifically Wave-U-Net and our TTF-W-Net, into the baseline

methods:

Phase, FullSubNet+, and DB-AIAT. Experimental results show that TTF-W-Net outperforms the baseline Wave-U-Net by 15.7% on the PESQ metric and the effect of the network by using our preprocessing method is improved. Consequently, the TTF-W-Net preprocessing Net offers effective speech enhancement.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Acoust Soc Am Año: 2024 Tipo del documento: Article País de afiliación: China Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google