Neural speech enhancement with unsupervised pre-training and mixture training.

Hao, Xiang; Xu, Chenglin; Xie, Lei

Hao, Xiang; Xu, Chenglin; Xie, Lei.

Afiliação

Hao X; Audio, Speech and Langauge Processing Group, School of Computer Science, Northwestern Polytechnical University, Xi'an, China. Electronic address: xhoare@mail.nwpu.edu.cn.
Xu C; Kuaishou Technology, Beijing, China. Electronic address: xuchenglin03@kuaishou.com.
Xie L; Audio, Speech and Langauge Processing Group, School of Computer Science, Northwestern Polytechnical University, Xi'an, China. Electronic address: lxie@nwpu.edu.cn.

Neural Netw ; 158: 216-227, 2023 Jan.

Article em En | MEDLINE | ID: mdl-36463693

ABSTRACT

ABSTRACT

Supervised neural speech enhancement methods always require a large scale of paired noisy and clean speech data. Since collecting adequate paired data from real-world applications is infeasible, simulated data is always adopted in supervised learning methods. However, the mismatch between the simulated data and in-the-wild data always causes performance inconsistency when the system is deployed in real-world applications. Unsupervised speech enhancement methods are studied to address the mismatch problem by directly using the in-the-wild noisy data without access to the corresponding clean speech. Therefore, the simulated paired data is not necessary. However, the performance of the unsupervised speech enhancement method is not on par with the supervised learning method. To address the aforementioned problems, this work proposes an unsupervised pre-training and mixture training algorithm by leveraging the advantages of supervised and unsupervised learning methods. Specifically, the proposed speech enhancement approach employs large volumes of unpaired noisy and clean speech to conduct unsupervised pre-training. The noisy data and a small amount of simulated paired data are then used for mixture training to optimize the pre-trained model. Experimental results show that the proposed method achieves better performances than other state-of-the-art supervised and unsupervised learning methods.

Assuntos

Algoritmos; Fala

Palavras-chave

Mixture training; Neural network; Speech enhancement; Unsupervised pre-training

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Fala / Algoritmos Idioma: En Revista: Neural Netw Assunto da revista: NEUROLOGIA Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google