Cross-Referencing Self-Training Network for Sound Event Detection in Audio Mixtures.

Park, Sangwook; Han, David K; Elhilali, Mounya

Park, Sangwook; Han, David K; Elhilali, Mounya.

Afiliação

Park S; Department of Electronic Engineering, Gangneung-Wonju National University, Gangneung, 25457 South Korea.
Han DK; Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, 19104 USA.
Elhilali M; Department of Electrical and Computer Engineering and jointly with the Department of Psychology and Brain Sciences, Johns Hopkins University, Baltimore, MD, 21218 USA.

IEEE Trans Multimedia ; 25: 4573-4585, 2023.

Article em En | MEDLINE | ID: mdl-37928617

RESUMO

Sound event detection is an important facet of audio tagging that aims to identify sounds of interest and define both the sound category and time boundaries for each sound event in a continuous recording. With advances in deep neural networks, there has been tremendous improvement in the performance of sound event detection systems, although at the expense of costly data collection and labeling efforts. In fact, current state-of-the-art methods employ supervised training methods that leverage large amounts of data samples and corresponding labels in order to facilitate identification of sound category and time stamps of events. As an alternative, the current study proposes a semi-supervised method for generating pseudo-labels from unsupervised data using a student-teacher scheme that balances self-training and cross-training. Additionally, this paper explores post-processing which extracts sound intervals from network prediction, for further improvement in sound event detection performance. The proposed approach is evaluated on sound event detection task for the DCASE2020 challenge. The results of these methods on both "validation" and "public evaluation" sets of DESED database show significant improvement compared to the state-of-the art systems in semi-supervised learning.

Palavras-chave

Sound event detection; pseudo label; self-training; semi-supervised learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: IEEE Trans Multimedia Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: IEEE Trans Multimedia Ano de publicação: 2023 Tipo de documento: Article