Pesquisa | Biblioteca Virtual em Saúde

Graph-Based Audio Classification Using Pre-Trained Models and Graph Neural Networks.

Castro-Ospina, Andrés Eduardo; Solarte-Sanchez, Miguel Angel; Vega-Escobar, Laura Stella; Isaza, Claudia; Martínez-Vargas, Juan David.

Sensors (Basel) ; 24(7)2024 Mar 26.

Artigo em Inglês | MEDLINE | ID: mdl-38610318

RESUMO

Sound classification plays a crucial role in enhancing the interpretation, analysis, and use of acoustic data, leading to a wide range of practical applications, of which environmental sound analysis is one of the most important. In this paper, we explore the representation of audio data as graphs in the context of sound classification. We propose a methodology that leverages pre-trained audio models to extract deep features from audio files, which are then employed as node information to build graphs. Subsequently, we train various graph neural networks (GNNs), specifically graph convolutional networks (GCNs), GraphSAGE, and graph attention networks (GATs), to solve multi-class audio classification problems. Our findings underscore the effectiveness of employing graphs to represent audio data. Moreover, they highlight the competitive performance of GNNs in sound classification endeavors, with the GAT model emerging as the top performer, achieving a mean accuracy of 83% in classifying environmental sounds and 91% in identifying the land cover of a site based on its audio recording. In conclusion, this study provides novel insights into the potential of graph representation learning techniques for analyzing audio data.

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA