Pesquisa | Biblioteca Virtual em Saúde

Bird song comparison using deep learning trained from avian perceptual judgments.

Zandberg, Lies; Morfi, Veronica; George, Julia M; Clayton, David F; Stowell, Dan; Lachlan, Robert F.

PLoS Comput Biol ; 20(8): e1012329, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39110762

RESUMO

Our understanding of bird song, a model system for animal communication and the neurobiology of learning, depends critically on making reliable, validated comparisons between the complex multidimensional syllables that are used in songs. However, most assessments of song similarity are based on human inspection of spectrograms, or computational methods developed from human intuitions. Using a novel automated operant conditioning system, we collected a large corpus of zebra finches' (Taeniopygia guttata) decisions about song syllable similarity. We use this dataset to compare and externally validate similarity algorithms in widely-used publicly available software (Raven, Sound Analysis Pro, Luscinia). Although these methods all perform better than chance, they do not closely emulate the avian assessments. We then introduce a novel deep learning method that can produce perceptual similarity judgements trained on such avian decisions. We find that this new method outperforms the established methods in accuracy and more closely approaches the avian assessments. Inconsistent (hence ambiguous) decisions are a common occurrence in animal behavioural data; we show that a modification of the deep learning training that accommodates these leads to the strongest performance. We argue this approach is the best way to validate methods to compare song similarity, that our dataset can be used to validate novel methods, and that the general approach can easily be extended to other species.

Assuntos

Aprendizado Profundo , Tentilhões , Vocalização Animal , Animais , Vocalização Animal/fisiologia , Tentilhões/fisiologia , Algoritmos , Biologia Computacional/métodos , Julgamento/fisiologia , Masculino , Espectrografia do Som/métodos , Condicionamento Operante/fisiologia , Humanos

Deep perceptual embeddings for unlabelled animal sound events.

Morfi, Veronica; Lachlan, Robert F; Stowell, Dan.

J Acoust Soc Am ; 150(1): 2, 2021 07.

Artigo em Inglês | MEDLINE | ID: mdl-34340499

RESUMO

Evaluating sound similarity is a fundamental building block in acoustic perception and computational analysis. Traditional data-driven analyses of perceptual similarity are based on heuristics or simplified linear models, and are thus limited. Deep learning embeddings, often using triplet networks, have been useful in many fields. However, such networks are usually trained using large class-labelled datasets. Such labels are not always feasible to acquire. We explore data-driven neural embeddings for sound event representation when class labels are absent, instead utilising proxies of perceptual similarity judgements. Ultimately, our target is to create a perceptual embedding space that reflects animals' perception of sound. We create deep perceptual embeddings for bird sounds using triplet models. In order to deal with the challenging nature of triplet loss training with the lack of class-labelled data, we utilise multidimensional scaling (MDS) pretraining, attention pooling, and a triplet mining scheme. We also evaluate the advantage of triplet learning compared to learning a neural embedding from a model trained on MDS alone. Using computational proxies of similarity judgements, we demonstrate the feasibility of the method to develop perceptual models for a wide range of data based on behavioural judgements, helping us understand how animals perceive sounds.

Assuntos

Som , Animais , Humanos

NIPS4Bplus: a richly annotated birdsong audio dataset.

Morfi, Veronica; Bas, Yves; Pamula, Hanna; Glotin, Hervé; Stowell, Dan.

PeerJ Comput Sci ; 5: e223, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-33816876

RESUMO

Recent advances in birdsong detection and classification have approached a limit due to the lack of fully annotated recordings. In this paper, we present NIPS4Bplus, the first richly annotated birdsong audio dataset, that is comprised of recordings containing bird vocalisations along with their active species tags plus the temporal annotations acquired for them. Statistical information about the recordings, their species specific tags and their temporal annotations are presented along with example uses. NIPS4Bplus could be used in various ecoacoustic tasks, such as training models for bird population monitoring, species classification, birdsong vocalisation detection and classification.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA