Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 8 de 8
1.
PeerJ ; 12: e17320, 2024.
Article En | MEDLINE | ID: mdl-38766489

Vocal complexity is central to many evolutionary hypotheses about animal communication. Yet, quantifying and comparing complexity remains a challenge, particularly when vocal types are highly graded. Male Bornean orangutans (Pongo pygmaeus wurmbii) produce complex and variable "long call" vocalizations comprising multiple sound types that vary within and among individuals. Previous studies described six distinct call (or pulse) types within these complex vocalizations, but none quantified their discreteness or the ability of human observers to reliably classify them. We studied the long calls of 13 individuals to: (1) evaluate and quantify the reliability of audio-visual classification by three well-trained observers, (2) distinguish among call types using supervised classification and unsupervised clustering, and (3) compare the performance of different feature sets. Using 46 acoustic features, we used machine learning (i.e., support vector machines, affinity propagation, and fuzzy c-means) to identify call types and assess their discreteness. We additionally used Uniform Manifold Approximation and Projection (UMAP) to visualize the separation of pulses using both extracted features and spectrogram representations. Supervised approaches showed low inter-observer reliability and poor classification accuracy, indicating that pulse types were not discrete. We propose an updated pulse classification approach that is highly reproducible across observers and exhibits strong classification accuracy using support vector machines. Although the low number of call types suggests long calls are fairly simple, the continuous gradation of sounds seems to greatly boost the complexity of this system. This work responds to calls for more quantitative research to define call types and quantify gradedness in animal vocal systems and highlights the need for a more comprehensive framework for studying vocal complexity vis-à-vis graded repertoires.


Vocalization, Animal , Animals , Vocalization, Animal/physiology , Male , Pongo pygmaeus/physiology , Reproducibility of Results , Machine Learning , Acoustics , Sound Spectrography , Borneo
2.
Philos Trans R Soc Lond B Biol Sci ; 379(1904): 20230444, 2024 Jun 24.
Article En | MEDLINE | ID: mdl-38705172

Passive acoustic monitoring (PAM) is a powerful tool for studying ecosystems. However, its effective application in tropical environments, particularly for insects, poses distinct challenges. Neotropical katydids produce complex species-specific calls, spanning mere milliseconds to seconds and spread across broad audible and ultrasonic frequencies. However, subtle differences in inter-pulse intervals or central frequencies are often the only discriminatory traits. These extremities, coupled with low source levels and susceptibility to masking by ambient noise, challenge species identification in PAM recordings. This study aimed to develop a deep learning-based solution to automate the recognition of 31 katydid species of interest in a biodiverse Panamanian forest with over 80 katydid species. Besides the innate challenges, our efforts were also encumbered by a limited and imbalanced initial training dataset comprising domain-mismatched recordings. To overcome these, we applied rigorous data engineering, improving input variance through controlled playback re-recordings and by employing physics-based data augmentation techniques, and tuning signal-processing, model and training parameters to produce a custom well-fit solution. Methods developed here are incorporated into Koogu, an open-source Python-based toolbox for developing deep learning-based bioacoustic analysis solutions. The parametric implementations offer a valuable resource, enhancing the capabilities of PAM for studying insects in tropical ecosystems. This article is part of the theme issue 'Towards a toolkit for global insect biodiversity monitoring'.


Acoustics , Vocalization, Animal , Animals , Panama , Deep Learning , Species Specificity
3.
Philos Trans R Soc Lond B Biol Sci ; 379(1904): 20230110, 2024 Jun 24.
Article En | MEDLINE | ID: mdl-38705184

Night-time light can have profound ecological effects, even when the source is natural moonlight. The impacts of light can, however, vary substantially by taxon, habitat and geographical region. We used a custom machine learning model built with the Python package Koogu to investigate the in situ effects of moonlight on the calling activity of neotropical forest katydids over multiple years. We prioritised species with calls that were commonly detected in human annotated data, enabling us to evaluate model performance. We focused on eight species of katydids that the model identified with high precision (generally greater than 0.90) and moderate-to-high recall (minimum 0.35), ensuring that detections were generally correct and that many calls were detected. These results suggest that moonlight has modest effects on the amount of calling, with the magnitude and direction of effect varying by species: half of the species showed positive effects of light and half showed negative. These findings emphasize the importance of understanding natural history for anticipating how biological communities respond to moonlight. The methods applied in this project highlight the emerging opportunities for evaluating large quantities of data with machine learning models to address ecological questions over space and time. This article is part of the theme issue 'Towards a toolkit for global insect biodiversity monitoring'.


Forests , Machine Learning , Vocalization, Animal , Animals , Light
4.
Laryngoscope ; 133(10): 2517-2524, 2023 10.
Article En | MEDLINE | ID: mdl-36533566

BACKGROUND: Current protocols for bedside swallow evaluation have high rates of false negative results. Though experts are not consistently able to screen for aspiration risk by assessing vocal quality, there is emerging evidence that vocal acoustic parameters are significantly different in patients at risk of aspiration. Herein, we aimed to determine whether the presence of material on the vocal folds in an excised canine laryngeal model may have an impact on acoustic and aerodynamic measures. METHODS: Two ex vivo canine larynges were tested. Three liquids of different viscosities (1:100 diluted glycerin, pure glycerin, and honey-thick Varibar) were placed on the vocal folds at a constant volume. Acoustic and aerodynamic measures were obtained in both adducted and abducted vocal fold configurations. Intraglottal high-speed imaging was used to approximate the maximum divergence angle of the larynges in the studied conditions and examine its relationship to vocal efficiency (VE) and acoustic measures. RESULTS: In glottic insufficiency conditions only, we found that several acoustic parameters could predict the presence of material on the vocal folds. Based on the combination of the aerodynamic and acoustic data, we found that decreased spectral energy in the higher harmonics was associated with decreased VE in the presence of material on the vocal folds and/or glottic insufficiency. CONCLUSION: Decreased spectral energy in the higher harmonics of the voice was found to be a potential biomarker of swallowing dysfunction, as it correlates with decreased vocal efficiency due to material on the vocal folds and/or glottic insufficiency, both of which are known risk factors for aspiration. LEVEL OF EVIDENCE: NA Laryngoscope, 133:2517-2524, 2023.


Glycerol , Larynx , Animals , Dogs , Vocal Cords , Glottis , Acoustics , Phonation
5.
J R Soc Interface ; 18(180): 20210297, 2021 07.
Article En | MEDLINE | ID: mdl-34283944

Many animals rely on long-form communication, in the form of songs, for vital functions such as mate attraction and territorial defence. We explored the prospect of improving automatic recognition performance by using the temporal context inherent in song. The ability to accurately detect sequences of calls has implications for conservation and biological studies. We show that the performance of a convolutional neural network (CNN), designed to detect song notes (calls) in short-duration audio segments, can be improved by combining it with a recurrent network designed to process sequences of learned representations from the CNN on a longer time scale. The combined system of independently trained CNN and long short-term memory (LSTM) network models exploits the temporal patterns between song notes. We demonstrate the technique using recordings of fin whale (Balaenoptera physalus) songs, which comprise patterned sequences of characteristic notes. We evaluated several variants of the CNN + LSTM network. Relative to the baseline CNN model, the CNN + LSTM models reduced performance variance, offering a 9-17% increase in area under the precision-recall curve and a 9-18% increase in peak F1-scores. These results show that the inclusion of temporal information may offer a valuable pathway for improving the automatic recognition and transcription of wildlife recordings.


Neural Networks, Computer , Animals , Time Factors
6.
J Acoust Soc Am ; 147(5): 3078, 2020 05.
Article En | MEDLINE | ID: mdl-32486822

Automatically detecting animal signals in soundscape recordings is of benefit to passive acoustic monitoring programs which may be undertaken for research or conservation. Numerous algorithms exist, which are typically optimized for certain situations (i.e., certain animal sound types and ambient noise conditions). Adding to the library of algorithms, this paper developed, tested, and compared three detectors for Omura's whale vocalizations (15-62 Hz; <15 s) in marine soundscape recordings which contained noise from other animals, wind, earthquakes, ships, and seismic surveys. All three detectors were based on processing of spectrographic representations. The specific methods were spectrogram cross-correlation, entropy computation, and spectral intensity "blob" tracing. The latter two were general-purpose detectors that were adapted for detection of Omura's whale vocalizations. Detector complexity and post-processing effort varied across the three detectors. Performance was assessed qualitatively using demonstrative examples, and quantitatively using Receiver-Operating Characteristics and Precision-Recall curves. While the results of quantitative assessment were dominated by the spectrogram cross-correlation method, qualitative assessment showed that all three detectors offered promising performance.


Balaenoptera , Acoustics , Animals , Cetacea , Noise , Sound , Sound Spectrography , Vocalization, Animal
7.
J Acoust Soc Am ; 147(1): 260, 2020 01.
Article En | MEDLINE | ID: mdl-32006980

Extraction of tonal signals embedded in background noise is a crucial step before classification and separation of low-frequency sounds of baleen whales. This work reports results of comparing five tonal detectors, namely the instantaneous frequency estimator, YIN estimator, harmonic product spectrum, cost-function-based detector, and ridge detector. Comparisons, based on a low-frequency adaptation of the Silbido scoring feature, employ five metrics, which quantify the effectiveness of these detectors to retrieve tonal signals that have a wide range of signal to noise ratios (SNRs) and the quality of the detection results. Ground-truth data were generated by embedding 20 synthetic Antarctic blue whale (Balaenoptera musculus intermedia) calls in randomly extracted 30-min noise segments from a 79 h-library recorded by an Ocean Bottom Seismometer in the Indian Ocean during 2012-2013. Monte-Carlo simulations were performed using 20 trials per SNR, ranging from 0 dB to 15 dB. Overall, the tonal detection results show the superiority of the cost-function-based and the ridge detectors, over the other detectors, for all SNR values. More particularly, for lower SNRs (⩽3 dB), these two methods outperformed the other three with high recall, low fragmentation, and high coverage scores. For SNRs ⩾7 dB, the five methods performed similarly.


Balaenoptera/psychology , Signal Processing, Computer-Assisted , Sound Spectrography , Transducers , Vocalization, Animal , Animals , Signal-To-Noise Ratio
8.
J Acoust Soc Am ; 137(6): 3077-86, 2015 Jun.
Article En | MEDLINE | ID: mdl-26093399

Prior research has shown that echolocation clicks of several species of terrestrial and marine fauna can be modelled as Gabor-like functions. Here, a system is proposed for the automatic detection of a variety of such signals. By means of mathematical formulation, it is shown that the output of the Teager-Kaiser Energy Operator (TKEO) applied to Gabor-like signals can be approximated by a Gaussian function. Based on the inferences, a detection algorithm involving the post-processing of the TKEO outputs is presented. The ratio of the outputs of two moving-average filters, a Gaussian and a rectangular filter, is shown to be an effective detection parameter. Detector performance is assessed using synthetic and real (taken from MobySound database) recordings. The detection method is shown to work readily with a variety of echolocation clicks and in various recording scenarios. The system exhibits low computational complexity and operates several times faster than real-time. Performance comparisons are made to other publicly available detectors including pamguard.


Acoustics , Algorithms , Cetacea/classification , Cetacea/physiology , Echolocation/classification , Models, Theoretical , Signal Processing, Computer-Assisted , Vocalization, Animal/classification , Animals , Pattern Recognition, Automated , Sound Spectrography , Time Factors
...