RESUMO
The goal of this study is to present an automatic vocalization recognition system of giant pandas (GPs). Over 12800 vocal samples of GPs were recorded at Chengdu Research Base of Giant Panda Breeding (CRBGPB) and labeled by CRBGPB animal husbandry staff. These vocal samples were divided into 16 categories, each with 800 samples. A novel deep neural network (DNN) named 3Fbank-GRU was proposed to automatically give labels to GP's vocalizations. Unlike existing human vocalization recognition frameworks based on Mel filter bank (Fbank) which used low-frequency features of voice only, we extracted the high, medium and low frequency features by Fbank and two self-deduced filter banks, named Medium Mel Filter bank (MFbank) and Reversed Mel Filter bank (RFbank). The three frequency features were sent into the 3Fbank-GRU to train and test. By training models using datasets labeled by CRBGPB animal husbandry staff and subsequent testing of trained models on recognizing tasks, the proposed method achieved recognition accuracy over 95%, which means that the automatic system can be used to accurately label large data sets of GP vocalizations collected by camera traps or other recording methods.
Assuntos
Ursidae , Animais , Humanos , Criação de Animais Domésticos , Redes Neurais de ComputaçãoRESUMO
NLMs is a state-of-art image denoising method; however, it sometimes oversmoothes anatomical features in low-dose CT (LDCT) imaging. In this paper, we propose a simple way to improve the spatial adaptivity (SA) of NLMs using pointwise fractal dimension (PWFD). Unlike existing fractal image dimensions that are computed on the whole images or blocks of images, the new PWFD, named pointwise box-counting dimension (PWBCD), is computed for each image pixel. PWBCD uses a fixed size local window centered at the considered image pixel to fit the different local structures of images. Then based on PWBCD, a new method that uses PWBCD to improve SA of NLMs directly is proposed. That is, PWBCD is combined with the weight of the difference between local comparison windows for NLMs. Smoothing results for test images and real sinograms show that PWBCD-NLMs with well-chosen parameters can preserve anatomical features better while suppressing the noises efficiently. In addition, PWBCD-NLMs also has better performance both in visual quality and peak signal to noise ratio (PSNR) than NLMs in LDCT imaging.