Pesquisa | Biblioteca Virtual em Saúde

Monitoring the courtship flight trajectory of Latham's snipe (Gallinago hardwickii) using microphone arrays.

Matsubayashi, Shiho; Osaka, Hideki; Suzuki, Reiji; Nakadai, Kazuhiro; Okuno, Hiroshi G.

Ecol Evol ; 13(4): e9938, 2023 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37013098

RESUMO

This study is the first to quantitatively measure the courtship display flights of Latham's snipe (Gallinago hardwickii), which is a "near threatened" species as of 2022 (IUCN red list of threatened species). By using a 16-channel microphone array and 8-channel microphone arrays, we localized the fine-scale movements of courtship flights of one male performing at high altitude and high speed, and we estimated the direction from which each sound arrived using robot audition. Preliminary analyses of the azimuthal and elevation angles of the courtship flights partially revealed a fine-scale flight trajectory. First, a male Latham's snipe gradually gained altitude while vocalizing sharp and harsh repeating calls, until it reached the flight peak altitude, then dove down while producing winnowing sound to the ground along the wetland zones without tall vegetation. This observation method is methodologically useful to establish a better understanding of Latham's snipe courtship flight site selection. Furthermore, this method can be extended to investigate other rare nocturnal or crepuscular birds that are too timid to risk ringing or tagging.

Auditory Survey of Endangered Eurasian Bittern Using Microphone Arrays and Robot Audition.

Matsubayashi, Shiho; Nakadai, Kazuhiro; Suzuki, Reiji; Ura, Tatsuya; Hasebe, Makoto; Okuno, Hiroshi G.

Front Robot AI ; 9: 854572, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35462782

RESUMO

Bioacoustics monitoring has become increasingly popular for studying the behavior and ecology of vocalizing birds. This study aims to verify the practical effectiveness of localization technology for auditory monitoring of endangered Eurasian bittern (Botaurus stellaris) which inhabits wetlands in remote areas with thick vegetation. Their crepuscular and highly secretive nature, except during the breeding season when they vocalize advertisement calls, make them difficult to monitor. Because of the increasing rates of habitat loss, surveying accurate numbers and their habitat needs are both important conservation tasks. We investigated the feasibility of localizing their booming calls, at a low frequency range between 100-200 Hz, using microphone arrays and robot audition HARK (Honda Research Institute, Audition for Robots with Kyoto University). We first simulated sound source localization of actual bittern calls for microphone arrays of radii 10 cm, 50 cm, 1 m, and 10 m, under different noise levels. Second, we monitored bitterns in an actual field environment using small microphone arrays (height = 12 cm; width = 8 cm), in the Sarobetsu Mire, Hokkaido Island, Japan. The simulation results showed that the spectral detectability was higher for larger microphone arrays, whereas the temporal detectability was higher for smaller microphone arrays. We identified that false detection in smaller microphone arrays, which was coincidentally generated in the calculation proximate to the transfer function for the opposite side. Despite technical limitations, we successfully localized booming calls of at least two males in a reverberant wetland, surrounded by thick vegetation and riparian trees. This study is the first case of localizing such rare birds using small-sized microphone arrays in the field, thereby presenting how this technology could contribute to auditory surveys of population numbers, behaviors, and microhabitat selection, all of which are difficult to investigate using other observation methods. This methodology is not only useful for the better understanding of bitterns, but it can also be extended to investigate other rare nocturnal birds with low-frequency vocalizations, without direct ringing or tagging. Our results also suggest a future necessity for a robust localization system to avoid reverberation and echoing in the field, resulting in the false detection of the target birds.

Assessment of Sound Source Tracking Using Multiple Drones Equipped with Multiple Microphone Arrays.

Yamada, Taiki; Itoyama, Katsutoshi; Nishida, Kenji; Nakadai, Kazuhiro.

Int J Environ Res Public Health ; 18(17)2021 08 27.

Artigo em Inglês | MEDLINE | ID: mdl-34501626

RESUMO

Drone audition techniques are helpful for listening to target sound sources from the sky, which can be used for human searching tasks in disaster sites. Among many techniques required for drone audition, sound source tracking is an essential technique, and thus several tracking methods have been proposed. Authors have also proposed a sound source tracking method that utilizes multiple microphone arrays to obtain the likelihood distribution of the sound source locations. These methods have been demonstrated in benchmark experiments. However, the performance against various sound sources with different distances and signal-to-noise ratios (SNRs) has been less evaluated. Since drone audition often needs to listen to distant sound sources and the input acoustic signal generally has a low SNR due to drone noise, making a performance assessment against source distance and SNR is essential. Therefore, this paper presents a concrete evaluation of sound source tracking methods using numerical simulation, focusing on various source distances and SNRs. The simulated results captured how the tracking performance will change when the sound source distance and SNR change. The proposed approach based on location distribution estimation tended to be more robust against distance increase, while existing approaches based on directional estimation tended to be more robust against decreasing SNR.

Assuntos

Ruído , Som , Acústica , Simulação por Computador , Audição , Humanos

Recognition of Non-Manual Content in Continuous Japanese Sign Language.

Brock, Heike; Farag, Iva; Nakadai, Kazuhiro.

Sensors (Basel) ; 20(19)2020 Oct 01.

Artigo em Inglês | MEDLINE | ID: mdl-33019608

RESUMO

The quality of recognition systems for continuous utterances in signed languages could be largely advanced within the last years. However, research efforts often do not address specific linguistic features of signed languages, as e.g., non-manual expressions. In this work, we evaluate the potential of a single video camera-based recognition system with respect to the latter. For this, we introduce a two-stage pipeline based on two-dimensional body joint positions extracted from RGB camera data. The system first separates the data flow of a signed expression into meaningful word segments on the base of a frame-wise binary Random Forest. Next, every segment is transformed into image-like shape and classified with a Convolutional Neural Network. The proposed system is then evaluated on a data set of continuous sentence expressions in Japanese Sign Language with a variation of non-manual expressions. Exploring multiple variations of data representations and network parameters, we are able to distinguish word segments of specific non-manual intonations with 86% accuracy from the underlying body joint movement data. Full sentence predictions achieve a total Word Error Rate of 15.75%. This marks an improvement of 13.22% as compared to ground truth predictions obtained from labeling insensitive towards non-manual content. Consequently, our analysis constitutes an important contribution for a better understanding of mixed manual and non-manual content in signed communication.

Assuntos

Idioma , Linguística , Redes Neurais de Computação , Língua de Sinais , Humanos , Japão , Movimento

A spatiotemporal analysis of acoustic interactions between great reed warblers (Acrocephalus arundinaceus) using microphone arrays and robot audition software HARK.

Suzuki, Reiji; Matsubayashi, Shiho; Saito, Fumiyuki; Murate, Tatsuyoshi; Masuda, Tomohisa; Yamamoto, Koichi; Kojima, Ryosuke; Nakadai, Kazuhiro; Okuno, Hiroshi G.

Ecol Evol ; 8(1): 812-825, 2018 01.

Artigo em Inglês | MEDLINE | ID: mdl-29321916

RESUMO

Acoustic interactions are important for understanding intra- and interspecific communication in songbird communities from the viewpoint of soundscape ecology. It has been suggested that birds may divide up sound space to increase communication efficiency in such a manner that they tend to avoid overlap with other birds when they sing. We are interested in clarifying the dynamics underlying the process as an example of complex systems based on short-term behavioral plasticity. However, it is very problematic to manually collect spatiotemporal patterns of acoustic events in natural habitats using data derived from a standard single-channel recording of several species singing simultaneously. Our purpose here was to investigate fine-scale spatiotemporal acoustic interactions of the great reed warbler. We surveyed spatial and temporal patterns of several vocalizing color-banded great reed warblers (Acrocephalus arundinaceus) using an open-source software for robot audition HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) and three new 16-channel, stand-alone, and water-resistant microphone arrays, named DACHO spread out in the bird's habitat. We first show that our system estimated the location of two color-banded individuals' song posts with mean error distance of 5.5 ± 4.5 m from the location of observed song posts. We then evaluated the temporal localization accuracy of the songs by comparing the duration of localized songs around the song posts with those annotated by human observers, with an accuracy score of average 0.89 for one bird that stayed at one song post. We further found significant temporal overlap avoidance and an asymmetric relationship between songs of the two singing individuals, using transfer entropy. We believe that our system and analytical approach contribute to a better understanding of fine-scale acoustic interactions in time and space in bird communities.

Design of UAV-Embedded Microphone Array System for Sound Source Localization in Outdoor Environments.

Hoshiba, Kotaro; Washizaki, Kai; Wakabayashi, Mizuho; Ishiki, Takahiro; Kumon, Makoto; Bando, Yoshiaki; Gabriel, Daniel; Nakadai, Kazuhiro; Okuno, Hiroshi G.

Sensors (Basel) ; 17(11)2017 Nov 03.

Artigo em Inglês | MEDLINE | ID: mdl-29099790

RESUMO

In search and rescue activities, unmanned aerial vehicles (UAV) should exploit sound information to compensate for poor visual information. This paper describes the design and implementation of a UAV-embedded microphone array system for sound source localization in outdoor environments. Four critical development problems included water-resistance of the microphone array, efficiency in assembling, reliability of wireless communication, and sufficiency of visualization tools for operators. To solve these problems, we developed a spherical microphone array system (SMAS) consisting of a microphone array, a stable wireless network communication system, and intuitive visualization tools. The performance of SMAS was evaluated with simulated data and a demonstration in the field. Results confirmed that the SMAS provides highly accurate localization, water resistance, prompt assembly, stable wireless communication, and intuitive information for observers and operators.

Efficient blind dereverberation and echo cancellation based on independent component analysis for actual acoustic signals.

Takeda, Ryu; Nakadai, Kazuhiro; Takahashi, Toru; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

Neural Comput ; 24(1): 234-72, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22023192

RESUMO

This letter presents a new algorithm for blind dereverberation and echo cancellation based on independent component analysis (ICA) for actual acoustic signals. We focus on frequency domain ICA (FD-ICA) because its computational cost and speed of learning convergence are sufficiently reasonable for practical applications such as hands-free speech recognition. In applying conventional FD-ICA as a preprocessing of automatic speech recognition in noisy environments, one of the most critical problems is how to cope with reverberations. To extract a clean signal from the reverberant observation, we model the separation process in the short-time Fourier transform domain and apply the multiple input/output inverse-filtering theorem (MINT) to the FD-ICA separation model. A naive implementation of this method is computationally expensive, because its time complexity is the second order of reverberation time. Therefore, the main issue in dereverberation is to reduce the high computational cost of ICA. In this letter, we reduce the computational complexity to the linear order of the reverberation time by using two techniques: (1) a separation model based on the independence of delayed observed signals with MINT and (2) spatial sphering for preprocessing. Experiments show that the computational cost grows in proportion to the linear order of the reverberation time and that our method improves the word correctness of automatic speech recognition by 10 to 20 points in a RT20= 670 ms reverberant environment.

Assuntos

Acústica , Algoritmos , Processamento de Sinais Assistido por Computador , Análise de Fourier , Modelos Teóricos , Ruído

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA