Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.440
Filtrar
1.
Nat Commun ; 15(1): 3617, 2024 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-38714699

RESUMEN

Sperm whales (Physeter macrocephalus) are highly social mammals that communicate using sequences of clicks called codas. While a subset of codas have been shown to encode information about caller identity, almost everything else about the sperm whale communication system, including its structure and information-carrying capacity, remains unknown. We show that codas exhibit contextual and combinatorial structure. First, we report previously undescribed features of codas that are sensitive to the conversational context in which they occur, and systematically controlled and imitated across whales. We call these rubato and ornamentation. Second, we show that codas form a combinatorial coding system in which rubato and ornamentation combine with two context-independent features we call rhythm and tempo to produce a large inventory of distinguishable codas. Sperm whale vocalisations are more expressive and structured than previously believed, and built from a repertoire comprising nearly an order of magnitude more distinguishable codas. These results show context-sensitive and combinatorial vocalisation can appear in organisms with divergent evolutionary lineage and vocal apparatus.


Asunto(s)
Cachalote , Vocalización Animal , Animales , Vocalización Animal/fisiología , Cachalote/fisiología , Cachalote/anatomía & histología , Masculino , Femenino , Espectrografía del Sonido
2.
Ter Arkh ; 96(3): 228-232, 2024 Apr 16.
Artículo en Ruso | MEDLINE | ID: mdl-38713036

RESUMEN

AIM: To evaluate the possibility of using spectral analysis of cough sounds in the diagnosis of a new coronavirus infection COVID-19. MATERIALS AND METHODS: Spectral toussophonobarography was performed in 218 patients with COVID-19 [48.56% men, 51.44% women, average age 40.2 (32.4; 51.0)], in 60 healthy individuals [50% men, 50% women, average age 41.7 (32.2; 53.0)] with induced cough (by inhalation of citric acid solution at a concentration of 20 g/l through a nebulizer). The recording was made using a contact microphone located on a special tripod at a distance of 15-20 cm from the face of the subject. The resulting recordings were processed in a computer program, after which spectral analysis of cough sounds was performed using Fourier transform algorithms. The following parameters of cough sounds were evaluated: the duration of the cough act (ms), the ratio of the energy of low frequencies (60-600 Hz) to the energy of high frequencies (600-6000 Hz), the frequency of the maximum energy of the cough sound (Hz). RESULTS: After statistical processing, it was found out that the parameters of the cough sound of COVID-19 patients differ from the cough of healthy individuals. The obtained data were substituted into the developed regression equation. Rounded to integers, the resulting number had the following interpretation: "0" - there is no COVID-19, "1" - there is COVID-19. CONCLUSION: The technique showed high levels of sensitivity and specificity. In addition, the method is characterized by sufficient ease of use and does not require expensive equipment, therefore it can be used in practice for timely diagnosis of COVID-19.


Asunto(s)
COVID-19 , Tos , SARS-CoV-2 , Humanos , Tos/diagnóstico , Tos/etiología , Tos/fisiopatología , COVID-19/diagnóstico , Femenino , Masculino , Adulto , Persona de Mediana Edad , Espectrografía del Sonido/métodos
3.
J Acoust Soc Am ; 155(5): 3037-3050, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38717209

RESUMEN

The progress of fin whale study is hindered by the debate about whether the two typical type-A and type-B calls (characterized by central source frequencies of 17-20 Hz and 20-30 Hz, respectively) originate from a single fin whale or two individual fin whales. Here, hydroacoustic data is employed to study the type, vocal behavior, and temporal evolution of fin whale calls around the Southern Wake Island from 2010 to 2022. It is identified that (1) type-A and type-B calls come from two individuals based on the large source separation of the two calls through high-precision determination of source location; (2) type-A fin whales exhibit vocal influence on type-B fin whales, where type-B fin whales become paired with type-A calls and vocalize regularly when type-A fin whales appear, and type-A fin whales always lead the call sequences; and (3) some type-A fin whales stop calling when another type-A fin whale approaches at a distance of about 1.6 km. During 2010-2022, type-A calls occur every year, whereas type-B calls are prevalent only after November 2018. A culture transmission is proposed from type-A fin whales to type-B fin whales and/or a population increase of type-B fin whales in the region after November 2018.


Asunto(s)
Acústica , Ballena de Aleta , Vocalización Animal , Animales , Ballena de Aleta/fisiología , Espectrografía del Sonido , Factores de Tiempo , Islas
4.
J Acoust Soc Am ; 155(5): 3071-3089, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38717213

RESUMEN

This study investigated how 40 Chinese learners of English as a foreign language (EFL learners) differed from 40 native English speakers in the production of four English tense-lax contrasts, /i-ɪ/, /u-ʊ/, /ɑ-ʌ/, and /æ-ε/, by examining the acoustic measurements of duration, the first three formant frequencies, and the slope of the first formant movement (F1 slope). The dynamic formant trajectory was modeled using discrete cosine transform coefficients to demonstrate the time-varying properties of formant trajectories. A discriminant analysis was employed to illustrate the extent to which Chinese EFL learners relied on different acoustic parameters. This study found that: (1) Chinese EFL learners overemphasized durational differences and weakened spectral differences for the /i-ɪ/, /u-ʊ/, and /ɑ-ʌ/ pairs, although they maintained sufficient spectral differences for /æ-ε/. In contrast, native English speakers predominantly used spectral differences across all four pairs; (2) in non-low tense-lax contrasts, unlike native English speakers, Chinese EFL learners failed to exhibit different F1 slope values, indicating a non-nativelike tongue-root placement during the articulatory process. The findings underscore the contribution of dynamic spectral patterns to the differentiation between English tense and lax vowels, and reveal the influence of precise articulatory gestures on the realization of the tense-lax contrast.


Asunto(s)
Multilingüismo , Fonética , Acústica del Lenguaje , Humanos , Masculino , Femenino , Adulto Joven , Medición de la Producción del Habla , Adulto , Lenguaje , Acústica , Aprendizaje , Calidad de la Voz , Espectrografía del Sonido , Pueblos del Este de Asia
5.
J Acoust Soc Am ; 155(4): 2724-2727, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-38656337

RESUMEN

The auditory sensitivity of a small songbird, the red-cheeked cordon bleu, was measured using the standard methods of animal psychophysics. Hearing in cordon bleus is similar to other small passerines with best hearing in the frequency region from 2 to 4 kHz and sensitivity declining at the rate of about 10 dB/octave below 2 kHz and about 35 dB/octave as frequency increases from 4 to 9 kHz. While critical ratios are similar to other songbirds, the long-term average power spectrum of cordon bleu song falls above the frequency of best hearing in this species.


Asunto(s)
Estimulación Acústica , Umbral Auditivo , Audición , Pájaros Cantores , Vocalización Animal , Animales , Vocalización Animal/fisiología , Audición/fisiología , Pájaros Cantores/fisiología , Masculino , Psicoacústica , Espectrografía del Sonido , Femenino
6.
J Acoust Soc Am ; 155(4): 2803-2816, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-38662608

RESUMEN

Urban expansion has increased pollution, including both physical (e.g., exhaust, litter) and sensory (e.g., anthropogenic noise) components. Urban avian species tend to increase the frequency and/or amplitude of songs to reduce masking by low-frequency noise. Nevertheless, song propagation to the receiver can also be constrained by the environment. We know relatively little about how this propagation may be altered across species that (1) vary in song complexity and (2) inhabit areas along an urbanization gradient. We investigated differences in song amplitude, attenuation, and active space, or the maximum distance a receiver can detect a signal, in two human-commensal species: the house sparrow (Passer domesticus) and house finch (Haemorhous mexicanus). We described urbanization both discretely and quantitatively to investigate the habitat characteristics most responsible for propagation changes. We found mixed support for our hypothesis of urban-specific degradation of songs. Urban songs propagated with higher amplitude; however, urban song fidelity was species-specific and showed lowered active space for urban house finch songs. Taken together, our results suggest that urban environments may constrain the propagation of vocal signals in species-specific manners. Ultimately, this has implications for the ability of urban birds to communicate with potential mates or kin.


Asunto(s)
Pinzones , Especificidad de la Especie , Urbanización , Vocalización Animal , Animales , Vocalización Animal/fisiología , Pinzones/fisiología , Gorriones/fisiología , Ruido , Espectrografía del Sonido , Ecosistema , Humanos , Enmascaramiento Perceptual/fisiología , Masculino
7.
PLoS One ; 19(4): e0299250, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38635752

RESUMEN

Passive acoustic monitoring has improved our understanding of vocalizing organisms in remote habitats and during all weather conditions. Many vocally active species are highly mobile, and their populations overlap. However, distinct vocalizations allow the tracking and discrimination of individuals or populations. Using signature whistles, the individually distinct calls of bottlenose dolphins, we calculated a minimum abundance of individuals, characterized and compared signature whistles from five locations, and determined reoccurrences of individuals throughout the Mid-Atlantic Bight and Chesapeake Bay, USA. We identified 1,888 signature whistles in which the duration, number of extrema, start, end, and minimum frequencies of signature whistles varied significantly by site. All characteristics of signature whistles were deemed important for determining from which site the whistle originated and due to the distinct signature whistle characteristics and lack of spatial mixing of the dolphins detected at the Offshore site, we suspect that these dolphins are of a different population than those at the Coastal and Bay sites. Signature whistles were also found to be shorter when sound levels were higher. Using only the passively recorded vocalizations of this marine top predator, we obtained information about its population and how it is affected by ambient sound levels, which will increase as offshore wind energy is developed. In this rapidly developing area, these calls offer critical management insights for this protected species.


Asunto(s)
Delfín Mular , Vocalización Animal , Animales , Espectrografía del Sonido , Ecosistema
8.
J Acoust Soc Am ; 155(4): 2627-2635, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-38629884

RESUMEN

Passive acoustic monitoring (PAM) is an optimal method for detecting and monitoring cetaceans as they frequently produce sound while underwater. Cue counting, counting acoustic cues of deep-diving cetaceans instead of animals, is an alternative method for density estimation, but requires an average cue production rate to convert cue density to animal density. Limited information about click rates exists for sperm whales in the central North Pacific Ocean. In the absence of acoustic tag data, we used towed hydrophone array data to calculate the first sperm whale click rates from this region and examined their variability based on click type, location, distance of whales from the array, and group size estimated by visual observers. Our findings show click type to be the most important variable, with groups that include codas yielding the highest click rates. We also found a positive relationship between group size and click detection rates that may be useful for acoustic predictions of group size in future studies. Echolocation clicks detected using PAM methods are often the only indicator of deep-diving cetacean presence. Understanding the factors affecting their click rates provides important information for acoustic density estimation.


Asunto(s)
Ecolocación , Cachalote , Animales , Vocalización Animal , Acústica , Ballenas , Espectrografía del Sonido
9.
Sci Rep ; 14(1): 6062, 2024 03 13.
Artículo en Inglés | MEDLINE | ID: mdl-38480760

RESUMEN

With the large increase in human marine activity, our seas have become populated with vessels that can be overheard from distances of even 20 km. Prior investigations showed that such a dense presence of vessels impacts the behaviour of marine animals, and in particular dolphins. While previous explorations were based on a linear observation for changes in the features of dolphin whistles, in this work we examine non-linear responses of bottlenose dolphins (Tursiops Truncatus) to the presence of vessels. We explored the response of dolphins to vessels by continuously recording acoustic data using two long-term acoustic recorders deployed near a shipping lane and a dolphin habitat in Eilat, Israel. Using deep learning methods we detected a large number of 50,000 whistles, which were clustered to associate whistle traces and to characterize their features to discriminate vocalizations of dolphins: both structure and quantities. Using a non-linear classifier, the whistles were categorized into two classes representing the presence or absence of a nearby vessel. Although our database does not show linear observable change in the features of the whistles, we obtained true positive and true negative rates exceeding 90% accuracy on separate, left-out test sets. We argue that this success in classification serves as a statistical proof for a non-linear response of dolphins to the presence of vessels.


Asunto(s)
Delfín Mular , Vocalización Animal , Animales , Humanos , Vocalización Animal/fisiología , Delfín Mular/fisiología , Acústica , Océanos y Mares , Navíos , Espectrografía del Sonido
10.
J Acoust Soc Am ; 155(3): 2050-2064, 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38477612

RESUMEN

The study of humpback whale song using passive acoustic monitoring devices requires bioacousticians to manually review hours of audio recordings to annotate the signals. To vastly reduce the time of manual annotation through automation, a machine learning model was developed. Convolutional neural networks have made major advances in the previous decade, leading to a wide range of applications, including the detection of frequency modulated vocalizations by cetaceans. A large dataset of over 60 000 audio segments of 4 s length is collected from the North Atlantic and used to fine-tune an existing model for humpback whale song detection in the North Pacific (see Allen, Harvey, Harrell, Jansen, Merkens, Wall, Cattiau, and Oleson (2021). Front. Mar. Sci. 8, 607321). Furthermore, different data augmentation techniques (time-shift, noise augmentation, and masking) are used to artificially increase the variability within the training set. Retraining and augmentation yield F-score values of 0.88 on context window basis and 0.89 on hourly basis with false positive rates of 0.05 on context window basis and 0.01 on hourly basis. If necessary, usage and retraining of the existing model is made convenient by a framework (AcoDet, acoustic detector) built during this project. Combining the tools provided by this framework could save researchers hours of manual annotation time and, thus, accelerate their research.


Asunto(s)
Yubarta , Animales , Vocalización Animal , Espectrografía del Sonido , Factores de Tiempo , Estaciones del Año , Acústica
11.
J Acoust Soc Am ; 155(2): 1437-1450, 2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38364047

RESUMEN

Odontocetes produce clicks for echolocation and communication. Most odontocetes are thought to produce either broadband (BB) or narrowband high-frequency (NBHF) clicks. Here, we show that the click repertoire of Hector's dolphin (Cephalorhynchus hectori) comprises highly stereotypical NBHF clicks and far more variable broadband clicks, with some that are intermediate between these two categories. Both NBHF and broadband clicks were made in trains, buzzes, and burst-pulses. Most clicks within click trains were typical NBHF clicks, which had a median centroid frequency of 130.3 kHz (median -10 dB bandwidth = 29.8 kHz). Some, however, while having only marginally lower centroid frequency (median = 123.8 kHz), had significant energy below 100 kHz and approximately double the bandwidth (median -10 dB bandwidth = 69.8 kHz); we refer to these as broadband. Broadband clicks in buzzes and burst-pulses had lower median centroid frequencies (120.7 and 121.8 kHz, respectively) compared to NBHF buzzes and burst-pulses (129.5 and 130.3 kHz, respectively). Source levels of NBHF clicks, estimated by using a drone to measure ranges from a single hydrophone and by computing time-of-arrival differences at a vertical hydrophone array, ranged from 116 to 171 dB re 1 µPa at 1 m, whereas source levels of broadband clicks, obtained from array data only, ranged from 138 to 184 dB re 1 µPa at 1 m. Our findings challenge the grouping of toothed whales as either NBHF or broadband species.


Asunto(s)
Delfines , Ecolocación , Animales , Acústica , Vocalización Animal , Espectrografía del Sonido
12.
J Acoust Soc Am ; 155(2): 1253-1263, 2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38341748

RESUMEN

The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). "Comparing measurement errors for formants in synthetic and natural vowels," J. Acoust. Soc. Am. 139(2), 713-727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.


Asunto(s)
Voz , Niño , Humanos , Acústica , Acústica del Lenguaje , Vibración , Espectrografía del Sonido
13.
J Acoust Soc Am ; 155(1): 274-283, 2024 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-38215217

RESUMEN

Echolocating bats and dolphins use biosonar to determine target range, but differences in range discrimination thresholds have been reported for the two species. Whether these differences represent a true difference in their sensory system capability is unknown. Here, the dolphin's range discrimination threshold as a function of absolute range and echo-phase was investigated. Using phantom echoes, the dolphins were trained to echo-inspect two simulated targets and indicate the closer target by pressing a paddle. One target was presented at a time, requiring the dolphin to hold the initial range in memory as they compared it to the second target. Range was simulated by manipulating echo-delay while the received echo levels, relative to the dolphins' clicks, were held constant. Range discrimination thresholds were determined at seven different ranges from 1.75 to 20 m. In contrast to bats, range discrimination thresholds increased from 4 to 75 cm, across the entire ranges tested. To investigate the acoustic features used more directly, discrimination thresholds were determined when the echo was given a random phase shift (±180°). Results for the constant-phase versus the random-phase echo were quantitatively similar, suggesting that dolphins used the envelope of the echo waveform to determine the difference in range.


Asunto(s)
Delfín Mular , Quirópteros , Ecolocación , Animales , Acústica , Espectrografía del Sonido
14.
J Acoust Soc Am ; 155(1): 396-404, 2024 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-38240666

RESUMEN

When they are exposed to loud fatiguing sounds in the oceans, marine mammals are susceptible to hearing damage in the form of temporary hearing threshold shifts (TTSs) or permanent hearing threshold shifts. We compared the level-dependent and frequency-dependent susceptibility to TTSs in harbor seals and harbor porpoises, species with different hearing sensitivities in the low- and high-frequency regions. Both species were exposed to 100% duty cycle one-sixth-octave noise bands at frequencies that covered their entire hearing range. In the case of the 6.5 kHz exposure for the harbor seals, a pure tone (continuous wave) was used. TTS was quantified as a function of sound pressure level (SPL) half an octave above the center frequency of the fatiguing sound. The species have different audiograms, but their frequency-specific susceptibility to TTS was more similar. The hearing frequency range in which both species were most susceptible to TTS was 22.5-50 kHz. Furthermore, the frequency ranges were characterized by having similar critical levels (defined as the SPL of the fatiguing sound above which the magnitude of TTS induced as a function of SPL increases more strongly). This standardized between-species comparison indicates that the audiogram is not a good predictor of frequency-dependent susceptibility to TTS.


Asunto(s)
Phoca , Phocoena , Animales , Estimulación Acústica , Fatiga Auditiva , Espectrografía del Sonido , Recuperación de la Función , Audición , Umbral Auditivo
15.
IEEE Trans Pattern Anal Mach Intell ; 46(6): 4234-4245, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38241115

RESUMEN

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge that quality, and how to achieve it. In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on benchmark datasets. Specifically, we leverage a variational auto-encoder (VAE) for end-to-end text-to-waveform generation, with several key modules to enhance the capacity of the prior from text and reduce the complexity of the posterior from speech, including phoneme pre-training, differentiable duration modeling, bidirectional prior/posterior modeling, and a memory mechanism in VAE. Experimental evaluations on the popular LJSpeech dataset show that our proposed NaturalSpeech achieves -0.01 CMOS (comparative mean opinion score) to human recordings at the sentence level, with Wilcoxon signed rank test at p-level p >> 0.05, which demonstrates no statistically significant difference from human recordings for the first time.


Asunto(s)
Algoritmos , Humanos , Procesamiento de Señales Asistido por Computador , Habla/fisiología , Procesamiento de Lenguaje Natural , Bases de Datos Factuales , Espectrografía del Sonido/métodos
16.
Behav Res Methods ; 56(3): 2114-2134, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37253958

RESUMEN

The use of voice recordings in both research and industry practice has increased dramatically in recent years-from diagnosing a COVID-19 infection based on patients' self-recorded voice samples to predicting customer emotions during a service center call. Crowdsourced audio data collection in participants' natural environment using their own recording device has opened up new avenues for researchers and practitioners to conduct research at scale across a broad range of disciplines. The current research examines whether fundamental properties of the human voice are reliably and validly captured through common consumer-grade audio-recording devices in current medical, behavioral science, business, and computer science research. Specifically, this work provides evidence from a tightly controlled laboratory experiment analyzing 1800 voice samples and subsequent simulations that recording devices with high proximity to a speaker (such as a headset or a lavalier microphone) lead to inflated measures of amplitude compared to a benchmark studio-quality microphone while recording devices with lower proximity to a speaker (such as a laptop or a smartphone in front of the speaker) systematically reduce measures of amplitude and can lead to biased measures of the speaker's true fundamental frequency. We further demonstrate through simulation studies that these differences can lead to biased and ultimately invalid conclusions in, for example, an emotion detection task. Finally, we outline a set of recording guidelines to ensure reliable and valid voice recordings and offer initial evidence for a machine-learning approach to bias correction in the case of distorted speech signals.


Asunto(s)
Calidad de la Voz , Voz , Humanos , Espectrografía del Sonido , Teléfono Inteligente , Microcomputadores
17.
Audiol., Commun. res ; 29: e2826, 2024. tab, graf
Artículo en Portugués | LILACS | ID: biblio-1550051

RESUMEN

RESUMO Objetivo desenvolver a etapa de validade baseada nos processos de resposta do Protocolo de Análise Espectrográfica da Voz (PAEV). Métodos foram recrutados dez fonoaudiólogos e dez alunos de graduação em Fonoaudiologia, que aplicaram o PAEV em dez espectrogramas, realizaram o julgamento dos itens do PAEV e participaram de uma entrevista cognitiva. A partir das respostas, o PAEV foi reanalisado para reformulação ou para exclusão de itens. Utilizou-se o teste Qui-Quadrado e os valores de acurácia para análise das respostas dos questionários, assim como análise qualitativa dos dados da entrevista cognitiva. Resultados os participantes obtiveram acurácia maior que 70% na maioria dos itens do PAE. Apenas sete itens alcançaram acurácia menor ou igual a 70%. Houve diferença entre as respostas de presença versus ausência de dificuldade na identificação dos itens no espectrograma. A maioria dos participantes não teve dificuldade na identificação dos itens do PAEV. Na entrevista cognitiva, apenas seis itens não obtiveram correta identificação da intenção, conforme verificado na análise qualitativa. Além disso, os participantes sugeriram exclusão de cinco itens. Conclusão após a etapa de validação baseada nos processos de resposta, o PAEV foi reformulado. Sete itens foram excluídos e dois itens foram reformulados. Dessa forma, a versão final do PAEV após essa etapa foi reduzida de 25 para 18 itens, distribuídos nos cinco domínios.


ABSTRACT Purpose To develop the validity step based on the response processes of the Spectrographic Analysis Protocol (SAP). Methods 10 speech therapists and 10 undergraduate students of the Speech Therapy course were recruited, who applied the SAP in 10 spectrograms, performed the evaluation of the PAE items, and participated in a cognitive interview (CI). The SAP was reanalyzed to reformulate or exclude items based on the responses. The chi-square test and the accuracy values were used to analyze the answers to the questionnaires and qualitative analysis of the CI data. Results the participants achieved accuracy > 70% in most items of the SAP. Only seven items achieved accuracy ≤ 70%. There was a difference between presence vs. absence of difficulty in identifying items in the spectrogram. Most participants had no problem identifying the SAP items. In the CI, only six items did not correctly identify the intention, verified in the qualitative analysis. In addition, participants suggested excluding five items. Conclusion After the validation step based on the response processes, the SAP is reformulated. Seven items were deleted, and two items were reformulated. Thus, the final version of the SAP after this stage was reduced from 25 to 18 items, distributed in the five domains.


Asunto(s)
Humanos , Espectrografía del Sonido/métodos , Acústica del Lenguaje , Calidad de la Voz , Trastornos de la Voz/diagnóstico por imagen
18.
Sci Rep ; 13(1): 21771, 2023 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-38065973

RESUMEN

Acoustic sequences have been described in a range of species and in varying complexity. Cetaceans are known to produce complex song displays but these are generally limited to mysticetes; little is known about call combinations in odontocetes. Here we investigate call combinations produced by killer whales (Orcinus orca), a highly social and vocal species. Using acoustic recordings from 22 multisensor tags, we use a first order Markov model to show that transitions between call types or subtypes were significantly different from random, with repetitions and specific call combinations occurring more often than expected by chance. The mixed call combinations were composed of two or three calls and were part of three call combination clusters. Call combinations were recorded over several years, from different individuals, and several social clusters. The most common call combination cluster consisted of six call (sub-)types. Although different combinations were generated, there were clear rules regarding which were the first and last call types produced, and combinations were highly stereotyped. Two of the three call combination clusters were produced outside of feeding contexts, but their function remains unclear and further research is required to determine possible functions and whether these combinations could be behaviour- or group-specific.


Asunto(s)
Orca , Humanos , Animales , Vocalización Animal , Conducta Social , Islandia , Espectrografía del Sonido
19.
J Acoust Soc Am ; 154(6): 3672-3683, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-38059727

RESUMEN

Sound production capabilities and characteristics in Loricariidae, the largest catfish family, have not been well examined. Sounds produced by three loricariid catfish species, Otocinclus affinis, Pterygoplichthys gibbiceps, and Pterygoplichthys pardalis, were recorded. Each of these species produces pulses via pectoral-fin spine stridulation by rubbing the ridged condyle of the dorsal process of the pectoral-fin spine base against a matching groove-like socket in the pectoral girdle. Light and scanning electron microscopy were used to examine the dorsal process of the pectoral-fin spines of these species. Mean distances between dorsal process ridges of O. affinis, P. gibbiceps, and P. pardalis were 53, 161, and 329 µm, respectively. Stridulation sounds occurred during either abduction (type A) or adduction (type B). O. affinis produced sounds through adduction only and P. pardalis through abduction only, whereas P. gibbiceps often produced pulse trains alternating between abduction and adduction. In these species, dominant frequency was an inverse function of sound duration, fish total length, and inter-ridge distance on the dorsal process of the pectoral-fin spine and sound duration increased with fish total length. While stridulation sounds are used in many behavioral contexts in catfishes, the functional significance of sound production in Loricariidae is currently unknown.


Asunto(s)
Bagres , Sonido , Animales , Comunicación Animal , Tamaño Corporal , Espectrografía del Sonido
20.
Artículo en Inglés | MEDLINE | ID: mdl-38083776

RESUMEN

Infant cry provides useful clinical insights for caregivers to make appropriate medical decisions, such as in obstetrics. However, robust infant cry detection in real clinical settings (e.g. obstetrics) is still challenging due to the limited training data in this scenario. In this paper, we propose a scene adaption framework (SAF) including two different learning stages that can quickly adapt the cry detection model to a new environment. The first stage uses the acoustic principle that mixture sources in audio signals are approximately additive to imitate the sounds in clinical settings using public datasets. The second stage utilizes mutual learning to mine the shared characteristics of infant cry between the clinical setting and public dataset to adapt the scene in an unsupervised manner. The clinical trial was conducted in Obstetrics, where the crying audios from 200 infants were collected. The experimented four classifiers used for infant cry detection have nearly 30% improvement on the F1-score by using SAF, which achieves similar performance as the supervised learning based on the target setting. SAF is demonstrated to be an effective plug- and-play tool for improving infant cry detection in new clinical settings. Our code is available at https://github.com/contactless-healthcare/Scene-Adaption-for-Infant-Cry-Detection.


Asunto(s)
Llanto , Obstetricia , Humanos , Lactante , Acústica , Sonido , Espectrografía del Sonido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...