Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 110
Filter
Add more filters

Publication year range
1.
Nature ; 631(8019): 118-124, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38898274

ABSTRACT

Locating sound sources such as prey or predators is critical for survival in many vertebrates. Terrestrial vertebrates locate sources by measuring the time delay and intensity difference of sound pressure at each ear1-5. Underwater, however, the physics of sound makes interaural cues very small, suggesting that directional hearing in fish should be nearly impossible6. Yet, directional hearing has been confirmed behaviourally, although the mechanisms have remained unknown for decades. Several hypotheses have been proposed to explain this remarkable ability, including the possibility that fish evolved an extreme sensitivity to minute interaural differences or that fish might compare sound pressure with particle motion signals7,8. However, experimental challenges have long hindered a definitive explanation. Here we empirically test these models in the transparent teleost Danionella cerebrum, one of the smallest vertebrates9,10. By selectively controlling pressure and particle motion, we dissect the sensory algorithm underlying directional acoustic startles. We find that both cues are indispensable for this behaviour and that their relative phase controls its direction. Using micro-computed tomography and optical vibrometry, we further show that D. cerebrum has the sensory structures to implement this mechanism. D. cerebrum shares these structures with more than 15% of living vertebrate species, suggesting a widespread mechanism for inferring sound direction.


Subject(s)
Cues , Cyprinidae , Hearing , Sound Localization , Animals , Female , Male , Algorithms , Hearing/physiology , Pressure , Sound , Sound Localization/physiology , Vibration , X-Ray Microtomography , Cyprinidae/physiology , Motion , Reflex, Startle , Particulate Matter
2.
PLoS Biol ; 22(4): e3002586, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38683852

ABSTRACT

Having two ears enables us to localize sound sources by exploiting interaural time differences (ITDs) in sound arrival. Principal neurons of the medial superior olive (MSO) are sensitive to ITD, and each MSO neuron responds optimally to a best ITD (bITD). In many cells, especially those tuned to low sound frequencies, these bITDs correspond to ITDs for which the contralateral ear leads, and are often larger than the ecologically relevant range, defined by the ratio of the interaural distance and the speed of sound. Using in vivo recordings in gerbils, we found that shortly after hearing onset the bITDs were even more contralaterally leading than found in adult gerbils, and travel latencies for contralateral sound-evoked activity clearly exceeded those for ipsilateral sounds. During the following weeks, both these latencies and their interaural difference decreased. A computational model indicated that spike timing-dependent plasticity can underlie this fine-tuning. Our results suggest that MSO neurons start out with a strong predisposition toward contralateral sounds due to their longer neural travel latencies, but that, especially in high-frequency neurons, this predisposition is subsequently mitigated by differential developmental fine-tuning of the travel latencies.


Subject(s)
Acoustic Stimulation , Gerbillinae , Neurons , Superior Olivary Complex , Animals , Neurons/physiology , Superior Olivary Complex/physiology , Sound Localization/physiology , Male , Olivary Nucleus/physiology , Sound , Female
3.
J Neurosci ; 44(21)2024 May 22.
Article in English | MEDLINE | ID: mdl-38664010

ABSTRACT

The natural environment challenges the brain to prioritize the processing of salient stimuli. The barn owl, a sound localization specialist, exhibits a circuit called the midbrain stimulus selection network, dedicated to representing locations of the most salient stimulus in circumstances of concurrent stimuli. Previous competition studies using unimodal (visual) and bimodal (visual and auditory) stimuli have shown that relative strength is encoded in spike response rates. However, open questions remain concerning auditory-auditory competition on coding. To this end, we present diverse auditory competitors (concurrent flat noise and amplitude-modulated noise) and record neural responses of awake barn owls of both sexes in subsequent midbrain space maps, the external nucleus of the inferior colliculus (ICx) and optic tectum (OT). While both ICx and OT exhibit a topographic map of auditory space, OT also integrates visual input and is part of the global-inhibitory midbrain stimulus selection network. Through comparative investigation of these regions, we show that while increasing strength of a competitor sound decreases spike response rates of spatially distant neurons in both regions, relative strength determines spike train synchrony of nearby units only in the OT. Furthermore, changes in synchrony by sound competition in the OT are correlated to gamma range oscillations of local field potentials associated with input from the midbrain stimulus selection network. The results of this investigation suggest that modulations in spiking synchrony between units by gamma oscillations are an emergent coding scheme representing relative strength of concurrent stimuli, which may have relevant implications for downstream readout.


Subject(s)
Acoustic Stimulation , Inferior Colliculi , Sound Localization , Strigiformes , Animals , Strigiformes/physiology , Female , Male , Acoustic Stimulation/methods , Sound Localization/physiology , Inferior Colliculi/physiology , Mesencephalon/physiology , Auditory Perception/physiology , Brain Mapping , Auditory Pathways/physiology , Neurons/physiology , Action Potentials/physiology
4.
J Neurosci ; 44(28)2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38830759

ABSTRACT

Congenital single-sided deafness (SSD) leads to an aural preference syndrome that is characterized by overrepresentation of the hearing ear in the auditory system. Cochlear implantation (CI) of the deaf ear is an effective treatment for SSD. However, the newly introduced auditory input in congenital SSD often does not reach expectations in late-implanted CI recipients with respect to binaural hearing and speech perception. In a previous study, a reduction of the interaural time difference (ITD) sensitivity has been shown in unilaterally congenitally deaf cats (uCDCs). In the present study, we focused on the interaural level difference (ILD) processing in the primary auditory cortex. The uCDC group was compared with hearing cats (HCs) and bilaterally congenitally deaf cats (CDCs). The ILD representation was reorganized, replacing the preference for the contralateral ear with a preference for the hearing ear, regardless of the cortical hemisphere. In accordance with the previous study, uCDCs were less sensitive to interaural time differences than HCs, resulting in unmodulated ITD responses, thus lacking directional information. Such incongruent ITDs and ILDs cannot be integrated for binaural sound source localization. In normal hearing, the predominant effect of each ear is excitation of the auditory cortex in the contralateral cortical hemisphere and inhibition in the ipsilateral hemisphere. In SSD, however, auditory pathways reorganized such that the hearing ear produced greater excitation in both cortical hemispheres and the deaf ear produced weaker excitation and preserved inhibition in both cortical hemispheres.


Subject(s)
Auditory Cortex , Cochlear Implantation , Cues , Hearing Loss, Unilateral , Sound Localization , Cats , Animals , Sound Localization/physiology , Hearing Loss, Unilateral/physiopathology , Cochlear Implantation/methods , Auditory Cortex/physiopathology , Female , Male , Acoustic Stimulation/methods , Functional Laterality/physiology , Deafness/physiopathology , Deafness/congenital , Deafness/surgery
6.
Eur J Neurosci ; 59(7): 1770-1788, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38230578

ABSTRACT

Studies on multisensory perception often focus on simplistic conditions in which one single stimulus is presented per modality. Yet, in everyday life, we usually encounter multiple signals per modality. To understand how multiple signals within and across the senses are combined, we extended the classical audio-visual spatial ventriloquism paradigm to combine two visual stimuli with one sound. The individual visual stimuli presented in the same trial differed in their relative timing and spatial offsets to the sound, allowing us to contrast their individual and combined influence on sound localization judgements. We find that the ventriloquism bias is not dominated by a single visual stimulus but rather is shaped by the collective multisensory evidence. In particular, the contribution of an individual visual stimulus to the ventriloquism bias depends not only on its own relative spatio-temporal alignment to the sound but also the spatio-temporal alignment of the other visual stimulus. We propose that this pattern of multi-stimulus multisensory integration reflects the evolution of evidence for sensory causal relations during individual trials, calling for the need to extend established models of multisensory causal inference to more naturalistic conditions. Our data also suggest that this pattern of multisensory interactions extends to the ventriloquism aftereffect, a bias in sound localization observed in unisensory judgements following a multisensory stimulus.


Subject(s)
Auditory Perception , Sound Localization , Acoustic Stimulation , Photic Stimulation , Visual Perception , Humans
7.
Eur J Neurosci ; 59(9): 2373-2390, 2024 May.
Article in English | MEDLINE | ID: mdl-38303554

ABSTRACT

Humans have the remarkable ability to integrate information from different senses, which greatly facilitates the detection, localization and identification of events in the environment. About 466 million people worldwide suffer from hearing loss. Yet, the impact of hearing loss on how the senses work together is rarely investigated. Here, we investigate how a common sensory impairment, asymmetric conductive hearing loss (AHL), alters the way our senses interact by examining human orienting behaviour with normal hearing (NH) and acute AHL. This type of hearing loss disrupts auditory localization. We hypothesized that this creates a conflict between auditory and visual spatial estimates and alters how auditory and visual inputs are integrated to facilitate multisensory spatial perception. We analysed the spatial and temporal properties of saccades to auditory, visual and audiovisual stimuli before and after plugging the right ear of participants. Both spatial and temporal aspects of multisensory integration were affected by AHL. Compared with NH, AHL caused participants to make slow, inaccurate and unprecise saccades towards auditory targets. Surprisingly, increased weight on visual input resulted in accurate audiovisual localization with AHL. This came at a cost: saccade latencies for audiovisual targets increased significantly. The larger the auditory localization errors, the less participants were able to benefit from audiovisual integration in terms of saccade latency. Our results indicate that observers immediately change sensory weights to effectively deal with acute AHL and preserve audiovisual accuracy in a way that cannot be fully explained by statistical models of optimal cue integration.


Subject(s)
Sound Localization , Visual Perception , Humans , Female , Adult , Male , Visual Perception/physiology , Sound Localization/physiology , Young Adult , Saccades/physiology , Auditory Perception/physiology , Hearing Loss/physiopathology , Photic Stimulation/methods , Acoustic Stimulation/methods , Space Perception/physiology
8.
Article in English | MEDLINE | ID: mdl-38227005

ABSTRACT

The Journal of Comparative Physiology lived up to its name in the last 100 years by including more than 1500 different taxa in almost 10,000 publications. Seventeen phyla of the animal kingdom were represented. The honeybee (Apis mellifera) is the taxon with most publications, followed by locust (Locusta migratoria), crayfishes (Cambarus spp.), and fruitfly (Drosophila melanogaster). The representation of species in this journal in the past, thus, differs much from the 13 model systems as named by the National Institutes of Health (USA). We mention major accomplishments of research on species with specific adaptations, specialist animals, for example, the quantitative description of the processes underlying the axon potential in squid (Loligo forbesii) and the isolation of the first receptor channel in the electric eel (Electrophorus electricus) and electric ray (Torpedo spp.). Future neuroethological work should make the recent genetic and technological developments available for specialist animals. There are many research questions left that may be answered with high yield in specialists and some questions that can only be answered in specialists. Moreover, the adaptations of animals that occupy specific ecological niches often lend themselves to biomimetic applications. We go into some depth in explaining our thoughts in the research of motion vision in insects, sound localization in barn owls, and electroreception in weakly electric fish.


Subject(s)
Electric Fish , Sound Localization , Strigiformes , Animals , Drosophila melanogaster , Sound Localization/physiology , Vision, Ocular , Electrophorus
9.
Audiol Neurootol ; 29(3): 228-238, 2024.
Article in English | MEDLINE | ID: mdl-38190808

ABSTRACT

INTRODUCTION: Cochlear implants (CIs) can restore binaural hearing in cases of single-sided deafness (SSD). However, studies with a high level of evidence in support of this phenomenon are lacking. The aim of this study is to analyze the effectiveness of CIs using several spatialized speech-in-noise tests and to identify potential predictors of successful surgery. METHODS: Ten cases underwent standard CI surgery (MEDEL-Flex24). The speech-in-noise test was used in three different spatial configurations. The noise was presented from the front (N0), toward the CI (NCI), and toward the ear (Near), while the speech was always from the front (S0). For each test, the speech-to-noise ratio at 50% intelligibility (SNR50) was evaluated. Seven different effects were assessed (summation, head shadow [HS], speech released of masking [SRM], and squelch for the CI and for the ear). RESULTS: A significant summation effect of 1.5 dB was observed. Contralateral PTA was positively correlated with S0N0-B and S0NCI-B (CIon and unplugged ear). S0N0-B results were positively correlated with S0N0-CIoff (p < 0.0001) and with S0Near-CIoff results (p = 0.004). A significant positive correlation was found between delay post-activation and HS gain for the CI (p = 0.005). Finally, the HS was negatively correlated with the squelch effect for the ear. CONCLUSION: CI benefits patients with SSD in noise and can improve the threshold for detecting low-level noise. Contralateral PTA could predict good postoperative results. Simple tests performed preoperatively can predict the likelihood of surgical success in reversing SSD.


Subject(s)
Cochlear Implantation , Cochlear Implants , Hearing Loss, Unilateral , Speech Perception , Humans , Middle Aged , Male , Female , Hearing Loss, Unilateral/surgery , Hearing Loss, Unilateral/rehabilitation , Hearing Loss, Unilateral/physiopathology , Adult , Aged , Sound Localization , Treatment Outcome , Noise
10.
Ear Hear ; 45(4): 969-984, 2024.
Article in English | MEDLINE | ID: mdl-38472134

ABSTRACT

OBJECTIVES: The independence of left and right automatic gain controls (AGCs) used in cochlear implants can distort interaural level differences and thereby compromise dynamic sound source localization. We assessed the degree to which synchronizing left and right AGCs mitigates those difficulties as indicated by listeners' ability to use the changes in interaural level differences that come with head movements to avoid front-back reversals (FBRs). DESIGN: Broadband noise stimuli were presented from one of six equally spaced loudspeakers surrounding the listener. Sound source identification was tested for stimuli presented at 70 dBA (above AGC threshold) for 10 bilateral cochlear implant patients, under conditions where (1) patients remained stationary and (2) free head movements within ±30° were encouraged. These conditions were repeated for both synchronized and independent AGCs. The same conditions were run at 50 dBA, below the AGC threshold, to assess listeners' baseline performance when AGCs were not engaged. In this way, the expected high variability in listener performance could be separated from effects of independent AGCs to reveal the degree to which synchronizing AGCs could restore localization performance to what it was without AGC compression. RESULTS: The mean rate of FBRs was higher for sound stimuli presented at 70 dBA with independent AGCs, both with and without head movements, than at 50 dBA, suggesting that when AGCs were independently engaged they contributed to poorer front-back localization. When listeners remained stationary, synchronizing AGCs did not significantly reduce the rate of FBRs. When AGCs were independent at 70 dBA, head movements did not have a significant effect on the rate of FBRs. Head movements did have a significant group effect on the rate of FBRs at 50 dBA when AGCs were not engaged and at 70 dBA when AGCs were synchronized. Synchronization of AGCs, together with head movements, reduced the rate of FBRs to approximately what it was in the 50-dBA baseline condition. Synchronizing AGCs also had a significant group effect on listeners' overall percent correct localization. CONCLUSIONS: Synchronizing AGCs allowed for listeners to mitigate front-back confusions introduced by unsynchronized AGCs when head motion was permitted, returning individual listener performance to roughly what it was in the 50-dBA baseline condition when AGCs were not engaged. Synchronization of AGCs did not overcome localization deficiencies which were observed when AGCs were not engaged, and which are therefore unrelated to AGC compression.


Subject(s)
Cochlear Implants , Sound Localization , Humans , Middle Aged , Male , Female , Aged , Adult , Cochlear Implantation , Head Movements/physiology , Noise , Aged, 80 and over
11.
Optom Vis Sci ; 101(6): 393-398, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38990237

ABSTRACT

SIGNIFICANCE: It is important to know whether early-onset vision loss and late-onset vision loss are associated with differences in the estimation of distances of sound sources within the environment. People with vision loss rely heavily on auditory cues for path planning, safe navigation, avoiding collisions, and activities of daily living. PURPOSE: Loss of vision can lead to substantial changes in auditory abilities. It is unclear whether differences in sound distance estimation exist in people with early-onset partial vision loss, late-onset partial vision loss, and normal vision. We investigated distance estimates for a range of sound sources and auditory environments in groups of participants with early- or late-onset partial visual loss and sighted controls. METHODS: Fifty-two participants heard static sounds with virtual distances ranging from 1.2 to 13.8 m within a simulated room. The room simulated either anechoic (no echoes) or reverberant environments. Stimuli were speech, music, or noise. Single sounds were presented, and participants reported the estimated distance of the sound source. Each participant took part in 480 trials. RESULTS: Analysis of variance showed significant main effects of visual status (p<0.05) environment (reverberant vs. anechoic, p<0.05) and also of the stimulus (p<0.05). Significant differences (p<0.05) were shown in the estimation of distances of sound sources between early-onset visually impaired participants and sighted controls for closer distances for all conditions except the anechoic speech condition and at middle distances for all conditions except the reverberant speech and music conditions. Late-onset visually impaired participants and sighted controls showed similar performance (p>0.05). CONCLUSIONS: The findings suggest that early-onset partial vision loss results in significant changes in judged auditory distance in different environments, especially for close and middle distances. Late-onset partial visual loss has less of an impact on the ability to estimate the distance of sound sources. The findings are consistent with a theoretical framework, the perceptual restructuring hypothesis, which was recently proposed to account for the effects of vision loss on audition.


Subject(s)
Sound Localization , Humans , Male , Female , Middle Aged , Aged , Adult , Sound Localization/physiology , Judgment , Auditory Perception/physiology , Distance Perception/physiology , Acoustic Stimulation/methods , Young Adult , Visual Acuity/physiology , Age of Onset , Aged, 80 and over , Cues
12.
Eur Arch Otorhinolaryngol ; 281(8): 4039-4047, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38365989

ABSTRACT

PURPOSE: First-generation bone bridges (BBs) have demonstrated favorable safety and audiological benefits in patients with conductive hearing loss. However, studies on the effects of second-generation BBs are limited, especially among children. In this study, we aimed to explore the surgical and audiological effects of second-generation BBs in patients with bilateral congenital microtia. METHODS: This single-center prospective study included nine Mandarin-speaking patients with bilateral microtia. All the patients underwent BCI Generation 602 (BCI602; MED-EL, Innsbruck, Austria) implant surgery between September 2021 and June 2023. Audiological and sound localization tests were performed under unaided and BB-aided conditions. RESULTS: The transmastoid and retrosigmoid sinus approaches were implemented in three and six patients, respectively. No patient underwent preoperative planning, lifts were unnecessary, and no sigmoid sinus or dural compression occurred. The mean function gain at 0.5-4.0 kHz was 28.06 ± 4.55-dB HL. The word recognition scores improved significantly in quiet under the BB aided condition. Signal-to-noise ratio reduction by 10.56 ± 2.30 dB improved the speech reception threshold in noise. Patients fitted with a unilateral BB demonstrated inferior sound source localization after the initial activation. CONCLUSIONS: Second-generation BBs are safe and effective for patients with bilateral congenital microtia and may be suitable for children with mastoid hypoplasia without preoperative three-dimensional reconstruction.


Subject(s)
Bone Conduction , Congenital Microtia , Hearing Loss, Conductive , Humans , Congenital Microtia/surgery , Congenital Microtia/complications , Male , Female , Prospective Studies , Child , Adolescent , Hearing Loss, Conductive/surgery , Hearing Loss, Conductive/etiology , Treatment Outcome , Young Adult , Adult , Sound Localization/physiology , Prosthesis Design
13.
J Acoust Soc Am ; 155(4): 2460-2469, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38578178

ABSTRACT

Head-worn devices (HWDs) interfere with the natural transmission of sound from the source to the ears of the listener, worsening their localization abilities. The localization errors introduced by HWDs have been mostly studied in static scenarios, but these errors are reduced if head movements are allowed. We studied the effect of 12 HWDs on an auditory-cued visual search task, where head movements were not restricted. In this task, a visual target had to be identified in a three-dimensional space with the help of an acoustic stimulus emitted from the same location as the visual target. The results showed an increase in the search time caused by the HWDs. Acoustic measurements of a dummy head wearing the studied HWDs showed evidence of impaired localization cues, which were used to estimate the perceived localization errors using computational auditory models of static localization. These models were able to explain the search-time differences in the perceptual task, showing the influence of quadrant errors in the auditory-aided visual search task. These results indicate that HWDs have an impact on sound-source localization even when head movements are possible, which may compromise the safety and the quality of experience of the wearer.


Subject(s)
Hearing Aids , Sound Localization , Acoustic Stimulation , Head Movements
14.
J Acoust Soc Am ; 156(1): 164-175, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38958583

ABSTRACT

Piano tone localization at the performer's listening point is a multisensory process involving audition, vision, and upper limb proprioception. The consequent representation of the auditory scene, especially in experienced pianists, is likely also influenced by their memory about the instrument keyboard. Disambiguating such components is not obvious, and first requires an analysis of the acoustic tone localization process to assess the role of auditory feedback in forming this scene. This analysis is complicated by the acoustic behavior of the piano, which does not guarantee the activation of the auditory precedence effect during a tone attack, nor can it provide robust interaural differences during the subsequent free evolution of the sound. In a tone localization task using a Disklavier upright piano (which can be operated remotely and configured to have its hammers hit a damper instead of producing a tone), twenty-three expert musicians, including pianists, successfully recognized the angular position of seven evenly distributed notes across the keyboard. The experiment involved listening to either full piano tones or just the key mechanical noise, with no additional feedback from other senses. This result suggests that the key mechanical noise alone activated the localization process without support from vision and/or limb proprioception. Since the same noise is present in the onset of the full tones, the key mechanics of our piano created a touch precursor in such tones that may be responsible of their correct angular localization by means of the auditory precedence effect. However, the significance of pitch cues arriving at a listener after the touch precursor was not measured when full tones were presented. As these cues characterize a note and, hence, the corresponding key position comprehensively, an open question remains regarding the contribution of pianists' spatial memory of the instrument keyboard to tone localization.


Subject(s)
Cues , Music , Sound Localization , Humans , Sound Localization/physiology , Adult , Male , Female , Young Adult , Acoustic Stimulation , Proprioception/physiology , Feedback, Sensory/physiology
15.
J Acoust Soc Am ; 156(1): 475-488, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-39013035

ABSTRACT

Extended-wear hearing aids (EWHAs) are small broadband analog amplification devices placed deeply enough in the ear canal to preserve most of the cues in the head-related transfer function. However, little is known about how EWHAs affect localization accuracy for normal hearing threshold (NHT) listeners. In this study, eight NHT participants were fitted with EWHAs and localized broadband sounds of different durations (250 ms and 4 s) and stimulus intensities (40, 50, 60, 70, and 80 dBA) in a spherical speaker array. When the EWHAs were in the active mode, localization accuracy was only slightly degraded relative to open-ear performance. However, when the EWHAs were turned off, localization performance was substantially degraded even at the highest stimulus intensities. An electro-acoustical evaluation of the EWHAs showed minimal effects of dynamic range compression on the signals and good preservation of the signal pattern for vertical polar sound localization. Between-study comparisons suggest that EWHA active mode localization accuracy is favorable compared to conventional active earplugs, and EWHA passive mode localization accuracy is comparable to conventional passive earplugs. These results suggest that the deep-insertion analog design of the EWHA is generally better at preserving localization accuracy of NHT listeners than conventional earplug devices.


Subject(s)
Auditory Threshold , Hearing Aids , Sound Localization , Humans , Adult , Male , Female , Young Adult , Acoustic Stimulation/methods , Cues , Equipment Design
16.
J Acoust Soc Am ; 155(5): 2934-2947, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38717201

ABSTRACT

Spatial separation and fundamental frequency (F0) separation are effective cues for improving the intelligibility of target speech in multi-talker scenarios. Previous studies predominantly focused on spatial configurations within the frontal hemifield, overlooking the ipsilateral side and the entire median plane, where localization confusion often occurs. This study investigated the impact of spatial and F0 separation on intelligibility under the above-mentioned underexplored spatial configurations. The speech reception thresholds were measured through three experiments for scenarios involving two to four talkers, either in the ipsilateral horizontal plane or in the entire median plane, utilizing monotonized speech with varying F0s as stimuli. The results revealed that spatial separation in symmetrical positions (front-back symmetry in the ipsilateral horizontal plane or front-back, up-down symmetry in the median plane) contributes positively to intelligibility. Both target direction and relative target-masker separation influence the masking release attributed to spatial separation. As the number of talkers exceeds two, the masking release from spatial separation diminishes. Nevertheless, F0 separation remains as a remarkably effective cue and could even facilitate spatial separation in improving intelligibility. Further analysis indicated that current intelligibility models encounter difficulties in accurately predicting intelligibility in scenarios explored in this study.


Subject(s)
Cues , Perceptual Masking , Sound Localization , Speech Intelligibility , Speech Perception , Humans , Female , Male , Young Adult , Adult , Speech Perception/physiology , Acoustic Stimulation , Auditory Threshold , Speech Acoustics , Speech Reception Threshold Test , Noise
17.
J Acoust Soc Am ; 156(2): 763-773, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-39105574

ABSTRACT

The perception of a talker's head orientation is an ecologically relevant task. Humans are able to discriminate changes in talker head orientation using acoustic cues. Factors that may influence measures of this ability have not been well characterized. Here, we examined the minimum audible change in head orientation cues (MACHO) using diotic stimuli. The effects of several factors were tested: talker and gender, stimulus bandwidth (full-band vs low-pass filtered at 8 or 10 kHz), transducer (loudspeaker vs headphone), stimulus uncertainty (interleaved vs blocked presentation of four talkers), and vocal production mode (speech vs singing). The best performance of ∼41° was achieved for full-band, blocked presentation of speech over a loudspeaker. Greater stimulus uncertainty (interleaved presentation) worsened the MACHO by 26%. Bandlimiting at 8 and 10 kHz worsened performance by an additional 22% and 14%, respectively. At equivalent overall sound levels, performance was better for speech than for singing. There was some limited evidence for the transducer influencing the MACHO. These findings suggest the MACHO relies on multiple factors manipulated here. One of the largest, consistent effects was that of talker, suggesting head orientation cues are highly dependent on individual talker characteristics. This may be due to individual variability in speech directivity patterns.


Subject(s)
Cues , Head , Speech Perception , Humans , Male , Female , Head/physiology , Adult , Young Adult , Acoustic Stimulation , Sound Localization , Singing , Orientation
18.
J Acoust Soc Am ; 156(2): 1202-1213, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-39158325

ABSTRACT

Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8-20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4-1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3-8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.


Subject(s)
Acoustic Stimulation , Cues , Noise , Perceptual Masking , Recognition, Psychology , Speech Perception , Humans , Female , Speech Perception/physiology , Young Adult , Adult , Male , Audiometry, Speech , Speech Intelligibility , Auditory Threshold , Sound Localization , Speech Acoustics , Sound Spectrography
19.
Sensors (Basel) ; 24(11)2024 May 27.
Article in English | MEDLINE | ID: mdl-38894232

ABSTRACT

Sound localization is a crucial aspect of human auditory perception. VR (virtual reality) technologies provide immersive audio platforms that allow human listeners to experience natural sounds based on their ability to localize sound. However, the simulations of sound generated by these platforms, which are based on the general head-related transfer function (HRTF), often lack accuracy in terms of individual sound perception and localization due to significant individual differences in this function. In this study, we aimed to investigate the disparities between the perceived locations of sound sources by users and the locations generated by the platform. Our goal was to determine if it is possible to train users to adapt to the platform-generated sound sources. We utilized the Microsoft HoloLens 2 virtual platform and collected data from 12 subjects based on six separate training sessions arranged in 2 weeks. We employed three modes of training to assess their effects on sound localization, in particular for studying the impacts of multimodal error, visual, and sound guidance in combination with kinesthetic/postural guidance, on the effectiveness of the training. We analyzed the collected data in terms of the training effect between pre- and post-sessions as well as the retention effect between two separate sessions based on subject-wise paired statistics. Our findings indicate that, as far as the training effect between pre- and post-sessions is concerned, the effect is proven to be statistically significant, in particular in the case wherein kinesthetic/postural guidance is mixed with visual and sound guidance. Conversely, visual error guidance alone was found to be largely ineffective. On the other hand, as far as the retention effect between two separate sessions is concerned, we could not find any meaningful statistical implication on the effect for all three error guidance modes out of the 2-week session of training. These findings can contribute to the improvement of VR technologies by ensuring they are designed to optimize human sound localization abilities.


Subject(s)
Sound Localization , Humans , Sound Localization/physiology , Female , Male , Adult , Virtual Reality , Young Adult , Auditory Perception/physiology , Sound
20.
Sensors (Basel) ; 24(13)2024 Jul 04.
Article in English | MEDLINE | ID: mdl-39001130

ABSTRACT

In recent years, embedded system technologies and products for sensor networks and wearable devices used for monitoring people's activities and health have become the focus of the global IT industry. In order to enhance the speech recognition capabilities of wearable devices, this article discusses the implementation of audio positioning and enhancement in embedded systems using embedded algorithms for direction detection and mixed source separation. The two algorithms are implemented using different embedded systems: direction detection developed using TI TMS320C6713 DSK and mixed source separation developed using Raspberry Pi 2. For mixed source separation, in the first experiment, the average signal-to-interference ratio (SIR) at 1 m and 2 m distances was 16.72 and 15.76, respectively. In the second experiment, when evaluated using speech recognition, the algorithm improved speech recognition accuracy to 95%.


Subject(s)
Algorithms , Wearable Electronic Devices , Humans , Signal Processing, Computer-Assisted , Sound Localization
SELECTION OF CITATIONS
SEARCH DETAIL