RESUMO
Logged and disturbed forests are often viewed as degraded and depauperate environments compared with primary forest. However, they are dynamic ecosystems1 that provide refugia for large amounts of biodiversity2,3, so we cannot afford to underestimate their conservation value4. Here we present empirically defined thresholds for categorizing the conservation value of logged forests, using one of the most comprehensive assessments of taxon responses to habitat degradation in any tropical forest environment. We analysed the impact of logging intensity on the individual occurrence patterns of 1,681 taxa belonging to 86 taxonomic orders and 126 functional groups in Sabah, Malaysia. Our results demonstrate the existence of two conservation-relevant thresholds. First, lightly logged forests (<29% biomass removal) retain high conservation value and a largely intact functional composition, and are therefore likely to recover their pre-logging values if allowed to undergo natural regeneration. Second, the most extreme impacts occur in heavily degraded forests with more than two-thirds (>68%) of their biomass removed, and these are likely to require more expensive measures to recover their biodiversity value. Overall, our data confirm that primary forests are irreplaceable5, but they also reinforce the message that logged forests retain considerable conservation value that should not be overlooked.
Assuntos
Conservação dos Recursos Naturais , Agricultura Florestal , Florestas , Árvores , Clima Tropical , Biodiversidade , Biomassa , Conservação dos Recursos Naturais/métodos , Conservação dos Recursos Naturais/estatística & dados numéricos , Agricultura Florestal/estatística & dados numéricos , Malásia , Árvores/classificação , Árvores/crescimento & desenvolvimento , AnimaisRESUMO
Spatial release from masking (SRM) in speech-on-speech tasks has been widely studied in the horizontal plane, where interaural cues play a fundamental role. Several studies have also observed SRM for sources located in the median plane, where (monaural) spectral cues are more important. However, a relatively unexplored research question concerns the impact of head-related transfer function (HRTF) personalisation on SRM, for example, whether using individually-measured HRTFs results in better performance if compared with the use of mannequin HRTFs. This study compares SRM in the median plane in a speech-on-speech virtual task rendered using both individual and mannequin HRTFs. SRM is obtained using English sentences with non-native English speakers. Our participants show lower SRM performances compared to those found by others using native English participants. Furthermore, SRM is significantly larger when the source is spatialised using the individual HRTF, and this effect is more marked for those with lower English proficiency. Further analyses using a spectral distortion metric and the estimation of the better-ear effect, show that the observed SRM can only partially be explained by HRTF-specific factors and that the effect of the familiarity with individual spatial cues is likely to be the most significant element driving these results.
Assuntos
Sinais (Psicologia) , Manequins , Humanos , Idioma , Reconhecimento Psicológico , FalaRESUMO
Natural habitats are being impacted by human pressures at an alarming rate. Monitoring these ecosystem-level changes often requires labor-intensive surveys that are unable to detect rapid or unanticipated environmental changes. Here we have developed a generalizable, data-driven solution to this challenge using eco-acoustic data. We exploited a convolutional neural network to embed soundscapes from a variety of ecosystems into a common acoustic space. In both supervised and unsupervised modes, this allowed us to accurately quantify variation in habitat quality across space and in biodiversity through time. On the scale of seconds, we learned a typical soundscape model that allowed automatic identification of anomalous sounds in playback experiments, providing a potential route for real-time automated detection of irregular environmental behavior including illegal logging and hunting. Our highly generalizable approach, and the common set of features, will enable scientists to unlock previously hidden insights from acoustic data and offers promise as a backbone technology for global collaborative autonomous ecosystem monitoring efforts.
Assuntos
Acústica , Ecossistema , Monitoramento Ambiental/métodos , Aprendizado de Máquina , Espectrografia do Som/classificação , Armas de Fogo , Agricultura Florestal , Som , FalaRESUMO
When performing binaural spatialisation, it is widely accepted that the choice of the head related transfer functions (HRTFs), and in particular the use of individually measured ones, can have an impact on localisation accuracy, externalization, and overall realism. Yet the impact of HRTF choices on speech-in-noise performances in cocktail party-like scenarios has not been investigated in depth. This paper introduces a study where 22 participants were presented with a frontal speech target and two lateral maskers, spatialised using a set of non-individual HRTFs. Speech reception threshold (SRT) was measured for each HRTF. Furthermore, using the SRT predicted by an existing speech perception model, the measured values were compensated in the attempt to remove overall HRTF-specific benefits. Results show significant overall differences among the SRTs measured using different HRTFs, consistently with the results predicted by the model. Individual differences between participants related to their SRT performances using different HRTFs could also be found, but their significance was reduced after the compensation. The implications of these findings are relevant to several research areas related to spatial hearing and speech perception, suggesting that when testing speech-in-noise performances within binaurally rendered virtual environments, the choice of the HRTF for each individual should be carefully considered.
Assuntos
Percepção da Fala , Fala , Limiar Auditivo , Audição , Humanos , Ruído , Teste do Limiar de Recepção da FalaRESUMO
Reverberation is essential for the realistic auralisation of enclosed spaces. However, it can be computationally expensive to render with high fidelity and, in practice, simplified models are typically used to lower costs while preserving perceived quality. Ambisonics-based methods may be employed to this purpose as they allow us to render a reverberant sound field more efficiently by limiting its spatial resolution. The present study explores the perceptual impact of two simplifications of Ambisonics-based binaural reverberation that aim to improve efficiency. First, a "hybrid Ambisonics" approach is proposed in which the direct sound path is generated by convolution with a spatially dense head related impulse response set, separately from reverberation. Second, the reverberant virtual loudspeaker method (RVL) is presented as a computationally efficient approach to dynamically render binaural reverberation for multiple sources with the potential limitation of inaccurately simulating listener's head rotations. Numerical and perceptual evaluations suggest that the perceived quality of hybrid Ambisonics auralisations of two measured rooms ceased to improve beyond the third order, which is a lower threshold than what was found by previous studies in which the direct sound path was not processed separately. Additionally, RVL is shown to produce auralisations with comparable perceived quality to Ambisonics renderings.
Assuntos
Percepção da Fala , SomRESUMO
INTRODUCTION: Individuals with chronic lung disease (eg, cystic fibrosis (CF)) often receive antimicrobial therapy including aminoglycosides resulting in ototoxicity. Extended high-frequency audiometry has increased sensitivity for ototoxicity detection, but diagnostic audiometry in a sound-booth is costly, time-consuming and requires a trained audiologist. This cross-sectional study analysed tablet-based audiometry (Shoebox MD) performed by non-audiologists in an outpatient setting, alongside home web-based audiometry (3D Tune-In) to screen for hearing loss in adults with CF. METHODS: Hearing was analysed in 126 CF adults using validated questionnaires, a web self-hearing test (0.5 to 4 kHz), tablet (0.25 to 12 kHz) and sound-booth audiometry (0.25 to 12 kHz). A threshold of ≥25 dB hearing loss at ≥1 audiometric frequency was considered abnormal. Demographics and mitochondrial DNA sequencing were used to analyse risk factors, and accuracy and usability of hearing tests determined. RESULTS: Prevalence of hearing loss within any frequency band tested was 48%. Multivariate analysis showed age (OR 1.127; (95% CI: 1.07 to 1.18; p value<0.0001) per year older) and total intravenous antibiotic days over 10 years (OR 1.006; (95% CI: 1.002 to 1.010; p value=0.004) per further intravenous day) were significantly associated with increased risk of hearing loss. Tablet audiometry had good usability, was 93% sensitive, 88% specific with 94% negative predictive value to screen for hearing loss compared with web self-test audiometry and questionnaires which had poor sensitivity (17% and 13%, respectively). Intraclass correlation (ICC) of tablet versus sound-booth audiometry showed high correlation (ICC >0.9) at all frequencies ≥4 kHz. CONCLUSIONS: Adults with CF have a high prevalence of drug-related hearing loss and tablet-based audiometry can be a practical, accurate screening tool within integrated ototoxicity monitoring programmes for early detection.
Assuntos
Fibrose Cística/complicações , Perda Auditiva/diagnóstico , Perda Auditiva/epidemiologia , Adulto , Audiometria , Computadores de Mão , Estudos Transversais , Fibrose Cística/terapia , Feminino , Humanos , Internet , Masculino , Pessoa de Meia-Idade , Prevalência , Fatores de Risco , Adulto JovemRESUMO
Sound localization is essential to perceive the surrounding world and to interact with objects. This ability can be learned across time, and multisensory and motor cues play a crucial role in the learning process. A recent study demonstrated that when training localization skills, reaching to the sound source to determine its position reduced localization errors faster and to a greater extent as compared to just naming sources' positions, despite the fact that in both tasks, participants received the same feedback about the correct position of sound sources in case of wrong response. However, it remains to establish which features have made reaching to sound more effective as compared to naming. In the present study, we introduced a further condition in which the hand is the effector providing the response, but without it reaching toward the space occupied by the target source: the pointing condition. We tested three groups of participants (naming, pointing, and reaching groups) each while performing a sound localization task in normal and altered listening situations (i.e. mild-moderate unilateral hearing loss) simulated through auditory virtual reality technology. The experiment comprised four blocks: during the first and the last block, participants were tested in normal listening condition, while during the second and the third in altered listening condition. We measured their performance, their subjective judgments (e.g. effort), and their head-related behavior (through kinematic tracking). First, people's performance decreased when exposed to asymmetrical mild-moderate hearing impairment, more specifically on the ipsilateral side and for the pointing group. Second, we documented that all groups decreased their localization errors across altered listening blocks, but the extent of this reduction was higher for reaching and pointing as compared to the naming group. Crucially, the reaching group leads to a greater error reduction for the side where the listening alteration was applied. Furthermore, we documented that, across blocks, reaching and pointing groups increased the implementation of head motor behavior during the task (i.e., they increased approaching head movements toward the space of the sound) more than naming. Third, while performance in the unaltered blocks (first and last) was comparable, only the reaching group continued to exhibit a head behavior similar to those developed during the altered blocks (second and third), corroborating the previous observed relationship between the reaching to sounds task and head movements. In conclusion, this study further demonstrated the effectiveness of reaching to sounds as compared to pointing and naming in the learning processes. This effect could be related both to the process of implementing goal-directed motor actions and to the role of reaching actions in fostering the implementation of head-related motor strategies.
Assuntos
Perda Auditiva , Localização de Som , Realidade Virtual , Humanos , Audição/fisiologia , Localização de Som/fisiologia , Testes AuditivosRESUMO
We report on findings from the first randomized controlled pilot trial of virtual reality exposure therapy (VRET) developed specifically for reducing social anxiety associated with stuttering. People who stutter with heightened social anxiety were recruited from online adverts and randomly allocated to receive VRET (n = 13) or be put on a waitlist (n = 12). Treatment was delivered remotely using a smartphone-based VR headset. It consisted of three weekly sessions, each comprising both performative and interactive exposure exercises, and was guided by a virtual therapist. Multilevel model analyses failed to demonstrate the effectiveness of VRET at reducing social anxiety between pre- and post-treatment. We found similar results for fear of negative evaluation, negative thoughts associated with stuttering, and stuttering characteristics. However, VRET was associated with reduced social anxiety between post-treatment and one-month follow-up. These pilot findings suggest that our current VRET protocol may not be effective at reducing social anxiety amongst people who stutter, though might be capable of supporting longer-term change. Future VRET protocols targeting stuttering-related social anxiety should be explored with larger samples. The results from this pilot trial provide a solid basis for further design improvements and for future research to explore appropriate techniques for widening access to social anxiety treatments in stuttering.
RESUMO
Spatial hearing is critical for communication in everyday sound-rich environments. It is important to gain an understanding of how well users of bilateral hearing devices function in these conditions. The purpose of this work was to evaluate a Virtual Acoustics (VA) version of the Spatial Speech in Noise (SSiN) test, the SSiN-VA. This implementation uses relatively inexpensive equipment and can be performed outside the clinic, allowing for regular monitoring of spatial-hearing performance. The SSiN-VA simultaneously assesses speech discrimination and relative localization with changing source locations in the presence of noise. The use of simultaneous tasks increases the cognitive load to better represent the difficulties faced by listeners in noisy real-world environments. Current clinical assessments may require costly equipment which has a large footprint. Consequently, spatial-hearing assessments may not be conducted at all. Additionally, as patients take greater control of their healthcare outcomes and a greater number of clinical appointments are conducted remotely, outcome measures that allow patients to carry out assessments at home are becoming more relevant. The SSiN-VA was implemented using the 3D Tune-In Toolkit, simulating seven loudspeaker locations spaced at 30° intervals with azimuths between -90° and +90°, and rendered for headphone playback using the binaural spatialization technique. Twelve normal-hearing participants were assessed to evaluate if SSiN-VA produced patterns of responses for relative localization and speech discrimination as a function of azimuth similar to those previously obtained using loudspeaker arrays. Additionally, the effect of the signal-to-noise ratio (SNR), the direction of the shift from target to reference, and the target phonetic contrast on performance were investigated. SSiN-VA led to similar patterns of performance as a function of spatial location compared to loudspeaker setups for both relative localization and speech discrimination. Performance for relative localization was significantly better at the highest SNR than at the lowest SNR tested, and a target shift to the right was associated with an increased likelihood of a correct response. For word discrimination, there was an interaction between SNR and word group. Overall, these outcomes support the use of virtual audio for speech discrimination and relative localization testing in noise.
RESUMO
OBJECTIVE: This study investigates how spatial working memory skills, and the processing and retrieval of distal auditory spatial information are influenced by visual experience. METHOD: We developed an experimental paradigm using an acoustic simulation. The performance of congenitally blind and sighted participants (n = 9 per group) was compared when recalling sequences of spatialised auditory items in the same or reverse order of presentation. Two experimental conditions based on stimuli features were tested: non-semantic and semantic. RESULTS: Blind participants had a shorter memory span in the backward than the forward order of presentation. In contrast, sighted participants did not, revealing that blindness affects spatial information processing with greater executive source involvement. Furthermore, we found that blind subjects performed worse overall than the sighted group and that the semantic information significantly improved the performance, regardless of the experimental group and the sequences' order of presentation. CONCLUSIONS: Lack of early visual experience affects the ability to encode the surrounding space. Congenital blindness influences the processing and retrieval of spatial auditory items, suggesting that visual experience plays a pivotal role in calibrating spatial memory abilities using the remaining sensory modalities. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Memória de Curto Prazo , Memória Espacial , Estimulação Acústica , Cegueira , Humanos , Rememoração Mental , Visão OcularRESUMO
Early bilateral cochlear implants (CIs) may enhance attention to speech, and reduce cognitive load in noisy environments. However, it is sometimes difficult to measure speech perception and listening effort, especially in very young children. Behavioral measures cannot always be obtained in young/uncooperative children, whereas objective measures are either difficult to assess or do not reliably correlate with behavioral measures. Recent studies have thus explored pupillometry as a possible objective measure. Here, pupillometry is introduced to assess attention to speech and music in noise in very young children with bilateral CIs (N = 14, age: 17-47 months), and in the age-matched group of normally-hearing (NH) children (N = 14, age: 22-48 months). The results show that the response to speech was affected by the presence of background noise only in children with CIs, but not NH children. Conversely, the presence of background noise altered pupil response to music only in in NH children. We conclude that whereas speech and music may receive comparable attention in comparable listening conditions, in young children with CIs, controlling for background noise affects attention to speech and speech processing more than in NH children. Potential implementations of the results for rehabilitation procedures are discussed.
RESUMO
Listeners can attend to and track instruments or singing voices in complex musical mixtures, even though the acoustical energy of sounds from individual instruments may overlap in time and frequency. In popular music, lead vocals are often accompanied by sound mixtures from a variety of instruments, such as drums, bass, keyboards, and guitars. However, little is known about how the perceptual organization of such musical scenes is affected by selective attention, and which acoustic features play the most important role. To investigate these questions, we explored the role of auditory attention in a realistic musical scenario. We conducted three online experiments in which participants detected single cued instruments or voices in multi-track musical mixtures. Stimuli consisted of 2-s multi-track excerpts of popular music. In one condition, the target cue preceded the mixture, allowing listeners to selectively attend to the target. In another condition, the target was presented after the mixture, requiring a more "global" mode of listening. Performance differences between these two conditions were interpreted as effects of selective attention. In Experiment 1, results showed that detection performance was generally dependent on the target's instrument category, but listeners were more accurate when the target was presented prior to the mixture rather than the opposite. Lead vocals appeared to be nearly unaffected by this change in presentation order and achieved the highest accuracy compared with the other instruments, which suggested a particular salience of vocal signals in musical mixtures. In Experiment 2, filtering was used to avoid potential spectral masking of target sounds. Although detection accuracy increased for all instruments, a similar pattern of results was observed regarding the instrument-specific differences between presentation orders. In Experiment 3, adjusting the sound level differences between the targets reduced the effect of presentation order, but did not affect the differences between instruments. While both acoustic manipulations facilitated the detection of targets, vocal signals remained particularly salient, which suggest that the manipulated features did not contribute to vocal salience. These findings demonstrate that lead vocals serve as robust attractor points of auditory attention regardless of the manipulation of low-level acoustical cues.
RESUMO
Acoustic indices derived from environmental soundscape recordings are being used to monitor ecosystem health and vocal animal biodiversity. Soundscape data can quickly become very expensive and difficult to manage, so data compression or temporal down-sampling are sometimes employed to reduce data storage and transmission costs. These parameters vary widely between experiments, with the consequences of this variation remaining mostly unknown.We analyse field recordings from North-Eastern Borneo across a gradient of historical land use. We quantify the impact of experimental parameters (MP3 compression, recording length and temporal subsetting) on soundscape descriptors (Analytical Indices and a convolutional neural net derived AudioSet Fingerprint). Both descriptor types were tested for their robustness to parameter alteration and their usability in a soundscape classification task.We find that compression and recording length both drive considerable variation in calculated index values. However, we find that the effects of this variation and temporal subsetting on the performance of classification models is minor: performance is much more strongly determined by acoustic index choice, with Audioset fingerprinting offering substantially greater (12%-16%) levels of classifier accuracy, precision and recall.We advise using the AudioSet Fingerprint in soundscape analysis, finding superior and consistent performance even on small pools of data. If data storage is a bottleneck to a study, we recommend Variable Bit Rate encoded compression (quality = 0) to reduce file size to 23% file size without affecting most Analytical Index values. The AudioSet Fingerprint can be compressed further to a Constant Bit Rate encoding of 64 kb/s (8% file size) without any detectable effect. These recommendations allow the efficient use of restricted data storage whilst permitting comparability of results between different studies.
RESUMO
INTRODUCTION: It is notoriously difficult to obtain a perfect fitting of hearing aids (HAs) for children as they often struggle to understand their hearing loss well enough to discuss the fitting adequately with their audiologist. Dartanan is an 'edutainment' game developed to help children understand the functions of their HA in different sound contexts. Dartanan also has elements of a leisure game for all children, in order to create an inclusive activity. METHODS: Game prototypes were evaluated during two formative evaluations and a summative evaluation. In total 106 children with and without hearing loss in Italy, Spain and the UK played Dartanan. A built-in virtual HA enabled children with hearing loss to use headphones to play.Results and conclusions: During the formative stages, feedback was discussed during focus groups on factors such as the audiological aspects, the extent to which children learned about HA functions, accessibility and usability, and this feedback was presented to the developers. After redevelopment, a summative evaluation was performed using an online survey. It was concluded that the game had met the goals of helping children understand their HA functionalities and providing an inclusive activity. User-evaluations were crucial in the development of the app into a useful and useable service.
RESUMO
Older children and teenagers with bilateral cochlear implants often have poor spatial hearing because they cannot fuse sounds from the two ears. This deficit jeopardizes speech and language development, education, and social well-being. The lack of protocols for fitting bilateral cochlear implants and resources for spatial-hearing training contribute to these difficulties. Spatial hearing develops with bilateral experience. A large body of research demonstrates that sound localisation can improve with training, underpinned by plasticity-driven changes in the auditory pathways. Generalizing training to non-trained auditory skills is best achieved by using a multi-modal (audio-visual) implementation and multi-domain training tasks (localisation, speech-in-noise, and spatial music). The goal of this work was to develop a package of virtual-reality games (BEARS, Both EARS) to train spatial hearing in young people (8-16 years) with bilateral cochlear implants using an action-research protocol. The action research protocol used formalized cycles for participants to trial aspects of the BEARS suite, reflect on their experiences, and in turn inform changes in the game implementations. This participatory design used the stakeholder participants as co-creators. The cycles for each of the three domains (localisation, spatial speech-in-noise, and spatial music) were customized to focus on the elements that the stakeholder participants considered important. The participants agreed that the final games were appropriate and ready to be used by patients. The main areas of modification were: the variety of immersive scenarios to cover age range and interests, the number of levels of complexity to ensure small improvements were measurable, feedback, and reward schemes to ensure positive reinforcement, and an additional implementation on an iPad for those who had difficulties with the headsets due to age or balance issues. The effectiveness of the BEARS training suite will be evaluated in a large-scale clinical trial to determine if using the games lead to improvements in speech-in-noise, quality of life, perceived benefit, and cost utility. Such interventions allow patients to take control of their own management reducing the reliance on outpatient-based rehabilitation. For young people, a virtual-reality implementation is more engaging than traditional rehabilitation methods, and the participatory design used here has ensured that the BEARS games are relevant.
RESUMO
BACKGROUND: Multiple gaming apps exist under the dementia umbrella for skills such as navigation; however, an app to specifically investigate the role of hearing loss in the process of cognitive decline is yet to be designed. There is a demonstrable gap in the utilization of games to further the knowledge of the potential relationship between hearing loss and dementia. OBJECTIVE: This study aims to identify the needs, facilitators, and barriers in designing a novel auditory-cognitive training gaming app. METHODS: A participatory design approach was used to engage key stakeholders across audiology and cognitive disorder specialties. Two rounds, including paired semistructured interviews and focus groups, were completed and thematically analyzed. RESULTS: A total of 18 stakeholders participated, and 6 themes were identified to inform the next stage of app development. These included congruence with hobbies, life getting in the way, motivational challenge, accessibility, addictive competition, and realism. CONCLUSIONS: The findings can now be implemented in the development of the app. The app will be evaluated against outcome measures of speech listening in noise, cognitive and attentional tasks, quality of life, and usability.
RESUMO
BACKGROUND: People with visual impairments can experience numerous challenges navigating unfamiliar environments. Systems that operate as prenavigation tools can assist such individuals. This mixed-methods study examined the effectiveness of an interactive audio-tactile map tool on the process of cognitive mapping and recall, among people who were blind or had visual impairments. The tool was developed with the involvement of visually impaired individuals who additionally provided further feedback throughout this research. METHODS: A mixed-methods experimental design was employed. Fourteen participants were allocated to either an experimental group who were exposed to an audio-tactile map, or a control group exposed to a verbally annotated tactile map. After five minutes' exposure, multiple-choice questions examined participants' recall of the spatial and navigational content. Subsequent semi-structured interviews were conducted to examine their views surrounding the study and the product. RESULTS: The experimental condition had significantly better overall recall than the control group and higher average scores in all four areas examined by the questions. The interviews suggested that the interactive component offered individuals the freedom to learn the map in several ways and did not restrict them to a sequential and linear approach to learning. CONCLUSION: Assistive technology can reduce challenges faced by people with visual impairments, and the flexible learning approach offered by the audio-tactile map may be of particular value. Future researchers and assistive technology developers may wish to explore this further.
Assuntos
Cegueira/reabilitação , Cognição , Rememoração Mental , Tecnologia Assistiva , Pessoas com Deficiência Visual/reabilitação , Adulto , Idoso , Feminino , Audição , Humanos , Masculino , Pessoa de Meia-Idade , Navegação Espacial , TatoRESUMO
BACKGROUND: Pre-navigational tools can assist visually impaired people when navigating unfamiliar environments. Assistive technology products (eg tactile maps or auditory simulations) can stimulate cognitive mapping processes to provide navigational assistance in these people. OBJECTIVES: We compared how well blind and visually impaired people could learn a map presented via a tablet computer auditory tactile map (ATM) in contrast to a conventional tactile map accompanied by a text description objectives. METHODS: Performance was assessed with a multiple choice test that quizzed participants on orientation and spatial awareness. Semi-structured interviews explored participant experiences and preferences. RESULTS: A statistically significant difference was found between the conditions with participants using the ATM performing much better than those who used a conventional tactile map and text description. Participants preferred the flexibility of learning of the ATM. CONCLUSION: This computer-based ATM provided an effective, easy to use and cost-effective way of enabling blind and partially sighted people learn a cognitive map and enhance their wellbeing.
Assuntos
Computadores de Mão , Orientação Espacial , Tato , Pessoas com Deficiência Visual , Cegueira , Humanos , Visão OcularRESUMO
This study examines the effect of adaptation to non-ideal auditory localization cues represented by the Head-Related Transfer Function (HRTF) and the retention of training for up to three months after the last session. Continuing from a previous study on rapid non-individual HRTF learning, subjects using non-individual HRTFs were tested alongside control subjects using their own measured HRTFs. Perceptually worst-rated non-individual HRTFs were chosen to represent the worst-case scenario in practice and to allow for maximum potential for improvement. The methodology consisted of a training game and a localization test to evaluate performance carried out over 10 sessions. Sessions 1-4 occurred at 1 week intervals, performed by all subjects. During initial sessions, subjects showed improvement in localization performance for polar error. Following this, half of the subjects stopped the training game element, continuing with only the localization task. The group that continued to train showed improvement, with 3 of 8 subjects achieving group mean polar errors comparable to the control group. The majority of the group that stopped the training game retained their performance attained at the end of session 4. In general, adaptation was found to be quite subject dependent, highlighting the limits of HRTF adaptation in the case of poor HRTF matches. No identifier to predict learning ability was observed.
RESUMO
Head-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain's ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements ("gamification") and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion ("active listening"). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.