Pesquisa | Portal Regional da BVS

1.

Characterizing correlations in partial credit speech recognition scoring with beta-binomial distributions.

Bosen, Adam K.

JASA Express Lett ; 4(2)2024 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-38299983

RESUMO

Partial credit scoring for speech recognition tasks can improve measurement precision. However, assessing the magnitude of this improvement with partial credit scoring is challenging because meaningful speech contains contextual cues, which create correlations between the probabilities of correctly identifying each token in a stimulus. Here, beta-binomial distributions were used to estimate recognition accuracy and intraclass correlation for phonemes in words and words in sentences in listeners with cochlear implants (N = 20). Estimates demonstrated substantial intraclass correlation in recognition accuracy within stimuli. These correlations were invariant across individuals. Intraclass correlations should be addressed in power analysis of partial credit scoring.

Assuntos

Implante Coclear , Implantes Cocleares , Percepção da Fala , Humanos , Distribuição Binomial , Fala

2.

Identifying Links Between Latent Memory and Speech Recognition Factors.

Bosen, Adam K; Doria, Gianna M.

Ear Hear ; 45(2): 351-369, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-37882100

RESUMO

OBJECTIVES: The link between memory ability and speech recognition accuracy is often examined by correlating summary measures of performance across various tasks, but interpretation of such correlations critically depends on assumptions about how these measures map onto underlying factors of interest. The present work presents an alternative approach, wherein latent factor models are fit to trial-level data from multiple tasks to directly test hypotheses about the underlying structure of memory and the extent to which latent memory factors are associated with individual differences in speech recognition accuracy. Latent factor models with different numbers of factors were fit to the data and compared to one another to select the structures which best explained vocoded sentence recognition in a two-talker masker across a range of target-to-masker ratios, performance on three memory tasks, and the link between sentence recognition and memory. DESIGN: Young adults with normal hearing (N = 52 for the memory tasks, of which 21 participants also completed the sentence recognition task) completed three memory tasks and one sentence recognition task: reading span, auditory digit span, visual free recall of words, and recognition of 16-channel vocoded Perceptually Robust English Sentence Test Open-set sentences in the presence of a two-talker masker at target-to-masker ratios between +10 and 0 dB. Correlations between summary measures of memory task performance and sentence recognition accuracy were calculated for comparison to prior work, and latent factor models were fit to trial-level data and compared against one another to identify the number of latent factors which best explains the data. Models with one or two latent factors were fit to the sentence recognition data and models with one, two, or three latent factors were fit to the memory task data. Based on findings with these models, full models that linked one speech factor to one, two, or three memory factors were fit to the full data set. Models were compared via Expected Log pointwise Predictive Density and post hoc inspection of model parameters. RESULTS: Summary measures were positively correlated across memory tasks and sentence recognition. Latent factor models revealed that sentence recognition accuracy was best explained by a single factor that varied across participants. Memory task performance was best explained by two latent factors, of which one was generally associated with performance on all three tasks and the other was specific to digit span recall accuracy at lists of six digits or more. When these models were combined, the general memory factor was closely related to the sentence recognition factor, whereas the factor specific to digit span had no apparent association with sentence recognition. CONCLUSIONS: Comparison of latent factor models enables testing hypotheses about the underlying structure linking cognition and speech recognition. This approach showed that multiple memory tasks assess a common latent factor that is related to individual differences in sentence recognition, although performance on some tasks was associated with multiple factors. Thus, while these tasks provide some convergent assessment of common latent factors, caution is needed when interpreting what they tell us about speech recognition.

Assuntos

Percepção da Fala , Fala , Adulto Jovem , Humanos , Cognição , Idioma , Testes Auditivos

3.

FORUM: Remote testing for psychological and physiological acoustics.

Peng, Z Ellen; Waz, Sebastian; Buss, Emily; Shen, Yi; Richards, Virginia; Bharadwaj, Hari; Stecker, G Christopher; Beim, Jordan A; Bosen, Adam K; Braza, Meredith D; Diedesch, Anna C; Dorey, Claire M; Dykstra, Andrew R; Gallun, Frederick J; Goldsworthy, Raymond L; Gray, Lincoln; Hoover, Eric C; Ihlefeld, Antje; Koelewijn, Thomas; Kopun, Judy G; Mesik, Juraj; Shub, Daniel E; Venezia, Jonathan H.

J Acoust Soc Am ; 151(5): 3116, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35649891

RESUMO

Acoustics research involving human participants typically takes place in specialized laboratory settings. Listening studies, for example, may present controlled sounds using calibrated transducers in sound-attenuating or anechoic chambers. In contrast, remote testing takes place outside of the laboratory in everyday settings (e.g., participants' homes). Remote testing could provide greater access to participants, larger sample sizes, and opportunities to characterize performance in typical listening environments at the cost of reduced control of environmental conditions, less precise calibration, and inconsistency in attentional state and/or response behaviors from relatively smaller sample sizes and unintuitive experimental tasks. The Acoustical Society of America Technical Committee on Psychological and Physiological Acoustics launched the Task Force on Remote Testing (https://tcppasa.org/remotetesting/) in May 2020 with goals of surveying approaches and platforms available to support remote testing and identifying challenges and considerations for prospective investigators. The results of this task force survey were made available online in the form of a set of Wiki pages and summarized in this report. This report outlines the state-of-the-art of remote testing in auditory-related research as of August 2021, which is based on the Wiki and a literature search of papers published in this area since 2020, and provides three case studies to demonstrate feasibility during practice.

Assuntos

Acústica , Percepção Auditiva , Atenção/fisiologia , Humanos , Estudos Prospectivos , Som

4.

Forward Digit Span and Word Familiarity Do Not Correlate With Differences in Speech Recognition in Individuals With Cochlear Implants After Accounting for Auditory Resolution.

Bosen, Adam K; Sevich, Victoria A; Cannon, Shauntelle A.

J Speech Lang Hear Res ; 64(8): 3330-3342, 2021 08 09.

Artigo em Inglês | MEDLINE | ID: mdl-34251908

RESUMO

Purpose In individuals with cochlear implants, speech recognition is not associated with tests of working memory that primarily reflect storage, such as forward digit span. In contrast, our previous work found that vocoded speech recognition in individuals with normal hearing was correlated with performance on a forward digit span task. A possible explanation for this difference across groups is that variability in auditory resolution across individuals with cochlear implants could conceal the true relationship between speech and memory tasks. Here, our goal was to determine if performance on forward digit span and speech recognition tasks are correlated in individuals with cochlear implants after controlling for individual differences in auditory resolution. Method We measured sentence recognition ability in 20 individuals with cochlear implants with Perceptually Robust English Sentence Test Open-set sentences. Spectral and temporal modulation detection tasks were used to assess individual differences in auditory resolution, auditory forward digit span was used to assess working memory storage, and self-reported word familiarity was used to assess vocabulary. Results Individual differences in speech recognition were predicted by spectral and temporal resolution. A correlation was found between forward digit span and speech recognition, but this correlation was not significant after controlling for spectral and temporal resolution. No relationship was found between word familiarity and speech recognition. Forward digit span performance was not associated with individual differences in auditory resolution. Conclusions Our findings support the idea that sentence recognition in individuals with cochlear implants is primarily limited by individual differences in working memory processing, not storage. Studies examining the relationship between speech and memory should control for individual differences in auditory resolution.

Assuntos

Implante Coclear , Implantes Cocleares , Percepção da Fala , Humanos , Reconhecimento Psicológico , Fala

5.

Serial Recall Predicts Vocoded Sentence Recognition Across Spectral Resolutions.

Bosen, Adam K; Barry, Michael F.

J Speech Lang Hear Res ; 63(4): 1282-1298, 2020 04 27.

Artigo em Inglês | MEDLINE | ID: mdl-32213149

RESUMO

Purpose The goal of this study was to determine how various aspects of cognition predict speech recognition ability across different levels of speech vocoding within a single group of listeners. Method We tested the ability of young adults (N = 32) with normal hearing to recognize Perceptually Robust English Sentence Test Open-set (PRESTO) sentences that were degraded with a vocoder to produce different levels of spectral resolution (16, eight, and four carrier channels). Participants also completed tests of cognition (fluid intelligence, short-term memory, and attention), which were used as predictors of sentence recognition. Sentence recognition was compared across vocoder conditions, predictors were correlated with individual differences in sentence recognition, and the relationships between predictors were characterized. Results PRESTO sentence recognition performance declined with a decreasing number of vocoder channels, with no evident floor or ceiling performance in any condition. Individual ability to recognize PRESTO sentences was consistent relative to the group across vocoder conditions. Short-term memory, as measured with serial recall, was a moderate predictor of sentence recognition (ρ = 0.65). Serial recall performance was constant across vocoder conditions when measured with a digit span task. Fluid intelligence was marginally correlated with serial recall, but not sentence recognition. Attentional measures had no discernible relationship to sentence recognition and a marginal relationship with serial recall. Conclusions Verbal serial recall is a substantial predictor of vocoded sentence recognition, and this predictive relationship is independent of spectral resolution. In populations that show variable speech recognition outcomes, such as listeners with cochlear implants, it should be possible to account for the independent effects of spectral resolution and verbal serial recall in their speech recognition ability. Supplemental Material https://doi.org/10.23641/asha.12021051.

Assuntos

Implantes Cocleares , Percepção da Fala , Humanos , Idioma , Rememoração Mental , Reconhecimento Psicológico , Adulto Jovem

6.

Acoustic-Phonetic Mismatches Impair Serial Recall of Degraded Words.

Bosen, Adam K; Monzingo, Elizabeth; AuBuchon, Angela M.

Audit Percept Cogn ; 3(1-2): 55-75, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33554052

RESUMO

Sequences of phonologically similar words are more difficult to remember than phonologically distinct sequences. This study investigated whether this difficulty arises in the acoustic similarity of auditory stimuli or in the corresponding phonological labels in memory. Participants reconstructed sequences of words which were degraded with a vocoder. We manipulated the phonological similarity of response options across two groups. One group was trained to map stimulus words onto phonologically similar response labels which matched the recorded word; the other group was trained to map words onto a set of plausible responses which were mismatched from the original recordings but were selected to have less phonological overlap. Participants trained on the matched responses were able to learn responses with less training and recall sequences more accurately than participants trained on the mismatched responses, even though the mismatched responses were more phonologically distinct from one another and participants were unaware of the mismatch. The relative difficulty of recalling items in the correct position was the same across both sets of response labels. Mismatched responses impaired recall accuracy across all positions except the final item in each list. These results are consistent with the idea that increased difficulty of mapping acoustic stimuli onto phonological forms impairs serial recall. Increased mapping difficulty could impair retention of memoranda and impede consolidation into phonological forms, which would impair recall in adverse listening conditions.

7.

Interactions Between Item Set and Vocoding in Serial Recall.

Bosen, Adam K; Luckasen, Mary C.

Ear Hear ; 40(6): 1404-1417, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31033634

RESUMO

OBJECTIVES: Serial recall of digits is frequently used to measure short-term memory span in various listening conditions. However, the use of digits may mask the effect of low quality auditory input. Digits have high frequency and are phonologically distinct relative to one another, so they should be easy to identify even with low quality auditory input. In contrast, larger item sets reduce listener ability to strategically constrain their expectations, which should reduce identification accuracy and increase the time and/or cognitive resources needed for identification when auditory quality is low. This diminished accuracy and increased cognitive load should interfere with memory for sequences of items drawn from large sets. The goal of this work was to determine whether this predicted interaction between auditory quality and stimulus set in short-term memory exists, and if so, whether this interaction is associated with processing speed, vocabulary, or attention. DESIGN: We compared immediate serial recall within young adults with normal hearing across unprocessed and vocoded listening conditions for multiple stimulus sets. Stimulus sets were lists of digits (1 to 9), consonant-vowel-consonant (CVC) words (chosen from a list of 60 words), and CVC nonwords (chosen from a list of 50 nonwords). Stimuli were unprocessed or vocoded with an eight-channel noise vocoder. To support interpretation of responses, words and nonwords were selected to minimize inclusion of multiple phonemes from within a confusion cluster. We also measured receptive vocabulary (Peabody Picture Vocabulary Test [PPVT-4]), sustained attention (test of variables of attention [TOVA]), and repetition speed for individual items from each stimulus set under both listening conditions. RESULTS: Vocoding stimuli had no impact on serial recall of digits, but reduced memory span for words and nonwords. This reduction in memory span was attributed to an increase in phonological confusions for nonwords. However, memory span for vocoded word lists remained reduced even after accounting for common phonetic confusions, indicating that lexical status played an additional role across listening conditions. Principal components analysis found two components that explained 84% of the variance in memory span across conditions. Component one had similar load across all conditions, indicating that participants had an underlying memory capacity, which was common to all conditions. Component two was loaded by performance in the vocoded word and nonword conditions, representing the sensitivity of memory span to vocoding of these stimuli. The order in which participants completed listening conditions had a small effect on memory span that could not account for the effect of listening condition. Repetition speed was fastest for digits, slower for words, and slowest for nonwords. On average, vocoding slowed repetition speed for all stimuli, but repetition speed was not predictive of individual memory span. Vocabulary and attention showed no correlation with memory span. CONCLUSIONS: Our results replicated previous findings that low quality auditory input can impair short-term memory, and demonstrated that this impairment is sensitive to stimulus set. Using multiple stimulus sets in degraded listening conditions can isolate memory capacity (in digit span) from impaired item identification (in word and nonword span), which may help characterize the relationship between memory and speech recognition in difficult listening conditions.

Assuntos

Estimulação Acústica/métodos , Memória de Curto Prazo , Rememoração Mental , Percepção da Fala , Atenção , Feminino , Voluntários Saudáveis , Humanos , Masculino , Fonética , Vocabulário , Adulto Jovem

8.

Individualized frequency importance functions for listeners with sensorineural hearing loss.

Yoho, Sarah E; Bosen, Adam K.

J Acoust Soc Am ; 145(2): 822, 2019 02.

Artigo em Inglês | MEDLINE | ID: mdl-30823788

RESUMO

The Speech Intelligibility Index includes a series of frequency importance functions for calculating the estimated intelligibility of speech under various conditions. Until recently, techniques to derive frequency importance required averaging data over a group of listeners, thus hindering the ability to observe individual differences due to factors such as hearing loss. In the current study, the "random combination strategy" [Bosen and Chatterjee (2016). J. Acoust. Soc. Am. 140, 3718-3727] was used to derive frequency importance functions for individual hearing-impaired listeners, and normal-hearing participants for comparison. Functions were measured by filtering sentences to contain only random subsets of frequency bands on each trial, and regressing speech recognition against the presence or absence of bands across trials. Results show that the contribution of each band to speech recognition was inversely proportional to audiometric threshold in that frequency region, likely due to reduced audibility, even though stimuli were shaped to compensate for each individual's hearing loss. The results presented in this paper demonstrate that this method is sensitive to factors that alter the shape of frequency importance functions within individuals with hearing loss, which could be used to characterize the impact of audibility or other factors related to suprathreshold deficits or hearing aid processing strategies.

Assuntos

Perda Auditiva Neurossensorial/fisiopatologia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adolescente , Adulto , Audiometria , Feminino , Humanos , Masculino , Medicina de Precisão , Adulto Jovem

9.

Multiple time scales of the ventriloquism aftereffect.

Bosen, Adam K; Fleming, Justin T; Allen, Paul D; O'Neill, William E; Paige, Gary D.

PLoS One ; 13(8): e0200930, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30067790

RESUMO

The ventriloquism aftereffect (VAE) refers to a shift in auditory spatial perception following exposure to a spatial disparity between auditory and visual stimuli. The VAE has been previously measured on two distinct time scales. Hundreds or thousands of exposures to a an audio-visual spatial disparity produces enduring VAE that persists after exposure ceases. Exposure to a single audio-visual spatial disparity produces immediate VAE that decays over seconds. To determine if these phenomena are two extremes of a continuum or represent distinct processes, we conducted an experiment with normal hearing listeners that measured VAE in response to a repeated, constant audio-visual disparity sequence, both immediately after exposure to each audio-visual disparity and after the end of the sequence. In each experimental session, subjects were exposed to sequences of auditory and visual targets that were constantly offset by +8° or -8° in azimuth from one another, then localized auditory targets presented in isolation following each sequence. Eye position was controlled throughout the experiment, to avoid the effects of gaze on auditory localization. In contrast to other studies that did not control eye position, we found both a large shift in auditory perception that decayed rapidly after each AV disparity exposure, along with a gradual shift in auditory perception that grew over time and persisted after exposure to the AV disparity ceased. We modeled the temporal and spatial properties of the measured auditory shifts using grey box nonlinear system identification, and found that two models could explain the data equally well. In the power model, the temporal decay of the ventriloquism aftereffect was modeled with a power law relationship. This causes an initial rapid drop in auditory shift, followed by a long tail which accumulates with repeated exposure to audio-visual disparity. In the double exponential model, two separate processes were required to explain the data, one which accumulated and decayed exponentially and the other which slowly integrated over time. Both models fit the data best when the spatial spread of the ventriloquism aftereffect was limited to a window around the location of the audio-visual disparity. We directly compare the predictions made by each model, and suggest additional measurements that could help distinguish which model best describes the mechanisms underlying the VAE.

Assuntos

Localização de Som , Adaptação Psicológica , Adulto , Movimentos Oculares , Feminino , Humanos , Masculino , Modelos Biológicos , Psicofísica , Percepção Espacial , Fatores de Tempo , Percepção Visual , Adulto Jovem

10.

Accumulation and decay of visual capture and the ventriloquism aftereffect caused by brief audio-visual disparities.

Bosen, Adam K; Fleming, Justin T; Allen, Paul D; O'Neill, William E; Paige, Gary D.

Exp Brain Res ; 235(2): 585-595, 2017 02.

Artigo em Inglês | MEDLINE | ID: mdl-27837258

RESUMO

Visual capture and the ventriloquism aftereffect resolve spatial disparities of incongruent auditory visual (AV) objects by shifting auditory spatial perception to align with vision. Here, we demonstrated the distinct temporal characteristics of visual capture and the ventriloquism aftereffect in response to brief AV disparities. In a set of experiments, subjects localized either the auditory component of AV targets (A within AV) or a second sound presented at varying delays (1-20 s) after AV exposure (A2 after AV). AV targets were trains of brief presentations (1 or 20), covering a ±30° azimuthal range, and with ±8° (R or L) disparity. We found that the magnitude of visual capture generally reached its peak within a single AV pair and did not dissipate with time, while the ventriloquism aftereffect accumulated with repetitions of AV pairs and dissipated with time. Additionally, the magnitude of the auditory shift induced by each phenomenon was uncorrelated across listeners and visual capture was unaffected by subsequent auditory targets, indicating that visual capture and the ventriloquism aftereffect are separate mechanisms with distinct effects on auditory spatial perception. Our results indicate that visual capture is a 'sample-and-hold' process that binds related objects and stores the combined percept in memory, whereas the ventriloquism aftereffect is a 'leaky integrator' process that accumulates with experience and decays with time to compensate for cross-modal disparities.

Assuntos

Localização de Som/fisiologia , Disparidade Visual/fisiologia , Percepção Visual/fisiologia , Estimulação Acústica , Adolescente , Adulto , Análise de Variância , Feminino , Humanos , Masculino , Memória/fisiologia , Estimulação Luminosa , Adulto Jovem

11.

Band importance functions of listeners with cochlear implants using clinical maps.

Bosen, Adam K; Chatterjee, Monita.

J Acoust Soc Am ; 140(5): 3718, 2016 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-27908046

RESUMO

Band importance functions estimate the relative contribution of individual acoustic frequency bands to speech intelligibility. Previous studies of band importance in listeners with cochlear implants have used experimental maps and direct stimulation. Here, band importance was estimated for clinical maps with acoustic stimulation. Listeners with cochlear implants had band importance functions that relied more heavily on lower frequencies and showed less cross-listener consistency than in listeners with normal hearing. The intersubject variability observed here indicates that averaging band importance functions across listeners with cochlear implants, as has been done in previous studies, may not be meaningful. Additionally, band importance functions of listeners with normal hearing for vocoded speech that either did or did not simulate spread of excitation were not different from one another, suggesting that additional factors beyond spread of excitation are necessary to account for changes in band importance in listeners with cochlear implants.

Assuntos

Implantes Cocleares , Estimulação Acústica , Adulto , Implante Coclear , Humanos , Masculino , Pessoa de Meia-Idade , Inteligibilidade da Fala , Percepção da Fala , Adulto Jovem

12.

Comparison of congruence judgment and auditory localization tasks for assessing the spatial limits of visual capture.

Bosen, Adam K; Fleming, Justin T; Brown, Sarah E; Allen, Paul D; O'Neill, William E; Paige, Gary D.

Biol Cybern ; 110(6): 455-471, 2016 12.

Artigo em Inglês | MEDLINE | ID: mdl-27815630

RESUMO

Vision typically has better spatial accuracy and precision than audition and as a result often captures auditory spatial perception when visual and auditory cues are presented together. One determinant of visual capture is the amount of spatial disparity between auditory and visual cues: when disparity is small, visual capture is likely to occur, and when disparity is large, visual capture is unlikely. Previous experiments have used two methods to probe how visual capture varies with spatial disparity. First, congruence judgment assesses perceived unity between cues by having subjects report whether or not auditory and visual targets came from the same location. Second, auditory localization assesses the graded influence of vision on auditory spatial perception by having subjects point to the remembered location of an auditory target presented with a visual target. Previous research has shown that when both tasks are performed concurrently they produce similar measures of visual capture, but this may not hold when tasks are performed independently. Here, subjects alternated between tasks independently across three sessions. A Bayesian inference model of visual capture was used to estimate perceptual parameters for each session, which were compared across tasks. Results demonstrated that the range of audiovisual disparities over which visual capture was likely to occur was narrower in auditory localization than in congruence judgment, which the model indicates was caused by subjects adjusting their prior expectation that targets originated from the same location in a task-dependent manner.

Assuntos

Percepção Auditiva , Modelos Biológicos , Animais , Teorema de Bayes , Humanos , Julgamento , Localização de Som , Percepção Espacial , Percepção Visual

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA