Búsqueda | OPS/OMS Uruguay

Web-based psychoacoustics: Hearing screening, infrastructure, and validation.

Mok, Brittany A; Viswanathan, Vibha; Borjigin, Agudemu; Singh, Ravinderjit; Kafi, Homeira; Bharadwaj, Hari M.

Behav Res Methods ; 56(3): 1433-1448, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-37326771

RESUMEN

Anonymous web-based experiments are increasingly used in many domains of behavioral research. However, online studies of auditory perception, especially of psychoacoustic phenomena pertaining to low-level sensory processing, are challenging because of limited available control of the acoustics, and the inability to perform audiometry to confirm normal-hearing status of participants. Here, we outline our approach to mitigate these challenges and validate our procedures by comparing web-based measurements to lab-based data on a range of classic psychoacoustic tasks. Individual tasks were created using jsPsych, an open-source JavaScript front-end library. Dynamic sequences of psychoacoustic tasks were implemented using Django, an open-source library for web applications, and combined with consent pages, questionnaires, and debriefing pages. Subjects were recruited via Prolific, a subject recruitment platform for web-based studies. Guided by a meta-analysis of lab-based data, we developed and validated a screening procedure to select participants for (putative) normal-hearing status based on their responses in a suprathreshold task and a survey. Headphone use was standardized by supplementing procedures from prior literature with a binaural hearing task. Individuals meeting all criteria were re-invited to complete a range of classic psychoacoustic tasks. For the re-invited participants, absolute thresholds were in excellent agreement with lab-based data for fundamental frequency discrimination, gap detection, and sensitivity to interaural time delay and level difference. Furthermore, word identification scores, consonant confusion patterns, and co-modulation masking release effect also matched lab-based studies. Our results suggest that web-based psychoacoustics is a viable complement to lab-based research. Source code for our infrastructure is provided.

Asunto(s)

Percepción Auditiva , Audición , Humanos , Psicoacústica , Audición/fisiología , Percepción Auditiva/fisiología , Audiometría , Internet , Umbral Auditivo/fisiología , Estimulación Acústica

Speech Categorization Reveals the Role of Early-Stage Temporal-Coherence Processing in Auditory Scene Analysis.

Viswanathan, Vibha; Shinn-Cunningham, Barbara G; Heinz, Michael G.

J Neurosci ; 42(2): 240-254, 2022 01 12.

Artículo en Inglés | MEDLINE | ID: mdl-34764159

RESUMEN

Temporal coherence of sound fluctuations across spectral channels is thought to aid auditory grouping and scene segregation. Although prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, neurophysiological evidence suggests that temporal-coherence-based scene analysis may start as early as the cochlear nucleus (i.e., the first auditory region supporting cross-channel processing over a wide frequency range). Accordingly, we hypothesized that aspects of temporal-coherence processing that could be realized in early auditory areas may shape speech understanding in noise. We then explored whether physiologically plausible computational models could account for results from a behavioral experiment that measured consonant categorization in different masking conditions. We tested whether within-channel masking of target-speech modulations predicted consonant confusions across the different conditions and whether predictions were improved by adding across-channel temporal-coherence processing mirroring the computations known to exist in the cochlear nucleus. Consonant confusions provide a rich characterization of error patterns in speech categorization, and are thus crucial for rigorously testing models of speech perception; however, to the best of our knowledge, they have not been used in prior studies of scene analysis. We find that within-channel modulation masking can reasonably account for category confusions, but that it fails when temporal fine structure cues are unavailable. However, the addition of across-channel temporal-coherence processing significantly improves confusion predictions across all tested conditions. Our results suggest that temporal-coherence processing strongly shapes speech understanding in noise and that physiological computations that exist early along the auditory pathway may contribute to this process.SIGNIFICANCE STATEMENT Temporal coherence of sound fluctuations across distinct frequency channels is thought to be important for auditory scene analysis. Prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, and it was unknown whether speech understanding in noise may be shaped by across-channel processing that exists in earlier auditory areas. Using physiologically plausible computational modeling to predict consonant confusions across different listening conditions, we find that across-channel temporal coherence contributes significantly to scene analysis and speech perception and that such processing may arise in the auditory pathway as early as the brainstem. By virtue of providing a richer characterization of error patterns not obtainable with just intelligibility scores, consonant confusions yield unique insight into scene analysis mechanisms.

Asunto(s)

Vías Auditivas/fisiología , Percepción Auditiva/fisiología , Cóclea/fisiología , Habla/fisiología , Estimulación Acústica , Umbral Auditivo/fisiología , Humanos , Modelos Neurológicos , Enmascaramiento Perceptual

Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble.

Viswanathan, Vibha; Shinn-Cunningham, Barbara G; Heinz, Michael G.

J Acoust Soc Am ; 150(4): 2664, 2021 10.

Artículo en Inglés | MEDLINE | ID: mdl-34717498

RESUMEN

To understand the mechanisms of speech perception in everyday listening environments, it is important to elucidate the relative contributions of different acoustic cues in transmitting phonetic content. Previous studies suggest that the envelope of speech in different frequency bands conveys most speech content, while the temporal fine structure (TFS) can aid in segregating target speech from background noise. However, the role of TFS in conveying phonetic content beyond what envelopes convey for intact speech in complex acoustic scenes is poorly understood. The present study addressed this question using online psychophysical experiments to measure the identification of consonants in multi-talker babble for intelligibility-matched intact and 64-channel envelope-vocoded stimuli. Consonant confusion patterns revealed that listeners had a greater tendency in the vocoded (versus intact) condition to be biased toward reporting that they heard an unvoiced consonant, despite envelope and place cues being largely preserved. This result was replicated when babble instances were varied across independent experiments, suggesting that TFS conveys voicing information beyond what is conveyed by envelopes for intact speech in babble. Given that multi-talker babble is a masker that is ubiquitous in everyday environments, this finding has implications for the design of assistive listening devices such as cochlear implants.

Asunto(s)

Implantes Cocleares , Percepción del Habla , Estimulación Acústica , Ruido/efectos adversos , Enmascaramiento Perceptual , Fonética , Habla , Inteligibilidad del Habla

Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions.

Viswanathan, Vibha; Bharadwaj, Hari M; Shinn-Cunningham, Barbara G; Heinz, Michael G.

J Acoust Soc Am ; 150(3): 2230, 2021 09.

Artículo en Inglés | MEDLINE | ID: mdl-34598642

RESUMEN

A fundamental question in the neuroscience of everyday communication is how scene acoustics shape the neural processing of attended speech sounds and in turn impact speech intelligibility. While it is well known that the temporal envelopes in target speech are important for intelligibility, how the neural encoding of target-speech envelopes is influenced by background sounds or other acoustic features of the scene is unknown. Here, we combine human electroencephalography with simultaneous intelligibility measurements to address this key gap. We find that the neural envelope-domain signal-to-noise ratio in target-speech encoding, which is shaped by masker modulations, predicts intelligibility over a range of strategically chosen realistic listening conditions unseen by the predictive model. This provides neurophysiological evidence for modulation masking. Moreover, using high-resolution vocoding to carefully control peripheral envelopes, we show that target-envelope coding fidelity in the brain depends not only on envelopes conveyed by the cochlea, but also on the temporal fine structure (TFS), which supports scene segregation. Our results are consistent with the notion that temporal coherence of sound elements across envelopes and/or TFS influences scene analysis and attentive selection of a target sound. Our findings also inform speech-intelligibility models and technologies attempting to improve real-world speech communication.

Asunto(s)

Inteligibilidad del Habla , Percepción del Habla , Estimulación Acústica , Acústica , Percepción Auditiva , Humanos , Enmascaramiento Perceptual , Relación Señal-Ruido

Impact of Reduced Spectral Resolution on Temporal-Coherence-Based Source Segregation.

Viswanathan, Vibha; Heinz, Michael G; Shinn-Cunningham, Barbara G.

bioRxiv ; 2024 Mar 13.

Artículo en Inglés | MEDLINE | ID: mdl-38586037

RESUMEN

Hearing-impaired listeners struggle to understand speech in noise, even when using cochlear implants (CIs) or hearing aids. Successful listening in noisy environments depends on the brain's ability to organize a mixture of sound sources into distinct perceptual streams (i.e., source segregation). In normal-hearing listeners, temporal coherence of sound fluctuations across frequency channels supports this process by promoting grouping of elements belonging to a single acoustic source. We hypothesized that reduced spectral resolution-a hallmark of both electric/CI (from current spread) and acoustic (from broadened tuning) hearing with sensorineural hearing loss-degrades segregation based on temporal coherence. This is because reduced frequency resolution decreases the likelihood that a single sound source dominates the activity driving any specific channel; concomitantly, it increases the correlation in activity across channels. Consistent with our hypothesis, predictions from a physiologically plausible model of temporal-coherence-based segregation suggest that CI current spread reduces comodulation masking release (CMR; a correlate of temporal-coherence processing) and speech intelligibility in noise. These predictions are consistent with our behavioral data with simulated CI listening. Our model also predicts smaller CMR with increasing levels of outer-hair-cell damage. These results suggest that reduced spectral resolution relative to normal hearing impairs temporal-coherence-based segregation and speech-in-noise outcomes.

Intracranial Mapping of Response Latencies and Task Effects for Spoken Syllable Processing in the Human Brain.

Viswanathan, Vibha; Rupp, Kyle M; Hect, Jasmine L; Harford, Emily E; Holt, Lori L; Abel, Taylor J.

bioRxiv ; 2024 Apr 05.

Artículo en Inglés | MEDLINE | ID: mdl-38617227

RESUMEN

Prior lesion, noninvasive-imaging, and intracranial-electroencephalography (iEEG) studies have documented hierarchical, parallel, and distributed characteristics of human speech processing. Yet, there have not been direct, intracranial observations of the latency with which regions outside the temporal lobe respond to speech, or how these responses are impacted by task demands. We leveraged human intracranial recordings via stereo-EEG to measure responses from diverse forebrain sites during (i) passive listening to /bi/ and /pi/ syllables, and (ii) active listening requiring /bi/-versus-/pi/ categorization. We find that neural response latency increases from a few tens of ms in Heschl's gyrus (HG) to several tens of ms in superior temporal gyrus (STG), superior temporal sulcus (STS), and early parietal areas, and hundreds of ms in later parietal areas, insula, frontal cortex, hippocampus, and amygdala. These data also suggest parallel flow of speech information dorsally and ventrally, from HG to parietal areas and from HG to STG and STS, respectively. Latency data also reveal areas in parietal cortex, frontal cortex, hippocampus, and amygdala that are not responsive to the stimuli during passive listening but are responsive during categorization. Furthermore, multiple regions-spanning auditory, parietal, frontal, and insular cortices, and hippocampus and amygdala-show greater neural response amplitudes during active versus passive listening (a task-related effect). Overall, these results are consistent with hierarchical processing of speech at a macro level and parallel streams of information flow in temporal and parietal regions. These data also reveal regions where the speech code is stimulus-faithful and those that encode task-relevant representations.

Induced Alpha And Beta Electroencephalographic Rhythms Covary With Single-Trial Speech Intelligibility In Competition.

Viswanathan, Vibha; Bharadwaj, Hari M; Heinz, Michael G; Shinn-Cunningham, Barbara G.

bioRxiv ; 2023 May 22.

Artículo en Inglés | MEDLINE | ID: mdl-36712081

RESUMEN

Neurophysiological studies suggest that intrinsic brain oscillations influence sensory processing, especially of rhythmic stimuli like speech. Prior work suggests that brain rhythms may mediate perceptual grouping and selective attention to speech amidst competing sound, as well as more linguistic aspects of speech processing like predictive coding. However, we know of no prior studies that have directly tested, at the single-trial level, whether brain oscillations relate to speech-in-noise outcomes. Here, we combined electroencephalography while simultaneously measuring intelligibility of spoken sentences amidst two different interfering sounds: multi-talker babble or speech-shaped noise. We find that induced parieto-occipital alpha (7-15 Hz; thought to modulate attentional focus) and frontal beta (13-30 Hz; associated with maintenance of the current sensorimotor state and predictive coding) oscillations covary with trial-wise percent-correct scores; importantly, alpha and beta power provide significant independent contributions to predicting single-trial behavioral outcomes. These results can inform models of speech processing and guide noninvasive measures to index different neural processes that together support complex listening.

Induced alpha and beta electroencephalographic rhythms covary with single-trial speech intelligibility in competition.

Viswanathan, Vibha; Bharadwaj, Hari M; Heinz, Michael G; Shinn-Cunningham, Barbara G.

Sci Rep ; 13(1): 10216, 2023 06 23.

Artículo en Inglés | MEDLINE | ID: mdl-37353552

RESUMEN

Asunto(s)

Inteligibilidad del Habla , Percepción del Habla , Percepción del Habla/fisiología , Ruido , Percepción Auditiva , Electroencefalografía

Electroencephalographic Signatures of the Neural Representation of Speech during Selective Attention.

Viswanathan, Vibha; Bharadwaj, Hari M; Shinn-Cunningham, Barbara G.

eNeuro ; 6(5)2019.

Artículo en Inglés | MEDLINE | ID: mdl-31585928

RESUMEN

The ability to selectively attend to speech in the presence of other competing talkers is critical for everyday communication; yet the neural mechanisms facilitating this process are poorly understood. Here, we use electroencephalography (EEG) to study how a mixture of two speech streams is represented in the brain as subjects attend to one stream or the other. To characterize the speech-EEG relationships and how they are modulated by attention, we estimate the statistical association between each canonical EEG frequency band (delta, theta, alpha, beta, low-gamma, and high-gamma) and the envelope of each of ten different frequency bands in the input speech. Consistent with previous literature, we find that low-frequency (delta and theta) bands show greater speech-EEG coherence when the speech stream is attended compared to when it is ignored. We also find that the envelope of the low-gamma band shows a similar attention effect, a result not previously reported with EEG. This is consistent with the prevailing theory that neural dynamics in the gamma range are important for attention-dependent routing of information in cortical circuits. In addition, we also find that the greatest attention-dependent increases in speech-EEG coherence are seen in the mid-frequency acoustic bands (0.5-3 kHz) of input speech and the temporal-parietal EEG sensors. Finally, we find individual differences in the following: (1) the specific set of speech-EEG associations that are the strongest, (2) the EEG and speech features that are the most informative about attentional focus, and (3) the overall magnitude of attentional enhancement of speech-EEG coherence.

Asunto(s)

Atención/fisiología , Percepción del Habla/fisiología , Adulto , Corteza Auditiva/fisiología , Electroencefalografía , Femenino , Humanos , Masculino

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA