Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
Add more filters










Publication year range
1.
Psychon Bull Rev ; 31(1): 104-121, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37580454

ABSTRACT

Though listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.


Subject(s)
Speech Perception , Humans , Speech , Phonetics , Attention
2.
Psychon Bull Rev ; 29(2): 627-634, 2022 Apr.
Article in English | MEDLINE | ID: mdl-34731443

ABSTRACT

The mapping between speech acoustics and phonemic representations is highly variable across talkers, and listeners are slower to recognize words when listening to multiple talkers compared with a single talker. Listeners' speech processing efficiency in mixed-talker settings improves when given time to reorient their attention to each new talker. However, it remains unknown how much time is needed to fully reorient attention to a new talker in mixed-talker settings so that speech processing becomes as efficient as when listening to a single talker. In this study, we examined how speech processing efficiency improves in mixed-talker settings as a function of the duration of continuous speech from a talker. In single-talker and mixed-talker conditions, listeners identified target words either in isolation or preceded by a carrier vowel of parametrically varying durations from 300 to 1,500 ms. Listeners' word identification was significantly slower in every mixed-talker condition compared with the corresponding single-talker condition. The costs associated with processing mixed-talker speech declined significantly as the duration of the speech carrier increased from 0 to 600 ms. However, increasing the carrier duration beyond 600 ms did not achieve further reduction in talker variability-related processing costs. These results suggest that two parallel mechanisms support processing talker variability: A stimulus-driven mechanism that operates on short timescales to reorient attention to new auditory sources, and a top-down mechanism that operates over longer timescales to allocate the cognitive resources needed to accommodate uncertainty in acoustic-phonemic correspondences during contexts where speech may come from multiple talkers.


Subject(s)
Speech Perception , Adaptation, Physiological , Auditory Perception , Humans , Speech , Speech Acoustics
3.
Brain Res ; 1778: 147720, 2022 03 01.
Article in English | MEDLINE | ID: mdl-34785256

ABSTRACT

Attention is a crucial component in sound source segregation allowing auditory objects of interest to be both singled out and held in focus. Our study utilizes a fundamental paradigm for sound source segregation: a sequence of interleaved tones, A and B, of different frequencies that can be heard as a single integrated stream or segregated into two streams (auditory streaming paradigm). We focus on the irregular alternations between integrated and segregated that occur for long presentations, so-called auditory bistability. Psychaoustic experiments demonstrate how attentional control, a listener's intention to experience integrated or segregated, biases perception in favour of different perceptual interpretations. Our data show that this is achieved by prolonging the dominance times of the attended percept and, to a lesser extent, by curtailing the dominance times of the unattended percept, an effect that remains consistent across a range of values for the difference in frequency between A and B. An existing neuromechanistic model describes the neural dynamics of perceptual competition downstream of primary auditory cortex (A1). The model allows us to propose plausible neural mechanisms for attentional control, as linked to different attentional strategies, in a direct comparison with behavioural data. A mechanism based on a percept-specific input gain best accounts for the effects of attentional control.


Subject(s)
Attention/physiology , Auditory Perception/physiology , Models, Theoretical , Psychoacoustics , Adult , Female , Humans , Male
4.
Brain Lang ; 221: 104996, 2021 10.
Article in English | MEDLINE | ID: mdl-34358924

ABSTRACT

Speech is processed less efficiently from discontinuous, mixed talkers than one consistent talker, but little is known about the neural mechanisms for processing talker variability. Here, we measured psychophysiological responses to talker variability using electroencephalography (EEG) and pupillometry while listeners performed a delayed recall of digit span task. Listeners heard and recalled seven-digit sequences with both talker (single- vs. mixed-talker digits) and temporal (0- vs. 500-ms inter-digit intervals) discontinuities. Talker discontinuity reduced serial recall accuracy. Both talker and temporal discontinuities elicited P3a-like neural evoked response, while rapid processing of mixed-talkers' speech led to increased phasic pupil dilation. Furthermore, mixed-talkers' speech produced less alpha oscillatory power during working memory maintenance, but not during speech encoding. Overall, these results are consistent with an auditory attention and streaming framework in which talker discontinuity leads to involuntary, stimulus-driven attentional reorientation to novel speech sources, resulting in the processing interference classically associated with talker variability.


Subject(s)
Speech Perception , Speech , Electroencephalography , Humans , Memory, Short-Term , Mental Recall
5.
Front Neurosci ; 15: 666627, 2021.
Article in English | MEDLINE | ID: mdl-34305516

ABSTRACT

The massive network of descending corticofugal projections has been long-recognized by anatomists, but their functional contributions to sound processing and auditory-guided behaviors remain a mystery. Most efforts to characterize the auditory corticofugal system have been inductive; wherein function is inferred from a few studies employing a wide range of methods to manipulate varying limbs of the descending system in a variety of species and preparations. An alternative approach, which we focus on here, is to first establish auditory-guided behaviors that reflect the contribution of top-down influences on auditory perception. To this end, we postulate that auditory corticofugal systems may contribute to active listening behaviors in which the timing of bottom-up sound cues can be predicted from top-down signals arising from cross-modal cues, temporal integration, or self-initiated movements. Here, we describe a behavioral framework for investigating how auditory perceptual performance is enhanced when subjects can anticipate the timing of upcoming target sounds. Our first paradigm, studied both in human subjects and mice, reports species-specific differences in visually cued expectation of sound onset in a signal-in-noise detection task. A second paradigm performed in mice reveals the benefits of temporal regularity as a perceptual grouping cue when detecting repeating target tones in complex background noise. A final behavioral approach demonstrates significant improvements in frequency discrimination threshold and perceptual sensitivity when auditory targets are presented at a predictable temporal interval following motor self-initiation of the trial. Collectively, these three behavioral approaches identify paradigms to study top-down influences on sound perception that are amenable to head-fixed preparations in genetically tractable animals, where it is possible to monitor and manipulate particular nodes of the descending auditory pathway with unparalleled precision.

6.
J Math Neurosci ; 11(1): 8, 2021 May 03.
Article in English | MEDLINE | ID: mdl-33939042

ABSTRACT

In the auditory streaming paradigm, alternating sequences of pure tones can be perceived as a single galloping rhythm (integration) or as two sequences with separated low and high tones (segregation). Although studied for decades, the neural mechanisms underlining this perceptual grouping of sound remains a mystery. With the aim of identifying a plausible minimal neural circuit that captures this phenomenon, we propose a firing rate model with two periodically forced neural populations coupled by fast direct excitation and slow delayed inhibition. By analyzing the model in a non-smooth, slow-fast regime we analytically prove the existence of a rich repertoire of dynamical states and of their parameter dependent transitions. We impose plausible parameter restrictions and link all states with perceptual interpretations. Regions of stimulus parameters occupied by states linked with each percept match those found in behavioural experiments. Our model suggests that slow inhibition masks the perception of subsequent tones during segregation (forward masking), whereas fast excitation enables integration for large pitch differences between the two tones.

7.
Cognition ; 204: 104393, 2020 11.
Article in English | MEDLINE | ID: mdl-32688132

ABSTRACT

Phonetic variability across talkers imposes additional processing costs during speech perception, often measured by performance decrements between single- and mixed-talker conditions. However, models differ in their predictions about whether accommodating greater phonetic variability (i.e., more talkers) imposes greater processing costs. We measured speech processing efficiency in a speeded word identification task, in which we manipulated the number of talkers (1, 2, 4, 8, or 16) listeners heard. Word identification was less efficient in every mixed-talker condition compared to the single-talker condition, but the magnitude of this performance decrement was not affected by the number of talkers. Furthermore, in a condition with uniform transition probabilities between two talkers, word identification was more efficient when the talker was the same as the prior trial compared to trials when the talker switched. These results support an auditory streaming model of talker adaptation, where processing costs associated with changing talkers result from attentional reorientation.


Subject(s)
Speech Perception , Speech , Attention , Cognition , Humans , Phonetics
8.
Cereb Cortex ; 30(8): 4563-4580, 2020 06 30.
Article in English | MEDLINE | ID: mdl-32219312

ABSTRACT

At any given moment, we experience a perceptual scene as a single whole and yet we may distinguish a variety of objects within it. This phenomenon instantiates two properties of conscious perception: integration and differentiation. Integration is the property of experiencing a collection of objects as a unitary percept and differentiation is the property of experiencing these objects as distinct from each other. Here, we evaluated the neural information dynamics underlying integration and differentiation of perceptual contents during bistable perception. Participants listened to a sequence of tones (auditory bistable stimuli) experienced either as a single stream (perceptual integration) or as two parallel streams (perceptual differentiation) of sounds. We computed neurophysiological indices of information integration and information differentiation with electroencephalographic and intracranial recordings. When perceptual alternations were endogenously driven, the integrated percept was associated with an increase in neural information integration and a decrease in neural differentiation across frontoparietal regions, whereas the opposite pattern was observed for the differentiated percept. However, when perception was exogenously driven by a change in the sound stream (no bistability), neural oscillatory power distinguished between percepts but information measures did not. We demonstrate that perceptual integration and differentiation can be mapped to theoretically motivated neural information signatures, suggesting a direct relationship between phenomenology and neurophysiology.


Subject(s)
Auditory Perception/physiology , Brain/physiology , Acoustic Stimulation , Electroencephalography , Female , Humans , Male , Young Adult
9.
Psychon Bull Rev ; 27(2): 307-314, 2020 Apr.
Article in English | MEDLINE | ID: mdl-31965484

ABSTRACT

One of the most fundamental questions that can be asked about any process concerns the underlying units over which it operates. And this is true not just for artificial processes (such as functions in a computer program that only take specific kinds of arguments) but for mental processes. Over what units does the process of enumeration operate? Recent work has demonstrated that in visuospatial arrays, these units are often irresistibly discrete objects. When enumerating the number of discs in a display, for example, observers underestimate to a greater degree when the discs are spatially segmented (e.g., by connecting pairs of discs with lines): you try to enumerate discs, but your mind can't help enumerating dumbbells. This phenomenon has previously been limited to static displays, but of course our experience of the world is inherently dynamic. Is enumeration in time similarly based on discrete events? To find out, we had observers enumerate the number of notes in quick musical sequences. Observers underestimated to a greater degree when the notes were temporally segmented (into discrete musical phrases, based on pitch-range shifts), even while carefully controlling for both duration and the overall range and heterogeneity of pitches. Observers tried to enumerate notes, but their minds couldn't help enumerating musical phrases - since those are the events they experienced. These results thus demonstrate how discrete events are prominent in our mental lives, and how the units that constitute discrete events are not entirely under our conscious, intentional control.


Subject(s)
Mathematical Concepts , Pattern Recognition, Visual/physiology , Adult , Humans
10.
Eur J Neurosci ; 51(5): 1191-1200, 2020 03.
Article in English | MEDLINE | ID: mdl-28922512

ABSTRACT

Integrating sounds from the same source and segregating sounds from different sources in an acoustic scene are an essential function of the auditory system. Naturally, the auditory system simultaneously makes use of multiple cues. Here, we investigate the interaction between spatial cues and frequency cues in stream segregation of European starlings (Sturnus vulgaris) using an objective measure of perception. Neural responses to streaming sounds were recorded, while the bird was performing a behavioural task that results in a higher sensitivity during a one-stream than a two-stream percept. Birds were trained to detect an onset time shift of a B tone in an ABA- triplet sequence in which A and B could differ in frequency and/or spatial location. If the frequency difference or spatial separation between the signal sources or both were increased, the behavioural time shift detection performance deteriorated. Spatial separation had a smaller effect on the performance compared to the frequency difference and both cues additively affected the performance. Neural responses in the primary auditory forebrain were affected by the frequency and spatial cues. However, frequency and spatial cue differences being sufficiently large to elicit behavioural effects did not reveal correlated neural response differences. The difference between the neuronal response pattern and behavioural response is discussed with relation to the task given to the bird. Perceptual effects of combining different cues in auditory scene analysis indicate that these cues are analysed independently and given different weights suggesting that the streaming percept arises consecutively to initial cue analysis.


Subject(s)
Cues , Starlings , Acoustic Stimulation , Animals , Auditory Perception , Prosencephalon
11.
Hear Res ; 383: 107807, 2019 11.
Article in English | MEDLINE | ID: mdl-31622836

ABSTRACT

We explore stream segregation with temporally modulated acoustic features using behavioral experiments and modelling. The auditory streaming paradigm in which alternating high- A and low-frequency tones B appear in a repeating ABA-pattern, has been shown to be perceptually bistable for extended presentations (order of minutes). For a fixed, repeating stimulus, perception spontaneously changes (switches) at random times, every 2-15 s, between an integrated interpretation with a galloping rhythm and segregated streams. Streaming in a natural auditory environment requires segregation of auditory objects with features that evolve over time. With the relatively idealized ABA-triplet paradigm, we explore perceptual switching in a non-static environment by considering slowly and periodically varying stimulus features. Our previously published model captures the dynamics of auditory bistability and predicts here how perceptual switches are entrained, tightly locked to the rising and falling phase of modulation. In psychoacoustic experiments we find that entrainment depends on both the period of modulation and the intrinsic switch characteristics of individual listeners. The extended auditory streaming paradigm with slowly modulated stimulus features presented here will be of significant interest for future imaging and neurophysiology experiments by reducing the need for subjective perceptual reports of ongoing perception.


Subject(s)
Auditory Pathways/physiology , Environment , Perceptual Masking , Pitch Perception , Acoustic Stimulation , Computer Simulation , Female , Humans , Male , Models, Neurological , Psychoacoustics , Young Adult
12.
J Neurosci ; 39(33): 6482-6497, 2019 08 14.
Article in English | MEDLINE | ID: mdl-31189576

ABSTRACT

A key challenge in neuroscience is understanding how sensory stimuli give rise to perception, especially when the process is supported by neural activity from an extended network of brain areas. Perception is inherently subjective, so interrogating its neural signatures requires, ideally, a combination of three factors: (1) behavioral tasks that separate stimulus-driven activity from perception per se; (2) human subjects who self-report their percepts while performing those tasks; and (3) concurrent neural recordings acquired at high spatial and temporal resolution. In this study, we analyzed human electrocorticographic recordings obtained during an auditory task which supported mutually exclusive perceptual interpretations. Eight neurosurgical patients (5 male; 3 female) listened to sequences of repeated triplets where tones were separated in frequency by several semitones. Subjects reported spontaneous alternations between two auditory perceptual states, 1-stream and 2-stream, by pressing a button. We compared averaged auditory evoked potentials (AEPs) associated with 1-stream and 2-stream percepts and identified significant differences between them in primary and nonprimary auditory cortex, surrounding auditory-related temporoparietal cortex, and frontal areas. We developed classifiers to identify spatial maps of percept-related differences in the AEP, corroborating findings from statistical analysis. We used one-dimensional embedding spaces to perform the group-level analysis. Our data illustrate exemplar high temporal resolution AEP waveforms in auditory core region; explain inconsistencies in perceptual effects within auditory cortex, reported across noninvasive studies of streaming of triplets; show percept-related changes in frontoparietal areas previously highlighted by studies that focused on perceptual transitions; and demonstrate that auditory cortex encodes maintenance of percepts and switches between them.SIGNIFICANCE STATEMENT The human brain has the remarkable ability to discern complex and ambiguous stimuli from the external world by parsing mixed inputs into interpretable segments. However, one's perception can deviate from objective reality. But how do perceptual discrepancies occur? What are their anatomical substrates? To address these questions, we performed intracranial recordings in neurosurgical patients as they reported their perception of sounds associated with two mutually exclusive interpretations. We identified signatures of subjective percepts as distinct from sound-driven brain activity in core and non-core auditory cortex and frontoparietal cortex. These findings were compared with previous studies of auditory bistable perception and suggested that perceptual transitions and maintenance of perceptual states were supported by common neural substrates.


Subject(s)
Auditory Cortex/physiology , Auditory Perception/physiology , Evoked Potentials, Auditory/physiology , Acoustic Stimulation , Adult , Electrocorticography , Female , Humans , Male , Middle Aged , Young Adult
13.
Trends Hear ; 22: 2331216518773226, 2018.
Article in English | MEDLINE | ID: mdl-29766759

ABSTRACT

The role of temporal cues in sequential stream segregation was investigated in cochlear implant (CI) listeners using a delay detection task composed of a sequence of bursts of pulses (B) on a single electrode interleaved with a second sequence (A) presented on the same electrode with a different pulse rate. In half of the trials, a delay was added to the last burst of the otherwise regular B sequence and the listeners were asked to detect this delay. As a jitter was added to the period between consecutive A bursts, time judgments between the A and B sequences provided an unreliable cue to perform the task. Thus, the segregation of the A and B sequences should improve performance. The pulse rate difference and the duration of the sequences were varied between trials. The performance in the detection task improved by increasing both the pulse rate differences and the sequence duration. This suggests that CI listeners can use pulse rate differences to segregate sequential sounds and that a segregated percept builds up over time. In addition, the contribution of place versus temporal cues for voluntary stream segregation was assessed by combining the results from this study with those from our previous study, where the same paradigm was used to determine the role of place cues on stream segregation. Pitch height differences between the A and the B sounds accounted for the results from both studies, suggesting that stream segregation is related to the salience of the perceptual difference between the sounds.


Subject(s)
Acoustic Stimulation , Auditory Perception/physiology , Cochlear Implants , Cues , Adult , Aged , Female , Humans , Male , Middle Aged , Young Adult
14.
J Neurosci ; 38(11): 2844-2853, 2018 03 14.
Article in English | MEDLINE | ID: mdl-29440556

ABSTRACT

Auditory signals arrive at the ear as a mixture that the brain must decompose into distinct sources based to a large extent on acoustic properties of the sounds. An important question concerns whether listeners have voluntary control over how many sources they perceive. This has been studied using pure high (H) and low (L) tones presented in the repeating pattern HLH-HLH-, which can form a bistable percept heard either as an integrated whole (HLH-) or as segregated into high (H-H-) and low (-L-) sequences. Although instructing listeners to try to integrate or segregate sounds affects reports of what they hear, this could reflect a response bias rather than a perceptual effect. We had human listeners (15 males, 12 females) continuously report their perception of such sequences and recorded neural activity using MEG. During neutral listening, a classifier trained on patterns of neural activity distinguished between periods of integrated and segregated perception. In other conditions, participants tried to influence their perception by allocating attention either to the whole sequence or to a subset of the sounds. They reported hearing the desired percept for a greater proportion of time than when listening neutrally. Critically, neural activity supported these reports; stimulus-locked brain responses in auditory cortex were more likely to resemble the signature of segregation when participants tried to hear segregation than when attempting to perceive integration. These results indicate that listeners can influence how many sound sources they perceive, as reflected in neural responses that track both the input and its perceptual organization.SIGNIFICANCE STATEMENT Can we consciously influence our perception of the external world? We address this question using sound sequences that can be heard either as coming from a single source or as two distinct auditory streams. Listeners reported spontaneous changes in their perception between these two interpretations while we recorded neural activity to identify signatures of such integration and segregation. They also indicated that they could, to some extent, choose between these alternatives. This claim was supported by corresponding changes in responses in auditory cortex. By linking neural and behavioral correlates of perception, we demonstrate that the number of objects that we perceive can depend not only on the physical attributes of our environment, but also on how we intend to experience it.


Subject(s)
Auditory Perception/physiology , Intention , Acoustic Stimulation , Adolescent , Adult , Attention/physiology , Auditory Cortex/physiology , Electroencephalography , Female , Humans , Magnetoencephalography , Male , Sound , Young Adult
15.
Trends Hear ; 22: 2331216517750262, 2018.
Article in English | MEDLINE | ID: mdl-29347886

ABSTRACT

Sequential stream segregation by cochlear implant (CI) listeners was investigated using a temporal delay detection task composed of a sequence of regularly presented bursts of pulses on a single electrode (B) interleaved with an irregular sequence (A) presented on a different electrode. In half of the trials, a delay was added to the last burst of the regular B sequence, and the listeners were asked to detect this delay. As a jitter was added to the period between consecutive A bursts, time judgments between the A and B sequences provided an unreliable cue to perform the task. Thus, the segregation of the A and B sequences should improve performance. In Experiment 1, the electrode separation and the sequence duration were varied to clarify whether place cues help CI listeners to voluntarily segregate sounds and whether a two-stream percept needs time to build up. Results suggested that place cues can facilitate the segregation of sequential sounds if enough time is provided to build up a two-stream percept. In Experiment 2, the duration of the sequence was fixed, and only the electrode separation was varied to estimate the fission boundary. Most listeners were able to segregate the sounds for separations of three or more electrodes, and some listeners could segregate sounds coming from adjacent electrodes.


Subject(s)
Auditory Perception , Cochlear Implants , Cues , Acoustic Stimulation , Adult , Aged , Denmark , Female , Humans , Male , Middle Aged , Young Adult
16.
Neuropsychologia ; 108: 82-91, 2018 01 08.
Article in English | MEDLINE | ID: mdl-29197502

ABSTRACT

In perceptual multi-stability, perception stochastically switches between alternative interpretations of the stimulus allowing examination of perceptual experience independent of stimulus parameters. Previous studies found that listeners show temporally stable idiosyncratic switching patterns when listening to a multi-stable auditory stimulus, such as in the auditory streaming paradigm. This inter-individual variability can be described along two dimensions, Exploration and Segregation. In the current study, we explored the functional brain networks associated with these dimensions and their constituents using electroencephalography. Results showed that Segregation and its constituents are related to brain networks operating in the theta EEG band, whereas Exploration and its constituents are related to networks in the lower and upper alpha and beta bands. Thus, the dimensions on which individuals' perception differ from each other in the auditory streaming paradigm probably reflect separate perceptual processes in the human brain. Further, the results suggest that networks mainly located in left auditory areas underlie the perception of integration, whereas perceiving the alternative patterns is accompanied by stronger interhemispheric connections.


Subject(s)
Auditory Perception/physiology , Brain/physiology , Adolescent , Adult , Auditory Pathways/physiology , Electroencephalography , Female , Humans , Male , Spectroscopy, Near-Infrared , Young Adult
17.
Article in English | MEDLINE | ID: mdl-28044020

ABSTRACT

Multistability in perception is a powerful tool for investigating sensory-perceptual transformations, because it produces dissociations between sensory inputs and subjective experience. Spontaneous switching between different perceptual objects occurs during prolonged listening to a sound sequence of tone triplets or repeated words (termed auditory streaming and verbal transformations, respectively). We used these examples of auditory multistability to examine to what extent neurochemical and cognitive factors influence the observed idiosyncratic patterns of switching between perceptual objects. The concentrations of glutamate-glutamine (Glx) and γ-aminobutyric acid (GABA) in brain regions were measured by magnetic resonance spectroscopy, while personality traits and executive functions were assessed using questionnaires and response inhibition tasks. Idiosyncratic patterns of perceptual switching in the two multistable stimulus configurations were identified using a multidimensional scaling (MDS) analysis. Intriguingly, although switching patterns within each individual differed between auditory streaming and verbal transformations, similar MDS dimensions were extracted separately from the two datasets. Individual switching patterns were significantly correlated with Glx and GABA concentrations in auditory cortex and inferior frontal cortex but not with the personality traits and executive functions. Our results suggest that auditory perceptual organization depends on the balance between neural excitation and inhibition in different brain regions.This article is part of the themed issue 'Auditory and visual scene analysis'.


Subject(s)
Auditory Cortex/physiology , Auditory Perception , Frontal Lobe/physiology , Neurotransmitter Agents/metabolism , Acoustic Stimulation , Adult , Female , Glutamic Acid/metabolism , Glutamine/metabolism , Humans , Japan , Male , Middle Aged , Young Adult , gamma-Aminobutyric Acid/metabolism
18.
Article in English | MEDLINE | ID: mdl-28044024

ABSTRACT

This study investigates the neural correlates and processes underlying the ambiguous percept produced by a stimulus similar to Deutsch's 'octave illusion', in which each ear is presented with a sequence of alternating pure tones of low and high frequencies. The same sequence is presented to each ear, but in opposite phase, such that the left and right ears receive a high-low-high … and a low-high-low … pattern, respectively. Listeners generally report hearing the illusion of an alternating pattern of low and high tones, with all the low tones lateralized to one side and all the high tones lateralized to the other side. The current explanation of the illusion is that it reflects an illusory feature conjunction of pitch and perceived location. Using psychophysics and electroencephalogram measures, we test this and an alternative hypothesis involving synchronous and sequential stream segregation, and investigate potential neural correlates of the illusion. We find that the illusion of alternating tones arises from the synchronous tone pairs across ears rather than sequential tones in one ear, suggesting that the illusion involves a misattribution of time across perceptual streams, rather than a misattribution of location within a stream. The results provide new insights into the mechanisms of binaural streaming and synchronous sound segregation.This article is part of the themed issue 'Auditory and visual scene analysis'.


Subject(s)
Auditory Cortex/physiology , Auditory Perception , Hearing , Illusions , Acoustic Stimulation , Adult , Electroencephalography , Female , Humans , Male , Psychophysics , Young Adult
19.
Front Neurosci ; 10: 524, 2016.
Article in English | MEDLINE | ID: mdl-27895552

ABSTRACT

Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.

20.
Trends Hear ; 202016 09 18.
Article in English | MEDLINE | ID: mdl-27641681

ABSTRACT

Research on hearing has long been challenged with understanding our exceptional ability to hear out individual sounds in a mixture (the so-called cocktail party problem). Two general approaches to the problem have been taken using sequences of tones as stimuli. The first has focused on our tendency to hear sequences, sufficiently separated in frequency, split into separate cohesive streams (auditory streaming). The second has focused on our ability to detect a change in one sequence, ignoring all others (auditory masking). The two phenomena are clearly related, but that relation has never been evaluated analytically. This article offers a detection-theoretic analysis of the relation between multitone streaming and masking that underscores the expected similarities and differences between these phenomena and the predicted outcome of experiments in each case. The key to establishing this relation is the function linking performance to the information divergence of the tone sequences, DKL (a measure of the statistical separation of their parameters). A strong prediction is that streaming and masking of tones will be a common function of DKL provided that the statistical properties of sequences are symmetric. Results of experiments are reported supporting this prediction.


Subject(s)
Hearing , Perceptual Masking , Sound , Acoustic Stimulation , Auditory Perception , Humans , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL