RESUMO
Humans use prior expectations to improve perception, especially of sensory signals that are degraded or ambiguous. However, if sensory input deviates from prior expectations, then correct perception depends on adjusting or rejecting prior expectations. Failure to adjust or reject the prior leads to perceptual illusions, especially if there is partial overlap (and thus partial mismatch) between expectations and input. With speech, "slips of the ear" occur when expectations lead to misperception. For instance, an entomologist might be more susceptible to hear "The ants are my friends" for "The answer, my friend" (in the Bob Dylan song Blowing in the Wind). Here, we contrast two mechanisms by which prior expectations may lead to misperception of degraded speech. First, clear representations of the common sounds in the prior and input (i.e., expected sounds) may lead to incorrect confirmation of the prior. Second, insufficient representations of sounds that deviate between prior and input (i.e., prediction errors) could lead to deception. We used crossmodal predictions from written words that partially match degraded speech to compare neural responses when male and female human listeners were deceived into accepting the prior or correctly reject it. Combined behavioral and multivariate representational similarity analysis of fMRI data show that veridical perception of degraded speech is signaled by representations of prediction error in the left superior temporal sulcus. Instead of using top-down processes to support perception of expected sensory input, our findings suggest that the strength of neural prediction error representations distinguishes correct perception and misperception.SIGNIFICANCE STATEMENT Misperceiving spoken words is an everyday experience, with outcomes that range from shared amusement to serious miscommunication. For hearing-impaired individuals, frequent misperception can lead to social withdrawal and isolation, with severe consequences for wellbeing. In this work, we specify the neural mechanisms by which prior expectations, which are so often helpful for perception, can lead to misperception of degraded sensory signals. Most descriptive theories of illusory perception explain misperception as arising from a clear sensory representation of features or sounds that are in common between prior expectations and sensory input. Our work instead provides support for a complementary proposal: that misperception occurs when there is an insufficient sensory representations of the deviation between expectations and sensory signals.
Assuntos
Encéfalo/fisiologia , Ilusões/fisiologia , Motivação/fisiologia , Percepção da Fala/fisiologia , Adolescente , Adulto , Mapeamento Encefálico/métodos , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Adulto JovemRESUMO
This study set out to investigate whether the 'phonological onset preference effect' often reported in adult studies using the visual world task (i.e., increased attention to an object that is phonologically-related to a spoken-target word, such as boat-bear) is also contingent upon toddler participants having sufficient preview time to inspect the picture stimuli. Picture preview is thought to support the activation of phonological codes which can then be matched to the phonological representations extracted from incoming speech signals and the picture stimuli, supporting the 'phonological mapping hypothesis'. We found that both toddlers and adults were able to show an early phonological onset preference in short preview conditions, though, adults' early phonological onset preferences in the short preview condition was extinguished by the presence of a semantic competitor, replicating previous adult findings (Huettig & McQueen, 2007). Removal of a semantic competitor reinstated the phonological onset preference effect under short preview conditions for adults. Our findings indicate that toddlers are driven more by bottom-up, phonological information when selecting a referent in a visual world task, as compared to adults who are more inclined to exploit top-down, semantic information when directing their attention to a visual object, especially when there is insufficient preview time. We propose that, when implicit naming is improbable in short-preview conditions, a phonological onset preference effect is driven by mapping on the visual-semantic levels, which is more susceptible to top-down influences.