RESUMO
Multimodal integration is the formation of a coherent percept from different sensory inputs such as vision, audition, and somatosensation. Most research on multimodal integration in speech perception has focused on audio-visual integration. In recent years, audio-tactile integration has also been investigated, and it has been established that puffs of air applied to the skin and timed with listening tasks shift the perception of voicing by naive listeners. The current study has replicated and extended these findings by testing the effect of air puffs on gradations of voice onset time along a continuum rather than the voiced and voiceless endpoints of the original work. Three continua were tested: bilabial ("pa/ba"), velar ("ka/ga"), and a vowel continuum ("head/hid") used as a control. The presence of air puffs was found to significantly increase the likelihood of choosing voiceless responses for the two VOT continua but had no effect on choices for the vowel continuum. Analysis of response times revealed that the presence of air puffs lengthened responses for intermediate (ambiguous) stimuli and shortened them for endpoint (non-ambiguous) stimuli. The slowest response times were observed for the intermediate steps for all three continua, but for the bilabial continuum this effect interacted with the presence of air puffs: responses were slower in the presence of air puffs, and faster in their absence. This suggests that during integration auditory and aero-tactile inputs are weighted differently by the perceptual system, with the latter exerting greater influence in those cases where the auditory cues for voicing are ambiguous.
RESUMO
People can understand speech under poor conditions, even when successive pieces of the waveform are flipped in time. Using a new method to measure perception of such stimuli, we show that words with sounds based on rapid spectral changes (stop consonants) are much more impaired by reversing speech segments than words with fewer such sounds, and that words are much more resistant to disruption than pseudowords. We then demonstrate that this lexical advantage is more characteristic of some people than others. Participants listened to speech that was degraded in two very different ways, and we measured each person's reliance on lexical support for each task. Listeners who relied on the lexicon for help in perceiving one kind of degraded speech also relied on the lexicon when dealing with a quite different kind of degraded speech. Thus, people differ in their relative reliance on the speech signal versus their pre-existing knowledge.