Your browser doesn't support javascript.
loading
Vision perceptually restores auditory spectral dynamics in speech.
Plass, John; Brang, David; Suzuki, Satoru; Grabowecky, Marcia.
Afiliação
  • Plass J; Department of Psychology, University of Michigan, Ann Arbor, MI 48109; jplass@umich.edu.
  • Brang D; Department of Psychology, Northwestern University, Evanston, IL 60208.
  • Suzuki S; Department of Psychology, University of Michigan, Ann Arbor, MI 48109.
  • Grabowecky M; Department of Psychology, Northwestern University, Evanston, IL 60208.
Proc Natl Acad Sci U S A ; 117(29): 16920-16927, 2020 07 21.
Article em En | MEDLINE | ID: mdl-32632010
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time-frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Acústica da Fala / Percepção da Fala / Percepção Visual Tipo de estudo: Prognostic_studies Limite: Adult / Female / Humans / Male Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Acústica da Fala / Percepção da Fala / Percepção Visual Tipo de estudo: Prognostic_studies Limite: Adult / Female / Humans / Male Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2020 Tipo de documento: Article