Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments.

Thézé, Raphaël; Gadiri, Mehdi Ali; Albert, Louis; Provost, Antoine; Giraud, Anne-Lise; Mégevand, Pierre

Thézé, Raphaël; Gadiri, Mehdi Ali; Albert, Louis; Provost, Antoine; Giraud, Anne-Lise; Mégevand, Pierre.

Afiliação

Thézé R; Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland.
Gadiri MA; Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland.
Albert L; Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland.
Provost A; Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland.
Giraud AL; Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland.
Mégevand P; Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland. pierre.megevand@unige.ch.

Sci Rep ; 10(1): 15540, 2020 09 23.

Article em En | MEDLINE | ID: mdl-32968127

RESUMO

Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.

Assuntos

Percepção da Fala; Fala; Percepção Visual; Adulto; Percepção Auditiva; Feminino; Humanos; Masculino; Semântica; Adulto Jovem

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Fala / Percepção da Fala / Percepção Visual Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Fala / Percepção da Fala / Percepção Visual Idioma: En Ano de publicação: 2020 Tipo de documento: Article