Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Lang Speech ; 63(2): 264-291, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31002280

RESUMEN

The audiovisual speech signal contains multimodal information to phrase boundaries. In three artificial language learning studies with 12 groups of adult participants we investigated whether English monolinguals and bilingual speakers of English and a language with opposite basic word order (i.e., in which objects precede verbs) can use word frequency, phrasal prosody and co-speech (facial) visual information, namely head nods, to parse unknown languages into phrase-like units. We showed that monolinguals and bilinguals used the auditory and visual sources of information to chunk "phrases" from the input. These results suggest that speech segmentation is a bimodal process, though the influence of co-speech facial gestures is rather limited and linked to the presence of auditory prosody. Importantly, a pragmatic factor, namely the language of the context, seems to determine the bilinguals' segmentation, overriding the auditory and visual cues and revealing a factor that begs further exploration.


Asunto(s)
Señales (Psicología) , Desarrollo del Lenguaje , Multilingüismo , Percepción del Habla , Aprendizaje Verbal , Estimulación Acústica , Adolescente , Adulto , Femenino , Humanos , Lenguaje , Masculino , Estimulación Luminosa , Semántica , Adulto Joven
2.
J Child Lang ; 47(2): 472-482, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31599214

RESUMEN

Fourteen-month-old infants are unable to link minimal pair nonsense words with novel objects (Stager & Werker, 1997). Might an adult's productions in a word learning context support minimal pair word-object association in these infants? We recorded a mother interacting with her 24-month-old son, and with her 5-month-old son, producing nonsense words bin and din. We used these productions to determine if they had a differential effect on 14-month-old infants' word-object association abilities. Females hearing the words spoken to the older infant, but not those to the younger, succeeded. We suggest that the task-appropriateness of utterances can support infant word learning.


Asunto(s)
Aprendizaje por Asociación , Desarrollo del Lenguaje , Relaciones Madre-Hijo , Adulto , Preescolar , Femenino , Humanos , Lactante , Masculino , Madres , Factores Sexuales , Percepción del Habla , Aprendizaje Verbal
3.
PLoS One ; 14(11): e0224786, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31710615

RESUMEN

The input contains perceptually available cues, which might allow young infants to discover abstract properties of the target language. Thus, word frequency and prosodic prominence correlate systematically with basic word order in natural languages. Prelexical infants are sensitive to these frequency-based and prosodic cues, and use them to parse new input into phrases that follow the order characteristic of their native languages. Importantly, young infants readily integrate auditory and visual facial information while processing language. Here, we ask whether co-verbal visual information provided by talking faces also helps prelexical infants learn the word order of their native language in addition to word frequency and prosodic prominence. We created two structurally ambiguous artificial languages containing head nods produced by an animated avatar, aligned or misaligned with the frequency-based and prosodic information. During 4 minutes, two groups of 4- and 8-month-old infants were familiarized with the artificial language containing aligned auditory and visual cues, while two further groups were exposed to the misaligned language. Using a modified Headturn Preference Procedure, we tested infants' preference for test items exhibiting the word order of the native language, French, vs. the opposite word order. At 4 months, infants had no preference, suggesting that 4-month-olds were not able to integrate the three available cues, or had not yet built a representation of word order. By contrast, 8-month-olds showed no preference when auditory and visual cues were aligned and a preference for the native word order when visual cues were misaligned. These results imply that infants at this age start to integrate the co-verbal visual and auditory cues.


Asunto(s)
Señales (Psicología) , Desarrollo del Lenguaje , Lenguaje , Percepción del Habla , Aprendizaje Verbal , Percepción Visual/fisiología , Cara , Femenino , Humanos , Lactante , Masculino , Estimulación Luminosa , Habla
4.
J Acoust Soc Am ; 144(4): 2447, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30404498

RESUMEN

A noninvasive method for accurately measuring anticipatory coarticulation at experimentally defined temporal locations is introduced. The method leverages work in audiovisual (AV) speech perception to provide a synthetic and robust measure that can be used to inform psycholinguistic theory. In this validation study, speakers were audio-video recorded while producing simple subject-verb-object sentences with contrasting object noun rhymes. Coarticulatory resistance of target noun onsets was manipulated as was metrical context for the determiner that modified the noun. Individual sentences were then gated from the verb to sentence end at segmental landmarks. These stimuli were presented to perceivers who were tasked with guessing the sentence-final rhyme. An audio-only condition was included to estimate the contribution of visual information to perceivers' performance. Findings were that perceivers accurately identified rhymes earlier in the AV condition than in the audio-only condition (i.e., at determiner onset vs determiner vowel). Effects of coarticulatory resistance and metrical context were similar across conditions and consistent with previous work on coarticulation. These findings were further validated with acoustic measurement of the determiner vowel and a cumulative video-based measure of perioral movement. Overall, gated AV speech perception can be used to test specific hypotheses regarding coarticulatory scope and strength in running speech.


Asunto(s)
Anticipación Psicológica , Psicolingüística/métodos , Percepción del Habla , Femenino , Humanos , Lenguaje , Masculino , Acústica del Lenguaje , Grabación en Video/métodos , Adulto Joven
5.
Cogn Dev ; 42: 37-48, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28970650

RESUMEN

The period between six and 12 months is a sensitive period for language learning during which infants undergo auditory perceptual attunement, and recent results indicate that this sensitive period may exist across sensory modalities. We tested infants at three stages of perceptual attunement (six, nine, and 11 months) to determine 1) whether they were sensitive to the congruence between heard and seen speech stimuli in an unfamiliar language, and 2) whether familiarization with congruent audiovisual speech could boost subsequent non-native auditory discrimination. Infants at six- and nine-, but not 11-months, detected audiovisual congruence of non-native syllables. Familiarization to incongruent, but not congruent, audiovisual speech changed auditory discrimination at test for six-month-olds but not nine- or 11-month-olds. These results advance the proposal that speech perception is audiovisual from early in ontogeny, and that the sensitive period for audiovisual speech perception may last somewhat longer than that for auditory perception alone.

6.
Front Psychiatry ; 6: 39, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25852578

RESUMEN

INTRODUCTION: Advanced video technology is available for sleep-laboratories. However, low-cost equipment for screening in the home setting has not been identified and tested, nor has a methodology for analysis of video recordings been suggested. METHODS: We investigated different combinations of hardware/software for home-videosomnography (HVS) and established a process for qualitative and quantitative analysis of HVS-recordings. A case vignette (HVS analysis for a 5.5-year-old girl with major insomnia and several co-morbidities) demonstrates how methodological considerations were addressed and how HVS added value to clinical assessment. RESULTS: We suggest an "ideal set of hardware/software" that is reliable, affordable (∼$500) and portable (=2.8 kg) to conduct non-invasive HVS, which allows time-lapse analyses. The equipment consists of a net-book, a camera with infrared optics, and a video capture device. (1) We present an HVS-analysis protocol consisting of three steps of analysis at varying replay speeds: (a) basic overview and classification at 16× normal speed; (b) second viewing and detailed descriptions at 4-8× normal speed, and (c) viewing, listening, and in-depth descriptions at real-time speed. (2) We also present a custom software program that facilitates video analysis and note-taking (Annotator(©)), and Optical Flow software that automatically quantifies movement for internal quality control of the HVS-recording. The case vignette demonstrates how the HVS-recordings revealed the dimension of insomnia caused by restless legs syndrome, and illustrated the cascade of symptoms, challenging behaviors, and resulting medications. CONCLUSION: The strategy of using HVS, although requiring validation and reliability testing, opens the floor for a new "observational sleep medicine," which has been useful in describing discomfort-related behavioral movement patterns in patients with communication difficulties presenting with challenging/disruptive sleep/wake behaviors.

7.
PLoS One ; 9(8): e105036, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25119189

RESUMEN

Behavioral coordination and synchrony contribute to a common biological mechanism that maintains communication, cooperation and bonding within many social species, such as primates and birds. Similarly, human language and social systems may also be attuned to coordination to facilitate communication and the formation of relationships. Gross similarities in movement patterns and convergence in the acoustic properties of speech have already been demonstrated between interacting individuals. In the present studies, we investigated how coordinated movements contribute to observers' perception of affiliation (friends vs. strangers) between two conversing individuals. We used novel computational methods to quantify motor coordination and demonstrated that individuals familiar with each other coordinated their movements more frequently. Observers used coordination to judge affiliation between conversing pairs but only when the perceptual stimuli were restricted to head and face regions. These results suggest that observed movement coordination in humans might contribute to perceptual decisions based on availability of information to perceivers.


Asunto(s)
Relaciones Interpersonales , Cinésica , Habla , Adulto , Femenino , Amigos , Humanos , Lenguaje , Masculino , Movimiento , Percepción , Reconocimiento en Psicología , Adulto Joven
8.
Proc Int Semin Speech Prod ; 2014: 352-355, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-26097900

RESUMEN

Parallels in the production of clear speech and words under prosodic focus suggest that both may be realized in the same way: as hyper-articulated speech. To directly investigate this possibility, school-aged children and college-aged adults produced target words in a default conversational style, a clear speech style, and with prosodic focus. The results were that children and adults both produced target vowels more distinctly and with greater mouth opening in the clear speech and prosodic focus conditions than in the default condition. Whereas the temporal scope of production changes varied as a function of condition in adults' speech, there was no evidence of this in children's speech.

9.
J Acoust Soc Am ; 131(3): 2162-72, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22423712

RESUMEN

This paper demonstrates an algorithm for computing the instantaneous correlation coefficient between two signals. The algorithm is the computational engine for analyzing the time-varying coordination between signals, which is called correlation map analysis (CMA). Correlation is computed around any pair of points in the two input signals. Thus, coordination can be assessed across a continuous range of temporal offsets and be detected even when changing over time due to temporal fluctuations. The correlation algorithm has two major features: (i) it is structurally similar to a tunable filter, requiring only one parameter to set its cutoff frequency (and sensitivity), (ii) it can be applied either uni-directionally (computing correlation based only on previous samples) or bi-directionally (computing correlation based on both previous and future samples). Computing instantaneous correlation for a range of time offsets between two signals produces a 2D correlation map, in which correlation is characterized as a function of time and temporal offset. Graphic visualization of the correlation map provides rapid assessment of how correspondence patterns progress through time. The utility of the algorithm and of CMA are exemplified using the spatial and temporal coordination of various audible and visible components associated with linguistic performance.


Asunto(s)
Algoritmos , Gestos , Habla/fisiología , Movimientos de la Cabeza/fisiología , Humanos , Ruido , Enmascaramiento Perceptual , Acústica del Lenguaje
10.
Lang Speech ; 53(Pt 1): 49-69, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20415002

RESUMEN

Systematic syllable-based variation has been observed in the relative spatial and temporal properties of supralaryngeal gestures in a number of complex segments. Generally, more anterior gestures tend to appear at syllable peripheries while less anterior gestures occur closer to syllable peaks. Because previous studies compared only two gestures, it is not clear how to characterize the gestures, nor whether timing offsets are categorical or gradient. North American English /r/ is an unusually complex segment, having three supralaryngeal constrictions, but technological limitations have hindered simultaneous study of all three. A novel combination of M-mode ultrasound and optical tracking was used to measure gestural relations in productions of /r/ by nine speakers of Canadian English. Results show a front-to-back timing pattern in syllable-initial position: Lip then tongue blade (TB), then tongue root (TR). In syllable-final position TR and Lip are followed by TB. There was also a reduction in magnitude affecting Lip and TB gestures in syllable-final position and TR in syllable-initial position. These findings are not wholly consistent with any theory advanced thus far to explain syllable-based allophonic variation. It is proposed that the relative magnitude of gestures is a better predictor of timing than relative anteriority or an assigned phonological classification.


Asunto(s)
Señales (Psicología) , Gestos , Lenguaje , Labio/fisiología , Percepción Espacial , Acústica del Lenguaje , Lengua/fisiología , Percepción Visual , Adulto , Canadá , Femenino , Humanos , Masculino , Fonética , Medición de la Producción del Habla , Lengua/diagnóstico por imagen , Ultrasonografía , Adulto Joven
11.
J Exp Psychol Hum Percept Perform ; 33(4): 905-14, 2007 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-17683236

RESUMEN

Recent studies have shown that the face and voice of an unfamiliar person can be matched for identity. Here the authors compare the relative effects of changing sentence content (what is said) and sentence manner (how it is said) on matching identity between faces and voices. A change between speaking a sentence as a statement and as a question disrupted matching performance, whereas changing the sentence itself did not. This was the case when the faces and voices were from the same race as participants and speaking a familiar language (English; Experiment 1) or from another race and speaking an unfamiliar language (Japanese; Experiment 2). Altering manner between conversational and clear speech (Experiment 3) or between conversational and casual speech (Experiment 4) was also disruptive. However, artificially slowing (Experiment 5) or speeding (Experiment 6) speech did not affect cross-modal matching performance. The results show that bimodal cues to identity are closely linked to manner but that content (what is said) and absolute tempo are not critical. Instead, prosodic variations in rhythmic structure and/or expressiveness may provide a bimodal, dynamic identity signature.


Asunto(s)
Cara , Percepción del Habla , Percepción Visual , Voz , Adolescente , Adulto , Femenino , Humanos , Masculino , Reconocimiento en Psicología , Calidad de la Voz
12.
J Speech Lang Hear Res ; 48(3): 543-53, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16197271

RESUMEN

The tongue is critical in the production of speech, yet its nature has made it difficult to measure. Not only does its ability to attain complex shapes make it difficult to track, it is also largely hidden from view during speech. The present article describes a new combination of optical tracking and ultrasound imaging that allows for a noninvasive, real-time view of most of the tongue surface during running speech. The optical system (Optotrak) tracks the location of external structures in 3-dimensional space using infrared emitting diodes (IREDs). By tracking 3 or more IREDs on the head and a similar number on an ultrasound transceiver, the transduced image of the tongue can be corrected for the motion of both the head and the transceiver and thus be represented relative to the hard structures of the vocal tract. If structural magnetic resonance images of the speaker are available, they may allow the estimation of the location of the rear pharyngeal wall as well. This new technique is contrasted with other currently available options for imaging the tongue. It promises to provide high-quality, relatively low-cost imaging of most of the tongue surface during fairly unconstrained speech.


Asunto(s)
Movimiento/fisiología , Habla/fisiología , Lengua/diagnóstico por imagen , Humanos , Imagenología Tridimensional , Rayos Infrarrojos , Imagen por Resonancia Magnética , Hueso Paladar/diagnóstico por imagen , Ultrasonografía , Grabación de Cinta de Video
13.
J Cogn Neurosci ; 16(5): 805-16, 2004 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15200708

RESUMEN

Perception of speech is improved when presentation of the audio signal is accompanied by concordant visual speech gesture information. This enhancement is most prevalent when the audio signal is degraded. One potential means by which the brain affords perceptual enhancement is thought to be through the integration of concordant information from multiple sensory channels in a common site of convergence, multisensory integration (MSI) sites. Some studies have identified potential sites in the superior temporal gyrus/sulcus (STG/S) that are responsive to multisensory information from the auditory speech signal and visual speech movement. One limitation of these studies is that they do not control for activity resulting from attentional modulation cued by such things as visual information signaling the onsets and offsets of the acoustic speech signal, as well as activity resulting from MSI of properties of the auditory speech signal with aspects of gross visual motion that are not specific to place of articulation information. This fMRI experiment uses spatial wavelet bandpass filtered Japanese sentences presented with background multispeaker audio noise to discern brain activity reflecting MSI induced by auditory and visual correspondence of place of articulation information that controls for activity resulting from the above-mentioned factors. The experiment consists of a low-frequency (LF) filtered condition containing gross visual motion of the lips, jaw, and head without specific place of articulation information, a midfrequency (MF) filtered condition containing place of articulation information, and an unfiltered (UF) condition. Sites of MSI selectively induced by auditory and visual correspondence of place of articulation information were determined by the presence of activity for both the MF and UF conditions relative to the LF condition. Based on these criteria, sites of MSI were found predominantly in the left middle temporal gyrus (MTG), and the left STG/S (including the auditory cortex). By controlling for additional factors that could also induce greater activity resulting from visual motion information, this study identifies potential MSI sites that we believe are involved with improved speech perception intelligibility.


Asunto(s)
Percepción Auditiva/fisiología , Corteza Cerebral/fisiología , Gestos , Percepción del Habla/fisiología , Percepción Visual/fisiología , Adulto , Mapeo Encefálico , Corteza Cerebral/anatomía & histología , Femenino , Lateralidad Funcional/fisiología , Humanos , Imagen por Resonancia Magnética/métodos , Masculino , Estimulación Física/métodos , Desempeño Psicomotor/fisiología
14.
Psychol Sci ; 15(2): 133-7, 2004 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-14738521

RESUMEN

People naturally move their heads when they speak, and our study shows that this rhythmic head motion conveys linguistic information. Three-dimensional head and face motion and the acoustics of a talker producing Japanese sentences were recorded and analyzed. The head movement correlated strongly with the pitch (fundamental frequency) and amplitude of the talker's voice. In a perception study, Japanese subjects viewed realistic talking-head animations based on these movement recordings in a speech-in-noise task. The animations allowed the head motion to be manipulated without changing other characteristics of the visual or acoustic speech. Subjects correctly identified more syllables when natural head motion was present in the animation than when it was eliminated or distorted. These results suggest that nonverbal gestures such as head movements play a more direct role in the perception of speech than previously known.


Asunto(s)
Movimientos de la Cabeza , Localización de Sonidos , Acústica del Lenguaje , Inteligibilidad del Habla , Percepción del Habla , Adulto , Fenómenos Biomecánicos , Expresión Facial , Femenino , Gestos , Humanos , Imagenología Tridimensional , Masculino , Distorsión de la Percepción , Fonética , Semántica , Espectrografía del Sonido , Interfaz Usuario-Computador
15.
Neuroreport ; 14(17): 2213-8, 2003 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-14625450

RESUMEN

This fMRI study explores brain regions involved with perceptual enhancement afforded by observation of visual speech gesture information. Subjects passively identified words presented in the following conditions: audio-only, audiovisual, audio-only with noise, audiovisual with noise, and visual only. The brain may use concordant audio and visual information to enhance perception by integrating the information in a converging multisensory site. Consistent with response properties of multisensory integration sites, enhanced activity in middle and superior temporal gyrus/sulcus was greatest when concordant audiovisual stimuli were presented with acoustic noise. Activity found in brain regions involved with planning and execution of speech production in response to visual speech presented with degraded or absent auditory stimulation, is consistent with the use of an additional pathway through which speech perception is facilitated by a process of internally simulating the intended speech act of the observed speaker.


Asunto(s)
Estimulación Acústica/métodos , Encéfalo/fisiología , Estimulación Luminosa/métodos , Percepción del Habla/fisiología , Percepción Visual/fisiología , Adulto , Humanos , Imagen por Resonancia Magnética/métodos , Masculino , Persona de Mediana Edad , Vías Nerviosas/fisiología , Desempeño Psicomotor/fisiología
16.
Curr Biol ; 13(19): 1709-14, 2003 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-14521837

RESUMEN

Speech perception provides compelling examples of a strong link between auditory and visual modalities. This link originates in the mechanics of speech production, which, in shaping the vocal tract, determine the movement of the face as well as the sound of the voice. In this paper, we present evidence that equivalent information about identity is available cross-modally from both the face and voice. Using a delayed matching to sample task, XAB, we show that people can match the video of an unfamiliar face, X, to an unfamiliar voice, A or B, and vice versa, but only when stimuli are moving and are played forward. The critical role of time-varying information is underlined by the ability to match faces to voices containing only the coarse spatial and temporal information provided by sine wave speech [5]. The effect of varying sentence content across modalities was small, showing that identity-specific information is not closely tied to particular utterances. We conclude that the physical constraints linking faces to voices result in bimodally available dynamic information, not only about what is being said, but also about who is saying it.


Asunto(s)
Expresión Facial , Individualidad , Percepción del Habla/fisiología , Percepción Visual/fisiología , Voz/fisiología , Adulto , Femenino , Humanos , Japón , Masculino , Grabación de Cinta de Video
17.
J Cogn Neurosci ; 15(6): 800-9, 2003 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-14511533

RESUMEN

Neuropsychological research suggests that the neural system underlying visible speech on the basis of kinematics is distinct from the system underlying visible speech of static images of the face and identifying whole-body actions from kinematics alone. Functional magnetic resonance imaging was used to identify the neural systems underlying point-light visible speech, as well as perception of a walking/jumping point-light body, to determine if they are independent. Although both point-light stimuli produced overlapping activation in the right middle occipital gyrus encompassing area KO and the right inferior temporal gyrus, they also activated distinct areas. Perception of walking biological motion activated a medial occipital area along the lingual gyrus close to the cuneus border, and the ventromedial frontal cortex, neither of which was activated by visible speech biological motion. In contrast, perception of visible speech biological motion activated right V5 and a network of motor-related areas (Broca's area, PM, M1, and supplementary motor area (SMA)), none of which were activated by walking biological motion. Many of the areas activated by seeing visible speech biological motion are similar to those activated while speech-reading from an actual face, with the exception of M1 and medial SMA. The motor-related areas found to be active during point-light visible speech are consistent with recent work characterizing the human "mirror" system (Rizzolatti, Fadiga, Gallese, & Fogassi, 1996).


Asunto(s)
Percepción de Movimiento/fisiología , Movimiento (Física) , Corteza Motora/fisiología , Habla/fisiología , Caminata/fisiología , Adulto , Fenómenos Biomecánicos , Mapeo Encefálico , Femenino , Humanos , Luz , Imagen por Resonancia Magnética/instrumentación , Imagen por Resonancia Magnética/métodos , Masculino , Reconocimiento Visual de Modelos , Estimulación Luminosa , Percepción del Habla
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...