Búsqueda | Portal Regional de la BVS

1.

Neurobiological foundations for the theory of harmony in western tonal music.

Tramo, M J; Cariani, P A; Delgutte, B; Braida, L D.

Ann N Y Acad Sci ; 930: 92-116, 2001 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-11458869

RESUMEN

Basic principles of the theory of harmony reflect physiological and anatomical properties of the auditory nervous system and related cognitive systems. This hypothesis is motivated by observations from several different disciplines, including ethnomusicology, developmental psychology, and animal behavior. Over the past several years, we and our colleagues have been investigating the vertical dimension of harmony from the perspective of neurobiology using physiological, psychoacoustic, and neurological methods. Properties of the auditory system that govern harmony perception include (1) the capacity of peripheral auditory neurons to encode temporal regularities in acoustic fine structure and (2) the differential tuning of many neurons throughout the auditory system to a narrow range of frequencies in the audible spectrum. Biologically determined limits on these properties constrain the range of notes used in music throughout the world and the way notes are combined to form intervals and chords in popular Western music. When a harmonic interval is played, neurons throughout the auditory system that are sensitive to one or more frequencies (partials) contained in the interval respond by firing action potentials. For consonant intervals, the fine timing of auditory nerve fiber responses contains strong representations of harmonically related pitches implied by the interval (e.g., Rameau's fundamental bass) in addition to the pitches of notes actually present in the interval. Moreover, all or most of the partials can be resolved by finely tuned neurons throughout the auditory system. By contrast, dissonant intervals evoke auditory nerve fiber activity that does not contain strong representations of constituent notes or related bass notes. Furthermore, many partials are too close together to be resolved. Consequently, they interfere with one another, cause coarse fluctuations in the firing of peripheral and central auditory neurons, and give rise to perception of roughness and dissonance. The effects of auditory cortex lesions on the perception of consonance, pitch, and roughness, combined with a critical reappraisal of published psychoacoustic data on the relationship between consonance and roughness, lead us to conclude that consonance is first and foremost a function of the pitch relationships among notes. Harmony in the vertical dimension is a positive phenomenon, not just a negative phenomenon that depends on the absence of roughness--a view currently held by many psychologists, musicologists, and physiologists.

Asunto(s)

Modelos Neurológicos , Música , Fenómenos Fisiológicos del Sistema Nervioso , Mundo Occidental , Corteza Auditiva/fisiología , Humanos , Percepción de la Altura Tonal/fisiología , Psicoacústica

2.

Development of speechreading supplements based on automatic speech recognition.

Duchnowski, P; Lum, D S; Krause, J C; Sexton, M G; Bratakos, M S; Braida, L D.

IEEE Trans Biomed Eng ; 47(4): 487-96, 2000 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-10763294

RESUMEN

In manual-cued speech (MCS) a speaker produces hand gestures to resolve ambiguities among speech elements that are often confused by speechreaders. The shape of the hand distinguishes among consonants; the position of the hand relative to the face distinguishes among vowels. Experienced receivers of MCS achieve nearly perfect reception of everyday connected speech. MCS has been taught to very young deaf children and greatly facilitates language learning, communication, and general education. This manuscript describes a system that can produce a form of cued speech automatically in real time and reports on its evaluation by trained receivers of MCS. Cues are derived by a hidden markov models (HMM)-based speaker-dependent phonetic speech recognizer that uses context-dependent phone models and are presented visually by superimposing animated handshapes on the face of the talker. The benefit provided by these cues strongly depends on articulation of hand movements and on precise synchronization of the actions of the hands and the face. Using the system reported here, experienced cue receivers can recognize roughly two-thirds of the keywords in cued low-context sentences correctly, compared to roughly one-third by speechreading alone (SA). The practical significance of these improvements is to support fairly normal rates of reception of conversational speech, a task that is often difficult via SA.

Asunto(s)

Equipos de Comunicación para Personas con Discapacidad , Sordera/rehabilitación , Gestos , Procesamiento de Lenguaje Natural , Percepción del Habla , Medición de la Producción del Habla/métodos , Adulto , Niño , Simulación por Computador , Señales (Psicología) , Presentación de Datos , Humanos , Lectura de los Labios , Cadenas de Markov , Modelos Biológicos , Lengua de Signos , Inteligibilidad del Habla , Interfaz Usuario-Computador

3.

A method to determine the speech transmission index from speech waveforms.

Payton, K L; Braida, L D.

J Acoust Soc Am ; 106(6): 3637-48, 1999 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-10615702

RESUMEN

A method for computing the speech transmission index (STI) using real speech stimuli is presented and evaluated. The method reduces the effects of some of the artifacts that can be encountered when speech waveforms are used as probe stimuli. Speech-based STIs are computed for conversational and clearly articulated speech in several noisy, reverberant, and noisy-reverberant environments and compared with speech intelligibility scores. The results indicate that, for each speaking style, the speech-based STI values are monotonically related to intelligibility scores for the degraded speech conditions tested. Therefore, the STI can be computed using speech probe waveforms and the values of the resulting indices are as good predictors of intelligibility scores as those derived from MTFs by theoretical methods.

Asunto(s)

Procesamiento Automatizado de Datos/métodos , Percepción del Habla/fisiología , Humanos , Modelos Teóricos , Ruido

4.

Effects of token variability on our ability to distinguish between vowels.

Uchanski, R M; Braida, L D.

Percept Psychophys ; 60(4): 533-43, 1998 May.

Artículo en Inglés | MEDLINE | ID: mdl-9628988

RESUMEN

Even when the speaker, context, and speaking style are held fixed, the physical properties of naturally spoken utterances of the same speech sound vary considerably. This variability imposes limits on our ability to distinguish between different speech sounds. We present a conceptual framework for relating the ability to distinguish between speech sounds in single-token experiments (in which each speech sound is represented by a single wave form) to resolution in multiple-token experiments. Experimental results indicate that this ability is substantially reduced by an increase in the number of tokens from 1 to 4, but that there is little further reduction when the number of tokens increases to 16. Furthermore, although there is little relation between the ability to distinguish between a given pair of tokens in the multiple- and the 1-token experiments, there is a modest correlation between the ability to distinguish specific vowel tokens in the 4- and 16-token experiments. These results suggest that while listeners use a multiplicity of cues to distinguish between single tokens of a pair of vowel sounds, so that performance is highly variable both across tokens and listeners, they use a smaller set when distinguishing between populations of naturally produced vowel tokens, so that variability is reduced. The effectiveness of the cues used in the latter case is limited more by internal noise than by the variability of the cues themselves.

Asunto(s)

Percepción del Habla/fisiología , Humanos , Modelos Biológicos , Fonética , Psicofísica , Pruebas de Discriminación del Habla

5.

Consistency among speech parameter vectors: application to predicting speech intelligibility.

Power, M H; Braida, L D.

J Acoust Soc Am ; 100(6): 3882-98, 1996 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-8969488

RESUMEN

Previous researchers interested in physical assessment of speech intelligibility have largely based their predictions on preservation of spectral shape. A new approach is presented in which intelligibility is predicted to be preserved only if a transformation modifies relevant speech parameters in a consistent manner. In particular, the parameters from each short-time interval are described by one of a finite number of symbols formed by quantizing the output of an auditory model, and preservation of intelligibility is modeled as the extent to which a one-to-one correspondence exists between the symbols of the input to the transformation, and those of the output. In this paper, a consistency-measurement system is designed and applied to prediction of intelligibility of linearly filtered speech and speech degraded by additive noise. Results were obtained for two parameter sets: one consisting of band-energy values, and the other based on the ensemble interval histogram (EIH) model. Predictions within a class of transformation varied monotonically with the amount of degradation. Across classes of transformation, the predicted effect of additive-noise transformations was more severe than typical perceptual effects. With respect to the goal of achieving predictions that varied monotonically with human speech-perception scores, performance was slightly better with the EIH parameter set.

Asunto(s)

Percepción del Habla , Umbral Auditivo , Humanos , Ruido , Enmascaramiento Perceptual , Fonética

6.

Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate.

Uchanski, R M; Choi, S S; Braida, L D; Reed, C M; Durlach, N I.

J Speech Hear Res ; 39(3): 494-509, 1996 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-8783129

RESUMEN

The contribution of reduced speaking rate to the intelligibility of "clear" speech (Picheny, Durlach, & Braida, 1985) was evaluated by adjusting the durations of speech segments (a) via nonuniform signal time-scaling, (b) by deleting and inserting pauses, and (c) by eliciting materials from a professional speaker at a wide range of speaking rates. Key words in clearly spoken nonsense sentences were substantially more intelligible than those spoken conversationally (15 points) when presented in quiet for listeners with sensorineural impairments and when presented in a noise background to listeners with normal hearing. Repeated presentation of conversational materials also improved scores (6 points). However, degradations introduced by segment-by-segment time-scaling rendered this time-scaling technique problematic as a means of converting speaking styles. Scores for key words excised from these materials and presented in isolation generally exhibited the same trends as in sentence contexts. Manipulation of pause structure reduced scores both when additional pauses were introduced into conversational sentences and when pauses were deleted from clear sentences. Key-word scores for materials produced by a professional talker were inversely correlated with speaking rate, but conversational rate scores did not approach those of clear speech for other talkers. In all experiments, listeners with normal hearing exposed to flat-spectrum background noise performed similarly to listeners with hearing loss.

Asunto(s)

Pérdida Auditiva Sensorineural , Percepción del Habla , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Espectrografía del Sonido , Factores de Tiempo

7.

Recognition of amplitude-modulation patterns in the presence of a distractor. I. Effects of correlation and frequency relation.

Takeuchi, A H; Braida, L D.

J Acoust Soc Am ; 98(1): 135-41, 1995 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-7608392

RESUMEN

Listeners' ability to compare the amplitude modulation pattern of 200- and 500-Hz targets when distractors that were also amplitude modulated were presented simultaneously was evaluated. The amplitude modulations of the distractors were either uncorrelated, partially correlated, or fully correlated with the amplitude modulations of the comparisons. Relative to the case of no distractor, performance tended to decrease when a distractor was present, and the degree of interference increased as the modulation correlation between the comparisons and distractors decreased. Although the interference was greater when the comparisons and distractors were separated by 50 Hz, there was also significant interference when the separation was 300 Hz. Whether the comparison was higher or lower in frequency than the distractor had no overall effect. However, the effect of modulation correlation was greater when comparisons were higher than distractors rather than lower. Patterns of interference are compared to those found in studies of modulation detection and discrimination interference, and implications for the use of multiple-band signals that carry the amplitude envelopes from different spectral regions of a speech signal to convey speech are discussed.

Asunto(s)

Percepción Auditiva , Humanos , Psicoacústica , Análisis y Desempeño de Tareas

8.

Recognition of amplitude-modulation patterns in the presence of a distractor. II. Effects of dichotic presentation and unmodulated distractors.

Takeuchi, A H; Braida, L D.

J Acoust Soc Am ; 98(1): 142-7, 1995 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-7608393

RESUMEN

Listeners indicated which of two comparisons had the same pattern of amplitude modulation as a target signal when distractors were presented simultaneously with the comparisons. The frequency separation between distractors and comparisons was either narrow or wide. Distractors were either modulated independently of the comparisons, comodulated with the comparisons, or unmodulated. At wide frequency separations, none of the distractors interfered significantly with modulation comparison, relative to performance with no distractors. At narrow frequency separations, comodulated distractors produced less interference than did independently modulated distractors. Unmodulated distractors also produced some interference. There was no difference between diotic presentation and dichotic presentation, in which distractors were presented to the opposite ear from targets and comparisons. Implications for the presentation of multiple envelope signals derived from different spectral regions of a speech signal to convey speech to hearing-impaired listeners are discussed.

Asunto(s)

Percepción Auditiva , Pruebas de Audición Dicótica , Humanos , Psicoacústica , Análisis y Desempeño de Tareas

9.

Effect of frequency transposition on the discrimination of amplitude envelope patterns.

Takeuchi, A H; Braida, L D.

J Acoust Soc Am ; 97(1): 453-60, 1995 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-7860826

RESUMEN

The effects of systematic training on listeners' ability to compare the amplitude envelopes of signals differing in frequency was tested. Listeners indicated which of two comparison signals had the same amplitude envelope as the target signal. During training, center frequencies of comparison signals were gradually increased or decreased relative to target signal center frequencies of 500, 1600, and 3160 Hz. After training, performance was still worse when target and comparison signals were at different frequencies rather than at the same frequency, except possibly when comparison signals were higher in frequency than the 1600-Hz target. Thus the amplitude envelopes of signals do not appear to be perceived independently of the signal itself. Listeners who received no training performed similarly to the trained listeners, except that their performances declined when comparison signals were higher in frequency than the 1600-Hz target. Training did not reduce interlistener differences in overall performance or in the extent of the decline in performance when frequency differences between the target and comparison signals were introduced. The effects of frequency lowering on amplitude envelope discrimination do not appear to be related to the reduced efficacy of frequency-lowered speech-derived amplitude envelopes in supplementing speechreading.

Asunto(s)

Percepción Auditiva , Adolescente , Adulto , Aprendizaje Discriminativo , Humanos , Ruido , Percepción del Habla , Análisis y Desempeño de Tareas

10.

Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing.

Payton, K L; Uchanski, R M; Braida, L D.

J Acoust Soc Am ; 95(3): 1581-92, 1994 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-8176061

RESUMEN

The effect of articulating clearly on speech intelligibility is analyzed for ten normal-hearing and two hearing-impaired listeners in noisy, reverberant, and combined environments. Clear speech is more intelligible than conversational speech for each listener in every environment. The difference in intelligibility due to speaking style increases as noise and/or reverberation increase. The average difference in intelligibility is 20 percentage points for the normal-hearing listeners and 26 percentage points for the hearing-impaired listeners. Two predictors of intelligibility are used to quantify the environmental degradations: The articulation index (AI) and the speech transmission index (STI). Both are shown to predict, reliably, performance levels within a speaking style for normal-hearing listeners. The AI is unable to represent the reduction in intelligibility scores due to reverberation for the hearing-impaired listeners. Neither predictor can account for the difference in intelligibility due to speaking style.

Asunto(s)

Pérdida Auditiva Sensorineural/psicología , Ruido , Enmascaramiento Perceptual , Inteligibilidad del Habla , Percepción del Habla , Adulto , Femenino , Pérdida Auditiva Sensorineural/diagnóstico , Humanos , Masculino , Persona de Mediana Edad , Distorsión de la Percepción , Psicoacústica , Valores de Referencia , Acústica del Lenguaje

11.

Auditory supplements to speechreading: combining amplitude envelope cues from different spectral regions of speech.

Grant, K W; Braida, L D; Renn, R J.

J Acoust Soc Am ; 95(2): 1065-73, 1994 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-8132900

RESUMEN

Many listeners with severe-to-profound hearing losses perceive only a narrow range of low-frequency sounds and must rely on speechreading to supplement the impoverished auditory signal in speech recognition. Previous research with normal-hearing subjects [Grant et al., J. Exp. Psychol. 43A, 621-645 (1991)] demonstrated that speechreading was significantly improved when supplemented by amplitude-envelope cues that were extracted from various spectral regions of speech and presented as amplitude modulations of carriers with frequencies at or below the speech band from which the envelope was derived. This experiment assessed the benefit to speechreading provided by pairs of such envelope cues presented simultaneously. In general, greater improvements in speechreading scores were observed for pairs than for single envelopes when the carrier signals were chosen appropriately. However, when pairs of envelope signals were transposed to low frequencies, the benefit to speechreading was no better than the most effective single-band envelope signal tested, or for a low-pass-filtered speech signal with the same overall bandwidth. Suggestions for improving the efficacy of frequency-lowered envelope cues for hearing-impaired listeners are discussed.

Asunto(s)

Señales (Psicología) , Pérdida Auditiva Sensorineural/rehabilitación , Lectura de los Labios , Percepción del Habla , Estimulación Acústica , Trastornos de la Comunicación/rehabilitación , Humanos , Estimulación Luminosa , Acústica del Lenguaje

12.

Automatic speech recognition to aid the hearing impaired: prospects for the automatic generation of cued speech.

Uchanski, R M; Delhorne, L A; Dix, A K; Braida, L D; Reed, C M; Durlach, N I.

J Rehabil Res Dev ; 31(1): 20-41, 1994.

Artículo en Inglés | MEDLINE | ID: mdl-8035358

RESUMEN

Although great strides have been made in the development of automatic speech recognition (ASR) systems, the communication performance achievable with the output of current real-time speech recognition systems would be extremely poor relative to normal speech reception. An alternate application of ASR technology to aid the hearing impaired would derive cues from the acoustical speech signal that could be used to supplement speechreading. We report a study of highly trained receivers of Manual Cued Speech that indicates that nearly perfect reception of everyday connected speech materials can be achieved at near normal speaking rates. To understand the accuracy that might be achieved with automatically generated cues, we measured how well trained spectrogram readers and an automatic speech recognizer could assign cues for various cue systems. We then applied a recently developed model of audiovisual integration to these recognizer measurements and data on human recognition of consonant and vowel segments via speechreading to evaluate the benefit to speechreading provided by such cues. Our analysis suggests that with cues derived from current recognizers, consonant and vowel segments can be received with accuracies in excess of 80%. This level of performance is roughly equivalent to the segment reception accuracy required to account for observed levels of Manual Cued Speech reception. Current recognizers provide maximal benefit by generating only a relatively small number (three to five) of cue groups, and may not provide substantially greater aid to speechreading than simpler aids that do not incorporate discrete phonetic recognition. To provide guidance for the development of improved automatic cueing systems, we describe techniques for determining optimum cue groups for a given recognizer and speechreader, and estimate the cueing performance that might be achieved if the performance of current recognizers were improved.

Asunto(s)

Equipos de Comunicación para Personas con Discapacidad , Pérdida Auditiva/rehabilitación , Habla , Adolescente , Adulto , Señales (Psicología) , Humanos , Modelos Teóricos , Fonética , Percepción del Habla

13.

Intensity perception. XIV. Intensity discrimination in listeners with sensorineural hearing loss.

Florentine, M; Reed, C M; Rabinowitz, W M; Braida, L D; Durlach, N I; Buus, S.

J Acoust Soc Am ; 94(5): 2575-86, 1993 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-8270735

RESUMEN

Intensity discrimination of pulsed tones (also called level discrimination) was measured as a function of level in 13 listeners with sensorineural hearing impairment of primarily cochlear origin, one listener with a vestibular schwannoma, and six listeners with normal hearing. Measurements were also made in normal ears presented with masking noise spectrally shaped to produce audiograms similar to those of the cochlearly impaired listeners. For unilateral impairments, tests were made at the same frequency in the normal and impaired ears. For bilateral-sloping impairments, tests were made at different frequencies in the same ear. The normal listeners showed results similar to other data in the literature. The listener with a vestibular schwannoma showed greatly reduced intensity resolution, except at a few levels. For listeners with recruiting sensorineural impairments, the results are discussed according to the configuration of the impairment and are compared across configurations at equal SPL, equal SL, and equal loudness level. Listeners with increasing hearing losses at frequencies above the test frequency generally showed impaired resolution, especially at high levels, and less deviation from Weber's law than normal listeners. Listeners with decreasing hearing loss at frequencies above the test frequency showed nearly normal intensity-resolution functions. Whereas these trends are generally present, there are also large differences among individuals. Results obtained from normal listeners who were tested in the presence of masking noise indicate that elevated thresholds and reduced dynamic range account for some, but not all, of the effects of recruiting sensorineural impairment on intensity resolution.

Asunto(s)

Cóclea/fisiopatología , Pérdida Auditiva Sensorineural/fisiopatología , Percepción Sonora , Estimulación Acústica , Adulto , Audiometría , Umbral Auditivo , Femenino , Audición , Humanos , Masculino , Persona de Mediana Edad , Neuroma Acústico/patología , Análisis y Desempeño de Tareas , Vestíbulo del Laberinto/patología

14.

Intelligibility of frequency-lowered speech produced by a channel vocoder.

Posen, M P; Reed, C M; Braida, L D.

J Rehabil Res Dev ; 30(1): 26-38, 1993.

Artículo en Inglés | MEDLINE | ID: mdl-8263827

RESUMEN

Frequency lowering is a form of signal processing designed to match speech to the residual auditory capacity of a listener with a high frequency hearing loss. A vocoder-based frequency-lowering system similar to one studied by Lippmann was evaluated in the present study. In this system, speech levels in high frequency bands modulated one-third-octave bands of noise at low frequencies, which were then added to unprocessed speech. Results obtained with this system indicated, in agreement with Lippmann, that processing improved the recognition of stop, fricative, and affricate consonants when the listening bandwidth was restricted to 800 Hz. However, results also showed that processing degraded the perception of nasals and semivowels, consonants not included in Lippmann's study. Based on these results, the frequency-lowering system was modified so as to suppress the processing whenever low frequency components dominated the input signal. High and low frequency energies of an input signal were measured continuously in the modified system, and the decision to process or to leave the signal unaltered was based on their relative levels. Results indicated that the modified system maintained the processing advantage for stops, fricatives, and affricates without degrading the perception of nasals and semi-vowels. The results of the present study also indicated that training is an important consideration when evaluating frequency-lowering systems.

Asunto(s)

Pérdida Auditiva de Alta Frecuencia/rehabilitación , Procesamiento de Señales Asistido por Computador/instrumentación , Inteligibilidad del Habla , Pérdida Auditiva de Alta Frecuencia/fisiopatología , Humanos , Minicomputadores , Pruebas de Discriminación del Habla

15.

Analytic study of the Tadoma method: improving performance through the use of supplementary tactual displays.

Reed, C M; Rabinowitz, W M; Durlach, N I; Delhorne, L A; Braida, L D; Pemberton, J C; Mulcahey, B D; Washington, D L.

J Speech Hear Res ; 35(2): 450-65, 1992 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-1533433

RESUMEN

Although results obtained with the Tadoma method of speechreading have set a new standard for tactual speech communication, they are nevertheless inferior to those obtained in the normal auditory domain. Speech reception through Tadoma is comparable to that of normal-hearing subjects listening to speech under adverse conditions corresponding to a speech-to-noise ratio of roughly 0 dB. The goal of the current study was to demonstrate improvements to speech reception through Tadoma through the use of supplementary tactual information, thus leading to a new standard of performance in the tactual domain. Three supplementary tactual displays were investigated: (a) an articulatory-based display of tongue contact with the hard palate; (b) a multichannel display of the short-term speech spectrum; and (c) tactual reception of Cued Speech. The ability of laboratory-trained subjects to discriminate pairs of speech segments that are highly confused through Tadoma was studied for each of these augmental displays. Generally, discrimination tests were conducted for Tadoma alone, the supplementary display alone, and Tadoma combined with the supplementary tactual display. The results indicated that the tongue-palate contact display was an effective supplement to Tadoma for improving discrimination of consonants, but that neither the tongue-palate contact display nor the short-term spectral display was highly effective in improving vowel discriminability. For both vowel and consonant stimulus pairs, discriminability was nearly perfect for the tactual reception of the manual cues associated with Cued Speech. Further experiments on the identification of speech segments were conducted for Tadoma combined with Cued Speech. The observed data for both discrimination and identification experiments are compared with the predictions of models of integration of information from separate sources.

Asunto(s)

Ceguera/rehabilitación , Equipos de Comunicación para Personas con Discapacidad/normas , Sordera/rehabilitación , Terapia Asistida por Computador/normas , Tacto , Ceguera/complicaciones , Ceguera/fisiopatología , Señales (Psicología) , Sordera/complicaciones , Sordera/fisiopatología , Estudios de Evaluación como Asunto , Expresión Facial , Femenino , Humanos , Masculino , Hueso Paladar/fisiología , Pruebas de Discriminación del Habla , Lengua/fisiología

16.

Single band amplitude envelope cues as an aid to speechreading.

Grant, K W; Braida, L D; Renn, R J.

Q J Exp Psychol A ; 43(3): 621-45, 1991 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-1775660

RESUMEN

Amplitude envelopes derived from speech have been shown to facilitate speech-reading to varying degrees, depending on how the envelope signals were extracted and presented and on the amount of training given to the subjects. In this study, three parameters related to envelope extraction and presentation were examined using both easy and difficult sentence materials: (1) the bandwidth and centre frequency of the filtered speech signal used to obtain the envelope; (2) the bandwidth of the envelope signal determined by the lowpass filter cutoff frequency used to "smooth" the envelope fluctuations; and (3) the carrier signal used to convey the envelope cues. Results for normal hearing subjects following a brief visual and auditory-visual familiarization/training period showed that (1) the envelope derived from wideband speech does not provide the greatest benefit to speechreading when compared to envelopes derived from selected octave bands of speech; (2) as the bandwidth centred around the carrier frequency increased from 12.5 to 1600 Hz, auditory-visual (AV) performance obtained with difficult sentence materials improved, especially for envelopes derived from high-frequency speech energy; (3) envelope bandwidths below 25 Hz resulted in AV scores that were sometimes equal to or worse than speechreading alone; (4) for each filtering condition tested, there was at least one bandwidth and carrier condition that produced AV scores that were significantly greater than speechreading alone; (5) low-frequency carriers were better than high-frequency or wideband carriers for envelopes derived from an octave band of speech centred at 500 Hz; and (6) low-frequency carriers were worse than high-frequency or wideband carriers for envelopes derived from an octave band centred at 3150 Hz. These results suggest that amplitude envelope cues can provide a substantial benefit to speechreading for both easy and difficult sentence materials, but that frequency transposition of these signals to regions remote from their "natural" spectral locations may result in reduced performance.

Asunto(s)

Atención , Lectura de los Labios , Fonética , Percepción de la Altura Tonal , Percepción del Habla , Adulto , Humanos , Distorsión de la Percepción , Enmascaramiento Perceptual , Psicoacústica

17.

Crossmodal integration in the identification of consonant segments.

Braida, L D.

Q J Exp Psychol A ; 43(3): 647-77, 1991 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-1775661

RESUMEN

Although speechreading can be facilitated by auditory or tactile supplements, the process that integrates cues across modalities is not well understood. This paper describes two "optimal processing" models for the types of integration that can be used in speechreading consonant segments and compares their predictions with those of the Fuzzy Logical Model of Perception (FLMP, Massaro, 1987). In "pre-labelling" integration, continuous sensory data is combined across modalities before response labels are assigned. In "post-labelling" integration, the responses that would be made under unimodal conditions are combined, and a joint response is derived from the pair. To describe pre-labelling integration, confusion matrices are characterized by a multidimensional decision model that allows performance to be described by a subject's sensitivity and bias in using continuous-valued cues. The cue space is characterized by the locations of stimulus and response centres. The distance between a pair of stimulus centres determines how well two stimuli can be distinguished in a given experiment. In the multimodal case, the cue space is assumed to be the product space of the cue spaces corresponding to the stimulation modes. Measurements of multimodal accuracy in five modern studies of consonant identification are more consistent with the predictions of the pre-labelling integration model than the FLMP or the post-labelling model.

Asunto(s)

Atención , Lectura de los Labios , Fonética , Percepción del Habla , Sordera/psicología , Sordera/rehabilitación , Humanos , Modelos Logísticos , Psicoacústica

18.

Evaluating the articulation index for auditory-visual input.

Grant, K W; Braida, L D.

J Acoust Soc Am ; 89(6): 2952-60, 1991 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-1918633

RESUMEN

An investigation of the auditory-visual (AV) articulation index (AI) correction procedure outlined in the ANSI standard [ANSI S3.5-1969 (R1986)] was made by evaluating auditory (A), visual (V), and auditory-visual sentence identification for both wideband speech degraded by additive noise and a variety of bandpass-filtered speech conditions presented in quiet and in noise. When the data for each of the different listening conditions were averaged across talkers and subjects, the procedure outlined in the standard was fairly well supported, although deviations from the predicted AV score were noted for individual subjects as well as individual talkers. For filtered speech signals with AIA less than 0.25, there was a tendency for the standard to underpredict AV scores. Conversely, for signals with AIA greater than 0.25, the standard consistently overpredicted AV scores. Additionally, synergistic effects, where the AIA obtained from the combination of different bandpass-filtered conditions was greater than the sum of the individual AIA's, were observed for all nonadjacent filter-band combinations (e.g., the addition of a low-pass band with a 630-Hz cutoff and a high-pass band with a 3150-Hz cutoff). These latter deviations from the standard violate the basic assumption of additivity stated by Articulation Theory, but are consistent with earlier reports by Pollack [I. Pollack, J. Acoust. Soc. Am. 20, 259-266 (1948)], Licklider [J. C. R. Licklider, Psychology: A Study of a Science, Vol. 1, edited by S. Koch (McGraw-Hill, New York, 1959), pp. 41-144], and Kryter [K. D. Kryter, J. Acoust. Soc. Am. 32, 547-556 (1960)].

Asunto(s)

Inteligibilidad del Habla , Percepción del Habla , Percepción Visual , Adulto , Humanos , Ruido , Habla

19.

Development and testing of artificial low-frequency speech codes.

Reed, C M; Power, M H; Durlach, N I; Braida, L D; Foss, K K; Reid, J A; Dubois, S R.

J Rehabil Res Dev ; 28(3): 67-82, 1991.

Artículo en Inglés | MEDLINE | ID: mdl-1880751

RESUMEN

In a new approach to the frequency-lowering of speech, artificial codes were developed for 24 consonants (C) and 15 vowels (V) for two values of lowpass cutoff frequency F (300 and 500 Hz). Each individual phoneme was coded by a unique, nonvarying acoustic signal confined to frequencies less than or equal to F. Stimuli were created through variations in spectral content, amplitude, and duration of tonal complexes or bandpass noise. For example, plosive and fricative sounds were constructed by specifying the duration and relative amplitude of bandpass noise with various center frequencies and bandwidths, while vowels were generated through variations in the spectral shape and duration of a ten-tone harmonic complex. The ability of normal-hearing listeners to identify coded Cs and Vs in fixed-context syllables was compared to their performance on single-token sets of natural speech utterances lowpass filtered to equivalent values of F. For a set of 24 consonants in C-/a/ context, asymptotic performance on coded sounds averaged 90 percent correct for F = 500 Hz and 65 percent for F = 300 Hz, compared to 75 percent and 40 percent for lowpass filtered speech. For a set of 15 vowels in /b/-V-/t/ context, asymptotic performance on coded sounds averaged 85 percent correct for F = 500 Hz and 65 percent for F = 300 Hz, compared to 85 percent and 50 percent for lowpass filtered speech. Identification of coded signals for F = 500 Hz was also examined in CV syllables where C was selected at random from the set of 24 Cs and V was selected at random from the set of 15 Vs. Asymptotic performance of roughly 67 percent correct and 71 percent correct was obtained for C and V identification, respectively. These scores are somewhat lower than those obtained in the fixed-context experiments. Finally, results were obtained concerning the effect of token variability on the identification of lowpass filtered speech. These results indicate a systematic decrease in percent-correct score as the number of tokens representing each phoneme in the identification tests increased from one to nine.

Asunto(s)

Fonética , Procesamiento de Señales Asistido por Computador , Inteligibilidad del Habla , Estudios de Evaluación como Asunto , Femenino , Humanos , Masculino , Ruido , Espectrografía del Sonido

20.

Analytic study of the Tadoma method: effects of hand position on segmental speech perception.

Reed, C M; Durlach, N I; Braida, L D; Schultz, M C.

J Speech Hear Res ; 32(4): 921-9, 1989 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-2601321

RESUMEN

In the Tadoma method of communication, deaf-blind individuals receive speech by placing a hand on the face and neck of the talker and monitoring actions associated with speech production. Previous research has documented the speech perception, speech production, and linguistic abilities of highly experienced users of the Tadoma method. The current study was performed to gain further insight into the cues involved in the perception of speech segments through Tadoma. Small-set segmental identification experiments were conducted in which the subjects' access to various types of articulatory information was systematically varied by imposing limitations on the contact of the hand with the face. Results obtained on 3 deaf-blind, highly experienced users of Tadoma were examined in terms of percent-correct scores, information transfer, and reception of speech features for each of sixteen experimental conditions. The results were generally consistent with expectations based on the speech cues assumed to be available in the various hand positions.

Asunto(s)

Métodos de Comunicación Total , Mano , Postura , Rehabilitación , Percepción del Habla , Tacto , Adulto , Señales (Psicología) , Femenino , Humanos , Masculino , Persona de Mediana Edad , Fonación , Conducta Verbal

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA