Búsqueda | Portal Regional de la BVS

1.

Inharmonic speech reveals the role of harmonicity in the cocktail party problem.

Popham, Sara; Boebinger, Dana; Ellis, Dan P W; Kawahara, Hideki; McDermott, Josh H.

Nat Commun ; 9(1): 2122, 2018 05 29.

Artículo en Inglés | MEDLINE | ID: mdl-29844313

RESUMEN

The "cocktail party problem" requires us to discern individual sound sources from mixtures of sources. The brain must use knowledge of natural sound regularities for this purpose. One much-discussed regularity is the tendency for frequencies to be harmonically related (integer multiples of a fundamental frequency). To test the role of harmonicity in real-world sound segregation, we developed speech analysis/synthesis tools to perturb the carrier frequencies of speech, disrupting harmonic frequency relations while maintaining the spectrotemporal envelope that determines phonemic content. We find that violations of harmonicity cause individual frequencies of speech to segregate from each other, impair the intelligibility of concurrent utterances despite leaving intelligibility of single utterances intact, and cause listeners to lose track of target talkers. However, additional segregation deficits result from replacing harmonic frequencies with noise (simulating whispering), suggesting additional grouping cues enabled by voiced speech excitation. Our results demonstrate acoustic grouping cues in real-world sound segregation.

Asunto(s)

Localización de Sonidos/fisiología , Espectrografía del Sonido/métodos , Acústica del Lenguaje , Percepción del Habla/fisiología , Habla/fisiología , Estimulación Acústica , Señales (Psicología) , Humanos , Ruido

2.

The Effect of Peripheral Compression on Syllable Perception Measured with a Hearing Impairment Simulator.

Matsui, Toshie; Irino, Toshio; Nagae, Misaki; Kawahara, Hideki; Patterson, Roy D.

Adv Exp Med Biol ; 894: 307-314, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-27080671

RESUMEN

Hearing impaired (HI) people often have difficulty understanding speech in multi-speaker or noisy environments. With HI listeners, however, it is often difficult to specify which stage, or stages, of auditory processing are responsible for the deficit. There might also be cognitive problems associated with age. In this paper, a HI simulator, based on the dynamic, compressive gammachirp (dcGC) filterbank, was used to measure the effect of a loss of compression on syllable recognition. The HI simulator can counteract the cochlear compression in normal hearing (NH) listeners and, thereby, isolate the deficit associated with a loss of compression in speech perception. Listeners were required to identify the second syllable in a three-syllable "nonsense word", and between trials, the relative level of the second syllable was varied, or the level of the entire sequence was varied. The difference between the Speech Reception Threshold (SRT) in these two conditions reveals the effect of compression on speech perception. The HI simulator adjusted a NH listener's compression to that of the "average 80-year old" with either normal compression or complete loss of compression. A reference condition was included where the HI simulator applied a simple 30-dB reduction in stimulus level. The results show that the loss of compression has its largest effect on recognition when the second syllable is attenuated relative to the first and third syllables. This is probably because the internal level of the second syllable is attenuated proportionately more when there is a loss of compression.

Asunto(s)

Pérdida Auditiva/fisiopatología , Enmascaramiento Perceptual , Percepción del Habla , Adulto , Femenino , Humanos , Masculino , Prueba del Umbral de Recepción del Habla

3.

Speaker perception.

Schweinberger, Stefan R; Kawahara, Hideki; Simpson, Adrian P; Skuk, Verena G; Zäske, Romi.

Wiley Interdiscip Rev Cogn Sci ; 5(1): 15-25, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-26304294

RESUMEN

While humans use their voice mainly for communicating information about the world, paralinguistic cues in the voice signal convey rich dynamic information about a speaker's arousal and emotional state, and extralinguistic cues reflect more stable speaker characteristics including identity, biological sex and social gender, socioeconomic or regional background, and age. Here we review the anatomical and physiological bases for individual differences in the human voice, before discussing how recent methodological progress in voice morphing and voice synthesis has promoted research on current theoretical issues, such as how voices are mentally represented in the human brain. Special attention is dedicated to the distinction between the recognition of familiar and unfamiliar speakers, in everyday situations or in the forensic context, and on the processes and representational changes that accompany the learning of new voices. We describe how specific impairments and individual differences in voice perception could relate to specific brain correlates. Finally, we consider that voices are produced by speakers who are often visible during communication, and review recent evidence that shows how speaker perception involves dynamic face-voice integration. The representation of para- and extralinguistic vocal information plays a major role in person perception and social communication, could be neuronally encoded in a prototype-referenced manner, and is subject to flexible adaptive recalibration as a result of specific perceptual experience. WIREs Cogn Sci 2014, 5:15-25. doi: 10.1002/wcs.1261 CONFLICT OF INTEREST: The authors have declared no conflicts of interest for this article. For further resources related to this article, please visit the WIREs website.

4.

Accurate estimation of compression in simultaneous masking enables the simulation of hearing impairment for normal-hearing listeners.

Irino, Toshio; Fukawatase, Tomofumi; Sakaguchi, Makoto; Nisimura, Ryuichi; Kawahara, Hideki; Patterson, Roy D.

Adv Exp Med Biol ; 787: 73-80, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23716211

RESUMEN

This chapter presents a unified gammachirp framework for -estimating cochlear compression and synthesizing sounds with inverse compression that -cancels the compression of a normal-hearing (NH) listener to simulate the -experience of a hearing-impaired (HI) listener. The compressive gammachirp (cGC) filter was -fitted to notched-noise masking data to derive level-dependent -filter shapes and the cochlear compression function (e.g., Patterson et al., J Acoust Soc Am 114:1529-1542, 2003). The procedure is based on the analysis/synthesis technique of Irino and Patterson (IEEE Trans Audio Speech Lang Process 14:2222-2232, 2006) using a dynamic cGC filterbank (dcGC-FB). The level dependency of the dcGC-FB can be reversed to produce inverse compression and resynthesize sounds in a form that cancels the compression applied by the -auditory system of the NH listener. The chapter shows that the estimation of compression in simultaneous masking is improved if the notched-noise procedure for the derivation of auditory filter shape includes noise bands with different levels. Since both the estimation and resynthesis are performed within the gammachirp framework, it is possible for a specific NH listener to experience the loss of a -specific HI listener.

Asunto(s)

Cóclea/fisiología , Pérdida Auditiva/fisiopatología , Audición/fisiología , Modelos Biológicos , Enmascaramiento Perceptual/fisiología , Estimulación Acústica , Umbral Auditivo/fisiología , Humanos , Ruido , Psicoacústica

5.

[Korean doctor Che-ma Yi and his Sasang Constitutional Medicine].

Kawahara, Hideki.

Zhonghua Yi Shi Za Zhi ; 41(2): 67-9, 2011 Mar.

Artículo en Chino | MEDLINE | ID: mdl-21624265

RESUMEN

Tongui suse powon, written by Chi-ma Yi (1837 - 1900), divided people into four types according to their constitution and disposition-Taiyang, Taiyin, Shaoyin and Shaoyang. He discussed the appearance, viscera, temperament and disease each type of people probably have. The formation of Sasang Consititutional Medicine is influenced by Sasang Theory in Yi Jing and 25-mode Personality Theory in Ling Shu and has nothing to do with Huangji Jingshishu written by SHAO Yong.

6.

Voice aftereffects of adaptation to speaker identity.

Zäske, Romi; Schweinberger, Stefan R; Kawahara, Hideki.

Hear Res ; 268(1-2): 38-45, 2010 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-20430084

RESUMEN

While adaptation to complex auditory stimuli has traditionally been reported for linguistic properties of speech, the present study demonstrates non-linguistic high-level aftereffects in the perception of voice identity, following adaptation to voices or faces of personally familiar speakers. In Exp. 1, prolonged exposure to speaker A's voice biased the perception of identity-ambiguous voice morphs between speakers A and B towards speaker B (and vice versa). Significantly biased voice identity perception was also observed in Exp. 2 when adaptors were videos of speakers' silently articulating faces, although effects were reduced in magnitude relative to those seen in Exp. 1. By contrast, adaptation to an unrelated speaker C elicited an intermediate proportion of speaker A identifications in both experiments. While crossmodal aftereffects on auditory identification (Exp. 2) dissipated rapidly, unimodal aftereffects (Exp. 1) were still measurable a few minutes after adaptation. These novel findings suggest contrastive coding of voice identity in long-term memory, with at least two perceptual mechanisms of voice identity adaptation: one related to auditory coding of voice characteristics, and another related to multimodal coding of familiar speaker identity.

Asunto(s)

Reconocimiento en Psicología , Acústica del Lenguaje , Percepción del Habla , Voz , Estimulación Acústica , Adaptación Psicológica , Adulto , Expresión Facial , Femenino , Humanos , Masculino , Memoria , Detección de Señal Psicológica , Factores de Tiempo , Grabación en Video , Adulto Joven

7.

Vocal attractiveness increases by averaging.

Bruckert, Laetitia; Bestelmeyer, Patricia; Latinus, Marianne; Rouger, Julien; Charest, Ian; Rousselet, Guillaume A; Kawahara, Hideki; Belin, Pascal.

Curr Biol ; 20(2): 116-20, 2010 Jan 26.

Artículo en Inglés | MEDLINE | ID: mdl-20129047

RESUMEN

Vocal attractiveness has a profound influence on listeners-a bias known as the "what sounds beautiful is good" vocal attractiveness stereotype [1]-with tangible impact on a voice owner's success at mating, job applications, and/or elections. The prevailing view holds that attractive voices are those that signal desirable attributes in a potential mate [2-4]-e.g., lower pitch in male voices. However, this account does not explain our preferences in more general social contexts in which voices of both genders are evaluated. Here we show that averaging voices via auditory morphing [5] results in more attractive voices, irrespective of the speaker's or listener's gender. Moreover, we show that this phenomenon is largely explained by two independent by-products of averaging: a smoother voice texture (reduced aperiodicities) and a greater similarity in pitch and timbre with the average of all voices (reduced "distance to mean"). These results provide the first evidence for a phenomenon of vocal attractiveness increases by averaging, analogous to a well-established effect of facial averaging [6, 7]. They highlight prototype-based coding [8] as a central feature of voice perception, emphasizing the similarity in the mechanisms of face and voice perception.

Asunto(s)

Habla , Femenino , Humanos , Masculino , Factores Sexuales

8.

In the ear of the beholder: neural correlates of adaptation to voice gender.

Zäske, Romi; Schweinberger, Stefan R; Kaufmann, Jürgen M; Kawahara, Hideki.

Eur J Neurosci ; 30(3): 527-34, 2009 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-19656175

RESUMEN

While high-level adaptation to faces has been extensively investigated, research on behavioural and neural correlates of auditory adaptation to paralinguistic social information in voices has been largely neglected. Here we replicate novel findings that adaptation to voice gender causes systematic contrastive aftereffects such that repeated exposure to female voice adaptors causes a subsequent test voice to be perceived as more male (and vice versa), even minutes after adaptation [S.R. Schweinberger et al., (2008), Current Biology, 18, 684-688). In addition, we recorded event-related potentials to test-voices morphed along a gender continuum. An attenuation in frontocentral N1-P2 amplitudes was seen when a test voice was preceded by gender-congruent voice adaptors. Additionally, similar amplitude attenuations were seen in a late parietal positive component (P3, 300-700 ms). These findings suggest that contrastive coding of voice gender takes place within the first few hundred milliseconds from voice onset, and is implemented by neurons in auditory areas that are specialised for detecting male and female voice quality.

Asunto(s)

Adaptación Fisiológica/fisiología , Percepción Auditiva/fisiología , Caracteres Sexuales , Calidad de la Voz , Potenciales Evocados Auditivos/fisiología , Femenino , Humanos , Masculino , Voz

9.

Use of a magnetic attachment to retain an obturator prosthesis for an osseous defect.

Kawamoto, Shin-ichiro; Hamamura, Syunichi; Kawahara, Hideki; Nishi, Yasuhiro; Nagaoka, Eiichi.

J Prosthodont ; 18(4): 359-62, 2009 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-19486454

RESUMEN

Tooth loss accompanied by a massive defect of the alveolar bone can cause serious problems such as food deposit and esthetic impairment. This report describes procedures for the fabrication of an osseous defect obturator prosthesis connected to a fixed partial denture by a magnetic attachment along with the clinical outcome.

Asunto(s)

Diseño de Dentadura , Dentadura Parcial Fija , Magnetismo/instrumentación , Prótesis Periodontal , Resinas Acrílicas , Proceso Alveolar/patología , Diente Canino/cirugía , Materiales Dentales , Estudios de Seguimiento , Humanos , Masculino , Enfermedades Maxilares/cirugía , Persona de Mediana Edad , Quiste Radicular/cirugía , Propiedades de Superficie , Extracción Dental , Alveolo Dental/patología , Resultado del Tratamiento

10.

Noh voice quality.

Fujimura, Osamu; Honda, Kiyoshi; Kawahara, Hideki; Konparu, Yasuyuki; Morise, Masanori; Williams, J C.

Logoped Phoniatr Vocol ; 34(4): 157-70, 2009 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-19499458

RESUMEN

In Noh, a traditional performing art of Japan, extremely expressive voice quality is used to convey an emotional message. A periodicity of voice appears responsible for these special effects. Acoustic signals were recorded for selected portions of dramatic singing in order to study the acoustic effects of delicate voice control by a master of the Konparu school. Using a signal analysis-synthesis algorithm, TANDEM-STRAIGHT, to represent multiple candidates for pitch perception, signals deviating from the harmonic structure have been successfully displayed, corresponding to auditory impressions of pitch movements, even when narrow-band spectrograms failed to show the perceived events. Strong interaction between vocal tract resonance and vocal fold vibration seems to play a major role in producing these expressive voice qualities.

Asunto(s)

Arte , Fonación , Calidad de la Voz , Algoritmos , Percepción Auditiva , Emociones , Humanos , Japón , Lenguaje , Masculino , Procesamiento de Señales Asistido por Computador , Espectrografía del Sonido , Acústica del Lenguaje , Vibración , Voz

11.

Auditory adaptation in voice perception.

Schweinberger, Stefan R; Casper, Christoph; Hauthal, Nadine; Kaufmann, Jürgen M; Kawahara, Hideki; Kloth, Nadine; Robertson, David M C; Simpson, Adrian P; Zäske, Romi.

Curr Biol ; 18(9): 684-8, 2008 May 06.

Artículo en Inglés | MEDLINE | ID: mdl-18450448

RESUMEN

Perceptual aftereffects following adaptation to simple stimulus attributes (e.g., motion, color) have been studied for hundreds of years. A striking recent discovery was that adaptation also elicits contrastive aftereffects in visual perception of complex stimuli and faces [1-6]. Here, we show for the first time that adaptation to nonlinguistic information in voices elicits systematic auditory aftereffects. Prior adaptation to male voices causes a voice to be perceived as more female (and vice versa), and these auditory aftereffects were measurable even minutes after adaptation. By contrast, crossmodal adaptation effects were absent, both when male or female first names and when silently articulating male or female faces were used as adaptors. When sinusoidal tones (with frequencies matched to male and female voice fundamental frequencies) were used as adaptors, no aftereffects on voice perception were observed. This excludes explanations for the voice aftereffect in terms of both pitch adaptation and postperceptual adaptation to gender concepts and suggests that contrastive voice-coding mechanisms may routinely influence voice perception. The role of adaptation in calibrating properties of high-level voice representations indicates that adaptation is not confined to vision but is a ubiquitous mechanism in the perception of nonlinguistic social information from both faces and voices.

Asunto(s)

Adaptación Fisiológica , Percepción Auditiva/fisiología , Caracteres Sexuales , Voz , Adulto , Femenino , Humanos , Masculino

12.

Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements.

Irino, Toshio; Patterson, Roy D; Kawahara, Hideki.

IEEE Trans Audio Speech Lang Process ; 14(6): 2212-2221, 2006 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-20191101

RESUMEN

We propose a new method to segregate concurrent speech sounds using an auditory version of a channel vocoder. The auditory representation of sound, referred to as an "auditory image," preserves fine temporal information, unlike conventional window-based processing systems. This makes it possible to segregate speech sources with an event synchronous procedure. Fundamental frequency information is used to estimate the sequence of glottal pulse times for a target speaker, and to repress the glottal events of other speakers. The procedure leads to robust extraction of the target speech and effective segregation even when the signal-to-noise ratio is as low as 0 dB. Moreover, the segregation performance remains high when the speech contains jitter, or when the estimate of the fundamental frequency F0 is inaccurate. This contrasts with conventional comb-filter methods where errors in F0 estimation produce a marked reduction in performance. We compared the new method to a comb-filter method using a cross-correlation measure and perceptual recognition experiments. The results suggest that the new method has the potential to supplant comb-filter and harmonic-selection methods for speech enhancement.

13.

The processing and perception of size information in speech sounds.

Smith, David R R; Patterson, Roy D; Turner, Richard; Kawahara, Hideki; Irino, Toshio.

J Acoust Soc Am ; 117(1): 305-18, 2005 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-15704423

RESUMEN

There is information in speech sounds about the length of the vocal tract; specifically, as a child grows, the resonators in the vocal tract grow and the formant frequencies of the vowels decrease. It has been hypothesized that the auditory system applies a scale transform to all sounds to segregate size information from resonator shape information, and thereby enhance both size perception and speech recognition [Irino and Patterson, Speech Commun. 36, 181-203 (2002)]. This paper describes size discrimination experiments and vowel recognition experiments designed to provide evidence for an auditory scaling mechanism. Vowels were scaled to represent people with vocal tracts much longer and shorter than normal, and with pitches much higher and lower than normal. The results of the discrimination experiments show that listeners can make fine judgments about the relative size of speakers, and they can do so for vowels scaled well beyond the normal range. Similarly, the recognition experiments show good performance for vowels in the normal range, and for vowels scaled well beyond the normal range of experience. Together, the experiments support the hypothesis that the auditory system automatically normalizes for the size information in communication sounds.

Asunto(s)

Percepción de la Altura Tonal , Percepción del Habla/fisiología , Humanos , Fonética , Sonido , Pruebas de Discriminación del Habla

14.

YIN, a fundamental frequency estimator for speech and music.

de Cheveigné, Alain; Kawahara, Hideki.

J Acoust Soc Am ; 111(4): 1917-30, 2002 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-12002874

RESUMEN

An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors. The algorithm has several desirable features. Error rates are about three times lower than the best competing methods, as evaluated over a database of speech recorded together with a laryngograph signal. There is no upper limit on the frequency search range, so the algorithm is suited for high-pitched voices and music. The algorithm is relatively simple and may be implemented efficiently and with low latency, and it involves few parameters that must be tuned. It is based on a signal model (periodic signal) that may be extended in several ways to handle various forms of aperiodicity that occur in particular applications. Finally, interesting parallels may be drawn with models of auditory processing.

Asunto(s)

Música , Discriminación de la Altura Tonal , Espectrografía del Sonido/estadística & datos numéricos , Acústica del Lenguaje , Calidad de la Voz , Algoritmos , Humanos , Modelos Estadísticos , Sensibilidad y Especificidad

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA