Pesquisa | Portal de Pesquisa da BVS

1.

On H1-H2 as an acoustic measure of linguistic phonation type.

Chai, Yuan; Garellek, Marc.

J Acoust Soc Am ; 152(3): 1856, 2022 09.

Artigo em Inglês | MEDLINE | ID: mdl-36182308

RESUMO

The measure H1-H2, the difference in amplitude between the first and second harmonics, is frequently used to distinguish phonation types and to characterize differences across voices and genders. While H1-H2 can differentiate voices and is used by listeners to perceive changes in voice quality, its relation to voice articulation is less straightforward. Its calculation also involves practical issues with error propagation. This paper highlights some developments in the use of H1-H2 and proposes a new measure that we call "residual H1." In residual H1, the amplitude of the first harmonic is normalized against the overall sound energy (as measured by root mean square energy) instead of against H2. Residual H1 may mitigate some of the issues with using H1-H2. The current study tests the correlation between residual H1 and electroglottographic contact quotient (CQ) and compares the ability of residual H1 vs H1-H2 to differentiate statistically across phonation types in !Xóõ and utterance-level changes in phonatory quality in Mandarin. The results show that residual H1 has a stronger correlation with CQ and differentiates contrastive and allophonic phonatory quality better than H1-H2, particularly for more constricted phonation types.

Assuntos

Fonação , Qualidade da Voz , Acústica , Feminino , Humanos , Masculino , Fonética , Acústica da Fala

2.

Validating a psychoacoustic model of voice quality.

Kreiman, Jody; Lee, Yoonjeong; Garellek, Marc; Samlan, Robin; Gerratt, Bruce R.

J Acoust Soc Am ; 149(1): 457, 2021 01.

Artigo em Inglês | MEDLINE | ID: mdl-33514179

RESUMO

No agreed-upon method currently exists for objective measurement of perceived voice quality. This paper describes validation of a psychoacoustic model designed to fill this gap. This model includes parameters to characterize the harmonic and inharmonic voice sources, vocal tract transfer function, fundamental frequency, and amplitude of the voice, which together serve to completely quantify the integral sound of a target voice sample. In experiment 1, 200 voices with and without diagnosed vocal pathology were fit with the model using analysis-by-synthesis. The resulting synthetic voice samples were not distinguishable from the original voice tokens, suggesting that the model has all the parameters it needs to fully quantify voice quality. In experiment 2 parameters that model the harmonic voice source were removed one by one, and the voice tokens were re-synthesized with the reduced model. In every case the lower-dimensional models provided worse perceptual matches to the quality of the natural tokens than did the original set, indicating that the psychoacoustic model cannot be reduced in dimensionality without loss of fit to the data. Results confirm that this model can be validly applied to quantify voice quality in clinical and research applications.

Assuntos

Psicoacústica , Distúrbios da Voz , Voz , Feminino , Humanos , Masculino , Fala , Acústica da Fala , Qualidade da Voz

3.

Acoustic Discriminability of the Complex Phonation System in !Xóõ.

Garellek, Marc.

Phonetica ; 77(2): 131-160, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-30739113

RESUMO

Phonation types, or contrastive voice qualities, are minimally produced using complex movements of the vocal folds, but may additionally involve constriction in the supraglottal and pharyngeal cavities. These complex articulations in turn produce a multidimensional acoustic output that can be modeled in various ways. In this study, I investigate whether the psychoacoustic model of voice by Kreiman et al. (2014) succeeds at distinguishing six phonation types of !Xóõ. Linear discriminant analysis is performed using parameters from the model averaged over the entire vowel as well as for the first and final halves of the vowel. The results indicate very high classification accuracy for all phonation types. Measures averaged over the vowel's entire duration are closely correlated with the discriminant functions, suggesting that they are sufficient for distinguishing even dynamic phonation types. Measures from all classes of parameters are correlated with the linear discriminant functions; in particular, the "strident" vowels, which are harsh in quality, are characterized by their noise, changes in spectral tilt, decrease in voicing amplitude and frequency, and raising of the first formant. Despite the large number of contrasts and the time-varying characteristics of many of the phonation types, the phonation contrasts in !Xóõ remain well differentiated acoustically.

Assuntos

Idioma , Fonação , Acústica da Fala , Humanos , Masculino , Fonética , Espectrografia do Som , Fala/fisiologia , Qualidade da Voz

4.

Modeling the voice source in terms of spectral slopes.

Garellek, Marc; Samlan, Robin; Gerratt, Bruce R; Kreiman, Jody.

J Acoust Soc Am ; 139(3): 1404-10, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-27036277

RESUMO

A psychoacoustic model of the voice source spectrum is proposed. The model is characterized by four spectral slope parameters: the difference in amplitude between the first two harmonics (H1-H2), the second and fourth harmonics (H2-H4), the fourth harmonic and the harmonic nearest 2 kHz in frequency (H4-2 kHz), and the harmonic nearest 2 kHz and that nearest 5 kHz (2 kHz-5 kHz). As a step toward model validation, experiments were conducted to establish the acoustic and perceptual independence of these parameters. In experiment 1, the model was fit to a large number of voice sources. Results showed that parameters are predictable from one another, but that these relationships are due to overall spectral roll-off. Two additional experiments addressed the perceptual independence of the source parameters. Listener sensitivity to H1-H2, H2-H4, and H4-2 kHz did not change as a function of the slope of an adjacent component, suggesting that sensitivity to these components is robust. Listener sensitivity to changes in spectral slope from 2 kHz to 5 kHz depended on complex interactions between spectral slope, spectral noise levels, and H4-2 kHz. It is concluded that the four parameters represent non-redundant acoustic and perceptual aspects of voice quality.

Assuntos

Acústica , Modelos Teóricos , Acústica da Fala , Qualidade da Voz , Adulto , Feminino , Humanos , Masculino , Psicoacústica , Espectrografia do Som , Medida da Produção da Fala , Adulto Jovem

5.

Perception of glottalization and phrase-final creak.

Garellek, Marc.

J Acoust Soc Am ; 137(2): 822-31, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-25698016

RESUMO

American English has several linguistic sources of creaky voice. Two common sources are /t/-glottalization (where /t/ is produced as a glottal stop and/or with creaky voice, as in "button") and phrase-final creak. Both /t/-glottalization and phrase-final creak have similar acoustic properties, but they can co-occur in English. The goal of this study is to determine whether /t/-glottalization and phrase-final creak are perceived distinctly. Sixteen English listeners were asked to identify words in a two-alternative forced choice task. The auditory targets were (near-) minimal pairs, in which one word could have /t/-glottalization (e.g., "button") but the other could not (e.g., "bun"). Stimuli were presented with and without phrase-final creak. Listeners made few identification errors overall, even when /t/-glottalization co-occurred with phrase-final creak, suggesting that /t/-glottalization and phrase-final creak remain perceptually distinct to English listeners. This supports the view that creaky voice is not a single category, but one comprised of distinct voice qualities.

6.

Perceptual evaluation of voice source models.

Kreiman, Jody; Garellek, Marc; Chen, Gang; Alwan, Abeer; Gerratt, Bruce R.

J Acoust Soc Am ; 138(1): 1-10, 2015 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-26233000

RESUMO

Models of the voice source differ in their fits to natural voices, but it is unclear which differences in fit are perceptually salient. This study examined the relationship between the fit of five voice source models to 40 natural voices, and the degree of perceptual match among stimuli synthesized with each of the modeled sources. Listeners completed a visual sort-and-rate task to compare versions of each voice created with the different source models, and the results were analyzed using multidimensional scaling. Neither fits to pulse shapes nor fits to landmark points on the pulses predicted observed differences in quality. Further, the source models fit the opening phase of the glottal pulses better than they fit the closing phase, but at the same time similarity in quality was better predicted by the timing and amplitude of the negative peak of the flow derivative (part of the closing phase) than by the timing and/or amplitude of peak glottal opening. Results indicate that simply knowing how (or how well) a particular source model fits or does not fit a target source pulse in the time domain provides little insight into what aspects of the voice source are important to listeners.

Assuntos

Percepção Auditiva/fisiologia , Qualidade da Voz/fisiologia , Estimulação Acústica , Adolescente , Adulto , Glote/fisiologia , Humanos , Pessoa de Meia-Idade , Modelos Biológicos , Localização de Som/fisiologia , Espectrografia do Som , Adulto Jovem

7.

Voice quality and tone identification in White Hmong.

Garellek, Marc; Keating, Patricia; Esposito, Christina M; Kreiman, Jody.

J Acoust Soc Am ; 133(2): 1078-89, 2013 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-23363123

RESUMO

This study investigates the importance of source spectrum slopes in the perception of phonation by White Hmong listeners. In White Hmong, nonmodal phonation (breathy or creaky voice) accompanies certain lexical tones, but its importance in tonal contrasts is unclear. In this study, native listeners participated in two perceptual tasks, in which they were asked to identify the word they heard. In the first task, participants heard natural stimuli with manipulated F0 and duration (phonation unchanged). Results indicate that phonation is important in identifying the breathy tone, but not the creaky tone. Thus, breathiness can be viewed as contrastive in White Hmong. Next, to understand which parts of the source spectrum listeners use to perceive contrastive breathy phonation, source spectrum slopes were manipulated in the second task to create stimuli ranging from modal to breathy sounding, with F0 held constant. Results indicate that changes in H1-H2 (difference in amplitude between the first and second harmonics) and H2-H4 (difference in amplitude between the second and fourth harmonics) are independently important for distinguishing breathy from modal phonation, consistent with the view that the percept of breathiness is influenced by a steep drop in harmonic energy in the lower frequencies.

Assuntos

Fonação , Fonética , Percepção da Altura Sonora , Reconhecimento Psicológico , Qualidade da Voz , Estimulação Acústica , Adulto , Audiometria da Fala , Sinais (Psicologia) , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Psicoacústica , Espectrografia do Som , Fatores de Tempo

8.

Acoustic and perceptual effects of changes in body layer stiffness in symmetric and asymmetric vocal fold models.

Zhang, Zhaoyan; Kreiman, Jody; Gerratt, Bruce R; Garellek, Marc.

J Acoust Soc Am ; 133(1): 453-62, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23297917

RESUMO

At present, it is not well understood how changes in vocal fold biomechanics correspond to changes in voice quality. Understanding such cross-domain links from physiology to acoustics to perception in the "speech chain" is of both theoretical and clinical importance. This study investigates links between changes in body layer stiffness, which is regulated primarily by the thyroarytenoid muscle, and the consequent changes in acoustics and voice quality under left-right symmetric and asymmetric stiffness conditions. Voice samples were generated using three series of two-layer physical vocal fold models, which differed only in body stiffness. Differences in perceived voice quality in each series were then measured in a "sort and rate" listening experiment. The results showed that increasing body stiffness better maintained vocal fold adductory position, thereby exciting more high-order harmonics, differences that listeners readily perceived. Changes to the degree of left-right stiffness mismatch and the resulting left-right vibratory asymmetry did not produce perceptually significant differences in quality unless the stiffness mismatch was large enough to cause a change in vibratory mode. This suggests that a vibration pattern with left-right asymmetry does not necessarily result in a salient deviation in voice quality, and thus may not always be of clinical significance.

Assuntos

Acústica , Modelos Anatômicos , Fonação , Acústica da Fala , Percepção da Fala , Prega Vocal/fisiologia , Qualidade da Voz , Fenômenos Biomecânicos , Elasticidade , Feminino , Humanos , Modelos Lineares , Masculino , Pressão , Processamento de Sinais Assistido por Computador , Espectrografia do Som , Vibração , Prega Vocal/anatomia & histologia

9.

Voice Quality of Children With Cerebral Palsy.

Nip, Ignatius S B; Garellek, Marc.

J Speech Lang Hear Res ; 64(8): 3051-3059, 2021 08 09.

Artigo em Inglês | MEDLINE | ID: mdl-34260269

RESUMO

Purpose Many children with cerebral palsy (CP) are described as having altered vocal quality. The current study utilizes psychoacoustic measures, namely, low-amplitude (H1*-H2*) and high-amplitude (H1*-A2*) spectral tilt and cepstral peak prominence (CPP), to identify the vocal fold articulation characteristics in this population. Method Eight children with CP and eight typically developing (TD) peers produced vowel singletons [i, É, u] and a story retell task with the same vowels in the words "beets, Bobby, boots." H1*-H2*, H1*-A2*, and CPP were extracted from each vowel. Results were analyzed with mixed linear models to identify the effect of Group (CP, TD), Task (vowel singleton, story retell), and Vowel [i, É, u] on the dependent variables. Results Children with CP have lower spectral tilt values (H1*-H2* and H1*-A2*) and lower CPP values than their TD peers. For both groups, vowel singletons were associated with lower CPP values as compared to story retell. Finally, the vowel [É] was associated with higher spectral tilt and higher CPP values as compared to [i, u]. Conclusions Children with CP have more constricted and creaky vocal quality due to lower spectral tilt and greater noise. Unlike adults, children demonstrate poorer vocal fold articulation when producing vowel singletons as compared to story retell. Finally, low vowels like [É] seem to be produced with less constriction and noise as compared to high vowels.

Assuntos

Paralisia Cerebral , Qualidade da Voz , Adulto , Paralisia Cerebral/complicações , Criança , Humanos , Fonação , Acústica da Fala , Prega Vocal

10.

Evidence against interactive effects on articulation in Javanese verb paradigms.

Seyfarth, Scott; Vander Klok, Jozina; Garellek, Marc.

Psychon Bull Rev ; 26(5): 1690-1696, 2019 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-31290010

RESUMO

In interactive models of speech production, wordforms that are related to a target form are co-activated during lexical planning, and co-activated wordforms can leave phonetic traces on the target. This mechanism has been proposed to account for phonetic similarities among morphologically related wordforms. We test this hypothesis in a Javanese verb paradigm. In Javanese, one class of verbs is inflected by nasalizing an initial voiceless obstruent: one form of each word begins with a nasal, while its otherwise identical relative begins with a voiceless obstruent. We predict that if morphologically related forms are co-activated during production, the nasal-initial forms of these words should show phonetic traces of their obstruent-initial forms, as compared to nasal-initial wordforms that do not alternate. Twenty-seven native Javanese speakers produced matched pairs of alternating and non-alternating wordforms. Based on an acoustic analysis of nasal resonance and closure duration, we present good evidence against the original hypothesis: We find that the alternating nasals are phonetically identical to the non-alternating ones on both measures. We argue that interactive effects during lexical planning do not offer the best account for morphologically conditioned phonetic similarities. We discuss an alternative involving competition between phonotactic constraints and word-specific phonological structures.

Assuntos

Fonética , Psicolinguística , Fala/fisiologia , Adulto , Humanos , Indonésia , Acústica da Fala

11.

Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech.

Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc.

J Speech Lang Hear Res ; 59(5): 994-1001, 2016 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-27626612

RESUMO

Purpose: The question of what type of utterance-a sustained vowel or continuous speech-is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Method: Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Results: Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Conclusions: Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.

Assuntos

Fonação , Fala , Qualidade da Voz , Adulto , Análise de Variância , Feminino , Humanos , Masculino , Espectrografia do Som , Adulto Jovem

12.

Toward a unified theory of voice production and perception.

Kreiman, Jody; Gerratt, Bruce R; Garellek, Marc; Samlan, Robin; Zhang, Zhaoyan.

Loquens ; 1(1)2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-27135054

RESUMO

At present, two important questions about voice remain unanswered: When voice quality changes, what physiological alteration caused this change, and if a change to the voice production system occurs, what change in perceived quality can be expected? We argue that these questions can only be answered by an integrated model of voice linking production and perception, and we describe steps towards the development of such a model. Preliminary evidence in support of this approach is also presented. We conclude that development of such a model should be a priority for scientists interested in voice, to explain what physical condition(s) might underlie a given voice quality, or what voice quality might result from a specific physical configuration.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA