Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Acoust Soc Am ; 152(3): 1856, 2022 09.
Article in English | MEDLINE | ID: mdl-36182308

ABSTRACT

The measure H1-H2, the difference in amplitude between the first and second harmonics, is frequently used to distinguish phonation types and to characterize differences across voices and genders. While H1-H2 can differentiate voices and is used by listeners to perceive changes in voice quality, its relation to voice articulation is less straightforward. Its calculation also involves practical issues with error propagation. This paper highlights some developments in the use of H1-H2 and proposes a new measure that we call "residual H1." In residual H1, the amplitude of the first harmonic is normalized against the overall sound energy (as measured by root mean square energy) instead of against H2. Residual H1 may mitigate some of the issues with using H1-H2. The current study tests the correlation between residual H1 and electroglottographic contact quotient (CQ) and compares the ability of residual H1 vs H1-H2 to differentiate statistically across phonation types in !Xóõ and utterance-level changes in phonatory quality in Mandarin. The results show that residual H1 has a stronger correlation with CQ and differentiates contrastive and allophonic phonatory quality better than H1-H2, particularly for more constricted phonation types.


Subject(s)
Phonation , Voice Quality , Acoustics , Female , Humans , Male , Phonetics , Speech Acoustics
2.
J Speech Lang Hear Res ; 64(8): 3051-3059, 2021 08 09.
Article in English | MEDLINE | ID: mdl-34260269

ABSTRACT

Purpose Many children with cerebral palsy (CP) are described as having altered vocal quality. The current study utilizes psychoacoustic measures, namely, low-amplitude (H1*-H2*) and high-amplitude (H1*-A2*) spectral tilt and cepstral peak prominence (CPP), to identify the vocal fold articulation characteristics in this population. Method Eight children with CP and eight typically developing (TD) peers produced vowel singletons [i, ɑ, u] and a story retell task with the same vowels in the words "beets, Bobby, boots." H1*-H2*, H1*-A2*, and CPP were extracted from each vowel. Results were analyzed with mixed linear models to identify the effect of Group (CP, TD), Task (vowel singleton, story retell), and Vowel [i, ɑ, u] on the dependent variables. Results Children with CP have lower spectral tilt values (H1*-H2* and H1*-A2*) and lower CPP values than their TD peers. For both groups, vowel singletons were associated with lower CPP values as compared to story retell. Finally, the vowel [ɑ] was associated with higher spectral tilt and higher CPP values as compared to [i, u]. Conclusions Children with CP have more constricted and creaky vocal quality due to lower spectral tilt and greater noise. Unlike adults, children demonstrate poorer vocal fold articulation when producing vowel singletons as compared to story retell. Finally, low vowels like [ɑ] seem to be produced with less constriction and noise as compared to high vowels.


Subject(s)
Cerebral Palsy , Voice Quality , Adult , Cerebral Palsy/complications , Child , Humans , Phonation , Speech Acoustics , Vocal Cords
3.
J Acoust Soc Am ; 149(1): 457, 2021 01.
Article in English | MEDLINE | ID: mdl-33514179

ABSTRACT

No agreed-upon method currently exists for objective measurement of perceived voice quality. This paper describes validation of a psychoacoustic model designed to fill this gap. This model includes parameters to characterize the harmonic and inharmonic voice sources, vocal tract transfer function, fundamental frequency, and amplitude of the voice, which together serve to completely quantify the integral sound of a target voice sample. In experiment 1, 200 voices with and without diagnosed vocal pathology were fit with the model using analysis-by-synthesis. The resulting synthetic voice samples were not distinguishable from the original voice tokens, suggesting that the model has all the parameters it needs to fully quantify voice quality. In experiment 2 parameters that model the harmonic voice source were removed one by one, and the voice tokens were re-synthesized with the reduced model. In every case the lower-dimensional models provided worse perceptual matches to the quality of the natural tokens than did the original set, indicating that the psychoacoustic model cannot be reduced in dimensionality without loss of fit to the data. Results confirm that this model can be validly applied to quantify voice quality in clinical and research applications.


Subject(s)
Psychoacoustics , Voice Disorders , Voice , Female , Humans , Male , Speech , Speech Acoustics , Voice Quality
4.
Phonetica ; 77(2): 131-160, 2020.
Article in English | MEDLINE | ID: mdl-30739113

ABSTRACT

Phonation types, or contrastive voice qualities, are minimally produced using complex movements of the vocal folds, but may additionally involve constriction in the supraglottal and pharyngeal cavities. These complex articulations in turn produce a multidimensional acoustic output that can be modeled in various ways. In this study, I investigate whether the psychoacoustic model of voice by Kreiman et al. (2014) succeeds at distinguishing six phonation types of !Xóõ. Linear discriminant analysis is performed using parameters from the model averaged over the entire vowel as well as for the first and final halves of the vowel. The results indicate very high classification accuracy for all phonation types. Measures averaged over the vowel's entire duration are closely correlated with the discriminant functions, suggesting that they are sufficient for distinguishing even dynamic phonation types. Measures from all classes of parameters are correlated with the linear discriminant functions; in particular, the "strident" vowels, which are harsh in quality, are characterized by their noise, changes in spectral tilt, decrease in voicing amplitude and frequency, and raising of the first formant. Despite the large number of contrasts and the time-varying characteristics of many of the phonation types, the phonation contrasts in !Xóõ remain well differentiated acoustically.


Subject(s)
Language , Phonation , Speech Acoustics , Humans , Male , Phonetics , Sound Spectrography , Speech/physiology , Voice Quality
5.
Psychon Bull Rev ; 26(5): 1690-1696, 2019 Oct.
Article in English | MEDLINE | ID: mdl-31290010

ABSTRACT

In interactive models of speech production, wordforms that are related to a target form are co-activated during lexical planning, and co-activated wordforms can leave phonetic traces on the target. This mechanism has been proposed to account for phonetic similarities among morphologically related wordforms. We test this hypothesis in a Javanese verb paradigm. In Javanese, one class of verbs is inflected by nasalizing an initial voiceless obstruent: one form of each word begins with a nasal, while its otherwise identical relative begins with a voiceless obstruent. We predict that if morphologically related forms are co-activated during production, the nasal-initial forms of these words should show phonetic traces of their obstruent-initial forms, as compared to nasal-initial wordforms that do not alternate. Twenty-seven native Javanese speakers produced matched pairs of alternating and non-alternating wordforms. Based on an acoustic analysis of nasal resonance and closure duration, we present good evidence against the original hypothesis: We find that the alternating nasals are phonetically identical to the non-alternating ones on both measures. We argue that interactive effects during lexical planning do not offer the best account for morphologically conditioned phonetic similarities. We discuss an alternative involving competition between phonotactic constraints and word-specific phonological structures.


Subject(s)
Phonetics , Psycholinguistics , Speech/physiology , Adult , Humans , Indonesia , Speech Acoustics
6.
J Speech Lang Hear Res ; 59(5): 994-1001, 2016 10 01.
Article in English | MEDLINE | ID: mdl-27626612

ABSTRACT

Purpose: The question of what type of utterance-a sustained vowel or continuous speech-is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Method: Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Results: Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Conclusions: Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.


Subject(s)
Phonation , Speech , Voice Quality , Adult , Analysis of Variance , Female , Humans , Male , Sound Spectrography , Young Adult
7.
J Acoust Soc Am ; 139(3): 1404-10, 2016 Mar.
Article in English | MEDLINE | ID: mdl-27036277

ABSTRACT

A psychoacoustic model of the voice source spectrum is proposed. The model is characterized by four spectral slope parameters: the difference in amplitude between the first two harmonics (H1-H2), the second and fourth harmonics (H2-H4), the fourth harmonic and the harmonic nearest 2 kHz in frequency (H4-2 kHz), and the harmonic nearest 2 kHz and that nearest 5 kHz (2 kHz-5 kHz). As a step toward model validation, experiments were conducted to establish the acoustic and perceptual independence of these parameters. In experiment 1, the model was fit to a large number of voice sources. Results showed that parameters are predictable from one another, but that these relationships are due to overall spectral roll-off. Two additional experiments addressed the perceptual independence of the source parameters. Listener sensitivity to H1-H2, H2-H4, and H4-2 kHz did not change as a function of the slope of an adjacent component, suggesting that sensitivity to these components is robust. Listener sensitivity to changes in spectral slope from 2 kHz to 5 kHz depended on complex interactions between spectral slope, spectral noise levels, and H4-2 kHz. It is concluded that the four parameters represent non-redundant acoustic and perceptual aspects of voice quality.


Subject(s)
Acoustics , Models, Theoretical , Speech Acoustics , Voice Quality , Adult , Female , Humans , Male , Psychoacoustics , Sound Spectrography , Speech Production Measurement , Young Adult
8.
J Acoust Soc Am ; 138(1): 1-10, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26233000

ABSTRACT

Models of the voice source differ in their fits to natural voices, but it is unclear which differences in fit are perceptually salient. This study examined the relationship between the fit of five voice source models to 40 natural voices, and the degree of perceptual match among stimuli synthesized with each of the modeled sources. Listeners completed a visual sort-and-rate task to compare versions of each voice created with the different source models, and the results were analyzed using multidimensional scaling. Neither fits to pulse shapes nor fits to landmark points on the pulses predicted observed differences in quality. Further, the source models fit the opening phase of the glottal pulses better than they fit the closing phase, but at the same time similarity in quality was better predicted by the timing and amplitude of the negative peak of the flow derivative (part of the closing phase) than by the timing and/or amplitude of peak glottal opening. Results indicate that simply knowing how (or how well) a particular source model fits or does not fit a target source pulse in the time domain provides little insight into what aspects of the voice source are important to listeners.


Subject(s)
Auditory Perception/physiology , Voice Quality/physiology , Acoustic Stimulation , Adolescent , Adult , Glottis/physiology , Humans , Middle Aged , Models, Biological , Sound Localization/physiology , Sound Spectrography , Young Adult
9.
J Acoust Soc Am ; 137(2): 822-31, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25698016

ABSTRACT

American English has several linguistic sources of creaky voice. Two common sources are /t/-glottalization (where /t/ is produced as a glottal stop and/or with creaky voice, as in "button") and phrase-final creak. Both /t/-glottalization and phrase-final creak have similar acoustic properties, but they can co-occur in English. The goal of this study is to determine whether /t/-glottalization and phrase-final creak are perceived distinctly. Sixteen English listeners were asked to identify words in a two-alternative forced choice task. The auditory targets were (near-) minimal pairs, in which one word could have /t/-glottalization (e.g., "button") but the other could not (e.g., "bun"). Stimuli were presented with and without phrase-final creak. Listeners made few identification errors overall, even when /t/-glottalization co-occurred with phrase-final creak, suggesting that /t/-glottalization and phrase-final creak remain perceptually distinct to English listeners. This supports the view that creaky voice is not a single category, but one comprised of distinct voice qualities.

10.
Loquens ; 1(1)2014 Jan.
Article in English | MEDLINE | ID: mdl-27135054

ABSTRACT

At present, two important questions about voice remain unanswered: When voice quality changes, what physiological alteration caused this change, and if a change to the voice production system occurs, what change in perceived quality can be expected? We argue that these questions can only be answered by an integrated model of voice linking production and perception, and we describe steps towards the development of such a model. Preliminary evidence in support of this approach is also presented. We conclude that development of such a model should be a priority for scientists interested in voice, to explain what physical condition(s) might underlie a given voice quality, or what voice quality might result from a specific physical configuration.

11.
J Acoust Soc Am ; 133(2): 1078-89, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23363123

ABSTRACT

This study investigates the importance of source spectrum slopes in the perception of phonation by White Hmong listeners. In White Hmong, nonmodal phonation (breathy or creaky voice) accompanies certain lexical tones, but its importance in tonal contrasts is unclear. In this study, native listeners participated in two perceptual tasks, in which they were asked to identify the word they heard. In the first task, participants heard natural stimuli with manipulated F0 and duration (phonation unchanged). Results indicate that phonation is important in identifying the breathy tone, but not the creaky tone. Thus, breathiness can be viewed as contrastive in White Hmong. Next, to understand which parts of the source spectrum listeners use to perceive contrastive breathy phonation, source spectrum slopes were manipulated in the second task to create stimuli ranging from modal to breathy sounding, with F0 held constant. Results indicate that changes in H1-H2 (difference in amplitude between the first and second harmonics) and H2-H4 (difference in amplitude between the second and fourth harmonics) are independently important for distinguishing breathy from modal phonation, consistent with the view that the percept of breathiness is influenced by a steep drop in harmonic energy in the lower frequencies.


Subject(s)
Phonation , Phonetics , Pitch Perception , Recognition, Psychology , Voice Quality , Acoustic Stimulation , Adult , Audiometry, Speech , Cues , Female , Humans , Logistic Models , Male , Middle Aged , Psychoacoustics , Sound Spectrography , Time Factors
12.
J Acoust Soc Am ; 133(1): 453-62, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23297917

ABSTRACT

At present, it is not well understood how changes in vocal fold biomechanics correspond to changes in voice quality. Understanding such cross-domain links from physiology to acoustics to perception in the "speech chain" is of both theoretical and clinical importance. This study investigates links between changes in body layer stiffness, which is regulated primarily by the thyroarytenoid muscle, and the consequent changes in acoustics and voice quality under left-right symmetric and asymmetric stiffness conditions. Voice samples were generated using three series of two-layer physical vocal fold models, which differed only in body stiffness. Differences in perceived voice quality in each series were then measured in a "sort and rate" listening experiment. The results showed that increasing body stiffness better maintained vocal fold adductory position, thereby exciting more high-order harmonics, differences that listeners readily perceived. Changes to the degree of left-right stiffness mismatch and the resulting left-right vibratory asymmetry did not produce perceptually significant differences in quality unless the stiffness mismatch was large enough to cause a change in vibratory mode. This suggests that a vibration pattern with left-right asymmetry does not necessarily result in a salient deviation in voice quality, and thus may not always be of clinical significance.


Subject(s)
Acoustics , Models, Anatomic , Phonation , Speech Acoustics , Speech Perception , Vocal Cords/physiology , Voice Quality , Biomechanical Phenomena , Elasticity , Female , Humans , Linear Models , Male , Pressure , Signal Processing, Computer-Assisted , Sound Spectrography , Vibration , Vocal Cords/anatomy & histology
SELECTION OF CITATIONS
SEARCH DETAIL