Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
1.
Behav Res Methods ; 56(6): 5588-5604, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38158551

RESUMO

Formants (vocal tract resonances) are increasingly analyzed not only by phoneticians in speech but also by behavioral scientists studying diverse phenomena such as acoustic size exaggeration and articulatory abilities of non-human animals. This often involves estimating vocal tract length acoustically and producing scale-invariant representations of formant patterns. We present a theoretical framework and practical tools for carrying out this work, including open-source software solutions included in R packages soundgen and phonTools. Automatic formant measurement with linear predictive coding is error-prone, but formant_app provides an integrated environment for formant annotation and correction with visual and auditory feedback. Once measured, formants can be normalized using a single recording (intrinsic methods) or multiple recordings from the same individual (extrinsic methods). Intrinsic speaker normalization can be as simple as taking formant ratios and calculating the geometric mean as a measure of overall scale. The regression method implemented in the function estimateVTL calculates the apparent vocal tract length assuming a single-tube model, while its residuals provide a scale-invariant vowel space based on how far each formant deviates from equal spacing (the schwa function). Extrinsic speaker normalization provides more accurate estimates of speaker- and vowel-specific scale factors by pooling information across recordings with simple averaging or mixed models, which we illustrate with example datasets and R code. The take-home messages are to record several calls or vowels per individual, measure at least three or four formants, check formant measurements manually, treat uncertain values as missing, and use the statistical tools best suited to each modeling context.


Assuntos
Software , Humanos , Fonética , Fala/fisiologia , Acústica da Fala , Prega Vocal/fisiologia , Acústica
2.
Proc Biol Sci ; 290(2008): 20231029, 2023 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-37817600

RESUMO

Variation in formant frequencies has been shown to affect social interactions and sexual competition in a range of avian species. Yet, the anatomical bases of this variation are poorly understood. Here, we investigated the morphological correlates of formants production in the vocal apparatus of African penguins. We modelled the geometry of the supra-syringeal vocal tract of 20 specimens to generate a population of virtual vocal tracts with varying dimensions. We then estimated the acoustic response of these virtual vocal tracts and extracted the centre frequency of the first four predicted formants. We demonstrate that: (i) variation in length and cross-sectional area of vocal tracts strongly affects the formant pattern, (ii) the tracheal region determines most of this variation, and (iii) the skeletal size of penguins does not correlate with the trachea length and consequently has relatively little effect on formants. We conclude that in African penguins, while the variation in vocal tract geometry generates variation in resonant frequencies supporting the discrimination of conspecifics, such variation does not provide information on the emitter's body size. Overall, our findings advance our understanding of the role of formant frequencies in bird vocal communication.


Assuntos
Spheniscidae , Animais , Spheniscidae/fisiologia , Vocalização Animal/fisiologia , Tamanho Corporal , Acústica , Comunicação
3.
Neuroimage ; 252: 119044, 2022 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-35240298

RESUMO

Multisensory integration enables stimulus representation even when the sensory input in a single modality is weak. In the context of speech, when confronted with a degraded acoustic signal, congruent visual inputs promote comprehension. When this input is masked, speech comprehension consequently becomes more difficult. But it still remains inconclusive which levels of speech processing are affected under which circumstances by occluding the mouth area. To answer this question, we conducted an audiovisual (AV) multi-speaker experiment using naturalistic speech. In half of the trials, the target speaker wore a (surgical) face mask, while we measured the brain activity of normal hearing participants via magnetoencephalography (MEG). We additionally added a distractor speaker in half of the trials in order to create an ecologically difficult listening situation. A decoding model on the clear AV speech was trained and used to reconstruct crucial speech features in each condition. We found significant main effects of face masks on the reconstruction of acoustic features, such as the speech envelope and spectral speech features (i.e. pitch and formant frequencies), while reconstruction of higher level features of speech segmentation (phoneme and word onsets) were especially impaired through masks in difficult listening situations. As we used surgical face masks in our study, which only show mild effects on speech acoustics, we interpret our findings as the result of the missing visual input. Our findings extend previous behavioural results, by demonstrating the complex contextual effects of occluding relevant visual information on speech processing.


Assuntos
Percepção da Fala , Fala , Estimulação Acústica , Acústica , Humanos , Boca , Percepção Visual
4.
Folia Phoniatr Logop ; 74(5): 335-344, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35344948

RESUMO

INTRODUCTION: Voice diagnostics including voice range profile (VRP) measurement and acoustic voice analysis is essential in laryngology and phoniatrics. Due to COVID-19 pandemic, wearing of 2 or 3 filtering face piece (FFP2/3) masks is recommended when high-risk aerosol-generating procedures like singing and speaking are being performed. Goal of this study was to compare VRP parameters when performed without and with FFP2/3 masks. Further, formant analysis for sustained vowels, singer's formant, and analysis of reading standard text samples were performed without/with FFP2/3 masks. METHODS: Twenty subjects (6 males and 14 females) were enrolled in this study with an average age of 36 ± 16 years (mean ± SD). Fourteen patients were rated as euphonic/not hoarse and 6 patients as mildly hoarse. All subjects underwent the VRP measurements, vowel, and text recordings without/with FFP2/3 mask using the software DiVAS by XION medical (Berlin, Germany). Voice range of singing voice, equivalent of voice extension measure (eVEM), fundamental frequency (F0), sound pressure level (SPL) of soft speaking and shouting were calculated and analyzed. Maximum phonation time (MPT) and jitter-% were included for Dysphonia Severity Index (DSI) measurement. Analyses of singer's formant were performed. Spectral analyses of sustained vowels /a:/, /i:/, and /u:/ (first = F1 and second = F2 formants), intensity of long-term average spectrum, and alpha-ratio were calculated using the freeware praat. RESULTS: For all subjects, the mean values of routine voice parameters without/with mask were analyzed: no significant differences were found in results of singing voice range, eVEM, SPL, and frequency of soft speaking/shouting, except significantly lower mean SPL of shouting with FFP2/3 mask, in particular that of the female subjects (p = 0.002). Results of MPT, jitter, and DSI without/with FFP2/3 mask showed no significant differences. Further mean values analyzed without/with mask were ratio singer's formant/loud singing, with lower ratio with FFP2/3 mask (p = 0.001), and F1 and F2 of /a:/, /i:/, /u:/, with no significant differences of the results, with the exception of F2 of /i:/ with lower value with FFP2/3 mask (p = 0.005). With the exceptions mentioned, the t test revealed no significant differences for each of the routine parameters tested in the recordings without and with wearing a FFP2/3 mask. CONCLUSION: It can be concluded that VRP measurements including DSI performed with FFP2/3 masks provide reliable data in clinical routine with respect to voice condition/constitution. Spectral analyses of sustained vowel, text, and singer's formant will be affected by wearing FFP2/3 masks.


Assuntos
Acústica , Máscaras , Voz , Adulto , COVID-19 , Teste para COVID-19 , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pandemias , Fonação , Acústica da Fala , Adulto Jovem
5.
J Anat ; 236(3): 398-424, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31777085

RESUMO

A retractable larynx and adaptations of the vocal folds in the males of several polygynous ruminants serve for the production of rutting calls that acoustically announce larger than actual body size to both rival males and potential female mates. Here, such features of the vocal tract and of the sound source are documented in another species. We investigated the vocal anatomy and laryngeal mobility including its acoustical effects during the rutting vocal display of free-ranging male impala (Aepyceros melampus melampus) in Namibia. Male impala produced bouts of rutting calls (consisting of oral roars and interspersed explosive nasal snorts) in a low-stretch posture while guarding a rutting territory or harem. For the duration of the roars, male impala retracted the larynx from its high resting position to a low mid-neck position involving an extensible pharynx and a resilient connection between the hyoid apparatus and the larynx. Maximal larynx retraction was 108 mm based on estimates in video single frames. This was in good concordance with 91-mm vocal tract elongation calculated on the basis of differences in formant dispersion between roar portions produced with the larynx still ascended and those produced with maximally retracted larynx. Judged by their morphological traits, the larynx-retracting muscles of male impala are homologous to those of other larynx-retracting ruminants. In contrast, the large and massive vocal keels are evolutionary novelties arising by fusion and linear arrangement of the arytenoid cartilage and the canonical vocal fold. These bulky and histologically complex vocal keels produced a low fundamental frequency of 50 Hz. Impala is another ruminant species in which the males are capable of larynx retraction. In addition, male impala vocal folds are spectacularly specialized compared with domestic bovids, allowing the production of impressive, low-frequency roaring vocalizations as a significant part of their rutting behaviour. Our study expands knowledge on the evolutionary variation of vocal fold morphology in mammals, suggesting that the structure of the mammalian sound source is not always human-like and should be considered in acoustic analysis and modelling.


Assuntos
Antílopes/anatomia & histologia , Músculos Laríngeos/anatomia & histologia , Laringe/anatomia & histologia , Vocalização Animal/fisiologia , Acústica , Animais , Antílopes/fisiologia , Músculos Laríngeos/fisiologia , Laringe/fisiologia , Masculino , Prega Vocal/anatomia & histologia , Prega Vocal/fisiologia
6.
Horm Behav ; 117: 104616, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31644889

RESUMO

Low frequency components (i.e. a low pitch (F0) and low formant spacing (ΔF)) signal high salivary testosterone and height in adult male voices and are associated with high masculinity attributions by unfamiliar listeners (in both men and women). However, the relation between the physiological, acoustic and perceptual dimensions of speakers' masculinity prior to puberty remains unknown. In this study, 110 pre-pubertal children (58 girls), aged 3 to 10, were recorded as they described a cartoon picture. 315 adults (182 women) rated children's perceived masculinity from the voice only after listening to the speakers' audio recordings. On the basis of their voices alone, boys who had higher salivary testosterone levels were rated as more masculine and the relation between testosterone and perceived masculinity was partially mediated by F0. The voices of taller boys were also rated as more masculine, but the relation between height and perceived masculinity was not mediated by the considered acoustic parameters, indicating that acoustic cues other than F0 and ΔF may signal stature. Both boys and girls who had lower F0, were also rated as more masculine, while ΔF did not affect ratings. These findings highlight the interdependence of physiological, acoustic and perceptual dimensions, and suggest that inter-individual variation in male voices, particularly F0, may advertise hormonal masculinity from a very early age.


Assuntos
Desenvolvimento Infantil/fisiologia , Masculinidade , Percepção Social , Acústica da Fala , Voz/fisiologia , Adolescente , Adulto , Fatores Etários , Percepção Auditiva/fisiologia , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Fatores Sexuais , Maturidade Sexual/fisiologia , Testosterona/sangue , Adulto Jovem
7.
Neuroimage ; 178: 574-582, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29860083

RESUMO

Speech sounds are encoded by distributed patterns of activity in bilateral superior temporal cortex. However, it is unclear whether speech sounds are topographically represented in cortex, or which acoustic or phonetic dimensions might be spatially mapped. Here, using functional MRI, we investigated the potential spatial representation of vowels, which are largely distinguished from one another by the frequencies of their first and second formants, i.e. peaks in their frequency spectra. This allowed us to generate clear hypotheses about the representation of specific vowels in tonotopic regions of auditory cortex. We scanned participants as they listened to multiple natural tokens of the vowels [ɑ] and [i], which we selected because their first and second formants overlap minimally. Formant-based regions of interest were defined for each vowel based on spectral analysis of the vowel stimuli and independently acquired tonotopic maps for each participant. We found that perception of [ɑ] and [i] yielded differential activation of tonotopic regions corresponding to formants of [ɑ] and [i], such that each vowel was associated with increased signal in tonotopic regions corresponding to its own formants. This pattern was observed in Heschl's gyrus and the superior temporal gyrus, in both hemispheres, and for both the first and second formants. Using linear discriminant analysis of mean signal change in formant-based regions of interest, the identity of untrained vowels was predicted with ∼73% accuracy. Our findings show that cortical encoding of vowels is scaffolded on tonotopy, a fundamental organizing principle of auditory cortex that is not language-specific.


Assuntos
Córtex Auditivo/fisiologia , Mapeamento Encefálico/métodos , Fonética , Percepção da Fala/fisiologia , Estimulação Acústica , Adulto , Feminino , Humanos , Imageamento por Ressonância Magnética/métodos , Masculino
8.
Eur J Neurosci ; 48(10): 3126-3145, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30240514

RESUMO

Neural encoding of the envelope of sounds like vowels is essential to access temporal information useful for speech recognition. Subcortical responses to envelope periodicity of vowels can be assessed using scalp-recorded envelope following responses (EFRs); however, the amplitude of EFRs vary by vowel spectra and the causal relationship is not well understood. One cause for spectral dependency could be interactions between responses with different phases, initiated by multiple stimulus frequencies. Phase differences can arise from earlier initiation of processing high frequencies relative to low frequencies in the cochlea. This study investigated the presence of such phase interactions by measuring EFRs to two naturally spoken vowels (/ε/ and /u/), while delaying the envelope phase of the second formant band (F2+) relative to the first formant (F1) band in 45° increments. At 0° F2+ phase delay, EFRs elicited by the vowel /ε/ were lower in amplitude than the EFRs elicited by /u/. Using vector computations, we found that the lower amplitude of /ε/-EFRs was caused by linear superposition of F1- and F2+-contributions with larger F1-F2+ phase differences (166°) compared to /u/ (19°). While the variation in amplitude across F2+ phase delays could be modeled with two dominant EFR sources for both vowels, the degree of variation was dependent on F1 and F2+ EFR characteristics. Together, we demonstrate that (a) broadband sounds like vowels elicit independent responses from different stimulus frequencies that may be out-of-phase and affect scalp-based measurements, and (b) delaying higher frequency formants can maximize EFR amplitudes for some vowels.


Assuntos
Ondas Encefálicas/fisiologia , Eletroencefalografia/métodos , Potenciais Evocados Auditivos do Tronco Encefálico/fisiologia , Psicoacústica , Percepção da Fala/fisiologia , Adolescente , Adulto , Feminino , Humanos , Masculino , Adulto Jovem
9.
Clin Linguist Phon ; 32(7): 622-639, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29265931

RESUMO

The central aim of this experiment was to compare acoustic parameters, formant frequencies and vowel space area (VSA), in adolescents with hearing-impaired (HI) and their normal-hearing (NH) peers; for kinematic parameters, the movements of vocal organs, especially the lips, jaw and tongue, during vowel production were analysed. The participants were 12 adolescents with different degrees of hearing impairment. The control group consisted of 12 age-matched NH adolescents. All participants were native Chinese speakers who were asked to produce the Mandarin vowels /a/, /i/ and /u/, with subsequent acoustic and kinematic analysis. There was significant difference between the two groups. Additionally, the HI group produced more exaggerated mouth and less tongue movements in all vowels, compared to their NH peers. Results were discussed regarding possible relationship between acoustic data, articulatory movements and degree of hearing loss to provide an integrative assessment of acoustic and kinematic characteristics of individuals with hearing loss.


Assuntos
Povo Asiático , Perda Auditiva , Acústica da Fala , Medida da Produção da Fala/métodos , Adolescente , Fenômenos Biomecânicos , China , Feminino , Humanos , Masculino
10.
Proc Biol Sci ; 284(1856)2017 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-28592674

RESUMO

Differences in vocal fundamental (F0) and average formant (Fn) frequencies covary with body size in most terrestrial mammals, such that larger organisms tend to produce lower frequency sounds than smaller organisms, both between species and also across different sex and life-stage morphs within species. Here we examined whether three-month-old human infants are sensitive to the relationship between body size and sound frequencies. Using a violation-of-expectation paradigm, we found that infants looked longer at stimuli inconsistent with the relationship-that is, a smaller organism producing lower frequency sounds, and a larger organism producing higher frequency sounds-than at stimuli that were consistent with it. This effect was stronger for fundamental frequency than it was for average formant frequency. These results suggest that by three months of age, human infants are already sensitive to the biologically relevant covariation between vocalization frequencies and visual cues to body size. This ability may be a consequence of developmental adaptations for building a phenotype capable of identifying and representing an organism's size, sex and life-stage.


Assuntos
Tamanho Corporal , Sinais (Psicologia) , Voz , Feminino , Humanos , Lactente , Masculino , Fenótipo , Espectrografia do Som
11.
J Exp Biol ; 219(Pt 8): 1224-36, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-27103677

RESUMO

With an average male body mass of 320 kg, the wapiti, ITALIC! Cervus canadensis, is the largest extant species of Old World deer (Cervinae). Despite this large body size, male wapiti produce whistle-like sexual calls called bugles characterised by an extremely high fundamental frequency. Investigations of the biometry and physiology of the male wapiti's relatively large larynx have so far failed to account for the production of such a high fundamental frequency. Our examination of spectrograms of male bugles suggested that the complex harmonic structure is best explained by a dual-source model (biphonation), with one source oscillating at a mean of 145 Hz (F0) and the other oscillating independently at an average of 1426 Hz (G0). A combination of anatomical investigations and acoustical modelling indicated that the F0 of male bugles is consistent with the vocal fold dimensions reported in this species, whereas the secondary, much higher source at G0 is more consistent with an aerodynamic whistle produced as air flows rapidly through a narrow supraglottic constriction. We also report a possible interaction between the higher frequency G0 and vocal tract resonances, as G0 transiently locks onto individual formants as the vocal tract is extended. We speculate that male wapiti have evolved such a dual-source phonation to advertise body size at close range (with a relatively low-frequency F0 providing a dense spectrum to highlight size-related information contained in formants) while simultaneously advertising their presence over greater distances using the very high-amplitude G0 whistle component.


Assuntos
Cervos/fisiologia , Fonação , Vocalização Animal/fisiologia , Animais , Gestos , Masculino , Músculos/fisiologia , Especificidade de Órgãos , Postura , Espectrografia do Som
12.
J Exp Biol ; 219(Pt 12): 1913-21, 2016 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-27059064

RESUMO

The information conveyed in acoustic signals is a central topic in mammal vocal communication research. Body size is one form of information that can be encoded in calls. Acoustic allometry aims to identify the specific acoustic correlates of body size within the vocalizations of a given species, and formants are often a useful acoustic cue in this context. We conducted a longitudinal investigation of acoustic allometry in domestic piglets (Sus scrofa domesticus), asking whether formants of grunt vocalizations provide information concerning the caller's body size over time. On four occasions, we recorded grunts from 20 kunekune piglets, measured their vocal tract length by means of radiographs (X-rays) and weighed them. Controlling for effects of age and sex, we found that body weight strongly predicts vocal tract length, which in turn determines formant frequencies. We conclude that grunt formant frequencies could allow domestic pigs to assess a signaler's body size as it grows. Further research using playback experiments is needed to determine the perceptual role of formants in domestic pig communication.


Assuntos
Sinais (Psicologia) , Sus scrofa/fisiologia , Vocalização Animal , Animais , Tamanho Corporal , Peso Corporal , Feminino , Masculino , Radiografia/veterinária , Espectrografia do Som , Sus scrofa/crescimento & desenvolvimento , Prega Vocal/anatomia & histologia , Prega Vocal/diagnóstico por imagem
13.
Horm Behav ; 66(4): 569-76, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25169905

RESUMO

Men's voices contain acoustic cues to body size and hormonal status, which have been found to affect women's ratings of speaker size, masculinity and attractiveness. However, the extent to which these voice parameters mediate the relationship between speakers' fitness-related features and listener's judgments of their masculinity has not yet been investigated. We audio-recorded 37 adult heterosexual males performing a range of speech tasks and asked 20 adult heterosexual female listeners to rate speakers' masculinity on the basis of their voices only. We then used a two-level (speaker within listener) path analysis to examine the relationships between the physiological (testosterone, height), acoustic (fundamental frequency or F0, and resonances or ΔF) and perceptual dimensions (listeners' ratings) of speakers' masculinity. Overall, results revealed that male speakers who were taller and had higher salivary testosterone levels also had lower F0 and ΔF, and were in turn rated as more masculine. The relationship between testosterone and perceived masculinity was essentially mediated by F0, while that of height and perceived masculinity was partially mediated by both F0 and ΔF. These observations confirm that women listeners attend to sexually dimorphic voice cues to assess the masculinity of unseen male speakers. In turn, variation in these voice features correlate with speakers' variation in stature and hormonal status, highlighting the interdependence of these physiological, acoustic and perceptual dimensions.


Assuntos
Percepção Auditiva/fisiologia , Masculinidade , Acústica da Fala , Voz/fisiologia , Adulto , Estatura/fisiologia , Sinais (Psicologia) , Feminino , Heterossexualidade , Humanos , Julgamento , Masculino , Caracteres Sexuais , Testosterona/metabolismo , Adulto Jovem
14.
Front Psychol ; 15: 1412372, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39171236

RESUMO

Introduction: Previous research has investigated sexual orientation differences in the acoustic properties of individuals' voices, often theorizing that homosexuals of both sexes would have voice properties mirroring those of heterosexuals of the opposite sex. Findings were mixed, but many of these studies have methodological limitations including small sample sizes, use of recited passages instead of natural speech, or grouping bisexual and homosexual participants together for analyses. Methods: To address these shortcomings, the present study examined a wide range of acoustic properties in the natural voices of 142 men and 175 women of varying sexual orientations, with sexual orientation treated as a continuous variable throughout. Results: Homosexual men had less breathy voices (as indicated by a lower harmonics-to-noise ratio) and, contrary to our prediction, a lower voice pitch and narrower pitch range than heterosexual men. Homosexual women had lower F4 formant frequency (vocal tract resonance or so-called overtone) in overall vowel production, and rougher voices (measured via jitter and spectral tilt) than heterosexual women. For those sexual orientation differences that were statistically significant, bisexuals were in-between heterosexuals and homosexuals. No sexual orientation differences were found in formants F1-F3, cepstral peak prominence, shimmer, or speech rate in either sex. Discussion: Recommendations for future "natural voice" investigations are outlined.

15.
Ann N Y Acad Sci ; 1538(1): 107-116, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39091036

RESUMO

Formants, or resonance frequencies of the upper vocal tract, are an essential part of acoustic communication. Articulatory gestures-such as jaw, tongue, lip, and soft palate movements-shape formant structure in human vocalizations, but little is known about how nonhuman mammals use those gestures to modify formant frequencies. Here, we report a case study with an adult male harbor seal trained to produce an arbitrary vocalization composed of multiple repetitions of the sound wa. We analyzed jaw movements frame-by-frame and matched them to the tracked formant modulation in the corresponding vocalizations. We found that the jaw opening angle was strongly correlated with the first (F1) and, to a lesser degree, with the second formant (F2). F2 variation was better explained by the jaw angle opening when the seal was lying on his back rather than on the belly, which might derive from soft tissue displacement due to gravity. These results show that harbor seals share some common articulatory traits with humans, where the F1 depends more on the jaw position than F2. We propose further in vivo investigations of seals to further test the role of the tongue on formant modulation in mammalian sound production.


Assuntos
Vocalização Animal , Animais , Vocalização Animal/fisiologia , Masculino , Língua/fisiologia , Arcada Osseodentária/fisiologia , Arcada Osseodentária/anatomia & histologia , Phocoena/fisiologia , Humanos
16.
J Voice ; 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39019670

RESUMO

Listeners use speech to identify both linguistic information, such as the word being produced, and indexical attributes, such as the gender of the speaker. Previous research has shown that these two aspects of speech perception are interrelated. It is important to understand this relationship in the context of gender-affirming voice training (GAVT), where changes in speech production as part of a speaker's gender-affirming care could potentially influence listeners' recognition of the intended utterance. This study conducted a secondary analysis of data from an experiment in which trans women matched shifted targets for the second formant frequency using visual-acoustic biofeedback. Utterances were synthetically altered to feature a gender-ambiguous fundamental frequency and were presented to blinded listeners for rating on a visual analog scale representing the gender spectrum, as well as word identification in a forced-choice task. We found a statistically significant association between the accuracy of word identification and the gender rating of utterances. However, there was no statistically significant difference in word identification accuracy for the formant-shifted conditions relative to an unshifted condition. Overall, these results support previous research in finding that word identification and speaker gender identification are interrelated processes; however, the findings also suggest that a small magnitude of shift in formant frequencies (of the type that might be pursued in a GAVT context) does not have a significant negative impact on the perceptual recoverability of isolated words.

17.
Physiol Behav ; 283: 114615, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-38880296

RESUMO

This study sets out to investigate the potential effect of males' testosterone level on speech production and speech perception. Regarding speech production, we investigate intra- and inter-individual variation in mean fundamental frequency (fo) and formant frequencies and highlight the potential interacting effect of another hormone, i.e. cortisol. In addition, we investigate the influence of different speech materials on the relationship between testosterone and speech production. Regarding speech perception, we investigate the potential effect of individual differences in males' testosterone level on ratings of attractiveness of female voices. In the production study, data is gathered from 30 healthy adult males ranging from 19 to 27 years (mean age: 22.4, SD: 2.2) who recorded their voices and provided saliva samples at 9 am, 12 noon and 3 pm on a single day. Speech material consists of sustained vowels, counting, read speech and a free description of pictures. Biological measures comprise speakers' height, grip strength, and hormone levels (testosterone and cortisol). In the perception study, participants were asked to rate the attractiveness of female voice stimuli (sentence stimulus, same-speaker pairs) that were manipulated in three steps regarding mean fo and formant frequencies. Regarding speech production, our results show that testosterone affected mean fo (but not formants) both within and between speakers. This relationship was weakened in speakers with high cortisol levels and depended on the speech material. Regarding speech perception, we found female stimuli with higher mean fo and formants to be rated as sounding more attractive than stimuli with lower mean fo and formants. Moreover, listeners with low testosterone showed an increased sensitivity to vocal cues of female attractiveness. While our results of the production study support earlier findings of a relationship between testosterone and mean fo in males (which is mediated by cortisol), they also highlight the relevance of the speech material: The effect of testosterone was strongest in sustained vowels, potentially due to a strengthened effect of hormones on physiologically strongly influenced tasks such as sustained vowels in contrast to more free speech tasks such as a picture description. The perception study is the first to show an effect of males' testosterone level on female attractiveness ratings using voice stimuli.


Assuntos
Sinais (Psicologia) , Hidrocortisona , Saliva , Percepção da Fala , Fala , Testosterona , Voz , Humanos , Testosterona/metabolismo , Testosterona/farmacologia , Masculino , Adulto , Adulto Jovem , Saliva/metabolismo , Saliva/química , Hidrocortisona/metabolismo , Percepção da Fala/fisiologia , Percepção da Fala/efeitos dos fármacos , Fala/fisiologia , Fala/efeitos dos fármacos , Voz/efeitos dos fármacos , Feminino , Beleza , Estimulação Acústica
18.
Lang Speech ; : 238309231223736, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38693788

RESUMO

This paper presents L2 vowel remediation in a classroom setting via two real-time visual feedback methods: articulatory ultrasound tongue imaging, which shows tongue shape and position, and a newly developed acoustic formant analyzer, which visualizes a point correlating with the combined effect of tongue position and lip rounding in a vowel quadrilateral. Ten Czech students of the Swedish language participated in the study. Swedish vowel production is difficult for Czech speakers since the languages differ significantly in their vowel systems. The students selected the vowel targets on their own and practiced in two classroom groups, with six students receiving two ultrasound training lessons, followed by one acoustic, and four students receiving two acoustic lessons, followed by one ultrasound. Audio data were collected pre-training, after the two sessions employing the first visual feedback method, and at post-training, allowing measuring Euclidean distance among selected groups of vowels and observing the direction of change within the vowel quadrilateral as a result of practice. Perception tests were performed before and after training, revealing that most learners perceived selected vowels correctly already before the practice. The study showed that both feedback methods can be successfully applied to L2 classroom learning, and both lead to the improvement in the pronunciation of the selected vowels, as well as the Swedish vowel set as a whole. However, ultrasound tongue imaging seems to have an advantage as it resulted in a greater number of improved targets.

19.
Biol Lett ; 9(4): 20130270, 2013 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-23720522

RESUMO

Formants are important phonetic elements of human speech that are also used by humans and non-human mammals to assess the body size of potential mates and rivals. As a consequence, it has been suggested that formant perception, which is crucial for speech perception, may have evolved through sexual selection. Somewhat surprisingly, though, no previous studies have examined whether sexes differ in their ability to use formants for size evaluation. Here, we investigated whether men and women differ in their ability to use the formant frequency spacing of synthetic vocal stimuli to make auditory size judgements over a wide range of fundamental frequencies (the main determinant of vocal pitch). Our results reveal that men are significantly better than women at comparing the apparent size of stimuli, and that lower pitch improves the ability of both men and women to perform these acoustic size judgements. These findings constitute the first demonstration of a sex difference in formant perception, and lend support to the idea that acoustic size normalization, a crucial prerequisite for speech perception, may have been sexually selected through male competition. We also provide the first evidence that vocalizations with relatively low pitch improve the perception of size-related formant information.


Assuntos
Seleção Genética , Percepção da Fala , Acústica , Adolescente , Tamanho Corporal , Inglaterra , Feminino , Humanos , Masculino , Caracteres Sexuais , Adulto Jovem
20.
J Voice ; 37(6): 971.e9-971.e16, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34256982

RESUMO

As part of our contributions to researches on the ongoing COVID-19 pandemic worldwide, we have studied the cough changes to the infected people based on the Hidden Markov Model (HMM) speech recognition classification, formants frequency and pitch analysis. In this paper, An HMM-based cough recognition system was implemented with 5 HMM states, 8 Gaussian Mixture Distributions (GMMs) and 13 dimensions of the basic Mel-Frequency Cepstral Coefficients (MFCC) with 39 dimensions of the overall feature vector. A comparison between formants frequency and pitch extracted values is realized based on the cough of COVID-19 infected people and healthy ones to confirm our cough recognition system results. The experimental results present that the difference between the recognition rates of infected and non-infected people is 6.7%. Whereas, the formant analysis variation based on the cough of infected and non-infected people is clearly observed with F1, F3, and F4 and lower for F0 and F2.


Assuntos
COVID-19 , Interface para o Reconhecimento da Fala , Humanos , Tosse/diagnóstico , Tosse/etiologia , Pandemias , COVID-19/complicações , COVID-19/diagnóstico , Fala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA