RESUMO
Fundamental frequency ( fo) is the most perceptually salient vocal acoustic parameter, yet little is known about how its perceptual influence varies across societies. We examined how fo affects key social perceptions and how socioecological variables modulate these effects in 2,647 adult listeners sampled from 44 locations across 22 nations. Low male fo increased men's perceptions of formidability and prestige, especially in societies with higher homicide rates and greater relational mobility in which male intrasexual competition may be more intense and rapid identification of high-status competitors may be exigent. High female fo increased women's perceptions of flirtatiousness where relational mobility was lower and threats to mating relationships may be greater. These results indicate that the influence of fo on social perceptions depends on socioecological variables, including those related to competition for status and mates.
Assuntos
Voz , Adulto , Humanos , Masculino , Feminino , Homicídio , Percepção Social , Parceiros SexuaisRESUMO
INTRODUCTION: Mutational voice disorder is the inability of the voice to adjust to the changes in the larynx during puberty, resulting in the speaking fundamental frequency failing to decrease. Standard treatments for mutational voice disorder are voice therapy and thyroplasty. However, voice therapy takes time to show its effects, and thyroplasty is highly invasive. Herein, we present a case of mutational voice disorder successfully treated with intracordal trafermin injection. CASE SUMMARY: A 31-year-old male patient was diagnosed with mutational voice disorder and offered standard treatment, but he requested a less invasive treatment with early effects. We performed intracordal trafermin injection with his consent. Two months after the procedure, the speaking fundamental frequency decreased from 155.5 Hz to 93.0 Hz, and the voice handicap index decreased from 14 to 2. DISCUSSION: This case suggests that intracordal trafermin injection is an effective treatment option for mutational voice disorder. Furthermore, compared with the standard treatment methods, it is less invasive and provides effects shortly with only one injection.
Assuntos
Fatores de Crescimento de Fibroblastos , Fragmentos de Peptídeos , Distúrbios da Voz , Voz , Masculino , Humanos , Adulto , Distúrbios da Voz/tratamento farmacológico , Distúrbios da Voz/cirurgia , Resultado do Tratamento , InjeçõesRESUMO
PURPOSE: Since new evidence regarding the impact of Wendler glottoplasty (WG) on the voice in transgender women became available in the literature in recent years, we aimed to perform an updated systematic review and meta-analysis to determine the actual safety and efficacy of WG in the process of vocal feminization. METHODS: PubMed, Embase, and Cochrane were searched for English-language articles published until July 4, 2023. Studies were found eligible if they evaluated the impact of WG on the acoustic-aerodynamic measures and quality of voice in transgender women. RESULTS: Twenty-three studies were identified. After exclusion of three studies due to incomplete data, 20 studies including 656 patients were included in the meta-analysis. After WG, there was a significant increase of fundamental frequency, speaking fundamental frequency, and lower limit of the frequency range (p < 0.001). Concurrently, a significant reduction of frequency range and maximum phonation time was observed (p < 0.001). No significant differences were found between the pre- and postoperative values regarding the Grade, Roughness, Breathiness, Asthenia, and Strain scale score (p = 0.339). The overall score in the Trans Woman Voice Questionnaire (TWVQ) significantly improved after WG (p < 0.001). CONCLUSIONS: WG is an effective voice feminization method in transgender women, associated with a high procedural success and low risk of postoperative complications. Significantly improved TWVQ score after surgery suggests its positive impact on the voice-related quality of life. Postoperative decrease of maximum phonation time and frequency range does not seem to significantly impact the effectiveness of voice production.
Assuntos
Glote , Pessoas Transgênero , Qualidade da Voz , Feminino , Humanos , Masculino , Glote/cirurgia , Laringoplastia/métodos , Acústica da FalaRESUMO
The use of voice recordings in both research and industry practice has increased dramatically in recent years-from diagnosing a COVID-19 infection based on patients' self-recorded voice samples to predicting customer emotions during a service center call. Crowdsourced audio data collection in participants' natural environment using their own recording device has opened up new avenues for researchers and practitioners to conduct research at scale across a broad range of disciplines. The current research examines whether fundamental properties of the human voice are reliably and validly captured through common consumer-grade audio-recording devices in current medical, behavioral science, business, and computer science research. Specifically, this work provides evidence from a tightly controlled laboratory experiment analyzing 1800 voice samples and subsequent simulations that recording devices with high proximity to a speaker (such as a headset or a lavalier microphone) lead to inflated measures of amplitude compared to a benchmark studio-quality microphone while recording devices with lower proximity to a speaker (such as a laptop or a smartphone in front of the speaker) systematically reduce measures of amplitude and can lead to biased measures of the speaker's true fundamental frequency. We further demonstrate through simulation studies that these differences can lead to biased and ultimately invalid conclusions in, for example, an emotion detection task. Finally, we outline a set of recording guidelines to ensure reliable and valid voice recordings and offer initial evidence for a machine-learning approach to bias correction in the case of distorted speech signals.
Assuntos
Qualidade da Voz , Voz , Humanos , Espectrografia do Som , Smartphone , MicrocomputadoresRESUMO
Binocular rivalry is an example of bistable visual perception extensively examined in neuroimaging. Magnetoencephalography can track brain responses to phasic visual stimulations of predetermined frequency and phase to advance our understanding of perceptual dominance and suppression in binocular rivalry. We used left and right eye stimuli that flickered at two tagging frequencies to track their respective oscillatory cortical evoked responses. We computed time-resolved measures of coherence to track brain responses phase locked with stimulus frequencies and with respect to the participants' indications of alternations of visual rivalry they experienced. We compared the brain maps obtained to those from a non-rivalrous control replay condition that used physically changing stimuli to mimic rivalry. We found stronger coherence within a posterior cortical network of visual areas during rivalry dominance compared with rivalry suppression and replay control. This network extended beyond the primary visual cortex to several retinotopic visual areas. Moreover, network coherence with dominant percepts in primary visual cortex peaked at least 50 ms prior to the suppressed percept nadir, consistent with the escape theory of alternations. Individual alternation rates were correlated with the rate of change in dominant evoked peaks, but not for the slope of response to suppressed percepts. Effective connectivity measures revealed that dominant (respectively, suppressed) percepts were expressed in dorsal (respectively ventral) streams. We thus demonstrate that binocular rivalry dominance and suppression engage distinct mechanisms and brain networks. These findings advance neural models of rivalry and may relate to more general aspects of selection and suppression in natural vision.
Assuntos
Magnetoencefalografia , Visão Binocular , Humanos , Visão Binocular/fisiologia , Percepção Visual/fisiologia , Encéfalo , Mapeamento Encefálico , Estimulação Luminosa , Disparidade VisualRESUMO
Women with the FMR1 premutation are susceptible to motor involvement related to atypical cerebellar function, including risk for developing fragile X tremor ataxia syndrome. Vocal quality analyses are sensitive to subtle differences in motor skills but have not yet been applied to the FMR1 premutation. This study examined whether women with the FMR1 premutation demonstrate differences in vocal quality, and whether such differences relate to FMR1 genetic, executive, motor, or health features of the FMR1 premutation. Participants included 35 women with the FMR1 premutation and 45 age-matched women without the FMR1 premutation who served as a comparison group. Three sustained /a/ vowels were analyzed for pitch (mean F0), variability of pitch (standard deviation of F0), and overall vocal quality (jitter, shimmer, and harmonics-to-noise ratio). Executive, motor, and health indices were obtained from direct and self-report measures and genetic samples were analyzed for FMR1 CGG repeat length and activation ratio. Women with the FMR1 premutation had a lower pitch, larger pitch variability, and poorer vocal quality than the comparison group. Working memory was related to harmonics-to-noise ratio and shimmer in women with the FMR1 premutation. Vocal quality abnormalities differentiated women with the FMR1 premutation from the comparison group and were evident even in the absence of other clinically evident motor deficits. This study supports vocal quality analyses as a tool that may prove useful in the detection of early signs of motor involvement in this population.
Assuntos
Proteína do X Frágil da Deficiência Intelectual , Síndrome do Cromossomo X Frágil , Humanos , Feminino , Proteína do X Frágil da Deficiência Intelectual/genética , Síndrome do Cromossomo X Frágil/genética , Tremor/genética , Ataxia/genética , Memória de Curto Prazo/fisiologiaRESUMO
PURPOSE: Subglottic pressure (Ps) and fundamental frequency (F0) play important roles in governing vocal fold (VF) dynamics. Theoretical description, model simulation, excised larynx and animal models have been used in previous studies, yet clinically applicable measurements are still lacking. This study aimed to evaluate the effects of surgery for benign laryngeal lesions by investigating the relationship between F0 and Ps. METHODS: Patients with benign laryngeal lesions who underwent phonosurgery were prospectively recruited. Participants were instructed to sustain voicing the vowel /o/ at three incremental frequencies four semitones apart in the modal register (F01, F02, F03). F0 was estimated by VF vibration on the accelerometer. Ps change was achieved and measured using the airflow interruption method. RESULTS: Thirteen patients with a mean age (SD) of 43.5 (12.4) years were included. The change in F0 per unit change of Ps, which is the slope (Hz/kPa) of the regression line of the frequency-pressure data pairs, decreased as the tension of the VF increased. The slopes significantly increased after the operation for F01 and F02 (36.43 ± 14.68 preoperatively, 53.91 ± 30.71 postoperatively, p = 0.011 and 26.02 ± 10.71; 34.85 ± 17.92, p = 0.046, respectively). In addition, there was a significant decrease in phonation threshold pressure and improvements in the grade, roughness, breathiness, asthenia, strain scale, and the voice handicap inventory-10. CONCLUSIONS: The relationship between F0 and Ps may serve as an objective assessment of the outcomes in the treatment of benign laryngeal diseases with clinical relevance.
Assuntos
Doenças da Laringe , Laringe , Animais , Prega Vocal/cirurgia , Laringe/cirurgia , Fonação , Doenças da Laringe/cirurgia , Simulação por ComputadorRESUMO
BACKGROUND: The expressiveness during reading is essential for a fluent reading. Reading prosody has been scarcely studied in an experimental manner, owing to the difficulties in taking objective and direct measures of this reading skill. However, new technologies development has made it possible to analyse reading prosody in an experimental way. Prosodic patterns may vary, not being the same at the beginning of the reading learning process as in adulthood. They may also be altered in disorders such as dyslexia, but little is known about the prosodic characteristics and reading fluency of people with neurodegenerative diseases that cause language impairment, such as Parkinson's disease (PD). AIMS: The aim of this work was to study reading fluency in PD considering the prosodic characteristics of its reading. METHODS & PROCEDURES: The participants were 31 Spanish adults with PD and 31 healthy controls, aged 59-88 years. Two experimental texts were designed that included declarative, interrogative, and exclamatory sentences and experimental verbs and nouns. The manipulability level of the nouns and the motor content of the verbs were considered. The reading of the participants was recorded and analysed with Praat software. OUTCOMES & RESULTS: A longer reading duration and a greater number of pauses, especially in verbs, were found in the PD group, which also showed less pitch variation than the control group in the experimental sentences. The control group showed a big initial rise in declarative and interrogative sentences, as well as a stronger final declination in declarative and exclamatory ones, when compared to the PD group. CONCLUSIONS & IMPLICATIONS: The use of experimental methodologies for the analysis of reading fluency allows learning more about the prosodic characteristics of people with different pathologies, such as PD. Scarce pitch variability found in the analysis, together with the great number of pauses and the longer reading duration, leads to poorly expressive reading, which compromises fluency in PD. The exhaustive evaluation of the reading fluency of PD patients will make it possible to design more complete assessment methods that will favour the diagnosis and early detection of this pathology. WHAT THIS STUDY ADDS: What is already known on this subject ⢠The speech of people with Parkinson's disease (PD) is often impaired by the appearance of hypokinetic dysarthria. The language of people with PD is usually affected with the progression of the disease, with lexico-semantic impairment which mainly affects verbs. Previous literature on reading fluency in PD usually considers reading speed and accuracy, neglecting prosody. Other neurodegenerative diseases with language impairment, such as Alzheimer's disease, commonly cause reading fluency problems. What this paper adds to existing knowledge ⢠This study provides direct and objective measures of the reading fluency (speed, accuracy and prosody) in patients with PD, by the design of experimental texts. Reading fluency characteristics were found to be altered in these patients, especially in pitch variations and reading duration. The reading of Parkinson's patients showed a more flattened pitch. In addition, a greater number of pauses and longer reading durations were also found in the reading of verbs compared to the control group. What are the potential or actual clinical implications of this work? ⢠The use of experimentally created texts makes it possible to analyse the influence of different psycholinguistic variables (frequency, length, motor content, manipulability) on reading fluency, and how the processing of these stimuli could be affected in PD. The objective analysis of the reading fluency characteristics in PD allows the design of more specific evaluation and diagnostic tasks. More complete assessment methods may allow the early detection of the disease. In the same way, it may favour a differential diagnosis with other neurodegenerative diseases.
Assuntos
Transtornos do Desenvolvimento da Linguagem , Doença de Parkinson , Adulto , Humanos , Doença de Parkinson/complicações , Doença de Parkinson/diagnóstico , Leitura , Idioma , FalaRESUMO
The field of structural health monitoring (SHM) faces a fundamental challenge related to accessibility. While analytical and empirical models and laboratory tests can provide engineers with an estimate of a structure's expected behavior under various loads, measurements of actual buildings require the installation and maintenance of sensors to collect observations. This is costly in terms of power and resources. MyShake, the free seismology smartphone app, aims to advance SHM by leveraging the presence of accelerometers in all smartphones and the wide usage of smartphones globally. MyShake records acceleration waveforms during earthquakes. Because phones are most typically located in buildings, a waveform recorded by MyShake contains response information from the structure in which the phone is located. This represents a free, potentially ubiquitous method of conducting critical structural measurements. In this work, we present preliminary findings that demonstrate the efficacy of smartphones for extracting the fundamental frequency of buildings, benchmarked against traditional accelerometers in a shake table test. Additionally, we present seven proof-of-concept examples of data collected by anonymous and privately owned smartphones running the MyShake app in real buildings, and assess the fundamental frequencies we measure. In all cases, the measured fundamental frequency is found to be reasonable and within an expected range in comparison with several commonly used empirical equations. For one irregularly shaped building, three separate measurements made over the course of four months fall within 7% of each other, validating the accuracy of MyShake measurements and illustrating how repeat observations can improve the robustness of the structural health catalog we aim to build.
RESUMO
A low-resource emotional speech synthesis system for empathetic speech synthesis based on modelling prosody features is presented here. Secondary emotions, identified to be needed for empathetic speech, are modelled and synthesised in this investigation. As secondary emotions are subtle in nature, they are difficult to model compared to primary emotions. This study is one of the few to model secondary emotions in speech as they have not been extensively studied so far. Current speech synthesis research uses large databases and deep learning techniques to develop emotion models. There are many secondary emotions, and hence, developing large databases for each of the secondary emotions is expensive. Hence, this research presents a proof of concept using handcrafted feature extraction and modelling of these features using a low-resource-intensive machine learning approach, thus creating synthetic speech with secondary emotions. Here, a quantitative-model-based transformation is used to shape the emotional speech's fundamental frequency contour. Speech rate and mean intensity are modelled via rule-based approaches. Using these models, an emotional text-to-speech synthesis system to synthesise five secondary emotions-anxious, apologetic, confident, enthusiastic and worried-is developed. A perception test to evaluate the synthesised emotional speech is also conducted. The participants could identify the correct emotion in a forced response test with a hit rate greater than 65%.
Assuntos
Percepção da Fala , Fala , Humanos , Percepção da Fala/fisiologia , Emoções/fisiologia , AnsiedadeRESUMO
Coordination between speech acoustics and manual gestures has been conceived as "not biologically mandated" (McClave E. J Psycholinguist Res 27(1): 69-89, 1998). However, recent work suggests a biomechanical entanglement between the upper limbs and the respiratory-vocal system (Pouw W, de Jonge-Hoekstra D, Harrison SJ, Paxton A, Dixon JA. Ann NY Acad Sci 1491(1): 89-105, 2021). Pouw et al. found that for movements with a high physical impulse, speech acoustics co-occur with the physical impulses of upper limb movements. They interpret this result in terms of biomechanical coupling between arm motion and speech via the breathing system. This coupling could support the synchrony observed between speech prosody and arm gestures during communication. The present study investigates whether the effect of physical impulse on speech acoustics can be extended to leg motion, assumed to be controlled independently from oral communication. The study involved 25 native speakers of German who recalled short stories while biking with their arms or their legs. These conditions were compared with a static condition in which participants could not move their arms. Our analyses are similar to that of Pouw et al. (Pouw W, de Jonge-Hoekstra D, Harrison SJ, Paxton A, Dixon JA. Ann NY Acad Sci 1491(1): 89-105, 2021). Results reveal that the presence of intensity peaks in the acoustic signal co-occur with the time of peak acceleration of legs' biking movements. However, this was not observed when biking with the arms, which corresponded to lower acceleration peaks. In contrast to intensity, F0 was not affected in the arm and leg conditions. These results suggest that 1) the biomechanical entanglements between the respiratory-vocal system and the lower limbs may also impact speech; 2) the physical impulse may have to reach a threshold to impact speech acoustics.NEW & NOTEWORTHY The link between speech and limb motion is an interdisciplinary challenge and a core issue in motor control and language research. Our research aims to disentangle the potential biomechanical links between lower limbs and the speech apparatus, by investigating the effect of leg movements on speech acoustics.
Assuntos
Perna (Membro) , Fala , Movimento , Braço , Extremidade SuperiorRESUMO
Callous-unemotional (CU) traits are associated with severe and persistent juvenile offending. CU traits are also associated with dampened emotional arousal, which suggests that fundamental frequency (f0), a measure of vocally-encoded emotional arousal, may serve as an accessible psychophysiological marker of CU traits in youth. This study investigated the associations between f0 range measured during an emotionally evocative task, CU traits, and emotion dysregulation in a mixed-gender sample of 168 justice-involved youth. For boys, after controlling for covariates, wider f0 range-indicating greater emotional arousal-was negatively associated with CU traits and positively associated with emotion dysregulation. For girls, no significant associations with f0 range emerged; however, CU traits were positively associated with emotion dysregulation. Findings suggest that f0 range may serve as a valid indicator of CU traits in JJ-involved boys, and that detained boys and girls with high CU traits are characterized by different profiles of emotion dysregulation.
RESUMO
Recently, the possibilities of detecting psychosocial stress from speech have been discussed. Yet, there are mixed effects and a current lack of clarity in relations and directions for parameters derived from stressed speech. The aim of the current study is - in a controlled psychosocial stress induction experiment - to apply network modeling to (1) look into the unique associations between specific speech parameters, comparing speech networks containing fundamental frequency (F0), jitter, mean voiced segment length, and Harmonics-to-Noise Ratio (HNR) pre- and post-stress induction, and (2) examine how changes pre- versus post-stress induction (i.e., change network) in each of the parameters are related to changes in self-reported negative affect. Results show that the network of speech parameters is similar after versus before the stress induction, with a central role of HNR, which shows that the complex interplay and unique associations between each of the used speech parameters is not impacted by psychosocial stress (aim 1). Moreover, we found a change network (consisting of pre-post stress difference values) with changes in jitter being positively related to changes in self-reported negative affect (aim 2). These findings illustrate - for the first time in a well-controlled but ecologically valid setting - the complex relations between different speech parameters in the context of psychosocial stress. Longitudinal and experimental studies are required to further investigate these relationships and to test whether the identified paths in the networks are indicative of causal relationships.
Assuntos
Acústica da Fala , Voz , Humanos , Fala , Medida da Produção da Fala , Estresse PsicológicoRESUMO
The frequency-following response (FFR) to periodic complex sounds has gained recent interest in auditory cognitive neuroscience as it captures with great fidelity the tracking accuracy of the periodic sound features in the ascending auditory system. Seminal studies suggested the FFR as a correlate of subcortical sound encoding, yet recent studies aiming to locate its sources challenged this assumption, demonstrating that FFR receives some contribution from the auditory cortex. Based on frequency-specific phase-locking capabilities along the auditory hierarchy, we hypothesized that FFRs to higher frequencies would receive less cortical contribution than those to lower frequencies, hence supporting a major subcortical involvement for these high frequency sounds. Here, we used a magnetoencephalographic (MEG) approach to trace the neural sources of the FFR elicited in healthy adults (N = 19) to low (89 Hz) and high (333 Hz) frequency sounds. FFRs elicited to the high and low frequency sounds were clearly observable on MEG and comparable to those obtained in simultaneous electroencephalographic recordings. Distributed source modeling analyses revealed midbrain, thalamic, and cortical contributions to FFR, arranged in frequency-specific configurations. Our results showed that the main contribution to the high-frequency sound FFR originated in the inferior colliculus and the medial geniculate body of the thalamus, with no significant cortical contribution. In contrast, the low-frequency sound FFR had a major contribution located in the auditory cortices, and also received contributions originating in the midbrain and thalamic structures. These findings support the multiple generator hypothesis of the FFR and are relevant for our understanding of the neural encoding of sounds along the auditory hierarchy, suggesting a hierarchical organization of periodicity encoding.
Assuntos
Estimulação Acústica/métodos , Córtex Auditivo/fisiologia , Vias Auditivas/fisiologia , Percepção Auditiva/fisiologia , Potenciais Evocados Auditivos/fisiologia , Magnetoencefalografia/métodos , Adulto , Eletroencefalografia/métodos , Feminino , Humanos , Masculino , Adulto JovemRESUMO
Fundamental frequency (fo), perceived as voice pitch, is the most sexually dimorphic, perceptually salient and intensively studied voice parameter in human nonverbal communication. Thousands of studies have linked human fo to biological and social speaker traits and life outcomes, from reproductive to economic. Critically, researchers have used myriad speech stimuli to measure fo and infer its functional relevance, from individual vowels to longer bouts of spontaneous speech. Here, we acoustically analysed fo in nearly 1000 affectively neutral speech utterances (vowels, words, counting, greetings, read paragraphs and free spontaneous speech) produced by the same 154 men and women, aged 18-67, with two aims: first, to test the methodological validity of comparing fo measures from diverse speech stimuli, and second, to test the prediction that the vast inter-individual differences in habitual fo found between same-sex adults are preserved across speech types. Indeed, despite differences in linguistic content, duration, scripted or spontan--eous production and within-individual variability, we show that 42-81% of inter-individual differences in fo can be explained between any two speech types. Beyond methodological implications, together with recent evidence that inter-individual differences in fo are remarkably stable across the lifespan and generalize to emotional speech and nonverbal vocalizations, our results further substantiate voice pitch as a robust and reliable biomarker in human communication.
Assuntos
Fala , Voz , Adulto , Feminino , Humanos , Masculino , Acústica da FalaRESUMO
Despite recent evidence of a positive relationship between cortisol levels and voice pitch in stressed speakers, the extent to which human listeners can reliably judge stress from the voice remains unknown. Here, we tested whether voice-based judgments of stress co-vary with the free cortisol levels and vocal parameters of speakers recorded in a real-life stressful situation (oral examination) and baseline (2 weeks prior). Hormone and acoustic analyses indicated elevated salivary cortisol levels and corresponding changes in voice pitch, vocal tract resonances (formants), and speed of speech during stress. In turn, listeners' stress ratings correlated significantly with speakers' cortisol levels. Higher pitched voices were consistently perceived as more stressed; however, the influence of formant frequencies, vocal perturbation and noise parameters on stress ratings varied across contexts, suggesting that listeners utilize different strategies when assessing calm versus stressed speech. These results indicate that nonverbal vocal cues can convey honest information about a speaker's underlying physiological level of stress that listeners can, to some extent, detect and utilize, while underscoring the necessity to control for individual differences in the biological stress response.
Assuntos
Percepção da Fala , Voz , Sinais (Psicologia) , Humanos , Hidrocortisona , JulgamentoRESUMO
INTRODUCTION: The purpose of the present study was to determine the possible effect of allergic rhinitis (AR) on voice change in children with acoustic analysis and Turkish children's voice handicap index-10 (TR-CVHI-10). METHODS: This is a case-control study. Forty-one children with AR, and a positive skin prick test, as well as 39 children of controls who had produced a negative skin prick test and lacked a history of allergic disease, were selected for the study. Each assessment included recordings for the purposes of acoustic voice analysis (fundamental frequency [f0], jitter %, shimmer %, and harmonics-to-noise ratio (HNR)), and aerodynamic analysis (maximum phonation time (MPT) and s/z ratio). All participants completed TR-CVHI-10. RESULTS: The mean TR-CVHI-10 score of the AR group was significantly higher than the control group (p = 0.013). No difference was observed between the AR and control groups in terms of jitter, shimmer, HNR, and MPT values and s/z ratio (p > 0.05). Conversely, the f0 value was more pronounced in controls (270.9 ± 60.3 Hz) than in the AR group (237.7 ± 54.3 Hz) (p = 0.012). CONCLUSION: The study's results revealed that AR can have an effect on fundamental frequency and voice quality in children. The diagnostic process should include AR as a potential cause of voice disorders in children.
Assuntos
Rinite Alérgica , Distúrbios da Voz , Voz , Estudos de Casos e Controles , Criança , Humanos , Fonação , Rinite Alérgica/diagnóstico , Acústica da Fala , Medida da Produção da Fala , Distúrbios da Voz/diagnóstico , Distúrbios da Voz/etiologiaRESUMO
Singing voice is a human quality that requires the precise coordination of numerous kinetic functions and results in a perceptually variable auditory outcome. The use of multi-sensor systems can facilitate the study of correlations between the vocal mechanism kinetic functions and the voice output. This is directly relevant to vocal education, rehabilitation, and prevention of vocal health issues in educators; professionals; and students of singing, music, and acting. In this work, we present the initial design of a modular multi-sensor system for singing voice analysis, and describe its first assessment experiment on the 'vocal breathiness' qualitative characteristic. A system case study with two professional singers was conducted, utilizing signals from four sensors. Participants sung a protocol of vocal trials in various degrees of intended vocal breathiness. Their (i) vocal output, (ii) phonatory function, and (iii) respiratory behavior-per-condition were recorded through a condenser microphone (CM), an Electroglottograph (EGG), and thoracic and abdominal respiratory effort transducers (RET), respectively. Participants' individual respiratory management strategies were studied through qualitative analysis of RET data. Microphone audio samples breathiness degree was rated perceptually, and correlation analysis was performed between sample ratings and parameters extracted from CM and EGG data. Smoothed Cepstral Peak Prominence (CPPS) and vocal folds' Open Quotient (OQ), as computed with the Howard method (HOQ), demonstrated the higher correlation coefficients, when analyzed individually. DECOM method-computed OQ (DOQ) was also examined. Interestingly, the correlation coefficient of pitch difference between estimates from CM and EGG signals appeared to be (based on the Pearson correlation coefficient) statistically insignificant (a result that warrants investigation in larger populations). The study of multi-variate models revealed even higher correlation coefficients. Models studied were the Acoustic Breathiness Index (ABI) and the proposed multiple regression model CDH (CPPS, DOQ, and HOQ), which was attempted in order to combine analysis results from microphone and EGG signals. The model combination of ABI and the proposed CDH appeared to yield the highest correlation with perceptual breathiness ratings. Study results suggest potential for the use of a completed system version in vocal pedagogy and research, as the case study indicated system practicality, a number of pertinent correlations, and introduced topics with further research possibilities.
Assuntos
Canto , Voz , Acústica , Humanos , Fonação , Prega Vocal , Qualidade da VozRESUMO
PURPOSE: The purpose of this study was to establish and characterize age- and gender-specific normative data of the singing voice using the voice range profile for clinical diagnostics. Furthermore, associations between the singing voice and the socioeconomic status were examined. METHODS: Singing voice profiles of 1,578 mostly untrained children aged between 7.0 and 16.11 years were analyzed. Participants had to reproduce sung tones at defined pitches, resulting in maximum and minimum fundamental frequency and sound pressure level (SPL). In addition, maximum phonation time (MPT) was measured. Percentile curves of frequency, SPL and MPT were estimated. To examine the associations of socioeconomic status, multivariate analyses adjusted for age and sex were performed. RESULTS: In boys, the mean of the highest frequency was 750.9 Hz and lowered to 397.1 Hz with increasing age. Similarly, the minimum frequency was 194.4 Hz and lowered to 91.9 Hz. In girls, the mean maximum frequency decreased from 754.9 to 725.3 Hz. The mean minimum frequency lowered from 202.4 to 175.0 Hz. For both sexes, the mean frequency range ∆f showed a constant range of roughly 24 semitones. The MPT increased with age, for boys and girls. There was neither an effect of age nor sex on SPLmin or SPLmax, ranging between 52.6 and 54.1 dBA and between 86.5 and 82.8 dBA, respectively. Socioeconomic status was not associated with the above-mentioned variables. CONCLUSION: To our knowledge, this study is the first to present large normative data on the singing voice in childhood and adolescence based on a high number of measurements. In addition, we provide percentile curves for practical application in clinic and vocal pedagogy which may be applied to distinguish between normal and pathological singing voice.
Assuntos
Canto , Voz , Adolescente , Criança , Feminino , Humanos , Masculino , Fonação , Qualidade da Voz , Treinamento da VozRESUMO
A retractable larynx and adaptations of the vocal folds in the males of several polygynous ruminants serve for the production of rutting calls that acoustically announce larger than actual body size to both rival males and potential female mates. Here, such features of the vocal tract and of the sound source are documented in another species. We investigated the vocal anatomy and laryngeal mobility including its acoustical effects during the rutting vocal display of free-ranging male impala (Aepyceros melampus melampus) in Namibia. Male impala produced bouts of rutting calls (consisting of oral roars and interspersed explosive nasal snorts) in a low-stretch posture while guarding a rutting territory or harem. For the duration of the roars, male impala retracted the larynx from its high resting position to a low mid-neck position involving an extensible pharynx and a resilient connection between the hyoid apparatus and the larynx. Maximal larynx retraction was 108 mm based on estimates in video single frames. This was in good concordance with 91-mm vocal tract elongation calculated on the basis of differences in formant dispersion between roar portions produced with the larynx still ascended and those produced with maximally retracted larynx. Judged by their morphological traits, the larynx-retracting muscles of male impala are homologous to those of other larynx-retracting ruminants. In contrast, the large and massive vocal keels are evolutionary novelties arising by fusion and linear arrangement of the arytenoid cartilage and the canonical vocal fold. These bulky and histologically complex vocal keels produced a low fundamental frequency of 50 Hz. Impala is another ruminant species in which the males are capable of larynx retraction. In addition, male impala vocal folds are spectacularly specialized compared with domestic bovids, allowing the production of impressive, low-frequency roaring vocalizations as a significant part of their rutting behaviour. Our study expands knowledge on the evolutionary variation of vocal fold morphology in mammals, suggesting that the structure of the mammalian sound source is not always human-like and should be considered in acoustic analysis and modelling.