Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
J Voice ; 37(1): 26-36, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33257208

RESUMO

OBJECTIVE: This study proposes a new computational framework for automated spatial segmentation of the vocal fold edges in high-speed videoendoscopy (HSV) data during connected speech. This spatio-temporal analytic representation of the vocal folds enables the HSV-based measurement of the glottal area waveform and other vibratory characteristics in the context of running speech. METHODS: HSV data were obtained from a vocally normal adult during production of the "Rainbow Passage." An algorithm based on an active contour modeling approach was developed for the analysis of HSV data. The algorithm was applied on a series of HSV kymograms at different intersections of the vocal folds to detect the edges of the vibrating vocal folds across the frames. This edge detection method follows a set of deformation rules for the active contours to capture the edges of the vocal folds through an energy optimization procedure. The detected edges in the kymograms were then registered back to the HSV frames. Subsequently, the glottal area waveform was calculated based on the area of the glottis enclosed by the vocal fold edges in each frame. RESULTS: The developed algorithm successfully captured the edges of the vocal folds in the HSV kymograms. This method led to an automated measurement of the glottal area waveform from the HSV frames during vocalizations in connected speech. CONCLUSION: The proposed algorithm serves as an automated method for spatial segmentation of the vocal folds in HSV data in connected speech. This study is one of the initial steps toward developing HSV-based measures to study vocal fold vibratory characteristics and voice production mechanisms in norm and disorder in the context of connected speech.


Assuntos
Laringe , Fala , Fonação , Gravação em Vídeo/métodos , Prega Vocal , Vibração
2.
J Speech Lang Hear Res ; 65(6): 2098-2113, 2022 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-35605603

RESUMO

PURPOSE: Voice disorders are best assessed by examining vocal fold dynamics in connected speech. This can be achieved using flexible laryngeal high-speed videoendoscopy (HSV), which enables us to study vocal fold mechanics with high temporal details. Analysis of vocal fold vibration using HSV requires accurate segmentation of the vocal fold edges. This article presents an automated deep-learning scheme to segment the glottal area in HSV from which the glottal edges are derived during connected speech. METHOD: Using a custom-built HSV system, data were obtained from a vocally healthy participant reciting the "Rainbow Passage." A deep neural network was designed for glottal area segmentation in the HSV data. A recently introduced hybrid approach by the authors was utilized as an automated labeling tool to train the network on a set of HSV frames, where the glottis region was automatically annotated during vocal fold vibrations. The network was then tested against manually segmented frames using different metrics, intersection over union (IoU), and Boundary F1 (BF) score, and its performance was assessed on various phonatory events on the HSV sequence. RESULTS: The designed network was successfully trained using the hybrid approach, without the need for manual labeling, and tested on the manually labeled data. The performance metrics showed a mean IoU of 0.82 and a mean BF score of 0.96. In addition, the evaluation assessment of the network's performance demonstrated an accurate segmentation of the glottal edges/area even during complex nonstationary phonatory events and when vocal folds were not vibrating, thus overcoming the limitations of the previous hybrid approach that could only be applied to the vibrating vocal folds. CONCLUSIONS: The introduced automated scheme guarantees accurate glottis representation in challenging color HSV data with lower image quality and excessive laryngeal maneuvers during all instances of connected speech. This facilitates the future development of HSV-based measures to assess the running vibratory characteristics of the vocal folds in speakers with and without voice disorder. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.19798864.


Assuntos
Aprendizado Profundo , Laringe , Distúrbios da Voz , Glote/diagnóstico por imagem , Humanos , Laringoscopia/métodos , Fonação , Fala , Vibração , Gravação em Vídeo , Prega Vocal/diagnóstico por imagem , Distúrbios da Voz/diagnóstico
3.
Appl Sci (Basel) ; 11(3)2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33717604

RESUMO

Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the "Rainbow Passage." The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech.

4.
Folia Phoniatr Logop ; 60(4): 188-94, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18463414

RESUMO

Although effort closure techniques have a long history in the treatment of hypofunctional and psychogenic voice disorders, there have been surprisingly few studies of their specific laryngeal and phonatory effects. The present study was designed to provide preliminary data on physiologic changes in voice production associated with a weightlifting and support maneuver. Twenty vocally healthy subjects (10 men and 10 women) lifted hand-held weights and steadily supported them with outstretched arms as they either sustained comfortable phonation or repeated the syllable /pi/. Both the male and female subjects showed an increase in the electroglottographic contact quotient, long-term F(0) variability, and estimated laryngeal airway resistance attributable to an elevated driving pressure. Conversely, there were no significant changes in mean F(0), pitch perturbation quotient (jitter), or phonatory airflow between the pre-lift and lift portions of their voice production, regardless of the amount of weight supported. The results of this study indicate that simultaneous phonation and weightlifting is associated with increased laryngeal airway resistance characterized by an elevation in driving pressure and medial compression of the vocal folds. Implications for an improved understanding of normal vocal physiology and for the therapeutic use of such air-trapping exercises are addressed.


Assuntos
Laringe/fisiologia , Fonação/fisiologia , Suporte de Carga/fisiologia , Adulto , Resistência das Vias Respiratórias , Feminino , Glote/fisiologia , Humanos , Masculino , Pressão , Ventilação Pulmonar
5.
J Voice ; 32(2): 256.e1-256.e12, 2018 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-28647431

RESUMO

OBJECTIVE: This study proposes a gradient-based method for temporal segmentation of laryngeal high-speed videoendoscopy (HSV) data obtained during connected speech. METHODS: A custom-developed HSV system coupled with a flexible fiberoptic nasolaryngoscope was used to record one vocally normal female participant during reading of the "Rainbow Passage." A gradient-based algorithm was developed to generate a motion window. When applied to the HSV data, the motion window acted as a filter tracking the location of the vibrating vocal folds. The glottal area waveform was estimated using a statistical-based image-processing approach. The vocal fold vibratory frequency was computed by an autocorrelation-based extraction of the fundamental frequency (f0) from the glottal area waveform. Temporal segmentation was then performed based on the f0 contour and automatic detection of the epiglottic obstructions. Additionally, visual temporal segmentation was performed by viewing the HSV images frame by frame to determine the time points of the vocalization onsets and offsets, and the epiglottic obstructions of the glottis. RESULTS: The time points resulting from the automatic and visual temporal segmentation methods were cross-validated. The f0-contour patterns of rise and fall resulting from the automatic algorithm were found to be in agreement with the visual inspection of the vibratory frequency change in the HSV data. CONCLUSIONS: This study demonstrated the feasibility of automatic temporal segmentation of HSV imaging of connected speech, which allows for mapping the video content into onsets, offsets, and epiglottic obstructions for each vocalization. Automated analysis of HSV imaging of connected speech has significant clinical potential for advancing instrumental voice assessment protocols.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Laringoscopia/métodos , Laringe/anatomia & histologia , Laringe/fisiologia , Fonação , Acústica da Fala , Gravação em Vídeo/métodos , Qualidade da Voz , Adulto , Algoritmos , Automação , Estudos de Viabilidade , Feminino , Humanos , Doenças da Laringe/diagnóstico , Doenças da Laringe/fisiopatologia , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Fatores de Tempo , Vibração , Distúrbios da Voz/diagnóstico , Distúrbios da Voz/fisiopatologia
6.
J Commun Disord ; 45(3): 173-80, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22436826

RESUMO

UNLABELLED: This study examined whether mean vocal fundamental frequency (F(0)) or speech sound pressure level (SPL) varies with changes in syllable repetition rate. Twenty-four young adults (12 M and 12 F) repeated the syllables/p∧/,/p∧tə/, and/p∧təkə/at a modeled "slow" rate of approximately one syllable per second, at a self-selected "comfortable" rate, and at their maximum rate. For both male and female subjects there was a significant increase in F(0), but not SPL, between the "slow" and "maximal" and between the "comfortable" and "maximal" repetition rates. Conversely, there was no significant difference in mean F(0) associated with syllable type, whereas significant SPL differences were most likely due to differences in plosive aspiration, syllable stress, and juncture between the mono-, bi-, and tri-syllabic sequences. These results suggest that there is a laryngeal adjustment that attends an increase in speech rate, lending additional support for speech and voice treatment strategies that employ rate modification techniques. LEARNING OUTCOMES: The reader will be able to: (1) outline the advantages and disadvantages of using a syllable-repetition task to evaluate speech rate; (2) describe how vocal F(0) and speech SPL are affected by changes in speech rate; and (3) describe the clinical and theoretical implications of the results from this study.


Assuntos
Fala , Voz , Adulto , Feminino , Humanos , Masculino , Fonética , Adulto Jovem
7.
J Voice ; 26(6): 816.e13-20, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23059188

RESUMO

OBJECTIVE: This investigation used synchronous high-speed videoendoscopy and electroglottography (EGG) to systematically study contact and separation behavior along the length of the vocal folds. DESIGN: Repeated measures. METHODS: Facilitated by EGG and digital kymograms derived at 20%, 35%, 50%, 65%, and 80% of the posteroanterior length of the vocal folds, the pattern of vocal-fold contact and separation was determined for seven female and seven male vocally healthy subjects while producing "breathy," "comfortable," and "pressed" phonations. RESULTS: The female subjects consistently used an anterior-to-posterior contact pattern and posterior-to-anterior separation pattern when producing a breathy or comfortable voice, with several using a simultaneous pattern of contact and/or separation for pressed phonation. The male subjects showed more variable "zipperlike" separation patterns, but consistently used a simultaneous contact pattern for pressed voice that was also commonly used when producing comfortable phonation. CONCLUSIONS: Findings indicate longitudinal phase differences in vocal-fold vibration are both common and expected in vocally healthy speakers. The implications for vocal assessment, as well as for the use and interpretation of the EGG signal, are discussed.


Assuntos
Eletrodiagnóstico , Laringoscopia , Fonação , Vibração , Gravação em Vídeo , Prega Vocal/fisiologia , Voz , Adulto , Fenômenos Biomecânicos , Feminino , Humanos , Quimografia , Masculino , Fatores Sexuais , Fatores de Tempo , Adulto Jovem
8.
J Voice ; 23(2): 164-8, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18083343

RESUMO

The speed with which the vocal folds adduct to the midline is considered an important variable in the etiology of some voice disorders and may also be a meaningful indicator of central or peripheral neural dysfunction. It is proposed that the time lag between the rise of the sound pressure (SP) and electroglottographic (EGG) signals, measured at the onset of phonation, provides a useful index of vocal attack time. This report describes the experimental validation of this measure, whereby the SP and EGG signals were recorded synchronously with high-speed videoendoscopy, from which a digital kymogram was generated. It is shown that, after appropriate signal processing, the intersignal time delay provides a potentially useful measure that varies with vocal attack characteristics. The proposed method calls for no invasive procedures and relies on signals that are routinely obtained in most clinical settings. Unlike acoustic "rise time" measures of voice onset, the glottographic measure involves no operator intervention, requires no arbitrary decisions about measurement points, and may be accomplished quickly and automatically on any personal computer.


Assuntos
Glote/fisiologia , Voz , Algoritmos , Feminino , Humanos , Laringoscopia , Masculino , Fonação , Processamento de Sinais Assistido por Computador , Fatores de Tempo , Gravação em Vídeo , Qualidade da Voz
9.
Ann Surg ; 236(6): 823-32, 2002 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-12454521

RESUMO

OBJECTIVE: To analyze voice function before and after thyroidectomy for patients with normal preoperative voice using a standardized multidimensional voice assessment protocol. SUMMARY BACKGROUND DATA: The natural history of post-thyroidectomy voice disturbances for patients with preserved laryngeal nerve function has not been systematically studied and characterized with the intent of using the data for postoperative voice rehabilitation. METHODS: During a prospective single-arm study, patients with normal voice underwent functional voice testing using a standardized voice grading scale and a battery of acoustic, aerodynamic, glottographic, and videostroboscopic tests before, 1 week after, and 3 months after thyroidectomy. Differences in observed sample means were evaluated using analysis of covariance or t test; categorical data was analyzed using the Fisher exact or chi-square test. RESULTS: Fifty-four patients were enrolled; 50 and 46 were evaluable at 1 week and 3 months, respectively. No patient developed recurrent laryngeal nerve injury; one had superior laryngeal nerve injury. Fifteen (30%) patients reported early subjective voice change and seven (14%) reported late (3-month) subjective voice change. Forty-two (84%) patients had significant objective change in at least one voice parameter. Six (12%) had significant alterations in more than three voice measures, of which four (67%) were symptomatic, whereas 25% with three or fewer objective changes had symptoms. Patients with persistent voice change at 3 months had an increased likelihood of multiple (more than three) early objective changes (43% vs. 7%). Early maximum phonational frequency range and vocal jitter changes from baseline were significantly associated with voice symptoms at 3 months. CONCLUSIONS: Early vocal symptoms are common following thyroidectomy and persist in 14% of patients. Multiple (more than three) objective voice changes correlate with early and late postoperative symptoms. Alterations in maximum phonational frequency range and vocal jitter predict late perceived vocal changes. Factors other than laryngeal nerve injury appear to alter post-thyroidectomy voice. The variability of patient symptoms underscores the importance of understanding the physiology of dysphonia.


Assuntos
Neoplasias da Glândula Tireoide/cirurgia , Tireoidectomia/efeitos adversos , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Adulto , Idoso , Análise de Variância , Feminino , Seguimentos , Humanos , Laringoscopia , Masculino , Pessoa de Meia-Idade , Período Pós-Operatório , Probabilidade , Estudos Prospectivos , Medição de Risco , Sensibilidade e Especificidade , Índice de Gravidade de Doença , Método Simples-Cego , Medida da Produção da Fala , Neoplasias da Glândula Tireoide/patologia , Tireoidectomia/métodos , Prega Vocal/fisiopatologia , Distúrbios da Voz/etiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA