RESUMEN
Loss of the larynx significantly alters natural voice production, requiring alternative communication modalities and rehabilitation methods to restore speech intelligibility and improve the quality of life of affected individuals. This paper explores advances in alaryngeal speech enhancement to improve signal quality and reduce background noise, focusing on individuals who have undergone laryngectomy. In this study, speech samples were obtained from 23 Lithuanian males who had undergone laryngectomy with secondary implantation of the tracheoesophageal prosthesis (TEP). Pareto-optimized gated long short-term memory was trained on tracheoesophageal speech data to recognize complex temporal connections and contextual information in speech signals. The system was able to distinguish between actual speech and various forms of noise and artifacts, resulting in a 25% drop in the mean signal-to-noise ratio compared to other approaches. According to acoustic analysis, the system significantly decreased the number of unvoiced frames (proportion of voiced frames) from 40% to 10% while maintaining stable proportions of voiced frames (proportion of voiced speech frames) and average voicing evidence (average voice evidence in voiced frames), indicating the accuracy of the approach in selectively attenuating noise and undesired speech artifacts while preserving important speech information.
RESUMEN
OBJECTIVE: This study aimed to develop a Voice Wellness Index (VWI) application combining the acoustic voice quality index (AVQI) and glottal function index (GFI) data and to evaluate its reliability in quantitative voice assessment and normal versus pathological voice differentiation. STUDY DESIGN: Cross-sectional study. METHODS: A total of 135 adult participants (86 patients with voice disorders and 49 patients with normal voices) were included in this study. Five iOS and Android smartphones with the "Voice Wellness Index" app installed were used to estimate VWI. The VWI data obtained using smartphones were compared with VWI measurements computed from voice recordings collected from a reference studio microphone. The diagnostic efficacy of VWI in differentiating between normal and disordered voices was assessed using receiver operating characteristics (ROC). RESULTS: With a Cronbach's alpha of 0.972 and an ICC of 0.972 (0.964-0.979), the VWI scores of the individual smartphones demonstrated remarkable inter-smartphone agreement and reliability. The VWI data obtained from different smartphones and a studio microphone showed nearly perfect direct linear correlations (r = 0.993-0.998). Depending on the individual smartphone device used, the cutoff scores of VWI related to differentiating between normal and pathological voice groups were calculated as 5.6-6.0 with the best balance between sensitivity (94.10-95.15%) and specificity (93.68-95.72%), The diagnostic accuracy was excellent in all cases, with an area under the curve (AUC) of 0.970-0.974. CONCLUSION: The "Voice Wellness Index" application is an accurate and reliable tool for voice quality measurement and normal versus pathological voice screening and has considerable potential to be used by healthcare professionals and patients for voice assessment.
RESUMEN
The problem of cleaning impaired speech is crucial for various applications such as speech recognition, telecommunication, and assistive technologies. In this paper, we propose a novel approach that combines Pareto-optimized deep learning with non-negative matrix factorization (NMF) to effectively reduce noise in impaired speech signals while preserving the quality of the desired speech. Our method begins by calculating the spectrogram of a noisy voice clip and extracting frequency statistics. A threshold is then determined based on the desired noise sensitivity, and a noise-to-signal mask is computed. This mask is smoothed to avoid abrupt transitions in noise levels, and the modified spectrogram is obtained by applying the smoothed mask to the signal spectrogram. We then employ a Pareto-optimized NMF to decompose the modified spectrogram into basis functions and corresponding weights, which are used to reconstruct the clean speech spectrogram. The final noise-reduced waveform is obtained by inverting the clean speech spectrogram. Our proposed method achieves a balance between various objectives, such as noise suppression, speech quality preservation, and computational efficiency, by leveraging Pareto optimization in the deep learning model. The experimental results demonstrate the effectiveness of our approach in cleaning alaryngeal speech signals, making it a promising solution for various real-world applications.
RESUMEN
The aim of the study was to develop a universal-platform-based (UPB) application suitable for different smartphones for estimation of the Acoustic Voice Quality Index (AVQI) and evaluate its reliability in AVQI measurements and normal and pathological voice differentiation. Our study group consisted of 135 adult individuals, including 49 with normal voices and 86 patients with pathological voices. The developed UPB "Voice Screen" application installed on five iOS and Android smartphones was used for AVQI estimation. The AVQI measures calculated from voice recordings obtained from a reference studio microphone were compared with AVQI results obtained using smartphones. The diagnostic accuracy of differentiating normal and pathological voices was evaluated by applying receiver-operating characteristics. One-way ANOVA analysis did not detect statistically significant differences between mean AVQI scores revealed using a studio microphone and different smartphones (F = 0.759; p = 0.58). Almost perfect direct linear correlations (r = 0.991-0.987) were observed between the AVQI results obtained with a studio microphone and different smartphones. An acceptable level of precision of the AVQI in discriminating between normal and pathological voices was yielded, with areas under the curve (AUC) displaying 0.834-0.862. There were no statistically significant differences between the AUCs (p > 0.05) obtained from studio and smartphones' microphones. The significant difference revealed between the AUCs was only 0.028. The UPB "Voice Screen" application represented an accurate and robust tool for voice quality measurements and normal vs. pathological voice screening purposes, demonstrating the potential to be used by patients and clinicians for voice assessment, employing both iOS and Android smartphones.
RESUMEN
The infiltrative growth pattern of desmoid tumors and their proximity to important anatomical structures make them difficult to manage. Mutilating surgery should be avoided, while surveillance or radiotherapy remain valid options.
RESUMEN
This study aimed to develop a multidimensional model for the evaluation of substitution voicing (SV) after laryngeal oncosurgery. The study group consisted of 121 adult male individuals: 59 patients with SV after laryngeal oncosurgery (endolaryngeal cordectomy, partial laryngectomy, total laryngectomy with tracheoesophageal prosthesis) and 62 healthy controls. A multidimensional protocol for the assessment of SV included, 1) self-reported speech evaluation with a short version of the Speech Handicap Index, 2) auditory-perceptual assessment, and 3) acoustic speech analysis using AMPEX® (Auditory Model Based Pitch Extractor) software. Moderate correlations were observed between parameters from self-reported auditory-perceptual and acoustic speech analysis domains. The multidimensional Substitution Voicing Index (SVI), including markers from these domains, was elaborated by using linear stepwise regression to determine the optimal set of parameters for categorising SV patients. The lowest mean SVI score was revealed in the control subgroup corresponding to the normal speech, followed by cordectomy subgroup and partial laryngectomy subgroup. The highest mean SVI score was revealed in the total laryngectomy subgroup, reflecting the most severely deteriorated quality of SV. One-way analysis of variance identified statistically significant differences between the mean SVI scores in separate subgroups. The results demonstrated the potential benefits of the SVI for a multidimensional evaluation of SV in patients after laryngeal oncosurgery.
Asunto(s)
Trastornos de la Voz , Voz , Adulto , Humanos , Masculino , Habla , Calidad de la Voz , Laringectomía/métodosRESUMEN
The study aimed to investigate and compare the accuracy and robustness of the multiparametric acoustic voice indices (MAVIs), namely the Dysphonia Severity Index (DSI), Acoustic Voice Quality Index (AVQI), Acoustic Breathiness Index (ABI), and Voice Wellness Index (VWI) measures in differentiating normal and dysphonic voices. The study group consisted of 129 adult individuals including 49 with normal voices and 80 patients with pathological voices. The diagnostic accuracy of the investigated MAVI in differentiating between normal and pathological voices was assessed using receiver operating characteristics (ROC). Moderate to strong positive linear correlations were observed between different MAVIs. The ROC statistical analysis revealed that all used measurements manifested in a high level of accuracy (area under the curve (AUC) of 0.80 and greater) and an acceptable level of sensitivity and specificity in discriminating between normal and pathological voices. However, with AUC 0.99, the VWI demonstrated the highest diagnostic accuracy. The highest Youden index equaled 0.93, revealing that a VWI cut-off of 4.45 corresponds with highly acceptable sensitivity (97.50%) and specificity (95.92%). In conclusion, the VWI was found to be beneficial in describing differences in voice quality status and discriminating between normal and dysphonic voices based on clinical diagnosis, i.e., dysphonia type, implying the VWI's reliable voice screening potential.
RESUMEN
Laryngeal carcinoma is the most common malignant tumor of the upper respiratory tract. Total laryngectomy provides complete and permanent detachment of the upper and lower airways that causes the loss of voice, leading to a patient's inability to verbally communicate in the postoperative period. This paper aims to exploit modern areas of deep learning research to objectively classify, extract and measure the substitution voicing after laryngeal oncosurgery from the audio signal. We propose using well-known convolutional neural networks (CNNs) applied for image classification for the analysis of voice audio signal. Our approach takes an input of Mel-frequency spectrogram (MFCC) as an input of deep neural network architecture. A database of digital speech recordings of 367 male subjects (279 normal speech samples and 88 pathological speech samples) was used. Our approach has shown the best true-positive rate of any of the compared state-of-the-art approaches, achieving an overall accuracy of 89.47%.
RESUMEN
PURPOSE: The study aimed to evaluate the impact of different variables on the longevity of Voice Prosthesis (VP) in patients after total laryngectomy. PATIENTS AND METHODS: This retrospective cohort study is based on data about a continuous series of 328 third-generation VP, which were implanted between 2016 and 2020. Data about the VP users' age, sex, place of residence, laryngeal tumor stage, neck irradiation, VP size, and the use of Heat and Moisture Exchanger (HME) were obtained and analyzed. The effect of these variables on VP lifetime was determined. RESULTS: The median lifetime of VPs in patients 65 years old and above was 182 days (95% CI 168-196), versus 146 days (95% CI 130-162) (P = 0.033) in patients younger than 65. Neck irradiation was associated with a longer VP median lifetime of 161 days (95% CI 142-180) compared to 126 days (95% CI 100-152) with no prior neck irradiation (P = 0.046). HME usage was associated with significantly increased longevity of VPs: 182 days (95% CI 156-208) with HME and 149 days (95% CI 132-166) without HME usage (P = 0.039). CONCLUSION: The results of the present study suggest that neck irradiation, and routine use of use of HME are positively associated with the longevity of VPs.
RESUMEN
OBJECTIVE: To assess correlations between auditory-perceptual and self-reported speech evaluation methods for substitution voicing (SV) and to investigate the robustness of these methods in a clinical setting. METHODS: Fifty-nine male patients who underwent laryngeal oncosurgery and 62 healthy male controls were included in this prospective study. Lithuanian versions of the Speech Handicap Index (SHI-LT) and Impression of voice quality (I), Impression of intelligibility (I), Unintended additive Noise (N), Fluency (F), and Quality of Voicing (Vo) scale (IINFVo-LT) were used to assess and compare self-reported and auditory-perceptual evaluations of SV. Speech samples were rated by a panel of experienced raters. RESULTS: The IINFVo-LT revealed good inter-rater reliability (ICC = 0.825) and intrarater reliability over time (ICC = 0.976) when assessing SV. Statistically significant differences (P < 0.05) of the mean scores of IINFVo-LT among the cordectomy, partial laryngectomy (22.52 [SD 9.98]), tracheoesophageal prosthesis (16.92 [SD 10.71]), and control (48.01 [SD 2.88]) groups confirmed the usefulness of IINFVo-LT for SV rating. A moderate negative correlation (r = -0.61; P < 0.001) demonstrated good concurrent validity between the IINFVo-LT and the SHI-LT total scores. A statistically significant, strong, negative correlation (r = -0.74) was obtained between the IINFVo-LT and SHI-LT speech handicap grade (P < 0.001), demonstrating good concurrent validity. CONCLUSION: The combination of IINFVo-LT and SHI-LT represents a potentially valuable and robust tool for evaluating SV and is helpful for assessing the degree of speech abnormality after laryngeal oncosurgery and its impact on patients' quality of life.
Asunto(s)
Calidad de Vida , Habla , Humanos , Laringectomía/efectos adversos , Masculino , Estudios Prospectivos , Reproducibilidad de los Resultados , Autoinforme , Inteligibilidad del HablaRESUMEN
OBJECTIVE: to assess the diagnostic value of Lithuanian version of Glottal Function Index (GFI-LT) questionnaire in pediatric dysphonia screening. METHODS: The GFI-LT was completed by 82 children (7-16 years old): 41 patients with voice disorders (patients group) and 41 healthy subjects (control group). Auditory-perceptual evaluation of voice was performed using the Grade Roughness Breathiness (GRB) protocol. Acoustic voice analysis was accomplished for F0, SDF0, jitter, shimmer and NNE using Dr. Speech, Tiger Elemetrics software. To evaluate the diagnostic accuracy differentiating normal and dysphonic voice, the receiver operating characteristic statistics were used. RESULTS: Perceptually dysphonia was revealed in all children of the patients group. Grade I (65.9%) was the most prevalent (pâ¯>â¯0.05). No dysphonia was detected in the control group. Acoustic voice analysis revealed statistically significantly (pâ¯<â¯0.001) deteriorated all acoustic voice parameters in patients' group comparing to control group. Statistically significant (pâ¯<â¯0.05) strong or moderate correlations were found between the GFI-LT, auditory-perceptual rating and all acoustic voice parameters of the patients group. The strongest correlations were observed between GFI-LT and G (râ¯=â¯0.70), R (râ¯=â¯0.69), jitter (râ¯=â¯0.56) and SDF0 (râ¯=â¯0.56). No statistically significant correlations between GFI-LT and children' age or gender were found (pâ¯>â¯0.05). The GFI-LT cut-off score of ≥3.0 was associated with excellent test accuracy (AUCâ¯=â¯0.961) distinguishing children with voice disorders from healthy controls, resulting in a balance between sensitivity and specificity (95.1% vs 85.4%). CONCLUSION: GFI-LT is considered to be a valid and reliable tool for self-assessment and screening of voice disorders in children.
Asunto(s)
Disfonía/diagnóstico , Acústica del Lenguaje , Encuestas y Cuestionarios , Calidad de la Voz , Adolescente , Percepción Auditiva , Estudios de Casos y Controles , Niño , Femenino , Humanos , Lituania , Masculino , Curva ROC , Medición de la Producción del HablaRESUMEN
OBJECTIVE: The objective is to study the cultural adaptation and validation of the Speech Handicap Index (SHI) questionnaire to the Lithuanian language. METHODS: Cultural adaptation and validation of the translated Lithuanian version of the SHI (SHI-LT) was performed as described by the Scientific Advisory Committee of the Medical Outcomes Trust. The SHI-LT was completed by 46 patients after total laryngectomy and by 60 healthy subjects of the control group. Validity and reliability of the SHI-LT were evaluated. RESULTS: The SHI-LT showed a statistically significant high internal consistency and test-retest reliability (Cronbach's α = 0.96-0.98). Good validity of SHI-LT was reflected by statistically significant (P < 0.001) difference between the mean scores of the patients and control groups (74.7 ± 26.9 and 5.5 ± 6.5, respectively). No age or gender dependence of SHI-LT was found (P > 0.05). Receiver operating characteristic test indicated that SHI-LT scores exceeding 17.0 points (cutoff value) distinguish patients from healthy controls, with a sensitivity of 97.8% and specificity of 95.0%. CONCLUSION: SHI-LT is considered to be a valid and reliable speech assessment tool for Lithuanian-speaking patients after laryngectomy.
Asunto(s)
Evaluación de la Discapacidad , Laringectomía/efectos adversos , Acústica del Lenguaje , Encuestas y Cuestionarios , Trastornos de la Voz/diagnóstico , Calidad de la Voz , Anciano , Área Bajo la Curva , Estudios de Casos y Controles , Características Culturales , Femenino , Humanos , Lituania , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Curva ROC , Reproducibilidad de los Resultados , Traducción , Trastornos de la Voz/etiología , Trastornos de la Voz/fisiopatologíaRESUMEN
BACKGROUND AND OBJECTIVE: The literature lacks data about the evaluation of throat-related symptoms proving chronic tonsillitis as the most common indication for adult tonsillectomy. Therefore, the aim of this study was to assess the most important throat-related symptoms suggestive of chronic tonsillitis in adults. MATERIAL AND METHODS: A prospective cohort study was carried out. The analysis of throat-related symptoms (complaints, tonsillitis rate, pharyngeal signs, and antistreptolysin-O titer) in 81 adults with histologically confirmed chronic tonsillitis was conducted. RESULTS: Recurrent tonsillitis was the most common complaint (74.1%). The mean number of tonsillitis episodes was 3.6 (SD, 1.9) times per year. There were no significant differences comparing the frequencies of all the analyzed pharyngeal signs (P>0.05). The antistreptolysin-O titer (mean, 279.8; SD, 211.6 UL) was pathological in 33.3% of patients. The study identified the most important throat-related symptoms revealing chronic tonsillitis: tonsillar cryptic debris (OR, 8.84; 95% CI, 1.93-40.53; P=0.005) and enlarged anterior cervical lymph nodes along with the frequency of tonsillitis episodes exceeding 3 times per year (OR, 8.27; 95% CI, 1.33-51.57; P=0.024). The classification accuracy of 85.2% was obtained. CONCLUSIONS: Tonsillar cryptic debris and enlarged regional lymph nodes along with recurrent tonsillitis could support the diagnosis of chronic tonsillitis in adults when considering tonsillectomy.