Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
J Acoust Soc Am ; 155(3): 1694-1703, 2024 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-38426839

RESUMEN

Cochlear implant (CI) is currently the vital technological device for assisting deaf patients in hearing sounds and greatly enhances their sound listening appreciation. Unfortunately, it performs poorly for music listening because of the insufficient number of electrodes and inaccurate identification of music features. Therefore, this study applied source separation technology with a self-adjustment function to enhance the music listening benefits for CI users. In the objective analysis method, this study showed that the results of the source-to-distortion, source-to-interference, and source-to-artifact ratios were 4.88, 5.92, and 15.28 dB, respectively, and significantly better than the Demucs baseline model. For the subjective analysis method, it scored higher than the traditional baseline method VIR6 (vocal to instrument ratio, 6 dB) by approximately 28.1 and 26.4 (out of 100) in the multi-stimulus test with hidden reference and anchor test, respectively. The experimental results showed that the proposed method can benefit CI users in identifying music in a live concert, and the personal self-fitting signal separation method had better results than any other default baselines (vocal to instrument ratio of 6 dB or vocal to instrument ratio of 0 dB) did. This finding suggests that the proposed system is a potential method for enhancing the music listening benefits for CI users.


Asunto(s)
Implantación Coclear , Implantes Cocleares , Sordera , Aprendizaje Profundo , Música , Humanos , Sordera/rehabilitación , Percepción Auditiva
2.
Sensors (Basel) ; 23(5)2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36904641

RESUMEN

Mechanisms underlying exercise-induced muscle fatigue and recovery are dependent on peripheral changes at the muscle level and improper control of motoneurons by the central nervous system. In this study, we analyzed the effects of muscle fatigue and recovery on the neuromuscular network through the spectral analysis of electroencephalography (EEG) and electromyography (EMG) signals. A total of 20 healthy right-handed volunteers performed an intermittent handgrip fatigue task. In the prefatigue, postfatigue, and postrecovery states, the participants contracted a handgrip dynamometer with sustained 30% maximal voluntary contractions (MVCs); EEG and EMG data were recorded. A considerable decrease was noted in EMG median frequency in the postfatigue state compared with the findings in other states. Furthermore, the EEG power spectral density of the right primary cortex exhibited a prominent increase in the gamma band. Muscle fatigue led to increases in the beta and gamma bands of contralateral and ipsilateral corticomuscular coherence, respectively. Moreover, a decrease was noted in corticocortical coherence between the bilateral primary motor cortices after muscle fatigue. EMG median frequency may serve as an indicator of muscle fatigue and recovery. Coherence analysis revealed that fatigue reduced the functional synchronization among bilateral motor areas but increased that between the cortex and muscle.


Asunto(s)
Corteza Motora , Fatiga Muscular , Humanos , Fatiga Muscular/fisiología , Electromiografía , Músculo Esquelético/fisiología , Fuerza de la Mano/fisiología , Electroencefalografía , Corteza Motora/fisiología
3.
Sensors (Basel) ; 22(19)2022 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-36236430

RESUMEN

With the development of active noise cancellation (ANC) technology, ANC has been used to mitigate the effects of environmental noise on audiometric results. However, objective evaluation methods supporting the accuracy of audiometry for ANC exposure to different levels of noise have not been reported. Accordingly, the audio characteristics of three different ANC headphone models were quantified under different noise conditions and the feasibility of ANC in noisy environments was investigated. Steady (pink noise) and non-steady noise (cafeteria babble noise) were used to simulate noisy environments. We compared the integrity of pure-tone signals obtained from three different ANC headphone models after processing under different noise scenarios and analyzed the degree of ANC signal correlation based on the Pearson correlation coefficient compared to pure-tone signals in quiet. The objective signal correlation results were compared with audiometric screening results to confirm the correspondence. Results revealed that ANC helped mitigate the effects of environmental noise on the measured signal and the combined ANC headset model retained the highest signal integrity. The degree of signal correlation was used as a confidence indicator for the accuracy of hearing screening in noise results. It was found that the ANC technique can be further improved for more complex noisy environments.


Asunto(s)
Tamizaje Masivo , Ruido , Audiometría de Tonos Puros/métodos , Estudios de Factibilidad , Audición
4.
J Med Internet Res ; 23(10): e25460, 2021 10 28.
Artículo en Inglés | MEDLINE | ID: mdl-34709193

RESUMEN

BACKGROUND: Cochlear implant technology is a well-known approach to help deaf individuals hear speech again and can improve speech intelligibility in quiet conditions; however, it still has room for improvement in noisy conditions. More recently, it has been proven that deep learning-based noise reduction, such as noise classification and deep denoising autoencoder (NC+DDAE), can benefit the intelligibility performance of patients with cochlear implants compared to classical noise reduction algorithms. OBJECTIVE: Following the successful implementation of the NC+DDAE model in our previous study, this study aimed to propose an advanced noise reduction system using knowledge transfer technology, called NC+DDAE_T; examine the proposed NC+DDAE_T noise reduction system using objective evaluations and subjective listening tests; and investigate which layer substitution of the knowledge transfer technology in the NC+DDAE_T noise reduction system provides the best outcome. METHODS: The knowledge transfer technology was adopted to reduce the number of parameters of the NC+DDAE_T compared with the NC+DDAE. We investigated which layer should be substituted using short-time objective intelligibility and perceptual evaluation of speech quality scores as well as t-distributed stochastic neighbor embedding to visualize the features in each model layer. Moreover, we enrolled 10 cochlear implant users for listening tests to evaluate the benefits of the newly developed NC+DDAE_T. RESULTS: The experimental results showed that substituting the middle layer (ie, the second layer in this study) of the noise-independent DDAE (NI-DDAE) model achieved the best performance gain regarding short-time objective intelligibility and perceptual evaluation of speech quality scores. Therefore, the parameters of layer 3 in the NI-DDAE were chosen to be replaced, thereby establishing the NC+DDAE_T. Both objective and listening test results showed that the proposed NC+DDAE_T noise reduction system achieved similar performances compared with the previous NC+DDAE in several noisy test conditions. However, the proposed NC+DDAE_T only required a quarter of the number of parameters compared to the NC+DDAE. CONCLUSIONS: This study demonstrated that knowledge transfer technology can help reduce the number of parameters in an NC+DDAE while keeping similar performance rates. This suggests that the proposed NC+DDAE_T model may reduce the implementation costs of this noise reduction system and provide more benefits for cochlear implant users.


Asunto(s)
Implantación Coclear , Implantes Cocleares , Percepción del Habla , Humanos , Ruido , Inteligibilidad del Habla
5.
Sensors (Basel) ; 22(1)2021 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-35009834

RESUMEN

Human motion tracking is widely applied to rehabilitation tasks, and inertial measurement unit (IMU) sensors are a well-known approach for recording motion behavior. IMU sensors can provide accurate information regarding three-dimensional (3D) human motion. However, IMU sensors must be attached to the body, which can be inconvenient or uncomfortable for users. To alleviate this issue, a visual-based tracking system from two-dimensional (2D) RGB images has been studied extensively in recent years and proven to have a suitable performance for human motion tracking. However, the 2D image system has its limitations. Specifically, human motion consists of spatial changes, and the 3D motion features predicted from the 2D images have limitations. In this study, we propose a deep learning (DL) human motion tracking technology using 3D image features with a deep bidirectional long short-term memory (DBLSTM) mechanism model. The experimental results show that, compared with the traditional 2D image system, the proposed system provides improved human motion tracking ability with RMSE in acceleration less than 0.5 (m/s2) X, Y, and Z directions. These findings suggest that the proposed model is a viable approach for future human motion tracking applications.


Asunto(s)
Imagenología Tridimensional , Memoria a Corto Plazo , Humanos , Movimiento (Física)
6.
BMC Evol Biol ; 19(1): 212, 2019 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-31747896

RESUMEN

Following publication of the original article [1], we have been notified that Additional file 3 was published with track changes.

7.
BMC Evol Biol ; 19(1): 64, 2019 02 27.
Artículo en Inglés | MEDLINE | ID: mdl-30813905

RESUMEN

BACKGROUND: Despite attempts in retracing the history of the Thao people in Taiwan using folktales, linguistics, physical anthropology, and ethnic studies, their history remains incomplete. The heritage of Thao has been associated with the Pazeh Western plains peoples and several other mountain peoples of Taiwan. In the last 400 years, their culture and genetic profile have been reshaped by East Asian migrants. They were displaced by the Japanese and the construction of a dam and almost faced extinction. In this paper, genetic information from mitochondrial DNA (mtDNA), Histoleucocyte antigens (HLA), and the non-recombining Y chromosome of 30 Thao individuals are compared to 836 other Taiwan Mountain and Plains Aborigines (TwrIP & TwPp), 384 Non-Aboriginal Taiwanese (non-TwA) and 149 Continental East Asians. RESULTS: The phylogeographic analyses of mtDNA haplogroups F4b and B4b1a2 indicated gene flow between Thao, Bunun, and Tsou, and suggested a common ancestry from 10,000 to 3000 years ago. A claim of close contact with the heavily Sinicized Pazeh of the plains was not rejected and suggests that the plains and mountain peoples most likely shared the same Austronesian agriculturist gene pool in the Neolithic. CONCLUSIONS: Having been moving repeatedly since their arrival in Taiwan between 6000 and 4500 years ago, the Thao finally settled in the central mountain range. They represent the last plains people whose strong bonds with their original culture allowed them to preserve their genetic heritage, despite significant gene flow from the mainland of Asia. Representing a considerable contribution to the genealogical history of the Thao people, the findings of this study bear on ongoing anthropological and linguistic debates on their origin.


Asunto(s)
Pueblo Asiatico/genética , Cromosomas Humanos Y/genética , Variación Genética , Antígenos HLA/genética , ADN Mitocondrial/química , ADN Mitocondrial/genética , Flujo Génico , Genética de Población , Haplotipos , Humanos , Masculino , Filogeografía , Análisis de Secuencia de ADN , Taiwán
8.
Ear Hear ; 39(4): 795-809, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29360687

RESUMEN

OBJECTIVE: We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. DESIGN: The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. RESULTS: The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. CONCLUSIONS: When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.


Asunto(s)
Implantación Coclear , Implantes Cocleares , Sordera/rehabilitación , Aprendizaje Profundo , Ruido , Percepción del Habla , Adulto , Niño , Femenino , Humanos , Masculino , Persona de Mediana Edad , Relación Señal-Ruido , Adulto Joven
9.
BMC Genet ; 15: 77, 2014 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-24965575

RESUMEN

BACKGROUND: Much of the data resolution of the haploid non-recombining Y chromosome (NRY) haplogroup O in East Asia are still rudimentary and could be an explanatory factor for current debates on the settlement history of Island Southeast Asia (ISEA). Here, 81 slowly evolving markers (mostly SNPs) and 17 Y-chromosomal short tandem repeats were used to achieve higher level molecular resolution. Our aim is to investigate if the distribution of NRY DNA variation in Taiwan and ISEA is consistent with a single pre-Neolithic expansion scenario from Southeast China to all ISEA, or if it better fits an expansion model from Taiwan (the OOT model), or whether a more complex history of settlement and dispersals throughout ISEA should be envisioned. RESULTS: We examined DNA samples from 1658 individuals from Vietnam, Thailand, Fujian, Taiwan (Han, plain tribes and 14 indigenous groups), the Philippines and Indonesia. While haplogroups O1a*-M119, O1a1*-P203, O1a2-M50 and O3a2-P201 follow a decreasing cline from Taiwan towards Western Indonesia, O2a1-M95/M88, O3a*-M324, O3a1c-IMS-JST002611 and O3a2c1a-M133 decline northward from Western Indonesia towards Taiwan. Compared to the Taiwan plain tribe minority groups the Taiwanese Austronesian speaking groups show little genetic paternal contribution from Han. They are also characterized by low Y-chromosome diversity, thus testifying for fast drift in these populations. However, in contrast to data provided from other regions of the genome, Y-chromosome gene diversity in Taiwan mountain tribes significantly increases from North to South. CONCLUSION: The geographic distribution and the diversity accumulated in the O1a*-M119, O1a1*-P203, O1a2-M50 and O3a2-P201 haplogroups on one hand, and in the O2a1-M95/M88, O3a*-M324, O3a1c-IMS-JST002611 and O3a2c1a-M133 haplogroups on the other, support a pincer model of dispersals and gene flow from the mainland to the islands which likely started during the late upper Paleolithic, 18,000 to 15,000 years ago. The branches of the pincer contributed separately to the paternal gene pool of the Philippines and conjointly to the gene pools of Madagascar and the Solomon Islands. The North to South increase in diversity found for Taiwanese Austronesian speaking groups contrasts with observations based on mitochondrial DNA, thus hinting to a differentiated demographic history of men and women in these populations.


Asunto(s)
Cromosomas Humanos Y/genética , Genética de Población , Haplotipos , Asia Sudoriental , Flujo Genético , Sitios Genéticos , Migración Humana , Humanos , Masculino , Repeticiones de Microsatélite , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , Taiwán
10.
J Am Acad Audiol ; 24(8): 671-83, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24131603

RESUMEN

BACKGROUND: Multichannel wide-dynamic-range compression (WDRC) is a widely adopted amplification scheme in modern digital hearing aids. It attempts to provide individuals with loudness recruitment with superior speech intelligibility and greater listening comfort over a wider range of input levels. However, recent surveys have shown that compression processing (operating in the nonlinear regime) usually reduces the long-term signal-to-noise ratio (SNR). PURPOSE: The purpose of this study was to determine the long-term SNR in an adaptive compression-ratio (CR) amplification scheme called adaptive wide-dynamic-range compression (AWDRC), and to determine whether this concept is better than static WDRC amplification at improving the long-term SNR for speech in noise. DESIGN AND STUDY SAMPLE: AWDRC uses the input short-term dynamic range to adjust the CR to maximize audibility and comfort. Various methods for evaluating the long-term SNR were used to observe the relationship between the CR and output SNR performance in AWDRC for seven typical audiograms, and to compare the results with those for static WDRC amplification. RESULTS: The results showed that the variation of the CR in AWDRC amplification can maintain the comfort and audibility of the output sound. In addition, the average long-term SNR improved by 0.1-5.5 dB for a flat hearing loss, by 0.2-3.4 dB for a reverse sloping hearing loss, by 1.4-4.8 dB for a high-frequency hearing loss, and by 0.3-5.7 dB for a mild-to-moderate-sloping high-frequency hearing loss relative to static WDRC amplification. The output long-term SNR differed significantly (p < .001) between static WDRC and AWDRC amplification. CONCLUSIONS: The results of this study show that AWDRC, which uses the characteristics of the input signal to adaptively adjust the CR, provides better long-term SNR performance than static WDRC amplification.


Asunto(s)
Umbral Auditivo , Audífonos/normas , Pérdida Auditiva de Alta Frecuencia/fisiopatología , Pérdida Auditiva Sensorineural/fisiopatología , Percepción del Habla/fisiología , Diseño de Equipo , Estudios de Seguimiento , Pérdida Auditiva de Alta Frecuencia/rehabilitación , Pérdida Auditiva Sensorineural/rehabilitación , Humanos , Factores de Tiempo
11.
IEEE Trans Biomed Eng ; 70(12): 3330-3341, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37327105

RESUMEN

OBJECTIVE: Although many speech enhancement (SE) algorithms have been proposed to promote speech perception in hearing-impaired patients, the conventional SE approaches that perform well under quiet and/or stationary noises fail under nonstationary noises and/or when the speaker is at a considerable distance. Therefore, the objective of this study is to overcome the limitations of the conventional speech enhancement approaches. METHOD: This study proposes a speaker-closed deep learning-based SE method together with an optical microphone to acquire and enhance the speech of a target speaker. RESULTS: The objective evaluation scores achieved by the proposed method outperformed the baseline methods by a margin of 0.21-0.27 and 0.34-0.64 in speech quality (HASQI) and speech comprehension/intelligibility (HASPI), respectively, for seven typical hearing loss types. CONCLUSION: The results suggest that the proposed method can enhance speech perception by cutting off noise from speech signals and mitigating interference caused by distance. SIGNIFICANCE: The results of this study show a potential way that can help improve the listening experience in enhancing speech quality and speech comprehension/intelligibility for hearing-impaired people.


Asunto(s)
Implantes Cocleares , Aprendizaje Profundo , Audífonos , Pérdida Auditiva , Percepción del Habla , Humanos , Inteligibilidad del Habla
12.
Artículo en Inglés | MEDLINE | ID: mdl-37938964

RESUMEN

Dysarthria, a speech disorder often caused by neurological damage, compromises the control of vocal muscles in patients, making their speech unclear and communication troublesome. Recently, voice-driven methods have been proposed to improve the speech intelligibility of patients with dysarthria. However, most methods require a significant representation of both the patient's and target speaker's corpus, which is problematic. This study aims to propose a data augmentation-based voice conversion (VC) system to reduce the recording burden on the speaker. We propose dysarthria voice conversion 3.1 (DVC 3.1) based on a data augmentation approach, including text-to-speech and StarGAN-VC architecture, to synthesize a large target and patient-like corpus to lower the burden of recording. An objective evaluation metric of the Google automatic speech recognition (Google ASR) system and a listening test were used to demonstrate the speech intelligibility benefits of DVC 3.1 under free-talk conditions. The DVC system without data augmentation (DVC 3.0) was used for comparison. Subjective and objective evaluation based on the experimental results indicated that the proposed DVC 3.1 system enhanced the Google ASR of two dysarthria patients by approximately [62.4%, 43.3%] and [55.9%, 57.3%] compared to unprocessed dysarthria speech and the DVC 3.0 system, respectively. Further, the proposed DVC 3.1 increased the speech intelligibility of two dysarthria patients by approximately [54.2%, 22.3%] and [63.4%, 70.1%] compared to unprocessed dysarthria speech and the DVC 3.0 system, respectively. The proposed DVC 3.1 system offers significant potential to improve the speech intelligibility performance of patients with dysarthria and enhance verbal communication quality.


Asunto(s)
Disartria , Voz , Humanos , Disartria/etiología , Inteligibilidad del Habla/fisiología , Músculos Laríngeos
13.
Asia Pac J Ophthalmol (Phila) ; 12(1): 21-28, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36706331

RESUMEN

PURPOSE: The aim was to develop a deep learning model for predicting the extent of visual impairment in epiretinal membrane (ERM) using optical coherence tomography (OCT) images, and to analyze the associated features. METHODS: Six hundred macular OCT images from eyes with ERM and no visually significant media opacity or other retinal diseases were obtained. Those with best-corrected visual acuity ≤20/50 were classified as "profound visual impairment," while those with best-corrected visual acuity >20/50 were classified as "less visual impairment." Ninety percent of images were used as the training data set and 10% were used for testing. Two convolutional neural network models (ResNet-50 and ResNet-18) were adopted for training. The t-distributed stochastic neighbor-embedding approach was used to compare their performances. The Grad-CAM technique was used in the heat map generative phase for feature analysis. RESULTS: During the model development, the training accuracy was 100% in both convolutional neural network models, while the testing accuracy was 70% and 80% for ResNet-18 and ResNet-50, respectively. The t-distributed stochastic neighbor-embedding approach found that the deeper structure (ResNet-50) had better discrimination on OCT characteristics for visual impairment than the shallower structure (ResNet-18). The heat maps indicated that the key features for visual impairment were located mostly in the inner retinal layers of the fovea and parafoveal regions. CONCLUSIONS: Deep learning algorithms could assess the extent of visual impairment from OCT images in patients with ERM. Changes in inner retinal layers were found to have a greater impact on visual acuity than the outer retinal changes.


Asunto(s)
Aprendizaje Profundo , Membrana Epirretinal , Humanos , Membrana Epirretinal/diagnóstico por imagen , Tomografía de Coherencia Óptica/métodos , Retina/diagnóstico por imagen , Trastornos de la Visión/etiología , Estudios Retrospectivos
14.
J Chin Med Assoc ; 86(1): 105-112, 2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36300992

RESUMEN

BACKGROUND: The population of young adults who are hearing impaired increases yearly, and a device that enables convenient hearing screening could help monitor their hearing. However, background noise is a critical issue that limits the capabilities of such a device. Therefore, this study evaluated the effectiveness of commercial active noise cancellation (ANC) headphones for hearing screening applications in the presence of background noise. In particular, six confounders were used for a comprehensive evaluation. METHODS: We enrolled 12 young adults (a total of 23 ears with normal hearing) to participate in this study. A cross-sectional self-controlled study was conducted to explore the effectiveness of hearing screening in the presence of background noise, with a total of 240 test conditions (=3 ANC models × 2 ANC function statuses × 2 noise types × 5 noise levels × 4 frequencies) for each test ear. Subsequently, a linear regression model was used to prove the effectiveness of ANC headphones for hearing screening applications in the presence of background noise with six confounders. RESULTS: The experimental results showed that, on average, the ANC function of headphones can improve the effectiveness of hearing screening tasks in the presence of background noise. Specifically, the statistical analysis showed that the ANC function enabled a significant 10% improvement ( p < 0.001) compared with no ANC function. CONCLUSION: This study confirmed the effectiveness of ANC headphones for young adult hearing screening applications in the presence of background noise. Furthermore, the statistical results confirmed that as confounding variables, noise type, noise level, hearing screening frequency, ANC headphone model, and sex all affect the effectiveness of the ANC function. These findings suggest that ANC is a potential means of helping users obtain high-accuracy hearing screening results in the presence of background noise. Moreover, we present possible directions of development for ANC headphones in future studies.


Asunto(s)
Pérdida Auditiva , Ruido , Adulto Joven , Humanos , Proyectos Piloto , Estudios Transversales , Ruido/prevención & control , Audición
15.
J Voice ; 2023 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-36732109

RESUMEN

OBJECTIVE: Doctors, nowadays, primarily use auditory-perceptual evaluation, such as the grade, roughness, breathiness, asthenia, and strain scale, to evaluate voice quality and determine the treatment. However, the results predicted by individual physicians often differ, because of subjective perceptions, and diagnosis time interval, if the patient's symptoms are hard to judge. Therefore, an accurate computerized pathological voice quality assessment system will improve the quality of assessment. METHOD: This study proposes a self_attention-based system, with a deep learning technology, named self_attention-based bidirectional long-short term memory (SA BiLSTM). Different pitches [low, normal, high], and vowels [/a/, /i/, /u/], were added into the proposed model, to make it learn how professional doctors evaluate the grade, roughness, breathiness, asthenia, and strain scale, in a high dimension view. RESULTS: The experimental results showed that the proposed system provided higher performance than the baseline system. More specifically, the macro average of the F1 score, presented as decimal, was used to compare the accuracy of classification. The (G, R, and B) of the proposed system were (0.768±0.011, 0.820±0.009, and 0.815±0.009), which is higher than the baseline systems: deep neural network (0.395±0.010, 0.312±0.019, 0.321±0.014) and convolution neural network (0.421±0.052, 0.306±0.043, 0.3250±0.032) respectively. CONCLUSIONS: The proposed system, with SA BiLSTM, pitches, and vowels, provides a more accurate way to evaluate the voice. This will be helpful for clinical voice evaluations and will improve patients' benefits from voice therapy.

16.
Biotechnol Bioeng ; 109(5): 1239-47, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22125231

RESUMEN

To establish a production platform for recombinant proteins in rice suspension cells, we first constructed a Gateway-compatible binary T-DNA destination vector. It provided a reliable and effective method for the rapid directional cloning of target genes into plant cells through Agrobacterium-mediated transformation. We used the approach to produce mouse granulocyte-macrophage colony-stimulating factor (mGM-CSF) in a rice suspension cell system. The promoter for the αAmy3 amylase gene, which is induced strongly by sugar depletion, drove the expression of mGM-CSF. The resulting recombinant protein was fused with the αAmy3 signal peptide and was secreted into the culture medium. The production of rice-derived mGM-CSF (rmGM-CSF) was scaled up successfully in a 2-L bioreactor, in which the highest yield of rmGM-CSF was 24.6 mg/L. Due to post-translational glycosylation, the molecular weight of rmGM-CSF was larger than that of recombinant mGM-CSF produced in Escherichia coli. The rmGM-CSF was bioactive and could stimulate the proliferation of a murine myeloblastic leukemia cell line, NSF-60.


Asunto(s)
Reactores Biológicos , Biotecnología/métodos , Factor Estimulante de Colonias de Granulocitos y Macrófagos/metabolismo , Oryza/metabolismo , Plantas Modificadas Genéticamente , Agrobacterium/genética , Animales , Técnicas de Cultivo de Célula , Medios de Cultivo/química , ADN Bacteriano , Escherichia coli/genética , Vectores Genéticos , Factor Estimulante de Colonias de Granulocitos y Macrófagos/química , Factor Estimulante de Colonias de Granulocitos y Macrófagos/genética , Ratones , Peso Molecular , Oryza/genética , Regiones Promotoras Genéticas , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo
17.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 1972-1976, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-36086160

RESUMEN

Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future.


Asunto(s)
Aprendizaje Profundo , Percepción del Habla , Percepción Auditiva , Humanos , Inteligibilidad del Habla , Factores de Tiempo
18.
Comput Methods Programs Biomed ; 215: 106602, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-35021138

RESUMEN

BACKGROUND AND OBJECTIVE: Most dysarthric patients encounter communication problems due to unintelligible speech. Currently, there are many voice-driven systems aimed at improving their speech intelligibility; however, the intelligibility performance of these systems are affected by challenging application conditions (e.g., time variance of patient's speech and background noise). To alleviate these problems, we proposed a dysarthria voice conversion (DVC) system for dysarthric patients and investigated the benefits under challenging application conditions. METHOD: A deep learning-based voice conversion system with phonetic posteriorgram (PPG) features, called the DVC-PPG system, was proposed in this study. An objective-evaluation metric of Google automatic speech recognition (Google ASR) system and a listening test were used to demonstrate the speech intelligibility benefits of DVC-PPG under quiet and noisy test conditions; besides, the well-known voice conversion system using mel-spectrogram, DVC-Mels, was used for comparison to verify the benefits of the proposed DVC-PPG system. RESULTS: The objective-evaluation metric of Google ASR showed the average accuracy of two subjects in the duplicate and outside test conditions while the DVC-PPG system provided higher speech recognitions rate (83.2% and 67.5%) than dysarthric speech (36.5% and 26.9%) and DVC-Mels (52.9% and 33.8%) under quiet conditions. However, the DVC-PPG system provided more stable performance than the DVC-Mels under noisy test conditions. In addition, the results of the listening test showed that the speech-intelligibility performance of DVC-PPG was better than those obtained via the dysarthria speech and DVC-Mels under the duplicate and outside conditions, respectively. CONCLUSIONS: The objective-evaluation metric and listening test results showed that the recognition rate of the proposed DVC-PPG system was significantly higher than those obtained via the original dysarthric speech and DVC-Mels system. Therefore, it can be inferred from our study that the DVC-PPG system can improve the ability of dysarthric patients to communicate with people under challenging application conditions.


Asunto(s)
Inteligibilidad del Habla , Voz , Disartria , Humanos , Fonética , Medición de la Producción del Habla
19.
JASA Express Lett ; 2(5): 055202, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-36154065

RESUMEN

Medical masks have become necessary of late because of the COVID-19 outbreak; however, they tend to attenuate the energy of speech signals and affect speech quality. Therefore, this study proposes an optical-based microphone approach to obtain speech signals from speakers' medical masks. Experimental results showed that the optical-based microphone approach achieved better performance (85.61%) than the two baseline approaches, namely, omnidirectional (24.17%) and directional microphones (31.65%), in the case of long-distance speech and background noise. The results suggest that the optical-based microphone method is a promising approach for acquiring speech from a medical mask.


Asunto(s)
COVID-19 , Audífonos , Percepción del Habla , COVID-19/prevención & control , Diseño de Equipo , Humanos , Máscaras , Habla , Vibración
20.
Artículo en Inglés | MEDLINE | ID: mdl-36085875

RESUMEN

Generally, those patients with dysarthria utter a distorted sound and the restrained intelligibility of a speech for both human and machine. To enhance the intelligibility of dysarthric speech, we applied a deep learning-based speech enhancement (SE) system in this task. Conventional SE approaches are used for shrinking noise components from the noise-corrupted input, and thus improve the sound quality and intelligibility simultaneously. In this study, we are focusing on reconstructing the severely distorted signal from the dysarthric speech for improving intelligibility. The proposed SE system prepares a convolutional neural network (CNN) model in the training phase, which is then used to process the dysarthric speech in the testing phase. During training, paired dysarthric-normal speech utterances are required. We adopt a dynamic time warping technique to align the dysarthric-normal utter-ances. The gained training data are used to train a CNN - based SE model. The proposed SE system is evaluated on the Google automatic speech recognition (ASR) system and a subjective listening test. The results showed that the proposed method could notably enhance the recognition performance for more than 10% in each of ASR and human recognitions from the unprocessed dysarthric speech. Clinical Relevance- This study enhances the intelligibility and ASR accuracy from a dysarthria speech to more than 10.


Asunto(s)
Disartria , Habla , Percepción Auditiva , Disartria/diagnóstico , Humanos , Redes Neurales de la Computación , Sonido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA