Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Sensors (Basel) ; 19(9)2019 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-31083445

RESUMO

The bioelectrical impedance analysis (BIA) method is widely used to predict percent body fat (PBF). However, it requires four to eight electrodes, and it takes a few minutes to accurately obtain the measurement results. In this study, we propose a faster and more accurate method that utilizes a small dry electrode-based wearable device, which predicts whole-body impedance using only upper-body impedance values. Such a small electrode-based device typically needs a long measurement time due to increased parasitic resistance, and its accuracy varies by measurement posture. To minimize these variations, we designed a sensing system that only utilizes contact with the wrist and index fingers. The measurement time was also reduced to five seconds by an effective parameter calibration network. Finally, we implemented a deep neural network-based algorithm to predict the PBF value by the measurement of the upper-body impedance and lower-body anthropometric data as auxiliary input features. The experiments were performed with 163 amateur athletes who exercised regularly. The performance of the proposed system was compared with those of two commercial systems that were designed to measure body composition using either a whole-body or upper-body impedance value. The results showed that the correlation coefficient ( r 2 ) value was improved by about 9%, and the standard error of estimate (SEE) was reduced by 28%.


Assuntos
Antropometria/métodos , Composição Corporal/fisiologia , Eletrodos , Impedância Elétrica , Humanos , Dispositivos Eletrônicos Vestíveis
2.
J Acoust Soc Am ; 143(1): EL37, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29390776

RESUMO

In this letter, a generic search grid generation algorithm for far-field source localization (SL) is proposed. Since conventional uniform regular grid structures only consider the resolution of the distribution, it is difficult to control the number of grid points to be distributed. The proposed algorithm generates a search grid by distributing a desired number of points evenly, depending on the target criterion, in either direction of arrival or time difference of arrival domain. The experimental results show that the proposed algorithm provides optimally distributed grid points given the number of desired points and the corresponding domain for SL processing.

3.
J Acoust Soc Am ; 135(6): EL284-90, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24907835

RESUMO

This letter investigates the impact of spectral compression on the vector Taylor series-based model adaptation algorithm. Unlike mel-frequency cepstral coefficients obtained by the logarithmic compression, the fractional power compression is used for extracting features. Since the relationship between acoustic models for clean and noisy speech depends on nonlinearity of the spectrum, it is important to select an appropriate compressive operator in the model adaptation. In this letter, the dependency of spectral nonlinearity on the speech recognition system is analyzed in various noisy environments. Experimental results confirm that the replacement of the compressive operator improves the performance of the model adaptation.

4.
J Acoust Soc Am ; 134(5): EL438-44, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24181988

RESUMO

This letter proposes a degradation and cognition model to estimate speech quality impairment because of packet loss concealment (PLC) algorithm implemented in the speech CODEC SILK. By considering the fact that the quality degradation caused by packet loss is highly related to the PLC algorithm, the impact of quality degradation on various types of previous and lost packet classes is analyzed. Then, the PLC effects to the proposed class types are measured by the class conditional expectation of the degradation scores. Finally, the cognition module is derived to estimate the total quality degradation in a mean opinion score (MOS) scale. When assessed for correlation with subject test results, the correlation coefficient of the encoder-based class model is 0.93, and that of the decoder-based model is 0.87.


Assuntos
Internet , Modelos Teóricos , Processamento de Sinais Assistido por Computador , Acústica da Fala , Percepção da Fala , Medida da Produção da Fala , Qualidade da Voz , Algoritmos , Audiometria da Fala , Feminino , Humanos , Masculino , Inteligibilidade da Fala
5.
J Acoust Soc Am ; 132(1): EL29-35, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22779569

RESUMO

This letter presents a single-channel speech dereverberation approach using a non-causal minimum variance distortionless response (MVDR) filter. The non-causal filter is adopted to utilize the additional information of the desired signal that lies in subsequent frames. Note that the desired signal output has minimal distortion due to the introduction of the MVDR criterion. The proposed system further suppresses the late reverberation by employing a statistical reverberant model. Experimental results demonstrate the superiority of the proposed algorithm to conventional approaches.


Assuntos
Algoritmos , Percepção da Fala/fisiologia , Análise de Fourier , Humanos , Mascaramento Perceptivo/fisiologia , Acústica da Fala , Inteligibilidade da Fala/fisiologia
6.
J Acoust Soc Am ; 131(2): 1536-46, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22352523

RESUMO

Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.


Assuntos
Fonética , Reconhecimento Psicológico/fisiologia , Acústica da Fala , Percepção da Fala/fisiologia , Telefone , Algoritmos , Sinais (Psicologia) , Humanos , Espectrografia do Som , Interface para o Reconhecimento da Fala
7.
Nat Commun ; 13(1): 5815, 2022 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-36192403

RESUMO

A wearable silent speech interface (SSI) is a promising platform that enables verbal communication without vocalization. The most widely studied methodology for SSI focuses on surface electromyography (sEMG). However, sEMG suffers from low scalability because of signal quality-related issues, including signal-to-noise ratio and interelectrode interference. Hence, here, we present a novel SSI by utilizing crystalline-silicon-based strain sensors combined with a 3D convolutional deep learning algorithm. Two perpendicularly placed strain gauges with minimized cell dimension (<0.1 mm2) could effectively capture the biaxial strain information with high reliability. We attached four strain sensors near the subject's mouths and collected strain data of unprecedently large wordsets (100 words), which our SSI can classify at a high accuracy rate (87.53%). Several analysis methods were demonstrated to verify the system's reliability, as well as the performance comparison with another SSI using sEMG electrodes with the same dimension, which exhibited a relatively low accuracy rate (42.60%).


Assuntos
Aprendizado Profundo , Fala , Algoritmos , Eletromiografia/métodos , Reprodutibilidade dos Testes , Silício
8.
J Acoust Soc Am ; 126(3): EL100-6, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19739699

RESUMO

This paper proposes an efficient method to improve speaker recognition performance by dynamically controlling the ratio of phoneme class information. It utilizes the fact that each phoneme contains different amounts of speaker discriminative information that can be measured by mutual information. After classifying phonemes into five classes, the optimal ratio of each class in both training and testing processes is adjusted using a non-linear optimization technique, i.e., the Nelder-Mead method. Speaker identification results verify that the proposed method achieves 18% improvement in terms of error rate compared to a baseline system.


Assuntos
Modelos Teóricos , Reconhecimento Automatizado de Padrão , Reconhecimento Fisiológico de Modelo , Fonética , Fala , Algoritmos , Animais , Teoria da Informação , Dinâmica não Linear
9.
J Acoust Soc Am ; 122(3): EL88, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17927313

RESUMO

The perceptual relevance of adopting the temporal envelope to model the frequency band of 4-7 kHz (highband) in wideband speech signal is described in this letter. Based on theoretical work in psychoacoustics, we find out that the temporal envelope can indeed be a perceptual cue for the high-band signal, i.e., a noiseless sound can be obtained if the temporal envelope is roughly preserved. Subjective listening tests verify that transparent quality can be obtained if the model is used for the 4.5-7 kHz band. The proposed model has the benefits of offering flexible scalability and reducing the cost for quantization in coding applications.


Assuntos
Percepção Auditiva/fisiologia , Audição/fisiologia , Percepção da Fala/fisiologia , Fala/fisiologia , Humanos , Modelos Biológicos , Mascaramento Perceptivo , Psicoacústica , Espectrografia do Som , Inteligibilidade da Fala
10.
Artigo em Inglês | MEDLINE | ID: mdl-26737691

RESUMO

This paper proposes a constrained two-layer compression technique for electrocardiogram (ECG) waves, of which encoded parameters can be directly used for the diagnosis of arrhythmia. In the first layer, a single ECG beat is represented by one of the registered templates in the codebook. Since the required coding parameter in this layer is only the codebook index of the selected template, its compression ratio (CR) is very high. Note that the distribution of registered templates is also related to the characteristics of ECG waves, thus it can be used as a metric to detect various types of arrhythmias. The residual error between the input and the selected template is encoded by a wavelet-based transform coding in the second layer. The number of wavelet coefficients is constrained by pre-defined maximum distortion to be allowed. The MIT-BIH arrhythmia database is used to evaluate the performance of the proposed algorithm. The proposed algorithm shows around 7.18 CR when the reference value of percentage root mean square difference (PRD) is set to ten.


Assuntos
Arritmias Cardíacas/diagnóstico , Eletrocardiografia , Algoritmos , Compressão de Dados , Bases de Dados Factuais , Humanos , Valores de Referência , Processamento de Sinais Assistido por Computador , Análise de Ondaletas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA