Your browser doesn't support javascript.
loading
Nonlinear Regularization Decoding Method for Speech Recognition.
Zhang, Jiang; Wang, Liejun; Yu, Yinfeng; Xu, Miaomiao.
Afiliação
  • Zhang J; College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China.
  • Wang L; College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China.
  • Yu Y; College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China.
  • Xu M; College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China.
Sensors (Basel) ; 24(12)2024 Jun 14.
Article em En | MEDLINE | ID: mdl-38931629
ABSTRACT
Existing end-to-end speech recognition methods typically employ hybrid decoders based on CTC and Transformer. However, the issue of error accumulation in these hybrid decoders hinders further improvements in accuracy. Additionally, most existing models are built upon Transformer architecture, which tends to be complex and unfriendly to small datasets. Hence, we propose a Nonlinear Regularization Decoding Method for Speech Recognition. Firstly, we introduce the nonlinear Transformer decoder, breaking away from traditional left-to-right or right-to-left decoding orders and enabling associations between any characters, mitigating the limitations of Transformer architectures on small datasets. Secondly, we propose a novel regularization attention module to optimize the attention score matrix, reducing the impact of early errors on later outputs. Finally, we introduce the tiny model to address the challenge of overly large model parameters. The experimental results indicate that our model demonstrates good performance. Compared to the baseline, our model achieves recognition improvements of 0.12%, 0.54%, 0.51%, and 1.2% on the Aishell1, Primewords, Free ST Chinese Corpus, and Common Voice 16.1 datasets of Uyghur, respectively.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Interface para o Reconhecimento da Fala Limite: Humans Idioma: En Revista: Sensors (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Interface para o Reconhecimento da Fala Limite: Humans Idioma: En Revista: Sensors (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China