Your browser doesn't support javascript.
loading
FPGA Implementation of Keyword Spotting System Using Depthwise Separable Binarized and Ternarized Neural Networks.
Bae, Seongwoo; Kim, Haechan; Lee, Seongjoo; Jung, Yunho.
Afiliação
  • Bae S; School of Electronics and Information Engineering, Korea Aerospace University, Goyang-si 10540, Republic of Korea.
  • Kim H; School of Electronics and Information Engineering, Korea Aerospace University, Goyang-si 10540, Republic of Korea.
  • Lee S; Department of Semiconductor Systems Engineering, Sejong University, Seoul 05006, Republic of Korea.
  • Jung Y; Institute of Semiconductor and System IC, Sejong University, Seoul 05006, Republic of Korea.
Sensors (Basel) ; 23(12)2023 Jun 19.
Article em En | MEDLINE | ID: mdl-37420866
ABSTRACT
Keyword spotting (KWS) systems are used for human-machine communications in various applications. In many cases, KWS involves a combination of wake-up-word (WUW) recognition for device activation and voice command classification tasks. These tasks present a challenge for embedded systems due to the complexity of deep learning algorithms and the need for optimized networks for each application. In this paper, we propose a depthwise separable binarized/ternarized neural network (DS-BTNN) hardware accelerator capable of performing both WUW recognition and command classification on a single device. The design achieves significant area efficiency by redundantly utilizing bitwise operators in the computation of the binarized neural network (BNN) and ternary neural network (TNN). In a complementary metal-oxide semiconductor (CMOS) 40 nm process environment, the DS-BTNN accelerator demonstrated significant efficiency. Compared with a design approach where BNN and TNN were independently developed and subsequently integrated as two separate modules into the system, our method achieved a 49.3% area reduction while yielding an area of 0.558 mm2. The designed KWS system, which was implemented on a Xilinx UltraScale+ ZCU104 field-programmable gate array (FPGA) board, receives real-time data from the microphone, preprocesses them into a mel spectrogram, and uses this as input to the classifier. Depending on the order, the network operates as a BNN or a TNN for WUW recognition and command classification, respectively. Operating at 170 MHz, our system achieved 97.1% accuracy in BNN-based WUW recognition and 90.5% in TNN-based command classification.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Limite: Humans Idioma: En Revista: Sensors (Basel) Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Limite: Humans Idioma: En Revista: Sensors (Basel) Ano de publicação: 2023 Tipo de documento: Article