ST-TGR: Spatio-Temporal Representation Learning for Skeleton-Based Teaching Gesture Recognition.

Chen, Zengzhao; Huang, Wenkai; Liu, Hai; Wang, Zhuo; Wen, Yuqun; Wang, Shengming

Chen, Zengzhao; Huang, Wenkai; Liu, Hai; Wang, Zhuo; Wen, Yuqun; Wang, Shengming.

Afiliación

Chen Z; Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China.
Huang W; National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China.
Liu H; Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China.
Wang Z; Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China.
Wen Y; National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China.
Wang S; Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China.

Sensors (Basel) ; 24(8)2024 Apr 18.

Article en En | MEDLINE | ID: mdl-38676207

ABSTRACT

ABSTRACT

Teaching gesture recognition is a technique used to recognize the hand movements of teachers in classroom teaching scenarios. This technology is widely used in education, including for classroom teaching evaluation, enhancing online teaching, and assisting special education. However, current research on gesture recognition in teaching mainly focuses on detecting the static gestures of individual students and analyzing their classroom behavior. To analyze the teacher's gestures and mitigate the difficulty of single-target dynamic gesture recognition in multi-person teaching scenarios, this paper proposes skeleton-based teaching gesture recognition (ST-TGR), which learns through spatio-temporal representation. This method mainly uses the human pose estimation technique RTMPose to extract the coordinates of the keypoints of the teacher's skeleton and then inputs the recognized sequence of the teacher's skeleton into the MoGRU action recognition network for classifying gesture actions. The MoGRU action recognition module mainly learns the spatio-temporal representation of target actions by stacking a multi-scale bidirectional gated recurrent unit (BiGRU) and using improved attention mechanism modules. To validate the generalization of the action recognition network model, we conducted comparative experiments on datasets including NTU RGB+D 60, UT-Kinect Action3D, SBU Kinect Interaction, and Florence 3D. The results indicate that, compared with most existing baseline models, the model proposed in this article exhibits better performance in recognition accuracy and speed.

Asunto(s)

Gestos; Humanos; Reconocimiento de Normas Patrones Automatizadas/métodos; Algoritmos; Enseñanza

Palabras clave

action recognition; classroom scenario; pose estimation; teaching gesture

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Gestos Idioma: En Revista: Sensors (Basel) Año: 2024 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Gestos Idioma: En Revista: Sensors (Basel) Año: 2024 Tipo del documento: Article