Skeleton-based multi-features and multi-stream network for real-time action recognition

Zhiwen Deng, Qing Gao, Zhaojie Ju, Xiang Yu

Research output: Contribution to journalArticlepeer-review

Abstract

Action recognition is a hot topic in the field of computer vision. It has been widely used in human-computer/robot interaction, abnormal behavior monitoring, and medical assistive. Because of the excellent robustness of skeleton data, it has attracted many scholars to research skeleton-based action recognition. Most of the current skeleton-based action recognition methods suffer from the incomplete and poor generalization of the input features, inadequate feature extraction by the network model, and imbalance between recognition accuracy and model size. We analyze the critical skeleton features for action recognition to solve these problems and propose a multi-features and multi-stream network (MM-Net) for real-time action recognition. First, three pairs of features are proposed, which are joint distance (JD) and joint distance velocity (JDV), joint angle (JA) and joint angle velocity (JAV), and fast-action joint position (FJP) and slow-action joint position (SJP). Second, a multi-features and multi-stream network is proposed by using one-dimensional convolutional neural network (1DCNN) to reduce the number of parameters of the model and fully extract the three pairs of features. As a result, MM-Net achieves the highest accuracies on both JHMDB (86.5%) and SHREC (96.4% on coarse and 93.3% on fine datasets). In addition, MM-Net is applied to a human-robot interaction (HRI) platform, which proves the practicality of MM-Net.

Original languageEnglish
Pages (from-to)7397-7409
Number of pages13
JournalIEEE Sensors Journal
Volume23
Issue number7
Early online date23 Feb 2023
DOIs
Publication statusPublished - 1 Apr 2023

Keywords

  • cameras
  • face recognition
  • feature extraction
  • human-computer interaction
  • human-robot interaction
  • multi-feature
  • real-time
  • real-time systems
  • sensors
  • skeleton
  • skeleton-based action recognition

Cite this