Multi-view region-adaptive multi-temporal DMM and RGB action recognition

Mahmoud M. N. Al-Faris, John Chiverton, Linda Yang, David Ndzi

Research output: Contribution to journalArticlepeer-review

180 Downloads (Pure)

Abstract

Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel multi-view region-adaptive multi-resolution-in-time depth motion map (MV-RAMDMM) formulation combined with appearance information. Multi-stream 3D convolutional neural networks (CNNs) are trained on the different views and time resolutions of the region-adaptive depth motion maps. Multiple views are synthesised to enhance the view invariance. The region-adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multi-class support vector machines. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human–object interaction. Three public-domain data-sets, namely MSR 3D Action, Northwestern UCLA multi-view actions and MSR 3D daily activity, are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.

Original languageEnglish
Number of pages16
JournalPattern Analysis & Applications
Early online date21 Apr 2020
DOIs
Publication statusEarly online - 21 Apr 2020

Keywords

  • Action Recognition
  • depth motion map (DMM)
  • 3D Convolutional Neural Network
  • Region Adaptive

Fingerprint

Dive into the research topics of 'Multi-view region-adaptive multi-temporal DMM and RGB action recognition'. Together they form a unique fingerprint.

Cite this