Vision-based human activity analysis

Student thesis: Doctoral Thesis


Human activity recognition has been an active research topic for decades due to its potential applications in video surveillance, human-robot interaction, elderly care, and entertainment. Although significant progress has been made recently with the emergency of RGB-D sensors, it still remains a great challenge in applying it to practical scenarios. The main contribution of this thesis is a novel human activity framework including four algorithms, namely, Geometry property and Bag of Semantic moving Words (GBSW) for human action recognition, Spatial Relation and temporal Moving Similarity (SRMS) for human interaction recognition, Skeleton Motion Distribution model (SMD) for human action detection, and Multi-stage Soft Regression (MSR) framework for online human activity recognition.

Firstly, targeting at traditional human action recognition problem where the action sequences are manually pre-segmented, a spatio-temporal feature descriptor GBSW which aggregates a bag of semantic moving words (BSW) with the geometric feature (G) is proposed to effectively represent human actions from skeleton sequences. Experimental results have shown that GBSW can obtain superior performance over the state-of-the-art methods.

Secondly, taking advantage of the BSW feature extracted from individuals, the moving similarity between body parts is further explored to describe the mutual relationship for effective human interaction recognition. A new large RGB-D based human-human interaction dataset, namely, Online Human Interaction (OHI) Dataset is collected for the evaluation of human interaction recognition algorithms. The effectiveness of the proposed method has been proven by the experimental results on both the public dataset and the newly collected dataset.

Thirdly, to remove the manual segmentation requirement in the traditional action recognition and achieve automatic action detection for a given video sequence, a novel SMD model is developed. Specifically, an adaptive density estimation function is built to calculate the density distribution of skeleton movements. Experimental results have demonstrate that our method outperforms the state-of-the-art methods in terms of both detection accuracy and recognition precision.

Fourthly, a MSR framework is developed for online activity recognition where the action needs to be recognized immediately for a continuously incoming video stream. The developed framework delicately assembles overlapped activity observations in all periods to improve its robustness against arbitrary activity segments. Extensive experimental results on several public available databases have demonstrated the outstanding performance of the MSR method over the state-of-the-art approaches.
Date of AwardSep 2018
Original languageEnglish
SupervisorHonghai Liu (Supervisor), Zhaojie Ju (Supervisor), Zhaojie Ju (Supervisor), Nicholas John Savage (Supervisor) & Honghai Liu (Supervisor)

Cite this