AbstractWith the popularity of smart devices in our daily lives in recent years, affective computing has attracted increasing attention, which is regarded as the fundamental requirement of human-machine interaction systems, such as smart phones and virtual reality equipment. As the most communicative part of emotion in our body, face contains a lot of behaviours of expressing emotions. Facial behaviour analysis imitates the way humans analyse and understand emotions, which is essential for achieving affective computing. However, automatic facial behaviour analysis is still a very difficult task, especially in the wild environment. This thesis addresses facial behaviour analysis from two fundamental aspects, namely eye analysis and facial expression analysis. Therefore, the proposed work in this thesis deeply explores how to adapt deep learning technologies to address the problems and challenges mainly from these two aspects.
For eye analysis, eye centre localization that occupies the crucial position becomes the first priority to be addressed. Existing methods mainly rely on hand-crafted features, which are not robust enough and are very sensitive to the variation from the wild environment. Moreover, the previous works on eye centre localization have rarely used technologies of deep learning. To address these issues, this thesis proposes a novel method based on a fully convolutional network (FCN) for the task of eye centre localization, which treats eye centre localization as a special subproblem of the task of semantic segmentation. The proposed method has been validated on challenging databases and has competitive performance compared with the state-of-the-art methods in terms of accuracy of eye centre localization, which is an alternative solution for some challenging real-world scenarios.
For facial expression analysis, this thesis proposes a novel relation-aware facial expression recognition method called Relation Convolutional Neural Network (ReCNN). ReCNN adaptively captures the relationship between crucial regions and facial expressions and focuses on the most discriminative regions for recognition. Comparing with the previous methods that rely on processing the whole face for recognition, the performance of ReCNN is more accurate and robust on two large in-the-wild databases. What’s more, the relationship between crucial regions and facial expressions shows big potential on further improving the performance of facial expression recognition. Inspired by ReCNN, this thesis continues to explore the role of crucial facial regions in facial expression synthesis and proposes a novel method called Local and Global Perception Generative Adversarial Network (LGP-GAN) for facial expression synthesis. It fully utilizes local and global facial information during facial expression synthesis. Extensive experiments on the mainstream database demonstrate that LGP-GAN has superior performance compared with the state-of-the-art methods, which is a feasible solution for the issue of inadequate training data in facial expression recognition. Towards exploring mobile affective computing, this thesis further proposes a light-weight CNN architecture with high performance and low consumption, which is well-suited for mobile applications. The proposed method is capable of real-time performance on an actual mobile device and allows for easy portability and integration with other applications.
In summary, through developing the aforementioned algorithmic solutions, the thesis gains the first-hand experience in adapting the technologies of deep learning to the facial behaviour analysis task. This is supposed to be beneficial to propagate facial behaviour analysis to a broader range of applications based on affective computing.
|Date of Award||Apr 2021|
|Supervisor||Hui Yu (Supervisor)|