AbstractFacial performance capture (or face capture), a process of reconstructing, tracking and analysing the deformable geometry and appearance of the human face from visual input (e.g. RGB or RGB-D images), is a long-standing research topic in the fields of computer graphics and vision. Over the past two decades, the field of face capture has witnessed rapid progress, which has pushed the capture method’s accuracy, speed and ease of use to a new level, and benefited a wide range of applications such as personalized facial avatar generation, face identification and facial animation. Nevertheless, it remains an open problem of improving the method’s capturing robustness while keeping the method compute and data efficient. Moreover, the emerging virtual reality (VR) technologies for immersive interactions have posed new challenges to facial performance capture. The VR head-mounted display (HMD) occludes a large portion of the user’s face, which makes conventional vision-based face capture methods less effective.
Targeting at solving the aforementioned problems, this thesis first develops two novel face capture approaches for detecting sparse 2D facial landmarks and tracking dense 3D facial geometry respectively from a monocular RGB camera. Both approaches have been thoroughly evaluated on benchmark face image and video datasets. In comparison with the previous methods, they showcase improved capturing performance at very low data and computational cost. The proposed approaches have further been implemented into mobile and desktop facial tracking interfaces and validated on live video streams.
For capturing the VR HMD user’s facial expression with high-fidelity, the thesis proposes to combine a classic monocular 3D face reconstruction algorithm with a pioneering facial biosensing technique – Faceteq, which uses advanced electromyographic (EMG) sensors to capture facial muscle activities. This extends the facial performance capture from the traditional visual scene to the novel VR context, thereby providing a practical solution to achieve face-to-face communication with compelling facial expressions in virtual environment.
Besides developing robust facial performance capture approaches, this thesis explores a new direction for applying those approaches to solve real-world problems. Specifically, it identifies the problem of automated facial nerve function assessment from visual face capture for facial palsy management. By systematically reviewing the principal studies on related topics, the thesis points out the challenges in the field and indicates promising directions for future work. What’s more, it proposes a promising pathway to apply the face capture methods proposed in previous chapters onto automated facial nerve function assessment. To the best of my knowledge, this is the first review of its kind to be reported so far. Due to the interdisciplinary nature of the review, it can benefit multiple areas, including visual face capture, clinical facial palsy diagnosis and facial bioengineering.
|Date of Award||Sep 2020|
|Supervisor||Hui Yu (Supervisor)|