Vision-based 3D Face Reenactment

  • Shuwen Zhao

Student thesis: Doctoral Thesis


This thesis addresses the challenge of 3D face reenactment, which involves manipulating face images to transfer into target head poses, facial expressions, and lip movements. De- spite the significant progress made in generating realistic faces, existing face reenactment technology still faces several drawbacks, such as struggling with capturing and conveying facial expressions accurately and handling large pose variations. Besides, face reenactment involves a large amount of computational resources, which are crucial to facilitate them into real-world applications.
To address these issues, this thesis proposes a vision-based 3D face reenactment frame- work. First, it presents a facial mesh deformation-based 3D face reconstruction method (FMD-3DFR) that captures expressions accurately and has a low computational cost with high fitting performance. The proposed method uses a data augmentation scheme to enhance the model generalization ability and a pose estimation loss function to improve the facial pose accuracy. Secondly, it proposes a detailed face geometry and appearance reconstruction- based method to generate more realistic 3D face models that allow control of large head pose variation. The proposed method reconstructs the detailed face shape by re-rendering the face model and generates a UV map with appearance information using generative adversarial network (GAN) based algorithms. To personalize the appearance detail, a loss function is proposed to refine individual-specific appearance details. Finally, the framework proposes a multi-model fusion-based face reenactment method that manipulates face images into target head poses and facial expressions while incorporating driving audio. This method uses the proposed face reconstruction methods and speech feature extraction method to regress face parameters, which are then confused and fed into a GAN-based, lightweight rendering module to generate the target images. Additionally, a motion module is added to predict facial motion fields. The proposed method produces comparable visual quality with significantly less time compared to existing state-of-the-art methods.
In summary, this thesis contributes a theoretical framework for vision-based 3D face reenactment that can be applied in real-time face-related applications. The study provides a series of novel methods for generating realistic facial expressions and controlling large pose variation advancing research and applications in vision-based 3D Face Reenactment.
Date of Award25 Sept 2023
Original languageEnglish
SupervisorHonghai Liu (Supervisor), Dalin Zhou (Supervisor) & Jiacheng Tan (Supervisor)

Cite this