Facial expression synthesis has gained increasing attention with the development of Generative Adversarial Networks (GANs). However, it is still very challenging to generate high-quality facial expressions since the overlapping and blur commonly appear in the generated facial images especially in the regions with rich facial features such as eye and mouth. Generally, existing methods mainly consider the face as a whole in facial expression synthesis without paying specific attention to the characteristics of facial expressions. In fact, according to the physiological and psychological research, the differences of facial expressions often appear in crucial regions such as eye and mouth. Motivated by this observation, a novel end-to-end facial expression synthesis method called Local and Global Perception Generative Adversarial Network (LGP-GAN) with a two-stage cascaded structure is proposed in this paper which is designed to extract and synthesize the details of the crucial facial regions. LGP-GAN can combine the generated results from the global network and local network into the corresponding facial expressions. In Stage I, LGP-GAN utilizes local networks to capture the local texture details of the crucial facial regions and generate local facial regions, which fully explores crucial facial region domain information in facial expressions. And then LGP-GAN uses a global network to learn the whole facial information in Stage II to generate the generate final facial expressions building upon local generated results from Stage I. We conduct qualitative and quantitative experiments on the commonly used public database to verify the effectiveness of the proposed method. Experimental results show the superiority of the proposed method over the state-of-the-art methods.
|Journal||IEEE Transactions on Circuits and Systems for Video Technology|
|Early online date||19 Apr 2021|
|Publication status||Early online - 19 Apr 2021|