HR-GCN: 2D-3D whole-body pose estimation with high-resolution graph convolutional network from a monocular camera

Mingyu Zhang, Qing Gao*, Yuanchuan Lai, Junjie Hu, Xin Zhang, Zhaojie Ju

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

16 Downloads (Pure)

Abstract

3D human pose estimation plays a vital role in applications such as action recognition, human-robot interaction, and immersive technologies. While traditional methods focus on coarse body keypoints, 3D whole-body pose estimation localizes keypoints for the entire body, including hands, face, and feet, allowing for the capture of more detailed human motion and expression information, which enhances its applicability to downstream tasks. Although 3D whole-body pose estimation can be achieved using marker-based systems, wearable devices, or multi-view camera setups, employing a monocular camera is the most convenient and cost-effective approach. However, the problem of monocular 3D whole-body pose estimation remains inadequately addressed, with significant shortcomings in accuracy. This paper introduces a High-Resolution Graph Convolutional Network (HR-GCN) designed to address the challenges of 2D-3D whole-body pose estimation. The proposed HR-GCN leverages the structural properties of graph convolutional networks to model the human skeleton, enabling accurate 3D pose estimation from 2D keypoints. The framework consists of two key modules: the High-Resolution Module (HRM) for extracting 3D body keypoints and coarse-grained features, and the Fine-Grained Keypoints Prediction Module (FGKPM) for refining the 3D coordinates of hands and face. Extensive experiments demonstrate the effectiveness of HR-GCN on the H3WB dataset, showcasing a significant reduction in Mean Per Joint Position Error (MPJPE) compared to existing state-of-the-art (SOTA) methods. The code and model are available at https://github.com/Z-mingyu/HR-GCN.git.

Original languageEnglish
JournalIEEE Sensors Journal
Early online date10 Apr 2025
DOIs
Publication statusEarly online - 10 Apr 2025

Keywords

  • 3D whole-body pose estimation
  • Graph convolutional network
  • Human pose estimation

Fingerprint

Dive into the research topics of 'HR-GCN: 2D-3D whole-body pose estimation with high-resolution graph convolutional network from a monocular camera'. Together they form a unique fingerprint.

Cite this