XHand: Real-time Expressive Hand Avatar

Qijun Gan, Zijie Zhou, Jianke Zhu
Zhejiang University
ganqijun@zju.edu.cn
MY ALT TEXT

Fig. 1. We present XHand, a rigged hand avatar that captures the geometry, appearance and poses of the hand. XHand is created from multi-view videos and utilizes MANO pose parameters (the first image in each group of (a)) to generate high-detail meshes (the second) and renderings (the third). XHand generates photo-realistic hand images in real-time for a given pose sequence (b). (c) is an example of animated personalized hand avatars according to poses in the wild images.

Abstract

Hand avatars play a pivotal role in a wide array of digital interfaces, enhancing user immersion and facilitating natural interaction within virtual environments. While previous studies have focused on photo-realistic hand rendering, little attention has been paid to reconstruct the hand geometry with fine details, which is essential to rendering quality. In the realms of extended reality and gaming, on-the-fly rendering becomes imperative. To this end, we introduce an expressive hand avatar, named XHand, that is designed to comprehensively generate hand shape, appearance, and deformations in real-time. To obtain fine-grained hand meshes, we make use of three feature embedding modules to predict hand deformation displacements, albedo, and linear blending skinning weights, respectively. To achieve photo-realistic hand rendering on fine-grained meshes, our method employs a mesh-based neural renderer by leveraging mesh topological consistency and latent codes from embedding modules. During training, a part-aware Laplace smoothing strategy is proposed by incorporating the distinct levels of regularization to effectively maintain the necessary details and eliminate the undesired artifacts. The experimental evaluations on InterHand2.6M and DeepHandMesh datasets demonstrate the efficacy of XHand, which is able to obtain high-fidelity geometry and texture for hand animations across diverse poses in real-time.

Table. 1.Rendering quality comparisons on the InterHand2.6M dataset. Our method excels in delivering the best rendering quality while simultaneously maintaining real-time performance.

Results in InterHand2.6M

Results in wild pose

Copyright

All codes on this page are copyright by us and published under the CC BY-NC 4.0 International License. This means that you must attribute the work in the manner specified by the authors, you may not use this work for commercial purposes.

BibTeX

@misc{gan2024xhandrealtimeexpressivehand,
      title={XHand: Real-time Expressive Hand Avatar},
      author={Qijun Gan and Zijie Zhou and Jianke Zhu},
      year={2024},
      eprint={2407.21002},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.21002},
}