Abstract: Input encoding has proven crucial in the success of methods based on neural radiance field. Compared to the literature on general static scene modeling, input encoding for dynamic hand modeling has been less explored. However, this aspect is critical to the modeling of deformation and rendering, as it maps a sampled point in space to the representation containing all the information associated with dynamic hand for inferring the geometry and appearance property of this point. The design of input encoding determines how well the neural network can learn for photo-realistic hand rendering. We offer an in-depth examination of this key component and introduce DEHand, a new representation utilizing Deformable Encoding for photo-realistic free-view and free-pose Hand rendering. DEHand leverages deformable encoding with a latent code map to achieve high-quality, pose-controlled rendering. Deformable encoding is achieved by adapting static input encoding techniques for the view synthesis of dynamic hands, using parametric hand mesh model as a proxy to construct encodings that map sampled points into a space capable of integrating over different poses and providing rich information for hand modeling. Our findings demonstrate that with our deformable encoding, a single Multilayer Perceptron (MLP) can achieve high-quality dynamic hand rendering, learning solely from images. Extensive experiments on InterHand2.6 M validate the superior rendering quality of our method and the effectiveness of each component in our design.
External IDs:doi:10.1109/tmm.2025.3581749
Loading