Abstract: We propose a new method for reconstructing control-lable implicit 3D human models from sparse multi-view RGB videos. Our method defines the neural scene repre-sentation on the mesh surface points and signed distances from the surface of a human body mesh. We identify an indistinguishability issue that arises when a point in 3D space is mapped to its nearest surface point on a mesh for learning surface-aligned neural scene representation. To address this issue, we propose projecting a point onto a mesh surface using a barycentric interpolation with modi-fied vertex normals. Experiments with the ZJU-MoCap and Human3.6M datasets show that our approach achieves a higher quality in a novel-view and novel-pose synthesis than existing methods. We also demonstrate that our method eas-ily supports the control of body shape and clothes. Project page: https://pfnet-research.github.io/surface-aligned-nerf/
Loading