Abstract: Neural implicit functions have proved successful in representing 3D shapes or surfaces at arbitrary resolutions and with high fidelity. Unfortunately, between the various forms of reconstruction tasks neural implicit representation methods target, reconstructing from discrete voxels remains limited because of the computational complexity involved. We address this problem by introducing Dual Hierarchical Representation (DHR), which allows for faithful reconstructions under constrained computation by hierarchical encoding, decoding, and training procedures. A hierarchical latent feature code set is produced by first encoding the sparse voxelized shape into multi-scale feature grids and then grid-sampling each feature with a query point. The proposed transformer decoder then incorporates individual latent codes in hierarchical order, directing feature-to-3D projection and modeling the interaction of latent features with occupancies via cross-attention. At the training phase, representations derived from all feature hierarchies are integrated with varying contributions for another global-to-local learning technique. Experiments verify that DHR gains representation power by outperforming various baselines by voxel reconstruction tasks. It also shows robustness against different shape categories and gains the potential for being useful in the wild thanks to the generalization ability the transformer carries. Our code is available at https://github.com/JYeShin/DualHierarchicalRepresentation.
Loading