Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion
Abstract: Recovering the complete shape of a 3D object from limited viewpoints plays an important role in 3D vision. Encouraged by the effectiveness of feature extraction using deep neural networks, recent point cloud completion methods prefer an encoding-decoding architecture for generating the global structure and local geometry from a set of input point proxies. In this paper, we introduce an innovative completion method aimed at uncovering structural details from input point clouds and maximizing their utility. Specifically, we improve both Encoding and Decoding for this task: (1) Key Context Fusion Encoding extracts and aggregates homologous key context by adaptively increasing the sampling bias towards salient structure and special contour points that are more representative of object structure information. (2) Semantic-based Decoding introduces a semantic EdgeConv module to prompt next Transformer decoder, which effectively learns and generates local geometry with semantic correlations from non-nearest neighbors. The experiments are evaluated on several 3D point cloud and 2.5D depth image datasets. Both qualitative and quantitative evaluations demonstrate that our method outperforms previous state-of-the-art methods.
Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Content] Media Interpretation, [Experience] Multimedia Applications
Relevance To Conference: This paper presents an innovative model designed for 3D point cloud complementation, an important advancement in enriching 3D and 2.5D applications by generating more accurate and detailed 3D models from partial data. Our model is applicable to the Generative Multimedia domain in this conference. This approach not only provides excellent 3D vision completion, but also benefits downstream multimedia applications.
Supplementary Material: zip
Submission Number: 1032
Loading