Efficient Point Cloud Geometry Compression Through Neighborhood Point TransformerDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Abstract: Although convolutional representation of multiscale sparse tensor demonstrated its superior efficiency to compress the Point Cloud Geometry (PCG) through exploiting cross-scale and same-scale correlations, its capacity was yet bounded. This is because 1) the fixed receptive field of the convolution cannot best characterize sparsely and irregularly distributed points; and 2) pretrained convolutions with fixed weights are insufficient to capture dynamic information conditioned on the input. This work proposes the Neighborhood Point transFormer (NPFormer) to replace the existing solutions by taking advantage of both convolution and attention mechanism to best exploit correlations under the multiscale representation framework for better geometry occupancy probability estimation. With this aim, a Neighborhood Point Attention layer (NPA) is devised and stacked with Sparse Convolution layers (SConvs) to form the NPFormer. In NPA, for each point, it uses its k Nearest Neighbors (kNN) to construct an adaptive local neighborhood; and then leverages the self-attention to dynamically aggregate information within this neighborhood. Compared with the anchor using standardized G-PCC, our method provides averaged 17% BD-Rate gains and 14% bitrate reduction for respective lossy and lossless modes when compressing the LiDAR point clouds (e.g. SemanticKITTI, Ford). There are also 20%-40% lossy BD-Rate improvement and 37%-53% lossless bitrate reduction for the compression of object point clouds (e.g. MVUB, MPEG 8i). Compared with the state-of-the-art solution using attention optimized octree codec, our approach requires much less decoding runtime with about 640 times speedup on average, while still presenting better compression efficiency.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
4 Replies

Loading