Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: self-supervised learning, representation learning, 3D point clouds, masked modeling, invariance learning, self-distillation, unsupervised learning
TL;DR: AsymDSD is a self-supervised learning framework for 3D point clouds that combines masked modeling and invariance learning through latent space prediction.
Abstract: Learning semantically meaningful representations from unstructured 3D point clouds remains a central challenge in computer vision, especially in the absence of large-scale labeled datasets. While masked point modeling (MPM) is widely used in self-supervised 3D learning, its reconstruction-based objective can limit its ability to capture high-level semantics. We propose AsymDSD, an Asymmetric Dual Self-Distillation framework that unifies masked modeling and invariance learning through prediction in the latent space rather than the input space. AsymDSD builds on a joint embedding architecture and introduces several key design choices: an efficient asymmetric setup, disabling attention between masked queries to prevent shape leakage, multi-mask sampling, and a point cloud adaptation of multi-crop. AsymDSD achieves state-of-the-art results on ScanObjectNN (90.53\%) and further improves to 93.72\% when pretrained on 930k shapes, surpassing prior methods.
Supplementary Material: zip
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 23401
Loading