Keywords: Few-shot Learning, Incremental Learning, Point Cloud, Mutimodal
Abstract: Few-shot class-incremental learning (FSCIL) is a practical yet challenging problem in visual representation learning, primarily due to two notorious issues: catastrophic forgetting of previously learned classes and overfitting to the currently introduced classes. While most existing approaches adopt prototype-based mechanisms to address these challenges, achieving impressive results on 2D images that only a handful have explored FSCIL in the context of 3D point clouds. In this work, We propose GAUSS-Fusion, which combines (i) bidirectional cross-modal attention between 2D renderings and 3D points, (ii) a category-aware viewpoint selection strategy, and (iii) a Gaussian Memory for generative replay. Our method generates robust and homogeneous multi-modal representations. Extensive experiments demonstrate that our approach outperforms state-of-the-art approaches. Notably, our model achieves a reduction in the relative accuracy dropping rate $\Delta$ up to 10\% on six benchmark 3D FSCIL tasks and two noisy-data tasks, showcasing its superior robustness and adaptability.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 16353
Loading