Keywords: Class Incremental Semantic Segmentation; Semantic Segmentation; Class Incremental Learning; Gaussian Mixture Models
TL;DR: We propose a Continual Gaussian Mixture Distribution (CoGaMiD) modeling method for class incremental semantic segmentation.
Abstract: Class incremental semantic segmentation (CISS) enables a model to continually segment new classes from non-stationary data while preserving previously learned knowledge. Recent top-performing approaches are prototype-based methods that assign a prototype to each learned class to reproduce previous knowledge. However, modeling each class distribution relying on only a single prototype, which remains fixed throughout the incremental process, presents two key limitations: (i) a single prototype is insufficient to accurately represent the complete class distribution when incoming data stream for a class is naturally multimodal; (ii) the features of old classes may exhibit anisotropy during the incremental process, preventing fixed prototypes from faithfully reproducing the matched distribution. To address the aforementioned limitations, we propose a Continual Gaussian Mixture Distribution (CoGaMiD) modeling method. Specifically, the means and covariance matrices of the Gaussian Mixture Models (GMMs) are estimated to model the complete feature distributions of learned classes. These GMMs are stored to generate pseudo-features that support the learning of novel classes in incremental steps. Moreover, we introduce a Dynamic Adjustment (DA) strategy that utilizes the features of previous classes within incoming data streams to update the stored GMMs. This adaptive update mitigates the mismatch between fixed GMMs and continually evolving distributions. Furthermore, a Gaussian-based Representation Constraint (GRC) loss is proposed to enhance the discriminability of new classes, avoiding confusion between new and old classes. Extensive experiments on Pascal VOC and ADE20K show that our method achieves superior performance compared to previous methods, especially in more challenging long-term incremental scenarios.
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 6933
Loading