TIKP: Text-to-Image Knowledge Preservation for Continual Semantic Segmentation

Zhidong Yu, Wei Yang, Xike Xie, Zhenbo Shi

Published: 01 Jan 2024, Last Modified: 07 Nov 2024AAAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Continual Semantic Segmentation (CSS) is an emerging trend, where catastrophic forgetting has been a perplexing problem. In this paper, we propose a Text-to-Image Knowledge Preservation (TIKP) framework to address this issue. TIKP applies Text-to-Image techniques to CSS by automatically generating prompts and content adaptation. It extracts associations between the labels of seen data and constructs text-level prompts based on these associations, which are preserved and maintained at each incremental step. During training, these prompts generate correlated images to mitigate the catastrophic forgetting. Particularly, as the generated images may have different distributions from the original data, TIKP transfers the knowledge by a content adaption loss, which determines the role played by the generated images in incremental training based on the similarity. In addition, for the classifier, we use the previous model from a different perspective: misclassifying new classes into old objects instead of the background. We propose a knowledge distillation loss based on wrong labels, enabling us to attribute varying weights to individual objects during the distillation process. Extensive experiments conducted in the same setting show that TIKP outperforms state-of-the-art methods by a large margin on benchmark datasets.