# Parameter Release and Knowledge Reuse for Class-Incremental Semantic Segmentation


## Updates & News
- Our paper has been submitted to **ICLR 2026**.

## Abtract
Class-incremental semantic segmentation aims to progressively learn new classes while preserving previously acquired knowledge. This task becomes particularly challenging when prior training samples are unavailable due to data privacy or storage restrictions, resulting in catastrophic forgetting. To address this issue, knowledge distillation is widely adopted as a constraint by maximizing the similarity of knowledge response between the current model (learning new classes) and the previous model (retaining old ones). However, knowledge distillation inherently preserves the distribution of old knowledge with minimal modification. This constraint limits the parameters available for learning new classes when substantial information from old classes is retained. Furthermore, the acquired old knowledge is often ignored to facilitate the learning of new knowledge, resulting in a waste of previously learned procedures. The above two questions result in the risk of class confusion and deviating from the performance of joint learning. Based on such analysis, we propose Distribution-based Knowledge Distillation (DKD) via a minimization–maximization distribution strategy. On the one hand, to alleviate the parameter competition between old and new knowledge, we minimize the distribution of old knowledge by releasing parameters with low sensitivity to old classes. On the other hand, to effectively utilize the valuable knowledge previously acquired, we maximize the distribution of shared knowledge between the old and new knowledge after approximating the new knowledge distribution via Laplacian-based projection estimation. The proposed method achieves an excellent balance between stability and plasticity in nine diverse settings on Pascal VOC and ADE20K. Notably, its average performance approaches that of joint learning (upper bound) while effectively reducing class confusion. The source code is provided in the supplementary material and will be made publicly available upon acceptance.
## Requirements
Conda environment settings:
```
conda create -n DKD python=3.8
conda activate DKD
```
You need to install the following libraries:
```
pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
pip install -U openmim
mim install mmcv
```
## Datasets
```
data_root/
    ├── VOCdevkit
    │   └── VOC2012/
    │       ├── Annotations/
    │       ├── ImageSet/
    │       ├── JPEGImages/
    │       └── SegmentationClassAug/
    ├── ADEChallengeData2016
    │   ├── annotations
    │   │   ├── training
    │   │   └── validation
    │   └── images
    │       ├── training
    │       └── validation
```

## Training 
```
sh main.sh
```
We provide a training script ``main.sh`` of 19-1 setting. Detailed training argumnets are as follows:
```sh
python -m torch.distributed.launch --nproc_per_node={num_gpu} --master_port={port} main.py --config ./configs/voc.yaml --log {your_log_name}
```

## Test
```
sh test.sh
```
We provide a test script ``test.sh`` for the 100-10 setting. Due to the 100MB limit for supplementary materials, we are unable to upload the .pth file. If the paper is accepted, we will make it publicly available via a link. Detailed test arguments are as follows:
```sh
python -m torch.distributed.launch --nproc_per_node={num_gpu} --master_port={port} main.py --config ./configs/ade20k.yaml --log {your_log_name}
```



## Contributors and Contact
If there are any questions, feel free to contact the authors.
