# MedCRP-CL: Continual Medical Image Segmentation via Bayesian Nonparametric Semantic Modality Discovery

## Overview

MedCRP-CL performs **online semantic modality discovery** and **modality-aware continual learning** for medical image segmentation. Using the Chinese Restaurant Process (CRP), our method dynamically infers semantic modality assignments from clinical text prompts as tasks arrive, without requiring a predefined number of modalities or access to future tasks.

### Key Features

- **Bayesian Semantic Modality Discovery**: CRP-based inference discovers semantic modalities from clinical prompts automatically
- **Modality-Specific LoRA**: Isolated low-rank adapters prevent cross-modality interference
- **Intra-Modality EWC**: Fisher information regularization consolidates knowledge within semantic modalities
- **Replay-Free**: Stores only aggregate statistics, compliant with medical data privacy regulations (HIPAA/GDPR)

### Results

| Method | Dice (%) | Forgetting (%) | Params (M) |
|--------|----------|----------------|------------|
| Sequential | 48.0 | 28.3 | 1.2 |
| EWC | 56.8 | 11.3 | 1.2 |
| MoE-Adapters | 65.3 | 7.1 | 51.9 |
| **Ours** | **73.3** | **4.1** | **8.6** |

## Installation
```bash
pip install torch torchvision transformers numpy pillow tqdm
```

## Pretrained Backbone

We use [CLIPSeg](https://huggingface.co/CIDAS/clipseg-rd64-refined) as the backbone. The model is automatically downloaded from HuggingFace Hub on first run.

## Dataset

### Sources

| Imaging Type | Organ | Dataset | Link |
|--------------|-------|---------|------|
| Endoscopy | Colon | Kvasir-SEG | https://datasets.simula.no/kvasir-seg/ |
| Endoscopy | Colon | ClinicDB | https://www.kaggle.com/datasets/balraj98/cvcclinicdb |
| Endoscopy | Colon | CVC-300 / ETIS | https://drive.google.com/drive/folders/10QXjxBJqCf7PAXqbDvoceWmZ-qF07tFi |
| Endoscopy | Colon | ColonDB | https://paperswithcode.com/sota/medical-image-segmentation-on-cvc-colondb |
| Dermoscopy | Skin | ISIC | https://challenge.isic-archive.com/data/ |
| Ultrasound | Heart | CAMUS | http://humanheart-project.creatis.insa-lyon.fr/database/#collection/6373703d73e9f0047faa1bc8 |
| Ultrasound | Breast | BUSI | https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset |
| X-ray | Chest | CheXlocalize | https://stanfordaimi.azurewebsites.net/datasets/23c56a0d-15de-405b-87c8-99c30138950c |

### Structure
```
data/
├── kvasir_polyp/
│   ├── anns/
│   │   ├── train.json
│   │   ├── val.json
│   │   └── test.json
│   ├── images/
│   └── masks/
├── clinicdb_polyp/
├── cvc300_polyp/
├── colondb_polyp/
├── etis_polyp/
├── isic/
├── camus/
├── busi_benign/
├── busi_malignant/
├── chex_airspace_opacity/
├── chex_atelectasis/
├── chex_cardiomegaly/
├── chex_edema/
├── chex_pleural_effusion/
├── chex_enlarged_cardiomediastinum/
└── chex_support_devices/
```


## Prompt Format


| Dataset | Concise Prompt | Detailed Prompt |
|---------|----------------|-----------------|
| Endoscopy (Polyp) | `polyp` | `one medium pink round polyp which is a small lump in the lining of colon located in the top left of the image` |
| ISIC | `skin melanoma` | `one large brown oval skin melanoma which is a spot with dark speckles located in right of the image` |
| CAMUS | `Left ventricular cavity` | `Left ventricular cavity in two-chamber view of the heart at the end of the diastole cycle of a 46-year-old female with poor image quality` |
| BUSI | `Benign tumor` | `Two medium square-shaped benign tumors at the center, left in the breast ultrasound image` |
| CheXlocalize | `Airspace Opacity` | `Airspace Opacity of shape rectangle, and located in right of the frontal view of a Chest Xray` |

## Usage
```bash
python train.py
```

## Method

### 1. Bayesian Semantic Modality Discovery

Tasks are assigned to semantic modalities via CRP prior combined with adaptive similarity distributions:
```
P(z_t = k | z_{1:t-1}, e_t) ∝ n_k · exp(ℓ(s_{t,k}))     # existing modality
P(z_t = new | z_{1:t-1}, e_t) ∝ α · exp(-ℓ(s_{t,k*}))   # new modality
```

Where `ℓ(s)` is the log-likelihood ratio learned from intra/inter-modality similarity distributions via Welford's online algorithm.

### 2. Dynamic Semantic Modality-Specific LoRA

Each semantic modality maintains separate low-rank adapters applied to Q, K, V, and output projections:
```
W_k = W_0 + (α_LoRA / r) · B_k · A_k
```

### 3. Intra-Modality EWC

Fisher information regularization prevents forgetting within each semantic modality:
```
Ω_k(θ_k) = Σ_i F̄_{k,i} (θ_{k,i} - θ*_{k,i})²
```

EWC applies only within semantic modalities—tasks in different semantic modalities have no parameter interaction.

## Project Structure
```
├── train.py              # Entry point
├── src/
│   ├── model.py          # CLIPModalityLoRAModel, AdaptiveCRPModalityManager, EWCManager
│   ├── dataset.py        # MedVLSMDataset, utilities
│   └── trainer.py        # Training and evaluation
├── metrics.py            # Dice score computation
├── prompt_strategies.py  # Prompt selection strategies
└── task_orders.py        # Task sequence definitions
```