# Continual Multitask Learning (CMTL) and Learning with Preserving (LwP)
In this project, we introduce Continual Multitask Learning (CMTL), a new learning paradigm that addresses real-world scenarios where tasks arrive sequentially with different label spaces but from the same underlying data distribution. This is common in situations where it's impractical to gather a complete dataset at once due to privacy, resource, or annotation constraints. CMTL poses unique challenges that go beyond traditional Continual Learning (CL) and Multitask Learning (MTL), as it requires models to handle new tasks without forgetting previous ones, while still generalizing across all tasks.

To tackle these challenges, we propose Learning with Preserving (LwP), a novel framework that preserves previously learned knowledge while keeping it relevant and applicable to future tasks. This approach allows us to avoid the need for replay buffers while ensuring that the learned representations are robust and generalizable across sequential tasks.

# Key Contributions:
- We propose Continual Multitask Learning (CMTL), highlighting its unique challenges where tasks arrive sequentially, and models must balance both continual and multitask learning in a sequential learning framework.
- We introduce Learning with Preserving (LwP), a method designed to maintain the integrity of the latent space, ensuring knowledge from past tasks is preserved and beneficial for future ones.
- Through extensive evaluations, we show that LwP outperforms existing baselines, including traditional CL and MTL models, particularly in CMTL scenarios where data is introduced incrementally.

# Train CL and MTL strategies on CelebA dataset (10 tasks, 32x32x3, 5 runs):

## CL strategies
main.py --job cl --model {lwp|der|derpp|fdr|lwf|gss|er} --dataset celeba -is 32 --num_seed 5
## MTL strategies
main.py --job mtl --model {mtl|pcgrad|imtl|nashmtl} --dataset celeba -is 32 --num_seed 5 -tsr 0.1
## Evaluation
eval.py --job {cl|mtl} --dataset celeba -is 32 --num_seed 5


# Train CL and MTL strategies on PhysiQ dataset (3 tasks, 128x6, 5 runs):

## CL strategies
main.py --job cl --model {lwp|der|derpp|fdr|lwf|gss} --dataset physiq -is 128 -l 0.01 --z_dim 128 --num_seed 5 -b 32
## MTL strategies
main.py --job mtl --model {mtl|pcgrad|imtl|nashmtl|single} --dataset physiq -is 128 -l 0.01 --z_dim 128 --num_seed 5 -b 32 -tsr 0.33
## Evaluation
eval.py --job {cl|mtl} --dataset physiq -is 128 --z_dim 128 --num_seed 5 -b 32


# Train CL and MTL strategies on FairFace dataset (3 tasks, 128x128x3):

## CL strategies
main.py --job cl --model {lwp|der|derpp|fdr|lwf|gss} --dataset fairface -is 128 --num_seed 5
## MTL strategies
main.py --job mtl --model {mtl|pcgrad|imtl|nashmtl} --dataset fairface -is 128 --num_seed 5 -tsr 0.333



# Convergence speed evaluation for PhysiQ and CelebA datasets:

## PhysiQ CL and MTL speed eval
eval_speed_main.py --job {cl|mtl} --model {lwp|er|der|derpp|fdr|gss|lwf|mtl|pcgrad|imtl|nashmtl} --dataset physiq -is 128 -l 0.01 --z_dim 128 --num_seed 5 -b 32 -tsr 0.33
## CelebA CL and MTL speed eval
eval_speed_main.py --job {cl|mtl} --model {lwp|der|fdr|lwf|gss|mtl|nashmtl} --dataset celeba -is 32 --num_seed 5 --eval_period 40 -tsr .1


