

## 1 IMDB
Specify the huggingface model name in the args of the script.
### 1.1 Data
It will be automatically downloaded from huggingface during the training.

### 1.2 Fine-tune the teacher models
```bash
cd Classification

GPT: python IMDB/GPT/run_imdb.py --model openai-community/gpt2-medium  --output_dir  ./teacher/gpt

BERT: python IMDB/BERT/run_imdb.py --model bert-base-uncased  --output_dir  ./teacher/bert/

```

### 1.3 KD 
```bash
Flex-KD: python IMDB/GPT/run_imdb_w_distillation.py  --student_model_name_or_path  openai-community/gpt2  --teacher_model_name_or_path ./IMDB/teacher/gpt --output_dir ./student/Flex-KD   --student_hidden_size 768 --alpha_corr 0.5


CKA: python IMDB/GPT/run_imdb_w_distillation.py  --student_model_name_or_path  openai-community/gpt2  --teacher_model_name_or_path ./IMDB/teacher/gpt --output_dir ./student/CKA  --alpha_CKA 0.5


Projector: python IMDB/GPT/run_imdb_w_distillation.py  --student_model_name_or_path  openai-community/gpt2  --teacher_model_name_or_path ./IMDB/teacher/gpt --output_dir ./student/Projector  --student_hidden_size 768 --teacher_hidden_size 1024 --alpha_corr 0.5 --do_projector True
```

## 2 GLUE
Specify the huggingface model name in the args of the script.
### 2.1 Data
```bash
cd Classification
cd GLUE
mkdir glue_data
python download_glue_data.py
```

### 2.2 Fine-tune the teacher models
```bash
python GLUE/run_glue.py   --model_type gpt   --model_name_or_path openai-community/gpt2-medium   --task_name sst-2   --data_dir ./GLUE/glue_data/SST-2/   --do_lower_case   --max_seq_length 128   --do_train   --per_gpu_train_batch_size 32   --per_gpu_eval_batch_size 32   --learning_rate 5e-5   --num_train_epochs 3.0   --output_dir ./model/gpt/sst2/teacher/
```

### 2.3 KD 
```bash
Flex-KD: python GLUE/run_glue_w_distillation.py  --teacher_model_name_or_path ./GLUE/gpt/sst2/teacher/     --student_model_name_or_path openai-community/gpt2   --task_name sst2  --max_seq_length 128 --output_dir ./results/student/sst2/  --alpha_glue 0.5 --alpha_corr 0.5 --num_train_epochs 3.0

CKA: python GLUE/run_glue_w_distillation.py  --teacher_model_name_or_path ./GLUE/gpt/sst2/teacher/     --student_model_name_or_path openai-community/gpt2   --task_name sst2  --max_seq_length 128 --output_dir ./results/student/sst2/  --alpha_glue 0.5 --alpha_CKA 0.5 --num_train_epochs 3.0

Projector: python GLUE/run_glue_w_distillation.py  --teacher_model_name_or_path ./GLUE/gpt/sst2/teacher/     --student_model_name_or_path openai-community/gpt2   --task_name sst2  --max_seq_length 128 --output_dir ./results/student/sst2/  --alpha_glue 0.5 --alpha_corr 0.5 --do_projector True --student_hidden_size 768 --teacher_hidden_size 1024 --num_train_epochs 3.0
```
