# TLXML: Task-Level Explanation of Meta-Learning via Influence Functions
## Requirements
- learn2learn 0.2
- torch 1.13 or higher
- torchvision 
- torch-summary
- torchsummary
- tensorboard
- pandas
- matplotlib
- scipy
- opencv-python

## Setup
```
source set_pythonpath.sh
```
## 1. MAML
```
cd examples/maml_l2l
```
### 1-1. Fully connected network (Task distinction)
(Case of 128 training tasks)
#### 1-1-a. MiniImagenet
Extracting 32 centroids for BOVW(Bag of Visual Word) (only for MiniImagenet)
```
python sift_extractor.py --k 32
```
Out: ./cache/\<yyyy-mmdd-hhmmss\>/mifeature_k32_ndata38400.pkl. 

Training
```
python fc_maml.py --num-iterations 1000 --cuda --k 32 --hidden 32 --activation relu --num-tasks 128 --sift-centroids cache/<yyyy-mmdd-hhmmss>/mifeature_k32_ndata38400.pkl
```
Out (models at different checkpoints) : ./cache/<20xx-xxxx-xxxxxx>/maml_k32_layer32_tasks128_mbs32_ways5_shots5_\<iteration\>.pth

Creating an explainer

(exact Hessian)
```
python explainer_fc_maml.py --cuda --k 32 --hidden 32 --activation relu --ckpt cache/<yyyy-mmdd-hhmmss>/maml_k32_layer32_tasks128_mbs32_ways5_shots5_<iteration>.pth --num-tasks 128 --sift-centroids cache/<yyyy-mmdd-hhmmss>/mifeature_k32_ndata38400.pkl
```
Out (the explainer) : ./cache/<yyyy-mmdd-hhmmss>/expl_maml_k32_layer32_tasks128_mbs32_ways5_shots5_\<iteration\>.pkl

(approximated Hessian, case of buffer size 512)
```
python explainer_fc_maml.py --cuda --k 32 --hidden 32 --activation relu --ckpt cache/<yyyy-mmdd-hhmmss>/maml_k32_layer32_tasks128_mbs32_ways5_shots5_<iteration>.pth --num-tasks 128 --sift-centroids cache/<yyyy-mmdd-hhmmss>/mifeature_k32_ndata38400.pkl --opa --num-hessian-elements 512
```
Out (the explainer) : ./cache/\<yyyy-mmdd-hhmmss\>/opa_nh512_expl_maml_k32_layer32_tasks128_mbs32_ways5_shots5_\<iteration\>.pkl

Computing self-ranks
```
python selfrank_fc_maml.py --cuda --num-tasks 128 --k 32 --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --sift-centroids cache/<yyyy-mmdd-hhmmss>//mifeature_k32_ndata38400.pkl --nums-pos-evs="-1,1285,512,256,128,64,32,16,8" 
```
Out: ./cache/\<yyyy-mmdd-hhmmss\>/selfrank_\<explainer_file_name\>.csv

Computing correations between task degradation and scores/self-ranks
```
python degrade_fc_maml.py --cuda --num-tasks 128 --k 32 --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --sift-centroids cache/<yyyy-mmdd-hhmmss>/mifeature_k32_ndata38400.pkl  --nums-pos-evs="-1,1285,512,256,128,64,32,16,8"  
```
Out1 (correlation coefficients) : ./cache/\<yyyy-mmdd-hhmmss\>/corr_nev\<num-pos-evs\>_\<explainer_file_name\>.csv

Out2 (mean and std values) : ./cache/\<yyyy-mmdd-hhmmss\>/stats_\<explainer_file_name\>.csv

#### 1-1-b. Omniglot
```
python fc_maml.py --num-iterations 1000 --cuda --k 36 --hidden 32 --activation relu --num-tasks 128 --dataset omniglot
```
```
python explainer_fc_maml.py --cuda --k 36 --hidden 32 --activation relu --ckpt cache/<yyyy-mmdd-hhmmss>/maml_k32_layer32_tasks128_mbs32_ways5_shots5_<iteration>.pth --num-tasks 128 --dataset omniglot
```
```
python explainer_fc_maml.py --cuda --k 36 --hidden 32 --activation relu --ckpt cache/<yyyy-mmdd-hhmmss>/maml_k32_layer32_tasks128_mbs32_ways5_shots5_<iteration>.pth --num-tasks 128  --opa --num-hessian-elements 512 --dataset omniglot
```
```
python selfrank_fc_maml.py --cuda --num-tasks 128 --k 36 --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --nums-pos-evs="-1,1413,512,256,128,64,32,16,8" -dataset omniglot
```
```
python degrade_fc_maml.py --cuda --num-tasks 128 --k 36 --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --nums-pos-evs="-1,1413,512,256,128,64,32,16,8"  -dataset omniglot
```

### 1.1.c Further analysis
See explain_fc_maml.ipynb

## 1.2 CNN (Task distribution distinction)
(Case of 1024 training tasks including 128 noise image tasks)
### 1.2.a. MiniImagenet
Training
```
python cnn_maml.py --cuda --num-iterations 20000 --num-tasks 1024 --noise-tasks 128 ----save-tasksets
```
Out1 (models at different checkpoints) : ./cache/\<yyyy-mmdd-hhmmss\>/maml_tasks1024_mbs32_ways5_shots5_nt128_\<iteration\>.pth

Out2 (task indexes of the noise image tasks) : ./cache/\<yyyy-mmdd-hhmmss\>/index_dict.pkl

For task augumention with rotation, add
```
--num-rotations <# rotation angles>
```
For weight decay in meta parameter update, add
```
--weight-decay <weight decay>
```

Creating an explainer(case of buffer size 1024)
```
python explainer_cnn_maml.py --cuda --ckpt ./cache/<yyyy-mmdd-hhmmss>/maml_tasks1024_mbs32_ways5_shots5_nt128_<iteration>.pth --num-tasks 1024 --noise-tasks 128 --num-hessian-elements 1024
```
Out (the explainer) : ./cache/\<yyyy-mmdd-hhmmss\>/einv_expl_opa_nh1024_ov_maml_tasks1024_mbs32_ways5_shots5_nt128_\<iteration\>.pkl

If we use a model trained with data augumentation, we should add the same option
```
--num-rotations <# rotation angles>
```
Explaining test losses
```
python explain_inference.py --cuda --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --num-tasks 1024 --noise-tasks 128 --num-test-tasks 128
```
Out(influence scores): .cache/\<yyyy-mmdd-hhmmss\>/df_ntest128_<explainer_file_name>

The resultant pandas dataframe has the following columns:

|column name|description|
|----|----|
|test_task_idx|index of the test task|
|test_accuracy|accuracy for the test data|
|adaptation_accuracy|accuracy for the adaptation data after adaptation|
|train_accuracy|accuracy for the adaptation data before adaptation|
|train_task_idx|training task idxes corresponding to train_task_score|
|train_task_score|influence function values in descending order|

### 1.2.b. Omniglot
The dataset can be specified by
```
--dataset omniglot
```

### 1.2.c Further analysis
See explain_cnn_maml_miniimagenet.ipynb, explain_cnn_maml_omniglot.ipynb

## 2. Prototypical network
```
cd examples/protonet_l2l
```
## 2.1 CNN (Task distribution distinction)
(Case of 1024 training tasks including 128 noise image tasks)
### 2.1.a. MiniImagenet
Training
```
python cnn_protonet.py --cuda --num-iterations 10000 --num-tasks 1024 --noise-tasks 128 --save-tasksets
```
Out1 (models at different checkpoints) : ./cache/\<yyyy-mmdd-hhmmss\>/protonet_tasks1024_mbs32_ways5_shots5_nt128_\<iteration\>.pth

Out2 (task indexes of the noise image tasks) : ./cache/\<yyyy-mmdd-hhmmss\>/index_dict.pkl

Creating an explainer(case of buffer size 1024)
```
python explainer_cnn_protonet.py --cuda --ckpt ./cache/<yyyy-mmdd-hhmmss>/protonet_tasks1024_mbs32_ways5_shots5_nt128_<iteration>.pth --num-tasks 1024 --noise-tasks 128 --num-hessian-elements 1024 --opa
```
Out (the explainer) : ./cache/\<yyyy-mmdd-hhmmss\>/einv_expl_opa_nh1024_ov_protonet_tasks1024_mbs32_ways5_shots5_nt128_\<iteration\>.pkl

Explaining test losses
```
python explain_inference.py --cuda --explainer cache/<yyyy-mmdd-hhmmss>/<explainer_file_name> --num-tasks 1024 --noise-tasks 128 --num-test-tasks 128
```
Out(influence scores): .cache/\<yyyy-mmdd-hhmmss\>/df_ntest128_<explainer_file_name>

### 2.1.b. Omniglot
The dataset can be specified by
```
--dataset omniglot
```

### 2.1.c Further analysis
See explain_cnn_protonet.ipynb

