# Towards Mitigating Architecture Overfitting in Dataset Distillation

## Requirements

PyTorch, Torchvision, Numpy, Kornia

## Distilled Data and Checkpoints of Teacher Models

We provide the distilled data of FRePo and MTT and the checkpoints of trained teacher models. See in ./distilled_data and ./snapshots. Due to space limitation, we only provide the distilled data and checkpoints of CIFAR10.

## Run Our Code

```
python train.py --exp_name test --no_log --seed 2333 --gpu 0 --data_name cifar10 --data_dir [DATA PATH] --distill_data_dir ./distilled_data/cifar10_10_frepo --tau 0.3 --model_name resnetdp --norm batchnorm --max_epoch 4000 --lr 5e-5 --batch_size 100 --optim lion --save_freq 100 --aug_mode 2 --scheduler improved --kd --teacher_model_name convnetfrepo --teacher_ckpt_path ./snapshots/cifar10_10_frepo_convnet/checkpoint_best.pth --strategy more2less --zca
```

The code shown above is to use our method to train ResNet18 on distilled data generated by FRePo, CIFAR10, IPC=10. The teacher model is ConvNet of FRePo version. If you want to run our code with other setting, please refer to implementation details in our paper.

## Acknowledgement

Part of our codes are referred to [GeorgeCazenavette/mtt-distillation: Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories" (github.com)](https://github.com/GeorgeCazenavette/mtt-distillation)
