# KernelWarehouse: Rethinking the Design of Dynamic Convolution

This repository is an official PyTorch implementation of ["KernelWarehouse: Rethinking the Design of Dynamic Convolution", KW for short], submitted to ICLR 2024. 

**KernelWarehouse** is a more general form of dynamic convolution, which provides a high degree of freedom to fit a desired parameter budget under large kernel numbers. KernelWarehouse redefines the basic concepts of
"*kernels*" and "*assembling kernels*" by applying the attentive mixture learning paradigm to a dense local kernel scale instead of a holistic kernel scale via kernel partition and warehouse sharing.

Basic experiments are conducted on the [ImageNet](http://www.image-net.org) benchmark, and downstream experiments are conducted on the [MS-COCO](https://cocodataset.org/#home) benchmark, showing leading performance. For instance, the ResNet18|ResNet50|MobileNetV2|ConvNeXt-Tiny model trained with KernelWarehouse on ImageNet reaches 76.05%|81.05%|75.92%|82.55% top-1 accuracy. Thanks to its flexible design, KernelWarehouse can even reduce the model size of a ConvNet while improving the accuracy, e.g., our ResNet18 model with 36.45%|65.10% parameter reduction to the baseline shows 2.89%|2.29% absolute improvement to top-1 accuracy.

<p align="center"><img src="fig/Fig_Architecture.png" width="800" /></p>

Schematic illustration of KernelWarehouse. Briefly speaking, KernelWarehouse sequentially divides the static kernel $\mathbf{W}$ at any regular convolutional layer of a ConvNet into $m$ disjoint kernel cells $\mathbf{w}_ 1, \dots, \mathbf{w}_ m$ having the same dimensions first, and then computes each kernel cell $\mathbf{w}_ i$ as a linear mixture $\mathbf{w}_ i=\alpha_{i1} \mathbf{e}_ 1+\dots+\alpha_{in}\mathbf{e}_ n$ based on a predefined "warehouse" (consisting of $n$ same dimensioned kernel cells $\mathbf{e}_ 1,\dots,\mathbf{e}_ n$ , e.g., $n=108$) which is shared to all same-stage convolutional layers, and finally replaces the static kernel $\mathbf{W}$ by assembling its corresponding $m$ mixtures in order, yielding a high degree of freedom to fit a desired convolutional parameter budget. The input-dependent scalar attentions $\alpha_{i1},\dots,\alpha_{in}$ are computed with a new attention function (NAF).
## Dataset

Following [this repository](https://github.com/pytorch/examples/tree/master/imagenet#requirements),

- Download the ImageNet dataset from http://www.image-net.org/.
- Then, move and extract the training and validation images to labeled subfolders, using [the following script](https://github.com/pytorch/examples/blob/main/imagenet/extract_ILSVRC.sh).

## Requirements

- python >= 3.7.0
- torch >= 1.8.1, torchvision >= 0.9.1
- timm == 0.3.2, tensorboardX, six

## Training

To train a model with KernelWarehouse:

```shell
python -m torch.distributed.launch --nproc_per_node={number of gpus} main.py --kw_config {path to config json} \
--batch_size {batch size per gpu} --update_freq {number of gradient accumulation steps}  --data_path {path to dataset} \
--output_dir {path to output folder}
```

For example, to train ResNet18 + KW (1×) on 8 GPUs with batch size of 4096:

```shell
python -m torch.distributed.launch --nproc_per_node=8 main.py --kw_config configs/resnet18/kw1x_resnet18.json \
--batch_size 128 --update_freq 4 --data_path {path to dataset} --output_dir {path to output folder}
```

For example, to train MobileNetV2 + KW (4×) on 8 GPUs with batch size of 256:

```shell
python -m torch.distributed.launch --nproc_per_node=8 main.py --kw_config configs/mobilenetv2_100/kw4x_mobilenetv2_100.json \
--batch_size 32 --update_freq 1 --data_path {path to dataset} --output_dir {path to output folder}
```

You can add "--use_amp true" to enable Automatic Mixed Precision to reduce memory usage and speed up training.

More config files for other models can be found in [configs](configs).

## Evaluation

To evaluate a pre-trained model:

```shell
python -m torch.distributed.launch --nproc_per_node={number of gpus} main.py --kw_config {path to config json} \
--eval true --data_path {path to dataset} --resume {path to model}
```