<h2>Efficient Multi-order Gated Aggregation Network</h2>

<p align="center">
<img src="https://user-images.githubusercontent.com/44519745/202308950-00708e25-9ac7-48f0-af12-224d927ac1ae.jpg" width=95% height=100% 
class="center">
</p>

We propose **MogaNet**, a new family of efficient ConvNets, to pursue informative context mining with preferable complexity-performance trade-offs. Please use this implementation according to the instructions as follows.

<details>
  <summary>Table of Contents</summary>
  <ol>
    <li><a href="#catalog">Catalog</a></li>
    <li><a href="#image-classification">Image Classification</a></li>
    <li><a href="#license">License</a></li>
    <li><a href="#acknowledgement">Acknowledgement</a></li>
    <li><a href="#citation">Citation</a></li>
  </ol>
</details>

## Catalog

- [x] **ImageNet-1K** Training and Validation Code [[code](#image-classification)]
- [x] Downstream Transfer to **Object Detection and Instance Segmentation on COCO** [[code](detection/)]
- [x] Downstream Transfer to **Semantic Segmentation on ADE20K** [[code](segmentation/)]
- [x] Downstream Transfer to **2D Human Pose Estimation on COCO** [[code](pose_estimation/)] (baseline models are supported)
- [x] Downstream Transfer to **Video Prediction on MMNIST** [[code](video_prediction/)] (baseline models are supported)

## Image Classification

### 1. Installation

Please check [INSTALL.md](INSTALL.md) for installation instructions.

### 2. Training and Validation

See [TRAINING.md](TRAINING.md) for ImageNet-1K training and validation instructions. The parameters in the trained model can be extracted by [code](extract_ckpt.py).

### 3. ImageNet-1K Trained Models

| Model | Resolution | Params (M) | Flops (G) | Top-1 / top-5 (%) | Script |
|---|:---:|:---:|:---:|:---:|:---:|
| MogaNet-XT | 224x224 | 2.97 | 0.80 | 76.5 \| 93.4 | [script](TRAINING.md) |
| MogaNet-XT | 256x256 | 2.97 | 1.04 | 77.2 \| 93.8 | [script](TRAINING.md) |
| MogaNet-T | 224x224 | 5.20 | 1.10 | 79.0 \| 94.6 | [script](TRAINING.md) |
| MogaNet-T | 256x256 | 5.20 | 1.44 | 79.6 \| 94.9 | [script](TRAINING.md) |
| MogaNet-S | 224x224 | 25.3 | 4.97 | 83.4 \| 96.9 | [script](TRAINING.md) |
| MogaNet-B | 224x224 | 43.9 | 9.93 | 84.3 \| 97.0 | [script](TRAINING.md) |
| MogaNet-L | 224x224 | 82.5 | 15.9 | 84.7 \| 97.1 | [script](TRAINING.md) |
| MogaNet-XL | 224x224 | 180.8 | 34.5 | 85.1 \| 97.4 | [script](TRAINING.md) |

### 4. Analysis Tools

(1) The [code](get_flops.py) to count MACs of MogaNet variants.

```
python get_flops.py --model moganet_tiny
```
<p align="center">
<img src="https://user-images.githubusercontent.com/44519745/212429257-f0b09d7a-7503-4945-9517-68ea36d10e00.png" width=100% height=100% 
class="center">
</p>

(2) The [code](cam_image.py) to visualize Grad-CAM activation maps (or variants of Grad-CAM) of MogaNet and other popular architectures.

```
python cam_image.py --use_cuda --image_path /path/to/image.JPEG --model moganet_tiny --method gradcam
```

<p align="right">(<a href="#top">back to top</a>)</p>

## License

This project is released under the [Apache 2.0 license](LICENSE).

## Acknowledgement

Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

- [pytorch-image-models](https://github.com/rwightman/pytorch-image-models).
- [PoolFormer](https://github.com/sail-sg/poolformer): Official PyTorch implementation of MetaFormer.
- [ConvNeXt](https://github.com/facebookresearch/ConvNeXt): Official PyTorch implementation of ConvNeXt.
- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab Detection Toolbox and Benchmark.
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab Semantic Segmentation Toolbox and Benchmark.
- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab Pose Estimation Toolbox and Benchmark.

<p align="right">(<a href="#top">back to top</a>)</p>
