Abstract: We present SpiralMLP, a novel architecture introduces a Spiral FC layer as a replacement for the conventional To-ken Mixing approach. Differingfrom several existing MLP-based models that primarily emphasize axes, our Spiral FC layer is designed as a deformable convolution layer with spiral-like offsets. We further adapt Spiral FC into two variants: Self-Spiral FC and Cross-Spiral FC, enabling both local and global feature integration seamlessly, eliminating the need for additional processing steps. To thoroughly investigate the effectiveness of the spiral-like offsets and validate our design, we conduct ablation studies and explore optimal configurations. In empirical tests, SpiralMLP reaches state-of-the-art performance, similar to Transform-ers, CNNs, and other MLPs, benchmarking on ImageNet-lk, COCO and ADE20K. SpiralMLP still maintains linear computational complexity O (HW) and is compatible with varying input image resolutions. Our study reveals that targeting the full receptive field is not essential for achieving high performance, instead, adopting a refined approach of-fers better results.11Our code is available at https://github.com/Kookree/SpiralMLP.
Loading