Improving Axial-Attention Network via Cross-Channel Weight Sharing

Nazmul Shahadat, Anthony S. Maida

Published: 2024, Last Modified: 05 Nov 2025FLAIRS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In recent years,Hypercomplex-inspired neural networks improved deep CNN architectures due to their ability to share weights across input channels and thus improve cohesivenessof representations within the layers.The work described herein studies the effect of replacing existing layers inan Axial Attention ResNet with their quaternion variants that use cross-channel weight sharingto assess the effect on image classification. We expect the quaternion enhancements to produce improved feature maps with more interlinked representations.We experiment with the stem of the network, the bottleneck layer, and thefully connected backend, by replacing them with quaternionversions.These modifications lead to novel architectures which yield improved accuracyperformance on the ImageNet300k classification dataset. Our baseline networks for comparison were the original real-valued ResNet, the original quaternion-valuedResNet, and the Axial Attention ResNet.Since improvement was observed regardless of which part of the network was modified,there is a promise that this technique may be generally useful in improvingclassification accuracy for a large classof networks.

External IDs:dblp:conf/flairs/ShahadatM24