- Abstract: Modern convolutional networks are not shift-invariant, despite their convolutional nature: small shifts in the input can cause drastic changes in the internal feature maps and output. In this paper, we isolate the cause -- the downsampling operation in convolutional and pooling layers -- and apply the appropriate signal processing fix -- low-pass filtering before downsampling. This simple architectural modification boosts the shift-equivariance of the internal representations and consequently, shift-invariance of the output. Importantly, this is achieved while maintaining downstream classification performance. In addition, incorporating the inductive bias of shift-invariance largely removes the need for shift-based data augmentation. Lastly, we observe that the modification induces spatially-smoother learned convolutional kernels. Our results suggest that this classical signal processing technique has a place in modern deep networks.
- Keywords: convolutional networks, signal processing, shift, translation, invariance, equivariance
- TL;DR: Modern networks are not shift-invariant, due to naive downsampling; we apply a signal processing tool -- anti-aliasing low-pass filtering before downsampling -- to improve shift-invariance