S2R2Net: A network architecture for efficient inference with sample-specific redundancy reduction

Published: 21 May 2025, Last Modified: 29 Jan 2026Pattern RecognitionEveryoneCC BY-NC-ND 4.0
Abstract: Deep models are compute-intensive, difficult to be deployed on resource-constrained devices. Current efforts reduce computations with respect to two different sources of redundancies: structural redundancy and spatial redundancy. Structural redundancy is exploited with neuron or channel pruning, and this type of redundancy is applicable to all samples. Spatial redundancy is sample-specific, where some regions of a specific sample contain no useful information and can be excluded during inference. In this work, we propose a network architecture for sample-specific redundancy reduction, $S^2R^2Net$, which aims to jointly reduce both types of redundancy. To facilitate the discovery of these redundancies, our model is fed with frequency domain representation. Next, the sample is partitioned into detailed regions and approximate regions guided by Discrete Cosine Transformation (DCT) coefficients. A multiple-branch architecture is designed for different processing of the two types of regions with significant computation reduction. This is achieved by reducing structural redundancy with reduced number of channels in both branches, and by reducing spatial redundancy with shrinked channel size in the approximate regions. Due to the regular architecture of $S^2R^2Net$, it can be easily applied to existing baseline models to improve their inference efficiency. Experimental results show that $S^2R^2Net$ is able to achieve computation reduction by nearly 70% when using VGG16 as the baseline, with only a 0.3% drop in Top-1 accuracy on Mini-ImageNet. When applied to ResNet50, $S^2R^2Net$ achieves a 53% reduction on computation, with only a 0.7% drop in Top-1 accuracy on ImageNet.
Loading