Self-Compressing Vision Tower for Efficient Dense Prediction Tasks

Aditya Shourya; Guangzhi Tang; Chang Sun

Self-Compressing Vision Tower for Efficient Dense Prediction Tasks

Aditya Shourya, Guangzhi Tang, Chang Sun

Published: 15 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Track: Type A (Regular Papers)

Keywords: Dense Prediction, Model Compression, Structured Pruning, Quantization, Efficient Vision Models

Abstract: Dense prediction tasks such as classification, segmentation, and optical flow require models that deliver high accuracy while maintaining sufficient throughput for practical applications on mobile or portable computing devices. However, most state-of-the-art architectures rely on deep sequential operations that are computationally expensive and challenging to execute on consumer-grade parallel hardware, this often leads to reduced inference speed or degraded accuracy, thereby limiting their applicability in real-time and edge scenarios. To address this challenge, we propose a novel, self-compressing vision architecture that applies structured pruning and quantization across key modules: convolutional layers, transposed convolutions, and linear attention in proportion to their parallel-time computational cost. By selectively reducing precision and pruning tensors in less critical layers, our approach achieves significant model compression. We evaluate our method on fine-grained classification (CUB-200-2011, Country211), semantic segmentation (ADE20K), and optical flow (HD1K). Our model matches the accuracy of state-of-the-art baselines (Efficient VIT) at full precision (FP32) and surpasses them under lower-precision settings, achieves reduced storage and higher throughput, all while maintaining similar training time. Finally, we highlight that compression serves not only as a mechanism for reducing model size but also as a basis for investigating the relationship between model depth and overall performance during inference. Project at: \url{https://github.com/adishourya/SelfcompressingDepthWiseAttn}

Serve As Reviewer: ~Guangzhi_Tang1, ~Chang_Sun1

Submission Number: 47

Loading