INSTANT: Compressing Gradients and Activations for Resource-Efficient Training

Tuan-Kiet Doan; Trung-Hieu Tran; Enzo Tartaglione; Nikola Simidjievski; Van-Tam Nguyen

INSTANT: Compressing Gradients and Activations for Resource-Efficient Training

Tuan-Kiet Doan, Trung-Hieu Tran, Enzo Tartaglione, Nikola Simidjievski, Van-Tam Nguyen

Published: 26 Jan 2026, Last Modified: 02 Mar 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Gradient Compression, Activation Compression, Resource-Constraint Training

TL;DR: This paper introduces INSTANT, a method for efficient training using low-rank gradient and activation compression.

Abstract: Deep learning has advanced at an unprecedented pace. This progress has led to a significant increase in its complexity. However, despite extensive research on accelerating inference, training deep models directly within a resource-constrained budget remains a considerable challenge due to its high computational and memory requirements. In this paper, we introduce INSTANT (compressIng gradieNtS and acTivAtions for resource-efficieNt Training), a method designed to address both the computational and the memory bottlenecks when training. INSTANT reduces resource demands during backpropagation by projecting gradients and activations into a low-rank subspace and performing computation within that compressed representation. Experimental results demonstrate that INSTANT achieves a $15\times$ reduction in computational cost and $32\times$ reduction in activation memory with negligible impact on model performance. The code will be made publicly available upon the paper's acceptance.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 6348

Loading