Batch Pruning by Activation Stability

Published: 26 Jan 2026, Last Modified: 09 May 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Batch Pruning, Activation Stability, Neural Network, Activation, Deep Learning
TL;DR: A dynamic data pruning method for deep learning training that adaptively removes low-utility batches based on activation stability, significantly reducing data usage while maintaining the accuracy.
Abstract: Training deep neural networks remains costly in terms of data, time, and energy, limiting their deployment in large-scale and resource-constrained settings. To address this, we propose Batch Pruning by Activation Stability (*B-PAS*), a dynamic plug-in strategy that accelerates training by removing batches that contribute less to learning. *B-PAS* monitors the stability of activation representations across epochs and prunes batches whose activation variance exhibits minimal change, indicating diminishing learning utility. Applied to ResNet-18, ResNet-50, and the Convolutional vision Transformer (CvT) on CIFAR-10, CIFAR-100, SVHN, and ImageNet-1K, *B-PAS* reduces training batch usage by up to 57\% with no loss in accuracy, and by 47\% while slightly improving accuracy. Moreover, it achieves up to 61\% savings in GPU node-hours, outperforming prior state-of-the-art pruning methods with up to 29\% higher data savings and 21\% greater GPU node-hour savings. We further demonstrate the generalization of *B-PAS* by extending it to GPT-2 fine-tuning, showing that activation stability can serve as an effective pruning signal beyond vision models. These results highlight activation stability as a powerful internal signal for efficient training, offering a practical and sustainable path toward data and energy-efficient deep learning.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 9877
Loading