Beyond Pruning: Neuro-inspired Sparse Training For Enhancing Model Performance, Convergence Speed, and Training Stability
Keywords: Sparse Training, Neuro-inspired, Neural Networks, Computer Vision, Pruning
TL;DR: Unveiling benefits of sparse training for artificial neural networks
Abstract: Pruning often trades accuracy for efficiency, and sparse training is hard to do from scratch without performance loss. We introduce a simple, neuro-inspired sparse training (NIST) algorithm that simultaneously sparsifies, stabilizes, and strengthens neural networks without introducing computational overhead or complex optimizers. Our method achieves high sparsity while surprisingly enhancing model performance, accelerating convergence, and improving training stability across diverse architectures, and plugs directly into standard training pipelines. Empirically, it strengthens MLP-heavy architectures (e.g., VGG, AlexNet) by aggressively sparsifying them ($>$90\%) and counterintuitively, 8-10\% improving test accuracy. Additionally, NIST accelerates convergence and reduces variance in efficient CNNs such as MobileNet. It also enables transformer training directly from 50\% initial sparsity and up to 70\% final sparsity with negligible performance loss, while speeding up model convergence in the first 30 epochs. Our comprehensive experiments, ablations, and comparisons against state-of-the-art pruning and sparse-training methods reveal that these gains stem not from reduced parameter counts alone, but from improved optimization dynamics and more effective parameter reallocation. This study reframes sparse training as a performance-enhancing tool rather than a compromise.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 18163
Loading