SSONN: Self-Scaled Optimized Neural Network

Evgeny Bessonnitsyn; Mironov Ivan; Fedor Kutergin; Danil Fedorov; Aleksandr Ustinov; Sergey Muravyov; Valeria Efimova

SSONN: Self-Scaled Optimized Neural Network

Evgeny Bessonnitsyn, Mironov Ivan, Fedor Kutergin, Danil Fedorov, Aleksandr Ustinov, Sergey Muravyov, Valeria Efimova

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: NAS, AutoML, Pruning, Deep learning

Abstract: Current approaches to lightweight neural network design face a fundamental trade-off: reducing model size inevitably compromises accuracy. Distillation and pruning are the most commonly used methods, they require an initially over-parameterized pretrained architecture that increases computational costs while training. This work introduces a novel Self-Scaled Optimized Neural Network (SSONN) method that eliminates the need for redundant initial models. Instead of following a \textit{train-then-compress} paradigm, SSONN starts with a single linear layer and dynamically increases its complexity during training through adaptive reverse pruning. Rather than removing redundant parameters, the algorithm selectively adds nodes and connections only to critical places to improve task-specific accuracy. Similar methods that perform dynamic expansion of neural network architecture rely on well-known architectures or their parts, whereas the proposed method is independent of existing architectures. Experiments on classical datasets (MNIST, Fashion-MNIST, and Unseen NAS datasets) demonstrate that SSONN achieves accuracy comparable to state-of-the-art models while using $10$ times fewer parameters. Furthermore, the method outperforms traditional training approaches in computational efficiency and enables flexible deployment in resource-constrained environments. These results highlight the potential of the suggested expansion strategy over the reduction approach for creating efficient and adaptive deep learning models. The code is available at~\footnote{The link is removed due to blind review, the code can be found in supplementary material.}.

Supplementary Material: zip

Primary Area: learning theory

Submission Number: 11328

Loading