Dynamic Weighting: Exploiting the Potential of a Single Weight Across Different Modes

Junbin Zhuang; Yunyi Yan; Baolong Guo

Dynamic Weighting: Exploiting the Potential of a Single Weight Across Different Modes

Junbin Zhuang, Yunyi Yan, Baolong Guo

24 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Weights Augmentation, Shadow Weights, Plain Weights, Accuracy-Oriented Mode, Desire-Oriented Mode

Abstract: Weights play an essential role in determining the performance of deep networks. This paper introduces a new concept termed ``Weight Augmentation Strateg'' (WAS), which emphasizes the exploration of weight spaces rather than traditional network structure design. The core of WAS is the utilization of randomly transformed weight coefficients, referred to as Shadow Weights (SW), for deep networks to calculate the loss function and update the parameters. Differently, stochastic gradient descent is applied to Plain Weights (PW), which is referred to as the original weight of the network before the random transformation. During training, numerous SW collectively form a high-dimensional space, while PW is directly learned from the distribution of SW. To maximize the benefits of WAS, we introduce two operational modes, \textit{i.e.}, the Accuracy-Priented Mode (AOM) and the Desire-Oriented Mode (DOM). To be concrete, AOM relies on PW, which ensures that the network remains highly robust and accurate. Meanwhile, DOM utilizes SW, which is determined by the specific objective of our proposed WAS, such as reduced computational complexity or lower sensitivity to particular data. These dual modes can be switched at any time as needed, thereby providing flexibility and adaptability to different tasks. By extending the concept of augmentation from data to weights, our WAS offers an easy-to-understand and implement technique that can significantly enhance almost all networks. Our experimental results demonstrate that convolutional neural networks, including VGG-16, ResNet-18, ResNet-34, GoogleNet, MobileNetV2, and EfficientNet-Lite, benefit substantially with little to no additional costs. On the CIFAR-100 and CIFAR-10 datasets, model accuracy increases by an average of 7.32\% and 9.28\%, respectively, with the highest improvements reaching 13.42\% and 18.93\%. In addition, DOM can reduce floating point operations (FLOPs) by up to 36.33\%.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3545

Loading