Kernel Binary Optimizer (KBOP): Latent-free Optimizer for Binary Neural Networks

Ovanes Petrosian; Aleksandr Rozengard; Maksim Dromashko; Nikita Izmailov; Majid Abbasov; Alexander Allakhverdyan; Anastasiia Zhadan; Salimov Bogdan; Andrey Rychkov; Ilya Lukashevich; Ilia Zharikov; Li Yin; Zhiyuan Lv

Kernel Binary Optimizer (KBOP): Latent-free Optimizer for Binary Neural Networks

Ovanes Petrosian, Aleksandr Rozengard, Maksim Dromashko, Nikita Izmailov, Majid Abbasov, Alexander Allakhverdyan, Anastasiia Zhadan, Salimov Bogdan, Andrey Rychkov, Ilya Lukashevich, Ilia Zharikov, Li Yin, Zhiyuan Lv

20 Sept 2025 (modified: 12 Mar 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Binary neural network, Binary-specific initialization, Momentum-based weight update, Learnable scaling factors

Abstract: Binary Neural Networks (BNNs) are receiving growing interest for enabling energy-intensive deep learning on resource-limited edge devices. Traditionally, the training methods of such models rely on minimizing the quantization error in forward propagation and approximating the sign function in full-precision models. However, such methods do not leverage the nature of training in the space of binary weights. To address this issue, we propose a latent-free method called Kernel Binary Optimizer (KBOP) to enable binary deep learning. The proposal is largely based on three major technical innovations: a sign-changing rule that reverses the sign of a binary weight according to the value of the gradient calculated at the binary point, where the rigidity of this rule is regulated by the learning rate (LR); a learnable scaling factor that partially integrates the full-precision nature of the input into the binary optimization space, using a different LR from the sign-changing rule; and a BNN Initialization (BNN Init) procedure that stabilizes the beginning of binary learning. The novelty of KBOP lies not only in its binary optimization approach but also in its seamless integration into existing architectures, such as convolutional or transformer networks. To demonstrate this, we compare our approach to state-of-the-art (SoTA) BNN training methods for the super-resolution task and large language models, achieving a significant inference performance increase. Experimental results indicate that our method closes, on average, 36.83% of the gap between existing SoTA BNN model results and full-precision model performance. We also provide a theoretical proof of the convergence of KBOP and a theoretical justification for BNN Init.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 24616

Loading