HYPERNETWORK-BASED THRESHOLD OPTIMIZATION FOR TERNARY NEURAL NETWORKS

18 Sept 2025 (modified: 17 Oct 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Ternary quantization, Hypernetworks, Neural network compression, Adaptive quantization, Bilevel training, Edge computing, Distributed Training
TL;DR: We develop a bilevel training method for ternary networks using hypernetworks to learn efficient parameter updates, achieving competitive accuracy with reduced communication costs
Abstract: Training and serving DNNs across heterogeneous, bandwidth-limited devices is constrained more by communication than FLOPs. In this setting, strict-ternary forward passes help on-device efficiency, but the dominant bottleneck remains shipping dense gradients or parameters each step. We instead use a hypernetwork that generates sparse, masked weight-update proposals from a low-dimensional latent, keeping quantization strict-ternary with a global threshold. Training is bilevel: an inner loop adapts the latent to each mini-batch by gradient descent, and an outer loop updates the hypernetwork with an accept-if-better mechanism for stability. This approach enables efficient distributed training because devices only communicate low-dimensional latent codes instead of full parameter updates. Our experiments demonstrate competitive accuracy while achieving significant communication efficiency - reducing the data exchanged between devices from the size of all model parameters to just the size of compact latent representations.
Primary Area: optimization
Submission Number: 14142
Loading