Quantization Compensator Network: Server-Side Feature Reconstruction in Partitioned IoT Systems

Isaac Sánchez Leal, Oscar Artur Bernd Berg, Silvia Krug, Eiraj Saqib, Irida Shallari, Axel Jantsch, Mattias O’Nils, Tomas Nordström

Published: 01 Jan 2025, Last Modified: 13 Jan 2026IEEE AccessEveryoneRevisionsCC BY-SA 4.0
Abstract: With the growing number of IoT devices generating data at the edge, there is a rising demand to run machine learning (ML) models directly on these resource-constrained nodes. To overcome hardware limitations, a common approach is to partition the model between the node and a more capable edge or cloud server. However, this introduces a communication bottleneck, especially for transmitting intermediate feature maps. Extreme quantization, such as 1-bit quantization, drastically reduces communication cost but causes significant accuracy degradation. Existing solutions like full-model retraining offer limited recovery, while methods such as autoencoders shift computational burden to the IoT node. In this work, we propose Quantization Compensator Network (QCNet)—a lightweight, server-side module that reconstructs high-fidelity feature maps directly from 1-bit quantized data. QCNet is used alongside fine-tuning of the server-side model and introduces no additional computation on the IoT node. We evaluate QCNet across diverse vision models (ResNet50, ViT-B/16, ConvNeXt Tiny, and YOLOv3 Tiny) and tasks (classification, detection), showing that it consistently outperforms standard dequantization, autoencoder-based, and Quantization-Aware Training (QAT) approaches. Remarkably, QCNet achieves accuracy close to—or even surpassing—that of the original unpartitioned models, while maintaining a favorable accuracy–latency trade-off. QCNet offers a practical and efficient solution for enabling accurate distributed intelligence on communication- and compute-limited IoT platforms.
Loading