Hardware-Aware Training for Multiplierless Convolutional Neural Networks

Rémi Garcia, Léo Pradels, Silviu-Ioan Filip, Olivier Sentieys

Published: 2025, Last Modified: 27 Feb 2026ARITH 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Many computer vision tasks use convolutional neural networks (CNNs). These networks have a significant computational cost and complex implementations, in particular on embedded systems. A common way to implement CNNs on integrated circuits is to use low-precision quantized weights and activations instead of de facto floating-point (FP) ones. This is important to reduce the implementation cost. However, this has drawbacks regarding accuracy, and Quantization-Aware Training (QAT) is one of the most popular approaches to mitigate this issue. In this article, we introduce a multiplierless-aware training approach that significantly reduces hardware resource consumption. We propose to incrementally fix weights to their current value based on their implementation cost. To compute this cost, we base our approach on a Multiple Constant Multiplication (MCM) shift-and-add solving technique. With this idea, we show a global implementation cost reduction by around 25% w. r. t. a vanilla QAT approach without hardware usage in the loop. Compared to state-of-the-art multiplierless-aware training methods, the network accuracy of our designs is closer to that of a vanilla QAT baseline.
Loading