Edge AI Online Training Architecture Using Multi-Phase-Quantization Optimizer

Itsuki Akeno, Hiiro Yamazaki, Tetsuya Asai, Kota Ando

Published: 01 Jan 2024, Last Modified: 12 Jun 2025IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Artificial intelligence (AI) using neural networks is now widely employed, and developing hardware for AI training computations is vital. In a previous study, we proposed a hardware-oriented optimizer (Holmes), which enables faster training with a smaller memory footprint compared to conventional optimizers. As a further step toward futuristic edge AI online training, in this study, we developed an architecture for hardware that incorporates Holmes and benefits from parallelization and pipelining to achieve significant throughput improvement. To evaluate this architecture, we investigated the scaling of the throughput and memory size over the degree of parallelism and the neural network model size through software simulation. The result proved the linear scalability of the memory and computational resources over the model size.