Abstract: Deep neural networks (DNNs) have notable performance in many fields, such as computer vision. Training a neural network on an edge device, commonly called on-device learning, has grown crucial for applications demanding real-time processing and enhanced privacy. However, existing on-device learning methods often face limitations, such as decreasing application accuracy, causing complexity in design and implementation, and increasing computational overhead, all of which hinder their effectiveness in reducing memory usage. In this paper, we address the issue by inspecting the memory usage of training a DNN, analyzing the effects of different on-device learning strategies, and introducing a framework that integrates neural architecture search (NAS) and rematerialization. The supernet of NAS can provide a population of compressed subnets/architectures to be trained without additional computational overhead, while rematerialization can mitigate memory consumption without accuracy loss. By leveraging the memory-saving effect of both supernet-based model compression and rematerialization, our proposed method can obtain suitable models that fit within the memory constraint while achieving a better trade-off between training time and model performance. In the experiments, we utilized complex datasets (CIFAR-100 and CUB-200) to fine-tune models on Raspberry Pi. The experimental results represent the effectiveness of our method in real-world on-device learning scenarios.
External IDs:dblp:conf/gecco/ChenWH25
Loading