Cacomp: A Cloud-Assisted Collaborative Deep Learning Compiler Framework for DNN Tasks on Edge

Weiwei Lin, Jinhui Lin, Haotong Zhang, Wentai Wu, Weizheng Wu, Zhetao Li, Keqin Li

Published: 01 Jan 2025, Last Modified: 24 Jul 2025IEEE Trans. Computers 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the development of edge computing, DNN services have been widely deployed on edge devices. The deployment efficiency of deep learning models relies on the optimization of inference and scheduling policy. However, traditional optimization methods on edge devices still suffer from prohibitively long tuning time due to devices’ low computational power. Meanwhile, the widely used scheduling algorithm, the dominant resource fairness algorithm (DRF algorithm), struggles to maximize the efficiency of model execution on edge devices and inevitably increases average waiting time as it is not applicable in the real-time distributed computing environment. In this paper, we propose Cacomp, a distributed cloud-assisted deep learning compiler framework that features accelerating the optimization on edge devices with assistance from the cloud and a novel inference task scheduling algorithm. Our framework utilizes the tuning records from the cloud devices and proposes a two-step distillation strategy to obtain the best tuning record set for the edge device. For the scheduling process, we propose an RD-DRF algorithm to allocate inference tasks to edge devices based on dominant resource matching in real time. Extensive results show that our framework can achieve up to 2.19x improvement in the optimization time compared with other methods on edge devices. Our proposed scheduling algorithm significantly shortens the average waiting time of inference tasks by 30% and improves resource utilization by 20% on edge devices.