Track: Systems and infrastructure for Web, mobile, and WoT
Keywords: Online Continual Learning; Real-Time Inference; Device-Cloud Collaboration.
Abstract: Online continual learning (CL) is becoming a mainstream paradigm to learn incrementally from task streams without forgetting previously learned knowledge. However, current online CL primarily focuses on the learning performance, such as avoiding catastrophic forgetting, neglecting the critical demands of real-time inference. As a result, the performance of real-time inference in online CL degrades significantly due to frequent data distribution variations and time-consuming incremental model adaptation. In this work, we propose ELITE, an online CL framework with device-cloud collaboration, to realize on-device real-time inference on time-varying task streams with performance guarantee. To realize on-device real-time inference in online CL, ELITE features a new design of the model zoo comprising various pre-trained models with the assistance of the cloud, and proposes a task-oriented on-device model selection to quickly retrieve the best-fit models instead of performing time-consuming model retraining. To prevent performance degradation on new tasks not available in the cloud, we introduces a latency-aware on-device model fine-tuning strategy to adapt to new tasks with accuracy-latency trade-off, and dynamically updates the model zoo in the cloud to enhance ELITE. Extensive evaluations on five real-world datasets have been conducted, and the results demonstrate that ELITE consistently outperforms the state-of-art solutions, improving the accuracy by 16.3\% on average and reducing the response latency by up to 1.98 times.
Submission Number: 84
Loading