Abstract: Edge computing and cloud computing have been utilized in many AI applications in various fields, such as computer vision, NLP, autonomous driving, and smart cities. To benefit from the advantages of both paradigms, we introduce HiDEC, a hierarchical deep neural network (DNN) inference framework with three novel features. First, HiDEC enables the training of a resource-adaptive DNN through the injection of multiple early exits. Second, HiDEC provides a latency-aware inference scheduler, which determines which input samples should exit locally on an edge device based on the exit scores, enabling inference on edge devices with insufficient resources to run the full model. Third, we introduce a dual thresholding approach allowing both easy and difficult samples to exit early. Our experiments on image and text classification benchmarks show that HiDEC significantly outperforms existing solutions.
0 Replies
Loading