Abstract: In DNNs, anytime inference refers to obtaining the goal output from any of the layers before the final layer, without having to recalculate the input. Models that allow for this type of inference at any time have become essential in the field of embedded AI, particularly for tasks that require near-real-time inference for a seamless user experience. While several techniques exist that permit anytime inference by adding intermediate decision points in DNNs, the number and location of such decision points are either based on heuristics or simply placed after each layer, which may or may not work. As a result, a novel framework is offered for designing customized decision points that are best suited for a particular job and model architecture. The choice points are placed using the ANNExR Optimal Exit Selector algorithm, which uses representative metrics like likelihood and uncertainty of prediction, as well as user preference, to achieve optimal performance. Additionally, for seamless execution, the framework performs automatic DNN modularization (based on the derived optimal decision points). The empirical results show how adaptive decision/exit points reduce average latency while maintaining the classification model’s accuracy.
0 Replies
Loading