SPEED: Selective Prediction for Early Exit DNNs

Divya Jyoti Bajpai; Bundeliya Harsh Jitendrakumar; Manjesh Kumar Hanawal

SPEED: Selective Prediction for Early Exit DNNs

Divya Jyoti Bajpai, Bundeliya Harsh Jitendrakumar, Manjesh Kumar Hanawal

27 Sept 2024 (modified: 27 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Selective prediction; Early Exits

TL;DR: Selective prediction methods for Early Exits

Abstract: Inference latency and trustworthiness of Deep Neural Networks (DNNs) are the bottlenecks in deploying them in critical applications like autonomous driving. Early Exit (EE) DDNs overcome the latency issues by allowing samples to exit from intermediary layers if they attain high confidence scores on the predicted class. However, the DNNs are known to exhibit overconfidence, which can lead to many samples exiting early and render EE strategies untrustworthy. We use Selective Prediction (SP) to overcome this issue by checking the hardness of the samples rather than just relying on the confidence score alone. We propose SPEED, a novel approach that uses Deferral Classifiers (DCs) at each layer to check the hardness of samples before performing EEs. The DCs at each layer identify if a sample is hard and either differ its inference to the next layer or directly send it to an expert. Early detection of hard samples and using an expert for inference prevents the wastage of computational resources and improves trust. We also investigate the generalization capability of DCs trained on one domain when applied to other domains where target domain data is not readily available. We observe that EE aided with SP improves both accuracy and latency. Our method minimizes the risk by 50% with a speedup of $2.05\times$ as compared to the final layer. The anonymized source code is available at https://anonymous.4open.science/r/SPEED-35DC/README.md.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12370

Loading