Strategies at a glance: A comparative analysis of training techniques for optimizing early-exit deep neural networks
Abstract: Early-exit deep neural networks (DNNs) enable adaptive inference by allowing predictions at intermediate lay-
ers, thereby reducing computational cost. However, their performance is highly sensitive to the chosen train-
ing strategy—a factor that remains underexplored. This study presents the first systematic comparison of six
prominent strategies—Joint, Separate, Branch-wise, Two-stage, Distillation-based, and Hybrid—across three ar-
chitectures (MobileNet, ResNet, VGG) using core benchmarks (CIFAR-10 and CIFAR-100). To evaluate scalabil-
ity and domain generalization, we extended experiments on ImageNet-100 and ChestX-ray14. For each setup,
we assess convergence behavior, accuracy, overfitting, and training efficiency, supported by statistical valida-
tion via ANOVA and Tukey’s HSD tests. Results reveal key trade-offs: Joint and Distillation-based strategies
offer strong generalization but incur higher computational cost; Two-stage and Branch-wise are prone to over-
fitting at deeper exits; Separate training underperforms at early exits. In contrast, Hybrid strategies achieve
the best balance of accuracy and efficiency. These insights offer practical guidance for optimizing early-exit
DNNs under resource constraints and lay a principled foundation for future research on efficient training
paradigms
Loading