Adaptive Backbone Selection for Efficient and Real-Time Vision Inference

Syed Amir Hamza; Alexander Jesser

Adaptive Backbone Selection for Efficient and Real-Time Vision Inference

Syed Amir Hamza, Alexander Jesser

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Efficient Inference, Green AI, Reinforcement Learning, Dynamic Inference, Adaptive Computation, Foundation Models

Abstract: Modern vision assistants often rely on large, static backbones regardless of input complexity, leading to unnecessary energy use and latency—especially on edge devices. We introduce Adaptive Backbone Selection (ABS), a dynamic inference framework that selects the most appropriate CNN backbone for each image in real-time. ABS integrates a lightweight complexity analyzer (based on edge and texture richness) and a policy network, trained via reinforcement learning, that learns to dynamically balance accuracy and latency through a custom reward function. To mitigate switching overhead, a memory-efficient Backbone Manager with LRU caching handles model reuse. Evaluated on ImageNet, ABS establishes a new, superior operating point on the accuracy-efficiency frontier, achieving higher accuracy than strong baselines like DenseNet121 at a fraction of the computational cost. Our work presents a practical and deployable system for building more sustainable and responsive real-time AI.

Submission Number: 87

Loading