Smartadapt: Multi-branch Object Detection Framework for Videos on Mobiles

Ran Xu, Fangzhou Mu, Jayoung Lee, Preeti Mukherjee, Somali Chaterji, Saurabh Bagchi, Yin Li

2022 (modified: 17 Nov 2022)CVPR 2022Readers: Everyone

Abstract: Several recent works seek to create lightweight deep net-works for video object detection on mobiles. We observe that many existing detectors, previously deemed computationally costly for mobiles, intrinsically support adaptive inference, and offer a multi-branch object detection frame-work (MBODF). Here, an MBODF is referred to as a so-lution that has many execution branches and one can dy-namically choose from among them at inference time to sat-isfy varying latency requirements (e.g. by varying resolution of an input frame). In this paper, we ask, and answer, the wide-ranging question across all MBODFs: How to expose the right set of execution branches and then how to sched-ule the optimal one at inference time? In addition, we un-cover the importance of making a content-aware decision on which branch to run, as the optimal one is conditioned on the video content. Finally, we explore a content-aware scheduler, an Oracle one, and then a practical one, leveraging various lightweight feature extractors. Our evaluation shows that layered on Faster R-CNN-based MBODF, compared to 7 baselines, our Smartadapt achieves a higher Pareto optimal curve in the accuracy-vs-latency space for the ILSVRC VID dataset.

0 Replies