Towards Real-time Video Compressive Sensing on Mobile Devices

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Video Snapshot Compressive Imaging (SCI) uses a low-speed 2D camera to capture high-speed scenes as snapshot compressed measurements, followed by a reconstruction algorithm to retrieve the high-speed video frames. The fast evolving mobile devices and existing high-performance video SCI reconstruction algorithms motivate us to develop mobile reconstruction methods for real-world applications. Yet, it is still challenging to deploy previous reconstruction algorithms on mobile devices due to the complex inference process, let alone real-time mobile reconstruction. To the best of our knowledge, there is no video SCI reconstruction model designed to run on the mobile devices. Towards this end, in this paper, we present an effective approach for video SCI reconstruction, dubbed MobileSCI, which can run at real-time speed on mobile devices for the first time. Specifically, we first build a U-shaped 2D convolution-based architecture, which is much more efficient and mobile-friendly than previous state-of-the-art reconstruction methods. Besides, an efficient feature mixing block, based on the channel splitting and shuffling mechanisms, is introduced as a novel bottleneck block of our proposed MobileSCI to alleviate the computational burden. Finally, a customized knowledge distillation strategy is utilized to further improve the reconstruction quality. Extensive results on both simulated and real data show that our proposed MobileSCI can achieve superior reconstruction quality with high efficiency on the mobile devices. Particularly, we can reconstruct a 256 × 256 × 8 snapshot compressed measurement with real-time performance (about 35 FPS) on an iPhone 15. Code of this paper will be released.
Primary Subject Area: [Systems] Systems and Middleware
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: In recent years, video Snapshot Compressive Imaging (SCI) has attracted much attention because it can capture high-speed scenes using a low-speed camera with low bandwidth. There are two stages in a video SCI system: hardware encoding and software decoding. In the hardware encoding process, we first modulate the high-speed scene with different masks, and then the modulated scene is compressed into a series of snapshot measurements, which are finally captured by a low-speed camera. In the software decoding stage, the captured snapshot measurements and the modulation masks are fed into a reconstruction algorithm to retrieve the desired video frames. Video SCI can provide an elegant solution in the multimedia processing scenario where bandwidth is limited such as video surveillance. So far, many successful video SCI hardware systems have been built. Following this, numerous deep learning-based video SCI reconstruction algorithms have been proposed with impressive reconstruction quality. Unfortunately, to the best of our knowledge, there is no video SCI reconstruction model designed to run on mobile devices. Towards this end, we propose a simple yet effective network for real-time video SCI reconstruction on the mobile devices. Combining the proposed optical setup with our MobileSCI network, we contribute a promising way to build a whole mobile video SCI system with real-time performance. We believe that this work can largely push the development of video SCI on the real multimedia applications.
Supplementary Material: zip
Submission Number: 1016
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview