Learn to Compress (LtC): Efficient Learning-based Streaming Video Analytics

Quazi Mishkatul Alam, Israat Haque, Nael B. Abu-Ghazaleh

Published: 2024, Last Modified: 25 Jan 2025NOMS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Video analytics are often performed as cloud services in edge settings, primarily to offload computation and also in situations where the results are not directly consumed at the video source. Sending high-quality video data from end devices can be expensive in terms of both bandwidth and power use. To build a streaming video analytics pipeline that makes efficient use of these resources, it is imperative to reduce the size of the video streams. Traditional video compression algorithms are unaware of the semantics of the video, and can be both inefficient and harmful to the analytics performance. In this paper, we introduce LtC, a collaborative framework between the video source and the analytics server that efficiently learns to reduce the video streams within an analytics pipeline. Specifically, LtC uses the full-size video analytics algorithm at the server as a teacher to train a lightweight student neural network, which is then deployed at the video source. The student network is trained to capture the semantic significance of different regions within a video, which is used to selectively preserve the crucial regions in high quality while aggressively compressing the remaining regions. Furthermore, LtC incorporates a novel temporal filtering algorithm based on feature differencing to omit transmitting frames that do not contribute new information. Overall, LtC reduces bandwidth usage by 28-35% and attains a response delay that is up to 45% shorter than current state-of-the-art methods, while maintaining comparable analytics performance.