Shortcut-V2V: Compression Framework for Video-to-Video Translation based on Temporal Redundancy Reduction

Published: 01 Jan 2023, Last Modified: 12 Nov 2024ICCV 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Video-to-video translation aims to generate video frames of a target domain from an input video. Despite its usefulness, the existing networks require enormous computations, necessitating their model compression for wide use. While there exist compression methods that improve computational efficiency in various image/video tasks, a generally-applicable compression method for video-to-video translation has not been studied much. In response, we present Shortcut-V2V, a general-purpose compression framework for video-to-video translation. Shortcut-V2V avoids full inference for every neighboring video frame by approximating the intermediate features of a current frame from those of the previous frame. Moreover, in our framework, a newly-proposed block called AdaBD adaptively blends and deforms features of neighboring frames, which makes more accurate predictions of the intermediate features possible. We conduct quantitative and qualitative evaluations using well-known video-to-video translation models on various tasks to demonstrate the general applicability of our framework. The results show that Shortcut-V2V achieves comparable performance compared to the original video-to-video translation model while saving 3.2-5.7× computational cost and 7.8-44× memory at test time. Our code and videos are available at https://shortcut-v2v.github.io/.
Loading