Divide and Conquer: Video Inpainting for Diminished Reality in Low-Resource Settings

Published: 01 Jan 2024, Last Modified: 10 Nov 2024ISBI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Although recent deep learning-based inpainting techniques achieve excellent restoration quality, their stringent requirement for computational resources such as GPU/VRAM render them difficult to use in settings where high-end hardware is not available. We address this gap between research and real-world applications in our submission to the DREAMING challenge, which explores such a use case – diminished reality – where run time and GPU hardware are limited. Specifically, this paper proposes a method that divides a video optimally into sections of variable lengths for the downstream inpainting model. By maximizing the number of frames, i.e., spatio-temporal information, in each subvideo, inference using a SOTA model can often be performed without significant degradation to output quality. In addition, to expedite inference, we examine techniques that intelligently reduce the amount of input pixels, e.g., downsampling and cropping, while maintaining decent inpainting quality.
Loading