Abstract: Video rescaling helps to fit different display devices. In video rescaling systems, videos are downsampled for easier storage, transmission and preview. The downsampled videos can be upsampled with a neural network to restore the details when needed. Previous group-based video rescaling algorithms benefit from the joint downsampling and joint upsampling of multiple frames, but are restricted by the fully joint operation. In this paper, we propose a recurrent diffusion-based framework for video rescaling. We employ biased joint operation and recurrent diffusion, to make a better use of the temporal relation within different frames in each image group. We explicitly control the direction of information propagation by arranging the processing order of all frames. In biased joint operation, we concentrate on restoring one frame, i.e., the middle frame. The other frames in the group are coarsely reconstructed. Our recurrent diffusion compensates the coarse frames by gradually propagating information from the middle to borders backwardly and forwardly. The recurrent diffusion module is performed by fusing the information of adjacent frames. Biased joint operation and recurrent diffusion are jointly trained. We design several propagation variants and find that our recurrent diffusion is the best among them. It is also shown that recurrent diffusion is better than non-recurrent diffusion in terms of reconstruction quality and model size. We also adopt a high-resolution fine-tuning strategy to further improve the quality of high-resolution frames. Experimental results demonstrate the effectiveness of the proposed method in terms of visual quality, quantitative evaluations, and computational efficiency. The code will be released at https://github.com/5ofwind/RDVR .
Loading