Scene-Adaptive Temporal Stabilisation for Video Colourisation Using Deep Video Priors

Marc Górriz Blanch, Noel E. O'Connor, Marta Mrak

Published: 2022, Last Modified: 14 May 2025ECCV Workshops (4) 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Automatic image colourisation methods applied independently to each video frame usually lead to flickering artefacts or propagation of errors because of differences between neighbouring frames. While this can be partially solved using optical flow methods, complex scenarios such as the appearance of new objects in the scene limit the efficiency of such solutions. To address this issue, we propose application of blind temporal consistency, learned during the inference stage, to consistently adapt colourisation to the given frames. However, training at test time is extremely time-consuming and its performance is highly dependent on the content, motion, and length of the input video, requiring a large number of iterations to generalise to complex sequences with multiple shots and scene changes. This paper proposes a generalised framework for colourisation of complex videos with an optimised few-shot training strategy to learn scene-aware video priors. The proposed architecture is jointly trained to stabilise the input video and to cluster its frames with the aim of learning scene-specific modes. Experimental results show performance improvement in complex sequences while requiring less training data and significantly fewer iterations.