A Low Complexity Convolutional Neural Network with Fused CP Decomposition for In-Loop Filtering in Video Coding

Tong Shao; Jay N. Shingala; Peng Yin; Arjun Arora; Ajay Shyam; Sean McCarthy

A Low Complexity Convolutional Neural Network with Fused CP Decomposition for In-Loop Filtering in Video Coding

Tong Shao, Jay N. Shingala, Peng Yin, Arjun Arora, Ajay Shyam, Sean McCarthy

Published: 01 Jan 2023, Last Modified: 16 May 2025DCC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, a novel low complexity convolutional neural network with fused CP decomposition is proposed for in-loop filtering in video coding. Based on the baseline model in JVET-X0140, the regular 3x3 convolutional layers are replaced by pointwise convolutions and separable convolutions via CP decomposition. We further propose to fuse the 1x1 pointwise convolutional layers among the decomposed layers with their adjacent regular 1x1 convolutional layers, resulting in one single 1x1 convolutional layer. The two procedures reduce the model complexity from 33.6 KMAC/Pixel to 16.265 KMAC/Pixel. Experimental results show that the model has 4.45% BD-Rate luma gain over VTM NNVC-2.0. It demonstrates the (0.56%, -0.63%, -1.89%) loss of (Y, U, V) for RA and (0.51%, 0.21%, 0.39%) for AI, while the CPU decoding time is reduced by 19% for RA and 24% for AI, proving the great ability of the fused CP decomposition model to reduce complexity while maintaining good trade-off. The BD-Rate and KMAC/Pixel plot also shows the superior trade-off between complexity and coding gain compared to state-of the-art filters.

Loading