Deep Inter Prediction for Versatile Video Coding (VVC)

Qurat Ul Ain Aisha, Young-Ju Choi, Jongho Kim, Sung-Chang Lim, Jin Soo Choi, Byung-Gyu Kim

Published: 2024, Last Modified: 25 Jan 2026J. Multim. Inf. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: A sophisticated video surveillance system involves the problem of limited video storage to record video data for long time. Video compression technology is an effective solution to address this problem. Inspired by the success of neural network-based approaches in computer vision, research on neural network-based video coding has emerged. With the aim of achieving improved compression efficiency, an investigation on inter prediction plays a crucial role in neural network-based video coding. In this paper, we propose a convolutional neural network (CNN)-based generation and enhancement method for inter prediction (GEIP) in the Versatile Video Coding (VVC) standard. By leveraging fused features and self-attended features based on attention mechanism, the proposed method maximizes inter prediction performance. When compared with VTM-11.0 NNVC-1.0 anchor, it is verified that the BDrate reduction of the proposed method can be achieved up to 7.06% on Y component under random access (RA) configuration.