Abstract: VVC is the next generation video coding standard in which inter prediction plays an important role to reduce the redundancy between adjacent frames. The coding time is longer since larger blocks and more motion search are supported, and the accuracy of inter prediction is limited because only temporal information is used in the conventional algorithm. This work make use of YOLOv5 to refine inter prediction in VVC, introducing an architecture that combines detected objects and tracking results with the proposed NR-Frame, which perform faster prediction of coded blocks within such detected objects. The experimental results demonstrate that the proposed method can achieve an average 11.45% (up to 13.27%) reduction in coding time under RA conditions compared to VTM-13.0.
Loading