Abstract: This paper introduces a novel deep-learning assisted video list decoding method for error-prone
video transmission systems. Unlike traditional list decoding techniques, our proposed system uses a
Transformer-based no-reference image quality assessment method to select the highest-scoring reconstructed
video candidate after reception. Three new components are defined and used in the Transformer-assisted
image quality evaluation metric: neighborhood-based patch fidelity aggregation, discriminant color texture
transformation and ranking-constrained penalty loss function. We have also created our own database of
non-uniformly distorted images, similar to those that might result from transmission errors, in a High
Efficiency Video Coding (HEVC) context. In our specific testing context, our improved Transformer-assisted
method has a decision accuracy of 100% for intra-coded image, while, for errors occurring in an inter
image, it is 96%. Notably, in the few cases where a wrong choice is made, the selected candidate’s quality
remains similar to the intact frame.
Loading