Abstract: In this paper, we propose a new deep learning-based
quality ranking framework to assist video list decoding methods
in the context of unreliable video transmissions. The objective is
to identify an intact image (corrected video frame) among a list
of candidate images generated by a list decoding method, where
all candidates, except for the intact image are corrupted. The
framework comprises a deep learning-based no-reference image
quality assessment (NR-IQA) for non-uniform video distortions
(NUD) system to rank the candidate images according to their
quality, which allows identifying the best one. To show the
validity of our proposed framework, we develop an NR-IQA
system relying on a proven patch-based convolutional neural
network (CNN) architecture, which we adapt to better account
for the non-uniform distortions observed in the candidate images,
e.g., H.265 transmission errors during wireless communications.
Specifically, we modify the patch size on which our CNN for
non-uniform distortions (CNN-NUD) operates to capture a larger
and more meaningful spatial context. Moreover, we develop a
new training database using images resulting from various bit
modifications in the received video packets, to simulate the list
decoding process, and train the system using a full reference
IQA (FR-IQA) method. Experiments on intra frames of videos
encoded using H.265 show the ability of this system to identify
an intact image among a set of five candidate images with an
average accuracy of 96.6%, whereas traditional NR-IQA metrics
or the initially trained CNN system offer poor accuracy ranging
between 15.7% and 33.6%, respectively.
Loading