On Self-Adaptive Perception Loss Function for Sequential Lossy Compression

TMLR Paper5878 Authors

12 Sept 2025 (modified: 13 Jan 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We consider causal, low-latency, sequential lossy compression, with mean squared error (MSE) as the distortion loss, and a perception loss function (PLF) to enhance the realism of reconstructions. As the main contribution, we propose and analyze a new PLF that considers the joint distribution between the current source frame and the previous reconstructions. We establish the theoretical rate-distortion-perception function for first-order Markov sources and analyze the Gaussian model in detail. From a qualitative perspective, the proposed loss can simultaneously avoid the error-permanence phenomenon and also better exploit the temporal correlation between high-quality reconstructions. The proposed loss is referred to as self-adaptive perception loss function (PLF-SA), as its behavior adapts to the quality of reconstructed frames. We provide a detailed comparison of the proposed perception loss function with previous approaches through both information theoretic analysis as well as experiments involving moving MNIST and UVG datasets.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We would like to sincerely thank the action editor and reviewers for their careful evaluation of our work and for providing thoughtful and constructive feedback. The comments highlighted areas where additional experiments and further clarification of theoretical points would strengthen the paper. We have revised the draft accordingly, and the main updates are summarized below. New experimental results: - A comparison with LPIPS-style perceptual losses has been added in Figure 1. - Temporal evaluation using the FloLPIPS metric is now included in Table 4. - More results on fourth-frame reconstructions have been incorporated into Figure 1. - Additional experiments on high-resolution datasets, including the Vimeo-90K validation set and the UVG-1080p dataset, have been added to broaden the empirical evaluation. The corresponding RD curves are presented in Figure 5. Clarifications of theoretical statements: We have expanded several explanations, refined proofs, and added remarks addressing the reviewers’ questions about the RDP formulation, Markov assumptions, minimizer existence, …. All major additions and edits are marked in blue for easy reference. We greatly appreciate the reviewers’ insights, which have helped improve the clarity, completeness, and empirical support of the paper.
Assigned Action Editor: ~Charles_Xu1
Submission Number: 5878
Loading