Keywords: compression, information theory, explainable AI, rate-distortion, learning theory
TL;DR: We identify common key failure modes in a popular neural compression framework, which encompasses most published neural compressors in the literature, and offer a new, unified perspective on understanding them through an information-theoretic lens.
Abstract: Artificial Neural Networks (ANNs) have revolutionized data compression by effectively learning nonlinear transformations directly from data. The Nonlinear Transform Coding (NTC) framework has demonstrated notable success in achieving favorable rate-distortion trade-offs, particularly for real-world multimedia such as image and video. Despite this progress, fundamental questions remain about whether NTC can compress various types of input sources optimally, and if not, where and why it falls short. To investigate these questions, we focus on simpler, closed-form sources where optimal compression strategies are well-characterized using tools from information theory. Reviewing key failure modes in NTC-based compressors from the literature points to a common underlying issue: their difficulty in learning high-frequency and discontinuous functions, leading to suboptimal compression performance compared to the information-theoretic optimum in certain setups. We also review several remedies that alleviate these failure modes, including a new one based on Fourier embeddings. By drawing a connection between these suboptimalities, our work provides a unified and fresh perspective on understanding them, thereby representing a step toward improving neural data compression.
Submission Number: 62
Loading