Semantic Optimal Lossless Vector Quantization

Semantic Optimal Lossless Vector Quantization

ICLR 2026 Conference Submission15730 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Lossless Semantic Compression, Vector Quantization, Representation Learning

TL;DR: We introduce SOLO-VQ, a vector-quantization method that achieves no loss in downstream task performance while reaching the information-theoretic rate lower bound on controlled environments.

Abstract: Is it possible to derive an optimally compact image representation that preserves semantic information without performance loss for a class of downstream tasks? This paper addresses this fundamental question by providing a formal definition of semantic lossless optimal compression. We introduce a framework called Semantic Optimal Lossless Vector Quantization (SOLO-VQ) as a practical realization to address this concept. Unlike prior works, which often rely on heuristics and evaluate on generic image datasets where optimality is unverifiable, we propose a novel evaluation protocol. We construct a series of synthetic datasets and associated tasks where the information-theoretic rate limits for lossless compression are computable. Within these controlled environments, we empirically demonstrate that SOLO-VQ achieves provably optimal and lossless compression, effectively reaching the theoretical lower bounds. Our work establishes a principled foundation for goal-oriented semantic media data compression and suggests a promising methodology towards achieving this goal for compressive real-world image transmission.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 15730

Loading