IVQ: Structured and Lightweight Vector Quantization via Binary Hierarchical Composition Inspired by $\textit{IChing}$
Abstract: Vector Quantization (VQ) has been widely used in visual and audio representation due to its effectiveness in compressing high-dimensional signals. However, existing VQ methods often rely on large and unstructured codebooks, which leads to inefficient code utilization and frequent codebook collapse. In this paper, we propose *IChing* Vector Quantization (IVQ), a lightweight and structured VQ framework inspired by *IChing*. IVQ introduces binary hierarchical composition and geometric symmetry relations into the codebook design, enabling a compact set of structured codes to represent the latent space while maintaining high utilization without codebook collapse. Experimental results show that IVQ achieves superior quality with significantly smaller codebooks and consistently higher utilization rates compared to several VQ variants in audio representation. Auxiliary experiments on visual reconstruction and cross-modal generation further validate the universality and robustness of IVQ. Codes are released at https://github.com/chouliuzuo/IVQ.
Lay Summary: VQ is a technology used to compress complex data into discrete representations, but traditional versions often suffer from unstructured and inefficient storage. Inspired by the ancient philosophy of $\textit{IChing}$, we developed IVQ, which uses a compact hierarchical and structured codebook to achieve high-performance and orderly data representation. This method significantly improves quantization efficiency while establishing a more stable and interpretable "common language" across diverse modalities..
Primary Area: Deep Learning->Other Representation Learning
Keywords: IChing Vector Quantization, lightweight and structured codebook, Binary Hierarchical Composition, Geometric Symmetry Relations
Originally Submitted PDF: pdf
Submission Number: 19999
Loading