Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization

Takashi Morita

Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization

Takashi Morita

02 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: vector quantization, code collapse, discrete autoencoder, smoothing

TL;DR: This study proposes a simple and effective regularization method for smoothed vector quantization that simultaneously prevents code collapse and enforces tight approximation.

Abstract: Vector quantization, which discretizes a continuous vector space into a finite set of representative vectors (a *codebook*), has been widely adopted in modern machine learning. Despite its effectiveness, vector quantization poses a fundamental challenge: the non-differentiable quantization step blocks gradient backpropagation. *Smoothed* vector quantization addresses this issue by relaxing the hard assignment of a codebook vector into a weighted combination of codebook entries, represented as the matrix product of a simplex vector and the codebook. Effective smoothing requires **two properties**: 1. smoothed quantizers should remain close to a onehot vector, ensuring tight approximation, and 2. all codebook entries should be utilized, preventing *code collapse*. Existing methods typically address these desiderata separately. By contrast, the present study introduces **a simple and intuitive regularization that promotes both simultaneously** by minimizing the distance between each simplex vertex and its $K$-nearest smoothed quantizers. Experiments on representative benchmarks—including discrete image autoencoding and contrastive speech representation learning—demonstrate that the proposed method achieves more reliable codebook utilization and improves performance compared to prior approaches.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 701

Loading