Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Tianze Yang; Yucheng Shi; Mengnan Du; Xuansheng Wu; Qiaoyu Tan; Jin Sun; Ninghao Liu

Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Tianze Yang, Yucheng Shi, Mengnan Du, Xuansheng Wu, Qiaoyu Tan, Jin Sun, Ninghao Liu

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Vector-Quantized Generative Models (VQGMs) have emerged as powerful tools for image generation. However, the key component of VQGMs---the codebook of discrete tokens---is still not well understood, e.g., which tokens are critical to generate an image of a certain concept? This paper introduces Concept-Oriented Token Explanation (CORTEX), a novel approach for interpreting VQGMs by identifying concept-specific token combinations. Our framework employs two methods: (1) a sample-level explanation method that analyzes token importance scores in individual images, and (2) a codebook-level explanation method that explores the entire codebook to find globally relevant tokens. Experimental results demonstrate CORTEX's efficacy in providing clear explanations of token usage in the generative process, outperforming baselines across multiple pretrained VQGMs. Besides enhancing VQGMs transparency, CORTEX is useful in applications such as targeted image editing and shortcut feature detection. Our code is available at https://github.com/YangTianze009/CORTEX.

Lay Summary: Generative models can create impressive images from text, but it’s often unclear how they represent different concepts internally. For example, when asked to generate “a bird” or “a doctor,” what parts of the model are actually responsible for shaping the result? We developed a method called CORTEX that helps uncover which visual building blocks, known as tokens, are most important for generating a specific concept. It works in two ways: first, by identifying the key tokens used in individual images, and second, by searching the model’s entire vocabulary to find the combinations that define a concept. CORTEX helps us understand what generative models have learned and how they use that to create images. It can reveal when certain concepts are shown with bias and identify which visual tokens are responsible. This also allows for targeted image editing by changing only the relevant parts.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/YangTianze009/CORTEX

Primary Area: Deep Learning->Everything Else

Keywords: Vector-Quantized Generative Model, Explainability, Information Bottleneck

Submission Number: 2678

Loading