Disentangling Primitive Representation Structures for Image Generation

ICLR 2026 Conference Submission2780 Authors

07 Sept 2025 (modified: 02 Dec 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative Model, Explainable Artificial Intelligence, Interpretability, Image Generation
TL;DR: This paper presents a method to explain the internal representation structure of a neural network for image generation.
Abstract: This paper explains a neural network for image generation from a new perspective, i.e., explaining representation structures for image generation. We propose a set of desirable properties to define the representation structure of a neural network for image generation, including feature completeness, spatial boundedness and consistency. These properties enable us to propose a method for disentangling primitive feature components from the intermediate-layer features, where each feature component generates a primitive regional pattern covering multiple image patches. In this way, the generation of the entire image can be explained as a superposition of these feature components. We prove that these feature components, which satisfy the feature completeness property and the linear additivity property (derived from the feature completeness, spatial boundedness, and consistency properties), can be computed as OR Harsanyi interaction. Experiments have verified the faithfulness of the disentangled primitive regional patterns.
Primary Area: interpretability and explainable AI
Submission Number: 2780
Loading