Understanding Compositional Generalization via Hierarchical Concept Models

18 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: compositionality, extrapolation, latent-variable models, generative models
TL;DR: We formulate inherent structures in the natural data that enable compositional generalization.
Abstract: Compositional generalization -- the ability to understand and generate novel combinations of learned concepts -- enables models to extend their capabilities beyond limited experiences. While humans perform this task naturally, we still lack a clear understanding of what theoretical properties enable this crucial capability and how to incorporate them into machine learning models. We propose that compositional generalization fundamentally requires decomposing high-level concepts into basic, low-level concepts that can be recombined across similar contexts, similar to how humans draw analogies between concepts. For example, someone who has never seen a peacock eating rice can envision this scene by relating it to their previous observations of a chicken eating rice. In this work, we formalize these intuitive processes using principles of causal modularity and minimal changes. We introduce a hierarchical data-generating process that naturally encodes different levels of concepts and their interaction mechanisms. Theoretically, we demonstrate that this approach enables compositional generalization supporting complex relations between composed concepts, advancing beyond prior work that assumes simpler interactions like additive effects. Furthermore, we show that the true latent hierarchical model can be recovered from data under weaker conditions than previously required. By applying insights from our theoretical framework, we achieve significant improvements on benchmark datasets, verifying our theory.
Primary Area: causal reasoning
Submission Number: 12039
Loading