Why Transformers Succeed and Fail at Compositional Generalization: Composition Equivalence and Module Coverage

Why Transformers Succeed and Fail at Compositional Generalization: Composition Equivalence and Module Coverage

ICLR 2026 Conference Submission21717 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: compositional generalization, out-of-distribution generalization, systematic generalization, compositionality, identifiability, transformers

TL;DR: We study how composition equivalence and module coverage explain compositional generalization successes/failures in transformers, showing direct models achieve superficially high benchmark performance with equivalences

Abstract: Compositional generalization—the ability to train on some combinations of modules and then generalize to unseen module combinations—is an important form of out-of-distribution generalization. A large body of work has evaluated this form of reasoning in transformer-based models, but the underlying mechanisms of success and failure remain poorly understood. We systematically evaluate compositional generalization in transformer-based models, and we identify two factors that play important roles in determining performance: ***composition equivalence*** and ***module coverage***. We show that the apparent performance of direct models (trained only on final outputs) can be entirely due to exploiting composition equivalences—different sequences of modules that reduce to identical end-to-end functions. When benchmarks eliminate these equivalences, the performance of these models drops to *near zero*, showing their inability to generalize to compositions of known modules that produce novel end-to-end functions. We discuss two key failure modes of step-by-step learning (trained on intermediate outputs). We show that composition equivalences encourage shortcut learning in step-by-step models, and these models fail to generalize when specific modules always appear at certain positions or in fixed combinations in the training set. These findings provide new insights into the conditions under which atomic modules that constitute a compositional task can be correctly learned by a model class for a specific train-test distribution.

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 21717

Loading