Compositional Generalization through Gradient Search in Nonparametric Latent Space

Haruki Shirakami; James Henderson

Compositional Generalization through Gradient Search in Nonparametric Latent Space

Haruki Shirakami, James Henderson

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: compositional generalization, variational Bayesian methods, meta-learning, abstract reasoning, nonparametric representations

TL;DR: We propose test-time gradient search over nonparametric variational Bayesian latent representations to achieve improved compositional generalization in abstract reasoning tasks.

Abstract: Many state-of-the-art methods in deep learning fail at systematic reasoning in settings which require compositional generalization. To address this, we propose a novel architecture which uses a nonparametric latent space, information-theoretic regularization of this space, and test-time gradient-based search to achieve strong performance on compositional meta-learning tasks such as program induction, Raven's progressive matrices, and linguistic systematicity tasks. Our proposed architecture, Abduction Transformer, uses nonparametric mixture distributions to represent inferred hidden causes of few-shot meta-learning instances. These representations are refined at test-time via gradient descent to better account for the observed few-shot examples, a form of variational posterior inference which allows Abduction Transformer to solve meta-learning tasks that require novel recombinations of knowledge acquired during training. Our method outperforms standard transformer architectures and a contemporary test-time adaptive variational approach, indicating a promising new direction for neural networks capable of systematic generalization.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 11600

Loading