Keywords: compositional generalization, variational Bayesian methods, meta-learning, abstract reasoning, nonparametric representations
TL;DR: We propose test-time gradient search over nonparametric variational Bayesian latent representations to achieve improved compositional generalization in abstract reasoning tasks.
Abstract: Many state-of-the-art methods in deep learning fail at systematic reasoning in settings which require compositional generalization. To address this, we propose a novel architecture which uses a nonparametric latent space, information-theoretic regularization of this space, and test-time gradient-based search to achieve strong performance on compositional meta-learning tasks such as program induction, Raven's progressive matrices, and linguistic systematicity tasks. Our proposed architecture, Abduction Transformer, uses nonparametric mixture distributions to represent inferred hidden causes of few-shot meta-learning instances. These representations are refined at test-time via gradient descent to better account for the observed few-shot examples, a form of variational posterior inference which allows Abduction Transformer to solve meta-learning tasks that require novel recombinations of knowledge acquired during training. Our method outperforms standard transformer architectures and a contemporary test-time adaptive variational approach, indicating a promising new direction for neural networks capable of systematic generalization.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 11600
Loading