Towards Understanding the Relationship between In-context Learning and Compositional GeneralizationDownload PDF

Anonymous

16 Aug 2023ACL ARR 2023 August Blind SubmissionReaders: Everyone
Abstract: According to the principle of compositional generalization, the meaning of a complex expression can be understood as a function of the meaning of its parts and of how they are combined. This principle is crucial for human language processing and also, arguably, for NLP models in the face of out-of-distribution data. However, many neural network models, including Transformers, have shown to struggle with compositional generalization. In this paper, we hypothesize that learning to in-context learn can provide the right inductive bias to promote compositional generalization. We do this by implementing a meta-learning approach that teaches a causal Transformer to utilize earlier examples to generalize to later ones: We construct a task distribution using different orderings of the training dataset and possibly shuffling the labels, which corresponds to training the model on all possible few-shot learning problems attainable from the dataset. At evaluation, we retain the zero-shot prediction setting by providing randomly sampled training examples for the model to in-context learn. Experiments on the SCAN and COGS datasets show that our method improves compositional generalization, indicating the usefulness of in-context learning problems as inductive bias for generalization.
Paper Type: long
Research Area: Machine Learning for NLP
Languages Studied: English
0 Replies

Loading