Does Entity Abstraction Help Generative Transformers Reason?Download PDF

Published: 28 Jan 2022, Last Modified: 22 Oct 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Transformers, reasoning, compositional generalization, entity type, abstraction
Abstract: Pre-trained language models (LMs) often struggle to reason logically or generalize in a compositional fashion. Recent work suggests that incorporating external entity knowledge can improve language models' abilities to reason and generalize. However the effect of explicitly providing entity abstraction remains unclear, especially with recent studies suggesting that pre-trained models already encode some of that knowledge in their parameters. In this work, we study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on three different NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) multi-hop question answering (HotpotQA), and (3) conversational question answering (CoQA). We propose and empirically explore three different ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall our analysis demonstrate that models with abstract entity knowledge performs slightly better than without it. However, our experiments also show that the benefits strongly depend on the technique used and the task at hand. The best abstraction aware model achieved an overall accuracy of 88.8% compared to the baseline model achieving 62.3% on CLUTRR. In addition, abstraction-aware models showed improved compositional generalization in both interpolation and extrapolation settings. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefits of explicit abstraction could be very significant in formally defined logical reasoning settings such as CLUTRR, but point to the notion that explicit abstraction is likely less beneficial for NLP tasks having less formal logical structure.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2201.01787/code)
11 Replies

Loading