Adaptive Pre-training of Language Models for Better Logical Reasoning

Soumya Sanyal; Yichong Xu; Shuohang Wang; Ziyi Yang; Reid Pryzant; Wenhao Yu; Chenguang Zhu; Xiang Ren

Adaptive Pre-training of Language Models for Better Logical Reasoning

Soumya Sanyal, Yichong Xu, Shuohang Wang, Ziyi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren

Published: 21 Oct 2022, Last Modified: 05 May 2023NeurIPS 2022 Workshop DistShift PosterReaders: Everyone

Keywords: logical reasoning, language modeling, continued pretraining

TL;DR: We develop a logical keyword-based data filtering technique, along with two self-supervised loss functions, to continue pretraining of LMs for logical reasoning

Abstract: Logical reasoning of text is an important ability that requires understanding the logical information present in the text and reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose AERIE, an adaptively pre-trained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic language understanding, are masked, and a sentence classification loss that teaches the model to distinguish between entailment and contradiction types of sentences. The proposed training paradigm is both simple and generalizable across tasks. We demonstrate the effectiveness of AERIE by comparing it with prior baselines on two logical reasoning datasets. AERIE performs comparably on ReClor and outperforms baselines on LogiQA.

1 Reply

Loading