CoSe-Co: Sentence Conditioned Generative CommonSense Contextualizer for Language Models

Rachit Bansal; Milan Aggarwal; Sumit Bhatia; Jivat Neet Kaur; Balaji Krishnamurthy

CoSe-Co: Sentence Conditioned Generative CommonSense Contextualizer for Language Models

Rachit Bansal, Milan Aggarwal, Sumit Bhatia, Jivat Neet Kaur, Balaji Krishnamurthy

Published: 18 Sept 2021, Last Modified: 05 May 2023CSKBReaders: Everyone

Keywords: Language Model, Commonsense, Knowledge Graph, Sentence-to-Path, Novel Concepts/Phrases, Task Agnostic, Robust, Scalable, Generalizable

TL;DR: A sentence conditioned LM based generative commonsense contextualiser trained to generate commonsense paths given a sentence as input.

Abstract: Pre-trained Language Models (PTLMs) have been shown to perform well on natural language reasoning tasks requiring commonsense. Prior work has leveraged structured commonsense present in knowledge graphs (KGs) to assist PTLMs. Some of these methods use KGs as separate static modules which limits knowledge coverage since KGs are finite, sparse, and noisy. Other methods have attempted to obtain generalized and scalable commonsense by training PTLMs on KGs. Since they are trained on symbolic KG phrases, applying them on natural language text during inference leads to input distribution shift. To this end, we propose a task agnostic sentence-conditioned generative CommonSense Contextualizer (CoSe-Co), which is trained to generate contextually relevant commonsense inferences given a natural language input. We devise a method to create semantically related sentence-commonsense pairs to train CoSe-Co. We observe commonsense inferences generated by CoSe-Co contain novel concepts that are relevant to the entire sentence context. We evaluate CoSe-Co on multi-choice QA and open-ended commonsense reasoning tasks on the CSQA, ARC, QASC, and OBQA datasets. CoSe-Co outperforms state-of-the-art methods in both these settings, while being task-agnostic, and performs especially well in low data regimes showing it is more robust and generalises better.

2 Replies

Loading