CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

14 Feb 2020 (modified: 10 May 2020)AKBC 2020 Conference Blind SubmissionReaders: Everyone
  • Abstract: Given a set of common concepts like {apple (noun), pick (verb), tree (noun)}, humans find it easy to write a sentence describing a grammatical and logically coherent scenario that covers these concepts, for example, {a boy picks an apple from a tree''}. The process of generating these sentences requires humans to use commonsense knowledge. We denote this ability as generative commonsense reasoning. Recent work in commonsense reasoning has focused mainly on discriminating the most plausible scenes from distractors via natural language understanding (NLU) settings such as multi-choice question answering. However, generative commonsense reasoning is a relatively unexplored research area, primarily due to the lack of a specialized benchmark dataset. In this paper, we present a constrained natural language generation (NLG) dataset, named CommonGen, to explicitly challenge machines in generative commonsense reasoning. It consists of 30k concept-sets with human-written sentences as references. Crowd-workers were also asked to write the rationales (i.e. the commonsense facts) used for generating the sentences in the development and test sets. We conduct experiments on a variety of generation models with both automatic and human evaluation. Experimental results show that there is still a large gap between the current state-of-the-art pre-trained model, UniLM, and human performance.
0 Replies