Pre-training Text-to-Text Transformers for Concept-centric Common Sense

Wangchunshu Zhou; Dong-Ho Lee; Ravi Kiran Selvam; Seyeon Lee; Xiang Ren

Pre-training Text-to-Text Transformers for Concept-centric Common Sense

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Xiang Ren

Published: 12 Jan 2021, Last Modified: 03 Apr 2024ICLR 2021 PosterReaders: Everyone

Keywords: Language Model Pre-training, Commonsense Reasoning, Self-supervised Learning

Abstract: Pretrained language models (PTLM) have achieved impressive results in a range of natural language understanding (NLU) and generation (NLG) tasks that require a syntactic and semantic understanding of the text. However, current pre-training objectives such as masked token prediction (for BERT-style PTLMs) and masked span infilling (for T5-style PTLMs) do not explicitly model the relational and compositional commonsense knowledge about everyday concepts, which is crucial to many downstream tasks requiring commonsense reasoning. To augment PTLMs with common sense, we propose generative and contrastive objectives as intermediate self-supervised pre-training tasks between general pre-training and downstream task-specific fine-tuning. We also propose a joint training framework to unify generative and contrastive objectives so that these objectives can be more effective. Our proposed objectives can pack more commonsense knowledge into the parameters of a pre-trained text-to-text transformer without relying on external knowledge bases, yielding better performance on both NLU and NLG tasks. We apply our method on a pre-trained T5 model in an intermediate task transfer learning fashion to train a concept-aware language model (CALM) and experiment with five commonsense benchmarks (four NLU tasks and one NLG task). Experimental results show that CALM outperforms baseline methods by a consistent margin.

One-sentence Summary: We propose self-supervised objectives and a joint training framework to augment pre-trained language models with common sense without relying on external knowledge bases.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Data: [C4](https://paperswithcode.com/dataset/c4), [CommonGen](https://paperswithcode.com/dataset/commongen), [CommonsenseQA](https://paperswithcode.com/dataset/commonsenseqa), [OpenBookQA](https://paperswithcode.com/dataset/openbookqa), [PIQA](https://paperswithcode.com/dataset/piqa)

21 Replies

Loading