Conditional set generation using Seq2seq models

Aman Madaan; Dheeraj Rajagopal; Antoine Bosselut; Yiming Yang

Conditional set generation using Seq2seq models

Aman Madaan, Dheeraj Rajagopal, Antoine Bosselut, Yiming Yang

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: natural language processing, nlp, language generation

Abstract: Conditional set generation learns a mapping from an input sequence of tokens to a set. Several popular natural language processing (NLP) tasks, such as entity typing and dialogue emotion tagging, are instances of set generation. Sequence-to-sequence models are a popular choice to model set generation but this typical approach of treating a set as a sequence does not fully leverage its key properties, namely order-invariance and cardinality. We propose a novel data augmentation approach that recovers informative orders for labels using their dependence information. Further, we jointly model the set cardinality and output by listing the set size as the first element and taking advantage of the autoregressive factorization used by seq2seq models. Our experiments in simulated settings and on three diverse NLP datasets show that our method improves over strong seq2seq baselines by about 9% on absolute F1 score. We will release all code and data upon acceptance.

One-sentence Summary: We propose a method for generating sets using sequence generation models by converting a set to a sequence in informative orders, and jointly modeling cardinality.

Supplementary Material: zip

10 Replies

Loading