Why Do Neural Response Generation Models Prefer Universal Replies?

Bowen Wu; Nan Jiang; Zhifeng Gao; Zongsheng Wang; Suke Li; Wenge Rong; Baoxun Wang

Why Do Neural Response Generation Models Prefer Universal Replies?

Bowen Wu, Nan Jiang, Zhifeng Gao, Zongsheng Wang, Suke Li, Wenge Rong, Baoxun Wang

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Recent advances in neural Sequence-to-Sequence (Seq2Seq) models reveal a purely data-driven approach to the response generation task. Despite its diverse variants and applications, the existing Seq2Seq models are prone to producing short and generic replies, which blocks such neural network architectures from being utilized in practical open-domain response generation tasks. In this research, we analyze this critical issue from the perspective of the optimization goal of models and the specific characteristics of human-to-human conversational corpora. Our analysis is conducted by decomposing the goal of Neural Response Generation (NRG) into the optimizations of word selection and ordering. It can be derived from the decomposing that Seq2Seq based NRG models naturally tend to select common words to compose responses, and ignore the semantic of queries in word ordering. On the basis of the analysis, we propose a max-marginal ranking regularization term to avoid Seq2Seq models from producing the generic and uninformative responses. The empirical experiments on benchmarks with several metrics have validated our analysis and proposed methodology.

Keywords: Neural Response Generation, Universal Replies, Optimization Goal Analysis, Max-Marginal Ranking Regularization

TL;DR: Analyze the reason for neural response generative models preferring universal replies; Propose a method to avoid it.

17 Replies

Loading