On the Effectiveness of Offline RL for Dialogue Response GenerationDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 05 Feb 2024ICML 2023Readers: Everyone
Abstract: A common training technique for language models is teacher forcing (TF). TF attempts to match human language exactly, even though identical meanings can be expressed in different ways. This motivat...
0 Replies

Loading