Abstract: Sentences generated by neural language models (LMs) often suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. We show that coherence errors can arise at multiple stages of LM computation, and describe a procedure for distinguishing errors in inferring state from errors in generating sentences. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000-9,000 training examples with only 500 state annotations.
Paper Type: short
0 Replies
Loading