Reinforced-Decoding : Enhancing Reinforcement Learning with prefix for Decoding-Time Controllable Text Generation

Reinforced-Decoding : Enhancing Reinforcement Learning with prefix for Decoding-Time Controllable Text Generation

ACL ARR 2025 February Submission8096 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Attribute-based Controlled Text Generation (CTG) aims to generate texts that contains desirable attributes. Previous work have demonstrated remarkable language generation capabilities, yet they often suffer performance degradation when the length of generations increases. To tackle this challenge, we propose Reinforced-Decoding, a novel lightweight decoding framework for CTG, whose main idea is strategically enhancing the controllability of prefixes on target attributes to construct better attribute distributions. Specifically, We train prefixes by prefix-tuning to obtain Class-conditional language models'(CC-LMs) next-token distributions. Then We leverage a reinforcement learning approach to explore the optimal policy which decides whether to insert prefixs to enhance their influence towards CC-LMs' next-token distribution, and reconstruct attribute distributions at each time step to guide LM to generate texts with desired attributes, effectively mitigating the issue of degrading performance when the length of generations increases. Extensive experiments on a range of CTG tasks demonstrate that Reinforced-Decoding outperforms existing strong baselines with improvements of 1\%-4\% in Acc and maintains high fluency across a wide range of length settings.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: controllable text generation，reinforce learning

Contribution Types: NLP engineering experiment

Languages Studied: english

Submission Number: 8096

Loading