Enhancing Controllable Generation with Improved Control Module Representations

ACL ARR 2024 June Submission4517 Authors

16 Jun 2024 (modified: 05 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Controllable generation (CG) has been widely used in large language models (LLMs) for a wide range of language tasks, such as multi-task learning and human preference alignment. For example, prompt-based CG uses curated prompts as inputs (such as system prompts) to control LLMs behaviors. Finetuning-based CG is widely adopted when training data is available; it trains control modules and controls LLM behaviors by plugging these modules into LLMs (e.g., trainable prompts or LoRA weights). Finetuning-based CG can freeze LLMs and only train control modules for efficiency or train LLMs together with control models for effectiveness. We argue that fine-tuning control modules together with LLMs directly is not the optimal optimization strategy since their representations are often initialized irrelevantly to LLMs' representations, which adds more difficulty for optimization. A better optimization should first align control modules with the LLM's representation space and then optimize them together. To this end, we propose a simple yet effective Two-step Freezing-then-Tuning framework (TFT) to achieve better optimization results for finetuning-based CG. Concretely, we first freeze LLMs and only optimize control modules to align their representations with LLMs, and then optimize control modules together with LLMs to ensure performance. Experiment results on two popular human preference alignment datasets and one multi-task learning dataset show that our approach significantly improves the controllable generation qualities compared with one-step optimization widely used in related works, and achieves better or on-par performance compared with other kinds of baselines, such as direct preference optimization.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Language Models, Controllable Generation, Alignment, Multi-task Learning
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study
Languages Studied: English
Submission Number: 4517
Loading