FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation
Abstract: Controllable text generation (CTG) seeks to craft texts adhering to specific attributes, traditionally employing learning-based techniques such as training, fine-tuning, or prefix-tuning with attribute-specific datasets. These approaches, while effective, demand extensive computational and data resources. In contrast, some proposed learning-free alternatives circumvent learning but often yield inferior results, exemplifying the fundamental machine learning trade-off between computational expense and model efficacy. To overcome these limitations, we propose FreeCtrl, a learning-free method that dynamically modulates the weights of selected feedforward neural network (FFN) vectors to increase the likelihood of generating sentences with desired attribute-related keywords. Specifically, we first identify the key characteristics and challenges of using FFN layers for CTG and then introduce a structured workflow to build and adaptively activate control centers constructed by FFN vectors to regulate the language model outputs on desirable attributes. Extensive experiments on single- and multi-attribute control reveal that the proposed learning-free FreeCtrl outperforms other learning-free and learning-based methods, successfully resolving the dilemma between learning costs and model performance.
Paper Type: long
Research Area: Generation
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies
Loading