Disentangled Learning with Synthetic Parallel Data for Text Style Transfer

Published: 01 Jan 2024, Last Modified: 13 May 2025ACL (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Text style transfer (TST) is an important task in natural language generation, which aims to transfer the text style (e.g., sentiment) while keeping its semantic information. Due to the absence of parallel datasets for supervision, most existing studies have been conducted in an unsupervised manner, where the generated sentences often suffer from high semantic divergence and thus low semantic preservation. In this paper, we propose a novel disentanglement-based framework for TST named DisenTrans, where disentanglement means that we separate the attribute and content components in the natural language corpus and consider this task from these two perspectives. Concretely, we first create a disentangled Chain-of-Thought prompting procedure to synthesize parallel data and corresponding attribute components for supervision. Then we develop a disentanglement learning method with synthetic data, where two losses are designed to enhance the focus on attribute properties and constrain the semantic space, thereby benefiting style control and semantic preservation respectively. Instructed by the disentanglement concept, our framework creates valuable supervised information and utilizes it effectively in TST tasks. Extensive experiments on mainstream datasets present that our framework achieves significant performance with great sample efficiency.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview