Tiny-StyleWizard: Unleashing the Potential of Small Language Models in Complex Style Transfer

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: pdf
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: complex style transfer, large language model, less is more, ungrokking, diversity
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: This paper introduces Tiny-StyleWizard, a framework to overcome the challenges posed by complex style transfer, with experiments showing small models can produce stylized text comparable to ChatGPT and observations like less is more and ungrokking.
Abstract: Text style transfer is a crucial task in natural language processing. While previous studies focused on simple styles like sentiment and formality, they overlooked the transfer of valuable complex styles. In this paper, we propose a framework named Tiny-StyleWizard to address this challenge. It first generates a specialized dataset retaining key aspects of the desired complex style based on diverse corpora and a large language model (LLM) and then fine-tines a small language model to achieve the goal of complex style transfer. Additionally, a novel evaluation protocol is devised to rank the quality of the generated specialized dataset and to measure the performance of different models. Extensive experiments on two representative complex style transfer tasks reveal that small language models like BART-base/large can produce stylized text on par with ChatGPT while the tinier ones like T5-mini (about 30M parameters) could surpass the state-of-the-art models. Intriguingly, Our investigation on the efficient construction of the training corpus shows the phenomenon named "less is more" and the subsequent similar "ungrokking" observation, emphasizing the supreme importance of data quality. Further exploration also showcases the sufficient diversity of the generation texts obtained by our Tiny-StyleWizard framework.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5406
Loading