Few-Shot Paraphrase Generation with LLMs: An Empirical Study of Models and Hyperparameters

ICLR 2026 Conference Submission19083 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: paraphrase, LLM
Abstract: The rapid progress of Large Language Models (LLMs) has made them widely used for data augmentation tasks, notably through paraphrase generation. For that task, paraphrases are expected to preserve the original meaning while exhibiting lexical diversity. In this work, we conduct an empirical study of various off-the-shelf LLMs for paraphrase generation. We examine different prompting and decoding strategies, and compare systems with respect to their ability to follow predefined templates, retain semantic fidelity, and produce lexical variation. Our results show that LLMs are generally effective at generating paraphrases. However, guiding the generation process by providing initial tokens significantly improves adherence to required patterns. Under this condition, repetition penalties in decoding can further enhance output diversity. Interestingly, we also find that few-shot prompting may reduce lexical diversity.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 19083
Loading