Keywords: Synthetic Data Generation, Language Models, LLMs, Fine-tuning
TL;DR: We introduce a general soft-prompt based data synthesis approach to generate fine-tuning data that improves upon hard-prompt baselines.
Abstract: We present a novel soft-prompt based framework, SoftSRV, that leverages a frozen pre-trained large language model (LLM) to generate targeted synthetic text sequences. Given a sample from the target distribution, our proposed framework uses data-driven loss minimization to train a parameterized ``variable'' soft-prompt. This soft-prompt is then used to steer the frozen LLM to generate synthetic sequences that are similar to the target distribution. We argue that SoftSRV provides a practical improvement over common hard-prompting approaches that rely on human-curated prompt-templates, which can be idiosyncratic, labor intensive to craft, and may need to be specialized per domain. We empirically evaluate SoftSRV and other baselines, using a frozen large decoder-only model to generate synthetic fine-tuning data for a small Gemma model. To test generality, we evaluate across three different domains (coding, math, reasoning) without any particular specialization to each domain. In this challenging setting, SoftSRV significantly improves upon hard-prompt baselines, generating data with superior fine-tuning performance and that better matches the target distribution according to the {\sc mauve} similarity metric.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11598
Loading