Intent Factored Generation: Unleashing the Diversity in Your Language Model

Published: 12 Jun 2025, Last Modified: 21 Jun 2025EXAIT@ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Language Modeling
Keywords: LLMs, Semantic Diversity, Exploration, RLVF, RLHF, Reasoning, Instruction Tuning, Maths, Code Generation
TL;DR: We propose a method increase the ability an LLM to explore through coherence-bound semantic diversity
Abstract: Obtaining multiple meaningfully diverse, high quality samples from Large Language Models (LLMs) for a fixed prompt remains an open challenge. Current methods for increasing diversity often only operate at the token-level, paraphrasing the same response. To address this we propose **I**ntent **F**actored **G**eneration (**IFG**); factoring the sampling process into two stages. First, a semantically dense intent stage where we sample keywords or a summary that anchors the sample. In the second stage, we sample the final response conditioning on both the original prompt and the intent from the first stage. This factorisation allows us to use a higher temperature during the intent step to promote conceptual diversity, and a lower temperature during the final generation to ensure the outputs are coherent and self-consistent. We empirically demonstrate that this simple method is highly effective across a diverse set of tasks. For reasoning tasks, we show this method improves pass@k on math and code problems. We demonstrate that this pass@k improvement translates to higher accuracy (pass@1) when we use IFG as an exploration method for Reinforcement Learning on maths. We also show that IFG is useful beyond reasoning. We combine IFG with Direct Preference Optimisation to increase diversity without sacrificing reward. Finally, we evaluate IFG on a general language modelling task; modelling comments on news articles, on a new dataset that we collect and open-source. On this task we achieve higher diversity, while maintaining the quality of the generations. In summary, we present a simple method of increasing the sample diversity of LLMs while maintaining performance across many tasks.
Serve As Reviewer: ~Eltayeb_Ahmed1, ~Uljad_Berdica1
Submission Number: 58
Loading