Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Qinglin Zhu; Runcong Zhao; Hanqi Yan; Yulan He; Yudong Chen; Lin Gui

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Qinglin Zhu, Runcong Zhao, Hanqi Yan, Yulan He, Yudong Chen, Lin Gui

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose an embedding-based search framework that optimises the first token's embedding through controlled perturbation and Bayesian refinement to enhance reasoning accuracy and coherence in large language models with minimal computation.

Abstract: Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.

Lay Summary: Large language models (LLMs), such as ChatGPT, can perform impressive tasks but often fail when faced with complex reasoning problems. Typically, these models struggle because they explore possible answers inefficiently, either sticking too closely to predictable paths or introducing excessive randomness. To address this, we developed a method called Soft Reasoning, which carefully guides these models to explore different potential solutions more efficiently and accurately. Instead of letting the model randomly guess, we slightly adjust its internal "thought process" at the very start of generating an answer. Specifically, we introduce tiny, controlled changes (like gentle nudges) to the model's initial direction, then refine these changes based on how promising each initial guess looks. Using a method called Bayesian optimisation, our approach systematically explores better starting points without random guessing or manual fine-tuning. Our experiments show that Soft Reasoning significantly improves the accuracy of answers from LLMs on challenging reasoning tasks, while using far fewer resources. This makes our approach highly scalable and widely applicable to different language models. Ultimately, our research helps models like ChatGPT reason more effectively, making them more reliable for solving complex problems in various real-world applications.

Primary Area: Deep Learning->Large Language Models

Keywords: Large Language Models, Reasoning, Embedding Perturbation, Bayesian Optimisation

Submission Number: 10330

Loading