A Lightweight Reasoning Method with Test-Time Scaling for Preserving Diversity and Factuality in LLM-Based Decision-Making

Rongrong Chen; Kailin Gao; Yuan He; Hongsheng Qi

A Lightweight Reasoning Method with Test-Time Scaling for Preserving Diversity and Factuality in LLM-Based Decision-Making

Rongrong Chen, Kailin Gao, Yuan He, Hongsheng Qi

Published: 04 Jul 2025, Last Modified: 22 Jul 2025KDD 2025 Workshop on Inference Optimization for GenAI PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Decision Making, Collaborative Reasoning, Test-Time Scaling, Task Decomposition

Abstract: Large Language Models (LLMs) have shown remarkable performance across various tasks, but their reasoning capabilities still face challenges. This paper aims to mitigate the limitations of LLMs in complex decision-making tasks, which require high-level reasoning ability. We introduce Smart Peers, a lightweight reasoning method designed to enhance LLMs' performance in decision-making tasks by integrating test-time scaling. Specifically, Smart Peers employs sequential and parallel self revision to perform task decomposition, enabling the LLM to make independent decisions multiple times and has the opportunity to revise its decision based on all peers' decisions. In this case, the method achieves test-time scaling, thereby ensuring diversity and factuality at each step of the decision-making task and enhancing the overall task completion. As a lightweight method, Smart Peers demonstrates superior performance compared to other complex trajectory planning algorithms in certain tasks in our experiments. We evaluate Smart Peers on three decision-making tasks: WebShop, ALFWorld, and Mini-Crosswords. The results demonstrate that Smart Peers achieves significant performance improvements over baseline methods. In particular, on the WebShop task, Smart Peers achieves a relative improvement of approximately 34.63% compared to other baseline methods. Additionally, Smart Peers exhibits notable advantages, including fully leveraging the LLMs' capability and promptly correcting erroneous steps, laying a foundation for future research in complex reasoning.

Submission Number: 9

Loading