AgentRM: Enhancing Agent Generalization with Reward Modeling

AgentRM: Enhancing Agent Generalization with Reward Modeling

ACL ARR 2025 February Submission8043 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing LLM-based agents have achieved strong performance on held-in tasks, but their generalizability to unseen tasks remains poor. Hence, some recent work focus on fine-tuning the policy model with more diverse tasks to improve the generalizability. In this work, we find that finetuning a reward model to guide the policy model is more robust than directly finetuning the policy model. Based on this finding, we propose AgentRM, a generalizable reward model, to guide the policy model for effective test-time search. We comprehensively investigate three approaches to construct the reward model, including explicit reward modeling, implicit reward modeling and LLM-as-a-judge. We then use AgentRM to guide the answer generation with Best-of-N sampling and step-level beam search. On nine agent tasks, AgentRM enhances the base policy model by $8.8$ points on average, surpassing the top general agent by $4.0$ points. Moreover, AgentRM can outperform the top specialized agent by $11.4$ points on held-in tasks with a specialized policy model. All the data and source codes will be released to facilitate the research in this area.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: applications

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Publicly available software and/or pre-trained models

Languages Studied: english

Submission Number: 8043

Loading