Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind

Published: 26 Jan 2026, Last Modified: 11 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI for Research; Rebuttal Agent
Abstract: Although AI has become deeply integrated into various stages of the research workflow and achieved remarkable advancements, academic rebuttal remains a significant and under-explored challenge. Rebuttal is a complex process of strategic communication under severe information asymmetry, not a simple technical debate. Current models fail because they only imitate surface-level linguistics, missing the essential element of perspective-taking required for effective persuasion. In this paper, we introduce RebuttalAgent, the first framework to ground academic rebuttal in Theory of Mind (ToM). Specifically, the agent implements ToM through a Theory-of-Mind-Strategy-Response (TSR) pipeline, which models a reviewer's mental state, formulates a persuasion strategy, and then generates a strategy-grounded response. To train our agent, we construct RebuttalBench, a large-scale synthetic dataset created via a novel critique-and-refine pipeline. Our twofold training process begins with a Supervised Fine-tuning phase to equip the agent with ToM-based analysis and strategic planning capabilities, followed by a Reinforcement Learning phase using a novel self-reward mechanism for scalable self-improvement without an external reward model. For a reliable and scalable automated evaluation, we develop Rebuttal-RM, a specialized evaluator trained on multi-source data of over 100K samples, whose scoring consistency with human preferences surpasses GPT-4.1. Extensive experiments show RebuttalAgent significantly outperforms the base model by 18.3\% and is competitive with advanced models such as o3 across both automated and human evaluations. Our code will be released publicly.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 8600
Loading