REDEREF: RECURSIVE DELEGATION AND REFLECTION FOR MULTI-TURN LLM AGENT COLLABORATION WITH DYNAMIC CAPABILITY DISCOVERY

Ankit Shah; Mohammad Parsa Hosseini; Saiyra Qureshi; Alex Huang; Connie Miao; Wei Wei

REDEREF: RECURSIVE DELEGATION AND REFLECTION FOR MULTI-TURN LLM AGENT COLLABORATION WITH DYNAMIC CAPABILITY DISCOVERY

Ankit Shah, Mohammad Parsa Hosseini, Saiyra Qureshi, Alex Huang, Connie Miao, Wei Wei

20 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Recursive

Abstract: Large language models (LLMs) achieve strong single-turn performance, yet real-world deployments demand multi-turn, multi-agent coordination with dynamic routing, reliable credit assignment, and long-horizon memory. We introduce , a lightweight, training-free framework that wraps arbitrary LLM agents with four synergistic components: (i) online Bayesian delegation (Thompson sampling) for dynamic routing; (ii) calibrated self-reflection via an LLM judge to drive credit assignment and recursive re-routing; (iii) text-appropriate aggregation using selection with evidence checks; and (iv) memory-aware belief updates for long-term adaptation. Across domain-diverse, split-knowledge tasks, attains higher success rates than static non-collaborative baselines, with ablations indicating that the recursive re-routing loop contributes the majority of gains on initially failed tasks while online Bayesian updates improve routing efficiency. These results suggest that an interpretable, probabilistic wrapper can substantially enhance multi-agent LLM coordination—enabling dynamic task routing, emergent specialization, and long-term adaptability with minimal overhead.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 23644

Loading