TopoWeaver-R1: Reinforcing Difficulty-Aware Topology Evolution in Multi-Agent Competition-Level Code Generation

Siyu Wang; Ruotian Lu; Zhihao Yang; Yuchao Wang; yanzhou zhang; Lei Xu; Qimin Xu; Guojun Yin; Cailian Chen; Xinping Guan

TopoWeaver-R1: Reinforcing Difficulty-Aware Topology Evolution in Multi-Agent Competition-Level Code Generation

Siyu Wang, Ruotian Lu, Zhihao Yang, Yuchao Wang, yanzhou zhang, Lei Xu, Qimin Xu, Guojun Yin, Cailian Chen, Xinping Guan

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Agent System, Competition-Level Code Generation, Dynamic Topology Generation, Reinforcement Learning Optimization

Abstract: Recent studies have shown that large language model (LLM)-driven multi-agent systems (MAS) are promising for addressing complex problems, with competition-level code generation as a representative domain. By emulating the collaboration among human programmers, these systems leverage predefined interaction topologies to achieve notable gains. However, such fixed structures introduce interaction redundancy and excessive token costs as task difficulty drops. While graph pruning and generation methods can produce sparser topologies, they remain static during inference, unable to adapt to execution feedback, and often converge to limited density ranges. To overcome these issues, we propose TopoWeaver-R1, a reinforcement learning–optimized MAS centered on an LLM orchestrator agent, which supports end-to-end evolutionary dynamic interaction topology generation. For each query, it infers agent roles and task difficulty, then constructs a task-adapted, density-aware layered directed acyclic graph (DAG) topology. The topology evolves via execution feedback and history, thereby improving the task-solving performance of the generated code. On three competition-level and two basic code datasets, TopoWeaver-R1 achieves state-of-the-art accuracy, with up to 14.6\% higher accuracy, 13\% lower density and 68\% lower token cost than the strongest baseline. Our approach transitions multi-agent topologies from static designs to dynamic, feedback-driven evolutionary designs with fine-grained, difficulty-aware density control.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 9205

Loading