Dynamic Role-Graph Reinforcement Learning for Multi-Agent Collaborative Coding Systems

Dynamic Role-Graph Reinforcement Learning for Multi-Agent Collaborative Coding Systems

ICLR 2026 Conference Submission25475 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Role-Graph Reinforcement Learning

Abstract: We propose \textbf{Dynamic Role-Graph Reinforcement Learning (DRGRL)}, a novel framework for multi-agent collaborative coding systems that addresses the challenges of evolving team dynamics and role-based coordination. Traditional multi-agent reinforcement learning (MARL) approaches are often ineffective for static representations of agent interactions, which don't correlate to the fluid nature of real world software development teams. The proposed method combines dynamic graph neural networks (GNNs) with role-aware attention mechanisms to model time-varying collaboration patterns in which agents (i.e., developers, corresponding to nodes in a graph) are represented as nodes of a graph with an adaptively changing topology reflecting changing teams. A transformer-based gnn encoder uses the SK severing information across the graph, and a collaboration complexity divider estimates coordination complexity to serve as a decision-making leader. The framework uses a centralized critic with decentralized actors (CCDA) to encourage a maximized team level rewards (e.g., reduced merge conflicts or test coverage) and individual autonomy. Moreover, the system is interfaced with traditional development tools, such as version control systems, IDEs, and conflict resolvers to simplify the integration of learned policies into traditional workflows. The key novelty lies in the \textbf{role-graph duality}, where roles are both learned from data and emergent from graph dynamics, enabling hierarchical coordination strategies. For instance, high collaboration complexity could lead to the distribution of the mediator roles to stabilize such a system. Experiments on man-made and real-world coding data sets show that simulations using the proposed method show significant gains in the efficiency of teamwork and code-quality over baseline methods for MARL. The Framework's flexibility with Dynamic Teams and the general nature of the collaboration scenario, the Framework can be a potential contender to solve the challenges that modern software engineering face.

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 25475

Loading