Automated Policy Shaping for Multi-Agent Reinforcement Learning in Pursuit-Evasion Games via Retrieval-Augmented LLMs
Keywords: Large Language Models (LLMs), multi-agent reinforcement learning (MARL), Retrieval-Augmented Generation (RAG), pursuit-evasion game (PEG).
Abstract: The paper proposes an automated framework for shaping policies in multi-agent reinforcement learning (MARL) environments, particularly in the pursuit-evasion game (PEG). The framework addresses the challenges of traditional manual reward design, which is time-consuming, labor-intensive, and inflexible. The core innovation is a tight integration of Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs). By leveraging RAG to provide domain-specific knowledge, the framework can generate semantically correct learning signals that guide agents to master the complex dynamics of PEG. Crucially, it integrates RAG with a closed-loop, data-driven evolutionary process that autonomously evolves these guidance signals based on agent performance. This evolutionary mechanism discovers sophisticated emergent behaviors through the systematic optimization of reward components and structure. Finally, empirical validation in complex PEG scenarios demonstrates that our framework significantly outperforms established baselines.
Submission Number: 40
Loading