SKARL: Provably Scalable Kernel Mean Field Reinforcement Learning for Variable-Size Multi-Agent Systems

SKARL: Provably Scalable Kernel Mean Field Reinforcement Learning for Variable-Size Multi-Agent Systems

ICLR 2026 Conference Submission19297 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning; Multi-agent Reinforcement Learning; Reproducing Kernel Hilbert Space

Abstract: Scaling multi-agent reinforcement learning (MARL) requires both scalability to large swarms and flexibility across varying population sizes. A promising approach is mean-field reinforcement learning (MFRL), which approximates agent interactions via population averages to mitigate state-action explosion. However, this approximation has limited representational capacity, restricting its effectiveness in truly large-scale settings. In this work, we introduce $\underline{S}$calable $\underline{K}$ernel Me$\underline{A}$n-Field Multi-Agent $\underline{R}$einforcement $\underline{L}$earning (SKARL), which lifts this bottleneck by embedding agent interactions into a reproducing kernel Hilbert space (RKHS). This kernel mean embedding provides a richer, size-agnostic representation that enables scaling across swarm sizes without retraining or architectural changes. For efficiency, we design an implementation based on functional gradient updates with Nyström approximations, which makes kernelized mean-field learning computationally trac .From the theoretical side, we establish convergence guarantees for both the kernel functionals and the overall SKARL algorithm. Empirically, SKARL trained with 64 agents generalizes seamlessly to deployments ranging from 4 to 256 agents, outperforming MARL baselines.

Primary Area: reinforcement learning

Submission Number: 19297

Loading