MaRCA: Multi-Agent Reinforcement Learning for Dynamic Computation Allocation in Large-Scale Recommender Systems

Published: 02 Mar 2026, Last Modified: 09 Mar 2026ICLR 2026 Workshop AIMSEveryoneRevisionsCC BY 4.0
Keywords: Resource Allocation, Deep Reinforcement Learning, Recommender System, Cooperative Multi-Agent Systems, Model Predictive Control
TL;DR: This work integrates multi‑agent reinforcement learning and model predictive control to dynamically allocate computation in large-scale recommendation systems, achieving a 16.7 % revenue increase and remaining stable during major promotions.
Abstract: Modern recommender systems face significant computational challenges due to growing model complexity and traffic scale, making efficient computation allocation critical for maximizing business revenue. Existing approaches typically simplify multi-stage computation resource allocation, neglecting inter-stage dependencies, thus limiting global optimality. In this paper, we propose MaRCA, a multi-agent reinforcement learning framework for end-to-end computation resource allocation in large-scale recommender systems. MaRCA models the stages of a recommender system as cooperative agents, using Centralized Training with Decentralized Execution (CTDE) to optimize revenue under computation resource constraints. We introduce an AutoBucket TestBench for accurate computation cost estimation, and a Model Predictive Control (MPC)-based Revenue-Cost Balancer to proactively forecast traffic loads and adjust the revenue‑cost trade‑off accordingly. Since its end-to-end deployment in the advertising pipeline of a leading global e-commerce platform in November 2024, MaRCA has consistently handled hundreds of billions of ad requests per day and has delivered a 16.67\% revenue uplift using existing computation resources.
Track: Long Paper
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 80
Loading