Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits
Abstract: We study a collaborative variant of the stochastic linear bandit problem in a multi-agent setting, motivated by real-world recommender systems where multiple content creators indirectly influence each other’s outcomes. In our formulation, agents interact with a shared environment but may choose to form coalitions, sharing observations to improve collective learning efficiency. We formalize this setup as a transferable utility (TU) cooperative game, where the value of a coalition is defined as the negative sum of cumulative regrets incurred by its members. This framework allows us to examine how algorithmic design and structural assumptions about agents---such as identical vs. heterogeneous action sets---affect collaboration incentives. We show that under some algorithmic conditions, the induced TU game exhibits desirable properties: for identical agents, the game is convex and admits a non-empty core containing the Shapley value, ensuring stable and equitable collaboration. For heterogeneous agents, we demonstrate core non-emptiness and propose a simple, implementable payoff mechanism that satisfies all but one Shapley value axioms. Experimental results on problem instances derived from MovieLens-100k dataset further illustrate how the empirical payout aligns and diverges from ideal cooperative outcome for different settings. Our results offer a principled lens for designing collaborative learning systems that are both effective and incentive-aligned.
Submission Number: 1054
Loading