Optimizing the coalition gain in Online Auctions with Greedy Structured Bandits

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Bandits, Auctions, Concentration inequalities
Abstract: Motivated by online display advertising, this work considers repeated second-price auctions, where agents sample their value from an unknown distribution with a cumulative distribution function F . In each auction t, a decision-maker bound by limited observations selects nt agents from a coalition of N to compete for a prize with p other agents, aiming to maximize the cumulative reward of the coalition across all auctions. The problem is framed as an N -armed structured bandit, each number of player sent being an arm n, with expected reward r(n) fully characterized by F and p + n. We present two algorithms, Local-Greedy (LG) and Greedy-Grid (GG), both achieving constant problem-dependent regret. This relies on three key ingredients: 1. an estimator of r(n) from feedback collected from any arm k, 2. concentration bounds of these estimates for k within an estimation neighborhood of n and 3. the unimodality property of r under standard assumptions on F . Additionally, GG exhibits problem-independent guarantees on top of best problem-dependent guarantees. However, by avoiding to rely on confidence intervals, LG practically outperforms GG, as well as standard unimodal bandit algorithms such as OSUB or multi-armed bandit algorithms
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Hugo_Richard1
Track: Fast Track: published work
Publication Link: https://proceedings.neurips.cc/paper_files/paper/2024/hash/22c799f287fd05e7174fd65a3ce134af-Abstract-Conference.html
Submission Number: 8
Loading