MA4DIV: Multi-Agent Reinforcement Learning for Search Result Diversification

Published: 29 Jan 2025, Last Modified: 29 Jan 2025WWW 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Track: Search and retrieval-augmented AI
Keywords: Learning to Rank, Search Result Diversification, Multi-Agent Cooperation, Reinforcement Learning
TL;DR: A Search Results Diversification (SRD) methods using Multi-agent Reinforcement Learning.
Abstract: Search result diversification (SRD), aimed at ensuring that selected documents in a ranking list cover a wide range of subtopics, is a significant and extensively studied problem in Web search and Information Retrieval. Existing methods primarily utilize a paradigm of "greedy selection", i.e., selecting one document with the highest diversity score at a time. These approaches tend to be inefficient and are easily trapped in a suboptimal state. In addition, some other methods optimize an approximation of the objective function, but the results still remain suboptimal. To address these challenges, we introduce \textbf{M}ulti-\textbf{A}gent reinforcement learning (MARL) for search result \textbf{DIV}ersity, which called \textbf{MA4DIV}. In this approach, each document is an agent and the search result diversification is modeled as a cooperative task among multiple agents. By modeling the SRD ranking problem as a cooperative MARL problem, this approach allows for directly optimizing the diversity metrics, such as $\alpha$-NDCG, while achieving high training efficiency. We conducted experiments on public TREC datasets and a large-scale dataset in the industrial setting. The results show that MA4DIV achieves substantial improvements in both effectiveness and efficiency than existing baselines, especially on the industrial scale dataset.
Submission Number: 462
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview