vMF-exp: von Mises-Fisher Exploration of Large Action Sets with Hyperspherical Embeddings

Walid Bendada; Guillaume Salha-Galvan; Romain Hennequin; Théo Bontempelli; Thomas Bouabça; Tristan Cazenave

vMF-exp: von Mises-Fisher Exploration of Large Action Sets with Hyperspherical Embeddings

Walid Bendada, Guillaume Salha-Galvan, Romain Hennequin, Théo Bontempelli, Thomas Bouabça, Tristan Cazenave

Published: 19 Jun 2024, Last Modified: 26 Jul 2024ARLET 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Exploration, Hyperspherical Embeddings, Reinforcement Learning, Scalability, von Mises-Fisher Distribution, Recommender Systems

TL;DR: We introduce a scalable method for exploring large action spaces in reinforcement learning problems where hyperspherical embedding vectors represent actions.

Abstract: This workshop paper is under review for presentation at an international conference. We introduce von Mises-Fisher exploration (vMF-exp), a scalable method for exploring large action sets in reinforcement learning problems where hyperspherical embedding vectors represent actions. vMF-exp involves initially sampling a state embedding representation using a von Mises-Fisher hyperspherical distribution, then exploring this representation's nearest neighbors, which scales to unlimited numbers of candidate actions. We show that, under theoretical assumptions, vMF-exp asymptotically maintains the same probability of exploring each action as Boltzmann Exploration (B-exp), a popular alternative that, nonetheless, suffers from scalability issues as it requires computing softmax values for each action. Consequently, vMF-exp serves as a scalable alternative to B-exp for exploring large action sets with hyperspherical embeddings. We further validate the empirical relevance of vMF-exp by discussing its successful deployment at scale on a music streaming service to recommend playlists to millions of users.

Submission Number: 9

Loading