# ICLR 2025 -- von Mises-Fisher Exploration
Repository for the ICLR 2025 submission Paper "Exploring Large Action Sets with Hyperspherical
Embeddings using von Mises-Fisher Sampling" by Anonymous Authors.

This paper introduces von Mises-Fisher exploration (vMF-exp), a scalable method for exploring large action sets in reinforcement learning problems where hyperspherical embedding vectors represent actions. vMF-exp involves initially sampling a state embedding representation using a von Mises-Fisher hyperspherical distribution, then exploring this representation's nearest neighbors, which scales to virtually unlimited numbers of candidate actions.
We show that, under theoretical assumptions, vMF-exp asymptotically maintains the same probability of exploring each action as Boltzmann Exploration (B-exp), a popular alternative that, nonetheless, suffers from scalability issues as it requires computing softmax values for each action.
Consequently, vMF-exp serves as a scalable alternative to B-exp for exploring large action sets with hyperspherical embeddings. 
In the final part of this paper, we further validate the empirical relevance of vMF-exp by discussing its successful deployment at scale on a music streaming service. On this service, vMF-exp has been employed for months to recommend playlists inspired by initial songs to millions of users, from millions of possible actions for each playlist.

# Experiments
## Theoretical probabilities
The script `main.py` will run Monte Carlo simulations estimating the probability for von Mises-Fisher exploration and Boltzmann exploration to sample an action with know similarity given a state vector. The result can then be plotted using `plot_mc_probas.py` where it will be compared against the theoretical expressions of **Proposition 4.2** and **Proposition 4.4**.
For instance, to reproduce **Figure 2.a**, one can run the following command
```
python -m src.main  -k 1.0 -a 0.5 -d 4 -N 1000000 -bs 256 -nt 30000
```
which will run the corresponding Monte Carlo Simulations (~ 3 hours on an Nvidia GTX 1080), followed by the command
```
python -m src.plot_mc_probas -path simulations/k=1.0_a=0.50_d=4_N=1000000_samples=7680000/
```
which will create a plot similar to the following one

![alt text](resources/exemple_monte_carlo.png)

and save in a sub-folder of /simulations/ named according to the chosen parameters.


## Optionnal : 3D and 2D Voronoï tessellations
The script `plot_voronoi_3D.py` will reproduce **Figure 1.b**, which is an exemple of the Voronoi tesselation of 51 vectors uniformly distributed on the 3D sphere.
Running the command
```
python -m src.plot_D_voronoi --shuffle
```
will sample new vectors and result in a plot similar to the following

![alt text](resources/3D_voronoi_exemple.png)

## Optionnal : Normal Approximation

The script `plot_voronoi_2D.py` will reproduce the plot of **Figure 1.c**, which is an exemple of the Voronoi tesselation of 11 vectors uniformly distributed on the 2D circle.
Running the command
```
python -m src.plot_3D_voronoi --shuffle
```
will sample new vectors and result in a plot similar to the following

![alt text](resources/2D_voronoi_exemple.png)
