# Companion Repository for the Paper  
**"Does Stochastic Gradient Really Succeed for Bandits?"**

This repository contains the code and experiments accompanying our paper on the stochastic gradient bandit algorithm (SGB).

---

##  Reproducing the Experiments

The experiments from the paper can be reproduced by running the Jupyter notebook SGB_paper.ipynb


This notebook covers all the experiments presented in the paper.

---

##  Repository Structure

All bandit algorithm implementations and experiment pipelines are contained in the `MAB/` folder. Notably:

- `MAB.py`: General implementations, including for instance the implementation of SGB and SAMBA (for rewards on [0,1]).
- `RadeMAB.py`: Instantiates algorithms for Rademacher distributions.

The script `xp_helpers.py` contains helper functions for running experiments in parallel (used for Sections 2–5 of the notebook).

---

##  Running Experiments Manually

If you'd like to test the algorithms manually without parallelization, follow these steps:

### 1. Define a Bandit Instance

For example, to define a Rademacher bandit:

```python
# Import the Rademacher bandit environment
from MAB.RademacherMAB import RadeMAB as RadeM

# Define the bandit instance with specified means
model = RadeM([mu0, mu1, ...])
```

### 2. Run an algorithm

To run a single trajectory of an algorithm ALG with parameters param1, param2,... it suffices to run 

```python
model.ALG(T, param1, param2,...)
```

For instance, for SGB with learning rate 0.2 up to horizon 1000, one can run 

```python
model.SGB(1000, 0.2)  
```

To run multiple trajectories and report the average regret, one can use the Monte Carlo wrapper:

```python
model.MC_regret('SGB', N, T, {'eta': 0.1})
```





