# Ensemble Sampling for Nonlinear Bandits

## Overview of Experiments

We implement and benchmark the following algorithms and their anytime variants:  
`Lin-ES`, `GLM-ES`, `Neural-ES`  

We test the algorithms in the following settings.  
Bandit: `linear`, `logistic`, `distance`, `quadform`  


## Structure of the Code

Algorithms are implemented in `./algo`, accumulated regrets are stored in `./data`  
Bandit environments are implemented in `./train_utils`  
The setup of each experiment (parameters of environment an agent) are in `./configs/simulation`  


## How to Run the Code

To install the necessary packages, run
```bash
pip install -r requirements.txt
```

To generate synthetic data in bandit settings, run
```bash
python3 data_generator.py --config configs/data-linear.yaml
```

Use the following code to run simulations with specific setting and model.
```bash
python3 run_simulation.py --config_path configs/simulation/[chosen model].yaml --repeat [number of experiments to repeat] 
```
For example, to run `LMC-TS` in linear contextual bandit,
```bash
python3 run_simulation.py --config_path configs/simulation/linear-LMCTS.yaml --repeat 1
```
