# Code for Follow-the-Pertubed-Leader in Combinatorial Multi-armed Bandit Problems.

## Introduction

This repository provides a research-oriented experimental framework for **combinatorial multi-armed bandit (CMAB)** problems. It implements **Follow-the-Pertubed-Leader (FTPL)** and some other algorithms and supports evaluation and comparison in both **stochastic** and **adversarial** environments. 

## Implemented Algorithms

The framework supports the following combinatorial bandit algorithms:

* FTPL
* CombUCB
* TS (Thompson Sampling)
* EXP2
* LOGBAR
* HYBRID

Each algorithm is implemented in the `ALGS/` directory and inherits from a unified base class `BANDIT`. Algorithms can be configured and tested in both stochastic and adversarial environments.

## Environment and Settings

The environment is defined in `Environment.py` and supports:

* **Stochastic setting:** Fixed distributions for arm losses.
* **Adversarial setting:** A stochastically
constrained adversarial setting for arm losses.

Key configurable parameters:

* `time_horizon`: Number of rounds to run (e.g., `10**7`)
* `num_arms`: Total number of arms
* `num_action`: Number of arms selected each round
* `Delta`: Difficulty gap in stochastic setting

## Running Experiments

To execute experiments, run:

```bash
python main.py
```

This script runs each algorithm in both environments for a number of independent trials. Regret results are saved in:

```
results_<env_setting>/T_<time_horizon>/regret_<Algorithm>_r_<trial_index>.npy
```

Modify `PARAMETERS`, `METHODS`, and `EPOCHES` in `main.py` to customize the experiment.

## Plotting Results

To visualize results, use `utils.py`:

```python
from utils import draws
methods = ["FTPL", "CUCB", "TS", "EXP2", "LOGBAR", "HYBRID"]
start_epoch = {m: 0 for m in methods}
end_epoch = {m: 20 for m in methods}
draws(methods, env_setting="stochastic", start_epoch=start_epoch, end_epoch=end_epoch)
draws(methods, env_setting="adversarial", start_epoch=start_epoch, end_epoch=end_epoch)
```

Plots are saved to:

```
T1_<Time1>_T2_<Time2>_plots/<env_setting>_plot.pdf
```

## Dependencies and Installation

Install required packages:

```bash
pip install numpy matplotlib tqdm
```

## Project Structure

* `main.py` — runs experiments
* `Environment.py` — bandit environment
* `utils.py` — regret plotting utilities
* `ALGS/` — algorithm implementations
* `results_*/` — saved experiment results

## Notes

This framework is intended for reproducing and visualizing combinatorial bandit algorithm performance in controlled experiments. Please adjust configurations for faster debugging or longer runs as needed.
