# *A Direct Approach for Handling Contextual Bandits with Latent State Dynamics* (Code for Numerical Simulation)
***

## Introduction
This is an instruction to reproduce the Numerical Simulation Results in of the article 
**A Direct Approach for Handling Contextual Bandits with Latent State Dynamics**. 

In the code, we use the name **Linear Bandits or S-UCB-Belief** to refer to **Box A: Staged UCB with estimated beliefs** 
and **Linear Bandits or Classical LinUCB** to refer to **Box C: Contextual Linear Bandits**.

---

## Step 0: Prepare the Environment and Download the Underlying Data
1. Unzip the *numerical_simulation.zip* and use the unzipped folder *numerical\_simulation* as **root directory**

2. Setup the Python environment:
    - Python Version: 3.11
    - Dependence Package: In *requirements.txt*

---

## Step 1: Prepare the dataset
- Download the underlying dataset from 
[UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients)
and save it as *./data/default of credit card clients.xls*


- Open the script *1\_Prepare\_Data.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed 
(see code comments for the explaination of each paramters) as following:

~~~
PATH = "." # Root directory, should be the same path this "README.md" file locates
PATH_DATA = f"{PATH}/data" # Path for data
PATH_MODELS = f"{PATH}/models" # Path for models

retrain_model = True # Default False. If retrain_model is True, it will retrain the PD model. If False, it will load the pretrained model
random_seed = 1989 # Random seed, used when retrain_model is True. To reproduce the PD model, set as 1989
~~~

- Run the script *1\_Prepare\_Data.py*

Note: get the prepared data in *./data/dt\_env.parq*

---

## Step 2-5: Run the script step 2 to step 5 with default parameters
- Run the script *2\_Simulation\_Benchmark\_Policy.py* 10 times with random seed from 1986 to 1995
- Run the script *3\_Hyperparameters\_Tuning.py*
- Run the script *4\_Linear\_Bandits\_Simulations.py* 10 times with random seed from 1986 to 1995
- Run the script *5\_Staged\_Bandits\_Belief\_Simulations* 10 times with random seed from 1986 to 1995
---

## Step 6: Collect Simulation Results and Generate the Graphs
- Run the script *6\_Plot\_Performance\_Algorithms.py* and get the **graphs** in 
*./pics*