# TSCE-MF: Thompson Sampling with Continuous-time Estimation for Stochastic Games

This repository contains the official implementation of **TSCE-MF**, a Thompson Sampling algorithm with continuous-time parameter estimation, applied to multi-agent stochastic differential games.

This method estimates the unknown system parameter over episodes and optimizes control policies accordingly. It demonstrates sublinear regret growth in continuous-time stochastic games with linear-quadratic structure.

---

## 📁 File Overview

- **TSCE.py**  
  Implements the `TSCE_MF` class for Thompson Sampling with continuous-time estimation in an N-player mean field game.

- **Large T.py**  
  Runs long-horizon experiments to evaluate the asymptotic behavior of regret and validate theoretical bounds

- **Different dimensions.py**  
  Examines the performance of the algorithm under varying state dimensions d, illustrating the dependence of regret scaling on problem size.

- **Compare with baseline.py** 
  Compares Thompson Sampling with certainty-equivalent and other baseline controllers to highlight the benefits of Bayesian exploration.

- **Conjugate prior.py**  
  Tests the algorithm under different prior distributions (e.g., Gaussian, t-distribution, exponential, beta) to study robustness with respect to prior misspecification.


---

## 📦 Requirements

Install necessary packages using pip:

```bash
pip install numpy scipy matplotlib
```

---

## 🚀 How to Run

Navigate to your project folder (e.g. Downloads):

```bash
cd ~/Downloads
python3 TSCE.py
```

This will initialize the simulation, run the TSCE-MF algorithm for several sample paths, and generate cumulative regret plots.

---

## 📊 Evaluation Output

The following figures will be shown:

- **Panel (a):** Cumulative regret $R(T)$ over time
- **Panel (b):** Scaled regret $R(T) / \sqrt{T \log{T}}$

---

## 📌 Notes

- The simulation uses `numpy`'s random generator. For reproducible results, set:

```python
np.random.seed(2)
```
