The code is based on *Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation*
* See the implementation in [Experiment.ipynb](Experiment.ipynb).


## Requirements

```bash
uv sync
```

## Run Experiments and Plot the results

Run all the cells in [Experiment.ipynb](Experiment.ipynb).
The results will be saved as "results-random.pdf" for the tabular env, "results-streaming.pdf" for the streaming env, and "results-linear.pdf" for the linear env.