>Under Development
# D2T2: Decision Transformer with Temporal Difference via Steering Guidance

This repo contains code for doing reinforcement learning via supervised learning in our work D2T2: Decision Transformer with Temporal Difference via Steering Guidance. We evaluate our proposed D2T2 on stochastic autonomous driving problem.

### Installation:
All python dependencies are in environment.yml. Install with:
```
conda env create -f environment.yml
pip install -U scikit-learn
conda activate implicit_DT
```
### Training:
select dataset (stop30, stop40, ..., stop80, stop_random) and algo (DT, TT, BC, IQL, BCQ, GDT(D2T2), SPLT)

```
python toycar_train.py --device cuda:1 --exp_name DT_70 --dataset stop70 --algo DT
```


### Testing:
select testing script (dt_toycar_plan.py, tt_toycar_plan.py, splt_toycar_plan.py, ...)
```
python dt_toycar_plan.py --cuda:3
```
