# TWO-STAGE COVERAGE EXPANSION FOR CROSS-DOMAIN OFFLINE REINFORCEMENT LEARNING VIA SCORE-BASED GENERATIVE MODELING



## Install
Our code is developed on top of OTDF(Lyu et al. 2025), see https://github.com/dmksjfl/OTDF 

To run this repo, setup following :

`conda create -n tce python=3.10` \
`pip install torch==1.11` \
`pip install ott-jax==0.4.5` \
`pip install jaxopt==0.8.3` \
`pip install jax==0.4.9` \
`pip install jaxlib==0.4.9` \
`pip install gym` \
`pip install pyyaml` \
`pip install tqdm` \
`pip install h5py` \
`pip install d4rl` \
`pip install 'cython<3.0'` \
`pip install ml_dtypes==0.2.0` \
`pip install pot` \
`pip install matplotlib` \
`pip install seaborn` \
`pip install numpy==1.24.3` \
`pip install six` \
`pip install git+https://github.com/aravindr93/mjrl@master#egg=mjrl` \
`pip install umap-learn` \
`pip install scikit-learn` \
`pip install d3rlpy==1.1.1` \
`pip install wandb` \



## Run TCE

To run entire process from training score model to offline RL, run `run_tce.py` with arguments; 

```
CUDA_VISIBLE_DEVICES=0 python run_tce.py --env halfcheetah-morph --srctype medium-replay --tartype expert --tr_score 1 --state_score_epoch 10000 --tran_score_epoch 5000 --num_gen 500000 --deno_steps 500 --model_save_dir train1 --ymax=0.2 --use_z_thresh=1 --z_thresh=3.0 --max_frac=0.0 --reg_weight=0.001 --seed 4895 --idn train1
```

The `run_tce.py` includes `run_offlinerl.py` python command, so the offline RL process is performed automatically. 



