# ASOR
[ICLR 2025 Submission] Code for paper "ASOR: Anchor State Regularized Policy Optimization for Policy Optimization under Dynamics Shift".


## Installation

Follow the steps in [OfflineRL](https://github.com/polixir/OfflineRL)

## Prepare Offline Dataset

Download the files in [Google Drive](https://drive.google.com/file/d/19Bc8LSE38A67LH3ZCaZDXDuHEc7tC35G/view?usp=sharing) and change the `path` parameter in line:15 of `examples/train_d4rl.py`.


## Run the ASOR algorithm

```
python examples/train_d4rl.py --algo_name=srpo_plus --exp_name=ASOR --seed 1 --task density_10,body_mass@walker2d-medium-expert-v0 --rew_reg_eta 0.1 --out_train_epoch 200 --device cuda:0
```

`walker2d-medium-expert-v0` can be changed to other Offline RL environments. To run baseline algorithms, `srpo_plus` can be changed to `maple_st` (for SRPO), `maple`, `mopo`, `cql`, etc.