# Diversity-regularized RL

Code for NeurIPS 2025 submission 12302 "Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization"

#### Installation

```
conda create -n mmrl python=3.12.2
conda activate mmrl
pip install -r requirements.txt
```

#### Run Training

For DrAC, DACER and SAC
```
python train.py [algo] --task [env_id] ...
```

e.g., `python train.py DrAC --task MultiGoalPointMaze`

For SQL and SSAC:

``` 
python train.py --algo [algo] --task [env_id]
```

Default configurations are defined in `rl/configs.yaml`. You can specify configurations by passing arguments when running the command, the key of argument is the same to the keys in `configs.yaml`.


To run DrAC with diffusion actor, pass "--actor_type diffusion" in arguments.

#### Evaluation
Evaluation will be done along with training. Visualization of results are made by functions defined in `run_batched_viz.py`.
