<div align="center">

<div id="user-content-toc" style="margin-bottom: 50px">
  <ul align="center" style="list-style: none;">
    <summary>
      <h1>Transitive RL</h1>
      <br>
      <h3>Value Learning via Divide and Conquer</h3>
      <br>
    </summary>
  </ul>
</div>

</div>

## Installation

TRL requires Python 3.9+ and is based on JAX. The main dependencies are
`jax >= 0.4.26`, `ogbench == 1.1.0`, and `gymnasium == 0.29.1`.
To install the full dependencies, simply run:
```bash
pip install -r requirements.txt
```

## Usage

The main implementation of TRL is in [agents/trl.py](agents/trl.py),
and our implementations of baselines (GCIQL, GCIVL, GCBC, GCFBC, QRL, CRL, SGT, COE)
can also be found in the same directory.

Here is an example command for TRL:

```bash

# TRL on OGBench scene

python main.py --env_name=scene-play-oraclerep-v0 --agent=agents/trl.py --agent.dp_type=mid --agent.mid.lam=1.0 --agent.mid.subgoal_ct=5 --agent.mid.quantile=0.5 --agent.mid.expectile=0.7 --agent.pe_type=rpg --agent.rpg.alpha=1 --agent.oracle_distill=True --offline_steps=1000000 --agent.actor_hidden_dims="(512, 512, 512)" --agent.value_hidden_dims="(512, 512, 512)" --agent.actor_geom_sample=False --agent.actor_p_trajgoal=1.0 --agent.actor_p_randomgoal=0.0 --agent.value_geom_sample=True --agent.discount=0.99

```

For TD-n: change flags  `--agent.dp_type=td --agent.td.n=_`

For MC: change flags `--agent.dp_type=mc`

Tuned hyperparameters for each environment and agent are provided in the paper.


## Acknowledgments

This codebase is built on top of reference implementations from [Horizon Reduction Makes RL Scalable](https://github.com/seohongpark/horizon-reduction/tree/master).
