# Representation-based Reinforcement Learning

Implementations of Soft Actor-Critic-based agents enhanced with latent-variable and contrastive representations, primarily following [1] and [2].

## Project Layout

- `agent/` – SAC baselines plus feature-augmented variants (LV, CTRL, random Fourier features, bonus variants).
- `networks/` – shared neural network backbones (policy, critics, VAEs).
- `utils/` – replay buffer, evaluation helpers, plotting utilities.
- `main.py` – main training scripts.
- `plot.py` – utilities for aggregating TensorBoard logs into publication-ready figures.
- `log/` – sample outputs from previous runs (not needed for installation).

## Quick Start

The algorithms can be run with the following commands.

```bash
python main.py --alg sac --env HalfCheetah-v5 --seed 0
python main.py --alg rffsac --env HalfCheetah-v5 --seed 0
python main.py --alg rffsac_bonus --env HalfCheetah-v5 --seed 0
```

TensorBoard summaries are written under `log/<env>/<alg>/<dir>/<seed>`.

The plots can be obtained by

```bash
python plot.py --alg sac rffsac rffsac_bonus --env HalfCheetah-v5 --dir 0 --seeds 0 1 2 3 --tags info/evaluation
```

## References

[1] [Ren, Tongzheng et al. "Latent variable representation for reinforcement learning." arXiv:2212.08765 (2022).](https://arxiv.org/abs/2212.08765)

[2] [Zhang, Tianjun et al. "Making linear MDPs practical via contrastive representation learning." ICML 2022.](https://arxiv.org/abs/2207.07150)
