# Representation-Based Exploration for Language Models: From Test-Time to Post-Training

Official code repo for the post-training (RL) part of the paper "Representation-Based Exploration for Language Models: From Test-Time to Post-Training", which is currently under submission at ICLR 2026.

## Installation

Please install verl from https://github.com/volcengine/verl.

## Running Experiments
We provide the following training scripts:

(1) RepExp: 
```
sh scripts/train_elliptical.sh
```
(2) Unlikeliness
```
sh scripts/train_unlikely.sh
```
(3) GRPO
```
sh scripts/train_grpo.sh
```

For evaluation, one can use `sh scripts/eval/eval.sh`, but make sure to point to the right checkpoint.