# Repository for DarkRoom I-PRL Setting 

This repo contains codes for the DarkRoom environment in the Immediate Preference Reinforcement Learning (I-PRL) setting. To facilitate understanding, we provide notebooks to demonstrate how to use the implemented modules to 
- Generate Pretraining Data
- Pretrain Decision Preference Pretrained Transformer (DP2T) 
- Pretrain In-Context Preference Optimization (ICPO)
- Evaluate Pretrained Models in Unseen Tasks

## Pretraining Data Generation

See ***pref-data.ipynb***. 

## ICPO Pretraining 

See ***pref-dp2t.ipynb***. 

## DP2T Pretrainingg 

See ***pref-icpo.ipynb***.

## Evaluation

See ***Evaluation.ipynb***.