# Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision

## **Overview**

## **Installation**
Experiments require MuJoCo and Metaworld. Follow the instructions in the [[MuJoCo]](https://github.com/openai/mujoco-py)[[Metaworld]](https://github.com/Farama-Foundation/Metaworld) to install.
Create a virtual environment using conda, and see `requirments.txt` file for more information about how to install the dependencies.
```shell
conda create -n t2da python=3.8.20 -y
conda activate t2da
pip install -r requirements.txt
```

## **Train Trajectory Encoder**
Train the Trajectory Encoder on different tasks in PointRobot-v0:
```shell
python train_traj_encoder.py --env point-robot --context_horizon 20
```

The trained checkpoint will be saved in `saves_world_model/point-robot/`.

## **Align Text Encoder with Trajectory Encoder**
Fine-tune the text encoder to align the produced text embeddings with dynamics-aware decision embeddings:
```shell
python train_align.py --text_encoder clip --env point-robot --context_horizon 20
```
By modifying the parameter ```text_encoder```, you can switch to using T5 or BERT as text_encoder.
The trained checkpoint will be saved in `saves_align/point-robot/`.

## **Downstream Task Training**
### Text-to-Decision Diffuser
Train the Text-to-Decision Diffuser on different tasks in PointRobot-v0:
```shell
python train_t2d_diffuser.py --env point-robot --prompts_type aligned_clip
```

Evaluate the Text-to-Decision Diffuser on different tasks in PointRobot-v0:
```shell
python evaluate_parallel.py --env point-robot --prompts_type aligned_clip
```

### Text-to-Decision Transformer
Train the Text-to-Decision Transformer on different tasks in PointRobot-v0:
```shell
python train_t2d_transformer.py --env point-robot --prompts_type aligned_clip
```