# DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning

This repository is the official implementation of DAIL. 

## Contents
- [DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning](#dail-beyond-task-ambiguity-for-language-conditioned-reinforcement-learning)
  - [Contents](#contents)
  - [Codebase Structure](#codebase-structure)
  - [Getting Started](#getting-started)
    - [BabyAI](#babyai)
    - [ALFRED](#alfred)
    - [Dataset Structure](#dataset-structure)
  - [Training](#training)
  - [Evalutation](#evalutation)
  - [Pre-trained Models](#pre-trained-models)


## Codebase Structure
```
\- ALFRED
  \- alfred # codes for environment ALFRED
  \- agents
  \- baselines
  \- data
  \- networks
  \- utils
  - config.py
  - eval.py
  - train.py # main training code
  - requirements.txt
\- BabyAI
  \- babyai # codes for environment BabyAI
  \- agents
  \- baselines
  \- data
  \- networks
  \- utils
  - eval.py
  - train.py # main training code
  - train_medium.py # main training code for medium dataset
  - requirements.txt
\- CLIP #codes for CLIP
- README.md
```

## Getting Started
Codes are trained on `NVIDIA-SMI 535.129.03, CUDA Version: 12.2`. To run the codes, you need to first prepare the environments and datasets describes as follows:

### BabyAI
- The following operations are done under the folder `BabyAI`.
- Prepare environment:
    - Create a new environment with `python==3.9.0`.
    - We provide a requirement file to install all the dependencies: `pip install -r requirements.txt`.
- Prepare data:
    - Download the offline datasets from [dataset](https://drive.google.com/file/d/1XoGF16dEv7owRO1zNI3Vr0u7-WNqLORU/view?usp=sharing) and put `Synth_mixed_Bot50000_IL50000_Random25000_preprocessed_merged.pk` into `BabyAI/data/trajs/`.
    - Put pre-trained models into `BabyAI/data/models/`.
    - Put `goal_tensor.pk`, `in_missions.pk` into `BabyAI/data/`.

### ALFRED
- The following operations are done under the folder `ALFRED`.
- Prepare environment:
    - Create a new environment with `python==3.9.21`.
    - We provide a requirement file to install all the dependencies: `pip install -r requirements.txt`.
    - Install ALFRED and AI2THOR:
        - Pleaser follow [ALFRED](https://github.com/askforalfred/alfred/tree/master) to install ALFRED and AI2THOR.
        - Run `python ALFRED/alfred/scripts/check_thor.py` to check if ALFRED is successfully installed.
- Prepare data:
    - Download Modeling Quickstart Dataset from [ALFRED-dataset](https://github.com/askforalfred/alfred/tree/master/data).
    - Unzip the files and move folder `json_feat_2.1.0` into `ALFRED/alfred/data`.
    - Run `python ALFRED/utils/data_preprocess.py` to preprocess data, which stores preprocessed data into `ALFRED/data/preprocessed`.
    - Move data (including folder `trajs` and `id2pt.pt`) from [dataset](https://drive.google.com/file/d/1XoGF16dEv7owRO1zNI3Vr0u7-WNqLORU/view?usp=sharing) into `ALFRED/data/additional`.
- Modify the configuration in `ALFRED/config.py` and run `python ALFRED/train.py` to start training.

### Dataset Structure

We provide the high-quality, medium-quality datasets, and the in-distribution/out-of-distribution split file for BabyAI in [dataset](https://drive.google.com/file/d/1XoGF16dEv7owRO1zNI3Vr0u7-WNqLORU/view?usp=sharing). The additional noisy data we used in ALFRED environment and the pre-trained models are also included in this link.

```
\- BabyAI-dataset
  \- high-quality
  \- medium-quality
  \- models
  - in_missions.pk
  - goal_tensor.pk
\- ALFRED-dataset
  \- additional_data
    \- trajs
    - id2pt.pt
  \- models
```

## Training

To train models of DAIL, simply run following script in each environment folder:

```
python train.py
```

- [Note] If you want to run an ablation experiment to study the contribution of each component in our learning framework, you can simply modify the `if_clip` and `model_type` arguments and then run the training or evaluation scripts.

## Evalutation

To evaluate models of DAIL, simply run following script in each environment folder. You may load your trained models or pre-trained models we provided after modifying the model filepath in the scripts.

```
python eval.py
```

## Pre-trained Models

- You can download pre-trained models from [dataset](https://drive.google.com/file/d/1XoGF16dEv7owRO1zNI3Vr0u7-WNqLORU/view?usp=sharing), and move them into `*/data/models/`.
- Modify the training and evaluation scripts' file paths in `*/eval.py` to load the pre-trained models with `agent.load_model()`.

