# REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

This paper explores how linguistic vagueness in referring expressions (REs) within human instructions affects LLM-based robot task planning and how to overcome this issue. To this end, we propose the first robot task planning benchmark with vague REs, called **REI-Bench**, where we discover that the vagueness of REs can severely degrade robot planning performance.

It provides:
- The **REI-Bench** benchmark datasets with vague referring expressions (REs)
- Evaluation scripts for LLM-based robot task planners
- Example usage and configuration files

## Dataset Example
Example data can be found in `dataset/ExampleData100.json`. 

## Install

1. Clone the whole repo.
    ```bash
    git clone {repo_url}
    ```

1. Setup a virtual environment.
    ```bash
    conda create -n {env_name} python=3.8
    conda activate {env_name}
    ```

1. Install PyTorch (2.0.0) first (see https://pytorch.org/get-started/locally/).
    ```bash
    # exemplary install command for PyTorch 2.0.0 with CUDA 11.7
    pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 --index-url https://download.pytorch.org/whl/cu117
    ```

1. Install python packages in `requirements.txt`.
    ```bash
    $ pip install -r requirements.txt
    ```


### Download ALFRED dataset.
```bash
cd alfred/data
sh download_data.sh json
```

### Write REI-Bench dataset.
```bash
python dataset/write_REI.py
```

## Quick Start
### **1. Run with Default Configuration**

You can run the program directly with:

```bash
python src/evaluate.py
````

### **2. Run with a Different Planner**

You can override the `model_name` and `planner_framework` settings in the configuration file via the command line:

```bash
python src/evaluate.py --config-name=config_alfred planner.model=deepseek-ai/deepseek-math-7b-instruct
```

### **3. Run with Prompting Method**
```bash
python src/evaluate.py --config-name=config_alfred prompting_method.TOCC=True
```