# MLLM-Based Semantic Interpretation of Failure Data

## [Project Website](https://mllm-failure-clustering.github.io/) | [Paper]()

Here is the codebase for our paper ["Enhancing Robot Safety via MLLM-Based Semantic Interpretation of Failure Data"](). We present a framework for interpreting the failure modes of autonomous systems by clustering their failure datasets, and further use them for runtime monitoring, targeted data collection, and policy refinement.


# Case Studies
We demonstrate our framework on two systems:
- **Dashcam Driving Datasets**<br>
We used the open-source [Nexar dashcam driving dataset](https://www.kaggle.com/competitions/nexar-collision-prediction/data). It contains videos of ego car failures.
- **Vision-based Indoor Navigation Robot**<br>
We used [LB-WayPtNav](https://github.com/mllm-failure-clustering/Visual-Navigation-Release/tree/failure_clustering), a vision-based indoor navigation framework.


# Installation
Unzip the failure_clustering.zip codebase and follow the listed commands for environment setup.
```
python -m venv venv
source venv/bin/activate
cd failure_clustering
pip install -r requirements.txt
```
Add API keys as environment variables (Add your keys in blanks in the commands)
```
export GOOGLE_API_KEY=''
export OPENAI_API_KEY=''
```

# Dataset
We provide the [dataset](https://drive.google.com/drive/folders/1lEUGI4vUhGsxbk_jUz2k1o2IvyrS-J3U?usp=sharing) of failure trajectories for two systems, an indoor navigation robot and car dashcam recordings:
```
pip install gdown
gdown --folder https://drive.google.com/drive/folders/1lEUGI4vUhGsxbk_jUz2k1o2IvyrS-J3U
```
Unzip the downloaded file to have the following structure:
```
failure_clustering_datasets/
├── driving/
│   └── nexar/
├── waypointnav/
│   ├── area1/
```

# Clustering Failure Datasets

### Step-1: Get failure trajectory descriptions
```
cd clustering/step1
python get_descriptions.py --experiment waypointnav
```

### Step-2: Get failure clusters
```
cd clustering/step2
python get_clusters.py
python aggregate_clusters.py
python convert_to_json.py
```

### Step-3: Assign Trajectories to Clusters
```
cd clustering/step3
python assign_to_clusters.py
```

# Runtime Monitoring with Failure Clusters
Enter your OpenAI API key in `failure_monitoring/driving/clustering.py` and `failure_monitoring/waypointnav/clustering.py`.

## Driving
```
cd failure_monitoring/driving/
python run_clustering.py
```

## Waypointnav
```
cd failure_monitoring/waypointnav/
python run_clustering.py
```
This will compute the failure detection metrics in the paper (TPR, TNR, FPR, FNR, F1-Score).

### Engaging the Safeguard Policy
We also test our runtime monitor for an indoor navigation robot by integrating it into the original policy and triggering an expert fallback controller whenever a failure is detected. Follow these steps to run that:
```
git clone -b failure_clustering https://github.com/mllm-failure-clustering/Visual-Navigation-Release.git
cd Visual-Navigation-Release
```
Follow the [instructions](https://github.com/mllm-failure-clustering/Visual-Navigation-Release/blob/failure_clustering/README.md) to set up the codebase. Test the runtime monitor with the following command
```
PYOPENGL_PLATFORM=egl PYTHONPATH='.' python executables/rgb/resnet50/rgb_waypoint_trainer.py test --job-dir logs --params params/rgb_trainer/reproduce_LB_WayPtNav_results/rgb_waypoint_trainer_finetune_params.py -d 0
```

# Targeted Data Collection and Policy Refinement
We perform targeted data collection and policy refinement for the indoor navigation robot. For data collection, we take the regions around failure clusters in the environment and collect expert data with the following command.
```
PYOPENGL_PLATFORM=egl PYTHONPATH='.' python executables/rgb/resnet50/rgb_waypoint_trainer.py generate-data --job-dir  logs --params params/rgb_trainer/reproduce_LB_WayPtNav_results/rgb_waypoint_trainer_finetune_params.py -d 0
```
Further, we augment the new data with the [original training dataset](https://drive.google.com/file/d/1yfRelD8bf-3bnhIMbW_4yvMglKimCpqY/view?usp=sharing) and use the following command for fine-tuning the available checkpoint.
```
PYOPENGL_PLATFORM=egl PYTHONPATH='.' python executables/rgb/resnet50/rgb_waypoint_trainer.py train --job-dir logs --params params/rgb_trainer/reproduce_LB_WayPtNav_results/rgb_waypoint_trainer_finetune_params.py -d 0
```
