# Code Implementation for SafeCoDe

This repository implements our SafeCoDe procedure for MLLMs. It supports two evaluation setups:

- **MOSSBench** — evaluates rejection rate (safety sensitivity).  
- **MSSBench** — evaluates contextual accuracy under safe/unsafe settings.

---

## Installation

```bash
pip install -r requirements.txt
```
## OpenAI API Key
Set your API key in the environment via:
```bash
# macOS / Linux
export OPENAI_API_KEY="<your_api_key_here>"

# Windows PowerShell
setx OPENAI_API_KEY "<your_api_key_here>"
```

## Benchmarks & Entry Points
Note that due to the limit of submission size, we weren't able to include the benchmark images in this repo. Please download the datasets from the official repo of MOSSBench and MSSBench.
### MOSSBench (Script Location)
```bash
eval_bench/MOSSBench/experiments/main.py
```

### MSSBench (Script Location)
```bash
MSSBench/inference.py
```

## Command-Line Arguments

```text
--model_type            Model type (e.g., llava, llava-next, idefics)
--model_path            Path or HF hub ID for model (required)
--output_name           Saved name for output files (required)
--alpha                 Contrastive scaling factor (default: 0.3)
--max_steps             Max guided decoding steps (default: 5)
--top_k                 Top-k tokens to consider (default: 20)
--lambda_supp           Suppression strength for safe verdicts (default: 1.0)
--lambda_boost          Boost strength for unsafe verdicts (default: 1.0)
--total_max_new_tokens  Max new tokens to generate (default: 256)
--mss_data_root         Root path to MSSBench data
--mss_output_dir        Output directory for MSSBench results
--moss_data_root        Root path to MOSSBench data
--moss_output_dir       Output directory for MOSSBench results
--moss_data_list        Restrict MOSSBench to a subset of sample IDs
--moss_data_offset      Start index offset for MOSSBench
--moss_inference        Run inference (use --no-moss_inference to disable)
--moss_eval             Run evaluation (use --no-moss_eval to disable)
```

## Code Running Examples:
### MOSSBench

```bash
cd /path/to/your_repo/eval_bench/MOSSBench/experiments

python main.py \
  --model_type idefics \
  --model_path HuggingFaceM4/idefics-9b-instruct \
  --output_name moss_run \
  --alpha 0.3 --max_steps 5 --top_k 20 \
  --lambda_supp 1.0 --lambda_boost 1.0 \
  --total_max_new_tokens 256 \
  --moss_data_root /path/to/MOSSBench/data \
  --moss_output_dir /path/to/outputs/mossbench \
  --moss_inference --moss_eval
```

### MSSBench

```bash
cd /path/to/your_repo/MSSBench

python inference.py \
  --mllm idefics \
  --data_root /path/to/MSSBench/data \
  --output_dir /path/to/outputs/mssbench \
  --model_path HuggingFaceM4/idefics-9b-instruct \
  --alpha 0.3 --max_steps 5 --top_k 20 \
  --lambda_supp 1.0 --lambda_boost 1.0 \
  --total_max_new_tokens 256
```




