<div align="center">
  <h1>SEC-bench</h1>
  <p>Automated Benchmarking of LLM Agents on Real-World Software Security Tasks</p>
</div>
<br>


## Setup
Clone the repository

```bash
git clone --recurse-submodules https://github.com/SEC-bench/SEC-bench.git
cd SEC-bench
```

Install the dependencies

```bash
conda create -n secb python=3.12
conda activate secb
pip install -r requirements.txt
pip install -e .
```

You need to set the following environment variables:

```bash
export GITHUB_TOKEN=<your_github_token>
export GITLAB_TOKEN=<your_gitlab_token>
export OPENAI_API_KEY=<your_openai_api_key>
export ANTHROPIC_API_KEY=<your_anthropic_api_key>
export HF_TOKEN_PATH=$HOME/.cache/hf_hub_token
export HF_HOME=<path/to/huggingface>    # The path to the huggingface models are saved
```

## Preprocessing
```bash
# Preprocess metadata of the OSV database
python -m secbench.preprocessor.seed --input-dir [OSV_DIR] --output-file [SEED_OUTPUT_FILE_PATH] --verbose

# Extract bug reports from reference URLs in preprocessed data
python -m secbench.preprocessor.report --input-file [SEED_OUTPUT_FILE_PATH] --output-file [REPORT_OUTPUT_FILE_PATH] --reports-dir [REPORTS_DIR] --lang [LANGUAGE] --type [TYPE] --whitelist [WHITELIST_PROJECTS] --blacklist [BLACKLIST_PROJECTS] --oss-fuzz

# Generate project configurations for reproducing vulnerabilities using OSS-Fuzz projects
python -m secbench.preprocessor.project --input-file [REPORT_OUTPUT_FILE_PATH] --output-file [PROJECT_OUTPUT_FILE_PATH] --tracking-file [TRACKING_FILE_PATH] --verbose
```

You can use `run_preprocessor.sh` with ease. You don't need to specify the output file path in report and project mode.

```bash
Usage: ./run_preprocessor.sh <mode> [options]

Modes:
  seed    - Parse CVE/OSV files and extract relevant information
  report  - Extract bug descriptions from reference URLs
  project - Generate project configurations for reproducing vulnerabilities

Options for seed mode:
  --input-dir <dir>           Directory containing input JSON files
  --output-file <file>        Output file path (JSONL format)
  --log-file <file>           Log file path (default: logs/seed.log)
  --repo-lang-file <file>     Repository language mapping file
  --verbose, -v               Enable verbose logging

Options for report mode:
  --input-file <file>         Input JSONL file containing preprocessed data
  --output-file <file>        Output JSONL file path (with bug reports)
  --reports-dir <dir>         Directory to store extracted bug reports
  --log-file <file>           Log file path (default: logs/report.log)
  --max-entries <n>           Maximum number of entries to process
  --verbose, -v               Enable verbose logging
  --type <type>               Select vulnerability type (CVE, OSV, or ALL)
  --lang <lang>               Filter entries by programming language
  --blacklist <repos>         Exclude entries from specified repositories
  --whitelist <repos>         Include only entries from specified repositories
  --oss-fuzz [config]         Filter entries by OSS-Fuzz projects
  --fixed-only                Filter entries with non-empty fixed commit

Options for project mode:
  --input-file <file>         Input file path containing bug reports
  --output-file <file>        Output file path containing project information
  --max-entries <n>           Maximum number of entries to process
  --log-file <file>           Log file path (default: logs/project.log)
  --verbose, -v               Enable verbose logging
  --tracking-file <file>      Path to the tracking file
  --force, -f                 Force reprocessing of already processed entries
  --append, -a                Append to the output file
  --sanitizer-only            Only process entries that have a sanitizer error
  --minimal                   Generate a minimalized Dockerfile and build script instead of using the original OSS-Fuzz files
  -h, --help                  Show this help message and exit

Examples:
  ./run_preprocessor.sh seed --input-dir ./data --output-file ./output/seed.jsonl
  ./run_preprocessor.sh report --input-file ./output/seed.jsonl --type CVE --oss-fuzz --lang C,C++,Java
  ./run_preprocessor.sh project --input-file ./output/report-cve-oss-c-cpp-java.jsonl --sanitizer-only
```

After all, you should build base and instance images.

```bash
# Build base images
python -m secbench.preprocessor.build_base_images

# Build instance images
python -m secbench.preprocessor.build_instance_images --input-file [OUTPUT_OF_PROJECT_PY] --ids [INSTANCE_IDS]

# Example: Build an instance image for `openjpeg` CVE-2024-56827
python -m secbench.preprocessor.build_instance_images --input-file ./output/project-cve-oss-c-c++-sanitizer-minimal.jsonl --ids openjpeg.cve-2024-56827

# Example: Build instance images for "gpac.cve" pattern
python -m secbench.preprocessor.build_instance_images --input-file ./output/project-cve-oss-c-c++-sanitizer-minimal.jsonl --filter gpac.cve
```

The instance images are tagged with `latest` suffix.

## Verification
Refer to [SecVerifier](https://github.com/SEC-bench/SecVerifier) for more details.

## Evaluation

1. Build the evaluation images.

```bash
python -m secbench.evaluator.build_eval_instances --input-dir [REPRODUCDE_INSTANCE_DIR]
```

Only verified images are saved with `hwiwonlee/secb.eval.x86_64.` prefix.

2. Run the evaluation.

```bash
python -m secbench.evaluator.eval_instances \
    --input-dir [PATH_TO_OUTPUT_DIR] \
    --mode [MODE] \
    --split [SPLIT] \
    --agent [AGENT] \
    --num-workers [NUM_WORKERS] \
    --output-dir [OUTPUT_DIR]
```

3. View the results.

```bash
python -m secbench.evaluator.view_patch_results \
    --agent [AGENT] \
    --input-dir [PATH_TO_OUTPUT_DIR]
```
