# A lightweight heuristic for detecting unfair tests in software engineering benchmarks

**Results may be sensitive to the execution environment.**

**We recommend Python 3.12.11 with macOS.**

Steps to reproduce results:

1. Make sure you are in the project root directory.

2. Set up a virtual environment:
    ```bash
    python -m venv venv
    source venv/bin/activate
    ```

   (**Windows users:** Replace the second command with `source venv/Scripts/activate`)

4. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

5. (optional) Reproduce the preprocessed annotation files:
   ```bash
   ./run_preprocessing.sh
   ```

6. Run all experiments:
   ```bash
   ./run_experiments.sh
   ```

The results for each experiment will be in **results/**

The combined results tables will be in **results/tables/**

Any previous results will be overwritten.
