# Ladders of Thought


## Installation Dependencies
```
pip install -r requirements.txt
```

## Download Data
Data files for GSM8K and Entailment-bank are included under `data/`. 
If not, please download `openai/gsm8k` from huggingface and put `train.jsonl` and `test.jsonl` under directory `data/gsm8k`. Download entailment-bank (v1, should have 1840 samples in the train split) and place `train.jsonl` under directory `data/entailment-bank`. 

## Progessive Data Generation

The progressively rewritten data are already included under `processed_data` and has been processed into training ready format and placed under `training_data/`. But if you want to rerun or test out the progressive rewrite you can follow the example below: 

```
export OPENAI_API_KEY=[Your API KEY]

python3 chatgpt_generation/batch_rewriter.py \
    --input data/entailment-bank/train.json \
    --output chatgpt_generation/entailment_bank_full/train_rewrite.jsonl \
    --model gpt-5-mini --max-output-tokens 20000 \
    --poll-interval 60 \
    --extract-json-blocks
```

The rewrite script uses OpenAI batch API. We have included a utility script `count_versions.py` to help find if any question is missing any rewrites.
If there are misssings due to network failures or max output token limit, we have also provided a `retry_failures.py` script to check and resubmit them as anothe batch API job. 

Example usage:

```
python chatgpt_generation/retry_failures.py \
    --original-batch-input-path openai_api_workdir/batch_input_id.jsonl \
    --batch-results-jsonl-path openai_api_workdir/batch_output_batch_id.jsonl \
    --json-blocks-dir outputs/entailment_bank/json_blocks/ \
    --workdir outputs/entailment_bank_retry1 \
    --extract-json-blocks \
    --min-json-blocks 1 --mode submit --poll-interval 30 --new-max-output-tokens 30000
```


Finally `concat_json_blocks.py` will concat all extracted json blocks and group them into jsonl files. 

```
python chatgpt_generation/concat_json_blocks.py \
  --json-blocks-dir outputs/entailment_bank/json_blocks/ \
  --grouped-output processed_data/entailment-bank/rewrites.grouped.jsonl \
  --flat-output processed_data/entailment-bank/rewrites.flat.jsonl
```


## Format Data into Trainining Ready Format

The data has already been processed into training ready format and placed under `training_data/`. But if you want to run the processing script again:

```
python3 scripts/process_gsm8k.py --ablation
python3 scripts/process_entailment_bank.py --ablation
```


## Run Experiments

All experiments use accelerate + pytorch trl. A default trl script is inclued in `configs/accelerate_config/multi_gpu.yaml`.


```
bash experiments/main_exp.sh
bash experiments/ablation_data.sh
bash experiments/ablation_schedule.sh
```

After training the saved models will be under `trained_models/` by default.

## Run evaluation

Evaluation uses lm-eval-harness. Make sure it is installed. All experiment yaml config files are included in `eval_tasks/`.

```
bash eval_main.sh
bash eval_ablation.sh
```

After evaluation results will be under `lm_eval_results/` by default.
