# Hivemind of Reasoning Agents

## Installation

``` bash
uv venv --python 3.11
uv pip install -e .
uv pip install 'flash-attn==2.8.2' --no-build-isolation
```

``` bash
cp ./dev_utils/modeling_qwen2.py ./.venv/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py
```

This adds
``` python
        if self.config._attn_implementation == "flex_attention":
            kernel_options = {
                "BLOCK_M": 64,
                "BLOCK_N": 64,
                "BLOCK_M1": 32,
                "BLOCK_N1": 64,
                "BLOCK_M2": 64,
                "BLOCK_N2": 32,
            }
            kwargs["kernel_options"] = kernel_options
```
in line 166 of .venv/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py


## Create datasets

``` bash
for DATASET in deepmath deepmath_test amc12_22_24 omni_math olympiadbench math_test aime_train; do
python completions.py --model_name deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --max_completion_length 4096 --max_model_len 8192 --dataset_name $DATASET --num_prompts 500 --num_completions_per_prompt 64 --temperature 1.0
python completions_rollouts.py --input_path outputs/completions/completions_deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B_${DATASET}_500_64_1.0_4096_None_None.pt --max_model_len 8192 --rollout_max_new_tokens 40 --paragraph_delimiter_token_id 14190
python completions_rollouts_bin.py --input_path outputs/completions_rollouts/completions_rollouts_deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B_${DATASET}_500_64_1.0_4096_None_None_append_____Final_Answer____boxed_delim_token_14190_roll_40.pt
done
```

## Run experiment (Terminal Answers)

### Probe
``` bash
python -m calib.cli.train \
--precomputed_path outputs/completions_rollouts/deepmath_64_224_balanced.pt \
--eval_precomputed_path outputs/completions_rollouts/aime_train_64_448.pt \
--seed $SEED \
--num_prompts 224 \
--eval_num_prompts 224 \
--eval_steps 0.5 \
--save_steps 0.5 \
--epochs 2 \
--effective_batch_size 64 \
--batch_size 64 \
--base_batch_size 32 \
--last_rollout_only \
--lr 1e-3 \
--lr_scheduler_type constant_with_warmup \
--group_size 1 \
--sampler_group_size 1 \
--architecture probe
```

### MSV ${GROUP_SIZE}
``` bash
for GROUP_SIZE in 1 4 16 64;
python -m calib.cli.train \
--precomputed_path outputs/completions_rollouts/deepmath_64_224_balanced.pt \
--eval_precomputed_path outputs/completions_rollouts/aime_train_64_448.pt \
--seed $SEED \
--num_prompts 224 \
--eval_num_prompts 224 \
--eval_steps 0.5 \
--save_steps 0.5 \
--epochs 2 \
--effective_batch_size 64 \
--batch_size 64 \
--base_batch_size 32 \
--last_rollout_only \
--lr 5e-5 \
--gating_lr 1e-1 \
--agent_emb \
--agent_lr 1e-3 \
--lr_scheduler_type constant_with_warmup \
--group_size ${GROUP_SIZE} \
--sampler_group_size ${GROUP_SIZE} \
--architecture full \
--num_hidden_layers 1 \
--attn_types omni,omni_bin,omni_indiv \
--node_features omni_log_count \
--no_early_node_features_projection \
--late_node_features_projection \
--bin_aggregate
```

