# Retrieval experiment launching

### Two model (pure DDP, local negative):
```
python /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/launch_frontier.py \
    --python_script="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/train_retrieval_w_anticausal.py" \
    --rccl_installdir="${WRKSPC}/tiny_plugins_rccl.tar.gz" \
    --environment="${WRKSPC}/frontier_conda_60.tar.gz" \
    --budget_hours=4 \
    --nodes 16 \
    --config /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/XXXX-22/frontier_jobs/fineweb_retrieval_train_pythia.json \
    --run_name wbsz-2560-local_negs_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-20-ctx-var-batch_negative_ddp_RR_lr_4e-3_optim_steps_57691 \
    --extra_args='--max_iters=57691 --max_tokens=null --save_step_interval=500 --optim_config.lr=4e-3 --min_lr=4e-4 --save_n_min_before_job_done=3 --world_batch_size=2560 --micro_batch_size=20 --fabric_strategy="ddp"' \
    --disable_net_gdr \
    --sub_output_dir_name wbsz-2560-local_negs_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-20-ctx-var-batch_negative_ddp_RR_lr_4e-3_optim_steps_57691
```

### Two model (grad aware parallel negatives):
Note: notice the arguments in `extra_args` (e.g. `negatives_cross_device`)
```
python /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/launch_frontier.py \
    --python_script="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/train_retrieval_w_anticausal.py" \
    --rccl_installdir="${WRKSPC}/tiny_plugins_rccl.tar.gz" \
    --environment="${WRKSPC}/frontier_conda_60.tar.gz" \
    --budget_hours=18 \
    --nodes 16 \
    --config /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/XXXX-22/frontier_jobs/fineweb_retrieval_train_pythia.json \
    --run_name wbsz-256-cross_device_negs_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-2-gradacc_bsz-2560-ctx-var-batch_negative_ddp_RR_lr_1e-3_optim_steps_57691 \
    --extra_args='--negatives_cross_device=true --max_iters=576908 --max_tokens=null --save_step_interval=500 --optim_config.lr=1e-3 --min_lr=1e-4 --save_n_min_before_job_done=3 --world_batch_size=2560 --micro_batch_size=2 --fabric_strategy="ddp"' \
    --disable_net_gdr \
    --sub_output_dir_name wbsz-256-cross_device_negs_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-2-gradacc_bsz-2560-ctx-var-batch_negative_ddp_RR_lr_1e-3_optim_steps_57691
```

### Two model (k positive labels on upper diagonal):
Note: notice the arguments in `extra_args` (e.g. `k_pos_labels`)
```
python /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/launch_frontier.py \
    --python_script="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/train_retrieval_w_anticausal.py" \
    --rccl_installdir="${WRKSPC}/tiny_plugins_rccl.tar.gz" \
    --environment="${WRKSPC}/frontier_conda_60.tar.gz" \
    --budget_hours=10 \
    --nodes=16 \
    --config="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/XXXX-22/frontier_jobs/fineweb_retrieval_train_pythia.json" \
    --run_name="k_pos_labels_3_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-2048-ctx-var-batch_negative_ddp_RR_lr_3e-3_max_iters_57691_reduce_both_dim" \
    --extra_args="--k_pos_labels=3 --decay_factor=1.0 --reduce=both_dim --max_iters=57691 --max_tokens=null --save_step_interval=500 --optim_config.lr=3e-3 --min_lr=3e-4 --save_n_min_before_job_done=3 --world_batch_size=2048 --micro_batch_size=16 --fabric_strategy=ddp" \
    --disable_net_gdr \
    --sub_output_dir_name="k_pos_labels_3_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-2048-ctx-var-batch_negative_ddp_RR_lr_3e-3_max_iters_57691_reduce_both_dim"
```

### Orca contrastive finetuning:
Note: notice the arguments in `extra_args` (e.g. `finetune_checkpoint`)
```
python /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/launch_frontier.py \
    --python_script="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/train_retrieval_w_anticausal.py" \
    --rccl_installdir="${WRKSPC}/tiny_plugins_rccl.tar.gz" \
    --environment="${WRKSPC}/frontier_conda_60.tar.gz" \
    --budget_hours=8 \
    --nodes 4 \
    --config /XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/XXXX-40/launch_scripts/XXXX-22/frontier_jobs/retrieval_train_pythia_natural_inst_data.json \
    --run_name orca_finetune_k_pos_labels_0_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-512-ctx-var-batch_negative_ddp_RR_lr_1e-3_max_iters_8250 \
    --extra_args=' --max_iters=8250 --save_step_interval=1000 --eval_step_interval=1000 --optim_config.lr=1e-3 --min_lr=1e-4 --save_n_min_before_job_done=3 --world_batch_size=512 --micro_batch_size=16 --finetune_checkpoint="/XXXX-30/XXXX-29/XXXX-31/scratch/XXXX-22/output/k_pos_labels_0_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-2048-ctx-var-batch_negative_ddp_RR_lr_3e-3_max_iters_57691_reduce_both_dim/checkpoints-ddp/step-00010000-k_pos_labels_0_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-2048-ctx-var-batch_negative_ddp_RR_lr_3e-3_max_iters_57691_reduce_both_dim.pth"' \
    --disable_net_gdr \
    --sub_output_dir_name orca_finetune_k_pos_labels_0_decay_factor_1.0_fineweb_100B-retrieval-dual-causal-pythia-160m-mbsz-16-wbsz-512-ctx-var-batch_negative_ddp_RR_lr_1e-3_max_iters_8250
```

### Single model (Suffix = Prefix):
Note: notice the arguments in `extra_args` (e.g. `suffix_is_prefix`)
```
python /XXXX-30/XXXX-29/XXXX-31/scratch/njain17/new_workspace/XXXX-40/launch_scripts/launch_frontier.py \
    --python_script="/XXXX-30/XXXX-29/XXXX-31/scratch/njain17/new_workspace/XXXX-40/train_retrieval_w_anticausal.py" \
    --rccl_installdir="${WRKSPC}/tiny_plugins_rccl.tar.gz" \
    --environment="${WRKSPC}/frontier_conda_60.tar.gz" \
    --budget_hours=10 \
    --nodes 16 \
    --config /XXXX-30/XXXX-29/XXXX-31/scratch/njain17/new_workspace/XXXX-40/launch_scripts/XXXX-22/frontier_jobs/fineweb_retrieval_train_pythia.json \
    --run_name fineweb_100B-retrieval-dual-causal-pythia-160m-suffix_is_prefix-mbsz-20-wbsz-2560-ctx-var-cross_batch_negative_ddp_RR_lr_4e-3_max_iters_57691 \
    --extra_args='--loss_type=cross_batch_negative --suffix_is_prefix=True --max_iters=57691 --max_tokens=null --save_step_interval=500 --optim_config.lr=4e-3 --min_lr=4e-4 --save_n_min_before_job_done=3 --world_batch_size=2560 --micro_batch_size=20 --fabric_strategy="ddp"' \
    --disable_net_gdr \
    --sub_output_dir_name fineweb_100B-retrieval-dual-causal-pythia-160m-suffix_is_prefix-mbsz-20-wbsz-2560-ctx-var-cross_batch_negative_ddp_RR_lr_4e-3_max_iters_57691
```