# MeZO

## Requirement 

Install the latest version of pytorch and transformers.

## Reproduce RoBERTa-large experiments

```bash
TYPE=prompt STEPS=100000 TASK=SST-2 TAG=run SEED=42 MODEL=roberta-large K=16 WANDB_MODE=disabled \
    bash run_fewshot.sh --per_device_train_batch_size 64 --per_device_eval_batch_size 4 --learning_rate 1e-6 \
    --logging_steps 10 \
    --zero_order_optim --zero_order_eps 1e-3 --lr_scheduler_type "constant" --optimizer "sgd" --evaluate_during_training --eval_steps 10000 \
    --efficient_zero_order
```

Please refer to run.py for more possible arguments.

## Reproduce OPT experiments

1. Enter llm_eval
2. Run with the following command

```bash
python run.py \
    --model_name facebook/opt-13b \
    --task_name SST2 \
    --output_dir result/run  --tag run  --train_set_seed 42  --num_train 1000 --num_dev 500 --num_eval 1000 --logging_steps 10  \
    --max_steps 20000 \
    --only_train_option \
    --trainer zo --zo_inplace --load_float16 \
    --learning_rate 1e-6 --zo_eps 1e-3 --per_device_train_batch_size 16 --weight_decay $WD \
    --lr_scheduler_type "constant" \
    --load_best_model_at_end --evaluation_strategy steps --save_strategy steps --save_total_limit 1 \
    --eval_steps 4000 --save_steps 4000 
```