Our methods code is mainly in .\GPT-2\examples\NLG\src\optimizer_custom.py

Quickstart

cd examples/NLG
pip install -r requirement.txt
bash download_pretrained_checkpoints.sh
bash create_datasets.sh
cd ./eval
bash download_evalscript.sh
cd ../../..
python setup.py develop
sudo apt-get install default-jre

E2E Experiment

1. Enter experiment folder
cd examples/NLG

2. Train GPT-2 medium with AdamS or LRACS optimizer (see our paper for hyperparameters)
   python -m torch.distributed.launch --nproc_per_node=1 --master-port=29512 src/gpt2_ft.py --config train_config_adams.yml
or python -m torch.distributed.launch --nproc_per_node=1 --master-port=29512 src/gpt2_ft.py --config train_config_lracs.yml

3. Generate output
python -m torch.distributed.launch --nproc_per_node=1 src/gpt2_beam.py \
    --data ./data/e2e/test.jsonl \
    --batch_size 1 \ 
    --seq_len 512 \
    --eval_len 64 \
    --model_card gpt2.md \
    --init_checkpoint trained_models/GPT2_M/e2e/AdamS/model_E2E_experiment.26290.pt \
    --platform local \
    --lora_dim 4 \
    --lora_alpha 32 \
    --beam 10 \
    --length_penalty 0.8 \
    --no_repeat_ngram_size 4 \
    --repetition_penalty 1.0 \
    --eos_token_id 628 \
    --work_dir trained_models/GPT2_M/e2e/AdamS \
    --output_file predict_E2E_experiment_AdamS_3e-4.jsonl

4. Decode outputs from step (3)
python src/gpt2_decode.py \
    --vocab ./vocab \
    --sample_file predict_E2E_experiment_AdamS_3e-4.jsonl \
    --input_file ./data/e2e/test_formatted.jsonl \
    --output_ref_file e2e_ref.txt \
    --output_pred_file e2e_pred.txt

5. Run evaluation on E2E test set
python eval/e2e/measure_scores.py e2e_ref.txt e2e_pred.txt -p