**********************************
Acknowledgement
**********************************
The code is modified from:
1. StyleGAN3: https://github.com/NVlabs/stylegan3
2. LayoutGAN++: https://github.com/ktrk115/const_layout/blob/master/model/layoutganpp.py
3. DETR: https://github.com/facebookresearch/detr
4. BLIP: https://github.com/salesforce/BLIP

**********************************
Training
**********************************
A typical training script looks like the following:
python train.py --gpus=8 --batch=8 --workers=8 --tick=1 --snap=20 \
--cfg=layoutganpp --aug=noaug \
--gamma=0.0 --pl-weight=0.0 \
--bbox-cls-weight=50.0 --bbox-rec-weight=500.0 --text-rec-weight=0.1 --text-len-rec-weight=2.0 --im-rec-weight=0.5 \
--bbox-giou-weight=4.0 --overlapping-weight=7.0 --alignment-weight=17.0 --z-rec-weight=5.0 \
--z-dim=4 --g-f-dim=256 --g-num-heads=4 --g-num-layers=8 --d-f-dim=256 --d-num-heads=4 --d-num-layers=8 \
--bert-f-dim=768 --bert-num-heads=4 --bert-num-encoder-layers=12 --bert-num-decoder-layers=2 \
--background-size=256 --im-f-dim=512 \
--metrics=layout_fid50k_train,layout_fid50k_val,overlap50k_alignment50k_layoutwise_iou50k_layoutwise_docsim50k_train,overlap50k_alignment50k_layoutwise_iou50k_layoutwise_docsim50k_val,fid50k_train,fid50k_val \
--data=data/dataset/ads_banner_collection_manual_3x_mask/zip/train.zip \
--outdir=training-runs/layoutganpp/ads_banner_collection_manual_3x_mask_50cls_2len_5z

**********************************
Generation
**********************************
A typical generation script refers to ./script_gen_single_sample_API.py