# AGDC-ContLayNet
## "Autoregressive Generation of Variable-Length Sequences with Joint Discrete and Continuous Spaces" - for ContLayNet

![assets/teaser](assets/teaser.png)

AGDC is a **unified autoregressive framework that jointly models discrete and continuous values** for high-fidelity sequence generation. Unlike traditional approaches that rely on discretization, AGDC directly models continuous values using denoising score matching while handling discrete values with cross-entropy loss.

This repository contains the implementation on our proposed ContLayNet dataset.

## 🛠️ Installation

```bash
conda create -n agdc python=3.11 -y
conda activate agdc
pip install -r requirements.txt
```

## 📊 Dataset
![assets/teaser](assets/contlaynet.png)

ContLayNet is a large-scale GDSII-formatted semiconductor layout dataset at the nanometer level specifically designed for high-precision deep learning tasks.

### Download Instructions
1. Download the dataset files from [this anonymous link](https://drive.google.com/drive/folders/1VOxnaDUe1YLGXWHWqE9hCzlFMJHwqNA-?usp=sharing)
- `train.pt`
- `val.pt`
- `test.pt`
2. Place the files in the following directory:
```text
AGDC/dataset/processed/
```


## 🔧 Training
Run training from the `AGDC/` directory:

```bash
python main.py \
    --max_length 600 \
    --epochs 12 \
    --batch_size 4 \
    --loss_weight 100 \
    --diffloss_d 3 \
    --diffloss_w 1024 \
    --n_layer 32 \
    --n_head 16 \
    --n_embd 1024 \
    --lr 7.5e-5 \
    --exp {your_exp_name} \
    --eos_alpha 0.1 \
    --length_loss_weight 0.1 \
    --resume {if_needed.pth}
```
Model checkpoints will be saved to: `logs/{your_exp_name}/ckpt`.

## 🏃 Pre-trained Model
A pre-trained checkpoint (trained for 10 days on a single A6000, following the paper's settings) is available at 
[this anonymous link](https://drive.google.com/drive/folders/1VOxnaDUe1YLGXWHWqE9hCzlFMJHwqNA-?usp=sharing).

## 🎨 Sampling
Generate samples from a trained model:

```bash
# Run from the AGDC/ directory
python sample.py \
    --ckpt {path_to_your_pth}.pth \
    --diffloss_d 3 \
    --diffloss_w 1024 \
    --n_layer 32 \
    --n_head 16 \
    --n_embd 1024 \
    --batch_size 4 \
    --out_dir {directory_name_for_save} \
    --num_context_boxes 100 \
    --num_samples 1000
```
- `num_context_boxes`: Number of initial boxes to provide for sequence generation
- `num_samples`: Number of visualized outputs to save (will be saved in `out_dir`)

## 📸 Visualization
Convert sampling results from `.txt` files to visualized `.png` images:
```bash
python gds_txt_to_img.py {directory_for_saved_txts} {directory_for_saved_pngs} 'png'
```

## 📈 Evaluation
Run evaluation (ContLayNet Benchmark):
```bash
python eval_drc.py --folder {path_to_your_folder_with_sampled_txt}
```

Expected output:
```txt
Rule 1: mean: 0.xxx
Rule 2: mean: 0.xxx
Rule 3: mean: 0.xxx
Rule 4: mean: 0.xxx
```

Note that Rule 1, 2, 3, 4 indicates Rule PDC, CLC, HSC, and VSC, respectively.

## 🙏 Acknowledgements
This project builds upon the following excellent open-source repositories:

- [mar](https://github.com/LTH14/mar)
- [DeepLayout](https://github.com/kampta/DeepLayout)

We thank the authors for their valuable contributions to the community.
