AGDC: Autoregressive Generation of Variable-Length Sequences with Joint Discrete and Continuous Spaces

14 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: joint discrete-continuous generation, autoregression, high-precision, layout generation, svg, length control
TL;DR: We propose AGDC, a framework jointly modeling discrete and continuous values to maintain full precision in vector representations, and introduce ContLayNet, a benchmark dataset with evaluation metrics for high-precision semiconductor layouts.
Abstract: Transformer-based autoregressive models excel in data generation but are inherently constrained by their reliance on discretized tokens, which limits their ability to represent continuous values with high precision. We analyze the scalability limitations of existing discretization-based approaches for generating hybrid discrete-continuous sequences, particularly in high-precision domains such as semiconductor circuit layout designs, where precision loss can lead to functional failure. To address the challenge, we propose **AGDC**, a novel unified framework that *jointly models discrete and continuous values for variable-length sequences*. AGDC employs a hybrid approach that combines categorical prediction for discrete values with diffusion-based modeling for continuous values, incorporating two key technical components: an end-of-sequence (EOS) logit adjustment mechanism that uses an MLP to dynamically adjust EOS token logits based on sequence context, and a length regularization term integrated into the loss function. Additionally, we present **ContLayNet**, a large-scale benchmark comprising 334K high-precision semiconductor layout samples with specialized evaluation metrics that capture functional correctness where precision errors significantly impact performance. Experiments on semiconductor layouts (ContLayNet), graphic layouts, and SVGs demonstrate AGDC's superior performance in generating high-fidelity hybrid vector representations compared to discretization-based and fixed-schema baselines, achieving scalable high-precision generation across diverse domains.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 5060
Loading