TabularARGN: An Auto-Regressive Generative Network for Tabular Data Generation

Published: 18 Nov 2025, Last Modified: 18 Nov 2025AITD@EurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Submission Type: Short paper (4 pages)
Keywords: Synthetic Data, Generative AI
TL;DR: we introduce a simple and flexible framework capable to generate a wide variety of tabular data structures and types.
Abstract: Synthetic data generation for tabular datasets requires a balance of fidelity, efficiency, and adaptability to address real-world applications. We present a synthetic data framework based on the Tabular Auto-Regressive Generative Network (TabularARGN), a flexible architectural framework for modeling tabular data. TabularARGN learns the joint distribution of tabular features by training on encoded representations of the original data and using randomly sampled variable orderings to fit an auto-regressive model. This design naturally supports conditional sampling across arbitrary feature subsets, enabling use cases such as class rebalancing, missing value imputation, and controllable sampling strategies. Although the architectural flexibility of the framework allows for these extensions, our focus in this work is on evaluating the fidelity and efficiency of the generated data. We demonstrate state-of-the-art utility performance with low computational overhead across established benchmarks, making TabularARGN a practical solution for scalable tabular data generation. The framework code is available at https://github.com/mostly-ai/mostlyai-engine.
Published Venue And Year: EurIPS'25 Workshop on AI for Tabular Data
Submission Number: 2
Loading