Abstract: The increasing demand for data privacy has spurred interest in generative models for simulating tabular data. However, the diverse data types and column distributions in tabular datasets present substantial challenges in developing efficient and expressive generative models. Existing approaches focus on learning joint distributions and statistics to generate distribution-consistent synthetic tables but often fail to capture global feature interactions and adapt to the requirements of downstream tasks. Moreover, current methods exhibit sensitivity to the ordering of input data during training, which leads to suboptimal performance. In this paper, we introduce HyperTab, a novel generative model based on hypergraph modeling and adversarial training. HyperTab leverages hypergraphs to capture the permutation invariance inherent in tabular data. By utilizing a hypergraph encoder, it explicitly models the interaction relationships between features, ensuring that the synthesized data accurately reflects global dependencies. Additionally, HyperTab adopts a multi-objective optimization strategy, simultaneously minimizing adversarial loss to ensure data authenticity, logical consistency loss to maintain the rationality of feature relationships, and task-specific losses, including mean squared error for regression tasks and cross-entropy for classification tasks. Extensive evaluations across multiple benchmark datasets demonstrate that HyperTab significantly outperforms existing methods in terms of generation quality.
External IDs:dblp:conf/icdar/OuyangJXZQ25
Loading