When Simplicity Wins: Efficient Knowledge Graph Generation with Sequential Decoders

When Simplicity Wins: Efficient Knowledge Graph Generation with Sequential Decoders

ICLR 2026 Conference Submission19352 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: knowledge graph generation, autoregressive models, structured data generation, knowledge graph, knowledge representation

TL;DR: We show that simple GRU-based models match transformer performance for knowledge graph generation while being 3.7-11× faster, challenging the assumption that complex attention mechanisms are necessary for structured data tasks.

Abstract: Knowledge Graph (KG) generation requires models to learn complex semantic dependencies between triples while maintaining domain validity constraints. State-of-the-art graph generation models rely on expensive attention mechanisms to capture complex dependencies, yet (head, relation, tail) triples can be straightforwardly represented as sequences, suggesting simpler architectures may suffice for KGs. We present $\textbf{ARK}$ ($\textbf{A}$uto-$\textbf{R}$egressive $\textbf{K}$nowledge Graph Generation), a family of RNN and transformer-based models that succesfully perform KG generation. We show that the RNN variant requires only 9-21\% of the training time, a 3.7-11× speedup, compared to the transformer variant. The RNN generates semantically valid graphs with 89.2-100.0\% validity on IntelliGraphs benchmarks, with less than 0.76\% degradation compared to the transformer on real-world datasets, while achieving up to 10.7\% better compression rates on synthetic datasets and 12.8–21.1\% gains on real-world datasets. Our analysis reveals that for KG generation, model capacity (hidden dimensionality $\geq$ 64) matters more than depth, with single-layer GRUs matching deep transformer performance. We also introduce $\textbf{SAIL}$, an extension of ARK that adds variational latent variables for controlled diversity and interpolation in KG generation. Both models support unconditional sampling and conditional generation from partial graphs. Our findings challenge the assumption that structured data generation requires attention mechanisms. This efficiency gain can enable the generation of larger KGs and unlock new applications.

Supplementary Material: pdf

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 19352

Loading