LGGBench: a Holistic Benchmark for Large Graph Generation

TMLR Paper5809 Authors

04 Sept 2025 (modified: 16 Sept 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The escalating demand for robust graph data sharing between organizations has propelled the development of methodologies that assess the efficacy and privacy of these shared graphs. We present LGGBench, a comprehensive benchmark designed to evaluate large graph generation methods across multiple dimensions crucial to proprietary data sharing. This benchmark integrates a diverse array of large graph datasets, sophisticated graph generation techniques, and comprehensive evaluation schemes to address the current shortcomings in graph data sharing. Our benchmark evaluates the generated graphs in terms of fidelity, utility, privacy, and scalability. Fidelity is assessed through graph statistical metrics, while utility measures the practical applicability of synthetic graphs in real-world tasks. Privacy is ensured through robust mechanisms against various inference attacks, and scalability is demonstrated through the benchmark's ability to handle extensive graph datasets efficiently. Through extensive experiments, we compare existing graph generation methods, highlighting their strengths and limitations across different types of graphs and evaluation metrics. The benchmark provides a holistic approach to evaluate and improve graph generation techniques, facilitating safer and more effective data sharing practices.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Olgica_Milenkovic1
Submission Number: 5809
Loading