Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions

Leslie O'Bray; Max Horn; Bastian Rieck; Karsten Borgwardt

Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions

Leslie O'Bray, Max Horn, Bastian Rieck, Karsten Borgwardt

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 SpotlightReaders: Everyone

Keywords: graph generative models, model evaluation

Abstract: Graph generative models are a highly active branch of machine learning. Given the steady development of new models of ever-increasing complexity, it is necessary to provide a principled way to evaluate and compare them. In this paper, we enumerate the desirable criteria for such a comparison metric and provide an overview of the status quo of graph generative model comparison in use today, which predominantly relies on the maximum mean discrepancy (MMD). We perform a systematic evaluation of MMD in the context of graph generative model comparison, highlighting some of the challenges and pitfalls researchers inadvertently may encounter. After conducting a thorough analysis of the behaviour of MMD on synthetically-generated perturbed graphs as well as on recently-proposed graph generative models, we are able to provide a suitable procedure to mitigate these challenges and pitfalls. We aggregate our findings into a list of practical recommendations for researchers to use when evaluating graph generative models.

One-sentence Summary: We investigate the potential pitfalls of using MMD to evaluate graph generative models and propose recommendations for the practitioner on how to mitigate those challenges.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/evaluation-metrics-for-graph-generative/code)

12 Replies

Loading