Explanation-Assisted Data Augmentation  for Graph Learning

Xu Zheng; Farhad Shirani; Tianchun Wang; Shouwei Gao; Wenqian Dong; Wei Cheng; Dongsheng Luo

Explanation-Assisted Data Augmentation for Graph Learning

Xu Zheng, Farhad Shirani, Tianchun Wang, Shouwei Gao, Wenqian Dong, Wei Cheng, Dongsheng Luo

26 Sept 2024 (modified: 28 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: empirical risk minimization, Explainable Graph Neural Networks

Abstract: This work introduces a novel class of Data Augmentation (DA) techniques in the context of graph learning. In general, DA refers to techniques that enlarge the training set using label-preserving transformations. Such techniques enable increased robustness and generalization, especially when the size of the original training set is limited. A fundamental idea in DA is that labels are invariant to domain-specific transformations of the input samples. However, it is challenging to identify such transformations in learning over graphical input domains due to the complex nature of graphs and the need to preserve their structural and semantic properties. In this work, we propose explanation-assisted DA (EA-DA) for Graph Neural Networks (GNNs). A graph explanation is a subgraph which is an `almost sufficient' statistic of the input graph with respect to its classification label. Consequently, the classification label is invariant, with high probability, to perturbations of graph edges not belonging to its explanation subgraph. We develop EA-DA techniques leveraging such perturbation invariances. First, we show analytically that the sample complexity of explanation-assisted learning can be arbitrarily smaller than explanation-agnostic learning. On the other hand, we show that if the training set is enlarged using EA-DA techniques and the learning rule does not distinguish between the augmented data and the original data, then the sample complexity can be worse than that of explanation-agnostic learning. We identify the main reason for the potential increase in sample complexity as the out-of-distribution nature of graph perturbations. We conclude that theoretically EA-DA may improve sample complexity, and that the learning rule must distinguish between the augmented data and the original data. Subsequently, we build upon these theoretical insights, introduce practically implementable EA-DA techniques and associated learning mechanisms, and perform extensive empirical evaluations.

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6430

Loading