Attention-based Graph Coreset Labeling for Active Learning

Xueqi Ma; Xingjun Ma; Sarah Monazam Erfani; James Bailey

Attention-based Graph Coreset Labeling for Active Learning

Xueqi Ma, Xingjun Ma, Sarah Monazam Erfani, James Bailey

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: active learning; graph learning

Abstract: GNN-based Active Learning (AL) methods have been proposed to improve labeling efficiency by selecting the most informative nodes in a graph for labeling. The existing graph active learning methods employ different heuristic approaches, while efficiency sometimes, they fail to explicitly explore the influence of labeled data on unlabeled data, thus limiting the generalizability of graph models to various types of graph data. In this paper, we propose an Attention-based Graph Coreset Labeling framework (AGCL). AGCL can, with limited budgets, gradually discover core data to be labeled from a global view so as to obtain a training dataset that can efficiently depict the whole graph space and maximize the performance of GNNs. Specifically, we explicitly explore and exploit the correlations between nodes in the unlabeled pool and those in the labeled pool using an attention architecture and directly connect the correlations with the prediction performance on unlabeled set. Using influence scores, AGCL can identify data for labeling having maximum representation difference from the existing labeled pool. This enhances sample complexity.We theoretically demonstrate the superiority of the attention-based data selection strategy in reducing the covering radius bound, thereby improving the expected prediction performance on unlabeled data. Our experimental results show that the labeled coreset can improve the generalizability of various graph models across different graph datasets, as well as CNN models on image classification tasks.

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7367

Loading