Algorithms and Hardness for Active Learning on Graphs

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We present the first algorithm with theoretical guarantees for an established active learning problem on general graphs
Abstract: We study the offline active learning problem on graphs. In this problem, one seeks to select k vertices whose labels are best suited for predicting the labels of all the other vertices in the graph. Guillory and Bilmes (Guillory & Bilmes, 2009) introduced a natural theoretical model motivated by a label smoothness assumption. Prior to our work, algorithms with theoretical guarantees were only known for restricted graph types such as trees (Cesa-Bianchi et al., 2010) despite the models simplicity. We present the first O(log n)-resource augmented algorithm for general weighted graphs. To complement our algorithm, we show constant hardness of approximation.
Lay Summary: In active learning, you get to choose which data points are labeled. A real world example are surveys, where respondents are usually carefully picked rather than selected arbitrarily. We give a new algorithm for selecting which data points to label based on similarity information between data points. For a natural objective, our algorithm always performs at least as well as the optimal algorithm with a more limited budget, and we observe that it outperforms previous algorithms in experiments.
Link To Code: https://github.com/mesimon/graph_label_selection
Primary Area: Theory->Active Learning and Interactive Learning
Keywords: graph, active learning, label selection, resource augmented algorithms, approximation algorithms
Submission Number: 2504
Loading