Uncertainty for Active Learning on Graphs

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: learning on graphs and other geometries & topologies
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Active Learning, Uncertainty Estimation, Node Classification, Machine Learning on Graphs
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We derive ground-truth uncertainties for node classification and prove that epistemic uncertainty sampling aligns with an optimal active learning strategy. However, existing uncertainty estimators perform suboptimal.
Abstract: Active learning (AL) is a promising technique to improve data efficiency of machine learning models by iteratively acquiring data labels during training. While Uncertainty Sampling (US) - a strategy that labels data points with the highest uncertainty - has proven effective for independent data, its implications for interdependent data, such as nodes in graphs, remain under-explored. In this work, we propose the first extensive study of US for node classification. Our contribution is threefold: **(1)** We are the first to provide a benchmark for US approaches beyond predictive uncertainty. We highlight a performance gap between conventional AL strategies for graphs and US. **(2)** We develop novel ground-truth Bayesian uncertainty estimates in terms of the data-generating process. We both theoretically prove and empirically confirm their effectiveness in guiding US toward high-quality label queries. **(3)** Based on our analysis, we highlight pitfalls in modeling uncertainty and relate them to contemporary uncertainty estimators for node classification.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3559
Loading