Using Sum-Product Networks to Assess Uncertainty in Deep Active Learning

Mohamadsadegh Khosravani; Sandra Zilles

Using Sum-Product Networks to Assess Uncertainty in Deep Active Learning

Mohamadsadegh Khosravani, Sandra Zilles

Published: 04 Mar 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The success of deep active learning hinges on the choice of an effective acquisition function, which ranks not yet labeled data points according to their expected informativeness. Many acquisition functions are (partly) based on the uncertainty that the current model has about the class label of a point, yet there is no generally agreed upon strategy for computing such uncertainty. This paper proposes a new and very simple approach to computing uncertainty in deep active learning with a Convolutional Neural Network (CNN). The main idea is to use the feature representation extracted by the CNN as data for training a Sum-Product Network (SPN). Since SPNs are typically used for estimating the distribution of a dataset, they are well suited to the task of estimating class probabilities that can be used directly by standard acquisition functions such as max entropy and variational ratio. The effectiveness of our method is demonstrated in an experimental study on several standard benchmark datasets for image classification, where we compare it to various state-of-the-art methods for assessing uncertainty in deep active learning.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Thanks to all reviewers and the action editor for their thorough review and their helpful comments. We followed all requests made by the action editor. In particular: (1) We added an analysis on expected calibration error (ECE); see Section 5.4. (2) We improved the writing by (2.1) splitting the introduction into two sections (see new Section 2 on Related Work), (2.2) scaling up figures to make them more readable, (2.3) removing typos, (2.4) reformulating a few sentences throughout for better readability. (3) More details on SPN: We expanded the discussion on SPN in Section 3. Moreover, we explain engineering decisions on the SPN structure used in the experiments in much more detail in Appendix A.1. (4) We address limitations of our method, in particular concerning large category datasets, in multiple places, including Introduction, Conclusions (last paragraph), as well as "Choice of Datasets" in the beginning of Section 5.

Supplementary Material: zip

Assigned Action Editor: ~Ozan_Sener1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1639

Loading