Responsible Active Learning via Human-in-the-loop Peer Study

TMLR Paper613 Authors

19 Nov 2022 (modified: 02 Apr 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Active learning has been proposed to reduce data annotation efforts by only manually labelling representative data samples for training. Meanwhile, recent active learning applications have benefited a lot from cloud computing services with not only sufficient computational resources but also crowdsourcing frameworks that include many humans in the active learning loop. However, previous active learning methods that always require passing large-scale unlabelled data to cloud may potentially raise significant data privacy issues. To mitigate such a risk, we propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability. Specifically, we first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side by maintaining an active learner (student) on the client-side. During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion. To further enhance the active learner via large-scale unlabelled data, we introduce multiple peer students into the active learner which is trained by a novel learning paradigm, including the In-Class Peer Study on labelled data and the Out-of-Class Peer Study on unlabelled data. Lastly, we devise a discrepancy-based active sampling criterion, Peer Study Feedback, that exploits the variability of peer students to select the most informative data to improve model stability. Extensive experiments demonstrate the superiority of the proposed PSL over a wide range of active learning methods in both standard and sensitive protection settings.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Added details in section 1 Introduction and section 3 Peer Study Learning. The former explains why single-model active learning methods can be prone to bias. The latter gives an overview of why and how the responsible active learning task is conducted.
Assigned Action Editor: ~Hanwang_Zhang3
Submission Number: 613