Human Knowledge Based Efficient Interactive Data Annotation via Active Weakly Supervised Learning

Published: 2021, Last Modified: 21 Jan 2026PerCom Workshops 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Machine learning has been contributing significantly to various pervasive computing systems. Further diffusion of such systems will require reducing the obstacles of huge data annotation costs and the uninterpretability. Weakly supervised learning is gaining attention as a method to solve these problems, especially in natural language processing (NLP). Its advantage is due to human-defined labeling functions (LFs) based on human knowledge instead of attaching a label to each data point manually. However, this method alone cannot reduce the actual process cost. Creating LFs can be a very costly process if there is no insight or support. We propose an interactive data annotation method via weakly supervised learning with an uncertainty-based active learning strategy. The proposed method iteratively presents a few highly-prioritized data points to be annotated considering the outputs of LFs and the uncertainty of the prediction. The humans' task is only implementing their knowledge that can be applied to the presented data points as an LF. We also verified the effectiveness of the proposed method through two classification tasks (NLP and non-NLP tasks). The experimental results indicate its effectiveness and high potential without limitation of application fields.
Loading