Active Learning with Human Heuristics: An Algorithm Robust to Labelling Bias
Abstract: Active learning(AL) enables prediction algorithms to achieve better performance with fewer data points by adaptively querying an oracle for output labels. In many instances, the oracle is a human. According to behavioral sciences, humans provide labels by employing decision heuristics which tend to offer biased labels. AL algorithms trained with such labels could in turn provide incorrect predictions, which could make the decisions made by such models unfair. How would modelling the oracle with such heuristics affect the performance of AL algorithms? We investigate three human heuristics (fast-and frugal tree, tallying, and franklin's rule) combined with four active learning algorithms (entropy-based, multi-view learning, density-based, and novel density-based) and apply them to five datasets from domains such as health, wealth and sustainability. A first novel finding is that if a heuristic leads to significant labelling bias, the performance of active learning algorithms significantly drops, sometimes below random sampling. Thus, it is key to design active learning algorithms robust to labeling bias. Our second contribution is a novel density-based algorithm that achieves an overall median improvement of 31% over current algorithms when the oracle has a significant labelling bias. In sum, designing and benchmarking active learning algorithms should incorporate the modelling of human decision heuristics.
Article: pdf
2 Replies
Loading