Active Learning with Amazon Mechanical Turk

Florian Laws, Christian Scheible, Hinrich Schütze

2011 (modified: 10 Nov 2022)EMNLP 2011Readers: Everyone

Abstract: Supervised classification needs large amounts of annotated training data that is expensive to create. Two approaches that reduce the cost of annotation are active learning and crowdsourcing. However, these two approaches have not been combined successfully to date. We evaluate the utility of active learning in crowdsourcing on two tasks, named entity recognition and sentiment detection, and show that active learning outperforms random selection of annotation examples in a noisy crowdsourcing scenario.

0 Replies