Abstract: The Dawid-Skene model is the most widely assumed model in the analysis of crowdsourcing algorithms that estimate ground-truth labels from noisy worker responses. In this work, we are motivated by crowdsourcing applications where workers have distinct skill sets and their accuracy additionally depends on a task's type. Focusing on the case where there are two types of tasks, we propose a spectral method to partition tasks into two groups such that a worker has the same reliability for all tasks within a group. Our analysis reveals a separability condition such that task types can be perfectly recovered if the number of workers $n$ scales logarithmically with the number of tasks $d$. Numerical experiments show how clustering tasks by type before estimating ground-truth labels enhances the performance of crowdsourcing algorithms in practical applications.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: This is a camera-ready version of the manuscript under submission (Paper 4271). It is currently accepted with minor revision.
**Changes since last revision:**
A discussion section has been added on how the approach could be extended to realistic settings and about the practical implications of our work. This is in response to the minor revision suggested by the action editor.
Code: https://github.com/Saptarsh/Saptarsh.github.io/blob/master/MultiTypeCrowdsourcing_Saptarshi.ipynb
Assigned Action Editor: ~Jinwoo_Shin1
Submission Number: 4271
Loading