Crowdsourcing Utilizing Subgroup Structure of Latent Factor Modeling.
Abstract: Crowdsourcing has emerged as an alternative solution for collecting large scale la- bels. However, the majority of recruited workers are not domain experts, so their con- tributed labels could be noisy. In this paper, we propose a two-stage model to predict the true labels for multicategory classification tasks in crowdsourcing. In the first stage, we fit the observed labels with a latent factor model and incorporate subgroup structures for both tasks and workers through a multi-centroid grouping penalty. Group-specific rotations are introduced to align workers with different task categories to solve mul- ticategory crowdsourcing tasks. In the second stage, we propose a concordance-based approach to identify high-quality worker subgroups who are relied upon to assign la- bels to tasks. In theory, we show the estimation consistency of the latent factors and the prediction consistency of the proposed method. The simulation studies show that the proposed method outperforms the existing competitive methods, assuming the sub- group structures within tasks and workers. We also demonstrate the application of the proposed method to real world problems and show its superiority.
Loading