A General Fine-Grained Truth Discovery Approach for Crowdsourced Data Aggregation

Published: 01 Jan 2017, Last Modified: 22 Feb 2025DASFAA (1) 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Crowdsourcing has been proven to be an efficient tool to collect large-scale datasets. Answers provided by the crowds are often noisy and conflicted, which makes aggregating them to infer ground truth a critical challenge. Existing fine-grained truth discovery methods solve this problem by exploring the correlation between source reliability and task topics or answers. However, they can only work on limited tasks, which results in the incompatibility with Writing tasks and Transcription tasks, along with the insufficient utilization of the global dataset. To maintain compatibility, we consider the existence of clusters in both tasks and sources, then propose a general fine-grained method. The proposed approach contains two integral components: kl-means and Pattern-based Truth Discovery (PTD). With the aid of ground truth data, kl-means directly employs a co-clustering reliability model on the correctness matrix to learn the patterns. Then PTD conducts the answer aggregation by incorporating captured patterns, producing a more accurate estimation. Therefore, our approach is compatible with all tasks and can better demonstrate the correlation among tasks and sources. Experimental results show that our method can produce a more precise estimation than other general truth discovery methods due to its ability to learn and utilize the patterns of both tasks and sources.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview