Improving crowdsourced label quality by peer-to-peer federated learning

Published: 01 Jan 2025, Last Modified: 30 Jul 2025Appl. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Crowdsourcing can help supervised learning obtain large quantities of labeled data. There are some issues with existing crowdsourcing methods from a user or a platform perspective. From the user perspective, annotating all data on only one crowdsourcing platform will make it difficult to guarantee his privacy as the platform has complete data. Furthermore, as the label quality is completely controlled by only one platform, the user then has difficulty to reject even the annotation quality is very poor. From the platform perspective, due to commercial secrets and privacy, data cannot be shared among crowdsourcing platforms, making it difficult to form a joint effort to improve label quality. In this study, we propose a new algorithm for crowdsourcing noise correction based on label distribution (LDNC) and further propose a new peer-to-peer federated learning (P2P-LDNC) algorithm to solve the above issues. Experiments show that the algorithms can collaboratively train multiple crowdsourcing platforms in a distributed environment, effectively improving the label quality.
Loading