Abstract: Crowdsourcing as a practical way of ensuring data quality has achieved great success in fields like image annotation, speech recognition, etc. For structured complex tasks like translation which are usually very intensive, one task need to be split first, otherwise workers may feel overwhelmed. However, if we randomly split one task into evenly sized sub tasks, context information may be lost, resulting in low quality data. Take translation as an example, the word ‘cookie’ could mean cooked snack, or a browser cookie which are text files with small pieces of data. Thus, how to divide one large structured complex task into sub tasks while at the same time maintaining context information becomes a challenge. In this paper, we propose a novel splitting method for structured complex tasks based on minimum cut in graph theory. We design mechanisms to convert one structured complex task into a weighted graph with nodes representing potential sub tasks and weighted edges representing the relationship between sub tasks. Experimental results on real translation tasks demonstrate that using our method could achieve higher scores compared with other methods.
Loading