Nowadays, crowdsourcing becomes an economical means to leverage human wisdom for large-scale data annotation. However, when annotation tasks require specific domain knowledge that people commonly don’t have, which is normal in citizen science projects, crowd workers’ integrity and proficiency problems will significantly impair the quality of crowdsourced data. In this project, I focused on improving the crowd workers’ reliability during the label collection process. This work has been submitted to AAMAS 2018.
Crowdsourcing offers an economical means to leverage human wisdom for large-scale data annotation. However, the crowdsourced labeled data often suffer from low quality and significant inconsistencies, since the low-cost crowd workers are commonly lacking in corresponding domain knowledge and might make cursory choices. Most research in this area emphasizes the post-processing of the obtained noisy labels, which cannot radically ameliorate the quality of crowdsourcing service. In this paper, we focus on improving the worker’s reliability during the label collecting process. We propose a novel game-theoretical framework of crowdsourcing, which formulates the interaction between the annotation system and the crowd workers as an incentivized pedagogical process between the teacher and the students. In this framework, the system is able to infer the worker’s belief or prior from their current answers, reward them by performance-contingent bonus, and instruct them accordingly via near-optimal examples. We develop an effective algorithm for the system to select examples, even when the worker’s belief is unidentifiable. Also, our mathematical guarantees show that our framework not only ensures a fair payoff to crowd workers regardless of their initial priors but also facilitates value-alignment between the annotation system (requester) and the crowd workers. Our experiments further demonstrate the effectiveness and robustness of our approach among different worker populations and worker behavior in improving the crowd worker’s reliability.
The basic process of pedagogical value-aligned crowdsourcing. A small ground truth set is labeled by experts, and the candidate features and hypotheses are elicited in advance (a). Each round, the annotation system random samples k instances for the crowd worker to label (b). By observing the worker’s answers, the annotation infers the worker’s belief and selects the most helpful examples (c). In the next round, the worker again labels k random sampled instances (d). If he improves, an immediate bonus credit will be given along with this round of new examples (e). Repeating practice and teaching stages until the (N-1)-th round (f), the worker will get paid by the end of the final round (g).
Runzhe Yang, Yexiang Xue and Carla Gomes.