Statistica Sinica 31 (2021), 519-546

OPTIMAL STOPPING AND WORKER SELECTION IN CROWDSOURCING:

AN ADAPTIVE SEQUENTIAL PROBABILITY RATIO TEST FRAMEWORK

Xiaoou Li^{1}, Yunxiao Chen^{2}, Xi Chen^{3}, Jingchen Liu^{4}, and Zhiliang Ying^{4}

Abstract: In this study, we solve a class of multiple testing problems under a Bayesian sequential decision framework. Our work is motivated by binary labeling tasks in crowdsourcing, where a requestor needs to simultaneously choose a worker to provide a label and decide when to stop collecting labels, under a certain budget constraint. We begin by using a binary hypothesis testing problem to determine the true label of a single object, and provide an optimal solution by casting it under an adaptive sequential probability ratio test framework. Then, we characterize the structure of the optimal solution, that is, the optimal adaptive sequential design, which minimizes the Bayes risk using a log-likelihood ratio statistic. We also develop a dynamic programming algorithm to efficiently compute the optimal solution. For the multiple testing problem, we propose an empirical Bayes approach for estimating the class priors, and show that the average loss of our method converges to the minimal Bayes risk under the true model. Experiments on both simulated and real data show the robustness of our method, as well as its superiority over existing methods in terms of its labeling accuracy.

Key words and phrases: Bayesian decision theory, crowdsourcing, empirical Bayes, sequential analysis, sequential probability ratio test.