Abstract: In studying the relationship between a binary variable and a covariate, it is very common that the value of the binary variable is missing for some observations, and subsequently those observations are uncategorised. In this paper we show that the uncategorised data can be treated as auxiliary information, as in survey sampling literature. We establish a framework of parametric and nonparametric estimation by the empirical likelihood. The proposed empirical likelihood estimators improve the efficiency of estimators based on the categorised samples in the leading order. In a comparative study with the ratio estimator, we show the reveal robust performance of the empirical likelihood estimators. Applications to tax-auditing and to genetic studies are discussed.
Key words and phrases: Auxiliary information, empirical likelihood, genetic studies, mixture model, survey sampling, tax-auditing problem.