Statistica Sinica

Sean X. Chen and Jun S. Liu

Abstract:The distribution of Z_{1}+…+Z_{n}is called Poisson-Binomial if theZ_{i}are independent Bernoulli random variables with not-all-equal probabilities of success. It is noted that such a distribution and its computation play an important role in a number of seemingly unrelated research areas such as survey sampling, case-control studies, and survival analysis. In this article, we provide a general theory about the Poisson-Binomial distribution concerning its computation and applications, and as by-products, we propose new weighted sampling schemes for finite population, a new method for hypothesis testing in logistic regression, and a new algorithm for finding the maximum conditional likelihood estimate (MCLE) in case-control studies. Two of our weighted sampling schemes are direct generalizations of the ``sequential" and ``reservoir" methods of Fan, Muller and Rezucha (1962) for simple random sampling, which are of interest to computer scientists. Our new algorithm for finding the MCLE in case-control studies is an iterative weighted least squares method, which naturally bridges prospective and retrospective GLMs.

Key words and phrases:Case-control studies, conditional Bernoulli distribution, iterative weighted least squares, logistic regression, PPS sampling, Poisson-Binomial, survey sampling, weighted sampling.