Statistica Sinica

Qing Kang, Paul I. Nelson and Christopher I. Vahl

Abstract:An outcome-dependent enriched (ODE) sample results from adding a random sample to a stratified sample, where the stratification is based on levels of a categorical outcome. In biometrics, such a sample can be generated by combining data from a cohort study with data from an independent case-control study. Suppose that the probability of an outcome is determined by covariates according to a given model. For the case where the marginal distributions of the outcome and predictors are both unknown, Morgenthaler and Vardi (1986) proposed a weighted likelihood (WL) method to estimate the model parameters from an ODE sample. Here, we derive and study the asymptotic properties of the WL estimator. Simulation and an asymptotic comparison demonstrate that when the presumed model is correct, the performance of the WL method is often comparable to the asymptotically efficient profile likelihood (PL) method. If the model is misspecified, the WL method has a nice interpretation and is more robust than the PL method. This leads us to recommend use of the WL method, especially for the situation where the fitness of the presumed model is in doubt and the sample size is large.

Key words and phrases:Case-control sample, central limit theorem, choice-based sample, empirical distribution, generalized linear model, semiparametric likelihood.