doi:http://dx.doi.org/10.5705/ss.2011.081
Abstract: We consider the problem of calculating power and sample size for tests based on generalized estimating equations (GEE), that arise in studies involving clustered or correlated data (e.g., longitudinal studies and sibling studies). Previous approaches approximate the power of such tests using the asymptotic behavior of the test statistics under fixed alternatives. We develop a more accurate approach in which the asymptotic behavior is studied under a sequence of local alternatives that converge to the null hypothesis at root- rate, where is the number of clusters. Based on this approach, explicit sample size formulae are derived for Wald and quasi-score test statistics in a variety of GEE settings. Simulation results show that in the important special case of logistic regression with exchangeable correlation structure, previous approaches can inflate the projected sample size (to obtain nominal % power using the Wald statistic) by over %, whereas the proposed approach provides an accuracy of around %.
Key words and phrases: Clustered and correlated data, GEE, local alternatives, longitudinal data analysis, marginal models.