Abstract: A unique feature of panel data is that temporal and cross-sectional variations are often confounded with one another. Numerous models have been proposed in the literature to describe different aspects of these variations. This article attempts to integrate model selection into categorical panel data analysis. To this end, conventional methods are often inappropriate because most of them are designed to compare submodels that belong to the same parametric class. We introduce the generalized accumulated prediction error (GAPE) for panel data and propose to use it as a model selection criterion. Theoretical properties of GAPE will be discussed. The results are applied to a set of scanner data drawn from marketing research.
Key words and phrases: AIC, BIC, consumer behavior, heterogeneity, marketing research, multivariate logit model, scanner data, Simpson's Paradox.