Statistica Sinica 28 (2018), 203-228
Abstract: Model selection and model averaging are essential to regression analysis in environmental studies, but determining which of the two approaches is the more appropriate and under what circumstances remains an active research topic. In this paper, we focus on geostatistical regression models for spatially referenced environmental data. For a general information criterion, we develop a new perturbation-based criterion that measures the uncertainty (or, instability) of spatial model selection, as well as an empirical rule for choosing between model selection and model averaging. Statistical inference based on the proposed model selection instability measure is justified both in theory and via a simulation study. The predictive performance of model selection and model averaging can be quite different when the uncertainty in model selection is relatively large, but the performance becomes more comparable as this uncertainty decreases. For illustration, a precipitation data set in the state of Colorado is analyzed.
Key words and phrases: Data perturbation, generalized degrees of freedom, geostatistics, information criterion, model complexity, spatial prediction.