Back To Index Previous Article Next Article Full Text

Statistica Sinica 29 (2019), 1607-1630

THE RESTRICTED CONSISTENCY PROPERTY OF
LEAVE-𝓃𝓋-OUT CROSS-VALIDATION FOR
HIGH-DIMENSIONAL VARIABLE SELECTION
Yang Feng and Yi Yu
Columbia University and University of Bristol

Abstract: Cross-validation (CV) methods are popular for selecting the tuning parameter in high-dimensional variable selection problems. We show that a misalignment of the CV is one possible reason for its over-selection behavior. To fix this issue, we propose using a version of leave-𝓃𝓋-out CV (CV(𝓃𝓋)) to select the optimal model from a restricted candidate model set for high-dimensional generalized linear models. By using the same candidate model sequence and a proper order for the construction sample size nc in each CV split, CV(𝓃𝓋) avoids potential problems when developing theoretical properties. CV(𝓃𝓋) is shown to exhibit the restricted model-selection consistency property under mild conditions. Extensive simulations and a real-data analysis support the theoretical results and demonstrate the performance of CV(𝓃𝓋) in terms of both model selection and prediction.

Key words and phrases: Generalized linear models, leave-𝓃𝓋-out cross-validation, restricted maximum likelihood estimators, restricted model-selection consistency, variable selection.

Back To Index Previous Article Next Article Full Text