Statistica Sinica 26 (2016), 69-95
Abstract: We propose a penalized quantile regression and an independence screening procedure to identify important covariates and to exclude unimportant ones for a general class of ultrahigh dimensional single index models, in which the conditional distribution of the response depends on the covariates via a single index structure. We observe that linear quantile regression yields a consistent estimator of the direction of the index parameter in the single index model. Such an observation dramatically reduces computational complexity in selecting important covariates in the single index model. We establish an oracle property for the penalized quantile regression estimator when the covariate dimension increases at an exponential rate of the sample size. From a practical perspective, however, when the covariate dimension is extremely large, the penalized quantile regression may suffer from at least two drawbacks: computational expediency and algorithmic stability. To address these issues, we propose an independence screening procedure which is robust to model misspecification, and has reliable performance when the distribution of the response variable is heavily tailed or response realizations contain extreme values. The new independence screening procedure offers a useful complement to penalized quantile regression since it helps to reduce the covariate dimension from ultrahigh dimensionality to a moderate scale. Based on the reduced model, penalized linear quantile regression further refines selection of important covariates at different quantile levels. We examine the finite sample performance of the newly proposed procedure by Monte Carlo simulations and demonstrate the proposed methodology by an empirical analysis of a data set.
Key words and phrases: Distance correlation, penalized quantile regression, single index models, sure screening property, ultrahigh dimensionality.