Statistica Sinica 32 (2022), 1881-1909
Fei Xue and Annie Qu
Abstract: Traditional variable selection methods could fail to be sign consistent when irrepresentable conditions are violated. This is especially critical in high-dimensional settings when the number of predictors exceeds the sample size. In this paper, we propose a new semi-standard partial covariance (SPAC) approach that is capable of reducing the correlation effects from other covariates, while fully capturing the magnitude of the coefficients. The proposed SPAC is effective in choosing covariates that have direct effects on the response variable, while eliminating predictors that are not directly associated with the response, but are highly correlated with the relevant predictors. We show that the proposed SPAC method with the Lasso penalty or the smoothly clipped absolute deviation (SCAD) penalty possesses strong sign consistency in high-dimensional settings. Numerical studies and a post-traumatic stress disorder data application confirm that the proposed method outperforms the existing Lasso, adaptive Lasso, SCAD, Peter-Clark-simple algorithm, and factor-adjusted regularized model selection methods when the irrepresentable conditions fail.
Key words and phrases: Irrepresentable condition, Lasso, model selection consistency, partial correlation, smoothly clipped absolute deviation.