Statistica Sinica 33 (2023), 1697-1719
Chien-Tong Lin1, Yu-Jen Cheng2 and Ching-Kang Ing2
Abstract: We examine the problem of variable selection for high-dimensional sparse Cox models. We propose using a computationally efficient procedure, the Chebyshev greedy algorithm (CGA), to sequentially include variables, and derive its convergence rate under a weak sparsity condition. When we assume a strong sparsity condition, we use a high-dimensional information criterion (HDIC) and the CGA to achieve variable selection consistency. We further devise a greedier version of the CGA (gCGA). With the help of the HDIC, the gCGA not only enjoys selection consistency, but also exhibits superior finite-sample performance in detecting marginally weak, but jointly strong signals over that of the original CGA and other related high-dimensional methods, such as conditional sure independence screening. We demonstrate the proposed methods using real data from a cytogenetically normal acute myeloid leukaemia (CN-AML) data set.
Key words and phrases: Chebyshev greedy algorithm, high-dimensional information criterion, sure screening, variable selection consistency.