Statistica Sinica 29 (2019), 1939-1961
Abstract: We develop a new procedure called the “pseduo-value method” (PVM) for ultra high-dimensional variable selection problems in semiparametric survival models. Currently, the prevailing strategies available for working with ultra high-dimensional lifetime data are the sure independence screening (SIS) strategies. The proposed unified methodology covers a much broader class of survival models, including general transformation models and the accelerated failure time (AFT) model. The proposed method is versatile because the conversion involved easily casts the problem of interest as a regular linear regression. Through this translation, all existing techniques developed for linear regression problems can be leveraged at almost no extra cost. The numerical performance of the PVM shows promising results: in addition to outperforming the (iterative) SIS for the Cox model, the new method accurately selects the effective variables for probit, proportional odds, and AFT models, which have been studied in ultra high-dimensional contexts on a case-by-case basis. We apply our unified method to analyze diffuse large-B-cell lymphoma data, finding genes that may be overlooked, but that could be influential.This finding is potentially of scientific importance on its own.
Key words and phrases: Accelerated failure time model, penalized log-marginal likelihood, semiparametric models, transformation models, variable selection.