Statistica Sinica 29 (2019), 1873-1889
Abstract: In many applications, subgroups with different parameters may exist even after accounting for the covariate effects, and it is important to identify the meaningful subgroups for better medical treatment or market segmentation. We propose a robust subgroup identification method based on median regression with concave fusion penalizations. The proposed method can simultaneously determine the number of subgroups, identify the group membership for each subject, and estimate the regression coefficients. Without requiring any parametric distributional assumptions, the proposed method is robust against outliers in the response and heteroscedasticity in the regression error. We develop a convenient algorithm based on local linear approximation, and establish the oracle property of the proposed penalized estimator and the model selection consistency for the modified Bayesian information criteria. The numerical performance of the proposed method is assessed through simulation and the analysis of a heart disease data.
Key words and phrases: Concave fusion penalization, heterogeneous parameters, median regression, model-based clustering, robust.