Back To Index Previous Article Next Article Full Text

Statistica Sinica 33 (2023), 1533-1553

INTEGRATIVE ANALYSIS FOR
HIGH-DIMENSIONAL STRATIFIED MODELS

Jian Huang1, Yuling Jiao2, Wei Wang3, Xiaodong Yan3 and Liping Zhu4

1University of Iowa, 2Wuhan University,
3Shandong University and 4Renmin University of China

Abstract: In modern economic studies, the population heterogeneity of multiple strata and high dimensionality of predictors pose major challenges. In this study, we introduce an integrative procedure that can be used to explore group and sparsity structures of high-dimensional and heterogeneous stratified models. Furthermore, we propose K-regression modelling as a hybrid of complex and simple models that exhibits arbitrary dependence on the stratum features, but linear dependence on the other variables. K-regression models exhibit the following features:(i) they are essentially nonparametric with respect to the stratified feature, and have parametric linear effects in the other variables with a potentially integrative pattern, because the effects and the corresponding sparsity structures can be the same for the strata in common groups, but vary across different groups; (ii) the devised K-regression algorithm automatically integrates the strata pertaining to a common regression model, and simultaneously estimates the corresponding effects; (iii) the proposed method quickly recovers subpopulation and sparsity structure of the K-regression models within massive high-dimensional strata; and (iv) the resulting estimators exhibit two-layer oracle properties, that is, the oracle estimator obtained using the known group and sparsity structures is the local minimizer of the objective function, with high probability. The stratum-specific bootstrap sampling scheme improves the integration accuracy. The results of simulation show that the proposed method performs appropriately for finite samples, and we demonstrate the usefulness of the method using real data.

Key words and phrases: Group fixed effect, heterogeneity, high-dimensionality, integrative analysis, K-regressio, massive data, stratum-specific bootstrap.

Back To Index Previous Article Next Article Full Text