Wei Ma, Junzhuo Gao, Lei Wang and Heng Lian (2025). SPARSE AND DEBIASED ADAPTIVE HUBER REGRESSION IN DISTRIBUTED DATA: AGGREGATED AND COMMUNICATION-EFFICIENT APPROACHES. Vol 35 No. 4, 1971-1990. DOI:10.5705/ss.202021.0375.

Abstract: Distributed estimation and statistical inference for linear models have drawn much attention recently, but few studies focus on robust learning in the presence of heavy-tailed/asymmetric errors and high-dimensional covariates. Based on adaptive Huber regression to achieve the bias-robustness tradeoff, two classes of sparse and debiased lasso estimators are proposed using aggregated and communication-efficient approaches. To be specific, an aggregated ℓ₁-penalized and a multi-round ℓ₁-penalized communication-efficient adaptive Huber estimators are respectively proposed in the first stage to handle the distributed data with high-dimensional covariates and heavy-tailed/asymmetric errors. To correct the biases caused by the lasso penalty, a unified debiasing framework based on the decorrelated score equations is considered in the second stage. In the third stage, hard-thresholding is used to produce the sparse and debiased lasso estimators. The convergence rates and asymptotic properties of the proposed two estimators are established. The finite-sample performance is studied through simulations and a real data application to Communities and Crime Data Set is also presented to illustrate the validity and feasibility of the proposed estimators.

Key words and phrases: Asymptotic normality, convergence rates, debiased lasso, decorrelated score, multi-round, thresholding.