Abstract
In this article, we consider change point inference for high dimensional
linear models. For change point detection, given any subgroup of variables, we
propose a new method for testing the homogeneity of corresponding regression coefficients across the observations. Under some regularity conditions, the proposed
new testing procedure controls the type I error asymptotically and is powerful
against sparse alternatives and enjoys certain optimality. For change point identification, an “argmax” based change point estimator is proposed which is shown
to be consistent for the true change point location. Moreover, combining with the
binary segmentation technique, we further extend our new method for detecting
and identifying multiple change points. Extensive numerical studies justify the
validity of our new method and demonstrate its competitive performance.
Key words and phrases: Change point inference; High dimensions; Linear regres- sion; Multiplier bootstrap; Subgroups
Information
| Preprint No. | SS-2023-0212 |
|---|---|
| Manuscript ID | SS-2023-0212 |
| Complete Authors | Bin Liu, Xinsheng Zhang, Yufeng Liu |
| Corresponding Authors | Yufeng Liu |
| Emails | yfliu@email.unc.edu |
References
- Bai, J. and P. Perron (1998). Estimating and testing linear models with multiple structural changes. Econometrica 66(1), 47–78.
- Bai, Y. and A. Safikhani (2023). A unified framework for change point detection in highdimensional linear models. Statistica Sinica 33, 1–28.
- Bickel, P. J., Y. Ritov, and A. B. Tsybakov (2009). Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics 37(4), 1705–1732.
- Chan, N. H., C. Y. Yau, and R. Zhang (2014). Group lasso for structural break time series. Journal of the American Statistical Association 109(506), 590–599.
- Chen, H., H. Ren, F. Yao, and C. Zou (2023). Data-driven selection of the number of changepoints via error rate control. Journal of the American Statistical Association 118(542), 1415–1428.
- Chernozhukov, V., D. Chetverikov, and K. Kato (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics 41(6), 2786–2819.
- Cho, H. and D. Owens (2022). High-dimensional data segmentation in regression settings permitting heavy tails and temporal dependence. arXiv preprint arXiv:2209.08892.
- He, Z., D. Cheng, and Y. Zhao (2023). Multiple testing of local extrema for detection of structural breaks in piecewise linear models. arXiv preprint arXiv:2308.04368.
- Horv´ath, L. (1995). Detecting changes in linear regressions. Statistics 26(3), 189–208.
- Horv´ath, L. and Q.-M. Shao (1995). Limit theorems for the union-intersection test. Journal of Statistical Planning and Inference 44(2), 133–148.
- Jirak, M. (2015). Uniform change point tests in high dimension. The Annals of Statistics 43(6), 2451–2483.
- Kaul, A., S. B. Fotopoulos, V. K. Jandhyala, and A. Safikhani (2021). Inference on the change point under a high dimensional sparse mean shift. Electronic Journal of Statistics 15, 71–134.
- Kaul, A., V. K. Jandhyala, and S. B. Fotopoulos (2019). An efficient two step algorithm for high dimensional change point regression models without grid search. The Journal of Machine Learning Research 20, 1–40.
- Lee, S., M. H. Seo, and Y. Shin (2016). The lasso for high dimensional regression with a possible change point. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78(1), 193–210.
- Leonardi, F. and P. B¨uhlmann (2016). Computationally efficient change point detection for high-dimensional regression. arXiv preprint: 1601.03704.
- Meinshausen, N. and P. B¨uhlmann (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics 34(3), 1436–1462.
- Quandt, R. E. (1960). Tests of the hypothesis that a linear regression system obeys two separate regimes. Journal of the American Statistical Association 55(290), 324–330.
- Shao, X. and X. Zhang (2010). Testing for change points in time series. Journal of the American Statistical Association 105(491), 1228–1240.
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 58(1), 267–288.
- Van de Geer, S., P. B¨uhlmann, Y. Ritov, and R. Dezeure (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics 42(3), 1166–1202.
- Vostrikova, L. Y. (1981). Detecting disorder in multidimensional random process. Soviet Math. Dokl 24, 55–59.
- Wang, D., Z. Zhao, K. Z. Lin, and R. Willett (2021). Statistically and computationally efficient change point localization in regression settings. The Journal of Machine Learning Research 22(248), 1–46.
- Wang, F., O. Madrid, Y. Yu, and A. Rinaldo (2022). Denoising and change point localisation in piecewise-constant high-dimensional regression coefficients. In International Conference on Artificial Intelligence and Statistics, Volume 151, pp. 4309–4338. PMLR.
- Xia, Y., T. Cai, and T. T. Cai (2018). Two-sample tests for high-dimensional linear regression with an application to detecting interactions. Statistica Sinica 28(1), 63–92.
- Xu, H., D. Wang, Z. Zhao, and Y. Yu (2022). Change point inference in high-dimensional regression models under temporal dependence. arXiv preprint arXiv:2207.12453.
- Yu, M. and X. Chen (2021). Finite sample change point inference and identification for highdimensional mean vectors. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 83(2), 247–270.
- Zhang, B., J. Geng, and L. Lai (2015). Change-point estimation in high dimensional linear regression models via sparse group lasso. In In: 53rd Annual Allerton Conference on
- Communication, Control, and Computing (Allerton), pp. 815–821. IEEE.
- Zhang, C.-H. and S. S. Zhang (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76(1), 217–242.
- Zhang, X. and G. Cheng (2017). Simultaneous inference for high-dimensional linear models. Journal of the American Statistical Association 112(518), 757–768.
Acknowledgments
The authors would like to thank the editor, the associate editor, and
the reviewers for their helpful comments and suggestions.
Supplementary Materials
The online supplementary materials provide detailed basic assumptions
and proofs of the main theory, and additional numerical results including
size, power, multiple change point detection. In addition, an interesting
real data application is also provided.