Dimension Reduction for Extreme Regression via Contour Projection

Liujun Chen and Jing Zeng

doi:10.5705/ss.202024.0159

Abstract

In extreme regression problems, a primary objective is to infer ex

treme values of the response given a set of predictors. The high dimensionality

and heavy-tailedness of the predictors limit the applicability of classical tools for

inferring conditional extremes. In this paper, we focus on the central extreme

subspace (CES), whose existence and uniqueness are guaranteed under fairly mild

conditions. By projecting the data onto the CES, the dimension of the predictors

is reduced while all the information for inferring conditional extremes is retained,

which effectively addresses the high dimensionality issue. We propose the novel

COPES method to estimate the CES by utilizing contour projection. Notably,

COPES is robust against heavy-tailed predictors. The theoretical justification

for the consistency of COPES is established. Overall, our proposal not only extends the toolkit for extreme regression but also broadens the scope of dimension

reduction techniques. The effectiveness of our proposal is demonstrated through

extensive simulation studies and an application to Chinese stock market data.

Key words and phrases: conditional extremes, sufficient dimension reduction, el- liptically contoured distribution, heavy-tailedness

Information

Preprint No.	SS-2024-0159
Manuscript ID	SS-2024-0159
Complete Authors	Liujun Chen, Jing Zeng
Corresponding Authors	Jing Zeng
Emails	zengjxl@ustc.edu.cn

References

Aghbalou, A., Portier, F., Sabourin, A. & Zhou, C. (2024), ‘Tail inverse regression: dimension reduction for prediction of extremes’, Bernoulli 30(1), 503–533.
Arnosti, N., Beck, M. & Milgrom, P. (2016), ‘Adverse selection and auction design for internet display advertising’, American Economic Review 106(10), 2852–2866.
Bousebata, M., Enjolras, G. & Girard, S. (2023), ‘Extreme partial least-squares’, Journal of Multivariate Analysis 194, 105101.
Bura, E., Duarte, S. & Forzani, L. (2016), ‘Sufficient reductions in regressions with exponential family inverse predictors’, Journal of the American Statistical Association 111(515), 1313– 1329.
Chen, X., Zhang, J. & Zhou, W. (2022), ‘High-dimensional elliptical sliced inverse regression in non-gaussian distributions’, Journal of Business & Economic Statistics 40(3), 1204–1215.
Chen, X., Zou, C. & Cook, R. D. (2010), ‘Coordinate-independent sparse sufficient dimension reduction and variable selection’, The Annals of Statistics 38(6), 3696–3723.
Chernozhukov, V. (2005), ‘Extramal quantile regression’, The Annals of Statistics 33(2), 806– 839.
Cook, R. D. (1998), Regression Graphics: Ideas for Studying Regressions Through Graphics, Vol. 318, John Wiley & Sons.
Cook, R. D. & Forzani, L. (2008), ‘Principal Fitted Components for Dimension Reduction in Regression’, Statistical Science 23(4), 485–501.
Cook, R. D. & Li, B. (2002), ‘Dimension reduction for conditional mean in regression’, The Annals of Statistics 30(2), 455–474.
Cook, R. D. & Ni, L. (2005), ‘Sufficient dimension reduction via inverse regression: A minimum discrepancy approach’, Journal of the American Statistical Association 100(470), 410–428.
Cook, R. D. & Ni, L. (2006), ‘Using intraslice covariances for improved estimation of the central subspace in regression’, Biometrika 93(1), 65–74.
Cook, R. D. & Weisberg, S. (1991), ‘Discussion of sliced inverse regression for dimension reduction’, Journal of the American Statistical Association 86(414), 328–332.
Eaton, M. L. (1986), ‘A characterization of spherical distributions’, Journal of Multivariate Analysis 20(2), 272–276.
Fan, J., Wang, W. & Zhu, Z. (2021), ‘A shrinkage principle for heavy-tailed data: Highdimensional robust low-rank matrix recovery’, The Annals of Statistics 49(3), 1239–1266.
Fukumizu, K. & Leng, C. (2014), ‘Gradient-based kernel dimension reduction for regression’, Journal of the American Statistical Association 109(505), 359–370.
Gardes, L. (2018), ‘Tail dimension reduction for extreme quantile estimation’, Extremes 21, 57– 95.
Girard, S., Stupfler, G. & Usseglio-Carleve, A. (2021), ‘Extreme conditional expectile estimation in heavy-tailed heteroscedastic regression models’, The Annals of Statistics 49(6), 3358– 3382.
Hall, P., Titterington, D. & Xue, J.-H. (2009), ‘Median-based classifiers for high-dimensional data’, Journal of the American Statistical Association 104(488), 1597–1608.
Hall, P. & Weissman, I. (1997), ‘On the estimation of extreme tail probabilities’, The Annals of Statistics 25(3), 1311–1326.
Huang, M.-Y. & Chiang, C.-T. (2017), ‘An effective semiparametric estimation approach for the sufficient dimension reduction model’, Journal of the American Statistical Association 112(519), 1296–1310.
Johnson, M. E. (1987), Multivariate statistical simulation: A guide to selecting and generating continuous multivariate distributions, Vol. 192, John Wiley & Sons.
Li, B. (2018), Sufficient dimension reduction: Methods and applications with R, Chapman and Hall/CRC.
Li, B. & Wang, S. (2007), ‘On directional regression for dimension reduction’, Journal of the American Statistical Association 102(479), 997–1008.
Li, K.-C. (1991), ‘Sliced inverse regression for dimension reduction’, Journal of the American Statistical Association 86(414), 316–327.
Loh, P.-L. (2017), ‘Statistical consistency and asymptotic normality for high-dimensional robust M-estimators’, The Annals of Statistics 45(2), 866–896.
Luo, R., Wang, H. & Tsai, C.-L. (2009), ‘Contour projected dimension reduction’, The Annals of Statistics 37(6), 3743–3778.
Ma, Y. & Zhu, L. (2012), ‘A semiparametric approach to dimension reduction’, Journal of the American Statistical Association 107(497), 168–179.
Ma, Y. & Zhu, L. (2013), ‘A review on dimension reduction’, International Statistical Review 81(1), 134–150.
Resnick, S. I. (2007), Heavy-tail phenomena: probabilistic and statistical modeling, Springer Science & Business Media.
Sheng, W. & Yin, X. (2016), ‘Sufficient dimension reduction via distance covariance’, Journal of Computational and Graphical Statistics 25(1), 91–104.
Tan, K., Shi, L. & Yu, Z. (2020), ‘Sparse SIR: Optimal rates and adaptive estimation’, The Annals of Statistics 48(1), 64–85.
Wang, H., Ni, L. & Tsai, C.-L. (2008), ‘Improving dimension reduction via contour-projection’, Statistica Sinica 18(1), 299–311.
Wang, H. & Tsai, C.-L. (2009), ‘Tail index regression’, Journal of the American Statistical Association 104(487), 1233–1240.
Xia, Y., Tong, H., Li, W. K. & Zhu, L.-X. (2002), ‘An adaptive estimation of dimension reduction space’, Journal of the Royal Statistical Society Series B: Statistical Methodology 64(3), 363–410.
Xu, W., Wang, H. J. & Li, D. (2022), ‘Extreme quantile estimation based on the tail single-index model’, Statistica Sinica 32(2), 893–914.
Yu, Z., Dong, Y. & Zhu, L.-X. (2016), ‘Trace pursuit: A general framework for model-free variable selection’, Journal of the American Statistical Association 111(514), 813–821.
Zeng, J., Mai, Q. & Zhang, X. (2024), ‘Subspace estimation with automatic dimension and variable selection in sufficient dimension reduction’, Journal of the American Statistical Association 119(545), 343–355.
Zhao, Z., Zhang, Z. & Chen, R. (2018), ‘Modeling maxima with autoregressive conditional fr´echet model’, Journal of Econometrics 207(2), 325–351.
Zhu, L.-P., Zhu, L.-X. & Feng, Z.-H. (2010), ‘Dimension reduction in regressions through cumulative slicing estimation’, Journal of the American Statistical Association 105(492), 1455– 1466.
Zhu, Z. & Zhou, W. (2021), Taming heavy-tailed features by shrinkage, in ‘Proceedings of The 24th International Conference on Artificial Intelligence and Statistics’, Vol. 130 of Proceedings of Machine Learning Research, PMLR, pp. 3268–3276. International Institute of Finance, School of Management, University of Science and Technology of China, Hefei, Anhui 230026, China.

Acknowledgments

The authors are grateful to the Editor, Associate Editor, and two anonymous referees, whose suggestions led to great improvement of this work.

All authors contributed equally and are listed in alphabetical order. Liujun

Chen’s research was partially supported by Grants 12301387 and 12471279

from National Natural Science Foundation of China (NNSFC). Jing Zeng’s

research was partially supported by Grant 12301365 from NNSFC and

Grant WK2040000075 from Fundamental Research Funds for the Central

Universities.

Supplementary Materials

The Supplementary Material includes additional discussions, theories, numerical results, and technical proofs.

Supplementary materials are available for download.

[1] Aghbalou, A., Portier, F., Sabourin, A. & Zhou, C. (2024), ‘Tail inverse regression: dimension reduction for prediction of extremes’, Bernoulli 30(1), 503–533.

[2] Arnosti, N., Beck, M. & Milgrom, P. (2016), ‘Adverse selection and auction design for internet display advertising’, American Economic Review 106(10), 2852–2866.

[3] Bousebata, M., Enjolras, G. & Girard, S. (2023), ‘Extreme partial least-squares’, Journal of Multivariate Analysis 194, 105101.

[4] Bura, E., Duarte, S. & Forzani, L. (2016), ‘Sufficient reductions in regressions with exponential family inverse predictors’, Journal of the American Statistical Association 111(515), 1313– 1329.

[5] Chen, X., Zhang, J. & Zhou, W. (2022), ‘High-dimensional elliptical sliced inverse regression in non-gaussian distributions’, Journal of Business & Economic Statistics 40(3), 1204–1215.

[6] Chen, X., Zou, C. & Cook, R. D. (2010), ‘Coordinate-independent sparse sufficient dimension reduction and variable selection’, The Annals of Statistics 38(6), 3696–3723.

[7] Chernozhukov, V. (2005), ‘Extramal quantile regression’, The Annals of Statistics 33(2), 806– 839.

[8] Cook, R. D. (1998), Regression Graphics: Ideas for Studying Regressions Through Graphics, Vol. 318, John Wiley & Sons.

[9] Cook, R. D. & Forzani, L. (2008), ‘Principal Fitted Components for Dimension Reduction in Regression’, Statistical Science 23(4), 485–501.

[10] Cook, R. D. & Li, B. (2002), ‘Dimension reduction for conditional mean in regression’, The Annals of Statistics 30(2), 455–474.

[11] Cook, R. D. & Ni, L. (2005), ‘Sufficient dimension reduction via inverse regression: A minimum discrepancy approach’, Journal of the American Statistical Association 100(470), 410–428.

[12] Cook, R. D. & Ni, L. (2006), ‘Using intraslice covariances for improved estimation of the central subspace in regression’, Biometrika 93(1), 65–74.

[13] Cook, R. D. & Weisberg, S. (1991), ‘Discussion of sliced inverse regression for dimension reduction’, Journal of the American Statistical Association 86(414), 328–332.

[14] Eaton, M. L. (1986), ‘A characterization of spherical distributions’, Journal of Multivariate Analysis 20(2), 272–276.

[15] Fan, J., Wang, W. & Zhu, Z. (2021), ‘A shrinkage principle for heavy-tailed data: Highdimensional robust low-rank matrix recovery’, The Annals of Statistics 49(3), 1239–1266.

[16] Fukumizu, K. & Leng, C. (2014), ‘Gradient-based kernel dimension reduction for regression’, Journal of the American Statistical Association 109(505), 359–370.

[17] Gardes, L. (2018), ‘Tail dimension reduction for extreme quantile estimation’, Extremes 21, 57– 95.

[18] Girard, S., Stupfler, G. & Usseglio-Carleve, A. (2021), ‘Extreme conditional expectile estimation in heavy-tailed heteroscedastic regression models’, The Annals of Statistics 49(6), 3358– 3382.

[19] Hall, P., Titterington, D. & Xue, J.-H. (2009), ‘Median-based classifiers for high-dimensional data’, Journal of the American Statistical Association 104(488), 1597–1608.

[20] Hall, P. & Weissman, I. (1997), ‘On the estimation of extreme tail probabilities’, The Annals of Statistics 25(3), 1311–1326.

[21] Huang, M.-Y. & Chiang, C.-T. (2017), ‘An effective semiparametric estimation approach for the sufficient dimension reduction model’, Journal of the American Statistical Association 112(519), 1296–1310.

[22] Johnson, M. E. (1987), Multivariate statistical simulation: A guide to selecting and generating continuous multivariate distributions, Vol. 192, John Wiley & Sons.

[23] Li, B. (2018), Sufficient dimension reduction: Methods and applications with R, Chapman and Hall/CRC.

[24] Li, B. & Wang, S. (2007), ‘On directional regression for dimension reduction’, Journal of the American Statistical Association 102(479), 997–1008.

[25] Li, K.-C. (1991), ‘Sliced inverse regression for dimension reduction’, Journal of the American Statistical Association 86(414), 316–327.

[26] Loh, P.-L. (2017), ‘Statistical consistency and asymptotic normality for high-dimensional robust M-estimators’, The Annals of Statistics 45(2), 866–896.

[27] Luo, R., Wang, H. & Tsai, C.-L. (2009), ‘Contour projected dimension reduction’, The Annals of Statistics 37(6), 3743–3778.

[28] Ma, Y. & Zhu, L. (2012), ‘A semiparametric approach to dimension reduction’, Journal of the American Statistical Association 107(497), 168–179.

[29] Ma, Y. & Zhu, L. (2013), ‘A review on dimension reduction’, International Statistical Review 81(1), 134–150.

[30] Resnick, S. I. (2007), Heavy-tail phenomena: probabilistic and statistical modeling, Springer Science & Business Media.

[31] Sheng, W. & Yin, X. (2016), ‘Sufficient dimension reduction via distance covariance’, Journal of Computational and Graphical Statistics 25(1), 91–104.

[32] Tan, K., Shi, L. & Yu, Z. (2020), ‘Sparse SIR: Optimal rates and adaptive estimation’, The Annals of Statistics 48(1), 64–85.

[33] Wang, H., Ni, L. & Tsai, C.-L. (2008), ‘Improving dimension reduction via contour-projection’, Statistica Sinica 18(1), 299–311.

[34] Wang, H. & Tsai, C.-L. (2009), ‘Tail index regression’, Journal of the American Statistical Association 104(487), 1233–1240.

[35] Xia, Y., Tong, H., Li, W. K. & Zhu, L.-X. (2002), ‘An adaptive estimation of dimension reduction space’, Journal of the Royal Statistical Society Series B: Statistical Methodology 64(3), 363–410.

[36] Xu, W., Wang, H. J. & Li, D. (2022), ‘Extreme quantile estimation based on the tail single-index model’, Statistica Sinica 32(2), 893–914.

[37] Yu, Z., Dong, Y. & Zhu, L.-X. (2016), ‘Trace pursuit: A general framework for model-free variable selection’, Journal of the American Statistical Association 111(514), 813–821.

[38] Zeng, J., Mai, Q. & Zhang, X. (2024), ‘Subspace estimation with automatic dimension and variable selection in sufficient dimension reduction’, Journal of the American Statistical Association 119(545), 343–355.

[39] Zhao, Z., Zhang, Z. & Chen, R. (2018), ‘Modeling maxima with autoregressive conditional fr´echet model’, Journal of Econometrics 207(2), 325–351.

[40] Zhu, L.-P., Zhu, L.-X. & Feng, Z.-H. (2010), ‘Dimension reduction in regressions through cumulative slicing estimation’, Journal of the American Statistical Association 105(492), 1455– 1466.

[41] Zhu, Z. & Zhou, W. (2021), Taming heavy-tailed features by shrinkage, in ‘Proceedings of The 24th International Conference on Artificial Intelligence and Statistics’, Vol. 130 of Proceedings of Machine Learning Research, PMLR, pp. 3268–3276. International Institute of Finance, School of Management, University of Science and Technology of China, Hefei, Anhui 230026, China.