Semiparametric Efficient Estimation of Quantile Regression

Zhanfeng Wang, Kani Chen, Yuanyuan Lin and Zhiliang Ying

doi:10.5705/ss.202024.0378

Abstract

Linear quantile regression model assumes quantiles of a response at

certain levels are linearly related with covariates. If the model is assumed for

one single quantile level, the semiparametric efficient estimation involves estimation of the conditional density of an error given covariates, which could be

prohibitively difficult because of the curse of dimensionality.

However, if the

model is assumed for all quantile levels, estimation of conditional density becomes estimation of the derivative of regression coefficient functions, which is

naturally available from initial estimators such as the Koenker-Bassett estimator. This paper derives the semiparametric efficient scores and the corresponding

efficiency bounds for the regression coefficients. Although there is no closed form

expression of the estimator or estimating function, we propose a computationally

feasible procedure leading to semiparametrically efficient estimation. Simulation

studies show that the proposed method could lead to substantial efficiency gain

over the standard methods.

Key words and phrases: Quantile regression; Semiparametric efficient score; Least favorable submodel; One-step estimation

Information

Preprint No.	SS-2024-0378
Manuscript ID	SS-2024-0378
Complete Authors	Zhanfeng Wang, Kani Chen, Yuanyuan Lin, Zhiliang Ying
Corresponding Authors	Kani Chen
Emails	makchen@ust.hk

References

Bickel, P. J., C. A. Klaassen, Y. Ritov, and J. A. Wellner (1993). Efficient and Adaptive Estimation for Semiparametric Models. New York: Springer-Verlag.
Bondell, H. D., B. J. Reich, and H. Wang (2010). Noncrossing quantile regression curve estimation. Biometrika 97, 825–838.
Chung, Y. and D. B. Dunson (2009). Nonparametric Bayes conditional distribution modeling with variable selection. J. Amer. Statist. Assoc. 104, 1646–1660.
Dunson, D. B. and J. A. Taylor (2005). Approximate Bayesian inference for quantiles. J. Nonparametri. Stat. 17, 385–400.
Feng, Y., Y. Chen, and X. He (2015). Bayesian quantile regression with approximate likelihood. Bernoulli 21, 832–850.
He, X. (1997). Quantile curves without crossing. The American Statistician 51, 186–192.
He, X., L. Wang, and H. G. Hong (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist. 41, 342–369.
He, X. and L. X. Zhu (2003). A lack-of-fit test for quantile regression. J. Amer. Statist. Assoc. 98, 1013–1022.
Jiang, L., H. J. Wang, and H. D. Bondell (2013). Interquartile shrinkage in regression models. J. Comp. Graph. Statist. 22, 970–986.
Kai, B., R. Li, and H. Zou (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Statist. 39, 305–332.
Kato, K. (2011). Group Lasso for high dimensional sparse quantile regression models. arXiv preprint arXiv:1103.1458.
Kim, M. O. and Y. Yang (2011). Semiparametric approach to a random effects quantile regression model. J. Amer. Statist. Assoc. 106, 1405– 1417.
Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge: Cambridge Univ. Press.
Koenker, R. and G. J. Bassett (1978). Regression quantiles. Econometrica 46, 33–50.
Koenker, R. and O. Geling (2001). Reappraising medfly longevity: A quantile regression survival analysis. J. Amer. Statist. Assoc. 96, 458–468.
Koenker, R. and Z. Xiao (2002). Inference on the quantile regression process. Econometrica 70, 1583–1612.
M¨uller, P. and F. A. Quintana (2004). Nonparametric Bayesian data analysis. Stat. Sci. 19, 95–110.
Newey, W. K. (1990). Semiparametric efficiency bounds. Journal of applied econometrics 5, 99–135.
Newey, W. K. and J. L. Powell (1990). Efficient estimation of linear and type i censored regression models under conditional quantile restrictions. Econometric Theory 6, 295–317.
Peng, L. and Y. Huang (2008). Survival analysis with quantile regression models. J. Amer. Statist. Assoc. 103, 637–649.
Portnoy, S. (2003). Censored regression quantiles. J. Amer. Statist. Assoc. 98, 1001–1012.
Portnoy, S. and R. Koenker (1989). Adaptive l-estimation for linear models. Ann. Statist. 17, 362–381.
Portnoy, S. and R. Koenker (1997). The gaussian hare and the laplacian tortoise: computability of squared-error versus absolute-error estimators (with discussion). Statistical Science 12, 279–300.
Qu, Z. and J. Yoon (2015). Nonparametric estimation and inference on conditional quantile processes. J. Econometrics 185, 1–19.
Reich, B. J., M. Fuentes, and D. B. Dunson (2011). Bayesian spatial quantile regression. J. Amer. Statist. Assoc. 106, 6–20.
Tsiatis, A. (2007). Semiparametric Theory and Missing Data. Springer Science & Business Media.
Wang, H. J. and D. Li (2013). Estimation of extreme conditional quantiles through power transformation. J. Amer. Statist. Assoc. 108, 1062–1074.
Wang, H. J., D. Li, and X. He (2012). Estimation of high conditional quantiles for heavy-tailed distributions. J. Amer. Statist. Assoc. 107, 1453–1464.
Wang, H. J. and L. Wang (2009). Locally weighted censored quantile regression. J. Amer. Statist. Assoc. 104, 1117–1128.
Wang, L., Y. Wu, and R. Li (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Amer. Statist. Assoc. 107, 214– 222.
Yang, Y. and X. He (2012). Bayesian empirical likelihood for quantile regression. Ann. Statist. 40, 1102–1131.
Yu, K. and M. C. Jones (1998). Local linear quantile regression. J. Amer. Statist. Assoc. 93, 228–237.
Zheng, Q., L. Peng, and X. He (2015). Globally adaptive quantile regression with ultra-high dimensional data. Ann. Statist. 43, 2225–2258.
Zheng, Q., L. Peng, and X. He (2018). High dimensional censored quantile regression. Ann. Statist. 46, 308–343.
Zhou, K. Q. and S. L. Portnoy (1998). Statistical inference on heteroscedastic models based on regression quantiles. J. Nonparametric Stat. 9, 239– 260.
Zou, H. and M. Yuan (2008). Composite quantile regression and the oracle model selection theory. Ann. Statist. 36, 1108–1126.

Acknowledgments

The authors thank the Editor, the Associate Editor and two anonymous

reviewers for their insightful comments and constructive suggestions that

helped improve the paper significantly. Z. Wang’s research was supported

by the National Science Foundation of China (No. 12371277, No. 12231017).

Y. Lin’s research was partially supported by the Hong Kong Research

Grants Council (No. 14306620 and 14304523), and Direct Grants for Research, The Chinese University of Hong Kong.

Supplementary Materials

The supplementary material contains the proofs of all theorems and additional results from numerical studies.

Supplementary materials are available for download.

[1] Bickel, P. J., C. A. Klaassen, Y. Ritov, and J. A. Wellner (1993). Efficient and Adaptive Estimation for Semiparametric Models. New York: Springer-Verlag.

[2] Bondell, H. D., B. J. Reich, and H. Wang (2010). Noncrossing quantile regression curve estimation. Biometrika 97, 825–838.

[3] Chung, Y. and D. B. Dunson (2009). Nonparametric Bayes conditional distribution modeling with variable selection. J. Amer. Statist. Assoc. 104, 1646–1660.

[4] Dunson, D. B. and J. A. Taylor (2005). Approximate Bayesian inference for quantiles. J. Nonparametri. Stat. 17, 385–400.

[5] Feng, Y., Y. Chen, and X. He (2015). Bayesian quantile regression with approximate likelihood. Bernoulli 21, 832–850.

[6] He, X. (1997). Quantile curves without crossing. The American Statistician 51, 186–192.

[7] He, X., L. Wang, and H. G. Hong (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist. 41, 342–369.

[8] He, X. and L. X. Zhu (2003). A lack-of-fit test for quantile regression. J. Amer. Statist. Assoc. 98, 1013–1022.

[9] Jiang, L., H. J. Wang, and H. D. Bondell (2013). Interquartile shrinkage in regression models. J. Comp. Graph. Statist. 22, 970–986.

[10] Kai, B., R. Li, and H. Zou (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Statist. 39, 305–332.

[11] Kato, K. (2011). Group Lasso for high dimensional sparse quantile regression models. arXiv preprint arXiv:1103.1458.

[12] Kim, M. O. and Y. Yang (2011). Semiparametric approach to a random effects quantile regression model. J. Amer. Statist. Assoc. 106, 1405– 1417.

[13] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge: Cambridge Univ. Press.

[14] Koenker, R. and G. J. Bassett (1978). Regression quantiles. Econometrica 46, 33–50.

[15] Koenker, R. and O. Geling (2001). Reappraising medfly longevity: A quantile regression survival analysis. J. Amer. Statist. Assoc. 96, 458–468.

[16] Koenker, R. and Z. Xiao (2002). Inference on the quantile regression process. Econometrica 70, 1583–1612.

[17] M¨uller, P. and F. A. Quintana (2004). Nonparametric Bayesian data analysis. Stat. Sci. 19, 95–110.

[18] Newey, W. K. (1990). Semiparametric efficiency bounds. Journal of applied econometrics 5, 99–135.

[19] Newey, W. K. and J. L. Powell (1990). Efficient estimation of linear and type i censored regression models under conditional quantile restrictions. Econometric Theory 6, 295–317.

[20] Peng, L. and Y. Huang (2008). Survival analysis with quantile regression models. J. Amer. Statist. Assoc. 103, 637–649.

[21] Portnoy, S. (2003). Censored regression quantiles. J. Amer. Statist. Assoc. 98, 1001–1012.

[22] Portnoy, S. and R. Koenker (1989). Adaptive l-estimation for linear models. Ann. Statist. 17, 362–381.

[23] Portnoy, S. and R. Koenker (1997). The gaussian hare and the laplacian tortoise: computability of squared-error versus absolute-error estimators (with discussion). Statistical Science 12, 279–300.

[24] Qu, Z. and J. Yoon (2015). Nonparametric estimation and inference on conditional quantile processes. J. Econometrics 185, 1–19.

[25] Reich, B. J., M. Fuentes, and D. B. Dunson (2011). Bayesian spatial quantile regression. J. Amer. Statist. Assoc. 106, 6–20.

[26] Tsiatis, A. (2007). Semiparametric Theory and Missing Data. Springer Science & Business Media.

[27] Wang, H. J. and D. Li (2013). Estimation of extreme conditional quantiles through power transformation. J. Amer. Statist. Assoc. 108, 1062–1074.

[28] Wang, H. J., D. Li, and X. He (2012). Estimation of high conditional quantiles for heavy-tailed distributions. J. Amer. Statist. Assoc. 107, 1453–1464.

[29] Wang, H. J. and L. Wang (2009). Locally weighted censored quantile regression. J. Amer. Statist. Assoc. 104, 1117–1128.

[30] Wang, L., Y. Wu, and R. Li (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Amer. Statist. Assoc. 107, 214– 222.

[31] Yang, Y. and X. He (2012). Bayesian empirical likelihood for quantile regression. Ann. Statist. 40, 1102–1131.

[32] Yu, K. and M. C. Jones (1998). Local linear quantile regression. J. Amer. Statist. Assoc. 93, 228–237.

[33] Zheng, Q., L. Peng, and X. He (2015). Globally adaptive quantile regression with ultra-high dimensional data. Ann. Statist. 43, 2225–2258.

[34] Zheng, Q., L. Peng, and X. He (2018). High dimensional censored quantile regression. Ann. Statist. 46, 308–343.

[35] Zhou, K. Q. and S. L. Portnoy (1998). Statistical inference on heteroscedastic models based on regression quantiles. J. Nonparametric Stat. 9, 239– 260.

[36] Zou, H. and M. Yuan (2008). Composite quantile regression and the oracle model selection theory. Ann. Statist. 36, 1108–1126.