Abstract
The scalar-on-function regression is quite useful for modelling mixed-data in the
context of scalar and functional variables. Under this class of regression, the paper aims at
proposing a compelling alternative to model selection methods to address model selection
uncertainty. The considered models characterize a scalar response using parametric effect of
the scalar predictors and nonparametric effect of a functional predictor, and a model averaging estimation is developed based on Mallows-type criterion to assign weights for averaging.
Further, the asymptotic optimality of the resulting estimator, in terms of achieving the smallest possible squared error loss, is established.
Besides, simulation studies demonstrate its
superiority to or comparability with some information criterion score-based model selection
and averaging estimators. The proposed procedure is also applied to a mid-infrared spectra
dataset for illustration.
Information
| Preprint No. | SS-2023-0067 |
|---|---|
| Manuscript ID | SS-2023-0067 |
| Complete Authors | Shishi Liu, Chunming Zhang, Hao Zhang, Rou Zhong, Jingxiao Zhang |
| Corresponding Authors | Jingxiao Zhang |
| Emails | zhjxiaoruc@163.com |
References
- Ando, T. and K.-C. Li (2014). A model-averaging approach for high-dimensional regression. Journal of the American Statistical Association 109(505), 254–265.
- Andrews, D. W. (1991). Asymptotic optimality of generalized cl, cross-validation, and generalized crossvalidation in regression with heteroskedastic errors. Journal of Econometrics 47(2-3), 359–377.
- Aneiros-P´erez, G. and P. Vieu (2006). Semi-functional partial linear regression. Statistics & Probability Letters 76(11), 1102–1110.
- Benjamini, Y. and Y. Hochberg (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57(1), 289–300. Buckland S T
- K P Burnham and N H Augustin (1997) Model selection: an integral part of inference Biometrics 53(2), 603–618.
- Cai, T. T. and P. Hall (2006). Prediction in functional linear regression. The Annals of Statistics 34(5), 2159–2179.
- Cai, T. T. and M. Yuan (2012). Minimax and adaptive prediction for functional linear regression. Journal of the American Statistical Association 107(499), 1201–1216.
- Cardot, H., F. Ferraty, and P. Sarda (1999). Functional linear model. Statistics & Probability Letters 45(1), 11–22.
- Cardot, H., F. Ferraty, and P. Sarda (2003). Spline estimators for the functional linear model. Statistica Sinica 13, 571–591.
- Cheng, X. and B. E. Hansen (2015). Forecasting with factor-augmented regression: A frequentist model averaging approach. Journal of Econometrics 186(2), 280–293.
- Claeskens, G. and N. L. Hjort (2008). Model selection and model averaging. Cambridge University Press.
- Clyde, M. and E. I. George (2004). Model uncertainty. Statistical science 19, 81–94.
- Diebold, F. X. and R. S. Mariano (2002). Comparing predictive accuracy. Journal of Business & economic statistics 20(1), 134–144.
- Draper, D. (1995). Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society: Series B (Methodological) 57(1), 45–70.
- Fan, J. and J. Lv (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society Series B: Statistical Methodology 70(5), 849–911.
- Fan, J. and R. Song (2010). Sure independence screening in generalized linear models with np-dimensionality. The Annals of Statistics 38(6), 3567–3604.
- Fan, Y., G. M. James, and P. Radchenko (2015). Functional additive regression. The Annals of Statistics 43(5), 2296–2325.
- Fang, F., J. Li, and X. Xia (2022). Semiparametric model averaging prediction for dichotomous response. Journal of Econometrics 229(2), 219–245.
- Gao, Y., X. Zhang, S. Wang, and G. Zou (2016). Model averaging based on leave-subject-out cross-validation. Journal of Econometrics 192(1), 139–151.
- Hall, P., H.-G. M¨uller, and J.-L. Wang (2006). Properties of principal component methods for functional and longitudinal data analysis. The annals of statistics 34(3), 1493–1517.
- Hansen, B. E. (2007). Least squares model averaging. Econometrica 75(4), 1175–1189.
- Hansen, B. E. and J. S. Racine (2012). Jackknife model averaging. Journal of Econometrics 167(1), 38–46.
- Hjort, N. L. and G. Claeskens (2003). Frequentist model average estimators. Journal of the American Statistical Association 98(464), 879–899.
- Kong, D., K. Xue, F. Yao, and H. H. Zhang (2016). Partially functional linear regression in high dimensions. Biometrika 103(1), 147–159.
- Li, K.-C. (1987). Asymptotic optimality for cp, cl, cross-validation and generalized cross-validation: discrete index set. The Annals of Statistics 15(3), 958–975.
- Liang, H., G. Zou, A. T. Wan, and X. Zhang (2011). Optimal weight choice for frequentist model average estimators. Journal of the American Statistical Association 106(495), 1053–1066.
- Liu, Q. and R. Okui (2013). Heteroskedasticity-robust cp model averaging. Econometrics Journal 16, 463–472.
- M¨uller, H.-G. and F. Yao (2008). Functional additive models. Journal of the American Statistical Association 103(484), 1534–1544.
- Racine, J. S. and C. F. Parmeter (2014, 02). Data-Driven Model Evaluation: A Test for Revealed Performance. In The Oxford Handbook of Applied Nonparametric and Semiparametric Econometrics and
- Statistics, pp. 308–345. Oxford University Press.
- Ramsay, J. O. and B. W. Silverman (2005). Functional data analysis (2nd ed.). New York: Springer.
- Rice, J. A. and B. W. Silverman (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society: Series B (Methodological) 53(1), 233–243.
- Sila, A., G. Pokhariyal, and K. Shepherd (2017). Evaluating regression-kriging for mid-infrared spectroscopy prediction of soil properties in western kenya-east africa. Geoderma Regional 10, 39–47.
- Speckman, P. (1988). Kernel smoothing in partial linear models. Journal of the Royal Statistical Society: Series B (Methodological) 50(3), 413–436.
- Tang, Q., W. Tu, and L. Kong (2023). Estimation for partial functional partially linear additive model. Computational Statistics & Data Analysis 177, 107584.
- Wan, A. T., X. Zhang, and G. Zou (2010). Least squares model averaging by mallows criterion. Journal of Econometrics 156(2), 277–283.
- Wang, G., X.-N. Feng, and M. Chen (2016). Functional partial linear single-index model. Scandinavian Journal of Statistics 43(1), 261–274.
- Wong, R. K., Y. Li, and Z. Zhu (2019). Partially linear functional additive models for multivariate functional data. Journal of the American Statistical Association 114(525), 406–418.
- Yao, F. and H.-G. M¨uller (2010). Functional quadratic regression. Biometrika 97(1), 49–64.
- Yao, F., H.-G. M¨uller, and J.-L. Wang (2005). Functional linear regression analysis for longitudinal data. The Annals of Statistics 33(6), 2873–2903.
- Yu, D., L. Kong, and I. Mizera (2016). Partial functional linear quantile regression for neuroimaging data analysis. Neurocomputing 195, 74–87.
- Zhang, H. and G. Zou (2020). Cross-validation model averaging for generalized functional linear model. Econometrics 8(1), 7.
- Zhang, X., J.-M. Chiou, and Y. Ma (2018). Functional prediction through averaging estimated functional linear regression models. Biometrika 105(4), 945–962.
- Zhang, X. and C.-A. Liu (2023). Model averaging prediction by k-fold cross-validation. Journal of Econometrics 235(1), 280–301.
- Zhang, X., A. T. Wan, and S. Z. Zhou (2012). Focused information criteria, model selection, and model averaging in a tobit model with a nonzero threshold. Journal of Business & Economic Statistics 30(1), 132–142.
- Zhang, X., A. T. Wan, and G. Zou (2013). Model averaging by jackknife criterion in models with dependent data. Journal of Econometrics 174(2), 82–94.
- Zhang, X. and W. Wang (2019). Optimal model averaging estimation for partially linear models. Statistica Sinica 29, 693–718.
- Zhang, X., D. Yu, G. Zou, and H. Liang (2016). Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. Journal of the American Statistical Association 111(516), 1775–1790.
- Zhang, X. and J. Yu (2018). Spatial weights matrix selection and model averaging for spatial autoregressive models. Journal of Econometrics 203(1), 1–18.
- Zhang, X., G. Zou, and R. J. Carroll (2015). Model averaging based on kullback-leibler distance. Statistica Sinica 25, 1583–1598.
- Zhang, X., G. Zou, and H. Liang (2014). Model averaging and weight choice in linear mixed-effects models. Biometrika 101(1), 205–218.
- Zhang, X., G. Zou, H. Liang, and R. J. Carroll (2020). Parsimonious model averaging with a diverging number of parameters. Journal of the American Statistical Association 115(530), 972–984.
- Zhu, H., F. Yao, and H. H. Zhang (2014). Structured functional additive regression in reproducing kernel hilbert spaces. Journal of the Royal Statistical Society. Series B, Statistical methodology 76(3), 581–603.
- Zhu, R., A. T. Wan, X. Zhang, and G. Zou (2019). A mallows-type model averaging estimator for the varying-coefficient partially linear model. Journal of the American Statistical Association 114(526), 882–892.
- Zhu, R., G. Zou, and X. Zhang (2018). Optimal model averaging estimation for partial functional linear models. Journal of Systems Science and Mathematical Sciences 38, 777–800.
- Zou, J., W. Wang, X. Zhang, and G. Zou (2022). Optimal model averaging for divergent-dimensional poisson regressions. Econometric Reviews 41(7), 775–805.
Acknowledgments
The authors greatly appreciate two referees and the associate editor for insightful
comments. S. Liu’s research was supported by the Zhejiang Provincial Department of
Education Scientific Research Project (No.Y202147117), and the Zhejiang Provincial
Basic Public Welfare Research Program’s Natural Science Foundation Exploration
Project (No.LQ23A010017). C. Zhang’s research was supported in part by the U.S.
National Science Foundation grants DMS-2013486 and DMS-1712418, and provided
by the University of Wisconsin-Madison Office of the Vice Chancellor for Research and
Graduate Education with funding from the Wisconsin Alumni Research Foundation.
J. Zhang’s research was supported by Public Health & Disease Control and Prevention, Fund for Building World-Class Universities (Disciplines) of Renmin University
of China (to J.Z.) and the MOE Project of Key Research Institute of Humanities and
Social Sciences (22JJD910001).
Supplementary Materials
The supplementary material contains additional simulations, additional details in
real application, and the detailed proofs for Lemma 1, Theorems 1 and 2.