Abstract

In nonlinear time series analysis, forecasting is fundamental but challenging with

the curse of dimensionality for nonparametric regression of multiple lagged variables and the

nonlinear/non-Gaussian features for response either continuous or discrete-valued. To address

the challenges, we propose a unified framework of semiparametric Generalized MArginal Forecast Model Averaging (GMAFMA) under a flexible conditional exponential family of distribu-

tions for nonlinear forecasting of time series. This framework will not only overcome the curse

of dimensionality with nonparametric forecasting but also flexibly adapt for both continuousand discrete-valued non-Gaussian time series data, bridging the gap in existing methods for

nonlinear forecasting. The GMAFMA procedure is developed by a semiparametric conditional

likelihood method for estimation of the combining weights of the marginal forecasts, with asymptotic normality established under mild time series data generating conditions. Furthermore, an

adaptively penalized GMAFMA (PGMAFMA) is suggested to find the most important marginal

forecasts so that the forecasting is more interpretable and precise. The procedures are supported

both by Monte Carlo simulations and various empirical applications, such as forecasting of the

Generalized Model Averaging Forecasting

number of strike events in labor economics and the FTSE 100 index market moving direction

in finance, and assessment of the causal effect of seatbelt law in reducing the road casualties.

Information

Preprint No.SS-2024-0371
Manuscript IDSS-2024-0371
Complete AuthorsRong Peng, Zudi Lu, Fangsheng Ge
Corresponding AuthorsZudi Lu
Emailszudilu@cityu.edu.hk

References

  1. Al-Sulami, D., Jiang, Z., Lu, Z. and Zhu, J. (2019). On a Semiparametric Data-Driven Nonlinear Model with Penalized Spatio-Temporal Lag Interactions. Journal of Time Series Analysis 40 (3), 327–342.
  2. An, H. and Huang, F. (1996). The Geometrical Ergodicity of Nonlinear Autoregressive Models. Statistica Sinica 6 (4), 943–956.
  3. Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1994). Time Series Analysis: Forecasting and Control. 3rd Edition. Prentice Hall, Englewood Cliffs, New Jersey.
  4. Cameron, A. and Trivedi, P. (1998). Regression Analysis of Count Data. Cambridge University Press.
  5. Cameron, A. C. and Trivedi, P. K. (1986). Econometric Models Based on Count Data. Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics 1 (1), 29–53.
  6. Cameron, A. C. and Trivedi, P. K. (1990). Regression-Based Tests for Overdispersion in the Poisson Model. Journal of Econometrics 46 (3), 347–364.
  7. Cameron, A. C. and Trivedi, P. K. (1996). 12 Count Data Models for Financial Data. Handbook of Statistics 14, 363–391.
  8. Cameron, A. C. and Trivedi, P. K. (2005). Microeconometrics: Methods and Applications. Cambridge University Press.
  9. Chen, J., Li, D. and Linton, O. (2019). A New Semiparametric Estimation Approach for Large Dynamic Covariance Matrices with Multiple Conditioning Variables. Journal of Econometrics 212 (1), 155–176.
  10. Chen, J., Li, D., Linton, O. and Lu, Z. (2016). Semiparametric Dynamic Portfolio Choice with Multiple Conditioning Variables. Journal of Econometrics 194 (2), 309– 318.
  11. Chen, J., Li, D., Linton, O. and Lu, Z. (2018). Semiparametric Ultra-High Dimensional Model Averaging of Nonlinear Dynamic Time Series. Journal of the American Statistical Association 113 (522), 919–932.
  12. Dong, C., Gao, J. and Linton, O. (2023). High Dimensional Semiparametric Moment Restriction Models. Journal of Econometrics 232 (2), 320–345.
  13. Doukhan, P., Massart, P. and Rio, E. (1995). Invariance Principles for Absolutely Regular Empirical Processes. Ann. Inst. H. Poincar´e Probab. Statist. 31, 393–427.
  14. Drost, F. C., Van den Akker, R. and Werker, B. J. (2009). Efficient Estimation of AutoRegression Parameters and Innovation Distributions for Semiparametric IntegerValued AR (p) Models. Journal of the Royal Statistical Society Series B: Statistical Methodology 71 (2), 467–485.
  15. Fan, J. (2018). Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66. Routledge.
  16. Fan, J., Farmen, M. and Gijbels, I. (1998). Local Maximum Likelihood Estimation and Inference. Journal of the Royal Statistical Society: Series B Statistical Methodology 60 (3), 591–608.
  17. Fan, J. and Li, R. (2001). Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association 96 (456), 1348–1360.
  18. Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer Science & Business Media.
  19. Fang, F., Li, J. and Xia, X. (2022). Semiparametric Model Averaging Prediction for Dichotomous Response. Journal of Econometrics 229 (2), 219–245.
  20. Fokianos, K., Fried, R., Kharin, Y. and Voloshko, V. (2021). Statistical Analysis of Multivariate Discrete-Valued Time Series. Journal of Multivariate Analysis 188, 104805.
  21. Fokianos, K., Rahbek, A. and Tjøstheim, D. (2009). Poisson Autoregression. Journal of the American Statistical Association 104 (488), 1430–1439.
  22. Freund, Y. and Schapire, R. E. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55 (1), 119–139.
  23. Gao, J. (2007). Nonlinear Time Series: Semiparametric and Nonparametric Methods. Chapman and Hall/CRC.
  24. Gao, J., Kanaya, S., Li, D. and Tjøstheim, D. (2015). Uniform Consistency for Nonparametric Estimators in Null Recurrent Time Series. Econometric Theory 31 (5), 911–952.
  25. Gao, J., Wang, Q. and Yin, J. (2013). Long-Range Dependent Time Series Specification. Bernoulli 19 (5A), 1714–1749.
  26. Garc´ıa-Ferrer, A., De Juan, A. and Poncela, P. (2007). The Relationship Between Road Traffic Accidents and Real Economic Activity in Spain: Common Cycles and Health Issues. Health Economics 16 (6), 603–626.
  27. Hansen, B. E. (2007). Least Squares Model Averaging. Econometrica 75 (4), 1175–1189.
  28. Harvey, A. C. and Durbin, J. (1986). The Effects of Seat Belt Legislation on British Road Casualties: A Case Study in Structural Time Series Modelling. Journal of the Royal Statistical Society: Series A Statistics in Society 149 (3), 187–210.
  29. Kennan, J. (1985). The Duration of Contract Strikes in US Manufacturing. Journal of Econometrics 28 (1), 5–28.
  30. Li, C., Li, Q., Racine, J. S. and Zhang, D. (2018). Optimal Model Averaging of Varying Coefficient Models. Statistica Sinica 28 (4), 2795–2809.
  31. Li, D., Linton, O. and Lu, Z. (2015). A Flexible Semiparametric Forecasting Model for Time Series. Journal of Econometrics 187 (1), 345–357.
  32. Li, J., Lv, J., Wan, A. T. and Liao, J. (2022). Adaboost Semiparametric Model Averaging Prediction for Multiple Categories. Journal of the American Statistical Association 117 (537), 495–509.
  33. Li, Q. and Racine, J. (2003). Nonparametric Estimation of Distributions with Categorical and Continuous Data. Journal of Multivariate Analysis 86 (2), 266–292.
  34. Liao, J., Zou, G., Gao, Y. and Zhang, X. (2021). Model Averaging Prediction for Time Series Models with a Diverging Number of Parameters. Journal of Econometrics 223 (1), 190–221.
  35. Liboschik, T., Fokianos, K. and Fried, R. (2017). tscount: An R Package for Analysis of Count Time Series Following Generalized Linear Models. Journal of Statistical Software 82, 1–51.
  36. Lu, Z. (1998). On the Geometric Ergodicity of a Non-Linear Autoregressive Model with an Autoregressive Conditional Heteroscedastic Term. Statistica Sinica 8 (4), 1205–1217.
  37. Lu, Z., Steinskog, D. J., Tjøstheim, D. and Yao, Q. (2009). Adaptively VaryingCoefficient Spatiotemporal Models. Journal of the Royal Statistical Society: Series B Statistical Methodology 71 (4), 859–880.
  38. Lu, Z., Tjøstheim, D. and Yao, Q. (2007). Adaptive Varying-Coefficient Linear Models for Stochastic Processes: Asymptotic Theory. Statistica Sinica 17 (1), 177–198.
  39. Masry, E. and Tjøstheim, D. (1995). Nonparametric Estimation and Identification of Nonlinear ARCH Time Series Strong Convergence and Asymptotic Normality: Strong Convergence and Asymptotic Normality. Econometric Theory 11 (2), 258– 289.
  40. McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. Volume 37. CRC Press.
  41. Murray, S., Xia, Y. and Xiao, H. (2024). Charting by Machines. Journal of Financial Economics 153, 103791.
  42. Peng, R. and Lu, Z. (2023). Uniform Consistency for Local Fitting of Time Series Non-Parametric Regression Allowing for Discrete-Valued Response. Statistics and Its Interface 16, 305–318.
  43. Peng, R. and Lu, Z. (2024). Semiparametric Averaging of Nonlinear Marginal Logistic Regressions and Forecasting for Time Series Classification. Econometrics and Statistics 31, 19–37.
  44. Qaqish, B. F. (2003). A Family of Multivariate Binary Distributions for Simulating Correlated Binary Variables with Specified Marginal Means and Correlations. Biometrika 90 (2), 455–463.
  45. Racine, J. and Li, Q. (2004). Nonparametric Estimation of Regression Functions with Both Categorical and Continuous Data. Journal of Econometrics 119 (1), 99–130.
  46. Racine, J. S., Li, Q., Yu, D. and Zheng, L. (2023). Optimal Model Averaging of MixedData Kernel-Weighted Spline Regressions. Journal of Business & Economic Statistics 41 (4), 1251–1261.
  47. Rubin, D. B. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology 66 (5), 688.
  48. Steel, M. F. (2020). Model Averaging and Its Use in Economics. Journal of Economic Literature 58 (3), 644–719.
  49. Stock, J. H. and Watson, M. W. (2006). Forecasting with Many Predictors. In Handbook of Economic Forecasting (Edited by G. Elliott, C. W. J. Granger and A. Timmermann), Volume 1, Elsevier.
  50. Terasvirta, T., Tjøstheim, D. and Granger, C. W. (2010). Modelling Nonlinear Economic Time Series. Oxford University Press.
  51. Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B Statistical Methodology 58 (1), 267–288.
  52. Tjøstheim, D. (1990). Non-Linear Time Series and Markov Chains. Advances in Applied Probability 22 (3), 587–611.
  53. Weiß, C. H. (2018). An Introduction to Discrete-Valued Time Series. John Wiley & Sons.
  54. Weiß, C. H. and Schnurr, A. (2023). Generalized Ordinal Patterns in Discrete-Valued Time Series: Nonparametric Testing for Serial Dependence. Journal of Nonparametric Statistics, 1–27.
  55. Winkelmann, R. and Zimmermann, K. F. (1994). Count Data Models for Demographic Data. Mathematical Population Studies 4 (3), 205–221.
  56. Yu, D., Tang, N.-S. and Shi, Y. (2025). Adaptively Aggregated Forecast for Exponential Family Panel Model. International Journal of Forecasting 41 (2), 733–747.
  57. Yu, D., Zhang, X. and Yau, K. K. (2018). Asymptotic Properties and Information Criteria for Misspecified Generalized Linear Mixed Models. Journal of the Royal Statistical Society Series B: Statistical Methodology 80 (4), 817–836.
  58. Zhang, X., Lu, Z. and Zou, G. (2013). Adaptively Combined Forecasting for Discrete Response Time Series. Journal of Econometrics 176 (1), 80–91.
  59. Zhang, X., Yu, D., Zou, G. and Liang, H. (2016). Optimal Model Averaging Estimation for Generalized Linear Models and Generalized Linear Mixed-Effects Models. Journal of the American Statistical Association 111 (516), 1775–1790.
  60. Zhang, X. and Zhang, X. (2023). Optimal Model Averaging Based on ForwardValidation. Journal of Econometrics 237 (2), 105295.
  61. Zhao, P. and Yu, B. (2006). On Model Selection Consistency of Lasso. Journal of Machine Learning Research 7 (Nov), 2541–2563.
  62. Zhou, W., Gao, J., Harris, D. and Kew, H. (2024). Semi-Parametric Single-Index Predictive Regression Models with Cointegrated Regressors. Journal of Econometrics 238 (1), 105577.
  63. Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association 101 (476), 1418–1429.

Acknowledgments

The authors would like to express sincere gratitude to the editor Prof Huixia Wang, the

Associate Editor and both referees for their careful reading and constructive comments,

which have greatly improved the early version of this paper. The research of Lu was

partially supported by the Startup Fund (No.7200813) of City University of Hong Kong,

which is also acknowledged.

Supplementary Materials

The supplementary material contains technical details and proofs for the results in the

main paper with additional numerical results.


Supplementary materials are available for download.