Abstract

We develop a new methodology for forecasting matrix-valued time series with historical ma

trix data and auxiliary vector time series data. We focus on a time series of matrices defined on a

static 2-D spatial grid and an auxiliary time series of non-spatial vectors. The proposed model, Matrix

AutoRegression with Auxiliary Covariates (MARAC), contains an autoregressive component for the

historical matrix predictors and an additive component that maps the auxiliary vector predictors to

a matrix response via tensor-vector product. The autoregressive component adopts a bi-linear transformation framework following Chen et al. (2021), significantly reducing the number of parameters.

The auxiliary component posits that the tensor coefficient, which maps non-spatial predictors to a

spatial response, contains slices of spatially smooth matrix coefficients that are discrete evaluations of

smooth functions on a spatial grid from a Reproducing Kernel Hilbert Space (RKHS). We propose to

estimate the model parameters under a penalized maximum likelihood estimation framework coupled

with an alternating minimization algorithm. We establish the joint asymptotics of the autoregressive

and tensor parameters under fixed and high-dimensional regimes. Extensive simulations and a geophysical application for forecasting the global Total Electron Content (TEC) are conducted to validate

the performance of MARAC.

Information

Preprint No.SS-2025-0029
Manuscript IDSS-2025-0029
Complete AuthorsHu Sun, Zuofeng Shang, Yang Chen
Corresponding AuthorsYang Chen
Emailsychenang@umich.edu

References

  1. Akaike, H. (1998). Information Theory and an Extension of the Maximum Likelihood Principle. In Selected papers of Hirotugu Akaike, pp. 199–213. Springer.
  2. Attouch, H., J. Bolte, and B. F. Svaiter (2013). Convergence of Descent Methods for Semi-Algebraic and Tame Problems: Proximal Algorithms, Forward–Backward Splitting, and Regularized Gauss–Seidel Methods. Mathematical Programming 137(1-2), 91–129.
  3. Banerjee, A., I. S. Dhillon, J. Ghosh, S. Sra, and G. Ridgeway (2005). Clustering on the Unit Hypersphere using von Mises-Fisher Distributions. Journal of Machine Learning Research 6(9), 1345–1382.
  4. Braun, M. L. (2006). Accurate Error Bounds for the Eigenvalues of the Kernel Matrix. The Journal of Machine Learning Research 7, 2303–2328.
  5. Cai, T. T. and M. Yuan (2012). Minimax and Adaptive Prediction for Functional Linear Regression. Journal of the American Statistical Association 107(499), 1201–1216.
  6. Chen, E. Y. and J. Fan (2023). Statistical Inference for High-Dimensional Matrix-Variate Factor Models. Journal of the American Statistical Association 118(542), 1038–1055.
  7. Chen, R., H. Xiao, and D. Yang (2021). Autoregressive Models for Matrix-valued Time Series. Journal of Econometrics 222(1), 539–560.
  8. Cheng, G. and Z. Shang (2015). Joint Asymptotics for Semi-nonparametric Regression Models with Partially Linear Structure. The Annals of Statistics 43, 1351–1390.
  9. Cressie, N. (1986). Kriging Nonstationary Data. Journal of the American Statistical Association 81(395), 625–634.
  10. Cressie, N. and G. Johannesson (2008). Fixed Rank Kriging for Very Large Spatial Data Sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(1), 209–226.
  11. Cressie, N. and C. K. Wikle (2015). Statistics for Spatio-Temporal Data. John Wiley & Sons.
  12. Cui, W., H. Cheng, and J. Sun (2018). An RKHS-based Approach to Double-Penalized Regression in High-dimensional Partially Linear Models. Journal of Multivariate Analysis 168, 201–210.
  13. Dong, M., L. Huang, X. Wu, and Q. Zeng (2020, feb). Application of Least-Squares Method to Time Series Analysis for 4dpm Matrix. IOP Conference Series: Earth and Environmental Science 455(1), 012200.
  14. Fosdick, B. and P. Hoff(2014). Separable Factor Analysis with Applications to Mortality Data. The Annals of Applied Statistics 8(1), 120–147.
  15. Gao, Z. and R. S. Tsay (2023). A Two-way Transformed Factor Model for Matrix-Variate Time Series. Econometrics and Statistics 27, 83–101.
  16. Gao, Z. and R. S. Tsay (2025). Denoising and Multilinear Projected-Estimation of High-Dimensional Matrix-Variate Factor Time Series. IEEE Transactions on Information Theory, in press.
  17. Gu, C. (2013). Smoothing Spline ANOVA models, 2nd edition. Springer, New York.
  18. Guha, S. and R. Guhaniyogi (2021). Bayesian Generalized Sparse Symmetric Tensor-on-Vector Regression. Technometrics 63(2), 160–170.
  19. Guhaniyogi, R., S. Qamar, and D. B. Dunson (2017). Bayesian Tensor Regression. The Journal of Machine Learning Research 18(1), 2733–2763.
  20. Guo, S., Y. Wang, and Q. Yao (2016, 10). High-dimensional and Banded Vector Autoregressions. Biometrika 103(4), 889–903.
  21. Hamilton, J. D. (2020). Time Series Analysis. Princeton University Press.
  22. Hastie, T., R. Tibshirani, J. H. Friedman, and J. H. Friedman (2009). The Elements of Statistical Learning: Data
  23. Mining, Inference, and Prediction, 2nd edition. Springer, New York.
  24. Hern´andez-Pajares, M., J. Juan, J. Sanz, R. Orus, A. Garcia-Rigo, J. Feltens, A. Komjathy, S. Schaer, and
  25. A. Krankowski (2009). The IGS VTEC Maps: a Reliable Source of Ionospheric Information since 1998. Journal of Geodesy 83, 263–275.
  26. Hoff, P. D. (2011). Separable Covariance Arrays via the Tucker Product, with Applications to Multivariate Relational Data. Bayesian Analysis 6(2), 179–196.
  27. Hsu, N.-J., H.-C. Huang, and R. S. Tsay (2021). Matrix Autoregressive Spatio-Temporal Models. Journal of Computational and Graphical Statistics 30(4), 1143–1155.
  28. Kang, J., B. J. Reich, and A.-M. Staicu (2018). Scalar-on-Image Regression via the Soft-Thresholded Gaussian Process. Biometrika 105(1), 165–184.
  29. Kennedy, R. A., P. Sadeghi, Z. Khalid, and J. D. McEwen (2013). Classification and Construction of Closed-form Kernels for Signal Representation on the 2-sphere. In Wavelets and Sparsity XV, Volume 8858, pp. 169–183. SPIE.
  30. Kolda, T. G. and B. W. Bader (2009). Tensor Decompositions and Applications. SIAM review 51(3), 455–500.
  31. Koltchinskii, V. and E. Gin´e (2000). Random Matrix Approximation of Spectra of Integral Operators. Bernoulli 6(1), 113–167.
  32. Li, L. and X. Zhang (2017). Parsimonious Tensor Response Regression. Journal of the American Statistical Association 112(519), 1131–1146.
  33. Li, X., D. Xu, H. Zhou, and L. Li (2018). Tucker Tensor Regression and Neuroimaging Analysis. Statistics in
  34. Li, Z. and H. Xiao (2021). Multi-linear Tensor Autoregressive Models. arXiv preprint arXiv:2110.00928.
  35. Liu, Y., J. Liu, and C. Zhu (2020). Low-rank Tensor Train Coefficient Array Estimation for Tensor-on-Tensor Regression. IEEE Transactions on Neural Networks and Learning Systems 31(12), 5402–5411.
  36. Lock, E. F. (2018). Tensor-on-Tensor Regression. Journal of Computational and Graphical Statistics 27(3), 638–647.
  37. Luo, Y. and A. R. Zhang (2024). Tensor-on-tensor Regression: Riemannian Optimization, Over-parameterization, Statistical-Computational Gap and Their Interplay. The Annals of Statistics 52(6), 2583–2612.
  38. Lyu, X., W. W. Sun, Z. Wang, H. Liu, J. Yang, and G. Cheng (2019). Tensor Graphical Model: Non-convex Optimization and Statistical Inference. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(8), 2024–2037.
  39. Papadogeorgou, G., Z. Zhang, and D. B. Dunson (2021). Soft Tensor Regression. The Journal of Machine Learning Research 22, 219–1.
  40. Papitashvili, N., D. Bilitza, and J. King (2014). OMNI: a Description of Near-Earth Solar Wind Environment. 40th COSPAR Scientific Assembly 40, C0–1.
  41. Rabusseau, G. and H. Kadri (2016). Low-rank Regression with Tensor Responses. Advances in Neural Information Processing Systems 29.
  42. Sch¨olkopf, B., R. Herbrich, and A. J. Smola (2001). A Generalized Representer Theorem. In International Conference on Computational Learning Theory, pp. 416–426. Springer.
  43. Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics 6(2), 461–464.
  44. Shang, Z. and G. Cheng (2013). Local and Global Asymptotic Inference in Smoothing Spline Models. The Annals of Statistics 41, 2608–2638.
  45. Shang, Z. and G. Cheng (2015). Nonparametric Inference in Generalized Functional Linear Models. The Annals of Statistics 43, 1742–1773.
  46. Shen, B., W. Xie, and Z. Kong (2022). Smooth Robust Tensor Completion for Background/Foreground Separation with Missing Pixels: Novel Algorithm with Convergence Guarantee. The Journal of Machine Learning Research 23(1), 9757–9796.
  47. Stock, J. H. and M. W. Watson (2001). Vector Autoregressions. Journal of Economic perspectives 15(4), 101–115.
  48. Sun, H., Z. Hua, J. Ren, S. Zou, Y. Sun, and Y. Chen (2022). Matrix Completion Methods for the Total Electron Content Video Reconstruction. The Annals of Applied Statistics 16(3), 1333–1358.
  49. Sun, H., W. Manchester, M. Jin, Y. Liu, and Y. Chen (2023). Tensor Gaussian Process with Contraction for MultiChannel Imaging Analysis. In International Conference on Machine Learning, pp. 32913–32935. PMLR.
  50. Sun, W. W. and L. Li (2017). STORE: Sparse Tensor Response Regression and Neuroimaging Analysis. The Journal of Machine Learning Research 18(1), 4908–4944.
  51. Tsiligkaridis, T., A. O. Hero III, and S. Zhou (2013). On Convergence of Kronecker Graphical Lasso Algorithms. IEEE Transactions on Signal Processing 61(7), 1743–1755.
  52. Tzeng, S. and H.-C. Huang (2018). Resolution Adaptive Fixed Rank Kriging. Technometrics 60(2), 198–208.
  53. van Zanten, J. and A. W. van der Vaart (2008). Reproducing Kernel Hilbert Spaces of Gaussian Priors. In Pushing the Limits of Contemporary Statistics: Contributions in honor of Jayanta K. Ghosh, pp. 200–222. Institute of Mathematical Statistics.
  54. Wang, D., X. Liu, and R. Chen (2019). Factor Models for Matrix-valued High-dimensional Time Series. Journal of Econometrics 208(1), 231–248.
  55. Wang, D., Y. Zheng, and G. Li (2024). High-Dimensional Low-rank Tensor Autoregressive Time Series Modeling. Journal of Econometrics 238(1), 105544.
  56. Wang, D., Y. Zheng, H. Lian, and G. Li (2022). High-dimensional Vector Autoregressive Time Series Modeling via Tensor Decomposition. Journal of the American Statistical Association 117(539), 1338–1356. Journal of the American Statistical Association 112(519), 1156–1168.
  57. Wang, Z., S. Zou, L. Liu, J. Ren, and E. Aa (2021). Hemispheric Asymmetries in the Mid-latitude Ionosphere During the September 7–8, 2017 Storm: Multi-instrument Observations. Journal of Geophysical Research: Space Physics 126, e2020JA028829.
  58. Williams, C. K. and C. E. Rasmussen (2006). Gaussian Processes for Machine Learning, Volume 2. MIT press
  59. Cambridge, MA.
  60. Xiao, H., Y. Han, R. Chen, and C. Liu (2022). Reduced Rank Autoregressive Models for Matrix Time Series. Journal of Business and Economic Statistics.
  61. Yang, Y., Z. Shang, and G. Cheng (2020). Non-asymptotic Analysis for Nonparametric Testing. In 33rd Annual Conference on Learning Theory, pp. 1–47. ACM.
  62. Younas, W., M. Khan, C. Amory-Mazaudier, P. O. Amaechi, and R. Fleury (2022). Middle and Low Latitudes Hemispheric Asymmetries in ΣO/N2 and TEC during Intense Magnetic Storms of Solar Cycle 24. Advances in Space Research 69, 220–235.
  63. Yuan, M. and T. T. Cai (2010). A Reproducing Kernel Hilbert Space Approach to Functional Linear Regression. The Annals of Statistics 38(6), 3412–3444.
  64. Zhou, H., L. Li, and H. Zhu (2013). Tensor Regression with Applications in Neuroimaging Data Analysis. Journal of the American Statistical Association 108(502), 540–552.
  65. Zhou, S. (2014). GEMINI: Graph Estimation with Matrix Variate Normal Instances. The Annals of Statistics 42(2), 532–562. Hu Sun

Acknowledgments

The authors thank Shasha Zou, Zihan Wang, and Yizhou Zhang for helpful discussions on

the TEC data. YC acknowledges support from NSF DMS 2113397, NSF PHY 2027555,

NASA Federal Award No. 80NSSC23M0192, and No. 80NSSC23M0191.

Supplementary Materials

The supplemental material contains details of the alternating minimization algorithm, technical proofs of all theorems and propositions of the paper, additional details of the simulation

experiments, and the approximated estimating algorithm based on kernel truncation. Our


Supplementary materials are available for download.