Abstract

Due to scale invariance, correlation matrices play a critical role in multivariate statistical

analysis. Statistical inference about correlation matrices encounter enormous challenges and is fundamentally different from inference about covariance matrices in both low- and high-dimensional settings.

This paper studies the test of general linear structures of high-dimensional correlation matrices, which

include commonly-used banded matrices and compound symmetry matrices as special cases. We first

propose a procedure using the quadratic loss function to estimate the unknown parameters associated

with the assumed linear structure. We then develop test statistics, based on the quadratic and infinite

norms, which are suitable for dense and sparse alternatives, respectively. The limiting distributions of

our proposed test statistics are derived under the null and alternative hypotheses. Extensive simulation

studies are conducted to demonstrate the finite sample performance of our proposed tests. Moreover,

a real data example is provided to show the applicability and the practical utility of the tests.

Information

Preprint No.SS-2024-0078
Manuscript IDSS-2024-0078
Complete AuthorsTingting Zou, Guangren Yang, Ruitao Lin, Guoliang Tian, Shurong Zheng
Corresponding AuthorsTingting Zou
Emailszoutt260@jlu.edu.cn

References

  1. Aitkin, M. A. (1969). Some tests for correlation matrices. Biometrika 56(2), 443–446.
  2. Bartlett, M. S. and D. V. Rajalakshman (1953). Goodness of fit tests for simultaneous autoregressive series. Journal of the Royal Statistical Society (Series B) 15, 107–124.
  3. Cai, T. and T. T. Jiang (2011). Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. The Annals of Statistics 39, 1496–1525.
  4. Cai, T. and T. T. Jiang (2012). Phase transition in limiting distributions of coherence of high-dimensional random matrices. Journal of Multivariate Analysis 107, 24–39.
  5. Cai, T., W. Liu, and Y. Xia (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association 108(501), 265–277.
  6. Cai, T. T. and A. Zhang (2016). Inference for high-dimensional differential correlation matrices. Journal of Multivariate Analysis 143, 107–126.
  7. Chen, J., X. Y. Wang, S. R. Zheng, B. S. Liu, and N.-Z. Shi (2020). Tests for high dimensional covariance matrices. Random Matrices: Theory and Applications 9(3), 1–25.
  8. Gao, J. T., X. Han, G. M. Pan, and Y. R. Yang (2017). High-dimensional correlation matrices: the central limit theorem and its application. Journal of the Royal Statistical Society (Series B) 79(3), 677–693.
  9. Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. The Annals of Applied Probability 14(2), 865–880.
  10. J¨oreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. Psychometrika 43(4), 443–477.
  11. Kullback, S. (1967). On testing correlation matrices. Journal of the Royal Statistical Society (Series C) 16(1), 80–85.
  12. Larzelere, R. E. and S. A. Mulaik (1977). Single-sample tests for many correlations. Psychological Bulletin 84(3), 557–569.
  13. Leung, D. and M. Drton (2018). Testing independence in high dimensions with sums of rank correlations. The Annals of Statistics 46(1), 280–307.
  14. Li, D., W. D. Liu, and A. Rosalsky (2010). Necessary and sufficient conditions for the asymptotic distribution of the largest entry of a sample correlation matrix. Probability Theory and Related Fields 148, 5–35.
  15. Li, D., Y. Qi, and A. Rosalsky (2012). On jiang’s asymptotic distribution of the largest entry of a sample correlation matrix. Journal of Multivariate Analysis 111, 256–270.
  16. Li, D. and A. Rosalsky (2006). Some strong limit theorems for the largest entries of sample correlation matrices. The Annals of Applied Probability 16, 423–447.
  17. Liu, W. D., Z. Lin, and Q. M. Shao (2008). The asymptotic distribution and berry-esseen bound of a new test for independence in high dimension with an application to stochastic optimization. The Annals of Applied Probability 18, 2337–2366.
  18. Liu, Y. and J. Xie (2020). Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. Journal of the American Statistical Association 115(529), 393–402.
  19. McDonald, R. P. (1975). Testing pattern hypotheses for correlation matrices. Psychometrika 40(40), 253–255.
  20. Mestre, X. and P. Vallet (2017). Correlation tests and linear spectral statistics of the sample correlation matrix. IEEE Transactions on Information Theory 63(7), 4585–4618.
  21. Noureddine, E. K. (2009). Concentration of measure and spectra of random matrices: with applications to correlation matrices, elliptical distributions and beyond. The Annals of Applied Probability 19(6), 2362–2405.
  22. Oldham, M. C., S. Horvath, and D. H. Geschwind (2006). Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proceedings of the National Academy of Sciences 103(47), 17973–17978.
  23. Opgen-Rhein, R. and K. Strimmer (2007). From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC systems biology 1(1), 37.
  24. Ramsay, J. O. and B. W. Silverman (2002). Applied functional data analysis: methods and case studies. Springer.
  25. Schott, J. R. (2005). Testing for complete independence in high dimensions. Biometrika 92(4), 951–956.
  26. Steiger, J. H. (1980). Testing pattern hypotheses on correlation matrices: Alternative statistics and some empirical results. Multivariate Behavioral Research 15, 335–352.
  27. Xiao, H. and W. Wu (2013). Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stochastic Processes and their Applications 123, 2899–2920.
  28. Yin, Y., C. Li, G.-L. Tian, and S. Zheng (2022). Spectral properties of rescaled sample correlation matrix. Statistica Sinica 32(4), 2007–2022.
  29. Yin, Y. and Y. Ma (2022). Properties of eigenvalues and eigenvectors of large-dimensional sample correlation matrices. The Annals of Applied Probability 32(6), 4763–4802.
  30. Yong, A. G. and S. Pearce (2013). A beginner’s guide to factor analysis: Focusing on exploratory factor analysis. Tutorials in quantitative methods for psychology 9(2), 79–94.
  31. Zheng, S. R., Z. Chen, H. J. Cui, and R. Z. Li (2019). Hypothesis testing on linear structures of high-dimensional covariance matrix. The Annals of Statistics 47(6), 3300–3334.
  32. Zheng, S. R., G. H. Cheng, J. H. Guo, and H. T. Zhu (2019). Test for high-dimensional correlation matrices. The Annals of Statistics 47(5), 2887–2921.
  33. Zhong, P.-S., W. Lan, P. X. K. Song, and C.-L. Tsai (2017). Tests for covariance structures with high-dimensional repeated measurements. The Annals of Statistics 45(3), 1185–1213.
  34. Zhou, W. (2007). Asymptotic distribution of the largest off-diagonal entry of correlation matrices. Transactions of the American Mathematical Society 359(11), 5345–5363.
  35. Zou, T., R. Lin, S. Zheng, and G.-L. Tian (2021). Two-sample tests for high-dimensional covariance matrices using both difference and ratio. Electronic Journal of Statistics 15(1), 135–210. Tingting Zou

Acknowledgments

The authors thank the Co-Editor, Dr. Huixia Judy Wang, the Associate Editor, and three

reviewers for their constructive and insightful comments that greatly improved this paper.

Table 3: Empirical size and power for the Tippett’s minimum p-value test Ttn and the

Cauchy combination test Tcn under scenarios 1–8, where n observations with dimension p are

generated from the Gaussian population.

Scenario

n

p

Ttn

Tcn

Ttn

Tcn

Ttn Tcn

Ttn

Tcn

Ttn

Tcn

Ttn

Tcn

Ttn

Tcn

Ttn Tcn

Empirical size (%)

4.2

5.0

4.5

4.8

3.9 4.7

3.5

4.7

4.1

4.6

3.4

4.7

4.3

4.8

4.7 4.8

4.8

5.2

4.6

4.9

4.2 4.9

3.6

4.6

4.4

4.9

3.2

4.6

4.3

4.6

4.4 4.5

5.0

5.2

4.8

4.9

4.5 4.9

3.3

4.3

4.4

4.6

3.0

4.3

4.2

4.6

4.5 4.5

4.9

5.3

4.8

5.0

4.7 4.8

3.6

4.7

4.4

4.7

3.1

4.7

4.8

5.2

4.2 4.2

5.2

5.3

5.3

5.3

4.9 5.1

3.7

4.7

4.6

4.7

3.4

4.8

4.7

4.8

4.3 4.1

4.4

5.0

4.4

4.6

3.9 4.6

3.2

4.4

4.0

4.5

3.0

4.5

4.1

4.5

3.8 4.0

4.5

5.0

4.6

5.0

4.2 5.0

3.4

4.7

4.4

5.0

3.3

4.8

4.3

4.9

4.0 4.3

4.6

5.1

4.8

5.0

4.4 5.0

3.5

4.7

4.3

4.8

3.3

4.8

4.3

4.8

4.1 4.0

4.3

4.6

4.6

4.6

4.2 4.5

3.7

5.0

3.9

4.3

3.2

4.8

4.2

4.6

3.9 3.8

4.7

4.8

5.0

5.1

4.7 5.1

4.0

5.2

4.6

4.7

3.5

5.1

4.9

5.3

3.5 3.5

Empirical power (%)

100.0 100.0 52.7 54.9 68.0 67.8 14.7 17.9

89.1 90.9

61.8 67.4

18.1 18.4 44.8 46.

100 100.0 100.0 93.8 95.0 52.7 52.8 19.0 22.0

90.0 91.7

62.9 68.8

33.1 33.2 36.7 37.

300 100.0 100.0 100.0 100.0 30.8 31.3 41.3 42.4

99.8 99.9

65.0 69.9

91.0 90.8 43.8 45.

500 100.0 100.0 100.0 100.0 23.2 23.4 66.5 66.1 100.0 100.0 66.7 71.2

99.8 99.7 57.4 59.

1000 100.0 100.0 100.0 100.0 15.7 16.2 98.5 98.5 100.0 100.0 71.4 74.7 100.0 100.0 87.5 88.

100.0 100.0 88.2 89.1 99.9 99.9 44.0 49.8 100.0 100.0 99.8 99.9

34.3 34.4 97.1 97.

100 100.0 100.0 100.0 100.0 99.6 99.5 53.9 58.2 100.0 100.0 99.9 99.9

56.3 56.0 95.7 96.

300 100.0 100.0 100.0 100.0 97.7 97.5 79.6 80.7 100.0 100.0 100.0 100.0 98.3 98.3 97.0 97.

500 100.0 100.0 100.0 100.0 96.1 95.8 93.4 93.2 100.0 100.0 100.0 100.0 100.0 100.0 98.7 98.

1000 100.0 100.0 100.0 100.0 91.6 91.4 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 99.9 99.

Shurong Zheng’s research was supported by the National Key R&D Program of China (No.

2024YFA1012200), the National Natural Science Foundation of China (Nos. 12326606 and

12231011). Tingting Zou’s research was supported by the National Natural Science Foundation of China (No. 12301339). Guangren Yang’s research was supported by the National

Social Science Fund of China grant (24BTJ070). Guo-Liang Tian’s research was partially

supported by the National Natural Science Foundation of China (No. 12171225).

Supplementary Materials

In the supplement, we give the detailed proofs of Lemma 1, Theorems 1, 2, 3, 5, and Corollary

3.

We also present the simulation results when the observations are generated from the

Gamma population.


Supplementary materials are available for download.