Abstract
Due to scale invariance, correlation matrices play a critical role in multivariate statistical
analysis. Statistical inference about correlation matrices encounter enormous challenges and is fundamentally different from inference about covariance matrices in both low- and high-dimensional settings.
This paper studies the test of general linear structures of high-dimensional correlation matrices, which
include commonly-used banded matrices and compound symmetry matrices as special cases. We first
propose a procedure using the quadratic loss function to estimate the unknown parameters associated
with the assumed linear structure. We then develop test statistics, based on the quadratic and infinite
norms, which are suitable for dense and sparse alternatives, respectively. The limiting distributions of
our proposed test statistics are derived under the null and alternative hypotheses. Extensive simulation
studies are conducted to demonstrate the finite sample performance of our proposed tests. Moreover,
a real data example is provided to show the applicability and the practical utility of the tests.
Information
| Preprint No. | SS-2024-0078 |
|---|---|
| Manuscript ID | SS-2024-0078 |
| Complete Authors | Tingting Zou, Guangren Yang, Ruitao Lin, Guoliang Tian, Shurong Zheng |
| Corresponding Authors | Tingting Zou |
| Emails | zoutt260@jlu.edu.cn |
References
- Aitkin, M. A. (1969). Some tests for correlation matrices. Biometrika 56(2), 443–446.
- Bartlett, M. S. and D. V. Rajalakshman (1953). Goodness of fit tests for simultaneous autoregressive series. Journal of the Royal Statistical Society (Series B) 15, 107–124.
- Cai, T. and T. T. Jiang (2011). Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. The Annals of Statistics 39, 1496–1525.
- Cai, T. and T. T. Jiang (2012). Phase transition in limiting distributions of coherence of high-dimensional random matrices. Journal of Multivariate Analysis 107, 24–39.
- Cai, T., W. Liu, and Y. Xia (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association 108(501), 265–277.
- Cai, T. T. and A. Zhang (2016). Inference for high-dimensional differential correlation matrices. Journal of Multivariate Analysis 143, 107–126.
- Chen, J., X. Y. Wang, S. R. Zheng, B. S. Liu, and N.-Z. Shi (2020). Tests for high dimensional covariance matrices. Random Matrices: Theory and Applications 9(3), 1–25.
- Gao, J. T., X. Han, G. M. Pan, and Y. R. Yang (2017). High-dimensional correlation matrices: the central limit theorem and its application. Journal of the Royal Statistical Society (Series B) 79(3), 677–693.
- Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. The Annals of Applied Probability 14(2), 865–880.
- J¨oreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. Psychometrika 43(4), 443–477.
- Kullback, S. (1967). On testing correlation matrices. Journal of the Royal Statistical Society (Series C) 16(1), 80–85.
- Larzelere, R. E. and S. A. Mulaik (1977). Single-sample tests for many correlations. Psychological Bulletin 84(3), 557–569.
- Leung, D. and M. Drton (2018). Testing independence in high dimensions with sums of rank correlations. The Annals of Statistics 46(1), 280–307.
- Li, D., W. D. Liu, and A. Rosalsky (2010). Necessary and sufficient conditions for the asymptotic distribution of the largest entry of a sample correlation matrix. Probability Theory and Related Fields 148, 5–35.
- Li, D., Y. Qi, and A. Rosalsky (2012). On jiang’s asymptotic distribution of the largest entry of a sample correlation matrix. Journal of Multivariate Analysis 111, 256–270.
- Li, D. and A. Rosalsky (2006). Some strong limit theorems for the largest entries of sample correlation matrices. The Annals of Applied Probability 16, 423–447.
- Liu, W. D., Z. Lin, and Q. M. Shao (2008). The asymptotic distribution and berry-esseen bound of a new test for independence in high dimension with an application to stochastic optimization. The Annals of Applied Probability 18, 2337–2366.
- Liu, Y. and J. Xie (2020). Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. Journal of the American Statistical Association 115(529), 393–402.
- McDonald, R. P. (1975). Testing pattern hypotheses for correlation matrices. Psychometrika 40(40), 253–255.
- Mestre, X. and P. Vallet (2017). Correlation tests and linear spectral statistics of the sample correlation matrix. IEEE Transactions on Information Theory 63(7), 4585–4618.
- Noureddine, E. K. (2009). Concentration of measure and spectra of random matrices: with applications to correlation matrices, elliptical distributions and beyond. The Annals of Applied Probability 19(6), 2362–2405.
- Oldham, M. C., S. Horvath, and D. H. Geschwind (2006). Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proceedings of the National Academy of Sciences 103(47), 17973–17978.
- Opgen-Rhein, R. and K. Strimmer (2007). From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC systems biology 1(1), 37.
- Ramsay, J. O. and B. W. Silverman (2002). Applied functional data analysis: methods and case studies. Springer.
- Schott, J. R. (2005). Testing for complete independence in high dimensions. Biometrika 92(4), 951–956.
- Steiger, J. H. (1980). Testing pattern hypotheses on correlation matrices: Alternative statistics and some empirical results. Multivariate Behavioral Research 15, 335–352.
- Xiao, H. and W. Wu (2013). Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stochastic Processes and their Applications 123, 2899–2920.
- Yin, Y., C. Li, G.-L. Tian, and S. Zheng (2022). Spectral properties of rescaled sample correlation matrix. Statistica Sinica 32(4), 2007–2022.
- Yin, Y. and Y. Ma (2022). Properties of eigenvalues and eigenvectors of large-dimensional sample correlation matrices. The Annals of Applied Probability 32(6), 4763–4802.
- Yong, A. G. and S. Pearce (2013). A beginner’s guide to factor analysis: Focusing on exploratory factor analysis. Tutorials in quantitative methods for psychology 9(2), 79–94.
- Zheng, S. R., Z. Chen, H. J. Cui, and R. Z. Li (2019). Hypothesis testing on linear structures of high-dimensional covariance matrix. The Annals of Statistics 47(6), 3300–3334.
- Zheng, S. R., G. H. Cheng, J. H. Guo, and H. T. Zhu (2019). Test for high-dimensional correlation matrices. The Annals of Statistics 47(5), 2887–2921.
- Zhong, P.-S., W. Lan, P. X. K. Song, and C.-L. Tsai (2017). Tests for covariance structures with high-dimensional repeated measurements. The Annals of Statistics 45(3), 1185–1213.
- Zhou, W. (2007). Asymptotic distribution of the largest off-diagonal entry of correlation matrices. Transactions of the American Mathematical Society 359(11), 5345–5363.
- Zou, T., R. Lin, S. Zheng, and G.-L. Tian (2021). Two-sample tests for high-dimensional covariance matrices using both difference and ratio. Electronic Journal of Statistics 15(1), 135–210. Tingting Zou
Acknowledgments
The authors thank the Co-Editor, Dr. Huixia Judy Wang, the Associate Editor, and three
reviewers for their constructive and insightful comments that greatly improved this paper.
Table 3: Empirical size and power for the Tippett’s minimum p-value test Ttn and the
Cauchy combination test Tcn under scenarios 1–8, where n observations with dimension p are
generated from the Gaussian population.
Scenario
n
p
Ttn
Tcn
Ttn
Tcn
Ttn Tcn
Ttn
Tcn
Ttn
Tcn
Ttn
Tcn
Ttn
Tcn
Ttn Tcn
Empirical size (%)
4.2
5.0
4.5
4.8
3.9 4.7
3.5
4.7
4.1
4.6
3.4
4.7
4.3
4.8
4.7 4.8
4.8
5.2
4.6
4.9
4.2 4.9
3.6
4.6
4.4
4.9
3.2
4.6
4.3
4.6
4.4 4.5
5.0
5.2
4.8
4.9
4.5 4.9
3.3
4.3
4.4
4.6
3.0
4.3
4.2
4.6
4.5 4.5
4.9
5.3
4.8
5.0
4.7 4.8
3.6
4.7
4.4
4.7
3.1
4.7
4.8
5.2
4.2 4.2
5.2
5.3
5.3
5.3
4.9 5.1
3.7
4.7
4.6
4.7
3.4
4.8
4.7
4.8
4.3 4.1
4.4
5.0
4.4
4.6
3.9 4.6
3.2
4.4
4.0
4.5
3.0
4.5
4.1
4.5
3.8 4.0
4.5
5.0
4.6
5.0
4.2 5.0
3.4
4.7
4.4
5.0
3.3
4.8
4.3
4.9
4.0 4.3
4.6
5.1
4.8
5.0
4.4 5.0
3.5
4.7
4.3
4.8
3.3
4.8
4.3
4.8
4.1 4.0
4.3
4.6
4.6
4.6
4.2 4.5
3.7
5.0
3.9
4.3
3.2
4.8
4.2
4.6
3.9 3.8
4.7
4.8
5.0
5.1
4.7 5.1
4.0
5.2
4.6
4.7
3.5
5.1
4.9
5.3
3.5 3.5
Empirical power (%)
100.0 100.0 52.7 54.9 68.0 67.8 14.7 17.9
89.1 90.9
61.8 67.4
18.1 18.4 44.8 46.
100 100.0 100.0 93.8 95.0 52.7 52.8 19.0 22.0
90.0 91.7
62.9 68.8
33.1 33.2 36.7 37.
300 100.0 100.0 100.0 100.0 30.8 31.3 41.3 42.4
99.8 99.9
65.0 69.9
91.0 90.8 43.8 45.
500 100.0 100.0 100.0 100.0 23.2 23.4 66.5 66.1 100.0 100.0 66.7 71.2
99.8 99.7 57.4 59.
1000 100.0 100.0 100.0 100.0 15.7 16.2 98.5 98.5 100.0 100.0 71.4 74.7 100.0 100.0 87.5 88.
100.0 100.0 88.2 89.1 99.9 99.9 44.0 49.8 100.0 100.0 99.8 99.9
34.3 34.4 97.1 97.
100 100.0 100.0 100.0 100.0 99.6 99.5 53.9 58.2 100.0 100.0 99.9 99.9
56.3 56.0 95.7 96.
300 100.0 100.0 100.0 100.0 97.7 97.5 79.6 80.7 100.0 100.0 100.0 100.0 98.3 98.3 97.0 97.
500 100.0 100.0 100.0 100.0 96.1 95.8 93.4 93.2 100.0 100.0 100.0 100.0 100.0 100.0 98.7 98.
1000 100.0 100.0 100.0 100.0 91.6 91.4 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 99.9 99.
Shurong Zheng’s research was supported by the National Key R&D Program of China (No.
2024YFA1012200), the National Natural Science Foundation of China (Nos. 12326606 and
12231011). Tingting Zou’s research was supported by the National Natural Science Foundation of China (No. 12301339). Guangren Yang’s research was supported by the National
Social Science Fund of China grant (24BTJ070). Guo-Liang Tian’s research was partially
supported by the National Natural Science Foundation of China (No. 12171225).
Supplementary Materials
In the supplement, we give the detailed proofs of Lemma 1, Theorems 1, 2, 3, 5, and Corollary
3.
We also present the simulation results when the observations are generated from the
Gamma population.