Back To Index Previous Article Next Article Full Text

Statistica Sinica 31 (2021), 1189-1214

ON SIMULTANEOUS CALIBRATION OF
TWO-SAMPLE t-TESTS FOR
HIGH-DIMENSION LOW-SAMPLE-SIZE DATA

Chunming Zhang, Shengji Jia and Yongfeng Wu

University of Wisconsin, Madison

Abstract: The exact distribution is typically unavailable for a two-sample t-statistic in a single test for equal population means if we have nonGaussian samples, unequal population variances, or unequal sample sizes n1 and n2. In this case, a calibration method using a reference distribution offers a practically feasible substitute. This study simultaneously calibrates a diverging number m of two-sample t-statistics for inferences of significance in high-dimensional data from a small sample. For the Gaussian calibration method, we demonstrate the following. First, the simultaneous "general" two-sample t-statistics achieve the overall significance level, as long as log(m) increases at a strictly slower rate than (n1 + n2) as n1 + n2 diverges. Second, directly applying the same calibration method to simultaneous "pooled" two-sample t-statistics may substantially lose the overall level accuracy. The proposed "adaptively pooled" two-sample t-statistics overcome such incoherence, while operating as simply and performing as well as the "general' two-sample t-statistics. Third, we propose a 'two-stage' t-test procedure to effectively alleviate the skewness commonly encountered in various two-sample t-statistics in practice, thus increasing the calibration accuracy. Lastly, we discuss the implications of these results using simulation studies and real-data applications.

Key words and phrases: Familywise error rate, multiple hypothesis testing, overall significance level, simultaneous inference, skewness.

Back To Index Previous Article Next Article Full Text