Abstract
This paper considers the problem of testing the marginal distributions
of multiple, independent data streams, where for each data stream, multiple composite hypotheses along with an indifference zone are posed. A novel global error
metric is proposed, which aims to control the probabilities of making different
numbers of misclassifications below different, user-specified levels, and which includes the classical and the generalized misclassification probabilities as special
cases. A novel testing procedure is designed and is shown to achieve the minimum expected sample size under all possible distributions, among all tests that
control this global error metric below the same levels, asymptotically as any of
these levels goes to zero. This asymptotic optimality theory is established allowing temporal dependence and general information functions beyond linear that
are considered in most literature. Examples are provided to illustrate the theory
and numerical studies are presented to visualize both the asymptotic properties
and finite-sample performance.
*Author's ORCID ID: 0000 0001 8508 7982
Key words and phrases: asymptotic optimality, multihypothesis testing, multiple testing, non-linear information function, sequential analysis
Information
| Preprint No. | SS-2025-0042 |
|---|---|
| Manuscript ID | SS-2025-0042 |
| Complete Authors | Yiming Xing |
| Corresponding Authors | Yiming Xing |
| Emails | yimingx4@tongji.edu.cn |
References
- Bartroff, J., T. L. Lai, and M.-C. Shih (2012). Sequential experimentation in clinical trials: design and analysis, Volume 298. Springer Science & Business Media.
- Bucklew, J. (2010). Introduction to Rare Event Simulation (1st ed.). Springer Publishing Company, Incorporated.
- Chaudhuri, A. and G. Fellouris (2024). Joint sequential detection and isolation for dependent data streams. The Annals of Statistics 52(5), 1899– 1926.
- Chernoff, H. (1959). Sequential design of experiments. The Annals of Mathematical Statistics 30(3), 755–770.
- Cohen, K. and Q. Zhao (2015). Asymptotically optimal anomaly detection via sequential testing. IEEE Transactions on Signal Processing 63(11), 2929–2941.
- Deshmukh, A., V. V. Veeravalli, and S. Bhashyam (2021). Sequential controlled sensing for composite multihypothesis testing. Sequential Analysis 40(2), 259–289.
- Draglia, V., A. Tartakovsky, and V. Veeravalli (1999). Multihypothesis sequential probability ratio tests .i. asymptotic optimality. IEEE Transactions on Information Theory 45(7), 2448–2461.
- Gafni, T., B. Wolff, G. Revach, N. Shlezinger, and K. Cohen (2023). Anomaly search over discrete composite hypotheses in hierarchical statistical models. IEEE Transactions on Signal Processing 71, 202–217.
- He, X. and J. Bartroff (2021). Asymptotically optimal sequential fdr and pfdr control with (or without) prior information on the number of signals. Journal of Statistical Planning and Inference 210, 87–99.
- Hemo, B., T. Gafni, K. Cohen, and Q. Zhao (2020). Searching for anomalies over composite hypotheses. IEEE Transactions on Signal Processing 68, 1181–1196.
- Jennison, C. and B. W. Turnbull (1999). Group sequential methods with applications to clinical trials. CRC Press.
- Malloy, M. L. and R. D. Nowak (2014). Sequential testing for sparse recovery. IEEE Transactions on Information Theory 60(12), 7862–7873.
- Nitinawarat, S., G. K. Atia, and V. V. Veeravalli (2013). Controlled sensing for multihypothesis testing. IEEE Transactions on automatic control 58(10), 2451–2464.
- Sarkar, S. K., J. Chen, and W. Guo (2013). Multiple testing in a two-stage adaptive design with combination tests controlling fdr. Journal of the American Statistical Association 108(504), 1385–1401.
- Siegmund, D. (1976). Importance Sampling in the Monte Carlo Study of Sequential Tests. The Annals of Statistics 4(4), 673 – 684.
- Song, Y. and G. Fellouris (2017). Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals. Electronic Journal of Statistics 11(1), 338 – 363.
- Song, Y. and G. Fellouris (2019). Sequential multiple testing with generalized error control: An asymptotic optimality theory. The Annals of Statistics 47(3), 1776 – 1803.
- Tartakovsky, A., I. Nikiforov, and M. Basseville (2014). Sequential Analysis: Hypothesis Testing and Changepoint Detection (1st ed.). Chapman & Hall/CRC.
- Tsopelakos, A. and G. Fellouris (2023). Sequential anomaly detection under sampling constraints. IEEE Transactions on Information Theory 69(12), 8126–8146.
- Tsopelakos, A. and G. Fellouris (2025). Sequential anomaly identification under sampling constraints for generalized error metrics. IEEE Transactions on Information Theory 71(12), 9753–9783.
- Wald, A. (1947). Sequential Analysis. New York: John Wiley & Sons.
- Wald, A. and J. Wolfowitz (1948). Optimum character of the sequential probability ratio test. Annals of Mathematical Statistics 19, 326–339.
- Xing, Y. (2026). To minimize the expected total sampling cost in sequential testing about a random vector. Journal of Multivariate Analysis 215, 105640.
- Xing, Y., A. Chaudhuri, and Y. Chen (2025). Signal detection under composite hypotheses with identical distributions for signals and for noises. arXiv preprint arXiv:2507.21692.
- Xing, Y. and G. Fellouris (2023). Signal recovery with multistage tests and without sparsity constraints. IEEE Transactions on Information Theory 69(11), 7220–7245.
- Xing, Y. and G. Fellouris (2024). Asymptotically optimal multistage tests for non-iid data. Statistica Sinica 34, 2325–2346.
- Xing, Y. and G. Fellouris (2025). Asymptotically optimal sequential multiple testing with asynchronous decisions. Bernoulli 31(1), 271–294.
- Xing, Y. and G. Fellouris (2026). Active sequential signal detection with asynchronous decisions. arXiv preprint arXiv:2604.04755.
- Zehetmayer, S., P. Bauer, and M. Posch (2005). Two-stage designs for experiments with a large number of hypotheses. Bioinformatics 21(19), 3771–3777.
Acknowledgments
The author would like to thank the Editor, the Associate Editor and the
reviewers for valuable comments and constructive suggestions, which have
greatly improved the paper.
The author is supported by National Natural Science Foundation of China (No. 12501379), Shanghai Rising-Star
Program (No. 24YF2748500), and Open Research Fund of Key Laboratory
of Advanced Theory and Application in Statistics and Data Science (East
China Normal University), Ministry of Education (No. KLATASDS2501).
Supplementary Materials
In the supplementary material, we illustrate the general theory through
three concrete examples, present extra numerical studies of testing the correlation coefficient of autoregressive data, and present all proofs and a dis-
cussion about model misspecification.