Abstract
Multivariate testing has recently emerged as a promising technique
in scientific decision-making and electronic information fields. Unlike standard
A/B/n testing, which evaluates individual variations, multivariate testing aims
to identify the best-performing combination of variations from all possible combinations. We address the challenge of robustly allocating treatments to subjects
in multivariate testing when treatment effects are confounded by covariates and
subjects are interconnected through a network. In this context, we introduce, for
the first time, the use of a mixed effect model to account for covariate uncertainty
and network structure. Based on this model, we propose a criterion to measure
the regret of efficiency due to incorrect specification of the covariance structure.
We derive minimax robust experimental schemes and introduce a novel scheme
that optimally matches the design with the robust covariance structure.
Our
proposed experimental schemes demonstrate: (a) resilience to various optimality criteria, (b) efficiency against model misspecification, and (c) applicability
to complex scenarios. This work extends existing research in optimal A/B testing designs, offering theoretical foundations and practical implementations that
outperform current approaches in statistical efficiency, as demonstrated through
simulations and a case study.
Information
| Preprint No. | SS-2025-0157 |
|---|---|
| Manuscript ID | SS-2025-0157 |
| Complete Authors | Shaohua Xu, Yongdao Zhou |
| Corresponding Authors | Yongdao Zhou |
| Emails | ydzhou@nankai.edu.cn |
References
- Asuncion, A. and D. Newman (2007). UCI Machine Learning Repository. https://archive. ics.uci.edu.
- Atkinson, A., A. Donev, and R. Tobias (2007). Optimum Experimental Designs, with SAS, Volume 34. Oxford University Press, Oxford.
- Bai, Y., J. Liu, and M. Tabord-Meehan (2024). Inference for matched tuples and fully blocked factorial designs. Quantitative Economics 15(2), 279–330.
- Bhat, N., V. F. Farias, C. C. Moallemi, and D. Sinha (2020). Near-optimal A-B testing. Management Science 66(10), 4477–4495.
- Branson, Z., T. Dasgupta, and D. B. Rubin (2016). Improving covariate balance in 2K factorial designs via rerandomization with an application to a New York City Department of Education High School Study. The Annals of Applied Statistics 10(4), 1958–1976.
- Chen, Q., B. Li, L. Deng, and Y. Wang (2023). Optimized covariance design for ab test on social network under interference. In Proceedings of the 37th International Conference on Neural Information Processing Systems, pp. 37448–37471.
- Haizler, T. and D. M. Steinberg (2021). Factorial designs for online experiments. Technometrics 63(1), 1–12.
- Harville, D. (1976). Extension of the gauss-markov theorem to include the estimation of random effects. The Annals of Statistics 4(2), 384–395.
- Jiang, J., D. Legrand, R. Severn, and R. Miikkulainen (2020). A comparison of the taguchi method and evolutionary optimization in multivariate testing. In 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–6. IEEE.
- Kohavi, R., R. Longbotham, D. Sommerfield, and R. M. Henne (2009). Controlled experiments on the web: survey and practical guide. Data Mining and Knowledge Discovery 18, 140– 181.
- Kohavi, R., D. Tang, and Y. Xu (2020). Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press, Cambridge.
- Larsen, N., J. Stallrich, S. Sengupta, A. Deng, R. Kohavi, and N. T. Stevens (2024). Statistical challenges in online controlled experiments: a review of A/B testing methodology. The American Statistician 78(2), 135–149.
- Li, X. and P. Ding (2020). Rerandomization and regression adjustment. Journal of the Royal Statistical Society Series B: Statistical Methodology 82(1), 241–268.
- Liu, H., J. Ren, and Y. Yang (2024). Randomization-based joint central limit theorem and efficient covariate adjustment in randomized block 2k factorial experiments. Journal of the American Statistical Association 119(545), 136–150.
- Nesterov, Y. (1998). Semidefinite relaxation and nonconvex quadratic optimization. Optimization Methods and Software 9(1-3), 141–160.
- Nesterov, Y. (2007). Smoothing technique and its applications in semidefinite optimization. Mathematical Programming 110(2), 245–259.
- Pashley, N. E. and M.-A. C. Bind (2023). Causal inference for multiple treatments using fractional factorial designs. Canadian Journal of Statistics 51(2), 444–468.
- Pokhilko, V., Q. Zhang, L. Kang, et al. (2019). D-optimal design for network A/B testing. Journal of Statistical Theory and Practice 13(4), 1–23.
- Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100(469), 322–331.
- Sadeghi, S., P. Chien, and N. Arora (2020). Sliced designs for multi-platform online experiments. Technometrics 62(3), 387–402.
- Searle, S. R., G. Casella, and C. E. McCulloch (2009). Variance Components. John Wiley &
- Sons, Hoboken, NJ.
- Verbeke, G., G. Molenberghs, and G. Verbeke (1997). Linear Mixed Models for Longitudinal Data. Springer, New York.
- Vono, M., N. Dobigeon, and P. Chainais (2022). High-dimensional gaussian sampling: a review and a unifying approach based on a stochastic proximal point algorithm. SIAM Review 64(1), 3–56.
- Waldspurger, I., A. d’Aspremont, and S. Mallat (2015). Phase recovery, maxcut and complex semidefinite programming. Mathematical Programming 149, 47–81.
- Wang, T., C. Rudin, F. Doshi-Velez, Y. Liu, E. Klampfl, and P. MacNeille (2017). A bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research 18(1), 2357–2393.
- Wiens, D. P. (2015). Robustness of Design. In A. M. Dean, M. Morris, J. Stufken, and D. Bingham (Eds.), Handbook of Design and Analysis of Experiments, pp. 719–753. CRC Press Taylor & Francis Group, Boca Raton, FL.
- Zhang, Q. and L. Kang (2022). Locally optimal design for A/B tests in the presence of covariates and network dependence. Technometrics 64(3), 358–369.
- Zhao, A. and P. Ding (2022). Regression-based causal inference with factorial experiments: estimands, model specifications and design-based properties. Biometrika 109(3), 799–815.
- Zhao, A. and P. Ding (2023). Covariate adjustment in multiarmed, possibly factorial experiments. Journal of the Royal Statistical Society Series B: Statistical Methodology 85(1), 1–23.
- NITFID, School of Statistics and Data Science, Nankai University, Tianjin 300071, China
Acknowledgments
The authors would like to thank the Editor, Associate Editor, and two
reviewers for their valuable comments and suggestions. This work was supported by the National Natural Science Foundation of China (12131001),
the Fundamental Research Funds for Central Universities, LPMC, and
KLMDASR.
Supplementary Materials
The Supplementary Material includes two applications of the proposed robust experimental schemes: A/B testing and sequential experiments, sup-
plementary simulation results, and proofs for all the theoretical results.