GROS: A General Robust Aggregation Strategy

Alejandro Cholaquidis, Emilien Joly and Leonardo Moreno

doi:10.5705/ss.202024.0414

Abstract

A new, very general, robust procedure for combining estimators

in metric spaces is introduced (GROS). The method is reminiscent of

the well-known median of means, as described in Devroye, Lerasle,

Lugosi and Oliveira (2016). Initially, the sample is divided into K

groups. Subsequently, an estimator is computed for each group. Finally, these K estimators are combined using a robust procedure.

We prove that this estimator is sub-Gaussian and we get its breakdown point, in the sense of Donoho. The robust procedure involves

a minimization problem on a general metric space, but we show that

the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample, making GROS feasible in practice.

The performance of GROS is evaluated through five simulation studies: the first one focuses on classification using k-means, the second

one on the multi-armed bandit problem, the third one on the regression problem. The fourth one is the set estimation problem under

a noisy model. We apply GROS to get a robust persistent diagram.

Lastly, an application of robust estimation techniques to determine

the home-range of Canis dingo in Australia is implemented.

Key words and phrases: Bandits, Median of means, Robustness, Sub-Gaussian estimator, Topological data analysis

Information

Preprint No.	SS-2024-0414
Manuscript ID	SS-2024-0414
Complete Authors	Alejandro Cholaquidis, Emilien Joly, Leonardo Moreno
Corresponding Authors	Leonardo Moreno
Emails	leonardo.moreno@fcea.edu.uy

References

Aaron, C., Cholaquidis, A. and Fraiman, R. (2022). Estimation of surface area. Electron. J. Statist. 16(2), 3751–3788.
Agrawal, R. (1995). Sample mean based index policies by o(log n) regret for the multi-armed bandit problem. Advances in Applied Probability 27(4), 1054–1078.
Azzalini, A. (2013). The Skew-Normal and Related Families. Institute of Mathematical Statistics Monographs. Cambridge University Press.
Ba´ıllo, A. and Chac´on, J. E. (2021). Statistical outline of animal home ranges: an application of set estimation. Handbook of Statistics 44, 3–37.
Biau, G., Fischer, A., Guedj, B. and Malley, J. D. (2016). COBRA: A combined regression strategy. Journal of Multivariate Analysis 146, 18– 28.
Boente, G., Mart´ınez, A. and Salibi´an-Barrera, M. (2017). Robust estimators for additive models using backfitting. Journal of Nonparametric Statistics 29(4), 744–767.
Boursier, E. and Perchet, V. (2022). A survey on multi-player bandits. arXiv preprint arXiv:2211.16275.
Breiman, L. (1996). Stacked regressions. Machine Learning 24, 49–64.
Breiman, L. (2001). Random forests. Machine Learning 45, 5–32.
Bubeck, S., Cesa-Bianchi, N. and Lugosi, G. (2013). Bandits with heavy tail. IEEE Transactions on Information Theory 59(11), 7711–7717.
Burtini, G., Loeppky, J. and Lawrence, R. (2015). A survey of online experiment design with the stochastic multi-armed bandit. arXiv preprint arXiv:1510.00757.
Burt, W. H. (1943). Territoriality and home range concepts as applied to mammals. Journal of Mammalogy 24(3), 346–352.
Cholaquidis, A., Fraiman, R., Ghattas, B. and Kalemkerian, J. (2021). A combined strategy for multivariate density estimation. Journal of Nonparametric Statistics 33(1), 39–59.
Cholaquidis, A., Fraiman, R., Kalemkerian, J. and Llop, P. (2016). A nonlinear aggregation type classifier. Journal of Multivariate Analysis 146, 269–281.
Cholaquidis, A., Fraiman, R., Mordecki, E. and Papalardo, C. (2021). Level set and drift estimation from a reflected Brownian motion with drift. Statistica Sinica 31, 29–51.
Cholaquidis, A., Hern´andez, M. and Fraiman, R. (2023). Home range estimation under a restricted sampling scheme. Journal of Nonparametric Statistics, to appear.
Cuesta-Albertos, J. A., Gordaliza, A. and Matr´an, C. (1997). Trimmed kmeans: an attempt to robustify quantizers. Ann. Statist. 25(2), 553–576.
Cuevas, A. and Rodr´ıguez-Casal, A. (2004). On boundary estimation. Advances in Applied Probability 36(2), 340–354.
Devroye, L., Lerasle, M., Lugosi, G. and Oliveira, R. I. (2016). SubGaussian mean estimators. Ann. Statist. 44(6), 2695–2725.
Donoho, D. L. (1982). Breakdown properties of multivariate location estimators. Technical report, Harvard University, Boston.
Devroye, L., Gy¨orfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer.
Edelsbrunner, H. and Harer, J. L. (2022). Computational Topology: An Introduction. American Mathematical Society.
Fern´andez, C. and Steel, M. F. J. (1998). On Bayesian modeling of fat tails and skewness. Journal of the American Statistical Association 93(441), 359–371.
Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139.
Gy¨orfi, L., Kohler, M., Krzy˙zak, A. and Walk, H. (2002). A Distributionfree Theory of Nonparametric Regression. Springer.
Hartigan, J. A. (1978). Asymptotic distributions for clustering criteria. Ann. Statist. 6, 117–131.
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35, 73–101.
James, W. and Stein, C. (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1, 361–379.
Joly, E., Lugosi, G. and Oliveira, R. I. (2017). On the estimation of the mean of a random vector. Electron. J. Statist. 11(1), 440–451.
Kaufman, L. (1990). Partitioning Around Medoids. In: Finding Groups in Data, 344:68–125.
Kaufman, L. and Rousseeuw, P. J. (2009). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley.
Lattimore, T. and Szepesv´ari, C. (2020). Bandit Algorithms. Cambridge University Press.
Lecu´e, G. and Lerasle, M. (2020). Robust machine learning by median-ofmeans: Theory and practice. Ann. Statist. 48(2), 906–931.
Lugosi, G. and Mendelson, S. (2019). Mean estimation and regression under heavy-tailed distributions: A survey. Foundations of Computational Mathematics 19(5), 1145–1190.
Lugosi, G. and Mendelson, S. (2019). Sub-Gaussian estimators of the mean of a random vector. Ann. Statist. 47(2), 783–794.
Maronna, R. A., Martin, R. D., Yohai, V. J. and Salibi´an-Barrera, M.
(2019). Robust Statistics: Theory and Methods (with R). Wiley.
McQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Volume 1, University of California Press, 281–297.
Nadaraya, E. A. (1964). On estimating regression. Theory of Probability & Its Applications 9(1), 141–142.
Nemirovsky, A. S. and Yudin, D. B. (1983). Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience.
Oh, H.-S., Nychka, D. W. and Lee, T. C. M. (2007). The role of pseudo data for robust smoothing with application to wavelet regression. Biometrika 94(4), 893–904.
Pollard, D. (1981). Strong consistency of k-means clustering. Ann. Probability 9(1), 135–140.
Pollard, D. (1982). A central limit theorem for k-means clustering. Ann. Probability 10(4), 919–926.
Rodriguez, D. and Valdora, M. (2019). The breakdown point of the median of means tournament. Statistics & Probability Letters 153, 108–112.
Rodr´ıguez-Casal, A. (2007). Set estimation under convexity type assumptions. Ann. IHP Probab. Stat. 43(6), 763–774.
Salibi´an-Barrera, M. (2023). Robust nonparametric regression: Review and practical considerations. Econometrics and Statistics,in press (corrected proof), available online 25 April 2023. doi: 10.1016/j.ecosta.2023.04.004.
Smith, B. P., Cairns, K. M., Adams, J. W., Newsome, T. M., Fillios, M.,
Deaux, E. C. et al. (2019). Taxonomic status of the Australian dingo: the case for Canis dingo Meyer, 1793. Zootaxa 4564(1), 173–197.
Vishwanath, S., Fukumizu, K., Kuriki, S. and Sriperumbudur, B. K. (2020). Robust persistence diagrams using reproducing kernels. Advances in Neural Information Processing Systems 33, 21900–21911.
Vishwanath, S., Sriperumbudur, B. K., Fukumizu, K. and Kuriki, S. (2022). Robust topological inference in the presence of outliers. arXiv preprint arXiv:2206.01795.
Watson, G. S. (1964). Smooth regression analysis. Sankhy¯a: The Indian Journal of Statistics, Series A 26(4), 359–372.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks 5(2), 241– 259.
Wysong, M. L., Hradsky, B. A., Iacona, G. D., Valentine, L. E., Morris, K.
and Ritchie, E. G. (2020). Space use and habitat selection of an invasive mesopredator and sympatric, native apex predator. Movement Ecology 8, 1–115.

Acknowledgments

We warmly thank the two anonymous referees for their thoughtful comments and valuable suggestions, which have significantly improved the qual-

ity of the paper. We also thank the Editors for their careful handling of

the manuscript and for their constructive guidance throughout the review

process.

The research of the first and third authors has been partially

supported by grant FCE-3-2022-1-172289 from ANII (Uruguay) and grant

22520220100031UD from CSIC (Uruguay).

Supplementary Materials

This document provides complementary material to the paper GROS: A

General Robust Aggregation Strategy. We present two additional application settings illustrating how the GROS principle can be used to build

robust procedures under heavy-tailed noise. In Section 1 we adapt the robust aggregated mean estimator (as in Eq. (1) of the main paper) to obtain

a robust UCB-type strategy for multi-armed bandits with heavy-tailed rewards, and we include a small simulation study. In Section 2 we apply

GROS to nonparametric regression by aggregating kernel estimators computed on independent subsamples, discuss a practical approximation of the

L2-distance used by the method, and compare the resulting estimator with

classical and robust alternatives through simulations.

Supplementary materials are available for download.

[1] Aaron, C., Cholaquidis, A. and Fraiman, R. (2022). Estimation of surface area. Electron. J. Statist. 16(2), 3751–3788.

[2] Agrawal, R. (1995). Sample mean based index policies by o(log n) regret for the multi-armed bandit problem. Advances in Applied Probability 27(4), 1054–1078.

[3] Azzalini, A. (2013). The Skew-Normal and Related Families. Institute of Mathematical Statistics Monographs. Cambridge University Press.

[4] Ba´ıllo, A. and Chac´on, J. E. (2021). Statistical outline of animal home ranges: an application of set estimation. Handbook of Statistics 44, 3–37.

[5] Biau, G., Fischer, A., Guedj, B. and Malley, J. D. (2016). COBRA: A combined regression strategy. Journal of Multivariate Analysis 146, 18– 28.

[6] Boente, G., Mart´ınez, A. and Salibi´an-Barrera, M. (2017). Robust estimators for additive models using backfitting. Journal of Nonparametric Statistics 29(4), 744–767.

[7] Boursier, E. and Perchet, V. (2022). A survey on multi-player bandits. arXiv preprint arXiv:2211.16275.

[8] Breiman, L. (1996). Stacked regressions. Machine Learning 24, 49–64.

[9] Breiman, L. (2001). Random forests. Machine Learning 45, 5–32.

[10] Bubeck, S., Cesa-Bianchi, N. and Lugosi, G. (2013). Bandits with heavy tail. IEEE Transactions on Information Theory 59(11), 7711–7717.

[11] Burtini, G., Loeppky, J. and Lawrence, R. (2015). A survey of online experiment design with the stochastic multi-armed bandit. arXiv preprint arXiv:1510.00757.

[12] Burt, W. H. (1943). Territoriality and home range concepts as applied to mammals. Journal of Mammalogy 24(3), 346–352.

[13] Cholaquidis, A., Fraiman, R., Ghattas, B. and Kalemkerian, J. (2021). A combined strategy for multivariate density estimation. Journal of Nonparametric Statistics 33(1), 39–59.

[14] Cholaquidis, A., Fraiman, R., Kalemkerian, J. and Llop, P. (2016). A nonlinear aggregation type classifier. Journal of Multivariate Analysis 146, 269–281.

[15] Cholaquidis, A., Fraiman, R., Mordecki, E. and Papalardo, C. (2021). Level set and drift estimation from a reflected Brownian motion with drift. Statistica Sinica 31, 29–51.

[16] Cholaquidis, A., Hern´andez, M. and Fraiman, R. (2023). Home range estimation under a restricted sampling scheme. Journal of Nonparametric Statistics, to appear.

[17] Cuesta-Albertos, J. A., Gordaliza, A. and Matr´an, C. (1997). Trimmed kmeans: an attempt to robustify quantizers. Ann. Statist. 25(2), 553–576.

[18] Cuevas, A. and Rodr´ıguez-Casal, A. (2004). On boundary estimation. Advances in Applied Probability 36(2), 340–354.

[19] Devroye, L., Lerasle, M., Lugosi, G. and Oliveira, R. I. (2016). SubGaussian mean estimators. Ann. Statist. 44(6), 2695–2725.

[20] Donoho, D. L. (1982). Breakdown properties of multivariate location estimators. Technical report, Harvard University, Boston.

[21] Devroye, L., Gy¨orfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer.

[22] Edelsbrunner, H. and Harer, J. L. (2022). Computational Topology: An Introduction. American Mathematical Society.

[23] Fern´andez, C. and Steel, M. F. J. (1998). On Bayesian modeling of fat tails and skewness. Journal of the American Statistical Association 93(441), 359–371.

[24] Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139.

[25] Gy¨orfi, L., Kohler, M., Krzy˙zak, A. and Walk, H. (2002). A Distributionfree Theory of Nonparametric Regression. Springer.

[26] Hartigan, J. A. (1978). Asymptotic distributions for clustering criteria. Ann. Statist. 6, 117–131.

[27] Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35, 73–101.

[28] James, W. and Stein, C. (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1, 361–379.

[29] Joly, E., Lugosi, G. and Oliveira, R. I. (2017). On the estimation of the mean of a random vector. Electron. J. Statist. 11(1), 440–451.

[30] Kaufman, L. (1990). Partitioning Around Medoids. In: Finding Groups in Data, 344:68–125.

[31] Kaufman, L. and Rousseeuw, P. J. (2009). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley.

[32] Lattimore, T. and Szepesv´ari, C. (2020). Bandit Algorithms. Cambridge University Press.

[33] Lecu´e, G. and Lerasle, M. (2020). Robust machine learning by median-ofmeans: Theory and practice. Ann. Statist. 48(2), 906–931.

[34] Lugosi, G. and Mendelson, S. (2019). Mean estimation and regression under heavy-tailed distributions: A survey. Foundations of Computational Mathematics 19(5), 1145–1190.

[35] Lugosi, G. and Mendelson, S. (2019). Sub-Gaussian estimators of the mean of a random vector. Ann. Statist. 47(2), 783–794.

[36] Maronna, R. A., Martin, R. D., Yohai, V. J. and Salibi´an-Barrera, M.

[37] (2019). Robust Statistics: Theory and Methods (with R). Wiley.

[38] McQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Volume 1, University of California Press, 281–297.

[39] Nadaraya, E. A. (1964). On estimating regression. Theory of Probability & Its Applications 9(1), 141–142.

[40] Nemirovsky, A. S. and Yudin, D. B. (1983). Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience.

[41] Oh, H.-S., Nychka, D. W. and Lee, T. C. M. (2007). The role of pseudo data for robust smoothing with application to wavelet regression. Biometrika 94(4), 893–904.

[42] Pollard, D. (1981). Strong consistency of k-means clustering. Ann. Probability 9(1), 135–140.

[43] Pollard, D. (1982). A central limit theorem for k-means clustering. Ann. Probability 10(4), 919–926.

[44] Rodriguez, D. and Valdora, M. (2019). The breakdown point of the median of means tournament. Statistics & Probability Letters 153, 108–112.

[45] Rodr´ıguez-Casal, A. (2007). Set estimation under convexity type assumptions. Ann. IHP Probab. Stat. 43(6), 763–774.

[46] Salibi´an-Barrera, M. (2023). Robust nonparametric regression: Review and practical considerations. Econometrics and Statistics,in press (corrected proof), available online 25 April 2023. doi: 10.1016/j.ecosta.2023.04.004.

[47] Smith, B. P., Cairns, K. M., Adams, J. W., Newsome, T. M., Fillios, M.,

[48] Deaux, E. C. et al. (2019). Taxonomic status of the Australian dingo: the case for Canis dingo Meyer, 1793. Zootaxa 4564(1), 173–197.

[49] Vishwanath, S., Fukumizu, K., Kuriki, S. and Sriperumbudur, B. K. (2020). Robust persistence diagrams using reproducing kernels. Advances in Neural Information Processing Systems 33, 21900–21911.

[50] Vishwanath, S., Sriperumbudur, B. K., Fukumizu, K. and Kuriki, S. (2022). Robust topological inference in the presence of outliers. arXiv preprint arXiv:2206.01795.

[51] Watson, G. S. (1964). Smooth regression analysis. Sankhy¯a: The Indian Journal of Statistics, Series A 26(4), 359–372.

[52] Wolpert, D. H. (1992). Stacked generalization. Neural Networks 5(2), 241– 259.

[53] Wysong, M. L., Hradsky, B. A., Iacona, G. D., Valentine, L. E., Morris, K.

[54] and Ritchie, E. G. (2020). Space use and habitat selection of an invasive mesopredator and sympatric, native apex predator. Movement Ecology 8, 1–115.