Minimax Rates of Convergence for Nonparametric Regression Under Adversarial Attacks

Jingfu Peng and Yuhong Yang

doi:10.5705/ss.202024.0298

Abstract

Recent research shows the susceptibility of machine learning models

to adversarial attacks, wherein minor but maliciously chosen perturbations of

the input can significantly degrade model performance. In this paper, we theoretically analyse the limits of robustness against such adversarial attacks in a

nonparametric regression setting, by examining the minimax rates of convergence

in an adversarial sup-norm. Our work reveals that the minimax rate under adversarial attacks in the input is the same as sum of two terms: one represents

the minimax rate in the standard setting without adversarial attacks, and the

other reflects the maximum deviation of the true regression function value within

the target function class when subjected to the input perturbations. The optimal rates under the adversarial setup can be achieved by an adversarial plug-in

procedure constructed from a minimax optimal estimator in the corresponding

standard setting. Two specific examples are given to illustrate the established

minimax results.

Key words and phrases: Nonparametric regression, adversarial robustness, mini- max risk, sup-norm

Information

Preprint No.	SS-2024-0298
Manuscript ID	SS-2024-0298
Complete Authors	Jingfu Peng, Yuhong Yang
Corresponding Authors	Yuhong Yang
Emails	yangx374@umn.edu

References

Athalye, A., N. Carlini, and D. Wagner (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, Volume 80, pp. 274–283.
Attias, I., A. Kontorovich, and Y. Mansour (2019). Improved generalization bounds for robust learning. In Algorithmic Learning Theory, Volume 98, pp. 162–183.
Audibert, J.-Y. and A. B. Tsybakov (2007). Fast learning rates for plug-in classifiers. The Annals of Statistics 35(2), 608–633.
Awasthi, P., N. Frank, and M. Mohri (2020). Adversarial learning guarantees for linear hypotheses and neural networks. In Proceedings of the 37th International Conference on Machine
Learning, ICML, Volume 119, pp. 431–441.
Ben-Tal, A., L. El Ghaoui, and A. Nemirovski (2009). Robust Optimization. Princeton university press.
Bertin, K. (2004a). Asymptotically exact minimax estimation in sup-norm for anisotropic H¨older classes. Bernoulli 10(5), 873–888.
Bertin, K. (2004b). Minimax exact constant in sup-norm for nonparametric regression with random design. Journal of Statistical Planning and Inference 123(2), 225–242.
Bhagoji, A. N., D. Cullina, and P. Mittal (2019). Lower bounds on adversarial robustness from optimal transport. In Advances in Neural Information Processing Systems 32, pp. 7496–7508.
Bhattacharjee, R. and K. Chaudhuri (2020). When are non-parametric methods robust? In International Conference on Machine Learning, pp. 832–841. PMLR.
Bhattacharjee, R., S. Jha, and K. Chaudhuri (2021). Sample complexity of robust linear classification on separated data. In International Conference on Machine Learning, pp. 884–893. PMLR.
Bhattacharya, A., D. Pati, and D. Dunson (2014). Anisotropic function estimation using multibandwidth Gaussian processes. The Annals of Statistics 42(1), 352–381.
Biggio, B., I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli
(2013). Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases - European Conference, Volume 8190 of Lecture Notes in Computer Science, pp. 387–402. Springer.
Birg´e, L. (1986). On estimating a density using hellinger distance and some other strange facts. Probability theory and related fields 71(2), 271–291.
Carlini, N. and D. Wagner (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57.
Chen, X. and T. M. Christensen (2015). Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. Journal of Econometrics 188(2), 447–465.
Cohen, J., E. Rosenfeld, and Z. Kolter (2019). Certified adversarial robustness via randomized smoothing. In Proceedings of the 36th International Conference on Machine Learning, Volume 97, pp. 1310–1320.
Collobert, R., J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(76), 2493–2537.
Dan, C., Y. Wei, and P. Ravikumar (2020). Sharp statistical guaratees for adversarially robust Gaussian classification. In Proceedings of the 37th International Conference on Machine
Learning, Volume 119, pp. 2345–2355.
Delgosha, P., H. Hassani, and R. Pedarsani (2024). Binary classification under ℓ0 attacks for general noise distribution. IEEE Transactions on Information Theory 70(2), 1284–1299.
Devroye, L. (1978). The uniform convergence of nearest neighbor regression function estimators and their application in optimization. IEEE Transactions on Information Theory 24(2), 142– 151.
Dobriban, E., H. Hassani, D. Hong, and A. Robey (2020). Provable tradeoffs in adversarially robust classification. arXiv preprint arXiv:2006.05161.
Donoho, D. L. (1994). Asymptotic minimax risk for sup-norm loss: Solution via optimal recovery. Probability Theory and Related Fields 99, 145–170.
Duchi, J., T. Hashimoto, and H. Namkoong (2023). Distributionally robust losses for latent covariate mixtures. Operations Research 71(2), 649–664.
Finlay, C. and A. M. Oberman (2021). Scaleable input gradient regularization for adversarial robustness. Machine Learning with Applications 3, 100017.
Ga¨ıffas, S. (2007). Sharp estimation in sup norm with random design. Statistics & Probability Letters 77(8), 782–794.
Gin´e, E. and R. Nickl (2009). Uniform limit theorems for wavelet density estimators. The Annals of Probability 37(4), 1605–1646.
Goodfellow, I. J., J. Shlens, and C. Szegedy (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
Grigorescu, S., B. Trasnea, T. Cocias, and G. Macesanu (2020). A survey of deep learning techniques for autonomous driving. Journal of Field Robotics 37(3), 362–386.
Ibragimov, I. and R. Khas’ minskii (1982). Bounds for the risks of non-parametric regression estimates. Theory of Probability & Its Applications 27(1), 84–99.
Imaizumi, M. (2023). Sup-norm convergence of deep neural network estimator for nonparametric regression by adversarial training. arXiv preprint arXiv:2307.04042.
Javanmard, A. and M. Soltanolkotabi (2022). Precise statistical analysis of classification accuracies for adversarial training. The Annals of Statistics 50(4), 2127–2156.
Javanmard, A., M. Soltanolkotabi, and H. Hassani (2020). Precise tradeoffs in adversarial training for linear regression. In Conference on Learning Theory, Volume 125, pp. 2034–2078.
Jeong, S. and V. Rockova (2023). The art of bart: Minimax optimality over nonhomogeneous smoothness in high dimension. Journal of Machine Learning Research 24(337), 1–65.
Khim, J. and P.-L. Loh (2018). Adversarial risk bounds via function transformation. arXiv preprint arXiv:1810.09519.
Korostelev, A. and M. Nussbaum (1999). The asymptotic minimax constant for sup-norm loss in nonparametric density estimation. Bernoulli 5(6), 1099–1118.
Krizhevsky, A., I. Sutskever, and G. E. Hinton (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, Volume 25, pp. 1106–1114.
LeCam, L. (1973). Convergence of estimates under dimensionality restrictions. The Annals of Statistics 1(1), 38–53.
Lee, J. and M. Raginsky (2018). Minimax statistical learning with wasserstein distances. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2692–2701.
Lepski, O. V. and A. B. Tsybakov (2000). Asymptotically exact nonparametric hypothesis testing in sup-norm and at a fixed point. Probability Theory and Related Fields 117(1), 17–48.
Liu, C., Y. Jiao, J. Wang, and J. Huang (2023). Non-asymptotic bounds for adversarial excess risk under misspecified models. arXiv preprint arXiv:2309.00771.
Madry, A., A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
Marzi, Z., S. Gopalakrishnan, U. Madhow, and R. Pedarsani (2018). Sparsity-based defense against adversarial attacks on linear classifiers. In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 31–35. IEEE.
Mehrabi, M., A. Javanmard, R. A. Rossi, A. Rao, and T. Mai (2021). Fundamental tradeoffs in distributionally adversarial training. In Proceedings of the 38th International Conference on Machine Learning, Volume 139, pp. 7544–7554.
Min, Y., L. Chen, and A. Karbasi (2021). The curious case of adversarially robust models: More data can help, double descend, or hurt generalization. In Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Volume 161, pp. 129–139.
Montasser, O., S. Hanneke, and N. Srebro (2019). Vc classes are adversarially robustly learnable, but only improperly. In Proceedings of the Thirty-Second Conference on Learning Theory, Volume 99, pp. 2512–2530.
Moosavi-Dezfooli, S., A. Fawzi, and P. Frossard (2016). Deepfool: A simple and accurate method to fool deep neural networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2574–2582.
Papernot, N., P. McDaniel, S. Jha, M. Fredrikson, Z. Celik, and A. Swami (2016). The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387.
Pydi, M. S. and V. S. Jog (2020). Adversarial risk via optimal transport and optimal couplings. In Proceedings of the 37th International Conference on Machine Learning, Volume 119, pp. 7814–7823.
Raghunathan, A., J. Steinhardt, and P. Liang (2018). Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344.
Raghunathan, A., S. M. Xie, F. Yang, J. C. Duchi, and P. Liang (2019). Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032.
Ribeiro, A. H. and T. B. Sch¨on (2023). Overparameterized linear regression under adversarial attacks. IEEE Transactions on Signal Processing 71, 601–614.
Robey, A., H. Hassani, and G. J. Pappas (2020). Model-based robust deep learning: Generalizing to natural, out-of-distribution data. arXiv preprint arXiv:2005.10247.
Schmidt, L., S. Santurkar, D. Tsipras, K. Talwar, and A. Madry (2018). Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems 31, pp. 5019–5031.
Schmidt-Hieber, J. and P. Zamolodtchikov (2024). Local convergence rates of the nonparametric least squares estimator with applications to transfer learning. Bernoulli 30(3), 1845–1877.
Shapiro, A., D. Dentcheva, and A. Ruszczynski (2021). Lectures on Stochastic Programming: Modeling and Theory. SIAM.
Silver, D., A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis
(2016). Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489.
Simard, P., D. Steinkraus, and J. Platt (2003). Best practices for convolutional neural networks applied to visual document analysis. In Seventh International Conference on Document Analysis and Recognition, pp. 958–963.
Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics 10(4), 1040–1053.
Szegedy, C., W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
Tsipras, D., S. Santurkar, L. Engstrom, A. Turner, and A. Madry (2019). Robustness may be at odds with accuracy. In 7th International Conference on Learning Representations.
Tsybakov, A. B. (2008). Introduction to Nonparametric Estimation. Springer New York, NY.
Tu, Z., J. Zhang, and D. Tao (2019). Theoretical analysis of adversarial learning: A minimax approach. In Advances in Neural Information Processing Systems, Volume 32, pp. 12259–12269.
Wang, Y., S. Jha, and K. Chaudhuri (2018). Analyzing the robustness of nearest neighbors to adversarial examples. In Proceedings of the 35th International Conference on Machine
Learning, Volume 80, pp. 5133–5142.
Xing, Y., Q. Song, and G. Cheng (2021). Predictive power of nearest neighbors algorithm under random perturbation. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Volume 130, pp. 496–504.
Xing, Y., R. Zhang, and G. Cheng (2021). Adversarially robust estimate and risk analysis in linear regression. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Volume 130, pp. 514–522.
Yang, Y. and A. Barron (1999). Information-theoretic determination of minimax rates of convergence. The Annals of Statistics 27(5), 1564–1599.
Yang, Y.-Y., C. Rashtchian, Y. Wang, and K. Chaudhuri (2020). Robustness for non-parametric classification: a generic attack and defense. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, Volume 108, pp. 941–951.
Yatracos, Y. G. (1985). Rates of convergence of minimum distance estimators and kolmogorov’s entropy. The Annals of Statistics 13(2), 768–774.
Yin, D., K. Ramchandran, and P. L. Bartlett (2019). Rademacher complexity for adversarially robust generalization. In Proceedings of the 36th International Conference on Machine
Learning, Volume 97, pp. 7085–7094.
Zhang, H., Y. Yu, J. Jiao, E. P. Xing, L. E. Ghaoui, and M. I. Jordan (2019). Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th International Conference on Machine Learning, Volume 97, pp. 7472–7482.
Zhao, P. and Z. Wan (2024). Robust nonparametric regression under poisoning attack. Proceedings of the AAAI Conference on Artificial Intelligence 38(15), 17007–17015. Jingfu Peng Yau Mathematical Sciences Center, Tsinghua University

Acknowledgments

The authors thank the associate editor and two anonymous referees for their

insightful and constructive comments.

Supplementary Materials

The online Supplementary Material includes the theoretical proofs for the

results in the main theorems and Examples 1-2.

Supplementary materials are available for download.

[1] Athalye, A., N. Carlini, and D. Wagner (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, Volume 80, pp. 274–283.

[2] Attias, I., A. Kontorovich, and Y. Mansour (2019). Improved generalization bounds for robust learning. In Algorithmic Learning Theory, Volume 98, pp. 162–183.

[3] Audibert, J.-Y. and A. B. Tsybakov (2007). Fast learning rates for plug-in classifiers. The Annals of Statistics 35(2), 608–633.

[4] Awasthi, P., N. Frank, and M. Mohri (2020). Adversarial learning guarantees for linear hypotheses and neural networks. In Proceedings of the 37th International Conference on Machine

[5] Learning, ICML, Volume 119, pp. 431–441.

[6] Ben-Tal, A., L. El Ghaoui, and A. Nemirovski (2009). Robust Optimization. Princeton university press.

[7] Bertin, K. (2004a). Asymptotically exact minimax estimation in sup-norm for anisotropic H¨older classes. Bernoulli 10(5), 873–888.

[8] Bertin, K. (2004b). Minimax exact constant in sup-norm for nonparametric regression with random design. Journal of Statistical Planning and Inference 123(2), 225–242.

[9] Bhagoji, A. N., D. Cullina, and P. Mittal (2019). Lower bounds on adversarial robustness from optimal transport. In Advances in Neural Information Processing Systems 32, pp. 7496–7508.

[10] Bhattacharjee, R. and K. Chaudhuri (2020). When are non-parametric methods robust? In International Conference on Machine Learning, pp. 832–841. PMLR.

[11] Bhattacharjee, R., S. Jha, and K. Chaudhuri (2021). Sample complexity of robust linear classification on separated data. In International Conference on Machine Learning, pp. 884–893. PMLR.

[12] Bhattacharya, A., D. Pati, and D. Dunson (2014). Anisotropic function estimation using multibandwidth Gaussian processes. The Annals of Statistics 42(1), 352–381.

[13] Biggio, B., I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli

[14] (2013). Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases - European Conference, Volume 8190 of Lecture Notes in Computer Science, pp. 387–402. Springer.

[15] Birg´e, L. (1986). On estimating a density using hellinger distance and some other strange facts. Probability theory and related fields 71(2), 271–291.

[16] Carlini, N. and D. Wagner (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57.

[17] Chen, X. and T. M. Christensen (2015). Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. Journal of Econometrics 188(2), 447–465.

[18] Cohen, J., E. Rosenfeld, and Z. Kolter (2019). Certified adversarial robustness via randomized smoothing. In Proceedings of the 36th International Conference on Machine Learning, Volume 97, pp. 1310–1320.

[19] Collobert, R., J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(76), 2493–2537.

[20] Dan, C., Y. Wei, and P. Ravikumar (2020). Sharp statistical guaratees for adversarially robust Gaussian classification. In Proceedings of the 37th International Conference on Machine

[21] Learning, Volume 119, pp. 2345–2355.

[22] Delgosha, P., H. Hassani, and R. Pedarsani (2024). Binary classification under ℓ0 attacks for general noise distribution. IEEE Transactions on Information Theory 70(2), 1284–1299.

[23] Devroye, L. (1978). The uniform convergence of nearest neighbor regression function estimators and their application in optimization. IEEE Transactions on Information Theory 24(2), 142– 151.

[24] Dobriban, E., H. Hassani, D. Hong, and A. Robey (2020). Provable tradeoffs in adversarially robust classification. arXiv preprint arXiv:2006.05161.

[25] Donoho, D. L. (1994). Asymptotic minimax risk for sup-norm loss: Solution via optimal recovery. Probability Theory and Related Fields 99, 145–170.

[26] Duchi, J., T. Hashimoto, and H. Namkoong (2023). Distributionally robust losses for latent covariate mixtures. Operations Research 71(2), 649–664.

[27] Finlay, C. and A. M. Oberman (2021). Scaleable input gradient regularization for adversarial robustness. Machine Learning with Applications 3, 100017.

[28] Ga¨ıffas, S. (2007). Sharp estimation in sup norm with random design. Statistics & Probability Letters 77(8), 782–794.

[29] Gin´e, E. and R. Nickl (2009). Uniform limit theorems for wavelet density estimators. The Annals of Probability 37(4), 1605–1646.

[30] Goodfellow, I. J., J. Shlens, and C. Szegedy (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

[31] Grigorescu, S., B. Trasnea, T. Cocias, and G. Macesanu (2020). A survey of deep learning techniques for autonomous driving. Journal of Field Robotics 37(3), 362–386.

[32] Ibragimov, I. and R. Khas’ minskii (1982). Bounds for the risks of non-parametric regression estimates. Theory of Probability & Its Applications 27(1), 84–99.

[33] Imaizumi, M. (2023). Sup-norm convergence of deep neural network estimator for nonparametric regression by adversarial training. arXiv preprint arXiv:2307.04042.

[34] Javanmard, A. and M. Soltanolkotabi (2022). Precise statistical analysis of classification accuracies for adversarial training. The Annals of Statistics 50(4), 2127–2156.

[35] Javanmard, A., M. Soltanolkotabi, and H. Hassani (2020). Precise tradeoffs in adversarial training for linear regression. In Conference on Learning Theory, Volume 125, pp. 2034–2078.

[36] Jeong, S. and V. Rockova (2023). The art of bart: Minimax optimality over nonhomogeneous smoothness in high dimension. Journal of Machine Learning Research 24(337), 1–65.

[37] Khim, J. and P.-L. Loh (2018). Adversarial risk bounds via function transformation. arXiv preprint arXiv:1810.09519.

[38] Korostelev, A. and M. Nussbaum (1999). The asymptotic minimax constant for sup-norm loss in nonparametric density estimation. Bernoulli 5(6), 1099–1118.

[39] Krizhevsky, A., I. Sutskever, and G. E. Hinton (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, Volume 25, pp. 1106–1114.

[40] LeCam, L. (1973). Convergence of estimates under dimensionality restrictions. The Annals of Statistics 1(1), 38–53.

[41] Lee, J. and M. Raginsky (2018). Minimax statistical learning with wasserstein distances. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2692–2701.

[42] Lepski, O. V. and A. B. Tsybakov (2000). Asymptotically exact nonparametric hypothesis testing in sup-norm and at a fixed point. Probability Theory and Related Fields 117(1), 17–48.

[43] Liu, C., Y. Jiao, J. Wang, and J. Huang (2023). Non-asymptotic bounds for adversarial excess risk under misspecified models. arXiv preprint arXiv:2309.00771.

[44] Madry, A., A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.

[45] Marzi, Z., S. Gopalakrishnan, U. Madhow, and R. Pedarsani (2018). Sparsity-based defense against adversarial attacks on linear classifiers. In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 31–35. IEEE.

[46] Mehrabi, M., A. Javanmard, R. A. Rossi, A. Rao, and T. Mai (2021). Fundamental tradeoffs in distributionally adversarial training. In Proceedings of the 38th International Conference on Machine Learning, Volume 139, pp. 7544–7554.

[47] Min, Y., L. Chen, and A. Karbasi (2021). The curious case of adversarially robust models: More data can help, double descend, or hurt generalization. In Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Volume 161, pp. 129–139.

[48] Montasser, O., S. Hanneke, and N. Srebro (2019). Vc classes are adversarially robustly learnable, but only improperly. In Proceedings of the Thirty-Second Conference on Learning Theory, Volume 99, pp. 2512–2530.

[49] Moosavi-Dezfooli, S., A. Fawzi, and P. Frossard (2016). Deepfool: A simple and accurate method to fool deep neural networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2574–2582.

[50] Papernot, N., P. McDaniel, S. Jha, M. Fredrikson, Z. Celik, and A. Swami (2016). The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387.

[51] Pydi, M. S. and V. S. Jog (2020). Adversarial risk via optimal transport and optimal couplings. In Proceedings of the 37th International Conference on Machine Learning, Volume 119, pp. 7814–7823.

[52] Raghunathan, A., J. Steinhardt, and P. Liang (2018). Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344.

[53] Raghunathan, A., S. M. Xie, F. Yang, J. C. Duchi, and P. Liang (2019). Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032.

[54] Ribeiro, A. H. and T. B. Sch¨on (2023). Overparameterized linear regression under adversarial attacks. IEEE Transactions on Signal Processing 71, 601–614.

[55] Robey, A., H. Hassani, and G. J. Pappas (2020). Model-based robust deep learning: Generalizing to natural, out-of-distribution data. arXiv preprint arXiv:2005.10247.

[56] Schmidt, L., S. Santurkar, D. Tsipras, K. Talwar, and A. Madry (2018). Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems 31, pp. 5019–5031.

[57] Schmidt-Hieber, J. and P. Zamolodtchikov (2024). Local convergence rates of the nonparametric least squares estimator with applications to transfer learning. Bernoulli 30(3), 1845–1877.

[58] Shapiro, A., D. Dentcheva, and A. Ruszczynski (2021). Lectures on Stochastic Programming: Modeling and Theory. SIAM.

[59] Silver, D., A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis

[60] (2016). Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489.

[61] Simard, P., D. Steinkraus, and J. Platt (2003). Best practices for convolutional neural networks applied to visual document analysis. In Seventh International Conference on Document Analysis and Recognition, pp. 958–963.

[62] Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics 10(4), 1040–1053.

[63] Szegedy, C., W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.

[64] Tsipras, D., S. Santurkar, L. Engstrom, A. Turner, and A. Madry (2019). Robustness may be at odds with accuracy. In 7th International Conference on Learning Representations.

[65] Tsybakov, A. B. (2008). Introduction to Nonparametric Estimation. Springer New York, NY.

[66] Tu, Z., J. Zhang, and D. Tao (2019). Theoretical analysis of adversarial learning: A minimax approach. In Advances in Neural Information Processing Systems, Volume 32, pp. 12259–12269.

[67] Wang, Y., S. Jha, and K. Chaudhuri (2018). Analyzing the robustness of nearest neighbors to adversarial examples. In Proceedings of the 35th International Conference on Machine

[68] Learning, Volume 80, pp. 5133–5142.

[69] Xing, Y., Q. Song, and G. Cheng (2021). Predictive power of nearest neighbors algorithm under random perturbation. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Volume 130, pp. 496–504.

[70] Xing, Y., R. Zhang, and G. Cheng (2021). Adversarially robust estimate and risk analysis in linear regression. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Volume 130, pp. 514–522.

[71] Yang, Y. and A. Barron (1999). Information-theoretic determination of minimax rates of convergence. The Annals of Statistics 27(5), 1564–1599.

[72] Yang, Y.-Y., C. Rashtchian, Y. Wang, and K. Chaudhuri (2020). Robustness for non-parametric classification: a generic attack and defense. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, Volume 108, pp. 941–951.

[73] Yatracos, Y. G. (1985). Rates of convergence of minimum distance estimators and kolmogorov’s entropy. The Annals of Statistics 13(2), 768–774.

[74] Yin, D., K. Ramchandran, and P. L. Bartlett (2019). Rademacher complexity for adversarially robust generalization. In Proceedings of the 36th International Conference on Machine

[75] Learning, Volume 97, pp. 7085–7094.

[76] Zhang, H., Y. Yu, J. Jiao, E. P. Xing, L. E. Ghaoui, and M. I. Jordan (2019). Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th International Conference on Machine Learning, Volume 97, pp. 7472–7482.

[77] Zhao, P. and Z. Wan (2024). Robust nonparametric regression under poisoning attack. Proceedings of the AAAI Conference on Artificial Intelligence 38(15), 17007–17015. Jingfu Peng Yau Mathematical Sciences Center, Tsinghua University