Back To Index Previous Article Next Article Full Text

Statistica Sinica 34 (2024), 1565-1583

ROBUST ESTIMATION OF COVARIANCE MATRICES:
ADVERSARIAL CONTAMINATION AND BEYOND

Stanislav Minsker* and Lang Wang

University of Southern California

Abstract: Abstract: We consider the problem of estimating the covariance structure of a random vector Y ∈ ℝd from an independent and identically distributed (i.i.d.) sample Y1, . . . , Yn. We are interested in the situation in which d is large relative to n , but the covariance matrix Σ of interest has (exactly or approximately) low rank. We assume that the given sample is either (a) ε-adversarially corrupted, meaning that an ε-fraction of the observations can be replaced by arbitrary vectors, or (b) i.i.d., but the underlying distribution is heavy-tailed, meaning that the norm of Y possesses only finite fourth moments. We propose estimators that are adaptive to the potential low-rank structure of the covariance matrix and to the proportion of contaminated data, and that admit tight deviation guarantees, despite rather weak underlying assumptions. Finally, we show that the proposed construction leads to numerically efficient algorithms that require minimal tuning from the user, and demonstrate the performance of such methods under various models of contamination.

Key words and phrases: Adversarial contamination, covariance estimation, heavy-tailed distribution, low-rank recovery, U-statistics.

Back To Index Previous Article Next Article Full Text