Statistica Sinica 30 (2020), 557-577

METHODS FOR SPARSE AND LOW-RANK RECOVERY

UNDER SIMPLEX CONSTRAINTS

Ping Li, Syama Sundar Rangapuram and Martin Slawski

Baidu Research, Amazon Research and George Mason University

Abstract: The de facto standard approach of promoting sparsity by means of 𝓁_{1}-regularization becomes ineffective in the presence of simplex constraints, that is, when the target is known to have non-negative entries summing to a given constant. The situation is analogous for the use of nuclear norm regularization for the low-rank recovery of Hermitian positive semidefinite matrices with a given trace. In the present paper, we discuss several strategies to deal with this situation, from simple to more complex. First, we consider empirical risk minimization (ERM), which has similar theoretical properties
w.r.t. prediction and 𝓁_{2}-estimation error as 𝓁_{1}-regularization. In light of this, we argue that ERM combined with a subsequent sparsification step (e.g., thresholding) represents a sound alternative to the heuristic of using 𝓁_{1}-regularization after dropping the sum constraint and the subsequent normalization. Next, we show that any sparsity-promoting regularizer under simplex constraints cannot be convex. A novel sparsity-promoting regularization scheme based on the inverse or negative of the squared 𝓁_{2}-norm is proposed, which avoids
the shortcomings of various alternative methods from the literature. Our approach naturally extends to Hermitian positive semidefinite matrices with a given trace.

Key words and phrases: D.C. programming, density matrices of quantum systems, estimation of mixture proportions, simplex constraints, sparsity.