Back To Index Previous Article Next Article Full Text

Statistica Sinica 28 (2018), 1265-1284

SPARSE k-MEANS WITH 𝓁 / 𝓁0 PENALTY
FOR HIGH-DIMENSIONAL DATA CLUSTERING
Xiangyu Chang1, Yu Wang2 , Rongjian Li3 and Zongben Xu1
1 Xi'an Jiaotong University, 2 University of California, Berkeley
and 3 Old Dominion University

Abstract: One of the existing sparse clustering approaches, 𝓁1-k-means, maximizes the weighted between-cluster sum of squares subject to the 𝓁1 penalty. In this paper, we propose a sparse clustering method based on an 𝓁 / 𝓁0 penalty, which we call 𝓁0-k-means. We design an efficient iterative algorithm for solving it. To compare the theoretical properties of 𝓁1 and 𝓁0-k-means, we show that they can be explained explicitly from a thresholding perspective based on different thresholding functions. Moreover, 𝓁1 and 𝓁0-k-means are proven to have a screening consistent property under Gaussian mixture models. Experiments on synthetic as well as real data justify the outperforming results of 𝓁0 with respect to 𝓁1-k-means.

Key words and phrases: High-dimensional data clustering, screening property, sparse k-means.

Back To Index Previous Article Next Article Full Text