Statistica Sinica 16(2006), 589-615

THE DOUBLY REGULARIZED SUPPORT

VECTOR MACHINE

Li Wang, Ji Zhu and Hui Zou

University of Michigan and University of Minnesota

Abstract: The standard -norm support vector machine (SVM) is a widely used tool for classification problems. The -norm SVM is a variant of the standard -norm SVM, that constrains the -norm of the fitted coefficients. Due to the nature of the -norm, the -norm SVM has the property of automatically selecting variables, not shared by the standard -norm SVM. It has been argued that the -norm SVM may have some advantage over the -norm SVM, especially with high dimensional problems and when there are redundant noise variables. On the other hand, the -norm SVM has two drawbacks: (1) when there are several highly correlated variables, the -norm SVM tends to pick only a few of them, and remove the rest; (2) the number of selected variables is upper bounded by the size of the training data. A typical example where these occur is in gene microarray analysis. In this paper, we propose a doubly regularized support vector machine (DrSVM). The DrSVM uses the elastic-net penalty, a mixture of the -norm and the -norm penalties. By doing so, the DrSVM performs automatic variable selection in a way similar to the -norm SVM. In addition, the DrSVM encourages highly correlated variables to be selected (or removed) together. We illustrate how the DrSVM can be particularly useful when the number of variables is much larger than the size of the training data (). We also develop efficient algorithms to compute the whole solution paths of the DrSVM.

Key words and phrases: Grouping effect, p ≫ n, quadratic programming, SVM, variable selection.