Back To Index Previous Article Next Article Full Text


Statistica Sinica 14(2004), 457-483





OPTIMAL SMOOTHING IN KERNEL DISCRIMINANT

ANALYSIS


Anil K. Ghosh and Probal Chaudhuri


Indian Statistical Institute, Calcutta


Abstract: One well-known use of kernel density estimates is in nonparametric discriminant analysis, and its popularity is evident in its implementation in some commonly used statistical softwares (e.g., SAS). In this paper, we make a critical investigation into the influence of the value of the bandwidth on the behavior of the average misclassification probability of a classifier that is based on kernel density estimates. In the course of this investigation, we have observed some counter-intuitive results. For instance, the use of bandwidths that minimize mean integrated square errors of kernel estimates of population densities may lead to rather poor average misclassification rates. Further, the best choice of smoothing parameters in classification problems not only depends on the underlying true densities and sample sizes but also on prior probabilities. In particular, if the prior probabilities are all equal, the behavior of the average misclassification probability turns out to be quite interesting when both the sample sizes and the bandwidths are large. Our theoretical analysis provides some new insights into the problem of smoothing in nonparametric discriminant analysis. We also observe that popular cross-validation techniques (e.g., leave-one-out or $V$-fold) may not be very effective for selecting the bandwidth in practice. As a by-product of our investigation, we present a method for choosing appropriate values of the bandwidths when kernel density estimates are fitted to the training sample in a classification problem. The performance of the proposed method has been demonstrated using some simulation experiments as well as analysis of benchmark data sets, and its asymptotic properties have been studied under some regularity conditions.



Key words and phrases: Average misclassification probability, bandwidth selection, Bayes' risk, cross-validation techniques, location-shift models, scale space, spherical symmetry.


Back To Index Previous Article Next Article Full Text