Statistica Sinica

Anil K. Ghosh and Peter Hall

Abstract:There is a substantial literature on the estimation of error rate, or risk, for nonparametric classifiers. Error-rate estimation has at least two purposes: accurately describing the error rate, and estimating the tuning parameters that permit the error rate to be mininised. In the light of work on related problems in nonparametric statistics, it is attractive to argue that both problems admit the same solution. Indeed, methods for optimising the point-estimation performance of nonparametric curve estimators often start from an accurate estimator of error. However, we argue in this paper that accurate estimators of error rate in classification tend to give poor results when used to choose tuning parameters; and vice versa. Concise theory is used to illustrate this point in the case of cross-validation (which gives very accurate estimators of error rate, but poor estimators of tuning parameters) and the smoothed bootstrap (where error-rate estimation is poor but tuning-parameter approximations are particularly good). The theory is readily extended to other methods, for example to the bootstrap approach, which gives good estimators of error rate but poor estimators of tuning parameters. Reasons for the apparent contradiction are given, and numerical results are used to point to the practical implications of the theory.

Key words and phrases:Bayes risk, bootstrap, cross-validation, discrimination, error rate, kernel methods, nonparametric density estimation, risk.