Back To Index Previous Article Next Article Full Text


Statistica Sinica 16(2006), 635-657





COMPARING LEARNING METHODS FOR

CLASSIFICATION


Yuhong Yang


University of Minnesota


Abstract: We address the consistency property of cross validation (CV) for classification. Sufficient conditions are obtained on the data splitting ratio to ensure that the better classifier between two candidates will be favored by CV with probability approaching 1. Interestingly, it turns out that for comparing two general learning methods, the ratio of the training sample size and the evaluation size does not have to approach 0 for consistency in selection, as is required for comparing parametric regression models (Shao (1993)). In fact, the ratio may be allowed to converge to infinity or any positive constant, depending on the situation. In addition, we also discuss confidence intervals and sequential instability in selection for comparing classifiers.



Key words and phrases: Classification, comparing learning methods, consistency in selection, cross validation paradox, sequential instability.

Back To Index Previous Article Next Article Full Text