Back To Index Previous Article Next Article Full Text


Statistica Sinica 7(1997), 815-840


SPLIT SELECTION METHODS FOR CLASSIFICATION TREES


Wei-Yin Loh and Yu-Shan Shih


University of Wisconsin-Madison and National Chung Cheng University


Abstract: Classification trees based on exhaustive search algorithms tend to be biased towards selecting variables that afford more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. Its split selection strategy shares similarities with the FACT method, but it yields binary splits and the final tree can be selected by a direct stopping rule or by pruning. Real and simulated data are used to compare QUEST with the exhaustive search approach. QUEST is shown to be substantially faster and the size and classification accuracy of its trees are typically comparable to those of exhaustive search.



Key words and phrases: Decision trees, discriminant analysis, machine learning.


Back To Index Previous Article Next Article Full Text