Rao, J. S. (1999). Bootstrap choice of cost complexity for better subset selection. Vol.9, No.1.

Statistica Sinica 9(1999), 273-287

BOOTSTRAP CHOICE OF COST COMPLEXITY

FOR BETTER SUBSET SELECTION

J. Sunil Rao

The Cleveland Clinc Foundation

Abstract: Subset selection is a long-standing problem. One goal of a selection procedure is consistency. Consistency using Akaike's Final Prediction Error Criterion (FPE) as a selection procedure can be shown to be related to the cost complexity parameter in FPE. However, another goal of a selection procedure is accurate predictions. The consistency property does not necessarily guarantee this second objective. The issue can be thought of as a bias versus variance tradeoff for the procedure. We use the bootstrap to model this tradeoff and provide an objective way of choosing a procedure which attempts to balance the two objectives. This is done in the spirit of the cost complexity pruning algorithm of classification and regression trees. The methodology is described and illustrated on simulated and real data examples.

Key words and phrases: Adaptive estimation, Mallow's c_p, model selection, prediction error, resampling methods.