Back To Index Previous Article Next Article Full Text

Statistica Sinica 25 (2015), 1265-1296

DEGREES OF FREEDOM AND MODEL SEARCH
Ryan J. Tibshirani
Carnegie Mellon University

Abstract: Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed by a given procedure. But, despite this fundamental role in statistics, its behavior is not completely well-understood, even in somewhat basic settings. For example, it may seem intuitively obvious that the best subset selection fit with subset size k has degrees of freedom larger than k, but this has not been formally verified, nor has is been precisely studied. At large, the current paper is motivated by this problem, and we derive an exact expression for the degrees of freedom of best subset selection in a restricted setting (orthogonal predictor variables). Along the way, we develop a concept that we name “search degrees of freedom”; intuitively, for adaptive regression procedures that perform variable selection, this is a part of the (total) degrees of freedom that we attribute entirely to the model selection mechanism. Finally, we establish a modest extension of Stein’s formula to cover discontinuous functions, and discuss its potential role in degrees of freedom and search degrees of freedom calculations.

Key words and phrases: Best subset selection, degrees of freedom, lasso, model search, Stein’s formula.

Back To Index Previous Article Next Article Full Text