doi:http://dx.doi.org/10.5705/ss.2010.216
Abstract: The small--large- situation has become common in genetics research, medical studies, risk management, and other fields. Feature selection is crucial in these studies yet poses a serious challenge. The traditional criteria such as AIC, BIC, and cross-validation choose too many features. In this paper, we examine the variable selection problem under the generalized linear models. We study the approach where a prior takes specific account of the small--large- situation. The criterion is shown to be variable selection consistent under generalized linear models. We also report simulation results and a data analysis to illustrate the effectiveness of EBIC for feature selection.
Key words and phrases: Consistency, exponential family, extended Bayes information criterion, feature selection, generalized linear model, small-n-large-P.