Statistica Sinica 32 (2022), 695-718
Yanjia Yu1 , Yi Yang2 and Yuhong Yang1
Abstract: Because model selection is ubiquitous in data analysis, the reproducibility of statistical results requires that we be able to evaluate the reliability of the employed model selection method, regardless of the model's apparent good properties. Instability measures have been proposed for evaluating model selection uncertainty. However, low instability does not necessarily indicate that the selected model is trustworthy, because low instability can also arise when a method tends to select an overly parsimonious model. F- and G-measures have become increasingly popular for assessing variable selection performance in theoretical studies and simulation results. However, they are not computable in practice. In this work, we propose an estimation method for F- and G-measures and prove their desirable properties of uniform consistency. This gives the data analyst a valuable tool to compare different variable selection methods based on the data at hand. Extensive simulations are conducted to show the very good finite-sample performance of our approach. Lastly, we apply our methods to several microarray gene expression data sets, with intriguing results.
Key words and phrases: F-measure, G-measure, gene expression, model averaging, reproducibility, variable selection performance.