Statistica Sinica 33 (2023), 259-279
J. Kenneth Tay, Nima Aghaeepour, Trevor Hastie and Robert Tibshirani
Abstract: In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method that leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net penalty. In our simulations, fwelnet outperforms the lasso in terms of the test mean squared error, and usually gives an improvement in terms of the true positive rate or false positive rate for feature selection. We also compare this method with the group lasso and Bayesian estimation. Lastly, we apply the proposed method to the early prediction of preeclampsia, where fwelnet outperforms the lasso in terms of the 10-fold cross-validated area under the curve (0.84 vs. 0.80, respectively), and suggest how fwelnet might be used for multi-task learning.
Key words and phrases: Feature information, model selection/variable selection, prediction.