Statistica Sinica 29 (2019), 329-351
Abstract: Nearest neighbor regression and kernel regression have been discussed toward imputing missing data in survey sampling for decades. In this study, methods of regression imputation are examined for estimating the mean of an incomplete variable and for predicting unidentified objects in the data. Novel convex mixtures of these two regression imputation estimators are constructed for keeping stable performance when the underlying missing data conditions are non-regular. Using a simulation study of two typical non-regularity conditions, the mixture imputation is shown to yield improved estimation against the existing competitors. The performance of predicting unidentified classes by the convex mixtures imputation estimators is also examined using two data sets from the UCI Machine Learning Repository.
Key words and phrases: Convex mixtures estimation, k-nearest neighbor imputation, kernel regression imputation, machine learning.