Statistica Sinica 28 (2018), 2771-2794
Abstract: We study the estimation of conditional mean regression functions through the so-called subset-based kernel principal component analysis (KPCA). Instead of using one global kernel feature space, we project a target function into different localized kernel feature spaces at different parts of the sample space. Each localized kernel feature space reflects the relationship on a subset between the response and covariates more parsimoniously. When the observations are collected from a strictly stationary and weakly dependent process, the orthonormal eigenfunctions which span the kernel feature space are consistently estimated by implementing an eigenanalysis on the subset-based kernel Gram matrix, and the estimated eigenfunctions are then used to construct the estimation of the mean regression function. Under some regularity conditions, the developed estimator is shown to be uniformly consistent over the subset with a convergence rate faster than those of some well-known nonparametric estimation methods. In addition, we discuss some generalizations of the KPCA approach, and consider using the same subset-based KPCA approach to estimate the conditional distribution function. The numerical studies including three simulated examples and two data sets illustrate the reliable performance of the proposed method. In particular, the improvement over the global KPCA method is evident.
Key words and phrases: Conditional distribution function, eigenanalysis, kernel Gram matrix, KPCA, mean regression function, nonparametric regression.