Statistica Sinica

Probal Chaudhuri, Min-Ching Huang, Wei-Yi Loh*** and Ruji Yao***

Abstract:A nonparametric function estimation method called SUPPORT (``Smoo- thed and Unsmoothed Piecewise-Polynomial Regression Trees'') is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data space. Partitioning is carried out recursively as in a tree-structured method. If the estimate is required to be smooth, the polynomial pieces may be glued together by means of weighted averaging. The smoothed estimate is thus obtained in three steps. In the first step, the regressor space is recursively partitioned until the data in each piece are adequately fitted by a polynomial of a fixed order. Partitioning is guided by analysis of the distributions of residuals and cross-validation estimates of prediction mean square error. In the second step, the data within a neighborhood of each partition are fitted by a polynomial. The final estimate of the regression function is obtained by averaging the polynomial pieces, using smooth weight functions each of which diminishes rapidly to zero outside its associated partition. Estimates of derivatives of the regression function may be obtained by similar averaging of the derivatives of the polynomial pieces. The advantages of the proposed estimate are that it possesses a smooth analytic form, is as many times differentiable as the family of weight functions are, and has a decision tree representation. The asymptotic properties of the smoothed and unsmoothed estimates are explored under appropriate regularity conditions. Examples comparing the accuracy of SUPPORT to other methods are given.

Key words and phrases:Consistency, cross-validation, nonparametric regression, recursive partitioning, smooth partition of unity, tree-structured regression.