Abstract: We study a problem of adaptive estimation of a conditional probability function in a pattern recognition setting. In many applications, for more flexibility, one may want to consider various estimation procedures targeted at different scenarios and/or under different assumptions. For example, when the feature dimension is high, to overcome the familiar curse of dimensionality one may seek a good parsimonious model among a number of candidates such as CART, neural nets and additive models. For such a situation, one wishes to have an automated final procedure that performs as well as the best candidate.
In this work, we propose a method to combine a countable collection of procedures for estimating the conditional probability. We show that the combined procedure has a property that its statistical risk is bounded above by that of any of the procedure being considered plus a small penalty. Thus asymptotically, the strengths of the different estimation procedures are shared by the combined procedure. A simulation study shows the potential advantage of combining models compared with model selection.
Key words and phrases: Adaptive estimation, conditional probability, logistic regression, minimax-rate adaptation, nonparametric classification.