Back To Index Previous Article Next Article Full Text


Statistica Sinica 16(2006), 471-494





BOOSTING FOR HIGH-MULTIVARIATE RESPONSES

IN HIGH-DIMENSIONAL LINEAR REGRESSION


Roman Werner Lutz and Peter Bühlmann


ETH Zürich, Switzerland


Abstract: We propose a boosting method, multivariate $L_2$Boosting, for multivariate linear regression based on some squared error loss for multivariate data. It can be applied to multivariate linear regression with continuous responses and to vector autoregressive time series. We prove, for i.i.d. as well as time series data, that multivariate $L_2$Boosting can consistently recover sparse high-dimensional multivariate linear functions, even when the number of predictor variables $p_n$ and the dimension of the response $q_n$ grow almost exponentially with sample size $n$, $p_n = q_n =
O(\exp(C n^{1- \xi}))$ $(0<\xi<1$, $0 < C < \infty)$, but the $\ell_1$-norm of the true underlying function is finite. Our theory seems to be among the first to address the issue of large dimension of the response variable; the relevance of such settings is briefly outlined. We also identify empirically some cases where our multivariate $L_2$Boosting is better than multiple application of univariate methods to single response components, thus demonstrating that the multivariate approach can be very useful.



Key words and phrases: High-multivariate high-dimensional linear regression.

$L_2$Boosting, vector AR time series.

Back To Index Previous Article Next Article Full Text