Statistica Sinica 31 (2021), 1749-1777
Di He, Yong Zhou and Hui Zou
Abstract: Multivariate responses are commonly encountered in many applications with high-dimensional input variables. Feature screening has been shown to be a very useful data analysis tool for high-dimensional data. Since the introduction of the sure independence screening approach, many variable screening methods have been proposed and studied in the literature. However, the majority of these methods focus on the classical univariate response data case, and do not apply naturally to data sets with multiple responses. We systematically study variable screening methods for multi-response data. First, we consider extensions of several popular screening methods to deal with multiple responses. Each of these methods has its own clear drawbacks. We then propose a new model-free screening method, which we call multi-response rank canonical correlation screening (mRCC), which not only takes into account the dependence structure among the multivariate responses, but also preserves nice properties of the rank correlation, such as robustness and invariance under monotonic transformation. The sure screening property of mRCC is established under weak regularity conditions. Extensive numerical experiments demonstrate the superior performance of mRCC over other available alternatives.
Key words and phrases: Canonical correlation, multi-response data, rank correlation, sure screening property.