Statistica Sinica 33 (2023), 2787-2808
Liuhua Peng, Guanghui Wang and Changliang Zou
Abstract: When working with large parallel data sets, it is necessary to check whether they are collected from different regression models before conducting further modeling, estimation, and inference. We propose a novel metric for such heterogeneity based on a projection strategy. We then use this metric to a new fully data-driven test for the equivalence of a large number of unknown regression models. We also construct the asymptotic normality of the proposed test, and apply the test to identify outlying data sets with regression models that deviate from the majority. Extensive numerical studies demonstrate that our methods perform satisfactorily.
Key words and phrases: Heterogeneity, outlier detection, parallel data sets, projections, U-statistics