Back To Index Previous Article Next Article Full Text

Statistica Sinica 33 (2023), 2209-2231

DISTRIBUTED EMPIRICAL LIKELIHOOD APPROACH TO
INTEGRATING UNBALANCED DATASETS

Ling Zhou1, Xichen She2 and Peter X.-K. Song2

1Southwestern University of Finance and Economics and 2University of Michigan

Abstract: This paper proposes a distributed empirical likelihood (DEL) method for performing an integrative analysis of multiple data sources with the flexibility of handling either homogeneous or heterogeneous data. The proposed DEL method does not require pooling individual data sets into a centralized operational platform, so the privacy of subject-level information in individual data sources is protected. The DEL method is shown to be almost surely equal to the centralized empirical likelihood approach that would be adopted if individual data sets were combined and stored at one place. We establish the large-sample properties and algorithm convergence of the DEL method. We also illustrate the numerical performance of the DEL method using simulation studies and a real-data example, in which the DEL method is clearly advantageous over the classical meta-estimation method when analyzing unbalanced data sets.

Key words and phrases: ADMM, data integration, data privacy, divide-and-conquer, meta estimation.

Back To Index Previous Article Next Article Full Text