Statistica Sinica 27 (2017), 1857-1878
Abstract: Some biomedical studies lead to mixture data. When a subgroup membership is missing for some of the subjects in a study, the distribution of the outcome is a mixture of the subgroup-specific distributions. Taking into account the uncertain distribution of the group membership and the covariates, we model the relation between the disease onset time and the covariates through transformation models in each sub-population, and develop a nonparametric maximum likelihood-based estimation implemented through the EM algorithm along with its inference procedure. We propose methods to identify the covariates that have different effects or common effects in distinct populations, which enables parsimonious modeling and better understanding of the differences across populations. The methods are illustrated through extensive simulation studies and a data example.
Key words and phrases: Censored data, EM algorithm, Laplace transformation, mixed populations, semiparametric models, transformation models, uncertain population identifier.