Abstract

With the rapidly increasing availability of aggregate data in the public domain,

there has been a growing interest in synthesizing information from individual-level data and

aggregate data. This article studies the maximum full likelihood estimation method to integrate the auxiliary information in the estimation of the accelerated failure time model.

To overcome the computational challenges in maximizing full likelihood, we propose a novel

one-step estimator, where the maximum conditional likelihood estimator without combining

any auxiliary information is chosen as an initial estimator. We establish the consistency and

asymptotic normality of the proposed one-step estimator and show that it is more efficient

than the initial estimator. The asymptotic variance of the proposed one-step estimator has

a closed form and is easily estimated by the plug-in rule. Simulation studies show that the

proposed one-step estimator yields an efficiency gain over existing approaches. The proposed

methodology is illustrated with an analysis of a chemotherapy study for Stage III colon

cancer.

Information

Preprint No.SS-2024-0105
Manuscript IDSS-2024-0105
Complete AuthorsHuijuan Ma, Manli Cheng, Yukun Liu, Donglin Zeng, Yong Zhou
Corresponding AuthorsYukun Liu
Emailsykliu@sfs.ecnu.edu.cn

References

  1. Bickel, P. J., C. A. Klaassen, Y. Ritov, and J. A. Wellner (1989). Efficient and adaptive estimation for semiparametric models. Baltimore: John Hopkins University Press.
  2. Buckley, J. and I. James (1979). Linear regression with censored data. Biometrika 66(3), 429–436.
  3. Cheng, Y.-J., Y.-C. Liu, C.-Y. Tsai, and C.-Y. Huang (2023). Semiparametric estimation of the transformation model by leveraging external aggregate data in the presence of population heterogeneity. Biometrics 79(3), 1996–2009.
  4. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 34(2), 187–202.
  5. Ding, J., J. Li, Y. Han, I. W. McKeague, and X. Wang (2023). Fitting additive risk models using auxiliary information. Statistics in Medicine 42(6), 894–916.
  6. Ding, Y. and B. Nan (2011). A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data. Annals of Statistics 39(6), 3032–3061.
  7. Gao, F. and K. Chan (2023). Noniterative adjustment to regression estimators with population-based auxiliary information for semiparametric models. Biometrics 79(1), 140–150.
  8. Han, B., I. Van Keilegom, and X. Wang (2022). Semiparametric estimation of the nonmixture cure model with auxiliary survival information. Biometrics 78(2), 448–459.
  9. Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50(4), 1029–1054.
  10. He, J., H. Li, S. Zhang, and X. Duan (2019). Additive hazards model with auxiliary subgroup survival information. Lifetime Data Analysis 25, 128–149.
  11. Horowitz, J. L. (1992). A smoothed maximum score estimator for the binary response model. Econometrica 60, 505–531.
  12. Huang, C.-Y., J. Ning, and J. Qin (2015). Semiparametric likelihood inference for left-truncated and rightcensored data. Biostatistics 16(4), 785–798.
  13. Huang, C.-Y. and J. Qin (2020). A unified approach for synthesizing population-level covariate effect information in semiparametric estimation with survival data. Statistics in Medicine 39(10), 1573– 1590.
  14. Huang, C.-Y., J. Qin, and H.-T. Tsai (2016). Efficient estimation of the Cox model with auxiliary subgroup survival information. Journal of the American Statistical Association 111(514), 787–799.
  15. Jin, Z., D. Y. Lin, L. J. Wei, and Z. Ying (2003). Rank-based inference for the accelerated failure time model. Biometrika 90(2), 341–353.
  16. Jin, Z., D. Y. Lin, and Z. Ying (2006). On least-squares regression with censored data. Biometrika 93(1), 147–161.
  17. Lai, T. L. and Z. Ying (1991). Large sample theory of a modified Buckley-James estimator for regression analysis with censored data. The Annals of Statistics 19, 1370–1402.
  18. Lin, Y. and K. Chen (2013). Efficient estimation of the censored linear regression model. Biometrika 100(2), 525–530.
  19. Owen, A. B. (1990). Empirical likelihood ratio confidence regions. The Annals of Statistics 18, 90–120.
  20. Qin, J. and J. Lawless (1994). Empirical likelihood and general estimating equations. The Annals of Statistics 22(1), 300–325.
  21. Qin, J., Y. Liu, and P. Li (2022). A selective review of statistical methods using calibration information from similar studies. Statistical Theory and Related Fields 6(3), 175–190.
  22. Qin, J., J. Ning, H. Liu, and Y. Shen (2011). Maximum likelihood estimations and EM algorithms with length-biased data. Journal of the American Statistical Association 106(496), 1434–1449.
  23. Ritov, Y. (1990). Estimation in a linear regression model with censored data. The Annals of Statistics 18, 303–328.
  24. Ritov, Y. and J. A. Wellner (1988). Censoring, martingales, and the Cox model. Contemporary Mathematics 80, 191–219.
  25. Shang, W. (2022). Statistical inference for Cox model under case-cohort design with subgroup survival information. Journal of the Korean Statistical Society 51(3), 884–926.
  26. Shang, W. and X. Wang (2017). The generalized moment estimation of the additive–multiplicative hazard model with auxiliary survival information. Computational Statistics & Data Analysis 112, 154–169.
  27. Shang, W. and C. Wu (2023). More effective estimation for additive hazards model in generalized case-cohort study. Communications in Statistics–Simulation and Computation 52(11), 5345–5370.
  28. Sheng, Y., Y. Sun, D. Deng, and C.-Y. Huang (2020). Censored linear regression in the presence or absence of auxiliary survival information. Biometrics 76(3), 734–745.
  29. Sheng, Y., Y. Sun, C.-Y. Huang, and M.-O. Kim (2021). Synthesizing external aggregated information in the penalized Cox regression under population heterogeneity. Statistics in Medicine 40(23), 4915–4930. integrating the subgroup incidence rate information. Journal of Applied Statistics 50(10), 2151–2170.
  30. Tsiatis, A. A. (1990). Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics 18, 354–372.
  31. Vardi, Y. (1989). Multiplicative censoring, renewal processes, deconvolution and decreasing density: nonparametric estimation. Biometrika 76(4), 751–761.
  32. Wei, L.-J., Z. Ying, and D. Y. Lin (1990). Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4), 845–851.
  33. Yang, S. (1997). Extended weighted log-rank estimating functions in censored regression. Journal of the American Statistical Association 92(439), 977–984.
  34. Ying, Z. (1993). A large sample study of rank estimation for censored regression data. The Annals of Statistics 21, 76–99.
  35. Zeng, D. and D. Lin (2007). Efficient estimation for the accelerated failure time model. Journal of the American Statistical Association 102(480), 1387–1396.
  36. Zhou, M. (2005). Empirical likelihood analysis of the rank estimator for the censored accelerated failure time model. Biometrika 92(2), 492–498. Huijuan Ma; Manli Cheng; Yukun Liu; KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, 200062, Shanghai, China

Acknowledgments

This work is partly supported by the National Key R&D Program of China (2021Y-

FA1000100 and 2021YFA1000101), the National Natural Science Foundation of China (12471253, 72331005 and 12171157), and the Natural Science Foundation of

Shanghai (23JS1400500)

The authors would like to thank the editor associate

editor, and two reviewers for their insightful comments which have helped improve

the manuscript substantially.

Supplementary Materials

The online Supplementary Material includes notations, the asymptotic results of ˆβau,

proofs of Theorems 1 and 2, the relationship between (ˆβau, ˆρau) and (˜β, ˜ρ), additional

simulation studies, as well as the heterogeneous case with heterogeneity in covariate

distributions and uncertainty in external information.


Supplementary materials are available for download.