Abstract
We consider a generalized partially linear model with missing outcomes
in longitudinal studies. Our proposed method, the longitudinal augmented inverse probability weighted kernel-profile estimating equations, employs kernel
estimating equations for the nonparametric part and profile estimating equations
for the parametric part. Auxiliary variables are used to model both the missingness and the conditional mean. The resulting estimators for both the parametric
and nonparametric parts are doubly robust. To further understand these estimators, we derive the semiparametric efficiency bound and the asymptotic properties
of the proposed estimators. We find that the estimator for the parametric part
attains the semiparametric efficiency bound under the multivariate normal assumption. We demonstrate the empirical performance of the proposed method
through simulation studies and an application to CD4 count data.
Information
| Preprint No. | SS-2024-0380 |
|---|---|
| Manuscript ID | SS-2024-0380 |
| Complete Authors | Zhongzhe Ouyang, Chang Wang, Lu Wang |
| Corresponding Authors | Lu Wang |
| Emails | luwang@umich.edu |
References
- Atkinson, K. E. (1997). The numerical solution of integral equations of the second kind, Volume 4. Cambridge university press.
- Bang, H. and J. M. Robins (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61(4), 962–973.
- Chen, B. and X.-H. Zhou (2013). Generalized partially linear models for incomplete longitudinal data in the presence of population-level information. Biometrics 69(2), 386–395.
- Du, J., Y. Li, and X. Cui (2023). Identification and estimation of generalized additive partial linear models with nonignorable missing response. Communications in Mathematics and Statistics, 1–44.
- Engle, R. F., C. W. Granger, J. Rice, and A. Weiss (1986). Semiparametric estimates of the relation between weather and electricity sales. Journal of the American statistical Association 81(394), 310–320.
- H¨ardle, W., H. Liang, and J. Gao (2012). Partially linear models. Springer Science & Business Media.
- Kuhn, L., R. Strehlau, S. Shiau, F. Patel, Y. Shen, K.-G. Technau, M. Burke, G. Sherman,
- A. Coovadia, G. M. Aldrovandi, et al. (2020). Early antiretroviral treatment of infants to attain hiv remission. EClinicalMedicine 18.
- Liang, H. (2008). Generalized partially linear models with missing covariates. Journal of multivariate analysis 99(5), 880–895.
- Liang, H., S. Wang, and R. J. Carroll (2007). Partially linear models with missing response variables and error-prone covariates. Biometrika 94(1), 185–198.
- Liang, K.-Y. and S. L. Zeger (1986). Longitudinal data analysis using generalized linear models. Biometrika 73(1), 13–22.
- Lin, H., B. Fu, G. Qin, and Z. Zhu (2017). Doubly robust estimation of generalized partial linear models for longitudinal data with dropouts. Biometrics 73(4), 1132–1139. Lin, X. and R. J. Carroll (2001a). Semiparametric regression for clustered data. Biometrika 88(4), 1179–1185.
- Lin, X. and R. J. Carroll (2001b). Semiparametric regression for clustered data using generalized estimating equations. Journal of the American Statistical Association 96(455), 1045–1056.
- Lin, X., N. Wang, A. H. Welsh, and R. J. Carroll (2004). Equivalent kernels of smoothing splines in nonparametric regression for clustered/longitudinal data. Biometrika 91(1), 177–193.
- Paik, M. C. (1997). The generalized estimating equation approach when data are not missing completely at random. Journal of the American Statistical Association 92(440), 1320– 1329.
- Pearson, C. (2012). Handbook of applied mathematics: selected results and methods. Springer Science & Business Media.
- Qin, G., Z. Zhu, and W. K. Fung (2012). Robust estimation of the generalised partial linear model with missing covariates. Journal of Nonparametric Statistics 24(2), 517–530.
- Robins, J. M. and A. Rotnitzky (1992). Recovery of information and adjustment for dependent censoring using surrogate markers. In AIDS epidemiology, pp. 297–331. Springer.
- Robins, J. M. and A. Rotnitzky (1995). Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association 90(429), 122– 129.
- Robins, J. M., A. Rotnitzky, and L. P. Zhao (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association 89(427), 846–866.
- Robins, J. M., A. Rotnitzky, and L. P. Zhao (1995). Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the american statistical association 90(429), 106–121.
- Rubin, D. B. (1976). Inference and missing data. Biometrika 63(3), 581–592.
- Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. Journal of the American Statistical Association 92(439), 1049– 1062.
- Sastry, S. S. (2012). Introductory methods of numerical analysis. PHI Learning Pvt. Ltd.
- Severini, T. A. and J. G. Staniswalis (1994). Quasi-likelihood estimation in semiparametric models. Journal of the American statistical Association 89(426), 501–511.
- Shao, Y. and L. Wang (2022). Generalized partial linear models with nonignorable dropouts. Metrika 85(2), 223–252.
- Tsiatis, A. A., M. Davidian, and W. Cao (2011). Improved doubly robust estimation when data are monotonely coarsened, with application to longitudinal studies with dropout. Biometrics 67(2), 536–545.
- van der Laan, M. J. and S. Gruber (2012). Targeted minimum loss based estimation of causal effects of multiple time point interventions. The international journal of biostatistics 8(1).
- Wang, L., Z. Ouyang, and X. Lin (2024). Doubly robust estimation and semiparametric efficiency in generalized partially linear models with missing outcomes. Stats 7(3), 924.
- Wang, N. (2003). Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika 90(1), 43–52.
- Wang, N., R. J. Carroll, and X. Lin (2005). Efficient semiparametric marginal estimation for longitudinal/clustered data. Journal of the American Statistical Association 100(469), 147–157.
- Wang, Q., O. Linton, and W. H¨ardle (2004). Semiparametric regression analysis with missing response at random. Journal of the American Statistical Association 99(466), 334–345.
- Yates, A. and L. Kuhn (2022). Healthy dynamics of CD4 T cells may drive HIV resurgence in perinatally-infected infants on antiretroviral therapy. https://doi.org/10.3886/ E167981V1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor].
- Zhang, H. H., G. Cheng, and Y. Liu (2011). Linear or nonlinear? automatic structure discovery for partially linear models. Journal of the American Statistical Association 106(495), 1099–1112.
- Zhang, T. and Z. Zhu (2011). Empirical likelihood inference for longitudinal data with missing response variables and error-prone covariates. Communications in Statistics-Theory and Methods 40(18), 3230–3244. ---