Efficient Estimation of Average Treatment Effects with Unmeasured Confounding and Proxies

Chunrong Ai and Jiawei Shan

doi:10.5705/ss.202025.0104

Abstract

Proximal causal inference provides a framework for estimating the av

erage treatment effect (ATE) in the presence of unmeasured confounding by

leveraging outcome and treatment proxies. Identification in this framework relies

on the existence of a so-called bridge function. Standard approaches typically

postulate a parametric specification for the bridge function, which is estimated

in a first step and then plugged into an ATE estimator. However, this sequential

procedure suffers from two potential sources of efficiency loss: (i) the difficulty

of efficiently estimating a bridge function defined by an integral equation, and

(ii) the failure to account for the correlation between the estimation steps. To

overcome these limitations, we propose a novel approach that approximates the

integral equation with increasing moment restrictions and jointly estimates the

bridge function and the ATE. We show that, under suitable conditions, our estimator is efficient. Additionally, we provide a data-driven procedure for selecting

the tuning parameter (i.e., the number of moment restrictions). Simulation studies reveal that the proposed method performs well in finite samples, and an

application to the right heart catheterization dataset from the SUPPORT study

demonstrates its practical value.

Key words and phrases: Data-driven method; generalized method of moments; proximal causal inference; semiparametric efficiency

Information

Preprint No.	SS-2025-0104
Manuscript ID	SS-2025-0104
Complete Authors	Chunrong Ai, Jiawei Shan
Corresponding Authors	Jiawei Shan
Emails	jiawei.shan@wisc.edu

References

Abadie, A. (2003, April). Semiparametric instrumental variable estimation of treatment response models. Journal of Econometrics 113(2), 231–263.
Ai, C. and X. Chen (2003, November). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica 71(6), 1795–1843.
Ai, C. and X. Chen (2012, October). The semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions. Journal of Econometrics 170(2), 442–457.
Andrews, D. W. K. (2017, August). Examples of L2-complete and boundedly-complete distributions. Journal of Econometrics 199(2), 213–220.
Bang, H. and J. M. Robins (2005, December). Doubly robust estimation in missing data and causal inference models. Biometrics 61(4), 962–973.
Bhattacharya, R., R. Nabi, and I. Shpitser (2022, January). Semiparametric inference for causal effects in graphical models with hidden variables. J. Mach. Learn. Res. 23(1), 13325–13400.
Brown, B. W. and W. K. Newey (1998). Efficient semiparametric estimation of expectations. Econometrica 66(2), 453–464.
Chen, X. (2007, January). Large sample sieve estimation of semi-nonparametric models. In J. J.
Heckman and E. E. Leamer (Eds.), Handbook of Econometrics, Volume 6, pp. 5549–5632. Elsevier.
Chen, X., V. Chernozhukov, S. Lee, and W. K. Newey (2014). Local identification of nonparametric and semiparametric models. Econometrica 82(2), 785–809.
Connors, Jr, A. F., T. Speroff, N. V. Dawson, C. Thomas, F. E. Harrell, Jr, D. Wagner, N. Desbiens, L. Goldman, A. W. Wu, R. M. Califf, W. J. Fulkerson, Jr, H. Vidaillet, S. Broste, P. Bellamy, J. Lynn, and W. A. Knaus (1996, September). The effectiveness of right heart catheterization in the initial care of critically III patients. JAMA 276(11), 889–897.
Cui, Y., H. Pu, X. Shi, W. Miao, and E. Tchetgen Tchetgen (2024). Semiparametric proximal causal inference. Journal of the American Statistical Association 119(546), 1348–1359. D’Haultfoeuille, X. (2011, June). On the completeness condition in nonparametric instrumental problems. Econometric Theory 27(3), 460–471.
Donald, S. G., G. W. Imbens, and W. K. Newey (2009, September). Choosing instrumental variables in conditional moment restriction models. Journal of Econometrics 152(1), 28– 36.
Dukes, O., I. Shpitser, and E. J. Tchetgen Tchetgen (2023). Proximal mediation analysis. Biometrika 110(4), 973–987.
Egami, N. and E. J. Tchetgen Tchetgen (2024, April). Identification and estimation of causal peer effects using double negative controls for unmeasured network confounding. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(2), 487–511.
Ghassami, A., A. Yang, I. Shpitser, and E. Tchetgen Tchetgen (2024, July). Causal inference with hidden mediators. Biometrika, asae037.
Greenland, S. and J. M. Robins (1986). Identifiability, exchangeability, and epidemiological confounding. International Journal of Epidemiology 15(3), 413–419.
Guo, A., D. Benkeser, and R. Nabi (2023, December). Targeted machine learning for average causal effect estimation using the front-door functional. arXiv preprint arXiv:2312.10234.
Guo, A. and R. Nabi (2024, September).
Average causal effect estimation in DAGs with hidden variables:
Extensions of back-door and front-door criteria.
arXiv preprint arXiv:2409.03962.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50(4), 1029–1054.
Hirano, K. and G. W. Imbens (2001, December). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and
Outcomes Research Methodology 2(3), 259–278.
Hu, Y. and J.-L. Shiu (2018, June). Nonparametric identification using instrumental variables:
Sufficient conditions for completeness. Econometric Theory 34(3), 659–693.
Kallus, N., X. Mao, and M. Uehara (2022, October). Causal inference under unmeasured confounding with negative controls:
A minimax learning approach.
arXiv preprint arXiv:2103.14029.
Kline, B. and E. Tamer (2023, September).
Recent developments in partial identification. Annual Review of Economics 15, 125–150.
Kompa, B., D. R. Bellamy, T. Kolokotrones, J. M. Robins, and A. L. Beam (2022). Deep learning methods for proximal inference via maximum moment restriction. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA, pp. 11189–11201. Curran Associates Inc.
Kress, R. (1989). Linear Integral Equations, Volume 82 of Applied Mathematical Sciences. New York: Springer New York.
Mastouri, A., Y. Zhu, L. Gultchin, A. Korba, R. Silva, M. Kusner, A. Gretton, and K. Muandet
(2021, July). Proximal causal learning with kernels: Two-stage estimation and moment restriction. In Proceedings of the 38th International Conference on Machine Learning, pp.
7512–7523. PMLR.
Miao, W., Z. Geng, and E. Tchetgen Tchetgen (2018, December). Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105(4), 987–993.
Miao, W., X. Shi, Y. Li, and E. J. Tchetgen Tchetgen (2024, October). A confounding bridge approach for double negative control inference on causal effects. Statistical Theory and Related Fields 8(4), 262–273.
Newey, W. K. (1997, July). Convergence rates and asymptotic normality for series estimators. Journal of Econometrics 79(1), 147–168.
Newey, W. K. and D. McFadden (1994, January). Large sample estimation and hypothesis testing. In Handbook of Econometrics, Volume 4, pp. 2111–2245. Elsevier.
Newey, W. K. and J. L. Powell (2003, September). Instrumental variable estimation of nonparametric models. Econometrica 71(5), 1565–1578.
Newey, W. K. and R. J. Smith (2004). Higher Order Properties of GMM and Generalized Empirical Likelihood Estimators. Econometrica 72(1), 219–255.
Pearl, J. (1995). On the testability of causal models with latent and instrumental variables.
Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference (1995), 435– 43.
Qi, Z., R. Miao, and X. Zhang (2024, April). Proximal learning for individualized treatment regimes under unmeasured confounding.
Journal of the American Statistical Association 119(546), 915–928.
Qiu, H., X. Shi, W. Miao, E. Dobriban, and E. Tchetgen Tchetgen (2024, June). Doubly robust proximal synthetic controls. Biometrics 80(2), ujae055.
Richardson, T. S., R. J. Evans, J. M. Robins, and I. Shpitser (2023, February). Nested Markov properties for acyclic directed mixed graphs. The Annals of Statistics 51(1), 334–361.
Scharfstein, D. O., A. Rotnitzky, and J. M. Robins (1999, December).
Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American
Statistical Association 94(448), 1096–1120.
Shi, X., W. Miao, J. C. Nelson, and E. J. Tchetgen Tchetgen (2020, April). Multiply robust causal inference with double-negative control adjustment for categorical unmeasured confounding. Journal of the Royal Statistical Society Series B: Statistical Methodology 82(2), 521–540.
Tan, Z. (2006). Regression and weighting methods for causal inference using instrumental variables. Journal of the American Statistical Association 101(476), 1607–1618.
Tchetgen Tchetgen, E. J., A. Ying, Y. Cui, X. Shi, and W. Miao (2024). An introduction to proximal causal inference. Statistical Science 39(3), 375–390.
Vermeulen, K. and S. Vansteelandt (2015, September). Bias-reduced doubly robust estimation.
Journal of the American Statistical Association 110(511), 1024–1036.
White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50(1), 1–25.
Ying, A. (2024). Proximal survival analysis to handle dependent right censoring. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(5), 1414–1434.
Ying, A., W. Miao, X. Shi, and E. J. Tchetgen Tchetgen (2023, July). Proximal causal inference for complex longitudinal studies. Journal of the Royal Statistical Society Series B: Statistical Methodology 85(3), 684–704. Chunrong Ai

Acknowledgments

We sincerely thank the editor, associate editor, and two reviewers for their

valuable comments, which led to a significant improvement in our paper.

Chunrong Ai gratefully acknowledges funding support from the National

Natural Science Foundation of China (Project No. 72133005).

Supplementary Materials

The online Supplementary Material contains additional simulation studies,

lemmas, and all technical proofs.

Supplementary materials are available for download.

[1] Abadie, A. (2003, April). Semiparametric instrumental variable estimation of treatment response models. Journal of Econometrics 113(2), 231–263.

[2] Ai, C. and X. Chen (2003, November). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica 71(6), 1795–1843.

[3] Ai, C. and X. Chen (2012, October). The semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions. Journal of Econometrics 170(2), 442–457.

[4] Andrews, D. W. K. (2017, August). Examples of L2-complete and boundedly-complete distributions. Journal of Econometrics 199(2), 213–220.

[5] Bang, H. and J. M. Robins (2005, December). Doubly robust estimation in missing data and causal inference models. Biometrics 61(4), 962–973.

[6] Bhattacharya, R., R. Nabi, and I. Shpitser (2022, January). Semiparametric inference for causal effects in graphical models with hidden variables. J. Mach. Learn. Res. 23(1), 13325–13400.

[7] Brown, B. W. and W. K. Newey (1998). Efficient semiparametric estimation of expectations. Econometrica 66(2), 453–464.

[8] Chen, X. (2007, January). Large sample sieve estimation of semi-nonparametric models. In J. J.

[9] Heckman and E. E. Leamer (Eds.), Handbook of Econometrics, Volume 6, pp. 5549–5632. Elsevier.

[10] Chen, X., V. Chernozhukov, S. Lee, and W. K. Newey (2014). Local identification of nonparametric and semiparametric models. Econometrica 82(2), 785–809.

[11] Connors, Jr, A. F., T. Speroff, N. V. Dawson, C. Thomas, F. E. Harrell, Jr, D. Wagner, N. Desbiens, L. Goldman, A. W. Wu, R. M. Califf, W. J. Fulkerson, Jr, H. Vidaillet, S. Broste, P. Bellamy, J. Lynn, and W. A. Knaus (1996, September). The effectiveness of right heart catheterization in the initial care of critically III patients. JAMA 276(11), 889–897.

[12] Cui, Y., H. Pu, X. Shi, W. Miao, and E. Tchetgen Tchetgen (2024). Semiparametric proximal causal inference. Journal of the American Statistical Association 119(546), 1348–1359. D’Haultfoeuille, X. (2011, June). On the completeness condition in nonparametric instrumental problems. Econometric Theory 27(3), 460–471.

[13] Donald, S. G., G. W. Imbens, and W. K. Newey (2009, September). Choosing instrumental variables in conditional moment restriction models. Journal of Econometrics 152(1), 28– 36.

[14] Dukes, O., I. Shpitser, and E. J. Tchetgen Tchetgen (2023). Proximal mediation analysis. Biometrika 110(4), 973–987.

[15] Egami, N. and E. J. Tchetgen Tchetgen (2024, April). Identification and estimation of causal peer effects using double negative controls for unmeasured network confounding. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(2), 487–511.

[16] Ghassami, A., A. Yang, I. Shpitser, and E. Tchetgen Tchetgen (2024, July). Causal inference with hidden mediators. Biometrika, asae037.

[17] Greenland, S. and J. M. Robins (1986). Identifiability, exchangeability, and epidemiological confounding. International Journal of Epidemiology 15(3), 413–419.

[18] Guo, A., D. Benkeser, and R. Nabi (2023, December). Targeted machine learning for average causal effect estimation using the front-door functional. arXiv preprint arXiv:2312.10234.

[19] Guo, A. and R. Nabi (2024, September).

[20] Average causal effect estimation in DAGs with hidden variables:

[21] Extensions of back-door and front-door criteria.

[22] arXiv preprint arXiv:2409.03962.

[23] Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50(4), 1029–1054.

[24] Hirano, K. and G. W. Imbens (2001, December). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and

[25] Outcomes Research Methodology 2(3), 259–278.

[26] Hu, Y. and J.-L. Shiu (2018, June). Nonparametric identification using instrumental variables:

[27] Sufficient conditions for completeness. Econometric Theory 34(3), 659–693.

[28] Kallus, N., X. Mao, and M. Uehara (2022, October). Causal inference under unmeasured confounding with negative controls:

[29] A minimax learning approach.

[30] arXiv preprint arXiv:2103.14029.

[31] Kline, B. and E. Tamer (2023, September).

[32] Recent developments in partial identification. Annual Review of Economics 15, 125–150.

[33] Kompa, B., D. R. Bellamy, T. Kolokotrones, J. M. Robins, and A. L. Beam (2022). Deep learning methods for proximal inference via maximum moment restriction. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA, pp. 11189–11201. Curran Associates Inc.

[34] Kress, R. (1989). Linear Integral Equations, Volume 82 of Applied Mathematical Sciences. New York: Springer New York.

[35] Mastouri, A., Y. Zhu, L. Gultchin, A. Korba, R. Silva, M. Kusner, A. Gretton, and K. Muandet

[36] (2021, July). Proximal causal learning with kernels: Two-stage estimation and moment restriction. In Proceedings of the 38th International Conference on Machine Learning, pp.

[37] 7512–7523. PMLR.

[38] Miao, W., Z. Geng, and E. Tchetgen Tchetgen (2018, December). Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105(4), 987–993.

[39] Miao, W., X. Shi, Y. Li, and E. J. Tchetgen Tchetgen (2024, October). A confounding bridge approach for double negative control inference on causal effects. Statistical Theory and Related Fields 8(4), 262–273.

[40] Newey, W. K. (1997, July). Convergence rates and asymptotic normality for series estimators. Journal of Econometrics 79(1), 147–168.

[41] Newey, W. K. and D. McFadden (1994, January). Large sample estimation and hypothesis testing. In Handbook of Econometrics, Volume 4, pp. 2111–2245. Elsevier.

[42] Newey, W. K. and J. L. Powell (2003, September). Instrumental variable estimation of nonparametric models. Econometrica 71(5), 1565–1578.

[43] Newey, W. K. and R. J. Smith (2004). Higher Order Properties of GMM and Generalized Empirical Likelihood Estimators. Econometrica 72(1), 219–255.

[44] Pearl, J. (1995). On the testability of causal models with latent and instrumental variables.

[45] Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference (1995), 435– 43.

[46] Qi, Z., R. Miao, and X. Zhang (2024, April). Proximal learning for individualized treatment regimes under unmeasured confounding.

[47] Journal of the American Statistical Association 119(546), 915–928.

[48] Qiu, H., X. Shi, W. Miao, E. Dobriban, and E. Tchetgen Tchetgen (2024, June). Doubly robust proximal synthetic controls. Biometrics 80(2), ujae055.

[49] Richardson, T. S., R. J. Evans, J. M. Robins, and I. Shpitser (2023, February). Nested Markov properties for acyclic directed mixed graphs. The Annals of Statistics 51(1), 334–361.

[50] Scharfstein, D. O., A. Rotnitzky, and J. M. Robins (1999, December).

[51] Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American

[52] Statistical Association 94(448), 1096–1120.

[53] Shi, X., W. Miao, J. C. Nelson, and E. J. Tchetgen Tchetgen (2020, April). Multiply robust causal inference with double-negative control adjustment for categorical unmeasured confounding. Journal of the Royal Statistical Society Series B: Statistical Methodology 82(2), 521–540.

[54] Tan, Z. (2006). Regression and weighting methods for causal inference using instrumental variables. Journal of the American Statistical Association 101(476), 1607–1618.

[55] Tchetgen Tchetgen, E. J., A. Ying, Y. Cui, X. Shi, and W. Miao (2024). An introduction to proximal causal inference. Statistical Science 39(3), 375–390.

[56] Vermeulen, K. and S. Vansteelandt (2015, September). Bias-reduced doubly robust estimation.

[57] Journal of the American Statistical Association 110(511), 1024–1036.

[58] White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50(1), 1–25.

[59] Ying, A. (2024). Proximal survival analysis to handle dependent right censoring. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(5), 1414–1434.

[60] Ying, A., W. Miao, X. Shi, and E. J. Tchetgen Tchetgen (2023, July). Proximal causal inference for complex longitudinal studies. Journal of the Royal Statistical Society Series B: Statistical Methodology 85(3), 684–704. Chunrong Ai