Identifiability and Estimation of Causal Effects with Non-Gaussianity and Auxiliary Covariates

Shuai, Kang; Luo, Shanshan; Zhang, Yue; Xie, Feng; He, Yangbo

doi:10.5705/ss.202023.0315

Abstract

Assessing causal effects in the presence of unmeasured confounding is

challenging.

Although auxiliary variables, such as instrumental variables, are

commonly used to identify causal effects, they are often unavailable in practice

due to stringent and untestable conditions. To address this issue, previous researches have utilized linear structural equation models to show that the causal

effect is identifiable when noise variables of the treatment and outcome are both

non-Gaussian. In this paper, we investigate the problem of identifying the causal

effect using the auxiliary covariate and non-Gaussianity from the treatment. Our

key idea is to characterize the impact of unmeasured confounders using an observed covariate, assuming they are all Gaussian. We demonstrate that the causal

effect can be identified using a measured covariate, and then extend the identification results to the multi-treatment setting. We further develop a simple estimation

procedure for estimating causal effects and derive a √n-consistent estimator. Finally, we evaluate the performance of our estimator through simulation studies

and apply our method to investigate the effect of the trade on income.

Key words and phrases: Auxiliary variable, Causal effects, Identification, Multi- ple treatments, Non-Gaussianity

Information

Preprint No.	SS-2023-0315
Manuscript ID	SS-2023-0315
Complete Authors	Kang Shuai, Shanshan Luo, Yue Zhang, Feng Xie, Yangbo He
Corresponding Authors	Yangbo He
Emails	heyb@math.pku.edu.cn

References

Anderson, T. W. and D. A. Darling (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. The Annals of Mathematical Statistics 23(2), 193– 212.
Angrist, J. D., G. W. Imbens, and D. B. Rubin (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91(434), 444–455.
Bickel, P. J., C. A. Klaassen, P. J. Bickel, Y. Ritov, J. Klaassen, J. A. Wellner, and Y. Ritov
(1993). Efficient and Adaptive Estimation for Semiparametric Models. Berlin: Springer.
Cattaneo, M. D., R. K. Crump, and M. Jansson (2012). Optimal inference for instrumental variables regression with non-Gaussian errors. Journal of Econometrics 167(1), 1–15.
Chen, Z. and L. Chan (2013). Causality in linear non-Gaussian acyclic models in the presence of latent Gaussian confounders. Neural Computation 25(6), 1605–1641.
Chernozhukov, V., W. K. Newey, and R. Singh (2022). Automatic debiased machine learning of causal and structural effects. Econometrica 90(3), 967–1027.
D’Amour, A. (2019). Comment: Reflections on the deconfounder. Journal of the American Statistical Association 114(528), 1597–1601.
Entner, D. and P. O. Hoyer (2011). Discovering unconfounded causal relationships using linear non-Gaussian models. In JSAI International Symposium on Artificial Intelligence, pp. 181–195. Springer.
Fan, Q. and Y. Wu (2024). Endogenous treatment effect estimation with a large and mixed set of instruments and control variables. The Review of Economics and Statistics 106(6), 1655–1674.
Frankel, J. A. and D. Romer (1999). Does trade cause growth? American Economic Review 89(3), 379–399.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis 38(4), 367–378.
Guo, Z., H. Kang, T. Tony Cai, and D. S. Small (2018). Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting. Journal of the Royal Statistical Society Series B: Statistical Methodology 80(4), 793–815.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica: Journal of the Econometric Society 50(4), 1029–1054.
Hoyer, P. O., A. Hyvarinen, R. Scheines, P. L. Spirtes, J. Ramsey, G. Lacerda, and S. Shimizu
(2008). Causal discovery of linear acyclic models with arbitrary distributions. In Proc. 24th Conf. on Uncertainty in Artificial Intelligence (UAI2008), Helsinki, Finlan, pp. 282–289.
Hoyer, P. O., S. Shimizu, A. J. Kerminen, and M. Palviainen (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning 49(2), 362–378.
Imai, K. and Z. Jiang (2019). Comment: The challenges of multiple causes. Journal of the American Statistical Association 114(528), 1605–1610.
Kang, H., A. Zhang, T. T. Cai, and D. S. Small (2016). Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. Journal of the American Statistical Association 111(513), 132–144.
Kasy, M. (2014). Instrumental variables with unrestricted heterogeneity and continuous treatment. The Review of Economic Studies 81(4), 1614–1636.
Kong, D., S. Yang, and L. Wang (2022). Identifiability of causal effects with multiple causes and a binary outcome. Biometrika 109(1), 265–272.
Kukla-Gryz, A. (2009). Economic growth, international trade and air pollution: A decomposition analysis. Ecological Economics 68(5), 1329–1339.
Kuroki, M. and J. Pearl (2014). Measurement bias and effect restoration in causal inference. Biometrika 101(2), 423–437.
Li, C., X. Shen, and W. Pan (2024). Nonlinear causal discovery with confounders. Journal of the American Statistical Association 119(546), 1205–1214.
Lin, Y., F. Windmeijer, X. Song, and Q. Fan (2024). On the instrumental variable estimation with many weak and invalid instruments. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(4), 1068–1088.
Lipsitch, M., E. T. Tchetgen, and T. Cohen (2010). Negative controls: A tool for detecting confounding and bias in observational studies. Epidemiology (Cambridge, Mass.) 21(3), 383–388.
Liu, Z., T. Ye, B. Sun, M. Schooling, and E. T. Tchetgen (2022). Mendelian randomization mixed-scale treatment effect robust identification and estimation for causal inference. Biometrics 79(3), 2208–2219.
Miao, W., Z. Geng, and E. J. Tchetgen Tchetgen (2018). Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105(4), 987–993.
Miao, W., W. Hu, E. L. Ogburn, and X.-H. Zhou (2023). Identifying effects of multiple treatments in the presence of unmeasured confounding. Journal of the American Statistical Association 118(543), 1953–1967.
Park, S. and S. Gupta (2012). Handling endogenous regressors by joint estimation using copulas. Marketing Science 31(4), 567–586.
Rosenbaum, P. R. and D. B. Rubin (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Methodological) 45(2), 212–218.
Rubin, D. B. (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association 75(371), 591–593.
Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100(469), 322–331.
Salehkaleybar, S., A. Ghassami, N. Kiyavash, and K. Zhang (2020). Learning linear nonGaussian causal models in the presence of latent variables. The Journal of Machine Learning Research 21(1), 1436–1459.
Shimizu, S., P. O. Hoyer, and A. Hyv¨arinen (2009). Estimation of linear non-Gaussian acyclic models for latent factors. Neurocomputing 72(7-9), 2024–2027.
Shimizu, S., P. O. Hoyer, A. Hyv¨arinen, A. Kerminen, and M. Jordan (2006). A linear nonGaussian acyclic model for causal discovery. Journal of Machine Learning Research 7(10), 2003–2030.
Shimizu, S., T. Inazumi, Y. Sogawa, A. Hyvarinen, Y. Kawahara, T. Washio, P. O. Hoyer,
K. Bollen, and P. Hoyer (2011). Directlingam: A direct method for learning a linear non-Gaussian structural equation model.
Journal of Machine Learning ResearchJMLR 12(2011), 1225–1248.
Stock, J. H., J. H. Wright, and M. Yogo (2002). A survey of weak instruments and weak identification in generalized method of moments. Journal of Business & Economic Statistics 20(4), 518–529.
Stone, J. V. (2002). Independent component analysis: an introduction. Trends in Cognitive Sciences 6(2), 59–64.
Sun, B., Y. Cui, and E. T. Tchetgen (2022). Selective machine learning of the average treatment effect with an invalid instrumental variable. Journal of Machine Learning Research 23(204), 1–40.
Tchetgen, E. J. T., A. Ying, Y. Cui, X. Shi, and W. Miao (2024). An introduction to proximal causal learning. Statistical Science 39(3), 375–390.
Wang, Y. and D. M. Blei (2019). The blessings of multiple causes. Journal of the American Statistical Association 114(528), 1574–1596.
Wang, Y. S. and M. Drton (2020). High-dimensional causal discovery under non-Gaussianity. Biometrika 107(1), 41–59.
Windmeijer, F., H. Farbmacher, N. Davies, and G. Davey Smith (2019). On the use of the Lasso for instrumental variables estimation with some invalid instruments. Journal of the American Statistical Association 114(527), 1339–1350.

Acknowledgments

The research work is supported by National Key R&D Program of China

(2022ZD0160300). The authors thank the referees and an editor for helpful

comments.

Supplementary Materials

available online includes additional technical

proofs and simulation results including two treatments case and sensitivity

analysis of the Gaussianity of unmeasured confounders.

Supplementary materials are available for download.

[1] Anderson, T. W. and D. A. Darling (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. The Annals of Mathematical Statistics 23(2), 193– 212.

[2] Angrist, J. D., G. W. Imbens, and D. B. Rubin (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91(434), 444–455.

[3] Bickel, P. J., C. A. Klaassen, P. J. Bickel, Y. Ritov, J. Klaassen, J. A. Wellner, and Y. Ritov

[4] (1993). Efficient and Adaptive Estimation for Semiparametric Models. Berlin: Springer.

[5] Cattaneo, M. D., R. K. Crump, and M. Jansson (2012). Optimal inference for instrumental variables regression with non-Gaussian errors. Journal of Econometrics 167(1), 1–15.

[6] Chen, Z. and L. Chan (2013). Causality in linear non-Gaussian acyclic models in the presence of latent Gaussian confounders. Neural Computation 25(6), 1605–1641.

[7] Chernozhukov, V., W. K. Newey, and R. Singh (2022). Automatic debiased machine learning of causal and structural effects. Econometrica 90(3), 967–1027.

[8] D’Amour, A. (2019). Comment: Reflections on the deconfounder. Journal of the American Statistical Association 114(528), 1597–1601.

[9] Entner, D. and P. O. Hoyer (2011). Discovering unconfounded causal relationships using linear non-Gaussian models. In JSAI International Symposium on Artificial Intelligence, pp. 181–195. Springer.

[10] Fan, Q. and Y. Wu (2024). Endogenous treatment effect estimation with a large and mixed set of instruments and control variables. The Review of Economics and Statistics 106(6), 1655–1674.

[11] Frankel, J. A. and D. Romer (1999). Does trade cause growth? American Economic Review 89(3), 379–399.

[12] Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis 38(4), 367–378.

[13] Guo, Z., H. Kang, T. Tony Cai, and D. S. Small (2018). Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting. Journal of the Royal Statistical Society Series B: Statistical Methodology 80(4), 793–815.

[14] Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica: Journal of the Econometric Society 50(4), 1029–1054.

[15] Hoyer, P. O., A. Hyvarinen, R. Scheines, P. L. Spirtes, J. Ramsey, G. Lacerda, and S. Shimizu

[16] (2008). Causal discovery of linear acyclic models with arbitrary distributions. In Proc. 24th Conf. on Uncertainty in Artificial Intelligence (UAI2008), Helsinki, Finlan, pp. 282–289.

[17] Hoyer, P. O., S. Shimizu, A. J. Kerminen, and M. Palviainen (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning 49(2), 362–378.

[18] Imai, K. and Z. Jiang (2019). Comment: The challenges of multiple causes. Journal of the American Statistical Association 114(528), 1605–1610.

[19] Kang, H., A. Zhang, T. T. Cai, and D. S. Small (2016). Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. Journal of the American Statistical Association 111(513), 132–144.

[20] Kasy, M. (2014). Instrumental variables with unrestricted heterogeneity and continuous treatment. The Review of Economic Studies 81(4), 1614–1636.

[21] Kong, D., S. Yang, and L. Wang (2022). Identifiability of causal effects with multiple causes and a binary outcome. Biometrika 109(1), 265–272.

[22] Kukla-Gryz, A. (2009). Economic growth, international trade and air pollution: A decomposition analysis. Ecological Economics 68(5), 1329–1339.

[23] Kuroki, M. and J. Pearl (2014). Measurement bias and effect restoration in causal inference. Biometrika 101(2), 423–437.

[24] Li, C., X. Shen, and W. Pan (2024). Nonlinear causal discovery with confounders. Journal of the American Statistical Association 119(546), 1205–1214.

[25] Lin, Y., F. Windmeijer, X. Song, and Q. Fan (2024). On the instrumental variable estimation with many weak and invalid instruments. Journal of the Royal Statistical Society Series B: Statistical Methodology 86(4), 1068–1088.

[26] Lipsitch, M., E. T. Tchetgen, and T. Cohen (2010). Negative controls: A tool for detecting confounding and bias in observational studies. Epidemiology (Cambridge, Mass.) 21(3), 383–388.

[27] Liu, Z., T. Ye, B. Sun, M. Schooling, and E. T. Tchetgen (2022). Mendelian randomization mixed-scale treatment effect robust identification and estimation for causal inference. Biometrics 79(3), 2208–2219.

[28] Miao, W., Z. Geng, and E. J. Tchetgen Tchetgen (2018). Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105(4), 987–993.

[29] Miao, W., W. Hu, E. L. Ogburn, and X.-H. Zhou (2023). Identifying effects of multiple treatments in the presence of unmeasured confounding. Journal of the American Statistical Association 118(543), 1953–1967.

[30] Park, S. and S. Gupta (2012). Handling endogenous regressors by joint estimation using copulas. Marketing Science 31(4), 567–586.

[31] Rosenbaum, P. R. and D. B. Rubin (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Methodological) 45(2), 212–218.

[32] Rubin, D. B. (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association 75(371), 591–593.

[33] Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100(469), 322–331.

[34] Salehkaleybar, S., A. Ghassami, N. Kiyavash, and K. Zhang (2020). Learning linear nonGaussian causal models in the presence of latent variables. The Journal of Machine Learning Research 21(1), 1436–1459.

[35] Shimizu, S., P. O. Hoyer, and A. Hyv¨arinen (2009). Estimation of linear non-Gaussian acyclic models for latent factors. Neurocomputing 72(7-9), 2024–2027.

[36] Shimizu, S., P. O. Hoyer, A. Hyv¨arinen, A. Kerminen, and M. Jordan (2006). A linear nonGaussian acyclic model for causal discovery. Journal of Machine Learning Research 7(10), 2003–2030.

[37] Shimizu, S., T. Inazumi, Y. Sogawa, A. Hyvarinen, Y. Kawahara, T. Washio, P. O. Hoyer,

[38] K. Bollen, and P. Hoyer (2011). Directlingam: A direct method for learning a linear non-Gaussian structural equation model.

[39] Journal of Machine Learning ResearchJMLR 12(2011), 1225–1248.

[40] Stock, J. H., J. H. Wright, and M. Yogo (2002). A survey of weak instruments and weak identification in generalized method of moments. Journal of Business & Economic Statistics 20(4), 518–529.

[41] Stone, J. V. (2002). Independent component analysis: an introduction. Trends in Cognitive Sciences 6(2), 59–64.

[42] Sun, B., Y. Cui, and E. T. Tchetgen (2022). Selective machine learning of the average treatment effect with an invalid instrumental variable. Journal of Machine Learning Research 23(204), 1–40.

[43] Tchetgen, E. J. T., A. Ying, Y. Cui, X. Shi, and W. Miao (2024). An introduction to proximal causal learning. Statistical Science 39(3), 375–390.

[44] Wang, Y. and D. M. Blei (2019). The blessings of multiple causes. Journal of the American Statistical Association 114(528), 1574–1596.

[45] Wang, Y. S. and M. Drton (2020). High-dimensional causal discovery under non-Gaussianity. Biometrika 107(1), 41–59.

[46] Windmeijer, F., H. Farbmacher, N. Davies, and G. Davey Smith (2019). On the use of the Lasso for instrumental variables estimation with some invalid instruments. Journal of the American Statistical Association 114(527), 1339–1350.