Leveraging Local Distributions in Mendelian Randomization: Uncertain Opinions are Invalid

Ziya Xu and Sai Li

doi:10.5705/ss.202023.0344

Abstract

Mendelian randomization (MR) considers using genetic variants as instrumental

variables (IVs) to infer causal effects in observational studies. However, the validity of causal

inference in MR can be compromised when the IVs are potentially invalid. In this work,

we propose a new method, MR-Local, to infer the causal effect in the existence of possibly

invalid IVs. By leveraging the distribution of ratio estimates around the true causal effect,

MR-Local selects the cluster of ratio estimates with the least uncertainty and performs

causal inference within it. We establish the asymptotic normality of our estimator in the

two-sample summary-data setting under either the plurality rule or the balanced pleiotropy

assumption. Extensive simulations and analyses of real datasets demonstrate the reliability

of our approach.

Key words and phrases: Causal inference, instrumental variable, Mendelian randomization, pleiotropy

Information

Preprint No.	SS-2023-0344
Manuscript ID	SS-2023-0344
Complete Authors	Ziya Xu, Sai Li
Corresponding Authors	Sai Li
Emails	saili@ruc.edu.cn

References

Basmann, R. L. (1957). A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica: Journal of the Econometric Society 25(1), 77–83.
Bowden, J., G. Davey Smith, and S. Burgess (2015). Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. International Journal of Epidemiology 44(2), 512–525.
Bowden, J., G. Davey Smith, P. C. Haycock, and S. Burgess (2016). Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genetic Epidemiology 40(4), 304–314.
Bowden, J., F. Del Greco M, C. Minelli, Q. Zhao, D. A. Lawlor, N. A. Sheehan, J. Thompson, and
G. Davey Smith (2019). Improving the accuracy of two-sample summary-data mendelian randomization: moving beyond the nome assumption. International Journal of Epidemiology 48(3), 728–742.
Burgess, S., A. Butterworth, and S. G. Thompson (2013). Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 37(7), 658–665.
Burgess, S. and S. G. Thompson (2015). Multivariable mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. American Journal of Epidemiology 181(4), 251–260.
Cochran, W. G. (1954). The combination of estimates from different experiments. Biometrics 10(1), 101–129.
Davey Smith, G. and S. Ebrahim (2003). Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology 32(1), 1–22.
Davey Smith, G. and G. Hemani (2014). Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human Molecular Genetics 23(R1), R89–R98.
Elsworth, B., M. Lyon, T. Alexander, Y. Liu, P. Matthews, J. Hallett, P. Bates, T. Palmer, V. Haberland,
G. D. Smith, et al. (2020). The mrc ieu opengwas data infrastructure. BioRxiv. preprint.
Guo, Z., H. Kang, T. T. Cai, and D. S. Small (2018). Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting. Journal of the Royal Statistical Society Series B: Statistical Methodology 80(4), 793–815.
Hartwig, F. P., G. Davey Smith, and J. Bowden (2017). Robust inference in summary data mendelian randomization via the zero modal pleiotropy assumption. International Journal of Epidemiology 46(6), 1985–1998.
Holland, P. W. (1988). Causal inference, path analysis and recursive structural equations models. ETS Research Report Series 1988(1), i–50.
Hu, X., J. Zhao, Z. Lin, Y. Wang, H. Peng, H. Zhao, X. Wan, and C. Yang (2022). Mendelian randomization for causal inference accounting for pleiotropy and sample structure using genome-wide summary statistics. Proceedings of the National Academy of Sciences 119(28), e2106858119.
Kang, H., A. Zhang, T. T. Cai, and D. S. Small (2016). Instrumental variables estimation with some invalid instruments and its application to mendelian randomization. Journal of the American Statistical Association 111(513), 132–144.
Kolesár, M., R. Chetty, J. Friedman, E. Glaeser, and G. W. Imbens (2015). Identification and inference with many invalid instruments. Journal of Business & Economic Statistics 33(4), 474–484.
Morrison, J., N. Knoblauch, J. H. Marcus, M. Stephens, and X. He (2020). Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nature Genetics 52(7), 740–747.
Qi, G. and N. Chatterjee (2019). Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nature Communications 10(1), 1–10.
Sanderson, E., G. Davey Smith, F. Windmeijer, and J. Bowden (2019). An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. International Journal of Epidemiology 48(3), 713–727.
Small, D. S. (2007). Sensitivity analysis for instrumental variables regression with overidentifying restrictions. Journal of the American Statistical Association 102(479), 1049–1058.
Sun, B., Y. Cui, and E. T. Tchetgen (2022). Selective machine learning of the average treatment effect with an invalid instrumental variable. The Journal of Machine Learning Research 23(1), 9249–9288.
Verbanck, M., C.-Y. Chen, B. Neale, and R. Do (2018). Detection of widespread horizontal pleiotropy in causal relationships inferred from mendelian randomization between complex traits and diseases. Nature Genetics 50(5), 693–698.
Windmeijer, F., H. Farbmacher, N. Davies, and G. Davey Smith (2019). On the use of the lasso for instrumental variables estimation with some invalid instruments. Journal of the American Statistical Association 114(527), 1339–1350.
Windmeijer, F., X. Liang, F. P. Hartwig, and J. Bowden (2021). The confidence interval method for selecting valid instrumental variables. Journal of the Royal Statistical Society Series B: Statistical Methodology 83(4), 752–776.
Ye, T., J. Shao, and H. Kang (2021). Debiased inverse-variance weighted estimator in two-sample summarydata mendelian randomization. The Annals of Statistics 49(4), 2079–2100.
Zhao, Q. (2018). mr.raps: Two Sample Mendelian Randomization using Robust Adjusted Profile Score. R package version 0.2.
Zhao, Q., J. Wang, G. Hemani, J. Bowden, and D. S. Small (2020). Statistical inference in two-sample summary-data mendelian randomization using robust adjusted profile score. The Annals of Statistics 48(3), 1742–1769.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (grant

no. 12201630).

Supplementary Materials

The supplementary material includes the following sections: S1, technical lemmas;

S2, proofs for the theorems; S3, supplementary propositions; S4, further information

on simulations; S5, further results on real studies.

Supplementary materials are available for download.

[1] Basmann, R. L. (1957). A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica: Journal of the Econometric Society 25(1), 77–83.

[2] Bowden, J., G. Davey Smith, and S. Burgess (2015). Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. International Journal of Epidemiology 44(2), 512–525.

[3] Bowden, J., G. Davey Smith, P. C. Haycock, and S. Burgess (2016). Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genetic Epidemiology 40(4), 304–314.

[4] Bowden, J., F. Del Greco M, C. Minelli, Q. Zhao, D. A. Lawlor, N. A. Sheehan, J. Thompson, and

[5] G. Davey Smith (2019). Improving the accuracy of two-sample summary-data mendelian randomization: moving beyond the nome assumption. International Journal of Epidemiology 48(3), 728–742.

[6] Burgess, S., A. Butterworth, and S. G. Thompson (2013). Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 37(7), 658–665.

[7] Burgess, S. and S. G. Thompson (2015). Multivariable mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. American Journal of Epidemiology 181(4), 251–260.

[8] Cochran, W. G. (1954). The combination of estimates from different experiments. Biometrics 10(1), 101–129.

[9] Davey Smith, G. and S. Ebrahim (2003). Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology 32(1), 1–22.

[10] Davey Smith, G. and G. Hemani (2014). Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human Molecular Genetics 23(R1), R89–R98.

[11] Elsworth, B., M. Lyon, T. Alexander, Y. Liu, P. Matthews, J. Hallett, P. Bates, T. Palmer, V. Haberland,

[12] G. D. Smith, et al. (2020). The mrc ieu opengwas data infrastructure. BioRxiv. preprint.

[13] Guo, Z., H. Kang, T. T. Cai, and D. S. Small (2018). Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting. Journal of the Royal Statistical Society Series B: Statistical Methodology 80(4), 793–815.

[14] Hartwig, F. P., G. Davey Smith, and J. Bowden (2017). Robust inference in summary data mendelian randomization via the zero modal pleiotropy assumption. International Journal of Epidemiology 46(6), 1985–1998.

[15] Holland, P. W. (1988). Causal inference, path analysis and recursive structural equations models. ETS Research Report Series 1988(1), i–50.

[16] Hu, X., J. Zhao, Z. Lin, Y. Wang, H. Peng, H. Zhao, X. Wan, and C. Yang (2022). Mendelian randomization for causal inference accounting for pleiotropy and sample structure using genome-wide summary statistics. Proceedings of the National Academy of Sciences 119(28), e2106858119.

[17] Kang, H., A. Zhang, T. T. Cai, and D. S. Small (2016). Instrumental variables estimation with some invalid instruments and its application to mendelian randomization. Journal of the American Statistical Association 111(513), 132–144.

[18] Kolesár, M., R. Chetty, J. Friedman, E. Glaeser, and G. W. Imbens (2015). Identification and inference with many invalid instruments. Journal of Business & Economic Statistics 33(4), 474–484.

[19] Morrison, J., N. Knoblauch, J. H. Marcus, M. Stephens, and X. He (2020). Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nature Genetics 52(7), 740–747.

[20] Qi, G. and N. Chatterjee (2019). Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nature Communications 10(1), 1–10.

[21] Sanderson, E., G. Davey Smith, F. Windmeijer, and J. Bowden (2019). An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. International Journal of Epidemiology 48(3), 713–727.

[22] Small, D. S. (2007). Sensitivity analysis for instrumental variables regression with overidentifying restrictions. Journal of the American Statistical Association 102(479), 1049–1058.

[23] Sun, B., Y. Cui, and E. T. Tchetgen (2022). Selective machine learning of the average treatment effect with an invalid instrumental variable. The Journal of Machine Learning Research 23(1), 9249–9288.

[24] Verbanck, M., C.-Y. Chen, B. Neale, and R. Do (2018). Detection of widespread horizontal pleiotropy in causal relationships inferred from mendelian randomization between complex traits and diseases. Nature Genetics 50(5), 693–698.

[25] Windmeijer, F., H. Farbmacher, N. Davies, and G. Davey Smith (2019). On the use of the lasso for instrumental variables estimation with some invalid instruments. Journal of the American Statistical Association 114(527), 1339–1350.

[26] Windmeijer, F., X. Liang, F. P. Hartwig, and J. Bowden (2021). The confidence interval method for selecting valid instrumental variables. Journal of the Royal Statistical Society Series B: Statistical Methodology 83(4), 752–776.

[27] Ye, T., J. Shao, and H. Kang (2021). Debiased inverse-variance weighted estimator in two-sample summarydata mendelian randomization. The Annals of Statistics 49(4), 2079–2100.

[28] Zhao, Q. (2018). mr.raps: Two Sample Mendelian Randomization using Robust Adjusted Profile Score. R package version 0.2.

[29] Zhao, Q., J. Wang, G. Hemani, J. Bowden, and D. S. Small (2020). Statistical inference in two-sample summary-data mendelian randomization using robust adjusted profile score. The Annals of Statistics 48(3), 1742–1769.