Semiparametric Principal Stratification Analysis Beyond Monotonicity

Jiaqi Tong, Brennan Kahan, Michael O. Harhay and Fan Li

doi:10.5705/ss.202025.0066

Abstract

Intercurrent events, common in clinical trials and observational studies, affect the ex

istence or interpretation of final outcomes. Principal stratification addresses this challenge by

defining local average treatment effect estimands within subpopulations, but often relies on restrictive assumptions such as monotonicity and counterfactual intermediate independence. To

overcome these limitations, we propose a semiparametric framework for principal stratification

analysis leveraging a margin-free, conditional odds ratio sensitivity parameter. Under principal

ignorability, we derive nonparametric identification formulas and efficient estimation methods,

including a conditionally doubly robust parametric estimator and a debiased machine learning

estimator with data-adaptive nuisance learners. Our simulations show that incorrectly assuming

monotonicity can frequently lead to biased inference, but incorrectly assuming non-monotonicity

when monotonicity holds may maintain approximately valid inference. We demonstrate our methods in the context of a critical care trial, where monotonicity is unlikely to be valid.

Key words and phrases: Causal inference; efficient influence function; conditional double robust- ness; odds ratio; sensitivity analysis; intercurrent events

Information

Preprint No.	SS-2025-0066
Manuscript ID	SS-2025-0066
Complete Authors	Jiaqi Tong, Brennan Kahan, Michael O. Harhay, Fan Li
Corresponding Authors	Fan Li
Emails	fan.f.li@yale.edu

References

Bartolucci, F. and L. Grilli (2011). Modeling partial compliance through copulas in a principal stratification framework. Journal of the American Statistical Association 106(494), 469– 479.
Bergsma, W. P. and T. Rudas (2002). Marginal models for categorical data. The Annals of Statistics 30(1), 140 – 159.
Bickel, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993). Efficient and Adaptive Estimation for Semiparametric Models. New York: Springer.
Cheng, C., B. Liu, L. Wruck, F. Li, and F. Li (2023). Multiply robust estimation for causal survival analysis with treatment noncompliance. arXiv preprint arXiv:2305.13443.
Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins
(2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21(1), C1–C68.
Ding, P. and J. Lu (2016). Principal stratification analysis using principal scores. Journal of the Royal Statistical Society Series B: Statistical Methodology 79(3), 757–777.
European Medicines Agency (2020). ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials. https://www.ema.europa.eu/en/documents/scientific-guideline/ ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-trials-guideline-statis en.pdf.
Farrell, M. H., T. Liang, and S. Misra (2021). Deep neural networks for estimation and inference. Econometrica 89(1), 181–213.
Feller, A., F. Mealli, and L. Miratrix (2017). Principal score methods: Assumptions, extensions, and practical considerations. Journal of Educational and Behavioral Statistics 42(6), 726– 758.
Frangakis, C. E. and D. B. Rubin (2002). Principal stratification in causal inference. Biometrics 58(1), 21–29.
Genest, C. and J. Neˇslehov´a (2007). A primer on copulas for count data. ASTIN Bulletin 37(2), 475–515.
Hayden, D., D. K. Pauler, and D. Schoenfeld (2005). An estimator for treatment comparisons among survivors in randomized trials. Biometrics 61(1), 305–310.
Imbens, G. W. and J. D. Angrist (1994). Identification and estimation of local average treatment effects. Econometrica 62(2), 467–475.
Isenberg, D., M. O. Harhay, N. Mitra, and F. Li (2025). Weighting methods for truncation by death in cluster-randomized trials. Statistical Methods in Medical Research 34(3), 473–489.
Jemiai, Y. (2005). Semiparametric methods for inferring treatment effects on outcomes defined only if a post-randomization event occurs. Ph. D. thesis, Harvard University. https: //www.proquest.com/docview/305002253.
Jiang, Z. and P. Ding (2021). Identification of causal effects within principal strata using auxiliary variables. Statistical Science 36(4), 493–508.
Jiang, Z., S. Yang, and P. Ding (2022). Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society Series B: Statistical Methodology 84(4), 1423–1445.
Kahan, B. C., J. Hindley, M. Edwards, S. Cro, and T. P. Morris (2024). The estimands framework: a primer on the ich e9(r1) addendum. BMJ 384.
Kang, J. D. Y. and J. L. Schafer (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4), 523 – 539.
Klaassen, C. A. (1987). Consistent estimation of the influence function of locally asymptotically linear estimators. The Annals of Statistics 15(4), 1548–1562.
Lu, S., Z. Jiang, and P. Ding (2025). Principal stratification with continuous post-treatment variables: nonparametric identification and semiparametric estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology, qkaf049.
Luo, Y., M. Spindler, and J. K¨uck (2025). High-dimensional l2-boosting: Rate of convergence. Journal of Machine Learning Research 26(89), 1–54.
National Heart, L. and B. I. A. C. T. Network (2004). Higher versus lower positive endexpiratory pressures in patients with the acute respiratory distress syndrome. New England Journal of Medicine 351(4), 327–336.
Nelsen, R. B. (2006). An Introduction to Copulas. New York: Springer.
Pfanzagl, J. and W. Wefelmeyer (1985). Contributions to a general asymptotic statistical theory. Statistics & Risk Modeling 3(3-4), 379–388.
Robins, J. M. (1998). Correction for non-compliance in equivalence trials. Statistics in Medicine 17(3), 269–302; discussion 387–9.
Roy, J., J. W. Hogan, and B. H. Marcus (2008). Principal stratification with predictors of compliance for randomized trials with 2 active treatments. Biostatistics 9(2), 277–289.
Rubin, D. B. (2006). Causal inference through potential outcomes and principal stratification: application to studies with “censoring” due to death. Statistical Science 21(3), 299–309.
Rudas, T. (2018). Lectures on Categorical Data Analysis. Springer Texts in Statistics. New York: Springer.
Shepherd, B. E., P. B. Gilbert, and C. T. Dupont (2011). Sensitivity analyses comparing time-to-event outcomes only existing in a subset selected postrandomization and relaxing monotonicity. Biometrics 67(3), 1100–1110.
Shepherd, B. E., P. B. Gilbert, Y. Jemiai, and A. Rotnitzky (2006). Sensitivity analyses comparing outcomes only existing in a subset selected post-randomization, conditional on covariates, with application to hiv vaccine trials. Biometrics 62(2), 332–342.
Shepherd, B. E., M. W. Redman, and D. P. Ankerst (2008). Does finasteride affect the severity of prostate cancer? a causal sensitivity analysis. Journal of the American Statistical Association 103(484), 1392–1404.
Tong, J., C. Cheng, G. Tong, M. O. Harhay, and F. Li (2024). Doubly robust estimation and sensitivity analysis with outcomes truncated by death in multi-arm clinical trials. arXiv preprint arXiv:2410.07483.
van der Laan, M. J., E. C. Polley, and A. E. Hubbard (2007). Super learner. Technical report, U.C. Berkeley Division of Biostatistics Working Paper Series. https://biostats. bepress.com/ucbbiostat/paper222.
Vansteelandt, S. and K. Van Lancker (2025). Chasing shadows: how implausible assumptions skew our understanding of causal estimands. Statistics in Biopharmaceutical Research 17(4), 507–513.
Wager, S. and S. Athey (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association 113(523), 1228– 1242.
Wu, P., P. Ding, Z. Geng, and Y. Liu (2024). Quantifying individual risk for binary outcome. arXiv preprint arXiv:2402.10537.
Yule, G. U. (1912). On the methods of measuring association between two attributes. Journal of the Royal Statistical Society 75(6), 579–642.
Zehavi, T. and D. Nevo (2023). Matching methods for truncation by death problems. Journal of the Royal Statistical Society Series A: Statistics in Society 186(4), 659–681.

Acknowledgments

Research in this article was supported by the United States National Institutes of

Health (NIH), National Heart, Lung, and Blood Institute (NHLBI, grant numbers R01-

HL168202 and 1R01HL178513). All statements in this report, including its findings and

conclusions, are solely those of the authors and do not necessarily represent the views of

the NIH. The authors also thank the Yale University-Mayo Clinic Center of Excellence

in Regulatory Science and Innovation (CERSI) for supporting this study.

Supplementary Materials

Additional technical details, derivations, proofs, and supporting information regarding

the simulation experiments are provided in the Online Supplementary Material.

Supplementary materials are available for download.

[1] Bartolucci, F. and L. Grilli (2011). Modeling partial compliance through copulas in a principal stratification framework. Journal of the American Statistical Association 106(494), 469– 479.

[2] Bergsma, W. P. and T. Rudas (2002). Marginal models for categorical data. The Annals of Statistics 30(1), 140 – 159.

[3] Bickel, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993). Efficient and Adaptive Estimation for Semiparametric Models. New York: Springer.

[4] Cheng, C., B. Liu, L. Wruck, F. Li, and F. Li (2023). Multiply robust estimation for causal survival analysis with treatment noncompliance. arXiv preprint arXiv:2305.13443.

[5] Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins

[6] (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21(1), C1–C68.

[7] Ding, P. and J. Lu (2016). Principal stratification analysis using principal scores. Journal of the Royal Statistical Society Series B: Statistical Methodology 79(3), 757–777.

[8] European Medicines Agency (2020). ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials. https://www.ema.europa.eu/en/documents/scientific-guideline/ ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-trials-guideline-statis en.pdf.

[9] Farrell, M. H., T. Liang, and S. Misra (2021). Deep neural networks for estimation and inference. Econometrica 89(1), 181–213.

[10] Feller, A., F. Mealli, and L. Miratrix (2017). Principal score methods: Assumptions, extensions, and practical considerations. Journal of Educational and Behavioral Statistics 42(6), 726– 758.

[11] Frangakis, C. E. and D. B. Rubin (2002). Principal stratification in causal inference. Biometrics 58(1), 21–29.

[12] Genest, C. and J. Neˇslehov´a (2007). A primer on copulas for count data. ASTIN Bulletin 37(2), 475–515.

[13] Hayden, D., D. K. Pauler, and D. Schoenfeld (2005). An estimator for treatment comparisons among survivors in randomized trials. Biometrics 61(1), 305–310.

[14] Imbens, G. W. and J. D. Angrist (1994). Identification and estimation of local average treatment effects. Econometrica 62(2), 467–475.

[15] Isenberg, D., M. O. Harhay, N. Mitra, and F. Li (2025). Weighting methods for truncation by death in cluster-randomized trials. Statistical Methods in Medical Research 34(3), 473–489.

[16] Jemiai, Y. (2005). Semiparametric methods for inferring treatment effects on outcomes defined only if a post-randomization event occurs. Ph. D. thesis, Harvard University. https: //www.proquest.com/docview/305002253.

[17] Jiang, Z. and P. Ding (2021). Identification of causal effects within principal strata using auxiliary variables. Statistical Science 36(4), 493–508.

[18] Jiang, Z., S. Yang, and P. Ding (2022). Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society Series B: Statistical Methodology 84(4), 1423–1445.

[19] Kahan, B. C., J. Hindley, M. Edwards, S. Cro, and T. P. Morris (2024). The estimands framework: a primer on the ich e9(r1) addendum. BMJ 384.

[20] Kang, J. D. Y. and J. L. Schafer (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4), 523 – 539.

[21] Klaassen, C. A. (1987). Consistent estimation of the influence function of locally asymptotically linear estimators. The Annals of Statistics 15(4), 1548–1562.

[22] Lu, S., Z. Jiang, and P. Ding (2025). Principal stratification with continuous post-treatment variables: nonparametric identification and semiparametric estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology, qkaf049.

[23] Luo, Y., M. Spindler, and J. K¨uck (2025). High-dimensional l2-boosting: Rate of convergence. Journal of Machine Learning Research 26(89), 1–54.

[24] National Heart, L. and B. I. A. C. T. Network (2004). Higher versus lower positive endexpiratory pressures in patients with the acute respiratory distress syndrome. New England Journal of Medicine 351(4), 327–336.

[25] Nelsen, R. B. (2006). An Introduction to Copulas. New York: Springer.

[26] Pfanzagl, J. and W. Wefelmeyer (1985). Contributions to a general asymptotic statistical theory. Statistics & Risk Modeling 3(3-4), 379–388.

[27] Robins, J. M. (1998). Correction for non-compliance in equivalence trials. Statistics in Medicine 17(3), 269–302; discussion 387–9.

[28] Roy, J., J. W. Hogan, and B. H. Marcus (2008). Principal stratification with predictors of compliance for randomized trials with 2 active treatments. Biostatistics 9(2), 277–289.

[29] Rubin, D. B. (2006). Causal inference through potential outcomes and principal stratification: application to studies with “censoring” due to death. Statistical Science 21(3), 299–309.

[30] Rudas, T. (2018). Lectures on Categorical Data Analysis. Springer Texts in Statistics. New York: Springer.

[31] Shepherd, B. E., P. B. Gilbert, and C. T. Dupont (2011). Sensitivity analyses comparing time-to-event outcomes only existing in a subset selected postrandomization and relaxing monotonicity. Biometrics 67(3), 1100–1110.

[32] Shepherd, B. E., P. B. Gilbert, Y. Jemiai, and A. Rotnitzky (2006). Sensitivity analyses comparing outcomes only existing in a subset selected post-randomization, conditional on covariates, with application to hiv vaccine trials. Biometrics 62(2), 332–342.

[33] Shepherd, B. E., M. W. Redman, and D. P. Ankerst (2008). Does finasteride affect the severity of prostate cancer? a causal sensitivity analysis. Journal of the American Statistical Association 103(484), 1392–1404.

[34] Tong, J., C. Cheng, G. Tong, M. O. Harhay, and F. Li (2024). Doubly robust estimation and sensitivity analysis with outcomes truncated by death in multi-arm clinical trials. arXiv preprint arXiv:2410.07483.

[35] van der Laan, M. J., E. C. Polley, and A. E. Hubbard (2007). Super learner. Technical report, U.C. Berkeley Division of Biostatistics Working Paper Series. https://biostats. bepress.com/ucbbiostat/paper222.

[36] Vansteelandt, S. and K. Van Lancker (2025). Chasing shadows: how implausible assumptions skew our understanding of causal estimands. Statistics in Biopharmaceutical Research 17(4), 507–513.

[37] Wager, S. and S. Athey (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association 113(523), 1228– 1242.

[38] Wu, P., P. Ding, Z. Geng, and Y. Liu (2024). Quantifying individual risk for binary outcome. arXiv preprint arXiv:2402.10537.

[39] Yule, G. U. (1912). On the methods of measuring association between two attributes. Journal of the Royal Statistical Society 75(6), 579–642.

[40] Zehavi, T. and D. Nevo (2023). Matching methods for truncation by death problems. Journal of the Royal Statistical Society Series A: Statistics in Society 186(4), 659–681.