Abstract

Uncertainty quantification is a critical aspect of modern statistical

modeling and machine learning. Among many methods for uncertainty quantification, conformal prediction is a powerful one, which offers finite-sample coverage

guarantees under the weak assumption of exchangeability.

However, the efficiency of conformal prediction in high-dimensional settings is often compromised

by the overfitting of complex models or the inherent bias. To address this, we

propose the debiased conformal threshold ridge regression (DeCThRR), a computationally efficient framework that integrates a stable thresholded ridge regres-

sion estimator with a targeted procedure to correct for regularization-induced

bias, before computing nonconformity scores. We prove that our method preserves finite-sample marginal coverage while achieving near-optimal efficiency and

asymptotic conditional coverage under mild assumptions. Experiments confirm

that our method produces narrower, more reliable prediction intervals than standard conformal approaches and some advanced inference methods, demonstrating

remarkable robustness even under model misspecification.

Key words and phrases: Conformal prediction, High-dimensional regression, Bias correction, Threshold ridge regression, Finite-sample coverage, Asymptotic effi- ciency, Conditional coverage 1

Information

Preprint No.SS-2025-0383
Manuscript IDSS-2025-0383
Complete AuthorsJiamei Wu, Pan Shang, Yanlin Tang, Linglong Kong, Bei Jiang, Lingchen Kong
Corresponding AuthorsLinglong Kong
Emailslkong@ualberta.ca

References

  1. Angelopoulos, A. N. and S. Bates (2023). Conformal prediction: A gentle introduction. Foundations and Trends in Machine Learning 16(4), 494–591.
  2. Barber, R. F., E. J. Cand`es, A. Ramdas, and R. J. Tibshirani (2023). Conformal prediction beyond exchangeability. The Annals of Statistics 51(2), 816–845.
  3. Bashari, M., A. Epstein, Y. Romano, and M. Sesia (2023). Derandomized novelty detection with FDR control via conformal e-values. In Advances in Neural Information Processing
  4. Systems, Volume 36, pp. 65585–65596.
  5. Berk, R., L. Brown, A. Buja, K. Zhang, and L. Zhao (2013). Valid post-selection inference. The Annals of Statistics 41(2), 802–837.
  6. Burnaev, E. and V. Vovk (2014). Efficiency of conformalized ridge regression. In Proceedings of the 27th Conference on Learning Theory, Volume 35, pp. 605–622. PMLR.
  7. Chernozhukov, V., K. W¨uthrich, and Y. Zhu (2021). Distributional conformal prediction. Proceedings of the National Academy of Sciences 118(48), e2107794118.
  8. Chetverikov, D. and K. Kato (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics 41(6), 2786–2819.
  9. Clart´e, L. and L. Zdeborov´a (2025). Building conformal prediction intervals with approximate message passing. In Proceedings of the Forty-first Conference on Uncertainty in Artificial
  10. Intelligence, Volume 286, pp. 798–820. PMLR.
  11. Efron, B. and R. J. Tibshirani (1994). An Introduction to the Bootstrap, Volume 57 of Monographs on Statistics and Applied Probability. New York: Chapman and Hall/CRC.
  12. Fontana, M., G. Zeni, and S. Vantini (2023). Conformal prediction: a unified review of theory and new challenges. Bernoulli 29(1), 1–23.
  13. Gibbs, I. and E. J. Cand`es (2025). Characterizing the training-conditional coverage of full conformal inference in high dimensions. arXiv preprint arXiv:2502.20579.
  14. Hebiri, M. (2010). Sparse conformal predictors. Statistics and Computing 20, 253–266.
  15. Horn, R. A. and C. R. Johnson (2012). Matrix Analysis (2nd ed.). Cambridge: Cambridge University Press.
  16. Javanmard, A. and A. Montanari (2014). Confidence intervals and hypothesis testing for highdimensional regression. The Journal of Machine Learning Research 15(1), 2869–2909.
  17. Joshi, S., S. Kiyani, G. Pappas, E. Dobriban, and H. Hassani (2025). Likelihood-ratio regularized quantile regression: Adapting conformal prediction to high-dimensional covariate shifts. arXiv preprint arXiv:2502.13030.
  18. Lee, J. D., D. L. Sun, Y. Sun, and J. E. Taylor (2016). Exact post-selection inference, with application to the lasso. The Annals of Statistics 44(3), 907–927.
  19. Lei, J. (2019). Fast exact conformalization of the lasso using piecewise linear homotopy. Biometrika 106(4), 749–764.
  20. Lei, J., M. G’Sell, A. Rinaldo, R. J. Tibshirani, and L. Wasserman (2018). Distribution-free predictive inference for regression. Journal of the American Statistical Association 113(523), 1094–1111.
  21. Lei, J. and L. Wasserman (2014). Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 71–96.
  22. Liang, R., W. Zhu, and R. F. Barber (2024). Conformal prediction after efficiency-oriented model selection. arXiv preprint arXiv:2408.07066.
  23. Liu, H. and B. Yu (2013). Asymptotic properties of lasso+ mls and lasso+ ridge in sparse high-dimensional linear regression. Electronic Journal of Statistics 7, 3124–3169.
  24. Lu, C., A. Lemay, K. Chang, K. H¨obel, and J. Kalpathy-Cramer (2022). Fair conformal predictors for applications in medical imaging. Proceedings of the AAAI Conference on Artificial Intelligence 36(11), 12008–12016.
  25. Mammen, E. (1993). Bootstrap and wild bootstrap for high dimensional linear models. The Annals of Statistics 21(1), 255–285.
  26. Shao, J. and X. Deng (2012). Estimation in high-dimensional linear models with deterministic design matrices. The Annals of Statistics 40(2), 812–831.
  27. Tibshirani, R. J., R. F. Barber, E. J. Cand`es, and A. Ramdas (2019). Conformal prediction under covariate shift. In Advances in Neural Information Processing Systems, Volume 32, pp. 2530–2540.
  28. Tibshirani, R. J., A. Rinaldo, R. Tibshirani, and L. Wasserman (2018). Uniform asymptotic inference and the bootstrap after model selection. The Annals of Statistics 46(3), 1255– 1287.
  29. Van de Geer, S., P. B¨uhlmann, Y. Ritov, and R. Dezeure (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics 42(3), 1166–1202.
  30. Vazquez, J. and J. C. Facelli (2022). Conformal prediction in clinical medical sciences. Journal of Healthcare Informatics Research 6(3), 241–252.
  31. Vovk, V., A. Gammerman, and G. Shafer (2005). Algorithmic Learning in A Random World. Springer.
  32. Zhang, C.-H. and S. S. Zhang (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 217–242.
  33. Zhang, Y. and D. N. Politis (2022). Ridge regression revisited: Debiasing, thresholding and bootstrap. The Annals of Statistics 50(3), 1401–1422.
  34. Zrnic, T. and M. I. Jordan (2023). Post-selection inference via algorithmic stability. The Annals of Statistics 51(4), 1666–1691. Jiamei Wu, Beijing Jiaotong University, China

Acknowledgments

This research was partially supported by National Natural Science Foundation of China (12371322, 12371265, 12401430).

Supplementary Materials

This file provides the proofs of theorems mentioned in the paper, and

some additional simulation results.


Supplementary materials are available for download.