Abstract

Many biomedical studies collect high-dimensional medical imaging data

to identify biomarkers for the detection, diagnosis, and treatment of human diseases. Consequently, it is crucial to develop accurate models that can predict a

wide range of clinical outcomes (both discrete and continuous) based on imaging

data. By treating imaging predictors as functional data, we propose a residualbased alternative partial least squares (RAPLS) model for a broad class of gen-

eralized functional linear models that incorporate both functional and scalar

covariates.

Our RAPLS method extends the alternative partial least squares

(APLS) algorithm iteratively to accommodate additional scalar covariates and

non-continuous outcomes. We establish the convergence rate of the RAPLS estimator for the unknown slope function and, with an additional calibration step,

we prove the asymptotic normality and efficiency of the calibrated RAPLS estimator for the scalar parameters. The effectiveness of the RAPLS algorithm is

demonstrated through multiple simulation studies and an application predicting

Alzheimer’s disease progression using neuroimaging data from the Alzheimer’s

Disease Neuroimaging Initiative (ADNI).

Information

Preprint No.SS-2023-0418
Manuscript IDSS-2023-0418
Complete AuthorsYue Wang, Xiao Wang, Joseph G. Ibrahim, Hongtu Zhu
Corresponding AuthorsYue Wang
Emailsyue.2.wang@cuanschutz.edu

References

  1. Aguilera, A., M. Aguilera-Morillo, and C. Preda (2016). Penalized versions of functional PLS regression. Chemometrics and Intelligent Laboratory Systems 154, 80–92.
  2. Aguilera, A. M., M. Escabias, C. Preda, and G. Saporta (2010). Using basis expansions for estimating functional PLS regression: applications with chemometric data. Chemometrics and Intelligent Laboratory Systems 104(2), 289–305.
  3. Bastien, P., V. E. Vinzi, and M. Tenenhaus (2005). Pls generalised linear regression. Computational Statistics & data analysis 48(1), 17–46.
  4. Besse, P. and J. O. Ramsay (1986). Principal components analysis of sampled functions. Psychometrika 51, 285–311.
  5. Boente, G. and R. Fraiman (2000). Kernel-based functional principal components. Statistics Probability Letters 48, 335 – 345.
  6. Cai, T. and P. Hall (2006). Prediction in functional linear regression. The Annals of Statistics 34, 2159–2179.
  7. Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins
  8. (2018, 01). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21(1), C1–C68.
  9. Cook, R. D. (2007). Fisher Lecture: Dimension Reduction in Regression. Statistical Science 22(1), 1 – 26.
  10. Delaigle, A. and P. Hall (2012). Methodology and theory for partial least squares applied to functional data. The Annals of Statistics 34, 2159–2179.
  11. Escabias, M., A. M. Aguilera, and M. J. Valderrama (2007). Functional PLS logit regression model. Computational Statistics & Data Analysis 51(10), 4891–4902.
  12. Febrero-Bande, M., P. Galeano, and W. Gonz´alez-Manteiga (2017). Functional principal component regression and functional partial least-squares regression: An overview and a comparative study. International Statistical Review 85(1), 61–83.
  13. Ferraty, F. and P. Vieu (2006). Nonparametric Functional Data Analysis: Methods, Theory, Applications and Implementation. Springer, New York.
  14. Gellar, J. E., E. Colantuoni, D. M. Needham, and C. M. Crainiceanu (2015). Cox regression models with functional covariates for survival data. Statistical Modelling 15, 256–278.
  15. Green, P. J. (1984). Pls regression on a stochatic process. Journal of the Royal Statistical Society. Series B 46, 149–192.
  16. Hall, P. and J. L. Horowitz (2007). Methodology and convergence rates for functional linear regression. The Annals of Statistics 35(1), 70–91.
  17. Horv´ath, L. and P. Kokoszka (2012). Inference for functional data with applications, Volume 200. Springer Science & Business Media.
  18. James, G., T. Hastie, and C. Sugar (2000). Principal component models for sparse functional data. Biometrika 87, 587–602.
  19. Jung, S. and J. S. Marron (2009). PCA consistency in high dimensional, low sample size context. The Annals of Statistics 37(6B), 4104–4130.
  20. Kong, D., A.-M. Staicu, and A. Maity (2016). Classical testing in functional linear models. Journal of nonparametric statistics 28(4), 813–838.
  21. Kong, D., K. Xue, F. Yao, and H. H. Zhang (2016). Partially functional linear regression in high dimensions. Biometrika 103(1), 147–159.
  22. Kr¨amer, N., A.-L. Boulesteix, and G. Tutz (2008). Penalized partial least squares with applications to b-spline transformations and functional data. Chemometrics and Intelligent Laboratory Systems 94(1), 60–69.
  23. Lee, E., H. Zhu, D. Kong, Y. Wang, K. S. Giovanello, and J. G. Ibrahim (2015). Bflcrm: A bayesian functional linear Cox regression model for predicting time to conversion to alzheimer’s disease. The Annals of Applied Statistics 9(4), 2153–2178.
  24. Li, K. and S. Luo (2017). Functional joint model for longitudinal and time-to-event data: an application to alzheimer’s disease. Statistics in Medicine 36(22), 3560–3572.
  25. Liaw, A., M. Wiener, et al. (2002). Classification and regression by randomforest. R News 2(3), 18–22.
  26. Lv, S., X. He, and J. Wang (2023). Kernel-based estimation for partially functional linear model: Minimax rates and randomized sketches. Journal of Machine Learning Research 24(55), 1–38.
  27. Meyer, N., M. Maumy-Bertrand, and F. Bertrand (2010). Comparaison de la r´egression pls et de la r´egression logistique pls: application aux donn´ees d’all´elotypage. Journal de la Soci´et´e Fran¸caise de Statistique 151, 1–18.
  28. Morris, J. S. (2015). Functional regression. Annual Reviews of Statistics and its Application 2, 321–359.
  29. M¨uller, H.-G. and U. Stadtm¨uller (2005). Generalized functional linear models. The Annals of Statistics, 774–805.
  30. Nozadi, S. H., S. Kadoury, A. D. N. Initiative, et al. (2018). Classification of alzheimer’s and mci patients from semantically parcelled pet images: A comparison between av45 and fdg-pet. International journal of biomedical imaging 2018(1), 1247430.
  31. Preda, C. and G. Saporta (2005a). Clusterwise PLS regression on a stochastic process. Computational Statistics & Data Analysis 49, 99–108.
  32. Preda, C. and G. Saporta (2005b). PLS regression on a stochatic process. Computational Statistics & Data Analysis 49, 149–158.
  33. Preda, C., G. Saporta, and C. Leveder (2005). PLS classification of functional data. Computational Statistics & Data Analysis 49, 223–235.
  34. Qu, S., J.-L. Wang, and X. Wang (2016). Optimal estimation for the functional Cox model. The Annals of Statistics 44, 1708–1738.
  35. Ramsay, J. O. and C. J. Dalzell (1991). Some tools for functional data analysis (with discussion). Journal of the Royal Statistical Society. 53, 539–572.
  36. Ramsay, J. O. and B. W. Silverman (2005). Functional Data Analysis. New York: SpringerVerlag.
  37. Reiss, P. T. and R. T. Ogden (2007). Functional principal component regression and functional partial least squares. Journal of the American Statistical Association 102(479), 984–996.
  38. Sanabria-Diaz, G., E. Martinez-Montes, L. Melie-Garcia, and A. D. N. Initiative (2013). Glucose metabolism during resting state reveals abnormal brain networks organization in the alzheimer’s disease and mild cognitive impairment. PloS one 8(7), e68860.
  39. Wang, X. and D. Ruppert (2015). Optimal prediction in an additive functional model. Statistica Sinica 25(2), 567–589.
  40. Wang, Y., J. G. Ibrahim, and H. Zhu (2020). Partial least squares for functional joint models with applications to the alzheimer’s disease neuroimaging initiative study. Biometrics 76(4), 1109–1119.
  41. Wilson, R. S., E. Segawa, P. A. Boyle, S. E. Anagnos, L. P. Hizel, and D. A. Bennett (2012). The natural history of cognitive decline in alzheimer’s disease. Psychology and aging 27(4), 1008.
  42. Yuan, M. and T. Cai (2010). A reproducing kernel hilbert space approach to functional linear regression. The Annals of Statistics 38, 3412–3444. Yue Wang

Acknowledgments

Data used in the preparation of this article were obtained from the Alzheimer’s

Disease Neuroimaging Initiative (ADNI) database. As such, the investigators within the ADNI contributed to the design and implementation of

ADNI and/or provided data but did not participate in the analysis or

writing of this report. A complete listing of ADNI investigators can be

found at http://adni.loni.usc.edu/wpcontent/uploads/howtoapply/

ADNIAcknowledgementList.pdf. Dr. Zhu was partially supported by the

National Institutes of Health (NIH) grants 1R01AR082684, 1OT2OD038045-

01, and the National Institute on Aging (NIA) of the National Institutes of

Health (NIH) grants U01AG079847, 1R01AG085581, RF1AG082938, and

Supplementary Materials

Online supplementary material includes additional simulations and theoretical results, proofs of the main theorems, and supporting information for

the real data application.


Supplementary materials are available for download.