Abstract
We propose a functional joint modeling (FJM) framework for correlating imaging re
sponses with genetic markers and clinical variables. Our FJM consists of a nonlinear multivariate
functional principal component analysis (NMFPCA) and a functional multiple-index varying coefficient model (FMVCM). The NMFPCA, with unknown link functions, is used to extract meaningful
functional principal component (FPC) scores of genetic markers, while the FMVCM identifies the
varying association of the extracted FPC scores and clinical variables with imaging data.
We
propose an efficient estimation procedure to estimate unknown functions in our FJM and a regularization approach to simultaneously select relevant features from infinite-dimensional functional
data and learn the model structure. The asymptotic convergence rate of estimators and model selection consistency are investigated. The proposed method is evaluated through simulation studies
and applied to an imaging genetic data set extracted from the Alzheimer’s Disease Neuroimaging
Initiative (ADNI) study.
∗These authors are jointly supervised this work: Xinyuan Song and Hongtu Zhu, Emails for Corre-
Information
| Preprint No. | SS-2023-0152 |
|---|---|
| Manuscript ID | SS-2023-0152 |
| Complete Authors | Qingzhi Zhong, Xinyuan Song, Hongtu Zhu |
| Corresponding Authors | Hongtu Zhu |
| Emails | htzhu@email.unc.edu |
References
- Anowar, F., S. Sadaoui, and B. Selim (2021). Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Computer Science Review 40, 100378.
- Bair, E., T. Hastie, D. Paul, and R. Tibshirani (2006). Prediction by supervised principal components. Journal of the American Statistical Association 101(473), 119–137.
- Barrett, J. C., B. Fry, J. Maller, and M. J. Daly (2005). Haploview: analysis and visualization of ld and haplotype maps. Bioinformatics 21(2), 263–265.
- Bosq, D. (2000). Linear processes in function spaces: theory and applications. Springer, New York.
- De Flores, R., R. La Joie, and G. Ch´etelat (2015). Structural imaging of hippocampal subfields in healthy aging and alzheimer’s disease. Neuroscience 309, 29–50.
- Elliott, L. T., K. Sharp, F. Alfaro-Almagro, S. Shi, K. L. Miller, G. Douaud, J. Marchini, and
- S. M. Smith (2018). Genome-wide association studies of brain imaging phenotypes in uk biobank. Nature 562(7726), 210–216.
- Gabriel, S. B., S. F. Schaffner, H. Nguyen, J. M. Moore, J. Roy, B. Blumenstiel, J. Higgins,
- M. DeFelice, A. Lochner, M. Faggart, et al. (2002). The structure of haplotype blocks in the human genome. science 296(5576), 2225–2229.
- Hall, P., H.-G. M¨uller, and J.-L. Wang (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 34(3), 1493–1517.
- Happ, C. and S. Greven (2018). Multivariate functional principal component analysis for data observed on different (dimensional) domains. Journal of the American Statistical Association 113(522), 649–659.
- Hibar, D. P., J. L. Stein, M. E. Renteria, A. Arias-Vasquez, S. Desrivi`eres, N. Jahanshad,
- R. Toro, K. Wittfeld, L. Abramovic, M. Andersson, et al. (2015). Common genetic variants influence human subcortical brain structures. Nature 520(7546), 224–229. Jack Jr, C. R., D. S. Knopman, W. J. Jagust, R. C. Petersen, M. W. Weiner, P. S. Aisen, L. M.
- Shaw, P. Vemuri, H. J. Wiste, S. D. Weigand, et al. (2013). Tracking pathophysiological processes in alzheimer’s disease: an updated hypothetical model of dynamic biomarkers. The Lancet Neurology 12(2), 207–216.
- Kang, K., J. Cai, X. Song, and H. Zhu (2019). Bayesian hidden markov models for delineating the pathology of alzheimer’s disease. Statistical methods in medical research 28(7), 2112–2124.
- Kim, M. and L. Wang (2021). Generalized spatially varying coefficient models. Journal of Computational and Graphical Statistics 30(1), 1–10.
- Li, B., A. Artemiou, and L. Li (2011). Principal support vector machines for linear and nonlienar sufficient dimension reduction. The Annals of Statistics 39(6), 3182–3210.
- Li, J., C. Huang, Z. Hongtu, and A. D. N. Initiative (2017). A functional varying-coefficient single-index model for functional response data. Journal of the American Statistical Association 112(519), 1169–1181.
- Li, T., Y. Yu, J. Marron, and H. Zhu (2024). A partially functional linear regression framework for integrating genetic, imaging, and clinical data. The Annals of Applied Statistics 18(1), 704–728.
- Li, X., L. Wang, H. J. Wang, and A. D. N. Initiative (2021). Sparse learning and structure identification for ultrahigh-dimensional image-on-scalar regression. Journal of the American Statistical Association 116(536), 1994–2008.
- Luo, X., L. Zhu, and H. Zhu (2016). Single-index varying coefficient model for functional responses. Biometrics 72(4), 1275–1284.
- Mielke, M. M., P. Vemuri, and W. A. Rocca (2014). Clinical epidemiology of alzheimer’s disease: assessing sex and gender differences. Clinical epidemiology 6, 37–48.
- Morris, J. S. and R. J. Carroll (2006). Wavelet-based functional mixed models. Journal of the Royal Statistical Society: Series B 68(2), 179–199.
- M¨uller, H.-G. and F. Yao (2008). Functional additive models. Journal of the American Statistical Association 103(484), 1534–1544. Olazar´an, J., B. Reisberg, L. Clare, I. Cruz, J. Pe˜na-Casanova, T. Del Ser, B. Woods, C. Beck,
- S. Auer, C. Lai, et al. (2010). Nonpharmacological therapies in alzheimer’s disease: a systematic review of efficacy. Dementia and geriatric cognitive disorders 30(2), 161–178.
- Pedraza, O., D. Bowers, and R. Gilmore (2004). Asymmetry of the hippocampus and amygdala in mri volumetric measurements of normal adults. Journal of the International Neuropsychological Society 10(5), 664–678.
- Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar,
- P. I. De Bakker, M. J. Daly, et al. (2007). Plink: a tool set for whole-genome association and population-based linkage analyses. The American journal of human genetics 81(3), 559–575.
- Ramsay, J. O. and B. W. Silverman (2005). Functional data analysis. Springer.
- Reiss, P. T. and R. T. Ogden (2010). Functional generalized linear models with images as predictors. Biometrics 66(1), 61–69.
- Sangalli, L. M., J. O. Ramsay, and T. O. Ramsay (2013). Spatial spline regression models. Journal of the Royal Statistical Society: Series B 75(4), 681–703.
- Shi, J., P. M. Thompson, B. Gutman, Y. Wang, A. D. N. Initiative, et al. (2013). Surface fluid registration of conformal representation: Application to detect disease burden and genetic influence on hippocampus. NeuroImage 78, 111–134.
- Stein, J. L., X. Hua, S. Lee, A. J. Ho, A. D. Leow, A. W. Toga, A. J. Saykin, L. Shen, T. Foroud,
- N. Pankratz, et al. (2010). Voxelwise genome-wide association study (vgwas). neuroimage 53(3), 1160–1174.
- Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. The annals of Statistics, 1348–1360.
- Veitch, D. P., M. W. Weiner, P. S. Aisen, L. A. Beckett, C. DeCarli, R. C. Green, D. Harvey,
- C. R. Jack Jr, W. Jagust, S. M. Landau, et al. (2022). Using the alzheimer’s disease neuroimaging initiative to improve early detection, diagnosis, and treatment of alzheimer’s disease. Alzheimer’s & Dementia 18(4), 824–857.
- Wall, J. D. and J. K. Pritchard (2003). Haplotype blocks and linkage disequilibrium in the human genome. Nature Reviews Genetics 4(8), 587–597.
- Wang, H., R. Li, and C.-L. Tsai (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3), 553–568.
- Wang, Y., Y. Song, P. Rajagopalan, T. An, K. Liu, Y.-Y. Chou, B. Gutman, A. W. Toga, P. M.
- Thompson, A. D. N. Initiative, et al. (2011). Surface-based tbm boosts power to detect disease effects on the brain: an n= 804 adni study. Neuroimage 56(4), 1993–2010.
- Xia, Y. (2008). A multiple-index model and dimension reduction. Journal of the American Statistical Association 103(484), 1631–1640.
- Yu, S., G. Wang, L. Wang, C. Liu, and L. Yang (2019). Estimation and inference for generalized geoadditive models. Journal of the American Statistical Association 115(530), 761–774.
- Yu, S., G. Wang, L. Wang, and L. Yang (2021). Multivariate spline estimation and inference for image-on-scalar regression. Statistica Sinica 31(3), 1463–1487.
- Zhu, H., T. Li, and B. Zhao (2023). Statistical learning methods for neuroimaging data analysis with applications. Annual Review of Biomedical Data Science 6(1), 73–104.
- Zhu, H., F. Yao, and H. H. Zhang (2014). Structured functional additive regression in reproducing kernel hilbert spaces. Journal of the Royal Statistical Society: Series B 76(3), 581–603.
Acknowledgments
The authors would like to express their gratitude to the editor, and the anonymous referees for their careful reading and useful comments which led to an improved presentation
of the paper. Data used in preparation of this article were obtained from the Alzheimer’s
Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or
provided data but did not participate in analysis or writing of this report. A complete
listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/
uploads/how_to_apply/ADNI_Acknowledgement_List.pdf. This work was partially supported by GRF grant 14302519 from the Research Grant Council of the HKSAR for Dr.
Song. Zhong’s work was supported by National Natural Science Foundation of China
(12401349) and China Postdoctoral Science Foundations (2024M751116).
Supplementary Materials
The Supplementary Material contains the theoretical proofs in Section 4, the estimation
procedure in Section 3, ADNI analysis given APOE-ϵ4 and disease status in Section 6,
and Web Tables 1–11 and Figures 1–7 in Sections 5 and 6.