Abstract
We propose a novel method of finding principal components in multivariate
data sets that lie on an embedded nonlinear Riemannian manifold within a higherdimensional space. Our aim is to extend the geometric interpretation of PCA, while
being able to capture non-geodesic modes of variation in the data. We introduce the
concept of a principal sub-manifold, a manifold passing through a reference point,
and at any point on the manifold extending in the direction of highest variation in
the space spanned by the eigenvectors of the local tangent space PCA. Compared
to recent work for the case where the sub-manifold is of dimension one Panaretos
et al. (2014)–essentially a curve lying on the manifold attempting to capture onedimensional variation–the current setting is much more general. The principal sub-
manifold is therefore an extension of the principal flow, accommodating to capture
higher dimensional variation in the data. We show the principal sub-manifold yields
the ball spanned by the usual principal components in Euclidean space. By means of
examples, we illustrate how to find, use and interpret a principal sub-manifold and
we present an application in shape analysis.
Information
| Preprint No. | SS-2021-0163 |
|---|---|
| Manuscript ID | SS-2021-0163 |
| Complete Authors | Zhigang Yao, Benjamin Eltzner, Tung Pham |
| Corresponding Authors | Zhigang Yao |
| Emails | zhigang.yao@nus.edu.sg |
References
- Akhoj, M., J. Benn, E. Grong, S. Sommer, and X. Pennec (2023). Principal subbundles for dimension reduction. arXiv preprint arXiv:2307.03128.
- Anderson, T. (1963). Asymptotic theory for principal component analysis. The Annals of Mathematical Statistics 34, 122–148.
- Chaudhuri, P. and J. S. Marron (2000). Scale space view of curve estimation. The Annals of Statistics 28, 408–428.
- Donoho, D. L. and C. Grimes (2003). Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences of the United States of America 100, 5591–5596.
- Eltzner, B., S. Huckemann, and K. V. Mardia (2018). Torus principal component analysis with applications to RNA structure. The Annals of Applied Statistics 12, 1332–1359.
- Fisher, R. (1953). Dispersion on a sphere. Proceedings of the Royal Society A:
- Mathematical, Physical and Engineering Science 217, 295–305.
- Fletcher, P. T. and S. Joshi (2007). Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Processing 87, 250–262.
- Fletcher, P. T., C. Lu, S. M. Pizer, and S. Joshi (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging 23, 995–1005.
- Gerber, S., T. Tasdizen, P. T. Fletcher, S. Joshi, R. Whitaker, and the
- Alzheimers Disease Neuroimaging Initiative (ADNI) (2010). Manifold modeling for brain population analysis. Medical Image Analysis 14, 643–653.
- Guhaniyogi, R. and D. Dunson (2016). Compressed gaussian process for manifold regression. Journal of Machine Learning Research 17, 1–26.
- Hastie, T. and W. Stuetzle (1989). Principal curves. Journal of the American Statistical Association 84, 502–516.
- Huckemann, S., T. Hotz, and A. Munk (2010). Intrinsic shape analysis: Geodesic pca for riemannian manifolds modulo isometric lie group actions. Statistica Sinica 20, 1–100.
- Huckemann, S. and H. Ziezold (2006). Principal component analysis for riemannian manifolds, with an application to triangular shape spaces. Advances in Applied Probability 38, 299–319.
- Jung, S., I. L. Dryden, and J. S. Marron (2012). Analysis of principal nested spheres. Biometrika 99, 551–568.
- Jung, S., X. Liu, J. S. Marron, and S. M. Pizer (2010). Generalized pca via the backward stepwise approach in image analysis. In Brain, Body and Machine, pp. 111–123. Springer: Berlin/Heidelberg.
- Jupp, P. E. and J. T. Kent (1987). Fitting smooth paths to spherical data. Journal of the Royal Statistical Soceity, Series C 36, 34–36.
- Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communication on Pure and Applied Mathematics 30, 509–541.
- Kendall, D. G. (1989). A survey of the statistical theory of shape. Statistical Science 4, 87–120.
- Kendall, D. G., D. Barden, T. K. Carne, and H. Le (1999). Shape and Shape Theory. New York: Wiley.
- Kenobi, K., I. L. Dryden, and H. Le (2010). Shape curves and geodesic modelling. Biometrika 97, 567–584.
- Kume, A., I. L. Dryden, and H. Le (2007). Shape-space smoothing splines for planar landmark data. Biometrika 94, 513–528.
- Panaretos, V. M., T. Pham, and Z. Yao (2014). Principal flows. Journal of the American Statistical Association 109, 424–436.
- Patrangenaru, V. and L. Ellingson (2015). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press.
- Pennec, X. (2006). Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. Journal of Mathematical Imaging and Vision 25, 127–154.
- Pennec, X. (2015). Barycentric subspaces and affine spans in manifolds. In Geometric Science of Information (GSI) 2015, pp. 12–21. Springer International Publishing.
- Pennec, X. and J.-P. Thirion (1997). A framework for uncertainty and validation of 3d registration a framework for uncertainty and validation of 3d registration methods based on points and frames. International Journal of Computer Vision 25, 203–229.
- Roweis, S. T. and L. K. Sau (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326.
- Sommer, S. (2013). Horizontal dimensionality reduction and iterated frame bundle development. In Geometric Science of Information (GSI) 2013, pp. 76–83. Springer: Berlin/Heidelberg.
- Souvenir, R. and R. Pless (2007). Image distance functions for manifold learning. Image and Vision Computing 25, 365–373.
- Yao, Z., J. Su, and S.-T. Yau (2023). Manifold fitting with cyclegan. Proceedings of the National Academy of Sciences of the United States of America 121, e2311436121.
- Yao, Z., Y. Xia, and Z. Fan (2024). Random fixed boundary flows. Journal of the American Statistical Association 119(547), 2356–2368.
- Yao, Z. and Z. Zhang (2020). Principal boundary on riemannian manifolds. Journal of the American Statistical Association 115, 1435–1448.
- Zhang, Z. and H. Zha (2004). Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientific Computing 26, 313–338. Zhigang Yao
Acknowledgments
Z. Yao has been supported by Singapore Ministry of Education Tier 2 grant (A-
0008520-00-00, A-8001562-00-00) and Tier 1 grant (A-0004809-00-00, A8000987-
00-00) at the National University of Singapore.
B. Eltzner gratefully acknowledges funding by the DFG CRC 803 project Z02 and DFG CRC 1456
project B02. B. Eltzner is very grateful to the National University of Singapore for funding a two-month long term visit in February and March 2018
which enabled collaboration on this project. We thank S. F. Huckemann and
J. S. Marron for helpful discussions.
Supplementary Materials
PDF file Principal Sub-manifolds – Supplementary Materials: This PDF file
contains additional background theory, some additional simulation results to
illustrate the greedy algorithm proposed here, and additional applications.