Back To Index Previous Article Next Article Full Text

Statistica Sinica 34 (2024), 1023-1043

STATISTICAL INFERENCE FOR GENETIC
RELATEDNESS BASED ON HIGH-DIMENSIONAL
LOGISTIC REGRESSION

Rong Ma1, Zijian Guo2, T. Tony Cai3 and Hongzhe Li*3

1Stanford University, 2Rutgers University,
and 3University of Pennsylvania

Abstract: We examine statistical inference for genetic relatedness between binary traits, based on individual-level genome-wide association data. Specifically, for high-dimensional logistic regression models, we define parameters characterizing the cross-trait genetic correlation, genetic covariance, and trait-specific genetic variance. We develop a novel weighted debiasing method for the logistic Lasso estimator and propose computationally efficient debiased estimators. Further more, we study the rates of convergence for these estimators and establish their asymptotic normality under mild conditions. Moreover, we construct confidence intervals and statistical tests for these parameters, and provide theoretical justifications for the methods, including the coverage probability and expected length of the confidence intervals, and the size and power of the proposed tests. Numerical studies under both model-generated data and simulated genetic data show the superiority of the proposed methods. By analyzing a real data set on autoimmune diseases, we demonstrate their ability to obtain novel insights about the shared genetic architecture between 10 pediatric autoimmune diseases.

Key words and phrases: Confidence interval, debiasing methods, functional estimation, genetic correlation, hypothesis testing.

Back To Index Previous Article Next Article Full Text