Abstract

We propose a new method of statistical inference, called the method of limits

(MoL), which may be viewed as an extension of the method of moments. This method is

motivated by the need to analyze count data for genome wide association studies (GWAS),

where the existing methods are hindered in statistical inference due to computational challenges. We establish consistency and asymptotic normality of the MoL estimator of heri-

tability from GWAS data, which is seen as an advantage over the existing PQLseq method.

Furthermore, we derived a consistent estimator of the proportion of causal SNPs. MoL

also showed an advantage of both statistical and computational efficiency measured by average statistical efficiency (ASE) in our simulation studies compared to PQLseq. We also

illustrate the usefulness of MoL through its application to the UK Biobank data to infer the

heritability of weekly champagne consumption and weekly red wine consumption using

the count data.

Information

Preprint No.SS-2024-0092
Manuscript IDSS-2024-0092
Complete AuthorsJiming Jiang, Leqi Xu, Yiliang Zhang, Hongyu Zhao
Corresponding AuthorsJiming Jiang
Emailsjimjiang@ucdavis.edu

References

  1. Booth, J. G. and Hobert, J. P. (1999), Maximum generalized linear mixed model likelihood with an automated Monte Carlo EM algorithm, J. Roy. Statist. Soc. B 61, 265–285.
  2. Breslow, N. E. and Clayton, D. G. (1993), Approximate inference in generalized linear mixed models, J. Amer. Statist. Assoc. 88, 9–25.
  3. Bycroft, C., Freeman, C., Petkova, D., Band, G., Elliott, L. T., et al. (2018), The UK Biobank resource with deep phenotyping and genomic data, Nature 562, 203–209.
  4. Dao, C., Jiang, J., Paul, D., and Zhao, H. (2021), Variance estimation and confidence intervals from highdimensional genome-wide association studies through misspecified mixed model analysis, J. Stat. Plan. Inference 220, 15–23.
  5. Golan, D., Lander, E. S., and Rosset, S. (2014), Measuring missing heritability: Inferring the contribution of common variants, PNAS 111, E5272–E5281.
  6. Jiang, J. (1998), Consistent estimators in generalized linear mixed models, J. Amer. Statist. Assoc. 93, 720–729.
  7. Jiang, J. and Nguyen, T. (2021), Linear and Generalized Linear Mixed Models and Their Applications, 2nd ed., Springer, New York.
  8. Jiang, J. (2022), Large Sample Techniques for Statistics, 2nd ed., Springer, New York.
  9. Jiang, J., Li, C., Paul, D., Yang, C., and Zhao, H. (2016), On high-dimensional misspecified mixed model analysis in genome-wide association study, Ann. Statist. 44, 2127–2160.
  10. Little, R. J. A. and Rubin, D. B. (2002), Statistical Analysis with Missing Data, 2nd ed., Wiley, New York.
  11. Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P. and others (2015), UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age. PLOS Medicine 12, e1001779.
  12. Sun, S., Zhu, J., Mozaffari, S., Ober, C., Chen, M., and Zhou, X. (2019), Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies, Bioinformatics 35, 487–496.
  13. Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., Madden, P. A., Heath, A.
  14. C., Martin, N. G., Montgomery, G. W. and others (2010), Common SNPs explain a large proportion of the heritability for human height, Nature Genetics 42, 565–569.

Acknowledgments

The research of Jiming Jiang is partially supported by the NSF grants DMS-

1713120, DMS-1914465 and DMS-2210569. The research of Hongyu Zhao is

partially supported by DMS 1713120 and NIH R01 GM134005. The research

was conducted using the UKBB resource under approved data requests (access

ref: 29900).

Supplementary Materials

The Supplementary Material contains proofs of the main theoretical results.

The code for simulations and real data analysis is available at https://

github.com/LeqiXu/MoL_analysis.


Supplementary materials are available for download.