Back To Index Previous Article Next Article Full Text Supplement


Statistica Sinica 19 (2009), 251-271





THE NESTED DIRICHLET DISTRIBUTION

AND INCOMPLETE CATEGORICAL DATA ANALYSIS


Kai Wang Ng$^1$, Man-Lai Tang$^2$, Guo-Liang Tian$^{1}$ and Ming Tan$^{3}$


$^1$The University of Hong Kong, $^2$Hong Kong Baptist University,
and $^3$University of Maryland Greenebaum Cancer Center
Abstract: The nested Dirichlet distribution (NDD) is an important distribution defined on the closed $n$-dimensional simplex. It includes the classical Dirichlet distribution and is useful in incomplete categorical data (ICD) analysis. In this article, we develop the distributional properties of NDD. New large-sample likelihood and small-sample Bayesian approaches for analyzing ICD are proposed and compared with existing likelihood/Bayesian strategies. We show that the new approaches have at least three advantages over existing approaches based on the traditional Dirichlet distribution in both frequentist and conjugate Bayesian inference for ICD. The new methods possess closed-form expressions for both the maximum likelihood and Bayes estimates when the likelihood function is in NDD form; produce computationally efficient EM and data augmentation algorithms when the likelihood is not in NDD form; and provide exact sampling procedures for some special cases. The methodologies are illustrated with simulated and real data.



Key words and phrases: Data augmentation, Dirichlet distribution, EM, incomplete categorical data, matrix rate of convergence, mixing rate of a markov chain, nested Dirichlet distribution.

Back To Index Previous Article Next Article Full Text Supplement