Abstract: The sample coverage for a random sample is defined as the sum of the class probabilities of the observed classes in multinomial sampling for which only one class occurs in each independent observation. This study generalizes the concept of sample coverage to the case that multiple possibly dependent class can occur for each observation. A consistent estimator for the generalized sample coverage and its mean squared error properties are developed. The resulting estimator is shown to be an approximate empirical Bayes estimator. A data set on Chinese poems is given for illustration. Results of a simulation study are reported to show the general performance of the proposed estimator and to suggest that the usual estimator, without considering the dependence among classes, may yield severe bias in some situations.
Key words and phrases: Sample coverage, discovering species, empirical Bayes.