Statistica Sinica

Irwin Guttman and Daniel Peña

Abstract:This paper develops diagnostics for data thought to be generated in accordance with the general univariate linear model. A first set of diagnostics is developed by considering posterior probabilities of models that dictate which ofkobservations from a sample ofnobservations (k<n/2) arespuriouslygenerated, giving rise to the possibleoutlyingnessof thekobservations considered. This is turn gives rise to diagnostics to help assess (estimate) the value ofk. A second set of diagnostics is found by using the Kullback-Leibler symmetric divergence, which is found to generate measures ofoutlyingnessandinfluence. Both sets of diagnostics are compared and related to each other and to other diagnostic statistics suggested in the literature. An example to illustrate to the use of these diagnostic procedures is included.

Key words and phrases:Spurious and outlying observations, posteriors of models, leverage, Kullback-Leibler measures, outlying and influential observations.