Abstract: A common interest in microarray data analysis is to identify genes having changes in expression values between different biological conditions. The existing methods include using two-sample -statistics, modified -statistics (SAM), Bayesian -statistics (Cyber-T), semiparametric hierarchical Bayesian models, and nonparametric permutation tests. All these methods essentially compare two population means. Unlike these methods, we consider using Bayes factors to compare gene expression levels that allows us to compare two population distributions. To adapt the use of Bayes factors to microarray data, we propose a new calibration approach that weighs two types of prior predictive error probabilities differently for each gene and, at the same time, controls the overall error rate for all genes. Moreover, a new gene selection algorithm based on the calibration approach is developed and its properties are examined. The proposed method is shown to have a smaller false discovery rate (FDR) and a smaller false non-discovery rate (FNDR) than several existing methods in several simulations. Finally, a data set from an affymetrix microarray experiment to identify genes associated with the mature osteoblast differentiation is used to further illustrate the proposed methodology.
Key words and phrases: Gene selection, Bayes factor, calibrating value, multilevel model, marginal likelihood.