Abstract: We discuss a method called ``cluster scoring'' for supervised learning from a set of gene expression experiments. Cluster scoring generalizes methods that rank individual genes based on their correlation with an outcome measure. It begins with a clustering of the genes, for example from hierarchical clustering, and then computes outcome scores both for individual genes and the average gene expression for each of the clusters. A permutation method is used to identify the significant subset of these scores. We illustrate the method on both simulated data, and data from a study of lymphoma.
Key words and phrases: Clustering, microarrays, supervised learning.