Abstract: Microarrays are used for identifying cell-cycle regulated genes by Spellman et al. (1998). In one experiment, a strain of yeast (cdc15-2) was incubated at a high temperature (C) for a long time, causing cdc15 arrest. Cells were then shifted back to a low temperature (C) and the monitoring of gene expression is taken every 10 min for 300 min, using cDNA microarrays. The data are available from their web site (http://cellcycle-www.standford.edu). We find a simple statistical model that can be used to describe most of the expression curves. Three ideas are involved in our analysis: (1) the use of principal component analysis to suggest basis curves; (2) the use of nested models for organizing gene expression patterns; (3) the construction of a compass plot using known cycle-regulated genes for phase determination.
The first two ideas are mainly statistical in nature, but some biological discretion is necessary for successful application. On the other hand, the third idea uses biological information subject to some statistical discretion. The agreement and the difference between our results and the 800 genes identified by Spellman et al. are discussed. A rather unexpected finding is the existence of over 500 genes whose expression levels oscillate regularly every 10 min from time 70 min to time 250 min like a biological pendulum. Extension of our analysis to other cell-cycle experiments is briefly discussed.
Key words and phrases: Analysis of variance, cell-cycle, gene expression, microarray, nested models, principal component analysis.