Cell Cycle-regulated Gene Expression inArabidopsis *

Regulated gene expression is an important mechanism for controlling cell cycle progression in yeast and mammals, and genes involved in cell division-related processes often show transcriptional regulation dependent on cell cycle position. Analysis of cell cycle processes in plants has been hampered by the lack of synchronizable cell suspensions for Arabidopsis, and few cell cycle-regulated genes are known. Using a recently described synchrony system, we have analyzed RNA from sequential samples ofArabidopsis cells progressing through the cell cycle using Affymetrix Genearrays. We identify nearly 500 genes that robustly display significant fluctuation in expression, representing the first genomic analysis of cell cycle-regulated gene expression in any plant. In addition to the limited number of genes previously identified as cell cycle-regulated in plants, we also find specific patterns of regulation for genes known or suspected to be involved in signal transduction, transcriptional regulation, and hormonal regulation, including key genes of cytokinin response. Genes identified represent pathways that are cell cycle-regulated in other organisms and those involved in plant-specific processes. The range and number of cell cycle-regulated genes show the close integration of the plant cell cycle into a variety of cellular control and response pathways.

Cell division is a fundamental biological process and shares conserved features and controls in all eukaryotes (1)(2)(3). However, plants have a number of special features that give the control of cell division particular importance, including an indeterminate mode of development, the absence of cell migration, and responsiveness of growth rate and development to changes in environmental conditions. Cell division therefore plays a role both in the developmental processes that create plant architecture and in the modulation of plant growth rate in response to the environment (4,5). It is therefore not unexpected that plant cell cycle control shows a number of novel aspects, together with conservation of the types of key regulators of cell cycle transitions such as cyclin-dependent kinases (CDKs), 1 CDK inhibitor genes, cyclins, retinoblastoma (Rb) protein homologs, and E2F (6 -16). However, important differences include the absence of direct CDC25 protein phosphatase homologs and the presence of cell cycle-regulated CDKs known as CDKB (17)(18)(19)(20)(21)(22). As well as the presence of such novel regulators of the cell cycle, cell division control in plants might also show interactions with plant hormones and developmental regulators as well as with plant-specific processes such as cell wall metabolism.
Regulation of gene expression in different phases is proposed to be an important mechanism for control of progression through the cell cycle in yeast and mammalian cells, and around 800 genes have been identified using microarray analysis in both systems as potentially cell cycle-regulated (23)(24)(25)(26)(27). The wide scale analysis of cell cycle-regulated expression in plants has been hampered to date by the lack of a suitable system for the synchronization of cells from a sequenced species, and rather few genes are documented as cell cycle-regulated (28). Almost all of these genes are directly involved in cell cycle progression, thereby giving few clues as to mechanisms by which cell cycle control may intersect with other cellular processes (22, 29 -35). Using a recently developed cell synchrony system for Arabidopsis cells (22), we have carried out an analysis of gene expression on high density Affymetrix microarrays (36). 2 Cell cycle progression was reversibly blocked using the DNA polymerase inhibitor aphidicolin, and sequential RNA samples taken at two hourly intervals over a 19-h period were analyzed for gene expression. Expression of 4010 genes was detected and tested for statistically significant cell cycle regulation above the variation shown by a randomized data set, resulting in the identification of 463 candidate cell cycle-regulated genes, showing that cell cycle regulation of expression is found for a significant number of genes in plants. A close match was found for known regulated genes between the microarray expression analysis and RNA gel blots. Systematic analysis of their expression revealed common patterns of expression following release, suggesting coordinate regulation of a number of genes. Genes regulated in this experiment represent both processes known or suspected to be cell cycle-regulated in plants or other organisms and genes involved in a number of other cellular processes including hormone response, signal transduction, transcription control, and metabolic regulation (37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47). The insights provided by the first wide scale analysis and identification of cell cycle-modulated gene expression in plants reflect the central role of cell division in plant development and responses and forms an important foundation for future studies in plant cell biology. order of the measurements. Data resampling was performed by allowing permutations of measurements for each gene. Subsequently, PVE values m i were calculated for all vectors Y i (t) in the artificial control data set.
Cluster Analysis of Cell Cycle Expression Patterns-Expression patterns of genes defined as having a statistically significantly (p Ͻ 0.05) greater periodic expression in the experiment than the randomized data set were imported into GeneMaths (version 1.50; Applied Maths). Of the 4010 genes identified that passed the variation filter, a total of 493 gene expression profiles met the periodic fluctuation conditions (p Ͻ 0.05). All of the processed values for signal log ratios after comparative analysis against the sample directly taken after block release (T0) were subjected to principal component analysis (PCA) and self-organizing map (SOM) algorithms using GeneMaths 1.50 (50). Prior PCA and SOM analysis genes were labeled (GeneMaths) according to the annotated peak of expression found after statistical analysis for each phase as Each block represents a sample taken at 1-h intervals from T0 (time of release) to T19 (19 h later). B, DNA histogram of flow analysis results in A. C, LI determination of S phase (ࡗ) and metaphase/anaphase index determination of metaphase and anaphase cells (Ⅺ). D, comparative mRNA analysis of gene expression by Northern blot and microarray analysis. Expression was normalized for microarray analysis by dividing the absolute detected signal through the maximum of expression (ࡗ). Signals after Northern blot analysis were quantified using NIH Image 1.62. The level of expression (in arbitrary units) was normalized by correcting against a loading control and expressing as a proportion of the maximum signal (Ⅺ).
follows: S phase peak (blue), G 2 (yellow), M phase (red), and G 1 (green). Both the expression values of different genes (493 values) and different experiments (nine) were used as variables to calculate the PCA. Data were normalized across genes and experiments. SOM analysis was performed choosing as map (or matrix) the dimension 4 ϫ 4. A dendrogram was created after the absolute expression pattern of each gene was normalized across the experiment by dividing the absolute signal at each time point by the maximum value for the same gene independently of whether it was called present or absent by MASuite. The hierarchical clustering analysis was performed by using as clustering algorithm the unweighted pair group method using arithmetic averages (large N/p) (51).
Data Base Search to Identify Regulatory Elements within the Promoter Region-The data base tool patmatch (available on the World Wide Web at www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl) was used to search the promoter region 1 kb upstream of each open reading frame of the selected 493 genes. The following consensus sequences were used to search for regulatory motifs: E2F (TTTYYCGYY), mitotic-specific activation (YCYAACGGYY), Oct (CGCGGATC), and Hex (CCACGTCA).

RESULTS
Cell Cycle Progression after Aphidicolin-induced Synchrony-Analysis of gene expression during the cell cycle is predicated on effective synchronization and analysis of cell cycle progression. In many plant systems, the fungal toxin aphidicolin has been found to be an effective method of reversibly blocking cell cycle progression (22,52,53). It inhibits both DNA polymerase ␣ and ␦ (54) and therefore blocks cell cycle progression in early S phase. Removal of the inhibitor by washing leads to release of the block and the synchronous resumption of S phase and progression through the cell cycle. However, Arabidopsis cell culture systems have proven remarkably recalcitrant to efficient synchronization using this or other methods (29,55,56). We recently developed techniques for aphidicolin synchronization of the Arabidopsis Landsberg erecta cell line MM2d (22), which was used for the synchronization experiments reported here.
After treatment of MM2d cells with aphidicolin for 24 h and subsequent washing to remove the block, cell cycle progression was followed by flow cytometry over a 19-h period (Fig. 1, A and  B). The majority of cells are arrested in G 1 /early S phase (G 1 /S, G 1 ϭ 1N) directly after release of the block. Within 1 h of removal of aphidicolin, a peak corresponding to a progressive increase in DNA content of S phase cells indicated that the majority of cells proceed synchronous through S phase (Fig.  1A). The DNA content of this S phase peak constantly increased in size before reaching the G 2 phase DNA content (2N) after 5 h. Peak analysis of the flow cytometry data shows that 77% of the cell population is in S phase 1 h after block release, and a maximum of more than 90% is found in G 2 after 7-8 h (Fig. 1B). At 8 h, a rapid increase in the number of metaphase cells is observed.
Synchrony was also monitored by pulse labeling with Br-dUrd and detection of newly synthesized BrdUrd-containing DNA using immunocytochemistry and indirect immunofluorescence to identify S phase cells actively synthesizing DNA (22). The proportion of BrdUrd-positive cells observed is defined as the labeling index. Independent labeling index determination of the same samples confirmed the level of S phase synchrony measured by flow cytometry showing a labeling index peak of 76% observed 2 h after release (Fig. 1C). The metaphase/anaphase index of cells in metaphase/anaphase reaches a peak value of around 11%, 11-12 h after release of the block. It should be noted that only cells in metaphase and anaphase were scored for the metaphase/anaphase index, which represent only around 35-40% of the total duration of mitosis, since it is difficult to score routinely other mitotic phases due to the small genome size and late condensation of Arabidopsis chromosomes in prophase (57).
Differential Analysis of Gene Expression-RNA was prepared from samples taken immediately after washing to remove aphidicolin (0 h) and at two hourly intervals until 16 h, followed by a final sample at 19 h. RNA was labeled and hybridized to high density Affymetrix GeneChip DNA arrays that contain ϳ8250 gene sequences and expressed sequence tags according to the manufacturer's instructions (Affymetrix). The hybridized chips were then analyzed, and genes were filtered as described under "Experimental Procedures." Of the ϳ8250 genes and expressed sequence tags represented on the chip, 4010 passed the biological variation filter, indicating that they were both reliably detected on at least one chip ("present" call), and showing a change from the expression level at time 0 ("difference" call).
Previous analysis of microarray data has found that random variation can produce apparently systematic patterns of expression (49), throwing doubt on earlier identification of cell cycle-regulated genes in human fibroblasts (24). We therefore confirmed the existence of periodicity in our data set by creating a control set of randomized data (see "Experimental Procedures") for the 4010 genes passing the first filter. Fourier-PVE values, indicating the degree of cyclicity in the data and phase of expression were determined. Fig. 2A shows a plot of the PVE values of the 1000 strongest expressed genes (highest mean expression; mean selection, left) or of the 1000 genes with the largest standard variation in expression (S.D. selection, right) against the similarly ranked genes from the random set. In such a plot, points close to or above the diagonal show that there is similar or even less FIG. 2. Periodicity and phases in observed and randomized data. A, the 1000 genes with the highest average expression (left) or the 1000 genes with the highest S.D. in expression (right) are compared with randomized gene expression values. The Fourier PVE was calculated for the selected genes and for 1000 randomized genes. Lists of PVE values were ordered according to size, and randomized values were plotted versus experimental values. Points below the diagonal indicate that there is more cyclicity in the experimental genes than in the randomized genes, whereas points above the line indicate that there is more cyclicity in the randomized genes than in the experimental genes. Similar results are obtained for the entire gene set. B, maxima of expression were calculated from the fit of data to a sine wave. Distribution of phases were displayed for all 4010 experimental time series as well as for 4010 randomized time series for one entire cell cycle. C, display of density distribution for all 4010 experimental time series as well as for 4010 randomized time series for one entire cell cycle. periodicity in the experimental data than in the random data. In contrast, points below the diagonal indicate periodic expression. Fig. 2A demonstrates considerable periodicity in the experimental data. Thus, the successful synchronization of the cell culture was confirmed and that indeed many genes are expressed in a cell cycle-dependent manner. Similar to observations of Shedden and Cooper (49), we observe stronger periodic expression among the genes with the highest S.D. than among the genes with highest mean expression ( Fig Subsequently, we compared the distribution of phases between experimental and randomized data. Data in Fig. 2B show that the timing of maximal expression of the randomized data is relatively evenly distributed throughout the period of the experiment, as expected for effectively randomized data. In contrast, the distribution of phases in the experimental data strongly deviates from that of the random data. In particular, very few genes had expression maxima in G 2 phase (Fig. 2C).
We then used the distribution of PVE values of the control data set to select those genes showing statistically significant higher periodic expression in the experiment than expected from the random data (p Ͻ 0.05), resulting in the definition of 493 (12%) gene signals of the total expressed (4010) as having a high probability of exhibiting significant regulation during the duration of the experiment. These included 213 gene signals with a peak of expression during S phase, nine genes peaking in G 2 , 135 in mitosis, and 136 in G 1 . The distribution of phases for the 493 selected signals was similar to that of the entire set of 4010 genes (data not shown). Although these genes were characterized by a low probability that their cyclical behavior was due to chance fluctuations, Fig. 2, B and C, demonstrates that a proportion of other genes among the set of 4010 is also likely to be expressed in a cell cycle-dependent manner. A number of the 493 gene signals identified as significantly regulated are represented by independent oligonucleotide sets on the array. 26 genes have duplicated oligonucleotide sets, and two have triplicated sets, resulting in a total of 463 different genes identified.
To confirm the reliability and sensitivity of the results obtained from the microarray analysis, the expression patterns of several genes known to be cell cycle-regulated (22) were determined by RNA gel (Northern) blot and compared with the normalized expression data from the microarray analysis (Fig.  1D). Cell cycle regulation of histone H4, CYCD2;1, CYCD3;1, CDKA, CDKB1, and CDKB2 could be readily detected both by Northern blot and among the 4010 expressed genes on the microarray. These comparative analyses clearly demonstrate that the expression profiles obtained by both methods show strikingly similar timing of their peak values and overall pattern, although small variations may be seen for individual time points. Striking is the clear difference between the timing of expression of CDKB1 and CDKB2 detected by both methods.
Principle Component Analysis-PCA was performed to analyze the extent to which the variation in expression seen among the 493 gene signals can be attributed to a limited number of variable components (58). Briefly, PCA can simplify the analysis and visualization of multidimensional data sets by determining key variables that explain the differences in the observation. The matrix to be analyzed using our data set has 493 rows of genes and nine columns of conditions corresponding to each of the measured time points. Fig. 3A is a plot of the observed variances in all nine principal components. The first two principal components account for 72% of the total variability observed in our data. Plotting all 493 genes onto the first and second principal component showed that all labeled genes fall into distinct quadrants (Fig. 3B). In addition, the position of known cell cycle-regulated genes, such as histones, mitotic FIG. 3. Dimensionally reduced expression data after PCA. For the 493 genes showing significant periodical expression profiles, the signal log ratios after comparative analysis against the sample directly taken after block release (T0) were subjected to PCA. A, plot of variance (percentage) of the nine principal components. Most of the variance in the cell cycle data set is contained in the first three principal components. B, the rotated and dimensionally reduced expression data of all 493 genes plotted on the first and second principal components. After statistical analysis, genes were color labeled (S phase-specific genes blue; G 2 phase-specific genes yellow; M phase-specific genes red; G 1 -specific genes green). The positions of known cell cycle-associated genes in the two-dimensional space after PCA were identified and labeled as indicated.
cyclins, and CDKs was identified. This result clearly demonstrates that annotated S phase and M phase genes have strikingly different locations in space after PCA. Thus, PCA confirms that the simple assignation of phase specificity by peak value is indicative of co-regulated genes and that the majority of variation observed can be explained by two principle variables.
Cluster Analysis of Gene Expression-Since PCA analysis demonstrated considerable structure in the expression data, clustering tools based on hierarchical neighbor joining and self-organizing maps were used to identify groups of co-regulated genes.
The relatedness of expression patterns of the 493 gene signals identified as differentially expressed was assessed by creating a dendrogram based on normalized expression levels of the absolute detected signal. This hierarchical cluster analysis clearly shows that different groups of genes show peaks of expression at specific time points throughout the time course (Fig. 4A, dark red). Abridged branch analysis resulted in the creation of sub-branches or nodes (Fig. 4A, A-G), which reflect differences in expression pattern and timing.
The hierarchical cluster analysis is based on consecutive pairwise comparisons, and although it is useful for grouping genes based on similarity of expression timing, it may not reflect the diversity of different regulatory patterns. The data set was therefore also clustered using SOMs, a neural network useful for clustering large data sets by classifying entries in a two-dimensional space or map. For this data set, SOM analysis using a 4 ϫ 4 matrix resulted in the optimal classification of observed gene signals, as shown in Fig. 4B, where the proportion of genes in each cluster having peak expression in different phases is indicated (S phase, blue; G 2 , yellow; M, red; G 1 , green). Both hierarchical branch and cluster definitions are provided in the data tables (Tables I-IV).
The reliability of the hybridization was assessed by examining the distribution of duplicated and triplicated genes between different cell cycle phases (S, G 2 , M, G 1 ), based on peak expression time, between different nodes on the hierarchical cluster and different SOM clusters. Of the 28 replicated genes, only five pairs were not assigned to the same phase by peak expression, seven pairs were not assigned to the same branch node by hierarchical clustering, and, using the most stringent clustering assessment, 19 gene pairs were assigned to the same cluster group. Of the remaining nine pairs, all but one pair were assigned to groups showing very similar trends (groups 15 and 16, 2 and 3, 8 -12, 8 -16, and 1-6). Taken together with the comparison of signals with Northern data above, the analysis of duplicates shows highly reproducible detection and assignment of expression patterns. We also conclude that the SOM cluster analysis reliably assigns duplicate signals for the same gene to the same expression pattern in Ͼ65% of cases and to very similar patterns in Ͼ96% of cases observed.
Known Cell Cycle-regulated Genes-Although rather few genes are known to be cell cycle-regulated in Arabidopsis, there is direct evidence for regulation of histones (53,59), mitotic cyclins (19, 60 -63), and B-type CDKs (20,22), proliferating cell nuclear antigen, and the CDC6 protein involved in initiation of DNA replication (16,64). We examined the extent to which two classes of likely co-regulated genes were identified as cell cycleregulated and whether known co-regulated genes were assigned to the same or similar clusters. The majority of histone genes are expressed primarily in S phase. 14 histone genes are represented on the Affymetrix array, of which 14 were detected as expressed and 10 different genes were identified within the set of cell cycle-regulated genes. All 10 regulated histones fall into the very similar clusters 12 (two genes), 15 (one duplicated gene signal whose pair is in 16) and 16 (eight gene signals), indicating high frequency of identification of histones and ro-  Tables I-IV. B, SOM analysis of gene expression. As described in Fig. 3, genes were color-labeled to identify which chosen map or matrix results in optimum classification of the predefined phase-specific genes in different cluster models (S, blue; G 2 , yellow; M, red; G 1 , green). After SOM calculation using a 4 ϫ 4 matrix, each node in the SOM is represented by a colored circle. The 16 hypothetical profiles obtained are displayed (clusters 1-16) with the number of genes found in each cluster. Linker histone protein, putative C c16 1 At4g30860 Putative protein B c8 At2g43590 Putative endochitinase B c16 At3g54640 Tryptophan synthase ␣ chain C c12 At1g02920 Glutathione S-transferase, putative C c16 At3g23340 Putative casein kinase I C c8 At2g43510 Putative trypsin inhibitor C c12 2 At4g22690 Cytochrome P450-like protein B c16 At1g14900 Linker histone protein, putative C c15 1 At2g43570 Endochitinase isolog B c12 At4g36990 Heat shock transcription factor HSF4 B c8 At4g25900 Possible apospory-associated-like B c12 At3g54560 Histone H2A,F A c12 1 At1g21000 Unknown protein B c16 1 At2g45300 (EPSP) synthase C c12 At1g65470 Hypothetical protein B c4 1 At1g07610 Metallothionein Calnexin-like protein C c7 At2g24180 Putative cytochrome P450 B c12 At2g44890 Putative cytochrome P450 B c8 At2g02930 Putative glutathione S-transferase B c16 At2g30620 Histone H1 C c16 1 At1g78830 Hypothetical protein B c16 At4g38540 Monooxygenase Putative PHD-type zinc finger C c12 At3g14940 Phosphoenolpyruvate carboxylase B c4 1 At4g32400 Adenylate Putative nucleotide-sugar dehydratase B c8 At4g29520 Putative protein B c8 At1g65460 Hypothetical protein, 5Ј partial B c7 At2g39420 Putative Putative glutathione peroxidase C c8 At3g54960 Protein disulfide-isomerase-like C c7 At4g23640 Putative potassium transport protein (TRH1) C c11 1 At5g06730 Peroxidase B c8 At4g01020 Pombe.PID:g1439562 B c8 At4g19420 Putative pectinacetylesterase B c7 1 At5g47220 Histone H2B B c16 At3g51030 Thioredoxin h A c12 At4g36640 Putative protein C c8 1 At2g45300 (EPSP) synthase C c8 At4g34230 Cinnamyl alcohol dehydrogenase-like B c7 At3g16530 Putative Hypothetical protein B c16 At2g40890 Putative cytochrome P450 C c12 1 At4g02110 Unknown protein B c16 1 At4g39540 Shikimate kinase-like protein B c8 1 At2g22250 Putative asparate aminotransferase C c12 1 At1g09430 Unknown protein B c8 At2g46350 Hypothetical protein C c11 1 At2g44160 Putative methylenetetrahydrofolate reductase C c7 1 At2g29420 Putative glutathione S-transferase C c11 At5g12020 Heat shock protein 17.6-II C c12 At4g24020 Putative protein B c16 At4g22770 Putative DNA-binding protein C c16 At1g79450 Hypothetical protein B c4 At1g21750 Putative protein-disulfide isomerase precursor C c11 At1g02500 S-Adenosylmethionine synthetase C c8 1 At2g22420 Putative peroxidase C c11 At2g24490 Putative replication protein A1 A c8 At2g33630 Putative steroid dehydrogenase B c8 At4g26910 Putative dihydrolipoamide succinyltransferase C c8 At2g20980 Hypothetical protein C c7 At4g27230 Histone H2A-like protein B c16 Cinnamate-4-hydroxylase C c8 At3g09010 Putative receptor set B c8 At3g13790 ␤-Fructofuranosidase 1 C c8 1 At4g21810 Putative protein D c16 At2g42790 Putative citrate synthase C c15 At1g21760 Unknown protein A c16 1 At1g47710 Serpin, putative B c4 At1g06760 Histone H1, putative B c16 At1g05470 Hypothetical protein B c4 At1g30720 Putative reticuline oxidase-like B c4 At3g12500 Basic chitinase A c15 At4g35520 Putative Unknown protein B c4 At2g43020 Putative amine oxidase B c4 At2g28740 Histone H4 C c16 At4g32140 Putative protein B c4 At2g42750 Unknown protein D c8 At4g21810 Putative protein A c8 At5g26340 Hexose transporter-like B c16 At2g30490 Cinnamate-4-hydroxylase C c8 At4g24520 ATRI D c 7 At4g01700 Putative chitinase B c8 At2g01680 Unknown protein C c12 At3g59970 MTHFRI C c11 At4g21910 Putative protein B c8 At2g38860 Unknown protein B c8 At4g16370 Isp4 like protein C c12 At1g59740 Oligopeptide transporter, putative C c16 At3g13790 ␤-Fructofuranosidase 1 C c8 1 At2g38810 Histone H2A C c12 At2g45330 Unknown protein A c12 At2g36320 Unknown protein C c15 At2g45500 Hypothetical protein B c4 At5g07360 Putative amidase C c11 At4g03020 Putative WD-repeat protein B c11 At2g22430 ATHB-6 C c15 At2g27170 Putative chromosome assoc. protein C c11 At1g75750 Unknown protein C c15 At4g17890 Putative protein B c16 At1g23560 Hypothetical protein B c4 At5g54140 IAA-amino acid hydrolase homolog ILL3 C c16 1 At2g22420 Putative peroxidase C c11 At4g35220 Putative protein C c12 At5g06910 DnaJ homologue (gb AAB91418.1 ) A c8 At3g52060 Putative protein C c15 At2g22480 Putative pyrophosphate-fructose-6-phatase A c12 At4g18710 Shaggy-like protein kinase etha C c12 At4g34100 Putative protein C c12 At4g21070 Putative protein (fragment) B c16 At5g05730 Anthranilate synthase component I-1 B c4 At4g17900 Putative protein B c4 1 At2g22470 Unknown protein A c8 At2g37110 Unknown protein C c12 At2g02390 Putative glutathione S-transferase B c4 At4g14630 Germin precursor oxalate oxidase B c8 At2g27190 Purple acid phosphatase precursor C c15 At1g53540 17.6-kDa heat shock protein (AA 1-156) C c8 At3g29200 Chorismate mutase B c4 At2g29500 Putative small heat shock protein A c12 At4g21070 Putative protein (fragment) D c16 At1g60420 Unknown protein B c4 At5g19530 Spermine synthase (ACL5) A c4 At5g47040 Mitochondrial Lon protease homolog 1 B c7 1 At2g33700 Putative protein phosphatase 2C B c4 At4g00300 Awaiting functional assignment B c4 At4g15070 Hypothetical protein C c15 At2g46340 Putative photomorphogenesis repressor D c15 1 At3g13870 Root hair-defective 3 (RHD3) B c8 At4g32160 Putative protein D c4 1 At1g55530 Putative protein B c4 At3g52850 Spot 3 protein and vacuolar sorting A c15 1 At2g22860 Unknown protein B c3 At2g39550 Putative geranylgeranyl transferase type B c4 At1g54110 Unknown protein B c4 At3g50780 Putative protein C c11 At2g37650 Putative SCARECROW gene regulator A c7 1 At2g41790 Putative zinc protease B c4 bust assignment to clusters. All histone signals are within branches A-C of the hierarchical tree. Interestingly, the only two H2A genes are both assigned to cluster 12, suggesting differential regulation compared with other histones. In contrast, CDC6, also previously reported as S phase-regulated (16,64), shows clearly different expression in branch D and cluster 7, indicating that it is down-regulated in mitosis and up-regulated during G 1 of the second cycle. Mitotic cyclins of both A and B classes are primarily expressed during G 2 and M in Arabidopsis and in other plants (31,32,65). Nine mitotic cyclins are present on the chip, of which all nine are detected as expressed, and eight gene signals (representing seven distinct genes) are defined as cell cycleregulated in this experiment. All fall into branch E and the very similar cluster 9, 13, or 14. Expression of CDKB1 and CDKB2 also peak in mitosis (Fig. 1D) (22), with CDKB1 showing earlier expression (branch D, cluster 15) than CDKB2 (branch E, cluster 13), which is thus co-regulated with mitotic cyclins. The robust identification of mitosis-specific genes val-idates the synchrony of the culture used, despite the relatively low metaphase/anaphase index recorded for the reasons discussed above.
Novel Regulated Genes-The genes identified as regulated fall into a wide range of cellular processes as defined by MIPS based on collapsed automatically derived functional categories (available on the World Wide Web at mips.gsf.de/proj/thal/db/ tables/tables_func_frame.html; Fig. 5). Genes that are highest expressed at the time of aphidicolin removal (t ϭ 0 h) are grouped in sub-branch A (31 genes). The application of aphidicolin for 24 h and the treatment of cells with fresh medium during washing, are likely to induce stress responses. It is therefore not surprising that we identify potential stress-associated genes within this cluster including chitinases, peroxidase, glutathione transferase, proteolysis (F-box protein, serine carboxypeptidase), and heat shock-related proteins (see Tables  I-IV for details). Nevertheless, we also observe expression of genes likely to be involved in S phase, such as histone H2A.F/Z already known to be cell cycle-regulated at the G 1 /S boundary

At5g39950
Thioredoxin C c12 At2g32720 Putative cytochrome b 5 C c12 At1g23020 Putative NADPH oxidase C c12 At1g59820 Chromaffin granule ATPase II homolog C c11 At2g31630 Putative SET-domain transcript regulator B c4 At1g10410 Unknown protein B c4 At1g20690 High mobility group protein, putative B c16 At2g01310 Unknown protein C c8 2 At1g05350 Unknown protein C c8 At2g47130 Putative alcohol dehydrogenase B c8 At1g09560 Germin-like protein B c8 1 At4g16660 HSP-like protein C c7 At2g31570 Putative glutathione peroxidase C c12 At5g40760 Glucose-6-phosphate dehydrogenase C c8 At1g67480 Unknown protein D c15 At2g17720 Putative prolyl 4-hydroxylase, ␣ subunit B c8 At1g55920 Serine acetyltransferase A c12 At1g20690 High mobility group protein, putative B c8 At4g02050 Putative hexose transporter B c16 At5g44790 ATP-dependent copper transporter C c16 At2g44790 Phytocyanin B c4 1 At5g17990 Anthranilate phosphoribosyltransferase B c13 At2g30140 Putative glucosyltransferase B c12 At3g49120 Peroxidase B c8 2 At4g38540 Monooxygenase 2 (MO2) B c16 At3g51030 Thioredoxin h A c12 At4g36140 Putative disease resistance protein B c4 At1g09200 Histone H3 C c16 At4g35110 Putative protein B c8 At2g27350 Unknown protein C c16 1 At5g63570 Glutamate-1-semialdehyde2,1-aminomut, 1 B c4 At1g70250 Receptor serine B c4 At4g11600 Phospholipid hydroperoxide glutathione peroxide A c16 At2g45290 Putative transketolase precursor C c12 1 At2g36310 Hypothetical protein A c11 At2g37520 Unknown protein B c7 1 At2g34970 Putative translation initiation factor cIF-2B C c11 At2g33530 Putative serine carboxypeptidase II A c16 At4g39270 Receptor protein kinase-like protein B c4 At1g77220 Unknown protein A c11 At2g27200 Putative nucleotide-binding protein D c4 At4g02390 NAD ϩ ADP-ribosyltransferase C c16 At5g39950 Thioredoxin C c12 At2g37480 Unknown protein A c4 At4g16650 Growth regulator-like protein C c11 At1g74310 Heat shock protein 101 B c8 At3g08760 Putative protein kinase D c4 At4g01870 Predicted protein of unknown function B c4 At3g12500 Basic chitinase A c15 At2g43260 Hypothetical protein C c11 At2g34400 Hypothetical protein B c12 At4g39230 NAD(P)H oxidoreduct isoflavonereduct-like B c4 At5g10180 Sulfate transporter C c15 At4g27910 Putative protein B c12 1 in Arabidopsis suspension cultures (29), proliferating cell nuclear antigen (cluster 11), and a DNA cytosine methyltransferase (cluster 12), which these results suggest are regulated genes. In sub-branch B, a large group of genes (147 genes) is found showing a peak of expression at 2 h after the block is released, corresponding to early to mid-S phase, including several genes involved in DNA metabolism and replication such as histones, a CDC50 homologue, and FAS1, which shows strong periodic regulation in cluster 4 (67). Also in node B/cluster 4 are found the mitogen-activated protein kinase AtMPK6, which is known to be involved in signaling of abiotic stress (68), as well as the mitogen-activated protein kinase kinase AtMKK2. In addition, a large group of genes are annotated as oxidative stress-responsive genes, such as peroxidases (3), chitinases (4), glutathione transferases (six total, all in nodes B/C), superoxidase dismutase, and ethylene-responsive element-binding factors (2). Within the next node (C), 91 genes are clustered, which are highly expressed over multiple time points mainly in S phase including further histone genes. A few genes in sub-branches C and D show high expression in S phase (2-6 h) and then decrease, with higher expression again seen in the last experiment after 19 h. This expression profile is found, for example, for the CDC6 gene, which is specifically expressed in G 1 and S phases (16,64), and the mitogen-activated protein kinase kinase kinase ATN1. ATN1 is related to mammalian transforming RAF kinases, but no biological role is known in plants (68). Two casein kinase I genes are also found in clusters B and C.
In node E, 102 genes are grouped together, which are assigned as M phase-specific, belonging to clusters 9, 10, 13, and 14. In addition to mitotic cyclins and CDKs (see above), the Arabidopsis homolog of budding yeast CDC20 is within this node and cluster 13 and is one of the most highly regulated genes detected. CDC20 is one of two proteins required to activate the anaphase-promoting complex. Other genes with clear mitotic associations are three genes for kinesin heavy chain, two kinesin-like potential spindle proteins (At2g28620, At4g14330), a homolog of an extragenic suppressor of bimD6I involved in chromosome structure and segregation (69), a homolog of human TOG that targets CDK activity to microtubules in mitosis (70), a fimbrin involved in F-actin filament crosslinking (71), and two helicases. Putative regulatory proteins include a protein phosphatase 2C (At2g30020), MYB70, an AP2 domain protein (At3g16280), and an FCA-like protein (At2g47310).
Node F includes 71 gene signals whose expression peaks at the M/G 1 boundary, and node G includes a further 27 genes expressed during G 1 phase. Some genes in these nodes are also expressed in the early stages of the experiment and are hence in clusters 1 and 2. Notable in node F are the very highly regulated histidine kinases (HKs) encoding the cytokinin receptor CRE1 (AtHK4) and the osmosensor AtHK1 as well as the response regulators ARR4 and ARR7. Two MCM proteins required for prereplication complex assembly, CDC21 (72) and MCM5, which has an E2F site in its promoter as does human MCM5 (73), are both expressed at this time in cluster 5, as is the DNA mismatch repair protein MSH2, which associates with p53 in S phase in mammalian cells (74). In node G, the SKP1 homolog At2g03160 is found in cluster 3, indicating a G 1 /S expression pattern in both cycles. SKP1 functions as part of the SCF complex and regulates the destruction of G 1 cell cycle regulators at the onset of S phase. The D-type cyclin CYCD4;1 (75) not previously known to be expressed or regulated in cell suspension cultures is also found in this node.
Links to Other Cellular Processes-In addition to processes likely to be cell cycle-regulated based on studies in other organisms, the data hold clues to novel plant-specific processes that may be integrated with cell division control.
Plants coordinate nuclear division with mitochondrial and plastid duplication and segregation. Two genes related to yeast ABF2, a high mobility group protein involved in mitochondrial DNA segregation (76), both have very strong regulation in the mitosis peak node E/cluster 13. Since both ABF2 homologs (At4g23800, At4g11080) are predicted to be plastid-targeted (data not shown), this could provide a link between nuclear and plastid division. Moreover, the only gene in Arabidopsis for organelle methionyl-tRNA synthetase that provides both mitochondrial and chloroplastic activity (77) is also regulated in node F/cluster 5, suggesting a link to organelle protein synthesis.
The data set includes a number of genes suggesting links to hormone perception, biosynthesis, and response, including cytokinin, brassinosteroids, auxin, ethylene, and jasmonate. One of the most highly regulated genes detected is the cytokinin receptor AtHK4 (CRE1, WOL1), which shows strongly periodic expression in G 1 (node F, cluster 1). It is thus expressed in early time points, decreases, and then increases in later time points. Interestingly, ATHK4 is co-regulated in the cell cycle with its downstream transcriptional targets ARR4 and ARR7, negative regulators of the cytokinin response presumably involved in a feedback mechanism (78,79). Since ARR4 is also linked to phytochrome and light signaling, it could provide a link between cytokinin, other signals, and cell cycle (79,80). The coordinate response of AtHK4, ARR4, and ARR7 cytokinin regulatory genes is consistent with the requirement for cytokinins for the G 1 /S transition through the regulation of CYCD3;1 expression (81) and suggests that roles for cytokinin at the G 2 /M transition (82) may be regulated by different genes.
Dwarf1 encodes the enzyme that converts 24-methylenecholesterol to campesterol in brassinosteroid biosynthesis and is found in node F, cluster 5 with a G 1 maximum. Brassinosteroid controls both cell growth and cell division (83) and regulates expression of the D-type cyclin CYCD3;1 (84,85), and the up-regulation of DWARF1 during G 1 phase suggests a mechanism by which this may be mediated.
A unique aspect of cell division in plants is close coordination required between cell wall synthesis and the cell cycle. It is therefore interesting to note at least 23 genes identified that have known or putative links to cell wall or biosynthesis of cell wall components that are found in several clusters. For example, expansins are a group of extracellular proteins that directly modify the mechanical properties of plant cell walls, leading to turgor-driven cell extension (91). Three expansin genes are detected as regulated, of which two are expressed in G 1 (F, 5), whereas the third is expressed in both S and G 1 phases (B, 2). Extensins are abundant proteins presumed to determine physical characteristics of the plant cell wall, and expression of one is found with a G 1 peak (G, 3). RHD3 (ROOT HAIR-DEFECTIVE 3) encodes a putative GTP-binding protein required for appropriate cell enlargement in Arabidopsis (92), and RHD3 is found to be regulated (B, 8).
Since lignin biosynthesis is a major utilizer of methionine via S-adenosyl methionine (94), this may be linked to cell wall synthesis or alternatively reflect the control of methionine pools for protein synthesis. A number of genes identified provide clues to possible links with developmental and differentiation processes through genes previously identified as having developmental phenotypes when mutated. These include genes for an Argonaute (AGO1)-like protein (F, 5) possibly involved in RNA turnover processes (95), a homolog of tomato DEM1 (defective embryo and meristem) (96), which is mitosis-regulated and FCA-related (E, 14), and a NAM (no apical meristem)-like protein (F, 9) (97) as well as four scarecrow-like transcription factors (98), three of which are in nodes A or B.
Cell Cycle-regulated Promoter Elements-Three main groups of regulatory elements have been described in plants that control cell cycle expression. E2F binding sites regulate expression by binding to activating or inhibitory E2F factors, which are themselves regulated by the recruitment of hypophosphorylated Rb to E2F sites, which inactivates expression (99). Phosphorylation of Rb in late G 1 results in activation of E2F-regulated genes including ribonucleotide reductase (100) and CDC6 (16,64). Expression of mitotic cyclins has been shown to depend on specific elements conferring mitotic-specific activation Putative thaumatin G c6 At1g04300 Unknown protein C c3 At4g25200 Mitochondrion-localized small heat shock A c2 At1g11670 Unknown protein B c6 At4g28350 Receptor protein kinase-like protein B c3 At1g19050 Response regulator 5, putative G c1 At2g30740 Putative protein kinase D c7 At5g42650 Allene oxide synthase (cmb CAA73184.1) G c6 At4g35480 RING-H2 finger protein RHA3b B c3 At2g29680 Putative CDC6 protein D c7 2 At1g68560 ␣-Xylosidase precursor F c5 At4g31340 Putative protein D c3 1 At2g30880 Unknown protein G c2 At1g04690 Putative K ϩ channel, ␤ subunit G c 2 At1g34300 Hypothetical protein B c2 At2g34080 Putative cysteine proteinase G c1 At4g23850 Acyl-CoA synthetase-like protein B c3 2 At4g35920 Putative protein F c5 At5g59450 Scarecrow-like 11-like F c6 1 At4g18910 Major intrinsic protein (MIP)-like B c2 At2g04350 Putative acyl-CoA synthetase F c2 At1g62180 5Ј-Adenylylphosphosulfate reductase, putative B c2 At2g41640 Unknown protein B c3 At3g18525 DNA mismatch repair protein MSH2, 5Ј partial F c5 At2g32800 Putative protein kinase B c2 At2g19860 Hexokinase (ATHXK2) B c4 At1g79930 Putative heat-shock protein F c5 At2g47650 Putative dTDP-glucose 4-6-dehydratase B c3 At1g53840 Pectinesterase, putative G c6 1 At3g51630 MAP kinase F c2 1 At1g13220 Putative nuclear matrix constituent protein D c1 At4g24280 hsp 70-like protein F c5 At2g27880 Argonaute (AGO1)-like protein F c5 At1g19730 Thioredoxin G c6 At2g24160 Putative disease resistance protein B c3 At3g55400 Methionyl-tRNA synthetase (AtcpMetRS) F c5 At1g04440 Putative casein kinase I B c3 At3g15353 Putative protein A c2 At3g11170 Omega-3 fatty acid desaturase, chloroplast precursor G c6 At5g06720 Peroxidase (emb CAA68212.1) G c1 At4g29810 MAP kinase kinase 2 B c4 At4g24190 HSP90-like protein F c6 At5g51460 Trehalose-6-phosphate phosphatase B c2 At1g59910 Hypothetical protein B c3 At1g78650 Hypothetical protein F c7 At1g19730 Thioredoxin A c6 At2g16660 Nodulin-like protein G c6 At2g40010 60S acidic ribosomal protein P0 F c6 1 At4g22010 Pectinesterase like protein F c5 At4g29950 Putative protein B c3 1 At1g14020 Putative growth regulator protein B c2 1 At2g17480 Similar to Mlo proteins from H. vulgare F c 5 At2g33020 Putative leucine-rich repeat disease resistance protein G c1 At2g30910 Putative ARP2 A c3 1 At5g13490 Adenosine nucleotide translocator F c6 1 At2g01610 Unknown protein F c5 At5g17330 Glutamate decarboxylase 1 (GAD 1) F c2 in tobacco and Arabidopsis (61), whereas histone gene expression depends on octamer (Oct) and hexamer (Hex) elements (59). We searched the regions 1 kb upstream of each open reading frame within the regulated gene set for the E2F (TTTYYC-GYY), mitotic-specific activation (YCYAACGGYY), Oct (CGCGGATC), and Hex (CCACGTCA) consensus sequences. Oct and Hex sequences were found in only a few genes, mostly histones. The relatively loose E2F and mitotic-specific activation consensus sequences identified a relatively large number of genes, not all of which are likely to be regulated by these factors. However, when we examined the distribution of detected sites between different clusters (Table V), we found that clusters contained either relatively few or relatively frequent sites. In all cases except cluster 13, the clusters with frequent E2F sites are distinct from those with larger numbers of genes with mitotic-specific activation sites, suggesting that these generally confer regulation at different times in the cell cycle. Cluster 13 represents mitotic peaking genes and may suggest a role for E2F in regulating expression of genes peaking in G 2 / mitosis as previously found for mammalian E2F-regulated genes (39). DISCUSSION Here we present for the first time the results of a wide scale analysis of regulated gene expression in a plant cell cycle synchronized culture. The results demonstrate that a large number of plant genes are likely to show cell cycle-dependent regulation of their expression. The identified genes are involved in a wide range of cellular processes including cell cycle control, cytoskeleton, transcription, proteolysis, phosphorylation, signal transduction, biosynthesis, carbon and amino acid metabolism, hormone response, and organelle function (Fig. 5 and Tables I-IV). Shedden and Cooper (49) have shown that microarray analysis is prone to random fluctuations, which can be interpreted as consistent regulation. We show that the data from this experiment show significantly greater regulation than a control randomized data set. We have applied statistical analysis to identify 463 genes among 4010 passing initial filters with a high probability of showing significant regulation (p Ͻ 0.05), and over 200 of these are significant (p Ͻ 0.01).
Shedden and Cooper (49,66) have also criticized analysis of cell cycle expression because of the perturbations caused by synchronization methods. It is clear that the synchronization carried out here does indeed cause induction of some stressrelated genes. However, the procedure was developed to minimize stress, and indeed Arabidopsis cells readily arrest division. Moreover, we have identified almost all Arabidopsis genes previously known to be cell cycle-regulated, including genes whose cell cycle regulation has been demonstrated in vivo by in situ hybridization and are therefore independent of synchronization procedures (30). A large number of genes also fall into clusters not consistent with a simple stress response due to their periodic response to cell cycle position, and in particular clusters indicative of roles in G 2 /M or G 1 /S processes.
It is interesting to note that, compared with the analysis in mammalian cells, we identify large numbers of genes regulated during G 1 phase (136 genes). This may reflect a greater role for transcriptional control in G 1 in plants. It is also likely that G 1 control in plants must integrate a larger number of potential FIG. 5. Classification of cell cycle-regulated genes in functional categories. The frequency of annotated genes having defined automatically derived collapsed functional categories was identified as indicated. It should be noted that one gene can have more than one annotated function. signals due to the multiple developmental and environmental influences on commitment to cell division. We conclude that the analysis has not only successfully identified known cell cycle-regulated genes but also identified as cell cycle-regulated a number of other genes involved in cell cycle progression, DNA replication and its control, and cytoskeletal processes that might be suspected to be regulated but for which no evidence has previously existed. In addition, a number of novel controlling genes including kinases, phosphatases, and transcription factors have been identified as well as links to genes previously known for their role in differentiation or developmental processes.