A predictive scale for evaluating cyclin-dependent kinase substrates. A comparison of p34cdc2 and p33cdk2.

Protein phosphorylation by members of the Cdk (cyclin-dependent kinase) family of protein kinases is necessary for progression through the cell cycle. However, the primary sequence determinants of Cdk substrate specificity have yet to be examined quantitatively. We have used a panel of glutathione S-transferase peptide fusions to investigate the fine-structure specificity of p33cdk2 and p34cdc2. Our data indicate that the generally held consensus sequences for p34cdc2 represent a significant oversimplification of its true specificity and that this specificity is conserved between species. p33cdk2 and p34cdc2 have similar but distinct substrate specificities that are affected modestly by the associated cyclin subunit. We derive specific values of phosphorylation efficiencies by these enzymes that can be used to estimate the phosphorylation potential of proposed Cdk substrates.

Protein phosphorylation by members of the Cdk (cyclin-dependent kinase) family of protein kinases is necessary for progression through the cell cycle. However, the primary sequence determinants of Cdk substrate specificity have yet to be examined quantitatively. We have used a panel of glutathione S-transferase peptide fusions to investigate the fine-structure specificity of p33 cdk2 and p34 cdc2 . Our data indicate that the generally held consensus sequences for p34 cdc2 represent a significant oversimplification of its true specificity and that this specificity is conserved between species. p33 cdk2 and p34 cdc2 have similar but distinct substrate specificities that are affected modestly by the associated cyclin subunit. We derive specific values of phosphorylation efficiencies by these enzymes that can be used to estimate the phosphorylation potential of proposed Cdk substrates.
The cell cycle consists of a series of strictly ordered steps, requiring the completion of one event before the next can occur. The protein kinases that control entry into and progression through various stages of the cell cycle are members of the Cdk 1 (cyclin-dependent kinase) subfamily of protein kinases. Cdk activities fluctuate as a result of post-translational modifications and protein-protein interactions. An active Cdk is formed after binding to a cyclin partner and phosphorylation on a key threonine (Thr-161 in human p34 cdc2 ). In vertebrates, Cdk4-cyclin D is necessary for passage through G 1 , p33 cdk2cyclin E is necessary for the transition from G 1 to S phase, p33 cdk2 -cyclin A is necessary for progression through S, and p34 cdc2 -cyclin B is necessary for the transition from G 2 to M phase (1).
Crucial to our understanding of the cell cycle is the ability to identify for the various Cdk-cyclin complexes the key substrates whose phosphorylation leads to the progression through a particular cellular event. Many of these downstream effects could be caused directly by the Cdk; for example, p34 cdc2 -cyclin B can phosphorylate lamins thus leading to their disassembly (2)(3)(4), an important event in the initiation of mitosis. Other effects could be indirect, the result of a cascade of events initiated by the Cdk; for example, Cdk4-cyclin D phosphorylates Rb, thus releasing E2F to promote the transcription of many genes important for DNA replication (5).
An understanding of the basis of substrate specificities of different Cdk-cyclin complexes is of central importance as specificity can be influenced by many factors. Obviously the choice of a phosphorylation target site will be influenced strongly by inherent differences in the substrate binding region of a particular Cdk (6 -9). In addition, the cyclin subunit could influence substrate specificity in any of the following ways: by binding a potential substrate and bringing it into contact with the Cdk; by targeting the Cdk to a particular subcellular location where it has access to only a limited number of potential substrates (10 -12); or by restricting Cdk activities to a narrow window within the cell cycle so that the kinase can only affect those substrates present and able to be activated during that stage (1). Most likely, the substrate is recognized by a combination of the Cdk substrate binding pocket and long range interactions with surface residues of the cyclin subunit (13). The majority of substrates would be recognized by the Cdk in association with any cyclin, but certain subsets might be recognized or preferred by a specific Cdk-cyclin pair (14). Several recent studies have indeed demonstrated that the identity of the cyclin partner can influence substrate specificity significantly (14 -17).
Several loose consensus substrate sequences have been reported for p34 cdc2 based on a limited number of known in vivo and in vitro p34 cdc2 substrates (for review, see Ref. 18). These include (K/R)(S/T)PX(K/R), where X is any amino acid (18) or a polar amino acid (19), and (S/T)PX(K/R), where X is any amino acid (20). It has generally been assumed that p33 cdk2 has a similar specificity. The few studies investigating the substrate specificity of the Cdks have been performed primarily on p34 cdc2 (14, 18, 20 -23) and have examined only a small number of peptides or sites in diverse proteins. A systematic study of protein kinase substrate specificity was carried out recently by Songyang et al. (9) using a peptide library containing approximately 2.5 billion unique peptides, with a fixed serine as the phosphate acceptor, as substrates for various kinases including p33 cdk2 -cyclin A and p34 cdc2 -cyclin B. This method identified a sequence similar to one of the consensus sites as the optimal substrate for p34cdc2-cyclin B, (K/R)SP(R/P)(R/K/H).
We have investigated the substrate specificity of p33 cdk2 bound to cyclin A or E and of p34 cdc2 bound to cyclin A or B using a systematic series of specifically defined peptide substrates appended to the COOH terminus of glutathione Stransferase, constructed by polymerase chain reaction using degenerate oligonucleotides. These substrates allowed us to determine quantitatively the role of the primary sequence of a target site in substrate utilization. Our panel of altered target sites has allowed us to compare the inherent differences in substrate recognition between p33 cdk2 and p34 cdc2 as well as to examine the effects of the cyclin regulatory subunits on specificity. In addition, we have found that the data generated from these experiments can be used to predict the potential utilization of novel phosphorylation sites.

MATERIALS AND METHODS
Production of GST Fusion Substrates-Substrates were constructed by polymerase chain reaction using pGEX-3X (24) or a previously made substrate as template. The 5Ј primer, which included nucleotides 67-106 of GST, introduced an internal XhoI site (underlined): TCG ACT TCT GCT CGA GTA TCT TGA AGA AAA ATA TGA AGA G, and the 3Ј primer introduced the substrate peptide (underlined) using either a degenerate or a specific oligonucleotide based on the following sequence: CGA TGA ATT CCC XNN XNN XNN XNN XNN ACC CCC ACG ACC TTC GAT CAG, where X ϭ G/C and N ϭ G/A/T/C. The amplified products were cloned into pGEX-3X containing an introduced XhoI site at nucleotide 77 of GST using XhoI and EcoRI and sequenced over the peptide region. The terminal sequence of the wild type fusion substrate was GRGGKSPRKGNSS.
Purification of GST Fusion Substrates-The constructs were transformed into Escherichia coli strains TG1 or BL21 for protein expression. 100-ml bacterial cultures were grown in LB containing 0.1 mg/ml ampicillin at 37°C until they reached an A 600 of 0.6 -1.0. Isopropyl-1-thio-␤-D-galactopyranoside was added to 0.4 mM, and cells were incubated for 14 -16 h at 23°C. Cells were pelleted and washed twice in 0.9% NaCl before resuspension in 2 ml of lysis buffer (150 mM NaCl; 5 mM EDTA; 50 mM Tris (pH 8.0); 10% (v/v) glycerol; 5 mM dithiothreitol; 10 g/ml each of leupeptin, chymostatin, and pepstatin; and 0.5 mg/ml lysozyme). After a 30-min incubation on ice, cells were lysed by the addition of Nonidet P-40 to 0.5% followed by sonication for two 25-s periods. The lysate was clarified by centrifugation at 40,000 rpm for 30 min at 4°C using a TL100.2 rotor in a Beckman TL100 ultracentrifuge. The supernatant was applied to a column containing 200 l of glutathione-agarose (Sigma) which had been prewashed with 3 ml of lysis buffer containing 0.5% Nonidet P-40. Following binding for at least 30 min, the column was washed with 3 ml of lysis buffer containing 0.5% Nonidet P-40 followed by a wash with 3 ml of buffer H (50 mM Hepes (pH 7.5), 100 mM NaCl, 3 mM dithiothreitol, and protease inhibitors as above). The GST-peptide fusion was eluted with 600 l of buffer H containing 5 mM glutathione and concentrated in a Centricon-10 concentrator (Millipore, Bedford, MA) for 60 -90 min at 5,000 rpm in an SA-600 rotor at 4°C. The final concentrations of the fusion proteins were determined spectrophotometrically (assuming an A 280 of 1.0 for a 1 mg/ml solution) and ranged from 10 to 127 mg/ml.
Purification of Cdk-Cyclin Complexes-Sea urchin GST-cyclin B was expressed in E. coli and purified as described (25). The cyclin protein was added to a Xenopus egg extract arrested in interphase, and the activated p34 cdc2 -cyclin complexes were retrieved on glutathione-agarose beads, eluted, and concentrated as described (26). The final concentration of purified Xenopus p34 cdc2 -cyclin B complexes was 184 nM, and their activity was 3.62 ϫ 10 3 pmol of phosphate transferred per min/g of p34 cdc2 at a saturating concentration of the wild type substrate. Human Cdk-cyclin complexes were purified from Sf9 insect cells coinfected with baculoviruses expressing GST-cyclin and Cdk. The complexes were retrieved on glutathione-Sepharose resin and eluted as described (27). The final concentrations and the specific activities of the complexes toward the wild type substrate were as follows: p33 cdk2 -cyclin A, 455 nM and 4.77 ϫ 10 3 pmol of phosphate transferred/min/g of p33 cdk2 ; p33 cdk2 -cyclin E, 121 nM and 4.66 ϫ 10 3 pmol of phosphate transferred/min/g of p33 cdk2 ; p34 cdc2 -cyclin A, 12 nM and 1.58 ϫ 10 4 pmol of phosphate transferred/min/g of p34 cdc2 ; p34 cdc2 -cyclin B, 12 nM and 1.13 ϫ 10 4 pmol of phosphate transferred/min/g of p34 cdc2 . The concentration of all Cdks were determined by quantitative immunoblotting using an anti-PSTAIRE antiserum and known amounts of enzyme as a reference standard.
In Vitro Kinase Assays and Data Analysis-Xenopus p34 cdc2 -cyclin B kinase assays were performed by incubating 5 l of substrate with 5 l containing 3 nM p34 cdc2 -cyclin B complexes, 0.25 Ci/l [␥-32 P]ATP, 0.4 mM ATP, 15 mM MgCl 2 , 20 mM EGTA, 10 mM dithiothreitol, 80 mM potassium ␤-glycerophosphate (pH 7.3), 1 mg/ml ovalbumin, and protease inhibitors as above at 25°C. Human Cdk-cyclin kinase assays were performed by incubating 5 l of substrate with 5 l of enzyme containing 4.6 nM p33 cdk2 -cyclin A, 6.1 nM p33 cdk2 -cyclin E, 2.4 nM p34 cdc2 -cyclin A, or 2.4 nM p34 cdc2 -cyclin B, in a mixture with 0.25 Ci/l [␥-32 P]ATP, 0.4 mM ATP, 15 mM MgCl 2 , 50 mM Hepes (pH 8.0), 20 mM p-nitrophenyl phosphate, 10 mM dithiothreitol, 1 mg/ml ovalbumin, and protease inhibitors as above. (The concentrations of each enzyme were chosen to give equal histone H1 kinase activities.) The reactions proceeded for 15 min at room temperature and were terminated by the addition of 10 l of 2 ϫ sodium dodecyl sulfate-polyacrylamide gel sample buffer and analyzed on 12.5% polyacrylamide gels. Phosphorylation was quantified on a Bio-Rad GS-250 molecular imager. Imager units were converted into cpm by scintillation counting of an excised band from a wet gel. K m determinations were made from data sets that produced R 2 values of 0.99 or greater for Xenopus p34 cdc2 and 0.96 or greater for human p34 cdc2 and p33 cdk2 when fit to the Michaelis-Menton equation using the Kaliedagraph program (version 2.1.3, Abelbeck Software, Stable Isotope Lab, University of Michigan). For Xenopus p34 cdc2 -cyclin B assays, all other data points are given as the means Ϯ the S.E. from five separate experiments that have been normalized based on total counts incorporated/experiment. For human Cdk-cyclin assays, all other data points are given as the means Ϯ the S.E. from three separate experiments that have been normalized based on a standard histone H1 kinase reaction performed with each separate experiment. Background incorporation by each enzyme into a nonphosphorylatable substrate, KAPRK, was subtracted from each data point. As a control, human Cdk4-cyclin D was purified from infected Sf9 cells (as above) and was used in kinase reactions (as above) with 10 different peptide substrates (KSPRK, DSPRK, GSPRK, QSPRK, KSPKK, KSPMK, KSPWK, KSPRH, KSPRP, and KSPRR). The phosphorylation was at background levels, indicating that none of the phosphorylation signal measured in experiments using p34 cdc2 or p33 cdk2 had been due to contaminating kinases.

RESULTS
Fine Substrate Specificity of p34 cdc2 -To refine the substrate specificity of p34 cdc2 , we constructed a series of substrates based on a histone H1 phosphorylation site, KSPRK (wild type), attached to the COOH terminus of glutathione S-transferase via gene fusion. This format allowed us to avoid the expense of producing synthetic peptides while examining a large number of sites within a single context. This approach differs from previous work investigating substrate specificity in which only a small number of sites within different protein contexts were examined, making direct comparisons of phosphorylation efficiencies difficult. We found using the wild type substrate that phosphorylation by purified Xenopus p34 cdc2cyclin B increased linearly with time over a 30-min period and a 1,000-fold range of concentrations (data not shown). A phosphorylation site mutant (KAPRK) was essentially unphosphorylated. Conditions within this linear range were chosen for all further experiments.
We replaced the charged residues at positions Ϫ1, ϩ2, and ϩ3 with respect to the phosphorylated serine or threonine with alanines in single, double, and triple combinations to determine the overall importance of each position to substrate recognition. Phosphorylation of these alanine substitution mutants by Xenopus p34 cdc2 was carried out over a wide range of substrate concentrations. Substitution at the Ϫ1 position (ASPRK) had only a small effect on phosphorylation efficiency, substitution at the ϩ2 position (KSPAK) had a more significant effect, and substitution at the ϩ3 position (KSPRA) had a severe effect (Fig. 1). The data were fit to the Michaelis-Menton equation, and kinetic parameters were determined for the four substrates that approached saturation within the concentration range of the experiment. The K m values (in M) for these substrates were as follows: KSPRK, 98.0; ASPRK, 108; KSPAK, 446; and ASPAK, 976. The V max values were all between 3,190 and 4,120 pmol of phosphate transferred/min/g of p34 cdc2 . Thus, all of the variation in substrate utilization was accounted for by substrate binding (K m ). The K m for histone H1 was 25 M under the same assay conditions (data not shown), indicating that the K m values for our best peptide substrates were close to those of actual p34 cdc2 substrates. Alanine substitutions at positions Ϫ1 and ϩ2 had only moderate effects, increasing the K m to 1.1-fold and 4.6-fold the wild type value, respectively. However, mutation of the ϩ3 position had a severe effect on K m , even greater than the ASPAK double substitution substrate, which had a 10-fold effect, confirming previous suggestions that the identity of the ϩ3 position was more important than the identity of the Ϫ1 or ϩ2 position (9, 18 -20). We also examined a KTPRK substrate to determine the utilization of threonine compared with serine. The K m for this substrate was 153 M, indicating that serine was slightly preferred (by 1.6-fold) as the phosphate acceptor.
We systematically replaced residues at positions Ϫ1, ϩ2, and ϩ3 with each possible amino acid to define any specific substrate preferences at these sites. The relative specificity of Xenopus p34 cdc2 -cyclin B toward these single amino acid substitutions was determined at a fixed substrate concentration of 50 M, which is well below the K m value of the wild type substrate (Fig. 1) and thus within the linear range. The analysis of these substrates agreed well with the findings from the alanine substitution substrates regarding the overall sensitivity of each position, although there were marked preferences.
The first position was relatively insensitive to amino acid substitutions ( Fig. 2A). All but five of the substrates were phosphorylated at least 80% as well as the wild type substrate. The peptide with proline at the first position was a relatively poor substrate, phosphorylated at only 46% of the wild type level. Thus we find that all substitutions at the Ϫ1 position are tolerated, but that there is a distinct variation in preference. In contrast, the conventional consensus sequences specify only lysine or arginine at the first position (18) or indicate that all amino acids are equivalent (20).
The ϩ2 position tolerated a much more limited number of amino acid substitutions (Fig. 2B). Only two substrates approached wild type levels, lysine at 108% and methionine at 80.1%. All other substrates were phosphorylated less than 65% as well as the wild type substrate. The most poorly tolerated substitutions were aspartic acid, glutamine, and proline, which reduced phosphorylation to less than 10% of wild type. The traditional consensus sequences either do not indicate any specificity at this position (18) or require a polar side chain (19). In fact, substitutions at the ϩ2 position show a full range of tolerance, from excellent to poor. Moreover, several polar amino acid side chains form poor substrates (e.g. aspartate and glutamine), whereas some nonpolar side chains yield excellent substrates (e.g. methionine).
The ϩ3 position was the most sensitive to substitution (Fig.  2C). Only arginine and lysine were well tolerated. Peptides containing histidine or proline were utilized at about 20% of the wild type level, which although low, was still considerably greater than the rest of the substrates. Aspartic acid and glutamic acid were the least tolerated changes at this position, resulting in peptides that were phosphorylated at less than 0.5% of the wild type level. The results for the ϩ3 position agree well with the consensus sites, which specify only lysine or arginine (18 -20). However, our results expand this definition by identifying histidine and proline as tolerable substitutions and by indicating that very few amino acids are entirely excluded at this position.
Based on molecular modeling, Songyang et al. (9) have proposed that the proline residue directly following the serine is necessary to anchor the substrate in the correct orientation for phosphorylation. Mutation of this proline to asparagine (KSNRK) abolished all detectable phosphorylation (data not shown). We also tested the utilization of the substrate KSTRK (data not shown). This sequence is based on the nonconventional p34 cdc2 phosphorylation sites in myosin light chain (SSKR and KTTKK) (23,28). This substrate was utilized at 0.03% of wild type (data not shown), which agrees well with the findings of Yamakita et al. (23) who saw extremely low levels of phosphorylation of myosin light chain by p34 cdc2 relative to phosphorylation of histone H1 (23).
Substrate Specificity of p33 cdk2 versus p34 cdc2 -We turned to purified human Cdk-cyclin complexes, rather than Xenopus enzymes, for studies comparing the specificities of p33 cdk2 and p34 cdc2 since these enzymes were more readily available. We first verified that human p34 cdc2 had essentially the same specificity as that from Xenopus. Both enzymes were most sensitive to alanine substitution at the ϩ3 position, followed by the ϩ2 and Ϫ1 positions, respectively (Figs. 1 and 3). They also showed the same general pattern of fine structure specificity based on the single amino acid substitutions (Figs. 2 and 4). The most significant difference was seen with the RSPRK substrate, which was utilized at 63.7% of the wild type level with the Xenopus enzyme ( Fig. 2A) and at 140% of the wild type level for the human enzyme (Fig. 4A). This result may reflect slight differences in the structures of the substrate binding pockets of the two enzymes that allow the human enzyme to accommodate the bulky arginine residue at the Ϫ1 position more readily. The overall similarity in substrate specificity between human and Xenopus p34 cdc2 -cyclin B provides reassurance that the innate specificities of these enzymes have been well conserved through evolution.
We next compared the sensitivities of p33 cdk2 -cyclin A and p34 cdc2 -cyclin A with single alanine substitutions at the Ϫ1, ϩ2, and ϩ3 positions of our substrate (Fig. 3). The enzymes were qualitatively similar in that each was barely sensitive to substitution by alanine at the Ϫ1 position, more sensitive to substitution at the ϩ2 position, and very sensitive to substitution at the ϩ3 position. Quantitatively, however, p33 cdk2 was much more sensitive to the ϩ2 and ϩ3 substitutions than was p34 cdc2 and showed almost no detectable activity toward the KSPRA substrate. We obtained K m and V max values for all four human enzymes using the three substrates (KSPRK, KTPRK, and ASPRK) that approached saturation closely enough to permit accurate fits to the Michaelis-Menton equation (Table  I). The p34 cdc2 enzymes consistently had about a 2-fold or more higher affinity for the substrates than the p33 cdk2 enzymes. Interestingly, p33 cdk2 -cyclin E had a significantly lower K m for the substrates than did p33 cdk2 -cyclin A. We assume that this effect represents a slight interaction between cyclin E and the GST substrates, although it could also represent a very subtle alteration in the substrate binding region of p33 cdk2 induced by the cyclin-binding partner. Although the p33 cdk2 enzymes had approximately equal affinities for substrates containing either Ser or Thr as the phosphorylation target, the p34 cdc2 enzymes preferred Thr by approximately 2-fold, again suggesting that there may be minor differences in the substrate binding interfaces of these enzymes. We were struck that the V max values for the p34 cdc2 enzymes were significantly higher than those for the p33 cdk2 enzymes since we normalized the amounts of each enzyme we used in the assays based on histone H1 kinase activity. One interpretation is that the p33 cdk2 enzymes may have a higher binding affinity for histone H1 and that we therefore underestimated the amounts of these enzymes to use in our substrate assays.
We examined the sensitivities of all four human enzymes to our full panel of substitution substrates in order to probe their site preferences more systematically. The four enzymes displayed remarkably similar fine specificity at the Ϫ1 position (Fig. 4A). As with Xenopus p34 cdc2 , the human enzymes tolerated all substitutions at this position, but Pro was again the most poorly tolerated at approximately 35-45% of the wild type level. Although there were some statistically significant differences in sensitivity between p34 cdc2 and p33 cdk2 , these were relatively modest, especially compared with some of those seen at the ϩ2 and ϩ3 positions (see below). The greatest differences were a relative preference of p33 cdk2 for Trp and of p34 cdc2 for Ile. Emphasizing the tolerance at this position, phosphorylation of a number of the substrates by all four enzymes was near or even exceeded that of the wild type substrate.
As with Xenopus p34 cdc2 , human p34 cdc2 and p33 cdk2 showed a wide range of substrate preferences at the ϩ2 position. Again, the most poorly tolerated substitutions were Asp, Pro, and Gln, each phosphorylated at less than 10% of the wild type level. However, unlike the case at the Ϫ1 position, there were marked differences at the ϩ2 position in the sensitivities of p33 cdk2 and p34 cdc2 to some substitutions. In particular, the p33 cdk2 enzymes phosphorylated the Gly-containing substrate (KSPGK) at only 1.0 and 3.2% of the wild type level, whereas p34 cdc2 phosphorylated the same substrate at 14.8 and 46.9% of the wild type level (Fig. 4B). Similarly, the p33 cdk2 enzymes phosphorylated the Pro-containing substrate (KSPPK) at only 0.1 and 0.5% of the wild type level, whereas p34 cdc2 phosphorylated the same substrate at 3.0 and 5.1% of the wild type level (Fig. 4B).
Substitutions at the ϩ3 position produced the greatest effects. For the p34 cdc2 enzymes, substitutions of basic amino acids were clearly preferred, whereas acidic substitutions were tolerated most poorly. As with Xenopus p34 cdc2 , His and Pro stood out somewhat above the other amino acids that made poor, but tolerated, substrates (Fig. 4C). The p33 cdk2 enzymes showed the same pattern of sensitivity but to a much greater extent, thus elaborating on the heightened sensitivity seen earlier with the single alanine substitutions (Fig. 3). In particular, there were only two substrates that were phosphorylated at better than 1% of the wild type level, KSPRK (the wild type), at 100%, and KSPRR, at 4.3 and 5.0% (Fig. 4C). This sensitivity of p33 cdk2 to substitution of Lys with even another basic residue was in marked contrast to the sensitivity of the p34 cdc2 enzymes, which phosphorylated the KSPRR substrate at 21.1 and 59.9% of the wild type level (Fig. 4C). Most ϩ3 substitution substrates were not detectably phosphorylated at all by the p33 cdk2 enzymes, and only six (Ala, Cys, His, Lys, Pro, and Arg) were phosphorylated at greater than 0.1% of the wild type level by either enzyme (Fig. 4C), whereas we could always detect at least a low level of phosphorylation of the same substrates by the p34 cdc2 enzymes (Fig. 4C).
We generally saw no or only very modest effects of the cyclin partner on the phosphorylation of the various substrates by either p34 cdc2 or p33 cdk2 . For instance, the largest effects at the Ϫ1 position involved the ISPRK substrate, which was phosphorylated 1.4 times as well by p33 cdk2 -cyclin E as by p33 cdk2cyclin A, and WSPRK, which was phosphorylated 1.5 times as well by p33 cdk2 -cyclin A as by p33 cdk2 -cyclin E (Fig. 4A). At the ϩ2 position, KSPGK was phosphorylated 3.3 times as well by p33 cdk2 -cyclin A as by p33 cdk2 -cyclin E and 3.2 times as well by p34 cdc2 -cyclin B as by p34 cdc2 -cyclin A (Fig. 4B). Of those substrates phosphorylated at all well at the ϩ3 position, the greatest effect of the cyclin was seen with the KSPRR substrate, which was phosphorylated 2.8 times as well by p34 cdc2 -cyclin B as by p34 cdc2 -cyclin A (Fig. 4C). However, since all of the ϩ3 position substrates were phosphorylated approximately 2-3- fold as well by p34 cdc2 -cyclin B as by p34 cdc2 -cyclin A, relative to the wild type substrate to which we normalized all of our data, we suspect that the p34 cdc2 -cyclin A enzyme actually prefers the wild type KSPRK sequence relative to p34 cdc2cyclin B. Although all these effects could reflect subtle alterations in the substrate binding region of p33 cdk2 and of p34 cdc2 caused by binding different cyclins, we are more inclined to view the generally sporadic effects as being due to relatively weak longer range interactions of the cyclins with some individual substrates. The physiological relevance of 2-fold differences in phosphorylation efficiency is doubtful. Prediction of Substrate Utilization-Further analysis of the alanine substitution substrates indicated that the single amino acid substitutions showed an additive effect that could be used to predict the substrate utilization of a double or triple alanine substitution substrate by Xenopus p34 cdc2 -cyclin B. We confirmed the generality of this finding by testing our ability to predict the phosphorylation efficiency of random multiple substitution substrates based on the data from the single amino acid substitutions. We predicted the utilization of each substrate by multiplying the percent wild type phosphorylation of the respective single amino acid substitutions. This predicted value was then compared with the experimental value for that substrate (a ratio of 1 represents a perfect prediction) (Table  II). For example, to predict the utilization of the SSPNL triple mutant, we multiplied the percent wild type phosphorylation of the three single amino acid substitutions, SSPRK (75.1%) and KSPNK (24.9%) and KSPRL (2.86%). The result (0.50%) was then compared with the actual utilization for this substrate (0.32%) to produce a ratio of 1.56.
We chose the substrates to reflect a broad range of amino acid substitutions. These substrates had predicted phosphorylation efficiencies ranging from 42.6 to 0.07% of wild type. There was a good correlation between the actual and predicted values; only two of the substrates had a predicted to actual ratio of greater than 5 or less than 0.2 (Table II). We repeated this analysis with the human complexes and obtained similar results (data not shown). DISCUSSION We have used a panel of GST fusion proteins containing systematic alterations of a canonical p34 cdc2 phosphorylation site to determine the fine specificity of p34 cdc2 and p33 cdk2 bound to various cyclins. Understanding the similarities and differences in the specificities of these enzymes is an essential first step toward evaluating potential substrates that could play important roles in cell cycle progression. Previous studies of p34 cdc2 phosphorylation sites have involved compilations of sites found in diverse proteins and examination of modest numbers of synthetic peptide substrate variants. Recently a peptide selection approach has been used to define a p34 cdc2 consensus site as (K/R)SP(R/P)(R/K/H). Although this method is extremely useful for rapidly determining optimal phosphorylation sites, it is not as well suited for determining which amino acids are poorly tolerated or excluded, it does not analyze all 20 amino acids, and it systematically overestimates the phosphorylation of suboptimal substrates (9). Our approach benefits from using a comprehensive collection of variant substrates within the same protein context. The fusion proteins are inexpensive and readily purified, and additional mutant phosphorylation sites can be engineered quite easily. We also expect that our panel of substrates will prove useful for determining the substrate specificity of other Cdks, including those involved in cell cycle control as well as those involved in other processes.
Overall, we found that Xenopus p34 cdc2 was least sensitive to substitutions at the Ϫ1 position of the wild type sequence KSPRK (with respect to the phosphorylated Ser), fairly sensitive to substitutions at the ϩ2 position, and most sensitive to substitutions at the ϩ3 position. Although this general pattern is consistent with widely held consensus sites for phosphorylation by p34 cdc2 , our data significantly alter our view of what sequences can constitute good, fair, or poor phosphorylation targets. Our finding that the Ϫ1 position can accommodate any amino acid, but that there is about a 2-fold variation in phosphorylation efficiency, is closer to the consensus that posits no specificity than to those that place a basic residue at this position. At the ϩ2 position, we found that neither consensus view, either specifying a polar residue or tolerating all amino acids, adequately fit the data. There was a strong degree of specificity at this position since some substitution mutants were phosphorylated almost 20-fold more efficiently than others. However, we have been unable to discern any simple pattern to explain this specificity. Clearly, though, both polar and nonpolar amino acid side chains could form excellent substrates, and some polar amino acids yielded quite poor substrates. At the ϩ3 position our data are in agreement with the consensus view that basic residues are best. However, by focusing on the best sites, the consensus view fails to distinguish among the poorer sites. We would divide the substitutions at the ϩ3 position into four classes: basic residues, which form excellent sites (about 100% of wild type phosphorylation efficiency); His and Pro, which can form surprisingly strong sites (about 20% of wild type); most other amino acids, which form weak but significant sites (about 5% of wild type); and acidic groups, which form virtually unphosphorylatable sites. The approximately 20-fold reduction in binding affinity on substitution of Ala and most other amino acids at the ϩ3 position corresponds to a weakening of the interaction by about 1.8 kcal/mol, which is consistent with the loss of a single ionic interaction linking the ϩ3 basic residue to p34 cdc2 .
We observed two classes of modest effects of the cyclin partner on phosphorylation of the substrates. First, we noted that the cyclin A-containing complex of p33 cdk2 had a consistently ϳ2-fold higher K m for our substrates than the cyclin E-containing complex (Table I). The K m was not measured for all substrates, but if phosphorylation efficiencies reflect changes in K m , and not in V max , then this difference in K m probably applies to nearly all of the substrates and not just to those shown in  Table I. This result may indicate that cyclin E is a "better" cyclin in that it may be able to induce a better geometry of the binding pocket in p33 cdk2 for substrates. We saw no comparable effect of cyclin A versus cyclin B in the p34 cdc2 -containing complexes. We also noted a more sporadic effect of cyclin partner on phosphorylation efficiencies that we are inclined to attribute to weak longer range interactions between the cyclin and individual target sequences. Although the specificities of p34 cdc2 and p33 cdk2 were generally similar, we were surprised to find a number of instances where p33 cdk2 was far more selective. p33 cdk2 was much less tolerant of Gly or Pro at the ϩ2 position than was p34 cdc2 . This effect was approximately 10-fold or more, depending on the cyclin partner. The greatest differences were seen at the ϩ3 position, where p33 cdk2 essentially did not phosphorylate (less than 0.4% of wild type efficiency) any substrate not containing Lys or Arg at this site. Even Arg, which yields a very good p34 cdc2 substrate, gave only poor substrates (about 5% wild type efficiency) for p33 cdk2 . The presence of any amino acid other than Lys or Arg at the ϩ3 position of a putative Cdk phosphorylation site can be taken as a strong indication that p34 cdc2 , rather than p33 cdk2 , phosphorylates that protein. The presence of Gly or Pro at the ϩ2 position, or of Arg at the ϩ3 position, would also tend to point toward p34 cdc2 as the relevant protein kinase.
Our data are likely to be useful for the initial evaluation of potential substrates of p34 cdc2 and of p33 cdk2 . We were able to predict the phosphorylation efficiencies of multiple site substitutions fairly accurately since the effects of the single amino acid substitution mutants on phosphorylation efficiency were additive (Table II). We envision that a full prediction of potential phosphorylation sites on novel proteins will help to guide experiments toward the most likely physiological sites. For example, our data predict that an intuitively poor target site, YSPMH, would be phosphorylated almost twice as efficiently by Xenopus p34 cdc2 -cyclin B as an intuitively excellent site, KSPDR (13.0% versus 7.0% of wild type efficiency). We do not anticipate that our predictive scale will be accurate for all sites in all protein contexts. Clearly, many factors combine to determine the phosphorylation efficiency of a given target. A theoretically excellent site could be buried within a protein or folded rigidly in an unfavorable conformation. Similarly, an otherwise weak site could be folded tightly and presented in a favorable way. Additional interactions between either subunit of the Cdk and a substrate could further influence specificity. Despite these caveats, and in particular because their contributions are difficult to evaluate, we feel that our scale presents an unbiased starting point for examination of Cdk substrates. Tables  showing the phosphorylation efficiencies depicted graphically  in Figs. 2 and 4 are readily available from the authors.