A single pair of acidic residues in the kinase major groove mediates strong substrate preference for P-2 or P-5 arginine in the AGC, CAMK, and STE kinase families.

Most basophilic serine/threonine kinases preferentially phosphorylate substrates with Arg at P-3 but vary greatly in additional strong preference for Arg at P-2 or P-5. The structural basis for P-2 or P-5 preference is known for two AGC kinases (family of protein kinases A, G, and C) in which it is mediated by a single pair of acidic residues (PEN+1 and YEM+1). We sought a general understanding of P-2 and P-5 Arg preference. The strength of Arg preference at each position was assessed in 15 kinases using a new degenerate peptide library approach. Strong P-2 or P-5 Arg preference occurred not only in AGC kinases (7 of 8 studied) but also in calmodulin-dependent protein kinase (CAMK, 1 of 3) and Ste20 (STE) kinases (2 of 4). Analysis of sequence conservation demonstrated almost perfect correlation between (a) strong P-2 or P-5 Arg preference and (b) acidic residues at both PEN+1 and YEM+1. Mutation of two kinases (PKC-theta and p21-activated kinase 1 (PAK1)) confirmed critical roles of both PEN+1 and YEM+1 residues in determining strong R-2 Arg preference. PAK kinases were unique in having exceptionally strong Arg preference at P-2 but lacking strong Arg preference at P-3. Preference for Arg at P-2 was so critical to PAK recognition that PAK1 activity was virtually eliminated by mutating the PEN+1 or YEM+1 residues. The fact that this specific pair of acidic residues has been repeatedly and exclusively used by evolution for conferring strong Arg preference at two different substrate positions in three different kinase families implies it is uniquely well suited to mediate sufficiently good substrate binding without unduly restricting product release.

The term "basophilic kinase" has been used to describe kinases that preferentially phosphorylate substrates having basic residues in close proximity to the phosphorylation site (1). Studies of PKA 3 in the late 1970s provided the first paradigm and showed a simple and powerful preference for Arg at PϪ3 and PϪ2 (2, 3) (where PϪn indicates the amino acid n residues N-terminal to the phosphorylated residue and Pϩn indicates the amino acid n residues C-terminal to the phosphorylated residue). PKC specificity proved to be more complex and to include preference for basic residues distributed on both sides of the phosphorylation site (4 -6). By 1990, some information on peptide specificity was available for about 20 kinases, of which the majority had evident preference for basic residues (7).
The structural basis for the well defined basophilic preference of PKA was solved concurrently by two different approaches: mutational analysis (8) and x-ray crystallography (9). The strong preference of PKA for Arg at PϪ3 was assigned to a single acidic residue (Glu-127) by both approaches. We refer to this residue as GELϩ1 4 to provide a general nomenclature rather than kinase-specific numbering. GEL refers to the consensus sequence of a conserved motif, and the ϩ1 refers to its position relative to the first residue in that motif (supplementary Fig. S1). The GELϩ1 acidic residue is located on the hinge that connects the kinase N-lobe to the C-lobe. GELϩ1 co-ordinates both with ATP and Arg at the PϪ3 position (10). This finding in PKA has proven to be a general rule for many kinases. Additional mutational analyses have clarified and generalized the critical importance of this acidic residue in PKA as well as other kinases (11,12). This feature is so general for basophilic kinases that degenerate peptide approaches generally use Arg as PϪ3 as the anchor residue for peptides (13).
The strong preference of PKA for Arg PϪ2 was shown to be conferred by two acidic residues: PENϩ1 (Glu-171) and YEMϩ1 (Glu-230) (8,9). These residues are widely separated in the linear sequence (in the catalytic loop and F-helix, respectively), but in three dimensions their side chains are close to each other in the major groove. In PKA, they cooperate in creating a negatively charged pocket. This finding has been extrapolated to the closely related PKCs to explain the observed role of Arg at the PϪ2 position for PKCs (14). However, it is uncertain how general this will prove to be. Specifically, 1) there is major variation in Arg preference at PϪ2 among basophilic kinases, and 2) there is also complex variation between Ser/Thr kinases in the residues present at PENϩ1 and YEMϩ1 (see "Results" and "Discussion"). The functional relationship between these is not clear, because there are no additional solved structures with Arg in substrate at PϪ2, and virtually no mutational analysis has been done of PENϩ1 or YEMϩ1 residues. Further interest and complexity has emerged from a new finding relating to this pair of residues. The crystal structure of the AGC family kinase AKT1 shows that the PϪ5 Arg in a bound substrate occupies the same pocket formed by acidic residues at PENϩ1 and YEMϩ1 (15). Thus, this pocket can function either for PϪ2 or PϪ5 Arg recognition.
It is important to understand more generally the rules regarding * This work was supported by the Intramural Research Program of the NCI, National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. □ S The on-line version of this article (available at http://www.jbc.org) contains supplemental Figs. S1-S5. 1  basophilic kinases, their preference for Arg at positions other than PϪ3, and the structural basis for this preference. This is of both practical and theoretical interest. Practically, it is important to understand which basophilic kinases are most similar in specificity and therefore most likely to be confused in predicting/assigning substrates. At a theoretical level, such analysis facilitates understanding the molecular basis of peptide specificity and its evolution. We have undertaken this analysis of 15 diverse kinases from the AGC, CAMK, and STE kinase families using a new degenerate peptide library approach that objectively quantifies preference for Arg at each substrate position. Our studies demonstrate that PENϩ1 and YEMϩ1 play a singular role in conferring strong PϪ2 or PϪ5 Arg preference in these three large kinase subfamilies. Kinases in the PAK family prove to be especially interesting in their exceptionally strong dependence on PϪ2 Arg recognition mediated by these two acidic residues.

MATERIALS AND METHODS
Peptide Synthesis and Kinases-Biotinylated single sequence peptides and degenerate peptides were synthesized as C-terminal amides on Mimotopes (Clayton, Australia) SynPhase Rink amide acrylicgrafted polypropylene solid support (loading 7.5 mol) using conventional Fmoc (N-(9-fluorenyl)methoxycarbonyl) chemistry as previously described (8). 15 catalytically active kinases were used. PKA, PKC-␣, PKC-, PKG, and p70S6K were purchased from Calbiochem. Pim1, MST4, PAK1, and PAK4 were purchased from ProQinase. PKD2, PDK1, CAMK II, and ASK1 were purchased from Upstate. A His 6tagged construct corresponding to residues 615-874 of human PRK1 (GenBank TM accession BC040061) was expressed in 293T cell by calcium phosphate transfection, and the corresponding protein was purified by nickel affinity chromatography. PKC-tagged at its N terminus with hexahistidine was expressed in baculovirus and purified by nickel affinity chromatography. PAK1 and PKC-were cloned into the Gateway system (Invitrogen) and expressed in vectors pDEST565 and pDEST26, respectively. Purification using nickel affinity chromatography was performed after expression. For all the proteins purified by us, SDS-PAGE and Silver (or Coomassie Blue) staining were carried out to verify the purity.
Western Blot-The phosphorylation status of the activation loop of PAK1 mutants was checked by Western blot using specific anti-phospho T423 antibody (Cell Signaling Technology). Purified kinases in 1ϫ SDS sample buffer were run in 8% SDS-PAGE gel and transferred on to nitrocellulose membranes. The membranes were then blotted with the appropriate antibody and developed with the ECL kit (Amersham Biosciences) following the manufacturer's protocol.
In Vitro Kinase Assay-Peptides were phosphorylated by in vitro kinase assay as previously described previously (8). In brief in vitro phosphorylation in the presence of ␥-32 P-labeled ATP (in 100 M cold ATP) was performed in 50 l of solution under standardized conditions. For assays involving degenerate peptides, stoichiometry of phosphorylation was kept to Ͻ10%. After reaction termination, 50 pmol of substrate was transferred to streptavidin-coated plates, and emissions counted after extensive washing. Calculations of the preference score for each position (such as PϪ3) are based on approaches described by Yaffe and coworkers (9) for analysis of phosphorylation of degenerate peptides. We extended those approaches and validated their usefulness in predicting phosphorylation both of peptides and of proteins (8). In concept, the preference score for a particular position (such as PϪ3) is a log 2-based score determined by comparing the magnitude of phosphorylation of the 9 degenerate peptides with Arg at that position with average phosphorylation of all 45 degenerate peptides. Details of computation are as follows. 1) cpm values for each of the 45 peptides were converted to log 2 cpm. 2) For each set of 9 peptides that shared an Arg at a position the mean of the log 2 cpm values was determined (mean PϪ7 , mean PϪ6 , . . . mean Pϩ3 ). 3) The preference score for Arg at a particular position was computed as the mean of its contribution in each of the 9 sets; therefore for the PϪ3 position it is the mean of the values: (log 2 cpm PϪ3/PϪ7 peptide Ϫ mean PϪ7 ), (log 2 cpm PϪ3/PϪ6 peptide Ϫ mean PϪ6 ), . . . (log 2 cpm PϪ3/Pϩ3 peptide Ϫ mean Pϩ3 ). Standard error was determined from the same nine values.
Mutagenesis and Kinetic Parameters-All mutagenesis was carried out by following the protocol of the QuikChange site-directed mutagenesis kit (Stratagene). A148E mutation was made in PKC-to make a constitutively active form from which further mutants were derived. Mutants included D508A (because A is present in GRK2, whose kinase domain has 55% sequence similarity to over 270 residues) and E571I (because I is present in CamK IV, whose kinase domain has 54% similarity to over 250 residues). For PAK1, kindly provided by J. Chernoff, mutations of H83L and H86L (PAK1-LL) were first made to obtain a constitutively active kinase, and then D393A or E456I mutations were introduced. Before kinase assays to determine K m and k cat , a Western blot was done to adjust the purified kinases to equivalent concentrations. Serial 2-fold dilutions of substrate peptide from 100 M to 0.2 M were used, and the assays were performed at 30°C for 10 min. K m and k cat were calculated by GraphPad Prism.

RESULTS
Previous studies have shown that strong preference for Arg at PϪ2 or PϪ5 (usually in addition to preference for Arg at PϪ3) is characteristic of multiple basophilic kinases (1). However, it would be helpful to know whether PϪ2 and PϪ5 are uniquely important when one takes into account the many unstudied human kinases. We approached this question in an alternate way. We analyzed the overall pattern of basophilic preference among human Ser/Thr kinases by determining the distribution of Arg or Lys in reported mammalian Ser/Thr phosphorylation sites (supplemental Fig. S2). Our analysis of 1441 sites confirms a clear non-random distribution of basic preference. PϪ2 and PϪ5 are the only two positions (in addition to PϪ3) at which the abundance of Arg is clearly greater than the average Arg frequency in the human proteome.
We therefore undertook analysis of the structural basis of this special pattern of basophilic preference. A degenerate peptide library approach was developed to quantitatively assess the position-specific preference of a kinase for Arg. Our "R-pair" method is based on a set of 45 degenerate peptides, each having exactly 2 fixed Arg and 1 fixed Ser (supplemental Fig. S3). The positions of the fixed Arg residues are varied systematically throughout the set so that it includes all possible combinations of two Arg from position PϪ7 to Pϩ3. The remaining eight positions are synthesized with a degenerate mix of residues. Arg was chosen rather than Lys because it has been shown that the strongest preferences are generally for Arg not Lys (Refs. 8, 15, and 16 and supplemental Fig. S2). The approach uses pairs of Arg (rather than a single Arg per peptide) for two reasons. First, we wanted to be able to investigate whether there were strong synergistic effects of pairs of Arg. Second, the anticipated low efficiency of phosphorylation of degenerate peptides with single fixed arginines could cause problems due to low signal-to-noise ratios. Validation of this R-pair set was performed by analysis of three kinases whose peptide specificity and Arg preference has been analyzed systematically by other approaches: PKA, p70S6K, and PKC-␣ (Fig. 1). An assay with any particular kinase provides data on phosphorylation of the 45 R-pair peptides (supplemental Fig. S4). Of the 45 values, 9 correspond to a peptide with Arg at a particular position (such as PϪ3). Robust assessment of Arg preference at a position is obtained by comparing phosphorylation of those 9 peptides with the remainder of the set. The end result of this calculation is a log 2 score for each residue position that reflects the relative preference (if positive) or disfavor (if negative) for an Arg at that position compared with Arg at other positions in the peptide (Fig. 1). The results from the R-pair analysis are consistent with previous knowledge about the three kinases. PKA was discovered in the mid 1970s to have a simple and strong pattern of Arg preference at PϪ2 and PϪ3 (3), which is evident with the R-pair set. The peptide with Arg at PϪ2 and PϪ3 is uniquely well phosphorylated (supplemental Fig. S4), and the position-specific score likewise shows strong preference at PϪ2 and PϪ3 (Fig. 1). PKC is known to differ from PKA in having a more broadly distributed preference for Arg not only at PϪ2 and PϪ3 but also for Arg at C-terminal positions, especially Pϩ2 (6,15). This broader preference is reflected in the higher relative phosphorylation of many peptides (supplemental Fig. S4). In this context the position-specific scores (Fig. 1) are particularly helpful in distinguishing the most important Arg preferences at PϪ3, PϪ2, and Pϩ2. Finally, p70S6K is known to have an Arg preference at PϪ3 and PϪ5 (17), which is likewise evident both in individual peptides (supplemental Fig. S4) and in position-specific scores (Fig. 1).
Five additional AGC kinases were then analyzed (Fig. 2). PKG, known to be similar to PKA in Arg preference, showed strong Arg preference at PϪ2 and PϪ3 and lesser preference at PϪ5. PRK1, PKC-, and PKCeach share with PKC-␣ the three positions of strongest Arg preference: PϪ3, PϪ2, and Pϩ2. However, the relative importance of the positions varied between these closely related kinases. For PKCand especially PKC-, the PϪ2 position was characterized by the strongest Arg preference. The especially strong preference of PKC-for Arg at PϪ2 was consistent in multiple experiments (cf. Fig. 3B) and differed from the other AGC kinases (for example PKA, PKG, and PRK) for which the strongest preference was at PϪ3. PDK1, whose Arg preference has not previously been studied in detail, differed from all seven other AGC kinases studied in lacking strong Arg preference at any position, including PϪ2 or PϪ3.
The accuracy of R-pair analysis depends on the assumption that the position of a strong Arg preference should be largely independent of other important preferred residues in a substrate. We were able to test this assumption for PKC, because it prefers a hydrophobic residue such as Phe or Leu at the Pϩ1 position (8,18). We created an additional test set of seven degenerate peptides all of which had a fixed Phe at the Pϩ1 position. The peptides also had a single fixed Arg whose position varied between PϪ4 and Pϩ4. Analysis of phosphorylation of this set by PKCconfirmed that the PϪ2 position was the single most important position for Arg. The next two most important were Pϩ2 and PϪ3 (Fig.  3A). Thus, the relative importance of Arg placement as determined by the R-pair analysis was confirmed by the analysis using the set with Phe at Pϩ1.
We used mutational analysis to investigate whether the especially strong PϪ2 Arg preference of PKChad the same structural basis as the PϪ2 Arg preference of PKA. PKC-residues Asp-508 (PENϩ1) and Glu-571 (YEMϩ1) were each mutated to a residue found at the corresponding positions of closely related kinases: Asp-508 to Ala and Glu-571 to Ile. Assessment of the kinetic parameters of these mutants demonstrated that the mutations had not impaired the catalytic capacity of the kinase on a favorable substrate with basic residues at multiple positions (supplemental Fig. S5).
Arg preferences of the mutant kinases were determined using the R-pair set (Fig. 3B). Wild-type protein purified from transfected mammalian cells showed the same dominant PϪ2 Arg preference observed with the commercial kinase preparation. The E571I mutant protein had greatly diminished preference for Arg at PϪ2. The D508A mutant protein also had a major decrease in preference for Arg at PϪ2. An additional mutant protein D544A served as a control. This residue was chosen because Asp-544 is the only other C-lobe acidic residue whose side chain is directed into the major groove. That protein showed no diminution of PϪ2 Arg preference. Thus the exceptional PϪ2 preference of PKCis strongly influenced by E571 (YEMϩ1) and D508 (PENϩ1). The foregoing PKC-data, taken together with the previous PKA data, suggest that strong PϪ2 Arg preference may always depend on cooperative function of the same pair of acidic residues in the C-lobe: PENϩ1 and YEMϩ1. We therefore analyzed conservation of these residues among AGC kinases and the two kinase subfamilies most closely related to the AGC subfamily: CAMK and STE. It is useful to describe a kinase by a two character acidic pair pattern (yy, yn, ny, or nn) describing whether the kinase domain has an acidic residue at PENϩ1 (first yes or no) and at YEMϩ1 (second y or n). The results (TABLE ONE) demonstrate two notable features related to this acidic pair pattern. First, the AGC, STE, and CAMK subfamilies differ in their predominant acidic pair pattern. Most of the AGC kinases have a yy pattern (i.e. acidic residues at both positions). Most of the CAMK have a yn pattern (i.e. acidic residue at PENϩ1 but not at YEMϩ1), and most of the STE kinases have an ny pattern (i.e. acidic residue at YEMϩ1 but not at PENϩ1). Second, within each subfamily there are 10 -25% of kinases which deviate from the predominant pattern.
Our foregoing analysis included seven AGC family kinases (Figs. 1 and 2). PDK1, the only one with a yn pattern, lacked strong PϪ2 Arg preference. Of the six with a yy pattern, all but one had a strong PϪ2 Arg preference. The outlier was p70S6K, which showed a strong PϪ5 Arg preference instead of a PϪ2 Arg preference. This finding for p70S6K fits the concept that emerged from analysis of AKT1 (19), namely that the same pair of acidic residues can function in an alternative mode, which mediates strong preference for Arg at PϪ5 if a bulky hydrophobic residue changes access to that pocket (see "Discussion").
If acidic residues at the PENϩ1 and YEMϩ1 positions are both required for strong PϪ2 or PϪ5 Arg preference, then most CAMK kinases should not have such preferences because most lack an acidic residue at YEMϩ1 (i.e. yn acidic pair pattern). This view is supported by studies of peptide specificity done on at least 13 CAMK. At least 10 of 13 kinases have been noted to have strong Arg preference at PϪ3, and none have been indicated to have strong PϪ2 Arg preference (18, 20 -34). Because of this strong body of existing evidence, we conducted only a limited study of CAMK (Fig. 4). Our studies of two CAMK having the typical yn acidic pair pattern revealed strong PϪ3 Arg preference but not strong PϪ2 or PϪ5 Arg preference. This is consistent with the concept that yy acidic pair pattern is necessary for strong PϪ2 Arg preference.
The PIM family of kinases is the only one in the CAMK superfamily (3 of 83) that has a yy acidic pair pattern of residues. PIM1 has been described as having a specificity of generalized basic preference from Ϫ5 to Ϫ2 (32). Our R-pair analysis provides a more quantitative view of the position-specific preference of PIM1 for Arg (Fig. 4). It shows a very strong dominant Arg preference at PϪ3 and PϪ5, with much more limited PϪ2 Arg preference. Thus, the PENϩ1 and YEMϩ1 acidic residues, which are unique to PIM kinases in the CAMK subfamily, most likely confer a PϪ5 preference resembling their inferred function for AKT1 (see "Discussion").
If acidic residues at the PENϩ1 and YEMϩ1 positions are both required for strong PϪ2 or PϪ5 Arg preference, then most STE kinases should not have such preferences, because most lack an acidic residue at PENϩ1 (i.e. ny acidic pair pattern). In addition STE kinases would be expected to lack strong PϪ3 Arg preference, because they lack the GELϩ1 acidic residue that is present in all AGC kinases and most CAMK kinases (TABLE ONE) and is a major contributor of PϪ3 Arg preference (35)(36)(37). The literature provides few peptide specificity studies of such representative STE kinases with ny acidic pair pattern, perhaps because these kinases are thought to be controlled by molecular interactions that increase substrate concentration ("recruitment") more than by peptide specificity (38). Consistent with this concept, our studies of MST4 (a representative STE kinase with an ny acidic pair pattern) demonstrated no Arg preference near PϪ3 (Fig. 5).
Of particular interest in the STE family are the 9 kinases with a yy acidic pair pattern. Three of these kinases (PAK1, PAK2, and PAK3) belong to the PAK I subfamily and three (PAK4, PAK5 and PAK6) belong to the PAK II subfamily. Analysis of PAK1 and PAK4 demonstrated that both had an exceptionally strong preference for Arg at PϪ2 (Fig. 5). Unlike the other basophilic kinases we have studied, the PAKs had very limited preference for Arg at PϪ3, consistent with their lack of an acidic residue at GELϩ1, which is in contrast to the dramatic PϪ2 Arg preference of the PAKs. The remaining three STE kinases with a yy acidic pair pattern belong to a branch of the MAPKKK family (ASK, COT, and TAK1). Analysis of one of the MAPKK kinases, ASK-1, showed that its strongest Arg preference was at PϪ2, but that preference was so weak that its biological relevance is unclear. Those results suggest that other residues in the kinase domain prevent the PENϩ1 and YEMϩ1 acidic residues in ASK-1 from mediating strong PϪ2 Arg recognition. We therefore explored possible structural explanations by creating a homology model of ASK-1 with SwissModel. The resulting model shows that the side chain of Lys-769 located on AlphaD points into the major groove in close proximity to YEMϩ1, which would position it to disrupt the acidic pocket for PϪ2/PϪ5 Arg binding. Notably, that basic residue is conserved in COT and TAK1. Thus, the six members of the PAK are the only kinases of the 48-member STE family that contain a residue pattern our analysis predicts will give a strong PϪ2 or PϪ3 Arg preference.
The singular preference of PAKs for Arg at PϪ2 is consistent with its yy acidic pair pattern. We therefore tested whether Asp-393 (PENϩ1) and Glu-456 (YEMϩ1) confer its preference for Arg at PϪ2. In planning this analysis we noted that autophosphorylation is an important feature of PAK activation (39 -41). Multiple residues are autophosphorylated, including a critical residue in the activation loop. Our analysis of amino sequence surrounding the activation loop site in PAK family members shows that all six PAKs have Arg at the PϪ2 position relative to the activation loop autophosphorylation site (Fig. 6A). This is notable, because PϪ2 is the location of strong Arg preference of PAKs (Fig. 5). We predicted therefore that mutation of those two residues (PENϩ1 and YEMϩ1) would prevent activation loop autophosphorylation. When the relevant mutant constructs were expressed in bacteria, they had increased mobility in PAGE gel, consistent with changes in autophosphorylation (Fig. 6B). Virtually complete loss of activation loop phosphorylation (Ͻ2%) was observed with E456I and D393A (Fig. 6C). PAK1 protein from D393A and E456I constructs had such complete loss of catalytic function that no activity could be detected above background on any of the four substrates we had found to be best for phosphorylation by PAK1 (Fig. 6D). This defect was not secondary to failure of activation loop phosphorylation, because mutation of the activation loop Thr-423 to Glu did not rescue kinase activity (data not shown).

DISCUSSION
The foregoing studies analyze the strength of Arg preference at the PϪ2 and PϪ5 positions among diverse basophilic kinases and the structural basis for that preference. The discussion will focus on the following issues: 1) characteristics of the binding pocket for PϪ2 Arg in the major groove of the kinase C-lobe; 2) use of the same pocket for PϪ5 Arg binding by some kinases; and 3) uniqueness of PAK family dependence on PϪ2 Arg recognition.
The current studies demonstrate that acidic PENϩ1 and YEMϩ1 residues that were first found to dictate strong PϪ2 Arg preference for PKA (Fig. 7B) are also the mediators of strong PϪ2 Arg preference for other kinases of the AGC and STE subfamilies. This pair of residues is present in most AGC kinases (TABLE ONE). The crystal structure of an additional AGC kinase, PKC-, reveals the same general organization of the binding pocket in the major groove of the kinase C-lobe (Fig. 7C) (42). Our mutation analysis shows that the PENϩ1 and YEMϩ1 residues mediate the especially strong PϪ2 Arg preference of PKC- (Fig.  3). This may not be particularly surprising, because PKC-is in the same kinase subfamily as PKA. However, the PAK family of kinases is much more distantly related to PKA, and their kinase domains are only 28% identical. Nevertheless they: 1) have the same general organization of the binding pocket (Fig. 7E) and 2) use exactly the same two acidic residues to mediate their strong PϪ2 Arg preference (Fig. 6).
Our studies are also consistent with the concept proposed by Barford and colleagues (19) that the same residues that mediate strong PϪ2 Arg binding (PENϩ1 and YEMϩ1) can operate in an alternative mode in which they mediate highly preferential binding of PϪ5 Arg by AKT1. We demonstrate here that PIM1 resembles AKT1 in having dominant PϪ3/PϪ5 Arg preference (Fig. 4). In keeping with these findings, a recent crystal structure (PDB identifier: 2BIL) demonstrates that PENϩ1 and YEMϩ1 co-ordinate with the PϪ5 Arg in bound substrate. PIM-2, which is the most divergent PIM family member (only 61% identity to PIM-1 in the kinase domain), has recently been shown to also have dominant PϪ3/PϪ5 Arg preference (34). Moreover, we have confirmed that p70s6k has dominant PϪ3/PϪ5 Arg preference (Fig. 1). Thus, AKT1, PIM-1, PIM-3, and p70S6K all have yy acidic pair patterns and have strong PϪ5 Arg preference. Conversely, no kinases lacking the yy acidic pair pattern have been observed to have strong PϪ5 Arg preference. It is thus highly likely that the PENϩ1 and YEMϩ1 residues are the primary mediators in eukaryotic protein kinases of PϪ5 Arg recognition in the PϪ3/PϪ5 pattern. Thus, PENϩ1 and YEMϩ1 acidic residues cooperate to create strong specificity for PϪ2 or PϪ5 Arg (TABLE TWO). Cooperation of these two residues was evident in PKA as assessed both in the crystal structure (1ATP) and by mutagenesis. Two arguments strongly support a more general cooperativity of these two residues in many kinases. First, combined analysis of sequence conservation and Arg preference for PϪ2 or PϪ5 indicates that strong preference is almost always observed in kinases with the yy acidic pair pattern and only in those kinases. For example, CAMKII and PKD, which lack an acidic residue at YEMϩ1 (Fig. 7D), have weak PϪ2 Arg preference. Second, mutational analysis of both PKCand PAK demonstrate that both residues contribute greatly to this strong Arg preference. It is worth noting that mutations of PKCand PAK at the YEMϩ1 residue cause a greater reduction in PϪ2 Arg preference than do mutations of the PENϩ1 residue. This is notable, because the YEMϩ1 residue has been minimally studied. For example, the YEMϩ1 residue is not included in the set of kinase domain residues used by Brinkworth et al. (43) to predict kinase specificity.
The simplest interpretation of this cooperativity between the PENϩ1 and YEMϩ1 acidic residues is that having two acidic residues provides stronger electrostatic interaction than one acidic residue. However, we infer that the underlying mechanism may be more sophisticated. In theory, evolution could have used a different pair of residues in the major groove to achieve similar cooperativity. Instead, evolution has repeatedly chosen exactly these two residues. This outcome is particularly remarkable for PAK kinases and PIM kinases, because their evolutionary distance from AGC makes it very likely that utilization of these two acidic residues is by convergent evolution (or gene recombination). Thus, these two residues appear to show a singular capacity for serving the role of Arg based interaction between kinase and substrate.
Although the PENϩ1/YEMϩ1 pair of acidic residues has singular importance, there are also multiple lines of evidence that the role of this

Tabulation of patterns of acidic residues among AGC, CAMK, and STE kinases
For standardization, we used family grouping of kinases and sequence alignments from Manning et al. (45). Four distinct acidic pair patterns (yy, yn, ny, and nn) are possible, which describe the presence or absence of acidic residues at the PEMϩ1, YEMϩ1 positions. For each acidic pair pattern, the number of kinases in the three kinase subfamilies has been tabulated. Also tabulated (bottom row), is the frequency of kinases in each family with an acidic residue at GELϩ1. pair is shaped by other sequence elements in the kinase domain as follows. 1) Whether this pair dictates PϪ2 Arg preference or PϪ5 Arg preference may be determined by a bulky hydrophobic residue on helix AlphaD (19). 2) The absolute strength of PϪ2 Arg preference, which PENϩ1/YEMϩ1 confers, is likely to also be modulated by the particular arrangement of other residues near the catalytic site. In our studies the most extreme example of this is ASK-1, which has a weak PϪ2 Arg preference despite presence of this pair (Fig. 5), which, as discussed under "Results," is likely due to a basic residue on AlphaD disrupting binding by the acidic pair. A more subtle example of this variation is evident in comparisons between closely related kinases. Arg PϪ2 preference is stronger in PKCthan PKC-␣ and PRK1 (Figs. 1 and 2) even though those kinase are very closely related and have virtually complete conservation in the catalytic cleft (data not shown). It is possible that this reflects subtle differences in spacing or relative positioning between the PENϩ1/YEMϩ1 residue pair.
The specificity of PAK is generally summarized as (K/R)RXS based on the informative studies of Traugh and colleagues (44). However, we view PAK as being a singular basophilic kinase whose differences from other basophilic kinases is not fully highlighted by the representation "(K/R)RXS." PAK specificity for Arg at PϪ2 differs quite dramatically from other basophilic kinases. First, kinases in the PAK I and PAK II subfamilies do not have a strong Arg preference at PϪ3, which sets them apart from virtually all other known basophilic kinases. This reflects a fundamental design difference in all STE kinases, which lack the GELϩ1 acidic residue that is almost always used by AGC and CAMK to create a PϪ3 Arg preference. Second, PAK's dependence on PϪ2 Arg recognition is exceptional. In a screening for PAK phosphorylation of set of 96 proteomic peptides having increased frequency of Arg in the vicinity of phosphorylatable residues, PAK1 and PAK4 each phosphorylated 16 of the peptides (i.e. 16 were at least 1/10th as well phosphorylated as the best peptide substrate). All 16 of those phosphorylated peptides had Arg at PϪ2, but only half of the more poorly phos- The sole kinase that does fit the correlate is ASK-1 that has a basic residue nearby, which is predicted to disrupt binding by the acidic pocket (see "Results"). b In addition to kinases studied here, studies of many CAMK (18, 20 -34) demonstrate that they fit into this category.  A, PKA structure (PDB identifier 1ATP) is shown to orient viewer for B-E. The catalytic cleft is facing the viewer, the kinase domain is space-filled, and the PKI (heat-stable inhibitor of cAMP-dependent protein kinase) peptide is shown as a ribbon diagram. Color code: gray is hydrophobic; red is acid; blue is basic; and purple is polar. B, detail for PKA that corresponds to the region highlighted by the white rectangle in A. PKI is shown as a yellow wire frame. The PKI Arg side chain at PϪ2 (yellow nitrogen spheres) is in contact with PENϩ1 and YEMϩ1 and at PϪ3 (green nitrogen spheres) is in contact with GELϩ1. C, corresponding region of PKC-(PDB identifier: 1XJD). D, corresponding region of PKD1 (from model generated by SwissModel). E, corresponding region of PAK1 (PDB identifier 1F3M). F, tabulation of residue numbers in the four kinases shown. Note: all structures have acidic residues at PENϩ1 (marked by "1"); all except PKD1 have acidic YEMϩ1 (marked by "2"); all except PAK have acidic GELϩ1 (marked by "3").
phorylated peptides did (p Ͻ 10 Ϫ4 ). Such perfect concordance of phosphorylation with Arg at PϪ2 was not observed with any of the other 16 kinases assayed similarly (data not shown). This is consistent with the virtual abolition of catalytic activity in PAK1 mutants at PENϩ1 and YEMϩ1. Thus PAK is singularly dependent on Arg at PϪ2, which warrants strong consideration when predicting/confirming in vivo PAK phosphorylation sites. Protein kinases are among the largest families of genes in eukaryotes (45). Although other protein folds can mediate phosphorylation, the eukaryotic protein kinase (ePK) domain has been by far the most "successful" fold in evolutionary terms, i.e. the one that has been most frequently used by evolution to mediate phosphorylation (46). The present studies highlight a special role for two acidic residues, PENϩ1 and YEMϩ1, in the C-lobe of the ePK domain. These particular residues have been consistently used by evolution to create strong preference for basic residue at the PϪ2 or PϪ5 position in substrates. The PAK family illustrates a singular example of this strategy, in which PϪ2 Arg interaction with this pair of residues is essential for kinase activity.