Sequence Specificity of SHP-1 and SHP-2 Src Homology 2 Domains

A combinatorial phosphotyrosyl (pY) peptide library was screened to determine the amino acid preferences at the pY+4 to pY+6 positions for the four SH2 domains of protein-tyrosine phosphatases SHP-1 and SHP-2. Individual binding sequences selected from the library were resynthesized and their binding affinities and specificities to various SH2 domains were further evaluated by SPR studies, stimulation of SHP-1 and SHP-2 phosphatase activity, and in vitro pulldown assays. These studies reveal that binding of a pY peptide to the N-SH2 domain of SHP-2 is greatly enhanced by a large hydrophobic residue (Trp, Tyr, Met, or Phe) at the pY+4 and/or pY+5 positions, whereas binding to SHP-1 N-SH2 domain is enhanced by either hydrophobic or positively charged residues (Arg, Lys, or His) at these positions. Similar residues at the pY+4 to pY+6 positions are also preferred by SHP-1 and SHP-2 C-SH2 domains, although their influence on the overall binding affinities is much smaller compared with the N-SH2 domains. A structural model was generated to qualitatively interpret the contribution of the pY+4 and pY+5 residues to the overall binding affinity. Examination of pY motifs from known SHP-1 and SHP-2-binding proteins shows that many of the pY motifs contain a hydrophobic or positively charged residue(s) at the pY+4 and pY+5 positions.

enzymes share highly similar sequences and three-dimensional structures, and yet they have quite different functions in vivo (2)(3)(4). SHP-1, which is mainly expressed in hematopoietic and less in epithelial cells, predominantly acts as a negative regulator of cellular signaling pathways induced by transmembrane receptors such as EpoR, c-Kit, and Fc␥RIIb1 (5)(6)(7)(8). In contrast, the ubiquitous SHP-2 is a positive regulator in the signal transduction of receptors like insulin receptor, platelet-derived growth factor receptor and EpoR (2)(3)(4)9). The association of the N-SH2 domain of SHP-1 or SHP-2 with an activated receptor results in the stimulation of their phosphatase activity. Indeed, both phosphatases exist in an inactive form in the cytosol, with the N-SH2 domain occluding the substrate-binding pocket of the PTP domain. The occupation of the N-SH2 domain with a phosphotyrosine-containing ligand leads to the dissociation of the inactive N-SH2⅐PTP complex and subsequently to the activation of the enzymes (10 -12).
Specific association of an SH2 domain with a cognate pY protein is mediated by the peptide motif surrounding the pY residue. It has been assumed that for most SH2 domains, sequence specificity is dictated by the three residues immediately C-terminal to pY (position ϩ1 to ϩ3 relative to pY, which is defined as position 0). Indeed, most of the previous studies on SH2 domain specificity have focused on this minimal recognition motif (pY to pYϩ3) (13). The sequence specificity of SHP-1 and SHP-2 SH2 domains has previously been determined by using combinatorial libraries (14 -16). Unlike the classical SH2 domains, SHP-1 and SHP-2 SH2 domains require amino acid residues both N-and C-terminal to pY (positions pYϪ2 to pYϩ3) for high affinity binding. The two C-SH2 domains bind pY peptides of a single consensus sequence (T/V/I/y)XpY(A/s/ t/v)X(I/V/L), which is also recognized by the two N-SH2 domains. In addition, the N-SH2 domain of SHP-1 binds to a second consensus LXpY(M/F)X(F/M) (class II), whereas the SHP-2 N-SH2 domain can bind to four other distinct classes of sequences (classes II-V) (14,15). There is accumulating evidence that residues beyond pYϩ3 may also significantly affect the binding affinity and specificity of SH2 domains. Case et al. (17) found that a 9-mer peptide (LNpYIDLDLV) derived from IRS-1 pY1172 bound SHP-2 N-SH2 domain more tightly than hexapeptide NpYIDLD. It is now clear that the shorter peptide is missing an important hydrophobic residue at position pYϪ2. However, the residues at pYϩ4 to pYϩ6 positions also contribute to the tighter binding of the 9-mer peptide. Similarly, a longer peptide corresponding to platelet-derived growth factor receptor pY1009 (DTSSVLpYTAVQPNE) bound to SHP-2 SH2 domain with a 2-fold higher affinity than a shorter peptide (VLpYTAV) (18 -20). Further evidence for a potential involvement of residue pYϩ5 in binding to the N-SH2 domain of SHP-2 came from studies on the consensus binding motifs of the SH2 domains of SOCS-3 and SHP-2 (16). Screening of a peptide library suggested that the N-SH2 domain of SHP-2 prefers an aromatic residue (Trp and Phe) at pYϩ5. Previous studies with SHP-1 showed a similar involvement of the C-terminal residues outside the region pYϪ2 to pYϩ3 for high affinity binding (10,21).
The above data clearly demonstrate the important contributions of residues outside the pYϪ2 to pYϩ3 region to the binding affinity of SHP-1 and SHP-2 SH2 domains. However, to the best of our knowledge, a systematic study of the sequence specificity at positions from pYϩ4 and beyond has not been carried out. Also, a comparison of the specificities of SHP-1 and SHP-2 SH2 domains for these positions has not been described, although this could be crucial for understanding their distinct, yet overlapping, specificities. In this work, a peptide library has been designed and screened against the four SH2 domains of SHP-1 and SHP-2 to systematically assess the sequence specificity at pYϩ4 to pYϩ6 positions. The results show that the two N-SH2 domains strongly prefer a large hydrophobic or positively charged residue at these positions, whereas the C-SH2 domains have broader specificity.

MATERIALS AND METHODS
Synthesis of the pY Library-The library was prepared by manual solid-phase peptide synthesis using Fmoc/tert-butyl chemistry on 2.5 g of 90-m TentaGel S NH 2 resin of a loading capacity of 0.3 mmol/g (Advanced ChemTech, Louisville, KY). A standard procedure with double couplings in the presence of a 4-fold excess of Fmoc-amino acid, O-benzotriazole-N,N,NЈ,NЈ-tetramethyluronium hexafluorophosphate, and N-hydroxybenzotriazole and an 8-fold excess of diisopropylethylamine in N,N-dimethylformamide was used, as described previously (15). To distinguish isobaric amino acids during peptide sequencing by mass spectrometry, Ac-Gly (5%) was added to the coupling reactions of Leu and Lys, and Ac-Ala (5%) was added to the coupling reaction of norleucine (22). Individual peptides were synthesized on 70 mg of Rink resin with a loading capacity of 0.2 mmol/g (Advanced ChemTech, Louisville, KY). A biotin and a linker was added to the N terminus using Fmoc-8-amino-3,6-dioxooctanoic acid and biotin as described elsewhere (23). The peptides were cleaved from the resin using 1 ml of 95% trifluoroacetic acid and 150 l of reagent K (0.5 ml of thioanisole, 0.25 ml of ethanedithiol, and 0.75 g of phenol) for 2-3 h at room temperature. The products were precipitated in cold diethyl ether and lyophilized. Finally, all peptides were purified by reversed-phase high pressure liquid chromatography on a Waters TM 600 system (Milford, MA) equipped with a Vydac 218TP column (5-m particle size, 300-Å pore size, 4.6 ϫ 25 mm). The identity of final products was confirmed by analysis on a matrix-assisted laser desorption ionization time-of-flight mass spectrometer (Bruker Reflex III) in an automated manner using ␣-hydroxycinnamic acid as the matrix.
Protein Expression and Purification-All SH2 domains used in the library screening were fusion proteins with the maltosebinding protein (MBP-SH2). The MBP-SH2 proteins were biotinylated prior to use. Binding studies by surface plasmon resonance (SPR) were performed with isolated SH2 domains containing an N-or C-terminal His 6 tag. Expression, purification, and biotinylation of all of these SH2 proteins were previously described (15). Full-length SHP-1 and SHP-2 were also expressed and purified as previously described (10,24).
pY Library Screening-The screening of the pY library was performed according to Sweeney et al. (15). In four independent experiments, 30 -50 mg of the pY library were extensively washed with dichloromethane, methanol, double-distilled H 2 O, and HBST buffer (30 mM HEPES, pH 7.4, 150 mM NaCl, and 0.01% Tween 20) and blocked for 1 h using HBST buffer containing 0.1% gelatin. Incubation with different concentrations of a biotinylated MBP-SH2 domain (0.5-2.0 nM final concentration) was performed for 4 -16 h at 4°C. Biotinylated MBP protein was used in the control experiment. After treatment with streptavidin-conjugated alkaline phosphatase and staining using 5-bromo-4-chloro-3-indolyl-phosphate, the resin was extensively washed with water. Beads with the highest color intensity were selected manually under a dissecting microscope and sequenced by partial Edman degradation/mass spectrometry (15,22).
Stimulation of Phosphatase Activity-Stimulation of SHP-1 and SHP-2 activities was measured using p-nitrophenyl phosphate as substrate (10,24). The assay reaction (total volume of 50 l) contained 50 mM HEPES, pH 7.4, 50 mM NaCl, 1 mM EDTA, 1 mM tris(carboxyethyl)phosphine, 10 mM p-nitrophenyl phosphate, 0 -500 M pY peptide, and 0.5 g of SHP-1 or 5 g of SHP-2. The reaction was initiated by the addition of SHP-1 (or SHP-2) as the last component and allowed to proceed at room temperature for 30 min. The reaction was quenched by adding 950 l of 1 M NaOH. The absorbance at 405 nm was measured on a PerkinElmer Lambda 20 UV-visible spectrophotometer. The SHP-1 (or SHP-2) reactivity reported was relative to that of SHP-1 (or SHP-2) in the absence of pY peptide. All measurements were carried out in duplicate, and the results are given as the average of two independent experiments.
Affinity Measurement by SPR-The binding affinity of pY peptides to the SH2 domains of SHP-1 and SHP-2 was assessed by SPR analysis on a BIAcore 3000 instrument (Amersham Biosciences). Biotinylated pY peptides were immobilized on streptavidin-coated SA 5 biosensor chips. Prior to immobilization, the sensor chip was conditioned with a solution containing 1 M NaCl and 50 mM NaOH (aqueous) according to the manufacturer's instructions. The binding assays were conducted at room temperature in Hepes-buffered saline-EP buffer (10 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA, 0.005% polysorbate 20). Each pY peptide (3-5 l) was injected at a flow rate of 15 l/min until a suitable and constant level of response units (400 -500 RU) was obtained. Varying concentrations of histidine-tagged SH2 proteins (0.05-5.0 M) were passed over the immobilized peptides for 2 min. To remove the protein from the peptide, a regeneration buffer (10 mM NaOH, 200 mM NaCl, 0.05% SDS) was applied. The equilibrium response unit (RU eq ) at a given SH2 protein concentration was obtained by subtracting the response of the blank flow cell from the flow cells containing the pY peptide. The dissociation constants (K D values) were calculated using the equation, where RU eq is the measured response unit at a given SH2 domain concentration, and RU max is the maximum response unit.
In Vitro Pulldown Assay-Raw 264.7 murine macrophage cells were lysed in TN1 lysis buffer (50 mM Tris, pH 8.0, 10 mM EDTA, 10 mM Na 4 P 2 O 7 , 10 mM NaF, 1% Triton X-100, 125 mM NaCl, 10 mM Na 3 VO 4 , and 10 g/ml each of aprotinin and leupeptin). Nuclei were removed by centrifugation for 10 min at 13,000 rpm at 4°C. Equal amounts of protein from each sample were incubated overnight with 5 g of a biotinylated pY peptide at 4°C. Streptavidin-agarose beads were added to the samples, which were then incubated for 1 h at 4°C. Beads were washed twice with TN1 buffer and boiled in SDS sample buffer (60 mM Tris pH 6.8, 2.3% SDS, 10% glycerol, 0.01% bromphenol blue, and 1% 2-mercaptoethanol) for 5 min, and the eluted proteins were separated by SDS-PAGE. Following separation, proteins were transferred to nitrocellulose membranes and then incubated with anti-SHP-1, anti-SHP-2, or anti-SHIP antibodies overnight at 4°C. After washing, the blots were incubated with horseradish peroxidase-conjugated secondary antibodies, washed again, and briefly incubated in ECL (Amersham Biosciences) for chemiluminescent detection on x-ray film. The assays were repeated twice. For binding competition, cell lysates were prepared as described above and incubated for 1 h at 4°C with 5 g each of biotinylated wild-type Fc␥RIIb1 peptide and a nonbiotinylated wild-type or mutant peptide as competitor. Streptavidin-agarose beads were added to the samples and incubated for 1 h at 4°C. Electrophoresis and Western blot analysis were performed as described above. The competition experiments were repeated three times.
Molecular Modeling-All structural models were generated based on the crystal structure of SHP-2 N-SH2 domain in complex with peptide SVLpYTAVQP (Protein Data Bank code 1aya) (20). Two separate modeling strategies were employed. In the first sequential method, the last two residues of the peptide in the Protein Data Bank 1aya structure were replaced by a Trp residue (position pYϩ4) using the SPDBV program (25). Several different sterically possible starting structures with different side chain conformations of the Trp residue were generated (using SPDBV). These structures served as the starting structures for energy minimization and molecular dynamics (MD) simulations (20 -40 ps at 310 K) using the Amber8 program package (26) and employing a generalized Born continuum solvent model (27). After the MD, the structures were again energy-minimized. During the energy minimization and MD simulations, the Trp residue was completely free to move; however, the rest of the structure was weakly restrained to the coordinates in the experimental structure (force constant: 0.05 kcal mol Ϫ1 Å Ϫ2 ). This weak restraint in principle allows adjustment of the structure by up to 1-2 Å from the x-ray structure. The lowest energy structure of the search nicely filled a cavity between loops 65-68 and 87-94, respectively. Based on the coordinates of this structure, two more residues were added (Phe and Ala) at the C terminus using SPDBV. The resulting structure was energy-minimized and subjected to an MD simulation and subsequent energy minimization as described above. An independent "direct" modeling strategy was also employed by adding the three residues (WFA) at once at the C terminus of the minimal sequence and performing a replica exchange MD simulation (as implemented in Amber8) (26) using the same restraints and continuum solvent model as described above and using the following range of simulation temperatures. A very similar final low energy structure was obtained using this alternative modeling strategy.

RESULTS
Design, Synthesis, and Screening of pY Library-We constructed a pY peptide library (AAX 1 NpYX 2 QX 3 X 4 X 5 X 6 LNBBRM-resin) that contains partially randomized positions at pYϪ2 (X 1 represents Leu, Ile, Val, or Thr), pYϩ1 (X 2 represents Ala, Thr, or Val), and pYϩ3 (X 3 represents Leu, Ile, or Val), and completely randomized positions at pYϩ4 to pYϩ6 (X 4 -X 6 represent L-norleucine or 18 proteinogenic L-amino acids except for Met and Cys; B represents ␤-alanine). An LNBBRM linker was added at the C terminus to facilitate mass spectrometric analysis (positively charged Arg) and cleavage from the resin (after Met with CNBr) (15). The amino acids at the partially randomized positions were chosen according to the type I consensus sequences for the SH2 domains of SHP-1 and SHP-2 (14,15). Asn and Gln were fixed at the pYϪ1 and pYϩ2 positions, respectively, because they are tolerated at these positions by all four SH2 domains. The theoretical diversity of the library is 2.47 ϫ 10 5 . Since all of the library members contain the most preferred residues at critical positions (pYϪ2, pYϩ1, and pYϩ3), we anticipated that, given sufficiently high concentration, the SH2 domains would bind to most (if not all) of the beads. In order to identify sequences with the highest affinities to the SH2 domains and thus define the sequence specificity at pYϩ4 to pYϩ6 positions, lower concentrations of SH2 proteins (0.5-2.0 nM) were used in the current screening than in previous screenings (10 -50 nM) (15). Typically, 150 mg of the library (ϳ4.3 ϫ 10 5 beads) were screened against each SH2 domain (usually in several batches), resulting in 100 -150 intensely colored beads. These positive beads were isolated from the library and sequenced by partial Edman degradation/mass spectrometry (15,22).
Sequence Specificity of SHP-1 and SHP-2 SH2 Domains at pYϩ4 to pYϩ6 Positions-A total of 361 peptide sequences were obtained for the four SH2 domains (Table 1). For the vast majority of the peptides, their amino acid sequences were unambiguously determined at all positions. A small fraction of the peptides had some ambiguities at one or more positions from pYϪ2 to pYϩ3; these positions are indicated by an X in Table 1. It is clear from these sequences that all four SH2 domains exhibit sequence specificity at positions pYϩ4 to pYϩ6, especially at the pYϩ4 and pYϩ5 positions. In general, the analysis showed that the SH2 domains share common structural features but also have clear differences.
The N-SH2 domain of SHP-1 strongly prefers a large hydrophobic (Trp, Tyr, Phe, Leu, or Ile) or positively charged residue (Arg, Lys, or His) at the pYϩ4 position (Table 1 and Fig. 1). Approximately 80% of the selected peptides contained a hydrophobic aromatic residue (Trp Ͼ Tyr Ͼ Ͼ Phe) or a basic amino acid (His, Arg Ͼ Ͼ Lys) at this position. A similar set of residues was also selected at pYϩ5 position, although positively charged residues (Arg, Lys, and His) appear to be more favorable than hydrophobic ones (Trp, Tyr, Phe, and Pro). At the pYϩ6 position, a wide variety of amino acids were accepted, with some preference for a positively charged residue (Arg, Lys, or His). Note that the combination of residues preferred at pYϩ4 to pYϩ6 positions is not random. Whereas certain combinations such as W(Y/F)G, WH(R/K), WH(T/S), YRR, RR(F/H/K), and HR(R/H) appeared many times, other more probable combinations, such as RRR, WWR, HHR, FFG, and FYG, were not observed at all. This underscores the importance of obtaining individual sequences.
Like the SHP-1 N-SH2 domain, the N-SH2 domain of SHP-2 strongly prefers hydrophobic aromatic residues (e.g. Trp, Tyr, and Phe) at the pYϩ4 and pYϩ5 positions (Table 1 and Fig. 1a). Nearly all of the selected peptides contained at least one hydrophobic residue at these positions. At the pYϩ6 position, predominantly small residues, such as proline, glycine, alanine, serine, and threonine, and basic amino acids, such as arginine and histidine, were selected. Thus, commonly observed motifs include W(Y/F)(G/A/P), WHX (where X represents a small or basic amino acid), (Y/F)(F/ W)(P/R), and M(F/Y)P. A major difference between the two N-SH2 domains is that the SHP-2 N-SH2 domain does not prefer positively charged residues at the pYϩ4 or pYϩ5 position; only histidine is found in some of the peptides. These findings are in good agreement with the results of De Souza et al. (16), who reported that the amino acids Trp and Phe are frequently found at pYϩ4 among SHP-2 N-SH2 domainbinding peptides.
In summary, the four SH2 domains of SHP-1 and SHP-2 show sequence specificity at pYϩ4 to pYϩ6 positions. However, unlike positions pYϩ1 and pYϩ3, where selectivity is absolute and a single amino acid is often required for binding, the SH2 domains have broader specificity at pYϩ4 to pYϩ6 positions, preferring a group of residues rather than a single amino acid. The N-SH2 domain of SHP-2 favors large hydrophobic amino acids at pYϩ4 and/or pYϩ5 positions, whereas the other three SH2 domains prefer both hydrophobic and positively charged residues at all three positions (pYϩ4 to pYϩ6).
Binding Affinities of Selected Peptides-The sequence specificity of the four SH2 domains at positions pYϩ4 to pYϩ6 was further evaluated by determining the dissociation constants (K D ) of selected peptides using surface plasmon resonance (SPR). To assess the contribution of pYϩ4 to pYϩ6 residues to the overall binding affinity, we synthesized peptides 1-9, which all have the same sequence at positions pYϪ4 to pYϩ3 (AAL-NpYAQL) but differ at positions pYϩ4 to pYϩ6 (Table 2). Peptide 1, which contains three alanyl residues at positions pYϩ4 to pYϩ6, was used as a control. Peptides 2, 5, 7, 8, and 9 were designed to contain sequence motifs selected from the library at pYϩ4 to pYϩ6 positions (WYG, WHR, HRH, RRF, and MFP, respectively). All five motifs increased the binding affinity of the pY peptide to SHP-1 N-SH2 domain. A comparison of the binding of this SH2 domain to immobilized peptides 1 and 2 is shown in Fig. 2a; substitution of WYG for AAA dramatically increased the amount of bound protein. Further quantitative analysis revealed a 36-fold difference in their binding affinities (Fig. 2, b and c, and Table 2). Among the selected motifs, MFP was most effective, increasing the binding affinity by 59-fold (K D values of 10 and 0.17 M for peptides 1 and 9, respectively), whereas the WHR motif increased the affinity by 24-fold. The positively charged motifs, RRF and HRH, were less effective, increasing the affinity by ϳ6and ϳ2-fold, respectively. Mutation of the glycine at pYϩ6 of peptide 2 into an alanine (peptide 3) had little effect on the binding affinity, as did mutation of pYϩ6 arginine of the WHR motif into an alanine (compare peptides 5 and 6). Replacement of the tyrosine at the pYϩ5 position of peptide 4 with an alanine (peptide 3) reduced the binding affinity by 1.5-fold.
To determine whether the observed affinity enhancement is unique to peptide LNpYAQL, we fused motifs AAA, WHR, and RRF (or WYG) to the C termini of another class I peptide TNpYTQL (peptides 10 -12 in Table 2) and LNpYMQF, which is a class II SHP-1 N-SH2-binding peptide (peptides 13-15) (14). Whereas WHR and RRF significantly increased the binding affinity of the class I peptide (TNpYTQL), WYG and WHR motifs had only a minor effect (Ͻ2-fold) on this class II peptide (Table 2). Thus, the results suggest that the selected motifs in Table 1 are most effective in the context of class I peptides, which were used in the pY library design. The class II peptides probably bind to the SH2 domain via a somewhat different binding mode, which does not allow optimal contacts between the above pYϩ4 to pYϩ6 motifs and the SH2 domain surface. It remains to be determined whether an alternative pYϩ4 to pYϩ6 motif(s) can be found for class II peptides.
Peptides 1-15 were also tested for binding to the other three SH2 domains of SHP-1 and SHP-2. The binding affinity of peptide LNpYAQL to the N-SH2 domain of SHP-2 is similarly enhanced by the hydrophobic motifs (Table 2). For example, WYG, WHR, and MFP decreased the K D value by 13-, 8-, and 30-fold, respectively. Like the SHP-1 N-SH2 domain, glycine to alanine mutation at the pYϩ6 position had little effect on binding to SHP-2 N-SH2 domain (compare peptides 2 and 4), whereas mutation of the pYϩ5 tyrosine to alanine reduced the affinity by 2-fold (compare peptides 3 and 4). A major difference between the N-SH2 domains of SHP-1 and SHP-2 is that positively charged motifs (e.g. HRH and RRF) do not affect the binding affinity to the latter (compare peptides 1, 7, and 8 or peptides 10 and 12), consistent with the fact that these motifs were not selected by SHP-2 N-SH2 during library screening (Table 1). The selected sequence motifs also enhanced the binding affinities of class I peptides to the C-SH2 domain of SHP-2, although the effect is generally of a smaller magnitude (Table 2). Again, MFP is most effective, increasing the binding affinity of LNpYAQL by 12-fold. Other hydrophobic motifs (e.g. WYG, WAA, WYA, WHR, and WHA) improved the affinity by 2-5fold. The positively charged RRF motif increased the affinity by 2-fold, but the frequently selected HRH motif was without effect. The C-SH2 domain of SHP-1 behaved differently from the other three SH2 domains. It exhibited very slow binding kinetics by SPR analysis (data not shown). Due to its difficulty in reaching binding equilibrium even after extended incubation times (especially at lower SH2 protein concentrations), the apparent K D values obtained represent the upper limits of the actual dissociation constants. Since all of the binding measurements were conducted under the same conditions, the apparent K D values allow us to make a relative comparison of the various peptides. The hydrophobic motifs, such as WYG, WYA, and MFP, marginally improved the binding affinity of class I peptides (1.4-, 5.2-, and 2.7-fold, respectively). The positively charged motifs had no effect (compare peptides 1, 7, and 8 or peptides 10 and 12).
In summary, the pYϩ4 and pYϩ5 residues are important specificity determinants for SHP-1 and SHP-2 SH2 domains. Binding to SHP-1 N-SH2 domain is greatly enhanced by both hydrophobic and positively charged residues at these positions. Large hydrophobic residues at pYϩ4 and pYϩ5 also dramatically improve binding to the N-SH2 domain of SHP-2. The binding affinity to the C-SH2 domain of SHP-2 is moderately enhanced by hydrophobic motifs, whereas the SHP-1 C-SH2 domain was only slightly affected by pYϩ4 to pYϩ6 residues. Our data also suggest that when an optimal residue (e.g. Trp) is already present at pYϩ4, the identity of the pYϩ5 residue is less critical (compare peptides 2-6). Vice versa, when the pYϩ4 residue is not optimal, the pYϩ5 residue can have a larger effect on the overall binding affinity (compare peptides 16 -18).
Stimulation of SHP-1 and SHP-2 Activity by Selected pY Peptides-To test whether the selected pY peptides also show enhanced binding to the full-length SHP-1 and SHP-2, we examined their ability to stimulate the catalytic activity of the two phosphatases. It has previously been established that the N-terminal SH2 domain of SHP-1/-2 directly binds to and inhibits their phosphatase domain (10 -12). Binding of a pY peptide to the N-SH2 domain disengages the intramolecular SH2⅐PTP complex and stimulates the enzymatic activity. There is a general correlation between the binding affinity of a pY peptide to the N-SH2 domain and its ability to stimulate the enzymatic activity. Peptides 1, 4, and 9 were chosen for the study. Thus, the catalytic activities of full-length SHP-1 and SHP-2 toward p-nitrophenyl phosphate were assayed in the presence of increasing concentrations of the pY peptides. As expected, all three peptides activated both SHP-1 and SHP-2 in a concentration-dependent manner (Fig. 3). Peptide 1 increased SHP-1 activity by ϳ10-fold at 400 M, the highest peptide concentration tested. Peptide 9, which has the highest affinity for SHP-1 N-SH2 domain (K D ϭ 0.17 M), was most potent in SHP-1 activation, producing a maximal activation of ϳ30-fold and an EC 50 value of 20 M (pY peptide concentration at which SHP-1 activity reaches half-maximum) (Fig. 3a). Peptide 4 (K D ϭ 0.26 M against SHP-1 N-SH2 domain) was slightly less potent, having an EC 50 value of ϳ30 M. Similar results were obtained for SHP-2 (Fig. 3b). Thus, there is an excellent agreement between the results from the stimulation assays and SPR analyses, confirming the ability of these peptides to bind intact SHP-1 and SHP-2.
Enhanced Binding of Fc␥RIIb1 to SHP-1 and SHP-2 by Mutations at pYϩ4 and pYϩ5 Positions-FcgRIIb1 is an inhibitory receptor on B cells. It negatively regulates B cell antigen receptor signaling by the recruitment of SH2 domain-containing inositol phosphatases SHIP-1 and SHIP-2 to its phosphorylated immunoreceptor tyrosine inhibition motif, NTITpYSLLMHP (28,29). It has been reported that the phosphorylated immunoreceptor tyrosine inhibition motif is also capable of recruiting SHP-1 under some conditions (30). Since the pYϩ4 to pYϩ6 sequence of the receptor (MHP) is not among the most optimal binding sequences for SHP-1 or SHP-2 SH2 domains, we tested whether the binding affinity of this receptor for SHP-1 and SHP-2 can be improved by mutating the residues at the pYϩ4 or pYϩ5 position. Two single amino acid mutants were generated by changing the pYϩ4 methionine into a tryptophan  (W mutant) or the pYϩ5 histidine into a phenylalanine (F mutant). The resulting motifs at pYϩ4 to pYϩ6, WHP and MFP, are both among the most preferred sequences. As expected, pY peptides corresponding to the mutant receptors (peptides 17 and 18) bound 2-6-fold more tightly to the N-SH2 domains of SHP-1 and SHP-2 than the wild-type sequence (peptide 16) ( Table 2). The F mutant also had ϳ2-fold higher affinity to the C-SH2 domain of SHP-2. Next, biotinylated peptides 16 -18 were used to precipitate SHP-1, SHP-2, and SHIP-1 from Raw 264.7 murine macrophages (pulldown assay). Cell lysates were incubated with the peptides, and bound proteins were precipitated with streptavidin-coated beads and analyzed by Western blot analysis. The wild-type sequence only precipitated trace amounts of SHP-1 and SHP-2 under the experimental conditions (Fig. 4a). In contrast, the mutant sequences especially the F mutant was much more effective in precipitating both SHP-1 and SHP-2 under the same conditions. All three peptides precipitated approximately the same amount of SHIP-1, indicating that binding to SHIP-1 SH2 domain was not affected by the pYϩ4 to pYϩ6 motifs. In an alternative experiment, the wild-type sequence (biotinylated peptide 16) was used to pull-down SHP-2 from the cell lysate in the presence of peptides 16 -18 (nonbiotinylated) as competitors, and the amount of precipitated proteins was analyzed by Western blots. The F mutant (peptide 17) was most potent in preventing binding of SHP-2 to biotinylated peptide 16, followed by the W mutant (Fig. 4b). These results demonstrate that the mutant Fc␥RIIb1 peptides bind to SHP-1 and SHP-2 more effectively than the wild-type receptor peptides. Molecular Modeling-To gain insight into the structural basis for the observed effects of pYϩ4 to pYϩ6 residues on SH2   JULY 21, 2006 • VOLUME 281 • NUMBER 29 domain binding, two modeling strategies were applied, both starting from the crystal structure of SHP-2 N-SH2 domain in complex with a class I peptide SVLpYTAVQP (Protein Data Bank code 1aya) (20) (see "Materials and Methods"). In the sequential protocol, a Trp residue was first added to the minimal binding sequence in the crystal structure (SVLpYTAV), resulting in peptide SVLpYTAVW. In the lowest energy docked structure, the pYϩ4 Trp fits snugly into a hydrophobic cleft formed between the loops 65-68 and 87-94 (Fig. 5, a-c). The Trp side chain makes contacts with Leu 65 , Leu 88 , and Tyr 81 of the N-SH2 domain and also forms a hydrogen bond to the carbonyl group of Gln 87 . Extension of the peptide by two more residues (Phe and Ala) and subsequent MD and energy minimization refinement revealed that the side chain of pYϩ5 Phe interacts with Gly 67 and Gly 68 as well as Tyr 66 . An independent simulation starting from the fully extended peptide performed under the same conditions, but using replica exchange MD as an advanced sampling method, yielded a very similar binding model (Fig. 5, a and b). Both models predict extensive interactions between pYϩ4 side chain and the SH2 domain surface. The pYϩ5 side chain also makes contacts with the SH2 domain but to a lesser extent than the pYϩ4 residue. The pYϩ6 residue has little interaction with the protein. This result is in keeping with the smaller influence of pYϩ5 and pYϩ6 residues on binding affinity compared with the pYϩ4 residue ( Table 2). The predicted binding cavity for the pYϩ4 Trp residue is large enough to accommodate other aromatic residues as well as a norleucine residue that were also selected as preferred residues.

SH2 Domain Specificity
The predicted binding mode for the pYϩ4 residue also offers a possible explanation for the smaller effect of the pYϩ4 residue on the binding affinity of the C-SH2 domains. Compared with the SHP-2 N-SH2 domain, the binding cleft for the pYϩ4 residue is much wider on the C-SH2 domain of SHP-2 (Fig. 5d ).
Consequently, the Trp side chain at the pYϩ4 position cannot simultaneously make contacts with both sides of the binding cleft walls. The situation is similar in case of SHP-1, where a wider peptide-binding groove has also been noted for the C-SH2 domain as compared with its N-SH2 domain (36).

DISCUSSION
Early structural studies with the SH2 domains from Src family kinases led to the widely accepted notion that pY and the three residues immediately C-terminal to pY (pY to pYϩ3) are the only determinants of SH2 domain-pY peptide interaction (31,32). Previous work from this and other laboratories has demonstrated that residues N-terminal to pY and C-terminal to pYϩ3 can have a dramatic effect on the binding affinity of pY peptides to several SH2 domains (10, 14 -21, 33-35). In particular, De Souza et al. (16) screened a peptide library against the SH2 domains of SHP-2 and SOCS-3 and found that they all have sequence specificity at the pYϩ4 and pYϩ5 positions. Unfortunately, their study involved sequencing the selected pY peptides as a pool, thus losing all of the sequence covariance information. In this work, we employed on-bead screening and a powerful mass spectrometry-based peptide sequencing technique, partial Edman degradation/mass spectrometry (15,22), to identify the individual sequences that bind to an SH2 domain of interest. Our data provide conclusive evidence that at least for the SH2 domains of SHP-1 and SHP-2, pYϩ4 and pYϩ5 residues are part of the specificity determinants, capable of enhancing the overall binding affinity by up to 2 orders of magnitude. Our data have also revealed the subtle specificity difference between SHP-1 and SHP-2 SH2 domains, which have overlapping specificities. Hydrophobic and/or basic residues are preferred at pYϩ4 to pYϩ6 positions. Importantly, the selected motifs show strong sequence covariance (e.g. WYG, WHR, and MFP were frequently selected). This illustrates the importance of obtaining individual binding sequences and the advantage of our method. X-ray structures of SHP-1 and SHP-2 SH2 domains show that the pYϩ4-and pYϩ5-binding pocket is formed by ␣-helix B as the base and EF and BG loops as the walls (Fig. 5) (20,36). The wide, shallow pocket can accommodate a variety of amino acid side chains. The side chains of Arg and Lys can engage in hydrophobic interactions with the protein surface. In addition, their positively charged head groups may interact with negatively charged residues on the SH2 domain. We have noted that the BG loop of SHP-1 N-SH2 domain contains a negatively charged motif, LQDRDG, whereas the corresponding sequence in SHP-2 is LKEKNG (20,36).
To determine whether the extended sequence specificity is physiologically relevant, we analyzed the pYϩ4 to pYϩ6 sequences of 54 human proteins that are known to bind to SHP-1 and/or SHP-2 via their SH2 domains (Table 3). It is clear that the pYϩ4 to pYϩ6 sequences in these proteins are not random. Out of a total of 103 pY motifs that are known to or putatively mediate the binding to SHP-1 and/or SHP-2, 54 motifs contain a hydrophobic residue at the pYϩ5 position, 20 of which have a Phe or Tyr. Note that Phe and Tyr were also the most frequently selected amino acids at the pYϩ5 position from the pY library (other than the positively charged Arg, Lys, and His) ( Table 1). Another 23 pY motifs have a positively charged residue at their pYϩ5 positions. At the pYϩ4 position, nearly half of the sequences (46 of them) display a hydrophobic or positively charged residue. Among the 54 proteins in Table 3, 23 were reported to bind both SHP-1 and SHP-2, whereas 13 are specific for SHP-1 and 18 bind to SHP-2 only. We noticed that the pYϩ4 to pYϩ6 sequences of the 13 SHP-1 target proteins are dominated by positively charged residues; each protein contains at least one pY motif that has a positively charged residue (Arg, Lys, or His) at the pYϩ4 and/or pYϩ5 position. Only one of the proteins (leukocyte-associated immunoglobulin-like receptor-1) has a negatively charged residue (Asp) at the pYϩ4 position of its first pY motif. In contrast, the pY motifs in the 18 proteins that bind specifically to SHP-2 have few positively charged residues at the pYϩ4 or pYϩ5 position. Instead, these positions typically display a combination of a hydrophobic residue at one position and a negatively charged or polar residue at the other. The proteins that bind to both SHP-1 and SHP-2 frequently have a combination of both types of motifs. This is consistent with our screening results, which showed that binding to SHP-1 is enhanced by both hydrophobic and positively charged residues at pYϩ4 and pYϩ5 positions, whereas SHP-2 binding is only enhanced by hydrophobic residues at these positions. Thus, the pYϩ4 and pYϩ5 residues probably play important roles in distinguishing SHP-1 and SHP-2, given that their SH2 domains have very similar binding specificities at pYϪ2 to pYϩ3 positions (14 -16).
The physiological relevance of our data is also corroborated by previous cellular studies. Huber et al. (37) reported that deletion of the pYϩ4 to pYϩ6 residues of Biliary glycoprotein pY515 motif (KKK) nearly completely eliminated the binding of SHP-1 and SHP-2 to the receptor. Another example is the binding of Helicobacter pylori CagA protein to SHP-2 (103). H. pylori causes gastritis and peptic ulcer by injecting CagA into gastric endothelial cells. Once inside the host cell, CagA undergoes tyrosine phosphorylation, binds and activates SHP-2, and thereby induces morphological transformation of the cell. The CagA protein from Western H. pylori isolates binds to SHP-2 via an EPIpYATIDDL motif. In contrast, predominant CagA proteins in H. pylori strains isolated from East Asia, where gastric carcinoma is prevalent, bind to SHP-2 through an EPIpYATIDFD motif. Substitution of Phe at the pYϩ5 position of Western CagA resulted in stronger binding to SHP-2 and a more virulent H. pylori strain. Vice versa, mutation of Phe at the pYϩ5 position of East Asian CagA into Asp decreased its binding to SHP-2 and the virulence of the strain (103).
Interestingly, whereas Trp is frequently the pYϩ4 residue among the peptides selected from the pY library, it is only found at the pYϩ4 position of one of the 54 human proteins (SH2 domaincontaining phosphatase anchor protein-1). There are at least two possible reasons why nature has not adopted this most effective residue more widely. First, the SH2 domains of SHP-1 and SHP-2 require hydrophobic residues at pY-2 (Val, Ile, Leu, or Thr), pYϩ1 (Ala, Ser, Thr, or Val), and pYϩ3 positions (Ile, Leu, or Val)   14 -16). If one or more hydrophobic residues are added at the pYϩ4 and/or pYϩ5 positions, the resulting sequence motifs (e.g. VXYAXLWYG) would probably be too hydrophobic to be able to exist on the protein surface, a prerequisite for phosphorylation by a protein kinase or binding to an SH2 domain. Second, since SHP-1 and SHP-2 each have two SH2 domains, a tandem pY motif with the most optimal sequences may bind to the phosphatases too tightly to allow rapid release.
It is worth noting that the peptides selected against the two C-SH2 domains are dominated by positively charged residues at pYϩ4 to pYϩ6 positions, yet SPR analysis shows that these positively charged motifs do not substantially enhance pY peptide binding to these domains (Tables 1 and 2). We believe this may have been caused by the bead-based screening method. When all of the beads carry the same minimal sequence for SH2 domain binding and the pYϩ4 to pYϩ6 residues do not confer large differences in binding affinity to the C-SH2 domains, the on-bead screening procedure can be influenced by nonspecific interactions. The positively charged sequences presumably interact with the negatively charged maltose-binding protein (pI ϭ 5.0), resulting in higher overall binding affinity of the MBP-SH2 fusion protein to these beads. This is usually not a

SH2 Domain Specificity
problem when the affinity difference is large between specific peptide ligands and the rest of the library members. The knowledge of SH2 domain specificity will be very useful for future biomedical studies. First, the consensus sequences will allow an investigator to predict the interaction partners of the SH2 domain-containing proteins. The information will facilitate the development of specific inhibitors against the SH2 domains. For example, the different specificity at pYϩ4 and pYϩ5 positions between SHP-1 and SHP-2 may serve as the basis for the development of specific peptide inhibitors against SHP-1 versus SHP-2. Such inhibitors would prevent the association of SHP-1 (or SHP-2) to its physiological targets and shed light on its function in various signaling pathways. Finally, one may be able to engineer a natural pY protein so that it only interacts with SHP-1 or SHP-2, in order to understand the cellular function of SHP-1 or SHP-2. In principle, one could also engineer novel signaling pathways by designing tailor-made pY motifs.
In conclusion, our studies have demonstrated that the sequence specificity and binding affinity of SHP-1 and SHP-2 SH2 domains are determined by amino acid residues at positions from pYϪ2 to pYϩ5, a much larger region than originally revealed by the crystallographic studies. Although it is currently unknown how many other SH2 domains also display this type of extended sequence specificity, it is clear that the picture of SH2 domain-pY protein interaction is more complex than what was originally proposed.