Assessment of Protein-tyrosine Phosphatase 1B Substrate Specificity Using “Inverse Alanine Scanning”*

An “inverse alanine scanning” peptide library approach has been developed to assess the substrate specificity of protein-tyrosine phosphatases (PTPases). In this method each Ala moiety in the parent peptide, Ac-AAAApYAAAA-NH2, is separately and sequentially replaced by the 19 non-Ala amino acids to generate a library of 153 well defined peptides. The relatively small number of peptides allows the acquisition of explicit kinetic data for all library members, thereby furnishing information about the contribution of individual amino acids with respect to substrate properties. The approach was applied to protein-tyrosine phosphatase 1B (PTP1B) as a first example, and the highly potent peptide substrate Ac-ELEFpYMDYE-NH2 (kcat/K m 2.2 ± 0.05 × 107 m −1s−1) has been identified. More importantly, several heretofore unknown features of the substrate specificity of PTP1B were revealed. This includes the ability of PTP1B to accommodate acidic, aromatic, and hydrophobic residues at the −1 position, a strong nonpreference for Lys and Arg residues in any position, and the first evidence that residues well beyond the +1 position contribute to substrate efficacy.

Protein-tyrosine phosphatases (PTPases) 1 are involved in the regulation of many fundamental cellular signaling processes (1). The identification of physiological substrates can be of great utility in defining the biological function of PTPases. In addition, PTPases are also promising targets for new therapeutics directed against various cellular disorders and diseases (2). For example, protein-tyrosine phosphatase 1B (PTP1B) has been implicated as a negative regulator of cellular signaling mediated by the insulin receptor. Mice lacking functional PTP1B have been shown to exhibit increased sensitivity to-ward insulin and to be obesity-resistant, suggesting that synthetic PTP1B inhibitors could serve as potential anti-diabetes and anti-obesity drugs (3). An assessment of PTPase substrate specificity should not only help to identify physiological substrates for PTPases, but may ultimately assist in the design of PTPase-specific inhibitors.
The identity of the physiological substrates for most PTPases is currently unknown. In addition, because the physiological substrates for PTPases are tyrosine-phosphorylated proteins, even if one knows the identity of the physiological substrate, it is still difficult to get sufficient quantities of site specifically and stoichiometrically tyrosine-phosphorylated protein molecules for detailed enzymological and biochemical studies. Thus, the current approach to probe PTPase substrate specificity has been to use synthetic phosphotyrosine (Tyr(P))-containing peptides that correspond to natural phosphorylation sites in proteins. It has been shown that PTPases display a range of k cat /K m values for relatively short peptide substrates (4 -11). In addition, the k cat /K m values for the peptides are orders of magnitude higher than that of Tyr(P) alone, suggesting that amino acids flanking the Tyr(P) contribute to high affinity binding. From these limited studies, it appears that efficient substrate recognition by PTPases requires the presence of amino acids on both sides of Tyr(P). However, because of the differences in sequence and size of the individual peptides examined, one cannot draw any definitive conclusions regarding the structural requirements for substrate recognition. Clearly, a more systematic and thorough approach, such as the use of peptide libraries, is needed for the elucidation of structural features that control substrate specificity for individual PTPases. Indeed, combinatorial peptide libraries have been useful in the determination of optimal amino acid sequence for protein kinase and SH2 domain recognition (12,13). We describe herein a novel peptide-based library approach to assess PTPase substrate specificity.

EXPERIMENTAL PROCEDURES
Peptide Synthesis and Characterization-Peptides were synthesized on CLEAR amide resin (Peptides International, Louisville, KY) using a standard protocol for HBTU/HOBt/collidine activation of Fmoc/t-butylprotected amino acid derivatives (Advanced Chemtech, Louisville, KY). Side chains of Asn, Cys, Gln, and His were trityl-protected; Lys and Trp were tert-butyloxycarbonyl-protected; and Arg was protected with the Pbf group. The coupling reaction was performed in DMF for 1 h using a 3-fold excess of amino acid. The couplings became more difficult with increasing peptide length, and coupling times were extended to 2 h and multiple couplings were performed when necessary. Fmoc removal was performed with 22% piperidine in DMF.
Phosphotyrosine was incorporated into the growing peptide chain as the prephosphorylated species Fmoc-Tyr(PO(OBzl)OH)-OH (where Bzl is benzyl) (Nova Biochem, San Diego, CA) by double coupling using 1.1 eq for the first coupling (overnight) and 0.5 eq for the second coupling (5 h). The NH 2 terminus of the peptides was acetylated with 2.5% acetic anhydride, 5% DIPEA, in DMF for 15 min. Final cleavage and side chain deprotection was achieved with 90% trifluoroacetic acid, 5% TIS, 2.5% EDT, and 2.5% water for 2 h. Peptides containing Met, His, or Trp were deprotected with 80% trifluoroacetic acid, 10% dimethyl sulfide, 5% EDT, 4% TIS, and 1% water to prevent benzylation of these residues. The resin was removed by filtration, and the remaining trifluoroacetic acid solution concentrated under nitrogen flow. Dry diethyl ether was added and the precipitated peptides collected by centrifugation. The peptides were resuspended, washed twice with ether, and lyophilized from acetic acid.
Certain peptide sequences showed contamination by (a) deletion of an alanine residue, (b) deletion of the NH 2 -terminal amino acid (defined position Ϫ4), (c) incomplete Fmoc removal after residue Ϫ3 or Ϫ4, (d) incomplete acetylation of residue Ϫ4. In particular, peptides containing amino acids with aliphatic, hydrophobic side chains, such as Val, Leu, and Ile in position ϩ4, Ϫ3, or Ϫ4, respectively, were affected. Alaninerich peptides are known to be difficult to synthesize due to aggregation, which prevents complete acylation and Fmoc removal. Trifluoroacetic acid is known to disrupt those aggregates, and we subsequently resynthesized these peptides using a modified synthesis protocol to allow washes with neat trifluoroacetic acid during peptide synthesis. These peptides were synthesized using the trifluoroacetic acid-stabile MBHA resin (Nova Biochem, San Diego, CA), and the first 7 amino acid were coupled using standard Fmoc chemistry as described above. Peptides not containing trifluoroacetic acid-sensitive side chain protection groups were then washed with neat trifluoroacetic acid twice before the next amino acid was coupled. The Fmoc group was removed with 30% piperidine/3% DBU in DMF. Complete NH 2 -terminal acetylation was ensured by coupling of acetylated alanine in position Ϫ4 or multiple acetylation with 5% acetic anhydride, 5% DIPEA in DMF. Peptides were deprotected and cleaved with 8% TFMSA in trifluoroacetic acid containing 12% thioanisole/EDT (2:1) for 10 min at 0°C, followed by 30 min at room temperature. All peptides synthesized using this protocol were obtained in high purity (Ͼ80 -95%).
Peptides were analyzed using MALDI-TOF mass spectrometry and analytical HPLC. The majority of peptides appeared basically as single peak during HPLC analysis, and the product content for all peptides was estimated from UV absorbance at 214 nm to be 85-99%. Impurities seen during HPLC analysis are most likely traces of remaining scavengers, because peptidic contaminations were not observed during MALDI mass spectral analysis.
PTP1B Preparation and Assay-The recombinant PTP1B was expressed in Escherichia coli and purified to homogeneity as described (14). The PTP1B-catalyzed hydrolysis of Tyr(P)-containing peptides was continuously monitored at 305 nm for the increase in tyrosine fluorescence with excitation at 280 nm (5). When

RESULTS AND DISCUSSION
At the outset, we felt it advantageous to develop an information rich library that would not only furnish a general assessment of PTPase substrate specificity, but provide detailed kinetic data for individual members of the library as well. Consequently, we employed a library based on the PTPasecatalyzed dephosphorylation of phosphotyrosine (Tyr(P)) peptides, rather than simple binding of the PTPase to a substrate mimic or binding of a catalytically inactive PTPase mutant to a Tyr(P) peptide. Unfortunately, classic combinatorial library methods do not readily lend themselves to detailed kinetic analyses, especially at the individual member level. We ruled out the one bead/one compound approach (15) as well as the positional scanning method (16,17) due to difficulties associated with quantitatively measuring kinetic constants in either the solid state or on nonequimolar mixtures of Tyr(P) peptides (18). Although these combinatorial methods cannot be used to assess the substrate efficacy of individual library members, both approaches generate a large number of peptides and therefore furnish a comprehensive appraisal of the diversity space associated with the substrate binding pocket on the target enzyme(s).
We have demonstrated that, in addition to peptides, PTPases catalyze the dephosphorylation of a variety of simple aromatic and aliphatic monophosphate esters (19,20). In addition, we have shown that the k cat /K m values for Tyr(P) peptides are orders of magnitude higher than that of Tyr(P) alone, suggesting that amino acids flanking the Tyr(P) contribute to high affinity binding (5,9,10). Based on the crystal structure of a PTP1B-Tyr(P) peptide (21), we predicted (and have confirmed) that a peptide composed of 8 Ala residues and a single central Tyr(P) would be recognized as a substrate by PTP1B. The classic "alanine scan" is used to probe (peptide ligand)/protein interactions by successively replacing each individual amino acid of the peptide ligand with an alanine (22). The inverse alanine scanning method illustrated in Fig. 1 separately and sequentially replaces each alanine residue with the other 19 standard amino acids. The advantages associated with a library of pure compounds include the ability to obtain detailed kinetic data for each individual library member and that, because the synthesis "history" of each library member is known, structural deconvolution of lead substrates is not necessary. However, libraries synthesized in parallel are often deemed to have the disadvantage of low relative structural diversity. To probe 8 positions around Tyr(P), a large number of peptides (20 8 ϭ 2.56 ϫ 10 10 ) would have to be synthesized by the classic combinatorial methods. In contrast, the inverse alanine scanning approach dramatically reduces this number to only 8 ϫ 19 ϩ 1 ϭ 153 peptides, while retaining molecular diversity.
Peptides were manually synthesized using a teflon block designed to accommodate the simultaneous preparation of 20 peptides at a time. An Fmoc-based synthesis protocol was employed, and the phosphorylated tyrosine was utilized during solid phase peptide synthesis (see "Experimental Procedures"). All members of the inverse alanine scan library were analyzed by matrix-assisted laser desorption/ionization mass spectrometry and reverse phase high performance liquid chromatography. No significant contamination was detected and all peptides were used in the subsequent assay without further purification. The PTPase activity assay was based on increased tyrosine fluorescence following dephosphorylation (5). The k cat /K m value, a measure of substrate specificity, was directly calculated from the reaction progress curve (see "Experimental Procedures").
The k cat /K m values for all 153 peptides ranged over 2 orders of magnitude and changed significantly for different substitutions within a single position. The results of the inverse alanine scanning of the entire peptide library for PTP1B are shown in Fig. 2. An examination of the scanning data revealed a number of factors that control the substrate specificity of PTP1B. Noteworthy features of PTP1B's substrate specificity include: the salutary effect of acidic residues at most sites and the fact that the incorporation of a Phe residue at position Ϫ1 furnishes the single most dramatic enhancement in substrate efficacy. The preference for acidic residues at positions NH 2 -terminal to Tyr(P) is consistent with results from previous analysis of Tyr(P)-containing peptides corresponding to in vivo phosphorylation sites (6 -8). A preference for aromatic residues at the Ϫ1 position was also noted in affinity selections from targeted peptide libraries containing nonhydrolyzable Tyr(P) mimics phosphonodifluoromethyl phenylalanine or malonyltyrosine using the epidermal growth factor receptor peptide DADEpYL as a template (23,24). The excellent agreement between the current results and previous studies at the Ϫ1 position validate the inverse alanine scanning approach. In addition, the results outlined in Fig. 2 have revealed an array of surprising and heretofore unknown aspects of PTP1B specificity. First, in addition to acidic and aromatic residues, PTP1B is able to accommodate a host of structurally diverse residues including Leu, Ile, Val, Cys, and Thr at the Ϫ1 position. These results suggest that the Ϫ1 binding pocket possesses a remarkable degree of flexibility such that residues with drastically different structures can be accommodated. In short, the simple notion that PTP1B has a specific, and therefore limited, preference for either aromatic or acidic residues at the Ϫ1 position is decidedly incorrect. Second, the presence of either Lys or Arg at all positions has deleterious consequences in terms of hydrolytic efficacy. Third, PTP1B exhibits a surprising preference for the two sulfur-containing amino acids, methionine and cysteine, at the ϩ1 position and a marked aversion for Pro at this site. Finally, this study provides the first evidence that PTP1B substrate recognition extends to specific residues well beyond the ϩ1 position. A strong preference for acidic residues is observed for position ϩ2 and a strong preference for aromatic residues is evident for position ϩ3.
Inverse alanine scanning treats the binding contribution of each residue as independent of adjacent amino acids. To test the validity of this assumption, we synthesized a peptide corresponding to the consensus sequence, as well as a number of peptides where certain positions of the consensus peptide were substituted by "undesirable" amino acids (Table I). The consensus peptide, Ac-ELEFpYMDYE-NH 2 , is one of the best peptide substrates ever reported for PTP1B, exhibiting a k cat /K m value of 2.2 ϫ 10 7 M Ϫ1 s Ϫ1 at pH 7.0 (I ϭ 0.15 M) and 30°C. Substitution of Phe at the Ϫ1 position with a Lys results in 3-fold decrease in k cat /K m , whereas replacement of Met at the ϩ1 position by a Pro leads to 16-fold reduction in k cat /K m . If subsites Ϫ1 and ϩ1 independently interact with the enzyme, one would predict a 48-fold decrease in k cat /K m for the doubly substituted consensus peptide. Indeed, this is precisely what we observed (Table I). These results are consistent with the crystal structure of a PTP1B-Tyr(P) peptide complex (21), in which the bound peptide lies in an extended conformation without any interactions between side chains of the peptide itself. Finally, Ac-KKKKpYPKKK-NH 2 , which was predicted to be the worst substrate based on the scanning results, is resistant to PTP1Bcatalyzed hydrolysis.
In summary, we have developed a novel combinatorial approach to identify consensus sequence motifs for PTPases. Inverse alanine scanning offers several advantages over classical combinatorial library methods. Most notable is the use of a small number of peptides of defined purity, rather than complex mixtures, to interrogate structural diversity space. Consequently, explicit kinetic data can be acquired for all library members, thereby furnishing general specificity trends in the context of rigorous kinetic information. The library data reported herein has revealed an array of heretofore unknown features of the substrate specificity of PTP1B, including clear preferences and aversions to specific residues at sites both adjacent to and noncontiguous with the hydrolyzable Tyr(P) moiety. These data have allowed us to prepare a potent peptide-based substrate for this enzyme as well as the first exam-  0.14 Ϯ 0.004 Ac-ELEKpYPDYE-NH 2 0.050 Ϯ 0.0002 Ac-KKKKpYPKKK-NH 2 No reaction ple of a Tyr(P)-containing peptide that fails to serve as a PTP1B substrate. We anticipate that the subsequent analysis of other members of the large PTPase family, as well as other enzymes that catalyze a variety of post-translational modifications, should likewise furnish consensus sequence motifs.