Tyrosine phosphorylation mapping of the epidermal growth factor receptor signaling pathway.

Phosphorylation is one of the most common forms of protein modification. The most frequent targets for protein phosphorylation in eukaryotes are serine and threonine residues, although tyrosine residues also undergo phosphorylation. Many of the currently applied methods for the detection and localization of protein phosphorylation sites are mass spectrometry-based and are biased against the analysis of tyrosine-phosphorylated residues because of the stability and low reactivity of phosphotyrosines. To overcome this lack of sensitive methods for the detection of phosphotyrosine-containing peptides, we have recently developed a method that is not affected by the more predominant threonine or serine phosphorylation within cells. It is based on the specific detection of immonium ion of phosphotyrosine at 216.043 Da and does not require prior knowledge of the protein sequence. In this report, we describe the first application of this new method in a proteomic strategy. Using anti-phosphotyrosine antibodies for immunoprecipitation and one-dimensional gel electrophoresis, we have identified 10 proteins in the epidermal growth factor receptor signaling pathway, of which 8 have been shown previously to be involved in epidermal growth factor signaling. Most importantly, in addition to several known tyrosine phosphorylation sites, we have identified five novel sites on SHIP-2, Hrs, Cbl, STAM, and STAM2, most of which were not predicted to be phosphorylated. Because of its sensitivity and selectivity, this approach will be useful in proteomic approaches to study tyrosine phosphorylation in a number of signal transduction pathways.

One of the most common types of post-translational modification is protein phosphorylation. Histidine, aspartic acid, and glutamic acid residues predominantly undergo phosphorylation in prokaryotes, whereas in eukaryotes serine, threonine, and tyrosine residues are the major targets of phosphorylation (1). It is estimated that approximately one third of all proteins in mammalian cells are phosphorylated at some time or another and that approximately 5% of a vertebrate genome encodes for protein kinases and protein phosphatases (2), underscoring the importance of this protein modification.
Signals from the extracellular environment that result in cellular changes ranging from membrane ruffling to mitogenesis are transmitted as a series of phosphorylation events that undergo amplification within the cytoplasm. It is therefore of pivotal importance to identify such phosphorylated residues on proteins that lead to assembly of larger multiprotein complexes as a result of interaction of phosphorylated residues with specific protein domains. This warrants the need for rapid and sensitive methods for detection of phosphorylation and precise localization of phosphorylated residues. The most sensitive methods for the detection of phosphorylation make use of the incorporation of phosphate groups labeled with radioactive phosphorus isotopes such 32 P or 33 P. However, the in vivo incorporation of radioactive isotopes is inefficient because of the presence of endogenous ATP pools within cells. Therefore, almost 100 -1000-fold greater amounts of radioactively labeled ATP are required to achieve a degree of in vivo phosphorylation that is sufficient for sensitive detection in comparison to in vitro kinase assays. To avoid working with such large amounts of radioactivity, nonradioactive methods are more desirable provided they are sufficiently sensitive. Most of the widely used nonradioactive methods are mass spectrometry-based because of the sensitivity and speed provided by this technology as compared with traditional biochemical methods.
Although approaches using matrix-assisted laser desorption ionization/time-of-flight MS 1 alone, or in combination with immobilized metal affinity chromatography and/or phosphatase treatment, have been used for phosphorylation analysis, they still suffer from a number of drawbacks, the major one being the inability to directly sequence the peptides to unambiguously identify the phosphorylation site (see, e.g., Ref. 3). Therefore, (electrospray) tandem mass spectrometry-based approaches are often preferred because they allow subsequent localization of the phosphorylation sites by peptide sequencing. One commonly used electrospray method is based on the neutral loss of HPO 3 (80 Da) or H 3 PO 4 (98 Da) moieties (4). Although this technique is quite useful for phosphorylated serine or threonine residues, it is less suitable for phosphotyrosines because the phosphate group on tyrosine residues is more stable and does not detach easily. A more universal technique * This work was supported in part by a grant from the Danish National Research Foundation to the Center for Experimental Bioinformatics. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  is the selective detection of a characteristic negatively charged fragment ion at m/z Ϫ79 (PO 3 Ϫ ) for the identification of any phosphorylated species in peptide mixtures (5)(6)(7). However, the alkaline conditions under which this method must be performed preclude direct sequencing of peptides in a sensitive manner.
Here we describe the application of a novel tandem mass spectrometry (MS/MS)-based technique that relies on specific detection of tyrosine-phosphorylated peptides (8,9). This method, termed phosphotyrosine-specific immonium ion (PSI) scanning, is based on detection of the specific immonium ion of phosphotyrosine at 216.043 Da as a characteristic "reporter" ion for precursor tyrosine-phosphorylated species in complex peptide mixtures. As sequencing is performed immediately after the detection of the phosphotyrosine-containing peptide, this method does not require knowledge of the protein sequence for phosphotyrosine localization, i.e. it is also applicable to novel/unknown proteins. Because numerous other peptide-derived fragment ions have the same nominal mass of 216 Da but different exact masses, high resolution, high accuracy quadrupole time-of-flight tandem mass spectrometers have to be employed for PSI scanning experiments. If PSI scanning is performed using a triple quadrupole tandem mass spectrometer (the type of instrument normally employed for precursor ion experiments), numerous false positives will be encountered, because this type of mass spectrometer cannot resolve the immonium ion of phosphotyrosine from other interfering peptide-derived fragment ions.
In this report, PSI scanning was used to identify several phosphoproteins in the EGF receptor signal transduction pathway. The identified proteins included well characterized proteins such as Shc, Cbl, Hrs, and signal transduction adaptor molecule (STAM) in addition to STAM2, a molecule whose involvement in the EGF receptor pathway has recently been reported (10,11). In addition, we found Ku 70 autoantigen and Hsp 70 as tyrosine-phosphorylated proteins. A number of tyrosine phosphorylation sites on these molecules were identified; five of them have never been described in the literature despite the fact that most of these proteins have been intensively studied by several groups. None of these novel tyrosine-phosphorylated residues conform to the known consensus motifs for SH2 or phosphotyrosine binding domains. The ability of a number of computer programs to predict these identified tyrosine phosphorylation sites was compared. Quite remarkably, most of the programs failed to predict the novel tyrosine-phosphorylated sites, which emphasizes a major limitation of such programs. PSI scanning can be used in high-throughput identification of in vivo tyrosine phosphorylation sites on proteins, which is an important step in the elucidation of regulatory networks within a cell.

EXPERIMENTAL PROCEDURES
Growth Factors and Antibodies-EGF and agarose-conjugated 4G10 monoclonal anti-phosphotyrosine antibody were purchased from Upstate Biotechnology (Waltham, MA). RC20 monoclonal anti-phosphotyrosine antibody was from Transduction Laboratories (Lexington, KY).
Cell Culture and Immunoprecipitation-HeLa cells were normally grown in Dulbecco's modified Eagle's medium containing 10% fetal bovine serum plus antibiotics. For immunoprecipitation experiments, a total of 10 8 HeLa cells were grown to 80% confluence and then cultured for additional 15 h without serum. Cells were either untreated or treated with 1 g/ml EGF for 5 min and subsequently lysed in 50 mM Tris-HCl, pH 7.6, 150 mM NaCl, 1% Nonidet P-40, 0.25% sodium deoxycholate, and 1 mM sodium orthovanadate in the presence of protease inhibitors. Cleared cell lysates were incubated for 4 h at 4°C with a mixture of anti-phosphotyrosine antibodies: 100 g of 4G-10 monoclonal antibody coupled to agarose beads and 50 g of biotin-conjugated RC20 monoclonal antibody bound to streptavidin-agarose beads. Precipitated immune complexes were washed three times with lysis buffer and then eluted twice with 100 mM phenylphosphate in lysis buffer at 37°C. Proteins from control and treated conditions were separated by SDS-PAGE under reducing conditions. After visualization by silver staining, bands of interest were excised and subjected to in-gel reduction, alkylation, and digestion with trypsin (sequencing grade; Roche Diagnostics, Mannheim, Germany) as described previously (12). Chemicals used for the in-gel derivatization reactions were purchased from Sigma.
Mass Spectrometry-High purity solvents used for nanoelectrospray experiments were purchased from Labscan (Dublin, Ireland). All experiments were performed on a QSTAR Pulsar quadrupole time-of-flight tandem mass spectrometer (AB/MDS-Sciex, Toronto, Canada) equipped with a nanoelectrospray ion source (MDS Proteomics, Odense, Denmark). Precursor ion scanning experiments were acquired with a dwell time of 50 ms at a step size of 0.5 Da and with the Q 2 -pulsing function turned on. The Q 0 -voltage, which determines the collision energy, was set to a value corresponding to one tenth of the m/z value of the precursor ion (for further details, see Refs. 8 and 9).
Protein digests were desalted and concentrated on a tandem arrangement comprising POROS R2 and OLIGO R3 columns (Perseptive Biosystems, Framingham, MA) prepared in GELoader tips (Eppendorf, Hamburg, Germany) as described previously (13,14). Columns were eluted in three steps (20, 40, or 60% methanol in 5% formic acid, respectively) directly into nanoelectrospray needles (MDS Proteomics), and each fraction was subjected to MS analysis. Proteins were identified employing the peptide sequence tag approach (15). Peptide sequence tags, derived from fragment ion spectra of selected peptides, were searched against the nonredundant protein data base using the program PepSea (MDS Proteomics).

Isolation of Tyrosine-phosphorylated Proteins
It is important to study events such as phosphorylation in the context of a cell because the products of phosphorylation reactions carried out in vitro may not be detected under more physiological conditions. We decided to use the EGF receptor signal transduction pathway as a system in which to test our phosphotyrosine detection method. The two major challenges with characterization of in vivo phosphorylation in signaling are that first, most signaling molecules are not found in high abundance in cells, and second, only a fraction of a given protein undergoes phosphorylation in a cell. HeLa cells express endogenous EGF receptors as well as other downstream components of this signal transduction pathway and were, therefore, chosen for our experiments. Two cell cultures were used where one was stimulated with EGF while the control set was left unstimulated. To enrich for tyrosine-phosphorylated proteins, cleared cell lysates were subjected to affinity chromatography using anti-phosphotyrosine antibodies immobilized on agarose beads. After washing the column several times, the bound proteins were eluted with phenylphosphate, which mimics phosphotyrosine residues and thus competitively binds to the anti-phosphotyrosine antibody. After elution, the samples were dialyzed and resolved by SDS-PAGE. As a limited quantity of cells was used for these experiments and protein concentrations in lysates were low, we performed silver staining, which is a very sensitive method for visualizing proteins in polyacrylamide gels. Silver staining allows one to detect subpicomole amounts of proteins quite easily and is compatible with subsequent mass spectrometric analysis. The differentially tyrosine-phosphorylated gel bands ( Fig. 1) that were clearly present in the EGF-treated lane and absent in the control lane were excised, reduced, alkylated, and digested for subsequent MS analysis as described under "Experimental Procedures."

Strategy for Sequencing of Tyrosine-phosphorylated Proteins by PSI Scan
The phosphotyrosine mapping in addition to ordinary protein identification was accomplished as follows utilizing the PSI scanning approach (see Fig. 2). After acquisition of a mass spectrum of the peptide mixture obtained from tryptic digests of silver stained bands, the spectrum was evaluated to identify peptide ion signals for further MS/MS experiments. During this time, a PSI scanning experiment was performed to identify phosphotyrosine-containing peptides. Finally, those peptides that gave rise to an ion signal in the precursor ion spectrum were first analyzed by MS/MS to obtain sequence information for protein identification as well as for localization of the phosphorylated tyrosine residue. Additional peptides were subjected to MS/MS experiments to confirm the protein identified on the basis of the tyrosine-phosphorylated peptides and also to identify other proteins that may be present. Table I lists all the proteins identified from the bands excised from the silverstained gel shown in Fig. 1.
To reduce interference by peptides that show ion signals at similar m/z values, (i.e. to decrease the complexity of the protein digest), the peptide mixtures were fractionated before mass spectrometric analysis. This was achieved by loading the protein digests onto a tandem column arrangement comprised of a POROS R2 and an OLIGO R3 microcolumn previously described by us (6). Such a procedure ensures that smaller and more hydrophilic peptides, which do not bind to the POROS R2 resin, are retained in the subsequent OLIGO R3 column. Therefore, this column setup also reduces the loss of phosphorylated peptides. Subsequently each column was eluted with 20% (fraction 1), 40% (fraction 2), and 60% methanol (fraction 3) containing 5% formic acid and each fraction subjected to nanoelectrospray analysis.

Identification of Proteins and Tyrosine Phosphorylation Sites
EGF Receptor (Band I)-Mass spectrometric analysis of band I, eluted with 20% methanol, 5% formic acid from the POROS R2 microcolumn showed several major ion signals (see Fig. 3A). MS/MS sequencing revealed that these ion signals were attributable to peptides derived from the EGF receptor. The increase in tyrosine phosphorylation of EGF receptor upon EGF stimulation was expected because EGF leads to homodimerization and an increase in catalytic activity of the EGF receptor (16). A PSI scan experiment shows one clear ion signal at approximately m/z 773 (Fig. 3B). The corresponding mass spectrum of the EGF receptor digest in the m/z range around 773 displays only one triply charged species of minor abundance. Sequencing of this peptide ion revealed it to contain Tyr-1172 (data not shown; for further detail, see Ref. 8), which is located in the C terminus of the EGF receptor and has previously been described as one of the major autophosphorylation sites (17,18). Indeed, mutation of Tyr-1172 in addition to C-terminal Tyr-1197, leads to a substantial decrease in binding of several EGF receptor substrates such as phospholipase C␥1, ras GTPase-activating protein, p85 subunit of phosphatidyl-   Fig. 1 The table lists all the proteins identified in this study along with their data base accession numbers, the band number from which they were derived (Fig. 1  SH2-containing Inositol 5-Phosphatase 2 (SHIP-2) (Band II)-Analysis of band II revealed a number of multiply charged species in the mass spectrum (fraction 3 of the POROS R2 microcolumn). Sequencing of the major peptides observed showed the presence of SHIP-2 with the EGF receptor as a minor component. The presence of EGF receptor is likely the result of smearing from the slower migrating band I which contains the EGF receptor. SHIP-2 was previously shown to be phosphorylated in response to EGF and to associate with Shc (20,21). Although its precise role in growth factor receptor signaling is not understood, SHIP-2 has been shown to negatively regulate insulin-induced signaling (22).
A PSI scanning experiment showed three peaks at m/z 509, 553, and 612 (data not shown). All of them corresponded to "invisible" ion signals in the mass spectrum of the protein digest. In other words, no ion signals were apparent in the mass spectrum of the protein digest as they were hidden by chemical noise. Sequencing of the fragment ion observed at m/z 612 showed that it corresponded to a tyrosine-phosphorylated tryptic peptide containing Tyr-1162 of SHIP-2 (GLPSD-pYGRPLSFPPPR) (Fig. 4).
Although Tyr-1162 in SHIP-2 was not reported as a tyrosine phosphorylation site before, the fact that it is phosphorylated is consistent with the evidence that residual tyrosine phosphorylation was observed, even though Taylor et al. (23) mutated the only tyrosine known to be phosphorylated in SHIP-2 (Tyr-986). This suggests that it is possible that the phosphorylation of tyrosine residue at position 1162 serves to recruit additional phosphotyrosine-interacting proteins to SHIP-2.
Hrs and Cbl (Band III)-Analysis of band III (fraction 1 of the POROS R2 microcolumn) showed several peptides derived from hepatocyte growth factor-regulated tyrosine kinase substrate, Hrs (24). A PSI scanning experiment identified two precursor ions one of which was sequenced and found to correspond to a tryptic peptide containing phosphorylated Tyr-334 of Hrs (YLNRNpYWEK; data not shown). Although Hrs is known to be rapidly phosphorylated on tyrosine residues after treatment with several growth factors including EGF, platelet-derived growth factor, and hepatocyte growth factor as well as by cytokines such as interleukin-2 and granulocyte/macrophage colony-stimulating factor, no phosphorylation sites have yet been identified on this molecule (25). Tyrosine-phosphorylated Hrs coimmunoprecipitates with STAM in a non-phosphotyrosine-dependent manner and with a 120-kDa phosphoprotein whose identity is unknown (25). It is possible that this unidentified protein associates only with the tyrosine-phosphorylated form of Hrs via Tyr-334. We also have preliminary evidence that an additional tyrosine residue (Tyr-329) is phosphorylated.
Analysis of the second elution fraction (POROS R2 microcolumn) revealed several peptides that could not be explained by a theoretical digest of the Hrs protein (see Fig. 5A). An adapter protein, Cbl, was identified as a second component present in this gel slice based on sequence information from other peptides (26). Both Cbl and Hrs have previously been shown to co-migrate on a one-dimensional gel (27). A PSI scan experiment showed that two ion signals (m/z 576 and 717) corresponded to peptides containing phosphotyrosine residues (Fig.  5B). Although no signal was evident in the mass spectrum of the digest at m/z 576 (see Fig. 5C), fragmentation of invisible peptide(s) at this m/z value gave a clear product ion spectrum that was derived from a peptide containing Tyr-674 of Cbl (IKPSSSANAIpYSLAAR) (see Fig. 5D). Although this tyrosine residue was predicted as being phosphorylated, attempts to investigate this were hindered because transfected cells did not express Cbl with a point mutation causing a residue change at Tyr-674 (28). This residue is found in a proline-rich region of Cbl, which is known to bind to SH3 domain of several proteins including Grb2, Nck, and members of the Src and Btk family (29). Our findings suggest that it is likely that Tyr-674 may mediate phosphotyrosine-dependent interactions as well. Interestingly, this peptide was not predicted by in silico tryptic digestions of the proteins present in the gel band. This is a result of the fact that there is a proline residue C-terminal to Arg-679, thereby normally preventing cleavage by trypsin at this position. This example shows the limitation of non-se-  612.6, 3؉). The major ion signals in the product ion spectrum were attributable to doubly charged y 11 to y 14 ions still carrying the phosphomoiety (marked with an asterisk) and singly charged y 6 to y 8 ions. These fragments allowed the identification of Tyr-1162 of SHIP-2 as the phosphorylated residue.
quencing-based approaches for phosphorylation analysis such as matrix-assisted laser desorption ionization/time-of-flight mass spectrometry in combination with immobilized metal affinity chromatography enrichment or phosphatase treatment, which completely rely on relating expected and observed m/z values of ion signals. Because such methods do not allow further modifications or nonspecific cleavages of the protease, the novel phosphorylation site that we have identified would simply be missed. In addition, this example illustrates the power of the new method, which allows selective detection and immediate peptide sequencing of the same sample. Although the ion signal was completely hidden in the noise so that no charge state and therefore no mass assignment could be made, it was still possible to obtain a good product ion spectrum, revealing the amino acid sequence to unambiguously localize a phosphotyrosine residue in an unexpected peptide.
STAM, STAM2, Hsp 70, and Ku 70 Autoantigen (Band IV)-Band IV was found to be a complex mixture as several proteins were identified from the second fraction eluted from POROS R2 column (Fig. 6A) by MS/MS sequencing. These proteins included bovine serum albumin, STAM, STAM2, Hsp 70, and Ku 70 autoantigen. As the cells used were human, bovine serum albumin is most likely a contaminant from the medium used for cell culture.
STAM was identified based on sequencing of a peptide that corresponds to its N terminus lacking an initiator methionine residue. This confirmed the STAM sequence reported by Takeshita et al. (30), who purified and sequenced STAM by Edman degradation and also found that it lacked the N-terminal methionine residue. In addition, a novel protein with significant homology to STAM was identified and was designated STAM2. This protein was recently cloned and investigated by two groups independently (10,11). An interesting attribute of STAM and STAM2 is that they are currently the two only cytosolic proteins that contain an immunoreceptor tyrosine-based activation motif domain, a motif normally only found in the cytoplasmic domains of immunoreceptors (31).
Two peptides corresponding to different charge states of the Tyr-198-containing phosphopeptide (QQSTTLSTLpYPSTSS-LLTNHQHEGR) derived from STAM were identified by PSI scanning (Fig. 6, B and C). This phosphorylation site Tyr-198 is located near the N terminus of STAM and is not within the immunoreceptor tyrosine-based activation motif domain, which is considered to possess multiple tyrosine phosphorylation sites. The significance of this phosphorylation site was underscored by the fact that another ion signal corresponding to the homologous peptide SLpYPSSELQLNNK derived from STAM2 (Fig. 6D) was sequenced. Although the functional analysis of STAM2 showed evidence for phosphorylation in the C-terminal region (11), it could not be localized in the study presented here. As the C-terminal sequence of STAM contains only few cleavage sites for enzymatic digestion (trypsin, GluC, AspN, LysC) peptides result from in-gel digestion that are larger than those normally observed by mass spectrometry. Thus, it is difficult to analyze this region using mass spectrometry in combination with conventional cleavage procedures. This region could conceivably be analyzed by using a less specific protease such as elastase (4) to obtain smaller peptides, which are within the normally observed mass range.
Hsp 70 proteins have been shown to undergo tyrosine phosphorylation in T and B cell receptor signaling pathways in a tyrosine kinase-dependent fashion (32,33). Hsp 70 isolated from nuclear envelope has also been shown to be tyrosinephosphorylated. A closely related molecule, Hsp 72, was shown to accumulate in the nucleus upon tyrosine phosphorylation in response to heat stress (34). Hsp 70 was recently shown to be a component of a multiprotein complex involved in mitogen-activated protein kinase signaling (35). It is possible that Hsp 70 may also be involved in nucleocytoplasmic transport in a manner analogous to that of Hsp 72. Ku 70 autoantigen was an-FIG. 5. Band III: mass spectrometric analysis of the tryptic digest. A, mass spectrum of the fraction eluted with 40% methanol, 5% formic acid from the POROS R2 microcolumn. B, the PSI scanning showed that only the ion signals at m/z 576 and 717 were attributable to phosphotyrosine-containing peptides. C, an expanded view of the mass spectrum of the peptide mixture is shown. The arrow indicates the low signal intensity of the precursor ion, giving rise to the base peak in the PSI scan (see B). D, despite the low signal intensity, a product ion spectrum of the precursor at m/z 576.6 (3ϩ) could be acquired that could be correlated to the nontryptic peptide (K)IKPSSSANAIpYSLAAR(P), which identified Tyr-674 of Cbl as the phosphorylation site.
other molecule identified from this band. It has been shown to associate with the cytoplasmic domain of CD 40 and translocate to the nucleus upon CD 40 ligation (36). To study these two proteins in more detail, we immunoprecipitated them using antibodies and subjected the immunoprecipitates to Western blotting with anti-phosphotyrosine antibodies. As shown in Fig. 7 (A and B), both of these proteins were found to be basally tyrosine-phosphorylated and there was no detectable increase in their tyrosine phosphorylation status upon stimulation with EGF. Because this band was a mixture of proteins, the most likely explanation is that Hsp 70 and Ku 70 autoantigen were tyrosine-phosphorylated in the basal state as well (i.e. present in the untreated lane) but could not be visualized under the silver-staining conditions used. Further experiments will be necessary to elucidate their role in receptor tyrosine kinase signaling, although it is likely that tyrosine phosphorylation does not play a major role.
Vimentin and Shc (Band V)-Two proteins were identified from this band: Shc and vimentin. Both proteins have been identified as phosphoproteins involved in receptor tyrosine kinase signaling pathways (37,38). MS/MS sequencing of the precursor ion observed in the PSI scan at m/z 659 identified of Tyr-427 in Shc as a phosphorylated residue (ELFDDPSpYVN-VQNLDK; see Fig. 8, A-C). This residue was already known to be phosphorylated upon T cell receptor as well as growth factor stimulation (38,39). The SH2 domain of the adapter protein, Grb2, has been shown to bind to this phosphorylated tyrosine residue that lies in the typical YXNX consensus motif for SH2 domain of Grb2 (40). Table II lists the proteins in which a phosphotyrosine residue was identified in this study, the corresponding phosphopeptide FIG. 6. Band IV: mass spectrometric analysis of the tryptic digest. A, mass spectrum of the fraction eluted with 40% methanol, 5% formic acid from the PO-ROS R2 microcolumn. B, the PSI scanning showed that not all ion signals were attributable to phosphotyrosine-containing peptides. C, the product ion spectrum of the precursor at m/z 717.3 (4ϩ) identified Tyr-198 of STAM as being phosphorylated (fragments carrying the phosphomoiety are marked with an asterisk). D, another major tyrosine phosphorylation site of STAM2 was identified as Tyr-192, based on the product ion spectrum of the precursor at m/z 786.9 (2ϩ). sequence, and a comparison with predicted phosphotyrosine residues using NetPhos (www.cbs.dtu.dk/services/NetPhos/), ProSite (www.expasy.ch/prosite/), and Scansite (cansite.bidmc.harvard.edu/cantley85.phtml) prediction programs. As is clear from our analysis, the predicted phosphorylation sites need to be assessed with caution. ProSite tends to underpredict tyrosine phosphorylation sites because only the motifs [RK]-

Comparison of Experimentally Determined Phosphotyrosine Residues with Predicted Phosphotyrosine Motifs
indicating the phosphorylation sites) are used. This contrasts with NetPhos, which has the tendency to overpredict tyrosine phosphoryla-tions site. Scansite is a program that is essentially based on experimental data from phosphorylation studies performed in vitro. All of these programs failed to identify many of the phosphorylated tyrosine residues that we have identified. ProSite predicted one to three tyrosine phosphorylation sites/ protein, but was able to predict only one phosphotyrosine residue of the several phosphotyrosines that were identified. On the other hand, the "success rate" of NetPhos and Scansite was only slightly better with the correct prediction of three of the identified tyrosine phosphorylation sites, despite the fact that  up to 5-20 tyrosine phosphorylation sites/protein had been predicted by NetPhos and 1-7 sites/protein by Scansite. Four of the seven tyrosine phosphorylation sites were missed by Net-Phos and Scansite, indicating numerous false negatives in addition to the false positives. For obvious reasons, it is quite difficult to experimentally label any phosphorylation prediction as a false positive based merely upon the inability to detect any given residue in its phosphorylated form. However, our data conclusively point out the limitations of the prediction programs and the consequent need to interpret such predictions with extreme caution.

Conclusions
Analytical methods to identify in vivo tyrosine phosphorylation sites are currently quite limited. Mass spectrometry is a very sensitive method for the analysis of phosphorylation sites in general. However, because of properties inherent to phosphorylated tyrosine residues, it is not straightforward to use mass spectrometry for identification of tyrosine phosphorylation sites within proteins. Here we describe the application of a novel precursor ion scan that is specific for phosphotyrosine residues to study the EGF receptor signaling pathway. Tyrosine-phosphorylated proteins were isolated from EGF-stimulated and nonstimulated cells by immunoprecipitation with anti-phosphotyrosine antibodies. The proteins were separated by SDS-PAGE and visualized by silver staining. Five bands that were different between the lanes were excised and subjected to protein identification and PSI scanning for phosphotyrosine mapping. Eight of the 10 (see above) proteins have been described previously as being involved in receptor tyrosine kinase signaling pathways, such as EGF and plateletderived growth factor signaling.
In this study, several known and novel tyrosine phosphorylation sites were identified in addition to identification of phosphoproteins. (i) Known tyrosine phosphorylation sites were confirmed for the EGF receptor (Tyr-1172) and Shc (Tyr-427).
Our results establish PSI scanning as a feasible method for the identification of tyrosine-phosphorylated proteins as well as for localization of phosphorylation sites when limited amounts of sample are available. A major advantage of this method is that it can be applied to a wide range of peptide sizes and charge states because the high resolving power of the quadrupole time-of-flight mass spectrometer allows unambiguous assignment of different charge states of highly charged (Ͼ3) peptides. Peptides up to approximately 2.8 kDa were successfully sequenced and the tyrosine phosphorylation site unambiguously identified. Finally, the low correlation between predicted and localized phosphorylation sites indicates that there are many more tyrosine phosphorylation motifs than are currently used by the prediction programs, underscoring the importance of unbiased methods for in vivo phosphorylation mapping such as the new method applied in this study.

TABLE II
A list of all tyrosine phosphorylation sites identified in this study The table lists proteins in which a phosphotyrosine residue was identified in this study, the corresponding phosphopeptide sequence, and a comparison with predicted phosphotyrosine residues using ProSite (www.expasy.ch/prosite/), NetPhos (www.cbs.dtu.dk/ services/NetPhos/) and Scansite (cansite.bidmc.harvard.edu/cantley85.phtml) prediction programs. The table also shows whether the corresponding tyrosine phosphorylation site was correctly predicted by the programs along with the total number of tyrosine phosphorylation sites predicted for each protein.