Mass spectrometric analysis of 40 S ribosomal proteins from Rat-1 fibroblasts.

Although sequences of most mammalian ribosomal proteins are available, little is known about the post-translational processing of ribosomal proteins. To examine their post-translational modifications, 40 S subunit proteins purified from Rat-1 fibroblasts and their peptides were analyzed by liquid chromatography coupled with electrospray mass spectrometry. Of 41 proteins observed, 36 corresponded to the 32 rat 40 S ribosomal proteins with known sequences (S3, S5, S7, and S24 presented in two forms). The observed masses of S4, S6-S8, S13, S15a, S16, S17, S19, S27a, S29, and S30 matched those predicted. Sa, S3a, S5, S11, S15, S18, S20, S21, S24, S26-S28, and an S7 variant showed changes in mass that were consistent with N-terminal demethionylation and/or acetylation (S5 and S27 also appeared to be internally formylated and acetylated, respectively). S23 appeared to be internally hydroxylated or methylated. S2, S3, S9, S10, S12, S14, and S25 showed changes in mass inconsistent with known covalent modifications (+220, −75, +86, +56, −100, −117, and −103 Da, respectively), possibly representing novel post-translational modifications or allelic sequence variation. Five unidentified proteins (12,084, 13,706, 13,741, 13,884, and 34,987 Da) were observed; for one, a sequence tag (PPGPPP), absent in any known ribosomal proteins, was determined, suggesting that it is a previously undescribed ribosome-associated protein. This study establishes a powerful method to rapidly analyze protein components of large biological complexes and their covalent modifications.

Ribosomes are complex structures consisting of large numbers of proteins that interact with rRNA to form a functional protein-synthesizing entity. Much progress has been made in identifying the components of prokaryotic ribosomes and in understanding their structure and function in protein synthesis (reviewed in Refs. 1 and 2). However, less information is available for the more complex eukaryotic ribosomes. Previous studies have identified several covalent modifications of eukaryotic rRNA, demonstrating their importance in the enzymology and stability of ribosomes (reviewed in Refs. 2

and 3).
Less is known about the covalent modifications of eukaryotic ribosomal proteins, although protein and cDNA sequence information is now available for nearly all of the yeast, rat, and human ribosomal proteins (reviewed in Refs. 4 -6). Nevertheless, protein co-or post-translational modifications may have effects on the activity, stability, assembly, or localization of eukaryotic ribosomes.
Mass spectrometry can be used to identify and characterize protein covalent modifications (reviewed in Refs. 7 and 8) without prior knowledge of their chemistry, providing an unbiased approach that is not achievable using techniques such as metabolic radiolabeling or chemical derivatization. Electrospray ionization mass spectrometry (9) is a particularly sensitive means of obtaining masses of proteins up to 100,000 Da with uncertainties of ϳ0.01% from small amounts of sample (nanoto micrograms) (reviewed in Refs. 10 and 11). When electrospray ionization mass spectrometry is coupled with reversedphase HPLC 1 (LC/MS), components within mixtures can be analyzed by mass as they elute from columns, allowing rapid analysis of proteins or peptides. Tandem mass spectrometry (MS/MS), in which peptide ions are fragmented by collisioninduced dissociation (CID), provides amino acid sequence information, enabling identification of modified residues (12). CID can be performed on all components in a sample by raising the orifice voltage (13) or on a single ion selected in the first quadrupole, activated by argon in the second quadrupole, and analyzed in the third quadrupole (14). Multisubunit complexes can also be analyzed by mass spectrometry, allowing identification of the protein components and their covalent modifications. Only two examples of this type of analysis have been reported previously, examining subunits of cytochrome bc 1 (15) and eukaryotic nucleosomes (16). In each case, protein components (Ͻ11) were first extracted and purified from these complexes prior to mass spectral analysis. In principle, the purification can be eliminated by the use of LC/MS analysis, saving time and avoiding loss by handling. Furthermore, the high sensitivity of the method enables analysis of limiting amounts of material, such as samples purified from cultured mammalian cells.
In this study, we report new information about the protein composition of 40 S ribosomal subunits and the nature of co-or post-translational modifications of ribosomal proteins. Our study demonstrates a novel application of LC/MS and LC/ MS/MS to rapidly identify the protein constituents and protein modifications of a large biological complex purified from cultured mammalian cells.
dishes and allowed to grow to confluence for ribosome purification.
Preparation of 40 S Ribosomal Proteins-40 S ribosomes were purified by a procedure modified from Terao and Ogata (17). All procedures were performed at 4°C unless otherwise stated. Cells were rinsed twice with 5 ml of phosphate-buffered saline and twice with 5 ml of 0.25 M sucrose, 25 mM KCl, 5 mM MgCl 2 , 1 mM benzamidine (Sigma), 20 g/ml each leupeptin and aprotinin (U. S. Biochemical Corp.), 5 g/ml pepstatin (U. S. Biochemical Corp.), 30 M 3,4-dichloroisocoumarin (Sigma), and 50 mM Tris-HCl, pH 7.6. Cells were drained, scraped, and then homogenized with a tight-fitting glass homogenizer. The homogenate was centrifuged at 27,000 ϫ g for 10 min, and the resulting supernatant was centrifuged at 350,000 ϫ g for 100 min. Pellets were resuspended in 1.4 ml of 0.15 M sucrose, 10 mM MgCl 2 , 25 mM KCl, and 35 mM Tris-HCl, pH 7.6, to which was added 0.4 ml of 2.5 M KCl and 10 mM MgCl 2 and 0.2 ml of 10% (w/v) sodium deoxycholate. Approximately 0.5 ml of the suspension was layered onto 2.5 ml of 0.3 M sucrose, 10 mM MgCl 2 , 0.6 M KCl, and 35 mM Tris-HCl, pH 7.6, and centrifuged at 350,000 ϫ g for 2 h. The resulting pellets were resuspended in 0.7-1.0 ml of 0.25 M sucrose, 5 mM MgCl 2 , 50 mM KCl, 10 mM NaHCO 3 , and 50 mM Tris-HCl, pH 7.6, and centrifuged at 27,000 ϫ g for 10 min, yielding a supernatant enriched in 80 S ribosomes. 80 S ribosomes were dissociated into 40 S and 60 S subunits by addition of puromycin (Sigma) to 0.1 mg/ml and incubation for 10 min at 37°C, followed by addition of ␤-mercaptoethanol to 5 mM, KCl to 0.7 M, and MgCl 2 to 1.25 mM and incubation for 15 min at 37°C. To purify ribosomal subunits, the dissociated ribosomes (0.5 ml) were layered onto 15-30% linear sucrose gradients (11.5 ml) containing 0.3 M KCl, 3 mM MgCl 2 , 20 mM Tris-HCl, pH 7.6, and 0.5 mM dithiothreitol, and the gradients were centrifuged at 125,000 ϫ g for 13 h in a Beckman SW 41 Ti rotor. Gradients were eluted at 0.5 ml/min in 0.5-ml fractions and analyzed by 15% SDS-polyacrylamide gel electrophoresis visualized by silver staining (18). Fractions containing 40 S subunits were pooled and diluted 2-fold with 0.6 M KCl, 6 mM MgCl 2 , 40 mM Tris-HCl, pH 7.6, and 1 mM dithiothreitol, and subunits were pelleted in 2-3-ml volume at 490,000 ϫ g for 4 h. The final pellets were stored at Ϫ80°C until use.
To dissociate 40 S ribosomal proteins from rRNA, each pellet was resuspended in 100 l of 6 M guanidine HCl, pH 5.0, followed by addition of 200 l of 10% acetic acid and incubation on ice for at least 5 min. Precipitated rRNA was removed by centrifugation at 10,000 rpm for 2 min at room temperature. Supernatants were used for direct LC/MS analyses as well as for partial protein purification and peptide generation for further mass spectral analyses.
Protein Purification and Peptide Generation for Mass Spectral Analyses-40 S ribosomal proteins (rRNA-free) were partially resolved on either a Brownlee C4 microbore (Applied Biosystems, Inc., 2.1 mm ϫ 3 cm) or a POROS R120 PEEK reversed-phase column (PerSeptive Systems; 2.1 mm ϫ 3 cm). Columns were equilibrated in 0.1% trifluoroacetic acid, and proteins were eluted with a gradient in 0 -80% acetonitrile and 0.08% trifluoroacetic acid at 1% acetonitrile/min and 200 l/min. Protein elution was monitored at 215 nm by UV absorbance detection (Applied Biosystems, Inc., Model 785A; 2.4-l flow cell), and fractions were collected manually. In some cases, post-column splitting was used to direct 10 -15% of the HPLC effluent to the mass spectrometer and the rest to the UV absorbance detector for manual fraction collection. Acetonitrile/trifluoroacetic acid in the fractions was removed by lyophilization, and mass spectral analyses of either intact or proteolyzed proteins were performed after solubilization with 20 l of 8 M urea and 80 l of 0.1 M Tris-HCl, pH 8. Proteolysis was carried out with 5% (w/w) endoproteinase Lys-C (Wako Bioproducts) for 3-4 h at 37°C. Prior to mass spectral analysis, samples were acidified by addition of formic acid to a final concentration of 1% (v/v).
Mass Spectral Analyses-Mass determinations of proteins or peptides were performed as described (19 -21) using an HPLC (Applied Biosystems, Inc., Model 140B) directly coupled to a Perkin-Elmer Sciex API-III triple quadrupole mass spectrometer equipped with a nebulization-assisted electrospray source and a high pressure collision cell. Protein components of 40 S subunits were separated on fused silica capillary columns (500 m ϫ 20 cm) packed with POROS R120 resin. Following column equilibration in 0.1% formic acid, proteins were loaded and eluted with a gradient in 0 -80% acetonitrile and 0.1% formic acid at 1% acetonitrile/min and 20 l/min. Peptides generated after proteolysis were analyzed in a similar manner, except that Vydac C18 resin (Hewlett-Packard Co.) buffered with 0.05% trifluoroacetic acid was used; peptides were eluted with a 0 -40% acetonitrile gradient at 2% acetonitrile/min. Mass spectra for proteins were obtained by scanning from 350 to 1800 Da/e with 0.13-Da/e step size and 0.5-ms dwell time. The ionspray needle was held at a potential of 4.8 kV with a nebulizing air flow rate of 0.6 liter/min and an orifice voltage of 75-80 V. Conditions for peptide analysis were identical except that spectra were obtained by scanning from 50 to 1600 Da/e with 0.2-Da/e step size, and the ionspray needle was decreased to 4.5 kV; to obtain CID at the orifice, the orifice voltage was increased to 95 V.
MS/MS was performed as proteins or peptides eluted from the capillary column into the mass spectrometer (LC/MS/MS) (19 -22). Individual ions were selected for CID in the first quadrupole and then accelerated to a kinetic energy of 16 -18 eV and collisionally activated with argon at a thickness of 3.5-4.2 ϫ 10 14 atoms/cm 3 in the high pressure collision cell of the second quadrupole. MS/MS data were collected in the third quadrupole by scanning from 50 to 1600 Da/e with 0.15-Da/e step size and 1.0-ms dwell time. Fragment ions produced were usually "b" and "y" types; in some cases, doubly charged fragment ions were observed (referred to as b 2ϩ and y 2ϩ , corresponding to (b ϩ 1H) 2ϩ and (y ϩ 1H) 2ϩ , respectively, in the nomenclature of Biemann (8)). LC/MS and LC/MS/MS data were analyzed using software provided by Perkin-Elmer Sciex. Figures were prepared by importing data into Sigmaplot (Jandel Scientific), Photoshop (Adobe), or Canvas (Deneba).

Mass Determinations of 40 S Ribosomal Proteins
Ribosomes were purified from Rat-1 fibroblasts and separated into 60 S and 40 S subunits. 40 S ribosomal proteins were dissociated from 18 S rRNA and analyzed by LC/MS. Fig. 1 displays the mass/charge ratio (m/z) of the ribosomal proteins versus HPLC elution. Each protein is observed as a harmonic series of coeluting ions with varying m/z values. For example, in scans 328 -330 ( Fig. 1, arrow), a discrete series is observed that is clearly separated from neighboring series. Several scans were summed to produce an ion spectrum; for example, Fig. 2A illustrates an m/z series from MH 9 9ϩ to MH 20 20ϩ . Protein masses were then derived by deconvolution of the m/z series; for example, Fig. 2B shows the deconvolution of the data in Fig. 2A, indicating a mass of 14,709 Da. Even in situations where proteins coeluted (for example, scans 195-215), each protein produced a separate ion series that deconvoluted into distinguishable masses. Masses were accepted as valid ribosomal components if they were observed in each of three independent 40 S ribosomal protein preparations. All proteins selected by this criterion were observed with signal intensities Ͼ5-fold above background. Experimental errors were typically Ͻ0.01% of total protein mass, as measured by standard deviations of independent determinations represented by each set of m/z values as well as by standard deviations of mass determinations from three independent preparations of 40 S ribosomal proteins. Observed masses did not vary when ribosomes were prepared in the presence versus absence of protease inhibitors, indicating that proteolysis after cell disruption was not significant.
LC/MS analyses of the 40 S ribosomal proteins yielded 41 protein masses. Other investigators have identified 33 rat 40 S ribosomal proteins, and amino acid and/or cDNA sequence information is available for 32 of these proteins (reviewed in Ref. 4). Comparison of our observed masses with masses predicted from amino acid sequences showed that, within experimental error, 12 of the 41 proteins had observed masses identical to the predicted masses of S4, S6, S7, S8, S13, S15a, S16, S17, S19, S27a, S29, and S30 (Table I). Of these 12 proteins, two (S6 and S7) retained their initiator Met as indicated by amino acid sequencing (23,24), and S27a and S30 were observed without their respective N-terminal ubiquitin and ubiquitin-like fusions (25,26). (Proteins that have the initiator Met removed after translation, as demonstrated by previous studies, are noted by Footnote j in Table I.)

Identification of 40 S Ribosomal Proteins
To verify these assignments and to identify the remaining 29 proteins, ribosomal proteins were partially purified, digested with endoproteinase Lys-C, and analyzed by LC/MS. An example of peptide mapping by LC/MS is shown in Fig. 3A for a protein with a mass of 31,451 Da. Peptide masses observed by LC/MS of the endoproteinase Lys-C digests were compared with predicted peptide masses for each of the ribosomal proteins. In the example shown, 16 observed peptide masses matched, within experimental error, to peptide masses predicted from the known amino acid sequence of ribosomal protein S2, establishing the identity of this protein (Table II and Fig. 3B). In general, four categories of peptides were often not found in peptide maps: (i) small hydrophilic peptides that might not bind to the Vydac C18 resin, (ii) large hydrophobic peptides that might not release from the resin, (iii) cysteinecontaining peptides that might form intramolecular or intermolecular disulfide bridges, and (iv) covalently modified peptides (e.g. peptides 1 and 1/2 from S2; see below). In many cases, peptide identities were confirmed by high orifice voltage CID, which yielded sequence information (13).  (Table II and Fig. 3B).) These methods confirmed the identities of all 32 of the known 40 S ribosomal proteins, allowing mass assignments to be made. This analysis is summarized in Table I, indicating the differences in observed versus predicted mass for each protein (third and fourth columns, respectively). The residues in each protein that were identified by peptide mapping (Table I, sixth column) include those that were recognized by mass alone (Roman type in the sixth column) and those for which high orifice voltage CID sequencing was obtained (italic type in the sixth column). The percentage of total residues accounted for by the identified peptides ("coverage") ranged from 16 to 97%, with the majority Ͼ50%, and the number of peptides observed for each protein ranged from 2 to 15 (seventh and eighth columns). Peptide mapping confirmed the identities of 11 proteins with observed masses similar to predicted masses (see below for the twelfth protein, S27a); N-terminal peptide masses were observed for S7, S8, S13, S15a, S19, S29, and S30, confirming the presence or absence of the initiator Met.

Covalent Modifications of 40 S Ribosomal Proteins
Of the remaining 29 protein masses, 24 corresponded to known ribosomal proteins, deduced by peptide mapping and sequencing (Table I). Our following predictions about the modifications of these proteins are based on the comparison of observed mass differences with the masses of known covalent amino acid modifications (27). In most cases, modifications could be localized within the protein amino acid sequence by identification and partial sequencing of modified peptide(s). Mass differences were accountable by known covalent modifications in 13 cases (Sa, S3a, S5, S11, S15, S18, S20, S21, S23, S24, S26, S27, and S28), leaving seven in which mass differences were clearly apparent, but corresponded to no known protein covalent modifications (S2, S3, S9, S10, S12, S14, and S25). In each of four cases (S3, S5, S7, and S24), two protein forms were observed by LC/MS.
Mass Differences Consistent with Known Covalent Modifications-S3a and S26 showed mass differences (Ϫ130 Ϯ 1 Da) consistent with demethionylation (Ϫ131 Da). Most likely, demethionylation would occur at the N terminus, although the N-terminal peptides (MAVGK, S3a; MTK, S26) were not observed due to their small size and hydrophilicity. S21, S24, and S28 showed mass differences consistent with acetylation (ϩ42 Da). Partial sequencing of the N-terminal peptide from S21 indicated that an acetyl group on this protein was located within the first four residues, in agreement with a previous report (28) of a blocked N terminus (MQNDAGEFVDLYVPRK: 9ϩ to MH 20 20ϩ (from right to left; the smallest peak that is not labeled is MH 20 20ϩ ϭ 736.1 Da/e). Protein mass can be calculated from each peak using the formula (M ϩ z)/z ϭ observed mass of ion, where M is protein mass and z is charge. Protein masses in Table I are determined from these calculations. B, the deconvoluted mass spectrum of the m/z series in A indicates a major protein of 14,708.7 Da and two minor proteins of 14,740.8 and 14,766.0 Da; these minor peaks most likely represent an oxidation product of the major protein and a contaminating adduct (ϩ57 Da) variably found in our experiments, respectively. The areas under these peaks are semiquantitative approximations of the relative amounts of each form.  44-52, 61-66, 67-78, 79-94, 103-114, 115- Interestingly, S24 coeluted with a protein of mass 311 Da smaller than the predicted S24 mass or 354 Da smaller than the observed S24 mass. This protein could be a variant of acetylated S24, accounted for by proteolytic removal of the three C-terminal residues on S24 (PKE, predicted ϭ 354 Da). The ratio of intensities of the unproteolyzed versus the putative proteolyzed form was 1.4:1 (data not shown), indicating that the proteolyzed form is minor. The ratio was not significantly altered by protease inhibitors added during ribosome preparation, suggesting that the modified form was not produced during purification.
S28 also differed from its predicted mass by ϩ42 Da. However, the N-terminal peptide from S28, identified by high orifice voltage CID, showed a mass difference of ϩ58 Da rather than ϩ42 Da; the ϩ58-Da modification was located within the first six residues (MDTSRVQPIK: predicted MH 1ϩ ϩ58/ MH 2 2ϩ ϩ(58/2) ϭ 1232.6/616.8 Da/e, observed ϭ 1232.6/617.0 Da/e; observed fragment ions: b 6 ϩ58, b 7 ϩ58, and y 5 ϭ 749.8, 876.2, and 585.0 Da/e, respectively). Because the intact protein differed by ϩ42 Da from the predicted mass, we ascribe the additional 16 Da to oxidation of the acetylated initiator Met during peptide preparation.
Finally, S23 was observed with a mass 16 Da greater than predicted. A peptide observed with this mass difference corresponded to residues 61-68 of S23 (QPNSAIRK: predicted MH 1ϩ ϩ16 ϭ 929.5 Da/e, observed ϭ 929.6 Da/e). A mass difference of ϩ16 Da is consistent with oxidation, hydroxylation, or methylation. Based on the peptide sequence, hydroxylation (of Lys or Pro) or methylation (of Lys) is more likely because oxidation generally occurs on Met. It is also possible that the observed mass difference might result from an amino acid substitution of Cys for Ser, Glu for Ile, Ile for Pro, Leu for Pro, or Ser for Ala.
Although S7 was among those proteins observed with exact predicted mass, a modified form that was 42 Da larger eluted 0.6 min later than the unmodified form. Peptide mapping of a mixture containing both proteins yielded two forms of the N terminus differing by 42  Six proteins (Sa, S11, S15, S18, S20, and S27) showed mass differences consistent with combined demethionylation and acetylation (Ϫ90 Da). In four cases (Sa, S11, S15, and S18), e Mass difference calculated as observed mass minus predicted mass. f Peptides identified by LC/MS and high orifice voltage CID sequencing are indicated by residue numbers. Peptides identified by mass alone are indicated in light-face Roman type. Peptides identified by three or more fragment ions by CID are indicated in bold-face italic type. Peptides identified by less than three fragment ions by CID are indicated in light face italic type; in these cases, fragment ions were consistent with fragmentation N-terminal to Pro. A bold-face number in parentheses indicates the observed mass difference from the predicted mass for that peptide. Brackets indicate an incomplete proteolytic product. In all cases, observed masses were within 1 Da of predicted masses. Monoiotopic masses were used for peptides Ͻ1500 Da and average masses for larger peptides.
g Numerator and denominator represent the number of observed and predicted peptides, respectively, from a complete endoproteinase LysC digestion (not including incomplete proteolytic products).
h Percent coverage of residues observed from LC/MS. The total number of residues in each protein is indicated in parentheses. i A protein coeluting with S3 that appears to be a variant form of S3 (see ''Results''). j N-terminal Met is removed after translation for S4 (43), S8 (44), S9 (45) S13 (46), S15a (47), S16 (48), S17 (49), S19 (50), S23 (51), and S29 (52). k A protein coeluting with S5 that appears to be a methionylated variant of S5 (see ''Results''). l A major protein eluting after S7 that appears to be an acetylated form of S7 (see ''Results''). m Previously shown to be acetylated at the N-terminus after removal of the initiator Met (29). n A protein coeluting with S24 that appears to be a proteolyzed form of S24 (see ''Results''). o Mass without ubiquitin N-terminal fusion domain, predicted from the cDNA sequence (25). p Mass without ubiquitin-like protein fusion domain, predicted from the cDNA sequence (26). q Proteins unidentifiable with any of the known ribosomal proteins.
N-terminal demethionylation and acetylation were confirmed by identification of correspondingly modified N-terminal peptides. Partial sequencing by high orifice voltage CID indicated that the modifications occurred within the first six residues of Sa, the first four residues of S11, the first four residues of S15, and the first five residues of S18 ( Da/e, respectively). Removal of the initiator Met and acetylation of the succeeding Ala residue of S15 have been previously documented by Edman degradation and mass spectrometry of the modified N terminus (29). In the case of S20, no modified peptide was observed; most likely, the modification occurs at the N-terminal peptide (MAFK), which would not bind to the Vydac C18 resin. In the case of S27, the N-terminal peptide was found to be demethionylated (Ϫ131 Da), but not acetylated (MPLARDLLHPSLEEEK: predicted MH 2 2ϩ Ϫ(131/2) ϭ 874.6 Da/e, observed ϭ 874.4 Da/e). The N-terminal peptide was identified by high orifice voltage CID with observed b 6  S5 also displays N-terminal demethionylation and acetylation, as its N-terminal peptide was 90 Da smaller than predicted (MTEWETATPAVAETPDIK: predicted MH 2 2ϩ Ϫ(90/2) ϭ 951.1 Da/e, observed ϭ 951.2 Da/e). In this case, peptide sequencing was carried out by LC/MS/MS, and the modifications were located within the first three residues of S5 (Fig. 5). However, the total mass observed still differed from that predicted by ϩ29 Da after accounting for the Ϫ90-Da N-terminal modification; therefore, an additional residue modified by ϩ29 Da is expected. This mass difference would be consistent with formylation of a free amine, such as Lys, or might result from amino acid substitution of Lys or Gln for Val. Although an internal peptide with a ϩ29-Da mass difference was not found, this model is supported by the detection of a second form of S5 with a mass 70 Da greater than predicted. The N-terminal peptide from the second form is 42 Da greater than predicted, indicating that it contains the initiator Met and is acetylated (predicted MH 2 2ϩ ϩ(42/2) ϭ 1017.1 Da/e, observed ϭ 1016.6 Da/e). Thus, two acetylated forms of S5 are present, with and without the initiator Met (ratio of intensities ϭ 1:4.2; data not shown), both of which contain a residue additionally modified by ϩ29 Da.
Mass Differences Unaccounted for by Known Covalent Modifications-Seven ribosomal proteins showed differences between observed versus predicted masses that were unaccounted for by known covalent modifications, raising the possibility that ribosomal proteins contain previously undescribed post-translational modifications. For example, S2 was observed with a mass 220 Da greater than predicted. Peptide mapping revealed N-terminal peptides 1 (Fig. 3A, asterisk) and 1/2 (arrowhead), which were each 220 Da greater than predicted, indicating that the modification is located near the N terminus (peptide S10 and S12 were observed with mass differences of ϩ57 and Ϫ100 Da, respectively. Their N-terminal peptides were observed with masses that agreed with those predicted (for S10, residues 1-5 (MLMPK) and residues 1-6 (MLMPKK): predicted MH 1ϩ ϭ 619.3 and 747.4 Da/e, respectively, and observed ϭ 619.2 and 747.4 Da/e, respectively; for S12, MAE-EGIAAGGVMDVNTALQEVLK: predicted MH 2 2ϩ /MH 3 3ϩ ϭ 1174.4/783.2 Da/e, observed ϭ 1174.6/783.8 Da/e). This indicates that the sites of modification are located elsewhere within these proteins. For S10, a C-terminal peptide with an observed mass 56 Ϯ 1 Da greater than predicted was found (KAEAG-AGSATEFQFRGGFGRGRGQPPQ: predicted MH 2 2ϩ ϩ(56/2)/

FIG. 3. LC/MS analysis of an endoproteinase Lys-C digest of S2.
A, S2 was partially purified by reversed-phase HPLC, lyophilized, and digested with endoproteinase Lys-C, and peptides were analyzed by LC/MS (see "Experimental Procedures"). The m/z values are plotted against elution as in Fig. 1, except that most peptides are represented by only one parent ion. In some cases, peptide ions were fragmented by high orifice voltage CID; these appear as lines of less intense fragment ions trailing below more intense parent ions, e.g. peptide 22 from S2 (arrow), located at scans 526 -529 and discussed in the legend to Fig. 4. The N-terminal peptides 1 and 1/2, located at scans 431-434 (asterisks and arrowheads, respectively), showed a modification of ϩ220 Da as discussed under "Discussion." B, the amino acid sequence of S2 is shown with peptide coverage. Slashes indicate the expected cleavages by endoproteinase Lys-C. The observed S2 peptides are underlined, and the N-terminal peptides observed with mass differences of ϩ220 Da are italicized. Fragment ions generated by high voltage CID of the endoproteinase Lys-C peptides are indicated by  for y ions and  for b ions. MH 3 3ϩ ϩ(56/3) ϭ 1411.5/941.3 Da/e, observed ϭ 1411.6/941.2 Da/e). This C-terminal sequence contains RGGF and GR repeats, also seen within the modified N-terminal sequence of S2 (Fig. 3B). An internal modified peptide from S12 was not identifiable by peptide mapping.
S3, S9, S14, and S25 were observed with masses that differed from those predicted by Ϫ75, ϩ86, Ϫ117, and Ϫ103 Da, respectively. In these cases, neither N-terminal nor modified internal peptides could be identified by peptide mapping, revealing little information as to whether these represent individual modifications, combinations of more than one modification, or allelic variations in sequence. S3 coeluted with a protein 362 Da smaller than the predicted and 287 Da smaller than the observed S3 mass. Peptide mapping of this mixture revealed a peptide consistent with the C terminus of S3 with a mass 288.0 Ϯ 1 Da smaller than predicted (GGKPEPPAMPQ-PVPTA: predicted MH 1ϩ Ϫ288 ϭ 1286.8 Da/e, observed ϭ 1286.8 Da/e). Thus, the coeluting protein appears to be a form of S3 with an additional modification of Ϫ288 Da. This mass change is not easily explained by C-terminal proteolysis, suggesting that the modification may include amino acid residue changes. The ratio of intensities of the larger versus smaller forms of S3 was 2.4:1 (data not shown).

Identification of S27a by LC/MS/MS Sequencing of an Intact Protein
Although peptide mapping and sequencing information allowed us to unequivocally identify 31 of the 32 known rat 40 S ribosomal proteins, the situation was complicated for S27a because peptide mapping revealed no CID data and low coverage (16%). Four peptide ions from endoproteinase Lys-C digestion were observed that agreed with predicted masses (residues 7-13 (KSYTTPK) and residues 8 -14 (SYTTPKK): predicted MH 1ϩ ϭ 696.3 Da/e, observed ϭ 696.2 Da/e; residues 8 -13 (SYTTPK): predicted MH 1ϩ ϭ 543.4 Da/e, observed ϭ 543.4 Da/e; residues 24 -28 (LAVLK): predicted MH 1ϩ ϭ 824.4 Da/e, observed ϭ 824.4 Da/e). However, these data were insufficient to unequivocally identify the protein as S27a. The observed mass and elution time of this protein (9403 Da, 19.1 min) were similar to those of S27 (9477 Da, 20.3 min), suggesting that the protein might be a modified form of S27. An HPLC fraction containing the candidate protein with a mass of 9403 Da was subjected to LC/MS/MS, selecting the MH 15 15ϩ (672.6 Da/e) ion for fragmentation. Fragmentation resulted in partial CID at the C terminus, TYCFNKPEDK (Fig. 6). The partial sequencing was sufficient to identify the 9403-Da protein as S27a.

Unidentified Proteins in 40 S Ribosomal Subunits
Five of the 41 proteins observed in our LC/MS analysis could not be identified as any of the known 40 S ribosomal proteins. Based on their association with 40 S subunits in each of three independent preparations, we named them Sb, Sc, Sd, Se, and Sf, with masses of 12,084, 13,706, 13,741, 13,884, and 34,987 a Peptides generated from a digestion of S2 by endoproteinase Lys-C, numbered from the N terminus. b Elution time from the data in Fig. 3. c Residues are numbered as described by Suzuki et al. (53). d Calculated using monoisotopic masses for peptides Ͻ1300 Da and average masses for peptides Ͼ1300 Da. Da, respectively. Total ion intensities of these proteins were measured by integration of deconvoluted mass spectra (as in Fig. 2A) and were compared with those of nearby proteins. The intensity of Sb was 40 -100% of S20; Sd and Se 30 -70% of S12; and Sc and Sf 5-50% of Sa. Sc and Sf eluted late from HPLC, suggesting that they are hydrophobic; the lower intensities may be due to less efficient ionization of hydrophobic proteins or to lower recovery from HPLC. The direct comparison of summed scans of Sa, Sc, and Sf showed that, despite the lower total ion intensities, individual ions from Sc were actually more intense than those from Sa and Sf (Fig. 7A). In four of the five cases (Sb, Sd, Se, and Sf), no information beyond total mass was obtained. Further studies are needed to determine if these represent modified forms of the known ribosomal proteins. Sufficient sequence information was obtained for Sc (13,706 Da) (Fig. 7B).) These ion series differed in mass by 2 Da; speculatively, a Val residue, which is 2 Da larger than Pro, might reside near this sequence. The sequence PPGPPP or PPPGPP was absent in any of the known 40 S or 60 S ribosomal proteins. DISCUSSION In this study, we present a novel application of mass spectrometry to analyze the protein components of rat 40 S ribosomal subunits from Rat-1 fibroblasts. Sufficient information was provided by LC/MS peptide mapping and LC/MS/MS sequencing to assign 36 of the 41 observed masses to the 32 40 S ribosomal proteins for which sequence information is available. Of the 32 ribosomal proteins, only 12 showed observed masses that matched those predicted from amino acid sequences (S4, S6 -S8, S13, S15a, S16, S17, S19, S27a, S29, and S30). Twenty showed observed masses that differed from those predicted, indicating that the majority of 40 S ribosomal proteins are coor post-translationally modified (Sa, S2, S3, S3a, S5, S9 -S12, S14, S15, S18, S20, S21, and S23-S28). Four ribosomal proteins were represented by more than one mass, indicating heterogeneity in processing (S3, S5, S7, and S24). After accounting for 36 observed proteins, five proteins remained unidentified (referred to as Sb, Sc, Sd, Se, and Sf), of which Sc appears to represent a unique ribosomal or ribosome-associated protein.
The differences between predicted versus observed protein masses most likely reflect covalent modifications of amino acids, although variations in amino acid sequence from data base information cannot be excluded. In most cases, reasonable predictions could be made about the chemistry of the modifications. The majority of observed mass differences were accounted for by acetylation and/or removal of the initiator Met. These reactions are catalyzed co-translationally by Met aminopeptidases and N ␣ -acetyltransferases (30 -32). Four proteins (S7, S21, S24, and S28) showed mass differences consistent with N-terminal acetylation in the presence of the initiator Met. S24 and S28 have N-terminal sequences (MNDT and MDTS, respectively) with Asn and Asp as penultimate residues, which are among the residues most susceptible to acetylation at the initiator Met (Asn, Asp, and Glu) (33,34). S7 and S21 have N-terminal sequences (MFSS and MQND, respectively) that are not typically acetylated (34); however, Boissel et al. (34) have demonstrated that acetylation sometimes occurs at the initiator Met with penultimate Phe and Gln residues. Nineteen 40 S ribosomal proteins were demethionylated at the first position. Eleven of these confirmed previous studies by Edman degradation (S4, S8, S9, S13, S15, S15a, S16, S17, S19, S23, and S29); our results suggest eight more examples (Sa, S3a, S5, S11, S18, S20, S26, and S27). Six of the demethionylated proteins showed masses consistent with further N-terminal acetylation (Sa, S5, S11, S15, S18, and S20). After removal of the initiator Met, the N-terminal residues are Ala or Ser in Sa, S11, S15, S18, and S20 and Thr in S5. These results agree with previous studies showing that proteins with Gly, Ala, Ser, or Thr as the penultimate residue are usually demethionylated FIG. 5. LC/MS/MS sequencing of the N-terminal peptide from S5. The S5 peptide digest was applied to a Vydac C18 capillary column, and a single ion was selected on-line MS/MS by collisional activation with argon. The parent ion was 951.6 Da/e (MH 2 2ϩ of the N-terminal peptide). The sequence of this peptide is shown with cleavages that generate the observed b ions () and y ions (), which are labeled in the spectrum. Dashed  ions were observed with intensities less than five times background (not labeled). The dehydrated ions are indicated as singly or doubly charged ions, differing by Ϫ18 or Ϫ9, respectively. The observed b 3-8 and b [11][12][13][14]   and acetylated (33,34) and that the majority of acetylated proteins are modified on either Ser or Ala (35). Removal of the initiator Met often destabilizes proteins, enhancing their susceptibility to degradation by the ubiquitin pathway (36). Acetylation in the presence or absence of Met prolongs the half-lives of proteins, preventing their degradation (30,36).
Other proteins showed mass differences that could be accounted for by (i) proteolyzed forms of S3 (Ϫ288 Da) and S24 (Ϫ354 Da), (ii) internal acetylation of S27 (ϩ42 Da), (iii) formylation of S5 (ϩ29 Da), and (iv) methylation or hydroxylation of S23 (ϩ16 Da). The modifications of the remaining proteins (S2, S3, S9, S10, S12, S14, and S25) are likely to be more complex. For example, ribosomal proteins S2 and S10 showed mass differences of ϩ220 and ϩ56 Da, respectively. These mass differences correspond to no known naturally occurring modifications, although they might reflect combinations of several modifications, with or without amino acid substitutions. The ϩ220-Da modification of S2 was located within the first 54 residues. Masses of known modifications similar but not identical to this value include myristoylation (ϩ210 Da) or biotinylation (ϩ226 Da). It is possible that the mass difference is actually 178 Da if the N terminus is acetylated, 351 Da if it is demethionylated, or 309 Da if it is demethionylated and acetylated. Known modifications with masses similar to these estimates are glucuronylation (176 Da), 4-oxyglycosylation (177 Da), and N-glycolneuraminylation (307 Da). In S10, the ϩ56-Da modification was located within the C-terminal peptide. This mass difference does not correlate to known protein modifications, but could be accounted for by an extra Gly (ϩ57 Da) incorporated in the protein sequence or an amino acid substitution of Ile or Leu for Gly. Further studies are needed to identify the chemistry of these modifications.
Both S2 and S10 are modified within sequences that contain RGGF and GR repeats, which are found within several nucleolar proteins (37)(38)(39)(40). Wool et al. (4) previously noted these repeats within S2, suggesting that they may direct nucleolar localization of S2 or mediate S2 binding to nascent rRNA. It is tempting to speculate that the modifications of S2 and S10 somehow influence S2 and S10 translocation or rRNA binding, thus regulating 40 S ribosomal protein assembly. It is also possible that the modification of S2 regulates binding or translation of mRNA, given that S2 has RNA binding potential and is likely to be localized near the ribosome active site, based on the homology between eukaryotic S2 and Escherichia coli S5 (2) and the immune electron microscopy of rat 40 S subunits (41).
Five proteins (Sb, Sc, Sd, Se, and Sf) were observed with masses that did not correspond to any known ribosomal protein. S1 is the only 40 S ribosomal protein for which no published cDNA sequence is available. Sf (34,987 Da) is most likely identical to S1, based on estimates of S1 mass by two-dimensional SDS-polyacrylamide gel electrophoresis (42). LC/MS/MS sequencing of Sc (13,706 Da) revealed a sequence tag of six residues: PPGPPP or PPPGPP. This sequence is absent in all of the known ribosomal proteins. Sc also appears to have greater hydrophobicity than the other ribosomal proteins based on elution time and mass. Hydrophobic proteins are known to ionize inefficiently, which might explain the low intensity observed for Sc (and Sf). Nevertheless, ions from Sc (Fig. 7A) showed intensities exceeding those of the coeluting protein Sa, indicating that Sc is relatively abundant. The reproducible appearance of Sc in three separate preparations of 40 S ribosomal proteins raises the possibility that Sc might represent a previously unidentified protein associated with 40 S ribosomal subunits.
The direct coupling of HPLC with mass spectrometry simplified the analyses of ribosomal complexes by eliminating the purification of individual proteins. This allowed us to detect multiple forms of certain proteins and to estimate their relative abundances. Two forms of S3, S5, S7, and S24 were observed, one form predominating in each case. Thus, the acetylated form of S7 was more abundant than its deacetylated form, and the demethionylated acetylated form of S5 was more abundant than its methionylated acetylated form. The unproteolyzed form of S24 predominated over its corresponding proteolyzed form. Similarly, a form of S3, 75 Da smaller than predicted, predominated over a variant that was 362 Da smaller than predicted.
Our study illustrates a novel and powerful method to identify components of large protein complexes and their co-or posttranslational modifications. Most biological events in cells are carried out by proteins and nucleic acids that exist and function within large complexes; thus, macromolecular assembly has emerged as an important aspect of biological regulation. The method we have developed lays the ground work for analyzing many types of protein complexes on a scale provided by mammalian cell culture. Ultimately, these techniques will be useful for characterizing post-translational modification events that regulate cellular responses, including processes that control cell growth, transformation, and differentiation.
Acknowledgments-We are grateful to Dr. Robert M. Barkley for technical advice on mass spectrometry, Mark Kissinger for cell culture preparation, and Carla Padilla for secretarial assistance. We also thank