Testing the role of gp96 as peptide chaperone in antigen processing.

gp96 is a 96-kDa glycoprotein of the endoplasmic reticulum that is believed to be involved in antigen processing as an intermediate carrier of peptides for presentation by major histocompatibility complex (MHC) class I molecules. This function implies that gp96 carries a large array of different peptides that represent the antigenicity of the cell and can serve all MHC class I molecules. So far, the evidence regarding these peptides is largely indirect and based on experiments where mice immunized with gp96 from tumor or virus-infected cells developed T cellular immune responses with the corresponding specificities. We analyzed by mass spectrometry peptides isolated from gp96 and found a number of different peptides derived from the proteins of different cellular compartments but mostly cytoplasm and nucleus. The sequences of these peptides provide information on the specificity of antigen processing and reveal structural requirements for binding to gp96 that only partially correspond to those of peptides presented by MHC class I molecules. The yield of peptides extracted from gp96 was far substoichiometric with an estimated occupancy of this chaperone of between 0.1% and 0.4%. These results strongly argue against a regular role for gp96 as a peptide chaperone in antigen processing.

gp96 is a 96-kDa glycoprotein of the endoplasmic reticulum that is believed to be involved in antigen processing as an intermediate carrier of peptides for presentation by major histocompatibility complex (MHC) class I molecules. This function implies that gp96 carries a large array of different peptides that represent the antigenicity of the cell and can serve all MHC class I molecules. So far, the evidence regarding these peptides is largely indirect and based on experiments where mice immunized with gp96 from tumor or virus-infected cells developed T cellular immune responses with the corresponding specificities. We analyzed by mass spectrometry peptides isolated from gp96 and found a number of different peptides derived from the proteins of different cellular compartments but mostly cytoplasm and nucleus. The sequences of these peptides provide information on the specificity of antigen processing and reveal structural requirements for binding to gp96 that only partially correspond to those of peptides presented by MHC class I molecules. The yield of peptides extracted from gp96 was far substoichiometric with an estimated occupancy of this chaperone of between 0.1% and 0.4%. These results strongly argue against a regular role for gp96 as a peptide chaperone in antigen processing.
A number of different functions have been assigned to the endoplasmic chaperone gp96, including ATPase and aminopeptidase activity and a role in protein folding (1,2). Its involvement in cellular immunity first became apparent with the demonstration that it is immunogenic when injected into animals and induces immune responses against specific antigens of the cells from which it was isolated (3). This antigen transfer capacity was proposed to be the molecular basis for crosspriming, i.e. induction of T cell responses to antigens presented by MHC 1 class I molecules of cells other than those that express the antigens (4,5), and gp96 was postulated to be a key intermediate endoplasmic carrier of peptides that are destined for presentation by MHC class I molecules. However, other studies have failed to demonstrate stimulation of tumor-specific T cells with gp96 isolated from the tumor cells (6). Also, gp96 was shown to directly stimulate dendritic cells to mature to competent antigen-presenting cells and to induce anti-tumor immune responses even when the peptide-binding domain had been deleted (7). The ␣2-macroglobulin receptor was shown to be the receptor for gp96 involved in these processes (8 -10). All of these observations suggest that gp96 plays an important role in cellular immune responses. However, the proposed peptide chaperone function of this protein needs to be critically reevaluated (11,12). A prerequisite for the antigen-specific immune functions is that gp96 carries a wide range of different peptides from different source proteins, including T cell epitopes that can be bound and presented by different MHC molecules independent of the specific MHC alleles of the cells that express the antigen. However, so far no direct information on the composition of gp96-bound peptides has been reported. Peptides presented by MHC class I molecules are thought to be derived mostly from defective ribosomal products, which are degraded by the proteasome system or, as recently reported, TPPII (13)(14)(15)(16). Free peptides generated by these endoproteolytic systems have not yet been isolated from cells, presumably because they are rapidly degraded. However, it is generally believed that the C terminus of most of these peptides is generated by cleavage by the proteasomes or TPPII directly (17). In contrast to the C terminus, the N terminus of the primary endoproteolytic products may be reduced by aminopeptidases in the cytoplasm and endoplasmic reticulum (18 -21). Peptides associated with gp96 should provide insights into the composition of the intracellular peptide pool and into the specificity of cleavage of polypeptides by proteasomes and TPPII inside the cells. To clarify the nature of gp96-bound peptides and to answer the above questions, we isolated peptides from gp96 of the cutaneous T cell lymphoma tumor cell line MyLa, determined their molecular masses, assessed their physicochemical properties, and sequenced a number of these isolates to identify their origin.

EXPERIMENTAL PROCEDURES
Cell Cultures-gp96 and MHC class I molecules were isolated from the HLA-A1-and HLA-B8-expressing cutaneous T cell lymphoma cell line MyLa (22). The cells were grown in roller flasks in Dulbecco's modified Eagle's medium culture medium supplemented with 5% heatinactivated fetal bovine serum, 5% newborn calf serum, and 100 units of interleukin-2/ml. 80 g of cells (ϳ8 ϫ 10 10 cells) were used for the isolation of MHC class I molecules, and 120 g (1.2 ϫ 10 11 cells) were used for gp96.
Isolation of Peptide from MHC Class I Molecules-MHC class I were isolated as described in details elsewhere (23). Briefly, the membrane proteins were solubilized with 20 mM Tris buffer, pH 7.4, containing 0.3% CHAPS, 0.2% Nonidet P-40, 145 mM NaCl, 1 mM EDTA, and 0.1 mM Pefabloc as proteinase inhibitors. The cell debris was removed by ultracentrifugation (1 h, 100,000 ϫ g), and the solubilisate precleared by passage through an immune affinity column with the monoclonal antibody 19-178, which has an irrelevant specificity. The MHC class I molecules were then absorbed to an affinity chromatography column with the monoclonal pan-HLA class I-specific antibody w6/32. Both antibodies were coupled to CH Sepharose 4B (Amersham Biosciences) via N-hydroxy succinimid. After adsorption of the solubilized and precleared proteins, the immune affinity column was washed successively, first with 20 mM Tris, 145 mM NaCl, pH 7.4 (TBS), second with TBS with 0.3% CHAPS, third with TBS buffer, fourth with TBS with 0.3% ␤-octylglycoside, fifth with TBS buffer, and sixth with ultrapure water. The peptides were then eluted directly from the column with 0.7% trifluoroacetic acid in ultrapure water, dried by lyophilization, redissolved in 1 ml of 5% acetonitrile with 0.1% trifluoroacetic acid, and separated from high molecular mass materials by ultrafiltration (3-kDa molecular mass cut-off; Centricon 3; Millipore).
Isolation of Peptide from gp96 -For the isolation of gp96, the cells were Dounce homogenized in hypotonic NaHCO 3 buffer (30 mM, pH 7.1) with 0.1 mM Pefabloc as protease inhibitor. The homogenate was cleared of cell debris by ultracentrifugation for 1 h at 100,000 ϫ g, and the supernatants were applied to a concavalin A affinity Sepharose column (Amersham Biosciences). The column was washed with homogenization buffer, and gp96 was eluted with the same buffer containing 10% ␣-methyl-mannopyranoside. The elution buffer was exchanged to homogenization buffer by ultrafiltration (Centriprep 30 or Centricon 30; Millipore). The concentrated eluate was collected, desalted by centrifugation through a spin column with Sephadex G10 (Amersham Biosciences), and adjusted to 2% trifluoroacetic acid. After 30 min at 4°C the peptides were separated from the high molecular weight components by ultrafiltration (Centricon 3), dried by lyophilization, and redissolved in 5% acetonitile with 0.1% trifluoroacetic acid. The remaining gp96 was incubated for another 30 min with 40% acetonitrile with 0.1% trifluoroacetic acid and again subjected to ultrafiltration to isolate hydrophobic peptides. Both batches of peptides were analyzed separately by HPLC and mass spectrometry.
Quantification of the Proteins and Peptides-MHC class I molecules and gp96 were isolated from 10 g of MyLa cells, analyzed for purity by SDS-PAGE, and quantified by a BCA protein quantification assay (Pierce) according to the manufacturer's instructions using bovine serum albumin as a standard. The theoretical yield of MHC class I molecules was assessed by determining the average number of molecules/cell surface by flow cytometry after staining the cells with the phycoerythrin-labeled pan-MHC class I-specific monoclonal antibody w6/32 and comparing the mean fluorescence intensities with the mean fluorescence intensities obtained with phycoerythrin-labeled calibration beads (QuantiBRITE TM ; BD Biosciences). The number of MyLa cells/1 g of cell pellet was estimated to be 1 ϫ 10 9 . The peptides extracted from MHC class I molecules and gp96 were quantified with the CBQCA ATTO-TAG peptide quantification kit (Molecular Probes, Eugene, OR) according to the manufacturer's instructions with 10 mM borate buffer, pH 9.3, and a completely randomized nonapeptide library as standard.
HPLC Separation of the Isolated Peptides-The peptides from the above preparations were fractionated by HPLC using a C2/C18 reversed phase column (SMART HPLC with a RPC SC 2.1/10 column; Amersham Biosciences) and an acetonitrile gradient of 5-90% solvent B (solvent A: 0.1% trifluoroacetic acid; solvent B: 90% acetonitrile, 0.1% trifluoroacetic acid). 45 fractions of 200 l were collected at a flow rate of 100 l/min. 20 -50 l of these fractions were subjected to a second dimension reverse phase micro HPLC (C18, 300-m inner diameter, 5-cm length; LC Packings, Amsterdam, The Netherlands). The peptides were eluted with a gradient of 5-90% solvent B as above at 2 l/min and collected in 45 fractions of 0.5 l directly onto MALDI targets.
Mass Spectrometric Analysis and Sequencing of the Isolated Peptides-0.5 l of every HPLC fractions was analyzed for the presence and the masses of peptides by MALDI-TOF mass spectrometry using 4-hydroxy-␣-cyanocinnamic acid as matrix. The peptides were sequenced by MALDI-TOF post-source decay (24). The amino acid sequences were deduced from the post-source decay fragmentation spectra with the de novo sequencing software Sequit (25) (www.sequit.org) and the data base search engine MASCOT (26, 27) (www.matrix-science.com).

RESULTS
Peptides associated with gp96 are expected to be very heterogeneous with a low abundance of the individual peptide species. The peptides extracted from gp96 as well as those from the MHC class I molecules of the MyLa tumor cell line were therefore fractionated by reversed phased HPLC (Fig. 1A) to reduce their heterogeneity and to enrich the specific peptides in single HPLC fractions for subsequent analysis by MALDI-TOF mass spectrometry. Consistent with the supposed function of gp96 as an intermediate carrier of antigenic peptides, a highly complex mixture of peptides was found to be associated with gp96 covering a mass range of ϳ800 -2,600 Da ( Fig. 1, B and D), which corresponds to peptide lengths of ϳ7-24 amino acids. This size distribution covers a much wider range than what is typically found for peptides eluted from MHC class I molecules as exemplified by the peptides isolated from the MHC class I molecules of MyLa cells, which with 800 -1500 Da is consistent with a length of MHC class I-bound peptide of 8 -11 amino acids ( Fig. 1, C and D). As by the acetonitrile concentrations required to elute these peptides from the HPLC reversed phase column, the hydrophobicity of the gp96 and of the MHC class I-bound peptides is comparable, although the MHC class I-associated peptides are more heterogeneous in their physicochemical properties.
For determination of the yields of gp96 and MHC class I molecules and peptides associated with these proteins, gp96 and MHC peptide complexes were isolated from 10 g of MyLa cells each. The average number of MHC class I molecules/cell surface was 30,000 as measured by flow cytometry with the phycoerythrin-labeled pan-MHC class I-specific monoclonal antibody w6/32 and phycoerythrin-labeled calibration beads. The theoretical yield of MHC molecules and peptides from the 10 g of MyLa cells would thus be 498 pmol. After acidic elution of the MHC peptide complexes from the immune affinity column and separation of protein and peptides by ultrafiltration, 400 pmol of MHC molecules and 101 pmol of peptides were recovered, which corresponds to 80% recovery of the MHC molecules and of that 25% recovery of the peptides. Because MHC class I molecules are stable only with incorporated peptide, it is assumed that the occupancy with peptide is ϳ100% so that the recovery of peptide reflects the outcome of the isolation procedure rather than the original occupancy of the MHC molecules. For gp96 it is not possible to measure the number of molecules in the intact cell. We therefore cannot calculate a theoretical yield. From the 10 g of cells 525 g of gp96 were isolated, which is 5469 pmol, ϳ14 times more than MHC class I molecules. From this amount of protein altogether 6.46 pmol of peptides were isolated, which corresponds to a yield of 0.1%. If we assume a comparable recovery of peptide isolated from gp96 as what we obtained from MHC class I molecules, the occupancy of gp96 would be 0.4%. Even taking into account the high excess of gp96 over MHC class I molecules, the amount of peptides bound to gp96 at steady state is only ϳ6% of what is bound to the MHC class molecules. To control for this far substoichiometric yield of peptides, we subjected the proteins actin, lysozyme, malate dehydrogenase, lactate dehydrogenase, and two different IgGs to the same acidic extraction procedure applied to gp96 but could not demonstrate any peptide binding.
The low abundance of individual gp96-associated peptides makes their characterization more challenging than the analysis of MHC class I-associated peptides. Nonetheless, using MALDI-TOF mass spectrometry with post-source decay fragmentation of the peptide and deduction of the sequences from the resulting fragments by de novo peptide sequencing or data base comparisons, we could determine the sequences of 14 of these peptides (Table I). These 14 peptides include samples from all but one of the HPLC fractions and thus represent the entire range of hydrophobicity of the gp96-associated peptides. Most of the peptides were obtained by extraction from gp96 with 2% trifluoroacetic acid. Only a few peptides were found in the 40% acetonitrile extracts. The sizes of the peptides were between 7 and 21 amino acids (887-2511 Da), which covers the entire size range of gp96-associated peptides detected. All of the peptides contain at least one of the aromatic amino acids phenylalanine or tyrosine, which has been suggested to be critical for binding of peptides to gp96 (28). The 14 peptides are derived from 12 different source proteins with different sub-cellular localization. Nine (cystatin A and B, calvasculin, CdC2-related protein kinase 5, fus-like protein, 4-␣-glucanotransferase, Hsp40, the mannose isomerase and the ribonucleoprotein) are proteins of the cytoplasm or nucleus, one (cathepsin A) is lysosomal, and two (plexin-A2 and the aminopeptidase) are membrane proteins. For cathepsin A, two peptides were found that differ by one amino acid at the N terminus (Fig. 2) and elute in different HPLC fractions, the shorter peptide in fraction 8 and the longer, more hydrophobic peptide in fraction 10. Cystatin A was also represented by two peptides, cystatin A 78 -98 and cystatin A 81-98 , eluting fractions 11 and 12, respectively. Here, the shorter is reduced by three amino acids at the N terminus. The structures of the two cathepsin A-derived peptides were determined by complete de novo sequencing (Fig. 2), and those of the longer cystatin A-derived peptides were determined by data base matches of the fragmentation profiles. No variations at the C terminus were found. Five peptides, the two variants of both cathepsin A and cystatin A, as well as the cystatin B peptide are of the C termini of the proteins, and the others are derived from internal sequence stretches.
To assess whether the gp96-associated peptides harbor MHC ligands, we compared their sequences to the sequences of peptides, which were found bound to MHC class I molecules and are listed in SYFPEITHI and in the JenPep databases of T cell epitopes and MHC ligands (29 -31). However, none of the gp96associated peptides we identified match or include the sequences of the MHC class I ligands determined to date. Nonetheless, in principle, these peptides can be precursors of MHC class I ligands because it is apparent from a comparison of their sequences with the sequence motifs of MHC-binding peptides as listed in the SYFPEITHI data base. Because it is generally assumed that trimming of the precursors of the MHC class I ligands occurs only at the N terminus, we kept the C terminus of the peptides constant for these analyses and allowed for N-terminal variation. In the cases of the heptapeptides YRP-PELL (Cdc2-like PK 265-271 ) and PVRQTFF (lysosomal mannosidase 635-641 ), we assumed that there could be longer precursors and extended their sequences at the C terminus according to the sequences of the parental proteins. For 8 of the 14 peptides, we were able to define potential MHC class I ligands (Table II). These potential epitopes would be presented by different MHC allomorphs. However, none of them would fit the epitope-binding motifs of the MHC class I molecules HLA-A1 and HLA-B8, which are expressed by the MyLa cells gp96-associated Peptides from which the gp96 was isolated. At least one of the source proteins of the gp96-associated peptides, cathepsin A, contains an independent known MHC class I ligand, which was found to be HLA B*2705-restricted. In contrast to the lack of correlation with known MHC class I-restricted epitopes, one peptide, EKKYFAATQFEPLAAR of the leucyl/cystinyl aminopeptidase, covers part of the sequence of a peptide that was found before to be restricted by the HLA class II molecule HLA-DPA1*0201/ DPB1*0401 (32). Comparing the peptides isolated from gp96 with those isolated from the MHC class I molecules of the MyLa cells, we could not find any correspondence. Table III lists seven MHC class I-associated peptides that were sequenced. These are mostly nonapeptides; one is a decapeptide, and two are octapeptides whose sequences correspond largely to the published epitope motives for the HLA class I molecules expressed by MyLa, i.e. HLA-A1 and HLA-B8. None of these peptides was found on gp96. Moreover, comparing the masses and the hydrophobicity of the 71 peptides isolated from the MHC class I molecules and of the 216 peptides isolated from gp96 ( Fig. 1), not a single match was found. The gp96-associated peptides as much as those associated with MHC class I molecules are products of protein degradation by the proteasome or the recently described TPPII endopeptidase system with possible subsequent N-terminal trimming by exopeptidases. The primary products of the cleavage by these systems inside cells have not been identified and analyzed yet, likely because of their low abundance and rapid degradation. The information on the specificity of the proteasomes available to date is derived from in vitro analyses of the cleavage of synthetic peptides or of the model protein elastase by proteasomes isolated from cells (33)(34)(35)(36)(37). Likewise, the information on the specificity of the TPPII system is deduced from in vitro experiments with purified enzyme complexes and a model polypeptide. All of these model substrates are processed independent of ubiquitin, which is in contrast to the majority of the proteins that are the naturally cleaved inside cells. Chaperones such as gp96 bind and thereby protect from complete degradation peptides that are generated by these endopeptidase complexes. Thus, the gp96-associated peptides are close representations of their primary cleavage products, and the C termini of these peptides should reflect the specificity of en-dopeptidase cleavage in the cytosol. We therefore compared the sequences of the peptides that we had isolated from gp96 to the cleavage sites predicted by the three available prediction algorithms for proteasomal cleavage. These algorithms are based on experiments with proteasome that were purified from cells and, as substrates, synthetic peptides (Fragpredict) (33,34) or human elastase as model protein (Paproc) (35,36) or on the C termini of MHC class I ligands (NetChop) (37) (Table IV). The C-terminal amino acids of the nine internal protein fragments associated with gp96 were phenylalanine (three times), asparagines (two times), and glutamic acid, arginine, proline, and leucine (one time each). It is widely accepted that the amino acid at the C terminus of the cleavage site does not, or not alone, determine the specificity of proteasome cleavage but that there are more remote structural determinants. For the evaluation of the cleavage specificity and the performance of prediction algorithms, we therefore took the entire sequences of the source proteins into consideration. Table IV compares the proteasomal cleavages predicted by the above three algorithms to the C termini of the gp96-associated peptides. The Fragpredict algorithm predicts five of the nine cleavage sites correctly, Paproc 3 and NetChop 5, however, one site only with low probability. The C termini of the two peptides derived from the Cdc2-like protein kinase and from the DnaJ homologue were consistently predicted by all three algorithms; the C termini of the calvasculin and of the mannosidase were not identified at all. In the other four cases the results were inconsistent. Thus, none of the available algorithms can predict proteasome cleavage with sufficient accuracy. These failures may be due to the input data sets used to develop and train the algorithms and could reflect differences between the protein degradation in the cytosol, and the cleavage specificity of the proteasomes in the in vitro experimental systems implemented in the Fragpredict and the Paroc algorithms. As an alternative, it may be considered that the gp96-associated peptides are products of TPPII, which cleaves after lysine. However, none of these peptides have a C-terminal lysine. The weak performance of the Net-Chop algorithms, which is based on the C termini of the known MHC class I ligands, indicates that the MHC class I-associated peptide pools differ from the pools bound by gp96. It is interesting to note that in many cases the predictions by all three algorithms for proteasome cleavage are off by only one amino acid toward the C terminus, some also toward the N terminus, of the experimentally determined site, which suggests a systematic cause for the deviation of the predicted cleavage sites.

DISCUSSION
The specific antigenicity of gp96-associated peptides had been deduced from immunization experiments and led to new concepts for the treatment of cancer with gp96 as a therapeutic vaccine. However, apart from a few examples where a defined tumor-associated T cell epitope was searched for and found associated with gp96, there was no information on these peptides (38 -42). The data presented herein provide the first insights into the composition of peptides bound by gp96 and, thereby, of the intracellular peptide pool independent from the restrictive binding properties of the MHC class I molecules. Despite the reports that gp96 can confer antigen specificity and that peptides found associated with gp96 include potential T cell epitopes, it is not clear whether peptides have to bind to gp96 before they are passed on to and incorporated into newly synthesized MHC class I molecules. It is also conceivable that gp96 and MHC molecules compete for the endoplasmic peptide pool and bind different, maybe overlapping subsets of this pool. However, it is also possible that gp96 is not a peptide chaperone at all and that peptide association with gp96 is just an accidental side effect of the presence of both in the same compartment.

gp96-associated Peptides
The sequences of the peptides that we isolated from gp96 and sequenced are in good agreement with recent publications on the binding of synthetic peptides to gp96 (28,(43)(44)(45). First, not only hydrophobic peptides but also hydrophilic peptides were eluted from gp96. Second, aromatic amino acids tyrosine or phenylalanine are found in all peptide sequences. Third, the aromatic amino acids were found at the different positions in the sequences, and no recurring peptide binding motif can be     (38), and the ovalbumin-derived SIINFEKL (39) all have at least one aromatic amino acid in their sequences. The only exception so far is the epitope IPGLPLSL of the mouse leukemia tumor antigen (40). The sequences of gp96-associated peptides not only provide information on their immunological potencies but also on the specificity of the antigen-processing machinery and, in particular, on the specificity of the endopeptidase systems that generate antigenic peptides from their source proteins. The three available algorithms for proteasomal cleavage are based on experimental cleavage of synthetic peptides or of elastase as a model protein and on the C termini of known MHC class I molecule-bound peptides (33)(34)(35)(36)(37). However, none of these algorithms could predict the C termini of all the gp96-associated peptides correctly. Also the known specificity of TPPII cannot account for the C termini of these peptides (15,16). It can be concluded, therefore, that either there are more endopeptidases with different specificities than the proteasome or TPPII involved in antigen processing or the specificities of these two known proteolytic complexes are different inside the cells than determined by the in vitro experiments with the purified enzymes. Moreover, there is no congruence of the gp96 and the MHC class I-associated peptide pools. These deviations of the peptide pools associated with these molecules could explain some of the discrepancies reported for attempts to use gp96 as a source for antigenic epitopes in vaccination experiments where in some cases mice could be immunized against an antigen of the cell from which the gp96 was isolated, and in other cases, however, not (3)(4)(5)(6). The broad range of different peptides, their size distribution, the subcellular location of their source proteins, and the possibility of predicting potential MHC class I molecule ligands in the sequences of some of these peptides are consistent with the assumed function of gp96 as intermediate carrier of MHC class I-restricted T cell epitopes. On the other hand, the comparisons of the C termini and the apparent requirement for an aromatic amino acid in the sequences of gp96-associated peptides with the MHC class I allele-specific peptide motifs suggest that gp96 and MHC class I molecules bind peptides of different but partially overlapping subsets of the intracellular peptide pool. In contrast to the MHC molecules, however, only a minute fraction of gp96 is occupied with peptide. The specific nature of gp96-associated peptides together with the far substoichiometric occupancy of gp96 strongly suggest that gp96 is not a peptide chaperone involved in antigen processing.