ERAP1 enzyme-mediated trimming and structural analyses of MHC I–bound precursor peptides yield novel insights into antigen processing and presentation

Endoplasmic reticulum aminopeptidase 1 (ERAP1) and ERAP2 critically shape the major histocompatibility complex I (MHC I) immunopeptidome. The ERAPs remove N-terminal residues from antigenic precursor peptides and generate optimal-length peptides (i.e. 8–10-mers) to fit into the MHC class I groove. It is therefore intriguing that MHC class I molecules can present N-terminally extended peptides on the cell surface that can elicit CD8+ T-cell responses. This observation likely reflects gaps in our understanding of how antigens are processed by the ERAP enzymes. To better understand ERAPs' function in antigen processing, here we generated a nested set of N-terminally extended 10–20-mer peptides (RA)nAAKKKYCL covalently bound to the human leukocyte antigen (HLA)-B*0801. We used X-ray crystallography, thermostability assessments, and an ERAP1-trimming assay to characterize these complexes. The X-ray structures determined at 1.40–1.65 Å resolutions revealed that the residue extensions (RA)n unexpectedly protrude out of the A pocket of HLA-B*0801, whereas the AAKKKYCL core of all peptides adopts similar, bound conformations. HLA-B*0801 residue 62 was critical to open the A pocket. We also show that HLA-B*0801 and antigenic precursor peptides form stable complexes. Finally, ERAP1-mediated trimming of the MHC I–bound peptides required a minimal length of 14 amino acids. We propose a mechanistic model explaining how ERAP1-mediated trimming of MHC I–bound peptides in cells can generate peptides of canonical as well as noncanonical lengths that still serve as stable MHC I ligands. Our results provide a framework to better understand how the ERAP enzymes influence the MHC I immunopeptidome.

The immune recognition of surface MHC 2 I/peptide complexes initiates activation of CD8ϩ T cells as a critical step in the elimination of pathogens. MHC I molecules generally bind peptides of 8 -10 amino acid residues; this represents an optimal length for peptides to span the binding cleft in an elongated conformation and form stabilizing H-bond interactions with MHC residues within the A and F pockets (1,2). Interestingly, there is increasing evidence from MS analysis that peptides with more than 11 amino acids can be eluted from human and mouse MHC I molecules (3)(4)(5)(6)(7)(8)(9)(10)(11). Moreover, long peptides can be presented on the surface of cells by MHC I molecules (4,5,7,8) and can stimulate CD8ϩ T-cell responses (4,5,7,8,(11)(12)(13)(14). These can be long overlapping peptides that share much of their core sequences, nested sets of peptides that share the same core sequence and carry N-or C-terminal residue extensions, or single long peptides. These observations raise two important questions. 1) How do MHC I molecules present long peptides? 2) Have surface N-terminally extended peptides escaped trimming by the endoplasmic reticulum (ER) aminopeptidases (ERAPs)?
X-ray crystallography has revealed that peptides of 10 and 11 amino acids can fit into the MHC I groove by adopting conformations in which the middle residues zig-zag or bulge out of the groove, while still maintaining the peptide N and C termini bound within the A and F pockets, respectively (15). Peptides as long as 16-mers have been reported to adopt such highly bulged conformations (11-14, 16, 17). Structures of MHC I molecules have also shown that peptides of optimal lengths, such as 9-and 10-mers, can adopt unusual bound conformations in which a single "extra" terminal amino acid is positioned outside of the MHC I groove at the N-or C-terminal end (5,(18)(19)(20)(21)(22)(23). Finally, several recent structures of MHC I molecules presenting long C-terminally extended peptides showed that the C-terminal residues exit out at the F pocket (9,22,24). Overall, crystallographic studies have informed us that peptides of various lengths can serve as MHC I ligands, often by adopting unconventional binding modes.
Antigenic peptides that are destined to become ligands of human MHC I molecules are generated first as long precursors by the cytosolic proteasome. These precursors are transported into the ER by the transporter associated with antigen processing (TAP), where they are then N-terminally trimmed by ERAP1 and ERAP2 (25,26) as well as undergoing a proofread-ing process to assess their ability to stabilize energetically MHC I molecules (27). We and others showed that the ER-resident protein tapasin (TPN) serves as a critical checkpoint in the generation of the MHC I immunopeptidome (28 -31). Importantly, studies showed that loss of ERAP1 function in humans or mice (ERAAP) leads to alterations in the MHC I immunopeptidome (32,33). These results established ERAP1 as yet another important protein that shapes the MHC I peptide repertoire, likely working in synergy with tapasin (34). Interestingly, evidence has been provided that ERAP1 and ERAP2 can form a heterodimer (ERAP1/ERAP2) with distinct functional properties relative to the individual aminopeptidases (33,35). Notably, in previous studies, we showed that MHC I-bound N-terminally extended peptides are trimmed by the ERAP1/ ERAP2 heterodimer to optimal lengths of 8-and 9-mers (36). These results extended other studies that support the view that both free and MHC I-bound precursors are substrates of the ERAP enzymes (37)(38)(39)(40). Therefore, it is intriguing that long N-terminally extended peptides can be presented as surface ligands by MHC I molecules, even in cells with active ERAP function. It suggests that such peptides have seemingly escaped trimming by the ERAPs in the ER and have emerged still in their precursor forms.
In this study, we used the cell-surface presentation of N-terminally extended peptides by MHC I molecules as a unique opportunity to elucidate how these peptides bind into the groove and better understand the role of the ERAPs in generating the MHC I immunopeptidome. We determined the crystal structures and thermostability of a set of N-terminally extended peptides bound to HLA-B*0801, and monitored the trimming of these peptides by ERAP1 both in their free and MHC I-bound forms. Together, our results provide novel mechanistic insights into antigen processing that explain how both canonical and noncanonical length peptides can be generated by the ERAPs and stably presented by MHC I to CD8ϩ T cells.

N-terminally extended peptides
We designed a nested set of N-terminally extended peptides based on the 8-mer HIV-1 Gag immunodominant epitope GGKKKYKL (41). To eliminate possible issues with peptides dissociating from the HLA-B*0801 groove during trimming by ERAP1, we introduced a Lys-to-Cys mutation at P7 in AAKK-KYKL and also introduced the complementary Glu 76 3 Cys mutation in HLA-B*0801, as we described previously (36). Based on AAKKKYCL, we produced the nested set of N-terminally extended 10-, 12-, 14-, and 20-mer peptides (RA) n AAKK-KYCL that can be covalently bound to HLA-B*0801E76C. It is noteworthy that such long peptides would still bind if they were not covalently bound to the MHC I molecule (36). Finally, for the purpose of our crystallographic studies, the first N-terminal Ala residue extension was N-methylated (N-Me), generating (R(N-Me)A)(RA) n Ϫ 1 AAKKKYCL. All complexes were refolded in vitro as described under "Experimental procedures."

(R(N-Me)A)AAKKKYCL protrudes out of the A pocket
To understand how a N-terminally extended 10-mer peptide is presented by HLA-B*0801, we determined the X-ray crystal structure of (R(N-Me)A)AAKKKYCL covalently bound to HLA-B*0801E76C at 1.65 Å ( Table 1). The structure surprisingly shows that the peptide adopts an elongated conformation in which the AAKKKYCL core is bound into the groove, and the residue extensions PϪ1 (N-Me)Ala (one position N-terminal of P1) and PϪ2 Arg protrude out at the A pocket (Fig. 1A). The electron density was well-defined over the entire length of the peptide, including the extension PϪ1 (N-Me)Ala, although EDITORS' PICK: ERAP1, MHC I, and precursor peptides no electron density at an acceptable 1 threshold could be discerned for the N-methyl group of PϪ1 (N-Me)Ala and PϪ2 Arg (Fig. 1B). The PϪ2 Arg residue was therefore omitted in our model. Comparisons with a previously reported canonical structure of GGKKKYKL bound to HLA-B*0801 (42) showed that the peptide backbones adopt nearly identical conformations, except for a small 0.71-Å shift in P1 C␣-atom positions (Fig. 1C). Remarkably, only minor structural changes in the binding groove were detected between the two structures (r.m.s. deviation of 0.46 Å, over 1-180 C␣ positions), except for residue Arg 62 . In the canonical structure of GGKKKYKL, the side chain of Arg 62 acts as a lid atop of the peptide N terminus (Fig. 1D), whereas in our structure, this residue swings out and up, which "opens" the A pocket. In this "open" configuration, PϪ1 (N-Me)Ala and PϪ2 Arg can extend straight out of the groove. Thus, Arg 62 appears to control the configuration of the A pocket between an "open" and "closed" form (see also the legend to Fig. 2C). A close-up view of the A pocket shows that the side-chain methyl group of P1 Ala is rotated and occupies the position of a canonical N-terminal group of bound peptide structures (Fig.  1E). Consequently, the main-chain nitrogen of P1 Ala adopts a position corresponding to a P1 side chain normally seen in MHC I structures (Fig. 1E). Similar P1 residue rotations within the A pocket have been reported previously (5,18,43). In this unusual configuration, the two peptide residue extensions fol- EDITORS' PICK: ERAP1, MHC I, and precursor peptides low an upward trajectory out of the groove. In our structure, because of the lack of a peptide N-terminal group, there are noticeable changes to the conventional network of H-bonds normally seen within the A pocket (1) (Fig. 1E): H-bonds to conserved Tyr 7 and Tyr 171 residues are absent, and a new H-bond is formed between the main-chain nitrogen of P1 Ala and Asn 63 . The main-chain carbonyl oxygen of P1 Ala also forms a key H-bond with Tyr 159 (Fig. 1E).

Structures of a nested set of N-terminally extended peptides
To extend the above analysis, we determined the structures of the 12-, 14-, and 20-mer peptides (R(N-Me)A)(RA) n Ϫ 1 AAK-KKYCL covalently bound to HLA-B*0801E76C (Table 1). The structures show that the 8-amino acid AAKKKYCL core of all peptides, even the very long 20-mer, zigzags in the groove in an almost identical manner to the 10-mer ( Fig. 2A). There was, however, a noticeable upward shift of the backbone from P1 to P3 for the 12-, 14-, and 20-mers relative to the 10-mer ( Fig. 2A): C␣-atom shifts of ϳ2.0 Å at P1 and ϳ0.44 Å at P3. The electron density for all peptides was well-defined over the AAKKKYCL core and PϪ1 Ala (Fig. S1, left panels). It is noteworthy that partial electron density for PϪ2 Arg was visible at 0.5 for the 12-, 14-, and 20-mers (Fig. S1, right panels), which was not the case, however, for the 10-mer. Nonetheless, we have omitted all residues starting at PϪ2 Arg from our final models. The structures also showed that all P1 Ala residues had undergone a rotation within the A pocket, as exemplified in Fig. 2B with the 12-mer. Furthermore, in all structures, because P1 Ala residues are less deeply anchored within the A pocket relative to the 10-mer (Fig. 2B), this created a large cavity that was filled by a water molecule. This water molecule makes the conventional H-bonds to Tyr 7 and Tyr 171 (Fig. 2B). The other H-bonds are the same as those seen in the 10-mer structure (compare with Fig. 1E). Notably, the structures also showed that the most significant structural change in the groove, relative to the canonical structure of GGKKKYKL, involves Arg 62 . In the 12-, 14-, and 20-mer structures, Arg 62 swings out and up from its canonical position, which opens the A pocket (Fig. 2C). Although the position of Arg 62 in the longer peptides is different from that seen in the 10-mer, and small differences are seen between the 12-, 14-, and 20-mers, all orientations generated an open-ended groove. Thus, these structures further support the view that Arg 62 plays a critical role in modulating the "open" and "closed" forms of the A pocket.

Role of P؊5 (N-Me)Ala and P5 middle anchor residue in the MHC I-bound conformation of (R(N-Me)A)(RA) 2 AAKKKYCL
We wanted to verify that the N-methylation of the N-terminal Ala residue and the presence of the middle anchor residue at P5 were not controlling how the extensions (R(N-Me)A)(RA) n-1 protrude out of the A pocket. For this, we generated a 14-mer mutant in which PϪ5 is occupied by an Ala residue and the P5 anchor residue is mutated to Gly (i.e. (RA) 3 AAKKGYCL). The mutant peptide was refolded with HLA-B*0801E76C, and the structure of the complex was determined to 1.48 Å ( Fig. 3 and Table 1). The structure shows that the AAKKGYCL core adopts an extended bound conformation with the (RA) 3 extensions protruding out of the A pocket (Fig.  3). Comparisons with the structure of the 14-mer (R(N-Me)A)(RA) 2 AAKKKYCL (Fig. 3) show that both peptides overlap fairly well and have very similar structural features, including the P1 Ala rotation. Thus, we conclude that the bound conformations of the N-terminally extended peptides, with residues extending out the groove, are not governed by the N-terminal (N-Me)Ala or middle anchor residues.

Stability of N-terminally extended peptides bound to HLA-B*0801E76C
To assess the stability of HLA-B*0801E76C complexes loaded with long peptides, we carried out a thermal denaturation assay based on differential scanning fluorometry. For this, we determined the stability of the 8-mer AAKKKYCL covalently bound to HLA-B*0801E76C and compared it with that of the 10-, 12-, 14-, and 20-mer peptides (RA) n AAKKKYCL (Fig. EDITORS' PICK: ERAP1, MHC I, and precursor peptides 4A). The melting temperature (T m ) of the 8-mer was 66°C, and only slightly lower T m values of 64°C were determined for the 10-, 12-, 14-, and 20-mers (Fig. 4B). These results suggest that the N-terminally extended peptides have surprisingly similar stabilities compared with the 8-mer. This is consistent with the structures showing that the AAKKKYCL core adopts similar bound conformations within the groove (see Fig. 2A). Finally, we found that both AAKKKYKL and AAKKKYCL had the same T m value of 66°C, indicating that the covalently bound nature of our peptides does not alter MHC I stability.

Processing of free and MHC I-bound peptides by ERAP1
We incubated the free and MHC I-bound 14-and 20-mer (RA) n AAKKKYCL with ERAP1 to assess the susceptibility of these peptides to processing. MS analysis of the free 14-mer (RA) 3 AAKKKYCL shows that after 20 min of trimming by ERAP1, the mixture contained a series of various fragments as short as 4-mer (Fig. 5). In contrast, and under identical conditions, ERAP1 trimmed the free 20-mer (RA) 6 AAKKKYCL to predominantly the 11-mer (Fig. 5). It is worth noting that ERAP1 was unable to trim a different free 20-mer (RA) 6 AAKKKYKL. 3 We attribute the inconsistencies in trimming free 20-mers to that random secondary structures that such long peptides likely adopt in solution.
Next, we examined the ability of ERAP1 to trim the MHC I-covalently bound 14-mer (RA) 3 AAKKKYCL. Results from MS analysis show that ERAP1 was unable to trim the bound 14-mer over a period of 6 h (Fig. 6A), even with additional ERAP1 and trimming continued to 10 h. 3 These results are in marked contrast to those obtained for the free 14-mer, which was readily trimmed by ERAP1 to small fragments (see Fig. 5). We then monitored the ability of ERAP1 to trim the MHC I-bound 20-mer (RA) 6 AAKKKYCL (Fig. 6B, t ϭ 0). MS analysis of the mixture after 3.5 h showed that the major products were the covalently bound 14-, 15-, and 16-mer. After 6 h and with additional ERAP1, the remaining 20-mer was absent in the mixture, while the relative abundance of the other fragments was largely unchanged. The mixture composition remained the same even after 10 h. 3 Overall, these results are consistent and show that trimming of the MHC I-covalently bound 20-mer generated the 14-mer as the shortest covalently linked peptide (i.e. the same molecular species that was resistant to trimming when used as a starting MHC I-bound peptide).

Model of MHC I-bound long peptides
The results obtained here help refine our model of how N-terminally extended peptides can bind into the MHC I groove (Fig. 7) (36). We propose that precursor peptides are anchored into the groove by a few C terminus residues with their remaining N-terminal residues extending into the solvent space. This binding mode is supported by data showing the importance of anchoring the C-terminal peptide residue within the F pocket (44,45). Interestingly, the structures of TAPBPR (TAP-binding protein-related), a TPN-like protein, in complex with MHC I showed that a conformationally flexible "scoop" loop of TAPBPR is positioned into the F pocket region of the MHC I groove; this feature provides further evidence that the C terminus of the peptide is critical during the initial stage of binding (and editing) (46,47). Finally, that the N terminus of MHC I-bound peptides can dissociate partially from the groove is supported by molecular dynamics simulations of MHC I/peptide complexes (40). Thus, taken together, we suggest that as precursor peptides bind to MHC I, from their C-to N-terminal ends, the N-terminal extensions can be concomitantly trimmed by the ERAP enzymes (see "Discussion").

Discussion
It is increasingly evident that the MHC I immunopeptidome is more diverse in peptide lengths than originally thought, and that peptides of up to 20 amino acids or more can be presented

EDITORS' PICK: ERAP1, MHC I, and precursor peptides
Our structures revealed that the N-terminally extended peptides bind into the MHC I groove by adopting noncanonical conformations with residue extensions protruding out of the A pocket. Our structures also showed that the 8-amino acid core sequence of all peptides overlap fairly well into the groove. Residue Arg 62 was critical to open the A pocket in response to binding long peptides; whether or not other factors govern opening of the A pocket remains to be determined. Our analysis suggests that there is no apparent limit in peptide lengths that would prevent a given N-terminally extended peptide to be presented by HLA-B*0801. This is consistent with our thermostability data showing that N-terminally extended peptides form remarkably stable complexes. Overall, our study provides a structural and biochemical framework to understand how N-terminally extended peptides can be stably presented by MHC I on the surface of cells.
We compared our structures with the recent structures of the HIV-1 Gag immunodominant epitope TSTLQEQIGW (TW10) presented by HLA-B*5701 and HLA-B*5801 (5,19). In these structures, the side chain of Ser occupied the canonical position of a P1 peptide N-terminal group within the A pocket, whereas its main-chain nitrogen occupied the canonical position of a P1 side chain; these structural features are comparable with our P1 Ala residues. Because of this P1 rotation, the N-terminal Thr of TW10 acted as an "extra" PϪ1 residue by exiting out of the A pocket, in a similar fashion to the PϪ1 residues of our peptides. Thus, overall, these three structures illustrate how peptides of different lengths bound to three different HLA-B molecules use the same molecular mechanism involving changes in the P1 peptide position to allow N-terminally extended peptides to exit out of the A pocket. Because both HLA-B*5701 and HLA-B*5801 have a Gly residue at 62, it was not possible to assess whether this residue was critical in allowing TW10 PϪ1 Thr to exit out of the groove. It is, however, noteworthy that residue Trp 62 was shown to play a critical role in the extension of long peptides out of the HLA-F groove (48). Interestingly, an analysis of several HLA-A, -B, and -C structures presenting different 9-mer peptides showed that polymorphic residue 62 changes its configuration, depending on the size of the peptide side chain at P1. Thus, on that basis, we suggest that HLA-A and -C molecules have the potential to present long N-terminally extended peptides. Finally, recent structures of HLA-A*0201 presenting C-terminally extended peptides showed that these peptides bind with their residue extensions exiting at the F pocket (9,24). In these structures, as seen in our study, the normally "closed" configuration of the F pocket became "open" from the movement of a single residue, namely Tyr 84 or Lys 146 . Overall, we suggest that as more surface N-and C-terminally extended peptides are identified and characterized, it will be evident that such "unconventional" MHC I-bound conformations in which peptides exit at one end of the groove are more common than is currently realized.
We also showed that the MHC I-bound 20-mer peptide was trimmed by ERAP1 to the 14-mer as the shortest final peptide, and this result was corroborated by the inability of ERAP1 to trim the starting MHC I-bound 14-mer. Thus, there is a positive correlation between efficiency of cleavage by ERAP1 and length of peptides, where at least 5-6 amino acids have to Shown is a model of antigen processing in which an N-terminally extended candidate peptide is bound into the MHC I groove by only a few C-terminal residues. As the peptide undergoes a dynamic binding and "sampling" into the groove (indicated by red arrows), from its C to N terminus, the N-terminal residue extensions are concurrently trimmed by the ERAPs. Inside the cells, the ERAP1 and ERAP2 enzymes likely exist in more than one molecular form, with each form shaping differently the MHC I immunopeptidome (see "Discussion"). ERAP1/ERAP2 indicates the heterodimer.
EDITORS' PICK: ERAP1, MHC I, and precursor peptides extend out of the A pocket, based on peptide conformations seen in our structures, to be accessible by ERAP1. These experimental data support a modeling exercise based on the X-ray structure of ERAP1 that predicted that 6 peptide residues would need to protrude out of the groove to reach the zinc active site (49).
Taken together, we propose the following model for the trimming of MHC I-bound peptides by the ERAP enzymes (Fig. 7). We suggest that as a candidate precursor peptide "lands" into the MHC I groove by its C-terminal residues, the N-terminal residue extensions are solvent-exposed and thus susceptible to trimming by the ERAPs (Fig. 7). As the N-terminal extensions are being trimmed, the peptide undergoes a dynamic binding from its C to N terminus (50 -52) and "sampling" (editing), possibly in the presence of MHC I accessory proteins (also see below). The editing of candidate peptides ensures that only peptides that meet a given threshold of stabilization energy become ligands of MHC I (29). We also suggest that whether the final MHC I complex will present an optimallength peptide or a peptide that is still in its precursor form depends largely on which ERAP molecular species is engaged in processing the partially bound precursor (Fig. 7). Based on our results, ERAP1 is more likely to yield peptides of noncanonical lengths, such as nested sets of N-terminally extended peptides that share a common core of 8 or 9 residues stably anchored into the groove, as shown in our structures. This view takes into consideration the peptide length dependence exhibited by ERAP1 that prevents removal of all residue extensions. On the other hand, as we showed previously (36), the ERAP1/ERAP2 heterodimer is more likely to generate peptides of canonical lengths (see also below). In this scenario, the relative rate of peptide binding/"sampling" versus the rate of peptide trimming by the ERAPs is also likely to influence the final lengths of the bound peptides.
The recently determined cryo-EM structure of the peptideloading complex (PLC) showed for the first time that TPN, calreticulin, ERp57, and MHC I are arranged in a pseudosymmetric manner around the transporter TAP (53). In this macromolecular organization, the N terminus of the MHC I groove is positioned away from the center of the PLC (i.e. at the outer surface of the assembly). Based on this, it seems sterically possible for the ERAP aminopeptidases to approach the PLC and access precursor peptides that are partially bound into MHC I molecules, as depicted in Fig. 7. Furthermore, from a kinetics perspective, it would be more efficient if the transported precursors that bind into PLC-engaged MHC I molecules are trimmed by the ERAPs while they are still bound in the editing module, rather than for the free precursors diffusing away from the PLC for a chance encounter with an aminopeptidase. Thus, the PLC structure is consistent with the trimming of precursor peptides that are partially bound into PLC-engaged MHC I molecules.
To date, the intermolecular relationship between ERAP1 and ERAP2 is undefined. Hence, it is unclear how ERAP1 working with ERAP2, in contrast to ERAP1 alone, can yield MHC I-bound peptides of canonical lengths. This could be due to ERAP2 altering the structure/function of ERAP1, a molecular interplay between ERAP2 and MHC I that enhances exposure of peptide residue extensions to ERAP1, or some other reasons to be determined. The functional interdependency of ERAP1 and ERAP2, and whether the enzymes are only transiently active as a heterodimer or not, is also unclear. It is also undetermined whether ERAP2 trims MHC I-bound peptides (Fig.  7). Given that the expression levels of ERAP1 and ERAP2 vary with cell subsets (26), and not all individuals even express ERAP2 (54), these are important questions to address in future studies.
In summary, our study has provided new basic knowledge to understand how the processing of MHC I-bound precursor peptides by different molecular forms of the ERAP enzymes generate final peptides of both canonical and noncanonical lengths that contribute to the natural length distribution of the MHC I immunopeptidome. We also provided a structural and biochemical basis to explain how long peptides form stable MHC I complexes and thus can be stably presented on the cell surface. As we more fully appreciate the cell-surface presentation of unusually long MHC I peptides, a next goal is to develop a better understanding of how such structures are recognized by CD8ϩ T cells.

Synthetic peptides
Peptides were synthesized by the solid-phase methodology on a Symphony synthesizer (Protein Technologies Inc.) and purified by reverse-phase HPLC on a C18 Agilent column. Stock solutions in DMSO (ϳ10 mg/ml) were stored at Ϫ80°C.

Refolding of HLA-B*0801E76C complexes
We examined the crystal structure of HLA-B*0801/GGRK-KYKL (PDB code 1AGB) to identify a residue in HLA-B* 0801 heavy chain (Glu 76 ) that is geometrically well-positioned to form a disulfide bond linkage with a peptide residue side chain, upon mutation with a cysteine, as we described previously (36). The HLA-B* 0801E76C heavy chain mutant was generated as described previously (36). HLA-B*0801E76C complexes were reconstituted from urea-solubilized inclusion bodies of HLA-B*0801E76C heavy chain (1 M) and ␤ 2 -microglobulin (2 M) with a synthetic peptide (10 M) in an oxidative refolding buffer (55). The crude refolding mixtures of HLA-B*0801E76C complexes were purified on a Superdex 200 gel-filtration column by FPLC. Stock solutions of purified complexes (10 -30 mg/ml) in 20 mM Tris-HCl, pH 7.5, 150 mM NaCl were kept at Ϫ80°C.

Baculovirus expression system for ERAP1
The plasmid of full-length human ERAP1 cloned into the pFastBac vector containing a tobacco etch virus (TEV) protease cleavable C-terminal 10ϫ histidine tag (TEV-His 10 -FLAG) was a gift from Addgene (Addgene plasmid no. 39174; RRID:Add-gene_39174). The generation of recombinant baculovirus for the expression of recombinant ERAP1 was carried out using the Bac-to-Bac baculovirus expression system (Invitrogen) as recommended by the manufacturer.

Expression and purification of ERAP1
High Five insect cells were cultured at 27°C in serum-free Express Five medium (Gibco) supplemented with 0.75% fetal EDITORS' PICK: ERAP1, MHC I, and precursor peptides bovine serum (Gibco). Infection of High Five cells with recombinant baculovirus was carried out in 2-liter Delong flasks at a cell density of 1.8 -2 ϫ 10 6 cells/ml, followed by gentle shaking at 27°C for 70 h. The culture medium (3 liters) was centrifuged (6500 ϫ g, 25 min, 4°C), and the supernatants were supplemented with 10% glycerol followed by concentration to 300 ml (10-fold) using a Prep/scale-TFF cartridge (Millipore, Burlington, MA). The concentrated supernatant was dialyzed overnight in 50 mM Tris-HCl, pH 8.0, 10 mM imidazole, 10% glycerol, and 300 mM NaCl. The supernatant was then loaded onto a nickel-nitrilotriacetic acid column and washed several times with 50 mM Tris-HCl, pH 8.0, 30 mM imidazole, 300 mM NaCl. The protein was eluted with 3-5 ml of 50 mM Tris-HCl, pH 8.0, 250 mM imidazole, 10% glycerol, 300 mM NaCl. The eluate was immediately supplemented with 1 mM DTT, followed by purification on a Superdex 200 gel filtration column by FPLC. The purified protein in 20 mM Tris-HCl, pH 7.5, 10% glycerol, 150 mM NaCl, 1 mM DTT was kept at Ϫ80°C.

Crystallization
The initial crystallization condition for HLA-B*0801E76C/ (R(N-Me)A)AAKKKYCL (ϳ10 mg/ml) was identified using the Crystal Screen I (Hampton Research, Riverside, CA) as solution #15, 0.2 M ammonium sulfate, 0.1 M sodium cacodylate trihydrate, pH 6.5, 30% (w/v) PEG 8000, via the hanging-drop vapor diffusion method at room temperature. Initial crystals were optimized using different pH values (4.5-7.0) and different PEGs (10 -30%; 6000 -8000). A seeding solution in solution #15 was generated from these optimized crystals. Crystals used for data collection were grown by mixing 2 l of ϳ10 mg/ml protein solution with 2 l of 0.2 M ammonium sulfate, 18% PEG 4000, 0.1 M sodium cacodylate, pH 5.7, and 0.5 l of seeding solution. Similar crystallization conditions were used to collect data for complexes involving the 12-, 14-, and 20-mer peptides.

Data collection, structural determination, and refinement
X-ray diffraction data sets were collected with a MAR-225 CCD detector at the LS-CAT beamline 21-ID-F (or 21-ID-G) of the Advanced Photon Source (Argonne National Laboratory, Argonne, IL). Data were integrated and scaled with the HKL-2000 program package (56) or XDS (57). Details of data processing are indicated in Table 1. The structures of all complexes were solved by molecular replacement using Phaser (58) (the initial search model was HLA-B*0801/GGRKKYKL (PDB code 1AGB). Structure refinement of all models was carried out in Phenix (or Refmac in CCP4) (59 -61) and manual building with COOT (62). Final refinement statistics are summarized in Table 1. The atomic coordinates of all structures have been deposited in the Protein Data Bank with the following accession codes: 10-mer (6P2S), 12-mer (6P23), 14-mer (6P27), 20-mer (6P2C), and 14-mer mutant (6P2F).

Thermal denaturation assay
A thermal shift assay was performed using an ABI ViiA7 RT-PCR instrument (Life Technologies, Inc., Carlsbad, CA). Reaction mixtures (total volume of 21 l) consisted of 7 l of complex (final concentration of 2 M), 7 l of 10ϫ SYPRO orange dye (5000ϫ, Thermo Fisher Scientific, Waltham, MA), and 7 l of buffer 50 mM HEPES, pH 7.2, 150 mM NaCl. Each complex was analyzed in quadruplicate. A temperature gradient from 25 to 85°C with continuous increment of 0.06°C/s was used to generate the denaturation curves. The averaged denaturation curves were plotted as fluorescence intensity versus temperature. The minimum point of the first derivative of each curve provided the melting temperature (T m ).

Peptide trimming assay
The trimming of free and MHC I-bound peptides (ϳ12 g) was carried out by incubating with ERAP1 (ϳ0.4 or ϳ34 g, respectively) in 50 mM Tris-HCl, pH 7.6, 150 mM NaCl, 100 M ZnCl 2 , supplemented with 1 mM DTT (omitted for trimming MHC I-bound peptides), at 37°C (total volume was 45 l). At various times, the assay mixtures were spun down for 3 min in a microcentrifuge (14,000 ϫ g), and aliquots (10 l) were taken from the supernatants, followed by quenching with formic acid. The aliquots were kept frozen until MS analyses. In some experiments with MHC I-bound precursors, additional ERAP1 (1-17 g) was added to the reaction mixtures after each aliquot. Each peptide was assayed 2-4 times, using different batches of ERAP1. The samples were analyzed by electrospray ionization and MALDI MS using an Agilent AJS-ESI QTOF 6545 system at the MS Core within the UIC Research Resources Center.