Unraveling the structural basis for the unusually rich association of human leukocyte antigen DQ2.5 with class-II-associated invariant chain peptides

Human leukocyte antigen (HLA)-DQ2.5 (DQA1*05/DQB1*02) is a class-II major histocompatibility complex protein associated with both type 1 diabetes and celiac disease. One unusual feature of DQ2.5 is its high class-II-associated invariant chain peptide (CLIP) content. Moreover, HLA-DQ2.5 preferentially binds the non-canonical CLIP2 over the canonical CLIP1. To better understand the structural basis of HLA-DQ2.5's unusual CLIP association characteristics, better insight into the HLA-DQ2.5·CLIP complex structures is required. To this end, we determined the X-ray crystal structure of the HLA-DQ2.5· CLIP1 and HLA-DQ2.5·CLIP2 complexes at 2.73 and 2.20 Å, respectively. We found that HLA-DQ2.5 has an unusually large P4 pocket and a positively charged peptide-binding groove that together promote preferential binding of CLIP2 over CLIP1. An α9-α22-α24-α31-β86-β90 hydrogen bond network located at the bottom of the peptide-binding groove, spanning from the P1 to P4 pockets, renders the residues in this region relatively immobile. This hydrogen bond network, along with a deletion mutation at α53, may lead to HLA-DM insensitivity in HLA-DQ2.5. A molecular dynamics simulation experiment reported here and recent biochemical studies by others support this hypothesis. The diminished HLA-DM sensitivity is the likely reason for the CLIP-rich phenotype of HLA-DQ2.5.

Class-II major histocompatibility complex (MHCII) 6 proteins present foreign peptides to T cell receptors of CD4 ϩ T cells (1). The membrane-associated MHCII proteins consist of one ␣-chain and one ␤-chain whose interface forms the peptide-binding groove. Humans express three MHCII isotypes, HLA-DR (DR), HLA-DP (DP), and HLA-DQ (DQ), all of which are encoded on chromosome 6. Newly synthesized MHCII proteins associate with a chaperone protein called the invariant chain (Ii) and form a nonameric complex (␣ 3 ␤ 3 Ii 3 ) in the endoplasmic reticulum (2). This complex formation prevents indiscriminate peptide loading onto the nascent MHCII and targets the nascent MHCII to the endosome for further processing (1)(2)(3). Once in the endosome, the MHCII-bound Ii is progressively proteolyzed until only a short fragment called class-IIassociated invariant chain peptide (CLIP) remains attached to the MHCII peptide-binding groove (4). Subsequently, CLIP is catalytically released by HLA-DM (DM) and replaced with exogenous peptides for CD4 ϩ T cell examination after transport to the cell surface (5). In addition to CLIP removal, DM also carries out peptide editing by catalyzing the release of lowaffinity peptides (5,6). Currently, three MHC-binding regions have been identified in Ii: the canonical CLIP1 (residues 83-101), non-canonical CLIP2 (residues 92-107), and non-canonical CLIP3 (residues 98 -111) (Fig. 1). Most mouse and human MHCIIs associate exclusively with CLIP1 (5). So far only DQ2.2, DQ2.5, DQ7.5, and DQ8 have been shown to bind both CLIP1 and CLIP2 (7)(8)(9). DQ7.5 binds CLIP1, CLIP2, and CLIP3 (9). Interestingly, all human MHCII alleles that bind CLIP2 or CLIP3 are associated with one or more This work was supported in part by the Faculty Science and Technology Acquisition and Retention Program of University of Texas System and institutional funding from The University of Texas at El Paso (to C.-Y. K.). The work in the laboratory of L. M. S. was supported by the Research Council of Norway through its Centres of Excellence funding scheme, Project 179573/ V40, as well as by the South-Eastern Norway Regional Health Authority. The authors declare that they have no conflicts of interest with the contents of this article. The atomic coordinates and structure factors (codes 5KSU and 5KSV) have been deposited in the Protein Data Bank (http://wwpdb.org/). 1 Both authors contributed equally to this work. 2 Recipient of the Singapore International Graduate Award from the Agency for Science, Technology and Research, Singapore. 3 Supported by a senior fellowship of the Wellcome Trust-Department of Biotechnology India alliance. 4 To whom correspondence may be addressed: Dept autoimmune diseases: celiac disease (DQ2.2, DQ2.5, DQ7.5, and DQ8) (10 -12) and type 1 diabetes (DQ2.5 and DQ8) (11,12). DQ2.5 is associated with celiac disease, an autoimmune-like disorder caused by a harmful immune response to ingested wheat gluten and similar proteins from rye and barley (13). Approximately 95% of celiac disease patients express DQ2.5 that is encoded by the DQA1*05 and DQB1*02 genes (10). These alleles are found on the DR3-DQ2 haplotype (cis configuration) and in the heterozygous combination of DR5-DQ7/ DR7-DQ2 haplotypes (trans configuration). The gluten-specific CD4 ϩ T cells of celiac disease patients recognize a diverse set of gluten epitopes when they are presented in the context of DQ2.5 but not in the context of other MHCII molecules, and preferential binding of deamidated gluten peptides appears to be the basis for the association of celiac disease with this HLA molecule (15)(16)(17)(18). One unusual phenotype of DQ2.5 is its high CLIP content. In DQ2.5-expressing B lymphoblastoid cells, CLIP1 and CLIP2 combined account for up to 53% of endogenous displayed peptides (7)(8)(9)19). Generally, CLIP accounts for only about 10% of MHCII-displayed peptides (20). Moreover, DQ2.5 preferentially binds the non-canonical CLIP2 over the canonical CLIP1 (7,8). The CLIP-rich phenotype of DQ2.5 was explained by MHCII⅐CLIP being poor substrates for DM (8,21). To better understand the structural basis of the unusual CLIP association characteristics of DQ2.5, we have determined the DQ2.5⅐CLIP1 complex and DQ2.5⅐CLIP2 complex crystal structures.

Structural basis for the CLIP-rich phenotype of DQ2.5
DQ2.5-expressing cells have an unusually high CLIP content (up to 53%; CLIP1 and CLIP2 combined) (7)(8)(9)19). One possible explanation for this is that DQ2.5 binds CLIP with higher affinity compared with other MHCIIs. However, the available structural data do not support this notion. The numbers of direct hydrogen bonds formed between CLIP1 (PϪ1 to P9 only) and DQ2.5, DR1, DR3, and I-A b are 11, 13, 17, and 13, respectively. Therefore, we propose that the CLIP-rich phenotype of DQ2.5 arises from an impaired interaction between DQ2.5 and the catalytic DM whose function is to displace the MHC-bound CLIP peptide. Much of the current structural and mechanistic understanding of MHCII-DM interaction is derived from the DR1⅐HA⅐DM crystal structure (Protein Data Bank code 4FQX) (31). We investigated whether DQ2.5 has all the structural elements to facilitate the DM interaction that is observed for DR1⅐DM. To do this, we built a homology model of DQ2.5⅐CLIP1⅐DM. We first examined the electrostatic complementarity of the contact surface areas shared by DQ2.5 and DM (Fig. 6). According to our model, two regions in DQ2.5 make direct contact with DM. The first region is located adjacent to the P1 pocket in the ␣ 1 domain, and the second region is located near the transmembrane segment in the ␣ 2 domain. DQ2.5 has better charge complementarity to DM than does DR1, and therefore we can rule out surface electrostatic charge distribution as the source of impaired DQ2.5-DM interaction.

Structural basis for the CLIP-rich phenotype of HLA-DQ2.5
Next, we examined whether DQ2.5 is able to undergo the same set of conformational changes that DR1 undergoes upon DM binding. In DR1, Phe-␣51 has been identified as a key DM-binding residue (32,33). When DR1 binds to DM, the ␣51-55 loop of DR1 transforms into an ␣-helix, which causes the side chain of Phe-␣51 to move 13 Å from its initial solvent-exposed  position to the P1 pocket cavity where it forms a hydrophobic cluster with Phe-␣24, Ile-␣31, Phe-␣32, Phe-␣48, and Phe-␤89 (Fig. 7) (31). This hydrophobic interaction is thought to stabilize the otherwise vacant and unstable P1 pocket. DQ2.5 contains a deletion mutation at ␣53. DM sensitivity of DQ2.5 was found to be partially restored upon insertion of a Gly at this position (34). Because of the deletion at ␣53 in DQ2.5, Phe-␣51 is located at a position that is inaccessible by DM. This unconventional location of Phe-␣51 may compromise the DQ2.5-DM interaction. Furthermore, we suggest that the DM insensitivity of DQ2.5 is due to the presence of an extensive hydrogen bond network (involving Tyr-␣9, Tyr-␣22, His-␣24, Gln-␣31, Glu-␤86, Thr-␤90, and a buried water molecule) that spans the P1 to the P4 pockets of DQ2.5 (Fig. 8a). We carried out a 50-ns molecular dynamics (MD) simulations of the DQ2.5⅐CLIP1 complex to assess the stability of this hydrogen bond network. MD trajectories show that all hydrogen bonds in this network, with the exception of the peripheral Glu-␤86 O ⑀1 -Thr-␤90 O ␥ , are stable (Fig. 9). Furthermore, we observed that Phe-␣51 does not enter the P1 pocket during the course of the DQ2.5⅐

Hydrogen bond between the P1 main chain nitrogen of CLIP1 and the ␣52 carbonyl group of DQ2.5
All crystal structures of MHCII in complex with a peptide whose P1 residue is not Pro have a hydrogen bond between the amide nitrogen of the P1 residue and the main chain carbonyl group of MHCII ␣53. Interestingly, DQ2.5 has a deletion mutation at ␣53, and as gluten peptides binding to DQ2.5 frequently have Pro at P1, it was suggested that DQ2.5 is unable to form this hydrogen bond (34). The significance of this peptide main chain hydrogen bond has been assessed by comparing binding of peptides N-methylated at the P1 position with unmodified peptides. Whereas such substitution gave decreased affinity for peptide binding to DR1, no effect was seen for DQ2.5 (35,36). Interestingly, our DQ2.5⅐CLIP1 structure shows that there is indeed a hydrogen bond between the P1 main chain nitrogen of CLIP1 and the ␣52 carbonyl group of DQ2.5. The gluten-derived DQ2.5-gliadin-␣1a T-cell epitope (LQPFPQPELPY where the underlined residue is P1) binds to DQ2.5 with 2-fold higher affinity than the analog peptide containing norvaline (Nva) at P1 (ϳ25 M) (37). Nva is a non-proteinogenic ␣-amino acid that is isosteric to Pro but has a primary amine group that is able to participate in hydrogen bonding. Therefore, DQ2.5gliadin-␣1a must have an overall energetic advantage over the Nva-substituted analog peptide for binding to DQ2.5 despite having a hydrogen bond deficiency at P1. We propose that  Structural basis for the CLIP-rich phenotype of HLA-DQ2.5 DQ2.5-gliadin-␣1a as well as other peptides containing Pro at P1 has an entropic advantage that compensates for the lost enthalpy associated with the P1 hydrogen bond.

Discussion
We have determined the crystal structures of DQ2.5⅐CLIP1 (Protein Data Bank code 5KSU) and DQ2.5⅐CLIP2 (Protein Data Bank code 5KSV) at 2.73 and 2.20 Å, respectively. Although crystal structures of CLIP1 in complex with DR1 (with the peptide bound in both forward and reverse directions) (23,38), DR3 (22), and I-A b (24) have been reported previously, no crystal structure of CLIP2 bound to an MHCII has been reported so far. DQ2.5 is unusual in that it associates with CLIP1 (Ii 83-101) as well as the non-canonical CLIP2 (Ii 92-107) (19). Our study has revealed two unique structural features of DQ2.5 that may promote its association with CLIP2. First, DQ2.5 has an unusually large P4 pocket, which can accommodate the bulky P4 Met of CLIP2. Second, DQ2.5 has a positively charged peptide-binding groove, which is electrostatically more compatible with the neutral CLIP2 compared with the positively charged CLIP1.
Another unusual characteristic of DQ2.5 is its CLIP-rich phenotype. CLIP1 and CLIP2 combined account for 53% of the eluted peptide pool (9). It was proposed that the CLIP-rich phenotype of DQ2.5 is explained by MHCII⅐CLIP being poor substrates for DM (8,21). During MHC maturation, DM catalyzes release of CLIP from the nascent MHC (5). Therefore, impaired DQ2.5-DM interaction will result in DQ2.5 molecules retaining their original CLIP cargo. In contrast, DR1-expressing cells have a low abundance of CLIP (20), which suggests that DR1 is a good substrate for DM. We found two structural elements in DQ2.5 that may lower its DM sensitivity. First, ␣51, which is a key DM-contacting residue in DR1, is positioned internally in DQ2.5 due to the ␣53 deletion mutation. Second, the peptide-binding groove residues that form the ␣9-␣22-␣24-␣31-␤86-␤90 hydrogen bond network are not as free to move about as the corresponding residues in DR1 (Fig. 8). Therefore, DQ2.5 is less predisposed to the drastic secondary structure changes that DR1 undergoes upon DM binding. Our MD study showed that the ␣9-␣22-␣24-␣31-␤86-␤90 hydrogen bond network is stable and that Phe-␣51 of DQ2.5 cannot move into the P1 pocket upon DM binding. This is due to the blockage of the P1 pocket entrance by a water molecule, which is part of the ␣9-␣22-␣24-␣31-␤86-␤90 hydrogen bond network. To further test this idea, we disrupted the hydrogen bond network by changing ␣24, ␣31, and ␤86 to hydrogen bond-nonpermissible residues and then repeated the MD exercise. This time, Phe-␣51 did translocate to fill the P1 pocket, similar to what happens in DR1 upon DM binding. Our hypothesis that the ␣9-␣22-␣24-␣31-␤86-␤90 hydrogen bond network leads to diminished DM sensitivity and ultimately to the CLIP-rich phenotype is further supported by a recent study by Zhou et al. (21) who showed that changing ␤86 of DQ8 from Glu to Ala resulted in increased DM sensitivity. DQ8 has the same ␣9-␣22-␣24-␣31-␤86-␤90 hydrogen bond network found in DQ2.5, although it does not have the ␣53 deletion mutation. Our analysis, however, is based on the assumption that the DR1-DM interaction mechanism is directly applicable to DQ2.5⅐DM. It remains to be seen whether the DR1-DM interaction mechanism is truly universal. Even if this should prove not to be the case, the preference of bulky hydrophobic anchor residues at the P1 pocket for both DR1 (39,40) and DQ2.5 (19,30) indicates that these two molecules likely share the mechanistic feature of Phe-␣51 translocating to fill the P1 pocket in the interaction with DM.

Crystallization and data collection
DQ2.5⅐CLIP1 and DQ2.5⅐CLIP2 were treated with Factor Xa for 16 h at 24°C to remove the leucine zippers from the MHCII. MHCII was then purified using anion-exchange (buffer A, 25 mM Tris, pH 8.0; buffer B, 25 mM Tris, pH 8.0, 0.5 M NaCl) and size-exclusion chromatography (buffer, 25 mM Tris, pH 8.0) and concentrated to 2 mg/ml. For DQ2.5⅐CLIP1, 1 l of the protein solution and 1 l of precipitant buffer (0.1 M ammonium sulfate, 0.1 M sodium cacodylate, pH 6.5, 25% PEG 8000, 6% glycerol) were combined in a single hanging drop and kept at 18°C. For DQ2.5⅐CLIP2, 1 l of the protein solution and 1 l of precipitant buffer (0.1 M Bis-Tris, pH 5.5, 22% PEG 3350) was combined in a single hanging drop and kept at 18°C. Small crystals of DQ2.5⅐CLIP1 and DQ2.5⅐CLIP2 appeared within 1 week and grew to full size in 2 weeks. Crystals were soaked in mother liquor containing 5% glycerol and then flash frozen in liquid nitrogen. X-ray diffraction data were collected at beam line 9-3 of the Stanford Synchrotron Radiation Laboratory. Diffraction data indexing and integrating were done using HKL2000 (47). The DQ2.5⅐CLIP1 crystal belonged to the C121 space group with cell dimensions a ϭ 128.86, b ϭ 69.21, and c ϭ 146.69. The DQ2.5⅐CLIP2 crystal belonged to the I23 space Structural basis for the CLIP-rich phenotype of HLA-DQ2.5 group with cell dimensions a ϭ 137.01, b ϭ 137.01, and c ϭ 137.01.

Structure determination and analysis
Both structures were determined by molecular replacement using Phaser (48,49). DQ2.5 coordinates from the DQ2.5⅐gliadin structure (Protein Data Bank code 1S9V) were used as the search model. Model refinement was carried out using Refmac (50), PHENIX (51), and Coot (49). CLIP1 and CLIP2 peptides were built at the end of refinement, guided by the F o Ϫ F c electron density map. Bulk solvent correction and isotropic B correction were applied throughout the refinement. Water molecules were identified from residual density greater than 1.0 in the 2F o Ϫ F c map. All water molecules were checked for valid geometry, environment, and density shape before conducting additional cycles of model building and refinement. The two last refinement rounds included TLS (translation, libration, and screw-rotation displacements) parameterization. Crystallographic data collection, processing, and refinement statistics are given in Table 1. The stereochemical quality of the final structures was carried out using PROCHECK (52).

Model building of DQ2.5(wild type)⅐CLIP1⅐DM and DQ2.5(E␤86G,Q␣31I,H␣24F)⅐CLIP1⅐DM
The wild-type and mutant DQ2.5⅐CLIP1⅐DM complexes were modeled using the MODELLERv9.10 suite of programs (53). The DR1⅐HA⅐DM (Protein Data Bank code 4FQX) and DQ2.5⅐CLIP1 (Protein Data Bank code 5KSU) crystal structures were used as templates. In this model, CLIP1 was truncated to the same length (from P2 to P10) as that of the HA peptide in the DR1⅐HA⅐DM crystal structure. A total of five models were created and evaluated using the discrete optimized protein energy statistical energy function (54). The slow refine option was applied to achieve energy-minimized models. Structure visualization and figure representation were done using Chimera (55) and PyMOL (56).

Molecular dynamics simulations
All-atom MD simulation was carried out on three systems: DQ2.5(wild type)⅐CLIP1, DQ2.5(wild type)⅐CLIP1⅐DM, and DQ2.5(E␤86G,Q␣31I,H␣24F)⅐CLIP1⅐DM. All crystal water molecules were included in the starting MD structures given their importance in mediating the protein-peptide interaction (57)(58)(59). The standard protonation state, at pH 7.0, is applied for all ionizable amino acid groups, i.e. Lys and Arg side chains are protonated, whereas Glu and Asp side chains are deprotonated. His side chains, by default, are singly protonated. Each system was solvated with TIP3P water box with a minimum distance of 12 Å to any protein atom and neutralized by sodium counter-ions. Periodic boundary conditions were applied on the system.
The complex was first energy-minimized using steepest descent and conjugate gradient minimization followed by heating to 300 K within 800 ps under canonical ensemble conditions. The system was then equilibrated under isothermal-isobaric ensemble conditions within 1 ns. The MD simulations under microcanonical ensemble conditions were carried out for 50 ns. SHAKE was turned on for all bonds involving hydrogen. All simulations were carried out with the AMBER 12 program (60) together with the ff99SB (61) force fields. The particle mesh Ewald (62) algorithm was used to calculate long-range interactions, and short-range interactions were truncated at 10.0 Å. The integration time step was set to 1 fs. Resulting trajectories were analyzed using a combination of indigenously developed Python scripts and the Ptraj/Cpptraj module of AMBER 12.

Pocket volume calculation
The volume of the P4 pocket in DQ2.5, DR1, DR3, and I-A b was calculated using an in-house script based on the Voronoi algorithm (14).