Structural Basis of Carbohydrate Recognition by Calreticulin*

The calnexin cycle is a process by which glycosylated proteins are subjected to folding cycles in the endoplasmic reticulum lumen via binding to the membrane protein calnexin (CNX) or to its soluble homolog calreticulin (CRT). CNX and CRT specifically recognize monoglucosylated Glc1Man9GlcNAc2 glycans, but the structural determinants underlying this specificity are unknown. Here, we report a 1.95-Å crystal structure of the CRT lectin domain in complex with the tetrasaccharide α-Glc-(1→3)-α-Man-(1→2)-α-Man-(1→2)-Man. The tetrasaccharide binds to a long channel on CRT formed by a concave β-sheet. All four sugar moieties are engaged in the protein binding via an extensive network of hydrogen bonds and hydrophobic contacts. The structure explains the requirement for glucose at the nonreducing end of the carbohydrate; the oxygen O2 of glucose perfectly fits to a pocket formed by CRT side chains while forming direct hydrogen bonds with the carbonyl of Gly124 and the side chain of Lys111. The structure also explains a requirement for the Cys105–Cys137 disulfide bond in CRT/CNX for efficient carbohydrate binding. The Cys105–Cys137 disulfide bond is involved in intimate contacts with the third and fourth sugar moieties of the Glc1Man3 tetrasaccharide. Finally, the structure rationalizes previous mutagenesis of CRT and lays a structural groundwork for future studies of the role of CNX/CRT in diverse biological pathways.

Protein glycosylation plays an important role in maturation and quality control of proteins in the endoplasmic reticulum (ER) 3 (1)(2)(3). Nascent polypeptide chains of the secretory pathway are translocated into the lumen of the ER where they immediately become N-glycosylated with Glc 3 Man 9 GlcNAc 2 on the side chain of asparagine residues in Asn-X-Ser/Thr motifs (see Scheme 1). The processing of the glycan starts with removal of the outermost glucose by the action of glucosidase I. Glucosidase II then trims the next glucose, upon which the monoglucosylated proteins selectively bind to the membrane ER protein calnexin (CNX) or its soluble homolog calreticulin (CRT) to enter the calnexin cycle. In the calnexin cycle, the nascent protein bound to CNX or CRT interacts with chaperones to promote its folding. The glucose and adjacent three mannose residues are critically important for carbohydrate recognition by CNX and CRT (Scheme 1) (4 -6). Removal of the terminal glucose from correctly folded protein substrates releases them from binding to CNX or CRT and allows their exit from the ER to their final destination. Non-native proteins, which lack the terminal glucose residue, can be reglucosylated by the enzyme UDP-glucose-glycoprotein glucosyltransferase to re-enter the calnexin cycle. Multiple rounds of deglucosylation-reglucosylation cycles can occur until a glycoprotein reaches its native fold or, if terminally misfolded, it is targeted for degradation via the ER-associated degradation pathway (7).
The previous crystal structure of CNX revealed two main structural components, a globular lectin domain and an extended arm-like domain called the P-domain (8). The lectin domain shows a fold similar to leguminous lectins and largely consists of a ␤-sandwich formed by two curved ␤-sheets. It also contains a single high affinity calcium-binding site that plays an important role in stabilizing the protein (9) but does not participate in carbohydrate recognition (8). The proline-rich P-domain interrupts the lectin domain between residues Pro 270 and Phe 415 . The P-domain consists of four copies of a repeat motif (termed type 1) and four copies of second repeat motif (type 2) in a "11112222" configuration to form a long hook-like arm that interacts with the thiol oxidoreductase ERp57 (10,11). The CRT P-domain shows a similar modular arrangement but consists of only three repeats of each type (10,12,13). CRT also possesses a highly negative C terminus, which binds calcium ions with millimolar affinity, but the C terminus does not contribute to glycan binding (14,15).
A portion of the carbohydrate-binding site was identified in CNX by soaking the crystals with glucose, but the study did not address the question of specificity toward monoglucosylated substrates and how remaining mannose moieties of the glycan recognize CNX (8). Subsequent mutagenesis studies confirmed the importance of CNX residues for carbohydrate binding (16). Parallel studies of CRT identified the residues that are essential for oligosaccharide interactions but also hinted at possible differences between carbohydrate recognition by CNX and CRT (17)(18)(19). The CNX structure did not explain the earlier observation that treatment with dithiothreitol impairs carbohydrate binding by CRT (6).
Here, we determined the high resolution crystal structure of a fragment of CRT corresponding to lectin domain in complex with a tetrasaccharide fragment from the glucosylated arm of the Glc 1 Man 9 GlcNAc 2 glycan. The structure explains CRT specificity for monoglucosylated protein substrates and rationalizes mutagenesis studies of the protein family.

MATERIALS AND METHODS
Protein Expression, Preparation, and Purification-The C163S mutant of the mouse CRT lectin domain (residues 18 -206 and 301-368 linked by Gly-Ser-Gly-Ser-Gly) was cloned into pET29a (Amersham Biosciences-Pharmacia) and expressed in Escherichia coli BL21(DE3) in rich (LB) medium as a fusion protein with both N-terminal and C-terminal His-tags. Residues are numbered according to the unprocessed, immature protein sequence. For labeling for NMR experiments, cells were grown in M9 minimal medium with 15 N-NH 4 Cl and [U-13 C]glucose. For production of a selenomethionine-labeled protein, the expression plasmid was transformed into the E. coli methionine auxotroph strain DL41(DE3), and the protein was produced using LeMaster medium (20). Cells were harvested and broken in TSC buffer (50 mM Tris, 300 mM NaCl, 3 mM CaCl 2 , pH 8.0). The fusion protein was purified by affinity chromatography on Ni 2ϩ -charged Sepharose resin, and the N-terminal tag was removed by cleavage with thrombin, leaving the Gly-Ser-Met N-terminal extension and Leu-Glu-His 6 C-terminal extension. The cleaved protein was additionally purified using size-exclusion chromatography using HPLC buffer (20 mM Tris, 100 mM NaCl, 3 mM CaCl 2 , pH 7.5). Selenomethionine-labeled protein was purified in a similar manner. The tetrasaccharide, ␣-D-glucopyranose-(133)-␣-D-mannopyranose-(132)-␣-D-mannopyranose-(132)-D-mannose (Glc 1 Man 3 ), was purchased from the Alberta Research Council (Edmonton, Canada) and used without further purification.
Crystallization-Initial crystallization conditions were identified utilizing hanging drop vapor diffusion using Classics II screen (Qiagen). The best crystals for unliganded CRT were obtained by equilibrating a 1.0-l drop of a protein (6 mg/ml) in buffer (20 mM Tris, 100 mM NaCl, 3 mM CaCl 2 , pH 7.5) and then mixed with 1.0 l of reservoir solution containing 29% (w/v) PEG monomethyl ether 2000, 0.2 M KSCN, 10 mM taurine, and 0.1 M Tris (pH 8.5) and suspended over 1 ml of reservoir solution. Crystals grew in 3-7 days at 22°C. For cryoprotection, the reservoir composition with the addition of 7% glycerol was used. For data collection, crystals were picked up in a nylon loop and flash cooled in an N 2 cold stream (Oxford Cryosystem). The crystals contain one molecule in the asymmetric unit (Z ϭ 4), corresponding to V m ϭ 2.08 Å 3 Da Ϫ1 and a solvent content of 40.8% (21).
The best crystals for tetrasaccharide-bound CRT were obtained by equilibrating a 1.0-l drop of a protein (6 mg/ml)/ tetrasaccharide mixture at 1:2 molar ratio in buffer (20 mM Tris, 100 mM NaCl, 3 mM CaCl 2 , pH 7.5), mixed with 1.0 l of reservoir solution containing 25.5% (w/v) PEG monomethyl ether 2000, 0.15 M KSCN, and 0.1 M Tris (pH 8.0) and suspended over 1 ml of reservoir solution. Crystals grew in 3-7 days at 22°C. For cryoprotection, the reservoir composition with the addition of 7% glycerol was used. For data collection, crystals were picked up in a nylon loop and flash cooled in a N 2 cold stream (Oxford Cryosystem). The crystals contain one molecule in the asymmetric unit (Z ϭ 4), corresponding to V m ϭ 1.96 Å 3 Da Ϫ1 and a solvent content of 37.4% (21).
Structure Solution and Refinement-The single-wavelength anomalous diffraction ion data sets from selenomethioninelabeled crystals of tetrasaccharide-bound CRT and a native data set from a crystal of unliganded CRT were collected at a wavelength of 0.9769 Å on an Area Detector Systems Corp. Quantum-210 CCD detector at beamline A1 at the Cornell High Energy Synchrotron Source (Table 1). Data processing and scaling were performed with HKL2000 (22). The starting phases for the orthorhombic crystal of tetrasaccharide-bound CRT were obtained using molecular replacement with CNX structure (Protein Data Bank code 1JHN) followed by direct refinement against experimentally derived selenium sites using PHASER (23). The resulting map was subjected to density modification with the program ARP/wARP (24) that allowed for automated model building of ϳ90% of the residues.
The partial model obtained from ARP/wARP was extended manually with the help of the program Coot (25) and was improved by several cycles of refinement, using the program REFMAC (26). Of 273 residues of the construct, the final model does not include GSM of the cloning linker at the N terminus, P 204 PK 206 , GSGSG of the linker replacing the P-domain and E 363 EQRLK 368 LEHHHHHH C-terminal residues. In addition, one Glc 1 Man 3 molecule, one calcium ion and 177 water molecules were included in the model. The final model had good stereochemistry with no outliers in the Ramachandran plot computed using PROCHECK (27).
For the unliganded CRT and monoclinic crystals of tetrasaccharide-bound CRT, the structures were obtained by molecular replacement with the orthorhombic tetrasaccharide-bound CRT structure using PHASER (23) and improved by several cycles of refinement, using the program REFMAC (26) and model refitting followed by the translation-librationscrew (TLS) refinement (28). For the unliganded CRT, the final model does not include L 203 PPK 206 , GSGSG of the linker and E 364 QRLK 368 LEHHHHHH C-terminal residues. One calcium ion and 56 water molecules were included in the model. The final model has good stereochemistry with no outliers in the Ramachandran plot computed using PROCHECK (27).
Isothermal Titration Calorimetry-Experiments were carried out on a MicroCal iTC200 titration calorimeter (GE Healthcare) using the VPViewer software for instrument control and data acquisition. The buffer used for isothermal titration calorimetry experiments contained 20 mM Tris, 100 mM NaCl, 3 mM CaCl 2 , pH 7.5. During a titration experiment, a sample of the CRT lectin domain was kept at 293 K in a stirred (1000 rpm) reaction cell of 0.2 ml. Nineteen injections, each of 2-l volume and 4-s duration with a 150-s interval between injections, were carried out using a 39.4-l syringe filled with Glc 1 Man 3 solution. Titration experiments were performed with 30 M protein solution in the cell and 300 M carbohydrate solution in the syringe. The calorimetric data were processed using the software package ORIGIN (version 7) to determine the Gibbs free energy of binding, molar binding stoichiometry (N), molar binding entropy (⌬S), and molar binding enthalpy (⌬H).

Crystallization and Structure Determination of CRT Lectin
Domain-Previous attempts to crystallize CRT were likely hindered by the intrinsic mobility of the arm-like P-domain and unstructured C terminus (supplemental Fig. 1). To overcome this, we deleted the P-domain (residues 207-300) of mouse CRT and replaced it with a short linker. Additionally, Cys 163 was mutated to serine to avoid intermolecular disulfide bond formation. Although we were able to obtain purified protein, extensive screening did not yield any crystals. We hypothesized that the presence of an unstructured C-terminal tail additionally hindered crystallization. To define the boundaries of the folded domain, we subjected full-length and P-domain-deleted mouse CRT to limited proteolysis using trypsin, chymotrypsin, proteinase K, and V8 protease. Characterization by mass spectrometry of fragments from trypsin digestion suggested the C terminus could be removed by cleavage at Lys 368 to produce a stable fragment and NMR spectroscopy confirmed that it was soluble and well folded (supplemental Figs. 2 and 3). Based on these results, we recloned the lectin domain of mouse CRT to include residues 18 -206 and 301-368. Crystallization trials produced long needle-like crystals that could be improved using additives including a Glc 1 Man 3 tetrasaccharide. Large, well diffracting crystals were obtained in the primitive orthorhombic (space group P2 1 2 1 2 1 ) and monoclinic (P2 1 ) forms with the best diffraction extending to beyond 2.0 Å.
We obtained unbiased, experimental phases for the orthorhombic crystals using selenomethionine-labeled protein and single wavelength anomalous diffraction (Table 1). Subsequently, this structure was used for molecular replacement to solve the structure of the monoclinic crystal form. The asym-  18 to Phe 202 and from Pro 301 to Glu 363 . No density was observed for Leu 203 -Lys 207 and the following linker GSGSG due to disorder in the crystal. Also, the C-terminal sequence Glu 364 -Lys 368 with the C-terminal His-tag were disordered. The structure of CRT lectin domain displays a jelly roll fold largely formed by a sandwich of two large ␤-sheets (Fig.  1A). The hydrophobic interactions between the seven-stranded concave ␤-sheet and six-stranded convex ␤-sheet are crucial for structural integrity of the domain. There is also a small ␤-sheet covering the surface where the concave and convex sheets create space between them. The structure also shows two short ␣-helices (Ala 32 -Arg 36 and Leu 196 -Asp 199 ) and a long ␣-helix (Glu 336 -Asp 362 ) that runs along and beyond the convex ␤-sheet. There are also several protruding loops in the domain. Thus, a flap-like ␤6-␤7 loop is covering a part of the concave ␤-sheet while packing against the ␣1-␤2 loop that rises against the ␤4 strand. Another long loop between strands ␤2 and ␤3 is stabilizing the C-terminal half of the long helix ␣3. The Cys 105 -Cys 137 disulfide bridges the beginning of strand ␤6 with the end of strand ␤7.
The structure clearly defines a calcium-binding site in the CRT lectin domain. The calcium ion is coordinated by the side chain of Asp 328 , backbone carbonyls of Gln 26 , Lys 62 , and Lys 64 , and by two water molecules (Fig. 1B).
Structural Basis of Carbohydrate Recognition-The tetrasaccharide Glc 1 Man 3 corresponds to the branch of the Glc 1 Man 9 GlcNAc 2 glycan that binds to the lectin domain of CRT/CNX. By isothermal titration calorimetry, we confirmed that the isolated, recombinant domain retained all of the lectin functions. The affinity of 0.7 M for Glc 1 Man 3 is very close to the reported value for intact CRT (supplemental Fig. 4) (29).
The electron density map showed clear, easily interpretable density for all four sugar moieties ( Fig. 2A) that we refer to according to their numbering in the natural glycan, Glc(3)-Man(D1)-Man(C)-Man(4). The tetrasaccharide runs along the long channel formed by the curved ␤sheet with all sugar moieties engaged in protein binding (Fig. 2B). Among them, the last mannose, Man(4), has somewhat looser contacts with CRT and higher B-factors. The first three moieties Glc(3), Man(D1), and Man(C) interact tightly with CRT and have B-factors similar to those of the protein residues.
At the nonreducing end of the tetrasaccharide, the glucose moiety lies flat in the shallow cavity, the base of which is formed by side chains of Met 131 and Ile 147 (Fig. 2B). In addition to these hydrophobic contacts, Glc (3)  Importantly, the structure explains the requirement for glucose at the nonreducing end of the carbohydrate. Glucose and mannose are C2 epimers of each other. Our structure shows that equatorial O2 of glucose perfectly fits to the groove formed by CRT side chains (Fig. 2D). Mannose in this position would cause a steric clash with the sulfur atom of Met 131 and the loss of hydrogen bonds with carbonyl of Gly 124 and the side chain of Lys 111 . Previous mutagenesis studies showed that even the single K111A mutation impairs CRT-carbohydrate interactions (17,18).
The structure also explains the specificity for monoglucosylated glycans (Glc 1 Man 9 GlcNAc 2 ) over the precursor with two glucose residues (Glc 2 Man 9 GlcNAc 2 ). Although the carbohydrate-binding site can accommodate a glucose residue in the second position, the sugar linkages are different. Binding of the tetrasaccharide ␣-Glc-(133)-␣-Glc-(133)-␣-Man-(132)-Man would result in the loss of hydrogen bonds and unfavorable interactions as the last mannose residue intersects the protein surface.
In contrast to the extensive protein contacts by the first sugar, Man(D1) and Man(C) mainly use their O4 -O6 edges for interactions with CRT (Fig. 2C). In particular, O4 of Man(D1) occupies a crucial position engaging in three direct hydrogen bonds with side chain of Tyr 109 and both the side chain and backbone carbonyl of Asp 317 . The side chain of Asp 317 also makes a direct hydrogen bond with O6 of Man(D1). Although O2 of Man(D1) is directed away from the ␤-sheet, it forms a water-mediated hydrogen bond with the side chain of Asp 125 located in the carbohydrate-interacting loop. Asp 135 of CRT is crucial for Man(C) binding, as its side chain forms direct hydrogen bonds with O4 and O6 of Man(C). Two water molecules assist the other Man(C)-CRT hydrogen bonds: between O3 and the side chains of Tyr 109 and Asp 135 , and between O6 and the side chain of Tyr 109 and backbone amide of Trp 319 .
Previous studies showed that treatment with dithiothreitol abrogates carbohydrate binding by CRT (6). We similarly see no binding in the presence of the non-thiol-reducing agent TCEP (tris(2-carboxyethyl)phosphine) (supplemental Fig.  4B). These cysteines are also essential to the chaperone function of CRT (30). The CRT/Glc 1 Man 3 structure provides a basis to explain these observations as the Cys 105 -Cys 137 disulfide bond is involved in contacts with the Man(C) and Man(4) moieties of the Glc 1 Man 3 tetrasaccharide. Namely, the C5-C6 bond of Man(C) and C1-O1 bond of Man(4) partially wrap around the CRT disulfide bond (Fig. 2C, right). Reduction of this disulfide bond would clearly disrupt binding of the last two mannose moieties. Both Man(C) and Man(4) are also engaged in hydrophobic interactions with the side chain of Trp 319 .
The C1-O1 bond of Man(4) is directed away from the protein. Therefore, it is likely that the following mannose residue Man(3) of Glc 1 Man 9 GlcNAc 2 glycan does not make any significant interactions with CRT, and the full essence of CRT-carbohydrate recognition is captured in our structure. In agreement with this conclusion, Glc 1 Man 3 was shown to compete effectively with Glc 1 Man 9 for binding to CNX (6).
Identical tetrasaccharide conformations were observed in the two crystal forms. This clearly demonstrates that the conformation is not affected by crystal packing. There is a single crystal contact between the C6 -O6 bond of Man(4) at the far end of tetrasaccharide and the side chain of Glu 345 of another CRT molecule in the P2 1 2 1 2 1 crystal form, whereas the carbohydrate does not contact any symmetry-related molecules in the P2 1 crystal form. Additionally, we note that the bound calcium ion is positioned far from the carbohydrate-binding site and is not involved in glycan recognition.
We overlaid the tetrasaccharide-bound and unliganded CRT lectin domain structures to assess the conformational changes in CRT upon carbohydrate binding, The overlay results in an root mean square deviation of 0.2 Å over 229 C ␣ atoms demonstrating that the two structures are nearly identical (Fig. 3A). The only significant change occurs in the flap-like loop containing Gly 124 and Asp 125 , residues involved in glucose binding. The loop conformation in the unliganded state is stabilized by side chains of Lys 111 and Asp 317 that make hydrogen bonds with the backbone carbonyl of Asp 125 and amides of Gly 124 and Asp 125 , respectively (Fig. 3B). When oligosaccharide binds CRT, side chains of Lys 111 and Asp 317 engage in direct hydrogen bonding with sugar moieties. The released Gly 124 and Asp 125 rotate allowing carbonyl of Gly 124 to form a hydrogen bond with Glc(3), whereas the side chain of Asp 125 interacts with carbohydrates via an ordered water molecule (Fig. 3C).
Structure Correlates well with Mutagenesis Results-Due to the interest in recognition of glycosylated proteins by CRT/ CNX, vast mutagenesis data have been obtained with CRT to characterize its binding to oligosaccharides. Mutagenesis of CRT by different groups showed that Tyr 109 , Lys 111 , Tyr 128 , Asp 135 , and Asp 317 are critical for carbohydrate binding (17)(18)(19). The crystal structure shows that the side chains of all of these residues form direct hydrogen bonds with carbohydrate moieties (Fig. 2C). The structure is also in perfect agreement with the results showing reduced carbohydrate binding for the W319I and W319A CRT mutants (19). The bulky side chain of this residue is involved in hydrophobic contacts with the reducing end mannose residue in our structure (Fig. 2B). Similarly, substitutions of Met 131 reduced but did not prevent binding (17,18). On the other hand, mutations of Asp 125 did not affect oligosaccharide binding (17), as the contacts of this residue with sugar are mediated through a water molecule. The structure further explains the observation that D160G and D160A mutations do not affect carbohydrate binding (18,19), as this residue is 13 Å away from the bound ligand. The structure also reveals that the 75% decrease in carbohydrate affinity in the R73L CRT mutant (17) results from mid-range conformational changes, as this residue is Ͼ10 Å away from the bound tetrasaccharide and is unable to participate directly in binding.
Comparison with CNX and Other ER Lectins-Sequence similarity between homologous regions of CRT and CNX led to assumptions that the structures of these proteins are very similar and resulted in use of homology models of CRT based on the CNX structure. Consequently, these models were used to interpret results from single point and deletion mutagenesis that proved to be inaccurate in some cases. As an example, the Glu 217 of CNX, which participates in ligand binding, was incorrectly assigned as homologous to Asp 160 of CRT, which is far from the carbohydrate-binding site (17,18). On the other hand, while the CRT construct used for glycan-independent substrate binding studies was missing several residues of the long C-terminal helix (31), the thermal denaturation curve for that construct is similar to that of the lectin domain studied here (supplemental Fig. 5).
The CRT structure allows for an accurate structural alignment of CRT and CNX. The structural overlay of unliganded CRT and CNX shows 33% sequence identity between their lectin domains and a root mean square deviation of 1.7 Å over 195 CA atoms. The structural similarities are strongest in the ␤-sheet regions, whereas the loops often adopt divergent conformations (Fig. 4A). Conspicuous differences between the two structures occur in the helical regions. Thus, the short ␣1 and ␣2 CRT helices are absent in the CNX structure. Similarly, the C-terminal helix is much shorter (10 versus 25 residues) in the CNX structure. The reason for this divergence is unclear, as this region is well conserved between the two proteins (Fig. 4B). Although this may reflect a genuine difference between CNX and CRT, it is possible that the native CNX C-terminal helix is longer than observed previously.
Despite differences elsewhere, the oligosaccharide-binding surface is nearly identical in both proteins (Fig. 4C).  in our crystal structure and involves residues (Tyr 109 and Asp 317 of CRT) that form hydrogen bonds with the following mannose in the CRT-tetrasaccharide complex. The reason for this discrepancy could simply result from low binding affinity of glucose to CNX coupled with the low resolution of crystallographic data set leading to some positional shift of the glucose moiety in the CNX structure. Or, the isolated glucose moiety could bind at a slightly different location when it is not part of an oligosaccharide chain. Nonetheless, the superposition of the carbohydrate binding residues is striking and suggests that CNX and CRT bind carbohydrates in an identical fashion.
The calcium ion is coordinated in the CNX structure by side chains of Asp 437 (Asp 328 of CRT) and Asp 118 (Asp 63 ), carbonyl of Ser 75 (Gln 26 ) and, possibly, by carbonyl of Lys 119 (Lys 64 ). The missing coordinating groups were not observed possibly due to insufficient resolution. In CRT, an equivalent calcium ion is coordinated by both oxygen atoms of Asp 328 side chain, carbonyls of Gln 26 , Lys 62 , and Lys 64 and by two water molecules.
A structural similarity search using the Dali database (32) showed that the CRT lectin domain is most similar to CNX (Z-score, 27.5) as expected. In addition, the CRT structure is similar to Emp47p (Z-score, 15.5), ERGIC-53 (Z-score, 14.3), and VIP36 (Z-score, 13.7). VIP36 and ERGIC-53 are transport lectin proteins that are involved in trafficking of glycosylated proteins out of the ER, whereas Emp46/47p is a yeast homolog of ERGIC-53. Interestingly, these proteins have specificity toward the deglucosylated D1 arm of high mannose glycans (33)(34)(35), the same arm recognized by CRT/CNX in its monoglucosylated state. The crystal structures of VIP36 in complex with ␣-Man-(132)-Man and ␣-Man-(132)-␣-Man-(133)-␤-Man-(134)-GlcNAc have been determined (36). Despite a significant overlap in oligosaccharide specificity and the use of a similar structural scaffold between CRT/CNX and VIP36, they use differing surfaces to bind carbohydrates (supplemental Fig. 6). This is an example of how a similar fold is adapted for binding somewhat differing ligands.

DISCUSSION
The high resolution crystal structure of the CRT lectin domain in complex with the Glc 1 Man 3 tetrasaccharide illuminates the molecular basis of monoglucosylated Glc 1 Man 9 GlcNAc 2 glycan function in the calnexin cycle. The structure explains the requirement for a single glucose at the nonreducing end of the carbohydrate and allows for an accurate structural alignment between CRT and CNX. The striking similarity in the sugarbinding sites suggests that CNX and CRT interact with monoglucosylated substrates in identical fashion.
To gain insight into the structure of full-length CRT with a bound glycoprotein, we overlaid the CRT lectin domain with a model of the P-domain derived from the structure of CNX (Fig.  5). As the CRT P-domain is shorter and has only three repeat modules, we removed the third module of the CNX P-domain. The CNX P-domain was chosen as a model because the crystal structure includes information about the relative orientation of the P-and lectin domains that is absent from the NMR structure of the isolated CRT P-domain (12,13). The tip of the CRT P-domain contains a binding site for ERp57, an oxidoreductase involved in disulfide bond formation in glycoproteins. The model gives an idea of how the bound glycoprotein might be positioned relative to the lectin domain, the P-domain and ERp57. CRT also contains an ϳ55-residue-long C-terminal extension that is rich (Ͼ60%) in glutamate and aspartate residues. Because of its abundant negative charges, this C-terminal domain is unlikely to be structured in solution, but it becomes more ordered upon binding calcium (37).
To exit the CNX/CRT cycle, monoglucosylated glycoproteins are processed by glucosidase II to remove the terminal glucose residue. The accessibility of the glycan for processing while bound to CNX/CRT has been the subject of some debate (38,39). Our structure shows that the bond between Glc(3) and Man(D1) targeted by glucosidase II is not easily accessible. It seems likely that the glycosylated substrate has to dissociate from CRT for deglucosylation to occur.
Earlier reports also suggested that CRT undergoes conformational changes upon carbohydrate binding (40). Comparison of the tetrasaccharide-bound and unliganded CRT lectin domain structures demonstrates that the only significant change occurs in the flap-like loop that is involved in glucose binding. On the other hand, loss of the bound calcium ion is likely to strongly destabilize the CRT lectin domain. Numerous studies have shown calcium-dependent conformational changes in CRT (41)(42)(43). Although our carbohydrate-bound CRT structure confirms that the calcium-binding site is too far to affect interactions with glycans, the tightly bound calcium ion is important for the structural integrity of the lectin domain. With a K d of ϳ2 M (14), loss of this high affinity calcium ion is unlikely to occur within the ER, but it may occur for CRT outside of the ER (44).
CNX and CRT exhibit overlapping but distinct patterns of interaction with folding glycoproteins in vivo (45)(46)(47). Given that the lectin sites of these chaperones are essentially identical, the basis for observed differences in glycoprotein binding specificity must reside in other properties. Previous studies have shown that that the distinct luminal versus membrane-bound topologies of CRT and CNX contribute to selection of client glycoproteins (45,48,49). It is also likely that the reported ability of CRT and CNX to bind directly to peptides (31,50,51) or to polypeptide segments of non-native protein conformers (9,40) contributes to substrate selection. Consequently, the identification of such peptide binding sites on these chaperones is of considerable interest. However, despite the fact that the lectin domains of both CNX (50) and CRT 4 are capable of binding hydrophobic peptides with micromolar K d , our efforts to obtain co-crystals of the CRT lectin domain with such peptides have so far been unsuccessful. Likewise, we have been unable to cocrystallize the lectin domain of CRT with ATP. Both CNX and CRT have been reported to bind ATP (41,42), and the presence of this nucleotide has been shown to enhance their abilities to bind non-native polypeptides and to suppress their aggregation in vitro (9,40). Given the conformational changes that have been reported to accompany nucleotide or peptide binding (9,41,42,50), it may be that such conformations are less amenable to crystallization.
CRT has also been reported to bind zinc, and four histidines within the lectin domain have been implicated in binding (40,43,(52)(53)(54)(55)(56). Examination of these histidines reveals that some are buried but that His 42 is exposed and adjacent to other residues, Asp 118 , Asp 121 , His 123 , and Asp 125 , that could potentially bind zinc.
In conclusion, we have determined the structure of the CRT lectin domain in complex with its physiological ligand. The structure provides the framework for the design and interpretation of mutants that affect the multiple physiological functions of CRT.