Mechanism of N-Acetylgalactosamine Binding to a C-type Animal Lectin Carbohydrate-recognition Domain*

The mammalian hepatic asialoglycoprotein receptor, a member of the C-type animal lectin family, displays preferential binding to N-acetylgalactosamine compared with galactose. The structural basis for selective binding toN-acetylgalactosamine has been investigated. Regions of the carbohydrate-recognition domain of the receptor believed to be important in preferential binding to N-acetylgalactosamine have been inserted into the homologous carbohydrate-recognition domain of a mannose-binding protein mutant that was previously altered to bind galactose. Introduction of a single histidine residue corresponding to residue 256 of the hepatic asialoglycoprotein receptor was found to cause a 14-fold increase in the relative affinity forN-acetylgalactosamine compared with galactose. The relative ability of various acyl derivatives of galactosamine to compete for binding to this modified carbohydrate-recognition domain suggest that it is a good model for the natural N-acetylgalactosamine binding site of the asialoglycoprotein receptor. Crystallographic analysis of this mutant carbohydrate-recognition domain in complex withN-acetylgalactosamine reveals a direct interaction between the inserted histidine residue and the methyl group of theN-acetyl substituent of the sugar. Evidence for the role of the side chain at position 208 of the receptor in positioning this key histidine residue was obtained from structural analysis and mutagenesis experiments. The corresponding serine residue in the modified carbohydrate-recognition domain of mannose-binding protein forms a hydrogen bond to the imidazole side chain. When this serine residue is changed to valine, loss in selectivity forN-acetylgalactosamine is observed. The structure of this mutant reveals that the β-branched valine side chain interacts directly with the histidine side chain, resulting in an altered imidazole ring orientation.

The mammalian hepatic asialoglycoprotein receptor is best known for its role in clearance of desialylated proteins from serum (1,2). The receptor is often regarded as a galactosebinding animal lectin because of its ability to bind galactose exposed when sialic acid is removed from complex N-linked oligosaccharides. However, the affinity of the receptor for Nacetylgalactosamine-terminated sugar structures and neoglycoproteins derivatized with this monosaccharide is substantially higher than for those terminating in galactose (3,4). In competition assays, N-acetylgalactosamine (GalNAc) 1 competes approximately 60-fold more effectively than galactose (Gal) for binding to the carbohydrate-recognition domain (CRD) of the major subunit of the receptor (rat hepatic lectin 1, RHL-1). Interestingly, preferential binding of GalNAc is not a property of the asialoglycoprotein receptor of peritoneal and tumoricidal macrophages, even though the CRD of this receptor shares 85% sequence identity with RHL-1 (5,6).
The basis for the different behavior of the hepatic and macrophage asialoglycoprotein receptors has previously been investigated by studying a series of chimeric CRDs containing different portions of the two receptors (7). This approach led to identification of three regions of the primary structure that contribute to selective binding of GalNAc. The position of the most significant of these regions relative to the monosaccharide binding site found in the three-dimensional structure of the homologous mannose-binding protein (MBP) CRD (8) suggests that these amino acids are unlikely to form direct contacts with the bound sugar. Instead, these residues probably influence the position of a critical histidine side chain (His 256 in RHL-1), which could in turn contact the N-acetyl portion of the sugar. The importance of His 256 for GalNAc selectivity was confirmed by mutagenesis (7).
Direct structural analysis of the CRD from an asialoglycoprotein receptor has not yet proven feasible. However, studies on mutant forms of the homologous mannose-binding CRD of rat serum mannose-binding protein suggest the likely arrangement of Gal and GalNAc in the binding site of the receptor. Insertion of three portions of RHL-1 into MBP results in a complete change in ligand-binding preference of this CRD from mannose to galactose (9). The mutant QPDWG, which contains the sequence Gln-Pro-Asp-Asp-Trp-Tyr-Gly-His-Gly-Leu-Gly-Gly in place of residues 185 through 191 of MBP, has binding properties almost indistinguishable from the asialoglycoprotein receptor when galactose and mannose are compared. Crystallographic analysis of galactose bound to the mutant CRD * Supported by a grant from the Wellcome Trust (to K. D.) and Grant GM50565 (to W. I. W.) from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
These studies have now been extended by further mutagenesis of the CRD from MBP to mimic the selective binding of GalNAc by RHL-1. Structural analysis confirms the importance of the histidine corresponding to His 256 of RHL-1 in directly contacting the bound ligand and provides additional evidence that other regions of RHL-1 that contribute to GalNAc binding may do so indirectly by influencing the position of this key histidine side chain.

EXPERIMENTAL PROCEDURES
Preparation of Mutant Proteins-Site-directed mutagenesis was conducted using double-stranded synthetic oligonucleotides to replace restriction fragments of the cDNA for MBP following standard methods (11). Oligonucleotides were prepared on an Applied Biosystems 391 DNA synthesizer. The CRDs derived from MBP were expressed in the pINIIIompA vector as described previously (9) and those derived from RHL-1 were expressed in vector pT5T (7). All of the CRDs were isolated by chromatography on galactose-Sepharose (7,12).
Sugar Synthesis-N-propionyl and N-iso-butanoyl derivatives of galactosamine were prepared by reaction of the free amino sugar with propionic and iso-butyric anhydride (13). Thio derivatives were prepared as described previously (14). The carboxypropionyl derivative was synthesized by reacting galactosamine hydrochloride with succinic anhydride. The detailed procedure will be published elsewhere. All derivatives were characterized using 1 H-NMR spectroscopy and fast atom bombardment mass spectroscopy. Concentrations were determined using the Morgan-Elson assay (15) following hydrolysis in 1 M HCl for 3 h at 96°C.
Binding Assays-Solid phase binding and competition assays were performed using 125 I-Gal 34 -serum albumin as reporter ligand (7). The K I values reported represent the point of half-maximal competition for binding as determined using a nonlinear least squares fitting program (SigmaPlot, Jandel Scientific).
NMR Analysis-Proteins for NMR studies were further purified by reverse phase chromatography on a C3 ultrapore column (Beckman Instruments) using a gradient from 10 to 50% acetonitrile in the presence of 0.1% trifluoroacetic acid. Eluted proteins were lyophilized and prepared for NMR analysis as in previous studies (16). Sugar was added in aliquots from concentrated stock solutions in D 2 O. Titrations were performed on a Varian Unity 500 spectrometer. Changes in chemical shift were fitted to a simple first order binding equation, ⌬␦ ϭ ⌬␦ bound / (K D ϩ [sugar]), where K D is the dissociation constant and ⌬␦ bound is the apparent change in chemical shift for the bound form.
Crystallization and Data Collection-Protein prepared as described above was subjected to clostripain (Worthington Enzymes) digestion and repurified by affinity and reverse phase chromatography as described previously (17). Lyophilized protein was redissolved in 10 mM NaCl and 10 mM CaCl 2 to 10 -20 mg/ml, and the pH was adjusted to 7 by addition of dilute sodium hydroxide. Crystals of QPDWGH and QPDWGHV were grown at 20°C by hanging drop vapor diffusion by mixing equal volumes of protein with reservoir solutions containing 12-15% polyethylene glycol 8000, 100 mM Tris-HCl, pH 8.0, 20 mM CaCl 2 , 10 mM NaCl, 0.02% NaN 3 . Crystals typically grew in 5-7 days to a size of 0.3 ϫ 0.3 ϫ 0.2 mm. Prior to data collection, the crystals were adapted in a stepwise manner to reservoir solution containing 0, 5, 7.5, 10, 15, and 20% 2-methyl-2,4-pentanediol. The solutions all contained 200 mM GalNAc (Sigma) to form the monosaccharide complexes. Crystals were flash-cooled at 100 K, and diffraction data were measured on a RAXIS II imaging plate detector mounted on a rotating copper anode. Data were processed using DENZO and SCALEPACK (18).
Structure Solution and Refinement-Crystals of QPDWGH and QP-DWGHV are nearly isomorphous with those of the galactose-binding MBP-A mutant QPDWG, with one trimer comprising the asymmetric unit (10). Structure solution consisted of rigid body refinement of the QPDWG mutant model against each of the data sets from QPDWGH and QPDWGHV. Water molecules, Ca 2ϩ ions, and the side chain of residue 202 were omitted from the model for both data sets. Additionally, the side chain of residue 154 was omitted during the initial QPD-WGHV refinement. Temperature factors from the QPDWG model were retained. The protomers were refined as individual rigid bodies against data from 10 -4.0 Å and then from 10 -2.8 Å. Positional and temperature factor refinement followed using data from 10 -2.2 Å. Calcium atoms were added, and the omitted residues were built into the difference (Fo-Fc) electron density using the program O (19). From this point on, a maximum likelihood refinement target and all data from 30 Å to the high resolution limit were used (20). Water molecules were added, and positional and isotropic temperature factor refinement was carried out. A bulk solvent correction and an overall anisotropic temperature factor tensor were applied throughout. The GalNAc molecule was modeled only after the difference density allowed unambiguous positioning of the sugar ring. All calculations were performed using the program CNS (21). Data collection and final refinement statistics are given in Tables I and II.

RESULTS
GalNAc-specific Binding to Mutants of MBP-As a complement to previous studies of the hepatic and macrophage asialoglycoprotein receptors (7), four regions of the hepatic receptor found to be important for selective binding of GalNAc were introduced into the CRD of MBP (Fig. 1). The starting point for these studies was a mutant CRD (QPDWG) already containing three regions from RHL-1, which are sufficient to establish high affinity binding to galactose (9). Because the histidine at position 256 of RHL-1 is absolutely essential for highly selective binding of GalNAc, this residue (region 4) was introduced into the corresponding portion of mutant QPDWG in place of Thr 202 to create mutant QPDWGH. The effect of this change was to increase by 14-fold the relative ability of GalNAc to compete for binding compared with Gal (Table III). This result confirms the importance of this histidine residue in binding to GalNAc.
The higher affinity of the MBP derivative containing His 202 for GalNAc compared with Gal was independently verified using NMR to assess binding affinities directly. The affinities were determined by examining the perturbation of amino acid side chain resonances in the aromatic region of the one-dimensional proton NMR spectrum of the His 202 mutant with increasing concentrations of sugar. The measured K D of 2.1 mM for Gal was comparable with that previously determined for the parental Thr 202 mutant (16). However, similar analysis of Gal-NAc binding provided a K D value of 0.20 mM (Fig. 2). Thus, the NMR titrations provide independent evidence that the His 202 mutant displays more than 10-fold higher affinity for GalNAc than for Gal.
Examination of individual spectra obtained during the titration also provides information about the nature of the interaction between His 202 and GalNAc. As shown in Fig. 3, a well resolved resonance can be associated with H2 (H⑀1) of the imidazole ring of this residue by comparison with the parental CRD. In the presence of increasing concentrations of sugar, the peak is largely unchanged, although there may be some broadening associated with immobilization at higher concentrations. However, it is difficult to quantify this effect because of increasing interference from the adjacent resonance that shifts upfield from 7.65 ppm with increasing sugar concentrations. Shifting of this adjacent resonance is also observed in both the parental and His 202 -containing CRDs titrated with galactose.
In cases where sugar protons are positioned above aromatic amino acid side chains in sugar-lectin complexes, there is often a ring current-induced broadening and shifting of the resonances (22). No such effects are observed for the methyl protons in the acetamido substituent of GalNAc in the presence of the His 202 -containing CRD. This result suggests that there is no direct stacking interaction with the face of the imidazole side chain of His 202 . Titration of the CRD from RHL-1 similarly fails to show broadening of this resonance (data not shown), provid-TABLE II Data collection and refinement statistics, model geometry All residues except Asn 206 of one protomer in QPDWGH ϩ GalNAc, which lies in a poorly ordered turn, fall within the allowed regions of the Ramachandran plot. The GalNAc in protomer 1 of QPDWGH was modeled as a mixture of both the ␣ and ␤ anomer with occupancies of 0.65 and 0.35, respectively. A sodium ion was modeled in the QPDWGH ϩ GalNAc structure but was not seen in the QPDWGHV ϩ GalNAc structure. No other significant differences are observed among the different copies except in regions of lattice contacts. The side chains of His 99 and Met 103 in protomer 1 were modeled in two conformations in QPDWGHV ϩ GalNAc.  ing further evidence for the similarity of the interaction of RHL-1 and the MBP mutant with GalNAc.
Crystallographic Analysis of a GalNAc-specific MBP Mu-tant-The crystal structure of QPDWGH complexed with Gal-NAc was determined. The structure is essentially superimposable with the original QPDWG structure. The Thr 202 3 His mutation produces significant changes only at the His 202 and Ser 154 side chain positions (Fig. 4). The hydroxyl group of Ser 154 in QPDWGH swings about the 1 angle by 120°relative to its orientation in QPDWG and forms a hydrogen bond with His 202 . The orientation of His 202 confirms that the increase in GalNAc specificity of this mutant results from a direct interaction between this residue and GalNAc. The imidazole ring is oriented so that the methyl group in the acetamido substituent of GalNAc makes van der Waals contact with N⑀2 and C⑀1 of His 202 (Fig. 5). The contact distances between the acetamido carbon atom and the N⑀2 and C⑀1 atoms are 3.9 Ϯ 0.2 and 3.9 Ϯ 0.1 Å, respectively (averaged over the three copies in the trimer), and compare favorably with the expected van der Waals distance between two carbon atoms or between a carbon and a nitrogen atom. Previous studies with a macrophage galactose receptor mutant that exhibits more than 20-fold enhanced binding of Gal-NAc compared with Gal have indicated that histidine at the position corresponding to His 256 of RHL-1 has a unique ability to support preferential binding of GalNAc (7). This residue has been mutated to a variety of different amino acids. Substitution of either alanine or glutamine at this position reduces preferential binding of GalNAc over Gal to less than 2-fold, whereas changes to asparagine, tyrosine, aspartic acid, and lysine actually result in lower affinity for GalNAc than for Gal. Glutamine, which is similar in length to histidine and contains amide and carbonyl oxygen functionalities, does not produce strong GalNAc selectivity, suggesting that hydrogen bonding is not involved and that the presence of C⑀1 is essential. Of the amino acids tested, glutamic acid comes closest to histidine, because it supports 4-fold tighter binding of GalNAc than Gal. The unique ability of histidine to elicit strong GalNAc selectivity can be explained by the fact that none of the other amino acids is isosteric with histidine and thus cannot form the favorable van der Waals contacts with the GalNAc acetamido group.
Role of Region 1 in GalNAc Selectivity-Among the three other regions of RHL-1 that affect GalNAc binding (7), region 1 shows the largest effect but is somewhat unusual in that most amino acids at position 208 of the hepatic receptor support selective binding to GalNAc. The major exception is valine, which is found at the corresponding position in the macrophage receptor. This substitution is primarily responsible for the different binding properties of the hepatic and macrophage receptors. The tolerance of this position for a wide range of side chains other than valine suggests that the effect of the valine might reflect alterations in the positions of other side chains, particularly His 256 in region 4, caused by the presence of a ␤-branched amino acid in region 1.
To provide evidence for this interpretation of the role of region 1, a further mutation in RHL-1 was created by substitution of isoleucine at position 208. This change results in Arg 182 -Pro-Gly 184 Thr 204 -Thr-Asp-Gly-His 208 Ser 154 2.5 Ϯ 0.5

FIG. 2. Binding of GalNAc to QPDWGH mutant of MBP measured by NMR.
Changes in the chemical shift of the resonance at 6.01 ppm in one-dimensional proton NMR spectrum (⌬␦) of the mutant protein were used to detect binding of sugar. The experimental points are indicated as filled circles, and the fitted curve is shown as a solid line. substantial loss of affinity for GalNAc, because the K I, Gal /K I, GalNAc for this mutant is 1.9 Ϯ 0.1. This value can be compared with 1.2 Ϯ 0.1 in the presence of Val 208 and 60 Ϯ 8 in the presence of Asn 208 (7). Combined with previous studies introducing various amino acid substitutions into region 1 of the macrophage galactose receptor, this result provides support for the suggestion that the presence of a ␤-branched amino acid at the position of Asn 208 in RHL-1 leads to loss of affinity for GalNAc. Threonine also has a ␤-substituent, but unlike valine and isoleucine, it has a hydroxyl group that could hydrogen bond to His 202 in a manner similar to that seen for Ser 154 in QPDWGH.
The possibility that the corresponding portion of MBP might similarly influence the binding of GalNAc to QPDWGH, perhaps by affecting the position of the histidine side chain, was investigated by insertion of valine in place of the residue that most nearly corresponds to Asn 208 of RHL-1. The alignment of amino acid sequences shown in Fig. 1 suggests that there are some differences in the conformation of this region of the protein, because the segment connecting ␣-helix 2 and ␤-strand 2 in MBP is 1 residue shorter than the corresponding segment of RHL-1. However, Ser 154 of MBP, at the beginning of ␤-strand 2, appears to correspond most closely to Asn 208 of RHL-1. Changing this residue to valine to create mutant QPDWGHV results in a 3-fold loss of preferential binding of GalNAc (Table III). This finding is consistent with the suggestion that amino acid side chains in region 1 influence the ability of the histidine residue in region 4 to mediate higher affinity binding to GalNAc.
The crystal structure of the His 202 /Val 154 mutant (QPD-WGHV) was determined to probe the structural basis of the effect of Val 154 on GalNAc selectivity (Fig. 6A). In this mutant, the ␤-methyl substituent of Val 154 makes van der Waals contact with the C␥, N␦1, and C⑀1 atoms of the His 202 imidazole ring, which rotates about its 2 torsion angle by 25°to accommodate the Val 154 ␤-methyl group (Fig. 6B). Despite this rotation, the imidazole ring of His 202 is still able to make van der Waals contact with the GalNAc acetamido group (lines in Fig.  6A). These findings are consistent with the observation that although the specificity of QPDWGHV for GalNAc is reduced 3-fold (Table III), the presence of Val 154 does not completely abolish GalNAc selectivity. The basis of the reduced preference for GalNAc is unclear, because the van der Waals contact distances between GalNAc and His 202 are not significantly altered when Ser 154 is mutated to valine. Given the difference in the length of segments connecting ␣-helix 2 to ␤-strand 2 in MBP and RHL-1, it is possible that valine at position 208 in RHL-1 clashes more severely with His 256 , resulting in a larger displacement of the imidazole ring than that observed in QPDWGHV.
Incorporation of two additional regions of RHL-1 into the macrophage receptor increases the relative effectiveness of GalNAc as a competitor for binding by roughly 3-fold (7). The introduction of regions 2 and 3 ( Fig. 1) into MBP does not result in significantly increased selectivity for GalNAc (Table III). In fact, some loss in selectivity is observed in the presence of either a short or long version of region 3. This loss of affinity for GalNAc may reflect an indirect effect on the essential His 202 in region 4, which is immediately adjacent in the amino acid sequence. The lack of positive effects from inclusion of regions 2 and 3 suggests that these portions of the MBP mutants do not assume conformations that accurately mirror the corresponding segment of the RHL-1. Alternatively, they may have indirect effects in RHL-1 that are mediated by residues not present in MBP.
Interactions of the N-Acyl Substituent-To compare the selectivity of the binding site in authentic RHL-1 with QPDWGH in region 4, the abilities of different N-acylated derivatives of galactosamine to compete for binding were assessed. The results for the two proteins are summarized in Table IV. The data for thio derivatives of the N-acetyl and N-propionyl substituents reflect the same relative preferences for side chain size previously observed for a series of simple N-acyl derivatives (7). Thus, the derivative with the larger thioacetyl substituent competes for binding to RHL-1 more effectively than does Gal-NAc and appears to bind at least as well as N-propionyl galactosamine. However, the thiopropionyl derivative, like the nbutanoyl derivative, binds less well than the N-propionyl FIG. 6. Ribbon representation of crystal structure of the QPDWGHV mutant of MBP complexed with GalNAc. A, stereo ribbon drawing shows the GalNAc binding site in the same orientation as in Fig. 4. The effect of Val 154 can been seen to cause a rotation of His 202 about the 2 torsion angle as compared with the QPDWGH mutant (Fig. 4). The van der Waals contact is depicted as a set of parallel lines. B, closeup of Fig. 6A showing the His 202 side chain from QPDWGHV (in gray) superimposed on His 202 from the QPDWGH structure (in white). The relative rotation of 25°between the imidazole rings is apparent. derivative. These results define an optimal size for the 2-substituent in line with previous studies (7). The change of oxygen to sulfur in the acyl substituent has approximately the same effect as addition of a carbon atom. Steric effects could also explain why the still larger carboxypropionyl derivative is the least effective inhibitor, although it is possible that the negative charge adversely affects the interaction of this ligand with the protein.
When the same derivatives were tested with the His 202 mutant of MBP, similar effects of size were observed. The thio derivatives show enhanced binding to the mutant MBP that closely parallels that observed for the natural RHL-1 binding site, whereas the carboxypropionyl derivative is again less effective. These comparisons provide evidence that the mutant MBP displays a restricted binding site for the 2-acylamido substituent of the ligand. The properties of this binding site are therefore similar to, although not exactly like, those of RHL-1. The apparently higher affinity of the propionyl derivative for RHL-1 is not observed in the modified MBP, indicating that the favorable interactions resulting from the presence of this intermediate-sized side chain must involve regions of RHL-1 that have not been incorporated into the MBP framework. However, the N-iso-butanoyl derivative is considerably less effective than the N-propionyl derivative as a competitor for binding to both QPDWGH and RHL-1. These results suggest that the larger group clashes with portions of QPDWGH, thereby limiting the size of the substituent that can be tolerated in a manner similar to RHL-1. Modeling suggests that although the N-propionyl derivative can be accommodated in the QPDWGH sugarbinding site, one of the terminal methyl groups of the larger N-iso-butanoyl derivative would clash with His 202 and Asn 210 . It is possible to model a N-iso-butanoyl rotamer such that there are no steric clashes, but there would be an entropic cost to restricting the rotation of the butanoyl group. In the propionyl derivative, this methyl group would be absent and no unfavorable contacts would be expected, which is consistent with the binding data. DISCUSSION Several lines of evidence suggest that the modified GalNAcbinding CRD in mutant QPDWGH provides a useful model for the structure of the GalNAc-selective binding site in RHL-1. The preferential binding of GalNAc compared with Gal and the relative selectivity of the model CRD for different N-acyl derivatives parallels qualitatively if not quantitatively the natural GalNAc-binding CRDs. In addition, the absolute dependence on a histidine residue in region 4 combined with the absence of evidence of an aromatic stacking interaction with the N-acetyl group of GalNAc indicate that the molecular interactions which stabilize GalNAc in the binding site are similar in the two cases.
The nature of this interaction is evident from the crystallographic analysis of the QPDWGH mutant. The increase in GalNAc selectivity in the mutant is the result of a direct van der Waals contact between the critical histidine and GalNAc. Furthermore, a nearby valine in the QPDWGHV mutant has been shown to influence this interaction because its ␤-methyl substituent directly alters the orientation of the histidine imidazole ring. Taken together, these results explain the ability of RHL-1 to bind GalNAc preferentially over Gal because of the presence of a histidine residue at position 256 and the fact that Asn 208 lacks a ␤ substituent. In the macrophage galactose receptor, which also contains the critical histidine residue, a valine residue is present at the position corresponding to Asn 208 in RHL-1. The ␤-methyl group of this valine residue likely perturbs the ring orientation of the critical histidine residue and its van der Waals contact to GalNAc, leading to a substantial loss of GalNAc selectivity. However, the QPD-WGHV mutant data cannot completely explain the loss of Gal-NAc selectivity in the macrophage receptor because, unlike the receptor, the mutant still exhibits 3-fold selectivity for GalNAc over Gal (Table III). The structure of this region of the protein may differ between MBP and RHL-1, or other regions of the receptor that have not been incorporated into the QPDWGHV mutant are required to eliminate GalNAc selectivity completely.
The preferential binding of acyl derivatives of certain sizes suggests the presence of a binding cleft that accommodates the 2-substituent of the ligand; greater stability is achieved up to a certain size, probably because of additional interactions between protein and ligand, but still larger side chains result in steric clashes that reduce affinity. The binding and NMR data provide some insight into the nature of the interactions that define this site. The fact that the thioacetyl derivative binds with higher affinity than the acetyl derivative argues against the presence of a hydrogen bond to the carbonyl oxygen in the GalNAc complex, because sulfur would be expected to be a weaker hydrogen bond acceptor than oxygen in this position. Combined with the NMR and crystallographic evidence that the binding does not involve packing of the methyl portion of the acetyl group against the face of the histidine or any other aromatic residue, these data indicate that the predominant interactions defining the binding cleft for the 2-substituent are simple van der Waals contacts with the edge of a histidine ring.