Structural Basis for Universal Corrinoid Recognition by the Cobalamin Transport Protein Haptocorrin*

Background: Haptocorrin (HC) is a cobalamin (Cbl) transport protein known to recognize a wide range of corrinoids. Results: We solved the crystal structure of human HC in complex with Cbl and cobinamide. Conclusion: HC recognizes corrinoids by establishing distinct contacts with the corrin ring. Significance: Our findings complete the molecular details for corrinoid recognition by human Cbl transport proteins. Cobalamin (Cbl; vitamin B12) is an essential micronutrient synthesized only by bacteria. Mammals have developed a sophisticated uptake system to capture the vitamin from the diet. Cbl transport is mediated by three transport proteins: transcobalamin, intrinsic factor, and haptocorrin (HC). All three proteins have a similar overall structure but a different selectivity for corrinoids. Here, we present the crystal structures of human HC in complex with cyanocobalamin and cobinamide at 2.35 and 3.0 Å resolution, respectively. The structures reveal that many of the interactions with the corrin ring are conserved among the human Cbl transporters. However, the non-conserved residues Asn-120, Arg-357, and Asn-373 form distinct interactions allowing for stabilization of corrinoids other than Cbl. A central binding motif forms interactions with the e- and f-side chains of the corrin ring and is conserved in corrinoid-binding proteins of other species. In addition, the α- and β-domains of HC form several unique interdomain contacts and have a higher shape complementarity than those of intrinsic factor and transcobalamin. The stabilization of ligands by all of these interactions is reflected in higher melting temperatures of the protein-ligand complexes. Our structural analysis offers fundamental insights into the unique binding behavior of HC and completes the picture of Cbl interaction with its three transport proteins.

Cobalamin (Cbl 4 ; vitamin B 12 ) is an essential micronutrient synthesized only by microbes. It is a cofactor for the enzymes methionine synthase and methylmalonyl-CoA mutase. Consequently, Cbl is indispensable for central metabolic reactions, including biosynthetic pathways of nucleotides, branchedchain amino acids, and odd-chain fatty acids (1,2). Therefore, Cbl deficiency has severe consequences, including anemia and impaired function of the nervous system (3). Cbl is a very complex metalorganic compound (4). It consists of a corrin ring with seven amide side chains (a-g), coordinating a central cobalt ion through four nitrogen atoms. The fifth coordination site on the lower axial part (␣-side) of the molecule is provided by a 5,6-dimethylbenzimidazole (DMB) ribonucleotide moiety connected to side chain f, whereas the sixth coordination site (␤-side) can be occupied by various ligands such as a cyano (CN), hydroxyl, methyl, or 5Ј-deoxyadenosyl group (see Fig. 1A).
A complex transport and uptake system has evolved to guarantee efficient assimilation of the minute amounts of Cbl present in dietary sources. Three proteins with similar structure mediate transport and distribution of Cbl in the human body by capturing the ligand with very high affinity (K d ϳ 6 fM) (5,6): intrinsic factor (IF), transcobalamin (TC), and haptocorrin (HC). Protein-bound Cbl is taken up by several receptors to cross cellular membranes (7)(8)(9)(10)(11). IF is responsible for the absorption of dietary Cbl in the intestine, and TC mediates Cbl transport from the blood to peripheral cells. HC binds Cbl in the upper part of the gastrointestinal tract and is responsible for the stomach passage (8). HC is present in secretions, including saliva, milk, and tears, but also in blood plasma (12). There, its physiological function is still unclear. A prominent feature of HC is its high content of glycans, which contribute 28% to the molecular weight. Depending on the source of synthesis (gastric mucosa, glands, granulocytes, etc.) (13), different glycoforms of HC have been described (14). In contrast, TC completely lacks glycans.
One striking difference between HC and the other two transport proteins is its ability to capture corrinoids other than Cbl, so-called Cbl analogs, with equally high affinity (15). Notably, HC even binds corrinoids whose nucleotide is missing (cobinamide (Cbi)) (see Fig. 1B) or not coordinated to the cobalt ("base-off" ligands) (6,16). This suggests that HC has a potential role as a scavenger protein in plasma, preventing the uptake of biologically inactive corrinoids into cells and clearing them through the hepatobiliary pathway (16,17). In fact, ϳ40% of the corrinoids in blood plasma are Cbl analogs (18,19). The exact nature and significance of these Cbl analogs in blood are still unclear, but interestingly, some studies found a correlation of high analog concentrations in blood with Cbl deficiency, neurologic abnormalities, and Alzheimer disease (12).
In the last 6 years, the crystal structures of human TC and bovine (bTC) TC (20), IF (21), and IF in complex with its receptor cubilin (22) have been reported. However, structural information on HC revealing the molecular determinants of the unique ability of HC to bind to Cbl analogs is still missing to date. Based on homology models, it was predicted that the overall architecture of HC is very similar to the structures of TC and IF (23). The ability of HC to bind the baseless ligand Cbi was explained by the presence of three bulky residues, Arg-357, Trp-359, and Tyr-362 (23), in the binding pocket. It is thought that these residues facilitate the binding of Cbi to HC by compensating for the absence of nucleotide at the ␣-side of the corrin ring. They may provide comparable hydrophobic contacts with the apolar ␣-side of Cbi, as observed for the DMB ribonucleotide in Cbl. In contrast to HC, TC and IF possess only one of these three bulky amino acids (Tyr and Trp, respectively) and thus cannot fully compensate for the hydrophobic contacts at the ␣-side of Cbi as proposed for HC. Still, the homology model did not provide an explanation for the ability of HC to bind a wide range of Cbl analogs modified at the ␣-side, including base-off ligands.
In this study, we set out to determine the crystal structure of HC to understand the molecular details of its unique corrinoidbinding behavior compared with the other reported Cbl transport proteins. To facilitate crystallization of the heavily glycosylated protein, we expressed recombinant human HC in HEK293 GnTI Ϫ cells, thereby producing proteins with a uniform glycosylation pattern to facilitate subsequent deglycosylation. We describe the crystal structures of HC in complex with the ligands cyanocobalamin (CNCbl) and Cbi. Our results complete the picture of corrinoid transport by providing the molecular details of the last of the three human transport proteins.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-Recombinant human HC and TC (referred to below as HC and TC) and murine TC (mTC) were prepared as described previously (24,25). Briefly, HC was expressed in HEK293 cells, purified on a nickel-nitrilotriacetic acid affinity column, and subjected to size exclusion chromatography using a Superdex 200 column equilibrated in 100 mM HEPES (pH 7.5) and 20 mM NaCl. Purification tags were removed by thrombin (Sigma) cleavage. For crystallization, the protein was expressed in HEK293 GnTI Ϫ cells, and additional steps were included in the purification procedure to increase the homogeneity of the protein sample. After thrombin cleavage and the second Ni 2ϩ affinity column purification, the buffer was exchanged to 50 mM citric acid at pH 5.5 (12 mM citric acid and 38 mM sodium citrate) for deglycosylation overnight at room temperature on an orbital shaker using either endoglycosidase H (5 units; New England Biolabs) or endoglycosidase F1 (10 g/mg protein; New England Biolabs). The sample was then finally polished on a HiLoad TM 16/60 Superdex 200 column at a flow rate of 48 ml/h using HEPES-buffered saline at pH 7.4 (20 mM HEPES (pH 7.4) and 100 mM NaCl). The fractions with pure and untagged protein were pooled, concentrated to 36 mg/ml, and directly used for crystallization. Recombinant IF was kindly provided by Prof. Ebba Nexø (Aarhus University Hospital, Aarhus, Denmark).
Differential Scanning Fluorometry (DSF)-A real-time PCR cycler (Rotor-Gene Q 5plex HRM, Qiagen) was used for protein stability measurement by DSF. Stability measurements were performed in triplicates in PBS at pH 7.4. SYPRO Orange (Invitrogen) was used at a 5-fold final concentration. The final protein concentration in the samples was 1 M, and the ligand concentration was varied between 0.25 and 10 M. The final . Redundancy-independent R-factor, where ͗I(hkl)͘ is the average of symmetry-related observations of a unique reflection, and each reflection contribution is adjusted by a factor of ͌(N/(N Ϫ 1)), where N is the multiplicity (36).
merging R-factor, which describes the precision of the averaged measurement (37). d CC half ϭ percentage of correlation between intensities from random half-data sets, as defined by Karplus and Diederichs (35). e r.m.s.d., root mean square deviation. f As defined by MOLPROBITY (31). volume of the samples was 100 l. The real-time PCR cycler was programmed to ramp the temperature from 25 to 95°C, collecting the data every 0.5°C, with an equilibration time of 2 s. Data were normalized to the peak fluorescence value of the protein in the absence of ligands. T m values were determined using Rotor-Gene Q Version 2.0.2.4 software (Qiagen).
Crystallization and Structure Determination-For crystallization, proteins were supplied with a 1.5-fold molar excess of CNCbl (Sigma) or dicyanocobinamide (Sigma) and concentrated to 0.8 mM (36 mg/ml). Proteins were mixed in a 1:1 ratio with the precipitant and crystallized by the sitting drop vapor diffusion technique at a temperature of 4°C. Hexagonal pink crystals appeared in 50% PEG 400, 0.2 M MgCl 2 , and 0.1 M sodium cacodylate (pH 6.5) after 5-7 days and reached their full size after 15 days. They belonged to space group P6 4 , with one molecule in the asymmetric unit. Crystals were flash-cooled in liquid nitrogen without additional cryoprotecting agents.
Diffraction data were collected at 100 K at beamline X06SA of the Swiss Light Source (Villigen, Switzerland). Data were processed and merged using XDS (26). The structure of HC-CNCbl was determined by molecular replacement with Phaser (27) using the individual domains of the structures of human IF (chain A, Protein Data Bank code 2PMV) and bTC (chain A, code 2BB6) as search models, after mutation of the side chains with CHAINSAW (25,28) and after removal of the flexible loops. The initial molecular replacement solution (RFZ (rotation function Z-score) ϭ 5.4, TFZ (translation function Z-score) ϭ 15.0, LLG (log-likelihood gain) ϭ 205, RFZ ϭ 3.9, TFZ ϭ 12.2, LLG ϭ 289) was first fitted by rigid body refinement, followed by automated model rebuilding with PHENIX (29). The resulting model was further completed by iterative cycles of refinement with TLS (translation/libration/screw) in PHENIX and manual rebuilding in Coot (30). The CNCbl ligand was manually placed in the difference map and included in the final rounds of refinement. The structure of HC-Cbi was determined by molecular replacement using the refined CNCbl model as template. Data collection and refinement statistics are provided in Table 1. The quality of the structures was analyzed with MOLPROBITY (31), and figures were prepared using PyMOL Version 1.5.0.4 (Schrödinger LLC).

RESULTS
Overall Structure of HC-Crystals of HC in complex with the two ligands CNCbl and Cbi (Fig. 1, A and B) were obtained, and the structures of the complexes were determined to 2.35 and 3.0 Å resolution, respectively. In both complexes, the ligands are buried at the interdomain interface. Here, we compared the HC-CNCbl structure with the reported structures of IF-CNCbl (21) and TC-H 2 O-Cbl (20). Despite the low sequence identity (Ͻ25%) to TC and IF (23), HC features a very similar domain architecture. It comprises two globular domains, the N-terminal ␣-domain (residues 1-287) and the C-terminal ␤-domain (residues 309 -410), which are connected by a flexible linker (Figs. 1C and 2).
The ␤-domain consists of two almost perpendicular antiparallel ␤-sheets (␤1/␤2/␤6/␤7/␤8 and ␤3/␤4/␤5) and an ␣-helix (␣13) stacked in between (20). The tight hydrophobic packing is further stabilized by an additional fourth disulfide bridge (Cys-365-Cys-370), which is unique to human HC and is absent in both TC and IF. This disulfide bridge reduces the flexibility of the ␤5-␤6 loop, which adopts a similar conformation as observed in the structure of IF and which contains a two-residue insertion (Asn-373 and Asn-374) compared with TC ( Fig.  1D). This two-residue insertion at the tip of the ␤5-␤6 turn establishes a contact with helix ␣5 and the ␣5-␣6 loop of the ␣-domain and confers additional stability to the complex. It comprises the side chains of Asn-109, His-113, Thr-116, Lys-152, Asn-153, Asn-373, Asn-374, and Arg-376 and the main chains of Asn-114, Leu-118, and Thr-119 (Fig. 1D), which form three direct and a series of water-mediated hydrogen bonds. In TC and IF, this interaction is less prominent due to a shorter ␤5-␤6 loop (TC) and a shorter helix ␣5 (in IF, two turns shorter, and TC, one turn shorter), which is tilted away from the ␤-domain (Figs. 1D and 2). In TC, no direct hydrogen bonds are observed at this interdomain contact, and in IF, only one salt bridge (Lys-365-Glu-110) and one hydrogen bond (Lys-365-Ser-105) have been described (21).
Clear electron density for seven N-linked N-acetylglucosamine moieties is present in the HC-CNCbl structure at Asn-193, Asn-293, Asn-314, Asn-320, Asn-326, Asn-331, and Asn-346. In contrast to the crystal structure of human TC, the HC linker does not participate in crystal contacts over its full length and is largely disordered (residues 296 -307). Linker residues 289 -294 form a parallel ␤-strand interaction with strand ␤2 of a symmetry-related molecule (residues 321-325). Moreover, the N-glycan at Asn-293 is in contact with the symmetry-related N-glycans at Asn-326 and Asn-346. This explains why HC crystallized only upon deglycosylation, as longer branched glycans at these three sites would have prevented crystal formation (data not shown).
The relative orientation between the ␣-and ␤-domains in HC varies compared with TC and IF: the central axis of the ␣-domain of HC relative to the ␤-domain is tilted by 16°compared with TC and is both tilted and twisted by 12°compared with IF. These differences result in a tighter packing of the HC-ligand complex, which features a total buried surface area of 1491 Å 2 compared with 1166, 1254, and 1266 Å 2 for TC, bTC, and IF, respectively. Accordingly, the overall structure of HC superimposes on TC, bTC, and IF with root mean square deviations of Binding of CNCbl to HC-The CNCbl molecule is bound at the ␣/␤-domain interface, with the corrin ring plane oriented almost parallel to the central axis of the helical barrel (Fig. 1C). The cobalt ion is coordinated by the four nitrogen atoms of the corrin ring, the N3B atom of the DMB ring, and a CN molecule at the sixth coordination site. The hydrogen bonds formed with side chains a, c, d, and g of the corrin ring are generally conserved among the three transport proteins (Fig. 1, E-G, and Table 2). Main-chain amides and/or carbonyls of ␤-domain residues (Leu-381, Leu-388, Ile-363, and Trp-379) form hydrogen bonds with side chains c and d, whereas amino acid side chains of the ␣-domain (Asp-163, Asn-217, and Gln-266) interact with side chains a and g through hydrogen bonds. As observed for IF, HC does not form any direct hydrogen bonds with the b-side chain. The most striking differences among the three transport proteins are found in the contacts that are formed with side chains e and f. In addition to the two conserved hydrogen bonds formed by Thr-119 and Gln-123 with the e-and f-side chains, respectively, HC forms unique contacts with both the e-and f-side chains with the three residues Asn-120, Tyr-410, and Asn-373 (Fig. 1, E and G; Fig. 3, A and B; and Table 2), which are not conserved in TC and IF (Fig. 2). Similar to TC and IF, the corrin ring is further stabilized by hydrophobic interactions with amino acid side chains, including Tyr-122, Phe-219, Trp-359, and Tyr-378 (Fig. 1, E and F).
At the ␣-side of the corrin ring, the ␤-hairpin residue Arg-357 forms a water-mediated hydrogen bond with the ring oxygen of the ribose moiety of CNCbl ( Fig. 1E and Fig. 3, C and D). This interaction is unique to HC because this amino acid is not conserved in IF and TC (Fig. 2). Together with the side chains of Trp-359 and Tyr-362, it further stabilizes the DMB ribonucleotide moiety by hydrophobic contacts. The stabilization of the ␣-side is complemented by a series of water-mediated hydrogen bonds with the phosphoryl moiety formed by the mainchain amides of the ␤-hairpin and by Glu-71 and Gln-123 of the ␣-domain (Fig. 1E and Fig. 3, C and D).
At the tip of the ␤-1Ј-␤-1Љ hairpin, the side chain of Phe-156 laterally stabilizes the CN moiety of the ligand on the ␤-side by a hydrophobic contact. Moreover, one polyethylene glycol and three water molecules fill up the cavity of the ␤-side (Fig. 1F). In TC, a histidine (His-173 in human TC and His-175 in bTC) (20)  coordinates the cobalt ion of Cbl (Fig. 3, E and F), whereas in IF, the sixth coordination site of Cbl is empty, devoid of any ordered water molecules (Fig. 3G) (21). In the absence of the CN moiety, we cannot exclude that the side chain of Phe-156 gets reoriented to coordinate the cobalt ion by a cationinteraction. Binding of Cbi to HC-To understand the molecular details that permit the binding of a baseless corrinoid to HC, we determined the crystal structure of HC in complex with Cbi (Fig. 1B). Compared with CNCbl binding, no differences are observed at the ␤-side. The corrin ring of Cbi is stabilized by the same interactions as described for CNCbl (Fig. 4A). Of particular importance are the interactions observed at the e-side chain of Cbi. In addition to a density observed for the CN moiety at the ␣-side of Cbi, additional density is present in the proximity of the CN moiety, which accounts for the presence of an alternate conformation of the e-propionamide side chain (Fig. 4B). This alternate conformation fills up the space otherwise occupied by the DMB ribonucleotide moiety and orients the e-propionamide ideally to form a hydrogen bond to the guanidine group of Arg-357.
DSF-We used DSF to analyze the ability of the ligands CNCbl and Cbi to stabilize HC, IF, TC, and mTC. mTC was included in these studies because it has been shown to recognize Cbi (32). Briefly, proteins were mixed with different concentrations of ligands, and thermal unfolding upon temperature increase was monitored. Whereas HC and mTC have been described to bind both ligands, IF and TC recognize only Cbl with high affinity (6,32). Reported kinetic binding studies observed similar affinities of the three human transport proteins for CNCbl (6). CNCbl binding induced very high thermal shifts (⌬T m ) of Ͼ23.5 Ϯ 0.04°C in HC and 26.3 Ϯ 0.03°C in mTC (Fig. 5, A and G). Interestingly, much lower ⌬T m values of 7.7 Ϯ 0.2°C and 13.9 Ϯ 0.2°C were observed for IF and TC, respectively (Fig. 5, C and E). ⌬T m values were concentrationindependent at ligand concentrations of Ͼ2 M (Fig. 5, insets), which corresponds to a protein/ligand ratio of at least 1:2.
The baseless corrinoid Cbi induced significantly lower ⌬T m values in all four proteins (Fig. 5, B, D, F, and H). Notably, no thermal shift could be detected upon binding to IF, even at a 10-fold excess of the ligand. Again, HC and mTC had very similar ⌬T m values of 15.9 Ϯ 0.3°C and 16.1 Ϯ 0.1°C, respectively. Only a very low thermal stabilization of 3.1 Ϯ 0.1°C was observed for TC.

DISCUSSION
It is well known that HC is the least specific Cbl transport protein, binding to a diversity of Cbl derivatives and analogs (13,16). However, the molecular details that confer this broad ligand tolerability to HC are poorly understood. Our studies provide a rational explanation for the high affinity binding of HC to a variety of corrinoids.
DSF demonstrated that not only Cbi but also CNCbl binding resulted in higher thermal stabilization of HC and mTC compared with the Cbl-specific transport proteins IF and TC. These results corroborate our structural findings that corrinoid recognition by HC generally differs compared with IF and TC and that the important determinants are involved in the binding of both Cbl and Cbi. We identified several structural features that explain the ability of HC to bind a wide range of corrinoids.
Most important, additional hydrogen bonds formed by the non-conserved residues Asn-120 and Asn-373 with both the eand f-side chains of the corrin ring stabilize even ligands with a missing or modified DMB ribonucleotide moiety. Although in TC the e-and f-side chains are stabilized by only one hydrogen bond each (Thr-134 and Gln-138), no hydrogen bond formation with these side chains is found in IF (Fig. 6). The better stabilization of the e-and f-side chains by a protein is reflected in (i) the ability of corrinoid ligands to induce higher thermal shifts and (ii) a lower selectivity of the protein for Cbl.
Recent studies showed that HC is present in most mammals (33). Interestingly, in species lacking HC, including mice and lower vertebrates, a TC-like protein has been shown to be less specific for Cbl and to recognize Cbi and other corrinoids, similar to human HC (32)(33)(34). According to the sequence alignment (Figs. 2 and 4C), Asn-135 in mTC corresponds to Asn-120 in human HC and is in a position to form the analogous hydrogen bonds with the e-and f-side chains. Furthermore, Asn-120 is also conserved in the sole Cbl transport protein described for some fish species (Fig. 4C). This protein is able to bind Cbi and was named HIT because it features some characteristics of HC, IF, and TC (34). Indeed, not only Asn-120 but also the entire binding motif . . . TNYYQ . . . , which forms four hydrogen bonds with e-and f-side chains, is conserved in fish HIT and mTC (Fig. 4C). We conclude that conservation of this binding motif is indicative of binding proteins that are able to recognize corrinoids other than Cbl. HC forms two additional hydrogen bonds with side chain e through Asn-373 and Tyr-410, but these interactions are not found in the binding proteins of fish and mice.
In the case of the baseless corrinoid Cbi, additional structural features of HC may contribute to stabilization of the ligand. First, Arg-357 stabilizes the flip-in conformation of Cbi in HC with a hydrogen bond that cannot occur in the other transport proteins. Second, at the lower side of Cbi, the three bulky side chains of Arg-357, Trp-359, and Tyr-362 provide hydrophobic and polar contacts with the corrin ring and stand in for the lacking base. Although the three large hydrophobic residues Arg-357, Trp-359, and Tyr-362 have already been identified as a possible feature for binding to the nucleotide-lacking Cbi (23), the role of Arg-357 in stabilizing the flip-in conformation of Cbi was not predicted. However, the three bulky residues are not a prerequisite for Cbi binding because only one of them is conserved in mTC.
Finally, the relative ␣/␤-domain orientation varies among HC, TC, bTC, and IF and leads to variable tight packing of the domains. The larger buried surface area in HC accounts for more interactions and even more shape complementarity between the two domains and is another molecular determinant that leads to the observed large differences in thermal shifts for HC compared with the other human Cbl transport proteins. Based on our analysis, it is likely that the stabilization of the protein-ligand complex not only occurs through electrostatic interactions but is augmented by higher shape complementarity in HC and the interdomain contact between ␣5 and the ␤5-␤6 loop.
Here, we have reported the crystal structure of the third human Cbl transport protein, HC, which has the unique ability to bind a wide range of corrinoids. The function of this protein in blood plasma and secretions is still obscure. Knowing the molecular determinants that contribute to corrinoid recognition could help to improve our understanding of the complex transport system by which the human body mediates different fates for the valuable vitamin Cbl and its potentially harmful analogs.