A Single N-Acetylgalactosamine Residue at Threonine 106 Modifies the Dynamics and Structure of Interferon α2a around the Glycosylation Site*

Background: O-Glycans are known to modulate the biological activity and physicochemical properties of type I interferon. Results: A single sugar unit modulates the structure and dynamics of the protein. Conclusion: In this study, we propose an explanation of the effects of glycosylation on biological activity. Significance: The results will facilitate the study of any O-GalNAc glycoprotein. Enzymatic addition of GalNAc to isotopically labeled IFNα2a produced in Escherichia coli yielded the O-linked glycoprotein GalNAcα-[13C,15N]IFNα2a. The three-dimensional structure of GalNAcα-IFNα2a has been determined in solution by NMR spectroscopy at high resolution. Proton-nitrogen heteronuclear Overhauser enhancement measurements revealed that the addition of a single monosaccharide unit at Thr-106 significantly slowed motions of the glycosylation loop on the nanosecond time scale. Subsequent addition of a Gal unit produced Gal(β1,3)GalNAcα-[13C,15N]IFNα2a. This extension resulted in a further decrease in the dynamics of this loop. The methodology used here allowed the first such description of the structure and dynamics of an O-glycoprotein and opens the way to the study of this class of proteins.

IFN␣2 is a polypeptide of 165 amino acids. Subtypes a and b differ only by one residue at position 23 (Lys to Arg). The threedimensional structure was determine by Klaus et al. (2) using NMR spectroscopy in solution. The fold consists of five ␣-helices (labeled A-E), four of which are arranged in an antiparallel fashion to form a left-handed four-helix bundle. Two disulfide bonds stabilize the conformation. During the development of recombinant versions of IFN␣2 from Escherichia coli for therapeutic use, Adolf et al. (3) discovered that three subtypes (a-c) of native IFN␣2 are O-glycosylated at Thr-106. This residue is located in the CD loop at one end of the helix bundle and may not interact with the receptor. In fact, the recently reported structures of complexes between IFN␣2a and its receptor by NMR (4) and x-ray (5) do not shed light on such possible interactions. This may explain the similar biological activity observed for both the recombinant IFN␣2 produced in bacteria and the glycosylated version purified from human leukocytes. However, Loignon et al. (6) reported that O-glycosylated IFN␣2b produced in HEK293 cells inhibits viral activity more than IFN␣2b prepared in E. coli. This suggests that the glycan does modulate biological activity.
In general, glycosylation of serine/threonine residues of proteins plays important roles in protein localization and trafficking, protein solubility, antigenicity, and cell-cell interactions (7). Recent evidence suggested that the glycan at Thr-106 might not improve the stability of the conformation under thermal stress. Using far-UV circular dichroism and calorimetry to monitor thermal unfolding, Johnston et al. (8) showed that glycosylated IFN␣2b is less thermostable by ϳ2°C than the nonglycosylated form. In contrast, the opposite effect was observed for glycosylated granulocyte colony-stimulating factor. The latter is more thermostable by ϳ5°C compared with the nonglycosylated E. coli-derived version (9). These reports thus suggest that glycosylation modulates conformational stability. It may also play a role in protein solubility and half-life in serum (10). In addition, IFN␣2 with covalently linked PEG exhibits a significantly increased bioavailability. One injection per week of the N-terminally PEGylated version of IFN␣2b achieved the same therapeutic result as three injections per week of the non-PEGylated protein (11). Moreover, a PEGylated version of GalNAc␣-IFN␣2b, in which the PEG moiety was linked to the sugar, had a significantly greater plasma concentration and half-life and 25-fold greater antiviral activity than E. coli-derived IFN␣2b (12). The role played by the glycan in modulating these biophysical properties (thermostability and solubility) may be explained in terms of perturbations of the conformation and/or dynamics of the polypeptide chain. Therefore, we aimed to apply NMR spectroscopy techniques to study the effects of glycosylation of IFN␣2a on its structure and dynamics.
The main challenge of using NMR spectroscopy lies in the production of glycoprotein samples enriched with stable isotopes such as carbon-13 and nitrogen-15, which are required to determine the three-dimensional structure and measure parameters sensitive to protein dynamics. Recently, a eukaryotic expression system based on human cell lines (HEK293) has been reported to produce high levels of glycosylated IFN␣2b (6). Although isotope incorporation may be feasible using defined serum-free media with such eukaryotic expression systems, it is a very expensive endeavor. In addition, this approach produces glycosylated IFN␣2b with a heterogeneous glycan composition. Adolf et al. (3) detected three major glycan structure variants in naturally produced IFN␣2: the core-1 (or T antigen) structure Gal(␤1,3)GalNAc and the latter substituted with either lactosamine linked to the GalNAc unit or sialic acid linked to the Gal moiety. Less abundant structures may incorporate fucose (3). Therefore, to address both challenges of isotope incorporation and glycan heterogeneity, we chose a strategy based on the production of labeled IFN␣2a in E. coli, followed by an in vitro enzymatic stepwise addition of the various monosaccharide units. This approach, first proposed by Wong and co-workers (13), provides a flexible labeling scheme and facilitates the production of a homogeneous glycoprotein for NMR studies. A priori, one may consider making a fully 13 C, 15 N-labeled O-glycoprotein to take full advantage of triple resonance NMR techniques. Because UDP-[ 13 C, 15 N]GalNAc is not commercially available, its synthesis from the very expensive [ 13 C, 15 N]GlcNAc (14,15) would be required. The many steps involved would further increase the final cost of producing a fully labeled sample. Instead, we opted for the preparation of a fully labeled polypeptide chain bearing unlabeled glycans. This labeling scheme was more practical in view of our intention to first study the effects of a single GalNAc residue on the protein structure and dynamics.
Purification of Labeled Glycoproteins-The reaction mixtures were desalted on a HiPrep 26/10 desalting column (GE Healthcare). The column was eluted at 5 ml/min with 1.5 column volumes of 25 mM NaOAc (pH 5.0). The desalted fractions were pooled and loaded on a 5-ml HiTrap SP HP column (GE Healthcare) equilibrated in 25 mM NaOAc buffer (pH 5.0). The material was eluted at 5 ml/min using a 0 -0.8 M NaCl gradient over 16 column volumes. Fractions (2 ml) absorbing at 280 nm were analyzed by SDS-PAGE (15% acrylamide) to determine which fractions contained the labeled glycoprotein.
Structure Calculation-Structure calculations were performed using CYANA 2.1 (21). The amino acid library of CYANA had been modified to incorporate torsion angle parameters for an O-glycosylated threonine residue, where the hydroxyl oxygen of threonine was covalently attached to the reducing end (C1) of the GalNAc moiety in an axial orientation, resulting in an ␣-glycosidic bond. Cartesian coordinates of the GalNAc␣-Thr residue were obtained by linking a threonine residue and GalNAc using the Chimera software package (22). A new CYANA entity was then created using the generated atomic coordinates and editing torsion angle definitions following the template for threonine (from the CYANA standard library) and the template for GalNAc. A total of 258 backbone torsion angles were derived from chemical shifts using TALOSϩ (23). In addition, calculations were performed with 66 hydrogen bond constraints inferred from the sequential short-and medium-range NOE pattern (Table 1). Structure validation was performed using the following packages: the Protein Structure Validation Software Suite (PSVS; version 1.4) from the Northeast Structural Genomics Consortium (24), the structure validation program PROCHECK (25), and the structure validation software suites PDBsum (26) and WHATIF (27). No further refinement was necessary to obtain a well resolved NMR ensemble of structures. The NMR structure ensemble was deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics (RCSB), under code 2LMS. Visualization of the structure was performed with the Chimera software package (22).

RESULTS
Sample Preparation-The preparation of glycosylated IFN␣2a was carried out in two steps. First, the polypeptide was produced with carbon-13 and nitrogen-15 enrichment in E. coli. Then, an unlabeled GalNAc unit was added enzymatically in vitro to the fully labeled folded protein. This labeling scheme has two advantages: the cost effectiveness of using unlabeled monosaccharide units and the possibility to use isotopefiltered NMR experiments that can distinguish between protein and glycan resonances.
The polypeptide contains 14 serine and 10 threonine residues, half of which are well exposed to the bulk water. Considering that there are no consensus sequences for the addition of O-GalNAc at Ser/Thr residues, the choice of the appropriate glycosyltransferase isoform was determined from the work of DeFrees et al. (12). The authors subjected a synthetic peptide (residues 100 -113) covering the potential glycosylation site of IFN␣2a to six isoforms of GalNAc-T. Analysis of the reaction products by matrix-assisted laser desorption ionization timeof-flight mass spectrometry suggested that GalNAc-T2 added a single GalNAc unit at Thr-106. The addition of the first unit of the core-1 glycan, also referred to as the Tn antigen, was quantitative as showed by SDS-PAGE (supplemental Fig. 1). The initial analysis of the glycosylation reaction product was carried out using two-dimensional 1 H, 15 N HSQC experiments to verify that GalNAc-T2 added the GalNAc unit at Thr-106 only. Comparison of the proton-nitrogen NMR correlation spectra of GalNAc␣-IFN␣2a and non-glycosylated IFN␣2a showed that the small glycan induced the largest chemical shift perturbations at the glycosylation site and surrounding residues (Fig.  1A). The overlay revealed that both Thr-106 and Thr-108 showed large chemical shift differences. Considering that amide chemical shifts are very sensitive to the smallest change in the local magnetic environment, the large differences indicated either that Thr-108 was highly perturbed by the glycosylation of Thr-106 or that both residues had been glycosylated. Fortunately, carbon chemical shifts are better probes to detect changes in the covalent structure of the side chains of amino acids after post-translational modification. The large change in chemical shift (ϩ9.92 ppm) observed for the ␤-carbon of Thr-106 in the HNCACB and CBCA(CO)NH experiments confirmed this residue as the glycosylation site (Fig. 1B). Indeed, an average difference of 0.17 ppm was measured for the ␤-carbon chemical shifts of all other Ser/Thr residues.
Resonance Assignment-Protein resonance assignment was carried out following the standard protocol based on the usual array of triple resonance NMR techniques. Resonance assignment of the GalNAc unit was carried out using 13 C, 15 N-filtered experiments that select glycan (unlabeled) resonances by filtering out all carbon-and nitrogen-bound proton signals. After several attempts, direct observation of the GalNAc anomeric proton using filtered experiments was not successful. However, spectra from filtered experiments did show what could be attributed to a correlation between H1 and H2, from which we could obtain the chemical shift of H1. However, the symmetrical correlation (H2 to H1) was not observed (supplemental Fig.  2A). Using a two-dimensional 1 H, 13 C heteronuclear multiple quantum correlation experiment recorded over 5 days, it was possible to measure unambiguously the unique anomeric 1 H and 13 C chemical shift pair at natural abundance. Although this proton-carbon correlation resonates in a specific region of the spectrum that is sufficiently distant from the intense protein signals (supplemental Fig. 2B), it is not the case for all other proton-carbon correlations of the pyranose ring. These resonate at the edge of the fully labeled ␣-carbon region and are masked by the latter. The remaining GalNAc proton chemical shifts were assigned using two-dimensional 13 C, 15 N-filtered NOESY (supplemental Fig. 2A).
Backbone Dynamics-The steady-state heteronuclear Overhauser effect between nitrogen-15 and proton was measured at two different field strengths to assess the effects of glycosylation on the protein backbone dynamics. The addition of a single GalNAc unit at Thr-106 resulted in a reduction of the backbone motions of the neighboring residues (Fig. 2), whereas the motions of all other residues were not significantly changed (supplemental Fig. 3). Elongation of the glycan chain with a galactose unit further reduced the motions of the loop surrounding the glycosylation site. Although the effects of the disaccharide on the dynamics were small, they were nevertheless significant. However, we did not pursue the determination of the three-dimensional structure of Gal(␤1,3)GalNAc␣-IFN␣2a.
Three-dimensional Structure-The ensemble of structures of GalNAc␣-IFN␣2a shown in Fig. 3A was calculated using NOEderived distance constraints and backbone torsion angle constraints derived from chemical shifts. The calculations were carried out with a fully extended polypeptide chain as a starting point to eliminate the possibility of any bias. The ensemble of structures shows the same overall conformation as reported by Klaus et al. (2) for the non-glycosylated protein in solution (Protein Data Bank code 1ITF) with two differences. The first nine residues at the N-terminal end of GalNAc␣-IFN␣2a adopt a well defined pseudo-helical (twisted coil) conformation, whereas these residues are not as well defined in the IFN␣2a structure. The glycosylation loop (residues 101-108) is also better defined in the GalNAc␣-IFN␣2a ensemble compared with the IFN␣2a ensemble (Fig. 3B). The orientation of the GalNAc unit (Fig. 3C) results from the measurement of 20 distance constraints between the glycosylation loop residues and the methyl and anomeric protons of the GalNAc moiety. Among the sugar-protein NOE contacts, most constraints are measured between the acetyl group methyl and Val-105 on one side and between the anomeric proton and residues Glu-107 and Thr-108 on the other. The resonance for C␤ of Thr-106 was significantly altered upon glycosylation, whereas that of Thr-108 was unchanged. Note that the C␤ resonance for Thr-106 is aliased and appears at the top portion of the strip (asterisk).

FIGURE 2. Plot of the normalized 15 N{ 1 H} NOE measured at 25°C and 700 MHz for [ 13 C, 15 N]IFN␣2a (red bars), GalNAc␣-[ 13 C, 15 N]IFN␣2a (green bars), and Gal(␤1,3)GalNAc␣-[ 13 C, 15 N]IFN␣2a (black bars)
plotted for residues 90 -120. The decrease in mobility is prominent for residues 105, 106, 108, and 112 (green bars). It is noteworthy that although the alteration on a nanosecond time scale motion was unaffected upon addition of the second saccharide moiety for these residues, residues 90 and 95 displayed a further decrease in mobility when the second saccharide unit was attached (black bars). For the complete plot, see supplemental Fig. 3.

DISCUSSION
The structure of IFN␣2a is a five-helix bundle fold with mobile segments. These are the first eight residues in N-terminal residues 42-52 and the glycosylation loop comprising residues 101-108 linking helices 3 and 4. The dynamic behavior of the N terminus is somewhat surprising considering that the first residue in the sequence is a cysteine involved in a disulfide bridge with Cys-98 located in helix 3. In contrast, residues 22-44 adopt an elongated peptide stretch terminated by a oneturn helix that displays a similar rigidity to all ␣-helices. Therefore, the presence of backbone motions produced an averaging of interproton distances, resulting in the measurement of fewer NOE-derived distance constraints (see Fig. 1a in Ref. 2). Thus, these mobile segments were not well defined (low root mean square deviation (r.m.s.d.)) in the ensemble of conformers shown in Fig. 3B. This dynamic behavior is probably an important factor responsible for the absence of coordinates for these residues in the x-ray structure of the IFN␣2a-IFN receptor complex (5).
Attenuation of backbone dynamics was induced upon addition of a single GalNAc unit at Thr-106 (Fig. 2). A priori, this attenuation should reduce the averaging of interproton dis-tances, thus facilitating the measurement of more distance constraints involving flexible segments. The ensemble of structures (Fig. 3A) determined in solution by NMR for GalNAc␣-IFN␣2a shows that the glycosylation loop and the N-terminal residues have a lower r.m.s.d. compared with the ensemble obtained for IFN␣2a (Fig. 3B). This suggests that glycosylation-induced rigidification of the mobile segment neighboring Thr-106 may have contributed to obtaining a better defined structure; it is most probably not the major factor. The total number of restraints used by the software CYANA to calculate the structure of GalNAc␣-IFN␣2a is almost double the number used for IFN␣2a. Although the same three-and four-dimensional versions of NOESY experiments were recorded on both proteins, the use of cryogenic probe head technology in this study was certainly the major factor explaining the difference. Indeed, a plot of the number of restraints per residues for each amino acid in the sequence (Fig. 4) revealed that the N terminus and the glycosylation loop both show a similar increase in the number of restraints. Although it is not possible to tease apart the respective contributions of the more sensitive instrumentation and the glycosylation to this increase in the number of constraints, the latter probably benefited from the less mobile glycosylation loop.
A first glance at Fig. 3A suggests that the conformation of the GalNAc unit is not well defined. This is partly due to the fact that the glycosylation loop experiences mobility, and it is not as well defined as the helices. Therefore, we assessed the conformation of the GalNAc unit in relationship to the protein backbone of tripeptide 105-107. This was accomplished with a superimposition of the backbone atoms from residues 105-107 (Fig. 3C) to use these atoms as a common reference for all conformers. Considering that few distance restraints were measured between the sugar (anomeric and methyl protons of the acetyl group only) and protons on the glycosylation loop, the conformation of the GalNAc unit was well defined with respect to the backbone of this loop (average r.m.s.d. of 1.3 Å). The orientation of the GalNAc moiety brings the amide proton of the acetyl group in proximity to the backbone carbonyl of Thr-106 with an average distance of 3.0 Å over all 20 structures. A similar conformation has been observed in synthetic O-GalNAc glycopeptides (28), suggesting that the acetyl group modulates the orientation of the GalNAc unit with the peptide backbone. In addition, the 3-hydroxyl, the attachment point of the next sugar, is pointing away from the protein, toward the bulk of the solvent. This indicates that the sugar unit adopts an appropriate conformation to allow recognition by glycoprotein-GalNAc 3␤-galactosyltransferase (C1GALT1). This enzyme is responsible for elongation of the chain with galactose to make the core-1 glycan. Because a three-dimensional structure of C1GALT1 was not available, it was not possible to model the interaction.
The higher potency observed for glycosylated IFN␣2b (6) triggered our interest to determine whether glycan-protein interactions could be present during the receptor-binding event. Using the structure of the ternary complex that includes both the ␣and ␤-chains of the type I interferon ␣/␤ receptor (5), we substituted IFN␣2a with our NMR structure of GalNAc␣-IFN␣2a into the complex to look for possible glycanprotein interactions (Fig. 3D). Before searching potential interactions with the receptor, it is worth mentioning that polar (hydrogen bonds) and hydrophobic contacts stabilize carbohydrate-protein complexes. Usually, a residue with an aromatic side chain mediates the latter. One example of this can be seen in the structure of the complex between lectin II from Ulex europaeus and its Gal(␤1,3)GalNAc ligand (29), where a tyrosine interacts with the hydrophobic side of the GalNAc residue. Examination of the receptor area susceptible to interaction with the GalNAc unit revealed that a number of side chain atoms of residues on the ␣-chain near the glycosylation loop of GalNAc␣-IFN␣2a were missing from the crystal structure of the ternary complex. Of particular interest, tryptophan at position 183 was missing, but it was manually added. The resulting model suggests that an interaction between the hydrophobic side of the pyranose ring of the sugar and Trp-183 is possible. The ideal geometry to properly position the hydrophobic face of the sugar on top of the aromatic ring is not present. However, a minor reorientation of the backbone would achieve this accommodation. In fact, among the 20 conformers of the NMR ensemble, the model was built using the one conformer that best positioned the sugar near the tryptophan side chain. We refrained from manually fitting the interactions, but a model from a theoretical simulation might produce a realistic model. In addition, the heteronuclear Overhauser enhancement data indicate that this loop does remain sufficiently mobile and could accommodate the appropriate conformation with a negligible energy penalty. Furthermore, this reorientation of GalNAc would still leave the 3-hydroxyl pointing away toward the solvent, thus indicating that the interaction between GalNAc and Trp-183 could exist with the core-1 glycan or longer chains. This interaction may also explain the higher potency of O-glycosylated IFN␣2b versus the non-glycosylated version. The additional contact between the GalNAc unit and Trp-183 would provide an increase in affinity between the cytokine and its receptor. Therefore, additional sugar residues on the glycan that increase glycan complexity and heterogeneity would display a similar potency if the longer chains did not possess excessive dynamics or steric interference. Moreover, the lower ther-mostability observed by Johnston et al. (8) for O-glycosylated IFN␣2b may be explained by intermolecular interactions involving hydrophobic surfaces of the protein exposed during the unfolding process and the hydrophobic faces of sugar units of the glycan. The absence of such carbohydrate-protein interactions in non-glycosylated IFN␣2b might explain the higher melting temperature observed.
In summary, the application of the reported enzymatic synthesis approach and the simple labeling scheme utilized to produce GalNAc-␣IFN␣2a allowed the determination of a threedimensional structure at high resolution and the study of the dynamics. It is noteworthy that a single GalNAc unit suffices to modulate the backbone dynamics near the glycosylation site. To our knowledge, this is the first example of the characterization of the structure and dynamics of an O-GalNAc glycoprotein in solution. In conjunction with the recently reported structures of IFN complexes, our results shed light on the biological role of glycans.