Eukaryotic N-Glycosylation Occurs via the Membrane-anchored C-terminal Domain of the Stt3p Subunit of Oligosaccharyltransferase*

Background: Stt3p is the catalytic subunit of oligosaccharyltransferase (OT) that catalyzes protein N-glycosylation. Results: We report the first high resolution NMR structure of the acceptor-binding domain of yeast OT. Conclusion: This work provides a structural basis for the function of the C-terminal domain of the Stt3p subunit. Significance: Structure determination of this critical domain is an important step toward understanding the mechanisms of the eukaryotic N-glycosylation process. N-Glycosylation is an essential and highly conserved protein modification. In eukaryotes, it is catalyzed by a multisubunit membrane-associated enzyme, oligosaccharyltransferase (OT). We report the high resolution structure of the C-terminal domain of eukaryotic Stt3p. Unlike its soluble β-sheet-rich prokaryotic counterparts, our model reveals that the C-terminal domain of yeast Stt3p is highly helical and has an overall oblate spheroid-shaped structure containing a membrane-embedded region. Anchoring of this protein segment to the endoplasmic reticulum membrane is likely to bring the membrane-embedded donor substrate closer, thus facilitating glycosylation efficiency. Structural comparison of the region near the WWDYG signature motif revealed that the acceptor substrate-binding site of yeast OT strikingly resembles its prokaryotic counterparts, suggesting a conserved mechanism of N-glycosylation from prokaryotes to eukaryotes. Furthermore, comparison of the NMR and cryo-EM structures of yeast OT revealed that the molecular architecture of this acceptor substrate-recognizing domain has interesting spatial specificity for interactions with other essential OT subunits.

due in the nascent polypeptide chain defined by the consensus sequon NX(T/S) (X P) (1,2). The transferred N-glycans play essential roles in modulating protein function in many critical processes, including development, inflammation, cancer, and the immune response (3,4). For many higher eukaryotes, protein N-glycosylation is a highly coordinated and complex process. In the most extensively studied eukaryotic system, Saccharomyces cerevisiae, OT contains nine non-identical integral membrane protein subunits: Ost1p-Ost6p, Stt3p, Wbp1p, and Swp1p (2). Among these, five subunits (Stt3p, Wbp1p, Swp1p, Ost1p, and Ost2p) are essential for the viability of the cell, whereas Ost3p and Ost6p are homologous interchangeable subunits (5). In contrast, a single protein chain defines the OT activity in prokaryotes, emphasizing the complexity of the glycosylation process in eukaryotes.
Although the detailed enzymatic reaction mechanism and the roles of the other subunits are not yet fully understood, a multitude of experimental evidence has suggested that Stt3p contains the catalytic site (6 -8). As shown in supplemental Fig. S1, eukaryotic Stt3p is highly conserved. In fact, it is the most conserved subunit in the OT complex (6). The most convincing evidence favoring this conclusion has come from the composition of the OT in some lower eukaryotes, such as trypanosomatids (9), as well as a few prokaryotes in which N-glycosylation occurs. In these organisms, OT is a single integral membrane protein homologous to the Stt3p subunit, such as PglB proteins in the bacteria Campylobacter jejuni (10) and Campylobacter lari (11) or the AglB protein in the archaeon Pyrococcus furiosus (12). Further studies showed that when transferred into Escherichia coli or stt3-deficient yeast cells, these single-polypeptide OTs can either enable or complement N-glycosylation in these host cells (13,14). Moreover, it has also been demonstrated that the introduction of Leishmania major Stt3p into yeast can replace the entire OT complex (15).
Despite the biological significance of N-glycosylation, very limited success has been achieved with regard to the structural investigation of eukaryotic OT. The scarcity of structural knowledge significantly hampers our understanding of the enzymatic mechanism of the OT complex. To date, there are only three structural reports at atomic resolution in terms of eukaryotic OT subunits: NMR structures of the minimembrane protein Ost4p from yeast (16) and humans (17) and the crystal structure of the N-terminal luminal domain of Ost3p/Ost6p (18). The low resolution structure (at 12 Å) of the yeast OT complex was also determined by EM techniques (19). In comparison with eukaryotic OT, the structures of prokaryotic OT are relatively well understood. Two crystal structures of the soluble C-terminal domain of the prokaryotic homologs of Stt3p have been determined in succession: the AglB protein of the archaeon P. furiosus (20) and the PglB protein of the bacterium C. jejuni (12). Recently, the crystal structure of a fulllength bacterial Stt3p homolog, the PglB protein of C. lari, was determined in complex with an acceptor peptide (21). However, sequence alignment reveals that the bacterial and archaeal Stt3p homologs show very limited sequence similarities to eukaryotic Stt3p.
We have previously reported the preparation and NMR assignments of the 274-residue C-terminal domain of Stt3p (residues 466 -718 plus 21 His-tagged residues) in detergent micelles (22,23). Our results revealed that the C-terminal domain of yeast Stt3p is highly helical. Here, we present the NMR structure of this 31.5-kDa helical membrane protein fragment in detergent micelles. Our results represent the feasibility of de novo structure determination of a medium-to-high molecular weight helical membrane protein fragment using solution NMR methodologies. Structural comparison of the acceptor substrate-binding site of yeast OT with that of its prokaryotic homologs suggests an evolutionarily conserved N-glycosylation mechanism. Furthermore, fitting the NMR structure of the C-terminal domain of Stt3p to the EM model provides insight into the interaction between Stt3p and other OT subunits.
NMR Experiments and Resonance Assignments-All spectra were acquired at 55°C using either a Bruker 600-MHz AVANCE spectrometer equipped with a triple-resonance cryoprobe in the Department of Chemistry and Biochemistry at Auburn University or a Varian Inova 900-MHz NMR spectrometer equipped with a cold probe in the Complex Carbohydrate Research Center at the University of Georgia. NMR data collected were subsequently processed using the NMRPipe program and analyzed using NMRView software (42).
Topology Determination-16-Doxylstearic acid (16-DSA) was used as the hydrophobic paramagnetic spin probe to determine the transmembrane domain of the C-terminal domain of Stt3p in SDS micelles. Titrations were performed by stepwise addition of 16-DSA over a concentration range of 0 -2 mM to a constant amount of U-15 N-labeled protein sample (0.1 mM). 1 H, 15 N HSQC experiments were carried out using the same parameters, except P 1 (the 90°hard pulse) and shimming values. The peak intensities were measured at each titration point to assess the amount of paramagnetic-induced line broadening.
Structure Calculation and Refinement-Structure calculation was done with CYANA 3.0. NOESY spectra were peakpicked and integrated interactively. Dihedral angle constraints were determined by TALOS ϩ (31). Structure calculation of the C-terminal domain of Stt3p was done in two steps. In the first step, the structure of each secondary structural element was determined to high resolution. The structures of these elements were then fixed by medium-range NOE restraints and relatively tight backbone dihedral angle restraints. In the second step, these segments were folded together with long-range NOE restraints. 100 structures were calculated, and 10 structures of the lowest total energy were used to represent their ensemble conformation. In the final calculated structures, the percentages of residues that reside in the most favored, additionally allowed, and generously allowed regions of the Ramachandran diagram are 60.5, 28.6, and 8.1%, respectively.

RESULTS AND DISCUSSION
Structure Determination of the C-terminal Domain of Stt3p-De novo structure determination of membrane proteins remains a challenge. In x-ray crystallography, the presence of detergents often hinders sample crystallization (24); whereas in NMR spectroscopy, the slow tumbling of membrane proteins embedded in detergent micelles often dramatically broadens the resonance line widths due to their rapid transverse relaxation rates. The difficulty in structure determination for ␣-helical membrane proteins by NMR is even more striking in comparison with their ␤-barrel counterparts. This is mainly because unlike the ␤-barrel membrane proteins, where the long-range interstrand NOEs are abundant and readily provide adequate restraints to define the global folds, there are typically very few long-range NH-NH NOEs for ␣-helical membrane proteins (25). Additionally, the NMR spectral dispersion for the amide region is very narrow for ␣-helical membrane proteins, which results in severe overlapping of resonances, making unambiguous assignments much harder. Consequently, to date, successful cases of ␣-helical membrane protein structure determination using NMR methods have been largely limited to proteins of low molecular weights (25)(26)(27)(28)(29)(30).
To achieve unambiguous assignments, we have used a combination of NOEs from 13 C, 15 N-double-labeled, partially deuterated (50%) triple-labeled, and uniformly 13 C, 2 H, 15 N-triplelabeled samples, as well as an ILV methyl-protonated, otherwise uniformly 13 C, 2 H, 15 N-triple-labeled sample, together with backbone dihedral angles from chemical shift analysis (TALOS ϩ ) (31). It is noteworthy that four-dimensional NOESY data proved to be very helpful in unambiguous identification of many medium-range NOEs, a characteristic for ␣-helices, together with many long-range NOEs. The detailed statistics of restraints used for the structure determination are provided in Table 1 under "Experimental Procedures." Overall Structure and Topology of the C-terminal Domain of Stt3p-As presented in Fig. 1A, the C-terminal domain of Stt3p reveals a globally compact oblate spheroid-shaped structure, with a major axis of ϳ680 Å and minor axis of ϳ400 Å. This domain of Stt3p is primarily helical, containing 11 helices (␣1-␣11) and a disordered C terminus (Fig. 1B). These 11 helices are connected by short-or medium-length loops. This high helicity is consistent with our previous experimental data from far-UV CD and chemical shift index analysis (22,23). Although both the TALOS ϩ program and chemical shift index predict the formation of a ␤-strand for Arg-592-Trp-598, we failed to unambiguously assign supportive NOEs to confirm this prediction, presumably due to peak overlapping. In fact, the absence of a ␤-strand in the C-terminal domain of eukaryotic Stt3p has been predicted by some structure predictions programs (20).
The topology of full-length yeast Stt3p has been proposed to be N cyt -C lum with 11 transmembrane helices and a soluble C-terminal domain (32). However, some prediction programs predict that there is a transmembrane domain near the middle of the C-terminal domain (22,33), whereas other programs predict that there is only a hydrophobic region at this position (32). Therefore, the detailed topology of the C-terminal domain of Stt3p remains controversial. In this study, we used a nonpolar paramagnetic probe, 16-DSA, to investigate the topology of the C-terminal domain of Stt3p with respect to the membrane. This non-polar probe accelerates the relaxation of neighboring nuclear spins via dipole-dipole interactions of the residues located in the transmembrane domain, thus reducing the intensities of HSQC peak (34). The HSQC spectra were acquired in both the absence and presence of 2 mM 16-DSA (supplemental Fig. S2). Further analysis showed that the addition of 16-DSA resulted in a significant reduction in the peak intensities for residues 488 -504, 511-526, 539 -551, and 566 -582 (Fig. 2). From the structure calculated in this study, we observed that, except for segment 566 -582, which partially protrudes outward, all other segments could form a hydrophobic pocket. This explains the reason for the resonance intensity reduction in those segments because the hydrophobic probe 16-DSA can fit into a hydrophobic pocket. Taken together with the results obtained from transmembrane prediction programs (22), it is likely that the C-terminal segment encompassing residues 566 -582 (␣5) and the adjoining loop residues are membrane-embedded. Because it has been established that the C-terminal domain of Stt3p is located in the luminal side of the endoplasmic reticulum (ER) (32), we concluded that this membrane-embedded segment penetrates only the lipid bilayer but does not span it completely and therefore is not a transmembrane domain. We propose that the protein domain under study is monotopic. This is similar to the case of bacterial peptidoglycan glycosyltransferase, which was shown to be a monotopic membrane protein (35). As shown in Fig. 3, this topology model is consistent with the structure presented in this study, in which the highly hydrophobic C terminus of ␣5 and the adjoining loop protrude into the lipid bilayer. Accordingly, combined with existing results, we propose that full-length Stt3p has an N cyt -C lum topology with 11 transmembrane domains in its N-terminal domain and a luminal monotopic C-terminal domain (Fig. 3B).
We previously reported that different concentrations of detergent led to changes in the chemical shift positions of some residues in the 1 H, 15 N HSQC spectra of the C-terminal domain of Stt3p (22). To confirm the topology of the C-terminal domain of Stt3p, we further analyzed the above data. Resonances in the HSQC spectra collected at different detergent concentrations were assigned (23). The effect of increasing detergent concentrations (50 -200 mM) on the chemical shift positions of each residue in the protein is shown in supplemental Fig. S3A. Residues that underwent significant chemical shift perturbations (Ն0.05 ppm) when mapped into the protein structure (supplemental Fig. S3B) revealed that the most of the perturbed residues are located at the proposed interface region between the protein and the ER membrane (shown in blue), except for a few that are located in a peripheral helix (shown in red). These results support our topology model of the C-terminal domain of Stt3p. Additionally, the last transmembrane segment of yeast Stt3p is predicted to encompass residues 442-464 (32), which suggests that the N-terminal residue of the protein under study (residue 466) should be located close to the ER membrane. Our model is in agreement with the above studies, showing close proximity between residue 466 and the lipid bilayer (Fig. 3A).

Comparison with the Crystal Structures of Prokaryotic Stt3p
Homologs-Two crystal structures of the C-terminal soluble domain of prokaryotic Stt3p homologs have been reported previously: the AglB protein of P. furiosus (12) and the PglB protein of C. jejuni (20). Recently, the crystal structure of full-length PglB of C. lari was also determined (21). The C-terminal domain of these structures comprises mainly an ␣-helical "central core" domain and some ␤-sheet-rich domains either encircling or inserting into the central core domain. The central core domain contains the well conserved WWDYG motif and was therefore previously proposed as a catalytic domain (12,20). However, recent crystal structural studies on full-length Stt3p of C. lari demonstrated that the WWDYG motif is indeed involved in binding of the signature sequon NX(T/S) (X P) on the acceptor peptide, but not directly in the catalysis process (21).
Comparison of the solution structure of the C-terminal domain of yeast Stt3p with the structures of its prokaryotic homologs reveals two major differences. First, the C-terminal domains of prokaryotic Stt3p homologs are water-soluble, lacking membrane-embedded segments, whereas our data suggest that the C-terminal domain of yeast Stt3p contains an ER membrane-embedded region, the C terminus of ␣5, and residues located in the adjoining loop. We postulate that the anchoring of the C-terminal domain of Stt3p, which contains the acceptor-binding site, to the ER membrane makes it closer to its donor substrate, dolichol-linked oligosaccharide, which is also embedded in ER membrane. This might potentially increase the effective local concentration of the donor substrate and hence facilitate the N-glycosylation process. Another striking difference is that the counterparts of the ␤-sheet-rich domains in AglB and PglB proteins are missing in the C-terminal domain of yeast Stt3p. It is thus reasonable to postulate that the C-terminal domain of Stt3p as a whole corresponds to the central core domain in the prokaryotic homologs, although it is larger and contains more helical elements. The function of these ␤-sheet-rich domains might be fulfilled by the other subunit(s) in the case of yeast OT.
Acceptor Substrate Binding Studies on the C-terminal Domain of Stt3p-As mentioned above, the WWDYG motif, which is conserved across the three domains of life, has been shown to be the acceptor substrate-binding site of OT (21). To investigate the interaction between the C-terminal domain of Stt3p and the acceptor substrate, NMR titration studies were  carried out using the Asn-Asp-Thr-NH 2 acceptor peptide, which contains the consensus N-linked glycosylation sequon. Binding results were evaluated by monitoring the changes in chemical shift positions in the 1 H, 15 N HSQC spectra of the C-terminal domain of Stt3p as described previously (22). As shown in Fig. 4A and supplemental Fig. S4, the resonance positions of some residues were perturbed upon the addition of the acceptor substrate peptide, confirming the interaction between the protein and peptide. Mapping of these residues within the structure of the C-terminal domain of Stt3p revealed that they can be divided into two groups either located in the peripheral loop regions (shown as blue spheres) or clustered around the WWDYG motif area (with the WWDYG motif and two previous Ala residues shown as red spheres and the adjacent residues shown as yellow spheres), as shown in Fig. 4B. Excluding the residues in the loop regions, other perturbed residues were found to be part of a pocket that is formed around the WWDYG motif. On the basis of the NMR mapping studies of the C-terminal domain of Stt3p with the acceptor peptide, we suggest that this pocket is the acceptor substrate-binding site located in the luminal C-terminal domain of Stt3p. As reported previously, the apparent K 0.5 for the interaction between the substrate peptide and the C-terminal domain of Stt3p is ϳ10 mM (22). The weak interactions could be due to the fact that we are not working with the whole OT complex. Moreover, in vitro experimental conditions can never be the same as within a cell. However, if the in vivo interaction is actually weak, it may provide the enzyme with a fast association and disassociation with the substrate, thus facilitating an efficient screening and glycosylation process of a nascent polypeptide chain.
Comparison of the Acceptor-binding Site of Stt3p and Its Prokaryotic Homologs-The limited sequence similarity between eukaryotic Stt3p and its prokaryotic homologs does not allow a meaningful sequence alignment. Because the architecture surrounding the WWDYG motif is highly relevant to understand the mechanisms of N-glycosylation, structural comparison of these regions of Stt3p and its prokaryotic homologs should provide information with respect to how well the mechanism of N-glycosylation is conserved throughout the three domains of life.
As shown in Fig. 5 (A-D), despite very low overall amino acid sequence similarities, the structural architecture of the region surrounding WWDYG motif of Stt3p is strikingly similar to that of its prokaryotic homologs: AglB protein from the archaeon P. furiosus and PglB proteins from the bacteria C. jejuni and C. lari. This observation suggests that OT shares a common mechanism of acceptor substrate binding in all three domains of life. In the binding sites of both eukaryotic and pro- A, the C terminus of ␣5 and its adjoining loop (mesh) are proposed as the ER membraneembedded region. Residue 466, which is known to be adjacent to the ER membrane, is depicted as blue sticks. B, topology of yeast Stt3p. The last transmembrane domain has been shown to be from residues 442 to 464 (32), suggesting that residue 466 is adjacent to the ER membrane.
karyotic OTs (Fig. 5), the local helical structural feature of the WWDYG motif is strictly conserved, although the N-terminal part of the WWDYG motif in AglB adopts a rare left-handed helical conformation compared with the typical right-handed helix in the other three structures. Structural studies of PglB from the bacterium C. lari in complex with an acceptor peptide have revealed that the side chains of the two tryptophans in this motif directly interact with the ␤-hydroxyl group of Thr at position ϩ2 of the acceptor sequon via hydrogen bonds (21). Additionally, an Ile residue, which is located adjacent to the WWDYG motif, was proposed to stabilize the interaction between the enzyme and the glycosylatable acceptor sequon through van der Waals contact (21). In our model, we also observed an Ile residue (Ile-572) in close proximity to the WWD residues (Fig. 5A). This observation appears to be reminiscent of PglB of C. lari, suggesting that Ile-572 might play a similar role in the binding of an acceptor substrate. In addition, sequence alignment showed that Ile-572 is absolutely conserved from yeast to human (supplemental Fig. S1).
Close comparison of the acceptor-binding sites of these different models revealed that the aspartate residue side chain orientation within the WWDYG motif is different. In both yeast Stt3p and the archaeon P. furiosus AglB, the side chain of this aspartate residue orients away from the two Trp residues (Fig. 5, A and B). In contrast, in PglB from the bacteria C. jejuni and C. lari, the side chains of the aspartate residue in the WWDYG motif point toward the two Trp residues (Fig. 5, C  and D). This observation implies that Asp-518 of Stt3p and AglB might play a different role from the bacterial homolog PglB, in which Asp-518, together with Trp-516 and Trp-517, directly interacts with the acceptor sequon via hydrogen bonds (21). Indeed, if Asp-518 in yeast OT also forms a hydrogen bond with the acceptor substrate, the OT activity would be expected to be largely unaffected upon its mutation to a glutamate residue because both aspartate and glutamate have comparable ability to form hydrogen bonds with the acceptor substrate due to their similar side chains. However, it has been shown that the D518E mutation in yeast Stt3p results in a lethal phenotype, rendering the enzyme completely nonfunctional (6). We proposed a structural role for Asp-518 in our earlier studies (22). Further structural and functional studies are needed to clarify the exact role of Asp-518 in the glycosylation process of higher organisms.
Based on phylogenetic tree analysis and structural studies of AglB protein from the archaeon P. furiosus, it was postulated that the acceptor substrate-binding site (which was tentatively proposed as the catalytic site of OT in this study) of eukaryotic OT was formed by the WWDYG motif and the so-called DK motif (20). Our model reveals that the DK motif of Stt3p (Asp-583-Lys-586) is indeed located in close proximity to the WWDYG motif (Fig. 5A), in agreement with the above proposal. However, the spatial orientation of the side chains of the residues in the DK motif cannot be precisely positioned in our model due to the missing or weak NMR resonances for these residues. For the same reason, we are uncertain about the local secondary structure of the DK motif. However, chemical shift index analysis and TALOS ϩ prediction based on the neighboring residues showed that it is very likely that Asp-582-Glu-587 form a small helix.
Structural Insights into the Interaction between the C-terminal Domain of Stt3p and Other OT Subunits-Arguably one of the most striking features of eukaryotic OT probably lies in its structural complexity. Yeast OT has two isoforms. Each isoform is composed of eight membrane protein subunits containing either Ost3p or Ost6p (36 -38). In the absence of a high resolution structure of the whole OT complex, any information on the subunit-subunit interactions might help to understand the functional role of each subunit, as well as the mechanism of enzyme catalysis. The EM structure of the whole OT complex has been previously reported (19). Comparison of the NMR structure of the C-terminal domain of Stt3p with the corresponding region in the EM structure demonstrated a very good spatial fit despite the fact that protein expression and reconstitution methods were different (Fig. 6, A and B). Because the OT samples used for EM studies were enzymatically active, we conclude that the solution structure presented here represents the native fold of the C-terminal domain of Stt3p.
Besides the catalytic Stt3p subunit, the EM model also demonstrates the spatial orientations of Ost1p, the proposed acceptor substrate-recognizing subunit (39), and Wbp1p, the proposed donor substrate-recognizing subunit (40). Based on the above information, our structure reveals that ␣4, which is aligned approximately parallel to the interface region of Ost1p, is responsible for the primary contact between the C-terminal domain of Stt3p and the luminal domain of Ost1p (Fig. 6C). More specifically, Asp-534, Asn-536, Thr-537, Asn-540, Thr-541, Ala-544, and Lys-548, which are oriented toward Ost1p, are very likely responsible for interactions between these two subunits. Sequence alignment of the C-terminal domain of Stt3p from yeast to human showed that these residues are all highly conserved (supplemental Fig. S1), suggesting their biological importance. With the exception of Ala-544, other polar residues can have electrostatic interactions or hydrogen bonding with the residues of Ost1p at the interface, thus providing hydrophilic environments to the groove formed between Stt3p and Ost1p. Further investigation will be needed to confirm the roles of those amino acid residues and to address the physiological significance of the hydrophilic properties of this proposed protein-conducting groove. Similarly, structural fitting of the C-terminal domain of Stt3p to the EM model allows us to identify the C terminus of ␣2 and the loop between ␣2 and ␣3 (loop L23) as the putative site for interactions with Wbp1p. Sequence alignment demonstrated that the C terminus of the ␣2 helix is highly conserved, whereas loop L23 is only partially conserved (supplemental Fig. S1). The presence of a disordered loop in the interface region between Stt3p and Wbp1p is in agreement with the dynamic feature of Wbp1p (19). This might provide a structural prerequisite for efficient N-glycosylation because it shows that the substrate of eukaryotic N-glycosylation reaction is in a flexible form (41).
In summary, we have presented the high resolution structure of the C-terminal domain of Stt3p, the proposed acceptorbinding domain of OT. Furthermore, considering the high sequence homology of eukaryotic Stt3p, determination of  the first high resolution structure of the acceptor-binding domain of yeast OT is an important step toward understanding the mechanisms of the eukaryotic N-glycosylation process.