Structural basis for the glycosyltransferase activity of the Salmonella effector SseK3

The Salmonella-secreted effector SseK3 translocates into host cells, targeting innate immune responses, including NF-κB activation. SseK3 is a glycosyltransferase that transfers an N-acetylglucosamine (GlcNAc) moiety onto the guanidino group of a target arginine, modulating host cell function. However, a lack of structural information has precluded elucidation of the molecular mechanisms in arginine and GlcNAc selection. We report here the crystal structure of SseK3 in its apo form and in complex with hydrolyzed UDP-GlcNAc. SseK3 possesses the typical glycosyltransferase type-A (GT-A)-family fold and the metal-coordinating DXD motif essential for ligand binding and enzymatic activity. Several conserved residues were essential for arginine GlcNAcylation and SseK3-mediated inhibition of NF-κB activation. Isothermal titration calorimetry revealed SseK3's preference for manganese coordination. The pattern of interactions in the substrate-bound SseK3 structure explained the selection of the primary ligand. Structural rearrangement of the C-terminal residues upon ligand binding was crucial for SseK3's catalytic activity, and NMR analysis indicated that SseK3 has limited UDP-GlcNAc hydrolysis activity. The release of free N-acetyl α-d-glucosamine, and the presence of the same molecule in the SseK3 active site, classified it as a retaining glycosyltransferase. A glutamate residue in the active site suggested a double-inversion mechanism for the arginine N-glycosylation reaction. Homology models of SseK1, SseK2, and the Escherichia coli orthologue NleB1 reveal differences in the surface electrostatic charge distribution, possibly accounting for their diverse activities. This first structure of a retaining GT-A arginine N-glycosyltransferase provides an important step toward a better understanding of this enzyme class and their roles as bacterial effectors.

Salmonella enterica is an intracellular Gram-negative pathogen for which strains of the various serovars cause many diseases in humans and animals worldwide. In immunocompetent humans, non-typhoidal Salmonella serovars, including S. enterica serovar Typhimurium, typically cause self-limiting gastroenteritis (1). However, in immunocompromised individuals, non-typhoidal Salmonella serovars frequently cause an invasive disease that results in significant morbidity (2).
Following invasion or phagocytic uptake into the host cell, one of the key virulence determinants is the Salmonella pathogenicity island 2 (SPI-2) 4 -encoded type III secretion system (T3SS), which delivers ϳ28 effector proteins into the host cell (3)(4)(5). A number of Salmonella effectors target and modify host proteins that have a role in mediating host inflammatory responses (3,6). Three highly related Salmonella effectors, SseK1, SseK2, and SseK3, orthologues of the enteropathogenic and enterohemorrhagic Escherichia coli T3SS effector NleB1 (7), translocate via the T3SS into the host cell (8). Similarly to NleB1 (9, 10), SseK1 and SseK3 are N-acetylglucosamine (GlcNAc) transferases that modify the TNFR1-associated death domain protein TRADD and inhibit activation of the proinflammatory transcription factor NF-B, as well as necroptotic host cell death (8). In in vitro experiments, SseK2 modifies the FAS-associated protein with death domain (FADD) (11), but despite detectable translocation into host cells, SseK2 is only able to inhibit NF-B when highly overexpressed in 293ET cells (8). This suggests that a loss either of catalytic activity or of substrate interaction occurs after delivery of physiologically relevant levels of SseK2. Whether SseK2 has unidentified functions during infection that are mediated by different targets or a different sugar modification remains unknown.
Unlike mammalian O-linked GlcNAcylation, where GlcNAc is attached to the oxygen of the hydroxyl group of serine and threonine residues of numerous cytosolic and nuclear proteins, including those that mediate signaling (12), GlcNAcylation mediated by NleB from pathogenic E. coli or SseK bacterial effectors results in the addition of GlcNAc to arginine residues with an N-glycosidic linkage (9,10). This distinct modification can be recognized in mammalian cells using an antibody that does not detect O-linked GlcNAcylation (13). Both mammalian GlcNAcylation and bacterially mediated arginine GlcNAcylation involve the transfer of sugar from an activated uridine-diphosphate carbohydrate donor substrate, UDP-GlcNAc. NleB and SseK proteins have a conserved DXD motif, typical of GT-A family of glycosyltransferases, where the aspartic side chains are required for the coordination of a metal divalent cation necessary for their enzymatic activity (14).
Although, to date, there are a number of structures describing O-glycosyltransferases and the O-GlcNAcylation catalytic event, the only known structure of a bacterial arginine-specific glycosyltransferase is that of the B pattern-type inverting glycosyltransferase EarP, which modifies translation elongation factor P (EF-P) by arginine rhamnosylation (15,16). Another study reported glucosylation of arginine by the sweet corn protein amylogenin, which mediates self-glucosylation in vitro (17), but there have been no follow-up reports.
Sequence analysis suggests a structure for the catalytic core of NleB1 protein similar to that of Photorhabdus asymbiotica tyrosine-glycosyltransferase protein toxin (18) and Clostridium difficile toxin A and toxin B (19,20), which glycosylate host Rho GTPases involved in the regulation of the host cytoskeleton, and the Legionella pneumophila Lgt1 that targets a serine residue of eEF1A, blocking protein biosynthesis (21). As well as the conserved DXD motif in SseK effectors, additional amino acids are conserved, including a tyrosine and glutamic acid residue, equivalent to Tyr-219 and Glu-253 in NleB1, both required for NleB1-mediated arginine GlcNAcylation of the death domain-containing protein, FADD (22). Arginine GlcNAcylation of target proteins is not observed in uninfected host cells and is irreversible by host enzymes (23) and therefore represents a potent bacterially mediated virulence mechanism. However, in the absence of structural information, the molecular details underlying arginine and UDP-GlcNAc selection are unknown.
This study presents the crystal structures of the N-glycosyltransferase SseK3 in its free form and bound to UDP, GlcNAc, and manganese. SseK3 adopts the classical GT-A glycosyltransferases family fold and is able to hydrolyze UDP-GlcNAc in the absence of a protein substrate. Mutational analysis of amino acids predicted to be important for substrate binding identified several conserved residues in SseK3 that are essential for arginine GlcNAcylation of TRADD and SseK3-mediated inhibition of TNF␣-induced NF-B activation, directly correlating enzymatic activity to virulence function. Structural analysis reveals the presence of an active site glutamate residue, which is conserved only in NleB and SseK proteins and not in other structurally related glycosyltransferases and is essential for enzymatic activity. The close proximity of this residue to the reactive anomeric carbon supports a double-inversion catalytic mechanism, where an intermediate enzyme-sugar is stabilized for the nucleophilic attack of the acceptor arginine side chain of the host substrates. This is the first structure of a retaining arginine glycosyltransferase bound to a hydrolyzed form of its donor substrate, representing an important step toward the understanding of this class of effector proteins.

A divalent cation is necessary for binding of the ligand to SseK3
Mass spectrometry analysis of bacterially expressed fulllength SseK3 revealed limited proteolysis, with the protein losing its first 13 and last 2 amino acids (data not shown). For the majority of T3SS effector proteins, the N-terminal 15-25 residues, often predicted to be unstructured, are required for translocation and remain uncleaved following delivery into the host cell cytosol (24,25). This region is frequently removed before crystallization, as in the case of the E. coli effector protease NleC (26), or disordered, as in the case of E. coli virulence factor NleE (27). We therefore prepared two forms of SseK3, 14-333 and 14-335, for subsequent structural and functional characterization.
Glycosyltransferases require the presence of a divalent cation for activity. To test the selectivity of SseK3 for divalent cations and UDP-glucose derivatives, we compared the ability of SseK3 to interact with Mg 2ϩ , Mn 2ϩ , and three UDP-carbohydrate derivatives (UDP-GlcNAc, UDP-Glc (glucose), and UDP-Gal (galactose)) by isothermal titration calorimetry (ITC) (Fig. 1,  Fig. S1, and Table 1). No binding was observed for UDP-GlcNAc to SseK3 (14 -335) in the absence of a coordinating metal cation (Fig. S1). When the titration experiments were performed in the presence of MgCl 2 or MnCl 2 , the resulting binding affinity for UDP-GlcNAc in the presence of Mn 2ϩ (K d ϭ 1.9 M) showed a 5-fold increase compared with that for Mg 2ϩ (K d ϭ 10 M) ( Fig. 1 and Fig. S1). This binding affinity is similar to that observed for C. difficile toxin A for UDP-Glc and manganese (K d ϭ 11.4 M) (28). The binding studies showed that ligand interaction is influenced by the presence of the two C-terminal residues in the longer SseK3 construct (Trp-334 and Arg-335), but both SseK3(14 -333) and SseK3 (14 -335) bind with a lower affinity to UDP-Gal, compared with UDP-Glc and UDP-GlcNAc. UDP-GlcNAc has the highest affinity for SseK3 (14 -335), although it only binds marginally stronger than UDP-Glc, and manifests the largest increase in affinity due to the presence of the two C-terminal residues Trp-334 and Arg-335. Interestingly, the change in affinity appears to be associated with an increase in binding enthalpy and a loss of entropy, suggesting the occurrence of a structural reorganization upon ligand binding to SseK3 ( Table 1).

Structure of SseK3
Crystals of SseK3(14 -333) enriched with selenomethionine diffracted to good resolution, but most of them exhibited severe anisotropy, were non-isomorphous, and showed translational non-crystallographic symmetry. Attempts to solve the structure by molecular replacement using available glycosyltransferase X-ray structures or selenium SAD phasing failed. Instead, we recorded data at the I23 long-wavelength beamline at Diamond Light Source to reveal the positions of the selenium and sulfur atoms within SseK3 based on the small anomalous signals present at long wavelengths to obtain an initial model SseK3 is a retaining type-A glycosyltransferase (29). This model was used as template in a molecular replacement search with a complete 2.21 Å resolution data set that did not exhibit severe diffraction anisotropy. The SseK3(14 -333) crystal belongs to space group P2 1 2 1 2 1 with two molecules in the asymmetric unit (a.u.) ( Table 2) that overlap with a root mean square deviation (RMSD) (C␣ of residues 27-328) of 0.91 Å. The structure of apo-SseK3(14 -333) is reported in Fig. 2A.  are not visible in the electronic density for either chain, nor are amino acids 329 -333 of chain B and the last serine residue (Ser-333) of chain A.

SseK3 is a retaining type-A glycosyltransferase
replacement using the apo-structure. SseK3(14 -335) crystals belong to the same space group as the shorter unbound construct, with two molecules in the asymmetric unit ( The SseK3 catalytic core domain (ϳ250 residues) shows the classic features of the large glycosyltransferase type-A family of enzymes (GT-A) consisting of a single module composed of a central parallel ␤-sheet core flanked by a number of ␣-helices (14). The structure contains an additional small domain, spanning residues 134 -171, containing two extra ␣-helices that are here named the ␣-helical insertion (Fig. 2). This protruding region, especially the position of the residues in the loop 148 -154, is differently tilted in the two SseK3 copies present in the crystallographic asymmetric unit, most likely due to non-identical crystal contacts.
Due to the length of the crystallization process and to the hydrolysis capabilities of SseK3 (14 -335), in the structure cocrystallized in the presence of UDP-GlcNAc and Mn 2ϩ , the ligand is hydrolyzed with the UDP and N-acetyl ␣-D-glucosamine still present in the active site ( Fig. 2B and Fig. S2). A similar observation was made for C. difficile toxB co-crystallized in

SseK3 is a retaining type-A glycosyltransferase
the presence of UDP-Glc (19). A comparison with the apostructure reveals that the C-terminal amino acids in the ligandbound SseK3 fold back onto the ligand to produce a short segment of ␣-helix, with the last two residues Trp-334 and Arg-335 directly interacting with the UDP and GlcNAc (Fig.  3, B and C).
To investigate the conformation of this C-terminal region in the apo state, we recorded a 1D proton NMR spectrum of apo-SseK3(14 -335). The spectrum has a downfield region, encompassing resonances from the backbone and side-chain amides and aromatic side-chain aliphatic protons that show an overall chemical shift dispersion typical of a folded protein (Fig. 3D). The spectrum has a surprising number of sharp resonances, including a signal at 10 ppm, absent in the spectrum of SseK3(14 -333) (Fig. S2C), probably corresponding to the sidechain H⑀1 of the C-terminal residue Trp-334. The sharp signals in the spectrum of the free protein are broadened when UDP-GlcNAc and MgCl 2 are added to the solution, suggesting a conformational rearrangement and loss of flexibility. Reorganization of the C terminus in the SseK3 structure upon ligand binding correlates both with the NMR downfield resonance collapse in the 1D proton spectrum and with the loss of entropy upon binding detected by the ITC experiments.

The active site
SseK3 contains the classic GT-A DXD motif ( 226 DAD 228 ) that coordinates the divalent manganese cation in an octahedral geometry (Fig. 3, A and B). Mn 2ϩ interacts with the carboxylate oxygens of Asp-228 and Asp-325, the UDP diphosphate, and the hydroxyl group of Ser-327, as well as an ordered water molecule, which bridges, via hydrogen bonds, the metal atom and the carboxyl group of Asp-226 of the DXD motif (Fig.  3B). All of the metal-coordinating residues are conserved across this effector family apart from SseK3 Asp-325, which is an asparagine residue in the other family members (Fig. 3A). Most C-terminal residues participate in interactions that lock the phosphate groups in the correct orientation. Ser-333, Trp-334, and Arg-335 all interact via hydrogen bonds with the oxygen atoms of the diphosphate. The last residue visible in the structure of SseK3(14 -333), serine 332, translates 17 Å from its position in the apo-structure to interact directly with a ␤-oxygen of the UDP (Fig. 3B).
The aromatic portion of the ligand is held in place by a series of interactions that include hydrogen bonds of the backbone amide and carbonyl of Phe-53 with the N3 and O2 of the ring and the -stacking of the uracil by Trp-52 and Phe-190 (Fig.  3C). The ribose C2 hydroxyl group interacts with both the backbone carbonyl of Gln-51 and the hydroxyl group of Tyr-224, whereas the C3 -OH group forms a hydrogen bond with the backbone amide of Ala-227.
The N-acetyl ␣-D-glucosamine in the active site is stabilized by a series of interactions that involve the anomeric C1 hydroxyl group, at a distance of 3.6 Å from the phosphorus atom to which it was bound, and the guanidine group of Arg-335, with hydrophobic interactions also established between the sugar and the aromatic ring of Trp-334 (Fig. 3C). The C2 acetyl group oxygen, absent in UDP-Glc, forms a hydrogen bond (3.13 Å) with the backbone amide of Gly-260. The same interaction can be seen between Gly-405 and the sugar acetyl group in the structure of the pneumococcal transferase GtfA in complex with UDP and GlcNAc (PDB code 4PQG) (33) and might contribute to the binding preference of SseK3 for UDP-GlcNAc compared with UDP-Glc. The structural position of Gly-260 is substituted by a bulky glutamine side chain in clostridial toxin A and toxin B (Fig. 3C), suggesting an explanation for preference of these toxins for UDP-Glc (19,20). The GlcNAc C4 and C6 hydroxyl groups in the structure of ligand-

SseK3 is a retaining type-A glycosyltransferase
bound SseK3 are locked in the active site by the side-chain carboxylate of Asp-191, which may explain why UDP-Gal, which has a C4 inverted stereochemistry compared with glucose, shows the lowest affinity for SseK3.
The UDP and GlcNAc present in the SseK3 catalytic pocket are in an arrangement similar to UDP-Glc bound to toxA and the hydrolyzed UDP and Glc in toxB (Fig. S2) (19,20). Differently from the clostridial toxins, the active site in SseK3 contains a glutamic acid residue, Glu-258, 3.2 Å away from the anomeric carbon. This residue is conserved in all SseK effectors and NleB but is replaced by an isoleucine in toxA, toxB, and LT (19,20,31). Previous data suggested an important role of this residue for NleB1 function (22).

Analysis of UDP-GlcNAc hydrolysis
The presence of hydrolyzed UDP-GlcNAc in the active site shows that, in the absence of a protein substrate, SseK3 can function as a hydrolase. To further explore the ability of SseK3 to hydrolyze UDP-GlcNAc, we used 31 P and 1 H NMR spectroscopy. At time 0 (Fig. 4), the 1D 31 P spectrum of UDP-GlcNAc shows two doublet signals for the P␣, at Ϫ13.06 ppm, and P␤, at Ϫ11.41 ppm, for the diphosphate group of the UDP. A wellresolved quadruplet signal system in the 1D proton spectrum of the intact UDP-GlcNAc at 5.43 ppm can also be observed for the anomeric H1 proton of the glucose moiety, resulting from the H1-H2 (3.56 Hz) and H1-P␤ (7.02 Hz) coupling (34,35). The addition of SseK3(14 -335) results in a progressive reduction of both NMR signals with time. The 31 P spectrum (Fig. 4A) shows the emergence of the P␣ and P␤ doublet signals from UDP in a free form, whereas the quadruplet signal of the sugar C1 proton (Fig. 4B) gradually disappears, substituted by a rising doublet at 5.12 ppm corresponding to the same proton no longer attached to the ␤-phosphorus of the UDP. The doublet has a 3 J H1,H2 ϭ 3.58 Hz, a value typical for an ␣-anomeric equatorial-axial stereochemistry for the sugar C1 (the ␤-form has a value around 7-8 Hz) (36,37). The chemical shift and

SseK3 is a retaining type-A glycosyltransferase
proton-proton coupling constant of the glucosamine derived from the hydrolysis of the UDP-GlcNAc are identical to those in the spectrum of free N-acetyl ␣-D-glucosamine. Moreover, mass spectrometry analysis of the reaction mixture does not show SseK3 modification (data not shown), excluding the possibility of intramolecular N-GlcNAcylation.
The UDP-GlcNAc hydrolysis proceeds with a retention of the C1 chirality, suggesting that SseK3 is a retaining glycosyltransferase enzyme. Although able to hydrolyze the ligand in the absence of its target substrate, SseK3 does it slowly. There is a reduction of 36 Ϯ 2% of the signal intensities in the 31 P spectrum for the ␣and ␤-phosphorus nuclei of the diphosphate group of the UDP-GlcNAc after 90 min at 30°C. The peaks are still visible in the spectrum, albeit very close to the noise level, after the reaction is allowed to proceed for 12 h. UDP-GlcNAc hydrolysis is not observed with SseK3(14 -333) (Fig. S3A) or in the absence of enzyme (Fig. S3B), indicating that together, the C-terminal residues Trp-334 and Arg-335 are necessary for SseK3 functional activity.

Structure-based analysis of SseK3 function
To test the role of active site residues in the enzymatic activity of SseK3 in a functional framework, multiple amino acid substitutions were tested for their ability to modify the SseK3 substrate TRADD as well as prevent NF-B activation. We focused on amino acids from our structural analysis of SseK3 (Fig. 3) that are required for coordination of UDP and GlcNAc and hence enzymatic activity. In addition, we were particularly interested to test whether Glu-258 was critical for enzymatic activity, as this residue is only conserved in NleB and SseK glycosyltransferases and has been shown to be required for the function of NleB1 (22).
Individual SseK3 mutants were co-transfected into 293ET cells together with FLAG-TRADD, and following anti-FLAG immunoprecipitation, the GlcNAcylation of TRADD was analyzed by immunoblotting. As a control, transfected SseK3 K251A remained active like wildtype SseK3. In contrast, and as expected, mutation of the manganese-coordinating DXD (D226A/D228A) motif ablated SseK3-induced arginine GlcNAcylation in cell lysates and the modification of immunoprecipitated TRADD (Fig. 5A) (8). The conserved glutamate (Glu-258, predicted to coordinate GlcNAc), which is present in SseK and NleB proteins but not other structurally related glycosyltransferases, was also essential for arginine GlcNAcylation of TRADD. In addition, mutations predicted to be required for coordination of uracil (W52A and Y224A) and GlcNAc (D191A, N259A, and W334A) also eliminated SseK3-induced protein arginine GlcNAcylation, and, more specifically, no modification of TRADD was detected (Fig.  5A). Transfected SseK3 R194A (predicted to disrupt GlcNAc binding) appeared inactive, but this mutant was not stably expressed and should be approached with caution (Fig. 5A). Surprisingly, SseK3 with a mutation in the C-terminal residue Arg-335 (to alanine) retained the ability to modify numerous proteins, including TRADD. However, the pat-  Arginine-GlcNAcylation of post-nuclear supernatants (input) and immunoprecipitated FLAG-TRADD was tested using anti-Arg-GlcNAc antibody. Expression of GFP-tagged SseK3 variants was tested using anti-GFP antibody in cell lysates. Antibodies to tubulin were used as a loading control. The shown immunoblots are representative of four independent experiments. B, 293ET cells were co-transfected with an NF-B-dependent luciferase reporter plasmid, pTK-Renilla luciferase, and the indicated GFP-tagged SseK3 mutants for 24 h before overnight stimulation with 50 ng/ml TNF␣. Luciferase activity was measured in cell lysates, and results are presented as fold activation relative to unstimulated controls expressing each SseK3 variant. Data are the mean Ϯ S.E. of five independent experiments, for which individual data points are indicated. Cell lysates from B were analyzed by immunoblotting in Fig. S4. *, p Ͻ 0.05 one-way analysis of variance. DXD corresponds to the SseK3 D226A/D228A mutant.

SseK3 is a retaining type-A glycosyltransferase
tern of GlcNAcylation was different, with significantly fewer proteins modified when compared with wildtype SseK3 (Fig. 5A).
Next, to test whether enzymatic activity correlated with the ability of SseK3 to inhibit TNF␣-induced NF-B activation, variants of SseK3 were transfected into 293ET cells together with an NF-B-dependent luciferase reporter plasmid. We predicted that Q51A would still be catalytically active, as it is the backbone carbonyl group and not the side chain that is involved in interactions with the ligand (Fig. 2C). Interestingly, whereas we were not able to detect arginine GlcNAcylation of TRADD, SseK3 Q51A was still a potent inhibitor of NF-B activation (Fig. 5), suggesting that catalytic activity is retained but greatly reduced below the level of detection. The side-chain N⑀ of residue His-247 is 4.4 Å away from the oxygen of the GlcNAc acetyl group, yet a similar finding was observed for SseK3 H247A, with inhibition of NF-B activation occurring without detectable modification of TRADD (Fig. 5). This suggests that very low and non-detectable levels of arginine GlcNAcylation are sufficient for SseK3-mediated inhibition of NF-B activation, presumably due to the irreversible nature of the arginine modification. As the rest of the mutations in SseK3 that ablated the ability of the enzyme to arginine-GlcNAcylate TRADD (DXD, W52A, Y224A, D191A, E258A, N259A, and W334A) also caused an inability of SseK3 to inhibit TNF␣-induced NF-B activation ( Fig. 5B and Fig. S4), these data reveal that enzymatic activity is required for SseK3 function.

Discussion
Here we report the structure of the Salmonella arginine N-glycosyltransferase, SseK3, in its apo and ligand-bound forms. The protein belongs to the GT-A family of glycosyltransferase enzymes and binds the ligand in a metal ion-dependent manner via a DXD motif. The dynamic features of the last 15 residues of SseK3 are important, as they experience a large conformational change upon ligand and metal binding. As suggested by solution NMR spectroscopy, the C-terminal residues are disordered in the unbound SseK3 structure, with structural reorganization following primary substrate binding, with an open-close catalytic cycle that, as in the case of a number of other glycosyltransferases, is likely to support SseK3 enzymatic activity (38,39). Residue Trp-334 in the active site pocket occupies the same structural position as Trp-520 in the catalytic fragment of the clostridial lethal toxin in complex with UDP-Glc and manganese (PDB code 2VKD) (31), and it is conserved in toxin A (Trp-519) (20) and toxin B (Trp-520) (19) of the same organism ( Fig. 6 and Fig. S2). A tryptophan residue (Trp-520) is also present in the active site of the L. pneumophila Lgt1 in complex with UDP-Glc and magnesium (PDB code 3JSZ) (21) and in a structurally equivalent position in the unbound form of the ␣-toxin from Clostridium novyi (PDB code 2VK9) (31). As for the other toxins, Trp-334 is a pivotal residue in the opening and closure of the catalytic cleft (40). In the ligand-bound SseK3, its N⑀1 forms a hydrogen bond with the UDP P␤ oxygen, whereas the equivalent tryptophan residue (Trp-519) in the structure of toxin A hydrogen-bonds the glycosidic oxygen of the intact UDP-Glc (Fig. S2B) (31). Trp-334 is therefore in a position to stabilize the emerging negative charge of the former glycosidic oxygen transferred onto the UDP. Once the enzymatic reaction has occurred, the opening of the catalytic site and the release of the UDP reinitiate the cascade; indeed, mutation of Trp-334 ablates the catalytic and functional activity of SseK3.
The presence of a hydrolyzed UDP-GlcNAc in the protein active site suggests that SseK3 can function as a hydrolase, a process observed also in the case of toxin A (28) and toxin B (19). The hydrolysis of UDP-GlcNAc, in the absence of target substrate, is likely to be a proxy of its physiological glycosyltransferase activity, and the release observed by NMR of an ␣-anomeric form of GlcNAc, as well as the presence of the same molecule in the enzyme active site, indicates that SseK3 is likely to be a retaining N-glycosylating enzyme.
Reaction mechanisms of retaining glycosyltransferases have been controversial, but it is generally accepted that they operate through the stabilization of an oxocarbenium-like transition state (41). Depending on the presence of a nucleophile in the active site correctly positioned on the ␤-face of the donor substrate, the reaction could proceed either via a front-face or double-inversion mechanism (42). For the clostridial O-glycosylating toxins, in the absence of other available bases in the active site, the deprotonation of the acceptor nucleophile is carried out by the nucleotide P␤ oxygen. Following the formation of a transient oxocarbenium state, the departure of the leaving sugar O1 and the formation of the new glycosidic bond occur on the same side of the anomeric carbon, retaining its stereochemistry (31). In the double-displacement reaction, the presence of a nucleophile in close proximity to the anomeric carbon, directly under the sugar substrate ␤-face, stabilizes an oxocarbenium-like enzyme-ligand adduct. The reaction is then followed by a subsequent direct substitution by the target nucleophile at the opposite site of the anomeric C1 with a double-inversion mechanism that retains the original carbon stereochemistry. This is the case for lysozyme (43) and GT6 family member ␣3GalT (44). For these enzymes, a well-located nucleophile within the active site (Asp-52 for lysozyme and Glu-317 for ␣3GalT) is a necessary element for the creation of a stereochemically inverted intermediate set for nucleophilic substitution. Despite their structural similarity, the presence of a glutamate side chain in the active site of SseK3, 3.2 Å from the anomeric carbon, suggests that this enzyme could operate as a retaining GT more similarly to ␣3GalT than to the clostridial toxins. The Glu-258 side chain is important for SseK3 catalytic activity and is a conserved active-site residue across the SseK effectors family and in NleB1, substituted by uncharged residues in structurally related toxins (Fig. 6). Mutation into an alanine residue, similarly to the mutation of the equivalent residue in NleB1 (E253A), impairs the arginine GlcNAcylation ability of SseK3 and NleB1 without, for the latter, abrogating binding to the target protein (22). To date, the only available structure of an arginine-glycosylation enzyme is that of the Pseudomonas aeruginosa EarP that catalyzes the transfer of a rhamnose molecule onto the bacterial translation elongation factor EF-P (16). EarP is an inverting glycosyltransferase belonging to the GT-B family and structurally unrelated to SseK3. Interestingly, EarP has three negatively charged residues SseK3 is a retaining type-A glycosyltransferase Figure 6. Glycosyltransferase structural alignment and SseK1, SseK2, and NleB1 homology structural models. Top, structural alignment of SseK3 with C. sordelli LT (PDB code 2VKD, 19% sequence identity), C. difficile toxin A (3SRZ, 17%), toxin B (2BVL, 15%), L. pneumophila Lgt1 (3JSZ, 11%), and P. asymbiotica toxin (4MIX, 19%). The catalytically important Glu-258 residue is marked with an asterisk in the alignment. Bottom, solvent-accessible surface representation colored according to the electrostatic potential (blue, positive; red, negative) of the structure of SseK3(14 -335) bound to its UDP-GlcNAc ligand and homology structural models of SseK1 (55% sequence identity with SseK3), Ssek2 (72%), and NleB1 (53%). Ligand bound in the active site is highlighted.

SseK3 is a retaining type-A glycosyltransferase
in the active site (Asp-13, Asp-17, and Glu-273), in close proximity to the sugar, that have been shown to be important in the catalytic activity and play a crucial role in the stabilization of the positive charge of the acceptor guanidino group. Mutation of those residues to alanine, as for the E258A mutation in SseK3, severely impairs EarP enzymatic activity without disrupting substrate binding (16).
We therefore propose a catalytic mechanism, in which the glutamate side chain of Glu-258 acts as the intramolecular nucleophile in the first step of a double-displacement mechanism, forming an SseK3-GlcNAc intermediate state primed for the C1 nucleophilic attack to the sugar ␣-side by the guanidino group on the target arginine. Glu-258 could aid in the selection of the arginine as the N-GlcNAcylation site and, similarly to the phosphate role in the clostridial toxins, increase the nucleophilic character of the acceptor by deprotonation of its side chain. A double-displacement mechanism, where the C1-O1 bond is broken before the nucleophilic attack, could also be favored by the presence of the guanidino group of Arg-335 tucked into the SseK3 active site. Arg-335 sits above the sugar O1, obstructing the access to the sugar ␣-side, sterically hindering a direct front-face reaction.
In SseK3, the well-located Arg-335 side chain hints at a potential intramolecular GlcNAcylation process. However, in vitro auto-GlcNAcylation of SseK3(14 -335) is unlikely, as NMR and mass spectrometry experiments only detected unbound glucosamine following UDP-GlcNAc hydrolysis by SseK3. Nevertheless, a specific role of this residue in SseK3 function cannot be excluded. In the E. coli MurG, an O-GlcNAc GT-B family glycosyltransferase, the presence of a positively charged arginine residue (Arg-261) in the active pocket provides extra stability for the negatively charged UDP emerging from the catalytic reaction (45). In the structure of the UDP and GlcNAc-bound GtfA, an O-GlcNAc GT-B, an active-site arginine residue (Arg-328), crucial for catalytic activity, forms a strong hydrogen bond with the P␤ oxygen, stabilizing the negative charge generated from the processing of the substrate (33). The active site of EarP has an arginine residue (Arg-271), whose guanidino N⑀1 is 2.5 Å from the phosphate oxygen that participates in the stabilization of the enzymatic products whose mutation into an alanine reduces EarP glycosylation activity (16). In SseK3, Arg-335 is tucked in the active site at 2.62 Å from the sugar O1 and 3.24 Å from the oxygen ␤-phosphate, and its mutation R335A induced a severely reduced arginine-GlcNAcylation pattern compared with wildtype SseK3 (Fig. 5). As for EarP, this could reflect a decrease in enzymatic activity due to a decreased stabilization of the enzymatic products. Alternatively, as this residue is only conserved in SseK2, which like SseK3 localizes to the Golgi network during infection (8), but is substituted by an alanine in SseK1 and absent in NleB1, it could be indirectly involved in substrate specificity via reduced Golgi network association.
During infection of macrophages, similarly to their E. coli ortholog, SseK1 and SseK3 both function to mediate inhibition of NF-B activation as well as inhibition of an inflammatory necroptotic host cell death (8). The structure of SseK3 bound to the hydrolysis products of its donor substrate revealed which amino acids are necessary for ligand binding and hence arginine GlcNAcylation of TRADD and SseK3-mediated inhibition of NF-B activity. In addition to the ion-coordinating DXD motif, previous mutational analysis of NleB1 identified residue Tyr-219 as required for function (22). This amino acid is conserved in SseK proteins (Tyr-224) and is important for the coordination of the uracil of the UDP and therefore necessary for both SseK3 catalytic activity and function.
SseK effector proteins share a high sequence identity, but most of the conserved residues are those in the catalytic domain ( Fig. 1 and Fig. S5). Sequence variability is mainly concentrated around residues in the ␣-helical insertion, suggesting that this region might direct substrate specificity. Despite its similarity to the other effectors, SseK2 does not show activity in cell-based assays (11). However, purified SseK2 can arginine-GlcNAcylate FADD in vitro, without forming a stable interaction, suggesting that this enzyme is functional (11). As all residues tested that are required for enzymatic activity in SseK3 are conserved in SseK2, it is not immediately clear why SseK2 is not active after translocation into host cells. In the absence of structural information for the other members of the SseK family, we created homology models of SseK1, SseK2, and NleB1 based on the SseK3 crystal structure (Fig. 6). The distribution of surface charges shows that both SseK1 and NleB1 have the active site surrounded by acidic residues, whereas the surfaces of SseK2 and SseK3 appear to be more similar, with 72% sequence identity, having the ligand site surrounded by larger positively charged areas compared with SseK1. Also, as the C-terminal residue in both SseK2 and SseK3 is a positively charged residue (Arg-348 and Arg-335), substituted by a smaller side chain in SseK1 (Ala-332) and absent in NleB1, the active site in SseK1 and NleB1 appears to be more accessible than in SseK2 and SseK3. Recently, it has been reported that NleB1 preferentially GlcNAcylates the death domain (DD) of FADD at residue Arg-117 (23), although it can interact with and modify the DD of other proteins, including TRADD and RIPK1 (9,10). Interestingly, SseK1 is the only SseK family member that can modify FADD (also at Arg-117) after bacterial delivery into host cells (8). The structure of FADD-DD shows an extensive basic surface patch whose residues have been shown to be important in the interaction with Fas receptor (CD95) (46). In particular, mutation of Arg-117, part of the FADD-DD extended positively charged surface, has been shown to abolish interaction with the cytoplasmic death domain of CD95 (47). Therefore, the contiguous acidic surfaces of SseK1 and NleB1 could act as a complementary surface, binding the death domain of FADD and allowing the side chain of residue Arg-117 to position itself in the active site for sugar transfer. The lack of activity in cell-based assays of SseK3 toward FADD and the absence of a phenotype in Salmonella for SseK2 could then be the result of the interplay between a less accessible active site and altered substrate specificity, with SseK3 still able to modify TRADD.
In summary, we have solved the first X-ray crystal structure of a GT-A arginine glycosyltransferase, identifying core-conserved residues that are required for catalytic activity and virulence function of SseK3, allowing us to propose a potential enzymatic mechanism for N-arginine glycosylation.

Protein cloning, expression, and purification
Wildtype SseK3(14 -333) and SseK3(14 -335) constructs (Uniprot: A0A0H3NMP8) were cloned in pGEX-6P1 (GE Healthcare) vector by the standard Gibson assembly protocol (48) and expressed with a cleavable N-terminal GST tag. The proteins were expressed in E. coli BL21 (DE3) Gold in LB supplemented with 100 g/ml ampicillin. Cells were grown at 30°C until A 600 Ϸ 0.6 -0.8, induced with 0.5 mM isopropyl ␤-D-1-thiogalactopyranoside, and incubated at 18°C for overnight expression. Proteins were purified by GST-Sepharose affinity chromatography followed by cleavage of the GST N-terminal tag by 3C-protease and size-exclusion chromatography in 25 mM HEPES buffer, pH 7.5, 150 mM NaCl, and 0.5 mM TCEP. The 3C-cleavage leaves amino acids GPLGS preceding the first amino acid of SseK3. SseK3(14 -335) purification was performed at 4°C, in the presence of EDTA-free protease inhibitor mixture and 1 mM phenylmethylsulfonyl fluoride in the lysis buffer. Selenomethionine labeling of SseK3(14 -333) was achieved by using the Molecular Dimensions standard labeling protocol of M9 supplemented with L-selenomethionine with the auxotrophic strain E. coli 834 (DE3) (49). SseK3 (14 -333) and SseK3(14 -335) concentrations were calculated using the values for ⑀ at ϭ 280 nm of 29,340 and 34,840 M Ϫ1 cm Ϫ1 , respectively. Mass spectrometry was used to check selenomethionine incorporation and protein molecular masses. The server SWISS-MODEL was used to create structural homology models of SseK1, SseK2, and NleB1 (50).

SseK3 mutant plasmid construction
SseK3 mutant variants were created by overlap mutagenesis PCR from the m4pGFP-SseK3 wildtype plasmid (8) and ligated into the mammalian expression vector m4pGFP to create plasmids encoding SseK3 mutants with an N-terminal GFP tag. All plasmids were checked by sequencing.

Isothermal titration calorimetry
ITC experiments were performed at 293 K using a Microcal iTC200 calorimeter (Malvern). The protein solutions were prepared in buffer containing 25 mM HEPES buffer, pH 7.5, 150 mM NaCl, 0.5 mM TCEP, and 5 mM either MgCl 2 or MnCl 2 . All experiments were performed by placing the solution containing SseK3 in the cell at concentrations of 50 M and the solution containing the UDP-glucose derivatives in the syringe at 500 M. The concentrations of UDP derivatives were estimated using the UDP extinction coefficient at ϭ 262 nm of 9800 M Ϫ1 cm Ϫ1 . For each titration 20 injections of 2 l were performed. Integrated data, corrected for heats of dilution, were fitted using a nonlinear least-squares algorithm to a 1:1 binding model, using the MicroCal Origin version 7.0 software package. The fitting parameters are ⌬H (reaction enthalpy change in kcal mol Ϫ1 ), K b (equilibrium binding constant in M Ϫ1 ), and n (num-ber of binding sites). The entropic contribution values (defined as ϪT⌬S) were calculated from the values of ⌬H and K b . Each experiment was repeated at least twice, and average values are reported in Table 1.

Crystallization
Initial screens for SseK3 (14 -333) were set up by the sittingdrop method at two protein concentrations (20 and 10 mg/ml) by combining 0.1 l of protein solution with 0.1 l of reservoir. Commercially available crystallization screens were dispensed by using an automated Mosquito machine (TTP Labtech). The most promising initial crystal hits grew from drops set up with 10 mg/ml protein solution in 0.1 M Tris, pH 8.5, 0.2 M NaCl, and 25% PEG 3350. No single crystals were obtained from the initial screening, and multiple rounds of seeding were necessary to improve crystal quality. The seed stock was prepared by harvesting the small crystals, obtained in 0.1 M Tris, pH 8.5, 0.2 M NaCl, and 25% PEG 3350, in 50 l of solution of the same condition. The stock was then homogenized by vortexing for 30 s using a Hampton seed bead. Sitting drops were dispensed by mixing 0.2 l of protein solution at 9 mg/ml, 0.18 l of reservoir solution (0.1 M Tris pH 8.5, 0.2 M NaCl, and 25% PEG 3350), and 0.02 l of seed stock. Single crystals of better diffraction quality could be grown with the selenomethionine SseK3(14 -333) protein sample in the same condition but using a total protein concentration of 7 mg/ml. For X-ray data acquisition, crystals were cryoprotected with mother liquor containing 0.1 M Tris, pH 8.5, 0.2 M NaCl, 25% PEG 3350, and 20% ethylene glycol.
To crystallize SseK3(14 -335) in complex with UDP-GlcNAc, a stock solution was prepared by mixing the protein at 10 mg/ml with a 3-fold molar excess of UDP-GlcNac (900 M) and a final concentration of MnCl 2 of 5 mM. Small single crystals were obtained in 0.1 M Tris, pH 8.0, 0.2 M NaI, and 20% PEG 3350 and improved by multiple rounds of seeding. For X-ray data acquisition, crystals were cryoprotected with mother liquor containing 0.1 M Tris, pH 8.0, 0.2 M NaI, 20% PEG 3350, and 30% trehalose. All of the crystallization experiments were done at 20°C.

Data collection, phasing, and refinement
Despite keeping crystallization and cryoprotection protocols the same, in total, four polymorphs were observed for SeMetlabeled SseK3(14 -333) crystals. Data from a monoclinic crystal were collected at the long-wavelength MX beamline I23 at Diamond Light Source (29) for experimental SAD phasing using a wavelength of ϭ 2.7751 Å and processed with XDS (51). The anomalous substructure of 20 selenium and 20 sulfur atoms (corresponding to four molecules in the asymmetric unit) was located by SHELXD (52) at 3.5 Å resolution. The substructure solution was confirmed by the presence of a 4-fold non-crystallographic symmetry. Initial experimental electron density maps were of poor quality, possibly due to the presence of a pseudotranslation and the anisotropy of the data. Because automatic building programs failed to provide a starting model, a careful inspection of the density map revealed the position of two helices that were placed as polyalanine helices using Coot (53). The 4-fold non-crystallographic symmetry was applied to locate the corresponding helices in the asymmetric unit. The anomalous substruc-

SseK3 is a retaining type-A glycosyltransferase
ture and the initial model of the asymmetric unit comprising eight helices were fed to CRANK2 (54) to produce improved maps benefiting from 4-fold averaging and an almost complete model.
Complete high-resolution data from for SeMet-labeled crystals of SseK3 (14 -333) were collected at the Swiss Light Source (Villigen, Switzerland) and processed using DIALS (http://dials.diamond.ac.uk/). 5 The structure of SeMetlabeled SseK3(14 -333) was solved at a single wavelength of 0.97640 Å using the structural model obtained by long-wavelength S-SAD as template in a molecular replacement with MOLREP (55). Data for SseK3 (14 -335) bound to UDP-GlcNAc and Mn 2ϩ were collected at Diamond Light Source, I04 (Oxford, UK) and processed using XDS (51). The structure of SseK3(14 -335) co-crystallized in the presence of UDP-GlcNAc and Mn 2ϩ was solved using the apo-structure as template for molecular replacement with MOLREP (55).
Models were iteratively improved by manual building in Coot (53) and refined using REFMAC5 (56) and Phenix (57). The parameters for the ligand stereochemistry were obtained from the standard Coot library. All structural figures were prepared in PyMOL (Schrödinger, LLC) and Chimera (58). Further details on data collection and refinement statistics are summarized in Table 2.

NF-B reporter assays
1 ϫ 10 5 293ET cells (seeded the day before use) were transfected with a mixture of 50 ng of p4kB:Luc, 20 ng of pRLTK, and 500 ng of m4pGFP-SseK3 variants (or m6pPAC-FLAG-GFP control) using Lipofectamine 2000 (Invitrogen). After 24 h of transfection, the cells were stimulated with 50 ng/ml human TNF␣ (Sigma) for 17 h, and luciferase activity was measured using the Dual-Luciferase reporter assay system (Promega) and a Tecan Infinite200 PRO plate reader. NF-B-regulated luciferase activity was first normalized to Renilla luciferase activity, and then the fold activation relative to unstimulated conditions of each SseK3 variant was calculated.

Nuclear magnetic resonance
All spectra were recorded at 25°C on a Bruker AVANCE spectrometer operating at 700 MHz, equipped with a cryogenically cooled quadruple-resonance ( 1 H, 15 N, 13 C, and 31 P) probe including z-axis pulse field gradients. Data were acquired and processed with Topspin version 3.5 (Bruker). Reaction solutions of 10 M of both SseK3 constructs (Santa Cruz Biotechnology, Inc.; weight Ͼ98%) were prepared in 20 mM Tris, pH 7.5, 100 mM NaCl, 0.5 mM TCEP, and 5 mM MgCl 2 . 500 M UDP-GlcNAc was added to the protein solutions, and hydrolysis was allowed to proceed at 30°C under gentle stirring. The reaction was quenched at different time points by adding 10 mM EDTA (pH 7.5) and 5% D 2 O required for the magnetic field lock. The same reaction conditions were used to monitor UDP-GlcNAc hydrolysis in the absence of enzyme. 1D 1 H and 31 P NMR spectra were recorded using the standard excitation sculpting pulse sequences zgesp and zgpg30 implemented in Topspin acquisition software.