Structure of a Novel O-Linked N-Acetyl-d-glucosamine (O-GlcNAc) Transferase, GtfA, Reveals Insights into the Glycosylation of Pneumococcal Serine-rich Repeat Adhesins*

Background: O-GlcNAcylation of surface adhesin PsrP catalyzed by GtfA-GtfB complex is involved in the pathogenicity of Streptococcus pneumoniae. Results: The crystal structure of GtfA reveals a novel O-GlcNAc transferase with a β-meander “add-on” domain. Conclusion: The add-on domain is crucial for complex formation and acceptor recognition. Significance: The structure provides insights into the catalytic mechanism of a novel bacterial O-GlcNAc transferase. Protein glycosylation catalyzed by the O-GlcNAc transferase (OGT) plays a critical role in various biological processes. In Streptococcus pneumoniae, the core enzyme GtfA and co-activator GtfB form an OGT complex to glycosylate the serine-rich repeat (SRR) of adhesin PsrP (pneumococcal serine-rich repeat protein), which is involved in the infection and pathogenesis. Here we report the 2.0 Å crystal structure of GtfA, revealing a β-meander add-on domain beyond the catalytic domain. It represents a novel add-on domain, which is distinct from the all-α-tetratricopeptide repeats in the only two structure-known OGTs. Structural analyses combined with binding assays indicate that this add-on domain contributes to forming an active GtfA-GtfB complex and recognizing the acceptor protein. In addition, the in vitro glycosylation system enables us to map the O-linkages to the serine residues within the first SRR of PsrP. These findings suggest that fusion with an add-on domain might be a universal mechanism for diverse OGTs that recognize varying acceptor proteins/peptides.

of an N-terminal domain of unknown function (named DUF1975) and a core catalytic domain of GT-B fold. However, the lack of structural information has been a major impediment to elucidating the molecular mechanism and developing inhibitors against this unique OGT complex.
Here we report the 2.0 Å crystal structure of S. pneumoniae GtfA in complex with GlcNAc and UDP. Structural comparison reveals that GtfA represents a novel OGT with a ␤-meander add-on domain of DUF1975, which is distinct from the all-␣-TPRs of XcOGT. The DUF1975 domain of GtfA involves the recognition of the acceptor PsrP and its co-activator GtfB and thus is critical for the intact OGT activity. Further glycoproteomic analyses provide new insights into the O-GlcNAcylation of serine repeat cluster. Structure-based analysis of this novel OGT may assist in the rational design of novel inhibitors for combating the diseases resulted from the infection of Gram-positive pathogens. Moreover, our findings imply that fusion with an add-on domain might be a universal mechanism adopted by diverse OGTs.

EXPERIMENTAL PROCEDURES
Cloning, Expression, and Purification of GtfA, GtfB, and Mutants-The coding regions of GtfA and GtfB were amplified from the genomic DNA of S. pneumoniae TIGR4 and cloned into a pET28a-derived vector with a His 6 tag at the N terminus. The Escherichia coli BL21 (DE3) strain was used for the expression of recombinant proteins. The transformed cells were grown at 37°C in LB culture medium (10 g of NaCl, 10 g of Bacto-Tryptone, and 5 g of yeast extract per liter) containing appropriate antibiotics until the A 600 nm reached about 0.6. Protein expression was then induced with 0.2 mM isopropyl 1-thio-␤-D-galactopyranoside by another 16 h at 16°C. Cells were harvested by centrifugation (6000 ϫ g, 4°C, 10 min) and resuspended in 40 ml of lysis buffer (50 mM Tris-Cl, pH 8.0, 150 mM NaCl, 5% (v/v) glycerol). After 5 min of sonication and centrifugation at 12,000 ϫ g for 30 min, the supernatant containing the soluble target protein was collected and loaded onto a nickel-nitrilotriacetic acid column (Qiagen, Mississauga ON) equilibrated with the binding buffer (20 mM Tris-Cl, pH 8.0, 150 mM NaCl). The target protein was eluted with 400 mM imidazole and further loaded onto a Superdex 200 column (GE Healthcare) equilibrated with 20 mM Tris-Cl, pH 7.5, 100 mM NaCl. Protein purity was evaluated by electrophoresis, and samples were stored at Ϫ80°C. Before crystallization, the protein sample was concentrated to 6 mg/ml by ultrafiltration (Millipore Amicon).
The selenium-Met (SeMet)-labeled GtfA protein was expressed in E. coli strain B834 (DE3) (Novagen, Madison, WI). Transformed cells were inoculated into LB medium at 37°C overnight. The cells were harvested and washed twice with the M9 medium. Then the cells were cultured in SeMet medium (M9 medium with 50 mg/liter SeMet and other essential amino acids at 50 mg/liter) to an A 600 nm of ϳ0.6. Protein expression and purification steps were carried out as described above for the native protein.
Site-directed mutagenesis was performed using the Quik-Change site-directed mutagenesis kit (Stratagene, La Jolla, CA) with the plasmid encoding the wild-type GtfA as a template.
The mutant proteins were expressed, purified, and stored in the same manner as the wild-type protein.
Crystallization, Data Collection, and Processing-Crystallization trials of GtfA were done using a Mosquito robot (TTP Labtech) in 96-well plates (Greiner) at 16°C. The crystals were obtained using the hanging drop vapor-diffusion method, with the initial condition of equilibrating 0.1 l of 6 mg/ml protein (mixed with UDP and GlcNAc to the final concentration of 10 mM) with and equal volume of the reservoir solution (0.1 M sodium citrate, pH 6.0, 30% PEG 3350). SeMet-substituted protein crystals were obtained at the conditions of 0.1 M Tris-Cl, pH 7.5, 0.2 M Li 2 SO 4 , 15% polyethylene glycol 3350. All the crystals were transferred to cryoprotectant (reservoir solution supplemented with 30% glycerol) and flash-cooled with liquid nitrogen. The data were collected at 100 K in a liquid nitrogen stream using beamline 17U with a Q315r CCD (ADSC, MARresearch, Germany) at the Shanghai Synchrotron Radiation Facility (SSRF).
Structure Determination and Refinement-All diffraction data were integrated and scaled with the program HKL2000 (26). The native and SeMet substituted GtfA proteins in the presence of GlcNAc and UDP were crystallized in the space group of P2 1 2 1 2 1 . The crystal structure of GtfA was determined using single-wavelength anomalous dispersion phasing (27) method from a single SeMet-substituted protein crystal to a highest resolution of 2.3 Å. The SHELXD program (28) implemented in IPCAS was used to locate the heavy atoms, and the phase was calculated by OASIS (29) and further improved with the programs RESOLVE and Buccaneer (30 -32). Electron density maps showed clear features of secondary structural elements. Automatic model building was carried out using Autobuild in PHENIX (33). The initial model was refined using the maximum likelihood method implemented in REFMAC5 (34) as part of CCP4i (35) program suite and rebuilt interactively using the program COOT (36). The structure was refined to an R-factor/R-free of 19.9%/24.8% and was evaluated with the programs MOLPROBITY (37) and PROCHECK (38). Crystallographic parameters are listed in Table 1. All structure figures were prepared with PyMOL (39).
Hydrolytic Activity Assays-The hydrolytic activities of GtfA and/or GtfB were assayed by high performance liquid chromatography (HPLC). All assays were performed at 37°C in the buffer containing 50 mM Tris, 100 mM NaCl, pH 7.5, 10 mM ␤-mercaptoethanol with UDP-GlcNAc as the sugar donor and PsrP SRR1 as the acceptor. The donor substrate UDP-GlcNAc (Sigma) was diluted to a series of concentrations from 100 mM stock solution. The reaction was triggered by adding the purified protein solution and terminated by heating at 100°C for 5 min. All samples were centrifuged at 10,000 ϫ g for 10 min, and the supernatant was subjected to HPLC system (Agilent 1200 Series). The buffer of 100 mM NH 4 H 2 PO 4 , pH 6.2, was used for equilibration of the column (Eclipse XDB-C18 column, 4.6 ϫ 150 mm, Agilent) and separation of the components at a flow rate of 1 ml/min. The product UDP was used as the standard and was quantified by the absorption at 254 nm. The initial velocities and substrate concentrations were used to non-linearly fit the Michaelis-Menten equation plot to calculate the K m and V max values. Three independent kinetic determinations were made to calculate the means and standard deviations for the reported K m and k cat values.
In Vitro SRR1 O-GlcNAcylation Assays-The PsrP SRR1 glycosylation assays were performed in a volume of 20 l with 20 M recombinant GST-PsrP SRR1 and a series of concentrations of sugar donor UDP-GlcNAc in the buffer containing 50 mM Tris-Cl 100 mM NaCl, pH 7.5, 10 mM ␤-mercaptoethanol. The glycosylation reaction lasted for 2 h at 37°C by adding 8 M GtfA and/or GtfB and was terminated by heating at 100°C for 5 min with the addition of 5ϫ loading buffer. Then these reaction mixtures were separated in a 12% SDS-PAGE gel and then transferred to polyvinylidene difluoride membranes. GlcNAc-modified GST-PsrP SRR1 proteins were detected on the polyvinylidene difluoride membrane using wheat germ agglutinin-HRP conjugate (WGA-HRP, 20363).
Identification of the O-GlcNAcylation Sites-Completely glycosylated GST-PsrP SRR1 and native GST-PsrP SRR1 were reduced, alkylated, digested with GluC (Sigma), and analyzed by reverse phase liquid chromatography according to a standard quality control method for enzymatic mapping. Using the in-gel digestion protocol (40), the samples were diluted in 50 mM NH 4 HCO 3 and reduced with 10 mM freshly made dithiothreitol at 56°C for 45 min, carboxyamidomethylated with 55 mM iodoacetamide, and incubated in the dark for 30 min; 200 l of 30% or 50% ACN (acetonitrile) in 50 mM NH 4 HCO 3 was added to remove the Coomassie stain, and the dried gel was protease-digested overnight at 37°C. After digestion, the reaction was quenched with 200 l of 60% ACN, 5% trifluoroacetic acid (TFA). The resulting peptides were suspended in 25 l of 0.1% formic acid and stored at Ϫ20°C until ready to LTQ Orbitrap Velos (Thermo Fisher) MS analysis.
The data were searched against protein database using the TurboSequest algorithm (Bio-Works, Thermo Fisher). To aid in identification of glycopeptides, we allowed for a mass increase on serine residues looking for the addition of GlcNAc gradually. All remaining spectra were manually evaluated for the presence of glycopeptides and sites of modification and were further validated by TurboSequest searches against the GST-PsrP SRR1 FASTA sequence combined with the TurboSequest common contaminants database.
Construction of a GtfA-GtfB and PsrP Co-expression System and Purification of Recombinant Proteins-An E. coli recombinant system that co-expresses GtfA-GtfB with its acceptor PsrP was used to determine binding of GtfA to PsrP and the function of DUF1975 in the binding. The recombinant E. coli strains were generated as follows. Full-length gtfA-gtfB was amplified from S. pneumoniae TIGR4 genomic DNA using a primer set of gtfA-gtfB-BamHI-1F and gtfA-gtfB-XhoI-1266R. The DNA fragment was digested and cloned into pGEX-6P-1 to construct pGEX-6P-1-gtfA-gtfB in E. coli BL21.1266 bp of 5Ј DNA fragment of psrP was amplified and cloned into pET-27b, creating pET-27b-PsrP-His. The resulting plasmid pET-27b-PsrP-His was used as a template to amplify the full expression unit (including the T7 promoter, psrP, a His 6 tag, and the transcriptional terminator). The amplified DNA fragment was inserted into the single XhoI site of pGEX-6p-1-GtfA-GtfB, creating the final construct pGEX-6p-1-GtfA-GtfB-PsrP-His in E. coli BL21 (DE3).
Site-directed mutagenesis was used to construct GtfA variants by PCR using a QuikChange mutagenesis kit (Stratagene). In brief, the plasmid pGEX-6p-1-GtfA-GtfB-PsrP-His was used as a template. Mutant alleles were confirmed by sequencing. The resulting plasmids were transformed into E. coli BL21 (DE3) for protein expression.
The interaction between PsrP with GtfA or GtfA variants was determined by co-purification of GST fused GtfA variants and PsrP from recombinant E. coli strains that harbor GST-GtfA variants/GtfB and PsrP. Production of recombinant proteins was carried out as follows. Recombinant E. coli BL21 strains were grown at 37°C to an optical density of 0.7. Protein expression was induced by the addition of isopropyl 1-thio-␤-D-galactopyranoside to a final concentration of 0.1 mM and incubation overnight at 25°C before harvesting cell pellets. Cell lysates were prepared from the cell pellets by sonication, and all recombinant proteins were purified by glutathione-Sepharose 4B beads (Amersham Biosciences). Equal amount of purified proteins were subjected to SDS-PAGE analysis followed by Coomassie Brilliant Blue staining and Western blot analysis using anti-GST, anti-His, and anti-PsrP to detect GtfA and PsrP.

RESULTS AND DISCUSSION
Overall Structure of GtfA-Crystals of the selenium-Metsubstituted recombinant GtfA were obtained, and the structure in the presence of UDP and GlcNAc was solved at the resolution of 2.0 Å. Each asymmetric unit of the crystal structure contains two molecules of GtfA with an overall root mean is the intensity of an observation, and ͗I(hkl)͘ is the mean value for its unique reflection; summations are over all reflections. square deviation (r.m.s.d.) of 0.56 Å over 501 C␣ atoms. Despite that the crystal packing results in a buried interface of 1600 Å 2 , both size-exclusion chromatography and ultraperformance liquid chromatography indicated that GtfA exists as a monomer in solution. The molecules of GlcNAc and UDP at the active site could be well defined in the final model. Similar to members in the GT-B superfamily, GtfA comprises a core structure of two Rossmann-like domains: N-Cat and C-Cat (Fig. 1, a and b). The N-Cat is composed of two segments: residues Met-1-Ser-78 and Gly-196 -Gly-306 interrupted by DUF1975, whereas the C-Cat (residues Ser-307-Asp-503) has a tail of helix ␣15 packing on N-Cat (Fig. 1b). Structural homology search using DALI server (41) revealed that the core structure of GtfA resembles GT-B enzymes at a Z-score Ն14.0 despite sharing a sequence identity of 18% (42 (46). Protruding from the N-terminal core structure is an add-on domain of DUF1975 (residues Val-79 -Tyr-195), which adopts a ␤-meander structure of a twisted 10-stranded antiparallel ␤-sheet (␤4-␤13). The hydrophobic outer side of DUF1975 is shielded from the solvent by a short helix ␣4 (Fig. 1b).
The molecules of UDP and GlcNAc are accommodated at the interdomain cleft with most contacts to C-Cat (Fig. 1c). In detail, the uracil base of UDP is stabilized by hydrophobic interactions with His-384 and Leu-387 in addition to hydrogen bonds with Ala-385 and Tyr-19. The ribose ring is stacked by Tyr-19, and its oxygen atoms O2 and O3 are fixed by the carboxylate group of Glu-412 via two conserved hydrogen bonds as found in E. coli NDP-glycosyltransferase MurG (47) and Mycobacterium smegmatis phosphatidylinositol mannosyltransferase PimA (45). The oxygen atoms of the ␣-phosphate group are stabilized by the backbone amides of Leu-408 and Thr-409, respectively, whereas those of the ␤-phosphate group interact with basic residues Arg-328 and Lys-333 and with the backbone amide of Gly-16 at the ␣1-␤1 loop (Fig. 1c). This loop at N-Cat usually undergoes significant conformational changes upon sugar-nucleotide binding (47,48). The GlcNAc molecule is approximately perpendicular to the UDP molecule and adopts a "bent-back" conformation toward the pyrophosphate of UDP (Fig. 1c). This conformation facilitates the anomeric sugar carbon (carbon 1 (C1)) to be exposed for nucleophilic attack (23,45,46). GlcNAc is also stabilized by several hydrogen bonds: two of the hydroxyl O3 with the backbone amides of Phe-406 and Gly-407, one of O4 with the amide nitrogen of Gly-407, one of O6 with N␦1 of His-242, and two of the acetyl group with residues Gly-405 and Glu-404 (Fig. 1c). All these active-site residues, especially the two basic residues Arg-328 and Lys-333, which were proposed to neutralize the negative charge of the phosphate groups of UDP (46), are generally conserved in GT-B enzymes.
Insertion of DUF1975 Makes GtfA a Novel OGT-The N-Cat domain of GtfA is interrupted by DUF1975, a function-unknown domain that is predominantly found in the N-terminal region of various putative bacterial glycosyltransferases. Structural homology search revealed that DUF1975 is similar to a  (50), and Pseudomonas aeruginosa pyoverdine-iron transporter FpvA (PDB code 2W78, Z score 5.6, r.m.s.d. 6.5 Å over 100 C␣ atoms) (51). However, DUF1975 could be only partially superimposed onto these structures, indicating that DUF1975 represents a novel structure.
Four different add-on domains have previously revealed in the structures of GT-B glycosyltransferases (Fig. 2a). They are the N-terminal TPR domain of hOGT (PDB code 3PE4) (6) and XcOGT (PDB code 2JLB) (8), the N-terminal all ␣-domain of N-glucosyltransferase (PDB code 3Q3E) (52), the C-terminal SH3 domain of mammalian ␣-1,6-fucosyltransferase (PDB code 2DE0) (53), and the N-terminal immunoglobulin like ␤-sandwich fold domain of ␣-2,6-sialyltransferase (PDB code 2Z4T) (54). It was assumed that during evolution these add-on domains impart the defined acceptor binding specificity to glycosyltransferases (55). Among these interrupted GT-B glycosyltransferases, GtfA has a molecular function similar to the two OGTs of known structure, hOGT and XcOGT, which have an N-terminal all-␣-TPR domain that play an important role in recognizing acceptor and/or partner proteins. Notably, the ␤-meander DUF1975 domain of GtfA adopts a similar orientation to that of TPRs in hOGT and XcOGT as shown by structural superpositions despite that it possesses a totally different composition of secondary structure elements (Fig. 2, b and c). DUF1975 Is Necessary for the Formation of an Active GtfA-GtfB Complex-Our previous studies reported that S. parasanguinis Gtf2 functions as a chaperone of Gtf1 for the glycosylation of its acceptor protein Fap1 (24). Accordingly, we hypothesized that S. pneumoniae GtfB may bind to GtfA to form a holoenzyme. Indeed, isothermal titration calorimetry experiments indicated they form a tight complex with a dissociation constant (K d ) of 51.5 nM. Size-exclusion chromatography combined with gel electrophoresis also confirmed that GtfA and GtfB form a stable complex of 1:1 stoichiometry. Moreover, we mapped a region ( 58 LYFNQL 63 ) in the DUF1975 domain of Gtf2 corresponding to the region of 57 LYFNQV 62 in GtfB, which was critical for the binding to Gtf1 (25). Thus, we determined the K d value of the individual DUF1975 domain of GtfB (GtfB-DUF1975, residues Met-1-Phe-171) toward GtfA and revealed a K d of 11.0 M, suggesting that the DUF1975 domain is necessary but not sufficient for the formation of a stable GtfA-GtfB complex.
In fact, GtfB also functions as a co-activator/chaperone for the full activity of GtfA. Our in vitro hydrolytic assays showed that the GtfA-GtfB complex had an activity (k cat /K m ) of 69.1 min Ϫ1 mM Ϫ1 toward UDP-GlcNAc, Ͼ100-fold that of the individual GtfA (0.6 min Ϫ1 mM Ϫ1 ), whereas GtfB did not exhibit any activity (Table 2). In contrast, the complex of GtfA with GtfB-DUF1975 had an activity of 1.1 min Ϫ1 mM Ϫ1 , which was at a comparable level with that of individual GtfA (Table 2). Moreover, mutation of either Arg-328 or Lys-333 to alanine completely abolished the activity, suggesting these two UDP binding residues are crucial for the catalysis. To determine the O-GlcNAcylation activity of GtfA-GtfB toward PsrP, in vitro O-GlcNAcylation assays were performed using UDP-GlcNAc as the sugar donor and SRR1 of PsrP as the peptide acceptor. Neither individual GtfA, GtfB, nor the complex of GtfA with GtfB-DUF1975 could catalyze the glycosylation of SRR1 (Fig. 3a). In contrast, the GtfA-GtfB complex gly-cosylated SRR1, resulting in a band of O-GlcNAcylated SRR1 of higher molecular weight (Fig. 3a). These results clearly demonstrated that formation of the GtfA-GtfB complex is crucial for the O-GlcNAcylation activity. Notably, no glycosylated SRR1 was detected in GtfA variants of either R328A and/or K333A, in agreement with the results of hydrolytic activity assays.
Despite the growing interests in the study of protein glycosylation of conserved SRRPs, it remains unknown how the SRRs are O-GlcNAcylated. To elucidate the O-GlcNAcylation pattern of PsrP, the full-length SRR1 and five truncated variants (termed SRR1-N12, -N20, -N25, -N29, and -N36, representing the peptides of corresponding N-terminal residues, respectively) were subject to O-GlcNAcylation assays. Only the four peptides of 25 or more residues were O-GlcNAcylated in vitro,  whereas no O-GlcNAcylation of either SRR1-N12 or SRR1-N20 was detected (Fig. 3b). These data suggested that a peptide of proper length is necessary for the in vitro O-GlcNAcylation. Furthermore, liquid chromatography mass spectrometry analyses identified that only the Ser, but not Thr residues, were O-GlcNAcylated. In detail, O-GlcNAcylation of serine residues (Ser-2, -4, -5, -7, -9, -11, -23, -29, -37, -39, -47, -49, and -50) within SRR1 were identified by LC-MS (Fig. 3c). Although only half of the Ser residues were detected to be O-GlcNAcylated, it is possible that most or even all Ser residues within SRR1 are modified by O-GlcNAcylation upon the saturation of UDP-GlcNAc, as the molecular mass of the homogenous band of O-GlcNAcylated SRR1 is much greater than the unmodified SRR1 (Fig. 3a).
DUF1975 Is Required for Binding of GtfA to the Acceptor PsrP-To dissect the acceptor binding site of GtfA, we initially tried co-crystallization of GtfA or GtfA-GtfB complex with various lengths of N-terminal peptide of SRR1 but did not succeed. Alternatively, we docked the 25-residue peptide SRR1-N25 recognized by the complex to our GtfA structure using the HADDOCK program (56). All output models suggested that GtfA only interacts with the N-terminal 18 residues, which was subsequently input to improve the docking model. Among the 12 output clusters, the first cluster of the lowest energy with 47 members satisfies the best interaction restraints and has a largest buried interface area of ϳ2160 Å 2 . The overall backbone r.m.s.d. of 0.8 Ϯ 0.5 Å for the 47 members indicated that the binding model was rather reliable on the theoretical calcula- tion. In the model, the octadecapeptide lies over the UDP and GlcNAc binding sites, well complementary in both shape and charge to the interdomain cleft (Fig. 4a). The peptide chain runs along the cleft with its N terminus proximity to the ␤4 and ␤5 of DUF1975 domain. The octadecapeptide is mainly stabilized by residues Trp-12, Ala-13, Glu-18, Glu-244, Glu-248, Asn-249, Asn-260, and Tyr-261 of N-Cat in addition to three residues (Arg-328, Glu-332 and Ser-403) of C-Cat and three residues (Asn-98, Arg-103, and Tyr-116) of the DUF1975 domain (Fig. 4b). Our docking model is also consistent with the previous conclusion that the donor nucleotide-sugar commonly binds to the C-Cat of GT-B enzymes, whereas the acceptor binds to the N-Cat (57). Sequence analysis of these interacting residues suggests that most of them are highly conserved in GtfA homologs from Gram-positive bacteria that encode SRRPs.
The docking model represents a pose that Ser-11Ј of PsrP is ready for being O-GlcNAcylated. The hydroxyl group of Ser-11Ј, stabilized by residues Arg-328 and Glu-332, has a distance of 2.9 Å to the ␤-phosphate oxygen of UDP and 3.1 Å to the anomeric oxygen of GlcNAc (Fig. 4b). The acceptor SRR1 peptide runs over the UDP-GlcNAc binding pocket, similar to the structure of hOGT-UDP-peptide complex (7). This conformation makes UDP-GlcNAc inaccessible to the solvent, consistent with the ordered bi-bi mechanism, in which the sugar donor UDP-GlcNAc binds before the acceptor peptide (58). However, Ser-11Ј and UDP are aligned on the same face of GlcNAc plane, indicating GtfA is a retaining OGT, different from the hOGT and XcOGT that both adopt an inverting mechanism. For retaining glycosyltransferases, it has been suggested that the ␤-phosphate of UDP acts as a general base (59). In the model the hydroxyl group of Ser-11Ј of PsrP is 3.5 Å to Glu-332-O⑀2, and 2.9 Å to the ␤-phosphate oxygen of UDP, suggesting that Glu-332 of GtfA or UDP may serve as a general base to activate the nucleophilic serine residue. Moreover, Glu-332-O⑀1 also makes a hydrogen bond with the amide backbone of Ser-11Ј of PsrP.
To test the docking model, we first performed the in vitro glycosyltransferase activity assays. Mutation of the highly conserved residue Glu-244 and the putative catalytic residue Glu-332 to alanine completely abolished the O-GlcNAcylation activities (Fig.  4c). Meanwhile, mutation of Ser-403 to alanine significantly diminished the activity (Fig. 4c). Furthermore, we used E. coli in vivo glycosylation system to test the O-GlcNAcylation activity and binding to PsrP. Co-expression of GtfA-GtfB with the recombinant substrate PsrP led to O-GlcNAcylation of PsrP (Fig. 4dIII,  second lane), which is also evident by appearance of a upper PsrP band (Fig. 4dII, second lane). PsrP proteins, both glycosylated upper band and non-glycosylated lower band, were readily co-purified with GtfA (Fig. 4dII, second lane), suggesting the in vivo binding between GtfA and PsrP. Mutation of Glu-332 to alanine completely inhibited glycosylation of PsrP but minimally affected the binding (Fig. 4dII, third lane), suggesting the Glu-332 residue is catalytically important as the subtle change has a dramatic impact on the GtfA activity. Importantly site-directed mutagenesis of three residues within DUF1975 reduced binding of GtfA to the acceptor PsrP and concurrently inhibited O-GlcNAcylation of PsrP (Fig. 4d, II and III, fourth-sixth lanes). Notably the binding between GtfA and GtfB were not dramatically changed in all site-directed mutants as indicated by co-purification of GtfA variants and GtfB (Fig. 4dII), demonstrating the unique requirement of DUF1795 for the substrate binding, a new function in addition to its ability to bind to the co-activator GtfB.
Both hOGT (6) and XcOGT (9) contain catalytic domains of GT-B fold after an all-␣TPR domain that mediates the recognition to a broad range of proteins (10). Previous studies have proven that TPR, a scaffold for the recognition of partner proteins or acceptor peptides, is crucial for the activity of OGTs (9,60). Similar to TPR, the DUF1975 domain of GtfA has also been implicated in mediating protein-protein interactions and is essential for the glycosylation of SRRPs (25,61). In our docking model, we found that the two most N-terminal residues of SRR1, Asn-1Ј and Ser-2Ј, form sidechain hydrogen bonds with residues Asn-98, Arg-103, and Tyr-116 of DUF1975 (Fig. 4b). Mutation of these three residues to alanine reduced the binding of GtfA to the peptide acceptor PsrP and decreased the both in vitro and in vivo glycosyltransferase activity of the GtfA-GtfB complex (Fig. 4,  c and d). Thus, despite a totally different overall structure, DUF1975 acts as a domain for recognizing the acceptor, a role similar to TPRs in hOGT and XcOGT.
Taken together, we report here the structure of GtfA, the core subunit of a novel bacterial OGT, which is responsible for the O-GlcNAcylation of alternate serine residues of the serinerich repeat adhesin PsrP. The ␤-meander add-on domain DUF1975 of GtfA is critical for the recognition of both co-activator GtfB and acceptor PsrP. The glycoproteomic analysis provides insights into a novel protein O-glycosylation fashion on multiple sites. Structure-based analysis may also assist in the future development of inhibitors against the biogenesis of bacterial SRRPs.