Cytoplasmic N-Glycosyltransferase of Actinobacillus pleuropneumoniae Is an Inverting Enzyme and Recognizes the NX(S/T) Consensus Sequence*

N-Linked glycosylation is a frequent protein modification that occurs in all three domains of life. This process involves the transfer of a preassembled oligosaccharide from a lipid donor to asparagine side chains of polypeptides and is catalyzed by the membrane-bound oligosaccharyltransferase (OST). We characterized an alternative bacterial pathway wherein a cytoplasmic N-glycosyltransferase uses nucleotide-activated monosaccharides as donors to modify asparagine residues of peptides and proteins. N-Glycosyltransferase is an inverting glycosyltransferase and recognizes the NX(S/T) consensus sequence. It therefore exhibits similar acceptor site specificity as eukaryotic OST, despite the unrelated predicted structural architecture and the apparently different catalytic mechanism. The identification of an enzyme that integrates some of the features of OST in a cytoplasmic pathway defines a novel class of N-linked protein glycosylation found in pathogenic bacteria.

N-Linked glycosylation is characterized by an N-glycosidic linkage between the side chain amide of an asparagine residue of proteins and an oligosaccharide. This type of glycosylation occurs in both prokaryotes and eukaryotes and requires the assembly of an oligosaccharide on a polyisoprenoid lipid by sequential addition of monosaccharides, catalyzed by cytosolic glycosyltransferase. The resulting lipid-linked oligosaccharide is translocated to the luminal side of the endoplasmic reticulum membrane or the plasma membrane of prokaryotes, where it may be further elongated. The glycan is then transferred to the ␦-amino group of asparagine residues within the consensus sequence NX(S/T) of polypeptides. This reaction is catalyzed by oligosaccharyltransferase (OST), 4 a membrane-bound enzyme that can be composed of several different subunits (1). As oli-gosaccharide transfer takes place in the endoplasmic reticulum or in the periplasm, N-glycosylation affects proteins trafficking along the secretory pathway.
N-Glycosylation exhibits important physiological functions. In the early secretory pathway of eukaryotes, N-glycans present on newly synthesized proteins orchestrate the folding of glycoproteins and act as a signal for directing misfolded polypeptides to degradation (2). After being processed in the Golgi organelle, N-linked glycans are relevant, among other processes, for the modulation of the immune system and for the control of immune cell homeostasis and inflammation (3,4). The importance of this protein modification is supported by its incidence: more than half of all eukaryotic proteins are glycosylated (5).
About 8 years ago, a study uncovered that the extracellular HMW1A adhesin of the Gram-negative bacterium Haemophilus influenzae is modified with hexose monosaccharides on asparagine residues (6). The pioneering work of St Geme and co-workers (7,8) subsequently showed that the modified asparagine residues are found within the same consensus sequence recognized by OST (i.e. NX(S/T)) and that the enzyme responsible for this modification is the HMW1C protein, capable of transferring Glc or Gal from nucleotide-activated substrates. Moreover, it has been proposed that HMW1C also forms hexose-hexose bonds. Modification of HMW1A adhesin prevents its premature degradation and promotes HMW1A to be displayed at the cell surface, a prerequisite for HMW1A-mediated adherence to human epithelial cells (8). Additionally, a recent report demonstrated that an HMW1C homolog from Actinobacillus pleuropneumoniae mediates N-linked glycosylation of the H. influenzae HMW1A protein (9).
To characterize accurately the mechanisms of this alternative N-glycosylation pathway and to gain insight into the specificity of the key enzyme N-glycosyltransferase (NGT), we established a platform suitable for in vitro glycosylation and undertook a detailed analysis of the reaction products. We adopted MS and gel electrophoresis for detection of glycosyltransferase activity using peptides and proteins as substrates and performed NMR studies to characterize the reaction adducts. We show that the HMW1C homolog from A. pleuropneumoniae is an inverting NGT that transfers a glucose or galactose moiety to asparagine, but it does not further elongate the N-linked monosaccharide. Instead, we found that another glucosyltransferase was able to elaborate the N-linked glucose. We compared the acceptor substrate range of NGT and OST and observed a highly similar specificity of the two different enzymes. We concluded that NGT integrates polypeptide recognition common to the OST-based modification into a novel framework, resulting in a general N-glycosylation system that operates in the cytoplasm.

EXPERIMENTAL PROCEDURES
Materials-Restriction enzymes were purchased from Fermentas. T4 DNA ligase was from PerkinElmer Life Sciences. UDP-Glc, UDP-GlcNAc, and UDP-GalNAc were from Sigma. UDP-Gal was obtained from VWR International. Synthetic peptides were purchased from JPT Peptide Technologies.
Construction of Plasmids-Escherichia coli DH5␣ was chosen as host for cloning. The NGT and ␣1,6-glucosyltransferase (␣6GlcT) genes were amplified by PCR using genomic DNA from Yersinia enterocolitica strain 8081, A. pleuropneumoniae strain L20, or A. pleuropneumoniae strain AP76 as a template. Fragments containing the NGT gene were cut with XhoI and ligated into pEC(AcrA-cyt), previously digested with NdeI, blunted by treatment with Klenow fragment, and digested with XhoI. The ␣6GlcT gene was cloned into the pEXT21 plasmid. All open reading frames are in-frame with a hexahistidine tag at the C terminus. All plasmid constructs were verified by sequencing of relevant fragments (Microsynth AG).
Protein Expression, Purification, and Analysis-E. coli DH5␣ cells harboring a plasmid for expression of a relevant protein were grown in volumes of 1 liter at 37°C in LB medium. Ampicillin (100 mg/liter) or chloramphenicol (25 mg/liter) was added to the medium as needed. When cultures reached A 600 ϭ 0.5, 0.2% arabinose or 1 mM isopropyl ␤-D-thiogalactopyranoside was added for induction of protein expression. After 4 h of incubation at 37°C, cells were harvested by centrifugation, resuspended in 30 mM Tris (pH 8) and 300 mM NaCl supplemented with 1 mM EDTA and 1 g/liter lysozyme, and incubated for 1 h at 4°C. MgCl 2 and DNase I (Roche Applied Science) were added to final concentrations of 5 mM and 0.1 mg/ml, respectively. Cells were broken with a French press. Extracts were spun at 150,000 ϫ g for 30 min at 4°C. Supernatants were supplemented with 20 mM imidazole and loaded on a HisTrap column (GE Healthcare). Purification of proteins was done as recommended by the provider. Purification of Xanthomonas campestris O-GlcNAc-transferase (OGT) was performed according to a published procedure (10). Buffer exchange to 25 mM Tris (pH 7.2) and 150 mM NaCl was performed by gel filtration chromatography using HiTrap desalting columns (GE Healthcare). Proteins were analyzed by SDS-PAGE and quantified by measuring absorbance at 280 nm.
Glycosylation Analysis of Synthetic Peptides-Enzymatic activity using different sugar donors or peptide acceptors was assessed with 1.4 g (0.46 M) of NGT and/or ␣6GlcT in a 50-l final volume of 25 mM Tris buffer (pH 7.2). Acceptor peptides and sugar donors were mixed at a 1:100 molar ratio. Glycosylation reactions were incubated for 16 h at 30°C. Analysis of the reaction products was performed by MALDI-TOF/ TOF-MS, NMR, or gel electrophoresis. For electrophoretic analysis, carboxytetramethylrhodamine (TAMRA)-labeled peptides were supplemented with reducing sample buffer (0.0625 M Tris-HCl (pH 6.8), 2% (v/w) SDS, 5% (v/v) ␤-mercaptoethanol, 10% (v/v) glycerol, and 0.01% (w/v) bromphenol blue), boiled at 95°C for 5 min, and separated by Tricine/SDS-PAGE (11). Fluorescence was acquired with a Bio-Rad RX imager. For removal of salts and enzyme, peptides were bound to a Sep-Pak C 18 cartridge (Waters) or to a ZipTipC 18 pipette tip (Millipore), washed with 0.1% formic acid, and eluted with a solution of 70% acetonitrile and 0.1% formic acid.
Glycosylation Analysis of Proteins-50 g of AcrA was incubated with 1 mM UDP-Glc and 1.4 g of NGT in 25 mM Tris (pH 7.2) and 150 mM NaCl for 16 h at 30°C. Samples were digested with trypsin (Promega) overnight at 37°C. Peptides were bound to a C 18 cartridge for removal of salts, eluted with a solution of 70% acetonitrile and 0.1% formic acid, and subjected to MS analysis.
MS Analysis-MALDI-MS and MS/MS analyses of synthetic peptides were performed on a model 4800 proteomics analyzer (Applied Biosystems) operated in the positive reflectron mode. The peptides eluted from the C 18 cartridge or the ZipTipC 18 pipette tip were mixed 1:1 with ␣-cyano-4-hydroxycinnamic acid (5 mg/ml in 70% acetonitrile and 0.1% trifluoroacetic acid) as a matrix for spotting onto the target plate. Peptide mixtures from the AcrA glycosylation reaction were analyzed with an Eksigent nano-HPLC system using an autosampler with a selfmade reverse-phase tip column (75 m ϫ 80 mm) packed with C 18 material (AQ, 3 m, 200 Å; Bischoff GmbH, Germany). The gradient consisted of 10 -30% acetonitrile in 0.2% formic acid at a flow rate of 250 nl/min for 60 min and 30 -50% acetonitrile in 0.2% formic acid for 5 min. High accuracy mass spectra were acquired with an LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) in the mass range of m/z 300 -1700 and a target value of 1 ϫ 10 6 ions. Up to 20 data-dependent MS/MS spectra of the most intense ions with a charge state of 2ϩ or higher were recorded in parallel at the ion trap using collision-induced dissociation.
NMR Analysis-Unless indicated otherwise, samples were lyophilized and dissolved in D 2 O at 1 mM. All samples were measured at 303 K on a 500-MHz AVANCE III spectrometer equipped with an inverse triple-resonance cryogenic probe (Bruker). Standard two-dimensional 13 C-1 H heteronuclear single-quantum coherence (HSQC), 13 C-1 H heteronuclear multiple-bond correlation, and 1 H-1 H total correlation spectroscopy (TOCSY) ( mix ϭ 13 and 120 ms) spectra were recorded to assign the peptide and glycan chemical shifts. In addition, a two-dimensional 13 C-1 H relayed HMQC-COSY spectrum (12,38) was measured to assist the assignment process. To detect through-bond correlations between glycan protons and the asparagine H␦21, a two-dimensional 1 H-1 H TOCSY spectrum was recorded in 95% H 2 O and 5% D 2 O with a mixing time of 80 ms. All spectra are referenced to 2,2-dimethyl-2-silapentanesulfonic acid. 13 C chemical shifts are indirectly referenced using a scaling factor (⌶) of 0.251449530 (13). All spectra were processed with Topspin 2.1 (Bruker) and analyzed by Sparky (39).

A. pleuropneumoniae and Y. enterocolitica HMW1C
Homologs Modify a DANYTK Peptide-To gain a comprehensive view of potential NGT proteins, we searched the genome data base for bacteria encoding HMW1C homologs and iden-tified a restricted group of candidate proteins (supplemental Fig. 1). Interestingly, many of the microorganisms that encode a putative NGT are pathogenic and cluster in the ␥-proteobacteria group, but only some strains within the same species appear to carry an hmw1c-like gene. When we extended the search to retrieve low score hits, we found that HMW1C of H. influenzae displays low similarity to the C-terminal region of the product of the XCC0866 gene from X. campestris (24% identity within a 236-amino acids sequence). Although there is no exhaustive evidence of the activity of XCC0866 protein, it is considered to be a bona fide OGT due to the high sequence identity to its eukaryotic counterpart (10,14). OGT is an inverting glycosyltransferase that utilizes UDP-GlcNAc as a donor to modify hydroxyl groups of serine or threonine residues of nuclear, mitochondrial, and cytoplasmic proteins (15,16). Notably, HMW1C homologs are classified in the same family (GT41) as OGT in the CAZy Database, probably due to their low sequence identity.
We expressed the HMW1C homologs from Y. enterocolitica strain 8081, A. pleuropneumoniae strain L20, A. pleuropneumoniae strain AP76, and X. campestris ATCC 33913 in E. coli and purified them. To test the proteins for glycosyltransferase activity, we adapted an in vitro assay developed for analysis of OST activity (17). We incubated the purified proteins with UDP-Glc and the hexapeptide DANYTK labeled at the N terminus with the fluorescent dye TAMRA. After separation of the reaction products by Tricine/SDS-PAGE and detection of fluorescence signals, we observed that Y. enterocolitica and the two A. pleuropneumoniae homologs modified the TAMRA-labeled peptide, visualized by a shift in electrophoretic mobility (Fig. 1). By contrast, X. campestris OGT did not exhibit glycosyltransferase activity for this acceptor peptide in the presence of UDP-Glc, UDP-Gal, UDP-GlcNAc, or UDP-GalNAc under the experimental conditions tested (supplemental Fig. 2). In the following, we focused our study on the A. pleuropneumoniae strain AP76 enzyme.
A. pleuropneumoniae HMW1C Homolog Is an Inverting N-Glucosyltransferase-To prove glycosylation of the peptide directly and to characterize the site and chemical structure of the modification, we analyzed the glycosylation products using NMR spectroscopy. The assignment of the peptide and glycan was achieved using through-bond short-and long-range J couplings. The 13 C-1 H HSQC spectrum, a fingerprint of the glycopeptide, showed 13 C-1 H correlations, with the signals of the glycan being separated from those of the peptide (Fig. 2A). As is typical for a hexose, seven carbohydrate-specific signals were observed. The anomeric C1 13 C chemical shift of 82 ppm was a strong indication of a linkage to nitrogen. Similar anomeric C1 shifts have been found in bacterial and eukaryotic N-glycans (18 -20), in sharp contrast to chemical shifts around 100 ppm typical for O-glycosidic linkages. A comparison of all observed 1 H and 13 C chemical shifts of the glycan with those obtained for published model compounds revealed a striking coincidence with those of Glc-␤Asn (supplemental Table 1) (21). With a perfect match of all six 13 C chemical shift values and the single reported 1 H chemical shift (22), we identified the Glc-␤Asn structure. To confirm the linkage, we measured a J H1,H2 coupling constant of 9.9 Hz typical for a ␤-configuration and recorded a two-dimensional TOCSY spectrum in H 2 O (Fig.  2B). Through-bond correlations between the asparagine side chain NH and glucose protons were clearly visible, with the asparagine side chain NH chemical shift at 8.88 ppm being similar to those reported for bacterial and eukaryotic N-glycans of 8.62-8.73 ppm (18,23). Thus, we established that the A. pleuropneumoniae HMW1C homolog is an inverting NGT.
Next, we analyzed the donor specificity of this NGT in vitro. The enzyme transferred glucose or galactose, but not GlcNAc or GalNAc, to the DANYTK peptide (Fig. 2C). In the presence of a 100-fold molar ratio of donor to acceptor, the conversion to glycopeptide was quantitative in the presence of UDP-Glc, whereas it was marginal in the presence of UDP-Gal. Importantly, NGT glycosylated the peptide in the presence of EDTA, proving that glycosyl transfer does not require metal ions. We also monitored the products of the reaction by MS (Fig. 2D). Analysis of unmodified TAMRA-DANYTK resulted in two major species, matching with TAMRA-DANYTK (calculated molecular mass, 1122.19 Da; observed, 1122.49 Da) and a byproduct, TAMRA-(DANYTK) 2 (calculated molecular mass, 1814.50 Da; observed, 1813.78 Da). After incubation with NGT and UDP-Glc, we detected species corresponding to TAMRA-DANYTK-Glc (calculated molecular mass, 1284.35 Da; observed, 1284.58 Da) and TAMRA-(DANYTK) 2 -Glc (calculated molecular mass, 1975.66 Da; observed, 1975.88 Da). A similar result was obtained after incubation with UDP-Gal. In all cases, we observed addition of a single hexose moiety to the asparagine residue.
A. pleuropneumoniae Encodes a Polymerizing Glucosyltransferase That Elongates N-Linked Glucose-We inspected the NGT-encoding genomic region of A. pleuropneumoniae strain AP76 (supplemental Fig. 3). This region contains genes encoding putative proteins involved in the uptake of mannitol and its conversion to glucosamine 6-phosphate, two isomerases, a nucleosidase, and a methylthiotransferase. Interestingly, the ORF next to the NGT gene encodes a putative glycosyltransferase (APP7_1696). We expressed a C-terminally tagged protein in E. coli. When we incubated the purified protein with the TAMRA-labeled product of the NGT reaction, we detected a mobility shift upon Tricine/SDS-PAGE indicative of elaboration of the glycopeptide (Fig. 3A). This modification occurred in the presence of UDP-Glc, but not in the presence of UDP-Gal, UDP-GlcNAc, or UDP-GalNAc. The glucosyltransferase activity appeared to be cation-independent (supplemental Fig.  4).  We analyzed the reaction product by MS and found an addition of two glucose moieties in the presence of a 1:100 acceptor/ donor ratio (Fig. 3B). Notably, we observed addition of up to six glucose units in the presence of an excess of the donor and with increasing amounts of glucosyltransferase (supplemental To determine the chemical structure and stereochemistry of the reaction product, we analyzed the glycopeptide by NMR spectroscopy. The 13 C-1 H HSQC spectrum displayed signals of three different glucose units (Fig. 3C). Two new signals appeared in the anomeric region at ϳ100 ppm, in addition to the previously observed signal at ϳ82 ppm originating from the N-linked glucose (supplemental Fig. 6). The signals were assigned with a two-dimensional TOCSY spectrum (Fig. 3D) and 13 C-1 H long-range correlations via J couplings. The first set of signals belonging to the N-linked glucose displayed a C6 chemical shift of 68.3 ppm that differed from the initial glycopeptide harboring a single glucose unit (C6, 63.3 ppm). This was indicative of a carbohydrate attachment at O6. The signals of a second glucose unit originated from a terminal glucose (C6, 63.2 ppm), whose chemical shifts coincided with those of Glc-␣1,6Glc. The third set of signals displayed chemical shifts of a bridging glucose unit that is ␣1,6-linked on either side. Chemical shifts of Glc-␣1,6Glc-␣1,6Glc reported previously (24,25) fitted perfectly the experimental data of the terminal and bridging glucoses, providing strong evidence for ␣1,6-linked glucose residues (supplemental Table 1). Chemical shifts calculated  with the algorithm CASPER (26) further supported the assignment. We concluded that the APP7_1697 gene encodes a polymerizing ␣6GlcT that elaborates the product of the NGT reaction.
NGT Exhibits Acceptor Site Specificity Overlapping OST-We analyzed glycosylation of the SIVNPGGSNLTYIER peptide present in the yeast glycoprotein lysophospholipase 2 (Plb2). After incubation with NGT and UDP-Glc, analysis of the reaction products by MS indicated that the peptide was modified (Fig. 4A, upper panels). The fragmentation spectrum of the ion at m/z 1882.93 was consistent with glucosylation of the asparagine within the NLT site (supplemental Fig. 7). Importantly, alteration of the NLT sequon to QLT or NPT abolished glycosylation (Fig. 4A), demonstrating a similar property of NGT as for OST with respect to acceptor sequon specificity (27).
We extended our analysis and examined glycosylation of a group of model peptides with the NX(S/T) consensus sequence in different positions (N terminus, central, C terminus). We chose sequences found to be glycosylated in yeast glycoproteins (Fig. 4B). Glycosylation at the NX(S/T) site was observed for all peptides (supplemental Fig. 8), indicating that the NX(S/T) consensus sequence was recognized by NGT. Although our analysis by MS/MS was not quantitative, the position of the consensus sequence within the peptide did not seem to affect glycosylation, and we did not observe a sequence preference for Ser or Thr at position ϩ2. Importantly, we did not detect glycosylation of short peptides such as DQNAT and DFNVT (data not shown), known substrates identified in vitro for bacterial OST from Campylobacter jejuni (28).
To further probe the acceptor site specificity of NGT, we tested glycosylation of the AcrA protein from C. jejuni, a substrate of bacterial and eukaryotic OSTs (29). MS analysis revealed that NGT was able to modify all four glycosites present in AcrA with a glucose moiety ( Fig. 4C and supplemental Fig. 9).
Altogether, these experiments proved that NGT and OST share the basis for recognition of acceptor substrates. Moreover, glycosylation of AcrA demonstrated that NGT operates on folded proteins.

DISCUSSION
In this work, we have provided direct proof that the soluble enzyme NGT glycosylates asparagine side chains of peptides and proteins. The modified asparagines are found within the NX(S/T) sequence. Proline is not tolerated in the X position, and asparagine cannot be replaced by glutamine. Any amino acid is allowed in positions next to the sequon. NGT shows preference for long polypeptides, indicating that binding of the substrate to the enzyme might require an extended contact surface. At the same time, modification of different NX(S/T) sequences on one protein suggested that NGT does not recognize a specific target but acts as a key component of a general protein glycosylation system.
NGT transfers one glucose unit from UDP-glucose to acceptor sites, forming ␤-glycosidic linkages in a metal ion-independent manner. UDP-galactose appears to be accepted as a substrate donor, albeit with low efficiency.
These features mark a clear distinction from the conventional N-glycosylation, wherein OST is localized in a membrane, uses a complex lipid-bound oligosaccharide as a donor, requires a metal ion for catalysis, and exclusively modifies proteins trafficking along the secretory pathway. Moreover, the two enzymes are structurally unrelated, as NGT is predicted to have a GT-B fold, whereas OST clusters in the GT-C group (30). Given these premises, it is remarkable that NGT and OST exhibit the same NX(S/T) acceptor sequence specificity. The amino acid at position ϩ2 appears not be essential for catalysis by OST, but is rather involved in binding of the acceptor substrate by the enzyme (31). Indeed, both NGT and OST can glycosylate sites that do not contain Ser or Thr at position ϩ2 (7,(32)(33)(34). 5 It will be important to determine the substrate recognition mode of NGT, the biophysical properties of the consensus sequence, and the mechanisms that selected this sequon for general glycosylation systems in two convergent evolutionary processes. In this context, it is interesting to note that the glycoproteins produced by the cytoplasmic glycosylation system of H. influenzae are secreted proteins.
Cytosolic N-glycosylation does not appear to be a singlemonosaccharide modification, as the N-glucose can be extended with ␣1,6-linked glucose in A. pleuropneumoniae. The enzyme responsible for glucose transfer is considered a distant relative of family GT4 glycosyltransferases based on sequence similarity. Remarkably, whereas we confirmed that the enzyme exhibits a retaining mechanism, its polymerase   N-linked protein glycosylation in A. pleuropneumoniae. NGT recognizes NX(S/T) sequences within polypeptides and glycosylates asparagine residues. NGT is an inverting glycosyltransferase with specificity for UDP-glucose as a donor substrate. The N-linked glucose can be elongated by a polymerizing ␣6GlcT, which is sensitive to the UDP-glucose concentration.
activity appears to be a novel feature of a member of this class of glycosyltransferases.
In summary, this study illustrates a cytoplasmic pathway of N-linked protein glycosylation (Fig. 5) with overlapping properties with the conventional N-glycosylation process. Future research will have to address the mechanism, extent, and physiological roles of this protein modification. Furthermore, as NGTs are encoded on the genome of relevant pathogens, it will be interesting to study how the immune system of the hosts reacts to this type of N-glycosylation. Finally, NGTs represent a promising tool that complements established platforms for production of glycoconjugates and for site-specific modification of proteins (35)(36)(37).