Characterization of a cytoplasmic glucosyltransferase that extends the core trisaccharide of the Toxoplasma Skp1 E3 ubiquitin ligase subunit

Skp1 is a subunit of the SCF (Skp1/Cullin 1/F-box protein) class of E3 ubiquitin ligases that are important for eukaryotic protein degradation. Unlike its animal counterparts, Skp1 from Toxoplasma gondii is hydroxylated by an O2-dependent prolyl-4-hydroxylase (PhyA), and the resulting hydroxyproline can subsequently be modified by a five-sugar chain. A similar modification is found in the social amoeba Dictyostelium, where it regulates SCF assembly and O2-dependent development. Homologous glycosyltransferases assemble a similar core trisaccharide in both organisms, and a bifunctional α-galactosyltransferase from CAZy family GT77 mediates the addition of the final two sugars in Dictyostelium, generating Galα1, 3Galα1,3Fucα1,2Galβ1,3GlcNAcα1-. Here, we found that Toxoplasma utilizes a cytoplasmic glycosyltransferase from an ancient clade of CAZy family GT32 to catalyze transfer of the fourth sugar. Catalytically active Glt1 was required for the addition of the terminal disaccharide in cells, and cytosolic extracts catalyzed transfer of [3H]glucose from UDP-[3H]glucose to the trisaccharide form of Skp1 in a glt1-dependent fashion. Recombinant Glt1 catalyzed the same reaction, confirming that it directly mediates Skp1 glucosylation, and NMR demonstrated formation of a Glcα1,3Fuc linkage. Recombinant Glt1 strongly preferred the full core trisaccharide attached to Skp1 and labeled only Skp1 in glt1Δ extracts, suggesting specificity for Skp1. glt1-knock-out parasites exhibited a growth defect not rescued by catalytically inactive Glt1, indicating that the glycan acts in concert with the first enzyme in the pathway, PhyA, in cells. A genomic bioinformatics survey suggested that Glt1 belongs to the ancestral Skp1 glycosylation pathway in protists and evolved separately from related Golgi-resident GT32 glycosyltransferases.


Skp1 is a subunit of the SCF (Skp1/Cullin 1/F-box protein) class of E3 ubiquitin ligases that are important for eukaryotic protein degradation. Unlike its animal counterparts, Skp1 from Toxoplasma gondii is hydroxylated by an O 2 -dependent prolyl-4-hydroxylase (PhyA), and the resulting hydroxyproline can subsequently be modified by a five-sugar chain. A similar modification is found in the social amoeba Dictyostelium, where it regulates SCF assembly and O 2 -dependent development.
Homologous glycosyltransferases assemble a similar core trisaccharide in both organisms, and a bifunctional ␣-galactosyltransferase from CAZy family GT77 mediates the addition of the final two sugars in Dictyostelium, generating Gal␣1, 3Gal␣1,3Fuc␣1,2Gal␤1,3GlcNAc␣1-. Here, we found that Toxoplasma utilizes a cytoplasmic glycosyltransferase from an ancient clade of CAZy family GT32 to catalyze transfer of the fourth sugar. Catalytically active Glt1 was required for the addition of the terminal disaccharide in cells, and cytosolic extracts catalyzed transfer of [ 3 H]glucose from UDP-[ 3 H]glucose to the trisaccharide form of Skp1 in a glt1-dependent fashion. Recombinant Glt1 catalyzed the same reaction, confirming that it directly mediates Skp1 glucosylation, and NMR demonstrated formation of a Glc␣1,3Fuc linkage. Recombinant Glt1 strongly preferred the full core trisaccharide attached to Skp1 and labeled only Skp1 in glt1⌬ extracts, suggesting specificity for Skp1. glt1-knock-out parasites exhibited a growth defect not rescued by catalytically inactive Glt1, indicating that the glycan acts in concert with the first enzyme in the pathway, PhyA, in cells. A genomic bioinformatics survey suggested that Glt1 belongs to the ancestral Skp1 glycosylation pathway in protists and evolved separately from related Golgi-resident GT32 glycosyltransferases.
Skp1 is an adaptor subunit of the Skp1/Cullin-1/F-box protein (SCF) 4 class of E3 ubiquitin ligases that target proteins for polyubiquitination and degradation via the 26S proteasome (1). In the agent for toxoplasmosis, Toxoplasma gondii, Skp1 is hydroxylated by the cytoplasmic prolyl 4-hydroxylase PhyA at Pro-154 (2) and subsequently modified by a linear pentasaccharide (3). In the social amoeba Dictyostelium, Skp1 is also modified by hydroxylation and a pentasaccharide, which represents a novel form of SCF regulation (4). Biochemical and interactome studies indicate that full glycosylation of Skp1 promotes association with three different F-box proteins (FBPs) (5,6) which, in a developmentally regulated manner, is associated with their reduced steady-state levels in cells. 5 Many FBPs are substrate receptors for ligands whose polyubiquitination controls their abundance, whereas others are considered to have enzymatic or other functions (2). FBPs possess a 40-amino acid F-box domain that binds to the C-terminal region of Skp1 (7,8). Recent studies show that glycosylation influences the organization and range of motions of this region of Skp1, in part by hydrogen bonding along the polypeptide in cis (9). Glycosylation is regulated by PhyA, whose action on Skp1 is rate-limited by O 2 availability in the cell (6,10). This biochemical mechanism underlies cellular O 2 sensing, which controls the slug-tofruit switch and sporulation during starvation-induced development (4). O 2 signifies positional information in the native soil environment of Dictyostelium, and sensing O 2 is key for the ability of developing cells to navigate to the soil surface for fruiting body formation (11).
PhyA is important for Toxoplasma tachyzoite proliferation on cultured human fibroblasts (12), which involves successive cycles of invasion, intracellular replication, egress, and reinvasion to form cell-free plaques. Glycosylation of Toxoplasma Skp1 is important too, because disruption of genes that mediate the addition of the first three monosaccharides also results in reduced parasite growth (3). However, the significance of the full pentasaccharide has not been examined, because the gene that mediates the addition of the final two sugars on the Dictyostelium pentasaccharide, agtA (13), is evidently absent from the Toxoplasma genome (14).
To investigate the function of the terminal disaccharide of the Toxoplasma Skp1 glycan, we searched for glycosyltransferase-like genes in the parasite's genome whose protein products are predicted to reside in the cytoplasm or nucleus and whose phylogenetic distribution correlates with the presence of the pgtA gene that mediates the addition of the second and third monosaccharides (15). This search netted two genes, not found in the amoebozoa, including Dictyostelium. Characterization of one of these (TGGT1_205060), which we named glt1, revealed a previously non-annotated gene that encodes a cytoplasmic glucosyltransferase from CAZy glycosyltransferase family 32 that modifies the Fuc terminus of the Skp1 trisaccharide. Gene disruption and complementation studies show that the fourth sugar contributes to Skp1 functionality in parasite growth. Thus, these unrelated protists employ different mechanisms to assemble a related but distinct pentasaccharide to regulate, presumably, the critical process of protein turnover in cells. A deeper understanding of the enzymes involved may offer unique strategies to control toxoplasmosis, which affects a large fraction of the world's human population and for which control of acute and chronic phases is lacking (16).

Prediction of candidate TgSkp1-modifying glycosyltransferases
To identify candidate proteins for catalyzing the addition of the fourth and fifth Skp1 monosaccharides, genome sequence databases were searched for predicted glycosyltransferase domains that reside in the cytoplasm and exist only in protists whose genomes harbor other predicted Skp1 modification pathway enzymes, phyA, gnt1, and pgtA, but not agtA, as outlined in Fig. 1. Briefly, the predicted proteome (8460 proteins) of the T. gondii Type I GT1 strain was searched using (i) the SUPERFAMILY server, which assigns protein domains at the SCOP "superfamily" level using hidden Markov models; (ii) dbCAN, an automated carbohydrate-active enzyme annotation database, which utilizes a CAZyme signature domain-based annotation based on a CDD (conserved domain database) search, literature curation, and a hidden Markov model; and (iii) the Pfam database. The resulting 45 sequences were scanned by SignalIP version 4.1 and TMHMM servers for signal sequences or transmembrane domains, which yielded 10 candidate cytoplasmic glycosyltransferases. Among those, TgGnt1 and TgPgtA were already identified as TgSkp1-modifying gly-cosyltransferases (3). The remaining eight sequences were subjected to BlastP analysis against the NCBI non-redundant database to search for their phylogenetic co-distribution with gnt1 and pgtA. This yielded two candidates, one from CAZy family GT32, TGGT1_205060, and another from CAZy family GT8. The GT32 sequence, here referred to as Glt1, is the subject of this study.

Glt1 sequence characteristics and phylogeny
In the Toxoplasma database ToxoDB V29, TGGT1_205060 (type 1 GT1 strain) is annotated as a 605-amino acid protein encoded by five exons that is conserved in ME49 (type II) and VEG (type III) strains with 99% nucleotide sequence identity. The protein contains a DXD motif typical of and essential for CAZy GT32 family and superfamily A glycosyltransferases (17). Proteomics analyses of Toxoplasma, as reported on ToxoDB, identified five Glt1 peptides, one of which was phosphorylated at Ser-601 (supplemental Fig. S1), confirming that the gene is expressed in tachyzoites grown in fibroblasts. To investigate the evolutionary origin of Glt1, 68 related and representative sequences from CAZy family GT32 were aligned according to their catalytic domains and subjected to a phylogenetic analysis using a maximum likelihood algorithm (Fig. 2). Glt1 was found in clade 2, which was almost uniquely populated by sequences from other protists that possess a CAZy family GT74-like sequence related to PgtA, which mediates the addition of the third Skp1 monosaccharide, upon which Glt1 is postulated to act. (Instances of GT74 association with sequences in other clades merely represent a second GT32-like sequence in a clade 2 organism.) The sole exception was a bacterial sequence that lacks the Skp1 glycosylation pathway. All sequences in this clade are predicted to encode cytoplasmic proteins based on the absence of signal peptide or anchor sequences (Fig. 2). Thus, they potentially modify Skp1, which also resides in this compartment and possesses the equivalent of Pro-154 in at least one  Local support values for the branch splits are shown. Known enzymatic activity for experimentally characterized sequences is provided in black after the leaf name as per the CAZy database. Teal-colored circles indicate the presence of a GT74 fucosyltransferase-related protein in the species; empty circles indicate its absence. Orange coloration indicates evidence for a signal peptide or signal anchor sequence expected to direct the protein to the secretory pathway.

A novel cytoplasmic glucosyltransferase
exclusive to this clade, which support the uniqueness of the Glt1 clade. The other two major clades contain sequences from all kingdoms, including bacteria, whose members are usually cytoplasmic as is common for bacterial glycan biogenesis, and protists/fungi/metazoan/chlorophytes, whose members are usually localized to the secretory pathway. Known functions of clade 1 of Fig. 2 include ␣4Galand ␣4GalNAc-transferases, whereas known functions in clade 3 include ␣6-Man-transferases. This analysis suggests that clade 2 with predominantly Glt1-like sequences diverged from the other two clades during the earliest radiation of the GT32 family and might have evolved to uniquely modify Skp1. Glt1 is likely to be a retaining enzyme that attaches a D-sugar in ␣-linkage, but the identity of the sugar cannot be predicted.

Glt1 is required for TgSkp1 glycosylation
To determine whether glt1 is involved in Skp1 glycosylation, the gene was replaced by double-crossover homologous recombination in the type 1 RH⌬⌬ strain, as illustrated in Fig. 3A. glt1 deletion mutants were confirmed by negative PCRs for glt1coding DNA and positive PCR products for the insertion of the selection marker hxgprt between glt1 flanking sequences (Fig.  3B). To control for off-target genetic modifications, a complementation construct containing a version of the original genomic DNA was used to replace the hxgprt locus in glt1 dis-ruption clone-1 using counterselection for loss of hxgprt (Fig.  3A). The same set of PCRs were used to confirm the desired gene restoration in clonal isolates (Fig. 3C). The same clone was also transformed with a mutant version in which three conserved Asp residues, including the DXD motif at positions 363 and 365 and another at codon 348, were changed to generate a potentially inactive mutant Glt1(D363A/D365A/D348N). Strains are listed in Table 1.
Western blot analysis of whole cells suggested that Skp1 from parasites lacking glt1 (Fig. 4A, lane 3) migrates slightly more rapidly than wild-type Skp1 (lanes 1 and 6) but slower than Skp1 from a phyA⌬ strain (lane 2). Complementation with wild-type glt1 (lane 4), but not the triple mutant (lane 5), restored normal mobility. This confirmed the specificity of the genetic disruption to glt1 and further implicated Glt1 as a Skp1 glycosyltransferase. To confirm an effect on Skp1 glycosylation and to pinpoint its location, Skp1 was immunoprecipitated, converted to peptides using trypsin, and analyzed by nanoLC/ MS-MS in an Orbitrap Fusion mass spectrometer. As shown in the extracted ion chromatograms from a sample from the glt1complemented strain (Fig. 4B, left, dashed box 1), doubly and triply charged ions with exact mass matches to the peptide ( 145 IFNIVNDFTPEEEAQVR 161 ) that bears the known modification site, Pro-154, were readily detected. As expected, the

A novel cytoplasmic glucosyltransferase
pentasaccharide form was also readily detected at an earlier elution time (Fig. 4E, left, dashed box 2), as reported previously for the normal parental strain (3) and shown in Table 2. In addition, a very low level of the disaccharide form was apparent in Fig. 4C, but no trisaccharide form was detected (Fig. 4D). In contrast, Skp1 from a glt1⌬ clone yielded no detectable pentasaccharide but abundant trisaccharide (box 3), as well as the unmodified peptide and a minor level of the disaccharide form.
The doubly charged version of this putative dHex-Hex-Hex-NAc-peptide ( Fig. 5A) was subjected to further CID analysis, yielding a series of cleavage products confirming the order of sugar residues ( Fig. 5B and Table 2) and the sequence of the peptide and its attachment, as expected, at a hydroxylated form of Pro-154 ( Fig. 5C). In a clone in which glt1⌬ was complemented with mutated (D363A/D365A/D348N) glt1 genomic DNA, the profiles resembled those of the glt1⌬ strain (Table 2), demonstrating the importance of its predicted glycosyltransferase activity. These findings implicate Glt1 as the glycosyltransferase that catalyzes the addition of the fourth sugar and indicate that the addition of the fifth sugar depends on the fourth.

Toxoplasma expresses a glt1-dependent Skp1 glucosyltransferase activity
To characterize the implied activity of Glt1, exogenous radioactive sugar nucleotide donors and Skp1 glycoforms were introduced into soluble parasite extracts in an attempt to recapitulate reactions in the cell. Because Toxoplasma and Dictyostelium apparently share the same core trisaccharide on their Skp1s, and their Skp1 sequences are highly conserved and serve as substrates for each other's PhyA, Gnt1, and PgtA enzymes (3,12), we used recombinantly expressed Dictyostelium FGGn-Skp1 (13) as a surrogate acceptor substrate. In vitro reactions were performed with cytosolic extracts in the presence of UDP-[ 3 H]Gal, based on the occurrence of Gal in Dictyostelium Skp1 (Fig. 6A). Substantial activity was detected, which was time-and Skp1-dependent and absent from glt1⌬ extracts. To confirm the chemical identity of the transferred 3 H, the radioactive Skp1 band was excised after electroblot transfer to a PVDF membrane and subjected to acid hydrolysis and high-pH anion exchange chromatography. Surprisingly, radioactivity co-chromatographed with D-Glc (Fig. 6B). New reactions using UDP-[ 3 H]Glc revealed more efficient incorporation (Fig. 6A), whose product also co-eluted with the Glc standard (data not shown). The incorporation of 3 H from UDP-[ 3 H]Gal was probably the result of the action of TGGT1_225880, a putative UDP-Glc 4-epimerase in the extract, which would explain the lower incorporation of 3 H from this substrate. Therefore, Glt1 is inferred to utilize UDP-Glc as its substrate.

Activity of recombinant Glt1
To test whether Glt1 is capable of directly glycosylating FGGn-Skp1, a codon-optimized Toxoplasma glt1 cDNA (supplemental Fig. S1) was expressed in and purified from Escherichia coli as an N-terminally His 6 -tagged protein. After enrichment over a Co 2ϩ -Talon column (Fig. 7A), Western blot analysis of fractions from a Q-anion exchange column revealed an anti-His 6 -reactive band at the expected M r value of 69,500 ( Fig. 7C). To simplify detection of enzymatic activity, a synthetic small molecule, FGGn-pNP, was used as a surrogate for FGGn-Skp1, and its validity is described below. A glucosyltransferase activity catalyzing the transfer of 3 H from UDP-[ 3 H]Glc to FGGn-pNP coeluted with the 69,500 band ( Fig. 7B). Inspection of a parallel gel stained for total protein with Coomassie Blue indicated a prominent band that co-migrated with the His 6 band (Fig. 7D), suggesting that the Glt1 protein was major component of the preparation. The great majority of Glt1 was full-length (not shown), and densitometry indicated that Glt1 represented 14% of the protein in fraction 21 (not shown). Pilot studies showed that the transferase activity was stable on ice for days and freeze-thawing, most active in the presence of bovine serum albumin, unaffected by NaCl concentration over the range of 10 -400 mM, and more active at increasing pH values up to the highest value tested (pH 8.5). Activity was blocked by the addition of EDTA, consistent with the importance of the DXD sequence for divalent cation coordination. A recombinant version of the triple (D348N/D363A/ D365A) mutant described above, as well as each of the individual point mutants, expressed well but were inactive (data not shown). MnCl 2 , but not MgCl 2 , supported activity, and 2 mM was sufficient for maximal activation. These data guided the design of the standard reaction to examine the substrate specificities of the enzyme.
Most glycosyltransferases are capable of specifically hydrolyzing their donor substrates by transferring the sugar to water in the absence of an appropriate acceptor (18). A screen for the ability of purified His 6 -Glt1 to hydrolyze six different UDPsugar donors, based on generation of UDP, revealed strong selectivity for UDP-Glc (Fig. 8A), which was consistent with the transferase activity in cytosolic extracts (Fig. 6). GDP-Man was not a substrate in a glycosyltransferase assay utilizing FG-Bn, which mimics the non-reducing terminal disaccharide of the Skp1 glycan, using a variation of the UDP-Glo assay that measures the generation of GDP or UDP (Fig. 8B). Kinetic analysis of UDP generation yielded a K m for UDP-Glc of 7.5 Ϯ 2.3 M. Transferase activity toward FG-Bn was also monitored as the incorporation of 3 H from UDP-[ 3 H]Glc into FG-Bn. These data confirmed the expected inverse hyperbolic dependence on

A novel cytoplasmic glucosyltransferase
UDP-Glc concentration and yielded an apparent K m of 6.0 Ϯ 0.77 M (Fig. 8C), consistent with the value from the other assay. UDP was a more potent inhibitor than GDP of this transferase reaction (data not shown), confirming that GDP-Man is not a substrate. A comparison of FG-Bn and FGGn-pNP acceptors showed markedly improved time-dependent activity toward the full trisaccharide (Fig. 8D), although an effect of the different aglycon moieties cannot be excluded. Fuc-pNP alone was inactive as an acceptor, as were various other mono-and disaccharides representing facets of the Skp1 or related glycans (Fig. 8E). Glt1 transferase activity toward varied concentrations of FGGn-pNP yielded a K m of 13 mM and a V max of 330 nmol/ h/g, or about 6 nmol of Glc/nmol of Glt1/s (Fig. 8F). To compare with the native substrate, Toxoplasma Skp1 was co-expressed in E. coli with Dictyostelium PhyA and Gnt1, purified to near homogeneity under non-denaturing conditions, and modified to completion ex vivo using Dictyostelium PgtA. Activity toward Toxoplasma FGGn-Skp1 yielded an apparent K m over 3 orders of magnitude lower than toward FGGn-pNP, 4.2 M, and an apparent V max that was reduced by about 200-fold (Fig. 8G). Thus, at the low micromolar concentrations expected in the cell, Glt1 is calculated to exhibit a strong preference for the FGGn-trisaccharide when attached to Skp1 compared with unknown potential aglycons.
To address the question of whether Glt1 has cellular targets other than Skp1, we searched for substrates that might accumulate in glt1⌬ parasites and be susceptible to glucosylation in extracts. This was tested by incubation of desalted wild-type or glt1⌬ parasite extracts in the presence of His 6 -Glt1 and UDP-[ 3 H]Glc, followed by separation of the entire reaction mixture by SDS-PAGE and scintillation counting of gel slices. As shown in Fig. 8H, a high level of incorporation of 3 H was observed at the position of Skp1 in the glt1⌬ extract. Incorporation was dependent on the addition of His 6 -Glt1 and was not observed in parental (RH⌬⌬) extracts consistent with Skp1 being already modified before cell extraction. Minimal incorporation was detected at higher M r positions, but it was also observed in the controls. Thus, within the sensitivity of the method and assuming that potential other Glt1 substrates are accessible and not alternatively processed in its absence, Glt1's main or only substrate is Skp1.

Linkage position of Glc
To characterize the glycosidic linkage of the Glc residue, FGGn-pNP was used as the acceptor substrate in a scaled up reaction with Glt1, and the glycan product was recovered by solid phase extraction on a C 18 Sep-Pak cartridge and analyzed by NMR. A 1D 1 H NMR spectrum revealed that the sample was of high purity but also contained many overlapping peaks in the proton dimension (Fig. 9A, between ϳ3.5 and 5.5 ppm). Much of the peak overlap in the proton dimension was resolved in the two-dimensional 1 H-13 C HSQC spectrum, which was provisionally assigned using the CASPER program (19) (Fig. 9B). Assignments were confirmed by analysis of the 2D COSY, TOCSY, and HMBC spectra (data not shown). The HMBC spectra of the Glt1 reaction product also revealed throughbond connectivities between the anomeric carbons to the ring protons (Fig. 9C) and between the ring carbons and anomeric protons (Fig. 9D), clearly demonstrating the glycosidic linkage between the terminal ␣Glc and underlying ␣Fuc as 133. Taken together, the NMR analyses are most consistent with the Skp1 glycan structure: Glc␣1,3Fuc␣1,2Gal␤1,3GlcNAc␣1-. Thus, Glt1 is a UDP-Glc:fucoside ␣1,3-glucosyltransferase.

MS detection of Skp1 glycopeptides in strains
Isoforms of the Skp1 peptide 145 IFNIVNDFTPEEEAQVR were detected as described in Fig. 4. The distribution of raw ion counts among the detected isoforms is shown for the strains analyzed.

Glt1 is important for Toxoplasma proliferation
The ability of the parasite to infect and proliferate on a monolayer culture of fibroblasts, as measured from the area of clearance (plaques) of cells, is a model for potential virulence in animals. Confluent monolayers were infected with parental, mutant, and complemented parasites, and plaque areas were analyzed 5.5 days later. As shown in Fig. 10, the average plaque area generated by glt1⌬ clones is reduced compared with that of parental (RH⌬⌬) cells, but larger than that of phyA⌬ strains that lack the entire modification on Skp1. There was no evi-    Fig. 4, whose m/z (1274.5954) matched that expected of IFNIVFTP(HexNAc-dHex-Hex)EEEAQVR. B, CID fragmentation of the doubly charged precursor ion yields a sequential loss of monosaccharide residues corresponding to dHex, Hex, and HexNAc, indicating the presence of a linear trisaccharide. C, the full CID fragmentation spectrum showing b (blue annotations) and y (red annotations) ion series that match the predicted peptide sequence, as illustrated in the inset. The glycan is linked via a hydroxylated derivative of Pro-154. Peptides with residual sugars are annotated in green.

A novel cytoplasmic glucosyltransferase
dence for reduced plating efficiency (data not shown). Genetic complementation with the original glt1 sequence at the same locus restored normal growth, verifying that the growth defect in the original disruption strain was due to disruption of glt1. In contrast, complementation with the enzymatically inactive mutant version of glt1 failed to rescue the growth defect of glt1⌬ cells. Therefore, the slow growth of the strain can be attributed to loss of the enzymatic activity itself rather than another potential function of Glt1.

Discussion
glt1⌬ parasites exhibit a growth defect in fibroblast cultures. The effect was specific for glt1, as complementation with glt1 restored normal growth. Failure to complement with a catalytically inactive sequence demonstrated the importance of the catalytic activity of Glt1. The deficit was not as strong as that of disrupting phyA, an earlier gene in the Skp1 modification pathway. Key to understanding how glt1 contributes to parasite growth, and interpreting the difference between disrupting phyA and glt1, is characterizing the biochemical contributions of the glt1 gene product.

Glt1 is a novel ␣3-glucosyltransferase
Here we find that Glt1, an enzyme from the CAZy GT32 family of retaining glycosyltransferases, mediates the addition of the fourth sugar on Toxoplasma Skp1. This enabled us, in turn, to infer the sugar to be the pyranose form of D-Glc in ␣-linkage to the 3-position of the underlying Fuc. This conclusion is based on (i) the mass spectrometric analysis of Skp1 that shows that disruption of glt1 leads to accumulation of the truncated trisaccharide form of Skp1 (FGGn-Skp1), suggesting an inability to transfer the fourth sugar, a residue of hexose (Figs. 4 and 5 and Table 2); (ii) the ability of parasite cell extracts to transfer Glc from UDP-␣-D-Glcp to FGGn-Skp1 by a mechanism that depends upon glt1 (Fig. 6); and (iii) the ability of recombinant His 6 -Glt1 (Fig. 7) to directly, specifically, and efficiently catalyze the addition of Glc to the Skp1 trisaccharide (FGGn-Skp1) and synthetic glycan models (Fig. 8). UDP-Glc was the only UDP-sugar, of the six tested, to be efficiently hydrolyzed by purified Glt1 (Fig. 8A). GDP-Man was not a substrate (Fig. 8B). Furthermore, UDP-Glc is likely to be the native substrate of Glt1 due to its ability to efficiently transfer Glc to the synthetic acceptor FGGn-pNP (Fig. 8F) at a calculated turnover number of about 6/s at maximal velocity, which is rapid for a glycosyltransferase. The Glt1 protein sequence is related to the CAZy GT32 family, and the loss of enzymatic activity of point mutants that inactivate other members of this family sup-

A novel cytoplasmic glucosyltransferase
ports this association (data not shown). The CAZy GT32 family includes several characterized retaining glycosyltransferases that use either UDP-Gal or GDP-Man, although none are known to utilize UDP-Glc, and catalyze formation of ␣1,3, ␣1,4, or ␣1,6 linkages. The NMR-HMBC analysis confirms the ␣-linkage and furthermore establishes glycosidic attachment to the 3-OH of the underlying Fuc (Fig. 9).

Skp1 is probably the natural substrate of Glt1
Glt1 and Skp1 both reside in cytoplasmic and possibly nuclear compartments based on the absence of apparent signal peptides or transmembrane domains, consistent with their recovery from cytosolic Toxoplasma extracts. Glt1 has a strong preference for non-reducing terminal Fuc in the context of the native Skp1 trisaccharide, relative to the non-reducing disaccharide, and did not modify GlcNAc or Gal in different contexts (Fig. 8, D and E). Furthermore, it did not modify Fuc alone, indicating that it will not target the abundant O-Fuc modifications in the nucleus of Toxoplasma (21). FGGn-Skp1 is also an excellent substrate, with an apparent K m of 4 M, dramatically lower than that of FGGn-pNP and consistent with favorable selectivity for the trisaccharide in the context of Skp1. What FGGn-Skp1 gains in K m is, however, sacrificed in V max . Nevertheless, at 1 M concentrations, FGGn-Skp1 is still calculated to enjoy a 15-fold catalytic advantage over FGGn-pNP. Evidence that Skp1 is a primary target of Glt1 comes from a biochemical complementation experiment, in which Skp1 is the only acceptor substrate detected after incubation of glt1⌬ extracts with His 6 -Glt1 and radioactive UDP-Glc (Fig. 8H). Although other substrates could have been missed because they were excluded from or not accessible in the extract, or alternatively modified in the absence of Glt1, the interpretation that Glt1 is dedicated to Skp1 is consistent with evidence that the two earlier glycosyltransferases in the pathway are specific for Skp1 (3) and with more extensive evidence that the Skp1 modification pathway is specific for Skp1 in Dictyostelium (4,14).

Functional variations between Toxoplasma Glt1 and Dictyostelium AgtA
Skp1 from Dictyostelium also possesses a pentasaccharide on the corresponding 4-hydroxproline residue, and evidence suggests that the core trisaccharide of Toxoplasma Skp1 is identi- cal to that of the recently confirmed (9) core trisaccharide of Dictyostelium Skp1, Fuc␣1,2Gal␤1,3GlcNAc␣1-. The evidence for identity is based primarily on sequence homologies of their Gnt1 and PgtA enzymes (3) and confirmation of their donor substrates as UDP-GlcNAc and as UDP-Gal and GDP-Fuc, respectively. Identity of the trisaccharide cores is also supported by the robust activity of His 6 -Glt1 toward endogenous Toxoplasma Skp1 in glt1⌬ extracts (Fig. 8H). More direct confirmation of the linkages of the core trisaccharide is confounded by the small amounts of the intracellular pathogen that are available and difficulties encountered in expressing soluble protein from Toxoplasma pgtA cDNA in E. coli or Dictyostelium. 6 The fourth sugar in the Dictyostelium Skp1 glycan, an ␣Gal, differs from the ␣Glc in Toxoplasma but conserves the 3-linkage to Fuc. The enzyme catalyzing its addition, AgtA, belongs to CAZy family GT77 and is evolutionarily unrelated to Glt1. AgtA also catalyzes the addition of the fifth sugar, an ␣Gal that is 3-linked to the fourth sugar to form a linear chain. In contrast, in Toxoplasma, the fifth sugar is added by a separate enzyme. 7 The mechanism of recognition of FGGn-Skp1 is distinct for the two enzymes. Whereas AgtA strongly prefers FG-relative to Fuc-or FGGn-(as conjugates to small aglycons) as an acceptor (22), Glt1 strongly favors the full trisaccharide and is unable to modify Fuc alone (Fig. 8, D and E). Genetic and biochemical studies reveal that Dictyostelium AgtA has an independent function in modulating Skp1 activity that probably involves a physical interaction via a separate WD40 repeat domain (13). Unlike AgtA, Glt1 lacks an identifiable domain separate from the catalytic domain, although it does possess short sequences at various positions within the conserved catalytic domain (supplemental Fig. S2). However, these insert sequences, variations of which are commonly observed in Toxoplasma proteins (23), are not conserved at the sequence level in another apicomplexan expected to harbor the Skp1 modification pathway, Sarcocystis neurona. Furthermore, the insert sequences tend to be absent from other protists that possess glt1-like genes. Therefore, Glt1 lacks a structural basis for the second function possessed by AgtA.
Structural studies of Dictyostelium Skp1 suggest that the fulllength glycan encourages an ensemble of conformations that promote interactions with at least certain FBPs (9). The trisaccharide form of Dictyostelium Skp1 exhibits intermediate interaction with two of the FBPs (6), and, if an analogous mechanism operates in Toxoplasma, this might explain why the glt1 deletion results in a growth phenotype intermediate between complete absence of the glycan and its full assembly. Further studies, to be conducted when the final linkage of the Toxoplasma glycan is ascertained, will address this possible explanation.

Glt1 is ancestral for Skp1 glycosylation in protists
A comprehensive reconstruction of the evolution of CAZy GT32-related sequences indicates that Glt1-like sequences form a distinct clade within the GT32 group and have evolutionarily diverged from other members through variations in the GT domain. This interpretation is supported by conservation of amino acids at 10 positions throughout the protein, including three that are exclusive to the Glt1 clade (supplemental Fig. S2). Glt1-like sequences are found only in protists (Fig.  2), with one exception in a bacterium, which might reflect horizontal transfer. But the broad distribution within this diverse phylogeny suggests that Glt1 was present in the last common protistan ancestor and selectively lost or modified where it did not afford a selective advantage in, for example, O 2 sensing. Furthermore, the Glt1 clade is populated only by protists that also possess CAZy GT74-like sequences predicted to encode the Skp1 PgtA-like enzyme that assembles the Fuc residue on which Glt1 acts in Toxoplasma. This evidence of co-evolution implicates all members of this clade in the assembly of the Skp1 glycan. However, Glt1-like sequences are absent from the amoebozoa (using BlastP search at E Ͻ 10 Ϫ1 ), which include Dictyostelium, where the CAZy family GT77 member AgtA performs a related biochemical function. The simplest explanation is that Glt1 is the ancestral Skp1 glycosyltransferase whose function was replaced in amoebozoa, which might have occurred to compensate for a loss of Glt1 or the final glycosyltransferase (Gat1), 7 or because of the selective advantage afforded by AgtA's additional function in Skp1 suppression (13).

glt1 gene replacement and complementation
DNA for the gene replacement was generated from pmini GFP.ht, in which the hxgprt gene is flanked by multiple cloning sites, as described (3). Briefly, 5Ј-flanking and 3Ј-flanking targeting sequences of glt1 from RH⌬⌬ were PCR-amplified with primer pairs a and aЈ and pairs b and bЈ, respectively (supplemental Table S1). The 5Ј-fragment was released by digestion with ApaI and HindIII and cloned into similarly digested pminiGFP.ht. The 3Ј-fragment was similarly inserted using XbaI and NotI. The resulting vector was linearized with ApaI and electroporated into strain RH⌬⌬. Transformants were selected, and GFP-negative parasites were cloned by limiting dilution. Genomic DNA was prepared and screened by PCR (3).
To complement a glt1⌬ clone, the hxgprt cassette of pmini GFP.ht was replaced with a ϳ5-kb gDNA fragment containing the glt1 coding region from RH⌬⌬ (2860 nucleotides) plus ϳ1 kb each of 5Ј-flanking and 3Ј-flanking DNA, using the complementary annealing mediated by exonuclease cloning method (25). Briefly, the vector and insert were PCR-amplified separately for 20 cycles by Q5 high-fidelity DNA polymerase (primers in supplemental Table S1). The amplified insert contained 15-base overhangs matching the termini of the amplified vector. The gel-purified amplicons were mixed at a 1:3 molar vector/insert ratio (total ϳ100 ng), incubated with T4 DNA polymerase (Novagen) at 22°C for 2.5 min to generate 5Ј-overhangs, incubated at 75°C for 20 min to inactivate the polymerase, and annealed at 50°C for 30 min. 2 l was transformed into E. coli Top10-competent cells. The recovered plasmid was electroporated into glt1⌬ parasites, which were subjected to selection (3). A triple point mutant (D348N/D363A/D365A) of the complementation plasmid was generated by site-directed mutagenesis (primers in supplemental Table S1) as described (12) and separately electroporated.

Mass spectrometry of Skp1 peptides
Skp1 was immunoprecipitated from urea-solubilized parasite extracts as described (3), except that the sample was first precleared by incubation with 50 l of anti-rabbit antibody bound beads for 1 h at 4°C. The dried, purified Skp1 samples were dissolved in 100 l of 10 mM dithiothreitol in 50 mM NH 4 HCO 3 , incubated at 56°C for 1 h, alkylated with 22.5 mM iodoacetamide for 45 min in the dark, and digested with trypsin (Promega) at 37°C overnight. The resulting peptides were recovered by addition to a C18 spin column (MicroSpin TM column, The Nest Group), elution with 0.1% formic acid in 80% acetonitrile, and drying under vacuum. Peptides were reconstituted in 19.5 l of solvent A (0.1% formic acid) and 0.5 l of solvent B (0.1% formic acid in 80% acetonitrile), separated on an Acclaim PepMap RSLC C18 column (75 m ϫ 15 cm), and eluted into the ion source of an Orbitrap Fusion Lumos Tribrid TM mass spectrometer (Thermo Fisher Scientific) with a linear gradient consisting of 0.5-100% solvent B over 150 min at a flow rate of 200 nl/min. The spray voltage was set to 2.2 kV, and the temperature of the heated capillary was set to 280°C. Full MS scans were acquired from m/z 300 to 2000 at 120,000 resolution, and MS 2 scans following collision-induced fragmentation were collected in the ion trap for the most intense ions in the Top-Speed mode within a 3-s cycle using Fusion instrument software (version 2.0, Thermo Fisher Scientific). The acquired raw spectra were analyzed using SEQUEST (Proteome Discoverer version 1.4, Thermo Fisher Scientific) with a full MS peptide tolerance of 20 ppm and MS 2 peptide fragment tolerance of 0.5 Da and filtered to generate a 1% target decoy peptide-spectrum-match false discovery rate for protein assignments. Spectra assigned as glycosylated TgSkp1 peptides were manually validated.

Preparation of Toxoplasma FGGn-Skp1
The E. coli dual expression plasmid for Dictyostelium Skp1A (5) and Dictyostelium PhyA was modified by substitution with synthetic cDNA encoding Toxoplasma Skp1 and introduced into E. coli together with the plasmid encoding Dictyostelium DdDp-Gnt1. Gn-Skp1 was purified as for its Dictyostelium counterpart and modified in vitro using Dictyostelium FLAG-PgtA. The reaction consisted of 6.25 M Toxoplasma Gn-Skp1, 112 nM Dictyostelium FLAG-PgtA, 25 M UDP-Gal, 50 M GDP-Fuc, 120 mM NaCl in 50 mM Tris-HCl (pH 7.5). The reaction was monitored by dot blotting onto nitrocellulose filters and probing with mAb 1C9, which is specific for Dictyostelium Gn-Skp1 relative to other modified glycoforms (15), and a 1:1000 dilution of polyclonal antibody UOK104, which is specific for Dictyostelium FGGn-Skp1 relative to other glycoforms (26). Alexa680-coupled secondary antibodies were applied and detected in an Odyssey infrared scanner (LI-COR). Similar specificity was observed for the Dictyostelium and Toxoplasma glycoforms, and the glycosylation reaction was taken to completion within the sensitivity of the method (data not shown).

Expression and purification of recombinant His 6 -Glt1
The predicted coding sequence of Glt1 (TGGT1_205060) was codon-optimized for expression in E. coli, chemically synthesized by GenScript (Piscataway, NJ), and inserted into the pUC57 vector between its NdeI and BamHI sites (supplemental Fig. S1). After treatment with NdeI and BamHI, the released coding fragment was inserted into similarly digested pET15b (Invitrogen), which resulted in the full-length 605-amino acid coding sequence preceded by an N-terminal His 6 tag and tobacco etch virus protease cleavage site (MGSSHHHHHHSS-GRENLYFQGH-). E. coli Gold cells expressing His 6 -Glt1 were incubated for 24 h in 8 ϫ 1 liter of Terrific Broth medium in the presence of 100 g/ml ampicillin, 2 g/liter lactose, and 125 M isopropyl 1-thio-␤-D-galactopyranoside at 20°C. Pilot studies showed that the autoinduction during bacterial growth (27) was much superior to standard induction at high density for expression of soluble Glt1 (data not shown). After 24 h, cells were collected by centrifugation at 2000 ϫ g for 10 min and resuspended in 50 mM Na ϩ /K ϩ phosphate (pH 7.8), 300 mM NaCl, 2 mM benzamidine, 0.5 g/ml pepstatin A, 5 g/ml aprotinin, 5 g/ml leupeptin, and 0.5 mM phenylmethylsulfonyl fluoride at 4°C; Na ϩ /K ϩ phosphate buffer was prepared by titrating monosodium phosphate into dipotassium phosphate of equal molarity. Bacteria were lysed using a probe sonicator (model 500, Thermo Fisher Scientific). The lysate was immediately centrifuged at 21,000 ϫ g for 30 min at 4°C, and the supernatant (S21) was applied to a column containing 1.5 ml of Co 2ϩ TALON resin (Clontech) pre-equilibrated at 4°C in the buffer described above. Protein was eluted with 300 mM imidazole in the same buffer. The major A 280 peak was dialyzed against 40 mM Tris-HCl (pH 8.0) overnight, and applied to a 1-ml Hi-Trap Q-Sepharose column (GE Healthcare) pre-equilibrated at 4°C in 40 mM Tris-HCl (pH 8.0), 2 mm DTT, 2 mM MgCl 2 , 10% (v/v) glycerol, aprotinin, and leupeptin (as above). The column was eluted with a 0 -1 M gradient of NaCl in the same buffer. Fractions with the highest enzymatic activity were confirmed for the presence of His 6 -Glt1 by Western blotting with anti-His 6 monoclonal antibody (Novagen catalog no. 70796-3), pooled, and frozen as aliquots at Ϫ80°C.

Glt1 enzyme activity assays
Hydrolysis of UDP-sugars was conducted using the UDP-Glo assay (Promega) as described (18). Briefly, His 6 -Glt1 (after Q-column purification) was incubated in the presence of 50 M sugar nucleotides in 20-l reactions containing 50 mM HEPES-NaOH (pH 7.4), 2 mM MnCl 2 , 5 mM DTT at 37°C for 16 h, and activity was quantitated based on conversion of the UDP reaction product to ATP.
Alternatively, Glt1 transferase activity was assayed based on release of GDP or UDP from unlabeled GDP-Man or UDP-Glc, using GDP-Glo or UDP-Glo assays (Promega) according to the manufacturer's protocol. Conditions were as described above.
Kinetic parameters were determined assuming Michaelis-Menten kinetics using GraphPad Prism software. Acceptor substrate kinetics were determined at 40 M UDP-Glc, and donor substrate kinetics were determined at 2 mM FG-Bn.
For assay of transferase activity in parasite extracts, the typical 50-l reaction volume containing 30 l of S100 fraction (1 mg of protein/ml), 50 pmol of FGGn-DdSkp1 (5) (13), in 50 mM HEPES-NaOH (pH 7.4), 10 mM MgCl 2 , 2 mM MnCl 2 , 2 mM DTT, 3 mM NaF, and protease inhibitors, was incubated at 37°C for 1.5 or 3 h. Reactions were stopped by the addition of 4ϫ Laemmli electrophoresis sample buffer, and incorporation was after SDS-PAGE as described (3). The chemical form of incorporated radioactivity was determined after electroblot transfer to a polyvinylidene difluoride membrane, acid hydrolysis, and high-pH anionic exchange chromatography, using 1.5 nmol each of L-Fuc, D-Glc, D-Gal, and D-Man as internal standards (3).
For assay of acceptor substrate activity in extracts, reactions were modified to contain 2 Ci of UDP-[ 3 H]Glc, 170 l of desalted S100 fraction (1 mg/ml), His 6 -Glt1 in a final volume of 200 l. After incubation for 3 h at 37°C, the reaction was concentrated by centrifugal ultrafiltration using a Nanosep 3K concentrator (Pall Corp.). The samples were resolved by SDS-PAGE, and each lane was sliced into 36 equal pieces, which were counted for radioactivity as above.

Glc-Fuc linkage analysis
To determine the linkage of the Glt1 product, 4 mol of FGGN-pNP and 8 mol of UDP-Glc were incubated with 34 g (ϳ0.65 nmol) of His 6 -Glt1 and 40 units of calf intestinal alkaline phosphatase (Promega) in 2 ml of reaction buffer (50 mM HEPES-NaOH (pH 8.0), 5 mM DTT, 8 mM MnCl 2 , and 10 mM MgCl 2 ) at 37°C for 5 h. The reaction was terminated with 8 ml of 5 mM Na-EDTA, pH 8.0. The product was recovered using a Sep-Pak C 18 cartridge as above, dried under N 2 , and reconstituted in 50 l of H 2 O. Quantitative conversion to the tetrasaccharide was confirmed using MALDI-TOF mass spectrometry (ABSciEx 5800, Applied Biosystems), and the concentration was determined spectrophotometrically using an extinction coefficient of 1.15 ϫ 10 4 M Ϫ1 cm Ϫ1 at 300 nm (22). 3 mol of dried GFGGn-Skp1 was resuspended in 99.96% D 2 O and analyzed on an Agilent 900-MHz DD2 spectrometer equipped with a 5-mm cryogenically cooled probe. NMR experiments were A novel cytoplasmic glucosyltransferase performed at 25°C after stabilization and shimming and used standard pulse sequences (PRESAT, gCOSY, zTOCSY, HSQCAD, and gHMBCAD) from the Agilent library. Two-dimensional data were collected with default values except for increased digital resolution in both dimensions. Spectra were processed with MestReNova software (Mestrelab Research S.L.). Peaks of the 1 H-13 C HSQC spectra were provisionally assigned by comparison with predicted chemical shifts calculated by the CASPER program, and all signals and residue linkages were confirmed from analysis of the 2D data.

Phylogenetic analysis
Full-length sequences for all of the experimentally characterized GT32 family proteins listed in the CAZy database (accessed May 10, 2017, http://www.cazy.org/) 8 (29) were collected. To identify Glt1-related sequences, T. gondii (EPR61400.1) and Vitrella brassicaformis (CEM02366.1) Glt1 protein sequences were used as BLAST queries to search against the NCBI nr database and available protist proteome databases (Ensembl protist and JGI) (http://protists.ensembl. org/index.html and http://genome.jgi.doe.gov/). 8 The putative ortholog from the chromerid V. brassicaformis, a close relative of apicomplexa that is predicted to express the Toxoplasma-like Skp1 modification pathway, was included due to its lack of apicomplexan-specific insertions within its proteins (30). For the GT74 family, the only characterized sequence from D. discoideum (AAF82378.1) was used as a BLAST query. For both families (GT32 and GT74), the collected BLAST best-hit sequences were then aligned using the MAFFT L-INS-I strategy (31), and the alignment was used to generate a hidden Markov model (HMM) (http://hmmer.org/) 8 (32). The built HMM profile was used to identify additional related sequences in NCBI nr from diverse taxonomic groups using an e-value cut-off of 1eϪ5. A total of 68 GT32-related sequences from bacteria, protists, fungi, chlorophyte, and metazoans were collected and used for the phylogenetic tree construction. Hits collected for the GT74 family were used to identify the presence of a GT74-related sequence in the species.
The 68 full-length sequences representing the GT32 family members were aligned using an MAFFT L-INS-I strategy. The boundaries of the GT32 domain were marked by identifying conserved regions in the alignment. The sequences were initially trimmed to extract only the GT32 domain and realigned using the same strategy. This domain alignment was further refined by removing large insert segments and poorly aligned regions. The trimmed alignment was used to generate the final phylogenetic tree. FastTree (33) was used to build the tree with default parameters using the following options: -wag for the WAG (34) model of amino acid evolution, -gamma for the rescaling of the branch lengths and the computation of Gamma20based likelihood scores, -pseudo to add pseudo counts for highly gapped segments in the alignment. Local support values for the internal nodes were computed by FastTree using the Shimodaira-Hasegawa test and are displayed in the figure. Trees generated using full-length GT domain alignments also resulted in similar tree topologies. Runs using RaXML also generated similar topologies (not shown). Functional annotations were collected from the CAZy database for the characterized sequences. SignalP3.0 (www.cbs.dtu.dk/services/SignalP-3.0/) 8 was used to predict signal peptide sequences, using separate runs for eukaryotes, Gram-negative bacteria, and Grampositive bacteria, and TMHMM version 2.0 (www.cbs.dtu.dk/ services/TMHMM/) 8 was used to predict the transmembrane regions.