Dissection of Hexosyl- and Sialyltransferase Domains in the Bifunctional Capsule Polymerases from Neisseria meningitidis W and Y Defines a New Sialyltransferase Family*

Background: Capsule polymerases of Neisseria meningitidis serogroups W and Y comprise hexosyl- and sialyltransferase activity. Results: Hexosyltransferase activity is encoded by the predicted N-terminal GT-B fold. Sialyltransferase activity requires 168 additional amino acids upstream of the predicted C-terminal GT-B fold. Conclusion: The sialyltransferase domains of NmW/Y define a new glycosyltransferase (CAZy) family. Significance: The new CAZy family comprises sequences from distantly related species. Crucial virulence determinants of disease causing Neisseria meningitidis species are their extracellular polysaccharide capsules. In the serogroups W and Y, these are heteropolymers of the repeating units (→6)-α-d-Gal-(1→4)-α-Neu5Ac-(2→)n in NmW and (→6)-α-d-Glc-(1→4)-α-Neu5Ac-(2→)n in NmY. The capsule polymerases, SiaDW and SiaDY, which synthesize these highly unusual polymers, are composed of two predicted GT-B fold domains separated by a large stretch of amino acids (aa 399–762). We recently showed that residues critical to the hexosyl- and sialyltransferase activity are found in the predicted N-terminal (aa 1–398) and C-terminal (aa 763–1037) GT-B fold domains, respectively. Here we use a mutational approach and synthetic fluorescent substrates to define the boundaries of the hexosyl- and sialyltransferase domains. Our results reveal that the active sialyltransferase domain extends well beyond the predicted C-terminal GT-B domain and defines a new glycosyltransferase family, GT97, in CAZy (Carbohydrate-Active enZYmes Database).

Bacterial capsules are a protective layer of extracellular polysaccharides that are firmly attached to the cell surface (1). Capsules both enhance bacterial survival in the environment by providing a highly hydrated physical barrier surrounding the cell (2) and contribute to symbiotic and pathogenic interactions with their host (1,3). Capsules are divided into four groups based on characteristics of the polysaccharide and export machinery (4,5). Group II capsules are characterized by polysaccharides with a high charge density and export via an ATPbinding cassette transporter-dependent system (6). Neisseria meningitidis (also referred to as meningococcus), a group of strictly human pathogens, produce group II capsules and are divided into 12 serogroups based on the chemical properties of the capsular polysaccharides (CPSs) 4 (7). Six of these serogroups (A, B, C, W, X, and Y) are important pathogens (8) (the serogroup nomenclature is simplified according to Harrison et al. (7)). In serogroups B, C, W, and Y, the negative charge of the CPS results from the building block sialic acid (Sia or Neu5Ac), a nonasugar carrying a carboxylate function in position 1 (9). Whereas CPSs in NmB and NmC are homopolymers with Neu5Ac in ␣2,8and ␣2,9-glycosidic linkage, respectively, the CPSs of NmW and NmY represent heteropolymers of the repeating units (36)-␣-D-Gal-(134)-␣-Neu5Ac-(23) n in NmW and (36)-␣-D-Glc-(134)-␣-Neu5Ac-(23) n in NmY (10). These latter structures are highly unusual because sialic acid occurs as an internal sugar. As a rule, sialic acid is a terminal (non-reducing end) sugar and, if internal, is conjugated with other sialic acid residues (11).
We recently succeeded with the molecular cloning and functional characterization of the capsule polymerases of NmW (SiaD W ) and NmY (SiaD Y ) and demonstrated bifunctionality for both enzymes (12). The large proteins with molecular masses of 120 kDa are 98% identical at the amino acid level and were predicted to comprise two glycosyltransferase (GT) domains of the GT-B fold (13). Bioinformatics and rational mutagenesis showed that the N-terminal domains encompass hexosyltransferase (HexTF) activity and that the C-terminal domains encode the sialyltransferase (SiaTF) activity (12).
Point mutations introduced to destroy one catalytic function fully inactivated the capsule polymerases SiaD W and SiaD Y . However, if the single mutant enzymes were combined in the same reaction tube, 70% of wild-type activity was restored. Beyond the demonstration that the two functional domains are capable of complementing each other in trans, these data indicated that chain elongation needs the successive activity of HexTF and SiaTF domains (12). Additional proof for the independence of the HexTF and SiaTF domains was obtained by saturation transfer difference NMR, showing the simultaneous binding of both sugar-nucleotides (CMP-Neu5Ac and UDP-Gal) and the priming oligosaccharide acceptor (12).
The N-terminal HexTF domains of SiaD W and SiaD Y are classified in CAZy family GT4 (14). On the other hand, due to the absence of similarity to known sialyltransferase families and the lack of defined boundaries for the SiaTF domains in SiaD W and SiaD Y , these domains could not be assigned to a CAZy family.
Here we aimed to determine whether the putative SiaTF domains of SiaD W and SiaD Y could be physically separated from the respective HexTF domains and to delineate the boundaries of the SiaTF and HexTF domains. Crucial toward this goal was the availability of a sensitive assay system that would enable the unequivocal detection of single sugar transfers. To this end, we synthesized fluorescently labeled acceptor substrates with which HexTF and SiaTF could be specifically primed. High performance liquid chromatography separation and fluorescence detection (HPLC-FD) of the labeled reaction products provided the single-product resolution necessary to delineate the two glycosyltransferase activities. We show that the N-terminal GT-B folds contain all of the information for HexTF activity, whereas the predicted C-terminal GT-B domains were insufficient to generate active SiaTFs. Using a mutational approach, the SiaTFs were shown to require an additional 168 amino acids immediately upstream of the predicted GT-B domains. Bioinformatics analyses demonstrate that the newly identified SiaTF domains are the first characterized members of a new CAZy family, GT97.

Generation of Truncation Mutants and Expression of Recombinant
Proteins-To separate HexTF and SiaTF domains, truncated variants of SiaD W and SiaD Y (Fig. 4) were designed with care not to destroy secondary structure elements as predicted by the Phyre2 server (13). These constructs were generated by PCR using pHC4 (SiaD W ) and pHC5 (SiaD Y ) (15) as templates. Hot Start Phusion-DNA-Polymerase (Thermo Scientific/Fermentas) was used in these experiments, and PCR conditions were as follows: 1 cycle of 98°C/120 s; 30 cycles of 98°C/15 s, 65°C/30 s, 72°C/30 s; and 1 cycle of 72°C/300 s. Used primers, containing NdeI or XhoI sites, together with the obtained truncation variants are listed in Table 1. PCR products after digestion with NdeI and XhoI (New England Biolabs, Inc.) were purified and ligated into the respective sites of the expression vector pET22-b (Novagen). After transformation into Escherichia coli XL-1 Blue (Stratagene), transformed colonies were selected on ampicillin, and constructs were controlled by restriction analysis and sequencing. Expressed proteins carried a C-terminal His 6 epitope. If recombinant proteins were purified, the protocol described by Romanow et al. (12) was used.
Synthesis and Purification of Fluorescently Labeled Acceptors to Prime SiaD W and SiaD Y Reactions-Because activity testing in the classical radioactive incorporation assay depends on polymer formation (short oligosaccharides are mobile in the paper chromatography step (16)), the reliable testing of monofunctional mutants required a test system in which single sugar transfers could be clearly observed. We exploited the capability of SiaD W and SiaD Y to extend sialic acid derivatives carrying the fluorescent label 2Ј-(4-methylumbelliferyl) (4-MU) at the reducing end. The recombinant C-terminally His 6 -tagged monofunctional full-length enzymes NmW-(S972A)-His 6 and NmY-(S972A)-His 6 were used as HexTFs, and NmW-(E307A)-His 6 and NmY-(E307A)-His 6 were used as SiaTFs. 4-MU-Sia-Gal/ Glc were obtained by mixing 4 mM 4-MU-Sia (Sigma-Aldrich or Iris Biotech GmbH) with 4 mM donor sugar UDP-Gal/UDP-Glc (Sigma) and 20 g ml Ϫ1 of the respective HexTF (NmW-(S972A)-His 6 or NmY-(S972A)-His 6 ) in reaction buffer (20 mM Tris/HCl, pH 8.0, 20 mM MgCl 2 , and 2 mM DTT). After 24 h at 25°C, enzymes were removed by ultrafiltration (AmiconUltra 10 molecular weight cut-off, Millipore), and filtrates containing the reaction products (4-MU-Sia-Gal or 4-MU-Sia-Glc) were lyophilized (Alpha 1-2 LD Plus, Martin Christ). After dissolution in water, samples were desalted on P2-gel filtration columns (Bio-Rad). Reaction products were identified by high performance liquid chromatography using an UFLC-RX system (Shimadzu) coupled to a fluorescence detector (RF-10A XL). Samples were excited at 315 nm and monitored at 375 nm. Although, under the conditions used, this reaction did not give product yields of Ͼ90% (see Fig. 2), further product purification was omitted because the acceptor quality increased significantly from 4-MU-Sia to 4-MU-Sia-Hex, making 4-MU-Sia an irrelevant contaminant. The obtained 4-MU-Sia-Gal/Glc were then the starting material for Sia transfer to obtain 4-MU-Sia-Gal/Glc-Sia. This reaction was carried out with either NmW-(E307A)-His 6 or NmY-(E307A)-His 6 in the presence of 2 mM CMP-Neu5Ac (Nacalai Tesque). Reaction conditions were identical to those described in the first reaction step. However, due to the significantly improved acceptor, the sialylation in this step was complete after 1 h of incubation. After the removal of enzymes and desalting, obtained compounds were used in iterative rounds to synthesize primers of the needed size.
Enzyme Testing-Optimal buffer conditions (pH and metal ion concentration) for SiaD W/Y activity were established using a radioactive incorporation assay (12). Identical to other neisserial capsule polymerases, SiaD W/Y activity was optimal at slightly basic pH. Conditions identified with the wild type enzymes were maintained for all assays in this study. For activity testing with fluorescent compounds, reactions were carried out in a total volume of 25 l. Mixtures contained 50 mM Tris/ HCl, pH 8.0, 20 mM MgCl 2 , 2 mM DTT, 1 mM acceptor (4-MU-Sia-Gal/Glc, 4-MU-Sia-Gal/Glc-Sia, or 4-MU-Sia-Gal/Glc-Sia-Gal/Glc) plus 2 mM of the nucleotide sugar (UDP-Gal/Glc (Sigma) and/or 2 mM CMP-Neu5Ac (Nacalai Tesque)), depending on the tested enzyme. Reactions were started by the addition of 20 g ml Ϫ1 purified enzyme or, if the soluble fraction of bacterial lysates was used as the enzyme source, with 72-100 g ml Ϫ1 total protein. After appropriate incubation times, reactions were stopped by shock freezing in liquid nitrogen.
Synthesized oligo-and polymers were analyzed and quantified via the fluorescent tag using an ultrafast HPLC system (UFLC-RX, Shimadzu) with coupled FD (detector RF-10A XL). The separation of 4-MU-labeled products was achieved with an anion exchange column (CarboPac PA-100 column, Dionex). Before loading onto the column, samples were diluted 500-fold in water. The buffers A (20 mM NaNO 3 ) and B (1 M NaNO 3 ) were used to establish a curved elution gradient, reaching 21.65% buffer B over 35 min at 0.6 ml min Ϫ1 and a column temperature of 50°C. The curved portion of the gradient is described by the following formula, in which the index Ϫ1.425 describes the slope (LS Solution, Shimadzu) (17)(18)(19).
Elution profiles were monitored via fluorescence emission at 375 nm with 315 nm as the excitation wavelength. Under these conditions, the separation of polysaccharides up to a degree of polymerization (DP) of 21 was easily achieved (see "Results"). The HPLC profiles can be quantified to determine reaction progress. Peak areas represent the relative amount of each DP in the product profile and were calculated by integration of HPLC chromatograms with the LC Solution software (Shimadzu). Normalizing peak areas to the total area under the curve, weighting each peak according to the number of transfers required to form this product, and summing over all peaks gives the "normalized and weighted area" according to the formula,

Primer pairs used for cloning
5Ј-CAAATAAGCTAGTTGCATAAACACCTACAGCAACGCGAGAT-3Ј Primer pairs used for the introduction of 5Ј-CAAATAAGCTAGTTGCATAAACACCTACAGCAACGCGAGAT-3Ј Primer pairs used for the generation of 5Ј-CCGCTCGAGTTTTTCTTGGCCAAAAAACTG-3Ј Primer pairs used for the generation of where A is the peak area, and n is the number of transfers required to form this product. The normalized and weighted area is directly proportional to the total number of sugar transfers at that reaction time point. SDS-PAGE and Immunoblotting-SDS-PAGE was performed under reducing conditions using 2.5% (v/v) ␤-mercaptoethanol. Western blot analysis was done on a PVDF membrane (Millipore). For detection of the hexahistidine (His 6 ) tag, penta-His antibody (Qiagen) was used as primary antibody at a concentration of 1 g ml Ϫ1 and subsequently detected with 0.05 g ml Ϫ1 IgG 1 anti-mouse IR Dye800 antibody (Odyssey Infrared Imaging). Protein bands were visualized and quantified with the infrared fluorescence detection system (LI-COR) according to the manufacturer's instructions.
Bioinformatics-The most similar homologs of the identified SiaTF domain were found by iterative PSI-BLAST searches of the non-redundant protein sequences database using the PSI-BLAST server (20). In each round, sequences with identity of Ͼ20% and coverage of Ͼ70% were included in the following iteration. After four iterations, the collected sequences were filtered for an E-value of better than 1e Ϫ50 and for the presence of conserved motifs (see "Results"). These sequences were aligned using the MAFFT server (G-INS-i strategy; BLOSUM30 scoring matrix; unalignlevel: 0.4) (21). Poorly fitting sequences were removed with MaxAlign (22) and by visual inspection of alignments and preliminary trees. The maximum likelihood tree was calculated from 359 ungapped positions using the PhyML software with bootstrapping (23). The tree was visualized with Archaeopterix (24).

Synthesis of Fluorescently Labeled Oligosaccharide Substrates-
With the intention to delineate HexTF and SiaTF activities and functionally express the individual enzyme domains, it was first necessary to have defined acceptors to prime each of these activities. As detailed under "Experimental Procedures," this was achieved using the previously described active site mutants bearing only the HexTF or SiaTF activity (12). The commercially available, but poorly used, fluorescent substrate 4-MU-Sia was first extended by a single hexose using either NmW-(S972A)-His 6 or NmY-(S972A)-His 6 to give 4-MU-Sia-Gal or 4-MU-Sia-Glc, respectively (henceforth referred to as 4-MU-Sia-Hex or 4-MU-DP2). The 4-MU-DP2 was then extended by the respective SiaTF (NmW-(E307A)-His 6 or NmY-(E307A)-His 6 ) to give 4-MU-Sia-Hex-Sia (4-MU-DP3), which proved to be an efficient acceptor for the HexTFs. Structures of the new acceptors (as an example shown for oligosaccharides with Gal as hexose) are given in Fig. 1.
Anion exchange separation of 4-MU-Sia and the elongated products 4-MU-DP2 to 4-MU-DP4 was achieved by HPLC with a CarboPac PA-100 column and NaNO 3 gradient. Product pro-files for the NmW and NmY substrates show baseline resolution of all compounds (Fig. 1, A and B). Although the elution behavior of each product is dominated by the total charge, charge density also plays a role. This interplay is manifested as large increases in retention time upon the addition of a negatively charged sialic acid and a smaller reduction in retention time upon the addition of a neutral hexose. The subtle effect of different Gal and Glc stereochemistries on the conformation of NmW and NmY oligosaccharides also results in small differences in the observed retention times. Although transfer of the first hexose onto 4-MU-DP1 remained incomplete under the conditions used, subsequent transfer reactions proceeded to completion. Notably, the wild-type enzymes were unable to initiate polymerization on 4-MU-Gal and 4-MU-Glc. The finding suggests that the SiaTF domain requires at least a disaccharide to prime activity or, alternatively, recognition of the acceptor may require the presence of at least one sialic acid residue.
The Fluorescent Substrates Enable Sensitive and Quantitative Detection of Capsule Polymerase Activity-To observe the initial steps in chain elongation, the wild-type SiaD W/Y were primed with 1 mM 4-MU-DP3 in the presence of a 4-fold excess of donor sugar (2 mM CMP-Sia and 2 mM UDP-Gal/Glc). Reactions were sampled over 40 min, and products were examined by HPLC-FD (Fig. 2). Shown are representative results obtained with SiaD W . Under these conditions, we observed the successive elongation of 4-MU-DP3 up to a maximum of DP19 (DP16 in the case of SiaD Y ).
The new assay system enabled a precise quantification of SiaD W/Y activity. The normalized and weighted area (see "Experimental Procedures") is directly proportional to the total number of sugars transferred at a given reaction time point. Progress curves are generated by plotting these values against reaction time (Fig. 2C). The congruence of reaction progress curves obtained in three independent experiments demonstrates the reliability of the assay system.
Observation of Single Sugar Transfers Confirms Monofunctionality of Active Site Mutants-We previously generated active site mutants that disabled either the SiaTF or HexTF function of SiaD W (12). However, monofunctionality was tested in a complementation assay using paper chromatography to detect incorporation of radiolabeled sugars into long polysaccharides. In order to unequivocally observe the transfer of single sugar residues, we decided to test the single point mutant enzymes with the new fluorescent acceptor substrates. In separate reactions, NmW-(S972A)-His 6 and NmW-(E307A)-His 6 were incubated with both donor sugars and primed with either 4-MU-DP2 or 4-MU-DP3 (Fig. 3). As expected, each enzyme transferred a single sugar residue onto the appropriate acceptor, clearly demonstrating the monofunctionality of these enzymes. These monofunctional enzymes served as positive controls in the subsequent experiments to delineate the HexTF and SiaTF domains.
Truncation Mutagenesis to Separate HexTF and SiaTF in SiaD W and SiaD Y -A schematic representation of the linear sequence of SiaD W/Y (Fig. 4A) shows the previously predicted HexTF and SiaTF domains and a stretch of 365 amino acids linking these domains (12). In bioinformatics analyses, neither sequence homologies nor structural folds could be identified in the central 365 amino acids. Therefore, we initially hypothesized that this sequence is not part of the catalytic domains but is necessary for tertiary organization of the two GT-B domains. To test this hypothesis, constructs harboring the predicted catalytic domains C⌬639 and N⌬777 (see In contrast, no activity could be detected with the predicted SiaTF domain (N⌬777; Fig. 5E), suggesting that the linker  DECEMBER 5, 2014 • VOLUME 289 • NUMBER 49 region, or part thereof, might be necessary to form the active SiaTF domain. Thus, the N⌬398 truncation, comprising the SiaTF domain and the entire linker region (see Fig. 4A) was generated and tested. Indeed, this construct showed SiaTF activity comparable with the positive control (compare Figs. 5B and 3A), confirming that the SiaTF domain extends beyond the predicted GT-B fold. Further truncation constructs were tested to more precisely define the boundary of the SiaTF domain. We found that N⌬562 and N⌬609 (Fig. 5, C and D) retained SiaTF activity, but further truncation completely inactivated the enzyme. The results described for the HexTF and SiaTF domains of SiaD W were similarly obtained for the two domains in SiaD Y . The expression of SiaD Y truncation mutants is displayed in Fig. 4B. Activity data are not shown.

The Capsule Polymerases of N. meningitidis W and Y
Taken together, the truncation studies defined the boundaries of the SiaTF domains of these capsule polymerases and demonstrated that the domains can be separated and expressed as active transferases. As originally predicted, the N-terminal HexTF domain comprises a classic GT-B fold belonging to CAZy family GT4 (25). In contrast, the C-terminal SiaTF domain extends beyond the predicted GT-B fold and includes ϳ170 additional amino acids of the linker region. The sequence stretch connecting the two GT-B folds is thus not a linking element but part of the functional SiaTF domains in SiaD W/Y .
The Sialyltransferase Domains in SiaD W/Y Define a New Glycosyltransferase Family-BLAST searches carried out with the newly identified SiaTF domain (SiaD W aa 563-1037) failed to reveal similarity to any of the known SiaTF families classified in the CAZy database (14). Further, searching the Pfam database (26) revealed no matches to HMM profiles (27,28) constructed with the known SiaTF families. However, iterative PSI-BLAST searches of the non-redundant protein sequence database (29) revealed similarity to a number of uncharacterized protein sequences. The identified sequences were filtered for e-value (better than 1e Ϫ50 ) and for the presence of conserved motifs that have previously been identified in SiaD W (12). The presented data support the inclusion of these proteins as a new family of glycosyltransferases, GT97, in the CAZy database.
Members of the new family are only found in a handful of taxonomically scattered species of bacteria and archaea. No homologues were found in eukaryotes. Notably, the identified sequences are primarily found in two types of organisms with very distinct habitats: (i) commensal bacteria inhabiting humans and animals, including a number of opportunistic pathogens, as well as (ii) extremely halophilic bacteria and archaea isolated from saturated brines, salt lakes, and ponds (Fig. 6). Because archaea are not known to incorporate sialic acid into their glycans but have been shown to use homologous biosynthetic pathways for the incorporation of the related nonulosonic acids (NulOs), legionaminic acid and pseudaminic acid (30,31), it is possible that GT97 glycosyltransferases exhibit specificity for these CMP-activated sugars. Supporting the putative function of GT97 family members as NulO transferases, the genome sequences of all of the identified organisms were found to harbor homologs of other nonulosonic acid biosynthesis genes, specifically NulO synthases (32,30) and CMP-NulO synthetase (33,30). Although the role of sialylated structures in commensal organisms and opportunistic pathogens is generally understood to be the avoidance of the immune system by molecular mimicry of host structures (34), the functional significance of GT97 family proteins in extreme halophiles is less clear. The phylogenetic tree for the new family revealed clustering that is largely at odds with the true phylogenetic relationships among these organisms (Fig. 6), suggesting that convergent evolution or horizontal gene transfer may have played a role in evolution of the GT97 family.
An alignment of the new family (Fig. 7) reveals slight differences from the known bacterial sialyl motifs (indicated in the alignment in red letters). The (D/E)(D/E)G motif (35), which is part of the catalytic center in bacterial sialyltransferases of CAZy families GT38, GT52, and GT80, was found to be replaced by QYA or QHG in the new family. Two other bacte-rial sialyl motifs, the HP motif (35) and the S(S/T) motif (36), which are involved in binding the nucleotide sugar (35), are also highly conserved in GT97. However, the S(S/T) motif is strictly ST in the GT97 family. The functional importance of the HP and the ST motifs has previously been demonstrated for SiaD W/Y (12). To interrogate the importance of the QYA motif, the point mutant SiaD W -Y836F was prepared and completely abolished activity (data not shown).
Taken together, the work presented in the current study (i) defines the minimum boundaries of the SiaTF domains of SiaD W/Y , (ii) shows that these domains are functional in isolation, and (iii) allows us to define a novel glycosyltransferase family, GT97, which is present in commensal bacteria of humans and animals and in extremely halophilic archaea and bacteria.

DISCUSSION
With the recent demonstration that the capsule polymerases of the N. meningitidis serogroups W and Y (SiaD W and SiaD Y ) are chimeric proteins comprising family GT4 HexTFs at their N terminus and a SiaTF at their C terminus (12), the question arose of whether the two enzymatic functions depend on their common presence in one polypeptide chain or if the functional subunits can be expressed separately. Here we demonstrate that the latter is the case, and we report a truncation study that delineates the boundary of the functionally active C-terminal SiaTF domain. This functional domain is ϳ170 residues longer than the previously predicted GT-B fold domain (12). However, from our results, it is not clear whether the additional region (amino acids 610 -777) serves a chaperone function ensuring correct folding of the SiaTF domain or if it directly participates in catalysis.
This study sees SiaD W/Y classified into two CAZy families, GT4 based on the HexTF domain and GT97 based on the newly delineated SiaTF domain. Several other bifunctional GTs also belong to different CAZy families, including the Pasteurella multocida heparosan synthases (GT2 and GT45) (37) and the bifunctional human glycosyltransferase LARGE (GT8 and GT49) (38).
Successful delineation of the two catalytic units required a novel assay system for the detection of single sugar transfers. To this end, we exploited two monofunctional active site mutants, capable of transferring either a single hexose or sialic acid residue, for the controlled synthesis of fluorescently labeled acceptor substrates. The new substrates (4-MU-Sia-Hex for SiaTFs and 4-MU-Sia-Hex-Sia for HexTFs), combined with HPLC-FD analysis of reaction products, provided a highly sensitive and robust assay of these enzymatic activities. The stability and specificity of the new substrates for the target activity enabled activity measurements directly from bacterial lysates, which accelerates mutant testing enormously. In addition, we also demonstrate that the new assay system can be used to quanti-   . Phylogenetic tree of family GT97. Phylogenetic analysis was carried out using the PhyML software with bootstrapping (23) using the alignment shown in Fig. 7 as input. Protein accession numbers and species names are indicated, and bootstrapping values are given at the nodes. GT97 family members were only found in the bacterial and archaeal domains of life, and the majority of these species are either human or animal commensals (including several opportunistic pathogens) or extreme halophiles, as indicated. DECEMBER 5, 2014 • VOLUME 289 • NUMBER 49 wild-type nor monofunctional SiaTFs used 4-MU-Gal/Glc as an acceptor, but these enzymes were highly efficient with 4-MU-Sia-Hex. These results indicate that a minimum of one Sia residue is necessary for recognition by the acceptor binding sites of both the HexTF and SiaTF domain. Further, the addition of the hydrophobic group MU improved acceptor quality because free Sia was not a substrate for the enzymes (12). These findings may reflect the nature of the in vivo priming acceptor. Willis and Whitfield (2) recently demonstrated for a number of Gram-negative bacteria, including N. meningitidis serogroup B, that the capsular polysaccharide is anchored via the reducing end to a unique glycolipid consisting of a lysophosphatidylglycerol moiety ␤-linked to oligo-3-deoxy-D-manno-oct-2-ulosonic acid. The observed acceptor specificities of SiaD W/Y agree well with the proposition that this universal anchor structure also primes in vivo capsule synthesis in NmW and NmY. This would explain both the requirement for a Sia residue, because this is a structural analog of 3-deoxy-D-manno-oct-2-ulosonic acid, and the improvement of acceptor quality by the addition of a hydrophobic group that may occupy the lipid binding site.

The Capsule Polymerases of N. meningitidis W and Y
Using the newly defined SiaTF domain to search protein databases revealed a new family of glycosyltransferases that is classified as GT97 in the CAZy database. The phylogenetic tree of GT97 sequences showed anomalous clustering, with high similarity occurring between sequences from distantly related species. This may suggest that horizontal gene transfer or convergent evolution has played a role in the history of the GT97 family. A case in point is the grouping of the Bacteroidetes Salinibacter ruber protein (WP_013061143) with those from haloarchaea Natronorubrum sulfidifaciens, Halorhabdus tiamatea, and Halorubrum kocurii. Indeed, the genome sequence of the extremely halophilic bacterium S. ruber revealed that considerable amounts of genetic material have been shared with its haloarchaea coinhabitants in concentrated sea water brines (39). These findings are in agreement with the well studied phylogeny of NulO biosynthetic pathways, which include  pathways for sialic acid, pseudaminic acid, and legionaminic acid (30,31). These pathways are also taxonomically scattered among bacteria and archaea and show complex phylogenetic relationships suggestive of horizontal gene transfer and convergent evolution (30,31,40). The presence of GT97 sequences in archaea raises the question of whether this family exhibits specificity for different CMP-activated donor sugars in the different domains of life. Archaea are not known to incorporate sialic acid into their glycans but rather pseudaminic acid and legionaminic acid, which share related biosynthetic pathways (30). Thus, it is possible, if not likely, that some of the identified GT97 proteins are actually pseudaminic acid or legionaminic acid transferases.
A notable finding is the presence of GT97 proteins in several extreme halophiles. Interestingly, in this respect, adaptation to hypersaline environments has been shown to involve acidification of the proteome, the replacement of neutral amino acids with acidic residues to maintain hydration (41). Thus, one may speculate that nonulosonic acid biosynthesis pathways confer a selective advantage by similarly increasing the acidity of the glycome. However, further work is needed to confirm the function of other members of the GT97 family and to interrogate the possibility of glycome acidification in extreme halophiles.