Deletion of Two Exons from the Lymnaea stagnalis β1→4-N-Acetylglucosaminyltransferase Gene Elevates the Kinetic Efficiency of the Encoded Enzyme for Both UDP-sugar Donor and Acceptor Substrates*

Lymnaea stagnalisUDP-GlcNAc:GlcNAcβ-R β1→4-N-acetylglucosaminyltransferase (β4-GlcNAcT) is an enzyme with structural similarity to mammalian UDP-Gal:GlcNAcβ-R β1→4-galactosyltransferase (β4-GalT). Here, we report that also the exon organization of the genes encoding these enzymes is very similar. The β4-GlcNAcT gene (12.5 kilobase pairs, spanning 10 exons) contains four exons, encompassing sequences that are absent in the β4-GalT gene. Two of these exons (exons 7 and 8) show a high sequence similarity to part of the preceding exon (exon 6), suggesting that they have originated by exon duplication. The exon in the β4-GalT gene, corresponding to β4-GlcNAcT exon 6, encodes a region that has been proposed to be involved in the binding of UDP-Gal. The question therefore arose, whether the repeating sequences encoded by exon 7 and 8 of the β4-GlcNAcT gene would determine the specificity of the enzyme for UDP-GlcNAc, or for the less preferred UDP-GalNAc. It was found that deletion of only the sequence encoded by exon 8 resulted in a completely inactive enzyme. By contrast, deletion of the amino acid residues encoded by exons 7 and 8 resulted in an enzyme with an elevated kinetic efficiency for both UDP-sugar donors, as well as for its acceptor substrates. These results suggest that at least part of the donor and acceptor binding domains of the β4-GlcNAcT are structurally linked and that the region encompassing the insertion contributes to acceptor recognition as well as to UDP-sugar binding and specificity.

Glycosyltransferases form a large family of functionally related, membrane-bound enzymes that are involved in the biosynthesis of the carbohydrate moieties of glycoproteins and glycolipids (1,2). Recently we have identified a novel glycosyltransferase by the isolation of a UDP-GlcNAc:GlcNAc␤-R ␤134-N-acetylglucosaminyltransferase (␤4-GlcNAcT) 1 cDNA from the prostate gland of the snail Lymnaea stagnalis (3). In vitro, the recombinant ␤4-GlcNAcT catalyzes the transfer of GlcNAc from UDP-GlcNAc in ␤134 linkage to various ␤-Nacetylglucosaminides (3,4). The ␤4-GlcNAcT cDNA appeared to show a significant sequence similarity to the mammalian UDP-Gal:GlcNAc␤-R ␤134-galactosyltransferase (␤4-GalT) cDNAs, with an overall resemblance between the predicted amino acid sequences of about 30% (3,(5)(6)(7). Based on the genetic and enzymatic relationship of the ␤4-GlcNAcT and the ␤4-GalTs, we have suggested that these enzymes constitute a separate glycosyltransferase gene family, the members of which are capable of catalyzing the transfer of a specific sugar from their respective UDP-sugar donors in a ␤134 linkage toward a terminal ␤-linked GlcNAc residue in the acceptor (3,8). Based on enzymatic properties, we have proposed that also UDP-GalNAc:GlcNAc␤-R ␤134-N-acetylgalactosaminyltransferase (␤4-GalNAcT), detected in several non-vertebrate species (9 -13), belongs to this family (14). The primary structure of this enzyme, however, is still unknown.
The reaction catalyzed by glycosyltransferases typically involves two substrates and often a divalent cation cofactor. This suggests that the enzymes consist of several functional domains involved in substrate and cofactor binding, respectively. Comparison of the conserved and variable regions of genetically related glycosyltransferases with different properties would open possibilities to address structure-function relationships. As snails and mammals are evolutionary distant species, comparison of the genomic organization of the genes that encode these enzymes might give insight in their way of divergence, that resulted in genes encoding enzymes with a different UDP-sugar specificity. The genomic organization of the murine and human ␤4-GalT genes have been described previously (15,16). Here we report the organization of the L. stagnalis gene that codes for the ␤4-GlcNAcT. The intron-exon distribution was determined and compared with that of the ␤4-GalT gene. Mutant ␤4-GlcNAcT cDNAs were constructed by deletion of sequences that do not have a counterpart in the ␤4-GalT gene, and expressed in COS cells. Comparison of the kinetic parameters of the resulting mutant and parental enzymes showed that the insertion in the ␤4-GlcNAcT and its surrounding regions contribute to acceptor recognition as well as to UDPsugar binding and specificity. DNA Techniques-The genomic clone 5 has been isolated previously (3) from a genomic EMBL3A library of L. stagnalis (17). Isolation of plasmid DNA was carried out by a modification of the minilysate method, as described in Ref. 18. Plasmids used for transfection of COS cells were isolated by the Qiagen plasmid protocol, using a QIAGEN-tip 100 minicolumn. Restriction enzymes and other DNA-modifying enzymes were used according to the manufacturer. Dideoxynucleotide chain-terminating sequencing reactions (19) were performed on doublestranded plasmid DNA, with the T7 DNA sequencing kit (Pharmacia Biotech Inc.), [␣-35 S]dATP (Amersham), using M13 universal primer, the KS and SK primers, and several sequence-specific synthetic oligonucleotide primers. Southern blotting was performed as described previously (3). PCR with sequence-specific primers was performed using Ultma-polymerase (Perkin-Elmer) by 25 cycles (1 min at 95°C, 1 min at 63°C, 1 min at 72°C). For cloning purposes, amplified fragments were subsequently purified according to the QIAquick PCR purification protocol (Qiagen Inc.) Construction of pPROTA Hybrid Plasmids-The plasmid pMC135, containing a fusion between part of the protein A sequence and ␤4-GlcNAcT cDNA, was constructed as follows. A BamHI-EcoRI adapter was ligated in the BamHI site of pVTBac-P11.4, carrying 5Ј-truncated ␤4-GlcNAcT cDNA (3). The resulting 1.4-kb EcoRI fragment from this construct was ligated into an EcoRI-digested pPROTA vector (20). Plasmids carrying mutant ␤4-GlcNAcT genes were derived from pMC135 by exchange of the 0.52-kb XhoI-BglII fragment for a PCR fragment carrying the desired deletion. For PCR of the deletion fragments, a sense primer (bases 439 -456 of the ␤4-GlcNAcT cDNA; Ref. 3) and the antisense primer ID8 (ATAGATCTTAAATCTGTCCGGGTTCACAT-TCCAG) or ID16 (ATAGATCTTAAATCTGTTCGGGTGCACGTTC) was used. The antisense primer ID8, used for construction of pMC142, consists of a part complementary to the 3Ј end of exon 6, and a part (11 base pairs), complementary to the 5Ј end of exon 9 (BglII restriction site underlined). The antisense primer ID16, used for construction of pMC166, consists of a part complementary to the 3Ј end of exon 7, and the same 5Ј part of exon 9 as ID8. PCR fragments obtained with these primers were digested with XhoI and BglII, and ligated into XhoI-BglIIdigested pMC135. After transformation and plasmid isolation of several transformants, the desired plasmids (pMC142 encoding protA-␤4-GlcNAcT⌬7-8 and pMC166 encoding protA-␤4-GlcNAcT⌬8) were selected by size determination of the internal SalI fragment followed by determination of the nucleotide sequence of the complete XhoI-BglII fragment with sequence-specific primers.
Expression in COS-7M6 Cells-Recombinant pPROTA chimeric constructs were transiently transfected to COS cells (3 ϫ 10 5 cells/10-cm dish), using the calcium phosphate precipitation technique as described previously (21); after 24 h, the medium was replaced by fresh medium; medium was harvested at 48, 72, and 96 h after transfection and each time replaced by fresh medium. The harvested media were pooled, stored at Ϫ20°C in portions, and used as enzyme source for glycosyltransferase assays and Western blotting. Membrane bound recombinant ␤4-GlcNAcT was produced with a pMT2-based construct as described previously (3).
Western Blot Analysis-The Western blot analysis performed was aimed on detection of the protein A part of the fusion proteins. The proteins of 1 l of pooled medium collected after transfection, were separated by SDS-polyacrylamide gel electrophoresis on 10% gels using the Mini-PROTEAN II system (Bio-Rad). Western blotting was performed essentially as described previously (21). As first antibody, an arbitrary mouse IgG monoclonal antibody (ED3, Ref. 22) was used, which reacts with the protein A part of the hybrid proteins. The second antibody used was a goat anti-mouse peroxidase conjugate (Tago, Inc. Immunodiagnostic Reagents).
Glycosyltransferase Assays-Standard glycosyltransferase assays were performed in a 50-l reaction mixture containing either 25 nmol of UDP- As enzyme source, 10 l of COS cell medium, which had been concentrated 10 times with Centriprep-10 concentrators (Amicon), was used. The product was isolated using Sep-Pak C-18 cartridges (Waters) (23). For acceptor specificity studies, acceptor substrate concentrations were kept at 1 mM, in terms of terminal GlcNAc residues. Control assays lacking the acceptor substrate were carried out to correct for incorporation into endogenous acceptors. Enzyme activity was expressed as pmol/min Ϫ1 /ml Ϫ1 of the original medium. As enzyme source for the experiments studying the UDP-Gal specificity, the enzyme was isolated from the medium by binding to IgG-agarose (Sigma). 10 ml of medium was incubated with 10 l of IgG-agarose beads for 16 h at 4°C, carefully shaking; the beads were subsequently collected by centrifugation and resuspended in 500 l of 0.1 M sodium cacodylate buffer, pH 7, containing 1 mg/ml bovine serum albumin; 10 l of this suspension was used in a standard assay.
Kinetic parameters (K m and V) were estimated from Lineweaver-Burk plots by varying the sugar-donor concentrations from 0.05 to 0.5 mM for UDP-GlcNAc and from 0.25 to 5 mM for UDP-GalNAc while keeping the GlcNAc-S-pNP concentration at 1 mM, or by varying the acceptor substrate concentration from 0.025 to 1 mM while keeping the UDP-GlcNAc concentration at 0.5 mM. The inhibitory effect of UDP was studied at fixed concentrations of UDP-GlcNAc (0.5 mM) and GlcNAc-O-pNP (1 mM), whereas the UDP concentration was varied from 0.1 to 5 mM.

Isolation and Characterization of the L. stagnalis ␤4-Glc-
NAcT Gene-A genomic clone, denoted 5, was isolated previously from a EMBL3A library of L. stagnalis, and was shown to contain two short DNA sequences identical to ␤4-GlcNAcT cDNA sequences (3). A rough genomic map of 5 was constructed by PCR and Southern blot hybridization using specific ␤4-GlcNAcT cDNA fragments as probes (Fig. 1). 5 was found to encompass the complete coding sequence of the ␤4-GlcNAcT gene, spanning 12.5 kb of DNA, that was divided into 10 exons (Fig. 1, Table I). As probably part of the 5Ј-noncoding sequence is lacking from the cDNA (3), we cannot exclude the presence in the gene of one or more exons upstream of the denoted exon 1. Exon 10 was found to encompass the complete 3Ј-noncoding region. All exon sequences in the genomic clone were identical to those of the cDNA (3). Donor and acceptor splice junction sequences (Table II) are in agreement with consensus sequences reported (25).
Comparison of the L. stagnalis ␤4-GlcNAcT Gene with the Murine and Human ␤4-GalT Genes-The cDNA encoding ␤4-GlcNAcT shows sequence identity with the mammalian ␤4-GalT cDNAs identified (3,(5)(6)(7). A comparison of the proteincoding exons of the L. stagnalis ␤4-GlcNAcT gene with those of the murine and human ␤4-GalT genes (15,16) is shown in Fig.  2. The ␤4-GalT gene was found to be divided into six exons, whereas the ␤4-GlcNAcT gene appeared to contain 10 exons. Exons 3, 4, 5, 6, and 9 of the ␤4-GlcNAcT gene were found to show similarity to exons 2-6 of the ␤4-GalT gene, corresponding to the catalytic domain of the enzyme (Fig. 2). This similarity was not only confined to a high degree of sequence identity; intron/exon boundaries were also found at identical positions within the gene.
The coding sequence of the ␤4-GlcNAcT gene is larger than that of the ␤4-GalT gene. These additional sequences appeared to be mainly encoded by three exons (exons 7, 8, and 10), that were not present in the ␤4-GalT gene. Exons 7 and 8 were found to encode a partial repeat of the sequence of exon 6 ( Table  III). The sequence analysis of the intron/exon borders (Table II) shows that exons 6, 7, and 8 are symmetrical exons. Additionally, it was observed that the acceptor splice junction sites of introns 6 and 7 as well as the donor splice junction sites of intron 6, 7, and 8 and the adjacent exon sequences are identical. These data strongly suggest that exons 7 and 8 of the ␤4-GlcNAcT gene arose by duplications (26), originating from the downstream part of exon 6.
Enzyme Activity of the Chimeric Proteins-The enzyme activities of the mutant hybrid proteins and the native protA-␤4-GlcNAcT were assayed using similar amounts of enzyme (Fig.  4). As the native membrane-bound ␤4-GlcNAcT shows a low GalNAcT activity (about 7% of the ␤4-GlcNAcT activity), the activity of the enzymes was measured with both UDP-GlcNAc and UDP-GalNAc. ProtA-␤4-GlcNAcT⌬7-8 showed reproducibly an almost 2 times higher GlcNAcT activity and a 4 times higher GalNAcT activity than the parental protA-␤4-GlcNAcT (Table IV). In contrast to the other chimeric proteins, protA-␤4-GlcNAcT⌬8 appeared to be enzymatically inactive. To determine if the enzymes would show GalT activity, the hybrid proteins were purified using IgG-agarose beads. In this way, the recombinant enzymes were completely disposed of COS cell-derived ␤4-GalT. The bead-associated chimeras did not show detectable GalT activity, whereas they showed GlcNAcT and GalNAcT activities similar to those for the concentrated media (results not shown).
Acceptor specificity studies, using UDP-GlcNAc as sugar donor, showed no significant differences in acceptor preference between the two active hybrid enzymes at an acceptor concentration of 1 mM (Table V; Ref. 4). The acceptor specificity of the mutant chimeric protein using UDP-GalNAc as a sugar-donor appeared very similar to the preference of the enzyme when using UDP-GlcNAc. By contrast, the prostate gland ␤4-GalN-AcT is less specific for the linkage type of the terminal ␤-linked GlcNAc, and its acceptor substrate requirement resembles that of the albumen gland ␤4-GalNAcT (9).
Kinetic Properties of the Chimeric ␤4-GlcNAcTs-To explain the differences in donor specificity in more detail, the K m and V values for UDP-GlcNAc and UDP-GalNAc were determined for both protA-␤4-GlcNAcT and protA-␤4-GlcNAcT⌬7-8. It appears that the increase in GlcNAcT activity of protA-␤4-GlcNAcT⌬7-8 is due to a 3-fold reduction in K m for UDP-GlcNAc (Table VI), whereas the increased GalNAcT activity of the mutant enzyme is mainly due to an enhanced maximum velocity. These effects result in an elevated kinetic efficiency for both types of transfer. The data suggest an involvement of the region encoded by exons 7 and 8 in UDP-sugar donor binding. However, it was found that both active chimera were    inhibited to almost the same extent by UDP (50% at 4 -5 mM UDP, results not shown).
The K m and V values of two acceptor substrates were estimated (Table VII). While the V values of both enzymes appeared to be of the same order for both compounds, the K m values were decreased in the mutant enzyme. Deletion of the two exons clearly elevates the kinetic efficiency with both acceptor substrates.

Characterization of the N-Acetylgalactosaminylated Product
Obtained with ␤4-GlcNAcT⌬7-8 -The N-acetylgalactosaminylated product formed by incubating protA-␤4-GlcNAcT⌬7-8, UDP-GalNAc, and GlcNAc-S-pNP was anticipated to be GalNAc␤134GlcNAc-S-pNP. To confirm this, the product was compared with authentic GalNAc␤134GlcNAc-S-pNP obtained by the action of L. stagnalis albumen gland ␤4-GalNAcT (9). Both product and reference were subjected to lectin affinity chromatography with immobilized WFA, known to bind with high affinity to glycans containing terminal GalNAc in a ␤134linkage (24,28,29). More than 90% of both compounds bound to the immobilized WFA, and could be eluted with 10 mM GalNAc. In addition, reference and product eluted on HPAEC with the same retention time (data not shown). These results indicate that ␤4-GlcNAcT⌬7-8 shows a ␤4-GalNAcT activity. DISCUSSION Based on the sequence identity of the mammalian ␤4-GalTs and L. stagnalis ␤4-GlcNAcT, these enzymes have been proposed to be members of one gene family (3). The observed conservation of the exon-intron organization of the genes encoding these enzymes strengthens this view. All exons of the ␤4-GalT gene show sequence identity to parts of the ␤4-Glc-NAcT gene, which suggests that both genes originated from an common ancestor, and diverged to genes encoding enzymes with a different sugar donor specificity.
The most remarkable difference between the genes is found in an insertion of two exons (exons 7 and 8) in the ␤4-GlcNAcT gene. These exons were found to encode partial repeats of exon 6. The sequences of these exons and those of the bordering introns strongly suggest that exons 7 and 8 were generated by internal exon duplications, and originate from exon 6. This is the first glycosyltransferase described in which exon duplication seems to have taken place during evolution. Generally, exon duplication is thought to play an important role in the evolution of genes, and many complex genes have been described that have evolved by internal exon duplication and subsequent modification of the primordial genes (30,31). For the GlcNAcT assays 100% activity is 128 and 338 pmol ⅐ min Ϫ1 ⅐ ml medium Ϫ1 , respectively. For the GalNAcT assays 100% activity is 110 pmol ⅐ min Ϫ1 ⅐ ml medium Ϫ1 for the mutant enzyme, and 5 nmol ⅐ min Ϫ1 ⅐ ml Ϫ1 for the prostate-derived enzyme.

TABLE VI Kinetic parameters of protein A-␤4-GlcNAcT chimeras and a deletion mutant derived thereof for two UDP-sugar donors
The recombinant enzymes were produced as protein A fusion proteins in COS cells.
Enzyme UDP-GlcNAc UDP-GalNAc To study the effect of the exon duplications on enzyme catalysis, we deleted the sequence (52 amino acids) corresponding to the two additional exons from the cDNA sequence. The catalytic domain of the mutant protein thus obtained, might resemble that of a putative ancestor of the L. stagnalis ␤4-GlcNAcT. The deletion enhanced the kinetic efficiency of the enzyme for both the transfer of GlcNAc and GalNAc, as well as for its acceptor substrates. The acceptor substrate specificity, i.e. the preference for a terminal ␤6-linked GlcNAc, was not affected. So the mutant enzyme has an enhanced catalytic potential, but also an increased sugar-donor promiscuity. Surprisingly, deletion of only the sequence encoded by exon 8 resulted in a completely inactive enzyme. An explanation could be that the enzyme is not properly folded with only one additional repeat, whereas the presence of two additional copies of the repeated sequence allows a correct folding.
From the results obtained here, it is difficult to deduce the selective advantage that was obtained by the exon duplications for the biological function of the ␤4-GlcNAcT. As can be inferred from the sequence, changes have occurred in the repeats after the exon duplications, and most likely also in other regions of the protein. It is possible, however, that reduction of the capacity of the primordial ␤4-GlcNAcT to transfer GalNAc might have been advantageous for the snail. The increase in the specificity of the enzyme for UDP-GlcNAc would then have been of more importance for the snail than the loss of catalytic potency that coincided with the introduction of the additional exons.
In ␤4-GalT the corresponding region (encoded by exon 5) has been proposed to be involved in UDP-Gal binding (32)(33)(34). In the same region a tetrapeptide (DKKN) has been found that is conserved between ␤4-GalT and ␣3-GalT, which is in support of this proposition (35). In the human ␤4-GalT a second region (in exon 4) has been proposed to be involved in UDP-Gal binding (36,37). Interestingly, this latter region shows a high degree of similarity between the ␤4-GalTs and the ␤4-GlcNAcT, whereas the more downstream region of ␤4-GalT (encoded by exon 5) shows much less sequence identity with the corresponding region in the ␤4-GlcNAcT (encoded by exons 6, 7, and 8). As the mammalian ␤4-GalTs and the L. stagnalis ␤4-GlcNAcT use UDP-Gal and UDP-GlcNAc, respectively, it is tempting to assume that the domain that is most conserved between these enzymes, is involved in the interaction with the UDP part of the nucleotide-sugar, while the more downstream region is responsible for the specificity for donor Gal and GlcNAc, respectively. The observation that both native ␤4-GlcNAcT and the deletion mutant were inhibited to a similar extent by UDP, suggesting that the UDP-binding domain is not affected by the deletion, is in support of this supposition. The lower K m for UDP-GlcNAc that was found for the mutant enzyme could be explained by a higher affinity of this enzyme for the GlcNAc part of the UDP-sugar.
The change in kinetic properties observed with the mutant enzyme suggests that the region around the insertion in ␤4-GlcNAcT is not only involved in interaction with the sugardonor, but also in binding of the acceptor substrate. Addition-ally, in ␤4-GalT a region (in exon 4) has been identified, that is involved in interaction of both donor and acceptor substrates (36,37). So in both enzymes the donor and acceptor substrate binding domains seem to be structurally linked. This is conceivable, as both substrates have to be close together for the transfer reaction. In ␤4-GalT another binding domain for Glc-NAc, however, has been localized in exon 2/3 (34). This region shows a high sequence identity with the corresponding region in ␤4-GlcNAcT, which supports the suggestion that this sequence is involved in acceptor binding.
We have shown here that the L. stagnalis ␤4-GlcNAcT can be transformed to an enzyme that is capable of catalyzing the transfer of GlcNAc and GalNAc with a similar maximum velocity. This suggests that the ␤4-GalNAcT, which has been observed in L. stagnalis tissues (Ref. 9 and this study), might be encoded by a structurally related enzyme belonging to the ␤4-GalT gene family. This is further supported by several studies that have shown a similarity between invertebrate ␤4-GalNAcTs and the mammalian ␤4-GalTs in acceptor specificity (9 -12), and sometimes in responsiveness to ␣-lactalbumin (13). Furthermore, ␤4-GalT shows sugar donor promiscuity at high concentrations of UDP-GalNAc (38), or in the presence of ␣-lactalbumin (24), resulting in the transfer of GalNAc. Interestingly, ␤4-GlcNAcT is not sensitive to ␣-lactalbumin (4), and this enzyme can not utilize UDP-Gal, but shows a relatively high ␤4-GalNAcT activity (7%). By mutagenesis, we have constructed an enzyme with an even higher GalNAcT activity. A similar experiment has been documented for the blood group A (␣3-GalNAcT) and B (␣3-GalT) enzymes (39). These enzymes also show some nucleotide sugar donor promiscuity (40), but by construction of hybrids an enzyme with both activities was obtained (41). Our observations suggest that, in addition to the facilitation of the transfer of the desired sugar, the prevention of sugar donor promiscuity might have been a driving force in the evolution of enzymes of the ␤4-GalT gene family.