Molecular cloning of a developmentally regulated N-acetylgalactosamine alpha2,6-sialyltransferase specific for sialylated glycoconjugates.

A cDNA encoding a novel sialyltransferase has been isolated employing the polymerase chain reaction using degenerate primers to conserved regions of the sialylmotif that is present in all eukaryotic members of the sialyltransferase gene family examined to date. The cDNA sequence revealed an open reading frame coding for 305 amino acids, making it the shortest sialyltransferase cloned to date. This open reading frame predicts all the characteristic structural features of other sialyltransferases including a type II membrane protein topology and both sialylmotifs, one centrally located and the second in the carboxyl-terminal portion of the cDNA. When compared with all other sialyltransferase cDNAs, the predicted amino acid sequence displays the lowest homology in the sialyltransferase gene family. Northern analysis shows this sialyltransferase to be developmentally regulated in brain with expression persisting through adulthood in spleen, kidney, and lung. Stable transfection of the full-length cDNA in the human kidney carcinoma cell line 293 produced an active sialyltransferase with marked specificity for the sialoside, Neu5Ac-alpha2,3Gal-beta1,3GalNAc and glycoconjugates carrying the same sequence such as G(M1b) and fetuin. The disialylated tetrasaccharide formed by reacting the sialyltransferase with the aforementioned sialoside was analyzed by one- and two-dimensional 1H and 13C NMR spectroscopy and was shown to be the Neu5Ac-alpha2,3Gal-beta1,3(Neu5Ac-alpha2,6)GalNAc sialoside. This indicates that the enzyme is a GalNAc alpha-2,6-sialyltransferase. Since two other ST6GalNAc sialyltransferase cDNAs have been isolated, this sialyltransferase has been designated ST6GalNAc III. Of these three, ST6GalNAc III displays the most restricted acceptor specificity and is the only sialyltransferase cloned to date capable of forming the developmentally regulated ganglioside G(D1alpha) from G(M1b).

Positioned in the protective and interactive glycocalyx of the cell surface, sialylated glycoconjugates are optimally situated to mediate initial communication events between two cells. Examples demonstrating the significance of biological events mediated by these interactions range from neurodevelopment where sialosides confer migratory properties to cells to trafficking of leukocytes from blood to sites of inflammation and lymphoid organs (1)(2)(3)(4). These interactions are controlled by the structural diversity of sialosides that undergo dramatic alterations throughout ontogeny and also by the site-specific expression of sialic acid recognition molecules such as the selectins and I-type lectins (1, 5-10, 55, 60).
Sialosides are generated by a family of glycosyltransferases termed the sialyltransferases by the transfer of sialic acid from its high energy donor CMP-sialic acid, to the nonreducing terminus of oligosaccharides (33,34). Sialyltransferases generate considerable structural diversity by transferring sialic acid with remarkable specificity for the underlying oligosaccharide substrate (33,34). The regulated expression of sialosides is dependent on many factors including the availability of sugar nucleotide to the Golgi lumen; competing glycosyltransferases; co-localization of appropriate acceptors, transferases, and sugar nucleotide transporters within a particular Golgi cisternae; and transit time of acceptors through the Golgi apparatus (56 -59). However the greatest determinant of sialoside expression is probably the site-specific expression observed for each member of the sialyltransferase gene family (30). Based on known structures for sialylated glycoconjugates, the sialyltransferase gene family has been estimated to consist of 10 -12 independent gene products, although it is becoming evident that the final number of sialyltransferases may ultimately prove to be larger. To date, 11 enzymatically distinct mammalian sialyltransferases have been cloned by direct and indirect methods (10 -24, 27-29, 42). Analysis of their amino acid sequences has revealed two conserved motifs. The longest is characterized by a 48 -50-amino acid region centrally located, and the shorter motif consists of a 20 -24-amino acid stretch (14,16,26). These have been designated L-sialylmotif and S-sialylmotif respectively (13,20). Site-directed mutagenesis of L-sialylmotif indicates that it plays a role in the recognition of the sugar nucleotide donar common to all sialyltransferases, CMP-Neu5Ac (61). When the eukaryotic sialyltransferase cDNAs are compared, the greatest conservation of amino acids is found at opposing ends of L-sialylmotif enabling the use of PCR 1 to clone additional members of this gene family.

EXPERIMENTAL PROCEDURES
Materials-ZAPII vector, Escherichia coli strain XL-1 Blue, Gigapack II packaging extracts, and Bluescript plasmid vector were purchased from Stratagene. [␣-32 P]dCTP was obtained from Amersham Corp.; CMP-[ 14 C]Neu5Ac was obtained from DuPont NEN, and unlabeled CMP-Neu5Ac was from Sigma. Restriction enzymes were purchased from New England Biolabs, Pharmacia Biotech Inc., and Life Technologies, Inc. Newcastle disease virus was obtained from Oxford Glycosystems and Salmonella typherium sialidase was from New England Biolabs. Gal␤1,3GalNAc␤1,0-benzyl was a kind gift from Dr. Shawn Defrees of Cytel Corp.
Isolation of RNA-Total RNA from rat tissues was prepared as described previously (25). Poly(A) ϩ mRNA was selected by two cycles of binding to oligo(dT)-cellulose type 2 (Collaborative Research) as described previously (11).
PCR Cloning with Degenerate Oligonucleotides-Based on the sequence information of the sialylmotif (14), two degenerate oligonucleotides were synthesized (Genosys), which were predicted to yield a 150base pair amplified fragment. The sequences of the 5Ј and 3Ј primers were 5Ј-GGAAGCTTTGSCRNMGSTGYRYCRTCGT and 3Ј-CCG- and Y ϭ C ϩ T), respectively. For PCR amplification, first-strand cDNA synthesized from rat brain total RNA was combined with 100 pmol of each primer. 30 cycles (95°C for 1 min, 37°C for 1 min, and 73°C for 2 min) were run using Pfu polymerase (Stratagene), and the products were digested with BamHI and HindIII and subcloned into these sites of Bluescript SK (Stratagene). The 50 clones were sequenced using a T7 primer (Stratagene), and 24 clones contained the new sialylmotif fragment, SMY.
Cloning of a Rat Sialyltransferase Containing SMY-Two separate reactions, one primed with oligo(dT) primers and one with random hexamer primers, were performed for the first-strand synthesis reaction. The two reactions were then combined for second-strand synthesis. The oligo(dT)-primed and random primed rat brain cDNA was ligated with EcoRI linkers and then ligated into EcoRI-digested ZAPII (Stratagene). The resultant library was packaged using a Stratagene Gigapack II packaging extract and plated on E. coli XL-1 Blue (Stratagene). Approximately 1,000,000 plaques were screened with the cloned PCR fragment SMY described above. Five positive clones, STY-1-STY-5, were plaque-purified and then excised into Bluescript vectors by in vivo excision with R408 helper phage. Nucleotide sequencing was carried out on double-stranded templates using Sequenase (modified T7 polymerase from U. S. Biochemical Corp.) by the dideoxynucleotide chain termination method.
Northern Analysis-Newborn RNA samples were isolated from rat pups within 4 days of birth. Samples of poly(A) ϩ RNA were electrophoresed in 1.2% agarose gels containing 2.2 M formaldehyde and transferred to a nitrocellulose filter (Schleicher and Schuell). Multiple tissue Northern blots of poly(A) ϩ RNAs were purchased from Clontech Laboratories for the analysis. 2 g of polyadenylated RNA was loaded in each lane. The blots were probed with a gel-purified, radiolabeled (Ͼ1 ϫ 10 9 cpm/mg), 1.3-kb EcoRI fragment isolated from STY-5.
Enzymatic Synthesis of Acceptors-G M1b was generated using recombinant ST3Gal I as described below. Briefly, 100 nmol of asialo-G M1 was resuspended in 100 l of 20 mM cacodylate buffer, pH 6.0, containing 0.1% Triton CF-54, 20 mM CMP-Neu5Ac, and 5 milliunits of ST3Gal I. The reaction mixture was allowed to proceed overnight, at which time the enzyme was heat-killed. The reaction was passed over a 1-ml C-18 column (Baker), and the G M1b product was eluted from the column with 90% methanol. The product was judged to be 95% pure by high performance thin-layer chromatography analysis. Neu5Ac␣2,3-sialylated antifreeze glycoprotein was synthesized by incubating 10 mg of antifreeze glycoprotein in 500 l with 20 mM cacodylate, pH 6.0, 0.1% Triton CF-54, 15 milliunits of ST3Gal I and 20 mM CMP-Neu5Ac. The reaction was allowed to proceed overnight, the enzyme was heat-killed (80°C for 5 min), and the product was purified by gel filtration.
Construction and Transfection of STY Expression Vector-Although a STY cDNA construct lacking the putative cytosolic and transmembrane domains was translated in mammalian cells, this soluble form lacked any detectable sialyltransferase activity. Therefore, an EcoRI fragment encoding the STY coding region in its entirety was subcloned into this site of pcDNA3 (Invitrogen). The correct orientation was confirmed by digestion with various restriction endonucleases as well as sequencing of the junctions between the vector and open reading frame of STY. This plasmid was stably transfected into the human kidney carcinoma cell line 293, by the calcium chloride method using a mammalian transfection kit (Stratagene). Transfection was achieved exactly as described by the manufacturer. Cells in which the plasmid was stably integrated were selected by growth in media (Dulbecco's modified Eagle's medium containing 10% fetal calf serum) containing G418 (100 g/ml).
Assay of Sialyltransferase STY Activity-293 cells stably transfected with full-length STY cDNA were grown to confluence in 225-cm 2 tissue culture flasks and harvested by scraping cells into phosphate-buffered saline. Cells were pelleted, phosphate-buffered saline was removed, and cells were resuspended in 1 ml of 1% Triton X-100, 50 mM NaCl, 5 mM MnCl 2 , 25 mM MES, pH 6.0. It should be noted that STY requires the presence of either magnesium or manganese for optimal activity. The cell pellet was solubilized by 10 passes through a 1-ml Pipetteman tip followed by vortexing for 10 s at a setting of seven. This crude homogenate was centrifuged for 10 min at 1000 ϫ g; supernatant was removed and used directly as enzyme source. The assay mixture consisted of 50 M CMP-Neu5Ac with 250,000 cpm of CMP-[ 14 C]Neu5Ac added as tracer, 0.1% Triton CF-54, 20 mM cacodylate, pH 6.0, and 10 l of enzyme extract in a 30-l reaction mixture. Glycoprotein and glycolipid products were separated from CMP-Neu5Ac by gel filtration as described previously (62). The sialic acid content of each glycoconjugate was fixed at 0.5 mM. In the case of oligosaccharides, disialylated product formed by STY was separated from monosialylated acceptor and CMP-Neu5Ac by Mono Q HPLC as described previously (63). Oligosaccharide acceptor concentration was fixed at 0.1 mM, and 20 M CMP-Neu5Ac plus 200,000 cpm of CMP-[ 14 C]Neu5Ac was included as tracer in a 30-l assay mix.
Differential Sialidase Treatment of STY Product-Fetuin was sialylated with CMP-[ 14 C]Neu5Ac using cellular extract from 293 cells expressing STY as described above. STY-sialylated fetuin was isolated by gel filtration, concentrated, and washed with water to remove salts and any remaining CMP-Neu5Ac using Centricon 10 (Amicon) microfiltration devices. This material was used to test the ability of newcastle disease virus (known to cleave ␣2,3 and ␣2,8 linkages) (40) or Salmonella typhimerium sialidase (known to cleave both ␣2,3 and ␣2,6 linkages under the conditions used, New England Biolabs) to hydrolyze the sialic acid added by STY. 10,000 cpm of STY sialylated fetuin was incubated with 8 milliunits of NDV or 50 units (1 l) of S. typhimerium sialidase in 10 l for 12 h after which sialidase was heat-killed. Control substrate included asialo-␣1-acid glycoprotein sialylated with purified ST3Gal III or purified ST6Gal I. Sialidase released sialic acid was resolved from sialylated glycoprotein by gel filtration.
Enzymatic 3GalNAc␤1-benzyl-20 mg of Gal␤1,3GalNAc␤1-benzyl was sialylated to near completion using 70 milliunits of recombinant ST3Gal I purified from baculovirus supernatants by CDP-hexanolamine chromatography as described previously (65). Briefly, 49 mg of CMP-Neu5Ac (20 mM final), 25 mM cacodylate, pH 6.0, 0.5% Triton CF-54, and 70 milliunits of ST3Gal I were mixed in a total reaction volume of 4 ml. The progress of the reaction was monitored by TLC using isopropyl alcohol, 1 M ammonium acetate to develop the chromatogram and resorcinol spray for detection. After an overnight incubation at 37°C, the reaction was approximately 95% complete as judged by TLC analysis. This material was passed over a 5-ml C-18 column, washed with 20 ml of water, and finally eluted with 20% methanol. The reverse phase purification step effectively removed all traces of protein and detergent from the oligosaccharide products. Approximately 50% of the monosialylated trisaccharide (Neu5Ac␣2,3Gal␤1,3GalNAc␤1-benzyl) bound to the column and was eluted yielding approximately 9 mg after drying under vacuum. The monosialylated trisaccharide eluted from the C-18 column was approximately 95% pure as judged by TLC and required no further purification prior to sialylation with STY. The remaining 50% was recovered in the run through/wash fractions and could be desalted and purified by anion-exchange HPLC as described below. Synthesis and Purification of the Disialylated Tetrasaccharide Formed by STY-293 cells stably transfected with full-length STY cDNA were grown to confluence in three 225-cm 2 tissue culture plates. Cells were harvested and solubilized as described above just prior to enzymatic synthesis of STY sialoside. In a 4-ml reaction volume 49 mg of CMP-Neu5Ac (20 mM), 9 mg of Neu5Ac␣2,3Gal␤1,3GalNAc␤1-benzyl, 2 ml of STY extract in 0.5% Triton X-100, 25 mM NaCl, 2.5 mM MnCl 2 , 12.5 mM MES, pH 6.0, 2.5 mM cacodylate, pH 6.0, were incubated overnight at 37°C. The progress of the reaction was monitored by TLC as described above. After an overnight incubation, about 40% of the sialylated trisaccharide had been converted to the disialylated tetrasaccharide. The reaction mixture was directly passed over a 5-ml C-18 column. Approximately 85% of the disialylated STY product failed to bind to the column and was recovered in the aqueous run through and wash fractions. The remaining 15% was eluted from the C-18 column with 20% methanol, dried, and resolved by anion-exchange HPLC as described below. The run through and wash fractions were lyophilized, resuspended in 600 l of 20% ethanol, and desalted over a Bio-Gel P-2 column equilibrated in 20% ethanol. 0.5-ml fractions were collected, and elution of the P-2 column (1.8 ϫ 20 cm) was monitored by spotting every other fraction on a TLC plate and subsequent detection by resorcinol spray. Fractions containing the STY sialoside were dried for further purification. This column provided substantial enrichment of disialylated tetrasaccharide STY sialoside over its monosialylated trisaccharide precursor. The desalted fractions containing the STY sialoside were resuspended in 2 ml of Milli Q water and fractionated over a strong anion HPLC column (HEMA-IEC, Alltech) utilizing a 500-l injection loop. Elution was monitored by following the absorbance at 214 nm. To avoid overloading the column, 4 -5 separate runs were performed. Elution buffers were 1 mM Tris, pH 8.0 (Buffer A) and 1 mM Tris, pH 8.0, 0.2 M NaCl (Buffer B), utilizing a gradient of 0% Buffer B for 6 min, increasing to 22.4% over 12 min, and then to 90% over 17 min, at 1.2 ml/min and collecting 0.6-ml fractions. Fractions containing STY sialoside were dried, resuspended in 20% ethanol, and desalted over a Bio-Gel P-2 column. Those fractions containing STY sialoside were pooled and dried for analysis by NMR. This system easily resolved the disialylated STY sialoside from its monosialylated precursor, CMP-Neu5Ac, and any free sialic acid.
Nuclear Magnetic Resonance Spectroscopy-Samples (ϳ2 mmol) of the trisaccharide precursor and the STY-produced tetrasaccharide product were first dissolved in D 2 O (99.6% d) at pH 7.0, lyophilized, and then re-exchanged 4 times with D 2 O (99.99% d, Cambridge Isotope Laboratories). The final solutions of the oligosaccharides in 0.7 ml of D 2 O had concentrations of ϳ2.5 mM and pH 6.7.
NMR spectra were recorded at 23°C on Bruker AM-500 and AMX-600 spectrometers, operating at frequencies of 500 and 600 MHz, respectively, for 1 H NMR. In all experiments, low power presaturation was applied to the residual HDO signal. 1 H chemical shifts (␦) are expressed in ppm downfield from sodium 4,4-dimethyl-4-silapentane-1sulfonate, with an accuracy of 0.002 ppm, and were measured relative to internal acetone at ␦ 2.225 ppm. 13 C Chemical shifts are expressed in ppm downfield from sodium 4,4-dimethyl-4-silapentane-1-sulfonate, with an accuracy of 0.02 ppm, and were measured from indirect 13 C detection experiments with the RF carrier at a position calibrated to be 80.0 ppm (relative to the methyl signal of internal acetone at ␦ 32.90 ppm).
One-dimensional TOCSY (43,46), and ROESY (45) experiments were performed with selective excitation of accessible structural-reportergroup signals by DANTE pulse trains (50). The one-dimensional TOCSY pulse program contained a 100-ms DIPSI-2 mixing sequence (52). The one-dimensional ROESY experiments used a 240 ms, 2.2 kHz, CW spin-lock pulse train flanked by two 90°pulses for offset compensation (47). Two-dimensional double quantum filtered COSY (51), HSQC (44), and HMQC-TOCSY (48) datasets were collected in phasesensitive mode using the time-proportional phase incrementation method (49). For the two-dimensional double quantum filtered COSY experiments, 800 FIDs of 2048 complex data points were collected; 16 scans/FID were acquired, the spectral width was set to 3623 Hz, and the RF carrier was placed at 4.0 ppm. For the HSQC spectra, 256 FIDs of 1024 complex points were acquired with 16 and 64 scans/FID for the triand tetrasaccharides, respectively. The HMQC-TOCSY experiment used a 40-ms Malcolm Levitt 17 mixing sequence (43). The GARP-1 sequence (53) was used for 13 C decoupling during 1 H acquisition. The spectral width in the 13 C dimension was set to 60 ppm, with the carrier at 80.00 ppm referenced to internal acetone at 32.90 ppm, with respect to sodium 4,4-dimethyl-4-silapentane-1-sulfonate. The two-dimensional datasets were processed typically with a Lorentzian-to-Gaussian weighting function applied in the t 2 dimension, and a shifted squared sine bell function and zero-filling was applied in the t 1 dimension. Processing was performed with the Felix software package, version 2.3 (BioSym Technologies, Inc.) on a Sun Sparc workstation.

PCR Amplification of a Sialylmotif
Fragment-To extend the PCR homology approach for cloning additional members of the gene family, rat brain cDNA was used as a template for PCR reactions, since it is an abundant source of sialyloligosaccharides that are synthesized by sialyltransferases that have not yet been cloned. The PCR experiments resulted in the amplification of a 150-base pair band equal to the size of the highly conserved region of the sialylmotif. Subcloning and sequencing of the amplified PCR products revealed that the band was a mixture of three DNA fragments. Of 50 clones characterized, 26 corresponded to the sialylmotif sequences of the previously cloned sialyltransferases, ST6Gal II, and ST3Gal III. The remaining 24 clones encoded a new sialylmotif fragment, designated SMY, which contained 7 of 8 amino acids that were found to be invariant in the 12 previously cloned sialyltransferases (Fig. 1). 2 The greatest homology is observed between SMY and the sialylmotif of ST3Gal II (54% identity), while the lowest homology is observed between SMY and the sialylmotif of ST8Sia I (33% identity).
Primary Structure of the STY cDNA-In order to clone the complete coding sequence of the gene containing the new sialylmotif, the SMY 150-base pair fragment was used to screen a rat brain cDNA library. Five positive clones, STY-1-STY-5, were isolated. Characterization of the positive clones revealed that clone STY-1 contained a 2.5-kb insert, clones STY-2 and -3 were 2.4 kb, clone STY-4 was 1.8 kb, and clone STY-5 was 1.3 kb in length. Northern analysis indicated that the STY mRNA was 3.0 kb (see below), suggesting that clone STY-1 was near full-length. Sequence analysis revealed that clones STY-1, -2, and -5 contained the complete open reading frame. STY-5 had the longest 5Ј-untranslated region, which contained one inframe ATG codon and three additional upstream ATG codons that were followed by a short open reading frame ("mini-cistrons") ranging from 10 amino acids to 37 amino acids up to the stop codons, while the 5Ј ends of STY-1 and -2 were Ϫ51 and Ϫ47, respectively (Fig. 2). We assigned the second in-frame ATG codon of STY-5 as the initiator codon, which was embedded within a better sequence for translation initiation than the upstream ATG, based on Kozak's rules (31). Additionally, STY-1, lacking the first initiation site and STY-5, containing the entire 5Ј end, including the first potential initiation site, were stably transfected into 293 cells. The result showed that STY-1 cell extracts had considerably more activity than STY-5 extracts did when identical amounts of protein were assayed, providing preliminary evidence that the appropriate start site is most likely the second in-frame ATG site (data not shown). The open reading frame encodes a protein of 305 amino acids and two N-linked glycosylation sites (Fig. 2). Hydropathy analysis (32) revealed one potential membrane-spanning region consisting of 17 hydrophobic residues, located 8 residues from the amino terminus (Fig. 2). This structural feature suggests that the STY protein has a type II membrane topology characteristic of all other glycosyltransferases cloned to date (33).
Comparison of the primary structure of STY protein and the 11 other cloned sialyltransferases indicates that it is the shortest of the 12 enzymes, which range from 305 to 566 amino acids in length, and that there is no significant similarity except in two regions, the so-called sialylmotifs (14,16,26). These results indicate that this protein belongs to the sialyltransferase gene family, although outside of the sialylmotifs the homology is the lowest of all sialyltransferases cloned to date.
Expression of Full-length STY Yields Sialyltransferase Activity-Although insertion of an epitope-tagged soluble form of STY into an expression vector resulted in the translation of secreted STY when transfected into COS cells, no sialyltransferase activity was detected. Since the length of the transmembrane domain is only predicted and the stem region of STY is predicted to be relatively short, it is possible that the cDNA was truncated within the catalytic domain of STY precluding the detection of activity. Therefore the entire open reading frame of STY was inserted into the mammalian expression vector, pcDNA3 (Invitrogen) and stably transfected into the kidney carcinoma cell line 293. Triton extracts from these cells displayed strikingly elevated levels of a unique sialyltransferase activity relative to 293 cells stably transfected with the vector alone. As displayed in Table I, STY transferred sialic acid only to sialylated glycoconjugates displaying the Neu5Ac␣2, 3Gal␤1,3GalNAc sequence including ␣2,3-sialylated antifreeze glycoprotein, the ganglioside G M1b , and fetuin. In contrast, the asialo derivatives of these substrates were not acceptors for STY. Although we observed a significant incorporation rate into anti-freeze glycoprotein, this was likely due to ST3Gal I endogenously expressed by 293 cells forming the Neu5Ac␣2, 3Gal␤1,3GalNAc sialoside acceptor for STY. While the ganglioside G M1b was a good acceptor for STY, other gangliosides containing the Neu5Ac␣2,3Gal␤1,3GalNAc sequence such as G D1a and G T1b , were not acceptors (Table I). Since G D1a and G T1b both contain a terminal Neu5Ac␣2,3Gal␤1,3GalNAc sequence, it is clear that the ␣2,3-linked sialic acid attached to the internal galactose of G D1a and G T1b (see Table I for structures) abolishes the ability of the sialyltransferase to transfer sialic acid to this sequence.
To confirm that Neu5Ac␣2,3Gal␤1,3GalNAc is required for activity with STY, we analyzed several sialylated oligosaccharides at a concentration of 0.1 mM to test for their ability to act as STY acceptors. As displayed by Table II, the best oligosaccharide acceptor was Neu5Ac␣2,3Gal␤1,3GalNAc followed by LSTa (Neu5Ac␣2, 3Gal␤1,3GlcNAc-lactose) to which STY transferred sialic acid at only approximately 3% of the rate of transfer to Neu5Ac␣2,3Gal␤1,3GalNAc. As was the case with the glycoconjugate acceptors (Table I), the nonsialylated Gal␤1,3GalNAc sequence is not an acceptor for STY. From the

FIG. 1. Comparison of sialylmotif L (SMY) and sialylmotif S of ST6GalNAc III with that of 11 previously cloned sialyltransferases.
The 11 previously cloned sialyltransferases are the rat ST6Gal I (11), the rat ST3Gal I (12), the mouse ST3Gal II (13), the rat STGal III (14), the human ST3 Gal IV (15,23), the chick ST6GalNAc I (16), the chick ST6GalNAc II (17), the rat ST6GalNAc III (this publication), the mouse ST8Sia I (18,22,24), the rat ST8Sia II (19), the mouse ST8Sia III (20), and the hamster ST8Sia IV (21). The sialyltransferase motifs are grouped by the linkage that they form. Concensus sequences were designated using the following rules: 1) 7 of 12 sialylmotifs must contain the indicated amino acid at a particular aligned site, 2) if only 2 amino acids are found at a particular aligned site, the designated amino acid(s) must be present in at least three sialylmotifs. above data, the acceptor specificity of STY is restricted to sialylated glycoconjugates carrying the Neu5Ac␣2,3Gal␤1, 3GalNAc sialoside in which the GalNAc is either ␣-linked to serine or threonine or ␤-linked to galactose (G M1b ).
Based on the acceptor specificity of STY and previously described sialyltransferase activities, we reasoned that STY formed either Neu5Ac␣2,3Gal␤1,3(Neu5Ac␣2,6)GalNAc commonly found O-linked to glycoproteins as well as on the ganglioside G D1␣ or possibly the sialoside Neu5Ac␣2,8Neu5Ac␣2, 3Gal␤1,3GalNAc found on the ganglioside G D1c (39). Differential sialidase and mild periodate experiments failed to elucidate the linkage formed by STY. To resolve this issue, the STY sialoside was extensively characterized by one-and two-dimensional 1 H and 13 C NMR experiments as described below.
One-and Two-Dimensional 1 H and 13 C NMR Analysis Establishes That STY Forms an ␣2,6-GalNAc Linkage-To unambiguously determine the structure of sialyloligosaccharide generated by STY, we produced approximately 3 mg of purified disialylated STY tetrasaccharide utilizing Neu5Ac␣2,3Gal␤1, 3GalNAc␤1-benzyl as an acceptor and detergent extracts from 293 cells stably transfected with STY as an enzyme source. This was deemed a reasonable approach since 293 cells mock transfected with vector alone generate at most only 3% of the level of disialylated product formed by STY transfected cells utilizing Neu5Ac␣2,3Gal␤1,3GalNAc␤1-benzyl as an acceptor (data not shown). After purification, the STY disialylated tetrasaccharide was analyzed by a variety of NMR experiments as described below. The one-dimensional 1 H NMR spectra of the acceptor monosialoside Neu5Ac␣2,3Gal␤1,3GalNAc␤1,0-benzyl and its disialyl tetrasaccharide product, recorded at 600 MHz in D 2 O at pH 6.7, are shown in Fig. 3 (a and b). The signals on the tetrasaccharide trace that are indicative of the presence of a second sialic acid residue not present in the acceptor trisaccharide include the H-3ax (␦ ϳ1.72), H-3eq (␦ ϳ2.75), and NAc methyl (␦ 2.04) signals characteristic for a Neu5Ac residue. They are more or less distinguishable from their counterparts attributable to the Neu5Ac␣2,3 residue already present in the trisaccharide. Also, the doublet signals of the Gal and GalNAc H-1 atoms (␦ 4.45 and 4.52, respectively) and the Gal H-3 (␦ 4.04) and GalNAc NAc methyl (␦ 1.88) signals are readily identifiable. However, signals other than those of these structural reporter groups (54) cannot be identified merely on the basis of inspection of the one-dimensional 1 H NMR spectra. Complete assignment of the 1 H spectra is nevertheless necessary to provide, through correlation to the 13 C NMR spectra of the two saccharides, unequivocal information about the residue and position of attachment of the second Neu5Ac residue in the product tetrasaccharide.
Our initial attempt to assign the 1 H spectra of the tri-and tetrasaccharide involved tracing 3 J HHЈ scalar coupling connectivities by a two-dimensional COSY experiment in conjunction with one-dimensional selective TOCSY experiments (results not shown). Thus, for example, starting from the already assigned Neu5Ac H-3ax and H-3eq signals, those of the Neu5Ac H-4, H-5 and H-6 atoms were located (see Table III). Analogously, starting from the Gal H-1 and GalNAc H-1 signals, respectively, the resonances of H-2, H-3, and H-4 in each of these rings were identified. However, due to the small Neu5Ac   C]Neu5Ac added as a tracer. Sialic acid content of each acceptor was fixed at 0.1 mM except in the case of Gal␤1,3GalNAc, which was assayed at 0.1 mM. Equivalent amounts of protein from 293 cells stably transfected with the pcDNA3 expression vector were assayed in parallel and subtracted from the values obtained with cells expressing ST6GalNAc III. Reaction mixtures were separated by anion exchange HPLC and CPM eluting in the disialylated region was used to determine the relative activities. Assays performed with extracts from cells transfected with vector alone displayed incorporation rates ranging from 0 to 250 cpm depending on the acceptor. *, activity is most likely due to endogenous ST3Gal I activity present in 293 cells forming the acceptor sialoside for STY. Molecular Cloning of N-Acetylgalactosamine ␣2,6-Sialyltransferase charide) (see Table III). Having obtained the partial assignment of the 1 H NMR spectra of the two saccharides, we then attempted to assign their 13 C NMR spectra. Fig. 4 shows the pertinent portion of the two-dimensional 1 H, 13 C HSQC spectrum of the tetrasaccharide. An HSQC experiment as conducted here is a two-dimensional 1 H, 13 C correlation experiment that connects signals belonging to 1 H and 13 C nuclei that are directly (through one bond) attached to each other in the chemical structure. First of all, such an HSQC spectrum provides the assignments of the 13 C signals from already assigned 1 H signals through one-on-one connectivities. Furthermore, HSQC spectra can provide the missing entries to complete the assignment of the 1 H spectra, by revealing the C-6 (or C-9, in case of Neu5Ac) methylene protons. Not only are methylene carbons coupled to two protons each, it also happens that the carbon signals of carbohydrate CH 2 groups in 13 C NMR spectra (55 Ͻ ␦ Ͻ 70 ppm) are well separated from the CH signals (70 Ͻ ␦ Ͻ 110 ppm). Thus, the HSQC experiment provides a relatively convenient way of assigning the CH 2 protons in the 1 H spectra. The HSQC CH 2 correlations are marked in Fig. 4. The spectrum of the acceptor trisaccharide shows three CH 2 signals (at ␦ 63.40, 63.41, and 64.94), while that of the product tetrasaccharide shows four of them (␦ 63.56, 65.08, 65.27, and 66.11). Their 1 J CH coupled protons in the 1 H spectra were found at the positions listed in Table III. To assign the CH 2 signals to specific glycosyl residues, the 13 C signals need to be "linked" to 1 H signals already assigned by COSY, TOCSY, and/or ROESY experiments. Two-dimensional HMQC-TOCSY experiments were conducted to provide these links for two of the observed 13 C CH 2 signals, namely to the Gal and GalNAc H-5 signals (results not shown). The HMQC-TOCSY spectra show partial TOCSY spectra in 1 H rows superimposed on 1 H, 13 C correlations. For example, the tetrasaccharide CH 2 signal at ␦ 66.11 ppm showed a HMQC connectivity to two C-6 protons (at ␦ 3.66 and 3.98 ppm) that in turn are TOCSY-correlated to a proton signal at 3.76 ppm; the latter had been assigned to GalNAc H-5 by a one-dimensional ROESY experiment (see Table III). Analogously, the C-6 signals of Gal in the tri-and tetrasaccharide and C-6 of GalNAc in  the trisaccharide were assigned. By default, the Neu5Ac C-9 signals were identified as the remaining CH 2 signals in the 13 C spectra. These CH2 signals are displayed in Table IV. Thus, it became obvious that the crucial difference between the 13 C NMR spectra of the tri-and tetrasaccharide was the chemical shift increment of ⌬␦ ϳ2.7 ppm shown by the GalNAc C-6 signal, which is typical for glycosylation at that position. Additionally the C-5 of the GalNAc unit is shielded 1.5 ppm, which is characteristic of glycosylation at the C-6 position of GalNAc (69). No other 13 C signals in the spectrum of the acceptor trisaccharide underwent a similar chemical shift increment (see Table III). It was deduced that the newly introduced sialic acid residue in the tetrasaccharide is attached to the C-6 position of GalNAc. Therefore, the product disialoside formed by incubation of the acceptor trisaccharide and CMP-Neu5Ac in the presence of STY was identified by NMR spectroscopy as Neu5Ac␣2,3Gal␤1,3(Neu5Ac␣2,6)GalNAc␤1,0-benzyl.
Since two other ␣2,6-GalNAc sialyltransferase cDNAs have been isolated, we have designated STY, ST6GalNAc III to conform with current sialyltransferase nomenclature. 1 The newly isolated sialyltransferase cDNA will be referred to as ST6GalNAc III throughout the remainder of the manuscript.
Expression of ST6GalNAc III in Adult and Newborn Tissues-In order to determine the pattern of expression and message size of the ST6GalNAc III gene, Northern blots with mRNA from adult and newborn rat tissues were probed with a 1.3-kb EcoRI fragment isolated from ST6GalNAc III-5. As shown in Fig. 5, the probe detects a mRNA band of approximately 3.0 kb. In adult tissues, expression of the mRNA is highest in spleen, followed by kidney and lung, while signals are not detectable for liver and skeletal muscle even if analyzed by PCR (data not shown). The gene is abundantly expressed in both newborn brain and kidney. These differential expression patterns are distinct from those observed for the other cloned sialyltransferases (30). Although the expression of ST6GalNAc III is below the detectable limits of Northern analysis in adult brain and since the cDNA encoding ST6GalNAc III was isolated from an adult rat brain cDNA library constructed by reverse transcription of poly(A) RNA, small amounts of this sialyltransferase mRNA must be present in this tissue. This attests to the extreme sensitivity of the PCR based cloning strategy employed in the current studies. DISCUSSION With the isolation of ST6GalNAc III cDNA, 12 enzymatically distinct sialyltransferases have been isolated. Three of these, including ST6GalNAc III, encode N-acetylgalactosaminide ␣2,6-sialyltransferases (16,17). As summarized in Table V, each ST6GalNAc sialyltransferase utilizes the Neu5Ac␣2, 3Gal␤1,3GalNAc␣1,0-Thr/Ser glycoconjugate as an acceptor. However the acceptor specificity of ST6GalNAc III is considerably more restricted than that of ST6GalNAc I and II. While ST6GalNAc I and II utilize various asialo O-linked structures as acceptors, ST6GalNAc III displays an absolute requirement for the sialylated structure (16,17). Indeed, the only acceptor for ST6GalNAc III to be identified in the current study is Neu5Ac␣2,3Gal␤1,3GalNAc both in its free oligosaccharide form as well as attached to glycoconjugates. The next best oligosaccharide acceptor, LSTa (see Table II for structure) has an incorporation rate of only 3% of that of Neu5Ac␣2,3Gal␤1, 3GalNAc, suggesting that transfer to glycoconjugates carrying the LSTa sequence is unlikely to be of physiological relevance. The acceptor specificity of ST6GalNAc III most closely resembles a sialyltransferase activity described in fetal liver by Bergh et al. (66) and in adult rat brain by Baubichon-Cortay et al. (67). Like ST6GalNAc III, both of these tissues express a ␣2,6-GalNAc sialyltransferase activity that utilizes the Neu5Ac␣2,3Gal␤1,3GalNAc␣1,0-Thr/Ser sialoside of fetuin as an acceptor but not its asialo derivative.
The most striking enzymatic difference between ST6GalNAc III and other members of the N-acetylgalactosamine sialyltransferase subfamily is that ST6GalNAc III is the only sialyltransferase cloned to date capable of forming the developmentally regulated ganglioside G D1␣ from G M1b . ST6GalNAc I and II do not utilize G M1b as an acceptor (17). 3 Thus ST6GalNAc I and II transfer sialic acid to ␣-linked GalNAc (GalNAc␣1,0-Thr/Ser) but not ␤-linked GalNAc such as is found in G M1b (see Table I for structure). In contrast ST6GalNAc III does not discriminate between ␣and ␤-linked GalNAc. Although common gangliosides such as G D1a and G T1b carry the Neu5Ac␣2, 3Gal␤1,3GalNAc moiety, unlike G M1b they are not acceptors for ST6GalNAc III. The only difference between G D1a and G M1b is a sialic acid residue linked to the internal galactose of G D1a . Thus this sialic acid residue abolishes the catalytic activity of ST6GalNAc III perhaps by sterically hindering the access of the sialyltransferase to the C-6 position of GalNAc. It has recently been shown that an ␣2,6-GalNAc sialyltransferase activity exists in rat liver that utilizes G D1a and G T1b as acceptors forming respectively G T1a␣ and G Q1b␣ (36). Since ST6-3 S. Tsuji, personal communication.   (39). Previously, sialosides generated by novel cDNAs relied on indirect methods such as sialidase treatments for linkage analysis. In the current studies, we were unable to confidently characterize the sialoside generated by ST6GalNAc III with strictly sialidase treatments, 4 forcing us to synthesize and purify quantities of sialoside sufficient for NMR analysis. NMR analysis confirmed that the acceptor monosialoside utilized in these studies was Neu5Ac␣2,3Gal␤1,3GalNAc␤1,0-benzyl. Furthermore, the product tetrasaccharide was identified by a combination of one-dimensional and two-dimensional 1 H and 13 C NMR experiments as Neu5Ac␣2,3Gal␤1,3(Neu5Ac-␣2,6)GalNAc␤1,0-benzyl.
Careful comparison of the HSQC spectra of the tri-and tetrasaccharide revealed that one of the CH 2 signals in the 13 C spectrum of the trisaccharide had undergone a chemical shift increment typical for glycosylation at that site. That CH 2 group was attributed to GalNAc based on an HMQC-TOCSY experiment. Thus, the linkage position of the second Neu5Ac residue was identified, not (as would usually be the case) by an HMBC experiment (as Neu5Ac does not have an anomeric proton), but by a 13 C-edited TOCSY experiment. Synthesis of quantities of ST6GalNAc III sialoside sufficient for these experiments was made possible by employing an expression vector in which the entire open reading frame of ST6GalNAc III was placed under control of the cytomegalovirus promoter. Stable transfection of this construct into 293 cells yielded remarkably high levels of ST6GalNAc III sialyltransferase activity that allowed for the enzymatic synthesis of milligram quantities of the ST6GalNAc III sialoside using only detergent lysates as an enzyme source. Indeed, levels of ST6GalNAc III were high enough to render endogenous sialidase and sialyltransferase activities insignificant relative to the recombinant sialyltransferase activity. This expression system may be of future utility for expression of sialyltransferases particularly when relatively high levels of enzyme activity are required.
In certain instances, sialyltransferases share sequence identity outside the sialylmotifs, greatly enhancing the probability that they form identical linkages. This is apparent when the sequences of four ␣2,8-sialyltransferase cDNAs are compared with one another. Throughout their open reading frames, they share a 60 -28% identity, with the closest identity occurring between ST8Sia IV and ST8Sia II (19,21,42). Two distinct ST6GalNAc transferases (ST6GalNAc I and II) differing slightly in their substrate specificity share 32% sequence identity throughout their coding region. The identity increases to 48% when the sequences are compared from their respective sialylmotif L to the carboxyl terminus (17). However, it is difficult to predict the linkage that a novel sialyltransferase cDNA will form if no homology with previously characterized sialyltransferase cDNAs is observed outside of the sialylmotif. For instance, outside of the sialylmotifs, ST6GalNAc III shares no amino acid identity to any previously isolated sialyltransferase cDNA including ST6GalNAc I and II. Thus even though ST6GalNAc I-III each form the Neu5Ac␣2,6-GalNAc linkage, it was impossible to predict the linkage that ST6GalNAc III formed by analysis of its primary sequence.
The spatial and temporal expression of ST6GalNAc III correlates well with that of G D1␣ . The expression of G D1␣ is highest in embryonic brain decreasing to low levels in adults (35). Of the other tissues examined, G D1␣ is found in spleen and lung. It is expressed on all T-cells and enriched in Th-1 cells, explaining its high expression levels in spleen (38). While ST6GalNAc III is abundantly expressed in newborn brain and kidney, a survey of adult tissues reveals that its expression is restricted to spleen, kidney, and to a small extent lung. Interestingly, ST6GalNAc III message was below the detectable limits of Northern analysis in adult brain. However the ST6GalNAc III cDNA was isolated from an adult rat brain cDNA library, thus minute levels of ST6GalNAc III message are present in this tissue, corresponding to the low levels of G D1␣ detected in adult brain tissue (35). Since ST6GalNAc III forms G D1␣ in vitro and the tissue-specific expression of the sialyltransferase and ganglioside correlate well, it is likely that one potential function of ST6GalNAc III is to synthesize G D1␣ in vivo. Since ST6GalNAc III only utilizes the Neu5Ac␣2,3Gal␤1,3GalNAc sequence as an acceptor, it must be co-expressed with ST3Gal I, ST3Gal II, or ST3Gal IV in the same cell to synthesize G D1␣ in vivo. Without such co-expression, the activity of ST6GalNAc III would be functionally null. G D1␣ has been implicated as a molecular component of a variety of important biological processes. These include metastasis of highly virulent lymphomas and motor learning as elaborated by Purkinje cells (37,35). In the future, it will be important to determine if ST6GalNAc III is co-expressed with G D1␣ in particular cell types and if so to genetically manipulate ST6GalNAc III in different systems to ultimately determine the biological relevance of G D1␣ .