Molecular Cloning of Brain-specific GD1α Synthase (ST6GalNAc V) Containing CAG/Glutamine Repeats*

A novel member of the mouse CMP-NeuAc: β-N-acetylgalactosaminide α2,6-sialyltransferase (ST6GalNAc) subfamily, designated ST6GalNAc V, was identified by BLAST analysis of expressed sequence tags. The sequence of the longest cDNA clone of ST6GalNAc V encoded a type II membrane protein with 8 amino acids comprising the cytoplasmic domain, 21 amino acids comprising the transmembrane region, and 306 amino acids comprising the catalytic domain. The predicted amino acid sequence showed homology to the previously cloned ST6GalNAc III and IV, with common amino acid sequences in sialyl motifs L and S among these three enzymes. Eleven CAG repeats were found in the stem region. A fusion protein with protein A and extracts from L cells transfected with ST6GalNAc V in a expression vector showed enzyme activity of α2,6-sialyltransferase almost exclusively for GM1b, but not toward glycoproteins. Sialidase treatment and thin layer chromatography immunostaining revealed that the product was GD1α. Northern blotting revealed that three transcripts of the gene were expressed specifically in brain tissues. It is concluded that this enzyme is involved in the synthesis of GD1α in the nervous tissues, and the CAG repeats may have implications in neurodegenerative diseases.

Gangliosides are glycosphingolipids containing sialic acids in the carbohydrate moiety and have been considered critical to a wide variety of cellular events, such as cell-cell interaction, cell adhesion, mediation of invasion of vectors, and protein targeting (1,2). The sialic acids in particular are thought important to the biological functions of gangliosides. In all the ganglioside structures so far defined, four main linkages of sialic acids are present, i.e. ␣2,3galactose (Gal), 1 ␣2,6Gal, ␣2,8sialic acid (Sia), and ␣2,6N-acetylgalactosamine (GalNAc).
To biosynthesize sialyl compounds containing one or more sialic acids with the linkages described above, a number of sialyltransferases are needed. To date, more than 15 species of sialyltransferase genes have been isolated (3,4); and six genes for ␣2,3Gal (ST3Gal), one gene for ␣2,6Gal (ST6Gal), five genes for ␣2,8Sia (ST8Sia), and four genes for ␣2,6GalNAc (ST6GalNAc) have been cloned as sialyltransferase genes involved in the synthesis of sialylated carbohydrates on glycoproteins and glycolipids (4). Some of them act on both glycoproteins and glycolipids, while others utilize either as an acceptor. The expression pattern of these genes varies, i.e. the expression of some genes is restricted to certain tissues or cells or to specific stages of development. However, many sialyltransferase genes are expressed in a ubiquitous manner.
For the glycosyltransferase genes responsible for the synthesis of gangliosides, the majority of cDNAs have been isolated (5). Namely, sialyltransferases designated SAT I, SAT II, SAT III, SAT IV, and SAT V have been cloned and well characterized, although there is some ambiguity as to their identities and specificities (5). However, enzymes to extend the carbohydrate chains, or those to further modify the carbohydrate structures, have not been well characterized as their cDNAs are not available. To analyze the significance of the minor structures present in the ganglioside-series gangliosides and regulatory mechanisms for the restricted and/or universal expression of those enzymes, molecular cloning of the genes is essential.
␣-Series gangliosides were defined as a new series of gangliosides containing NeuAc linked to the C6 position of GalNAc of the gangliotetraosyl backbone (6,7). They have been thought only a minor component (8), and little is known about them. In contrast with O-glycans, ␣2,6-sialylated GalNAc structures are rarely detected in the carbohydrate moiety of glycosphingolipids. However, the expression of GD1␣, 2 a typical ␣-series ganglioside, was restricted to a particular region and a particular population in brain tissues (9), suggesting that the expression level of GD1␣ is fairly high in some regions.
In the present study, we have isolated a cDNA of GD1␣ synthase (ST6GalNAc V) gene specifically expressed in the brain, which contains an interesting CAG repeat. Although several ST6GalNAc cDNAs that may synthesize GD1␣ have been reported (10,11), ST6GalNAc V is specific for GM1b in contrast with other members of the ST6GalNAc family. Moreover, ST6GalNAc V showed brain-specific expression, suggesting a critical role of ST6GalNAc V in the synthesis of GD1␣ in brain tissues.
Isolation of ST6GaNAc V cDNA-Mouse-expressed sequence tags (GenBank TM accession numbers AU035329, AA462934, and AA968060) with similarity to mouse ST6GalNAc IV were identified using the tBLASTn algorithm against the dbEST data base at the National Center for Biotechnology Information. The expressed sequence tag cDNA clone (AU035329) with the longest 5Ј-region was obtained from Japanese Collection of Research Bioresources. To isolate cDNA clones, the reverse transcription-polymerase chain reaction (RT-PCR) method using total RNA from mouse brain was performed. A sense primer containing a XhoI site, 5Ј-ACTCGAGCCCAAAATGAAGACCCT-3Ј (nucleotides 188 -204 in Fig. 1A), and an antisense primer containing a SpeI site, 5Ј-CACTAGTCAGAACACAGGCTTACCCT-3Ј (nucleotides 1182-1200), were used for the PCR, which was carried out as follows: 94°C for 1 min, 25 cycles of (94°C for 1 min, 55°C for 1 min, and 72°C for 1 min), and 72°C for 1 min. The RT-PCR-amplified product (1010 base pairs) was subcloned into pCR ® 2.1-TOPO vector (Invitrogen, San Diego, CA). The nucleotide sequence was determined by the dideoxy termination method using an ABI PRISM 310 genetic analyzer (Applied Biosystems, Foster City, CA).
Construction of Expression Vector-An expression vector of the cloned cDNA was prepared by insertion of the subcloned cDNA fragment into the XhoI and SpeI sites of pMIKneo vector (kindly provided by Dr. K. Maruyama at Tokyo Medical and Dental University). To prepare a soluble fusion enzyme, a truncated form of ST6GalNAc V, lacking 34 amino acids from the NH 2 terminus, was prepared by PCR using a 5Ј primer containing an EcoRI site, 5Ј-TTGTTGGAATTCTA-CAGCAGCCTCGGCAGC-3Ј (nucleotides 269 -286), and a 3Ј primer containing a XhoI site, 5Ј-CATGTTCTCGAGTCAGAACACAGGCT-TACC-3Ј (nucleotides 1184 -1201), and the cloned cDNA fragment as a template. The product was digested with EcoRI and XhoI and subcloned into these sites of pCD SA vector (kindly provided by Dr. Tsuji, RIKEN Institute, Wako, Japan).
Preparation of Membrane Fraction-Mouse fibroblast L cells (provided by Dr. A. Albino at Memorial Sloan-Kettering Cancer Center, New York) were grown in Dulbecco's modified Eagle's medium (DMEM) supplemented with 7.5% fetal calf serum. L cells at 80% confluence were transfected by the DEAE-dextran method (17). After 48 h, the cells were collected and lysed in ice-cold phosphate-buffered saline containing 1 mM phenylmethylsulfonyl fluoride using a nitrogen cavitation apparatus as described previously (18). Nuclei were removed by low speed centrifugation, and the supernatant was centrifuged at 100,000 ϫ g for 1 h at 4°C. The pellet was resuspended in ice-cold 100 mM sodium cacodylate buffer (pH 6.0) and used as an enzyme source.
Preparation of Soluble Forms of ST6GalNAc V-L cells were transfected with pCD SA-ST6GalNAc V by the DEAE-dextran method and cultured for 16 h in DMEM containing 7.5% fetal calf serum. The medium was replaced with DMEM containing ITS TM culture supplement (Becton Dickinson, Bedford, MA), and the cells were cultured for another 32 h. The culture medium was then collected, concentrated 100-fold, and dialyzed against 100 mM sodium cacodylate buffer (pH 6.0) as described previously (19).
Sialyltransferase Assay-The sialyltransferase assay was performed in a mixture containing 10 mM MgCl 2 , 0.3% Triton CF-54, 100 mM sodium cacodylate buffer (pH 6.0), 0.66 mM CMP-NeuAc (Sigma), 4400 dpm/l CMP-[ 14 C]NeuAc (Amersham Pharmacia Biotech), the enzyme solution, and substrates in a total volume of 50 l for glycolipid acceptors, and 25 l for glycoproteins. The reaction mixture was incubated at 37°C for 1 h. For glycolipid acceptors, the reaction was terminated by addition of 1 ml of water. The products were isolated using a C 18 Sep-Pak cartridge (Waters, Milford, MA) and analyzed by thin layer chromatography (TLC) with a solvent system of chloroform/methanol/12 mM MgCl 2 (50:40:10). High performance TLC plates (E. Merck, Darmstadt, Germany) were used. For glycoprotein acceptors, the reaction was terminated by the addition of 25 l of SDS-polyacrylamide gel electrophoresis loading buffer and the mixtures were directly subjected to SDS-polyacrylamide gel electrophoresis. The radioactivity on each plate and gel was visualized with a BAS 2000 image analyzer (Fuji Film, Tokyo, Japan).
Exoglycosidase Digestion-One g of GM1b was sialylated with a soluble form of ST6GalNAc V (ProtA-ST6GalNAc V). The products were purified with a C 18 Sep-Pak cartridge, dried, and redissolved in 25 l of 50 mM sodium citrate (pH 6.0) and 100 mM NaCl containing 100 g/ml bovine serum albumin. Salmonella typhimurium LT2 sialidase (0.85 unit) (New England Biolabs, Beverly, MA) was added to the resultant products, which were then incubated overnight at 37°C. After purification with a C 18 Sep-Pak cartridge, the digestion product was further treated with 7 milliunits of bovine testes ␤-galactosidase (Sigma) in 40 l of 10 mM potassium acetate (pH 5.0) containing 0.2% sodium taurocholate and 1 mg/ml bovine serum albumin at 37°C for 48 h. For linkage analysis of sialic acids, 0.5 g of the product was treated with a linkage-specific sialidase, 0.85 units of S. typhimurium LT2 sialidase (specific for ␣2,3-linked sialic acids and weakly active for ␣2,6 linkage), 0.85 unit of Clostridium perfringens sialidase (specific for ␣2,3and ␣2,6-sialic acids, New England Biolabs), or 5 milliunits of Newcastle disease virus sialidase (specific for ␣2,3and ␣2,8-linked sialic acids, Roche Molecular Biochemicals) (20). The enzyme reaction was performed at 37°C for 24 h.
TLC Immunostaining-Five g of GM1b was sialylated with ProtA-ST6GalNAc V for 6 h, and purified with a C 18 Sep-Pak cartridge, dried, and subjected to TLC. TLC immunostaining was performed as described previously (21) according to the method of Taki et al. (22). In brief, the TLC plate was heat-blotted to a polyvinylidene difluoride membrane after chromatography of the glycolipids. The membrane was incubated with monoclonal antibody (mAb) KA-17 at a 1:100 dilution for 90 min, washed, and incubated with biotinylated horse anti-mouse IgG for 1 h. The antibody binding was revealed with ABC-PO (Vector, Burlingame, CA) and HRP-1000 (Konica, Tokyo, Japan) as described previously (23).
Northern Blot Analysis-mRNA was isolated from mouse tissues using an mRNA isolation kit (Miltenyi Biotec, Bergisch, Germany) according to the manufacturer's instructions. Two g of poly(A) ϩ RNA was separated on a 1.2% agarose, 2% formaldehyde gel, then transferred onto a GeneScreen Plus ® membrane (DuPont). After baking, the filter was prehybridized for 2 h at 42°C in a solution consisting of 5 ϫ SSPE (saline/sodium phosphate/EDTA), 50% formamide, 5x Denhardt's solution, 1% SDS, and 10% dextran sulfate. Hybridization was carried out for 16 h at 42°C in the same solution containing 5 ϫ 10 5 dpm/ml of the 32 P-labeled probes. Alternatively, a mouse Multiple Choice TM Northern blot was obtained form OriGene Technologies Inc. (Rockville, MD) and hybridized according to the manufacturer's instructions. The filters were washed and then exposed to the imaging plate to be analyzed in a FUJIX BIO-Imaging Analyzer BAS 2000.

RESULTS
Isolation of ST6GalNAc V cDNA-Using the mouse expressed sequence tag data base, we found sequences (Gen-Bank TM accession numbers AU035329, AA462934, and AA968060) with similarity to mouse ST6GalNAc IV and obtained an expressed sequence tag cDNA clone AU035329 from Japanese Collection of Research Bioresources. Then a corresponding cDNA fragment was obtained by RT-PCR using total RNA from mouse brain. The nucleotide sequence revealed that the cDNA contains an open reading frame encoding a protein of 335 amino acids with a calculated molecular mass of 38,301 daltons, with two potential N-linked glycosylation sites (Fig.  1A). The initiation codon at the beginning of the open reading frame is embedded within a sequence similar to the Kozak consensus initiation sequence (24,25). Inspection and hydropathy of the predicted protein sequence suggested that this enzyme molecule has the structural organization of a membrane protein with type II topology, which is commonly detected in glycosyltransferase genes. A single hydrophobic seg-ment with 21 amino acids was present near the amino terminus. This putative signal anchor sequence would place 8 residues within the cytosolic compartment and 306 amino acids within the Golgi lumen as a catalytic domain (Fig. 1B). Comparison of the primary structure of the newly cloned sialyltransferase and the 16 other cloned sialyltransferases indicated that there is significant similarity in two regions, the L sialyl motif and S sialyl motif (Fig. 2). In particular, this new enzyme has several common amino acid residues specifically conserved among the three members of ST6GalNAc (ST6GalNAc III, IV, and this gene, as shown by gray boxes in Fig. 2), although there were few conserved residues commonly detected among all five ST6GalNAc members. Thus, this new gene was tentatively designated as ST6GalNAc V. These results suggested that ST6GalNAc III, IV, and V share similar functions such as substrate specificity. A most characteristic finding in the primary structure of this gene product was the presence of CAG repeats at Gln 38 to Gln 48 (totally 11 CAGs) located in the stem region (Fig. 1A).
Sialyltransferase Activity of the Cloned cDNA Product-To analyze the sialyltransferase activity of the ST6GalNAc V, the expression vector of the cloned cDNA, pMIKneo-ST6GalNAc V, was transfected into L cells, and the extracts were assayed for sialyltransferase activity using CMP-[ 14 C]NeuAc as a donor. The enzyme sialylated GM1b almost exclusively, but no other asialo or sialosyl compounds were significantly utilized as an acceptor (data not shown). No activity was detected in the extracts prepared from mock-transfected cells. The apparent K m value for GM1b was 0.65 mM (data not shown).
Substrate Specificity of Cloned ST6GalNAc V-To analyze the substrate specificities of ST6GalNAc V, a fusion gene consisting of the IgM signal peptide sequence, the protein A IgG binding domain, and the putative catalytic domain of ST6GalNAc V (residue number 26 -335) was constructed (ProtA-ST6GalNAc V) and transfected into L cells. Using secreted fusion enzyme in the supernatant, we analyzed the sialyltransferase activity for various glycolipids and glycoproteins. As shown in Fig. 3, no glycolipids except for GM1b showed significant acceptor activity. Fetuin and BSM, and their desialylated forms were also completely inactive as an acceptor for ST6GalNAc V (Table I). The fact that GA1 was inactive indicates that a sialic acid linked to galactose at the non-reducing end by an ␣2,3 linkage was critical for the substrate activity.
TLC Immunostaining-To confirm that the enzyme product is GD1␣, TLC immunostaining of the products using an anti-GD1␣ mAb KA-17 was performed. As shown in Fig. 5, a product with ProtA-ST6GalNAc V from GM1b was clearly stained as standard GD1␣ at the same migration site. GM1b itself was faintly stained as previously reported (9). None of the major gangliosides from bovine brain were stained, confirming the specificity of the mAb. Thus, the product was confirmed to be GD1␣.
Expression of the ST6GalNAc V Gene-To determine the expression pattern and the size of the ST6GalNAc V mRNA, Northern blotting was performed. Among 11 tissues examined, only sample from brain showed three bands at 6.5, 3.0, and 2.3 base pairs (Fig. 6). Only spleen sample showed a very faint band at 6.5 base pairs. Consequently, this ST6GalNAc V gene was expressed in brain tissues in a very restricted manner.

DISCUSSION
Four ST6GalNAc genes have been reported to date. Among these four, ST6GalNAc I (12) and ST3GalNAc II (13,14) were isolated as sialyltransferases which mainly utilize O-glycans as an acceptor. ST6GalNAc I acts toward GalNAc-Ser/Thr, and ST3GalNAc II acts on Gal␤1,3GalNAc-Ser/Thr. ST6GalNAc III (10,11) and ST6GalNAc IV (11) showed a similar substrate specificity and preferred a terminal sialic acid with an ␣2,3 linkage on galactose as an acceptor structure. However, ST6GalNAc III better utilizes glycolipid acceptors, while ST6GalNAc IV preferred O-glycans as acceptors. Therefore, the ST6GalNAc V reported in this study is more similar to ST6GalNAc III and IV in terms of major substrate structure than ST6GalNAc I and II. The amino acid sequence alignment of these five ST6GalNAc also demonstrated that ST6GalNAc V is closer to ST6GalNAc III and IV in primary structure (Fig. 2).
ST6GalNAc V is similar to ST6GalNAc III in terms of the nature of acceptor structures they prefer, i.e. NeuAc␣2,3Gal␤1, 3GalNAc on glycolipids. However, their fine substrate specific- FIG. 3. Thin layer chromatography of sialylated glycosphingolipids. Various glycosphingolipids (0.1 mM) were used as acceptors for ST6GalNAc V, and the products were separated on a TLC plate with a solvent system of chloroform/methanol/12 mM MgCl 2 (50:40:10). The plate was exposed to a BAS imaging plate and then analyzed with a BAS 2000 radioimage analyzer. ities and expression patterns were fairly different. ST6GalNAc V does not utilize glycoproteins as an acceptor in contrast to ST6GalNAc III which acts on glycoproteins with significant efficiency. Furthermore, the expression of the ST6GalNAc III gene was not confined to brain tissues, i.e. it was also expressed in heart or lung etc. In contrast, ST6GalNAc V was almost specifically expressed in the brain as shown in Fig. 6. ST6GalNAc V might be the best candidate for GD1␣ synthase specifically expressed in brain tissues. CAG (glutamine) repeats have been detected in a number of proteins which are relevant to inherited neurodegenerative diseases (26) such as Huntington disease (27) and Machado-Joseph disease (28). In patients with these diseases, the number of CAG repeats is more than 40, and the abnormal proteins aggregate and form intranuclear inclusions, resulting in neuronal apoptosis (27,29). Similar findings were demonstrated in transgenic mice of the human Huntington's disease gene carrying CAG repeat expansions (30). The number of CAG repeats in ST6GalNAc V is 11. Therefore, it may not be directly involved in the pathogenesis of neurodegenerative diseases. L cells transfected with the expression vector of ST6GalNAc V actually behaved as untransfected cells. However, it is possible that there are expansions of CAG repeats in the human ST6GalNAc V gene in some neurodegenerative diseases, being involved in the pathogenesis. Screening for mutations of this gene in a large population of patients with neurological disorders is needed to clarify this point. Anyway, the fact that a brain-specific sialyltransferase has a CAG repeat structure is very interesting and may imply that this protein has functions other than that of sialyltransferase activity.
GD1␣ was reported to be a minor ganglioside in bovine brain tissues (8). It was demonstrated to be accumulated in the proximal dendrites and cell bodies of Purkinje cells in murine celleberum using a specific mAb (9). It was also detected in macrophages (31) and mammary glands during lactation (32). Furthermore, it was reported that GD1␣ is a functional molecule on mouse lymphoma cells (33), playing important roles in tumor cell metastasis as an adhesion molecule. Therefore, there may be several tissue specific ST6GalNAc members capable of synthesizing GD1␣, and ST6GalNAc V might be a brain-specific isotype responsible for the synthesis of ␣-series gangliosides in nervous tissues. These results in addition to the characteristics of GD1␣ synthase as predicted from the cloned cDNA here suggest that GD1␣ is a critical molecule in the communication and interaction between neuronal cells and their supportive cells, particularly in brain tissues. The availability of the GD1␣ synthase (ST6GalNAc V) gene would enable us to clearly demonstrate the roles of GD1␣ in neuronal development and in tumor metastasis.
Most of studies on the ␣-series of gangliosides have been done with animal tissues or cells. That no human studies on the expression of GD1␣ have been conducted to date suggests that ␣-series gangliosides are minor components of human tissues and cells or merely that no rigorous investigation on the presence of ␣-series gangliosides in human has been done, if in fact they exist. Either way, the use of ST6GalNAc V should enable us to clearly investigate the presence and significances of ␣-series gangliosides in human bodies, especially in the nervous systems. NeuAc-labeled sialyl-GM1b using ProtA-ST6GalNAc V. As a control, the same reaction was performed without the enzyme (lane 2). Lane 4 was acidic glycosphingolipids (2.5 g) extracted from bovine brain (B.B.) containing GM1, GD1a, GD1b, and GT1b as major components. Glycosphingolipids were separated by TLC and blotted onto a PVDF membrane. Immunostaining was done using mAb KA-17 to detect GD1␣.
FIG. 6. Restricted expression of ST6GalNAc V gene in mouse tissues. Northern blots with poly(A) ϩ RNA from various adult mouse tissues were probed with a mouse ST6GalNAc V cDNA fragment (nucleotides 269 -1201 in Fig. 1A) as described under "Experimental Procedures." The same filters were probed with glyceraldehyde-phosphate dehydrogenase (GAPDH) cDNA after removing the radioactivity. The sizes of ST6GalNAc V transcripts are indicated at the left.