Molecular Cloning and Expression of a Human Chondroitin Synthase*

We have identified a human chondroitin synthase from the HUGE (human unidentifiedgene-encoded large proteins) protein data base by screening with two keywords: “one transmembrane domain” and “galactosyltransferase family.” The identified protein consists of 802 amino acids with a type II transmembrane protein topology. The protein showed weak homology to the β1,3-galactosyltransferase family on the amino-terminal side and to the β1,4-galactosyltransferase family on the carboxyl-terminal side. The expression of a soluble recombinant form of the protein in COS-1 cells produced an active enzyme, which transferred not only the glucuronic acid (GlcUA) from UDP-[14C]GlcUA but alsoN-acetylgalactosamine (GalNAc) from UDP-[3H]GalNAc to the polymer chondroitin. Identification of the reaction products demonstrated that the enzyme was chondroitin synthase, with both β1,3-GlcUA transferase and β1,4-GalNAc transferase activities. The coding region of the chondroitin synthase was divided into three discrete exons and localized to chromosome 15. Northern blot analysis revealed that the chondroitin synthase gene exhibited ubiquitous but markedly differential expression in the human tissues examined. Thus, we demonstrated that analogous to human heparan sulfate polymerases, the single polypeptide chondroitin synthase possesses two glycosyltransferase activities required for chain polymerization.

Chondroitin sulfate belongs to the glycosaminoglycan family and occurs as proteoglycans that are expressed on the surface of most cells and in extracellular matrices. They are covalently linked to a wide range of core protein families. The glycosaminoglycan chains are increasingly implicated as important regulators of many biological processes, such as cell proliferation and recognition, extracellular matrix deposition, and morphogenesis (for reviews see Refs. [1][2][3]. Recently, chondroitin sulfate as well as heparan sulfate have attracted much attention because they play an important role in neural network formation in the developing mammalian brain (for reviews see Refs. 4 and 5). Although a number of sulfotransferases involved in chondroitin sulfate synthesis as well as glycosyl-and sulfotransferases required for heparan sulfate synthesis have been molecularly cloned (for a review see Ref. 6), glycosyltransferases that are responsible for chondroitin sulfate have not been cloned except for a bacterial chondroitin synthase (7). This lack of cloning has hampered vigorous investigation into the biological functions of this essential molecular species in mammalian systems.
Sulfated glycosaminoglycans including chondroitin sulfate and heparin/heparan sulfate have a linear polymer structure that is composed of repeating disaccharide units of glucuronic acid (GlcUA) 1 and N-acetylgalactosamine (GalNAc) or of GlcUA and GlcNAc, respectively. Both chains are covalently attached to serine residues in core proteins via a unique tetrasaccharide structure, GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1 (for reviews see Refs. 8 and 9). In glycosaminoglycan biosynthesis, the disaccharide units are formed by alternate monosaccharide addition from corresponding UDP sugars to the nonreducing end of the elongating chain (for reviews, see Refs. 9 and 10). Glycosyltransferases involved in the biosynthesis of the repeating disaccharide region of heparin/heparan sulfate have been purified from bovine serum, and cDNA cloning has demonstrated that a single polypeptide catalyzes both GlcNAc and GlcUA transferase reactions, designated heparan sulfate polymerase, and that it is encoded by a member of the hereditary multiple exostoses gene family of tumor suppressors (11). In contrast, although GlcUA transferase II (GlcAT-II) and GalNAc transferase II (GalNAcT-II) that are involved in the biosynthesis of chondroitin sulfate disaccharide units have been extensively purified from chick cartilage (12) and fetal bovine serum (13), cDNA cloning has not been achieved because of the difficulty in purifying the enzymes to homogeneity.

Materials-UDP-[U-
In Silico Cloning of a Human Novel Glycosyltransferase cDNA-Screening of the HUGE protein data base (www.kazusa.or.jp/ huge/) at the Kazusa DNA Research Institute (Chiba, Japan) with two keywords, "one transmembrane domain" and "galactosyltransferase family," revealed a clone (ID KIAA0990; GenBank TM accession number AB023207) that was then obtained from the Kazusa DNA Research Institute. Analysis of this clone revealed a single open reading frame with weak sequence similarity to human core 1 UDP-Gal:GalNAc␣-R ␤1,3-Gal transferase (GenBank TM accession number AF155582) on the amino-terminal side and to human UDP-Gal:GlcNAc␤-R ␤1,4-Gal transferase II (GenBank TM accession number AB024434) on the carboxyl-terminal side. In addition, a data base search of the Human Genome Project, which recently became available, showed a genome sequence (accession number NT 010274.3) identical to the cDNA sequence. Comparison between the cDNA and the genome sequence revealed the genomic organization of this novel glycosyltransferase gene.
Construction of a Soluble Form of the Novel Glycosyltransferase-The cDNA fragment of a truncated form of the novel glycosyltransferase, lacking the first 46 amino-terminal amino acids of the novel glycosyltransferase, was amplified by polymerase chain reaction with the KIAA0990 cDNA as a template using a 5Ј-primer (5Ј-CCCTC-GAGGGGCTGCCGGTCCGGGC-3Ј) containing a XhoI site and a 3Ј-primer (5Ј-CCCTCGAGCAATCTTAAAGGAGTCCTATGTA-3Ј) containing a XhoI site located 138 base pairs downstream of the stop codon. Polymerase chain reactions were carried out with Pfu polymerase (Stratagene, La Jolla, CA) by 34 cycles at 94°C for 30 s, 55°C for 30 s, and 72°C for 180 s in 5% (v/v) dimethyl sulfoxide. The polymerase chain reaction fragments were digested with XhoI, and both ends of the fragments were partially filled by a Klenow fragment (New England Biolabs, Beverly, MA) with dCTP and dTTP. The pGIR201protA (21) vector digested with BamHI was also partially filled with dATP and dGTP. The resultant fragment was subcloned into the pGIR201protA, resulting in fusion of the novel glycosyltransferase with the insulin signal sequence and the protein A sequence present in the vector. An NheI fragment containing the above fusion protein sequence was inserted into the XbaI site of the expression vector pEF-BOS (22).
Identification of the Enzyme Reaction Products-The isolation of each product from the GalNAc transferase reaction or the GlcUA transferase reaction using polymer chondroitin as an acceptor was carried out by gel filtration on a Superdex Peptide column equilibrated with 0.25 M NH 4 HCO 3 , 7% 1-propanol. The radioactive peak containing each enzyme reaction product was pooled and evaporated to dryness. The isolated GalNAc transferase reaction product (about 120 g) was digested with 100 mIU of chondroitinase AC-II for assessing the digestibility in a total volume of 30 l of 50 mM sodium acetate buffer, pH 6.0, at 37°C overnight. The isolated GlcUA transferase reaction product (about 180 g) was digested with 100 mIU of chondroitinase AC-II in a total volume of 30 l of 50 mM sodium acetate buffer, pH 6.0, or with 22 mIU of ␤-glucuronidase in a total volume of 30 l of 0.05 M sodium citrate buffer, pH 4.5, at 37°C overnight. Each enzyme digest was analyzed using the same Superdex Peptide column as described above.
Northern Blot Analysis-A commercial human 12-lane multiple tissue Northern blot (CLONTECH) membrane was used for the analysis. One g of polyadenylated RNA was loaded in each lane. The membrane was probed with a gel purified, radiolabeled (Ͼ1 ϫ 10 9 cpm/g), 0.84-kb chondroitin synthase-specific fragment corresponding to nucleotides 631-1469 of the KIAA0990 cDNA (GenBank TM accession number AB023207).

RESULTS
In Silico Cloning of a Putative Human Chondroitin Synthase cDNA-Two keywords, "one transmembrane domain" and "galactosyltransferase family," were used to screen the HUGE protein data base, because a type II transmembrane protein topology is characteristic of many other glycosyltransferases cloned to date, and GalNAcT-II is a ␤1,4-GalNAc transferase that might belong to ␤1,4-Gal transferase family. HUGE data base screening identified a clone (KIAA0990) containing a 5Јuntranslated region of 494 base pairs, a single open reading frame of 2406 base pairs coding for a protein of 802 amino acids with three potential N-glycosylation sites (Fig. 1), and a 3Јuntranslated region of 1.7 kb with a presumptive polyadenylation signal. Northern blot analysis indicated that the mRNA was about 5.0 kb in length in various human tissues (see below), suggesting that the cDNA was approximately full length. The deduced amino acid sequence corresponded to a 91,728-Da polypeptide. The predicted translation initiation site conformed to the Kozak consensus sequence for initiation (29), and an in-frame stop codon was present upstream of the assigned initiating ATG codon. A Kyte-Doolittle hydropathy analysis (30) revealed one prominent hydrophobic segment of 17 amino acid residues in length in the NH 2 -terminal region, predicting that the protein has a type II transmembrane topology characteristic of many Golgi-localized glycosyltransferases cloned to date (Fig. 1). Data base searches indicated that the amino acid sequence displayed weak sequence similarity to human core 1 UDP-Gal:GalNAc␣-R ␤1,3-Gal transferase (Gen-Bank TM accession number AF155582) on the amino-terminal side and to human UDP-Gal:GlcNAc␤-R ␤1,4-Gal transferase II (GenBank TM accession number AB024434) on the carboxylterminal side. A characteristic feature of homologous glycosyltransferase genes is that different members may encode enzymes that have different donor or acceptor sugar specificities, but the nature of the sugar linkage formed is often retained (31). Thus, the features of the identified protein sequence suggest that the identified gene product might possess both ␤1,3-GlcUA transferase (GlcAT-II) and ␤1,4-GalNAc transferase (GalNAcT-II) activities. In addition, a homologue of the identified human gene was found in the Caenorhabditis elegans or Drosophila genome. The sequences and alignment of the hypothetical proteins from human, C. elegans, and Drosophila are shown in Fig. 1. The human sequence shared 36 and 42% identity with that of C. elegans and Drosophila, respectively. Each of the three proteins contained DDD on the amino-terminal side and DVD on the carboxyl-terminal side (Fig. 1), both of which appear to correspond to the conserved DXD motif found in most glycosyltransferases (32).
Genomic Organization and Chromosomal Localization-Comparison of the identified cDNA sequence with the genome sequence deposited in the Human Genome Project Data Base revealed the genomic structure and chromosomal localization of the gene. The gene spans over 40 kb, and the coding region of the gene was divided into three discrete exons as shown in Fig. 2. The intron/exon junctions followed the GT/AG rule (33) and were flanked by conserved sequences (data not shown). This gene is located on human chromosome 15.
Expression of a Soluble Form of the Putative Glycosyltransferase and Characterization as Chondroitin Synthase-To facilitate the functional analysis of the putative glycosyltransferase, a soluble form of the protein was generated by replacing the first 46 amino acids of the putative glycosyltransferase with a cleavable insulin signal sequence and a protein A IgGbinding domain as described under "Experimental Procedures," and then the soluble putative glycosyltransferase was expressed in COS-1 cells as a recombinant enzyme fused with the protein A IgG-binding domain. The fused enzyme expressed in the medium was adsorbed onto IgG-Sepharose beads to eliminate endogenous glycosyltransferases, and then the en- zyme-bound beads were used as an enzyme source. The bound fusion protein was assayed for glycosyltransferase activity using a variety of acceptor and donor substrates. As shown in Table I, the activity was detected using polymer chondroitin with either UDP-GlcUA or UDP-GalNAc and using chondrooligosaccharides with UDP-GalNAc. In addition, a weak activity was detected using chondroitin sulfate A with UDP-GlcUA and using chondroitin sulfate C with UDP-GalNAc, respectively (Table II). In contrast, no other activities were detected using other acceptor substrates with either UDP-GlcUA, UDP-GalNAc, UDP-GlcNAc, or UDP-Gal as a donor substrate. These included GlcAT-I, GalNAc transferase I involved in the initiation of chondroitin sulfate biosynthesis, hyaluronan synthase, core 1 UDP-Gal:GalNAc␣-R ␤1,3-Gal transferase, and UDP-Gal:GlcNAc␤-R ␤1,4-Gal transferase activities. No detectable glycosyltransferase activity was recovered by the affinity purification from a control pEF-BOS transfection sample. These findings clearly indicate that the expressed protein is a GlcUA/ GalNAc transferase with marked specificity for polymer chondroitin and chondro-oligosaccharides.
To identify the GlcUA transferase reaction products, polymer chondroitin was labeled with [ 14 C]GlcUA using the enzyme-bound beads as an enzyme source, and the products were isolated and then subjected to a gel filtration analysis after ␤-glucuronidase or chondroitinase AC-II treatment. As shown in Fig. 3A, the labeled products were completely digested by ␤-glucuronidase or chondroitinase AC-II, quantitatively yielding a 14  To identify the GalNAc transferase reaction products, polymer chondroitin was labeled with [ 3 H]GalNAc using the enzyme-bound beads as an enzyme source, and the products were isolated and then subjected to a gel filtration analysis after chondroitinase AC-II treatment. As shown in Fig. 3B, the labeled products were completely digested by chondroitinase AC- FIG. 3. Identification of the putative human glycosyltransferase reaction products. A, the GlcUA transferase reaction products recovered from a Superdex Peptide column were subjected to digestion with chondroitinase AC-II or ␤-glucuronidase as described under "Experimental Procedures." The undigested sample (filled squares), the chondroitinase AC-II digest (filled circles), or the ␤-glucuronidase digest (filled triangles) was applied to a column of Superdex Peptide, and the respective effluent fractions (0.4 ml each) were analyzed for radioactivity. Arrows indicate the elution positions of the authentic saturated disaccharide (arrow 1, GlcUA␤1-3GalNAc) or free GlcUA (arrow 2, [ 14 C]GlcUA). B, the GalNAc transferase reaction products recovered from a Superdex Peptide column were subjected to digestion with chondroitinase AC-II as described under "Experimental Procedures." The undigested sample (filled squares) or the chondroitinase AC-II digest (filled circles) was applied to a column of Superdex Peptide, and the respective effluent fractions (0.4 ml each) were analyzed for radioactivity. Arrows indicate the elution positions of the authentic saturated disaccharide (arrow 1, GlcUA␤1-3GalNAc) or free GalNAc (arrow 2, [ 3 H]GalNAc).  (GlcUA␤1-3GalNAc) 3 UDP-GalNAc 0.4 (GlcUA␤1-3GalNAc) 4 UDP-GalNAc 0.5 GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser UDP-GalNAc ND ␣-Thrombomodulin b UDP-GalNAc ND (GlcUA␤1-3GlcNAc) 5 UDP-GlcNAc ND Asialo-ovine submaxillary mucin c UDP-Gal ND GlcNAc␤1-3Gal␤1-4GlcNAc␤1-3Gal␤1-4GlcNAc UDP-Gal ND a The values represent the averages of two independent experiments. b ␣-Thrombomodulin contains a tetrasaccharide linkage GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl (16).   (Fig. 4). The chondroitin synthase gene exhibited ubiquitous but differential expression in the human tissues examined. Notably, expression was particularly abundant in placenta, followed by spleen, lung, and peripheral blood leukocytes. These findings are in accordance with the observations that chondroitin sulfate proteoglycans are distributed on the surfaces of most cells and in the extracellular matrices in virtually every tissue. DISCUSSION We have identified a human chondroitin synthase and demonstrated that the single polypeptide possessed both GlcAT-II and GalNAcT-II activities. This feature is similar to heparan sulfate polymerases that have both GlcUA transferase and GlcNAc transferase activities (11,34). Recently, a chondroitin synthase with both GlcUA transferase and GalNAc transferase activities was molecularly cloned from bacteria, which produce chondroitin or chondroitin-like capsular polysaccharides (7). Notably, however, the bacterial enzyme, composed of 965 amino acid residues, shows no homology with the human chondroitin synthase described here. In fact, a homologue of the human chondroitin synthase is absent in prokaryotes or yeast but present in higher eukaryotic species such as C. elegans or Drosophila (Fig. 1), being consistent with the findings that chondroitin or chondroitin 4-sulfate is found in C. elegans and Drosophila, respectively (35)(36)(37). In addition, although the bacterial enzyme is 87% identical to the bacterial hyaluronan synthase at the nucleotide and the amino acid sequence levels, the human chondroitin synthase shows no homology with the three cloned human hyaluronan synthases. This seeming discrepancy is not unexpected, considering that the bacterial chondroitin synthase appears to be associated with plasma membranes in view of the production of capsular polysaccharides lacking core proteins, whereas vertebrate enzyme(s) are Golgi resident(s) and vertebrate chondroitin sulfate chains are covalently attached to core proteins. In view of the fact that the chondroitin synthesizing enzyme derived from bovine serum appears to have a molecular mass of 160 kDa as estimated by gel filtration (13), the cloned enzyme may form a dimer, as has been observed for other Golgi glycosyltransferases (38,39). In this context, it is noteworthy that the GlcAT-II purified from chick cartilage microsomes has been reported to have a molecular mass of 80 kDa as determined by SDS-polyacrylamide gel electrophoresis (12). It will be interesting to evaluate whether it has GalNAcT-II activity as well. The cDNA probe for the human chondroitin synthase has now become available and will be a useful tool for investigating the biological functions of chondroitin sulfate, which is ubiquitously expressed and plays an indispensable role in many tissues, particularly the brain. It will also help investigating the possible association of the gene with any disease pathology.
It is presently unknown why GlcAT-II and GalNAcT-II activities detected in the assays were low with polymer chondroitin and chondro-oligosaccharides as acceptor substrates ( Table I). Incubation of polymer chondroitin with the recombinant chondroitin synthase in the presence of both UDP-GlcUA and UDP-GalNAc did not augment the chondroitin synthase activity (data not shown). Rather, it is possible that the recombinant chondroitin synthase might not be secreted well into the culture medium because the transfectant homogenates showed higher chondroitin synthase activity than the medium. More efforts to address this issue will be required. Besides, because the GlcUA or GalNAc incorporation into polymer chondroitin was proportional to the incubation time at least for 8 h (data not shown), the difference between the GlcUA and GalNAc transfer to polymer chondroitin (Table I)  Characterization of the acceptor substrate specificity of the human recombinant chondroitin synthase revealed that the enzyme showed marked specificity for polymer chondroitin and chondro-oligosaccharides (Table I and II). Neither a trisaccharide linkage, Gal␤1-3Gal␤1-4Xyl␤1, nor ␣-thrombomodulin containing a tetrasaccharide linkage, GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1, on the native core protein (16) was used as an acceptor substrate. These findings demonstrate that the transfer of GlcUA to polymer chondroitin is mediated by GlcAT-II, distinct from the GlcAT-I that was previously cloned and was involved in the formation of the tetrasaccharide linkage GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl (24). Additionally, the transfer of GalNAc to polymer chondroitin is mediated by GalNAcT-II, distinct from GalNAc transferase I involved in the initiation of chondroitin sulfate biosynthesis (40). Hence, it is now clear that at least one other gene encoding GalNAc transferase I likely exists in the human genome. Thus, it should be emphasized that a single GalNAc transferase has been proposed to catalyze transferase reactions not only for chain elongation but also for chain initiation of chondroitin sulfate using the soluble enzyme system derived from serum-free culture medium of the human melanoma cell line (41). Both reactions are greatly stimulated by the ␣-thrombomodulin core protein, which appears to be recognized by the putative chondroitin synthase. In contrast, the chondroitin sulfate-synthesizing enzyme system that was highly purified from bovine serum had GalNAcT-II and GlcAT-II activities but no GalNAc transferase I activity, even when ␣-thrombomodulin was used as an acceptor (13). Hence, the chondroitin synthase identified here seems to be similar to or identical to the serum enzyme rather than the melanoma enzyme, which possibly possesses three enzyme activities: GalNAcT-II, GlcAT-II, and GalNAc transferase I. Indeed, data base searches yielded two other human genes that are homologous to the chondroitin synthase gene and had weak but significant sequence similarities and a conserved DXD motif, 2 suggesting that a few more enzymes may be involved in chondroitin sulfate biosynthesis. Characterization of the two gene products is now in progress.