Molecular Cloning and Expression of Human Chondroitin N -Acetylgalactosaminyltransferase THE KEY ENZYME FOR CHAIN INITIATION AND ELONGATION OF CHONDROITIN/DERMATAN SULFATE ON THE PROTEIN LINKAGE REGION TETRASACCHARIDE SHARED BY HEPARIN/HEPARAN SULFATE*

Based on sequence homology with the recently cloned human chondroitin synthase, we identified a novel (cid:1) 1,4-N -acetylgalactosaminyltransferase, which consisted of 532 amino acids with a type II transmembrane protein topology. The amino acid sequence displayed 27% identity to that of human chondroitin synthase. The expression of a soluble form of the protein in COS-1 cells produced an active enzyme, which transferred (cid:1) 1,4- N acetylgalactosamine (GalNAc) from UDP-[ 3 H]GalNAc not only to a polymer chondroitin representing growing chondroitin chains ( (cid:1) -GalNAc transferase II activity) but also to GlcUA (cid:1) 1–3Gal (cid:1) 1- O -C 2 H 4 NH-benzyloxycar- bonyl, a synthetic substrate for (cid:1) -GalNAc transferase I that transfers the first GalNAc to the core tetrasaccharide in the protein linkage region of chondroitin sulfate. Hence, the enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulfate and is the

cellular matrix of connective tissues and are also found at the surface of many cell types and in intracellular secretory granules. The GAG chains exhibit tissue-specific and developmentally regulated expression and have been implicated in the regulation and maintenance of cell proliferation, cytodifferentiation, and tissue morphogenesis (for reviews, see Refs. [1][2][3]. Recent studies of chondroitin/dermatan sulfate chains have indicated important roles in neural network formation in the developing mammalian brain, attracting much attention (for reviews, see Refs. 4 and 5).
Chondroitin/dermatan sulfate and heparin/heparan sulfate are produced on the so-called GAG-protein linkage region, Glc-UA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-, attached to specific Ser residues of core proteins, which is common to the GAGs (for reviews, see Refs. 6 and 7). The synthesis of this region is initiated by the addition of Xyl to Ser followed by the addition of two Gal residues and is completed by the addition of GlcUA, with each reaction being catalyzed by the respective specific glycosyltransferase (6,7). The GAGs are built up on this linkage region tetrasaccharide by the alternate addition of N-acetylhexosamine and GlcUA residues. Chondroitin/dermatan sulfate is synthesized once GalNAc is transferred to the common linkage region, whereas heparin/heparan sulfate is formed if GlcNAc is added first. The first hexosamine transfer is therefore considered to be critical in determining whether chondroitin/dermatan sulfate or heparin/heparan sulfate chains are selectively assembled on the common linkage region. Although such mechanisms have long been proposed based on data from conventional structural and enzymological studies (7), not all of the glycosyltransferases involved have been molecularly cloned. In particular, the molecular mechanisms underlying the biosynthetic selective chain assembly of different GAG chains have been a long-standing enigma, although the biosynthesis of heparin/heparan sulfate has recently been clarified by cDNA cloning of the responsible glycosyltransferases (for a review, see Ref. 8).
Recent cDNA cloning of the glycosyltransferases involved in the biosynthesis of heparin/heparan sulfate has revealed the biosynthesis to be associated with the EXT gene family, the hereditary multiple exostoses gene family of tumor suppressors (for a review, see Ref. 8). It has been demonstrated that human EXT1 and EXT2 are the heparan sulfate copolymerases that polymerize GlcUA and GlcNAc alternately (9 -11). The human EXTL1 protein is GlcNAc transferase II, which is involved in the chain elongation of heparan sulfate (12), the human EXTL2 * This work was supported in part by the Science Research Promotion Fund of the Japan Private School Promotion Foundation and by Grantin-aid for Scientific Research 13470493 and Grant-in-aid for Scientific Research on Priority Areas 10178102 from the Ministry of Education, Science, Culture and Sports of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  protein is GlcNAc transferase I (13), which determines and initiates the synthesis of heparan sulfate on the common GAGprotein linkage region (14), and the human EXTL3 protein exhibits both GlcNAc transferase I and II activities (12). In contrast, the mechanisms of the chain assembly of chondroitin sulfate are not yet fully understood. We have recently cloned cDNA encoding a single large polypeptide, namely, chondroitin synthase, with the GlcUA transferase II (␤1,3glucuronyltransferase II) and GalNAc transferase II (GalN-AcT-II) activities responsible for the biosynthesis of repeating disaccharide units of chondroitin sulfate (15). The feature of a single protein with dual glycosyltransferase activities is reminiscent of that of heparan sulfate copolymerases encoded by EXT1 and EXT2. However, characterization of the acceptor substrate specificity of the human recombinant chondroitin synthase revealed that the enzyme showed no GalNAcT-I activity, which determines and initiates the synthesis of chondroitin sulfate on the common GAG-protein linkage region (16), indicating that at least one other gene encoding GalNAcT-I likely exists in the human genome. A data base search revealed two other human genes homologous to the chondroitin synthase gene. Here, we describe the identification of the gene product of one of the two genes in the novel gene family as the key enzyme (chondroitin GalNAcT) responsible for the selective chain assembly, i.e. initiation and elongation, in chondroitin/ dermatan sulfate.  (19) were provided by Dr. T. Ogawa (Riken, The Institute of Physical and Chemical Research, Saitama, Japan). GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz was chemically synthesized. 2 Chondro-hexasaccharide (GlcUA␤1-3GalNAc) 3 was prepared from chondroitin as described previously (20). A Superdex Peptide HR10/30 column was obtained from Amersham Biosciences, Inc.
In Silico Cloning of a Human Putative Chondroitin GalNAcT cDNA-tBLASTn analysis of the GenBank TM data base with sequences from human chondroitin synthase (15) revealed two additional distinct clones. Analysis of one clone (Genbank TM accession number AX092340) revealed a single open reading frame with significant sequence similarity to the COOH-terminal side of human chondroitin synthase. In addition, a data base search of the Human Genome Project, which recently became available, showed a genome sequence (accession number NT 029338.1) identical to the cDNA sequence. Comparison between the cDNA and the genome sequence revealed the genomic organization of this putative chondroitin GalNAcT gene.  (21), resulting in the fusion of the putative GalNAcT with the insulin signal sequence and the protein A sequence present in the vector. An NheI fragment containing the above fusion protein sequence was inserted into the XbaI site of the expression vector pEF-BOS (22). The nucleotide sequence of the amplified cDNA was determined in a 377 DNA sequencer (PE Applied Biosystems).

Expression of a Soluble Form of Putative Chondroitin
GalNAcT and Enzyme Assays-The expression plasmid (6.7 g) was transfected into COS-1 cells on 100-mm plates using FuGENE 6 (Roche Molecular Biochemicals) according to the manufacturer's instructions. Two days after transfection, 1 ml of the culture medium was collected and incubated with 10 l of IgG-Sepharose (Amersham Biosciences, Inc.) for 1 h at 4°C. The beads recovered by centrifugation were washed with the assay buffer and then resuspended in the assay buffer and tested for GalNAcT and GlcUA transferase activities as described below. The acceptor used for GalNAcT was polymer chondroitin (167 g), chondrohexasaccharide (GlcUA␤1-3GalNAc) 3  ]GlcUA by gel filtration using a syringe column packed with Sephadex G-25 (superfine) or a Superdex Peptide column or by HPLC on a Nova-Pak C 18 column (3.9 ϫ 150 mm; Waters, Tokyo, Japan) as described previously (13,(23)(24)(25)(26). The recovered labeled products were quantified by liquid scintillation spectrophotometry.
Identification of the Enzyme Reaction Products-Isolation of the products from the GalNAcT reaction using polymer chondroitin as an acceptor was carried out by gel filtration on a Superdex Peptide column equilibrated with 0.25 M NH 4 HCO 3 /7% 1-propanol. The radioactive peak containing the enzyme reaction product was pooled and evaporated to dryness. The isolated GalNAcT reaction product (about 120 g) was digested with 100 mIU of chondroitinase AC-II to assess the digestibility in a total volume of 30 l of 50 mM sodium acetate buffer, pH 6.0, at 37°C overnight. The enzyme digest was analyzed using the same Superdex Peptide column as described above.
The isolation of the products from the GalNAcT reaction using GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz was performed by HPLC on a Nova-Pak C 18 column (3.9 ϫ 150 mm; Waters) in an LC-10A system (Shimadzu Co., Kyoto, Japan). The column was developed isocratically for 15 min with H 2 O at a flow rate of 1.0 ml/min at room temperature; thereafter, a linear gradient was applied to increase the methanol concentration from 0% to 100% over a 5-min period, and the column was then developed isocratically for 40 min with 100% methanol. The radioactive peak containing the product was pooled and evaporated to dryness. The isolated product (about 74 pmol) was incubated with 100   3  mIU of chondroitinase AC-II to assess the digestibility in a total volume of 30 l of 50 mM sodium acetate buffer, pH 6.0, at 37°C overnight or with 40 mIU of ␣-N-acetylgalactosaminidase in a total volume of 20 l of 50 mM sodium citrate buffer, pH 4.5, at 37°C overnight (13). The enzyme digest was analyzed using the same Nova-Pak C 18 column as described above.

In Silico Cloning of a Putative Human Chondroitin
GalNAcT cDNA-We recently identified and characterized human chondroitin synthase (15). Screening of the nonredundant data base at the National Center for Biotechnology Information (National Institutes of Health, Bethesda, MD) using the deduced amino acid sequence of human chondroitin synthase identified a clone (Genbank TM accession number AX092340) containing a 5Ј-untranslated region of 654 bp, a single open reading frame of 1596 bp coding for a protein of 532 amino acids with two potential N-glycosylation sites (Fig. 1), and a 3Ј-untranslated region of about 1.6 kb with four presumptive polyadenylation signals. Northern blot analysis indicated that the mRNA was about 4.0 kb in length in various human tissues (see below), suggesting that the cDNA was approximately the full length. The deduced amino acid sequence corresponded to a 61,350-Da polypeptide. The predicted translation initiation site conformed to the Kozak consensus sequence for initiation (27), and an in-frame stop codon was present upstream of the assigned initiating ATG codon. A Kyte-Doolittle hydropathy analysis (28) revealed one prominent hydrophobic segment of 20 amino acid residues in the NH 2 -terminal region, predicting that the protein has a type II transmembrane topology characteristic of many Golgilocalized glycosyltransferases cloned to date (Fig. 1). Data base searches indicated that the amino acid sequence displayed 27% identity to human chondroitin synthase (Fig. 1) and weak sequence similarity to human UDP-Gal:GlcNAc␤-R ␤1,4-Gal transferase II (Genbank TM accession number AB024434), with the highest sequence identity found in the COOH-terminal domain with catalytic activity. Notably, each protein shared a conserved DVD motif (Fig. 1), which appears to correspond to the conserved DXD motif found in most glycosyltransferases (29). Thus, the features of the identified protein sequence suggest that the identified gene product might possess GalNAcT-I and/or -II activities. Intriguingly, a homologue of the identified human gene was found in the Drosophila but not in the Caenorhabditis elegans genome. The human sequence shared 38% identity with that of Drosophila (data not shown).
Genomic Organization and Chromosomal Localization-Comparison of the identified cDNA sequence with the genome sequence deposited in the Human Genome Project Data base revealed the genomic structure and chromosomal localization of the gene. The gene spans over 280 kb, and the coding region of the gene was divided into seven discrete exons as shown in Fig. 2. The intron/exon junctions followed the GT/AG rule (30) and were flanked by conserved sequences (data not shown). This gene is located on human chromosome 8.
Expression of a Soluble Form of the Putative Glycosyltransferase and Characterization as Chondroitin GalNAcT-To facilitate functional analysis of the putative GalNAcT, a soluble form of the protein was generated by replacing the first 41 amino acids of the putative glycosyltransferase with a cleav-able insulin signal sequence and a protein A IgG-binding domain as described under "Experimental Procedures," and then the soluble putative glycosyltransferase was expressed in COS-1 cells as a recombinant enzyme fused with the protein A IgG-binding domain. When the expression plasmid containing the putative glycosyltransferase/protein A fusion construct was expressed in COS-1 cells, an approximate 95-kDa protein was secreted as shown by Western blotting using IgG (data not shown). The apparent molecular mass of the fused protein was reduced to about 85 kDa after N-glycosidase treatment (data not shown), suggesting that either one of the two potential N-linked glycosylation sites of the putative glycosyltransferase was utilized. The fused enzyme expressed in the medium was adsorbed onto IgG-Sepharose beads to eliminate endogenous glycosyltransferases, and then the enzyme-bound beads were used as an enzyme source. The bound fusion protein was assayed for glycosyltransferase activity using a variety of acceptors and either UDP-GalNAc or UDP-GlcUA as a donor substrate. GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz used as an acceptor substrate for the GalNAcT-I reaction shares the disaccharide sequence with the GAG-protein linkage region tetrasaccharide. As shown in Table I, marked GalNAc transferase activity was detected with polymer chondroitin, chondrohexasaccharide (GlcUA␤1-3GalNAc) 3 , ␣-thrombomodulin containing a linkage region tetrasaccharide (GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1) on the native core protein (18), and GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz but not with the tetrasaccharide serines (GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser or Glc-UA␤1-3Gal(4-O-sulfate)␤1-3Gal␤1-4Xyl␤1-O-Ser) as acceptor substrates. In contrast, no GlcUA transferase activity was observed using polymer chondroitin. No detectable glycosyltransferase activity was recovered by affinity purification from a control pEF-BOS transfection sample. These findings clearly indicate that the expressed protein is a GalNAcT.
To identify the GalNAcT reaction products, the representative acceptor substrates, polymer chondroitin and GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz, were individually labeled by the respective transferase reaction using UDP-[ 3 H]GlcNAc as a donor substrate and the enzyme-bound beads as an enzyme source. Both labeled products were completely digested by chondroitinase AC-II, which cleaves a ␤1,4-N-acetylgalactosaminidic linkage in an eliminative fashion, quantitatively yielding a 3 H-labeled peak at the position of free [ 3 H]GalNAc, as demonstrated by gel filtration (Fig. 3A) or hydrophobic HPLC (Fig. 3B). In contrast, they were inert to the action of ␣-N-acetylgalactosaminidase, which is in marked contrast to the product of the ␣1,4-GalNAc transferase reaction catalyzed by EXTL2 (13). These findings indicate that a GalNAc residue was indeed transferred to the nonreducing terminal GlcUA of polymer chondroitin or GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz through a ␤1-4 linkage. Taken together, the abovementioned results indicate that the identified protein is a novel and unique ␤1,4-GalNAc transferase I/II involved in the biosynthetic initiation and elongation of chondroitin/dermatan sulfate.
Expression Pattern of Chondroitin GalNAcT-Northern blot analysis of mRNA demonstrated a single band of ϳ4.0 kb for all human tissues examined (Fig. 4). The gene exhibited a ubiquitous but differential expression in the human tissues examined. Notably, the expression pattern was similar to that of chondroitin synthase. Array analysis also indicated that the gene was expressed in all human tissues examined (Fig. 5). The strongest signals were seen in the placenta, thyroid, bladder, prostate, and adrenal gland. These findings are in accordance with the observation that chondroitin/dermatan sulfate proteoglycans are distributed on the surfaces of most cells and in the extracellular matrices in virtually every tissue. DISCUSSION We have identified human chondroitin GalNAcT and demonstrated that the enzyme possessed both GalNAcT-I and -II activities responsible for chain initiation and elongation in chondroitin/dermatan sulfate. This enzyme is similar to a heparan sulfate GlcNAc transferase encoded by human EXTL3 or C. elegans rib-2, which has both GlcNAc transferase-I and -II activities responsible for chain initiation and elongation in heparin/heparan sulfate (12,31). Thus, in the synthesis of both chondroitin/dermatan sulfate and heparin/heparan sulfate, at least two kinds of enzymes, one for chain initiation and elongation (chondroitin GalNAcT and heparan sulfate GlcNAc transferase EXTL3, respectively) and the other for chain polymerization (chondroitin synthase and heparan sulfate copolymerases EXT1 and EXT2, respectively), are required. Intriguingly, these findings, taken together, suggest that although the chondroitin backbone is synthesized by the novel gene family proteins distinct from the EXT gene family proteins that assemble a heparan backbone, the assembly of the polysaccharide backbones of chondroitin sulfate and heparan sulfate shares a similar mechanism of chain construction.
Until recently, little information was available about the role of GalNAcT-I in the assembly of chondroitin sulfate chains. It has been believed that GalNAcT-I exists in addition to Gal-NAcT-II, which synthesizes the repeating disaccharide region, since the pioneering work in 1985 of Rohrmann et al. (16), who reported two distinct activities of ␤-GalNAc transferases I and II that are different in their thermostability. While GalNAcT-I was being studied, a unique ␣-GalNAc transferase, which shared similar thermostability with the previously reported GalNAcT-I (16), was discovered in mammalian tissues and cells that transferred GalNAc from UDP-GalNAc to the linkage tetrasaccharide serine GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser (20,32) through an ␣1,4-linkage (33). The amino acid sequencing of the enzyme purified from serum-free media of human sarcoma cells revealed it to be EXTL2 (13). To our surprise, the protein showed not only ␣-GalNAc transferase activity toward the linkage tetrasaccharide serine but also ␣-GlcNAc transferase activity toward GlcUA␤1-3Gal␤1-naphthalenmethanol, which is an authentic acceptor for the previously reported GlcNAc transferase I (14), indicating that EXTL2 is identical to GlcNAc transferase I, which selectively assembles a heparan sulfate chain in the common tetrasaccharide linkage region. Thus, the alleged GalNAcT-I, which is thermostable, appears to be identical to EXTL2 exhibiting GlcNAc transferase I activity for heparin/heparan sulfate chain assembly as well as ␣-GalNAc transferase, the physiological significance of which still remains to be established (13).
Genuine GalNAcT-I activity was detected for the first time in the spent medium of a human melanoma cell line when ␣-thrombomodulin was used as an acceptor substrate (24). ␣-Thrombomodulin bears the truncated linkage region tetrasaccharide (Glc-UA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-) (18) and accepted a ␤1,4-GalNAc residue at the reducing terminal, GlcUA, of the tetrasaccharide. The partially purified enzyme preparation also showed GalNAcT-II activity for chondroitin sulfate chain elongation. Interestingly, this activity was competitively inhibited by ␣-thrombomodulin, suggesting that both activities are catalyzed by a single GalNAcT (24). Furthermore, both activities are greatly augmented by the ␣-thrombomodulin core protein, which appears to be recognized by the GalNAcT. In this context, the chondroitin GalNAcT identified in the present study also appears to recognize a specific peptide sequence in the core protein or an aglycone structure attached to the linkage region tetrasaccharide, as in the case of heparan sulfate GlcNAc transferases EXTL2 and EXTL3 (12,13), because the tetrasaccharide serine GlcUA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser derived from the linkage region did not serve as an acceptor substrate (Table I). The aglycone of GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz probably mimicked the peptide sequence required, as observed for GlcNAc transfer- ase I reactions catalyzed by EXTL2 and EXTL3, both of which also utilize GlcUA␤1-3Gal␤1-O-C 2 H 4 NHCbz as acceptor (12,13). Thus, the chondroitin GalNAcT identified here seems to be similar or identical to the GalNAc transferase detected in the spent medium of a human melanoma cell line.
As described previously (15), a homologue of human chondroitin synthase is present in C. elegans as well as in Drosophila, consistent with the findings that chondroitin or chondroitin 4-sulfate is found in C. elegans and Drosophila, respectively (34 -36). Notably, however, a homologue of human chondroitin GalNAcT is absent in C. elegans but present in higher species such as Drosophila (data not shown). Based on these findings, we hypothesize that a single enzyme protein, chondroitin synthase, in C. elegans is responsible for all the transferase activities of GalNAcT-I and -II as well as ␤1,3-glucuronyltransferase II. It is likely that as in the biosynthesis of heparan sulfate (31), the biosynthetic mechanism for chondroitin sulfate in C. elegans is distinct from that in mammals. Characterization of chondroitin synthase in C. elegans is now in progress.
The mechanism for the selective chain assembly of chondroitin/dermatan sulfate and heparin/heparan sulfate on the common linkage region tetrasaccharide has been an enigma since the discovery of the common linkage region tetrasaccharide in the mid-1960s (6,7). This was mainly because of a lack of cDNA cloning of the two key enzymes, GalNAcT-I and Glc-NAc transferase I. As described here and previously (12,13), however, cDNAs coding for the enzymes harboring GalNAcT-I or GlcNAc transferase I activity have finally been cloned. These cDNA probes will be useful for investigating the biological functions of chondroitin/dermatan sulfate as well as the mechanism of selective chain assembly of chondroitin/dermatan sulfate and heparin/heparan sulfate, both of which are ubiquitously expressed and play a variety of indispensable yet different biological roles in many tissues.