Molecular Cloning and Expression of Glucuronyltransferase I Involved in the Biosynthesis of the Glycosaminoglycan-Protein Linkage Region of Proteoglycans*

We isolated a cDNA encoding a novel glucuronyltransferase from human placenta cDNA with the use of the degenerate reverse transcriptase-polymerase chain reaction method. Degenerate primers were designed based upon the amino acid sequence alignment of rat glucuronyltransferase (GlcAT-P) involved in the biosynthesis of the carbohydrate epitope HNK-1 with putative proteins in Caenorhabditis elegans and Schistosoma mansoni . The new cDNA sequence revealed an open reading frame coding for a protein of 335 amino acids with a type II transmembrane protein topology. The amino acid sequence displayed 43% identity to the rat GlcAT-P, and the highest sequence identity was found in the COOH-terminal catalytic domain. The expression of a soluble recombinant form of the protein in COS-1 cells produced an active glucuronyltransferase with marked specificity for a glycoserine Gal b 1–3Gal b 1–4Xyl b 1- O Ser. In contrast, asialoorosomucoid, which contains the Gal b 1–4GlcNAc sequence and is a good is the previously glucuronyltransferase that forms the glycosaminoglycan-protein linkage region, GlcA b 1–3Gal b 1–3Gal b 1–4Xyl b 1- O -Ser, of proteoglycans.

Proteoglycans are polyanionic molecules that are distributed on the surfaces of most cells and the extracellular matrixes in virtually every tissue and comprise a wide range of structures involving different core proteins and different classes, numbers, and lengths of sulfated glycosaminoglycans (GAGs) 1 (for reviews, see Refs. 1 and 2). Despite the ubiquity of this family of molecules, a wide variety of proteoglycans with characteristic sulfated GAG chains exhibit tissue-specific and developmentally regulated expression (for a review, see Ref. 3) and have been implicated in the regulation and maintenance of cell proliferation, cytodifferentiation, and tissue morphogenesis (for a review, see Ref. 4).
Sulfated GAGs, including heparin/heparan sulfate, chondroitin sulfate, and dermatan sulfate, are covalently bound to Ser residues in the core proteins through the common carbohydrate-protein linkage structure, GlcA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser (1,2). Heparin/heparan sulfate is synthesized once GlcNAc is transferred to the common linkage region, while chondroitin sulfate is formed if GalNAc is first added. The biosynthesis of these GAGs is initiated by the addition of Xyl to specific Ser residues in the core protein and is followed by the addition of two Gal residues, a GlcA residue and then alternating addition of N-acetylhexosamine and GlcA residues. The addition of each sugar is thought to be catalyzed by a specific glycosyltransferase, which uses the corresponding uridine diphosphate sugar as a donor substrate (1,2).
The glucuronyl transfer to the Gal residue is catalyzed by a ␤1,3-glucuronyltransferase, which has been termed glucuronyltransferase I (GlcAT-I), which is thought to be distinct from the glucuronyltransferases involved in the formation of the repeating disaccharide units characteristic of various GAG species (5)(6)(7)(8). GlcAT-I was first detected in an embryonic chick cartilage extract (5) and was subsequently partially purified from embryonic chick brain (9) and mouse mastocytoma cells (6). However, attempts to purify GlcAT-I to homogeneity have not been successful due to the low concentrations and the difficulty in solubilization of the enzyme. Recently, cDNA encoding a glucuronyltransferase involved in the biosynthesis of the HNK-1 carbohydrate epitope on glycoproteins (GlcAT-P) was cloned (10). The GlcAT-P is a ␤1,3-glucuronyltransferase that utilizes glycoprotein acceptor substrates with the terminal Gal␤1-4GlcNAc sequence. Mixed substrate experiments have indicated that GlcAT-I and the GlcAT-P are distinct enzymes, although the two enzymatic reactions are similar (11). In view of the resemblance of the reactions catalyzed by GlcAT-I and GlcAT-P, we anticipated that a molecular similarity might exist between the two enzymes and attempted to exploit this suspected similarity to isolate a cDNA encoding GlcAT-I.
The alignment of the amino acid sequence of rat GlcAT-P with putative proteins in Caenorhabditis elegans and Schistosoma mansoni revealed the presence of four highly conserved motifs (I-IV) in the large catalytic region (10). Accordingly, we used degenerate primers to conserved elements found in the motifs II and III for a PCR-based approach to clone new members of this gene family. In the present study, we describe the successful use of the approach for cloning a cDNA from human placenta which encodes GlcAT-I responsible for the formation of the GAG-protein linkage region, GlcA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser, of proteoglycans.
PCR-based Cloning of a New Glucuronyltransferase-Based upon the amino acid sequence alignment of rat GlcAT-P with putative proteins in C. elegans and S. mansoni (see Fig. 2 in Ref. 10), two degenerate oligonucleotides to conserved elements found in the motifs II and III were synthesized, which were predicted to yield amplified fragments composed of around 210 bp. The sequences of the 5Ј and 3Ј primers were 5Ј-TSGTSTAYTTYGCYGAYGAYGA-3Ј (the nucleotide sequence corresponding to 608 -629 of the cDNA for rat GlcAT-P (10)) and 5Ј-TT-SACRGCRAASCCRGCCATRT-3Ј (the nucleotide sequence corresponding to 800 -821) (R ϭ A ϩ G, S ϭ C ϩ G, and Y ϭ C ϩ T), respectively. For PCR amplification, first strand cDNA synthesized with 25 ng of human placenta poly(A) ϩ RNA (OriGene, Rockville, MD) was combined with 50 pmol of each primer. Thirty cycles (94°C for 45 s, 50°C for 45 s, and 72°C for 90 s) were run using EX Taq polymerase (Takara Shuzo Co., Kyoto), and the products were subcloned into the pGEM-T Easy vector (Promega, Madison, WI). The 12 clones were sequenced using a T7 primer, and 10 of the 12 clones contained a novel putative glucuronyltransferase fragment. To obtain the entire coding sequence containing the cDNA fragment, a cloning strategy of a rapid amplification of cDNA ends was employed according to the manufacturer's instructions (5Ј/3Ј RACE kit, Boehringer Mannheim, Mannheim, Germany), using 0.2 g of human placenta poly(A) ϩ RNA as a template. The nucleotide sequence of the amplified cDNA was determined in a 377 DNA sequencer (PE Applied Biosystems, Foster, CA). Several clones were sequenced to compensate for misreading by EX Taq polymerase.
Construction of a Soluble Form of the Glucuronyltransferase (Gl-cAT)-A truncated form of GlcAT, lacking the first NH 2 -terminal 43 amino acids of the GlcAT, was amplified by PCR using a 5Ј primer (5Ј-GGAAGATCTCTACGGCAGAAGGATCTGAGGAT-3Ј) containing an in-frame BglII site and a 3Ј primer (5Ј-GGAAGATCTGTGCCTGA-AAAGAGGTGGTAG-3Ј) containing an in-frame BglII site located 40 bp downstream of the stop codon. PCR reactions were carried out with Pfu polymerase (Stratagene, La Jolla, CA) by 30 cycles of 96°C for 30 s, 60°C for 45 s, and 72°C for 90 s. The PCR fragment was subcloned into the BamHI site of pGIR201protA (14) resulting in the fusion of GlcAT to the insulin signal sequence and the protein A sequence present in the vector. A NheI fragment containing the fusion protein was inserted into the XbaI site of the expression vector pSVL.
Expression of the Soluble Form of the GlcAT and Enzyme Assay-The expression plasmid (11 g) was transfected into COS-1 cells on 100-mm plates using LipofectAMINE (Life Technologies, Inc.) according to the instructions provided by the manufacturer. Two days after transfection, 1 ml of the culture medium was collected and incubated with 10 l of IgG-Sepharose (Amersham Pharmacia Biotech) for 1 h at 4°C. The beads recovered by centrifugation were washed with and then resuspended in the assay buffer and tested for glucuronyltransferase activities using linkage oligosaccharide-serines (1 nmol each), chondroitin (300 g), and asialoorosomucoid (20 g) as acceptor substrates as described, respectively (5,15,16).
Characterization of the Reaction Products-The isolation of the products from the GlcAT reaction using the linkage trisaccharide-serine was carried out by gel filtration on a Superdex 30 column equilibrated with 0.25 M NH 4 HCO 3 , 7% 1-propanol. The radioactive peak containing the product was pooled and evaporated to dryness. The isolated product was digested with 100 mIU of ␤-glucuronidase overnight at 37°C in a total volume of 50 l of 0.1 M sodium acetate buffer, pH 4.5. The digest was analyzed using the same Superdex 30 column as that noted above. To further identify the reaction product, it was subjected to the recently developed analytical method for linkage glycoserines using chromophore labeling in conjunction with HPLC. 2 The product purified by the Superdex 30 was derivatized with 4-dimethylaminoazobenzene-4Јsulfonyl chloride (DABS-Cl) (Wako Pure Chemicals, Osaka, Japan) as essentially described (17). Briefly, the product (1 nmol) was dissolved in 20 l of a 50 mM sodium bicarbonate buffer (pH 8.1) and mixed with 40 l of a DABS-Cl solution (4 nmol/l in acetone). The mixture was heated at 70°C for 10 min, dried, and redissolved in 1 ml of 5% (v/v) ethanol. The labeled product was analyzed by HPLC on a Nova-Pak C 18 column (3.9 ϫ 150 mm; Waters, Tokyo) in an LC-10AS system (Shimadzu Co., Kyoto) using a linear acetonitrile gradient in 25 mM sodium acetate, pH 6.5, containing 4% dimethylformamide. The gradient was made by increasing the acetonitrile concentration from 14 to 28% over a 60-min period at a flow rate of 1.0 ml/min at room temperature. Eluates were monitored at 436 nm. The identification of the reaction product was accomplished by co-chromatography with the authentic linkage tetrasaccharide-serine labeled with DABS-Cl as described above.

PCR-based Cloning of a New Glucuronyltransferase-
The alignment of the amino acid sequence of rat GlcAT-P with those of the corresponding putative proteins in C. elegans and S. mansoni revealed the presence of four highly conserved motifs (I-IV) in the large catalytic region (see Fig. 2 in Ref. 10). Accordingly, we first designed degenerate oligonucleotide primers to the conserved regions found in the motifs II and III for a PCR-based approach to clone new members of this gene family (see "Experimental Procedures"). Human placenta was chosen as a source for cDNA isolation in the initial PCR experiments, since it is an abundant source of sulfated GAGs and does not express the GlcAT-P gene. 3 With human placenta cDNA as the template, the PCR using degenerate primers resulted in the amplification of a major ϳ210-bp product (data not shown). After subcloning the PCR product and sequencing individual clones, we found that 10 of the 12 characterized clones had an identical sequence that was similar to but distinct from that of the GlcAT-P, suggesting that these clones encoded a novel member of the glucuronyltransferase gene family.
Primary Structure of the New Gene-To clone the complete coding sequence of the novel cDNA, the cloning strategy of rapid amplification of cDNA ends was employed (18). The revealed sequence of the overlapping cDNA fragments indicated a single open reading frame of 1005-bp coding for a protein of 335 amino acids, including one potential N-glycosylation site (Fig. 1). The deduced amino acid sequence corresponds to a 37,061-Da polypeptide. The predicted translation initiation site conformed to the Kozak consensus sequence for initiation (19). However, since no in-frame stop codon is present upstream of the assigned initiating ATG codon, the possibility that the identified cDNA still lacks a 5Ј-terminal sequence cannot be excluded (Fig. 1). A Kyte-Doolittle hydropathy analysis (20) revealed one potential membrane-spanning region consisting of 18 hydrophobic amino acid residues, which was located 7 residues away from the amino terminus (Fig. 1) and appears to result in a type II transmembrane orientation characteristic of many of the other glycosyltransferases cloned to date. An additional characteristic feature in the amino acid sequence of the newly cloned gene was a proline-rich domain (from Pro-30 to Pro-69) next to the transmembrane region, as is seen in several other glycosyltransferases including the rat GlcAT-P (10). Data base searches indicated that the amino acid sequence displayed 42.7% identity to the rat GlcAT-P, and the highest sequence identity was found in the COOH-terminal catalytic domain containing the four previously identified highly conserved motifs (I-IV) (10), following the proline-rich region (252-amino acids between Pro-68 and Glu-319 overlap with 56.7% identity) (Fig. 2). The sequence conservation strongly suggests that the newly cloned gene product is a novel glucuronyltransferase.
Expression of a Soluble Form of the New Glucuronyltransferase and Characterization as GlcAT-I-To facilitate the functional analysis of the putative glucuronyltransferase, a soluble form of the protein was generated by replacing the first 43 amino acids of the putative glucuronyltransferase with a cleavable insulin signal sequence and a protein A IgG binding domain as described under "Experimental Procedures" and then the soluble putative glucuronyltransferase was expressed in COS-1 cells as a recombinant enzyme fused with the protein A IgG binding domain. The fused enzyme expressed in the medium was absorbed on IgG-Sepharose beads to eliminate endogenous glucuronyltransferase and then the enzyme-bound beads were used as an enzyme source. The bound fusion protein was assayed for glucuronyltransferase activity using a variety of acceptor substrates. As shown in Table I, activity was detected only with the linkage trisaccharide-serine, Gal␤1-3Gal␤1-4Xyl␤1-O-Ser as the substrate, whereas no activity was detected using polymer chondroitin, the linkage pentasaccharide-serine, or heptasaccharide-serine as acceptors. In contrast, asialoorosomucoid, which contains a Gal␤1-4GlcNAc sequence and is a good acceptor substrate for the GlcAT-P (10), did not serve as an acceptor. In addition, no detectable glucuronyltransferase activity was recovered by the affinity purification from a control pSVL transfection sample. These results clearly indicate that the expressed protein is a glucuronyltransferase with marked specificity for Gal␤1-3 Gal␤1-4Xyl␤1-O-Ser.
To identify the glucuronyltransferase reaction products, the linkage trisaccharide-serine was labeled with [ 14 C]GlcA using the enzyme-bound beads as an enzyme source, and the products were isolated and then subjected to a gel filtration analysis after ␤-glucuronidase treatment. The labeled products were completely digested by ␤-glucuronidase, quantitatively yielding a 14 C-labeled peak at the position of free [ 14 C]GlcA (data not shown). These results indicate that a GlcA residue was indeed transferred to the linkage trisaccharide-serine through a ␤-linkage. The labeled products were also derivatized with DABS-Cl and analyzed by HPLC as described under "Experimental Procedures." As shown in Fig. 3, only a single radioactive peak was detected at the elution position of a DABS-derivative of the linkage tetrasaccharide-serine, GlcA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser, being separated from those of Gal␤1-3Gal␤1-4Xyl␤1-O-Ser and GalNAc␤1-4GlcA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser. Moreover, the identity of the reaction product was confirmed by co-chromatography with the authentic linkage tetrasaccharide-serine labeled with DABS-Cl (data not shown). These results indicate that a GlcA residue was transferred exclusively to the nonreducing terminal Gal of the linkage trisaccharide-serine, most likely through a ␤1-3 linkage.

PCR Cloning of Glucuronyltransferase I 6617 DISCUSSION
GlcAT-I is the first cloned glycosyltransferase among those responsible for the biosynthesis of sulfated GAGs, including heparin/heparan sulfate, chondroitin sulfate, and dermatan sulfate. It should be emphasized that no protein sequence information had been available for GlcAT-I, although the identification and partial purification of the enzyme had been reported in a few tissues. The initial characterization of crude enzyme preparations showed that the enzyme catalyzed the GlcA transfer not only to GAG-protein linkage region fragments with the characteristic structure (e.g. Gal␤1-3Gal) but also to disaccharides with analogous structures such as lactose (Gal␤1-4Glc) and N-acetyllactosamine (Gal␤1-4GlcNAc) (5,6,9). Curenton et al. (11) suggested that these glucuronyl transfer reactions might be catalyzed by two distinct ␤1,3-glucuronyltransferases, GlcAT-I and GlcAT-P, respectively. This hypothesis has now been confirmed by the cDNA cloning of both enzymes in a recent study (10) and in the present study. It is not surprising, however, that a molecular similarity exists between the two enzymes, although the product glycan types are different (the GAG linkage region and the HNK-1 epitope on N-linked glycoproteins), since these two enzymes exhibit a certain similarity in their substrate recognition (both enzymes recognize the terminal Gal moiety of their acceptor substrates). In addition, the molecular similarity revealed in this study suggests that both genes may be evolutionarily related and originate from one primordial gene.
The expression of GlcAT-I in COS-1 cells, where a fusion was made among the cleavable insulin signal sequence, the protein A IgG binding domain, and the truncated GlcAT-I, resulted in the secretion of the truncated soluble enzyme into the medium, which exhibited the glucuronyltransferase activity despite the finding that GlcAT-I in embryonic chick brain was firmly membrane-associated (11). The present result demonstrated that the cytoplasmic and transmembrane domains are not required for the catalytic activity. The characterization of the acceptor substrate specificity of the recombinant GlcAT-I revealed that the enzyme showed strict specificity toward the linkage trisaccharide-serine, Gal␤1-3Gal␤1-4Xyl␤1-O-Ser (Table I). Neither other linkage oligosaccharide-serines nor polymer chondroitin was utilized as an acceptor substrate. These findings clearly indicate that the transfer of GlcA to the linkage trisaccharide primer is mediated by GlcAT-I, distinct from the enzyme that has been termed glucuronyltransferase II involved in the formation of the repeating disaccharide units of chondroitin sulfate, as proposed previously (5,7,8). Helting and Rodén (5) reported that linkage region oligosaccharides such as Gal␤1-3Gal and Gal␤1-3Gal␤1-4Xyl as well as Gal␤1-3Gal␤1-4Xyl␤1-O-Ser were good acceptors for GlcAT-I in an embryonic chick cartilage extract. Likewise, the recombinant GlcAT-I efficiently utilized Gal␤1-3Gal␤1-4Xyl as an acceptor, 4 being consistent with the property of the previously characterized enzyme and the finding that the serine residue is not essential for the GlcA transfer.
The availability of the GlcAT-I cDNA now provides an essential tool for investigating the biological functions of sulfated GAGs, especially since GlcAT-I is responsible for the synthesis of the common GAG-protein linkage region. Hence, the gene knockout would result in the complete elimination of chondroitin sulfate, dermatan sulfate, heparan sulfate, and heparin unless functional redundancy exists.