Human Homolog of Caenorhabditis elegans sqv-3 Gene Is Galactosyltransferase I Involved in the Biosynthesis of the Glycosaminoglycan-Protein Linkage Region of Proteoglycans*

A cDNA encoding a novel galactosyltransferase was identified based on BLAST analysis of expressed sequence tags, and the cDNA clones were isolated from a human melanoma line library. The new cDNA sequence encoded a type II membrane protein with 327 amino acid sequence and showed 38% homology to theCaenorhabditis elegans sqv-3 gene involved in the vulval invagination and oocyte development. Extracts from L cells transfected with the galactosyltransferase cDNA in an expression vector and a fusion protein with protein A exhibited marked galactosyltransferase activity specific forp-nitrophenyl-β-d-xylopyranoside. Moreover, transfection with the cloned cDNA restored glycosaminoglycan synthesis of galactosyltransferase I-deficient Chinese hamster ovary mutant pgsB-761 cells. Analysis of the enzyme product by β-galactosidase digestion, mass spectroscopy, and NMR spectroscopy revealed that the reaction product was formed via β-1,4 linkage, indicating that the enzyme is galactosyltransferase I (UDP-galactose:O-β-d-xylosylprotein 4-β-d-galactosyltransferase, EC 2.4.1.133) involved in the synthesis of the glycosaminoglycan-protein linkage region of proteoglycans.

Proteoglycans are polyanionic molecules consisting of different core proteins and different types, numbers, and length of glycosaminoglycans (GAGs) 1 and are present not only on the cell surface but also in the extracellular matrices of various tissues (1). A wide variety of proteoglycans containing characteristic sulfated GAG chains are generated in a cell type-specific manner, and their strictly regulated expression patterns have suggested their roles in the regulation of cell proliferation/differentiation (2), tissue development and organogenesis (3), and infections (4).
The biosynthesis of the sulfated GAGs on proteoglycans is initiated by the addition of Xyl to Ser residues in the core proteins; and then the addition of two Gal residues and a GlcA residue subsequently takes place (1). Alternatively, the addition of GlcNAc or GalNAc residues to the common linkage structure leads to the formation of heparin/heparan sulfate or that of chondroitin sulfate/dermatan sulfate, respectively. The sequential transfer of individual sugars has been considered to be catalyzed by specific glycosyltransferases, and some glycosyltransferase genes responsible for individual steps have recently been isolated, i.e. glucuronyltransferase I (5,6), GlcA/ GlcNAc transferases to elongate the heparan sulfate chain (7), and a GalNAc/GlcNAc transferase to initiate this process (8).
In the present study, we isolated cDNA clones encoding human galactosyltransferase I (XGalT-1), which is involved in the biosynthesis of the common carbohydrate-protein linkage structure GlcA␤1,3Gal␤1,3Gal␤1,4Xyl␤1-O-Ser (1), based on the BLAST analysis of expressed sequence tags (EST) using the cDNA sequence of the Caenorhabditis elegans sqv-3 gene (9). We demonstrate here the substrate specificities of the cloned cDNA product and the restoration of GAG expression on the mutant CHO cells deficient in galactosyltransferase I after the introduction of the cloned cDNA. These results, as well as the structure analyses of the enzyme product, indicated that the cloned gene encodes human galactosyltransferase I.
EST Data Base Search-Data base searches were performed with the coding sequence of the C. elegans sqv-3 gene (GenBank TM accession no. AJ005867) using tBLASTn algorithms against the EST data base at the NCBI. Four human EST clones (GenBank TM accession nos. AI040029, AA100869, AA442547, and T82170) were found, and overlapping sequences (nucleotides 644 -1126) were obtained by reverse transcription PCR using total RNA from human colon cancer cell line Lovo as a template.
Screening of cDNA Library-The 5Ј RACE product (nucleotides 29 -687) prepared using the 5Ј RACE kit (Life Technologies, Inc.) was 32 P-labeled with a Megaprime™ DNA labeling system (Amersham Pharmacia Biotech) and used to screen the SK-MEL-37 cDNA library. Approximately, 4 ϫ 10 5 recombinant clones were screened by colony hybridization. The nucleotide sequence was determined by the dideoxy termination method using an ABI PRISM TM 310 genetic analyzer (Applied Biosystems). The newly cloned gene was designated XGalT-1 for the reasons described below.
Construction of Expression Vectors-A cDNA fragment encoding the open reading frame of XGalT-1 was prepared by PCR using a 5Ј primer * This work was supported by grants-in-aid for scientific research (10470029), for priority areas (10178104), and for Center of Excellence research from the Ministry of Education, Science, Sports and Culture of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AB028600.
Cell Culture-Mouse fibroblast L cells and CHO-K1 cells were grown in Dulbecco's modified Eagle's minimum essential medium supplemented with 7.5% FCS at 37°C in a 5% CO 2 atmosphere. CHO mutant pgsB-761 (10) was obtained from the American Type Culture Collection and grown in F-12K medium (Life Technologies, Inc.) supplemented with 10% FCS.
Preparation of Membrane Fraction-L cells were transiently transfected with an expression plasmid (4 g) by the DEAE-dextran method (11). After 48 h of culture, the cells were harvested, and the membrane fraction was prepared as described (12).
Preparation of Soluble Forms of XGalT-1-L cells (10-cm dish) were transfected with pCDSA-XGalT-1 (4 g) by the DEAE-dextran method, and soluble forms of XGalT-1 were obtained as described (13). To prepare the sample for NMR spectroscopic analysis, the soluble enzyme was further purified using IgG-Sepharose (Amersham Pharmacia Biotech). The beads-enzyme complex was washed with and then resuspended in 100 mM MES buffer, pH 6.0.
Galactosyltransferase Assay-The galactosyltransferase activity was determined according to Lugemwa et al. (14) with modification. The assay mixture containing 1 l of Me 2 SO, 15 mM MnCl 2 , 50 mM KCl, 1% Triton X-100, 100 mM MES buffer, pH 6.0, 0.6 mM UDP-Gal, 5000 dpm/l UDP-[ 14 C]Gal (NEN Life Science Products), and 1 g of the enzyme and substrates in a total volume of 25 l. After incubation at 37°C for 30 min, the reaction mixture was applied onto a Sep-Pak C 18 cartridge (Waters), and the product was eluted with 5 ml of methanol.
␤-Galactosidase Digestion-One g of [ 14 C]Gal␤1-4Xyl␤1-p-Nph formed using XGalT-1-protA was dissolved in 200 l of solution containing 50 mM Tris-HCl, pH 7.3, 50 mM NaCl, 20 g of ␤-galactosidase from Escherichia coli (Roche Molecular Biochemicals) and incubated for 7 h at 37°C. One g of the labeled product was also digested with 20 milliunits of diplococcal ␤-galactosidase (Roche Molecular Biochemicals) for 23 h at 37°C in a total volume of 100 l of 50 mM sodium citrate buffer, pH 6.0, containing 100 mM NaCl and 100 g/ml bovine serum albumin. The digested product was separated on a Sep-Pak C 18 cartridge as described above.
Purification and Identification of the Enzyme Product-The enzyme reaction was performed in a mixture consisting of 50 l of the enzymebound IgG-Sepharose, 2.7 mg of p-Nph-␤-D-Xyl, 10 l of Me 2 SO, 15 mM MnCl 2 , 50 mM KCl, 1% Triton X-100, 100 mM MES buffer, pH 6.0, and 75 mM of UDP-Gal in a final volume of 250 l. The complete conversion of p-Nph-␤-D-Xyl to a less migrating compound was confirmed by thin layer chromatography with a solvent system of ethanol/pyridine/n-butanol/acetate/water (100:10:10:3:30). The reaction product was purified on a Sep-Pak C 18 cartridge as described above and then further purified by high performance liquid chromatography (Jasco 880-PU pumps and MD915 detector) using a Develosil TM ODS HG-5 column (4.6 ϫ 250 mm, Nomura Chemical). Elution solvents were aqueous 10% acetonitrile (solvent A) and aqueous 30% acetonitrile (solvent B) utilizing a gradient at 40°C. Before NMR analysis, the sample was dissolved in D 2 O (99.9 atom % deuterium), dried up in vacuo, and finally dissolved in 600 l of D 2 O (99.96 atom % deuterium). 1 H NMR spectra (600 MHz) were obtained on a JEOL JMN ␣-600 in D 2 O, and one-dimensional, onedimensional HOHAHA, and NOE difference spectra and COSY were recorded as ppm from internal t-BuOH (␦1.23) at 25°C. FAB-MS data were recorded on a JEOL MStation using m-nitrobenzyl alcohol.
Northern Blotting-Human Multiple Tissue Northern blot® was purchased from CLONTECH Laboratories and was probed with a gel-purified [␣-32 P]dCTP-labeled XGalT-1 cDNA or glyceraldehyde-phosphate dehydrogenase.

Molecular Cloning of Human XGalT-1 Gene-C. elegans
sqv-3 was similar in its amino acid sequence to the ␤4-galactosyltransferase family of human and other vertebrates (9) and was required for vulval invagination and oocyte development (17). To find a human homolog, the NCBI Data Bank of EST cDNA clones was probed with the deduced amino acid sequence of sqv-3 cDNA. Several human partial-length cDNA clones were obtained, and the nucleotide sequence was confirmed by reverse transcription PCR (nucleotides 644 -1126 in Fig. 1A) with total RNA from a human colon cancer line. Five independent cDNAs homologous to sqv-3 were obtained from a library, and three of them contained a whole open reading frame encoding a protein of 327 amino acids with a molecular mass of 37,405 daltons (Fig. 1A). It has one potential N-linked glycosylation site. The position of the AUG start codon was determined according to the Kozak consensus sequence (18). Hydropathy (19) indicated one prominent hydrophobic segment of 28 residues in length in the amino-terminal region, predicting that the protein had the type II transmembrane topology characteristic of many other glycosyltransferases cloned to date (Fig.  1B). A comparison of the primary structure of the identified cDNA and human ␤4GalT-1 (GenBank™ accession no. X14085) (20) revealed that 82 of the 327 amino acids (25%) were identical (Fig. 1C). Similar results were observed between the newly cloned gene and other human ␤4-galactosyltransferases (␤4GalT-II, -III, -IV, -V, and -VI) (21)(22)(23)(24)(25). In contrast, this gene shares 38% identity with the sqv-3 gene, suggesting that this gene represents its human ortholog (Fig. 1C). These results indicate that the newly cloned gene product is a novel member of the ␤4-galactosyltransferase gene family.
Galactosyltransferase Activity of the Newly Cloned Enzyme-To analyze the galactosyltransferase activity of the human sqv-3 ortholog, the expression vector of the cloned cDNA, pMIKneo-XGalT-1, was transfected into L cells, and their extracts were used for galactosyltransferase assay using a variety of acceptor substrates (Table I) (26). Similar results were obtained using a soluble fusion enzyme XGalT-1-protA (data not shown).
Characterization of the Enzyme Product-To identify the enzyme reaction products, p-Nph-␤-D-Xyl was labeled with [ 14 C]Gal using XGalT-1-protA and then subjected to treatment with E. coli. ␤-galactosidase (27). More than 99% of the labeled products were digested (data not shown). Furthermore, the labeled products were also completely cleaved by diplococcal ␤-galactosidase, which specifically digests terminal ␤-1,4galactosyl linkage (data not shown), indicating that XGalT-1-protA fusion protein catalyzes the galactose transfer from UDP-Gal to the acceptor in ␤-1,4 linkage.
Structural Determination of the Enzyme Product-To characterize the structure of the reaction product, FAB-MS was performed and revealed that the molecular weight of the product is 433 from signals at 434 m/z (M ϩ 1) and 456 m/z (M ϩ Na ϩ ), which was attributable to galactosyl products of the starting Xyl␤1-p-Nph. The 1 H NMR spectrum showed the p-nitrophenyl moiety and two glycosyl residues. All of the signals of the sugar moieties were assigned by 1D-HOHAHA spectra irradiating at each anomeric signal (␦ ϭ 4.49 and ␦ ϭ 5.25) and COSY (Fig. 2). The structure of the p-nitrophenyl-␤-xylopyranosyl moiety was confirmed by comparison with the spectrum of the starting molecule. The coupling constant of the galactosyl residue (J 1,2 ϭ 7.7 Hz, J 2,3 ϭ 9.9 Hz, J 3,4 ϭ 3.7 Hz, J 4,5 ϭ 0 Hz) indicated that the galactoside is a ␤-pyranoside. The linkage of the galactoside was determined by NOE difference spectra. By irradiation at Gal H-1, strong NOEs were observed at Xyl H-4, Gal H-3, and Gal H-5, and also, by irradiation at Xyl H-4, Gal H-1 was strongly enhanced. Thus, the structure of the product was identified as Gal␤1-4Xyl␤1-p-Nph (28).
Restoration of GAG Synthesis by the Cloned XGalT-1 in a GAG-deficient Mutant Cell-To confirm that this enzyme is involved in the biosynthesis of GAGs in vivo, CHO mutant pgsB-761 (galactosyltransferase I-deficient) cells were tran-siently transfected with pMIKneo-XGalT-1. As shown in Fig.  3A, about 50% of the transfected cells were stained with mAb HeppSS-1, whereas the mock-transfected cells were negative, indicating that the cloned cDNA actually encodes galactosyltransferase I. This result was also confirmed by 35 S-labeled GAGs with CSase or HSase digestion (Fig. 3B).
Expression of the XGalT-1 Gene-Northern blotting with cDNA as a probe revealed that the XGalT-1 gene was expressed in all human tissues examined (including the heart, brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas), and 1.8-kilobase pair transcripts were detected (data not shown).

DISCUSSION
Newly cloned XGalT-1 was identified as a possible member of the ␤4-galactosyltransferase family in the BLAST analysis of EST using the cDNA sequence of C. elegans sqv-3. sqv-3 was identified as one of the genes possiblly encoding the components of a conserved glycosylation pathway and required for vulval invagination (9,17). Because of the similarity of the amino acid sequence to the cloned mammalian ␤4-galactosyltransferases, sqv-3 seemed to be a ␤1,4-galactosyltransferase to create galactose ␤1,4-N-acetylglucosamine linkage (9). The predicted amino acid sequence of the new gene had a higher homology to sqv-3 than to other mammalian ␤4-galactosyltransferases, suggesting that the gene encodes a ␤4-galactosyltransferase distinct from known ␤4-galactosyltransferases involved in the synthesis of Gal␤1,4GlcNAc (or Glc) structures. As expected, the substrate specificity analysis revealed that the new gene encodes XGalT-1, involved in the biosynthesis of the GAG-protein linkage region of proteoglycans.
XGalT-1 was studied by Rodén's group (26,27) and was further characterized in the study of CHO mutants defective in galactosyltransferase I (10) and in the study of clinical cases (29). The substrate specificities and storage stability were similar to the characters of the XGalT-1 enzyme analyzed in this study. The K m value for p-Nph-␤-D-Xyl reported (29) was also in good agreement with our results. All of these data, in addition to the results with the mutant CHO cells and those with FAB-MS and NMR, supported the identity of the new gene as human galactosyltransferase I.
Northern blotting showed a ubiquitous expression pattern, indicating the universal importance of the gene product. Since the activating and inhibitory effects of proteoglycans on cell proliferation have been elucidated (30,31), important roles of GAGs on proteoglycans have been increasingly recognized. The phenotypes of C. elegans defective of sqv-3 gene strongly suggested the critical roles of its homolog in the morphogenesis and development of mammalian tissues. Thus, the linkage between the genetic study of C. elegans and mammalian glycobiology research would promote further understanding of the biological significance of GAGs on proteoglycans.