Molecular Cloning and Characterization of a Novel Human β1,4-N-Acetylgalactosaminyltransferase, β4GalNAc-T3, Responsible for the Synthesis of N,N′-Diacetyllactosediamine, GalNAcβ1–4GlcNAc*

We found a novel human glycosyltransferase gene carrying a hypothetical β1,4-glycosyltransferase motif during a BLAST search, and we cloned its full-length open reading frame by using the 5′-rapid amplification of cDNA ends method. It encodes a type II transmembrane protein of 999 amino acids with homology to chondroitin sulfate synthase in its C-terminal region (GenBank™ accession number AB089940). Its putative orthologous gene was also found in mouse (accession number AB114826). The truncated form of the human enzyme was expressed in HEK293T cells as a soluble protein. The recombinant enzyme transferred GalNAc to GlcNAc β-benzyl. The product was deduced to be GalNAcβ1–4GlcNAcβ-benzyl based on mass spectrometry and NMR spectroscopy. We renamed the enzyme β1,4-N-acetylgalactosaminyltransferase-III (β4GalNAc-T3). β4GalNAc-T3 effectively synthesized N,N′-diacetylgalactosediamine, GalNAcβ1–4GlcNAc, at non-reducing termini of various acceptors derived not only from N-glycans but also from O-glycans. Quantitative real time PCR analysis showed that its transcript was highly expressed in stomach, colon, and testis. As some glycohormones contain N,N′-diacetylgalactosediamine structures in their N-glycans, we examined the ability of β4GalNAc-T3 to synthesize N,N′-diacetylgalactosediamine structures in N-glycans on a model protein. When fetal calf fetuin treated with neuraminidase and β1,4-galactosidase was utilized as an acceptor protein, β4GalNAc-T3 transferred GalNAc to it. Furthermore, the majority of the signal from GalNAc disappeared on treatment with glycopeptidase F. These results suggest that β4GalNAc-T3 could transfer GalNAc residues, producing N,N′-diacetylgalactosediamine structures at least in N-glycans and probably in both N- and O-glycans.

We found a novel human glycosyltransferase gene carrying a hypothetical ␤1,4-glycosyltransferase motif during a BLAST search, and we cloned its full-length open reading frame by using the 5-rapid amplification of cDNA ends method. It encodes a type II transmembrane protein of 999 amino acids with homology to chondroitin sulfate synthase in its C-terminal region (GenBank TM accession number AB089940). Its putative orthologous gene was also found in mouse (accession number AB114826). The truncated form of the human enzyme was expressed in HEK293T cells as a soluble protein. The recombinant enzyme transferred GalNAc to GlcNAc ␤-benzyl. The product was deduced to be GalNAc␤1-4GlcNAc␤-benzyl based on mass spectrometry and NMR spectroscopy. We renamed the enzyme ␤1,4-N-acetylgalactosaminyltransferase-III (␤4GalNAc-T3). ␤4GalNAc-T3 effectively synthesized N,N-diacetylgalactosediamine, GalNAc␤1-4GlcNAc, at non-reducing termini of various acceptors derived not only from N-glycans but also from O-glycans. Quantitative real time PCR analysis showed that its transcript was highly expressed in stomach, colon, and testis. As some glycohormones contain N,N-diacetylgalactosediamine structures in their N-glycans, we examined the ability of ␤4GalNAc-T3 to synthesize N,N-diacetylgalactosediamine structures in N-glycans on a model protein.
When fetal calf fetuin treated with neuraminidase and ␤1,4-galactosidase was utilized as an acceptor protein, ␤4GalNAc-T3 transferred GalNAc to it. Furthermore, the majority of the signal from GalNAc disappeared on treatment with glycopeptidase F. These results suggest that ␤4GalNAc-T3 could transfer GalNAc residues, producing N,N-diacetylgalactosediamine structures at least in Nglycans and probably in both N-and O-glycans.
The bovine ␤4Gal-T1, one of the best understood glycosyltransferases (17,18), was crystallized, and its structure was analyzed (19 -21). An acidic short sequence, GWGGED, which is shared by six ␤4Gal-Ts and three ␤4GalNAc-Ts, has been recognized to contain the general catalytic base of the family. Recently, Ramakrishnan and Qasba (22) reported that Tyr-289 of bovine ␤4Gal-T1 is essential for donor binding. Elimination of a hydrogen bond by mutating the Tyr-289 residue to Leu, Ile, or Asn enhances the GalNAc-T activity. One GalNAc-T of Caenorhabditis elegans, which can synthesize GalNAc␤1-4GlcNAc * This work was performed as part of the R & D Project of the Industrial Science and Technology Frontier Program (R & D for Establishment and Utilization of a Technical Infrastructure for Japanese Industry) supported by the New Energy and Industrial Technology Development Organization. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AB089940 and AB114826.
(N,NЈ-diacetyllactosediamine, LacdiNAc), has been cloned and named Ce␤4GalNAcT (23). Ce␤4GalNAcT has Ile at the same position as Tyr-289 of ␤4Gal-T1, and transfers GalNAc to a variety of acceptor substrates with terminal ␤-linked GlcNAc, resulting in the synthesis of the LacdiNAc structure.
LacdiNAc, a special structure in human N-glycans, has been found in certain glycoproteins and glycohormones, such as lutropin (LH) (24), thyrotropin (TSH) (25), glycodelin-A (26,27), tissue factor pathway inhibitor (TFPI) produced in human embryonic kidney (HEK) 293 cells (28), and so on. LH is a pituitary glycoprotein hormone that is essential for the regulation of follicular maturation, ovulation, and the secretion of estradiol and progesterone. Both LH and follicle-stimulating hormone are synthesized in the same cell, i.e. gonadotrophs of the anterior pituitary; however, the terminal structure of their N-glycans differs. LH has SO 4 -4GalNAc␤1-4GlcNAc␤1-2Man at the non-reducing terminus of N-glycans, whereas folliclestimulating hormone has NeuAc␣2-3/6Gal␤1-4GlcNAc␤1-2Man (29,30). The SO 4 -4GalNAc␤1-4GlcNAc␤1-2Man structure in LH is captured by a receptor in hepatic endothelial cells and Kupffer cells, resulting in the rapid removal of LH from blood (31). There are reports that GalNAc-T responsible for the synthesis of LacdiNAc recognizes the amino acid sequence of the protein in addition to recognizing the carbohydrate structure. In the case of LH, the PLRXKK sequence on the Nterminal side of a glycosylated Asp is essential for recognition by this enzyme (32,33).
It has been reported that some cell lines could synthesize LacdiNAc structures in N-glycans (34 -38). Human protein C (HPC) expressed in HEK293 cells shows stronger anticoagulant activity than plasma HPC. The HPC from HEK293 cells has the LacdiNAc structure in its N-glycans, whereas plasma HPC does not (34). LacdiNAc may contribute to the increased anticoagulant activity of the HPC produced by HEK293 cells (34). In HT-1080 fibrosarcoma cells, LacdiNAc, sialylated Lac-diNAc, and fucosylated LacdiNAc have been found (36). Thus, certain cells can produce the LacdiNAc structure that may confer specific functions on carrier proteins.
Ce␤4GalNAcT of C. elegans is highly homologous, and probably orthologous, to human ␤4Gal-T1 (23). However, the human ␤4GalNAc-T gene(s) responsible for the synthesis of Lac-diNAc have not been cloned. Although it may not be orthologous to Ce␤4GalNAcT, it should contain the motif shared by all the ␤1,4-glycosyltransferase (␤4GT) family members. Therefore, we searched for a novel glycosyltransferase utilizing the amino acid sequences of the ␤4GT family and paid particular attention to the GWGGED motif of ␤4GT. In this study, we cloned a novel human ␤4GalNAc-T cDNA, the product of which showed catalytic activity for the synthesis of Lac-diNAc structures in N-glycans.

EXPERIMENTAL PROCEDURES
Isolation of Human ␤4GalNAc-T3 cDNA-We performed a BLAST search of the GenBank TM data base by using ␤4GT motifs as query sequences and identified a genomic DNA with accession number AC006205, which was estimated to contain a partial open reading frame (ORF) using GENSCAN software (39), and we showed a high level of homology to the C-terminal region of the ␤4GT family. To obtain the complete ORF, the four-step 5Ј-rapid amplification for complementary DNA (cDNA) ends (5Ј-RACE) method was employed using a Marathon-Ready TM cDNA amplification kit (Clontech, Palo Alto, CA) and eight reverse primers, the nucleotide sequences of which are summarized in Table I. The sequences of the DNA fragments obtained by the 5Ј-RACE method were determined using a DYEnamic ET Terminator Cycle sequencing kit (Amersham Biosciences). Finally, a cDNA sequence encoding the full-length ORF was obtained by PCR using the Marathon-Ready TM cDNA of human stomach (Clontech) as a template.
Construction and Purification of Human and Mouse ␤4GalNAc-T3 Proteins Fused with FLAG Peptides-The putative catalytic domain of human ␤4GalNAc-T3 (amino acids 57-999) was expressed as a secreted protein fused with a FLAG peptide in HEK293T cells. An ϳ2.7-kb DNA fragment was amplified by PCR using the Marathon-Ready TM cDNA derived from human stomach as a template and two primers, 5Ј-GGAATTCGAGGTACGGCAGCTGGAGAGAA-3Ј and 5Ј-ACGCGTC-GACCTACAGCGTCTTCATCTGGCGA-3Ј. The amplified fragment was digested with the restriction endonucleases EcoRI and SalI and then inserted into pcDNA3.1 (Invitrogen). Following digestion of the vector with restriction endonucleases EcoRI and PmeI, the 2.9-kb fragment was inserted into the EcoRI-EcoRV site of pFLAG-CMV-1 (Sigma) to construct pCMV-␤4GalNAc-T3. The catalytic domain of ␤4GalNAc-T3 was expressed in HEK293T cells. A 50-ml volume of culture medium was mixed with anti-FLAG M1 antibody resin (Sigma) and incubated with rotating at 4°C overnight. The resin was washed twice with 50 mM Tris-buffered saline (50 mM Tris-HCl, pH 7.4, and 150 mM NaCl) containing 1 mM CaCl 2 and suspended in 100 l of the assay buffer described below. The mouse ␤4GalNAc-T3 (m␤4GalNAc-T3) gene encoding its putative catalytic domain (amino acids 57-986) was amplified with two primers, 5Ј-CCCAAGCTTCGGCCCAGGCCGGCGGGAAC-C-3Ј and 5Ј-GGAATTCTCACGGCATCTTCATTTGGCGA-3Ј, by using the cDNA derived from mouse stomach as a template. The amplified 2.7-kb fragment was digested with endonuclease HindIII and EcoRI and then the digested fragment was inserted into pFLAG-CMV-1.
For the reaction in the GalNAc-T assay, 50 mM MES buffer (pH 6.5) containing 0.1% Triton X-100, 1 mM UDP-GalNAc, 10 mM MnCl 2 , and 500 M acceptor substrate was used. A 10-l volume of enzyme solution for 20 l of each reaction mixture was added and incubated at 37°C for various periods. After the incubation the mixture was filtrated with an Ultrafree-MC column (Millipore, Bedford, MA), and a 10-l aliquot was subjected to reversed phase-high performance liquid chromatography (HPLC) on an ODS-80Ts QA column (4.6 ϫ 250 mm; Tosoh, Tokyo, Japan). 0.1% trifluoroacetic acid/H 2 O with 12% acetonitrile was used as a running solution. An ultraviolet spectrophotometer (absorbance at 210 nm), SPD-10A VP (Shimadzu, Kyoto, Japan), was used for detection of the peaks. Pyridyl amino-labeled oligosaccharides as acceptor substrates were added to the reaction mixtures at 50 nM. For the analyses of the products derived from these oligosaccharides, 100 mM acetic acid/triethylamine, pH 4.0, was used as a running solution, and the products were eluted with a 30 -70% gradient of 1% 1-butanol in running solution at a flow rate of 1.0 ml/min at 55°C.
Determination of Products by ␤4GalNAc-T3 with Mass Spectrometry (MS)-An additional peak obtained with UDP-GalNAc and GlcNAc␤-O-Bz by reversed phase chromatography was isolated and analyzed by a matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) MS (Reflex IV, Bruker Daltonics, Billerica, MA). The products (10 pmol) were dried, dissolved in 1 l of H 2 O, and applied. 2,5-Dihydrobenzoic acid was used for the matrix.
Determination of Products by ␤4GalNAc-T3 with 1 H NMR Spectra-The reaction product (200 g) was dissolved in 150 l of D 2 O using a micro cell and used as a sample for 1 H NMR experiments. One-dimensional and two-dimensional 1 H NMR spectra were recorded with DMX750 (Bruker, Germany, 750.13 MHz for 1 H nucleus) and ECA800 (JEOL, Tokyo, Japan, 800.14 MHz for 1 H nucleus) spectrometers at 25°C. The methylene proton of a benzyl group in a higher field (4.576 ppm) was used as a reference for the 1 H NMR chemical shifts. 1st series/1st primer 5Ј-CAACAGTTCAAGCTCCAGGAGGTA-3Ј 1st series/nested primer 5Ј-CTGACGCTTTTCCACGTTCACAAT-3Ј 2nd series/1st primer 5Ј-CACCCCGTCTCTGCTCTGCGAT-3Ј 2nd series/nested primer 5Ј-GTCTTCCTGGGGCTGTCACCA-3Ј 3rd series/1st primer 5Ј-CACCTCATCCATCTGTAGGAACGT-3Ј 3rd series/nested primer 5Ј-CTGTCGCCATGCAACTTCCACGT-3Ј 4th series/1st primer 5Ј-AATGTCGTGGTCCTCGAGGCTCA-3Ј 4th series/nested primer 5Ј-GATGGTAGAACTGGAGGTGTGGAT-3Ј Quantitative Analysis of ␤4GalNAc-T3 Transcripts in Human Tissues by Real Time PCR-For the quantification of human ␤4GalNAc-T3 transcripts, we employed the real time PCR method, as described in detail previously (40,41). Total RNA of various human tissues was purchased from Clontech. Total RNA of peripheral blood mononuclear cells and various cell lines was extracted with an RNeasy kit (Qiagen, Hilden, Germany). cDNA templates were synthesized from the total RNA with a SuperScript TM II first-strand synthesis system (Invitrogen). Standard curves for the endogenous control, glyceraldehyde-3-phosphate dehydrogenase (GAPDH) cDNA, were generated by serial dilution of a pCR2.1 (Invitrogen) DNA containing the GAPDH gene. The primer set and the probe for the ␤4GalNAc-T3 gene were as follows: the forward primer was 5Ј-CTGGACTTCAGGCTGGCATAG-3Ј and the reverse primer 5Ј-GAATGGCATCGATGACTCCAG-3Ј, and the probe was 5Ј-CTCGTGAAGGACCCGCA-3Ј with a minor groove binder (42). PCR products were continuously measured with an ABI PRISM 7700 Sequence Detection System (Applied Biosystems, Foster City, CA). The relative amount of the transcript was normalized to the amount of Detection of GalNAc on Asialo/Agalacto-FCF-Fetal calf fetuin (FCF), neuraminidase, ␤1,4-galactosidase, and glycopeptidase F were purchased from Sigma, Nacalai Tesque (Kyoto, Japan), Calbiochem, and Takara, respectively. Asialo/agalacto-FCF was prepared from 200 g of FCF by incubating with 4 microunits of neuraminidase and 12 microunits of ␤1,4-galactosidase at 37°C for 16 h. The transfer of GalNAc by ␤4GalNAc-T3 to glycoprotein was performed in 20 l of a standard reaction mixture containing 50 g of asialo/agalacto-FCF produced by glycosidase treatment. After incubation at 37°C for 16 h, 5 l of the reaction mixture was digested with glycopeptidase F (GPF) according to the manufacturer's instructions. For detection of transferred GalNAc, horseradish peroxidase (HRP)-conjugated lectin, Wisteria floribunda agglutinin (WFA) (EY Laboratories, San Mateo, CA), was used. A 1-l aliquot of reaction mixture subjected to 12.5% SDS-PAGE was transferred to nitrocellulose membrane (Schleicher & Schuell) and stained with 0.1% HRP-conjugated WFA lectin. The signals were detected using enhanced chemiluminescence (ECL) and Hyperfilm ECL (Amersham Biosciences).

RESULTS
Determination of Nucleotide Sequence of ␤4GalNAc-T3 cDNA and Its Putative Amino Acid Sequence-We determined a novel full-length cDNA sequence by the 5Ј-RACE method and registered it in the GenBank TM data base with the accession number AB089940. The nucleotide sequence and putative amino acid sequence are shown in Fig. 1. The gene was located at 12p13.3 and composed of at least 20 exons. This open reading frame consisted of 2,997 bp encoding a predicted 999-amino acid protein with a typical type II topology, which is common in glycosyltransferases. It contained four potential N-glycosylation sites and a DXH sequence, which is conserved in UDP-GalNAc:polypeptide-GalNAc-Ts (pp-GalNAc-T), and is thought to participate in divalent cation binding. In amino acid sequence, ␤4GalNAc-T3 was more similar to ␤4GalNAc-Ts than to ␤4Gal-Ts in the human ␤4GT family (Fig. 2). ␤4GalNAc-T3 was included in a GalNAc-T cluster in the phylogenetic tree, not in a Gal-T cluster (Fig. 2). The glycosyltransferase most homologous to ␤4GalNAc-T3 was chondroitin sulfate synthase 1 (CSS1), which is an enzyme-polymerizing chondroitin, (GalNAc␤1,4GlcUA␤1,3) n . The two showed 29.3% identity in the C-terminal 140-amino acid region. In this region, there was the ␤4GT motif, GWGGED, which is highly conserved in the ␤4GT family except for ␤4GalNAc-T1 and -T2. Furthermore, a gene possessing a quite similar sequence to human ␤4GalNAc-T3 was found in mouse (82.2% identity in amino acid sequence) as shown in Fig. 3. This mouse gene is probably orthologous to human ␤4GalNAc-T3 and was named m␤4GalNAc-T3. It encoded a hypothetical 986-amino acid protein carrying four potential N-glycosylation sites, a DXH sequence, and a ␤4GT motif, the same as human ␤4GalNAc-T3.
Substrate Specificity of ␤4GalNAc-T3-We determined the substrate specificity of the truncated soluble ␤4GalNAc-T3 expressed in HEK293 cells. By utilizing a variety of UDP donors and monosaccharide acceptors with a pNp or Bz group, donor and acceptor substrates for ␤4GalNAc-T3 were screened with HPLC. As shown in Fig. 4, a peak (P) appeared at 28.4 min (Fig. 4B) in addition to the acceptor substrate peak (S) at 23.2 min (Fig. 4, A and B) when UDP-GalNAc and GlcNAc␤-O-Bz were used as a donor and an acceptor substrate, respectively. The peak P in Fig. 4B (Table II, acceptor numbers 1-13). ␤4GalNAc-T3 showed no activity toward chondroitin-related acceptors containing GlcUA at their non-reducing termini, although its amino acid sequence was similar to that of CSS1 (data not shown). m␤4GalNAc-T3 exhibited the same substrate specificity as human ␤4GalNAc-T3 did (data not shown).
Determination of ␤4GalNAc-T3 Product with 1 H NMR-1 H NMR spectroscopy was performed to determine the newly formed glycosidic linkage of the ␤4GalNAc-T3 product. A onedimensional 1 H NMR spectrum of the ␤4GalNAc-T3 product is shown in Fig. 5. In the NMR spectra, signal integrals (not shown, 5 phenyl protons of Bz, 2 methylene protons of Bz, 2 anomeric protons, 12 sugar protons except anomeric protons, and 6 methyl protons of two N-acetyl groups) corresponded well with the structure of GalNAc-GlcNAc-O-Bz. As shown in Fig. 5 and in Table III, two anomeric protons revealed resonances at a very close magnetic field with a coupling constant (J 1,2 ) larger than 8 Hz. This indicates that two pyranoses in the samples are in the ␤-gluco-configuration. All 1 H signals could be assigned after high resolutional detection in COSY, TOCSY, and NOESY experiments. The anomeric resonance in the lower field showed NOE with two methylene protons of the benzyl group in the sample (not shown); on the other hand, the anomeric resonance in the higher field did not show NOE with methylene protons (not shown). The results mean that the anomeric resonance in the lower field is responsible for the anomeric proton of the substrate pyranose (␤-GlcNAc, defined as A) and that the anomeric proton in the higher field corresponds to the anomeric proton of the transferred pyranose (␤-GalNAc, defined as B). The chemical shifts and coupling constants of the sugar part of the sample are shown in Table  III. The chemical shift and signal splitting of B-4 resonance was characteristic in the ␤-Gal configuration (43), and the order in the chemical shift of A1-A6 protons was characteristically similar to the observed spectrum of ␤-GlcNAc in LNnT (Gal␤1-4GlcNAc␤1-3Gal␤1-4Glc). As shown in Fig. 6, a weak NOE cross-peak between B1 and A4 and very weak NOE crosspeaks between B1 and two A6 were observed in addition to strong inner residual NOEs between B1 and B5 and between A1 and A5. These results suggest the existence of a ␤1-4 linkage between two pyranoses. Results in NMR experiments thus indicated clearly that the product of ␤4GalNAc-T3 is GalNAc␤1-4GlcNAc-O-Bz.
Comparison of Acceptor Substrates-To investigate the specificity for acceptor substrates, N-and O-glycans containing GlcNAc at their non-reducing termini were utilized. As shown in Tables II and IV, a GalNAc residue could be transferred to all acceptor substrates examined, although the transfer efficiency differed. The most efficient acceptor in the O-glycan structures was core 6-pNp (GlcNAc␤1-6GalNAc␣-pNp) which accepted GalNAc with a 2.2-fold higher efficiency than Glc-NAc␤-Bz. However, core 2-pNp having similar non-reducing terminal structures with core 6-pNp showed less efficiency than the core 3-pNp. On the other hand, the most efficient acceptor substrate in the N-glycans examined was a non-fucosylated bi-antennary one (acceptor substrate number 1 in Table IV). The efficiency decreased as the number of antennae increased (data not shown). However, the presence of Fuc residues had little effect (number 1 and 2, 3 and 4, and 5 and 6) on the activity of ␤4GalNAc-T3. In contrast, a difference in transfer efficiency was observed between the two antennae, i.e. the GlcNAc␤1-2Man␣1-3 antenna was preferable as an acceptor for ␤4GalNAc-T3 to the GlcNAc␤1-2Man␣1-6 antenna (numbers 3 and 5 and 4 and 6).
Quantitative Analysis of the ␤4GalNAc-T3 Transcripts in Human Tissues and Cell Lines by Real Time PCR-We determined the tissue distribution and expression levels of the human ␤4GalNAc-T3 transcript by the real time PCR method. The expression levels of ␤4GalNAc-T3 in various tissues were shown relative to the GAPDH transcript (Fig. 7A). The transcript was highly expressed in stomach, colon, and testis, with relatively low levels found in other tissues. In the case of cell lines, the expression levels were comparatively low; however, the transcript was certainly expressed in SW1116 (colon cancer), KATO III (stomach cancer), HEK293, and some other cell lines (Fig. 7B).

DISCUSSION
We found a novel human glycosyltransferase carrying the ␤4GT motif, WGGED, in its C-terminal region. From its similarity in amino acid sequence to CSS1 which transfers GalNAc to GlcA with a ␤1,4-linkage (10), we expected it to have ␤4Gal-NAc-T activity. As expected, it transferred GalNAc to GlcNAc and synthesized the LacdiNAc structure. Its ortholog was also found in mouse. Kawar et al. (23) has reported the cloning of a nematode LacdiNAc synthase, Ce␤4GalNAc-T, which is prob-ably an ortholog of human ␤4Gal-T1. ␤4GalNAc-T3 in the present study is a novel glycosyltransferase, and this is the first report of the molecular cloning and characterization of mammal LacdiNAc synthases, ␤4GalNAc-T3 and m␤4GalNAc-T3.
By comparing ␤4GalNAc-T3 with Ce␤4GalNAc-T, the greatest difference is the length, i.e. ␤4GalNAc-T3 and Ce␤4GalNAc-T consist of 999 and 383 amino acids, respectively (Fig. 1). Furthermore, Ce␤4GalNAc-T has relatively strong homology (35.5% identity) with human ␤4Gal-T1 (23), and the two are probably orthologous. However, ␤4GalNAc-T3 was more similar to CSS1 than any other ␤4GTs (Fig. 2). Although the C-terminal sequence of ␤4GalNAc-T3 had a conserved ␤4GT motif, WGGED, the N-terminal sequence was quite unique. As shown in Fig. 3, no protein homologous with the N-terminal sequence of ␤4GalNAc-T3 was found in any data bases by a BLAST search except for m␤4GalNAc-T3. In the middle region, the homology between ␤4GalNAc-T3 and m␤4GalNAc-T3 was comparatively low, although both enzymes contain many prolines and acidic amino acids, Glu and Asp, in this region (Fig. 3). Functions of the N-terminal region and the middle region, which compose the unique amino acid sequence, remain to be elucidated.
On screening with monosaccharide acceptors and UDPsugar donors, both ␤4GalNAc-T3 and m␤4GalNAc-T3 showed GalNAc-T activity toward GlcNAc (Fig. 4). Although the amino acid sequence of ␤4GalNAc-T3 showed a high level of homology with that of CSS1, no GalNAc-T activity toward GlcUA was observed. The GalNAc-GlcNAc structure has been found as LacdiNAc in humans, and this enzyme had a motif conserved in the ␤4GT family; therefore, we expected that it might be a ␤4GalNAc-T. The results of NMR spectrometry revealed that 2.0 J 6a, 6b 12.1 a The chemical shifts were set as the higher field signal of the benzyl methylene protons in ppm. the product was GalNAc␤1-4GlcNAc, LacdiNAc, as expected (Figs. 5 and 6). ␤4GalNAc-T3 did not show any ␤3GT activity despite having homology with CSS1, which has both ␤4 and ␤3GT activities. However, this is not unexpected because the ␤3GT motif was not found in ␤4GalNAc-T3.
Ramakrishnan and Qasba (22) reported that the elimination of a hydrogen bond by mutating the Tyr-289 residue to Leu, Ile, or Asn in bovine ␤4Gal-T1 enhances the GalNAc-T activity in place of Gal-T activity. In fact, Ce␤4GalNAc-T possesses Ile-257 at the same position as Tyr-289 in bovine ␤4Gal-T1 (23). The primary sequence of ␤4GalNAc-T3 is not very homologous with that of bovine ␤4Gal-T1 or Ce␤4GalNAc-T; however, Asn-927, Leu-931, or Leu-932 might be an important residue for donor binding, judging from the distance between these amino acids and the conserved WGGED sequences.
We analyzed the acceptor preference of ␤4GalNAc-T3 for oligosaccharides that were derived from N-and O-glycans. The levels of efficiency in Tables II and IV are not comparable,   TABLE IV  Substrate specificity of ␤4GalNAc-T3 FIG. 6. Two-dimensional NMR spectrum of the ␤4GalNAc-T3 product. Two-dimensional NOESY spectrum (800.14 MHz in JEOL ECA800) of the reaction product. NOE cross-peaks of the reaction product were very weak and detected in both the positive (B1-A4 and two B1-A6) and negative phase (A1-A5 and B1-B5).
because the concentrations of acceptor substrates are different. However, it was demonstrated that ␤4GalNAc-T3 could transfer GalNAc to any of the non-reducing terminal GlcNAc-␤ in vitro. Among O-glycan-derived substrates, core 6-pNp was the most efficient acceptor. The core 6 structures and core 6 synthesizing activity have been found in ovarian cyst fluid, seminal fluid, and meconium (44 -46). As shown in Fig. 7, the ␤4GalNAc-T3 gene was highly expressed in testis; therefore, there is some possibility that carrier proteins of ␤4GalNAc-T products on the core 6 structure exist in testis. Core 2 and core 3 structures have been found in stomach. ␤3Gn-T6, which synthesizes core 3, is highly expressed in stomach as reported in our previous paper (40). Although the GalNAc␤1-4core 3 and GalNAc␤1-4core 2 structures have not been found, these O-glycans would be candidates for acceptors of ␤4GalNAc-T3 in vivo. On the other hand, some human glycoproteins are known to contain the LacdiNAc structure in their N-glycans; therefore, these N-glycans are possible native substrates for ␤4Gal-NAc-T3 in vivo. ␤4GalNAc-T3 could transfer GalNAc residues to N-glycans in asialo/agalacto-FCF (Fig. 8). It is known that FCF is a glycoprotein containing both N-and O-glycans (47). FCF is a convenient acceptor for the screening of N-and Oglycosylation. However, FCF is probably not a physiological acceptor for ␤4GalNAc-T3, because there is no report that it has the LacdiNAc structure. The major sugar chains in FCF are O-glycans, NeuAc␣2-3Gal␤1-3GalNAc and NeuAc␣2-3Gal␤1-3(NeuAc␣2-6)GalNAc, each of which accounts for ϳ30% of all glycans, with another 10% composed of NeuAc␣2-3Gal␤1-4GlcNAc␤1-6(NeuAc␣2-3Gal␤1-3)GalNAc (48). FCF also possesses bi-and tri-antennary N-glycans. The results in Fig. 8 demonstrated that ␤4GalNAc-T3 efficiently transferred GalNAc to the N-glycan of asialo/agalacto-FCF because the strong signal probed with WFA that binds to GalNAc almost disappeared after the GPF treatment. The molecular size of FCF was reduced to ϳ45 and 50 kDa because of the loss of N-glycans. However, two bands having faint positive signals for FIG. 7. Quantitative real time PCR analysis of the ␤4GalNAc-T3 transcript in human tissues and cell lines. Standard curves for ␤4GalNAc-T3 and GAPDH were generated by serial dilution of each plasmid DNA. The expression level of the ␤4GalNAc-T3 transcript was normalized to that of the GAPDH transcript, which was measured in the same cDNAs from human tissues (A) and cell lines (B). Data were obtained from triplicate experiments and are indicated as the mean Ϯ S.D. PBMC, peripheral blood mononuclear cells; GOTO and SCCH-26, neuroblastomas; T98G and U251, glioblastomas; PC-7, lung adenocarcinoma; PC-1 and EBC-1, lung squamous cells; A431, esophagus cancer; MKN45, KATO III, and HSC43, stomach cancers; Colo205, HCT15, LSC, LSB, SW480, and SW1116, colorectal cancers; HepG2, hepatocarcinoma; Capan-2, pancreas cancer; SW1736, thyroid cancer; HL-60, promyelocytic leukemia; Namalwa, B cell lymphoma; Daudi, B cell (Burkitt's) lymphoma; K562, erythroid leukemia; HEK293, embryonic kidney cell; HeLa and HeLa-S3, cervix cancers. WFA remained at the 45-and 50-kDa positions even after the GPF treatment (lane 6). This indicated that GalNAc was transferred to the O-glycan of asialo/agalacto-FCF which lost Nglycans. One of the O-glycans, NeuAc␣2-3Gal␤1-4GlcNAc␤1-6(NeuAc␣2-3Gal␤1-3)GalNAc, could be an acceptor by exposing a non-reducing terminal GlcNAc after neuraminidase and ␤1,4-galactosidase treatments. Both N-glycan and O-glycan in asialo/agalacto-FCF are possible acceptors for ␤4GalNAc-T3.
Two cell lines, HEK293T and HT1080, have been reported to produce the LacdiNAc structure in N-glycans (34 -38). HPC and TFPI produced by HEK293 cells have LacdiNAc in their N-glycans (25,41). Quantitative real time PCR analyses revealed that the ␤4GalNAc-T3 transcript was highly expressed in HEK293 cells. This indicated that LacdiNAc in HPC and TFPI produced by HEK293 cells is probably synthesized by this enzyme.
In the human body, typical LacdiNAc structures have been found in N-glycans of LH (29,30), TSH (25), glycodelin-A (26,27), TFPI (28), and tenascin-R (49). LH and TSH are produced in the pituitary gland. Glycodelin-A and tenascin-R are mainly produced in the uterus and Purkinje cells in the molecular layer of the cerebellum. The expression levels of the ␤4GalNAc-T3 transcript in brain and uterus were very low as shown in Fig. 7. We do not know whether the LacdiNAc structures in these glycoproteins are synthesized by ␤4GalNAc-T3 or not. Very recently, we have cloned another gene that has high homology with the gene for ␤4GalNAc-T3. There is a possibility that both enzymes have similar types of activity, sharing roles in various tissues and cells but differing in tissue distribution. FIG. 8. Determination of GalNAc in the asialo/agalacto-FCF with and without glycosidase treatments. Reaction mixtures containing 1.5 (for Coomassie staining) and 0.5 g (for WFA lectin blotting) of asialo/agalacto-FCF were separated by SDS-PAGE and stained with Coomassie (lanes 1-3) or WFA lectin conjugated with HRP (lanes 4 -6) as described under "Experimental Procedures." Untreated asialo/agalacto-FCF (lanes 1 and 4), asialo/agalacto-FCF reacted with ␤4GalNAc-T3 (lanes 2 and 5), and the same protein treated with glycopeptidase F after the ␤4GalNAc-T3 reaction (lanes 3 and 6) were used.