A Family of Human β3-Galactosyltransferases

BLAST analysis of expressed sequence tags (ESTs) using the coding sequence of a human UDP-galactose:β-N-acetyl-glucosamine β-1,3-galactosyltransferase, designated β3Gal-T1, revealed no ESTs with identical sequences but a large number with similarity. Three different sets of overlapping ESTs with sequence similarities to β3Gal-T1 were compiled, and complete coding regions of these genes were obtained. Expression of two of these genes in the Baculo virus system showed that one represented a UDP-galactose:β-N-acetyl-glucosamine β-1,3-galactosyltransferase (β3Gal-T2) with similar kinetic properties as β3Gal-T1. Another gene represented a UDP-galactose:β-N-acetyl-galactosamine β-1,3-galactosyltransferase (β3Gal-T4) involved in GM1/GD1 ganglioside synthesis, and this gene was highly similar to a recently reported rat GD1 synthase (Miyazaki, H., Fukumoto, S., Okada, M., Hasegawa, T., and Furukawa, K. (1997) J. Biol. Chem. 272, 24794–24799). Northern analysis of mRNA from human organs with the four homologous cDNA revealed different expression patterns. β3Gal-T1 mRNA was expressed in brain, β3Gal-T2 was expressed in brain and heart, and β3Gal-T3 and -T4 were more widely expressed. The coding regions for each of the four genes were contained in single exons. β3Gal-T2, -T3, and -T4 were localized to 1q31, 3q25, and 6p21.3, respectively, by EST mapping. The results demonstrate the existence of a family of homologous β3-galactosyltransferase genes.

The data base of expressed sequence tags (ESTs) 1 is now estimated to contain sequence information from more than half of human genes; it therefore provides a unique source for identifying novel members of homologous gene families by conserved sequence motifs (1). The identification of novel genes by sequence similarity was used recently to identify a large family of homologous UDP-galactose:␤-N-acetyl-glucosamine ␤-1,4galactosyltransferases (␤4Gal-T). At least six novel genes were found of which four have been shown to represent functional ␤4Gal-Ts (2)(3)(4)(5). Perhaps surprisingly, there are no sequence similarities between the ␤-4-galactosyltransferases and a putative UDP-galactose:␤-N-acetyl-glucosamine ␤-1,3-galactosyltransferase gene submitted to GenBank in 1996 and here designated ␤3Gal-T1. The disaccharides Gal␤1-3GlcNAc␤ (type 1 chain) and Gal␤1-4GlcNAc␤ (type 2 chain) are core structures in glycosphingolipids and glycoproteins, where they occur in linear or branched repeated structures (6,7). The two core structures are differentially expressed in cells and organs (8) and synthesized by at least two independent galactosyltransferase activities (9,10). However, a glycosyltransferase may transfer to either C-3 or C-4 of ␤-GlcNAc (e.g. the ␣-3,4-fucosyltransferases (11)), suggesting that some enzymes may recognize common features of the two acceptor sites.
In the present study, we found that ␤3Gal-T1 is one member of a ␤3-galactosyltransferase family. The human EST sequence data base was used to identify several novel members of a ␤3-galactosyltransferase gene family. The ␤3Gal-T1 gene was not found in the EST data base, but a large number of ESTs were identified that shared short sequence stretches with high similarity and had conserved cysteine residues. The full coding sequences of three of these genes were established; expression studies demonstrated that one gene encoded a new UDP-Gal: ␤GlcNAc ␤-1,3-galactosyltransferase (␤3Gal-T2); and one gene encoded a UDP-Gal: ␤GalNAc ␤-1,3-galactosyltransferase gene (␤3Gal-T4), with different acceptor substrate specificity.

Identification of ␤3Gal-T1 Homologues Genes
Data base searches were performed with the reported coding sequence of a human ␤3Gal-T1 sequence (GenBank accession number E07739) using the tBLASTn algorithm against the dbEST data base at The National Center for Biotechnology Information, U. S. A., as described previously (2). Overlapping sequences were merged (Fig. 1), and the Unigene data base was used to select cDNA clones with the longest inserts and chromosomal assignments. EST cDNA clones were obtained from Genome Systems Inc. * This work was supported by The Danish Cancer Society, the Velux Foundation, the Danish Medical Research Council, the Lundbeck Foundation, PECS/P/SAU/253/95, NIH 1 RO1 CA66234, RO1 CA41521, RO1 CA70740, and National Institute of Health Resource Center for Biomedical Complex Carbohydrates Grant NIH 5 P41 RR05351. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) Y15060, Y15061, and Y15062.

Cloning and Sequencing of the Full Coding Sequence of ␤3Gal-T2
Four partly overlapping ESTs with approximately 360 bp open reading frame with sequence similarity to the C-terminal sequence of ␤3Gal-T1 were identified (Fig. 1). A further 5Ј sequence was obtained by 5Ј rapid amplification of cDNA ends using human fetal brain Marathon-Ready cDNA (CLONTECH) in combination with antisense primers EBER405 (5Ј-GGTGCATATCCTCGCATTAGG), EBER409 (5Ј-GGT-GCTAGACTTTCATTGCCCC), and EBER412 (5Ј-TTCTTTCCAAATGT-TCCGAAGG) for 35 cycles at 95°C, 45 s; 55°C, 15 s; 68°C, 3 min, using the Expand kit enzyme (Boehringer Mannheim). The rapid amplification of cDNA ends products were cloned into the BamHI site of pT7T3U19, and multiple clones were sequenced. The entire sequence was confirmed by sequencing genomic P1 clones. The composite sequence contained an open reading frame of 1266 bp encoding a putative protein with a type II domain structure (Fig. 2), and an overall sequence identity of approximately 42% to ␤3Gal-T1.

Cloning and Sequencing of the Full Coding Sequence of ␤3Gal-T3
Five partly overlapping ESTs with approximately 900 bp of open reading frame and sequence similarity to ␤3Gal-T1 were identified (Fig.  1). An additional 5Ј sequence was obtained by 5Ј rapid amplification of cDNA ends using primers EBER606 (5Ј-GCAGTTTGAATGCTCTC-GAAGTGTG) and EBER612 (5Ј-AGCAGCAGGAGGCTCCATTTG) as described for ␤3Gal-T2 above. The composite sequence contained an open reading frame of 993 bp encoding a putative protein with a type II domain structure (Fig. 2) and an overall sequence identity of approximately 40% to ␤3Gal-T1 and 33% to -T2.

Cloning and Sequencing of the Full Coding Sequence of ␤3Gal-T4
Two overlapping EST clones with sequence similarity to the Nterminal sequence of ␤3Gal-T1 were identified (Fig. 1). Sequencing of the inserts revealed an open reading frame of 1134 bp potentially encoding a protein with a type II domain structure ( Fig. 2) with an overall sequence identity of approximately 33% to ␤3Gal-T1.

Expression of ␤3-Galactosyltransferases in Sf9 Cells
The plasmids with pAcGP67 or pVL1193 were co-transfected with Baculo-Gold TM DNA (Pharmingen) as described previously (12). Recombinant Baculo virus were obtained after two successive amplifications in Sf9 cells grown in serum-containing medium. Controls included the pAcGP67-GalNAc-T3-sol (12). Standard assays were performed in 50 l of total reaction mixtures containing 25 mM Tris (pH 7.5), 10 mM MnCl 2 , 0.25% Triton X-100, 100 M UDP-[ 14 C]Gal (2,300 cpm/nmol) (Amersham Pharmacia Biotech), and varying concentration of acceptor substrates (Sigma) (see Table I for structures). Reaction products were quantified by Dowex-1 chromatography. Assays with hen egg ovalbumin (Sigma) were performed with the standard reaction mixture modified to contain 200 M UDP-Gal, 54 mM NaCl, and 1 mg of ovalbumin. The transfer of galactose was evaluated after separation by filtration through Whatman GF/C glass fiber filters. Constructs that encoded soluble secreted enzymes (lacking the signal-anchor sequence) were assayed with 5-20 l of culture supernatant from infected cells, whereas the full-length enzymes were assayed with 1% Triton X-100 homogenates of cells. Assays used for assessment of K m of acceptor substrates and donor substrates were modified to include 500 M UDP-[ 14 C]Gal (2,300 cpm/nmol) or 100 mM GlcNAc␤-benzyl. Assays with glycolipid acceptors were conducted as described previously (13) in reaction mixtures containing 2.5 mol of HEPES buffer (pH 7.2), 1 mol of MnCl 2 , 100 g of TDOC (for ␤3Gal-T1 and -T2) or Triton CF-54 (for ␤3Gal-T4), 20 g of acceptor glycolipid, 15 nmol of UDP-[ 14 C]Gal (13,000 cpm/nmol), and enzyme in a total volume of 100 l.

Characterization of the Products Formed with ␤3Gal-T2 and -T4
Terminal glycosylation of Lc 3 Cer (43) with ␤3Gal-T2 was performed in a reaction mixture consisting of 1 milliunit of ␤3Gal-T2 (specific activity determined with ␤GlcNAc-benzyl), 150 g of Lc 3 Cer, 25 mM Tris (pH 7.4), 10 mM MnCl 2 , 50 g of taurodeoxycholate, and 0.5 mol of UDP-Gal in a final volume of 100 l. The secreted form of ␤3Gal-T2 was partially purified by sequential DEAE and S-Sepharose chromatographies from serum-free medium as described previously (14). Terminal glycosylation of G M2 (Sigma) was performed with 50 l of a 1:1 suspension of a membrane fraction prepared from High Five™ (Invitrogen) cells infected with the full coding expression construct of ␤3Gal-T4. Briefly, cells were harvested 2 days postinfection, lysed, and homogenized in 1.0% CF54, 150 mM cacodylate (pH 6.5), 10 mM MnCl 2 , and 2 mM EDTA. The extract was pelleted by low speed centrifugation (3,000 rpm for 10 min), and the supernatant from this was pelleted again by high speed centrifugation (30,000 rpm for 30 min). Most of the enzyme activity was retained in the high speed-pelleted fraction and used as enzyme source. The reaction mixture included 250 g of G M2 , 150 mM cacodylate (pH 6.5), 10 mM MnCl 2 , 0.3% Triton CF-54, and 5 mM UDP-Gal in a final volume of 200 l. The glycosylations were monitored by high performance TLC, and the products were purified on octadecyl silica cartridges (Bakerbond, J. T. Baker Inc.) and deuterium-exchanged as described previously (2). One-dimensional 1 H NMR spectroscopy of the product of ␤3Gal-T2 was performed on a Bruker AMX-500 spectrometer (temperature, 308 K; spectral width, 5000 Hz acquired over 16,000 data points; relaxation delay, 2 s; solvent suppression by presaturation pulse). One-dimensional 1 H NMR spectroscopy of the product of ␤3Gal-T4 was performed on a Varian Unity-INOVA 600 MHz spectrometer (temperature, 308 K; spectral width, 6000 Hz acquired over 16,000 data points; relaxation delay, 1.5 s; solvent suppression by presaturation pulse).

Northern Analysis
The human multiple tissue northern blot was obtained from CLON-TECH and used once for the experiments shown. The soluble expression constructs were used as probes. Probes were random-primed-labeled using [␣-32 P]dCTP (Amersham) and an oligo labeling kit (Amersham). The blots were probed overnight at 42°C as described previously (12)
The chromosomal localization of ␤3Gal-T2, -T3, and -T4 were determined using 3ЈEST mapping data (National Center for Biotechnology Information). No ESTs corresponding to the available sequence of ␤3Gal-T1 was found; thus, the localization of this gene was not determined.

RESULTS
Identification and Cloning of Human ␤3Gal-T2, ␤3Gal-T3, and ␤3Gal-T4 -The search and cloning strategy outlined in Fig. 1 produced three novel genes with significant sequence similarity to ␤3Gal-T1 (Fig. 2). Additionally, three genes with less similarity were identified (not shown). Multiple sequence alignment of the three ␤3-galactosyltransferases as well as a homologous Drosophila gene designated Brainiac (15) (Gen-Bank accession number U41449) is shown in Fig. 2. The sequence similarities between the four human genes are limited to the central regions; there were no significant similarities in the N-terminal regions. Several sequence motifs in the putative catalytic domains are conserved between all the sequences. In the high similarity region, ␤3Gal-T2 is most similar to ␤3Gal-T1, ␤3Gal-T3 was the second most similar, and the least similar is ␤3Gal-T4. At least three cysteine residues align within all the human genes; an additional two align within ␤3Gal-T1, -T2, and -T3 (Fig. 2). Although Brainiac has most of the sequence motifs conserved among the human genes, none of the conserved cysteines in the human genes were found in Brainiac. There are two potential N-linked glycosylation sites in ␤3Gal-T1, five in ␤3Gal-T2 and -T3, and one in ␤3Gal-T4, respectively. The glycosylation sites are mainly in the N-terminal region, to the carboxyl side of the putative transmembrane region. Interestingly, one site occurs in the region of high sequence similarity and is conserved among the four human genes and Brainiac.
The predicted coding region of ␤3Gal-T2 has two potential initiation codons, both of which are in agreement with Kozak's rule (16). The coding sequence thus depicts a type II transmembrane glycoprotein with two different N-terminal cytoplasmic domains of 24 or 11 residues and a transmembrane segment of 19 residues flanked by charged residues, and 379 residues contain the stem region and catalytic domain (Fig. 2). ␤3Gal-T2 differs significantly from the other genes by having an extended putative stem region (approximately 50 residues) which is hydrophilic, as is found in most glycosyltransferase genes (Figs. 2 and 3). The 3Ј-UTR with one polyadenylation signal at position 2497 (1228 bp 3Ј-UTR) was found in all the identified EST clones. The 3ЈESTs (WI-13433) were linked to chromosome 1q31 between microsatellite markers D1S461 and D1S412 at 208 -213 centimorgan.
The predicted coding region of ␤3Gal-T3 has two potential initiation codons; only the second is in agreement with Kozak's rule (16). The coding sequence depicts a type II transmembrane glycoprotein with two potentially different N-terminal cytoplasmic domains of 19 or 7 residues, a transmembrane segment of 14 residues, and a stem region plus catalytic domain of 298 residues (Fig. 2). The 774-bp 3Ј-UTR obtained from all EST clones did not include a consensus polyadenylation signal. The 3ЈESTs (WI-9638) were linked to chromosome 3q25 between D3S1275 and D3S3702 microsatellite markers at 176 -179 centimorgan.
The predicted coding region of ␤3Gal-T4 has a single initiation codon, in agreement with Kozak's rule (16). The coding sequence yields a type II transmembrane glycoprotein with an N-terminal cytoplasmic domain of 8 residues, a short transmembrane segment of 11 residues, and a stem region and catalytic domain of 359 residues (Fig. 2). All conserved sequence motifs within the human ␤3-galactosyltransferase genes are found in ␤3Gal-T4. Whereas these align fully within ␤3Gal-T1, -T2, and -T3, there is an inserted sequence element in ␤3Gal-T4 that is illustrated by inserted spacings in the multiple sequence alignment analysis (Fig. 2) and in the hydropathy plot (Fig. 3). The hydropathy plot of ␤3Gal-T4 shows a very hydrophobic putative stem region that differs significantly from the other genes. This has not been observed for other animal glycosyltransferases except for a ceramide galactosyltransferase (17). The human ␤3Gal-T4 exhibits relatively high similarity to a recently reported rat gene (overall amino acid sequence identity of 79%) (18). Although this is lower than the similarity found in comparisons between most rat/human glycosyltransferase genes, it suggests that these two genes are homologues. A 3Ј-UTR with a polyadenylation signal at position 1387 (250-bp 3Ј-UTR) was included in the EST clones. The 3ЈESTs (stSG4027) were to linked to chromosome 6p21.3 between microsatellite markers D6S276 and D6S439 at 44 -48 centimorgan.
Expression of ␤3Gal-T1 and -T2-Expression of a soluble construct of ␤3Gal-T1 and -T2 in Sf9 cells resulted in a marked increase in galactosyltransferase activity using ␤GlcNAc-benzyl as an acceptor substrate, compared with uninfected cells or cells infected with control constructs for polypeptide GalNActransferases (12) ( Table I). Analysis of the substrate specificity of ␤3Gal-T1 and -T2 activities showed that only saccharides with a terminal ␤GlcNAc residue and not ␣GlcNAc or ␣GalNAc were acceptor substrates. It was not possible to determine K m for benzyl-␤GlcNAc or Umb-␤GlcNAc with ␤3Gal-T1 and -T2 due to substrate inhibition at concentrations above 100 mM.
The K m for UDP-Gal of ␤3Gal-T1 and -T2 were 90 Ϯ 5 M and 37 Ϯ 9 M, respectively, using Benzyl-␤GlcNAc as an acceptor substrate. ␤3Gal-T2 catalyzed glycosylation of hen egg ovalbumin, whereas ␤3Gal-T1 showed poor activity with this substrate (Table II). Analysis of enzyme activities with a panel of glycolipid substrates revealed that both enzymes were capable of catalyzing glycosylation of ␤GlcNAc-terminating structures Lc 3 Cer and nLc 5 Cer, but ␤3Gal-T1 showed lower activity with nLc 5 Cer. Both enzymes showed low activities with GlcCer (Table III). ␤3Gal-T2 and to a lesser extent ␤3Gal-T1 incorporated galactose into nLc 4 Cer. Secreted forms of ␤4-galactosyltransferases were included as controls (2). 2 Both ␤3Gal-T1 and -T2 showed strict donor substrate specificity for UDP-Gal and did not utilize UDP-GalNAc or UDP-GlcNAc with the acceptor substrates tested (data not shown). The reaction product formed by ␤3Gal-T2 with Lc 3 Cer was shown by 1 H NMR to be Lc 4 Cer, thus verifying that T2 formed the Gal␤1-3GlcNAc linkage. In the downfield region (3.7-5.7 ppm) (Fig. 4), the one-dimensional 1 H NMR spectrum indicated the presence of Lc 4 Cer as well as residual Lc 3 Cer, and this was in agreement with results of TLC analysis, which indicated approximately 40% conversion of Lc 3 Cer to Lc 4 Cer. The spectral features of the two components were virtually identical with those previously observed (19,20) (Fig. 4) and clearly distinct from those obtained for nLc 4 Cer (21).
Expression of ␤3Gal-T3-Two different soluble constructs and two full-length coding constructs were expressed in Sf9 cells, but enzyme activity was not detected with any of the substrates listed in Table I, Lc 3 , G M2 , or globoside (not shown).
Expression of ␤3Gal-T4 -Expression of the full-length coding construct for ␤3Gal-T4 in Sf9 cells produced no detectable activity with any of the simple sugar derivatives listed in Table  I. However, analysis of enzyme activity with a panel of glycolipids revealed that products migrating as Gg 4 and G M1 were formed with Gg 3 and G M2 glycolipid substrates, respectively (Table IV). The reaction product formed by ␤3Gal-T4 with G M2 was analyzed by 1 H NMR to verify that T4 formed the Gal␤1-3GalNAc linkage to make G M1 (Fig. 5). Aside from the appearance of a number of impurities mainly ascribed to membrane phospholipids not separated from the ganglioside product, the proton NMR spectral data of the reaction product were comparable with those of Koerner et al. (22) for nonexchangeable CH, CH 2 , and CH 3 signals of G M1 , allowing for slight differences in temperature (308 versus 303 K), % D 2 O, and ganglioside concentration.
Northern Analysis of ␤3Gal-T1, -T2, -T3, and -T4 -Northern  analysis with mRNA from eight adult human organs revealed different patterns of expression by the four genes (Fig. 6). ␤3Gal-T1 expression was detected in brain with a transcript size of 6.5 kb, but no expression was detected in the other human organs tested or in 25 human tumor cell lines derived from the pancreas and colon. The human cancer cell line used for the original cloning of ␤3Gal-T1 was not included in this study. ␤3Gal-T2 produced a major transcript of 3.6 kb and a minor transcript of 3.2 kb in heart and brain, but it was not expressed in any of the other organs tested. All human ESTs derived from ␤3Gal-T2 were obtained from brain libraries, but two mouse ESTs were from a mammary gland library. ␤3Gal- GlcNAc␤1-3Gal␤1-4Glc␤1-Cer 0.36 0.29 nLc 4 Gal␤1-4GlcNAc␤1-3Gal␤1-4Glc␤1-Cer 0.04 0.08 nLc 5 GlcNAc␤1-3Gal␤1-4GlcNAc␤1-3Gal␤1-4Glc␤1-Cer 0.16 0.29 a Assayed using 100 g of taurodeoxycholate per 100 l reaction mixture. Enzyme source was medium from infected Sf9 cells. b ND, not detectable. 3 Cer with ␤3Gal-T2. The substrate (S) and product (P) isolated together from the crude reaction mixture are shown. 2048 free induction decays were accumulated at 308 K. Arabic numerals refer to ring protons of residues designated by Roman numerals or capital letters in the corresponding structures. R refers to protons of the sphingosine backbone, and FA refers to protons of the fatty N-acyl moiety. The downfield region of the spectrum displayed five distinct ␤-anomeric resonances ( 3 J 1,2 Х 7-9 Hz). Two of these, at 4.166 ppm ( 3 J 1,2 ϭ 7.7 Hz) and 4.263 ppm ( 3 J 1,2 Х 7 Hz), correspond to ␤-Glc I-1 and ␤-Gal II-1, respectively, of both Lc 3 Cer and Lc 4 Cer. A third, at 4.620 ppm ( 3 J 1,2 ϭ 8.1 Hz), corresponds to ␤-GlcNAc III-1 of Lc 3 Cer. The two resonances of lower intensity, at 4.781 ppm ( 3 J 1,2 ϭ 8.4 Hz) and 4.136 ppm ( 3 J 1,2 ϭ 7.0 Hz), correspond to ␤-GlcNAc III-1 and ␤-Gal IV-1, respectively, of Lc 4 Cer. By contrast, under these conditions, the resonance for H-1 of the terminal Gal␤134 of nLc 4 Cer was previously found at 4.214 ppm ( 3 J 1,2 ϭ 6.7 Hz), whereas that of ␤-GlcNAc III-1 was found at 4.664 ppm ( 3 J 1,2 ϭ 7.9 Hz) (21,39). Thus, the data clearly show evidence of partial glycosylation of Lc 3 Cer with Gal␤133 to make Lc 4 Cer. Two other pairs of resonances are consistent with the presence in the mixture of both Lc 3 Cer and Lc 4 Cer. These are the characteristic ␤-Gal II-4 signals, shifted slightly downfield in the latter (3.851 ppm; 3 J 3,4 ϭ 3.3 Hz) relative to the former (3.837 ppm; 3 J 3,4 ϭ 2.6 Hz), both found within Ϯ0.002 ppm of previously observed values (19,20); and the ␤-GlcNAc NAc singlets, found at 1.815 ppm in the latter and 1.834 ppm in the former.

FIG. 4. Downfield and NAc regions of 600-MHz 1 H NMR spectrum of glycosphingolipid mixture isolated after enzymatic glycosylation of Lc
T3 produced a major transcript of 3.8 kb and minor transcript of 3.0 kb similarly in heart and brain, and expression was also observed in placenta, kidney, and pancreas. Human ESTs derived from ␤3Gal-T3 were obtained from brain, liver/spleen, and heart libraries. ␤3Gal-T4 yielded multiple transcripts of 5.0, 3.0, and 2.2 kb in all organs with some variations in detected levels of expression, and the expression in brain was relatively low. The two available human ESTs derived from ␤3Gal-T4 were obtained from brain libraries.

DISCUSSION
In the present study, four members of a human ␤3-galactosyltransferase gene family were characterized. A human ␤3galactosyltransferase gene, here designated ␤3Gal-T1, indicated to catalyze the formation of the type 1 core structure, Gal␤1-3GlcNAc, was released in GenBank October 1996 and described in some detail in a patent application (JP1994181759-A/1). ␤3Gal-T1 was isolated from the melanoma cell line WM266-4 by a transfection-cloning strategy that used expression of Le a and sialosyl-Le a in KJM-1 cells for selection. In the present study, the function of ␤3Gal-T1 was verified by expression in insect cells. Through the application of an EST cloning strategy previously used to identify a family of homologous ␤4-galactosyltransferases (2), a novel family of homologous ␤3-galactosyltransferase genes was identified. Three genes with a high degree of similarity, ␤3Gal-T2, -T3, and -T4, were studied here, and recombinant forms of two of these have ␤3-galactosyltransferase activity. ␤3Gal-T2 has kinetic properties similar to ␤3Gal-T1, but there was a striking difference in their activities with the glycoprotein ovalbumin. ␤3Gal-T4 was found to catalyze the addition of galactose to the gangliosides G M2 and Gg 3 . The function of the human ␤3Gal-T3 gene reported here was not identified.
During the course of this work, a rat UDP-Gal:␤GalNAc ␤-1,3-galactosyltransferase gene was isolated by the transfection-cloning strategy (18). The rat enzyme has an overall sequence identity of 79% to human ␤3Gal-T4, and the substrate specificities are similar. Thus, we propose that these represent homologues among these species. However, it remains possible that the two genes are different but closely related variants, since the sequence similarity is low compared with what has been found for other human and mouse homologues of glycosyltransferases. During review of this manuscript, human ␤3Gal-T1 and -T2 were reported by Kolbinger et al. (23), and in agreement with our finding that ␤3Gal-T2 utilized nLc 4 Cer as acceptor, these authors reported 16% activity of this enzyme with Gal␤O-(CH 2 ) 8 -CO 2 CH 3 . Mouse homologues of ␤3Gal-T1, -T2, and -T3 were reported simultaneously (24), and in this study, the murine ␤3Gal-T2 and -T3 were found to have low levels of activity with GlcNAc-␤-pNP, approximately 30-fold lower than ␤3Gal-T1. Interestingly, the kinetic properties of all three murine enzymes were poor with K m for UDP-Gal between 0.6 and 2.3 mM, which is much higher than the K m of other glycosyltransferases (5-100 M) (25). In the present study, the K m for UDP-Gal with human ␤3Gal-T1 and -T2 were 90 and 37 M, respectively. Furthermore, the kinetic properties of human ␤3Gal-T1 and -T2 with simple acceptors were comparable, but we did not detect significant activity with ␤3Gal-T3. Whether these differences are due to experimental problems or to species-related variations is not known.
Analysis of sequence similarities between the four ␤3-galactosyltransferase genes using the ClustalW algorithm (Fig. 2) revealed comparatively low overall sequence identities of 29 -42%; however, several conserved short sequence motifs were found. Interestingly, a single potential N-glycosylation site is conserved in all genes, a feature not generally found among homologous glycosyltransferases, although a single conserved site is also found in most of the ␤4-galactosyltransferases (2, 3). Support for an evolutionary relationship among the three genes was provided by analysis of their genomic organizations, which showed that the coding regions of all four genes were located in a single exon. The same organization was found for the three mouse genes (24). At least four ␤4-galactosyltransferase genes have the same genomic organization, including the conservation of five intron positions (2). It appears that genes for several glycosyltransferase families have similar organizations. Several members of the fucosyltransferases and the ␤1-6N-acetylglucosaminyltransferase families are encoded by a single exon (26,27). In contrast to these gene families, all members of the ␤3Gal-T family have different chromosomal localizations. The significance of this is presently unknown.
The existence of multiple ␤3-galactosyltransferases suggests a surprising redundancy in genes with seemingly similar functions, which could represent a comprehensive genetic back-up. However, it is equally likely that this high number of enzymes have evolved as a result of specific requirements for enzymes with different functions. Apparent redundancies in substrate specificities of glycosyltransferases have been found in sialyltransferases (28,29), fucosyltransferases (30 -32), ␤4-galactosyltransferases (2)(3)(4), and polypeptide GalNAc-transferases (14,33). There are differences in the kinetic parameters of some members of each of these families that relate to type and complexity of acceptor glycoconjugate or acceptor peptide sequence for the polypeptide GalNAc-transferases. In the present study, ␤3Gal-T2 catalyzed transfer of galactose to ovalbumin, whereas ␤3Gal-T1 did not. Analogously, one member of the ␤4-galactosyltransferase family failed to utilize this substrate  (Table II). It is not clear whether this indicates that the enzymes have different preferences for glycoprotein and glycolipid substrates or have selective specificities for particular antennae of branched mannose and poly-N-acetyllactosamine structures. Ovalbumin contains unsubstituted GlcNAc␤1-2Man and GlcNAc␤1-4Man structures (34), and apparently none of these are utilized by ␤3Gal-T1. However, ␤3Gal-T1 utilized the disaccharide GlcNAc␤1-6Man-1-OMe more efficiently than ␤3Gal-T2 (Table I), indicating that perhaps tri-or tetraantennary N-linked glycoproteins may serve as substrates for ␤3Gal-T1. Unfortunately, a complete panel of disaccharide acceptors for different antennae of N-linked structures were not available for this study. None of the ␤3Gal-Ts utilized the disaccharide GlcNAc␤1-3GalNAc representing mucin-type core 3 (Table I), whereas four ␤4Gal-Ts efficiently use this substrate (2). ␤3Gal-T1 and -T2 showed similar activities with ␤-GlcNAc-terminating lactoseries glycosphingolipids. Interestingly, especially ␤3Gal-T2 exhibited significant incorporation into nLc 4 Cer (Table III). The structure of the product was not determined, but it is likely to be Gal␤1-3Gal␤1-4GlcNAc␤1-3Gal␤1-4Glc␤1-Cer, which was originally isolated by Stellner and Hakomori (35). Thus, the substrate specificities of ␤3Gal-T1 and -T2 are different, but further studies are required to define in detail the full range of functions of each enzyme.
The human ␤3-galactosyltransferase genes appear to be distantly related to the Drosophila gene Brainiac, which is involved in contact and adhesion between germ-line and follicle cells (36,37) (Fig. 2). Previously, Yuan et al. (15) compared sequences of Brainiac and a related gene, Fringe, to a number of known bacterial glycosyltransferases and suggested that the two Drosophila genes may represent glycosyltransferases. These investigators also analyzed the human EST data base for potential related human genes, and in fact two ESTs (GenBank accession numbers R13867 and W26435) were suggested to represent human genes homologous to Brainiac. EST R13867 was shown in the present study to be derived from ␤3Gal-T2, and EST W26435 is from another ␤3-galactosyltransferase homologous gene that is presently under study. Yuan et al. (15) identified five conserved sequence motifs between Lex1, Fringe, and Brainiac subfamilies, and all of these motifs fall within the highly conserved sequence regions between the four human ␤3-galactosyltransferase genes (Fig. 2). Although several sequence motifs are shared between these genes, none of the three conserved cysteine residues in these are found in Brainiac. One cysteine in the C-terminal region of Brainiac and ␤3Gal-T4 was aligned. The potential N-glycosylation site conserved in all human ␤3-galactosyltransferases, is also found in Brainiac. Preliminary attempts to express Brainiac and identify glycosyltransferase activity were not successful with the substrates described here. 3 A family of ␤3-galactosyltransferases is described in the present study. The substrate specificities and expression patterns for the ␤3-galactosyltransferases characterized to date were different, although some redundancy in function may exist for ␤3Gal-T1 and -T2. The finding that ␤3Gal-T4 transferred galactose to ␤-GalNAc in G M2 suggests that other related ␤3-galactosyltransferases may belong to this gene family. Candidates include the Gal␤1-3Gb 4 glycolipid synthase, the Gal␤1-3GalNAc␣1-3(Fuc␣1-2)Gal␤1-R synthase initiating the repetitive histo-blood group A-associated glycosphingolipids, as well as the Gal␤1-3GalNAc␣1-O-Ser/Thr mucin-type core 1 synthase.