Cloning and Characterization of a Close Homologue of Human UDP-N-acetyl-α-d-galactosamine:Polypeptide N-Acetylgalactosaminyltransferase-T3, Designated GalNAc-T6

The UDP-GalNAc:polypeptideN-acetylgalactosaminyltransferase, designated GalNAc-T3, exhibits unique functions. Specific acceptor substrates are used by GalNAc-T3 and not by other GalNAc-transferases. The expression pattern of GalNAc-T3 is restricted, and loss of expression is a characteristic feature of poorly differentiated pancreatic tumors. In the present study, a sixth human UDP-GalNAc:polypeptideN-acetylgalactosaminyltransferase, designated GalNAc-T6, with high similarity to GalNAc-T3, was characterized. GalNAc-T6 exhibited high sequence similarity to GalNAc-T3 throughout the coding region, in contrast to the limited similarity that exists between homologous glycosyltransferase genes, which is usually restricted to the putative catalytic domain. The genomic organizations of GALNT3 and GALNT6 are identical with the coding regions placed in 10 exons, but the genes are localized differently at 2q31 and 12q13, respectively. Acceptor substrate specificities of GalNAc-T3 and -T6 were similar and different from other GalNAc-transferases. Northern analysis revealed distinct expression patterns, which were confirmed by immunocytology using monoclonal antibodies. In contrast to GalNAc-T3, GalNAc-T6 was expressed in WI38 fibroblast cells, indicating that GalNAc-T6 represents a candidate for synthesis of oncofetal fibronectin. The results demonstrate the existence of genetic redundancy of a polypeptide GalNAc-transferase that does not provide full functional redundancy.

The initiation of mucin-type O-linked protein glycosylation is controlled by a family of UDP-GalNAc:polypeptide N-acetylga-lactosaminyltransferases (GalNAc-transferases) 1 (EC 2.4.1.41) (1). To date five distinct members of the animal GalNAc-transferase family have been reported (2)(3)(4)(5)(6)(7). The GalNAc-transferase family appears to be highly conserved as nine distinct genes have been identified in Caenorhabditis elegans (8). Four human GalNAc-transferases have been characterized, and several characteristics of the human family of GalNAc-transferases are now apparent. (i) There is overall sequence similarities of approximately 40 -45%, with regions of high similarity (80%) in GalNAc-transferase motifs but little similarity among the N-terminal regions encoding the cytoplasmic tail, the putative signal anchor sequence, and the stem regions (1); (ii) the chromosomal localizations and genomic organizations are different (1,9); (iii) the substrate specificities as determined by in vitro assays are different, but there is overlap among some substrates, especially those derived from mucin tandem repeats (5,6,10); and (iv) the patterns of expression in human cells and organs are different (2,3,5,6). Furthermore, the catalytic action of the different GalNAc-transferases can be cooperative, since glycosylation of certain acceptor sites in the MUC1 tandem repeat by one GalNAc-transferase is required before other sites can be glycosylated by another GalNActransferase (6). These data suggest that each GalNAc-transferase has distinct biological functions that are mainly determined by the kinetic properties and expression patterns of the enzymes. Many properties of the enzyme function are still not fully understood: the importance of sub-Golgi localization (11), the importance of large variation in the length of stem regions (1,7), and the significance of a putative C-terminal lectin-like domain of approximately 100 amino acids, which does not appear to be essential for catalytic activity (12,13).
GalNAc-T3 exhibits acceptor substrate specificities not seen with other enzymes, including glycosylation of a single in vivo defined O-glycosylation site in fibronectin, which forms the * This work was supported by the Danish Cancer Society, the Mizutani Foundation for Glycoscience, the Ingeborg Roikjer Foundation, the Velux Foundation, the Danish Medical Research Council, the Danish Natural Science Research Council, the Novo Nordisk Foundation, National Institutes of Health Grant 1 RO1 CA66234, funds from the European Union Biotech 4th Framework, and the Dutch Cancer Society. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) Y08565 and AJ133523.
** To whom correspondence should be addressed: School of Dentistry, Nørre Alle 20, DK-2200 Copenhagen N, Denmark. Tel.: 45 35326835; Fax: 45 35326505; E-mail: henrik.clausen@odont.ku.dk. oncofetal fibronectin isoform (14) and a single site in the V3 loop of HIV gp120 (5). The unique specificity of GalNAc-T3 for the oncofetal fibronectin peptide was reproduced with plasma fibronectin, clearly indicating the importance of the primary sequence context as a major factor in determining O-glycosylation (10). Furthermore, the specificity of GalNAc-T3 for the HIV sequence has been confirmed in vivo. A reporter construct containing the acceptor sequence was O-glycosylated only if GalNAc-T3 was co-expressed in the host cell (15). This demonstrated that specificity in vitro reflects in vivo specificity. Recent studies with a panel of monoclonal antibodies to human GalNAc-transferases demonstrated that GalNAc-T3 is not expressed in connective tissue cells in normal or tumor tissues or in a fibroblast cell line synthesizing oncofetal fibronectin (16). It is therefore unlikely that GalNAc-T3 represents the native fibronectin GalNAc-transferase activity found in tumor tissues and fibroblast cell lines, as originally described by Matsuura et al. (17). This suggests the existence of an additional GalNActransferase with similar properties as GalNAc-T3 but with a different expression pattern.
Here, we report the cloning and expression of such a novel human GalNAc-transferase, which appears to represent a high similarity duplicate gene of GalNAc-T3. The novel GalNActransferase, designated GalNAc-T6, displayed an acceptor substrate specificity similar to GalNAc-T3, although GalNAc-T6 showed better kinetic properties with the fibronectin substrate. The two GalNAc-transferases exhibit different expression patterns as analyzed by Northern analysis and immunocytology. GalNAc-T6 was expressed in a fibroblast cell line synthesizing oncofetal fibronectin. The existence of high similarity pairs of genes within the GalNAc-transferase gene family is significant to the biological function of this large gene family and of practical significance for studies of transgenic knock-out models.

EXPERIMENTAL PROCEDURES
Identification and Cloning of cDNA for GalNAc-T6 -The dbEST data base at The National Center for Biotechnology Information (NCBI), was searched for sequences similar to the coding region of the human GalNAc-T3 gene (5) using the tBLASTn and BLASTn algorithms. The 5Ј region of GalNAc-T3, with no known similarity to other human GalNAc-transferases (bp 1-600), was used to identify a rat EST (GenBank accession number H32001) predicted to encode a protein sequence with 63% similarity to residues 99 -181 of GalNAc-T3. The putative human counterpart of this rat sequence was isolated by PCR with a sense primer, EBHC500 (5Ј-AGCGGATCCACTCCTGCCTTCC-GGGGTTC-3Ј), derived from the rat EST sequence, and an antisense primer, EBHC106N (5Ј-AGCGGATCCGTATTCGTCCATCCAIACITC-TG-3Ј), derived from the conserved GalNAc-transferase motif (5) (Fig. 1,  panel A). Four cDNA libraries from MKN45 (3), Colo205 (Stratagene), salivary glands (5), and spleen were screened by PCR with 0.5 M primers and library lysates containing 1 ϫ 10 6 plaque-forming units. A spleen ZAP (Stratagene) random-primed cDNA library was prepared from 2.0 g of human spleen mRNA (CLONTECH) using a time saver cDNA kit (Amersham Pharmacia Biotech) and constructed as recommended by the manufacturer. PCR was performed by using 35 cycles of 95°C for 20 s, 55°C for 5 s, 72°C for 2 min under standard conditions. Two cDNA libraries derived from salivary glands and spleen yielded a single PCR product, whereas two libraries derived from the cell lines Colo205 and MKN45 gave no product. The PCR products from the salivary gland and spleen libraries were cloned and sequenced, and the sequences were found to be identical. The identified novel human sequence (clone #1) covering the central part of a putative GalNActransferase gene was further studied by cloning 5Ј and 3Ј sequences using rapid cDNA library screening (5) (Fig. 1, panel B). Briefly, the spleen library was aliquoted into 20 sublibraries, and these were screened by PCR using primer pairs based on the identified novel sequence and the ZAP vector primers T3 and T7, and the products were confirmed by hybridization with either EBHC500 or EBHC505 (5Ј-AGCGGATCCACTCTGCCCCTCTGGACGGGC-3Ј). EBHC503 (5Ј-AGCGGATCCGACAAGACAGTGGTGGTGAGC-3Ј) was used for the 3Ј PCR, and EBHC504 (5Ј-AGCGGATCCGGGTCTCCAGGGGGGTCCAC-3Ј) was used for the 5Ј PCR (Fig. 1, panel B). PCR with EBHC503/T7 yielded a single product of 2 kbp for the 3Ј sequence, and PCR with EBHC504/T3 gave a single product of 250 bp. Both products were blunt end-cloned and sequenced. The 3Ј sequence contained an in-frame stop codon. The 5Ј sequence had a potential open reading frame but was shorter than the coding region of GalNAc-T3 and had no translation initiation codon and hydrophobic sequences, suggesting the existence of further 5Ј sequence. Attempts to obtain additional 5Ј sequence information by use of various 5Ј rapid amplification of cDNA ends (RACE) techniques failed.
Genomic Cloning and Characterization of the Organization of GALNT6 -A P1 human foreskin genomic library (DuPont Merck Pharmaceutical Co. Human Foreskin Fibroblast P1 Library) was screened using primer pairs EBHC500/EBHC504. Three clones were obtained from Genome Systems DMPC-HFF#1-235-B10 (P1-12423), DMPC-HFF#1-826-D3 (P1-12424), DMPC-HFF#1-994-B6 (P1-12425). DNA from P1 phage was prepared as recommended by Genome Systems Inc., and P1-12423 was selected for partial sequence analysis. The entire sequence of the open reading frame compiled from the PCR cDNA cloning strategy was confirmed with minor sequence corrections. The most 5Ј-coding sequence of the putative GalNAc-transferase gene was obtained by 5Ј sequencing of the P1 DNA ( Fig. 1, panel A). This sequence included a translation initiation codon, a putative cytoplasmic tail, and a putative hydrophobic transmembrane-spanning domain. The intron/exon organization of the gene was determined by comparing cDNA and P1 sequences.
Polypeptide GalNAc-transferase Assay-Standard assays were performed in 50 l of total reaction mixtures containing 25 mM Tris (pH 7.4), 10 mM MnCl 2 , 0.25% Triton X-100, 50 M UDP[ 14 C]GalNAc (2,000 cpm/nmol) (Amersham Pharmacia Biotech), 200 -500 M acceptor peptide (see Tables I and II for structures), and 5-10 l of culture supernatants. In some experiments, as indicated, purified recombinant Gal-NAc-transferase preparations were used. GalNAc-T6 was purified as described previously (10) using sequential ion-exchange chromatographies on Amberlite (IRA95, Sigma) and DEAE-Sephacel, S-Sepharose fast flow, and Mono-S (PC1.6/5, Smart-System) (Amersham Pharmacia Biotech) columns. Final purification to apparent homogeneity was performed on S12 gel filtration (Superose 12 pc3.2/30, Smart-System, Amersham Pharmacia Biotech). Purity and protein concentration of the final fractions were assessed by S12 gel filtration and SDS-polyacrylamide gel electrophoresis using bovine serum albumin as a standard. The specific activity of the purified GalNAc-T6 was estimated to be 2.35 units/mg using 250 M Muc1a (Table I) as the acceptor peptide. Previously, soluble forms of human GalNAc-T1, -T2, and -T3 were expressed in Sf9 cells and purified to near homogeneity with specific activities of 0.6 unit/mg for GalNAc-T1, 0.5 unit/mg for GalNAc-T2, 0.5 unit/mg for GalNAc-T3, and 0.05 unit/mg for GalNAc-T4, measured using peptides derived from MUC2, MUC1, and MUC7 tandem repeats (6, 10). Peptides were synthesized by ourselves, by Carlbiotech (Copenhagen), or Neosystems (Strasbourg), and quality was ascertained by amino acid analysis and mass spectrometry. For analysis of the donor substrate specificity, assays were performed with 100 M UDP[ 14 C]Gal or UDP[ 14 C]GlcNAc (4,000 cpm/nmol). Products were routinely quantified by scintillation counting after Dowex-1 formic acid cycle chromatography. At least once for all combinations of enzyme sources and peptides, the products were evaluated by C-18 reverse phase chromatography (PC3.2/3 or mRPC C2/C18 SC2.1/10 Amersham Pharmacia Biotech, Smart System) with scintillation counting of peptide peak fractions. Peptides and products produced by in vitro glycosylation were confirmed by mass spectrometry, and reaction kinetics were monitored by capillary electrophoresis. Reaction mixtures for preparative glycosylation included 2 mM cold UDP-GalNAc and 25 g of acceptor peptide in a total volume of 100 l. Reactions were incubated in the sample carousel of an Applied Biosystems model HT270 (Perkin-Elmer) at 30°C, and injections were performed at 60-min intervals. Capillary zone electrophoresis was performed on coated fused silica capillaries, 72 cm ϫ 50 m, with 49 cm between sample injection and optical cell. Electrophoresis were performed at 30°C using 50 mM phosphate buffer (pH 2.5). Voltage across the capillary was 20 kV in positive mode with the anode at the injection side, and the runs were monitored at 210 nm. At the beginning of each cycle the capillary was flushed with 0.1 M NaOH for 2 min followed by flushing with 50 mM phosphate buffer (pH 2.5) for 4 min. After 8 h of reaction the glycopeptides were purified by C-18 high performance liquid chromatography and analyzed by matrixassisted laser desorption/ionization mass spectrometry time of flight (MALDI-TOF). Spectra were acquired on either Voyager-DE mass spectrometer (Perseptive Biosystem Inc.) equipped with delay extraction. The matrix used was 2,5-dihydroxybenzoic acid (25 mg/ml) dissolved in a 2:1 mixture of 0.1% trifluoroacetic acid in water and acetonitrile. Samples dissolved in 0.1% trifluoroacetic acid to a concentration of approximately 80 fmol-2 pmol/l were prepared for analysis by placing 1 l of sample solution on a probe tip followed by 1 l of matrix.
Generation of Monoclonal Antibodies and Immunocytology-The production and characterization of the anti-GalNAc-T6 monoclonal antibody, UH7 (2F3), was essentially as described previously (16). Balb/c mice were immunized three to four times with 10 g of undenatured purified GalNAc-T6 protein. Fusion to NS-1 and the cloning procedure were as described previously (16). Hybridomas were selected by immunocytology on air-dried, acetone-fixed Sf9 cells infected with various full-coding or secreted GalNAc-transferase baculovirus constructs as well as by ability to differentially immunoprecipitate active recombinant enzymes (16). Western blot analysis with purified recombinant enzymes was also performed.
Immunocytology was performed with a series of human cell lines. Human fibroblast (WI38), leukocyte (HL60), epidermoid carcinoma (A431), colon carcinoma (Colo205), cervix carcinoma (HeLa), pancreatic carcinoma (Suit2, ASPC1), and gastric carcinoma (MKN45) were grown to subconfluency in the appropriate media as recommended by American Type Culture Collection. Cells were fixed in ice-cold acetone for 10 min and then kept at Ϫ70°C before staining. In addition, cell lines (WI38, MKN45) were fixed in 3% paraformaldehyde, quenched with 50 mM ammonium chloride in phosphate-buffered saline, and permeabilized in 0.1% Triton X-100 before antibody staining. At this stage, 0.2% fish skin gelatin (Sigma) was added as a blocking agent. The two different fixation protocols produced similar results. Cells were incubated with undiluted hybridoma supernatants for 1-24 h at 4°C. Bound mAbs were detected with FITC-conjugated rabbit anti-mouse immunoglobulin absorbed with human serum (code F-261, Dako, Denmark). Slides were mounted in glycerol containing p-phenylenediamine and examined in a Zeiss fluorescence microscope using epi-illumination. The microscope was equipped with FITC interference filters and a 75-W xenon lamp (FITC).

RESULTS
Cloning of GalNAc-T6 -GalNAc-T6 was identified from a rat EST (H32001), and the human sequence was obtained by a combination of PCR-based cDNA cloning and by genomic cloning. The composite sequence contained an open reading frame of 1869 bp (GenBank accession number Y08565), which is similar to GalNAc-T3 (1902 bp). An additional 33 bp in the coding region of GalNAc-T3 compared with T6 is attributable to the position of the translation initiation codons with GalNAc-T3, having a longer N-terminal cytoplasmic sequence (Fig. 1,  panel A). The entire coding sequence was confirmed by sequencing of P1 clones covering the entire coding sequence. Sequencing on P1 clone, P1-12423, revealed that the coding region of GALNT6 was contained in 10 exons (Fig. 1, panel A). Sequences flanking the introns are shown in Fig. 2. A compar-FIG. 2. Intron junctions in the coding region of GALNT6. Exon sequences are shown in capital letters with the nucleotide position from initiation codon in subscript and the predicted amino acid sequence in single-letter code above the sequence. Flanking intron sequences are shown in small letters, and exons are labeled according to Fig. 1. Sequences were aligned to best fit of the gt/ag consensus rule (38).
ison of intron/exon boundaries between GALNT6 and GALNT3 showed that these were positioned identically.
The deduced sequence of GalNAc-T6 predicts a type II transmembrane protein with a hydrophobic signal anchor sequence in residues 9 -27 ( Fig. 1, panel A). Both GalNAc-T3 and -T6 have 2 potential N-glycosylation sites; one is conserved in the C-terminal region, whereas the other is located in the putative stem region in a nonconserved position.
Results of fluorescence in situ hybridization revealed that GALNT6 is located on chromosome 12q13 (Fig. 3). No specific hybridization signals were observed at other chromosomal sites. A total of 20 cells in metaphase were analyzed. Several human ESTs with sequences identical to the coding region and available 3Ј-untranslated-region sequence of GalNAc-T6 were identified.
Expression of GalNAc-T6 -Expression of the secreted construct of GalNAc-T6 in Sf9 cells resulted in GalNAc-transferase activity in the culture medium of infected cells that was significantly greater than background values obtained with uninfected controls or cells infected with the histo blood group O 2 gene (Table I). In general, GalNAc-T6 showed activity and specificity similar to GalNAc-T3, including greater activity with Muc1a as compared with Muc1b (derived from the tandem repeat of MUC1) and unique activities with the HIV IIIB gp120 V3 loop and the fibronectin peptide. GalNAc-T3 and -T6 did not transfer to the hCG-␤ peptide derived from human choriogonadotropin ␤-chain. One striking difference was the peptide derived from CD59 (Table II), which was a reasonable substrate for GalNAc-T6 but a poor substrate for GalNAc-T3. The CD59 sequence was identified as a putative O-glycosylation site in a recent study by Rudd et al. (20). CD59 was not a substrate for GalNAc-T2 and -T4, and GalNAc-T1 showed very low activity. GalNAc-T6 had a K m of approximately 2 mM with CD59, but the K m of GalNAc-T3 with this substrate could not be measured due to quantities of peptides required. A sequence from the tandem repeat of zonadhesin (21) was a substrate for GalNAc-T1, -T2, -T3, and -T6 (Table II). Analysis of the incorporation of GalNAc residues into the zonadhesin peptide by capillary zone electrophoresis and MALDI-TOF revealed that up to 6 mol of GalNAc could be incorporated. Minor differences in the rate of incorporation was found between the enzymes (not shown).
This study identified another novel unique substrate for GalNAc-T3 and -T6 in the prion protein (22). A disulfidebonded loop sequence consisting of 36 amino acids is found in the prion protein, and this loop contains two N-glycosylation sites that are utilized both in the normal and disease form of the protein ( 179 CVNITIKQHTVTTTTKGENFTETDVK-MMERVVEQMC 214 ) (23,24). The 15 residues between these two N-linked glycosylation sites contain a cluster of likely O-glycosylation sites. No other obvious potential O-glycosylation sites are found in the prion protein. Two peptides derived from the prion loop, prion-a and prion-b (Table II), were tested as substrates, and only GalNAc-T3 and -T6 utilized the prion-a peptide, whereas no enzymes transferred to the prion-b peptide. GalNAc-T1 did show very low activity with the prion-a peptide. Terminal glycosylation with GalNAc-T3 and -T6 of the prion-a peptide revealed that up to four sites out of the five potential sites could be glycosylated (not shown), but the fourth site could not be quantitatively glycosylated. GalNAc-T3 and -T6 showed similar low activities with substrates based on MUC7 and rat submaxillary gland mucin (EA2 peptide), which are derived from mucin tandem repeats and represent efficient substrates for several other GalNAc-transferase including Gal-NAc-T1 and -T2 (Table I) and rat GalNAc-T5 (7).
A comparative analysis revealed differences in the kinetic parameters of purified recombinant GalNAc-T3 and -T6. Gal-NAc-T3 showed better catalytic activity with the HIV V3-loop peptide, and GalNAc-T6 showed better activity with the fibronectin peptide (Tables I and II). The former was due to a higher V max of GalNAc-T3, whereas the latter was related to both a lower K m and higher V max for GalNAc-T6.
Northern Blot Analysis of Human Organs and Tumor Cell Lines-Northern blots with mRNA from 23 human adult organs showed that GalNAc-T6 hybridized to a single mRNA of approximately 5 kbp in placenta and trachea, with weak signals in brain and pancreas (Fig. 4). The transcript size in brain appeared to be different, with two faint bands of approximately 6.5 and 3 kb. This pattern is different than that found previously for GalNAc-T3 (5), although both GalNAc-T3 and -T6 were expressed in placenta.
Further analysis of mRNA expression among a panel of pancreatic and breast adenocarcinoma cell lines showed a differential expression pattern for the two genes. GalNAc-T3 was strongly expressed in nine cell lines, weakly expressed in four cell lines, and not detected in five cell lines (Fig. 5). GalNAc-T3  expression in several of these cell lines has previously been confirmed by Western blotting (19). GalNAc-T6 transcript (5 kb) was detected in WI38 fibroblasts (Fig. 6), and only one of the carcinoma cell lines, MiaPaca (Fig. 5), analyzed. MiaPaca did not express GalNAc-T3.
Generation of Monoclonal Antibodies to Human GalNAc-T6 -A hybridoma line, UH7 (2F3), secreting IgG1 specifically reacting with GalNAc-T6 was selected for use. The specificity of the antibody was confirmed by immunocytology with Sf9 cells expressing human GalNAc-T1, -T2, -T3, -T4, and -T6, and differential immunoprecipitation of GalNAc-transferase activities was derived from the medium of infected Sf9 cells. This strategy for selection and characterization has been described in detail in Mandel et al. (16). UH7 reacted with Sf9 cells expressing GalNAc-T6 and not with Sf9 cells expressing other GalNAc-transferases or irrelevant proteins (not shown). UH7 immunoprecipitated recombinant GalNAc-T6 from spent medium of infected Sf9 cells, whereas GalNAc-T1, -T2, -T3, and -T4 activities did not immunoprecipitate. UH7 removed all GalNAc-T6 activity from the medium, but only low activity (approximately 10%) was recovered in the immunoprecipitate (not shown). UH7 did not react by Western blot analysis. Another antibody, UH8 (2E11) was reactive in Western blots but did not react with the native enzymes by immunocytology or immunoprecipitation. The same finding was found for antibodies to GalNAc-T2 and -T3, where antibodies reactive with the native, active enzymes fail to react by Western blot, whereas other antibodies reactive by Western blot failed to react with the native proteins (16).
Immunolabeling of Cell Lines and Sperm Cells-The immunocytological staining of GalNAc-T3 and -T6 was distinct in human tumor cell lines (Fig. 7). Previously, we have shown that GalNAc-T3 is not expressed in fibroblast cell lines or connective tissue in normal and tumor specimens (16). This is significant because cultured fibroblasts and fetal and tumor tissues express a GalNAc-transferase activity capable of utilizing the fibronectin peptide substrate form oncofetal fibronectin (17). GalNAc-T6 was expressed in the fibroblast cell line WI38, which produces oncofetal fibronectin (14,25), as evaluated by immunostaining with mAb UH7 and Northern analysis (Figs. 6 and 7). mAb UH7 did not label the pancreatic carcinoma lines, SUIT2 and ASPC1, in agreement with the Northern analysis (Figs. 5 and 6). Anti-GalNAc-T3 mAb UH5 labeled SUIT2 cells heterogenously (Fig. 7, panel B), which is in agreement with the Northern analysis and the finding that sublines derived from SUIT2 show differences in expression of GalNAc-T3 (Fig.  5) (19). GalNAc-T6 was also detected by immunocytology in A431 cells (not shown).
Previously, we showed that GalNAc-T3 was strongly expressed in ejaculated spermatozoa, whereas neither GalNAc-T1, -T2, or -T4 were expressed (16). As shown in Fig. 7 Gal-NAc-T6 was not expressed in spermatozoa. Thus, GalNAc-T3 remains the only GalNAc-transferase found in spermatozoa. One likely substrate for GalNAc-T3 in spermatozoa is the zonadhesin cell membrane adhesion molecule, which plays a role in sperm-egg binding (21). Zonadhesin contains a mucinlike tandem repeat region with a degenerate eight-amino acid repeat sequence, and zonadhesin is heavily O-glycosylated. GalNAc-T3 utilized the zonadhesin peptide substrate but exhibited excess substrate inhibition (Table II). DISCUSSION This study demonstrates the existence of a subfamily of genes with high similarity within the large polypeptide Gal-NAc-transferase gene family. GalNAc-T3 and -T6 have high similarity in DNA and amino acid sequence throughout the coding region, identical organization for nine conserved intron/  exon boundaries in the coding regions, and similar kinetic properties that are distinct from other GalNAc-transferases. These parameters strongly indicate that GalNAc-T3 and -T6 are derived from a late gene duplication event and raises the possibility that they provide genetic and functional redundancy. However, there were minor differences in the activities of GalNAc-T3 and -T6, and more importantly, the expression patterns of the two transferases were shown to be entirely different by Northern analysis and immunocytology. Thus, although they may represent duplicated genes, the products of the genes must have different functions and apparently do not provide functional back-up in all cells. The pattern of sequence similarity found between Gal-NAc-T3 and -T6 differs from that generally found between members of homologous glycosyltransferases. Other members of the GalNAc-transferase family share sequence similarity primarily in the putative catalytic domain, consisting of a major part of the mid-and C-terminal sequence (6,7). The recently cloned rat GalNAc-T5 had a stem region of more than 500 residues, which underscores the fact that there are large differences outside the catalytic domain (7). One previous exception to this pattern of similarity was found with three of the ␣-1,4-fucosyltransferases (FUT3, FUT5, and FUT6) (26,27), which share an exceptionally high degree of sequence similarity. Interestingly, these three fucosyltransferases also share a similar simple genomic organization (one coding exon), and the genes localize to a single locus on chromosome 19 (28). A single putative common ancestral gene for the ␣-1,4-fucosyltransferase family was identified in cow, suggesting that a duplication event of this gene occurred very late in evolution (29). GALNT3 and GALNT6 share genomic organizations but do not co-localize to a single locus and are found on different chromosomes. Orthologous rodent genes for both GALNT3 (30) (Gen-Bank accession number U70535) and GALNT6 (GenBank accession number AJ133523) exist, but it is possible that lower organisms only have one copy of these high similarity copy genes. Thus, Hagen and Nehrke (8) found no evidence of such closely related pairs of GalNAc-transferase genes in the Gal- NAc-transferase family in C. elegans.
The kinetic properties of GalNAc-T6 were similar to Gal-NAc-T3 and quite different from those of other GalNAc-transferases (Table I). It is clear that a number of peptide sequences derived from mucin tandem repeats with high density of Oglycosylation sites serve as substrates for several GalNActransferases; however, it is equally clear that each GalNActransferase has unique acceptor specificities. Unique substrate peptides have been identified for all the human GalNAc-transferases characterized to date (6,10). The unique specificities of GalNAc-T3 and GalNAc-T6 have been extensively studied, and several substrates have been identified. The first unique substrate defined for a GalNAc-transferase was the HIV IIIB gp120derived peptide (5). Recently, Nehrke et al. (15) showed that this unique specificity was not solely an in vitro phenomenon but is reproduced in an in vivo model. Interestingly, immunocytological and Northern analysis revealed that GalNAc-T6 and not -T3 is expressed in CD4-positive H9 cells, which are often used for in vitro propagation of HIV. 2 However, the expression pattern in freshly isolated CD4-positive lymphocytes is not known. These data suggest that the repertoire of Gal-NAc-transferases in a cell plays a major role in determining the O-glycosylation pattern. Thus, a glycoprotein may be O-glycosylated at certain positions in one cell type but when expressed in another cell type that lacks a single GalNAc-transferase, specific positions may not be O-glycosylated.
Minor differences in the kinetic properties of GalNAc-T3 and -T6 were identified. It is important to note that for practical and economical reasons only representative panels of potential substrates can be studied, which may not reflect the full range of specificities exhibited by the enzymes. GalNAc-T6 showed almost 10-fold better activity with a CD59-derived substrate, which was recently identified as an in vivo O-glycosylation site (20). Detailed kinetic parameters of this substrate with Gal-NAc-T3 and -T6 could not be studied, since the K m of Gal-NAc-T6 was at least 2 mM. GalNAc-T3 and -T6 activities also differed for HIV and fibronectin peptide substrates. GalNAc-T6 showed the best kinetic properties with the fibronectin peptide. The finding that GalNAc-T6 showed relative better kinetic properties with the fibronectin peptide may be significant, since this enzyme and not GalNAc-T3 is expressed in the WI38 fibroblast cell line, which produces O-glycosylated oncofetal fibronectin (Figs. 6 and 7). O-Glycosylation of the fibronectin IIICS domain was previously suggested to be related to an onco-developmentally regulated polypeptide GalNAc-transferase activity (17). Previously, we suggested that GalNAc-T3 was a candidate for this activity (10). However, recent immunocytological studies with anti-GalNAc-T3 mAbs revealed that Gal-NAc-T3 is not expressed in fibroblast cell lines, connective tissue cells in normal organs, or in connective tissue associated with carcinomas (16). Since, GalNAc-T6 shows better kinetic properties for synthesis of oncofetal fibronectin and is expressed in WI38 fibroblasts, it is likely that this enzyme represents the GalNAc-transferase activity responsible for induction of oncofetal fibronectin. The finding that GalNAc-T6 shows a more restricted expression pattern than GalNAc-T3 and other GalNAc-Ts corroborates this hypothesis. In preliminary immunohistological studies of squamous cell carcinomas of the mouth, which produce oncofetal fibronectin in the connective tissue compartment of carcinomas (25, 31, 32), we have not yet been able to demonstrate expression of GalNAc-T6 in the connective tissue compartment. 3 GalNAc-T6 was not expressed in normal stratified squamous epithelium of the oral cavity, but variable expression was detected in the epithelial compartment of squamous carcinomas. Further studies are needed to evaluate this, but the initial implication is that expression of Gal-NAc-T6 is cancer-associated in stratified squamous epithelium.
The high similarity genes, GalNAc-T3 and -T6, are differentially expressed in human organs and cell lines. Previously, Northern analysis of human organs showed that GalNAc-T3 was highly expressed in pancreas and testis, with weaker expression in several other organs including kidney, prostate, and intestine (5). A similar expression pattern of GalNAc-T3 was found in the mouse, and additional organs with high expression levels, were identified (30). GalNAc-T6 has a more restricted expression pattern and was expressed in placenta and trachea and weakly in brain and pancreas (Fig. 4). The Northern analysis of cell lines shown in Fig. 5 further illustrate the restricted expression of GalNAc-T6 and the differential expression of GalNAc-T3 and -T6. Human GalNAc-T1, -T2, and -T4 were found to be more ubiquitously expressed (2,5,6). The murine GalNAc-transferases designated GalNAc-T4 and -T5 have more restricted expression patterns, with high expression being found only in stomach, small intestine, colon, and sublingual glands (7,33). Recently, we have initiated the production of monoclonal antibodies to human GalNAc-transferases to circumvent inherent technical and practical problems and limitations associated with Northern analysis and in situ hybrid-2 U. Mandel  ization techniques (6,16). One antibody (UH7) to GalNAc-T6 described here showed the same characteristics as those developed to other GalNAc-transferases: no cross reactivity with other GalNAc-transferases and exclusive specificity for the native enzyme protein. Application of this antibody further confirmed the differential expression patterns of GalNAc-T3 and -T6 (Fig. 7). The panel of antibodies developed provides a valuable tool for detailed studies of the role of individual members of the GalNAc-transferase family in normal cells and tissues as well as in disease states. The novel strategy for the generation and selection of hybridomas applied here appears to have overcome past difficulties in making antibodies to glycosyltransferases.
The identification of potential O-glycosylation sites in the prion protein is intriguing. Presently, there is no evidence that the prion protein carries O-glycans (22); however, this may relate to lack of contact with GalNAc-T3 and/or -T6. Northern analysis did show expression of GalNAc-T6 (and not GalNAc-T3) in brain using whole brain mRNA (Fig. 4), but we have no evidence that GalNAc-T6 is co-expressed with the prion protein. Full appreciation of the expression pattern of these enzymes in brain tissues requires detailed immunohistological analysis with monoclonal antibodies. If co-expression occurs, it is highly likely that O-glycosylation would occur in the prion sequence. This is supported by the in vivo data provided for the HIV V3-loop sequence by Nehrke et al. (15) but also by the fact that the acceptor sequence in the prion protein is contained on a sequence with flanking N-glycosylation (22), which would be predicted to be exposed to the appropriate Golgi compartment.
Despite evidence indicating distinct specificities for individual GalNAc-transferases, the first knock-out experiment of a GalNAc-transferase revealed no apparent phenotypic changes in homozygously deficient mice (34). However, it is now clear that the targeted gene was not the intended GalNAc-T1. Instead a close homologue with high sequence similarity and a similar genomic organization was targeted (35). The inactivated gene has not yet been cloned and expressed; nonetheless, the lack of an observed phenotype when the gene was targeted may be explained by several possibilities: a lack of function of this gene, the function of this gene is dispensable, or the function of the gene is compensated for by another gene. The impaired gene has higher sequence similarity to GalNAc-T1 than other GalNAc-transferases. Furthermore, the genomic organizations of the two genes appear to be identical (35). 4 This may indicate that these two genes represent another subfamily in the GalNAc-transferase family, but it is also important to note that one pseudogene derived from GalNAc-T1 has been identified (36). The information gathered here for the GalNAc-T3/T6 subfamily is therefore an important basis for studies of the function of both of these genes using the knock-out strategy. A parallel type of mannosyl O-glycosylation in yeast has been studied in more detail; although this type of protein glycosylation is initiated by a family of seven homologous mannosyl transferases, disruption studies showed that at least two genes must be targeted to yield changes in growth and viability (37).
In conclusion, the present results demonstrate the existence of subfamilies of polypeptide GalNAc-transferase genes. Mem-bers of one subfamily have similar kinetic properties; however, they were shown to have entirely different expression patterns, suggesting that they play different roles in different cell types.