Substrate specificities of three members of the human UDP-N-acetyl-alpha-D-galactosamine:Polypeptide N-acetylgalactosaminyltransferase family, GalNAc-T1, -T2, and -T3.

Mucin-type O-glycosylation is initiated by UDP-N-acetylgalactosamine:polypeptide N-acetylgalactosaminyltransferases (GalNAc-transferases). The role each GalNAc-transferase plays in O-glycosylation is unclear. In this report we characterized the specificity and kinetic properties of three purified recombinant GalNAc-transferases. GalNAc-T1, -T2, and -T3 were expressed as soluble proteins in insect cells and purified to near homogeneity. The enzymes have distinct but partly overlapping specificities with short peptide acceptor substrates. Peptides specifically utilized by GalNAc-T2 or -T3, or preferentially by GalNAc-T1 were identified. GalNAc-T1 and -T3 showed strict donor substrate specificities for UDP-GalNAc, whereas GalNAc-T2 also utilized UDP-Gal with one peptide acceptor substrate. Glycosylation of peptides based on MUC1 tandem repeat showed that three of five potential sites in the tandem repeat were glycosylated by all three enzymes when one or five repeat peptides were analyzed. However, analysis of enzyme kinetics by capillary electrophoresis and mass spectrometry demonstrated that the three enzymes react at different rates with individual sites in the MUC1 repeat. The results demonstrate that individual GalNAc-transferases have distinct activities and the initiation of O-glycosylation in a cell is regulated by a repertoire of GalNAc-transferases.

To date three human UDP-N-Acetylgalactosamine:polypeptide N-acetylgalactosaminyltransferases (1-3) (GalNAc-transferases) 1 have been identified and characterized (1)(2)(3)(4). Al-though the three GalNAc-transferases show similarities in primary structure with regard to predicted domain structures, sequence motifs, and conserved cysteine residues, the overall amino acid sequence similarity of only 45% suggests that the members of the GalNAc-transferase family have undergone significant changes during evolution. The genes encoding these enzymes are located on different chromosomes and have distinct structures, although some intron positions are conserved, suggesting an evolutionary relationship. 2 The genes are differentially expressed in organs as revealed by Northern analysis (1)(2)(3); in particular GalNAc-T3 exhibited a restricted expression pattern. One question addressed here is whether these three GalNAc-transferases are isoenzymes with redundant or unique functions.
Hennet et al. (5) recently addressed this question by analyzing mice rendered deficient in a close homologue of GalNAc-T1 by gene targeting. No obvious phenotypic differences were observed and preliminary characterization of the residual Gal-NAc-transferase activity with a few substrates did not reveal differences in enzyme activities. There was a reduction in Gal-NAc-transferase activity in ES cells in which the gene was inactivated. It is difficult to assess the full significance of these findings because the enzyme deleted in these studies is not well characterized with respect to substrate specificity and tissue expression pattern. Disruption of Dol-P-Man:polypeptide mannosyltransferases which initiate O-glycosylation in yeast showed that loss of one enzyme did not affect cell growth and O-glycosylation in a severe manner. In contrast, disruption of two or more genes affected growth or was lethal (6).
The parameters that determine sites of O-glycan attachment to glycoproteins are poorly understood (7)(8)(9)(10). Unlike N-linked glycosylation and most other types of protein glycosylation, a consensus peptide sequence motif for acceptor sites has not emerged for either GalNAc-or Man-type O-glycosylation (6, 8 -11) despite extensive studies of the acceptor substrate specificities of different GalNAc-transferase preparations (8,(12)(13)(14). Analysis of sequences around confirmed sites O-glycosylated in vivo failed to reveal a simple model for prediction of glycosylation (9,15,16). Attempts to infer sequence specificity from analysis of the substrate specificities of GalNAc-transferase activities obtained from extracts is likely to be misleading due to variable expression of a number of different GalNActransferases, which may show distinct specificities for acceptor substrates. Some GalNAc-transferases may compete for acceptor substrate sites, even if they do not glycosylate the acceptor substrate site (17). Mathematical models designed to predict sequence preferences of O-glycan sites are flawed because of the limited number of sites identified to date and the selected class of glycoproteins these represent (9,16), and the fact that the analyzed sites were obtained from a number of different organisms and cell types that probably express different repertoires of GalNAc-transferases.
In the present study we investigated the in vitro specificity and kinetic properties of purified recombinant GalNAc-transferases, GalNAc-T1, -T2, and -T3. The results demonstrate unique but partly overlapping acceptor substrate specificities among the three enzymes. Specific sites on peptides were glycosylated, however, there were differences in kinetic properties at these sites. The same sites were glycosylated in a 20-mer or a 105-mer peptide based on the MUC1 tandem repeat. Selective specificity of GalNAc-T3 for a 6-mer sequence in fibronectin was maintained with the intact fibronectin molecule. The results indicate that the acceptor substrate specificities of the GalNAc-transferases is largely dependent on the primary sequence of the acceptor substrate.

Expression and Purification of Recombinant GalNAc-transferases
Expression constructs of soluble human GalNAc-T1, -T2, and -T3 were prepared in the vector pAcGP67 as described previously (2,3). The constructs for GalNAc-T1 and -T2 were designed to correspond to the previously identified N-terminal sequence of the purified soluble enzymes (1,2). The N terminus of the expressed recombinant forms included the following residues (underlined) derived from the vector construct: T1, NH 2 -DLGSRGL; T2, NH 2 -DPGTLLEPKKK; and T3, NH 2 -DLGSSTMER-. Sf9 cells were grown at 27°C in TMN-FH medium containing 10% fetal calf serum (Pharmingen). Plasmids pAcGP67-GalNAc-T1-sol, pAcGP67-GalNAc-T2-sol, and pAcGP67-GalNAc-T3-sol were cotransfected with Baculo-Gold DNA (Pharmingen) as described previously (3). Recombinant Baculovirus was obtained after two successive amplifications in Sf9 cells grown in serum-containing medium, and titers of virus were estimated by titration in 24-well plates with monitoring of enzyme activities. Controls included soluble human blood group A GalNAc-transferase (18), and the enzymatically non-functional blood group O 2 allele (19). For large scale expression in serum-free medium, Sf9 cells were adapted to growth in 2.5 expanded surface area roller bottles (in vitro) in 200 ml of 30% Grace containing 10% fetal calf serum and 70% SF-900 II medium (Life Technologies) in a 27°C roller at 0.6 rpm. Cells which could be loosened by centrifugal agitation were transferred to regular roller bottles maintained upright in a shaking bath (27°C, 140 rpm) and 200 ml of SF-900 II medium containing 1 mM glutamine and 0.025% F-68 (Life Technologies) were added. Cells were grown for 2 days, split 1:1 with SF-900 II medium, and grown for another 2 days. Cell densities approached approximately 1 ϫ 10 6 /ml. Cells were infected as follows: cells were harvested and resuspended in 100 ml of SF-900 II medium in the original shaker bottle and infected with 1:1,000 to 1:5,000 of a stock of the second amplification of virus (3).
After 1 h 350 ml of SF-900 II medium containing 2 mM glutamine, 0.1% glucose, 0.25% lipid mixture (Life Technologies), and 0.2% yeast extract ultrafiltrate (200 g/liter) (Sigma) were added and flasks shaken for 72-96 h (27°C, 140 rpm). The spent medium was harvested by centrifugation at 1,000 rpm and stored at 4°C in the presence of 0.02% NaN 3 . Attempts to use Sf9 cells maintained throughout in SF-900 II medium failed to yield significant quantities of recombinant proteins.

Purification of Recombinant GalNAc-transferases from Serum-free Medium
Purification of recombinant enzymes from serum-free medium was performed as follows (Table I). Approximately 400 ml of medium containing 2-5 units of enzyme were harvested and processed individually. Medium was dialyzed against 25 mM Bis-Tris, pH 6.0, 10 mM NaCl, 2 mM MnCl 2 , and 2 mM EDTA, centrifuged at 10,000 ϫ g, and passed through a 120-ml DEAE (Sigma) column equilibrated in dialysis buffer without EDTA. The excluded fractions were applied to a 30-ml S-Sepharose Fast-flow (Pharmacia) column equilibrated in the same buffer and GalNAc-transferases were eluted with a gradient of NaCl from 10 to 500 mM. Fractions containing enzyme were pooled and simultaneously dialyzed and concentrated using a Spectrum Dialysis Concentrator with 10,000 cut off (Spectrum). The concentrated Gal-NAc-T1 and -T2 were diluted 5-fold in Bis-Tris buffer with 10 mM NaCl, applied to a Mono-S column (HR 5/5, Pharmacia), and eluted with a NaCl gradient from 10 to 500 mM. GalNAc-T3 was not subjected to the second cation exchange chromatography as this step inactivated the enzyme. Mono-S fractions of GalNAc-T1 and -T2 as well as concentrated S-Sepharose fractions of GalNAc-T3 were further purified by S12 gel filtration chromatography (HR 10/30, Smart System, Pharmacia) run in phosphate buffer (pH 7.4) with 1.15 M NaCl. The purity and protein concentration of final fractions of the GalNAc-transferases were assessed by S12 gel filtration chromatography (PC3.2/30, Smart System, Pharmacia) and SDS-PAGE using bovine serum albumin as a standard.

Polypeptide GalNAc-transferase Assay
Standard assays were performed in 50 l of total reaction mixtures containing 25 mM Tris (pH 7.4), 10 mM MnCl 2 , 0.25% Triton X-100, 50 M UDP-[ 14 C]GalNAc (2,000 cpm/nmol) (Amersham), 5 mM 2-mercaptoethanol (only in assays to determine K m and V max ), 0.01-0.5 milliunits of GalNAc-transferase, and 25 g of acceptor peptide (see Table II for structures). Peptides were synthesized by ourselves or by Carlbiotech (Copenhagen) and Neosystems (Strasbourg), and quality was ascertained by amino acid analysis and mass spectrometry. Products were routinely determined by scintillation counting after Dowex-1 formic acid cycle chromatography. At least once for all combinations of enzyme sources and peptides, the products were evaluated by C-18 reverse phase chromatography (PC3.2/3 or RPC C2/C18 SC2.1/10 Pharmacia, Smart System) with scintillation counting of peptide peak fractions. Finally, peptides and products produced by in vitro glycosylation were in most cases also confirmed by mass spectrometry.  a One unit of enzyme is defined as the amount of enzyme that will transfer 1 mol of GalNAc from UDP-GalNAc in 1 min using the standard reaction mixture as described under "Experimental Procedures" with 25 g of Muc2 peptide as acceptor substrate for GalNAc-T1, 25 g of Muc1b as acceptor substrate for GalNAc-T2, and 25 g of Muc1a as acceptor substrate for GalNAc-T3. cific activity estimated with Muc2 peptide). Assays were performed in duplicate or quadruplicate.
Assays to determine the metal ion requirement were performed in standard reaction mixtures without MnCl 2 using 0.25 milliunits of GalNAc-transferase (specific activity estimated with Muc5C peptide) purified by gel filtration (run in buffer phosphate-buffered saline with 1 M NaCl) in the absence of MnCl 2 . Analysis of GalNAc-transferase activity without addition of Mn 2ϩ revealed no detectable activity. The activity was assessed with Muc2 peptide in the presence of 5, 10, or 20 mM CaCl 2 or MgCl 2 with MnCl 2 as control.
Preparative glycosylation of peptides was performed with 10 -50 nmol of peptide, 0.5-2.5 mmol of UDP-[ 14 C]GalNAc (10-fold excess of potential Ser/Thr acceptor sites), and 0.25-5 milliunits of GalNActransferase (specific activity determined using the relevant acceptor peptide to be glycosylated) in a final volume of 200 l. Reactions were allowed to incubate for 24 -48 h at 37°C, and at 18 -24 h additional enzyme and UDP-GalNAc (50% of originally added) were added. A peptide was considered terminally glycosylated by a GalNAc-transferase when addition of enzyme and UDP-GalNAc did not result in further incorporation over 4 h as estimated by [ 14  The mass analysis corresponds with the molecular weights estimated by SDS-PAGE in panels A-C. The higher mass of GalNAc-T1 than predicted are most likely due to N-glycosylation, and this is also suggested by the heterogeneous peaks obtained. GalNAc-T2 has only one potential N-linked glycosylation site, and this was not predicted to be utilized (2). In agreement with this the obtained mass of GalNAc-T2 corresponds well with the predicted mass. The observed mass of GalNAc-T3 was lower than predicted, and this could be attributed to cleavage in the stem region. In contrast to T1 and T2 the expression construct of T3 included the entire stem region as the cleavage site for this enzyme is unknown.
heat-inactivated (5 min 95°C) enzyme. SDS-PAGE Western blotting using monoclonal antibodies FDZ to fibronectin and FDC-6 and 5C10, which specifically detect the oncofetal fibronectin epitope, as well as enzyme-linked immunoadsorbent assay using the same antibodies.

Monitoring of in Vitro O-Glycosylation by Capillary Electrophoresis
A reaction mixture for preparative glycosylation was used with cold 1-2 mM UDP-GalNAc and 0.5-1 mM acceptor peptides in a total volume of 50 -100 l. The assay was incubated in the sample carousel at 30°C and injections performed at 30 -60-min intervals. Capillary zone electrophoresis was performed on a Applied Biosystems model HT270 (Perkin-Elmer). Coated fused silica capillaries, 72 cm ϫ 75 m, with 35-cm length between sample injection and optical cell were used. Electrophoresis were performed at 30°C using 50 mM phosphate buffer (pH 2.5). Voltage across the capillary was 20 KV in the positive mode with the anode at the injection side, and the runs were monitored at 210 nm. At the beginning of each cycle the capillary was flushed with 0.1 M NaOH for 2 min, followed by flushing with 50 mM phosphate buffer (pH 2.5) for 4 min. Composition of products separated were assessed in two ways: (i) parallel reactions were stopped at time points with maximum peak height for each component, and then glycopeptides were purified by C-18 HPLC, and analyzed by MALDI-TOF and amino acid sequence analysis. (ii) Purified standard glycopeptides were co-injected with reaction mixtures to assess co-migration of products.

Structure Determination
Matrix-assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-TOF)-All mass spectra, except for the Muc1 105-mer peptide, were acquired on a Voyager-Elite MALDI time of flight mass spectrometer (Perseptive Biosystem Inc., Framingham, MA), equipped with delayed extraction. The MALDI matrix was a 9:1 mixture of 25 g/liter 2,5-dihydroxybenzoic acid and 25 g/liter 2-hydroxy-5-methoxy benzoic acid (Aldrich) dissolved in a 2:1 mixture of 0.1% trifluoroacetic acid in water and acetonitrile. Samples dissolved in 0.1% trifluoroacetic acid to a concentration of approximately 2 pmol/l were prepared for analysis by placing 1 l of sample solution on a probe tip followed by 1 l of matrix. All spectra were obtained in the linear mode and calibrated using external calibration. Data processing were carried out using GRAMS/386 software.
All mass spectra for the Muc1 105-mer peptide were obtained on a Bruker reflex time of flight mass spectrometer (Bruker-Franzen Analytik, Bremen, Germany). Data were acquired by a LeCroy 9450A 400 megasamples/s digital storage oscilloscope (LeCroy Corp., Chestnut Ridge, NY) from which single shot spectra were transferred to a Macintosh Quadra 950 computer (Apple Computer Inc., Cupertino, CA) via a National Instruments NI DAQ GPIB controller board (National Instruments, Austin, TX). Samples were dissolved in 0.1% trifluoroacetic acid to a concentration of approximately 2 pmol/l. One l of sample solution was placed on a stainless steel probe tip followed by 1 l of matrix solution (␣-cyano-4-hydroxycinnamic acid dissolved in 70% acetonitrile, 15 g/liter). All mass spectra were obtained in the linear mode and calibrated using a singly charged matrix ion, which provided a mass accuracy of approximately 0.1%. Data processing were carried out using the computer program LaserOne, which was written in ThinkC (Symantec Corporation, Cupertino, CA) by M. Mann and P. Mortensen, EMBL, Heidelberg, Germany.
Asp-N Endoproteinase Digestion-Reactions containing Muc1 105-mer (2 nmol) in 100 l 0.1 M sodium phosphate (pH 8.0) with 0.2 mg of Asp-N endoproteinase were incubated for 18 h at 37°C. The digest was injected directly into a reverse phase HPLC column and cleaved peptides eluted by a gradient of 0 -90% acetonitrile in 0.1% trifluoroacetic acid.
Amino Acid Sequencing-Automated Edman degradation was performed on a Knauer 910 pulsed-liquid gas-phase sequencer using polyvinylidene difluoride membranes as immobilizing support. The peptide samples were dissolved in 0.1% trifluoroacetic acid and spotted onto Polybrene-coated polyvinylidene difluoride membranes. The PTH derivatives were separated on-line on a 2 ϫ 250-mm column packed with 5 ml of SuperSpher C18 (Merck) using a narrow bore HPLC.

Expression and Purification of Secreted Forms of GalNAc-T1, -T2, and -T3
Expression of recombinant GalNAc-transferases in serumfree media yielded 2-5 units/400 ml in shaker flasks harvested 72-96 h post-infection. The combination of anion and cation exchange chromatography resulted in concentration and partial purification of the enzymes (Table I, Fig. 1). A gel filtration step yielded enzyme preparations in high yields with few contaminants as estimated by Coomassie staining of SDS-PAGE gels. The specific activities of transferase preparations were estimated to 0.5-0.6 units/mg. GalNAc-T3 consistently produced two faster migrating bands, and we believe that these are different glycoforms and/or proteolytic products because all three components were recognized by a monoclonal antibody to the enzyme as detected by Western blotting (not shown). Furthermore, immunoprecipitation of an expressed full-length GalNAc-T3 enzyme with a C-terminal myc-tag using an antibody to the myc-tag also produced the same banding pattern. 3

Comparison of the Kinetic Properties of GalNAc-T1, -T2, and -T3
The kinetic parameters of the three GalNAc-transferases in this study are potentially affected by the following factors: (i) the enzymes are derived from human cDNA sequences; (ii) the enzymes are soluble constructs with GalNAc-T1 and -T2 having a N-terminal sequence similar to the forms originally purified and GalNAc-T3 having a longer N-terminal sequence designed to exclude the hydrophobic retention signal and 12 3 T. Nilsson, personal communication. residues from the predicted stem region (3); and (iii) the enzymes are expressed as secreted products in insect cells grown in serum-free medium. Comparisons of the kinetic parameters to purified human enzymes in membrane bound and/or secreted forms would determine whether or not these factors influence the results; however, this is currently not possible.
The following observations suggest that the data obtained is not significantly affected by the design of the expression constructs and the expression system. The kinetic parameters of purified, recombinant GalNAc-T2 were found to be similar to those of the originally purified enzyme (2, 17) with respect to K m for the acceptor substrates including Muc2. The acceptor substrate specificity of GalNAc-T3 is unique and consistent with an activity detected in organ extracts that was not seen with GalNAc-T1 or -T2 (3,17). Human GalNAc-T1 has not been purified, but data with purified bovine colostrum GalNAc-T1 and recombinant GalNAc-T1 (bovine placenta cDNA) expressed in the Baculo-system or in COS-7 cells are available (1,4,20). The acceptor substrate specificity of purified GalNAc-T1 matched that of bovine GalNAc-T1 expressed in COS-7 cells with respect to Muc1a, Muc1b, and Muc2 peptides. In contrast, however, the K m for UDP-GalNAc estimated with purified, recombinant GalNAc-T1 (human cDNA) in the present study was higher than that measured with the purified bovine enzyme (14,21) and recombinant bovine enzyme expressed in COS-7 cells (22). Acceptor Substrate Specificities-An analysis of acceptor substrate specificity is summarized in Table II. Peptides derived from the tandem repeat region of secreted mucins, MUC2 and MUC5AC, which have a high density of Ser/Thr, were utilized efficiently by all three enzymes. The MUC2 sequence exhibited the lowest K m with all three enzymes. Previously, affinity chromatography of GalNAc-transferase preparation from placenta with the Muc2 peptide as ligand was found to selectively bind GalNAc-T2, and the non-bound GalNAc-transferase activity was measured to have higher K m for the Muc2 peptide (17). The relatively low K m for recombinant GalNAc-T2 (Table II) is consistent with this finding. The estimated K m of the non-bound fraction was 0.254 mM which is close to the K m values obtained for both GalNAc-T1 and -T3 (Table II).
All three recombinant enzymes utilized MUC1-derived peptide substrates; however, the kinetics with each of the peptides were very different. GalNAc-T2 showed a lower K m for MUC1 peptides that included the GSTAP sites, whereas T1 and T3 showed lower K m for the peptides with GVTSA sites. This differential activity was also found in previous analysis of the specificity of crude organ extracts (17). Enzyme activity was greater for MUC1 peptides with GVTSA in extracts from rat testis and salivary glands, and human placenta, whereas enzyme activity in extracts from rat kidney and human liver was greater with peptides that included the GSTAP sites. It is hypothesized that these differences in specificity among organ extracts reflect the differential expression of GalNActransferases; however, interpretation of these results is restricted by our limited knowledge of the expression pattern of the enzymes and the possible expression of additional un-  Substrate Specificities of Three GalNAc-transferases characterized GalNAc-transferases. 4 Acceptor substrate peptides selectively utilized by a single GalNAc-transferase were identified. Previously, an acceptor peptide, HIV IIIB -V3, derived from the V3-loop of HIV gp120, was found to be glycosylated exclusively by GalNAc-T3 (3), and this was verified using the purified recombinant enzymes (Table II). Another peptide derived from fibronectin (Table II) was also found to be glycosylated only by GalNAc-T3; its K m could not be estimated, apparently because of substrate inhibition at concentrations over 1 mM (not shown). Essentially, no incorporation of GalNAc into the fibronectin peptide was detected when GalNAc-T1 or -T2 were used, with less than 5% incorporation after 24-h reactions. The peptide substrate human choriogonadotropin ␤ chain contains serine sites previously shown to be utilized by Muc2-affinity purified placenta GalNAc-T2 (17). Only recombinant GalNAc-T2 utilized this substrate, and the K m was higher than for most other peptides (1.20 mM). The peptide derived from a fragment (LSESTTQLP-) of ovine submaxillary mucin (23), which exhibits sequence similarity to the N terminus of glycophorin A, was preferentially utilized by GalNAc-T1 with a K m of 0.30 mM. GalNAc-T3 showed low activity with this peptide.
A serine glycosylation site on erythropoietin was previously shown to be an efficient in vitro serine acceptor sequence with a K m of 4.4 mM for purified porcine GalNAc-transferase (24). This porcine enzyme was later suggested to represent a Gal-NAc-T1 homologue (25). Hagen et al. (4) found that the same peptide with a Thr substitution of the Ser site was a 58-fold better substrate for bovine GalNAc-T1 than the Ser containing peptide. The same Ser-containing erythropoietin peptide was evaluated with the three human recombinant GalNAc-transferases (Table II). There was very poor in vitro O-glycosylation of this peptide by the three enzymes, and K m values could not be determined. The Thr-substituted peptide was a better substrate for all three enzymes. To evaluate the kinetics of glycosylation of the erythropoietin peptides, the reaction was monitored by CE (Fig. 2) which showed that all three enzymes exhibited similar reaction patterns with both peptides. The data presented here further support the hypothesis that one GalNAc-transferase may utilize both Ser and Thr sites (2,4,24). However, as yet the best Ser containing peptide identified is the human choriogonadotropin ␤ chain peptide specifically utilized by GalNAc-T2 and the K m of this reaction is more than a 100-fold higher than the best Thr containing peptide substrate. In agreement with the results presented here the in vivo glycosylation of the single Ser glycosylation site in erythropoietin is less efficient than the Thr-substituted site (26). It remains possible that undiscovered enzymes will show better reaction kinetics with the Ser containing peptide.
Donor Substrate Specificities-The K m for UDP-GalNAc with GalNAc-T2 and -T3 were 10 and 29 M, respectively, and Gal-NAc-T1 was found to have a higher K m of 62 M. The K m of purified bovine colostrum GalNAc-T1 and recombinant T1 ex-pressed in COS-7 cells were previously found to have K m values of 5-9 M. The reason for this discrepancy is not known at this time.
Analysis of the specificity of the enzymes for other donor substrates (Table III) revealed that with the peptide panel tested GalNAc-T1 and -T3 showed exclusive specificities for UDP-GalNAc at a Muc2 acceptor concentration of 500 M. In contrast, GalNAc-T2 also utilized UDP-Gal, but only in combination with the Muc2 acceptor peptide which is the best acceptor substrate for this enzyme. The K m of GalNAc-T2 for UDP-Gal with Muc2 was estimated at 27 M, less than 3-fold higher than UDP-GalNAc, indicating that this may be a naturally relevant substrate.
Requirements for Divalent Metal Ions-All three enzymes were found to show strict requirement for Mn 2ϩ , no activity was detected when Mg 2ϩ or Ca 2ϩ were substituted for Mn 2ϩ (not shown). Previous studies have shown that porcine Gal-NAc-transferase could be partially reactivated by 20 mM Co 2ϩ , Cd 2ϩ , Ni 2ϩ , and Cu 2ϩ , but not Mg 2ϩ or Ca 2ϩ (27). Matsuura et al. (28) found that a GalNAc-transferase activity capable of glycosylating the fibronectin-derived peptide (Table II) was equally active with Mg 2ϩ or Ca 2ϩ as with Mn 2ϩ . Thus, recombinant human GalNAc-T3 was different with respect to metal ion requirements.

Substrate Specificities: The MUC1 Model
GalNAc glycosylation of acceptor peptides with multiple potential acceptor sites (as is always found in mucin derived sequences) may be random, or all sites in a peptide may be utilized simultaneously, or there may be an ordered processing that is determined by the kinetic properties for glycosylation of each acceptor site. Furthermore, the length of the acceptor peptide may influence the order of glycosylation especially if the size of the peptide contributes to formation of defined tertiary structure. An initial observation that the GalNAc-transferase showed different activities with peptides covering the GVTSA and the GSTAP sequons in MUC1 (Table II) prompted us to use the Muc1 peptide as a model for studying the order of glycosylation of a peptide with multiple acceptor sites.
Analysis of a 105-mer Peptide Representing 5 Repeats of MUC1-Terminal glycosylation of a 105-residue peptide based on the MUC1 tandem repeat by using either GalNAc-T1 or -T2 produced a glycopeptide with 15 mol of GalNAc residues incorporated (Fig. 3, panel A). Endo-Asp cleavage of the unglycosylated and glycosylated peptides yielded 20-residue peptides corresponding to a single repeat, and minor amounts of a 19mer peptide as a result of partial loss of the C-terminal Pro residue. MALDI-TOF analysis of the HPLC purified digest showed that each tandem repeat moiety contained 3 mol of GalNAc residues. A minor HPLC peak from the GalNAc-T1 reaction was found to contain only 2 mol of GalNAc, and MALDI-TOF analysis of the intact glycosylated 105-mer detected some peptide with only 14 mol of GalNAc incorporated. Amino acid sequencing of the glycosylated 20-residue peptides revealed that both GalNAc-T1 and -T2 utilized T in GVTSA Below: mass spectra analysis of endo-Asp cleaved and HPLC purified digests. Two major HPLC peaks were separated and the difference in mass was consistently 98 -99, suggesting within the mass accuracy 0.1% that the C-terminal Pro was partially lost in the digestion. Both GalNAc-T1 and -T2 digests yielded two fractions with masses of 1900 and 1800 suggesting that 3 GalNAc residues were incorporated. Panel B, amino acid sequencing of the HPLC purified 20-mer peptide and glycopeptides. Only cycles corresponding to the five potential glycosylation sites are shown. In all cases the first Thr in -DTR-yielded a single Thr-PTH peak and the last Ser in GVTSA yielded a single intact Ser-PTH peak. In contrast, the single Ser in GSTAP showed no Ser-PTH peak, but a pseudopeak between Asp-PTH and Asn-PTH standards. The two Thr's in GSTAP and GVTSA yielded a split pattern with a Thr-PTH peak and pseudopeaks around Gln-PTH. The amino acid sequencing results, combined with the mass spectrometry result demonstrating that each glycopeptide had three GalNAc residues incorporated, indicate that the same three positions are utilized by the two enzymes. and ST in GSTAP (Fig. 3, panel B). Evidence for glycosylation of serine included disappearance of the Ser-PTH peak and appearance of a pseudo-peak between Asp-PTH and Asn-PTH.
Evidence of threonine glycosylation was a reduction in the Thr-PTH peak and emergence of pseudopeaks between Asn-PTH and Ser-PTH. Previously, we observed that threonine  (Table II), and three of these are utilized by all three enzymes. GalNAc-T2 also glycosylates a fourth site after the first three sites are occupied and this corresponds to the N-terminal Thr in TAP24. The three sites utilized by all enzymes were Thr in GVTSA and ST in GSTAP. The kinetics of the reactions were different with respect to timing of the order of addition of the first, second, and third mole of GalNAc (see "Results" for details). Injections were performed directly from the ongoing glycosylation reaction at the time points indicated. Assignments of peak identity with each enzyme was performed by HPLC purification of products of parallel run reactions by MALDI-TOF and amino acid sequencing (Fig. 5). Peak B represented monoglycosylated peptide with GalNAc attached at Thr in GVTSA for GalNAc-T1 and -T3 reactions, and Thr in GSTAP for GalNAc-T2 reactions. The diglycosylated peptide for all three enzymes represented incorporation in Thr in GVTSA and GSTAP, and the triglycosylated peptide represented incorporation in Ser in GSTAP.
GalNAc glycosylation resulted in complete disappearance of the Thr-PTH peak, and emergence of pseudopeaks around Gln-PTH (17), but the data presented here were obtained with a different automated sequencer, which has a different sensitivity for detection of Ser-PTH and Thr-PTH derivatives. The peptide fragment with only 2 mol of GalNAc incorporated by GalNAc-T1 was also sequenced and found to be glycosylated in Thr at GVTSA and Thr at GSTAP (not shown).
Analysis of a Single Repeat of MUC1 (TAP24)-The surprising finding that the two GalNAc-T1 and -T2 produced the same final product with the 105-mer peptide in long-term assays despite large differences in the reaction kinetics with the short MUC1-based peptide substrates prompted us to analyze the kinetics of the glycosylation of Muc1 peptides in more detail. Glycosylation of the TAP24 MUC1 peptide, which contains 6 potential acceptor sites, was analyzed by CE at different time points during a 24-h reaction (Fig. 4). Reactions with Gal-NAc-T1 and -T3 produced three peaks, which corresponded to 1, 2, or 3 mol of GalNAc incorporated at Thr in GVTSA and ST in GSTAP. Reaction with GalNAc-T2 produced the same three peaks representing the same sites of incorporation. In addition, a fourth peak was detected, which corresponded to incorporation of a fourth mole of GalNAc in the N-terminal Thr residue. There was no evidence of incorporation at Ser in GVTSA and Thr in -DTR-in any of the peptides analyzed.
The kinetics of the reactions were different with respect to timing of the order of addition of the first, second, and third moles of GalNAc. The relative product developments of 1 and 2 mol of GalNAc incorporation were slightly different with Gal-NAc-T2 being the fastest enzyme to convert from 1 to 2 mol, indicating a lower K m for the monoglycosylated peptide than the unglycosylated peptide. GalNAc-T1 was the slowest in this respect and T3 was intermediate. Similarly GalNAc-T2 initiated the third mole of incorporation before all substrate was converted to the diglycosylated glycoform, while both T1 and T3 only initiated the third mole after all peptide was converted to the diglycosylated form.
Analysis of the reaction products of these glycosylation reactions by mass spectrometry at a time point where mainly 1 mol of GalNAc was incorporated into the peptide substrate gave the following results: monoglycosylated peptide produced by Gal-NAc-T1 and -T3 showed incorporation of GalNAc in the GVTSA motif, whereas the product from reactions with GalNAc-T2 showed GalNAc incorporated in the GSTAP motif (Fig. 5). Amino acid sequencing of these products showed that the component in the first peak B(1) (Fig. 4) observed with GalNAc-T1 and -T3 was the peptide with incorporation at Thr in GVTSA; however, sequencing of component in the first peak obtained from reactions with GalNAc-T2 revealed only that Thr in -DTR-was not glycosylated (further sequencing into the peptide was not possible).
All three enzymes incorporated a third GalNAc only when a substantial fraction of the peptide substrate was converted to a diglycosylated form. Differences between the enzymes ability to incorporate the third GalNAc were also found in that Gal-NAc-T2 incorporated the third GalNAc much quicker than GalNAc-T1 and -T3. Furthermore, GalNAc-T2 incorporated a fourth GalNAc at the initial threonine in TAP24. To ensure that the observed differences in activity between GalNAc-T1, -T2, and -T3 were not due to differences in stability, the loss of activity for each enzyme was assessed by incubation at 30°C in the reaction buffer without substrates over a time period of 24 h. The three enzymes showed comparable levels of inactivation over time with GalNAc-T2 being the most unstable (not shown). GalNAc-T1 and -T3 lost 40% of activity and GalNAc-T2 lost 60% of activity in 24 h.

Analysis of 11-mer Peptides Covering
Either GVTSA or GSTAP of MUC1 Repeat-Further details in the sequence of glycosylation was tested using a mixture of two short 11-mer peptides covering either GVTSA or GSTAP. CE analysis was possible because these two peptides and their glycoforms were well separated by CE. As shown in Fig. 6 (panels A-C) glycosylation of Muc1a (GVTSA) by all three transferases resulted in one glycoform that was glycosylated at Thr in GVTSA as evaluated by MALDI-TOF and amino acid sequencing (not shown). Glycosylation of Muc1b (GSTAP) by all three transferases also yielded one glycoform that was glycosylated at Thr in GSTAP (not shown). The peptide design apparently did not FIG. 5. MALDI-TOF analysis of monoglycosylated and diglycosylated reaction products of TAP25 corresponding to the analysis in Fig. 4. The diglycosylated products (peak C(2), Fig. 4) were purified by HPLC from parallel run reactions with the three GalNActransferases at maximum product time points. Sites with GalNAc incorporated were assigned by analysis of an endo-Asp digest (fragments with residues 1-12 and [13][14][15][16][17][18][19][20][21][22][23][24]. Panel A shows the diglycosylated glycoform produced with GalNAc-T1 showing incorporation of 1 mol in both fragments. Amino acid sequencing of these fragments revealed incorporation in Thr in GVTSA (fragment [1][2][3][4][5][6][7][8][9][10][11][12] and Thr in GSTAP (fragment 13-24) (not shown). The same products were characterized for GalNAc-T2 and -T3 reactions. Panels B-D show the monoglycosylated products (peak B(1), Fig. 4) of GalNAc-T1, -T2, and -T3 reactions, respectively, analyzed by the same method. The monoglycosylated glycoform produced with GalNAc-T1 and -T3 have GalNAc attached in Thr in GVTSA (fragment 1-12), while the product with GalNAc-T2 has the GalNAc attached to Thr in GSTAP (fragment 13-24). allow incorporation of GalNAc into the serine site because the N-terminal sequence was too short as the 15-mer Muc1b peptide incorporated ST at GSTAP. 5 Nevertheless, CE monitoring of the glycosylation reaction with the mixed peptides clearly demonstrated that GalNAc-T1 and -T3 glycosylate the Thr in GVTSA site with better efficiency than the GSTAP sites, whereas GalNAc-T2 glycosylates Thr in GSTAP more efficiently than Thr at GVTSA. These data are in accordance with the observed K m values for the peptides as well as with the set of longer peptides (Table II). Matsuura et al. (29) originally identified an O-glycosylation site in the sequence VTHPGY, which was derived from fibronectin, and showed that extracts from fetal and tumor tissues expressed GalNAc-transferase activity capable of O-glycosylating different peptide designs with this sequence under in vitro conditions (28). As discussed above, GalNAc-T3 exclusively utilized this substrate sequence (Table II), and therefore is a candidate for the enzyme activity identified previously. To determine whether or not GalNAc-T3 glycosylates fibronectin, we evaluated plasma fibronectin as a substrate for the enzyme. Plasma fibronectin lacks O-glycosylation at the -VTHPGY-sequence and two monoclonal antibodies are available to monitor glycosylation at this specific sequence (29,30). As shown in Fig.  7 O-glycosylation of fibronectin by GalNAc-T3, but not Gal-NAc-T1 and -T2, created the FDC-6 and 5C10 epitope on fibronectin as evaluated by enzyme-linked immunoadsorbent assay and SDS-PAGE Western analysis, supporting the hypothesis that fibronectin is a physiological substrate for GalNAc-T3. DISCUSSION Mucin-type O-glycosylation is initiated by a family of polypeptide GalNAc-transferases. The data in this report sup-port the hypothesis that each individual enzyme has a unique function that includes the ability to glycosylate different acceptor substrates as evaluated by in vitro analysis. However, it is also apparent that there is overlap in acceptor substrate specificities, suggesting that the enzymes have some redundant functions. Within the inherent limitations of in vitro analysis, our results suggest that primary sequence of the acceptor site is one determining factor for position and rate of O-glycosylation.

Substrate Specificities: The Fibronectin Model
It is clear that the GalNAc-transferases evaluated here utilize a wide range of acceptor sequences. GalNAc-T3 is unique in its ability to utilize substrates with sequences flanking the glycosylation site that contain charged side chains and do not contain proline, which are often associated with O-glycan sites (8,9,15). Recently, Nehrke et al. (31) developed an in vivo model for assessment of the influence of flanking sequences for O-glycosylation. Prior studies of the acceptor specificity of bovine GalNAc-T1 suggested that charged residues in the flanking region of the acceptor sequence PHMAQVTVGPGL severely affected the activity (14); however, similar substitutions introduced in a chimeric reporter construct expressed in COS-7 cells revealed little if any effect. One interpretation of this finding is that analysis of acceptor substrate specificities of GalNAc-transferases by in vitro methods does not correctly reflect the in vivo condition. Our results suggest another possibility. In vivo evaluation of the specificity of GalNAc-T1 with the total repertoire of GalNAc-transferases in COS-7 cells (or other cell types) will reflect the activities of all endogenous enzymes. The observation of overlapping acceptor specificity among the GalNAc-transferases suggest that several enzymes could be involved in the glycosylation of a single substrate. It is also possible that single site substitutions with charged residues may have different influences on the acceptor specificity of different GalNAc-transferases and thus not reveal effects in this system. Preliminary studies of the HIV-V3 sequence in the same reporter construct show that glycosylation in wild-type COS-7 cells is very low, whereas co-transfection with a Gal- NAc-T3 construct resulted in full in vivo glycosylation. 6 Studies of O-mannosylation in yeast has similarly shown that loss of a single Man transferase can be accompanied by a selective loss of in vitro as well as in vivo glycosylation of a specific site without affecting the general activity (32). Thus, further studies are needed to fully understand in vivo specificity of these enzymes.
Mucin-like domains are one major site of O-glycosylation. Analysis of the factors that regulate initiation of O-glycosylation of mucin is important for understanding some aspects of disease-associated aberrant O-glycosylation. The cell membrane-associated mucin termed MUC1 was originally identified as a cancer-associated mucin found in breast and pancreatic cancers (33,34). Subsequent studies showed that cancer-associated forms of MUC1 arise by differential O-glycosylation of the tandemly repeated region in cancer cells (35,36). Recent studies have indicated that MUC1 in tumor cells has fewer O-glycan chains and the O-glycans have reduced ␤1-6GlcNAc branching (37,38). The data presented here demonstrate that three of five serine and threonine residues in the repeat are acceptors for in vitro glycosylation that is catalyzed by GalNAc-T1, -T2, and -T3. Previously, Stadie et al. (39) reported glycosylation of the same three sites by crude enzyme preparations from skimmed milk, colon and breast carcinoma cell lines, although Ser in GSTAP was only partially glycosylated. Nishimori et al. (40) found no difference of in vitro O-glycosylation of MUC1 based peptides with GalNAc-transferase preparations from normal and cancer tissues. Stadie's (39) study found that O-glycosylation was initiated at Thr in the GVTSA motif followed by Thr in the GSTAP motif in extracts from all sources tested. This pattern was similar to that determined by us for GalNAc-T1 and -T3, and is in agreement with our previous studies of crude extracts from salivary glands (17). GalNAc-T2 showed an opposite pattern of reactivity with highest efficiency for the GSTAP motif. This activity may not be present in sufficient quantities to be detected in the sources tested by Stadie et al. (39). It is presently unclear what effect differences in kinetic parameters of the individual GalNAc-transferases may have for mucin O-glycosylation in vivo. It is possible that alterations in the expression of particular GalNAc-transferases in cancer cells may result in the production of glycoforms that differ in number of attached O-glycans. Stadie et al. (39) found that MUC1 tandem repeat peptides with 2-3 mol of GalNAc showed a reduced reactivity with the antibody SM3 which defines one cancer-associated form of MUC1 (35,41).
The use of capillary electrophoresis for analysis of in vitro O-glycosylation of peptides derived from mucins with multiple acceptor sites proved to be a useful technique because it allowed direct assessment of enzyme kinetics on individual acceptor sites. CE was recently used by Hennebicq-Reig et al. (42) to analyze glycosylation of a MUC5C tandem repeat peptide. We adapted this technique to allow direct sampling and electrophoresis of on-going reaction mixtures with individual Gal-NAc-transferases. Structural analysis of glycoforms corresponding to the peaks separated by CE revealed that each peak represented a unique glycoform with GalNAc residues attached 6 K. Nehrke, F. Hagen, and L. A. Tabak, personal communication. at a specific site(s). CE is a powerful tool for monitoring glycosyltransferase reactions and the sensitivity may be increased dramatically by using fluorescence acceptor or donor substrates (43).
Among the first evidence for the existence of multiple Gal-NAc-transferases was the finding that fetal and cancer tissues selectively expressed GalNAc-transferase activity capable of utilizing a peptide sequence, VTHPGY, derived from fibronectin (28). We showed that only recombinant GalNAc-T3 utilized the VTHPGY substrate. Thus, GalNAc-T3 is a candidate for the GalNAc-transferase activity previously observed in cancer and fetal tissues. Northern blot analysis reveals a restricted expression pattern for GalNAc-T3 (3), however, more detailed in situ localization is required to determine if T3 expression is regulated in tumors and fetal tissues. Interestingly, all three recombinant GalNAc-transferases were dependent on the Mn 2ϩ , and only low activity was detected when using Ca 2ϩ and Mg 2ϩ . In contrast, the oncofetally regulated activity previously identified (28) utilized either Mn 2ϩ or Ca 2ϩ . This does not support the hypothesis that GalNAc-T3 is the GalNAc-transferase involved in glycosylation of the fibronectin site and suggests the existence of additional enzymes with this specificity. Additional putative GalNAc-transferase genes have been identified and cloned and several of these have recently been expressed and found to have GalNAc-transferase activity. 7 One potentially important finding from this work is that GalNAc-T2 exhibited high donor substrate specificity for UDP-Gal that is specifically associated with one acceptor substrate. Initiation of O-glycosylation with galactose has not been reported for eucaryotic cells, but it is found in prokaryotes (44). It is not known if GalNAc-T2 performs Gal O-glycosylation in vivo, since such structures have not been reported in association with Muc2 or any other mucins to the best of our knowledge; however, it is not clear that this has been looked for exhaustively. As expected, anti-Tn antibodies and VVA lectin did not react with the Gal-glycosylated glycopeptide, whereas the glycopeptide was highly reactive with Jacalin (data not shown). Future studies should address whether this donor substrate specificity is found for membrane-bound GalNAc-T2 and if it leads to Gal O-glycosylation in vivo. Other glycosyltransferases have been shown to be able to use two different sugar nucleotides as donor substrates. The histo-blood group A enzyme, which normally utilize UDP-GalNAc, show weak specificity for UDP-Gal, although with higher K m value (45). Similarly the histo-blood group B enzyme, which normally utilize UDP-Gal, can utilize UDP-GalNAc in high concentrations (46). A chimeric AB enzyme that utilizes efficiently both sugar nucleotides was produced by introducing a single amino acid substitution into the protein (47). Furthermore, recently it was shown that ␣-lactalbumin alters not only the acceptor substrate specificity but also the donor substrate specificity of the ␤1,4Gal transferase so that the enzyme utilize both UDP-Gal and UDP-GalNAc (48).
In summary the studies reported here show that the kinetic properties of three GalNAc-transferases including acceptor and donor substrate specificities differ significantly. The results suggest that the initiation of O-glycosylation is a selective process that is controlled by the repertoire of GalNAc-transferases expressed. These findings have wide implications for the understanding of O-glycosylation in disease and also for the recombinant expression technology.