Characterization of a UDP-GalNAc:Polypeptide N-Acetylgalactosaminyltransferase That Displays Glycopeptide N-Acetylgalactosaminyltransferase Activity*

We report the cloning, expression, and characterization of a novel member of the mammalian UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase (ppGaNTase) family that transfers GalNAc to a GalNAc-containing glycopeptide. Northern blot analysis revealed that the gene encoding this enzyme, termed ppGaNTase-T6, is expressed in a highly tissue-specific manner. Significant levels of transcript were found in rat and mouse sublingual gland, stomach, small intestine, and colon; trace amounts were seen in the ovary, cervix, and uterus. Recombinant constructs were expressed transiently in COS7 cells but demonstrated no transferase activity in vitro against a panel of unmodified peptides, including GTTPSPVPTTSTTSAP (MUC5AC). However, when incubated with the total glycosylated products obtained by action of ppGaNTase-T1 on MUC5AC (mainly GTT(GalNAc)PSPVPTTSTT(GalNAc)SAP), additional incorporation of GalNAc was achieved, resulting in new hydroxyamino acids being modified. The MUC5AC glycopeptide failed to serve as a substrate for ppGaNTase-T6 after modification of the GalNAc residues by periodate oxidation and sodium borohydride reduction, indicating a requirement for the presence of intact GalNAc. This suggests thatO-glycosylation of multisite substrates may proceed in a specific hierarchical manner and underscores the potential complexity of the processes that regulate O-glycosylation.

O-Linked glycans are involved in a number of biological functions including leukocyte trafficking (1) and sperm-egg adhesion (2). In addition, clusters of O-linked oligosaccharides impart a "stalk-like" conformation that is common among several membrane receptors (3). In contrast to N-linked glycosylation, O-linked glycans are synthesized stepwise. Thus, the acquisition of GalNAc represents the first step in mammalian (mucin-type) O-glycosylation. A family of UDP-GalNAc: polypeptide N-acetylgalactosaminyltransferase enzymes (pp-GaNTase, 1 EC 2.4.1.41) is responsible for this initial enzymatic step. Five family members (ppGaNTase-T1 (4,5), -T2 (6), -T3 (7,8), -T4 (9), and -T5 (10)) have been identified in mammals thus far and have been shown to have unique expression patterns as well as substrate specificities. However, little is known regarding their respective activities on native substrates as well as potential inter-relationships with one another.
In the present study, we have cloned a novel member of this enzyme family termed ppGaNTase-T6. When recombinant enzyme was expressed as a secreted product from COS7 cells, no ppGaNTase activity was detected in vitro against a panel of unmodified peptides, including the peptide GTTPSPVPTTSTT-SAP, which is derived from the human MUC5AC gene sequence (11). However, when this MUC5AC peptide was first glycosylated with ppGaNTase-T1 to yield mainly GTT(Gal-NAc)PSPVPTTSTT(GalNAc)SAP (but also mono-and tri-substituted species), the ppGaNTase-T6 isoform was active toward the glycopeptidic preparation. This suggests that the addition of the initial O-linked sugar may occur in a hierarchical manner with the action of certain ppGaNTases necessary prior to the action of others.

EXPERIMENTAL PROCEDURES
Isolation of ppGaNTase-T6 Probes and Full-length cDNAs-The conserved amino acid regions EIWGGEN and VWMDEYK were used to design sense and antisense PCR primers, d(GARATHTGGGGNGGN-GARAA) (321 sense) and d(TTRTAYTCRTCCATCCANAC) (379 antisense). These were used to perform PCR reactions on rat sublingual gland (rat SLG) cDNA; the resultant 200-base pair PCR products were cloned into M13 vehicles and screened as described previously (10). Positively hybridizing M13 clones were sequenced with infrared fluorescent dye-labeled primers on an LI-COR DNA 4000L DNA sequencer. The insert from a unique clone was used to generate an asymmetrically labeled PCR probe using the oligonucleotide 379 antisense. This probe was then used to screen 1 ϫ 10 6 plaques from an oligo(dT)-primed Uni-Zap XR rat sublingual gland cDNA library (10) according to standard procedures (12). One of the four positive clones obtained was fully sequenced. The N-terminal transmembrane domain was determined by a Kyte-Doolittle hydrophobicity plot. Sequence alignments were performed using the Clustal method of Megalign (DNASTAR) and began at the conserved region FNXXXSD in the putative lumenal domain (amino acid position 84 in ppGaNTase-T1, 100 in ppGaNTase-T2, 150 in pp-GaNTase-T3, 102 in ppGaNTase-T4, 454 in ppGaNTase-T5, and 175 in ppGaNTase-T6).
Amino Acid Similarity Determinations-Amino acid sequences were aligned, one pair at a time, using the pairwise ClustalW (1.4) algorithm in MacVector (Oxford Molecular Group). The following alignment modes and parameters were used: slow alignment, open gap penalty ϭ 10, extended gap penalty ϭ 0.1, similarity matrix ϭ blosum, delay divergence ϭ 40%, and no hydrophilic gap penalty. The percent amino acid sequence similarity displayed in Table I represents the sum of the percent identities and similarities. Sequences comprising the conserved domains used in Table I begin with the first amino acid in Fig. 2 and end with a conserved proline (amino acid position 425 in ppGaNTase-T1, 440 in ppGaNTase-T2, 499 in ppGaNTase-T3, 438 in ppGaNTase-T4, 796 in ppGaNTase-T5, and 524 in ppGaNTase-T6). The segment of conserved sequences is approximately 340 amino acids in length in the various isoforms.
Northern Blot Analysis-Total RNA from BALB/c mouse and Wistar rat tissues was extracted according to the single step isolation method described in Ausubel et al. (13). Following electrophoresis in a 1% formaldehyde-agarose gel, rat and mouse total RNA samples were transferred to Hybond-N membranes (Amersham Pharmacia Biotech) according to Sambrook et al. (12). A segment of the ppGaNTase-T6 cDNA region, from nucleotide position 1305 to 1473, was labeled by asymmetric PCR (14) using the antisense oligonucleotide d(GACTTC-CACAACACGCACAT) and then used as a probe for ppGaNTase-T6 transcripts. ppGaNTase-T1 and -T4 were detected as described previously (9). Antisense 18 S ribosomal subunit oligonucleotide d(TATTG-GAGCTGGAATTACCGCGGCTGCTGG) was end-labeled as described (12) and used to normalize sample loading by hybridizing with 5 M excess of probe. All hybridizations were performed in 5ϫ SSPE, 50% formamide at 42°C with two final washes in 2ϫ SSC, 0.1% SDS at 65°C for 20 min.
Generation of Secretion Constructs for ppGaNTase-T6 -The 2.2-kilobase full-length cDNA (isolated from the rat sublingual gland cDNA library) for ppGaNTase-T6 was cloned into the PstI site of Phagescript SK (Stratagene). Oligonucleotide-directed mutagenesis (15) was performed on deoxyuracil-containing single-stranded DNA from this construct using the oligonucleotide d(ACGACCCGAACGCGTTGAGCAG-GAT), which generates an MluI site 3Ј of the putative hydrophobic transmembrane domain (at nucleotide position 107 of ppGaNTase-T6). This modified vector was used to clone a 5Ј-truncated form of the ppGaNTase-T6 cDNA (from the newly introduced MluI site to the PstI site at nucleotide position 2155) into the mammalian expression vector pIMKF1 (9) to create the vector, pF1-rT6. pF1-rT6 is an SV40-based expression vector that generates a fusion protein containing the following, in order: an insulin secretion signal, a metal-binding site, a heart muscle kinase site, a FLAG TM epitope tag, and the truncated rat pp-GaNTase-T6 cDNA (rT6).
Expression, Labeling, and Gel Analysis of Secreted Isoforms-COS7 cells were grown to 90% confluency in Dulbecco's modified Eagle's medium (Life Technologies, Inc.) ϩ 10% fetal calf serum at 37°C and 5% CO 2 . One g of pIMKF1 (9), pF1-mT1 (9), or pF1-rT6 and 8 l of LipofectAMINE (Life Technologies, Inc.) were used to transfect a 35-mm well of COS7 cells as described previously (9). Recombinant enzymes were assayed and quantitated directly from the culture media of transfected cells. Levels of recombinant enzymes were analyzed by Tricine SDS-PAGE (16) after labeling with [␥-32 P]rATP using heart muscle kinase (HMK) as described previously (data not shown) (10). Gels were dried under vacuum and exposed to film (XAR, Eastman Kodak Co.) or quantitated on a PhosphorImager (Molecular Dynamics).
Functional Assays of Secreted Recombinant ppGaNTase-T6 from COS7 Cells-Activities of ppGaNTase-T1 and -T6 were initially measured against the following panel of peptide substrates as described previously (9, 10): EA2 (PTTDSTTPAPTTK) from the tandem repeat of rat submandibular gland mucin (17); HIV (RGPGRAF VTIGKIGNMR) from the human immunodeficiency virus gp120 protein (7); MUC2 (PTTTPISTTTMVTPTPTPTC) from human intestinal mucin (18); MUC1b (PDTRPAPGSTAPPAC) from human MUC1 mucin (19); EPO-T (PPDAATAAPLR) from human erythropoietin (4); rMUC-2 (SPTTSTPISSTPQPTS) from rat intestinal mucin (20); mG-MUC (QTSSPNTGKTSTISTT) from mouse gastric mucin (21); and MUC5AC (GTTPSPVPTTSTTSAP) from human MUC5AC mucin (11). Equivalent amounts (units) of each enzyme (as determined by SDS-PAGE gel quantitation) were used in each assay. No enzymatic activity for pp-GaNTase-T6 was detected in any of these initial assays. Subsequent assays of ppGaNTase-T1 and -T6 activity To minimize the possibility of dipeptidylaminotransferase or other peptidase activities that could confound the MALDI-MS analysis, the thiol inhibitor trans-epoxysuccinyl-L-leucylamido-3methyl butane and serine peptidase inhibitors were included in the reaction mixture as described previously (22). Reactions were stopped by the addition of 8 volumes of 20 mM sodium borate, 1 mM EDTA (pH 9.1). Reaction products were passed through AG1-X8 resin and eluted with 3 ml of water, and incorporation was determined by scintillation counting. Background values obtained from controls incubated without peptide substrate were subtracted from each experimental value. Products from the above mentioned incubations using ppGaNTase-T1 cell culture media were recovered by using Sep-Pak C18 reverse-phase cartridges (Waters Corp., Milford, MA) as described previously (23). The products of the reaction by ppGaNTase-T1 were used as substrates (in place of MUC5AC parent peptide) in subsequent incubations with ppGaNTase-T6 and mock (pIMKF1) media. For this second step of N-acetylgalactosaminylation, the conditions were identical to those described above, except that 1 nM UDP-[ 14 C]GalNAc (54.7 mCi⅐mmol Ϫ1 ; 2.02 Gbq⅐mmol Ϫ1 ; 0.02 mCi⅐ml Ϫ1 ) replaced 1.25 nM UDP-[ 3 H]GalNAc. Reactions were performed for 24 h at 37°C and were stopped as described above.
Reaction products from the aforementioned enzyme assays were analyzed by mass spectrometry and/or capillary electrophoresis. To desalt samples prior to capillary electrophoresis and/or mass spectrometry, Sep-Pak C18 cartridges were used as described previously (23). Matrix-assisted laser desorption ionization mass spectrometry was performed using a Vision 2000 time-of-flight instrument (Finnigan MAT, Bremen, Germany) equipped with a 337 nm UV laser. The mass spectra were acquired in reflectron mode under 6 kV acceleration voltage and positive detection. The samples were prepared by mixing directly onto the target 1 l of analyzed solution (typically 50 pmol) and 1 l of a 2,5-dihydroxybenzoic acid matrix solution (12 mg⅐ml Ϫ1 in CH 3 OH/H 2 O, 70:30, v/v) and then allowed to crystallize at room temperature. External calibration was performed using the MUC5AC peptide (M r 1502.7). 10 -30 shots were accumulated for the mass spectrum.
Capillary electrophoresis was performed on a P/ACE system model 5000 (Beckman, Fullerton, CA) under conditions previously described (23). For the separation of the hexadecapeptides, 2 N formic acid buffer with 2.5% polyvinyl alcohol (M r 15,000) (v/v) (24) was used. To determine O-linkage sites, a preparative scale procedure was performed as described by Bielher and Schwartz (25), and the recovered fractions  were then analyzed by Edman degradation using an Applied Biosystems gas-phase sequencer, model 477A, as described previously (26). Periodate Oxidation, Sodium Borohydride Reduction, and Enzyme Assays-Large quantities of glycosylated MUC5AC were prepared by incubation with Pichia pastoris-derived recombinant ppGaNTase-T1. 2 Briefly, the ppGaNTase-T1 coding segment from pF3-mT1 (27) was inserted into the EcoRI site of a modified Pichia expression vector, pPIC (Invitrogen). Pichia containing this vector were grown and expression was induced according to the manufacturer's instructions (Invitrogen). Recombinant ppGaNTase-T1 was purified as described previously (4). Pichia-derived recombinant ppGaNTase-T1 and ppGaNTase-T1 expressed from COS7 cells displayed similar substrate specificities and kinetic parameters. 2 Pichia-derived ppGaNTase-T1 (0.028 g) was incubated with 1 mg of MUC5AC under the conditions described above, using UDP-[ 3 H]GalNAc (7.8 Ci⅐mmol Ϫ1 ; 288.6 GBq⅐mmol Ϫ1 ; 0.1 mCi⅐ml Ϫ1 ) at a final concentration of 128.2 M and cold UDP-GalNAc at a final concentration of 30 mM. Reaction products were passed through a AG1-X8 column, and incorporation was determined by scintillation counting. The reaction products were isolated on a Waters 265 HPLC using a Vydac C-18 reverse phase column (0.46 ϫ 25 cm) with a flow rate of 1 ml/min using a linear gradient of 5% acetonitrile, 0.1% trifluoroacetic acid to 20% acetonitrile, 0.1% trifluoroacetic acid for 20 min at 22°C. Mass spectrometry was performed on purified products at the Louisiana State University Mass Spectrometry Facility using pulsed extraction and in reflector mode on a Bruker (Billerica, MA) ProFLEX III MALDI-TOF mass spectrometer. The matrix used was ␣-cyano-4hydroxycinnamic acid. Two-point calibrations were performed using peptides that have masses above and below the range of the masses of our samples. The most abundant product recovered (the tri-glycopeptide) was subjected to Edman degradation as described previously (28) using a PE Applied Biosystems 473A protein sequencer (Foster City, CA).
This purified MUC5AC tri-glycopeptide (100 nmol) and the MUC5AC parent peptide (100 nmol) were oxidized with 200 l of 0.08 M NaIO 4 in 0.05 M acetate buffer (pH 4.5) at 4°C for 60 h in the dark (29) in side by side reactions. Excess periodate was destroyed by adding 20 l of ethylene glycol. The reaction mixtures were adjusted to pH 7.5 with 1 N NaOH. Sodium borohydride was added to a final concentration of 0.2 M and reduction continued for 24 h at 4°C. Excess borohydride was destroyed by the addition of 20 l of glacial acetic acid, and released boric acid was evaporated several times with methanol. The reaction mixtures were purified by HPLC as described in the previous paragraph. Capillary electrophoresis was performed on an Applied Biosystems 270A-HT capillary electrophoresis system using 2 N formic acid, 2.5% polyvinyl alcohol (v/v) and a fused silica capillary column (0.75 m inner diameter) with 50 cm to the optical path and running voltage of 15 kV.
Periodate-treated MUC5AC and the MUC5AC tri-glycopeptide as well as untreated MUC5AC and MUC5AC tri-glycopeptide were then used as substrates in reactions with COS7 cell-derived ppGaNTase-T1, ppGaNTase-T6, or mock media. Reactions were carried out in duplicate as described above with the following modifications: 15 g of each peptide/glycopeptide substrate were used in each reaction; the final concentration of UDP-[ 14 C]GalNAc (54.7 mCi⅐mmol Ϫ1 ; 2.02 Gbq⅐mmol Ϫ1 ; 0.02 mCi⅐ml Ϫ1 ) was 21.78 nM and the final concentration of cold UDP-GalNAc was 0.96 mM; the final reaction volume was 50 l. Reaction products were passed through AG1-X8 resin; incorporation was determined by scintillation counting.

RESULTS
cDNA Cloning and Sequence Analysis of ppGaNTase-T6 -Primary sequence alignments of previously identified members of the ppGaNTase family revealed many conserved regions within an approximately 420-amino acid segment of the proteins (9). We designed degenerate PCR primers to short blocks of highly conserved sequences, EIWGGEN and VWMDEYK. PCR was performed on cDNA from rat SLG, and clones were purified and sequenced to identify the nature of the insert as described previously (10). From this screening, a novel PCR product was identified that shared homology with previously characterized isoforms. The insert from this clone was used as a probe to screen a rat sublingual gland cDNA library. A cDNA containing a complete open reading frame was sequenced and given the designation, ppGaNTase-T6.
As shown in Fig. 1, the cDNA encoding ppGaNTase-T6 contains a 2228-base pair insert encoding a unique 657-amino acid protein. No upstream termination codon or Kozak sequence was found. Conceptual translation of this cDNA revealed a type II membrane protein architecture, typical of the ppGaNTase family. The enzyme consists of a potentially short N-terminal cytoplasmic region, a 27-amino acid hydrophobic region, a 147amino acid stem region, and a 483-amino acid lumenal region. As shown in Fig. 2, ppGaNTase-T6 is distinct from previously identified mammalian isoforms yet shares many blocks of sequence similarity or identity between consensus amino acid 174 and 657. Table I summarizes the degree of amino acid similarity between each of the known isoforms within the conserved lumenal region; ppGaNTase-T6 has the lowest similarity when compared with the other isoforms.
Northern Blot Analysis-Northern blots of mouse and rat total RNA were probed with a ppGaNTase-T6-specific probe (Fig. 3) as well as probes specific for previously characterized isoforms. The highest levels of the 4.8-kilobase ppGaNTase-T6 message were found in the SLG, with lower levels seen in stomach, small intestine, and colon of both rat and mouse. Trace amounts were detectable in ovary, cervix, and uterus. As reported previously, ppGaNTase-T4 transcripts were found in the digestive and reproductive tracts as well as other tissues.
The ppGaNTase-T1 message was present in all tissues examined. The tissue specificity of expression for each isoform was found to be conserved between rat and mouse.
Functional Expression-The truncated coding region of pp-GaNTase-T6 was cloned downstream of the insulin secretion signal, HMK site, and FLAG TM epitope tag in the vector pIMKF1 (9). The ppGaNTase-T6 truncation began at amino acid position 38. The ppGaNTase-T6 expression construct as well as a similar construct containing ppGaNTase-T1 were independently transfected into COS7 cells. The expressed products from these transfections were secreted into the culture media. Initially, equivalent amounts of each isoform (as judged by densitometric scanning of Tricine SDS-PAGE gels) (data not shown) were used for in vitro glycosylation assays against a panel of peptides (10). Although ppGaNTase-T1 glycosylated a number of peptide substrates, no in vitro glycosylation activity was seen for ppGaNTase-T6 (data not shown). Further assays were then conducted using the MUC5AC peptide and cell culture media from cells transfected with either ppGaNTase-T1 or ppGaNTase-T6. When the MUC5AC peptide was incubated with media from ppGaNTase-T6-transfected cells, capillary electrophoresis revealed only a single peak that displayed a mass (m/z ϭ 1525.2 [M ϩ ϩ Na ϩ ] ϩ ) corresponding to the parent peptide (Fig. 4A). Incubation of the same peptide with ppGaN-Tase-T1 resulted in incorporation of GalNAc into the peptide fraction (258.3 nmol of GalNAc/h/unit of recombinant ppGaN-Tase-T1, where a unit of ppGaNTase-T1 is defined as an arbitrary amount of ppGaNTase-T1 normalized to ppGaNTase-T6 after gel quantitation as described under "Experimental Procedures"). The capillary electrophoresis profile revealed the formation of one major peak (36% of initial peptide substrate) (peak 2, Fig. 4B) and two minor species (6.0 and 3.9% of initial peptide substrate, respectively) (peaks 1 and 3, Fig. 4B). Thus, the total of glycosylated products represented 45.9% of the initial peptide presented to the enzyme. MALDI-MS confirmed that peaks 1-3 consisted of mono-(m/z ϭ 1728.5, i.e. 203 greater than the parent peptide), di-(m/z ϭ 1931.5), and tri-substituted (m/z ϭ 2134.4) glycopeptides, respectively. Direct sequence analysis revealed that threonines 3 and 13 were substituted with GalNAc in the major purified fraction, obtained by capillary electrophoresis at preparative scale (peak 2,  Fig. 4B), corresponding to the di-substituted species. When ppGaNTase-T1 and ppGaNTase-T6 were employed in combination, an increase in the level of GalNAc incorporation into the MUC5AC peptide was observed over that obtained with ppGaNTase-T1 alone (294.5 nmol of GalNAc/h/unit of recombinant ppGaNTase-T1 and -T6); 49% of the initial peptide presented was distributed in five discrete fractions resolved by capillary electrophoresis (peaks 1-5; 17.1, 10.0, 11.8, 7.4, and 2.7% of initial peptide presented, respectively) (Fig. 4C). Analysis of the products by MALDI-MS indicated that they corresponded to glycopeptides that were substituted with one to five residues of GalNAc (m/z ϭ 1728.5, 1931.5, 2134.4, 2337.6, and 2540.5, respectively). Insufficient amounts of material were present to determine the positions of GalNAc residues in these species. Collectively, these results suggested that ppGaN-Tase-T6 catalyzes the transfer of GalNAc from UDP-GalNAc to a GalNAc-containing glycopeptide (i.e. a UDP-GalNAc glycopeptide-GaNTase (gpGaNTase)).
To confirm the presence of gpGaNTase activity, the substrate GTTPSPVPTTSTTSAP was first incubated for 24 h with recombinant ppGaNTase-T1 cell culture media using UDP-[ 3 H]GalNAc as the sugar donor. The reaction products (containing the di-substituted glycopeptide and unmodified parent peptide as well as small amounts of mono-and tri-substituted peptide) were next incubated with ppGaNTase-T6 in the presence of UDP-[ 14 C]GalNAc (40,000 dpm) as the sugar donor. As a control, an equivalent quantity of culture media from mocktransfected (pIMKF1) COS7 cells was also used as an "enzyme" source. As expected, little incorporation of [ 14 C]GalNAc was obtained when the mock-transfected material was used as the enzyme source (360 dpm; Ͻ1% of the initial tritiated substrate was labeled with 14 C). In contrast, significant incorporation of [ 14 C]GalNAc was obtained when ppGaNTase-T6 was used (5,960 dpm; 18.6% of initial substrate was 14 C-labeled, corresponding to 283.8 nmol of GalNAc/h/unit). Fig. 5 compares the products generated after the second incubation with mocktransfected supernatant and ppGaNTase-T6. In contrast to the products obtained after incubation with media from the mocktransfected control, ppGaNTase-T6 yielded 3 additional glycopeptide fractions with longer retention times; fractions 4 -6 correspond to glycopeptides substituted with four to six residues of GalNAc (m/z ϭ 2337.6, 2540.5, and 2743.6). The relative level of the di-substituted glycopeptide (peak 2) present after ppGaNTase-T6 incubation versus mock incubation was much reduced (2.7% of total profile versus 32.7%, respectively), suggesting that it had been converted to the more heavily glycosylated species (peaks 3-6), whereas the quantity of the parent peptide (ϳ50% of total profile) and the mono-substituted glycopeptide (peak 1) (ϳ7% of total profile) remained unchanged (Fig. 5).
As an initial step in defining the requirement of the ppGaN-Tase-T6 isoform for a GalNAc-containing substrate, we modified GalNAc residues by periodate oxidation and sodium borohydride reduction. To obtain sufficient amounts of glycosylated MUC5AC, we used the P. pastoris expression system to generate large quantities of recombinant ppGaNTase-T1. The pp-GaNTase-T1 coding segment used in the COS7 cell expression system was cloned into a Pichia expression vector (pPIC; Invitrogen) and was expressed under methanol induction conditions. Approximately 500 g/liter ppGaNTase-T1 was purified as described (4) and a portion of that was incubated with MUC5AC and UDP-[ 3 H]GalNAc as outlined under "Experimental Procedures." This incubation resulted in the production of 4 glycopeptide fractions (1887.6 nmol of GalNAc/h/g of ppGaNTase-T1 incorporated), corresponding to mono-(m/z ϭ 1725.9) (21.7% of initial peptide presented), di-(m/z ϭ 1930.2) (28.7% of initial peptide presented), tri-(m/z ϭ 2133.1) (35.3% of initial peptide presented), and tetra-substituted (m/z ϭ 2336.1) (1.3% of initial peptide presented) glycopeptides, respectively. The most abundant peak recovered after HPLC purification of all reaction products was that representing the tri-glycopeptide, as determined by mass spectrometry (m/z ϭ 2133.1). Edman degradation of this species revealed that a GalNAc residue was present at serine 5 and, like the di-substituted species generated by COS7 cell-derived ppGaNTase-T1, at threonines 3 and 13.
This purified tri-glycopeptide along with the MUC5AC parent peptide were subjected to periodate oxidation followed by sodium borohydride reduction. Periodate-treated and untreated tri-glycopeptide and MUC5AC parent peptide were purified by HPLC, analyzed for integrity by capillary electrophoresis (data not shown), and subsequently incubated with COS7 cell-derived ppGaNTase-T1, ppGaNTase-T6, or mocktransfected (pIMKF1) media in the presence of UDP-[ 14 C]Gal-NAc. Table II compares the counts incorporated into each substrate by each enzyme. Treatment of the tri-glycopeptide with periodate and sodium borohydride clearly reduces the ability of ppGaNTase-T6 to use it as a substrate (compare 3960 cpm incorporated into untreated material to 648 cpm incorporated into treated material). This reduction in incorporation is not due to the peptide itself being compromised during periodate treatment because ppGaNTase-T1 works equally well on both treated and untreated MUC5AC (Table II). These data suggest that ppGaNTase-T6 requires the presence of intact GalNAc on the MUC5AC peptide for it to be used as a substrate. DISCUSSION Through the use of degenerate PCR, we have cloned a novel isoform of the ppGaNTase family. ppGaNTase-T6 is a type II membrane protein, consisting of a potentially short N-terminal cytoplasmic domain, a transmembrane domain, a stem region, and a lumenal domain, characteristic of the other previously identified isoforms. This isoform displays the lowest level of amino acid similarity within the putative catalytic domain among the members of the ppGaNTase family and is the only isoform identified to date that lacks any potential N-glycosylation sites. Unlike previously identified isoforms, ppGaN-Tase-T6 fails to act on a panel of 8 peptide substrates but rather catalyzes the transfer of GalNAc from UDP-GalNAc to a GalNAc-containing peptide substrate. Furthermore, the modification of the GalNAc residues on the glycopeptide substrate by periodate oxidation and sodium borohydride reduction inhibits further incorporation of GalNAc by ppGaNTase-T6. Our data, therefore, indicate that at least two free GalNAc residues must be incorporated (by ppGaNTase-T1) into the MUC5AC peptide before it can be used as a substrate by ppGaNTase-T6; this requirement is specific to the GalNAc structure itself and is not simply satisfied by the presence of the chemical constituents that make up the GalNAc residue. Whether or not there exist strict positional requirements for these GalNAc residues as well as their effect on the site of transfer of subsequent GalNAcs by ppGaNTase-T6 remains to be determined. Nonetheless, ppGaNTase-T6 appears to require the prior addition of GalNAc residues by another isoform, highlighting a potential hierarchical relationship between members of the ppGaNTase family.
Glycosylation of the MUC5AC peptide by ppGaNTase-T1 also appears to occur in a regulated manner. The di-glycopeptide contains GalNac residues at threonines 3 and 13; the tri-glycopeptide has an additional GalNAc at serine 5. It will be of interest to determine and compare the K m values of the threonine positions versus the serine. These data demonstrate there exists a hierarchical addition of GalNAc within the MUC5AC substrate by ppGaNTase-T1.
The pattern of transcript expression for ppGaNTase-T6 is very restricted, being found predominantly in the SLG, with lower levels in the remainder of the digestive tract and female reproductive tract. This distinct expression pattern is conserved across species, between rat and mouse. Thus, ppGaN-Tase-T6 expression is most abundant within the SLG, which contains all of the functional isoforms of the ppGaNTase family identified to date. Recently, the MUC5B gene product has been identified as one of the major human sublingual gland mucins (30,31). The MUC5B gene encodes a highly complex 3570amino acid protein containing four super-repeats of 528 amino acids within the central exon; each super-repeat is composed of 11 irregular tandem repeats of 29 amino acids enriched in serine and threonine residues, a segment of 111 amino acids that is enriched in hydroxyamino acids but contains no obvious repeating sequence and a cysteine-rich domain (32). We have recently identified rat SLG clones that show similarity to MUC5B. 3 We speculate that the glycosylation of such complex substrates as the MUC5B gene product and rat SLG mucins requires the coordinated action of multiple ppGaNTase isoforms and that this may account for the large number of isoforms found within this tissue type. The O-glycosylation potential by the different ppGaNTase isoforms toward the MUC5B substrate and rat SLG mucin is still under investigation.
There has been some debate about whether the addition of O-linked GalNAc occurs simultaneously or not (e.g. compare Refs. 33 and 34). Nevertheless, from the present work, at least one form of ppGaNTase requires the prior activity of another. While this work was under review, a report appeared by Bennett et al. (35) who described a role for ppGaNTase-T4 (9,35) in glycosylating sites in a peptide derived from MUC1, which were not glycosylated by the action of ppGaNTase-T1, -T2, and -T3. The type of regulatory control observed in the present work and the findings of Bennett et al. (35) suggests that maximal occupancy of potential O-glycosylation sites requires the coordinated action of multiple transferases. Röttger et al. (36) have recently demonstrated that epitope-tagged recombinant ppGaNTase-T1, -T2, and -T3 localize throughout the Golgi stack of HeLa cells, following transient expression. Whether the collaborating enzymes described here are spatially co-localized must be determined. We are currently determining if there are other collaborations among the ppGaNTase family members and their functional interrelationships. This should help determine if there is a strict hierarchy to the order in which the different hydroxyamino acids acquire O-linked sugar and what unique role each isoform may play in the glycosylation status of native substrates.