Molecular Cloning and Functional Characterization of a Lepidopteran Insect (cid:1) 4- N -Acetylgalactosaminyltransferase with Broad Substrate Specificity, a Functional Role in Glycoprotein Biosynthesis, and a Potential Functional Role in Glycolipid Biosynthesis*

A degenerate PCR approach was used to isolate a lepidopteran insect cDNA encoding a (cid:1) 4-galactosyl-transferase family member. The isolation and initial identification of this cDNA was based on bioinformatics, but its identification as a (cid:1) 4-galactosyltransferase family member was experimentally confirmed. The newly identified (cid:1) 4-galactosyltransferase family member had unusually broad donor and acceptor substrate specificities in vitro , as transfered galactose, N -acetylglu-cosamine, and N -acetylgalactosamine to carbohydrate, glycoprotein, and glycolipid acceptors. However, the enzyme preferentially utilized N -acetylgalactosamine as the donor for all three acceptors, and its derived amino acid sequence was closely related to a known N -acetyl-galactosaminyltransferase. These data suggested that the newly isolated cDNA encodes a (cid:1) 4- N -acetylgalac-tosaminyltransferase that functions in insect cell glycoprotein biosynthesis, glycolipid biosynthesis, or both. The remainder of this study focused on the role of this enzyme in N -glycoprotein biosynthesis. The results showed that the purified enzyme transferred N -acetyl-galactosamine, but no detectable galactose or N -acetyl-glucosamine, to a synthetic N -glycan in vitro . The struc-ture of the reaction product was confirmed by chromatographic, mass spectroscopic, and nuclear magnetic resonance analyses. Co-expression of the new cDNA product in insect cells with an N -glycoprotein reporter showed that it transferred N -acetylgalac-tosamine, but no detectable galactose or N -acetylglu-cosamine, to this N -glycoprotein in vivo . Confocal microscopy showed that a GFP-tagged version of the was localized in the insect cell Golgi apparatus. noblotting (68) with commercial mouse anti-protein C (Roche Applied Science) and alkaline phosphatase-conjugated goat anti-mouse IgG (Promega) as the primary and secondary antibodies, respectively, and a standard alkaline phosphatase-based color reaction as the detection method (69). The enzymatic activity of the affinity-purified protein was examined using the in vitro assays described above, and its apparent K m values for UDP- N -acetylgalactosamine and pNP-GlcNAc were de- termined using modified versions of the same type of assay. To determine the apparent K m for the donor substrate, assays were performed in the presence of 1.0 m M pNP-GlcNAc, nonradioactive UDP- N -acetyl-galactosamine (Sigma) concentrations ranging from 0.150 to 12 m M , and a constant ratio of tritiated UDP- N -acetylgalactosamine with a specific activity of 15 Ci/mmol. To determine the apparent K m for the acceptor substrate, assays were performed in the presence of 1.56 m M nonradioactive UDP- N -acetylgalactosamine, 0.2 (cid:3) Ci of tritiated UDP- N -acetylgalactosamine, and pNP-GlcNAc concentrations ranging from 0.05 to 1.0 m M . The incubation time used for the K m determinations was 15 min, which was within the linear range for the purified enzyme, as shown by preliminary time course experiments (data not shown). Fi- nally, the ability of the affinity-purified protein to transfer galactose, N -acetylglucosamine, and N -acetylgalactosamine to a synthetic oligo- saccharide acceptor substrate was examined in 0.05 ml (cid:2) 4-galactosyl-transferase assays containing 50 m M sodium cacodylate, pH 7.0, 20 m M MnCl 2 , 4 m M ATP, 10 m M donor substrate (UDP-galactose, UDP- N - acetylglucosamine, or UDP- N -acetylgalactosamine; Sigma) and 0.02 (cid:3) M [GlcNAc (cid:2) 1-Man (cid:1) 6](GlcNAc (cid:2) 1-Man (cid:1) 3)Man (cid:2) 1GlcNAc (cid:2) 1GlcNAc la- beled with pyridylamine at its reducing terminus (GnGn-PA; Calbio-chem). These reactions were incubated overnight at 37 °C, and then the reaction products were analyzed by high performance liquid chroma-tography (HPLC) using a Dionex (Sunnyvale, CA) Model DX-500 sys- tem equipped with a Model RF 2000 fluorescence detector and a 5 (cid:3) M C18 reverse phase column (4.6 (cid:1) 250 mm, Dionex). The HPLC profiles were developed using buffers A (0.1 M ammonium acetate, pH 4) and B (30% methanol) with a step gradient that included 12% buffer B for 5 min, 12–20% buffer B from 5–15 min, and 20–100% buffer B from 15–30 min, with a flow rate of 1 ml/min. In some experiments, the transferase reaction was followed by treatment with (cid:2) - N -acetylhexosaminidase. In these experiments, a single 0.100-ml N -acetylgalactosaminyltrans- ferase reaction was evaporated to dryness, redissolved in 50 m M sodium citrate buffer, pH 5.0, split into two aliquots, and incubated overnight at 37 °C with either buffer alone or 0.8 units of (cid:2) - N -acetylhexosaminidase (Seikagaku Tokyo, Japan) prior to the HPLC analysis. family member encoded by the recombinant baculovirus, AcHPC4-TnGalN-AcT. Note that this protein is expressed under the transcriptional control of the polyhedrin promoter and secreted under the control of the Autographa californica multicapsid nucleopolyhedrovirus gp64 signal peptide. the results of an immunoaffinity purification of the extracellular fusion protein expressed by AcHPC4-TnGalNAcT-infected Sf9 cells. The protein was analyzed by with staining ( CB and by immunoblotting with anti-protein C ( molecular

For the past decade, several laboratories have been studying protein N-glycosylation pathways in insect cell systems. This work is important because insects occupy an intermediate evolutionary niche in glycobiology between lower and higher eukaryotes (1)(2)(3) and because a better understanding of this pathway is necessary to support the use of insectbased systems for recombinant mammalian glycoprotein production (4 -7). Despite being used routinely for recombinant glycoprotein production, however, we still have relatively little basic information on glycoprotein biosynthesis and processing in insect systems to support this important practical application. Thus, our group has focused on the use of a molecular genetic approach to provide an unequivocal view of the processing enzymes that constitute insect N-glycosylation pathways and the means to rigorously examine their functions.
Previously, we molecularly cloned genes encoding class I and class II processing ␣-mannosidases from a lepidopteran insect cell line and characterized the properties of the gene products in detail (8 -13). In parallel, other investigators cloned Drosophila melanogaster genes encoding class I and class II processing ␣-mannosidases and characterized those gene products, as well (14 -17). Together with earlier biochemical studies (18 -21), these studies clearly established that insect cells encode and express processing ␣-mannosidases similar to those of higher eucaryotes and provided new information about how these enzymes function in insect protein N-glycosylation. Similar progress in understanding insect core fucosylation reactions has come from the isolation of a core ␣-1,3-fucosyltransferase gene from D. melanogaster and functional characterization of the gene product (22), which extended earlier biochemical studies on insect core fucosyltransferase activities (23,24).
A relatively smaller number of reports have focused on insect genes encoding the glycosyltransferases mediating elongation of N-glycan termini. One reason for this is that insects are generally considered to lack functional levels of these enzymes. Historically, this conclusion has been based on the absence of detectable levels of these enzyme activities in insect cells and the fact that the processed N-glycans of insect-derived glycoproteins are usually trimmed, paucimannose structures consisting of Man 3 GlcNAc 2 with or without core fucose residues, rather than elongated, complex N-glycans like those produced by mammalian cells (reviewed in Refs. 1-3 and 25-27). However, more recent reports have shown that lepidopteran insect cell lines actually do have low levels of some of the glycosyltransferase activities mediating elongation of N-glycan termini, including N-acetylglucosaminyltransferase I (28,29), N-acetylglucosaminyltransferase II (28), ␤4-galactosyltransferase (30,31), and ␤4-N-acetylgalactosaminyltransferase (30). In addition, various reports have documented the presence of terminal N-acetylglucosamine, galactose, N-acetylgalactosamine, and even sialic acid residues on N-glycans produced by insect cells (31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43). Finally, putative N-acetylglucosaminyltransferase, galactosyltransferase, and sialyltransferase genes have been identified in the D. melanogaster data bases (44), and the biochemical functions of the putative fly N-acetylglucosaminyltransferase I (45) and sialyltransferase (46) gene products have been experimentally confirmed in published studies. These and other relatively recent observations support the hypothesis that insect cells have a branched protein Nglycosylation pathway, which includes a previously unrecognized repertoire of terminal glycosyltransferases that can mediate N-glycan elongation (26,47).
The goals of the present study were to further test this branched pathway hypothesis and to extend the previous report of van Die et al. (30), who found that insect cell lines derived from Trichoplusia ni and, to a lesser extent, Spodoptera frugiperda, have low levels of ␤4-galactosyltransferase and ␤4-N-acetylgalactosaminyltransferase activities. This finding was among the first to suggest that insect cells could produce complex N-glycans. It was subsequently extended by demonstrations that a subpopulation of the N-glycans isolated from recombinant glycoproteins produced in T. ni cells have terminal galactose residues (32,38,39) and that some N-glycans isolated from uninfected T. ni have terminal N-acetylgalactosamine residues (43). Our specific aims were to molecularly clone a member of the ␤4-galactosyltransferase gene family from T. ni and examine the function of the gene product. Both goals were successfully accomplished. We isolated a T. ni cDNA that belongs to the ␤4-galactosyltransferase family and showed that the gene product is closely related to the recently described ␤4-N-acetylgalactosaminyltransferase of Caenorhabditis elegans (48). The T. ni product had unusually broad donor and acceptor substrate specificities in vitro, but clearly preferred N-acetylgalactosamine as the donor for monosaccharide, glycoprotein, and glycolipid acceptor substrates. These results suggested that the function of the T. ni cDNA product is to transfer N-acetylgalactosamine residues to newly synthesized glycoproteins, glycolipids, or both. This finding had significant implications for the notion that a C. elegans ␤4-N-acetylgalactosaminyltransferase might be able to participate in both glycoprotein and glycolipid biosynthesis, as will be detailed under "Discussion." However, in accordance with our interest in insect protein N-glycosylation pathways, we focused the remainder of our present study on the potential role of the T. ni enzyme in N-glycoprotein biosynthesis. The results unequivocally confirmed that the new cDNA isolated in this study encodes a ␤4-N-acetylgalactosaminyltransferase involved in Nglycoprotein glycan elongation. Thus, this cDNA encodes the enzyme involved in the synthesis of GalNAc-␤4GlcNAc-R, also known as LacDiNAc or LDN, which is found in the terminal regions of some N-glycans (49,50), including those of some T. ni glycoproteins (43). This gene product can account for the ␤4-N-acetylgalactosaminyltransferase activity originally detected in T. ni cells by van Die et al. (30). It could be the source of the low level galactosyltransferase activity observed in that study, as well. The molecular cloning of the ␤4-Nacetylgalactosaminyltransferase cDNA and characterization of the gene product contributes to the growing body of molecular genetic evidence that insects can produce complex Nglycans. In addition, the T. ni cDNA isolated in this study is the first ␤4-N-acetylgalactosaminyltransferase gene to be described in any insect system and only the second experimen-tally confirmed ␤4-N-acetylgalactosaminyltransferase gene to be reported in any system.

EXPERIMENTAL PROCEDURES
Cells and Cell Culture-The two insect cell lines used in this study were Sf9 (6), which is a subclone of the IPLB-Sf21-AE line originally isolated from S. frugiperda ovaries (51), and Tn-5B1-4, also known as High Five™ (Invitrogen), which was originally isolated from T. ni embryos (52). Both cell lines were routinely maintained as suspension cultures at densities of about 0.3-3.0 ϫ 10 6 cells per ml in TNM-FH medium (6) supplemented with 10% (v/v) fetal bovine serum (Hyclone Inc., Logan, UT) and 0.1% (w/v) pluronic F68 (BASF Wynandotte Corp., Parsippany, NJ; Ref. 53). This same medium was also used to culture Tn-5B1-4 cells as adherent cultures in 25 cm 2 flasks (Corning Inc., Corning, NY) containing about 1.5-6.0 ϫ 10 6 total cells. In addition, a separate culture of Sf9 cells was maintained in suspension at densities of about 0.3-3.0 ϫ 10 6 cells per ml in SFX-INSECT serum-free medium (Hyclone) for use in the enzyme purification experiments.
Preparation of cDNA from Membrane-bound T. ni RNA-Microsomal membranes were isolated from log phase Tn-5B1-4 cell cultures using a previously described method (54). Briefly, about 1 ϫ 10 8 cells were washed with ice-cold Tris-buffered saline (50 mM Tris-HCl, pH 7.0, containing 0.9% w/v NaCl), then the cells were resuspended in 1 mM MgCl 2 and Dounce-homogenized on ice until most of the cells appeared to be broken under a phase contrast microscope. The homogenates were centrifuged for 5 min at 1000 ϫ g at 4°C, then the supernatants were harvested and mixed with an equal volume of 60% (w/v) sucrose in membrane buffer (1 mM Tris-HCl, pH 7.5 containing 1 mM MgCl 2 ). 10-ml aliquots of this solution were then applied to hand-layered step sucrose gradients in membrane buffer, which consisted of 3 ml of 60% (w/v) sucrose, 7 ml of 45% (w/v) sucrose, and 7 ml of 40% (w/v) sucrose. The samples were overlaid with 7 ml of 25% sucrose, and then the gradients were centrifuged for 20 h at 20,500 rpm at 4°C in a Beckman SW28 rotor. The membrane band at the 45-60% sucrose interphase was collected by side puncture, diluted 1:5 with membrane buffer, and pelleted for 1 h at 29,000 rpm at 4°C in a Beckman Ti45 rotor. The pellets were resuspended in TE buffer (0.1 mM Tris-HCl, pH 8.0, containing 1 mM EDTA), and the membrane-bound RNA was extracted with phenol, phenol:chloroform (1:1), and then ethanol-precipitated, redissolved, and quantified by spectrophotometry. The RNA was further purified by oligo dT-cellulose column chromatography, as previously described (55), then converted to cDNA with the GeneRacer TM kit (Invitrogen), according to the manufacturer's protocol, with GeneR-acer™ RNA Oligo as the 5Ј anchor and GeneRacer™ Oligo dT as the reverse transcription primer.
Isolation of a T. ni cDNA Encoding a ␤4-Galactosyltransferase Family Member-The cDNA produced with the membrane-bound T. ni RNA was subsequently used as the template for nested PCR (Ref. 56) with degenerate oligonucleotide primers designed against conserved regions of known ␤4-galactosyltransferase family members. The sequences of the conserved regions and degenerate primers that ultimately yielded specific PCR products are shown in Table I. The primary PCRs were performed in a total volume of 0.050 ml and contained 10 mM Tris-HCl (pH 9.0), 0.1% (v/v) Triton-X-100, 50 mM KCl, 1.5 mM MgCl 2 , 200 M dNTPs, 1.25 unit of Taq polymerase (Promega), 40 M each primer, and 100 ng of cDNA. These reactions were incubated for 2 min at 95°C prior to addition of the primers, then the primers were added and the reactions were cycled 15 times using: (i) 2 min at 95°C, (ii) 25 s at 55°C, and (iii) 1 min at 72°C in a PerkinElmer Applied Biosystems Model 2400 GeneAmp thermal cycler (Foster City, CA). A touchdown PCR method (57) was used in which the annealing temperature, which was 55°C in the first cycle, was reduced by 1°C per cycle during these first 15 cycles, and then 35 additional cycles were performed using (i) 30 s at 95°C, (ii) 20 s at 40°C, and (iii) 1 min at 72°C. After a final extension period of 6 min at 72°C, aliquots of the spent reactions were used as the templates for secondary PCRs under the same conditions as for the primary PCRs. The spent secondary PCRs were harvested and analyzed on 1% Seaplaque® low melting temperature agarose gels. A specific amplification product of about 135 bp was recovered from the gel, cloned into pCR2.1-TOPO® (Invitrogen) and sequenced using universal primers. The resulting nucleotide sequence was used to design an internal, gene-specific primer, which was used for nested 3Ј-RACE 1 with the GeneRacer TM kit. The primary reactions utilized a gene-specific sense primer (5Ј-CTCGACACCAGGAACCTGTAC-3Ј) and the antisense GeneRacer TM 3Ј primer, while the secondary reactions utilized a downstream, gene-specific sense primer (5Ј-CCGCGACATATGTCCGCT-TCC-3Ј) and the antisense GeneRacer TM nested 3Ј primer. The template for the primary reactions was 100 ng of the T. ni cDNA preparation described above and the template for the secondary reactions was an aliquot of the spent primary reaction. In both cases, the 3Ј-RACE reactions were incubated for 2 min at 94°C prior to addition of the primers, then the primers were added and the reactions were cycled according to the manufacturer's protocol, which included five cycles of (i) 30 s at 94°C, (ii) 1 min at 72°C; and then five cycles of (i) 30 s at 94°C, (ii) 30 s at 70°C, (iii) 1 min at 72°C; and then 20 cycles of (i) 30 s at 94°C, (ii) 30 s at 50°C, (iii) 1 min at 72°C. After a final extension period of 10 min at 72°C, the spent secondary 3Ј-RACE reactions were harvested, analyzed by agarose gel electrophoresis, as described above, and a specific amplification product of about 210 bp was recovered from the gel, cloned into pCR2.1-TOPO® (Invitrogen), and sequenced using universal primers. The resulting sequence data were used to design two pairs of additional gene-specific primers, which were used to screen a UNZAP (Stratagene) T. ni cDNA library by sibling selection and PCR, as described previously (8,9). This library was derived from poly(A)ϩ RNA isolated from Tn-368 cells (58) and was kindly provided by Dr. Paul Friesen of the University of Wisconsin. Ultimately, a specific lambda clone was identified, plaque-purified, and the plasmid insert was excised using the ExAssist® method (Stratagene) according to the manufacturer's protocol. Bacterial colonies containing this plasmid were identified, amplified, and the plasmid, designated pTn␤4GalT, was extracted, purified, and used to sequence both strands of the entire cDNA insert with universal and gene-specific primers. The cDNA sequence was assembled from the raw nucleotide sequence data using AssemblyLIGN v. 1.0.9b (Accelrys; San Diego, CA) and deposited in GenBank TM as accession number AY601103. This sequence was translated and multiple sequence alignments performed using SIXFRAME and ClustalW, respectively, via the San Diego Supercomputing Center Biology Workbench website (workbench.sdsc.edu/). Biology Workbench (workbench.sdsc.edu/) also was used to analyze the nucleotide and amino acid sequences of the putative ␤4-galactosyltransferase family member by AASTATS, which was used to calculate the theoretical molecular weight of the apoprotein, and TMHMM and TMAP, which were used to identify potential transmembrane domains. NetNGlyc (www.cbs.dtu.dk/services/NetNGlyc/; Ref. 59) was used to identify potential N-glycosylation sites and BLAST-P (60) via the National Center for Biotechnology Information website (www.ncbi.nlm.nih.gov/BLAST/) was used to identify related sequences in the nonredundant protein data base.
Expression of the T. ni cDNA with an Immediate Early Baculovirus Vector-The putative T. ni ␤4-galactosyltransferase cDNA was excised by digesting pTn␤4GalT with BamHI and DraI and then the 2179-bp product was gel-purified and subcloned into the BglII and StuI sites of the immediate early baculovirus transfer plasmid, pAcP(ϩ)IETV3 (61). The resulting plasmid, designated pAcP(ϩ)IETn␤4GalT, encoded the full-length putative ␤4-galactosyltransferase family member under the transcriptional control of a baculovirus ie1 promoter and hr5 enhancer. This plasmid was purified by a standard CsCl/ethidium bromide density gradient centrifugation method (62), mixed with Bsu36I-digested BakPak6 baculoviral DNA (63), and the mixture was used to co-transfect Sf9 cells by a modified calcium phosphate precipitation method (6). The baculovirus progeny were resolved by plaque purification and an immediate early recombinant was identified by its occlusion positive/ white plaque phenotype and subjected to two additional rounds of plaque purification. Working stocks of this virus, designated AcP(ϩ)IETn␤4GalT, were prepared by low multiplicity infection of Sf9 cells, titered by plaque assays in the same cell line, and used to express the T. ni cDNA product. Briefly, Sf9 cells were infected with either AcP(ϩ)IETn␤4GalT or wild-type baculovirus at a multiplicity of infection of about 5 plaque-forming units per cell. At 24 h post-infection, the infected cells were harvested, washed once with 75 mM sodium cacodylate, pH 7.0, and extracted for 10 min on ice with ␤4-galactosyltrans-ferase buffer (50 mM sodium cacodylate, pH 7.0; 20 mM MnCl 2 , 1.0% (v/v) Triton X-100, 4 mM ATP, and Complete™ protease inhibitor mixture (EDTA-free, Roche Applied Science, Indianapolis, IN)). The cell extracts were clarified for 10 min at top speed in a refrigerated microcentrifuge (Hermle Model Z180M), the supernatants were harvested, and total protein concentrations were determined using a commercial bicinchoninic assay kit (Pierce).
␤4-Galactosyltransferase Family Member Assays-Triplicate samples of infected cell extracts containing 50 g of total protein were assayed in a total reaction volume of 0.050 ml containing 1 mM pnitrophenyl N-acetyl-␤-D-glucosaminide (pNP-GlcNAc) as the acceptor substrate and 0.3 uCi of UDP-galactose, [1][2][3]  Ci/mmol) as the donor substrate. The reactions were incubated for 1 h at 37°C and then terminated by adding 1.0 ml of ice-cold water. The spent reactions were then immediately applied to Sep-Pak C18 cartridges (Millipore Corp., Bedford, MA), which were subsequently washed with water and eluted with methanol. The eluants were added to 5 ml of Scintisafe Plus 50% scintillation mixture (Fisher Scientific), and incorporated radioactivity was measured in a Model LS-6500 liquid scintillation spectrometer (Beckman Coulter Instruments). These same assay conditions were used to measure the activity of a purified affinitytagged, soluble domain of the cDNA product (shown diagrammatically in Fig. 5A), which is described below. We also measured the ability of the purified enzyme to transfer each of the three donor substrates to ovalbumin with a trichloroacetic acid precipitation assay, as described previously (64). Finally, we assayed the ability of the purified enzyme to transfer each of the three donor substrates to the glycolipid acceptor, 65), using the Sep-Pak method described above. The results of all of these assays were presented as the average amounts of monosaccharide transferred/g protein/h with standard deviations calculated from the individual measurements.
Expression and Purification of an Affinity-tagged, Soluble Form of the T. ni cDNA Product-The nucleotide sequence encoding the predicted soluble domain of the putative T. ni ␤4-galactosyltransferase family member (amino acids 33-421; Fig. 1) was amplified using standard PCR conditions (56) with pTn␤4GalT as the template and sense (5Ј-GAATTCATGAAGACCAAGTGGACCCGCGCTTGATTGACGGCA-AATCGCCGCTCCGGACCTATCTC-3Ј) and antisense (5Ј-GAATTCCC-ATCAGCTACGCTCGTC-3Ј) primers, which added a sequence encoding the 12 amino acid HPC4 epitope (66) onto the 5Ј-end and EcoRI sites onto both ends of the product. The PCR product was gel-purified, cloned into pCR2.1-TOPO®, and a clone with the correct nucleotide sequence was identified by sequencing with universal primers. The EcoRI fragment was then excised, gel-purified, and subcloned into the unique site of pAcSecG2T (BD Biosciences), in which the glutathione S-transferase (GST) coding sequence had been deleted from EcoNI to BamHI. Sequencing revealed that the insert was out-of-frame with respect to the upstream sequence encoding the baculovirus gp64 signal peptide. This problem was solved by partially digesting the plasmid with EcoRI and religating it, as the correct reading frame was verified by sequencing the resulting plasmid. This final plasmid was used to isolate a recombinant baculovirus designated AcHPC4solTn␤4GalT in Sf9 cells, as described above, except this virus was identified by its occlusion-negative, white plaque phenotype, and it encodes the HPC4-tagged predicted soluble domain of the T. ni cDNA product (shown diagrammatically in Fig. 5A) under the control of the strong polyhedrin promoter. This virus was amplified and titered, and working stocks were used to infect 50 ml of Sf9 cells cultured in suspension to a density of about 1 ϫ 10 6 cells/ml in Sfx-INSECT serum-free medium (HyClone) at a multiplicity of about 5 plaque-forming units per cell. The infected cells were harvested at 72 h post-infection, and cell-free medium was prepared by low speed centrifugation. The cell-free medium was adjusted to 3 mM CaCl 2 and tumbled for 2 h at room temperature with 0.25 ml of mouse anti-protein C beads (4 mg of HPC4 antibody/ml; Roche Applied Science). The beads were then gently pelleted (5 min at 750 ϫ g), washed three times with 2 ml of wash buffer (100 mM sodium cacodylate, pH 7.0; 3 mM CaCl 2 , 0.1% Triton X-100), and the bound protein was eluted with 0.5 ml of elution buffer (50 mM sodium cacodylate, pH 7.0, 5 mM EDTA).
Characterization of the Purified, Soluble Form of the T. ni cDNA Product-The affinity-purified fusion protein described above was initially analyzed by discontinuous SDS-PAGE (67)  noblotting (68) with commercial mouse anti-protein C (Roche Applied Science) and alkaline phosphatase-conjugated goat anti-mouse IgG (Promega) as the primary and secondary antibodies, respectively, and a standard alkaline phosphatase-based color reaction as the detection method (69). The enzymatic activity of the affinity-purified protein was examined using the in vitro assays described above, and its apparent K m values for UDP-N-acetylgalactosamine and pNP-GlcNAc were determined using modified versions of the same type of assay. To determine the apparent K m for the donor substrate, assays were performed in the presence of 1.0 mM pNP-GlcNAc, nonradioactive UDP-N-acetylgalactosamine (Sigma) concentrations ranging from 0.150 to 12 mM, and a constant ratio of tritiated UDP-N-acetylgalactosamine with a specific activity of 15 Ci/mmol. To determine the apparent K m for the acceptor substrate, assays were performed in the presence of 1.56 mM nonradioactive UDP-N-acetylgalactosamine, 0.2 Ci of tritiated UDP-N-acetylgalactosamine, and pNP-GlcNAc concentrations ranging from 0.05 to 1.0 mM. The incubation time used for the K m determinations was 15 min, which was within the linear range for the purified enzyme, as shown by preliminary time course experiments (data not shown). Finally, the ability of the affinity-purified protein to transfer galactose, N-acetylglucosamine, and N-acetylgalactosamine to a synthetic oligosaccharide acceptor substrate was examined in 0.05 ml ␤4-galactosyltransferase assays containing 50 mM sodium cacodylate, pH 7.0, 20 mM MnCl 2 , 4 mM ATP, 10 mM donor substrate (UDP-galactose, UDP-Nacetylglucosamine, or UDP-N-acetylgalactosamine; Sigma) and 0.02 M [GlcNAc␤1-Man␣6](GlcNAc␤1-Man␣3)Man␤1GlcNAc␤1GlcNAc labeled with pyridylamine at its reducing terminus (GnGn-PA; Calbiochem). These reactions were incubated overnight at 37°C, and then the reaction products were analyzed by high performance liquid chromatography (HPLC) using a Dionex (Sunnyvale, CA) Model DX-500 system equipped with a Model RF 2000 fluorescence detector and a 5 M C18 reverse phase column (4.6 ϫ 250 mm, Dionex). The HPLC profiles were developed using buffers A (0.1 M ammonium acetate, pH 4) and B (30% methanol) with a step gradient that included 12% buffer B for 5 min, 12-20% buffer B from 5-15 min, and 20 -100% buffer B from 15-30 min, with a flow rate of 1 ml/min. In some experiments, the transferase reaction was followed by treatment with ␤-N-acetylhexosaminidase. In these experiments, a single 0.100-ml N-acetylgalactosaminyltransferase reaction was evaporated to dryness, redissolved in 50 mM sodium citrate buffer, pH 5.0, split into two aliquots, and incubated overnight at 37°C with either buffer alone or 0.8 units of ␤-N-acetylhexosaminidase (Seikagaku Corporation, Tokyo, Japan) prior to the HPLC analysis.
Mass Spectroscopic Analysis of the Transferase Reaction Products-The in vitro transferase reaction with the affinity-purified enzyme, UDP-N-acetylgalactosamine donor, and GnGn-PA acceptor substrate was scaled up 10-fold relative to the conditions described above and the two major products were partially resolved by HPLC, collected as they eluted from the column, desalted by batch absorption with a Dowex 50WX8 -200 cation exchange resin, and dialyzed by drop dialysis (70) against ultrapure water. The samples were then mixed with 2,5-dihydroxybenzoic acid matrix (100 mg/ml in 50% acetonitrile, 0.1% trifluoroacetic acid), dried onto matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) plates, and analyzed by MALDI-TOF mass spectroscopy on a Voyager DE-PRO instrument operating in positive ion mode.
1 H Nuclear Magnetic Resonance Spectroscopic Analysis of the Transferase Reaction Products-For NMR analysis, the N-acetylgalactosaminyltransferase reaction was similarly scaled up to include 2 g of affinity-purified enzyme, 50 mM UDP-N-acetylgalactosamine, and 50 nmol of GnGn-PA, then incubated for 48 h under the other conditions described above. The reaction products were resolved by HPLC, collected as they eluted from the column, and desalted as described above. Hydroxyl hydrogens in the glycan were exchanged with deuterium by repeated dissolution in 99.9% D2O and intermittent lyophilization. The deuterium-exchanged glycan (ϳ40 nmol) was dissolved in 0.6 ml of 99.96% D2O (Cambridge Isotope Labs, Andover, MA) and subjected to 1 H NMR analysis at 500 MHz. The NMR spectrum was recorded at 27°C on a Bruker Avance 500 spectrometer equipped with a 5 mm TBI probe and interfaced with an HP work station running XWinNMR 3.5 under Linux 7.1 (College of Chemistry, University of California, Berkeley, CA). Acquisition parameters included digital quad detection, 64 K complex data points, spectral width 20 ppm, pulse flip angle 70°, and number of scans 1024. Data were processed off-line using Bruker's TopSpin for MS Windows 2000/XP software (version 1.2.b; Bruker BioSpin, Billerica, MA). Chemical shifts are reported relative to DSS (but were actually measured by reference to a trace of internal acetate at ␦ 1.910 in D2O) with an accuracy of 0.002 ppm.
Functional Analysis of the T. ni cDNA Product in Vivo-The reporter glycoprotein used for these studies was GST-SfManI, a GST-tagged soluble domain of the S. frugiperda class I Golgi ␣-mannosidase (8), which was expressed under the control of the strong polyhedrin promoter by infecting Sf9 cells with a recombinant baculovirus designated AcGST-SfManI (10). Three 50-ml Sf9 suspension cultures were grown to a density of about 1 ϫ 10 6 cells/ml in TNM-FH and then simultaneously infected at a total multiplicity of infection of 10 plaque-forming units per cell (5 plaque-forming units per cell for each virus) with mixtures of AcGSTSf-ManI plus wild-type baculovirus, AcP(ϩ)IE␤4GalT (71), or AcP(ϩ)IETn␤4GalT. The infected cells were harvested at 72 h post-infection and the GST-SfManI was affinitypurified, as described previously. Samples of the purified fusion protein were then analyzed by SDS-PAGE with immunoblotting or lectin blotting, as described previously (72,73). Parallel samples of the model glycoprotein isolated from each infected cell type were treated with buffer alone or PNGase-F (New England Biolabs, Beverly, MA) to demonstrate the specificity of the lectin blotting assay, as described previously (72,73). The primary and secondary antisera used for the immunoblotting analyses were rabbit polyclonal anti-GST (Sigma) and alkaline phosphatase-conjugated goat anti-rabbit IgG (Sigma), respectively. The lectins used in this study were biotinylated Ricinus communis agglutinin (RCA; EY Laboratories, San Mateo, CA), Wisteria floribunda agglutinin (WFA; Vector Laboratories; Burlingame, CA), and Griffonia simplicifolia lectin II (GS-II; Vector), with alkaline phosphatase-conjugated streptavidin (Vector) as the secondary detection reagent. Both types of assay were developed using a standard alkaline phosphatase color reaction (69). Intracellular Distribution of the T. ni cDNA Product-A DNA fragment encoding the full-length putative T. ni ␤4-galactosyltransferase family member (amino acids 1-421) was produced by PCR under standard conditions (56) with pTn␤4GalT as the template and sense (5Ј-AGATCTACCATGGGCGGGCGGGCGACGCGCGCG-3Ј) and antisense (5Ј-GGATCCGCGCTACGCTCGTCAATGTTCACTAGTAT-3Ј) primers that added unique BglII and BamHI sites to its 5Ј-and 3Ј-ends, respectively. This fragment was gel-purified, cloned into pCR2.1-TOPO®, and a clone with the correct nucleotide sequence was identified by sequencing with universal primers. The 1290 bp BglII-BamHI fragment was then excised, gel-purified, and subcloned into the corresponding sites of pEGFP-N1 (BD Biosciences Clontech), which positioned the full-length T. ni protein coding sequence upstream and in-frame with a sequence encoding the red-shifted variant of Aequoria victoria green fluorescent protein (GFP; Ref. 74). The fused coding sequence was subsequently excised with BglII and NotI, gel-purified, and subcloned into the corresponding sites of the immediate early baculovirus transfer plasmid, pAcP(ϩ)IE1TV3 (61). Finally, the resulting plasmid was used to produce an immediate early recombinant baculovirus, designated AcP(ϩ)IETn␤4GalT-GFP, as described above. This virus was identified by its occlusion-positive, white plaque phenotype and purified, amplified, and titered as described above. The intracellular distribution of the fusion protein was determined by living cell fluorescence experiments. Sf9 cells were infected with AcP(ϩ)IETn␤4GalT-GFP at a multiplicity of infection of 5 plaque-forming units per cell, and then the infected cells were harvested and washed with fresh medium at 24 h post-infection. The infected cells were stained for 1 h with 450 nM BODIPY-TR ceramide (Molecular Probes Inc. Eugene, OR), washed with fresh medium, and examined under a Leica TCS SP2 confocal laser-scanning microscope (Leica Microsystems, Heidelberg, Germany).

Isolation and Characterization of an Insect ␤4-Galactosyl-
transferase Family Member-The cDNA described in this study was isolated on the basis of short, conserved amino acid sequences in ␤4-galactosyltransferase family members, which were noted in the literature by several different investigators during the late 1990s (75)(76)(77)(78)(79)(80). Degenerate oligonucleotides encoding four of these conserved amino acid sequences (Table I) were synthesized and used as primers for nested PCRs with cDNA prepared from T. ni membrane-bound RNA, as described under "Experimental Procedures." No specific amplification products were observed upon analysis of the primary PCRs by agarose gel electrophoresis and ethidium bromide staining (primers S6 and A5; data not shown). However, a specific product of about the expected size (135 bp) was observed in the secondary, nested PCRs performed with primers S4 and A6 (data not shown). This PCR product was cloned, sequenced, analyzed by BLAST-P (60), and the results indicated that it was most closely related to ␤4-galactosyltransferase family members (data not shown). Thus, the sequence of this PCR product was used to design a gene-specific primer for 3Ј-RACE, as described under "Experimental Procedures." The 3Ј-RACE reactions yielded a specific amplification product of about 210 bp, which was also cloned and sequenced. The resulting sequence was used to design exact-match primers, which were used to screen a TN-368 cell cDNA library by a sibling selection and PCR method, as described previously (8,9). This screen ultimately yielded a positive lambda clone, and the cDNA insert was excised and sequenced, as described under "Experimental Procedures." The nucleotide and deduced amino acid sequences of this new T. ni cDNA are shown in Fig. 1 and have been deposited in GenBank TM (acc. no. AY601103). The 1266 nucleotide open reading frame encodes a theoretical protein of 421 amino acids with a calculated molecular weight of 48,305. Computer analysis of the putative amino acid sequence with various algorithms indicated that it includes a single transmembrane domain near the N terminus. For example, the TMHMM algorithm predicted a single transmembrane domain from amino acids 7 to 29 of the putative protein. Further analysis using NetNGlyc (59) revealed that the putative protein has five consensus N-glycosylation sites. BLAST-P analysis revealed that the putative cDNA product is most similar to known members of the ␤4-galactosyltransferase family (Table II). The most statistically significant hits were putative ␤4-galactosyltransferase family members from Anopheles gambiae and D. melanogaster, as well as a known ␤4-N-acetylgalactosaminyltransferase from C. elegans (48). The specific similarities between the putative T. ni ␤4-galactosyltransferase family member and selected known or predicted ␤4-galactosyltransferase family members are shown in a ClustalW multiple sequence alignment (Fig. 2). The key features conserved among the members of this family include a metal binding site (DXD), four cysteine residues, and the major sequence motif WG-WGGEDDD (76 -78, 80, 81). The deduced T. ni cDNA product includes the metal binding site (amino acids 274 -276), three of the four conserved cysteine residues (amino acids 196, 269, and 288), and the WGWGGEDDD motif (amino acids 334 -342; Fig.  2). However, the sequence of the metal binding site is DID, rather than DVD, which is found in most other family members. A phylogenetic tree (82) clearly revealed that the putative T. ni ␤4-galactosyltransferase family member is most closely related to the predicted insect and known worm enzymes (Fig. 3). Together, the results of these in silico analyses strongly suggested that the new T. ni cDNA isolated in this study encodes a member of the ␤4-galactosyltranferase family. In addition, the close relationship of the derived cDNA product to the recently characterized C. elegans family member (48) indicated that this product might be a ␤4-N-acetylgalactosaminyltransferase.
Expression and Biochemical Characterization of the Native T. ni cDNA Product-The enzymatic activity of the new T. ni cDNA product was initially examined in crude lysates of Sf9 cells infected with a baculovirus expression vector designed to express the full-length, native gene product during the immediate early phase of infection. This expression strategy was chosen because the specific activity of intracellular forms of some recombinant secretory pathway enzymes are higher when expressed earlier in infection with this type of baculovirus vector, as compared with a conventional baculovirus vector (61). Lysates from Sf9 cells infected in parallel with a wild-type baculovirus were used to measure the background in each assay, which was subtracted from the levels measured with the recombinant virus-infected cell lysates. The results of assays with UDP-galactose, UDP-N-acetylglucosamine, or UDP-Nacetylgalactosamine as the donor substrates and pNP-GlcNAc as the acceptor substrate are shown in Fig. 4. In each case, the  ble, HPC4-tagged form of the cDNA product, which is shown diagrammatically in Fig. 5A, was used to confirm and further characterize its enzyme activity. A fragment encoding the predicted soluble domain of the cDNA product was PCR-amplified, fused to an HPC4-epitope coding sequence, and the chimeric sequence was transferred into a conventional recombinant baculovirus. This virus was used to express the fusion protein under the control of the strong polyhedrin promoter in Sf9 cells and then the secreted product was immunoaffinity-purified, as described under "Experimental Procedures." This conventional baculovirus-based expression strategy was used for these experiments because the polyhedrin promoter provides higher levels of foreign gene expression than the ie1 promoter and extracellular forms of recombinant secretory pathway proteins usually have high specific activities (7). SDS-PAGE with Coomassie Blue staining revealed that the fusion protein had been effectively purified and immunoblotting with a commercial antiserum against protein C confirmed its identity (Fig. 5B). The purified protein also contained a major immunoreactive lower molecular weight species, which appeared to be a degradation product. Subsequently, aliquots of the purified fusion protein were used for in vitro transferase assays with UDP-galactose, UDP-N-acetylglucosamine, or UDP-N-acetylgalactosamine as the donor substrates and pNP-GlcNAc, ovalbumin, or AP3Cer (65) as carbohydrate, glycoprotein, and glycolipid acceptor substrates. The results showed that the putative T. ni ␤4-galactosyltransferase family member could transfer each donor substrate to each acceptor substrate tested (Fig. 6). The purified enzyme transferred only tiny amounts of galactose and Nacetylglucosamine to the glycoprotein and glycolipid acceptors, relative to acceptor-negative controls, and had a clear preference for UDP-N-acetylgalactosamine as the donor, irrespective of the acceptor substrate. Focusing on the N-acetylgalactosaminyltransferase activity, we found that the apparent K m values of this form of the enzyme were about 1.64 mM and 0.38 mM for UDP-N-acetylgalactosamine (Fig. 7A) and pNP-GlcNAc (Fig.  7B), respectively. These results indicated that the T. ni cDNA isolated in this study encodes a ␤4-N-acetylgalactosaminyltransferase with a role in N-glycoprotein biosynthesis, glycolipid biosynthesis, or both. Because our laboratory in primarily interested in insect protein glycosylation pathways, the remainder of the experiments performed in this study focused on the potential role of this cDNA product in N-glycoprotein biosynthesis.
The next set of assays was designed to examine the ability of the purified enzyme to transfer galactose, N-acetylglucosamine, or N-acetylgalactosamine to GnGn-PA, a synthetic N-glycan acceptor, in vitro. HPLC analyses showed that there was no change in the elution profile of this acceptor after it was treated with the purified enzyme plus either UDP-galactose or UDP-N-acetylglucosamine (Fig. 8, A-C). Conversely, the acceptor peak disappeared and two new peaks appeared when this same N-glycan acceptor was treated with the purified enzyme plus UDP-N-acetylgalactosamine (Fig. 8D). In time course experiments, the first peak (20.38 min) was observed earliest, the second peak (21.35 min) was observed later, and the first peak disappeared as the second peak appeared (data not shown). These results showed that the purified enzyme preferentially transferred N-acetylgalactosamine to the synthetic N-glycan acceptor substrate and suggested that the enzyme produced a terminally di-N-acetylgalactosaminylated N-glycan through one or both terminally mono-N-acetylgalactosaminylated intermediates. This interpretation was confirmed by MALDI-TOF analysis of the material in the first and second peaks observed in the HPLC profile in Fig. 8D. Because it was necessary to perform scaled-up, incomplete glycosyltransferase reactions to obtain adequate amounts of both the intermediate and end product for the mass spectroscopic analyses, the GnGn-PA acceptor, putative intermediate, and putative end-product were not resolved in the preparative HPLC runs (data not shown). Nevertheless, MALDI-TOF showed that one of the two major species in the first peak from the HPLC run had a mass of about 1620 daltons (Fig. 9A), which was nearly identical to the theoretical mass of a sodium adduct of the presumed intermediate, consisting of GnGn-PA with one terminal N-acetylgalactosamine residue (1621.3 daltons). The other major peak in this profile had a mass of about 1417 daltons, which was nearly identical to the mass of a sodium adduct of the acceptor substrate (1418.3 daltons). Similarly, MALDI-TOF showed that one of the major species in the second peak from the HPLC run had a mass of about 1823 daltons (Fig. 9B), which was nearly identical to the theoretical mass of a sodium adduct of the presumed end-product, GnGn-PA with two terminal Nacetylgalactosamine residues (1824.3 daltons). The other major peak in this profile had a mass nearly identical to the expected mass of the intermediate. The additional minor peaks preceding the major peaks differ by the size of a single sodium ion and those following the major peaks are presumed to be isotopic hydrogen variants, as they differ by a single dalton.
The nature of the terminal N-acetylgalactosamine linkage to GnGn-PA was initially examined by digesting the products of the transferase reaction with ␤-N-acetylhexosaminidase prior to HPLC analysis. The results showed that ␤-N-acetylhexosaminidase treatment (Fig. 10C) eliminated both of the reaction products observed in the buffer control (Fig. 10B) and produced a single peak that comigrated with the trimannosyl core standard, Man 3 GlcNAc 2 -PA (Fig. 10A). Thus, treatment with this exoglycosidase removed the terminal N-acetylgalactosamine and penultimate N-acetylglucosamine residues from the reaction products, indicating that the T. ni enzyme trans- ferred terminal N-acetylgalactosamine residues to GnGN-PA in ␤-linkages. This conclusion was directly confirmed and extended by 500-MHz 1 H NMR analysis. The NMR spectrum indicated that the sample contains a bi-antennary N-type oligosaccharide ending in PA-derivatized N,NЈ-diacetylchitobiose (Fig. 11). The bi-antennary nature of this glycan was evident Together, the results of the HPLC, MALDI-TOF, and NMR analyses clearly demonstrated that the T. ni cDNA isolated in this study encodes a ␤4-galactosyltransferase family member, which we may now designate as a T. ni ␤4-N-acetylgalactosaminyltransferase (Tn␤4GalNAcT) to reflect the fact that the it preferentially utilizes UDP-N-acetylgalactosamine as a donor substrate. The ability of this enzyme to transfer Nacetylgalactosamine to both glycoprotein and synthetic N-glycan acceptors in vitro suggested that it might be involved in N-glycoprotein biosynthesis in vivo. Thus, the final part of this study was designed to examine this possibility.
In Vivo Function of Tn␤4GalNAcT-The postulated in vivo role of Tn␤4GalNAcT in N-glycoprotein biosynthesis was examined by co-expressing the full-length, untagged enzyme together with a secreted N-glycoprotein, GST-SfManI, which we have used as a reporter in several previous insect N-glycan processing studies (10,72,85,86). Sf9 cells were co-infected with a conventional baculovirus encoding GST-SfManI plus an immediate early recombinant baculovirus encoding either Tn␤4GalNAcT or bovine ␤4GalT-I as a control, and then the model glycoprotein was affinity-purified from each infected culture and analyzed by lectin blotting, as described under "Experimental Procedures." These experiments were specifically designed to include a temporal difference in the expression of the modifying enzymes and the reporter glycoprotein. The ␤4-galactosyltransferase family members had to be expressed early in infection so each would be available intracellularly prior to expression of the reporter, which occurred later, under polyhedrin control. The results of these analyses clearly showed that the GST-SfManI produced by cells co-infected with AcP(ϩ)IETn␤4GalNAcT reacted with WFA, a lectin specific for ␤4-linked N-acetylgalactosamine residues (Fig. 12B), but not with RCA, a lectin specific for ␤4-linked galactose residues (Fig. 12C). The opposite results were obtained in lectin blots of the GST-SfManI produced by cells co-infected with the control virus, AcP(ϩ)IE␤4GalT. Standard immunoblotting analysis with polyclonal anti-GST showed that approximately equal amounts of GST-SfManI were loaded in each lane (Fig. 12A). The specificity of the lectin blotting assays was verified by pre-digesting the GST-SfManI with PNGase-F (Fig. 12, lanes marked ϩ), which precluded any detectable lectin binding. We also found that GST-SfManII produced by cells co-infected with AcP(ϩ)IETn␤4GalNAcT failed to bind to GS-II, a lectin specific for terminal N-acetylglucosamine residues (data not shown). In Sf9 cells were infected with either a wild-type baculovirus or AcP(ϩ)IETn␤4GalNAcT, extracts were prepared, and normalized amounts were assayed for glycosyltransferase activity toward pNP-GlcNAc with either UDP-galactose, UDP-N-acetylglucosamine or UDP-N-acetylgalactosamine as the donor substrates, as described under "Experimental Procedures." The graph shows the average transferase activities measured in triplicate assays with wildtype background substracted and standard deviations indicated by the error bars. contrast, this lectin bound to a control GST-SfManII preparation isolated from cells coinfected with AcP(ϩ)IEHGnTI, which expresses human N-acetylglucosaminyltransferase I (87). Thus, these results demonstrated that Tn␤4GalNAcT can participate in N-glycoprotein biosynthesis in vivo and that it transfers N-acetylgalactosamine, but no detectable galactose or Nacetylglucosamine, to the reporter N-glycoprotein used in this study.
Intracellular Distribution of Tn␤4GalNAcT-If Tn␤4GalNAcT participates in the elongation of N-glycoprotein glycans, it would be expected to perform this function in the Golgi apparatus. Thus, the intracellular distribution of Tn␤4GalNAcT was examined in living cell co-localization experiments. The full-length Tn␤4GalNAcT coding sequence was positioned in-frame with a downstream GFP coding sequence and the resulting construct was used to produce an immediate early recombinant baculovirus. This type of baculovirus vector provides relatively low expression levels, which avoids potential localization artifacts, and allows the recombinant protein to be visualized earlier in infection, prior to the onset of adverse effects of baculovirus infection on host secretory pathway function (88). Sf9 cells were infected for 24 h with AcP(ϩ)IETn␤4GalNAcT-GFP, counterstained with BODIPYTR ceramide, and examined by confocal laser scanning microscopy, as described under "Experimental Procedures." The results showed that the fusion protein had a punctate distribution throughout the cytoplasm in these unfixed, living cells (Fig.  13A). This pattern overlapped almost perfectly with the staining pattern obtained with BODIPY®TR ceramide (Fig. 13C, merged with GFP pattern in Fig. 13B). BODIPY®TR ceramide is a Golgispecific dye that preferentially stains the trans-Golgi. In previous studies, we have shown that BODIPY®TR ceramide co-localizes with several different GFP-tagged insect and mammalian Nglycan processing enzymes, including a classic Golgi marker, bovine ␤4-galactosyltransferase I (11,12,89). Thus, the results shown in Fig. 13 indicated that the Tn␤4GalNAcT-GFP protein is mainly localized in the Golgi apparatus of these baculovirusinfected insect cells. This suggests that the native protein is predominantly localized in the insect cell Golgi apparatus, as well, though it is formally possible that addition of the GFP tag could have altered its intracellular distribution. DISCUSSION Generally, the N-glycan structures of insect glycoproteins suggest that insect protein N-glycosylation pathways are truncated versions of the mammalian pathway, with all of the N-glycan trimming reactions, but few of the elongation reactions (reviewed in Refs. 1, 2, 25-27). However, there is growing evidence that this is an overly simplistic view, as insect cells apparently can process N-glycans more extensively than has been previously recognized. Thus, we are beginning to appreciate that at least some insect cells, under some conditions, have the potential to produce elongated N-glycans with hybrid or complex structures. We believe the N-glycan processing potential of insect cells will be fully appreciated only after insect glycosyltransferase genes have been molecularly cloned and the gene products functionally analyzed. Whereas some published studies have utilized this general approach, most have focused on the processing exoglycosidases (8 -17), the core fucosyltransferase, (22), and the early acting glycosyltransferase, N-acetylglucosaminyltransferase I (45). The only published study to date that describes an insect terminal glycosyltransferase gene together with a supporting functional analysis of the gene product is a recent publication on a D. melanogaster sialyltransferase (46). This study provided exciting new evidence that insect cells have the potential to produce complex N-glycans. However, it did not demonstrate that this enzyme participates in N-glycoprotein biosynthesis in vivo. Thus, the present study was distinguished by the fact that it showed that insect cells encode and express a terminal glycosyltransferase, specifically a ␤4-galactosyltransferase family member, and also provided evidence that this enzyme can function in N-glycoprotein biosynthesis in vivo. These results represented a major step forward in our ongoing effort to obtain unequivocal proof that insect cells have the potential to produce complex N-glycans.
The cabbage looper, T. ni, was chosen as the model insect system for this project because a previous study had shown that an established T. ni cell line has low levels of galactosyltransferase and N-acetylgalactosaminyltransferase activities (30). In addition, several studies had shown that T. ni cells could produce recombinant glycoproteins with terminally galactosylated N-glycans when infected with baculovirus expression vectors (32,35,38,39). Thus, it was theoretically possible to isolate a ␤4-galactosyltransferase family member from this organism.
PCRs with degenerate primers corresponding to conserved regions of ␤4-galactosyltransferase family members yielded a related DNA fragment, which ultimately led to the isolation of a full-length cDNA from a T. ni library. Bioinformatic analysis of the derived amino acid sequence suggested that this cDNA encoded a membrane glycoprotein with type II topology, which is characteristic of all known Golgi glycosyltransferases (90). Furthermore, the predicted protein had significant similarity to known and predicted ␤4-galactosyltransferase family members. The closest relatives of the derived T. ni protein were putative ␤4-galactosyltransferase family members from mos- quito and fly. The closest relative of known function was a ␤4-N-acetylgalactosaminyltransferase from C. elegans (48). Thus, it appeared that we had molecularly cloned a ␤4-galactosyltransferase family member from T. ni, but it appeared to encode a ␤4-N-acetylgalactosaminyltransferase rather than a ␤4-galactosyltransferase.
A baculovirus vector was used to express the full-length T. ni cDNA product for biochemical assays of crude enzyme preparations. The results showed that this form of the enzyme could transfer galactose, N-acetylglucosamine, and N-acetylgalactosamine from their respective nucleotide sugar donors to the artificial acceptor substrate, pNP-GlcNAc, in vitro, but had a clear preference for N-acetylgalactosamine. These results were confirmed by additional in vitro assays of a purified affinitytagged soluble domain of the cDNA product. This form of the enzyme transferred galactose, N-acetylglucosamine, and N-acetylgalactosamine to pNP-GlcNAc in vitro. It also transferred all three sugars to glycoprotein and glycolipid acceptors in vitro, with a strong preference for N-acetylgalactosamine. These results indicated that the T. ni cDNA encodes a ␤4-Nacetylgalactosaminyltransferase with an unusually broad substrate specificity that could potentially function in N-glycoprotein biosynthesis, glycolipid biosynthesis, or both. Because our research interests focus on insect protein glycosylation path- ways, the remainder of the present study focused on the potential role of this gene product in N-glycoprotein biosynthesis. However, the broader significance of the results obtained with the glycolipid acceptor is worthy of further discussion. Recent studies have shown that the acquisition of resistance to Cry5B, an invertebrate-specific toxin produced by Bacillus thuringiensis, results from mutations in four C. elegans genes designated bre-2, -3, -4, and -5 (91,92). Each of these genes encodes known or putative glycosyltransferases. Moreover, bre-4 specifically encodes the C. elegans N-acetylgalactosaminyltransferase that is most closely related to the T. ni enzyme identified in this study. It has been proposed that the enzymes encoded by bre -3, -5, and -4 catalyze a stepwise series of glycosyltransferase reactions culminating with the production of a glycosphingolipid (GalNAc␤4GlcNAc␤3Man␤4Glc-ceramide) required for Cry5B toxicity. The bre-4 product has been implicated in Nglycoprotein (48), but not glycolipid biosynthesis. Therefore, our finding that the T. ni enzyme can transfer N-acetylgalac- tosamine to both glycoprotein and glycolipid acceptors in vitro provides at least some evidence to support the idea that the C. elegans enzyme might participate in both glycoprotein and glycolipid biosynthesis, as well.
Focusing on its potential role in N-glycoprotein biosynthesis, we found that the T. ni enzyme could transfer N-acetylgalactosamine, but no detectable N-acetylglucosamine or galactose, to a synthetic biantennary N-glycan in vitro. The enzyme produced one or both mono-N-acetylgalactosaminylated intermediates and a di-N-acetylgalactosaminylated product, as ob-served by HPLC. The masses of both the intermediate(s) and the product were verified by MALDI-TOF and the terminal N-acetylgalactosamine residues were shown to be ␤1,4-linked by exoglycosidase digestions and NMR analysis. These results conclusively demonstrated that the T. ni cDNA encodes a ␤4-N-acetylgalactosaminyltransferase, designated Tn␤4GalNAcT, and strongly supported the idea that it functions in insect N-glycoprotein biosynthesis.
This tentative conclusion was further supported by in vivo experiments in which Tn␤4GalNAcT was co-expressed with a reporter N-glycoprotein in baculovirus-infected insect cells. The results showed that Tn␤4GalNAcT transferred N-acetylgalactosamine, but no detectable galactose or N-acetylglucosamine, to the reporter. Finally, a Tn␤4GalNAcT-GFP fusion protein was localized predominantly in the Golgi apparatus of baculovirus-infected insect cells. Thus, all of the in silico, in vitro, and in vivo data obtained in this study were consistent with the conclusion that the new T. ni cDNA encodes an enzymatically active ␤4-N-acetylgalactosaminyltransferase which functions in N-glycoprotein biosynthesis. More specifically, Tn␤4GalNAcT yields the N-glycan GalNAc␤4GlcNAc-R, also known as LacDiNAc or LDN. This disaccharide is an increasingly well-recognized alternative to the more common outer chain structure, Gal␤4GlcNAc-R, also known as LacNAc, or FIG. 11. NMR analysis of the N-glycan reaction product. A preparative N-acetylgalactosaminyltransferase reaction was performed, and the major N-glycan product, which eluted around 21.35 min in Fig. 8D, was recovered from an HPLC run. The glycan was then desalted, exchanged with D 2 O, and analyzed by 500-MHz 1 H NMR spectroscopy, as described under "Experimental Procedures." A, 1 H NMR spectrum of the glycan product (500 mHz, 27°C, D 2 O). B, resolution-enhanced H-1 and NAc regions. Italicized numbers in the spectrum refer to the corresponding residues in the structure. LN, which can be found in both vertebrate and invertebrate glycoproteins (49). Among the vertebrates, LDN is a common terminal disaccharide of pituitary hormone N-glycans, in which the N-acetylgalactosamine residues are 4-O-sulfated and function as recognition markers for hormonal clearance (93,94). Among the invertebrates, LDN is a common antigenic deter-minant of parasitic worms (reviewed in Ref. 95) and also has been documented in native T. ni glycoproteins (43). The presence of a ␤4-N-acetylgalactosaminyltransferase gene in T. ni cells and the antigenicity of worm LDN residues raise important questions about the potential antigenicity of recombinant glycoproteins produced in these insect cells. To date, we are unaware of any structural evidence that LacDiNAc has been found in the N-glycans of any recombinant glycoprotein produced in baculovirus-infected T. ni cells or larvae. In fact, considering that the expression of most or all host cell genes is repressed during baculovirus infection (96), and that ␤4-Nacetylgalactosaminyltransferase activity is dramatically reduced during baculovirus infection (30), it seems rather unlikely that LacDiNAc will occur in recombinant N-glycoproteins produced using the baculovirus expression system.
It is interesting to consider in retrospect that we actually intended to isolate an insect ␤4-galactosyltransferase cDNA in this study. It appears that we did not isolate this cDNA, suggesting that T. ni encodes another ␤4-galactosyltransferase family member, which is responsible for the galactosyltransferase activity observed in cell lysates (30) and the terminal galactose residues observed on some N-glycans produced in these cells (32,38,39). Alternatively, it is possible that the cDNA isolated in this study encodes a bifunctional enzyme that has both ␤4-N-acetylgalactosaminyltransferase and ␤4-galactosyltransferase activities in vivo. In fact, we detected some galactosyltransferase activity in lysates from insect cells expressing this cDNA product, but it was much lower than the N-acetylgalactosaminyltransferase activity detected in these same lysates. van Die et al. (30) observed the same quantitative relationship between these two endogenous activities in T. ni cell lysates (30). Thus, it is possible that the T. ni cDNA product can also function as a ␤4-galactosyltransferase activity in vivo, but only at levels below the sensitivity of our lectin blotting assays. It will be interesting to address this possibility in future studies. However, irrespective of whether or not the cDNA isolated in this study encodes a bifunctional enzyme, its isolation, together with functional characterization of the gene product, are significant because the results provide unequivocal, molecular genetic evidence that insect cells have the potential to produce complex N-glycans. The presence or absence of this capability has been a controversial subject in insect glycobiology for the past decade, and the old idea that insect cells cannot produce complex N-glycans has begun to crumble under the growing weight of evidence to the contrary. The present study, together with the recently published study on a fly sialyltransferase gene coupled with functional characterization of the gene product, should contribute to the demise of this old idea.