The Caenorhabditis elegans gene, gly-2, can rescue the N-acetylglucosaminyltransferase V mutation of Lec4 cells.

UDP-N-acetylglucosamine:alpha-6-d-mannoside beta-1,6-N-acetylglucosaminyltransferase V (GlcNAc-TV) is a regulator of polylactosamine-containing N-glycans and is causally involved in T cell regulation and tumor metastasis. The Caenorhabditis elegans genome contains a single orthologous gene, gly-2, that is transcribed and encodes a 669-residue type II membrane protein that is 36.7% identical to mammalian GlcNAc-TV (Mgat-5). Recombinant GLY-2 possessed GlcNAc-TV activity when assayed in vitro, and protein truncations demonstrated that the N-terminal boundary of the catalytic domain is Ile-138. gly-2 complemented the Phaseolus vulgaris leucoagglutinin binding defect of Chinese hamster ovary Lec4 cells, whereas GLY-2(L116R), an equivalent mutation to that which causes the Lec4A phenotype, could not. We conclude that the worm gene is functionally interchangeable with the mammalian form. GlcNAc-TV activity was detected in wild-type animals but not those homozygous for a deletion allele of gly-2. Activity was restored in mutant animals by an extrachromosomal array that encompassed the gly-2 gene. Green fluorescent protein reporter transgenes driven by the gly-2 promoter were expressed by developing embryos from the late comma stage onward, present in a complex subset of neurons in larvae and, in addition, the spermathecal and pharyngeal-intestinal valves and certain vulval cells of adults. However, no overt phenotypes were observed in animals homozygous for deletion alleles of gly-2.

quence-unrelated GlcNAc-T enzymes that create branches in complex-type N-glycans (1). These branches can be further elongated by galactosyltransferase and other enzymes to create the mature glycoprotein oligosaccharides. The GlcNAc␤1,6 branch resulting from GlcNAc-TV action is distinct in that it is the preferred site for elongation with polylactosamine chains, repeating lactosamine units that themselves can be further branched and carry a variety of terminal structures. Glc-NAc-TV is thus a potential regulator of polylactosamine containing N-glycan chains on target glycoproteins. GlcNAc-TV is also distinct from the other N-acetylglucosaminyltransferases in that it has a specific temporal and spatial expression pattern in the developing mouse embryo. Expression is concentrated in neuronal tissues, specialized epithelium, and regions with stem cell-like populations. Zygotic expression increases at about 9.5 days post coitus, which coincides with the onset of organogenesis (2).
Mice deficient in GlcNAc-TV activity through mutation of the Mgat-5 locus are viable but develop glomerulonephritis with age, which is associated with T cell hypersensitivity, apparently as a result of altered activation kinetics of the T cell receptor complex (3). When the Mgat-5 o allele is combined with a mouse mammary tumor virus-promoted Polyomavirus middle T antigen transgene, multifocal tumorigenesis is delayed, and metastasis caused by the Polyomavirus middle T antigen is dramatically suppressed (4). This result is consistent with prior observations that tumor cell lines selected by resistance to the cytotoxic lectin Phaseolus vulgaris leucoagglutinin (L-PHA) deficient in GlcNAc-TV also failed to metastasize in syngeneic mice (5).
Although the Mgat-5 o mouse is highly informative, systematic analysis of a complex viable phenotype remains difficult, particularly the identification of the dependent molecules and pathways. We therefore sought a simpler model organism in which synthetic genetics could be carried out rapidly to characterize the complex pleiotropic phenotypes expected from disruption of the glycosylation machinery. Because of the cellular non-autonomy typical of glycosylation phenotypes and of the phylogenetic restriction of complex-type N-glycans to metazoans, a whole animal model is necessary. Caenorhabditis elegans is the simplest and most highly characterized animal, its adult anatomy and developmental lineage have been completely determined (6,7), and its genome is essentially completely sequenced (8). C. elegans is highly tractable to experimental phenotypic and genetic analysis, and there are numerous examples demonstrating that genetic pathways found in mammals are also conserved in this nematode (9 -13).
Surveys of the C. elegans genome sequence revealed a coding potential for most known glycosyltransferase genes (14). Genes encoding active polypeptide GalNAc-transferases (15), Glc-NAc-TI (16), and a fucosyltransferase (17) have been characterized. In addition, there are at least three sqv genes that are elements of a proteoglycan glycosylation pathway that when mutated cause severe and pleiotropic defects (18 -21). A recent NMR-mass spectrometry study identified the abundant N-and O-glycans in C. elegans (22). The canonical oligomannose series of N-glycans were observed, but atypical O-glycans were found where polypeptide linked GalNAc was ␤1-6-branched as in mammals but substituted with glucose or galactose rather than GlcNAc. We characterized the 6 homologues of core 2 GlcNAc-T (23) and demonstrated that gly-1 transfers glucose from UDPglucose to core 1 acceptor consistent with the inference based on the structural analysis (24).
We observed that the C. elegans genome encodes a single gene, designated gly-2, which is homologous to mammalian GlcNAc-TV sequences. In this paper, we establish that the nematode orthologue is functionally equivalent to that from mammals and that C. elegans is an appropriate model in which to pursue investigations of the contributions to fitness made by ␤6-GlcNAc-branched N-glycans.
Molecular Biology Procedures-Unless otherwise noted, standard molecular biology techniques were employed (27).
5Ј RACE-Poly(A) ϩ RNA was isolated from mixed populations of C. elegans using a QuickPrep Micro mRNA purification kit (Pharmacia). The 5Ј RACE system (Invitrogen) was used according to the manufacturer's instructions. First strand cDNA synthesis was primed with yk5Јrc0. First round PCR using AmpliTaq Gold (PerkinElmer Life Sciences) was primed with yk5Јrc1. The second round PCR used Pfu DNA polymerase (CLONTECH) and yk5Јrc2 as the gene-specific primer. Amplimer was sequenced directly and subcloned into the EcoRV site of pZErO-2 (Invitrogen). Independent recombinants were analyzed by colony PCR using SL1, SL2, or RACE anchor and yk5Јrc2 primers.
Northern Analysis-Non-starved mixed stage animals from Bristol N2 and him-5(e1490) strains were used to prepare poly(A) ϩ RNA using a Dynabeads kit (Dynal A. S.) after disruption in a Polytron (Kinematica). ϳ1 g of mRNA was fractionated, blotted, probed with the ␣-32 P-labeled SalI/SmaI fragment of yk126h8, and analyzed with a PhosphorImager (Storm/ImageQuant, Molecular Dynamics).
Construction of Mammalian Expression Vectors-pISTH1 was constructed from pIMKF1 (15) by replacing the NdeI-BamHI segment upstream of the cloning site with an NdeI-BglII fragment from pCITE4b(ϩ) (Novagen). N-terminal truncations of GLY-2 were generated by PCR from yk126h8 as template using Pfu DNA polymerase primed by yk* 670 r and one of ykI 28 f, gly2-⌬133, gly2-⌬137, or gly2-⌬138. Amplimers were subcloned into the EcoRV site of pZErO-2 (Invitrogen) and sequenced. BamHI fragments of error-free subclones were ligated into the BamHI site of pISTH1. Ligation junctions, frame, and orientation were checked by DNA sequencing. A yk5Јrc2 and SL1-primed TaqDNA polymerase PCR product of the RACE amplimer was subcloned into EcoRV cut and T-tailed pGEM5zf(ϩ) (Promega) forming pYS. pCDNA3::yk126h8(ϩ) was created by subcloning the PvuII-SmaI fragment of yk126h8 into EcoRV cut pCDNA3 (Invitrogen). An expression construct for mature SL1 trans-spliced cDNA (pCSYK-1) was constructed by combining the SpeI-NarI fragment of pYS with the NarI-NotI fragment of pCDNA3::yk126h8(ϩ) in SpeI-NotI-cut pZErO-2, the BamHI-NotI fragment of which was subcloned into pCDNA3. The amplified region and ligation junctions were checked by DNA sequencing. The GLY-2(L116R) mutation was introduced into pCSYK-1 by mutagenesis directed by primer GLY2-L116R using the Chameleon kit according to the manufacturer's instructions (Stratagene). The complete transcriptional unit of the resulting construct, pCSYK-L116R, was sequenced. pEGFP-GLY2 was constructed by subcloning the ykR 2 fyk* 670 r product generated by PCR amplification with Pfu DNA polymerase from yk126h8 template into the BamHI site of pEGFP-C3 (CLON-TECH). The insert and ligation junctions were completely sequenced and found to be in-frame and error-free. pEGFP-L116R was derived by replacing the BstXI-EcoRV fragment with the equivalent section of pCSYK-L116R to generate pEGFP-L116R. The introduced segment was confirmed by sequencing.
Transient Expression and Secretion of GLY-2 in Lec4 Cells-3 ϫ 10 5 Lec4 cells (ATCC) were plated in each well of 6-well tissue culture clusters (Costar). The following morning, 1 g of DNA (QIAgen) of pISTH1-based truncation constructs were transfected at 37°C in a humidified 5% CO 2 atmosphere for 5-6 h using 8 l of LipofectAMINE (Invitrogen) in 1 ml of OptiMEM-I (Invitrogen)/well. One ml of ␣-minimal essential medium containing 20% fetal bovine serum was added to the wells, and the clusters were transferred to a humidified 5% CO 2 atmosphere at 30°C overnight. The following day, well contents were aspirated and replaced with 2 ml of ␣-minimal essential medium containing 10% fetal bovine serum, and incubation was continued until 78 h post-transfection. Conditioned medium was clarified by centrifugation at 1800 ϫ g for 10 min and stored at 4°C after the addition of sodium azide to 0.05% w/v.
Immunopurification of Recombinant Proteins-Recombinant proteins directed by pISTH1-based plasmids bear an N-terminal S-tag that was assayed according to the manufacturer's instructions in conditioned media from the transient transfections (Novagen). 1.25 pmol of recombinant protein in conditioned medium was immunoprecipitated and diluted into 1 ml of dilution buffer (10 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.025% w/v NaN 3 , 0.1% v/v Triton X-100, 0.1% w/v bovine serum albumin). Aliquots were preadsorbed with 35 l of a 50% v/v slurry of goat anti-rabbit IgG polyclonal antibody-agarose (Sigma) overnight at 4°C then centrifuged briefly before supernatants were transferred to fresh tubes. 1 g of rabbit polyclonal anti-S-tag antibody and a fresh aliquot of anti-rabbit IgG polyclonal antibody-agarose were added for 2 h at 4°C before the beads were pelleted by centrifugation at 3000 rpm for 10 s in a microcentrifuge. Beads were washed 3 times with 1-ml aliquots of dilution buffer before 3 more washes with 100 mM MES, pH 6.5, 0.1% v/v Triton X-100, 100 g/ml bovine serum albumin. A final aspiration of supernatant left the beads as a 50% slurry in a total volume of 35 l, which was used for assay of GlcNAc-TV enzyme activity.
Assay of GlcNAc-TV Enzyme Activity-Enzyme activity was measured using synthetic specific acceptors (28). Assays contained 1 mM ␤GlcNAc(1,2)␣Man(1,6)␤Glc-O(CH 2 ) 7 CH 3 acceptor, 1 mM [6-3 H]UDP-GlcNAc (44,400 dpm/nmol) in 50 mM MES, pH 6.5, in total volumes between 30 and 100 l. Enzyme sources were nematode microsomal membranes, cell lysates, conditioned media either directly or dialysates against 10 mM MES, pH 6.0, or immunoprecipitates. Assays using microsomal membranes contained 2 mM acceptor and donor, both, which was 10 5 dpm/nmol. In addition these samples contained 5 mM adenosine 5Ј-monophosphate and 500 M 2-acetamido-1,2dideoxynojirimycin (Toronto Research Chemicals). After 3 h at the appropriate incubation temperature, 1 ml of ice-cold water was added to stop further reaction, and assays were either frozen or processed immediately. Enzyme products were separated from radioactive substrates by binding them to 50 mg of C 18 cartridges (Alltech) preconditioned with methanol rinsing and water washing. Reactions were loaded, and the columns were washed 5 times with 1 ml of water. Radiolabeled products were eluted directly into scintillation vials with 2 separately applied 0.5-ml aliquots of methanol, and the radioactivity was determined by liquid scintillation counting.
Fluorescence Analysis of Lec4 Cells Transfected with gly-2-Transient transfections were performed essentially as above except that Lec4 or CHO-K1 cells were plated at 10 6 cells on 6-cm tissue culture dishes (Falcon) that were cotransfected with 0.5 g of pCMVCD20 (29) and 2.5 g of pLec4, pCHO-K1, pCSYK-1, or pCSYK-L116R DNA using 18 l of LipofectAMINE in 3 ml of OptiMEM-I. At 71 h post-transfection, the media were aspirated, and the plates were rinsed with ice-cold PBS followed by ice-cold PBS, 0.1% w/v EDTA (PBSE). Cells were dissociated from the dish by incubation in 0.5 ml of PBSE for 10 min at room temperature before triturating with 4.5 ml of PBS, 1% v/v fetal bovine serum, 0.1% w/v NaN 3 (PBSFN). Aliquots of 1.2 ϫ 10 6 transfected cells were transferred to 6-ml polypropylene tubes (Falcon) on ice, filled with PBSFN, and centrifuged at 500 ϫ g for 5 min at 4°C, and the supernatants were decanted. FITC-conjugated L-PHA (Sigma) was preadsorbed against Lec4 cells by incubating 40 g of FITC-L-PHA with 4 ϫ 10 7 untransfected Lec4 cells (harvested using PBSE) in a total volume of 800 l of PBSFN for 15 min on ice, then clarified. Lec4absorbed FITC-L-PHA (0.5 g) and 10 l of phycoerythrin-conjugated monoclonal anti-CD20 (BD PharMingen) were added to each sample, and the cells were resuspended. After a 30-min incubation on ice, tubes were filled with PBSFN and centrifuged at 500 ϫ g for 5 min at 4°C, and the supernatants were decanted. Washes were repeated twice more before a final resuspension in 1 ml of PBSFN. FACS was carried out on a FACStar (BD PharMingen). Live single cells were selected based on a forward and side scattering gates, and data acquisition and analysis used the CellQuest package. Transfected cells were gated based on the phycoerythrin anti-CD20 fluorescence, and the FITC L-PHA staining of at least 10 4 transfected live single cells was measured for each sample. 3 g of pEGFP-C3, pEGFP-GLY2, or pEGFP-L116R were transfected into Lec4 or CHO-K1 similarly. After harvesting, cells were stained with 2 g of biotinylated L-PHA (Sigma), washed four times with PBSFN, then developed with 1 g of streptavidin-CyChrome (BD PharMingen). After 3 washes with PBSFN, CyChrome staining of 2 ϫ 10 4 transfected GFP ϩ cells was measured for each sample. The remaining cells after analysis were immediately washed twice in PBS before cell pellets were flash-frozen and stored at -70°C. Cell pellets were lysed in 50 mM MES, pH 6.5, 0.5% v/v Triton X-100, 10 mM EDTA containing 1ϫ Complete protease inhibitor mixture (Roche Molecular Biochemicals). After 5 min on ice, lysates were clarified at 14,000 rpm for 5 min in a microcentrifuge, and supernatants were transferred and assayed immediately for GlcNAc-TV activity.
Western Analysis-Conditioned media from Lec4 cells that had been transiently transfected with pISTH1-based truncation constructs were subjected to SDS-PAGE, electroblotted to polyvinylidene difluoride (Waters), and blocked with TBS, 0.1% v/v Tween 20, 5% skimmed milk (TBSTM). Filters were washed with TBS, 0.1% v/v Tween 20 (TBST), then incubated with 0.5 g/ml polyclonal rabbit anti-S-tag (CLON-TECH) at 4°C overnight. The blot was washed again with TBST then developed with 1:12,500 horseradish peroxidase-conjugated donkey anti-rabbit Ig (Amersham Biosciences) in TBST for 2 h at room temperature before extensive washes with TBST then TBS and visualization of the signal by ECL (Amersham Biosciences), recorded using X-Omat Blue XB-1 film (Eastman Kodak Co.). Clarified lysates prepared from samples that had been subjected to FACS analysis and GlcNAc-TV assay were separated by electrophoresis in a MOPS buffer system on 4 -12% BisTris NuPage gels (Novex) then electroblotted as above. After methanol washing and air-drying, filters were incubated with 1:5000 monoclonal anti-GFP (CLONTECH) in TBSTM for 30 min at room temperature. After 5 rinses in TBS, the blot was developed with 1:2000 horseradish peroxidase-conjugated sheep anti-mouse Ig (Amersham Biosciences) in TBSTM. After 5 rinses and a 15-min wash in TBS, chemiluminescence signals (Supersignal, Pierce) were recorded.
GFP Reporter Transgenes-The 7461-bp NsiI fragment of C55B7 was subcloned into the PstI site of pPD95.69 and pPD95.77. A partial NarI digest was performed, and the overhangs were blunted (Klenow). A SmaI digestion was used to excise the intervening fragment, and the construct was reclosed. The ligation junctions were found to be correct after DNA sequencing. This procedure created an in-frame fusion between the NarI site in codon 3 of GLY-2 and the GFP segment of the vector, preceded by 6.7 kbp of upstream genomic DNA corresponding to bases 19,280 to 25,991 of C55B7. CB1282 hermaphrodites were transformed by gonad injection (30) of a mixture of reporter construct and pMH#6, a plasmid containing a region of C. elegans genomic DNA capable of rescuing the dpy-20(e1282) mutation. Several non-Dpy F1 progeny were selected for each reporter construct, transgenic lines were established from them, and epifluorescence microscopy was performed using a Leica DMR photomicroscope. The inheritance of extra-chromosomal arrays is mosaic, and the fine structure of the array in each strain is different. Consequently, several individuals from each line were examined to compile consensus expression patterns. Cell identification was accomplished using the position and morphology of the expressing cells, the number and position of their nuclei, and by comparison to anatomical landmarks visualized by differential interference contrast microscopy.
Mutagenesis-The gly-2 alleles, ev581, and qa700 were generated by Tc1-mediated mutagenesis with minor modifications (31). qa703 was isolated from ethylmethanesulfonate-induced deletion libraries using minor variances from published procedures (32). Tc1 mutagenesis relies on transposon mobilization, so the founding strain contains mut-2 alleles. Animals bearing qa700 were therefore crossed eight times with N2 before out-crossing with BC107 and recombination of the dpy-14 locus with gly-2 to break the chromosome between gly-2 and mut-2. This strain was then further out-crossed an additional 4 times with N2 to remove the dpy-14 allele and create strain XA728 gly-2(qa700**14) I. qa703-bearing animals were crossed three times with N2 then with DR435 to recombine the mutagenized chromosome with wild-type material either side of gly-2. dpy-5(e61), gly-2(qa703), and unc-13(e51) I animals were derived and subsequently crossed another 5 times with N2 to remove the markers and generate strain XA762 gly-2(qa703**10) I. Both alleles were mapped by recombination frequencies with dpy-5 and unc-13 using PCR to score for the presence of the qa700 or qa703 alleles. The deletion boundaries of the alleles were characterized by sequencing DNA that had been PCR-amplified from genomic DNA using primers that encompassed the deletions.
Genetic Mapping of gly-2-DR435 hermaphrodites were mated with XA728 males, and cross-progeny hermaphrodites were picked and allowed to segregate F2. Animals carrying chromosomes that had recombined between the dpy-5 and unc-13 loci were genotyped by singleworm PCR (26) using primer sets that specifically detected wild-type and deletion alleles to determine the frequency of recombination between gly-2 and both marker loci, dpy-5 and unc-13. Of 26 Dpy non-Unc chromosomes, 15 recombinations occurred in the dpy-5-gly-2 interval, and of 28 Unc non-Dpy chromosomes, 13 recombinations occurred between gly-2 and unc-13.
Construction of Precomplementation Lines- The 13,806-bp XbaI fragment corresponding to bases 17,188 -30,994 of cosmid C55B7 was subcloned into the XbaI site of pZErO-2 to create pResLng-9E, and the structure was verified by restriction digests. This genomic region encompassed all gly-2 sequences detected in transcripts as well as an additional 4248 bases upstream of the 5Ј limit of yk126h8 and 1281 bases downstream of the site of polyadenylation. XA762 hermaphrodites were transformed by gonad injection (30) of a mixture of pResLng-9E and pRF4, a plasmid containing a region of C. elegans genomic DNA carrying the rol-6(su1006) mutation that acts dominantly by causing animals bearing the array to roll. Several independent rolling lines were established, and the percentage of rolling self-progeny from each was characterized. GlcNAc-TV activity was assayed using microsomal membranes prepared from 2 such lines, XA766 gly-2(qa703) I; qaEx743[gly-2(ϩ), rol-6(su1006)] and XA768 gly-2(qa703) I; qaEx745[gly-2(ϩ), rol-6(su1006)], both of which transmitted the array to 30 -50% of their progeny.
Microsomal Membrane Preparation-Cultures were established by picking 50 rolling L4 hermaphrodites (or 20 animals from non-transgenic lines) to each of 5 100-mm diameter complete nematode growth medium plates that were then grown at room temperature until the animals cleared the Escherichia coli OP50 lawn. Nematodes were rinsed from the plates in cold 100 mM NaCl, washed twice, then floated on sucrose (60% w/v). After 2 washes with 100 mM NaCl, the pellet was snap-frozen in an ethanol-dry ice bath and stored at Ϫ70°C. Samples were thawed by adding 1 ml of TSEC (20 mM Tris-HCl, 250 mM sucrose, 1 mM EDTA, 1ϫ Complete TM ; Roche Molecular Biochemicals) then sonicated on ice 5 times using a 10-s pulse before dilution with a further 3 ml of TSEC. After centrifugal clarification for 10 min at 3000 rpm at 4°C (Sorval RT6000), the supernatant was ultracentrifuged at 55,000 rpm for 1 h at 4°C (Beckman L8 -80 M with a 70.1Ti rotor). The microsomal pellet was suspended in a minimal volume of 100 mM MES, pH 6.5, 2% v/v Triton X-100, 2ϫ Complete TM , 20 mM EDTA, the protein concentration was determined by BCA assay (Pierce) standardized with bovine serum albumin, then 386 g of each preparation was subjected immediately to GlcNAc-TV assay.

RESULTS
The gly-2 Gene of C. elegans-TBLASTN queries of the Gen-Bank TM dBEST data base using rat GlcNAc-TV polypeptide (GB:AAA41665) revealed two homologous C. elegans ESTs, Ce-gly-2 Is GlcNAc-TV cm20c4 and yk126h8 (33), 2 which were obtained and sequenced (Fig. 1). A single reverse transcriptase-specific product after 5Ј RACE was observed, and direct sequencing revealed a transspliced SL1 sequence attached to position Ϫ14, where a splice acceptor site occurs immediately upstream in the genomic sequence. Comparison of the genomic and yk126h8 sequences confirms an intron at this point. All 35 independent subclones of the RACE product that were tested for the presence of SL1 and SL2 sequences by colony PCR and 5 that were sequenced contained SL1. This transcript structure is concordant with the Northern analysis that indicated a single poly(A) ϩ RNA species of ϳ2.25 kb (Fig. 2). Comparison of the cDNA and genomic sequences indicates that the gene organization is typical, with 10 exons of 82-589 bp separated by 44 -882 bp of introns (Fig.  3A) (34). Notably, the majority of the exon boundaries in human and C. elegans genes occurs at equivalent residues, and in most cases, the phase is conserved too. We named the gene gly-2 as a member of the GLYcosylation class. BLAST searches using the cDNA or deduced polypeptide sequences revealed that the C. elegans genome contains a single homologous region, implying that gly-2 is the nematode orthologue of GlcNAc-TV.
The conceptual translation of the open reading frame encodes a 669-amino acid polypeptide that is 59.9% similar and 36.7% identical to rat GlcNAc-TV. When the sequence was queried against GenBank TM using BLAST, only mammalian GlcNAc-TV sequences were returned as significant hits. There are five potential N-linked glycosylation sites, but they are not conserved with the mammalian homologues. Hydropathy plots indicated that GLY-2 is a type II membrane protein with the secondary structural characteristics of Golgi glycosyltransferases (Fig. 4A). This plot reveals four distinct regions in GLY-2; a hydrophilic cytosolic tail precedes the putative TMD, whereas the lumenal part of the molecule consists of a consistently hydrophilic 112 residue stretch before an amphiphilic C-terminal portion. Consistent with this model, alignments between GLY-2 and mammalian homologues showed increased conservation in the C-terminal portion of the molecule (Fig.  4B). A conserved peptide (C 110 -P 124 ) lies in the otherwise diverged stem that encompasses a conserved leucine residue equivalent to that mutated in the GlcNAc-TV gene of Lec4A cells (Fig. 3B)  Ce-gly-2 Is GlcNAc-TV hydropathy and similarity profiles, we postulated that the Nterminal limit of the catalytic domain is the boundary between exon 3 and 4, the first junction after the C 110 -P 124 peptide. This is the equivalent region to that observed to be essential for catalytic activity in rat GlcNAc-TV (36). Constructs directing the secretion into the medium of soluble, truncated versions of the protein (structures indicated in Fig. 3B) were transfected into Lec4 cells, a CHO-K1 derivative lacking endogenous Glc-NAc-TV activity. Transfections were incubated at 30°C to reduce the anticipated denaturation of GLY-2, which as a C. elegans enzyme is adapted for growth at 20°C. The resulting conditioned medium contained soluble fusion protein at ϳ1 g/ml, and GlcNAc-TV activity was detected from transfections with pISTH1-GLY2 series plasmids but not from vector-only controls. The nematode enzyme is markedly inhibited by NaCl above 50 mM (Fig. 5A). This is analogous to the suppression of rat GlcNAc-TV by NaCl above physiological levels (37). The pH optimum of GLY-2 is around pH 6.5 (Fig. 5B), typical of most Golgi glycosyltransferases, and is the ambient pH of the Golgi apparatus (38). As expected GLY-2 is progressively thermolabile, and no differences were apparent among truncation variants (Fig. 5C). As with other ␤6-N-acetylglucosaminyltransferases, GLY-2 was active in the presence of EDTA, and Mn 2ϩ addition did not stimulate the reaction (data not shown).
Conditioned media from the truncation series containing equivalent amounts of S-tag fusion protein were assayed directly (Fig. 5D). The inferred initiator methionine and transmembrane domain are confirmed by the detection of soluble enzyme from the construct that lacked the first 27 deduced residues. Deletion of more than 137 residues severely impaired the specific activity. Since all truncation variants were equally thermolabile, the most plausible reason is that the catalytic domain boundary resides at Ile-138. To confirm this and demonstrate that GlcNAc-TV activity was an intrinsic property of the recombinant polypeptide, the fusion protein was immunoprecipitated from the conditioned medium using anti-S-tag antibody. These assays were performed with equivalent amounts of S-tagged fusion protein, allowing direct comparisons between the various truncated forms (Fig. 5E). A band at the  Ce-gly-2 Is GlcNAc-TV expected size (ϳ81 kDa) was observed when the immunoprecipitate of GLY-2⌬27 was Western-blotted for S-tag. In the other truncations an unavoidable background band masked the region at the expected size range (ϳ60 kDa) (Fig. 5F). As with conditioned medium, deleting the first 137 residues of GLY-2, a region comprising the initiator methionine, the TMD, and the predicted stem region, including the C 110 -P 124 peptide, had little effect on specific activity. Removing a single additional residue reduced activity by 75%. Therefore, the boundary of the catalytic domain does indeed correspond to the 5Ј limit of the exon initiated by Ile-138.
gly-2 Can Rescue the Cell Surface Phenotype of Chinese Hamster Ovary Lec4 Cells-The complementation of a genetic defect by a heterologous allele is a stringent test of equivalence since all the salient properties of the endogenous gene must be fulfilled by the introduced allele in the physiological environment. Lec4 mutant cells lack GlcNAc-TV activity and the mature glycan products, GlcNAc␤1,6 branched N-linked oligosaccharides on cell surface glycoproteins, which can be specifically detected as determinants of L-PHA binding (Fig. 6A). The parental phenotype was restored to Lec4 by transfecting the wild-type CHO-K1 GlcNAc-TV cDNA expression constructs (Fig. 6B). Transfection with wild-type gly-2 also rescued the Lec4 phenotype, and the profile is qualitatively identical to that of Lec4 cells rescued by transfection of CHO-K1 Glc-NAc-TV (Fig. 6D). The partially rescued population is probably the result of low levels of activity expressed in these cells, itself due to thermolability of the nematode enzyme at 30°C. Thus, gly-2 is functionally equivalent to the mammalian gene prod-uct, able to act on the natural glycoprotein substrates found in mammalian cells and create glycans recognized by L-PHA.
GlcNAc-TV must be present in the medial-Golgi because the elaboration of ␤6-GlcNAc-branched N-glycans and Lec4A mutant cells cannot bind L-PHA at the cell surface because they mislocalize active enzyme (35). The equivalent of the Lec4A missense mutation in GLY-2 was assayed. Protein truncations removing this region are catalytically active, yet GLY-2(L116R) failed to rescue the Lec4 phenotype in three independent experiments (Fig. 6C). Thus, although the wild-type GLY-2 enzyme complements Lec4 and, therefore, must be expressed and functional, the L116R mutant might not be. To address this, since attempts to raise anti-GLY-2 antibodies were unsuccessful, as were assays for activity in these transfected samples, constructs expressing GFP fused to the N terminus of GLY-2 were tested. Transfection of pEGFP-C3 alone does not affect the L-PHA binding properties of Lec4 or CHO-K1 (Fig. 7, A and  B). GFP::GLY-2(ϩ), however, results in complete restoration of the parental phenotype in Lec4 cells and is more effective than native GLY-2 (compare Figs. 7D to 6D). Consistent with this enhancement, GFP::GLY-2(L116R) can now partially rescue the cell surface phenotype and must therefore be catalytically competent (compare Figs. 7C to 6C). The FACS analysis indicated that transfection efficiencies were the same for all samples; therefore, cell extracts were Western-blotted for GFP epitopes, and GlcNAc-TV was assayed. Slightly more GFP epitope, as well as GlcNAc-TV enzyme activity, can be detected per cell transfected with GFP::GLY-2(L116R), but there is no indication of appreciable differences in specific activity (Fig. 7, Ce-gly-2 Is GlcNAc-TV E and F). Transfected cells were examined by deconvolution microscopy (data not shown), but fluorescent signals from both native and mutant forms were present in membranous compartments other than medial-Golgi. Overexpression by transient transfection may overwhelm retention and trafficking mechanisms, but nevertheless, GLY-2(ϩ) and GLY-2(L116R) have different rescue behaviors.
Expression Pattern of gly-2p::GFP during Nematode Development-Transcriptional fusions of 6.7 kbp of upstream genomic DNA corresponding to bases 19,280 -25,991 of cosmid C55B7 to nuclear localized and cytosolic forms of GFP provided by vectors pPD95.69 and pPD95.77, respectively, were used as reporter constructs. This stretch includes the 3Ј end (base 19,436) of the next confirmed gene upstream on the same strand as gly-2. It encompasses all of the 5Ј-untranslated region sequences found in yk126h8 (which starts at base 21,436) as well as the region that is conserved in the genome of Caenorhabditis briggsae, a closely related species (alignment starts at base 23,114). By these criteria, the constructs should contain a fully qualified promoter of gly-2.
The distribution of signal in transgenic worms was unique and highly restricted with respect to tissue and/or stage of development but did not correspond to the descendants of a particular branch of the cell lineage. Fluorescence was first detectable at the comma stage (Fig. 8A) in cells that divided and appeared to migrate during the 2-fold (Fig. 8B) and 3-fold stages (Fig. 8C). Neuronal staining was obvious from L1 onward and by early L4 was seen to occur in both the dorsal and ventral nerve chords (Fig. 8D). During this stage, a strong signal was noted in the developing vulva (most likely the vulE and/or vulF cells). By late L4 an intense GFP signal in the spermathecal valve as well as other vulval and/or uterine structures was evident (Fig. 8E). Expression in the uv1 and uv2 cells was suggested by the pattern of fluorescence around the vulva. However, the nuclear-localized reporter construct stained more nuclei than can be accounted for by expression in these cells alone (Fig. 8F). With this construct, nuclear localized signal was observed in all four nuclei of the syncytial spermathecal valve cell (Fig. 8G). Although GFP fluorescence was seen to be strongest in the late L4 and early adult for the spermathecal valve and vulval/uterine structures previously noted, it was seen to persist throughout adulthood (Fig. 8, H-J). The M8 cell of the terminal bulb of the pharynx, all six cells of the pharyngeal-intestinal valve, and neuronal cell bodies within the metacorpus and around the isthmus of the pharynx also expressed gly-2p::GFP (Fig. 8K). At least 37 neurons with cell bodies lying next to the ventral nerve chord were positive for gly-2-directed reporter expression in the adult hermaphrodite, although with widely varying levels of staining. There was also GFP fluorescence present in other neurons associated with the pre-anal, dorso-rectal, and/or lumbar ganglia. In adult males, expression was similar in non-sexually dimorphic tissues and was also observed in axons that project into rays 2, 3, 5, 6, and either 8 or 9 of the copulatory bursa (data not shown).
gly-2 Is a Non-essential Gene-ev581 is a Tc1 insertion Ce-gly-2 Is GlcNAc-TV allele into the 7th intron of gly-2 from which qa700 was derived by imprecise excision, an event that deleted 1165 bp containing ϳ2.5 exons that contribute to the catalytic domain (Table I). qa703 is a deletion created by ethylmethanesulfonate-induced deletion mutagenesis that removes 494 bp con-taining exon 6 and half of the largest exon, 7, both of which contribute to the catalytic domain. Both deletion alleles are probably null, but animals homozygous for either are viable. To check that no gross rearrangements occurred during mutagenesis, genetic mapping of the genotypes was performed. This placed the alleles on linkage group I between 1.07 and 1.18 map units to the right of dpy-5, exactly where expected from interpolations of the physical map.
GlcNAc-TV activity could be detected in microsomal extracts of wild-type C. elegans but not in the deletion mutant strain XA762 gly-2(qa703) (Fig. 9). Enzyme activity was restored in transgenic lines carrying a genomic region encompassing the gly-2 gene on the deletion mutant background. Thus, gly-2, which is the sole cognate homologue of Mgat-5 in C. elegans, encodes nematode GlcNAc-TV.
The strain XA728 gly-2(qa700**14) I had fertility defects arising from abnormal sperm function (Spe) that were not observed in XA762 gly-2(qa703**10) I. Compound heterozygotes of a qa700/qa703 genotype were non-Spe confirming that this defect is caused by a linked but extragenic mutation in a complementation group unrelated to gly-2 (data not shown). Although gly-2 is expressed in many neurons, the vulva, and spermatheca, XA762 was wild type with respect to morphology, egg laying and hatching, locomotion, brood size, dauer switching,  male incidence, developmental timing, and mechanosensory axon path-finding (data not shown). GFP reporter patterns were also unaffected by the mutant background. DISCUSSION The genomic structure of the gly-2 gene is significantly related to that of human GlcNAc-TV. The majority of exon boundaries, particularly in the catalytic domain, occur at equivalent residues and are in-frame. The N-terminal boundary of the catalytic region starts at exon 4; exon 3 contains the "Lec4A" region. Retention of phase zero introns in ancient genes is a feature of the "introns-early" model (39). These observations support the notion that exon shuffling of functional domains may have been the mechanism by which the ancestral Glc-NAc-TV gene originated.
The deduced polypeptide sequence of gly-2 is stereotypical of Golgi glycosyltransferases, being a type II membrane protein with a 20-residue TMD starting six residues from the N terminus. This length is efficiently retained by the Golgi apparatus and is the sole element in the polypeptide that appears to have bilayer-spanning properties (40). The lumenal portion starts with a hydrophilic region that may position the following catalytic domain away from the membrane and so promote efficient interactions with macromolecular substrates. Heterologous expression of recombinant gene product demonstrated that GLY-2 does indeed possess GlcNAc-TV enzyme activity and other properties in common with the mammalian homologue. The putative initiator codon and the TMD were confirmed since soluble recombinant fusion proteins were produced when truncated. The proposed stem could also be removed without affecting GlcNAc-TV enzyme activity in vitro. Several other C. elegans glycosyltransferase-related sequences have been found to possess the catalytic activity expected from their homologies. gly-3, gly-4, and gly-5 are polypeptide Gal-NAc-Ts (15), and gly-12 and gly-14 encode active GlcNAc-TI (16), whereas CeFT-1 is an ␣1,3-fucosyltransferase (17). gly-1 and possibly the other core 2 GlcNAc-T homologues may be an exception (23). GLY-1 transfers glucose rather than GlcNAc to core 1 acceptors (24), an observation concordant with the available structural data on C. elegans glycoprotein glycans (22). The components of the proteoglycan pathway encoded by sqv-3, sqv-7, and sqv-8 all possess the biochemical activity expected from their homologies (20,21). The proper functioning of Glc-NAc-TV depends not only on catalytic competence but also upon being able to interact with nascent glycoprotein substrates in the ambient milieu, correct localization, and domain structure (35). We found unequivocally that gly-2 could rescue the surface lectin binding phenotype of Lec4 cells. Thus, GLY-2 retains all of the salient properties of the mammalian Glc-NAc-TV despite being diverged for Ͼ500Myr (41).
Alignment of mammalian GlcNAc-TV and GLY-2 identified a region that is highly conserved despite being N-terminal to the catalytic domain. This region contains a leucine that is mutated in Lec4A cells, causing otherwise active GlcNAc-TV to mislocalize and fail to elaborate cell surface ␤6-GlcNAcbranched N-glycans in consequence (35). The equivalent mutation in native GLY-2 did not rescue the Lec4 defect, but a GFP fusion product could do so inefficiently. It may be that the fusion protein is better expressed than the native nematode enzyme in Lec4 cells or that the addition of GFP stabilizes the product (42). The simplest interpretation is that GFP::GLY-2(L116R) is mislocalized as in Lec4A, but due to overexpression typical of transient transfections, a portion overwhelms the endoplasmic reticulum retention system and proceeds to the medial-Golgi (43). BLAST searches indicated that the conserved 15-residue peptide encompassing the critical leucine is unique to GlcNAc-TV but has been conserved throughout metazoan radiation. Because mutations affect subcellular localization, it may be that the region is conserved because of a role in targeting to the medial-Golgi. If so, this mechanism is either GlcNAc-TV-specific or acts via its conformational properties, plausible since the peptide is bounded by two conserved cysteines. Conformational elements participate in the subcellular localization of lysosomal hydrolases where a common surface is recognized to initiate formation of the mannose 6-phosphate-targeting signal (44).
Our data are concordant with the dominant transcript being SL1 trans-spliced to the first splice acceptor upstream of the initiator codon and is typical of monocistronic C. elegans genes with a proximal promoter (34). yk126h8 contains an additional 383 nucleotides that occur in 4 non-coding exons 3994 -4533 bp upstream and may represent a minor isoform from a distal upstream basal promoter. Distal promoters driving expression of this type of transcript at low levels are observed in C. elegans, for example pkc-1 (45). The genomic fragment used for constructing the GFP reporter transgenes included both potential promoters. From these, GLY-2 expression can be crudely summarized as occurring in some of the structures that have valve properties, the vulva, the spermathecal valve, and the pharyngeal-intestinal valve. The other major locus of expression is neuronal, present in many but not all 302 neurons in the adult hermaphrodite (46). Curiously, mammalian brain is rich in GlcNAc-TV transcripts, but enzyme activity is barely detectable, and Mgat-5 o mice are not obviously affected (4). However, failure to nurture pups is significantly more common in Mgat-5 o mice in a 129/Sv background. 3 The essentially complete sequence of the C. elegans genome (8) contains a single gene that is orthologous to mammalian Golgi GlcNAc-TV proteins at both the primary sequence and domain organization level. This is unusual for glycosylationrelated genes in the nematode. The C. elegans genome contains many gene families, and glycosyltransferases are well represented (14,47,48). Multiple glycosyltransferase homologues, C-type and S-type lectin domains as well as nucleotide-sugar synthases, occur in a cluster (49). Core 2 GlcNAc-T-like sequences are the 167th largest gene family (23,48); there are nine polypeptide GalNAc-T-like sequences (15), three homologues of GlcNAc-TI (16), and evidence for at least two ␣1,3 fucosyltransferases (17). There are two ␤4-galactosyltransferase homologues, of which mutations in one, sqv-3, affects epithelial morphogenesis, resulting in defects in vulval invagination as well as oocyte receptiveness to sperm (18,19). Many mammalian glycosyltransferases are also present in multiple copies (50), but as in the worm, GlcNAc-TV has only one functional copy. Disruption of the Mgat-5 locus in mice results in a complete loss of both enzyme activity and GlcNAc␤1,6branched structures (4). Although structural studies have yet to observe complex N-glycans in C. elegans, GlcNAc-TV activity in wild-type animals is detectable, absent in animals with gly-2 deleted, and restored by transgenes containing gly-2 genomic DNA. From our present study, it appears that Ce-gly-2 is orthologous to Mgat-5, structurally conserved at both genomic and polypeptide levels, and functionally interchangeable with mammalian GlcNAc-TV. Such "deep homology" is a feature of ancient and pivotal genes that occur in conserved pathways (41), but ablation of the gly-2 gene in C. elegans is without visible defects despite resulting in an enzymatically null phenotype. This situation is not unusual; many genes with severely defective alleles are viable in C. elegans (e.g. 23,54). It may be that the contributions are subtle under laboratory growth conditions. Mgat-5 o mice are also without overt phenotype but display several phenotypes that are dependent on extrinsic conditions. Suppression of tumor growth and metastasis induced by the Polyomavirus middle T-antigen is observed (4)m and abnormalities in T-cell function, although significant, do not appear to compromise the animals greatly under laboratory conditions (3). The tractability of screens in C. elegans to uncover synthetic phenotypes enables this conundrum to be addressed and should mutate genes that interact genetically with gly-2. These would reveal GlcNAc-TV-dependent pathways and phenotypes, identifying the contributions to fitness made by ␤6-GlcNAc-branched N-glycans.