Expression of Three Caenorhabditis elegans N-Acetylglucosaminyltransferase I Genes during Development*

UDP-N-acetylglucosamine:α-3-d-mannoside β-1,2-N-acetylglucosaminyltransferase I (GnT I) is a key enzyme in the synthesis of Asn-linked complex and hybrid glycans. Studies on mice with a null mutation in the GnT I gene have indicated that N-glycans play critical roles in mammalian morphogenesis. This paper presents studies on N-glycans during the development of the nematode Caenorhabditis elegans. We have cloned cDNAs for three predicted C. elegans genes homologous to mammalian GnT I (designatedgly-12, gly-13, and gly-14). All three cDNAs encode proteins (467, 449, and 437 amino acids, respectively) with the domain structure typical of previously cloned Golgi-type glycosyltransferases. Expression in both insect cells and transgenic worms showed that gly-12 and gly-14, but not gly-13, encode active GnT I. All three genes were expressed throughout worm development (embryo, larval stages L1–L4, and adult worms). The gly-12 and gly-13promoters were expressed from embryogenesis to adulthood in many tissues. The gly-14 promoter was expressed only in gut cells from L1 to adult developmental stages. Transgenic worms that overexpress any one of the three genes show no obvious phenotypic defects. The data indicate that C. elegans is a suitable model for further study of the role of complex N-glycans in development.

UDP-N-acetylglucosamine:␣-3-D-mannoside ␤-1,2-N-acetylglucosaminyltransferase I (GnT I) 1 is a key enzyme in the synthesis of Asn-linked complex and hybrid glycans because branching cannot occur until GnT I has acted (1,2). Recent work on mice and humans in which the synthesis of complex N-glycans is defective has provided excellent evidence that these structures play essential roles in development. Although somatic Chinese hamster ovary cell mutants lacking the GnT I gene show essentially normal growth, mouse embryos with a null mutation in this gene do not survive beyond 10.5 days postfertilization and show severe developmental abnormalities particularly of the brain (3,4). Mice with a homozygous null mutation in the gene encoding UDP-N-acetylglucosamine:␣-6-D-mannoside ␤-1,2-N-acetylglucosaminyltransferase II (GnT II) survive to term but are born stunted with various congenital abnormalities and die shortly after birth (5). Carbohydratedeficient glycoprotein syndrome (CDGS) is a group of congenital diseases in which there is a defect in protein N-glycosylation (6). Children with CDGS types 1 and 2 show severe psychomotor retardation and other multisystemic abnormalities. About 80% of CDGS type 1 children have a defect in the phosphomannomutase gene (7)(8)(9). Another variant of CDGS type 1 has been described recently with a defect in the phosphomannose isomerase gene (10). Two children with CDGS type 2 have inactivating point mutations in the GnT II gene (11)(12)(13). Several other congenital diseases are associated with defective complex N-glycan synthesis (14).
These studies indicate that although complex N-glycans are not essential for the growth of cells in tissue culture, they play critical roles in mammalian morphogenesis. Complex N-glycans are absent from bacteria (15) and yeast (16) and are present in very small amounts, if at all, in protozoa (Trypanosoma cruzi (17), Leishmania (18), and Plasmodium (19)) and Dictyostelium discoideum (20). All of the above organisms except bacteria are capable of making N-glycans of the oligomannose type. Complex N-glycans are present in most of the multicellular invertebrate and vertebrate animals that have been analyzed (nematodes (21-23), schistosomes (24 -26), molluscs (27,28), insects (29), fish (30), birds (31)(32)(33)(34), and mammals) and in plants (35). However, a mutant Arabidopsis plant, which lacks GnT I and is unable to synthesize complex Nglycans, shows no apparent phenotype (36,37), suggesting that complex N-glycans do not play an essential role in plant development. The data indicate that complex N-glycans appeared in evolution just prior to the appearance of multicellular organisms and that, at least in mammals, they play important roles in the interactions between a cell and its cellular and fluid environment.
Because of the complexities encountered in the study of mammalian development, we have initiated studies on the role of complex N-glycans in the development of a simpler organism, the nematode worm Caenorhabditis elegans. Over 80% of the C. elegans genome has been sequenced, and detailed information is available on the morphology, development, and physiology of this worm. The GnT I gene (MGAT1) has been cloned from several mammalian and nonmammalian species (38). A computer search of the C. elegans genomic DNA sequence data base for sequences similar to the rabbit GnT I protein sequence using the BLASTP algorithm (39) revealed three homologous sequences, the products of predicted genes F48E3.1, B0416.6, and M01F1.1 (40). We report in this paper the sequences of the cDNAs of these three genes which we have designated gly-12, gly-13, and gly-14, respectively; all C. elegans glycosylationrelated genes are named gly (41). We present an analysis of the spatial and temporal pattern of gene expression during C. elegans development and the enzyme activities the genes encode when expressed in insect cells and in transgenic worms. Preliminary reports of this work have appeared (42)(43)(44)(45).

Molecular Biology Procedures
Unless otherwise stated, standard molecular biology procedures were used (46,47). Oligonucleotides were synthesized on a Pharmacia DNA synthesizer and purified by the cartridge method (Hospital for Sick Children-Amersham Pharmacia Biotech Center, Toronto, Canada). All cDNAs and DNA constructs were sequenced in both directions by the double strand dideoxy method (48) using the Amersham Pharmacia Biotech T7 Sequencing Kit.

Cloning of gly-13 cDNA by Phage Library Screening
The polymerase chain reaction (PCR) was used to prepare three gly-13 gene-specific probes (based on the genomic sequence; Gen-Bank TM accession number U23516), as follows (see Table I for PCR primer sequences): probe A (142 nt), primers CEBF1 and CEBR2; probe B (145 nt), primers CEBF3 and CEBR4; probe C (168 nt), primers CEBF5 and CEBR6. PCR products were purified by electrophoresis on a 1% agarose gel. DNA probes were labeled with [␣-32 P]dCTP (Amersham Pharmacia Biotech; 3000 Ci/mmol) using the Amersham Pharmacia Biotech "Ready To Go" DNA labeling kit. The labeled probes were purified using Sephadex G50 DNA grade nick columns (Amersham Pharmacia Biotech). An oligo(dT)-primed C. elegans cDNA library in gt10 (provided by Drs. S. Kim and H. R. Horvitz, MIT) was screened with a mixture of probes A, B, and C. Positive plaques were rescreened with each probe individually. Only one plaque hybridized to all three probes, and it was purified for further analysis.

Subcloning and Sequencing of gly-13 cDNA
Since there is an EcoRI site in the open reading frame of gly-13 near the 3Ј-end, the full-length gly-13 cDNA was obtained by partial digestion of the gt10 plaque DNA with EcoRI, subcloned into pGEM7Zf(ϩ) (Promega), and sequenced. A partial cDNA was also produced by the RT-PCR/3Ј-RACE procedure (49) using a preparation of C. elegans total RNA as template. An oligo(dT) primer was used for reverse transcription. A gene-specific forward primer (CEBF5 , Table I) and a reverse adaptor primer (AP2) were used for 3Ј-RACE (Marathon cDNA amplification kit, CLONTECH).

Cloning of gly-14 cDNA by RT-PCR
The gly-14 cDNA was cloned by an RT-PCR approach using the Marathon cDNA Amplification Kit (CLONTECH) as recommended by the manufacturer. Adaptor-ligated double-stranded cDNA was synthesized by reverse transcription of adult C. elegans total RNA followed by second strand synthesis and ligation of Marathon cDNA adaptor to both ends of the double-stranded cDNA. The Marathon cDNA adaptor has two primer binding sites: AP1 (outer) and AP2 (inner) (see Table I). PCR was then carried out three times in succession using the adaptor-ligated cDNA as template and the following primer pairs (Table I; gene-specific primers based on the genomic sequence, GenBank TM accession number Z46381): CEMR5-AP1 (once) followed by CEMF1-CEMR6 (twice). PCR was also carried out three times in succession using the following primer pairs (Table I): CEMF7-AP1 (once) followed by CEMF8-CEMR4 (twice). The PCR products could be visualized by ethidium bromide staining of agarose gels only after the third round of PCR. Fusion of the two PCR products was carried out by PCR using Vent DNA polymerase (New England Biolabs) and the primer pair CEMF1-CEMR4 (Table I) to yield a cDNA fragment encoding the GLY-14 protein sequence containing the STOP codon but lacking 30 amino acids at the amino terminus, including the putative transmembrane domain. This truncated cDNA was subcloned into the NotI and KpnI sites of the baculovirus transfer vector pVT-Bac-His (kindly donated by Dr. David Joziasse, Vrije Universiteit, Amsterdam) (38) and sequenced. Some of the missing 5Ј-sequence of gly-14 cDNA was obtained by PCR using the AP2-CEMR10 primer pair (Table I) and adaptor-ligated cDNA as template. Attempts to determine the remainder of the 3Ј-end of the cDNA by 3Ј-RACE were not successful.
Cloning of gly-12 cDNA by Phage Library Screening A 1.2-kb gly-12 hybridization probe was made by PCR using the Marathon adaptor-ligated C. elegans cDNA (described above) as template and primers CEFF2-CEFR3 (Table I; gene-specific primers based on the genomic sequence, GenBank TM accession number U28735). The C. elegans cDNA library was screened with this probe, as described above for gly-13. After three rounds of screening with the same probe, nine positive plaques were identified. Phage DNA from five of these plaques was prepared, and inserts were excised with EcoRI, subcloned into pGEM7Zf(ϩ), and sequenced. A portion of the 3Ј-end of the gly-12 cDNA (88 nt) was obtained by PCR using adaptor-ligated cDNA as template and three successive PCR reactions with primer pairs CEFF1-AP1, CEFF2-AP2, and CEFF5-CEFR7, respectively (Table I). Attempts to determine the remainder of the 3Ј-end of the cDNA by 3Ј-RACE were not successful.
Determination of 5Ј-Ends of gly-12, gly-13, and gly-14 cDNAs Total cDNA was prepared by RT-PCR using as substrate total RNA prepared from the L2 larval stage. The following PCR reactions were carried out using this cDNA as template and the SL1 primer (Table I) as the forward primer and reverse primers as shown in Table I.
gly-12-PCR was carried out using CEFR15 as the reverse primer. The solution was reamplified using the nested primer CEFR4. A PCR product of the expected size was observed and further amplified with reverse primer CEFR10.
gly-13-PCR was carried out using CEBR4 as the reverse primer. A PCR product of the expected size was seen and reamplified with the nested primer CEBR2.
gly-14 -A procedure similar to that used for gly-12 was carried out using CEMR6, CEMR10, and CEMR2, respectively, as reverse primers. The three final PCR products were sequenced and all three messages showed trans-splicing to SL1.

Expression of GnT I in the Baculovirus/Sf9 Insect Cell System
C. elegans GnT I was expressed in the baculovirus/Sf9 insect cell system as described previously (38,50,51). DNA fragments encoding truncated GLY-12, GLY-13, and GLY-14 GnT I proteins lacking the amino-terminal cytoplasmic and transmembrane domains and parts of the stem region were synthesized by PCR amplification using Vent DNA polymerase and GnT I cDNAs as templates (43,39, and 31 amino acids were removed from the amino-terminal end, respectively). The primer pairs used for gly-13 and gly-14 are shown in Table I. The PCR products were subcloned into the baculovirus transfer vector pVT-Bac-His (38) downstream from and in frame with the ATG start site of the plasmid using restriction enzyme sites introduced by the primers (Table  I). This vector encodes a cleavable signal sequence for secretion from the Sf9 cells.
Full-length gly-12 cDNA was excised from plasmid p78F-Myc (see below) and subcloned into pBlueBacHis C (Invitrogen) downstream from and in frame with the ATG start site of the plasmid to create a recombinant transfer vector encoding full-length GLY-12 (51).
Northern Analysis of C. elegans mRNA Total (ϳ20 g) and poly(A) ϩ RNA (ϳ2.5 g) from a mixed stage population of N2 hermaphrodites and total RNA from staged synchronous populations was subjected to Northern analysis by electrophoresis in a denaturing 1.0% agarose gel (10% formaldehyde) (55). DNA size markers were prepared by digestion of phage DNA (5 g) with EcoRI and HindIII; DNA fragments were end-labeled with [␣-32 P]dATP (3000 Ci/mmol, NEN Life Science Products) and Klenow DNA polymerase (New England Biolabs). Probes for gly-12 (126 -1504 nt relative to the ATG start codon at ϩ1) and gly-13 (114 -1350 nt relative to the ATG start codon at ϩ1) were made by excision of the truncated gly-12 and gly-13 inserts from the respective recombinant baculovirus transfer vectors (see above). Probes were randomly labeled with [␣-32 P]dATP (3000 Ci/mmol, NEN Life Science Products). After hybridization of Northern blots with these probes (1 ϫ 10 7 cpm/ml for gly-12 and 3.2 ϫ 10 6 cpm/ml for gly-13), the blots were stripped with 0.1% SDS at 100°C and reprobed with a 32 P-labeled probe for the fem-1 gene (1-2 ϫ 10 6 cpm/ml) as a sample loading control (56).

Quantitation of gly-12 and gly-14 Messages by Competitive RT-PCR
We were able to detect gly-14 mRNA only by RT-PCR, not by Northern blot analysis, suggesting that gly-14 is expressed at lower levels than gly-12 and gly-13. We used competitive quantitative RT-PCR to estimate the relative abundance of both gly-12 and gly-14 mRNA during development. Total cDNA was obtained by oligo(dT)-primed reverse transcription of total RNA from all six developmental stages in the presence of 1 Ci of [␣-32 P]dCTP (800 Ci/mmol, Amersham Pharmacia Biotech) using the Advantage RT-for-PCR kit (CLONTECH). Competitor cDNA was made by PCR with primer pairs (Table I) CEFF-DEL/ CEFR3 (gly-12, 660-nt product, internal deletion of 233 nt) and CEMF7/ CEMR-DEL (gly-14, 726 nt product, internal deletion of 213 nt) using the respective gly-12 and gly-14 cDNAs as templates. PCR was then carried out with gene-specific primer pairs (Table I) CEFF5/CEFR3 (gly-12, 893-nt product) and CEMF7/CEMR4 (gly-14, 939-nt product) using as template a mixture of [␣-32 P]dCTP-labeled total cDNA (at a constant concentration) and purified competitor cDNA (at variable concentrations). The PCR products were resolved in a 1.5% agarose gel stained with ethidium bromide. Since the amount of added competitor cDNA is known, and on the assumption that the molar ratio of wild type cDNA to competitor cDNA remains approximately constant throughout the amplification, estimates can be made of the amount of gly-12 and gly-14 cDNA present at the start of the PCR reaction by scanning of the agarose gels. The expression level of message at each worm developmental stage was normalized with the amount of [␣-32 P]dCTP-labeled total cDNA added to each PCR reaction.

Preparation of DNA Constructs for Promoter Analysis
We used transcriptional fusion of GnT I genomic DNA to the lacZ reporter gene to examine the spatial pattern of GnT I expression during C. elegans development. Plasmids encoding lacZ were provided by Dr. A. Fire, Carnegie Institute of Washington (Baltimore, MD). 2 gly-13-Cosmid B0416 (40) was grown overnight in 500 ml of LB medium containing 50 g/ml of ampicillin. DNA was prepared by using the Qiagen Maxi-prep kit, and two gene fragments were cut out using SalI and PstI, respectively. A DNA fragment (1008 nt) containing the putative promoter region immediately upstream of the first exon, the complete 30-nt 5Ј-UTR, and the first 8 nucleotides of the open reading frame (the initiation codon ATG was mutated to TTG) was obtained by PCR using the SalI fragment from cosmid B0416 as a template and primers PRBF53 and PRBR1040 (Table I). The PCR product was subcloned into the SalI and BamHI sites of plasmid pPD95.11 upstream of the lacZ gene creating plasmid p11B/prom. A genomic DNA fragment (3428 nt) containing the remainder of the open reading frame and the complete 3Ј-UTR of gly-13 was obtained by PCR using the PstI fragment from cosmid B0416 as a template and primers PRBF618 and PRBR4027 (Table I). The PCR product was subcloned into the SpeI and AflII sites of plasmid p11B/prom, downstream of the lacZ gene, creating plasmid p11B/prom-ORF.
gly-14 -Cosmid M01F1 (40) DNA was prepared as above. A DNA fragment (3.4 kb) containing the complete open reading frame except for the first nine nucleotides and the 3Ј-UTR was amplified by PCR from cosmid M01F1 using primers PRMF1158 and PRMR4530 (Table I) and subcloned into the SpeI and AflII sites of plasmid pPD95.11 downstream of the lacZ gene to create plasmid p11M/ORF. A DNA fragment (2.6 kb) containing the putative promoter region immediately upstream of the first exon, the complete 5Ј-UTR, and the first nine nucleotides of the open reading frame (the initiation codon ATG was mutated to ACG) was amplified by PCR from cosmid M01F1 DNA with primers PRMF33 and PRMR2664 (Table I) and subcloned into the SalI and BamHI sites of plasmid p11M/ORF upstream of the lacZ gene to create plasmid p11M/prom-ORF.
Plasmid p57M/prom2.6, containing the same 2.6-kb putative promoter region as p11M/prom-ORF but in which the region downstream of the lacZ gene was replaced with the 3Ј-UTR of the unc-54 gene, was constructed by inserting the 2.6-kb PCR product from the promoter region (see above) into the SalI and BamHI sites of pPD95.57. Deletion of ϳ1.3and ϳ2.0-kb fragments from the 5Ј-end of p57M/prom2.6 with EcoRV and NheI resulted in plasmids p57M/prom1.3 and p57M/ prom0.6, respectively.
gly-12-Cosmid F48E3 (40) DNA was prepared as above. A DNA fragment (1740 nt) containing the putative promoter region immediately upstream of the first exon, the complete 5Ј-UTR, and four nucleotides from the open reading frame (the initiation codon ATG was mutated to AAG) was amplified by PCR from the cosmid DNA with primers PRFF5 and PRFR6 (Table I) and subcloned into the SalI and BamHI sites of pPD95.57 upstream of the lacZ gene to create plasmid p57F/prom.

DNA Constructs for Heat Shock-induced Expression of
GnT I by Transgenic Worms gly-13 Expression Construct-A NotI DNA fragment was excised from the gly-13 baculovirus transfer plasmid (see above) encoding the open reading frame downstream of the transmembrane domain (nucleotides 114 -1350 relative to the initiation ATG codon at ϩ1), bluntended, and digested with ApaLI. A DNA fragment covering the first exon, first intron, and second exon was amplified by PCR from cosmid B0416 using primers BEXF1 and BEXR2 (Table I) and digested with NheI and ApaLI. These two fragments were then subcloned into the NheI and EcoRV sites of plasmids pPD49.78 and pPD49.83 downstream of the heat shock promoter (57), producing plasmids p78B and p83B, respectively. Myc spacer was amplified from plasmid AS#1309 3 using primers SC#001 and SC#002 (Table I). This fragment contains sequences encoding a Myc epitope tag (underlined) and 18 amino acids from the FEM-1 amino terminus (italics) (MAAEQKLISEEDLGRT-PNGHHFRTVIYNAAAVGGMH). The Myc spacer was subcloned into plasmids p78B and p83B immediately upstream of the gly-13 sequence, producing plasmids p78B-Myc and p83B-Myc, respectively.
gly-14 Expression Construct-A KpnI DNA fragment was excised from the gly-14 baculovirus transfer plasmid (see above) encoding the open reading frame downstream of the transmembrane domain (nucleotides 90 -1314 relative to the initiation ATG codon at ϩ1), blunt-ended and digested with HindIII. A DNA fragment covering the first three exons and introns, exon 4, and part of intron 4 was amplified by PCR from cosmid M01F1 using primers MEXF1 and MEXR2 (Table I) followed by digestion with NheI and HindIII. These two fragments were then subcloned into the NheI and EcoRV sites of plasmids pPD49.78 and pPD49.83 downstream of the heat shock promoter, producing plasmids p78M and p83M, respectively. Myc spacer was subcloned into these two plasmids, as described above, immediately upstream of the gly-14 sequence to produce plasmids p78M-Myc and p83M-Myc, respectively.
gly-12 Expression Construct-The amino terminus of the gly-12 open reading frame was amplified by PCR using as template one of the gly-12 cDNA clones isolated from the cDNA library (see above) and primers FEXF1 (Table I) and CEFR4 (Table I). An NsiI site was introduced near the ATG start codon. The carboxyl terminus of gly-12 was similarly amplified by PCR using primers CEFF5 (Table I) and FEXR2 (Table I). The middle of the gly-12 cDNA was obtained directly from the cDNA clone. These three fragments and Myc spacer were subcloned into the NheI and SacI sites of pPD49.78 and pPD49.83, producing plasmids p78F-Myc and p83F-Myc, respectively.

Preparation of Transgenic Worms
Promoter Analysis-DNA injection into the C. elegans germ line was carried out as described by Mello et al. (57,58). Transgenic lines were established from F2 descendants of animals injected with 10 or 50 ng/l of GnT I::lac Z constructs and 50 ng/l of plasmid pRF4, which carries the dominant rol-6 allele rol-6 (su1006) that serves as a transformation marker. The total DNA concentration of the injection mixture was adjusted to 100 ng/l by the addition of pBlueScript SK(Ϫ) (Stratagene). In some experiments, the F1 progeny of injected animals were analyzed directly for reporter expression; in these experiments, the concentration of GnT I::lac Z constructs was 100 ng/l. LacZ expression was examined in a smg-1 (e1228) background. The smg-1 mutation stabilizes aberrant transcripts with long 3Ј-UTRs (59). Transgenic lines or F1 progeny were cultured at 25°C. For F1 lac Z assays, 15-20 adult hermaphrodites were injected on each of three consecutive days. ␤-galactosidase staining of late larvae and adults was carried out as described by Xie et al. (60). Staining of embryos and young larvae was carried out by the method of Fire (61). All animals were co-stained with 1 g/ml of 4,6-diamidino-2-phenylindole to visualize the cell nuclei. In the figures, anterior is to the left and dorsal is up.

C. elegans Culture, Heat Shock, and Worm Lysis
The standard laboratory wild type strain N2 or a smg-1 (e1228) mutant derived from N2 nematodes was grown on MYOB (62) agar plates seeded with Escherichia coli strain OP50 (a leaky uracil-requiring strain). To examine the consequences of overexpressing GnT I, gravid adults were allowed to lay eggs for 2-3 h at 20°C. Eggs were incubated for a further 6 h at 20°C and subjected to heat shock treatment at 33°C for 1 h at 12-h intervals until the animals reached adulthood.
To measure the activity of overexpressed enzyme, heat shock was carried out at 33°C for 2 h followed by recovery at 20°C for a further 2 h. Worms from 10 -15 agar plates were harvested by washing off the plates with M9 buffer, pelleted by centrifugation at 700 rpm for 2 min, washed with 10 ml of M9 buffer and suspended in 5 ml of ice-cold M9 buffer. An equal volume of ice-cold 60% (w/w) sucrose was added, and the suspension was mixed by inversion and centrifuged at 700 rpm for 5 min to remove bacteria. Worms were collected; washed twice with 10 ml of M9 buffer; resuspended in 1 ml of buffer containing 20 mM Tris-HCl (pH 7.5), 250 mM sucrose, and protease inhibitor mixture (Boehringer); and stored at Ϫ70°C. Worms were lysed by sonication, five times with 5-s pulses at 30-s intervals. The sonicate was centrifuged at 3500 rpm in a Beckman JA17 rotor for 10 min, and the supernatant was centrifuged at 55,000 rpm for 1 h (Beckman 100.3 rotor). The microsomal pellet was resuspended in lysis buffer (25 mM MES, pH 6.1, 1% Triton X-100 and protease inhibitor mixture).

RESULTS
Cloning of gly-12, gly-13, and gly-14 cDNAs-The three C. elegans GnT I cDNA sequences are not shown in this paper and have been submitted to the GenBank TM data base. We have determined the 3Ј-end of gly-13 but not of gly-12 and gly-14.
The gly-13 cDNA has a 342-nt 3Ј-untranslated region and an AATAAA polyadenylation initiation sequence 23 nt upstream of a long poly(A) sequence. Over 70% of C. elegans mRNAs carry a 22-nt trans-spliced 5Ј leader sequence known as SL1 (66); we showed the presence of SL1-bearing transcripts at the 5Ј-ends of all three C. elegans GnT I cDNAs but have not determined whether the mRNAs of the three genes are exclusively trans-spliced to SL1. As expected, there are consensus 3Ј intron splice acceptor sites at the 5Ј-end of the first exon of all three GnT I genes (data not shown).
Comparison of the gly-12, gly-13, and gly-14 genomic DNA sequences in the C. elegans data base with the cDNA sequences indicate that the three genes contain 14, 12, and 12 exons, respectively, with conservation of all 5Ј-intron splice donor and 3Ј-intron splice acceptor sequences. This is in marked contrast to the mammalian GnT I gene, in which the entire coding region is on a single exon (55). The exon-intron junctions predicted by computer analysis (Genefinder) of the C. elegans genomic DNA sequences were correct for the gly-14 gene. Errors for the gly-13 gene were relatively minor. Computer analysis of cosmid F48E3 predicted a gene F48E3.1 consisting of 17 exons. However, we found that the predicted gene F48E3.1 consisted of two separate genes. The most 3Ј 11 predicted exons, together with three additional exons not predicted by the Genefinder program, constitute an open reading frame that can encode a protein similar to mammalian GnT I. We have designated this gene as gly-12. The six predicted exons upstream of gly-12 form a separate gene, designated F48E3.1a, that can encode a protein of unknown function with no similarity to mammalian GnT I. A series of PCR reactions using adaptor-ligated C. elegans cDNA as template were carried out to establish the presence of these two distinct genes (data not shown). PCR products of the expected sizes were obtained with the following primer pairs (Table I): CEFF8-CEFR11 (from the upstream gene) and CEFF5-CEFR3 (from the downstream gly-12 gene). No PCR products were obtained with primer pairs CEFF9-CEFR4 and CEFF8-CEFR10; CEFF9 and CEFF8 were derived from the upstream gene sequence, whereas CEFR4 and CEFR10 were derived from the downstream gly-12 gene.
The gly-12, gly-13, and gly-14 cDNA sequences contain open reading frames of 1401, 1347, and 1311 nt encoding putative proteins of 467, 449, and 437 amino acid residues, respectively ( Fig. 1). Hydropathy plots (67) of all three protein sequences (not shown) predict a domain structure typical of all previously cloned Golgi-type glycosyltransferases, namely a short N-terminal cytoplasmic domain, a hydrophobic noncleavable signalanchor transmembrane domain, a stem region, and a long C-terminal catalytic domain. Whereas the mammalian GnT I proteins do not contain any putative N-glycosylation sites (NX(S/T) sequons), all three C. elegans proteins contain such sequences (GLY-12 at Asn 111 , Asn 128 , Asn 337 , and Asn 402 ; GLY-13 at Asn 159 ; GLY-14 at Asn 129 , Asn 188 , and Asn 242 ). Comparison of the aligned full-length protein sequences (Fig. 1) shows 44, 44, and 60% identities for the GLY-12/GLY-13, GLY-12/GLY-14, and GLY-13/GLY-14 pairs, respectively, and comparison of the 350 -394 amino acid C-terminal catalytic domains with the corresponding mammalian GnT I sequences shows 36, 47, and 48% identities for the GLY-12, GLY-13, and GLY-14 sequences, respectively. The three C. elegans GnT I protein sequences show no similarity to the mammalian GnT I proteins in the cytoplasmic, transmembrane, and stem regions (Fig. 1). The gly-13 and gly-14 genes share 9 of 11 intron positions, whereas four introns occur at the same positions in all three genes (Fig. 1). Genes gly-12 and gly-13 are on C. elegans chromosome X, and gly-14 is on chromosome III. The data indicate that the mammalian GnT I genes and three C. elegans GnT I genes are derived from a common ancestor.
Expression of C. elegans GnT I cDNAs in the Baculovirus/ Sf9 System-Sf9 cell lysates have been reported to contain GnT I activity (68), but endogenous GnT I activity was low in both cell lysates and supernatants under our assay conditions (0.03-3.4 nmol/10 5 cells/h at 48 -120 h after infection). The culture medium of Sf9 cells infected with recombinant baculovirus encoding truncated GLY-14 contained levels of enzyme activity equivalent to those previously obtained with mammalian GnT I expression (data not shown). Expression of GLY-12 either as a truncated or full-length protein yielded enzyme activities above background levels, but this activity was appreciably less than the intracellular activity of truncated GLY-14 (data not shown). Western blot analysis using mouse monoclonal antibody raised against the enterokinase cleavage site (Anti-Xpress antibody kit, Invitrogen) (38) showed recombinant protein bands at the expected molecular weights for GLY-12 (weak) and GLY-14 (strong). Attempts to express recombinant baculovirus encoding truncated GLY-13 were not successful; we could not detect a protein band by Western analysis, nor could we detect any enzyme activity either in cell lysates or supernatants (data not shown). Kinetic analysis of GLY-14 in cell 5Ј-GTCGTCTTCAGTCACAATG-3Ј

5Ј-TTCTATGCATTCCTCCAACCGCCGCAGCATTAT-3Ј
a Unknown gene upstream of gly-12, designated as F48E3.1a (see "Results"). b These primers were used to make PCR products for insertion into the baculovirus transfer vector (see "Materials and Methods"). c These primers contain a STOP codon.
supernatant and GLY-12 in Sf9 cell extracts gave linear 1/v versus 1/S plots (where v is the initial velocity and S is the substrate concentration) consistent with an ordered sequential Bi Bi mechanism if one assumes steady state rather than rapid equilibrium kinetics (53). Kinetic constants (data not shown) indicate no major differences between C. elegans and previously published data on rabbit GnT I (54); GLY-12 was assayed only with M 3 -octyl, whereas GLY-14 and the rabbit enzyme were assayed with both M 3 -octyl and M 5 -glycopeptide. Rabbit GnT I had a higher temperature optimum (37°C) than GLY-12 and GLY-14 (20 -30°C, data not shown), and the rabbit enzyme remained active at pH 5.0 -5.5, whereas GLY-14 did not (data not shown). The rabbit and GLY-14 enzymes showed very similar metal requirements (data not shown); there was an absolute requirement for Mn 2ϩ and little (Ͻ20% of maximum activity) or no activity with Mg 2ϩ , Ni 2ϩ , Ba 2ϩ , Ca 2ϩ , Cd 2ϩ , Fe 2ϩ , or Cu 2ϩ . The products of GLY-12 and GLY-14 with M 3 -octyl were analyzed by thin layer chromatography and shown to co-migrate with standard Man␣1-6(GlcNAc␤1-2Man␣1-3)Man␤-Ooctyl (69) in the following solvent systems: (i) acetonitrile/water (5:1) and (ii) dichloromethane/methanol/water (55:35:6) (data not shown). The product of GLY-14 was purified and shown to have an NMR spectrum at 500 MHz identical to the Man␣1-6(GlcNAc␤1-2Man␣1-3)Man␤-O-octyl standard (69) (data not shown).
Expression of GnT I mRNA at Various Stages in C. elegans Development-As a first step toward understanding the role of N-glycans in development, we studied the expression of GnT I mRNA in six developmental stages of C. elegans, an embryo fraction containing a mixture of embryo stages, the four larval stages L1-L4, and adult worms (70). Northern analysis detected messages for both gly-12 (a major band at ϳ2.1 kb, Fig.  2A) and gly-13 (a major band at ϳ1.9 kb, Fig. 2B) in all six developmental stages. Assuming that the 3Ј-untranslated regions of gly-12 and gly-14 are not excessively long, the mRNA sizes are consistent with the cDNA lengths determined by sequencing (Ͼ1982, 1719, and Ͼ1322 nt for gly-12, gly-13, and gly-14, respectively). We used quantitative RT-PCR (Figs. 3, A-C) to establish the relative abundance of gly-12 and gly-14 mRNA. There were no major variations between different stages of development except for a higher level of gly-12 message at the embryo stage (Fig. 3, A and C). The gly-12 mRNA levels are 6 -38 times higher than the gly-14 mRNA (Fig. 3C).
Expression of GnT I Promoters at Various Stages in C. elegans Development-Expression of the ␤-galactosidase reporter gene in the F1 progeny of worms injected with the gly-12 promoter construct p57F/prom was observed throughout all developmental stages and in many tissues (intestine, muscle, hypodermis, and other epithelial cells and in ganglia in the head and tail region) (Fig. 4, A-C, and data not shown).
The F1 progeny of worms injected with the gly-13 promoter construct p11B/prom-ORF expressed ␤-galactosidase from late embryogenesis to adulthood. Expression in L1 larva was confined to the gut cells (Fig. 4D). From L2 to adulthood, ␤-galactosidase was expressed in many different tissues, including gut, muscle, hypodermis, and other epithelial cells and the nervous system (ganglia in the head and tail region and the ventral nerve cord) (Fig. 4, E and F, and data not shown).
Transgenic animals carrying the gly-14 promoter construct p57M/prom2.6, containing a 2.6-kb promoter fragment and the 3Ј-UTR of the unc-54 gene, expressed the reporter gene only in the gut (Fig. 4G and data not shown). The same expression pattern was observed in animals ranging from L1 to adulthood.
The most anterior gut cells (int 1) and those on the posterior third of the body expressed the reporter gene most strongly. The remainder of the gut often failed to express the reporter. We did not detect reporter expression in embryos, despite finding that embryos contain approximately the same amount of gly-14 mRNA as other stages (Fig. 3, B and C). As is typical of reporter transgenes in C. elegans (71), none of our reporter constructs directed expression in the germ line. Therefore, we cannot exclude the possibility that gly-14 mRNA is maternally contributed to the embryo. Injection of p11M/prom-ORF (lacking the 3Ј-UTR of the unc-54 gene) and the truncated constructs p57M/prom1.3 and p57M/prom0.6 did not result in reporter gene expression.
Heat Shock-induced Overexpression of GnT I-Transgenic worms that overexpress gly-12, gly-13, or gly-14 under the control of the hsp-16 heat shock promoters show no obvious phenotypic defects. To test whether functional GnT I was produced in these transgenic worms, microsomal fractions were prepared to determine enzyme activity. Heat shock induction of gly-13 resulted in little or no increase in GnT I activity compared with wild type N2 worms (wild type GnT I activity ϭ Ͻ0.1 nmol/h/mg) (Table II). Dramatic increases in enzyme activity were observed on heat shock induction of both gly-12 (27-157-fold) and gly-14 (39 -182-fold) ( Table II). Lysates of all three transgenic lines overexpressing GnT I showed protein bands of the expected size by Western blotting (Fig. 5). This finding suggests that the low GLY-13 activity is not due to poor expression but rather to a low specific activity, at least with the acceptor substrate used for assay; GLY-13 may have a high activity with an as yet unknown physiological acceptor. Immunolocalization experiments showed that all three GnT I gene products stained as focal areas in the perinuclear region of the cytoplasm, suggesting a Golgi complex location ( Fig. 6 and data not shown).

DISCUSSION
Our studies on "knockout" mice with a null mutation in the GnT I gene (3) and on humans with carbohydrate-deficient glycoprotein syndrome type II (13) have indicated that interference with complex N-glycan synthesis is associated with severe defects in embryonic development particularly of the nervous system. We have therefore initiated studies on the expression of GnT I in C. elegans in the hope that this relatively simple and thoroughly characterized organism will provide important information on the role of complex N-glycans in development.
A search of the genomic DNA data base using the BLASTP algorithm (39) indicated the presence of three C. elegans genes that show significant similarity to mammalian GnT I. These genes have been designated gly-12, gly-13, and gly-14. Only a single functional copy of the GnT I gene has been reported in mammals. We have cloned the C. elegans cDNAs and have demonstrated that two of them (gly-12 and gly-14) encode an active GnT I in both a heterologous (Sf9 insect cells) and homologous (transgenic worms) host. However, no protein was detected by Western analysis when we attempted to express gly-13 in Sf9 insect cells. Expression of gly-13 in transgenic worms yielded a protein of the expected size on Western blots, but this protein showed no GnT I enzyme activity. The data suggest that gly-13 may encode a glycosyltransferase with a specificity different from GnT I. Similarly, only 5 of the 11 C. elegans UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase cDNA homologues cloned by Hagen and Nehrke (41) were shown to possess enzyme activity. Bakker et al. (72) attempted to clone a snail UDP-GalNAc:GlcNAc␤-R ␤4-Gal-NAc-transferase by screening a snail cDNA library with a UDP-Gal:GlcNAc␤-R ␤4-Gal-transferase probe but instead cloned a novel UDP-GlcNAc:GlcNAc␤-R ␤4-GlcNAc-transferase. Studies are under way on the large scale expression of gly-13 so that a search can be made for other enzyme activities.
In contrast to the mammalian GnT I genes in which the entire open reading frame is on a single exon, the gly-12, gly-13, and gly-14 genes have multiple exons (14,12, and 12, respectively). However, although the identity between the C. elegans and mammalian GnT I amino acid sequences is less than 50%, the GLY-12 and GLY-14 enzymes show kinetic parameters very similar to the rabbit enzyme. The major differences detected were the pH profiles (rabbit GnT I maintains its activity to a significantly lower pH than GLY-14) and the lower temperature optimum for GLY-12 and GLY-14 relative to rabbit GnT I. C. elegans GalNAc-transferase also has a lower temperature optimum than the mammalian enzyme (41).
We have previously shown that removal of 106 amino acids from the N terminus of rabbit GnT I does not inactivate the enzyme (38). This region contains the cytoplasmic, transmembrane, and stem domains and shows marked differences in amino acid sequence between the mammalian enzymes and each of the three C. elegans enzymes (Fig. 1). The catalytic domain of GnT I contains 341 amino acids for the mammalian enzymes and 350 -394 amino acids for the C. elegans enzymes. Comparison of mammalian and C. elegans sequences indicates five highly conserved regions that are probably essential for catalytic activity (118 -159, 200 -211, 221-265, 277-330, and 431-461 in Fig. 1). Of the five cysteine residues in the consensus sequence (128, 137, 159, 256, and 322 in Fig. 1), only two are conserved for all mammalian and C. elegans sequences (256 and 322). Site-directed mutagenesis of invariant amino acids is being carried out to determine whether these residues are indeed essential for enzyme activity.
FIG. 6. Immunolocalization of C. elegans heat shock-induced Myc-GLY-12. The transgenic worm was stained with antibody 9E10, which recognizes the Myc epitope tag (right panel) and with 4,6-diamidino-2-phenylindole to visualize nuclei (left panel). A portion of the intestine has extruded, thereby permitting a clear view of three gut cells. It is seen that the Myc epitope is localized to punctate perinuclear areas, suggestive of localization in the Golgi complex. pressed in all six stages of worm development. Except for a relative increase of gly-12 expression in the embryo, there is no significant difference in expression between the various developmental stages. Analysis of reporter gene expression in transgenic animals confirms the expression of gly-12 and gly-13 at all stages of development and shows that the gly-12 and gly-13 promoters are expressed in most cell types. Expression of the gly-14 promoter was detected only postembryonically and only in gut cells, suggesting a tissue-specific expression of enzyme activity. Confirmation of its gut-specific expression will require detection of endogenous gly-14 gene products.
Overexpression of the three C. elegans GnT I genes in transgenic worms under the control of worm heat shock promoter caused no obvious phenotypic changes despite marked increases in the enzyme activities of GLY-12 and GLY-14 in worm lysates. This is perhaps not surprising, since mammalian GnT I is probably a housekeeping gene (55) and is expressed in non-rate-limiting amounts. However, mice with a null mutation in the GnT I gene do not survive beyond 10 days of embryonic life, and it is therefore of great interest to study the effects of null mutations in the three C. elegans genes. Attempts to create mutant worms lacking GnT I by injection of single-stranded and double-stranded RNA have to date been unsuccessful in demonstrating any obvious phenotypes. Attempts to create mutant worms by other methods are under way.
The C. elegans genomic DNA and expressed sequence tag data bases contain sequences that show significant homologies to at least 17 enzymes involved in the synthesis of N-and O-glycans, glycosyl-phosphatidylinositol anchors, and proteoglycans, e.g. UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase (41), UDP-GlcNAc:polypeptide N-acetylglucosaminyltransferase (73), GnT I (this study), UDP-Gal:GlcNAc-R ␤1,4galactosyltransferase, ␤1,6-N-acetylglucosaminyltransferase V, and ␣1,3-fucosyltransferase. The only enzymatically active C. elegans glycosyltransferases published to date are the UDP-Gal-NAc:polypeptide N-acetylgalactosaminyltransferases (41) and GnT I (this study). Lectins such as wheat germ agglutinin (74,75) and concanavalin A (76) have been shown to bind to C. elegans tissues, suggesting the presence of glycoproteins in these organisms. Although no detailed glycan structures have as yet been reported for C. elegans, both N-(21-23) and Oglycan (77) fine structures have been determined for several parasitic nematodes using mass spectrometric analysis. Some of these nematode N-glycan structures contain the Glc-NAc␤1,2-Man␣1,3-Man␤-R moiety indicative of a functional GnT I enzyme. The data available to date therefore show that C. elegans is highly active in the synthesis of glycoproteins and is an excellent organism in which to study the role of protein glycosylation in the development of a multicellular organism.