A UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase is essential for viability in Drosophila melanogaster.

We report the first demonstration that the activity of a member of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase gene family is necessary for viability in Drosophila melanogaster. Expression of the wild-type recombinant pgant35A gene in COS7 cells resulted in in vitro activity against peptide and glycopeptide substrates, demonstrating that this gene encodes a biochemically active transferase. Previous mutagenesis studies identified recessive lethal mutations that were rescued by a genomic fragment containing the pgant35A gene; however, the presence of additional open reading frames within this fragment left open the possibility that another gene was responsible for rescue of the observed lethality. Here, we have determined the molecular nature of the mutations in three independent mutant alleles. Two of the mutant alleles contain premature stop codons within the coding region of pgant35A. The third mutant contains an arginine to tryptophan amino acid change, which, when expressed in COS7 cells, resulted in a dramatic reduction of transferase activity in vitro. PCR amplification of this gene from Drosophila cDNA panels and Northern analysis revealed that it is expressed throughout embryonic, larval, and pupal stages as well as in adult males and females. This study provides the first direct evidence for the involvement of a member of this conserved multigene family in eukaryotic development and viability.

We report the first demonstration that the activity of a member of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase gene family is necessary for viability in Drosophila melanogaster. Expression of the wildtype recombinant pgant35A gene in COS7 cells resulted in in vitro activity against peptide and glycopeptide substrates, demonstrating that this gene encodes a biochemically active transferase. Previous mutagenesis studies identified recessive lethal mutations that were rescued by a genomic fragment containing the pgant35A gene; however, the presence of additional open reading frames within this fragment left open the possibility that another gene was responsible for rescue of the observed lethality. Here, we have determined the molecular nature of the mutations in three independent mutant alleles. Two of the mutant alleles contain premature stop codons within the coding region of pgant35A. The third mutant contains an arginine to tryptophan amino acid change, which, when expressed in COS7 cells, resulted in a dramatic reduction of transferase activity in vitro. PCR amplification of this gene from Drosophila cDNA panels and Northern analysis revealed that it is expressed throughout embryonic, larval, and pupal stages as well as in adult males and females. This study provides the first direct evidence for the involvement of a member of this conserved multigene family in eukaryotic development and viability.
O-Linked protein glycosylation is initiated by the action of a family of enzymes known as the UDP-GalNAc:polypeptide Nacetylgalactosaminyltransferases (ppGaNTases) 1 (EC 2.4.1.41). The members of this enzyme family catalyze the transfer of GalNAc from the nucleotide sugar UDP-GalNAc to the hydroxyl group of either serine or threonine in protein substrates, performing the first committed step in the synthesis of mucintype O-glycans. To date, eight distinct mammalian isoforms of this enzyme family have been functionally characterized: pp-GaNTase-T1 (1, 2), -T2 (3), -T3 (4,5), -T4 (6), -T5 (7), -T6 (8), -T7 (9, 10), and -T9 (11); additionally, isoforms have also been identified and characterized in Caenorhabditis elegans (12). Each mammalian isoform displays a unique combination of expression in adult tissues, spatial and temporal expression during development, and in vitro substrate specificity. Whereas some isoforms are found across a broad range of adult tissues and also act on a large repertoire of substrates in vitro (ppGaNTase-T1, -T2, -T3, and -T6), others are more restricted in their expression patterns and their substrate preferences (ppGaNTase-T4, -T5, -T7 and -T9). Most recently, a class of isoforms (ppGaNTase-T7 (9) and -T9 (11)) has emerged that requires the prior action of other ppGaNTases, indicating the existence of a potentially complex hierarchy within this enzyme family.
A number of functional roles have been proposed for mucintype O-glycans (13); however, direct evidence implicating this enzyme family in biological functions as well as demonstrating their necessity for mammalian viability is still lacking. While ablation of members of this gene family in mice is ongoing, no definitive phenotypes are yet evident. A strain deficient in ppGaNTase-T1 (14), while displaying altered lectin staining patterns, is viable and fertile, as is a knockout of another putative isoform (15).
Through sequence homology searches of the Drosophila melanogaster genome data base and text searches of FlyBase, potential members of the ppGaNTase enzyme family are readily recognized. One putative isoform (l(2)35Aa; hereafter referred to as pgant35A) is documented in FlyBase as having lethal allelic mutations generated by ethylmethanesulfonate mutagenesis (16). These mutations are recessive lethal, arresting during early pupal development (16), and are rescued by a genomic fragment containing the putative pgant35A gene (17). While these studies suggest that the mutations responsible for the recessive lethal phenotype may lie in the pgant35A gene, the molecular nature of these mutations was not defined, and it remains possible that other open reading frames within this genomic fragment may have been responsible for the rescue.
To begin to address its role in the lethal phenotype observed, we have cloned pgant35A from a D. melanogaster embryonic cDNA library and expressed it as a recombinant protein, demonstrating that it encodes a biochemically active transferase. Sequencing of the pgant35A gene from homozygous mutant embryos of three independent mutant strains revealed unique mutations within the coding region of pgant35A, all of which would result in the production of truncated or enzymatically compromised proteins. This study demonstrates that the molecular defect resulting in the lethal phenotype of these mutants lies within the pgant35A gene and provides the first direct evidence for the involvement of a member of the ppGa-NTase family in normal eukaryotic development.

EXPERIMENTAL PROCEDURES
Isolation of pgant35A Full-length cDNA-The amino acid consensus sequence SPTMAGGLFAVNRKYFQHLGEY, derived from the con-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. served region of previously characterized mammalian ppGaNTases, was used to perform a tBLASTn search against the existing D. melanogaster genome data base present in NCBI to identify all potential members of this enzyme family. The 14 predicted D. melanogaster ppGaNTase gene sequences obtained were aligned to identify highly conserved regions on which to base degenerate probes to screen cDNA libraries. The primers MAGGLF-S (dATGGCCGGCGGNCTGTTTGC-CAT) and WGGEN-AS (dATCTCCANATTCTCGCCGCCCCA) were used to amplify a 100-bp fragment from D. melanogaster genomic DNA. This amplified genomic fragment was then radioactively labeled using the Random Primers DNA Labeling System (Invitrogen) and used to probe a D. melanogaster Canton-S embryo (2-14-h-old) UniZap cDNA library (Stratagene; catalog no. 937602). Hybridizations were performed in 5ϫ SSPE plus 50% formamide at 42°C with washes in 2ϫ SSC plus 0.5% SDS for 5 min at room temperature and for 15 min at 65°C. Positively hybridizing clones were further screened with a pgant35A isoform-specific primer, FlyH-762S (dTGCACGCGAGGC-CGTGGGCGATG), hybridizing in 5ϫ SSPE, 50% formamide at 30°C with washes in 2ϫ SSC, 0.5% SDS for 5 min at room temperature and 15 min at 42°C. From this second round of screening, two independent cDNAs were isolated corresponding to pgant35A. One clone (FlyH-2a) containing a complete open reading frame was completely sequenced on both strands (Lark Technologies, Inc.).

Generation of Green Fluorescent Protein (GFP) Balancer Stocks and DNA Isolation from Homozygous Mutant Embryos-In
FlyBase, the locus occupied by pgant35A (l(2)35Aa) is documented as having three potential mutant alleles generated through ethylmethanesulfonate mutagenesis (16): Bloomington Stock b 1 l (2)35Aa 3 Adh n4 /CyO (hereafter referred to as 3775); b l (2)35 Aa SF32 Adh n2 pr cn/In(2LR)O, Cy dp lvI pr cn 2 (hereafter referred to as SF32); and l (2)35Aa HG8 Adh nc1 cn bw/ In(2LR)O, Cy dp lvI pr cn 2 (hereafter referred to as HG8). Mutant stocks were obtained from the Bloomington Stock Center and were also the kind gifts of M. Ashburner and C. Flores. Mutant stocks were crossed to w; In(2LR)noc 4L Sco rv9R , b 1 /CyO, P{w ϩmc ϭ ActGFP} JMR1, which contains a green fluorescent protein (GFP) marker on the CyO chromosome 2 balancer (18), to generate GFP balancer/mutant stocks. Embryos and larvae from these three balanced mutant stocks were screened on a fluorescent microscope, and those displaying no GFP fluorescence (and therefore homozygous for the mutant chromosomes) were isolated. DNA was extracted using DNAzol Reagent (Invitrogen) according to the manufacturer's instructions. Primers within the genomic region flanking the pgant35A gene (FlyH-408-S (dTAGCATCTTCGGTGGCATC) and FlyH-2691-AS (dATATGCAGACATAACATATTCGTACAC)) were used to amplify a 2.3-kb fragment containing the pgant35a gene from the isolated genomic DNA. PCR products obtained from the DNA of each homozygous mutant were directly sequenced on both strands (ACGT, Inc.).
PCR and Northern Blot Analysis-cDNA panels from various D. melanogaster tissues and stages of development were obtained from OriGene Technologies, Inc. Primers from within the coding region of pgant35A, FlyH-1330-S (dCCTCATCAAGTCGGAGAACG) and FlyH-2175-AS (dAGGCACAGCAACTTGTCCAG), were used to specifically amplify an 845-bp product under the following conditions: 35 cycles of 95°C for 1 min, 67°C for 1 min, and 72°C for 1 min followed by one cycle of 72°C for 10 min. rp49 control PCR amplifications were performed according to the manufacturer's instructions. Reaction products were electrophoresed in a 1% TAE-agarose gel and photographed on a Bio-Rad Fluor-S TM MultiImager.
Poly(A) ϩ RNA from Drosophila embryos, larvae, and adults (CLON-TECH) was electrophoresed in a 0.7% formaldehyde agarose gel and transferred to Hybond-NX membranes (Amersham Biosciences) according to Sambrook et al. (19). A unique 520-bp segment of the pgant35A cDNA region was amplified from the FlyH-2a plasmid using the primers FlyH-ISϩ (dATAGGTACCAAGCTTATCTAATTTATTCCGATC-ATCATGAAAGTGAC) and FlyH-IS-(dATAGAGCTCGAGTCCGCTTC-CGTGGTCTCCTG). The PCR product was cut with KpnI/SacI and cloned into the KpnI/SacI sites of pBluescript KSϩ to generate the vector pBSFlyH-IS. pBSFlyH-IS was digested with HindIII, and the antisense strand of the pgant35A insert was radiolabeled with [ 32 P]UTP using the MAXIscript TM In Vitro Transcription Kit (T7 RNA polymerase) (Ambion). Hybridizations with the riboprobe were carried out in 5ϫ SSPE plus 50% formamide at 68°C with a final wash in 2ϫ SSC plus 0.1% SDS at 65°C for 20 min. The northern blot was stripped according to the manufacturer's instructions. A 600-bp EcoRI/HindIII fragment of the rp49 gene (20) was labeled with [ 32 P]dCTP using the Random Primers DNA Labeling System (Invitrogen) and used as a probe on the stripped northern blot to control for loading variations and RNA integrity in each lane. rp49 hybridizations were performed in 5ϫ SSPE plus 50% formamide at 42°C with two final washes in 2ϫ SSC plus 0.1% SDS at 65°C for 20 min and in 0.2ϫ SSC plus 0.1% SDS at 65°C for 30 min.
Generation of Secretion Constructs for Wild-type pgant35A and SF32 Mutant-An MluI site was introduced into a fragment of the FlyH-2a cDNA by PCR amplification using the primers FlyH-305S (dTCTACG-CGTACAGCCTGCGC) and FlyH-BglII-AS (dGAAGATCTTCCGCTTC-CGTGGTCTCCTG). This amplified product was digested with MluI and BglII and cloned into the vector pIMKF4 to create the vector, pF4-FlyH-349. Sequencing was performed to verify that no PCR-induced mutations had been sustained in the cloned product. A 1.8-kb BstEII/BamHI fragment from the FlyH-2a cDNA was then cloned into the BstEII/BglII sites of pF4-FlyH-349 to generate the mammalian expression vector, pF4-f35A. pF4-f35A is an SV40-based expression vector that generates a fusion protein containing the following, in order: an insulin secretion signal, a metal binding site, a heart muscle kinase site, a FLAG TM epitope tag, and the truncated pgant35A gene.
The SF32 mutant allele of pgant35A was cloned into the pIMKF4 expression vector by using the genomic amplification product obtained from the homozygous mutant embryos from the GFP balancer/SF32 stock described above. Because this gene contains no introns, we were able to clone directly from the genomic PCR product. The 2.3-kb PCR fragment was digested with BstEII to generate a cloning fragment with a 5Ј BstEII end and a blunt 3Ј end; this fragment was then cloned into the BstEII/NotI(blunt) sites of pF4FlyH-349 to generate the SV40-based recombinant expression vector, pF4-SF32.

RESULTS
Degenerate PCR probes, based on amino acid consensus sequences derived from putative D. melanogaster ppGaNTases found in the data base, were used to isolate the pgant35A cDNA from a D. melanogaster Canton S embryonic cDNA library (Stratagene). As shown in Fig. 1, conceptual translation of the pgant35A cDNA reveals a type II membrane protein consisting of an 8-amino acid N-terminal cytoplasmic region, a 23-amino acid hydrophobic/transmembrane region, an 82-amino acid stem region, and a 519-amino acid putative catalytic region. The nucleotide sequence within the coding region of this cDNA clone differs from that of the predicted cDNA sequence found in FlyBase at 13 positions; one of these differences results in a threonine at amino acid position 72 (within the stem region) in place of the isoleucine found in the FlyBase predicted protein. Table I summarizes the degree of amino acid similarity (within the conserved region) between the conceptually translated PGANT35A protein from Fig. 1 and each of the functionally characterized mammalian isoforms. PGANT35A displays the greatest degree of similarity within this region to ppGaNTase-hT2 and the lowest degree of similarity to ppGaNTase-rT7, the previously characterized glycopeptide transferase. The PGANT35A amino acid sequence was also used to search the public human genome data base (NCBI) for putative orthologues. One putative human ppGaNTase (FLJ21634) shares the greatest degree of amino acid similarity to PGANT35A (71%) of all isoforms identified (Table I). Alignments of the putative catalytic C-terminal regions for the isoforms mentioned above is shown in Fig. 2. The shaded boxes illustrate the regions of conservation between isoforms and across species.
The truncated coding region of pgant35A (beginning at amino acid position 33) was cloned into a mammalian expression vector and transfected into COS7 cells as described previously (6). The expressed products from cells transfected with either pgant35A, ppGaNTase-T5, or ppGaNTase-T9 expression vectors were harvested from the culture media. Equal relative amounts of the recombinant proteins (as determined by SDS-PAGE) were used in in vitro glycosylation reactions. Reactions of recombinant PGANT35A against a panel of peptides showed substantial transferase activity only with EA2; activity 2-3fold above background was seen with the IgAh peptide and MUC5AC-13 glycopeptide (Table II). The other peptides tested showed values less than 2-fold above background. By comparison, recombinant ppGaNTase-T5 showed much less transfer of GalNAc to the EA2 peptide than recombinant PGANT35A. The recombinant glycopeptide transferase, ppGaNTase-T9, showed substantial activity against the MUC5AC-3 and MUC5AC-3/13 glycopeptide substrates, on which PGANT35A was not active. However, both ppGaNTase-T9 and PGANT35A showed comparable activities (2-3-fold above background) on the MUC5AC-13 glycopeptide substrate.
Reactions to determine K m values for the acceptor peptide EA2 and the donor substrate UDP-GalNAc were then per- formed. The K m value for the EA2 acceptor peptide with recombinant PGANT35A (0.595 mM) is comparable with those determined previously for the mammalian ppGaNTases (0.042-1.02 mM). Likewise, the K m value determined for UDP-GalNAc with PGANT35A (0.02 mM) is similar to values determined for the mammalian isoforms (0.022-0.051 mM).
Analysis of Mutants-The three independent strains that represent candidate mutants in the pgant35A gene were crossed to Bloomington Stock 4533, which contains a GFP marker on the CyO chromosome 2 balancer (18). Because the candidate mutations also reside on chromosome 2, these crosses generated GFP balancer/mutant stocks. The GFP bal-ancer chromosomes allow one to then visualize the chromosomal composition of D. melanogaster embryos and larvae by looking for GFP fluorescence; only embryos or larvae having at least one copy of the GFP balancer chromosome will appear green under a fluorescent microscope. Each of the three GFP balancer/mutant stocks produced three types of progeny; onequarter were GFP balancer/GFP balancer, one-half were GFP balancer/mutant, and one-quarter were mutant/mutant. The homozygous mutant embryos and larvae were therefore the only ones that did not exhibit GFP fluorescence. The GFPnegative embryos and larvae from these mutant stocks were harvested under a fluorescent microscope, and genomic DNA  35A  mT1  hT2  mT3  mT4  rT5  hT6  rT7  rT9  FLJ21634   35A  NA  64  65  62  62  60  61  57  58  71  mT1  64  NA  70  70  70  67  67  62  67  71  hT2  65  70  NA  63  68  63  64  60  64  69  mT3  62  70  63  NA  69  64  85  57  62  65  mT4  62  70  68  69  NA  65  66  57  66  68  rT5  60  67  63  64  65  NA  60  57  66  67  hT6  61  67  64  85  66  60  NA  58  58  61  rT7  57  62  60  57  57  57  58  NA  63  58  rT9  58  67  64  62  66  66  58  63  NA  67  FLJ21634  71  71  69  65  68  67  61  58  67  NA FIG. 2. Amino acid sequence alignments of PGANT35A and mammalian ppGaNTases. Amino acid sequences were aligned within the putative catalytic domain, beginning with the consensus sequence FNXXXSD and extending to the C terminus. The shaded blocks indicate regions of similarity or identity. A consensus sequence is given below the alignments for positions that are greater than 50% conserved among the isoforms shown. 35A, PGANT35A; FLJ21634, predicted protein from public human genome data base (NCBI); m, mouse; r, rat; h, human. was extracted. The pgant35A coding region was then PCRamplified and sequenced to determine whether a nucleotide change had occurred in these mutants. The results are summarized in Fig. 1 (boxed residues). The HG8 mutant contained a C to T transition at nucleotide 265, resulting in a glutamine to stop codon change at amino acid position 89; this mutation truncates the protein within the putative stem region, thereby eliminating the entire conserved region. The 3775 mutant contained a premature stop codon at amino acid position 195 (the result of a T to A transversion at nucleotide position 584), eliminating greater than half of the putative catalytic region and many conserved regions known to be important for enzymatic activity based on prior mutagenesis studies (29). The SF32 mutant was found to contain a C to T transition at nucleotide 679, thereby changing an arginine to tryptophan at amino acid position 227. Although this arginine is a highly conserved residue, prior mutagenesis to determine its affect on enzymatic activity had not been performed. We therefore cloned this mutant into the SV40-based expression vector and expressed the recombinant protein in COS7 cells. SDS-PAGE was performed on labeled recombinant wild type PGANT35A and SF32 mutant harvested from the media to quantitate relative levels and to verify the production of the mutant protein (Fig. 3). An appropriately sized band of ϳ78 kDa was seen in both the PGANT35A and SF32 lanes. In vitro assays using equivalent amounts of recombinant wild type PGANT35A and mutant SF32 protein were performed to determine the effect of this mutation. As shown in Table II, the SF32 mutant displays activity barely detectable above background. Using the EA2 peptide as an acceptor substrate, the SF32 mutant shows a substantial reduction in activity from that of the wild type enzyme, demonstrating that this arginine to tryptophan mutation has a severe effect on the enzymatic activity of this protein in vitro. K m values were not measurable for either substrate with the SF32 mutant protein.
Expression of pgant35A-To address the temporal expression pattern of pgant35A, we used semiquantitative PCR amplification of panels containing cDNA from staged D. melanogaster embryos, larvae, pupae, and adult males and females. Primers were designed from within the coding region of pgant35A and used to amplify an 845-bp product. As shown in Fig. 4A, a PCR product of the appropriate size can be seen throughout the embryonic stages, gradually declining as embryogenesis proceeds. A PCR amplification product is seen in the first, second, and third instar larval stages, increasing as the larval stages progress. Expression is also seen in the pupal lane and to varying degrees in both sexes of adult flies. No product is seen in the negative control, which lacked cDNA (Fig. 4A, lane 14). Fig. 4B shows amplification of an rp49 control to demonstrate the presence of cDNA in each well of the panel. Again, no product is seen in the negative control that lacked cDNA (Fig. 4B, lane 14).
Because PCR amplification would detect both sense and antisense transcripts within the region of the designated primers,   we confirmed the PCR results by probing a northern blot with a strand-specific probe. Poly(A) ϩ RNA from embryonic, larval, and adult stages was hybridized with an antisense riboprobe from the coding region of pgant35A. A 2.3-kb band, corresponding to the sense transcript generated from the pgant35A gene, can be seen intensely in the embryonic and adult stages and to a lesser degree in the larval stage (Fig. 4C). rp49 expression on the same northern blot demonstrates RNA loading and integrity (Fig. 4C, lower panel). DISCUSSION We report that the activity of a member of the ppGaNTase family is essential for development and viability in a eukaryotic organism. Our studies demonstrate that not only is the product of the pgant35A open reading frame a functional ppGaNTase in vitro, but each of the three independent mutant alleles, which cause arrest during early pupal development (16), contains a mutation within the coding region of pgant35A. Two of the mutants contain premature stop codons that eliminate all or most of the catalytic region known to be necessary for enzymatic activity. The third mutant contains an arginine to tryptophan change at a highly conserved position within the catalytic domain. Recombinant expression of this mutant demonstrates that virtually all enzymatic activity has been lost; incorporation levels are barely above background values and show a substantial reduction from those of the wild type enzyme. Previous studies had rescued the lethal phenotype of these mutants with a genomic fragment containing the pgant35A gene (17). However, the possibility remained that other open reading frames within this genomic region may have been responsible for the rescue. The molecular characterization of the independent mutants reported here, along with the previous rescue studies, conclusively demonstrate that mutations within the coding region of pgant35A are responsible for the early pupal lethality observed in these strains.
The pgant35A gene shows striking homology to other functionally characterized members of the ppGaNTase family. Homology searches against the human sequence databases further reveal a putative ppGaNTase with even greater amino acid similarity to PGANT35A (71% within the conserved domain). It will be interesting to functionally characterize this putative orthologue of pgant35A and determine whether deficiencies in the mouse homologue result in lethality as well.
PGANT35A displays transferase activity in vitro, transferring GalNAc from UDP-GalNAc onto the hydroxyl group of either serine or threonine residues in selected peptide substrates. Among the panel of peptide substrates tested, PGANT35A showed substantial activity on EA2. Kinetic measurements indicate similar affinities of this enzyme for donor and acceptor substrates relative to other mammalian ppGaN-Tases characterized previously. PGANT35A also showed activity greater than 2-fold above background using the glycopeptide substrate MUC5AC-13, indicating its ability to transfer GalNAc to both peptide and glycopeptide substrates, albeit at different efficiencies. The previously characterized glycopeptide transferase, ppGaNTase-T9, preferred MUC5AC-3 and MUC5AC-3/13 over the MUC5AC-13 glycopeptide, whereas PGANT35A did not transfer GalNAc to MUC5AC-3 or MUC5AC-3/13. These results highlight different preferences among isoforms for specific glycopeptide substrates as well as peptide substrates. PCR panels of cDNA from different stages of D. melanogaster development as well as northern blots demonstrate that the pgant35A gene is expressed during embryonic, larval, pupal, and adult stages. The embryonic signal seen may represent a maternal RNA contribution rather than zygotic expression. However, expression is seen to increase during larval development and continue through pupal and adult stages. We now have evidence that embryos lacking the maternal pgant35A RNA can be rescued by zygotic expression, indicating that the maternal RNA may not be strictly required for passage through the early stages of development (data not shown).
Recent studies in D. melanogaster have shed some light on the biological effects of other types of protein glycosylation. Studies on the Drosophila fringe gene have revealed that it encodes a glycosyltransferase that modulates the ability of Notch to signal events affecting cell fate decisions, clearly demonstrating its importance during development (30). Other genes known to be crucial for development in D. melanogaster, such as sugarless and sulfateless (31,32), have subsequently been shown to be involved in the synthesis of heparan sulfate glycosaminoglycans. From these studies, it is clear that Drosophila is an experimental system in which the biological consequences of protein modification can be addressed. Through future studies using this model organism, we hope to characterize the ppGaNTase enzyme family, define in vivo substrates, and begin to dissect the role of O-linked glycosylation in eukaryotic development.