Identification of a sialidase encoded in the human major histocompatibility complex.

Mammalian sialidases are important in modulating the sialic acid content of cell-surface and intracellular glycoproteins. However, the full extent of this enzyme family and the physical and biochemical properties of its individual members are unclear. We have identified a novel gene, G9, in the human major histocompatibility complex (MHC), that encodes a 415-amino acid protein sharing 21-28% sequence identity with the bacterial sialidases and containing three copies of the Asp-block motif characteristic of these enzymes. The level of sequence identity between human G9 and a cytosolic sialidase identified in rat and hamster (28-29%) is much less than would be expected for analogous proteins in these species, suggesting that G9 is distinct from the cytosolic enzyme. Expression of G9 in insect cells has confirmed that it encodes a sialidase, which shows optimal activity at pH 4.6, but appears to have limited substrate specificity. The G9 protein carries an N-terminal signal sequence and immunofluorescence staining of COS7 cells expressing recombinant G9 shows localization of this sialidase exclusively to the endoplasmic reticulum. The location of the G9 gene, within the human MHC, corresponds to that of the murine Neu-1 locus, suggesting that these are analogous genes. One of the functions attributed to Neu-1 is the up-regulation of sialidase activity during T cell activation.

Sialidase (neuraminidase) enzymes catalyze the removal of sialic (N-acetylneuramic) acid moities from glycoproteins and glycolipids. In microorganisms, such as viruses, parasites, and bacteria, sialidases are thought to be important in nutrition and pathogenesis (1), while in mammalian species these enzymes are involved in modulating cellular events such as activation, differentiation, maturation, and growth, which are all accompanied by changes in sialic acid levels (2). The existence of many different types of sialic acid linkage is consistent with a requirement for multiple sialidases and the sialidase activities detected in mammals do differ in their relative levels between cell types as well as in subcellular localization and biochemical properties (e.g. substrate specificity). A cytosolic form, and two lysosomal forms of sialidase have been reported, as well as activities associated with microsomes, the plasma membrane, and the Golgi apparatus (3)(4)(5)(6)(7). However, variation in the procedures used to purify and characterize sialidases from mammalian cells and tissues has led to confusion in the literature. This can only be resolved by the cloning and sequencing of the individual sialidases at the DNA level, which will allow functional characterization of recombinant proteins, and by the subcellular localization of these enzymes using specific antibodies. The only mammalian sialidase to have been cloned and sequenced to date is the cytosolic enzyme which has been isolated from rat skeletal muscle (8) and from the Chinese hamster ovary cell line (9).
A number of inherited human diseases are associated with deficiencies in sialidase activity (10). The genetic basis of these disorders, which are characterized by developmental and neurological abnormalities, is poorly understood (11,12). However, there has been one report of a combined deficiency in sialidase and cytochrome P450 steroid 21-hydroxylase (P450c21) activities (13). The P450c21B gene is located in the class III region of the human major histocompatibility complex (MHC) 1 in the chromosome band 6p21.3 (14,15). In the patient described, the P450c21B gene is not deleted, but there is the possibility of linked mutations in the P450c21B gene and a sialidase gene in the same region of chromosome 6. The presence of a sialidase gene in this region was indicated by the localization of the murine Neu-1 locus, which encodes a liver sialidase, between the C2 and BAT5 genes in the S region of the histocompatibility-2 complex (the equivalent of the human MHC class III region) by a combination of linkage studies in backcross mice and the examination of Neu-1 phenotypes in congenic strains (16 -18).
The human MHC spans ϳ4000 kb of DNA and can be divided into three regions (14). The class I and class II regions contain genes that encode the highly polymorphic histocompatibility antigens required for the presentation of antigen to T cells. The 1100 kb of DNA separating the class I and class II regions, which is termed the class III region, is densely populated and contains at least 63 genes, most of which are unrelated to each other and to the histocompatibility antigens (14,15). These include genes encoding the complement components C2, C4, * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) X78687.
‡ Supported by a Medical Research Council Training Research Fellowship.
§ Supported by the Spanish Government (Ministry of Education and Science (F.P.U.)).
** To whom correspondence should be addressed. Tel.: 1865-275349; Fax: 1865-275729. and factor B, three members of the 70-kDa heat shock protein family, and the cytokines tumor necrosis factor, lymphotoxin-␣, and lymphotoxin-␤. The organization of the MHC class III region is highly conserved between man and mouse and, in those cases that have been tested to date, the genes found in the human class III region also map to the mouse histocompatibility-2S region in the same relative order (17,19,20).
G9, one of the newly identified genes in the human MHC class III region (21), lies within the same segment of DNA as the Neu-1 gene in the murine histocompatibility-2S region.
Here we describe the characterization of the G9 gene and provide evidence that it encodes a novel human sialidase.

EXPERIMENTAL PROCEDURES
Isolation of cDNA Clones and Nucleotide Sequence Analysis-A 1.6-kb KpnI fragment (probe I), isolated from the MHC class III-linked cosmid J8b (21), was labeled with 32 P and used to probe a U937 (monocyte) cDNA library constructed in gt10. Six clones were identified that corresponded to a single gene (defined as G9) and the ϳ1.8-kb insert of the largest, 3, was subcloned into PvuII-cut pATX (designated pG9-3). The insert from pG9-3 was digested with HinfI, MspI, or StyI/Asp718 and the DNA fragments generated were ligated into SmaI-cut M13mp10 and sequenced. The sequence data was assembled into contigs using the SAP program from the Staden package (22). Genomic DNA fragments were cloned into SmaI-cut M13mp10 or HincII-cut pGEM-3Zf(ϩ) vectors (Promega). Single-stranded DNA was recovered from clones in pGEM3Zf(ϩ) using the helper phage M13K07 in the presence of kanamycin. All sequencing was carried out by the dideoxy chain termination method using the Sequenase system (U. S. Biochemical Corp.) with the M13 universal primer (5Ј-GTAAAACGACGGC-CAGT-3Ј) or sequence-specific oligonucleotide primers.
The predicted G9 amino acid sequence was analyzed using the programs SigCleave and PepWindow, from the Genetics Computer Group (GCG) package, to identify potential signal cleavage sites (using the von Heijne algorithm (23)) and to generate a hydropathy plot (using the Kyte-Doolittle scale (24)), respectively. The Macintosh program Top-Pred, which uses the GES hydophobicity scale (25), was also used to predict transmembrane regions. The derived amino acid sequence of G9 was compared with the National Biomedical Research Foundation and SwissProt protein data bases using the FASTA program from the GCG package (26). The significance of protein sequence similarity detected by data base searching was determined using the ALIGN program (27) and multiple alignment of related proteins was carried out using the AMPS program (28). A bias of 6 added to each term of the mutation data matrix and a break penalty of 5 was used. 100 random runs were performed for each pairwise comparison to allow calculation of mean random scores.
Southern and Northern Blot Analysis-Cosmid J8b DNA (1 g) and genomic DNA (5 g), from the human leukocyte antigen homozygous cell line ICE5 (used to construct the library from which cosmid J8b was isolated), were digested with restriction enzymes under the conditions recommended by the supplier. The digested DNA was fractionated on 0.8% (w/v) agarose gels and transferred to nitrocellulose membranes (29). Total RNA (15 g) from cell lines representing monocytes, macrophages, hepatocytes, T lymphocytes, B lymphocytes, and epithelial cells was fractionated on 1% (w/v) agarose, 1.9% (v/v) formaldehyde denaturing gels and transferred to nitrocellulose.
Blots were hybridized with the pG9-3 cDNA insert, labeled with 32 P by random hexanucleotide priming (30), and then washed as described by Hsieh and Campbell (31) and autoradiographed between two intensifying screens at Ϫ70°C for 1-5 days.
Transcription Mapping by RNase Protection-RNA probes of high specific activity were synthesized in vitro using the Riboprobe Gemini System II (Promega). A 700-bp BamHI/HindIII genomic fragment containing the 5Ј end of the G9 gene was subcloned into HincII-cut pGEM-3Zf(ϩ) vector (Promega). The DNA (1 g) was linearized by digestion with SmaI and a 32 P-labeled antisense RNA probe was transcribed using bacteriophage SP6 polymerase under the conditions specified by the supplier (Promega). Transcription mapping was carried out as described in Hsieh and Campbell (31) using 5 ϫ 10 5 cpm of RNA probe and 10 g of total RNA in each reaction. The 32 P-labeled RNA duplexes were analyzed by electrophoresis in 6% (w/v) polyacrylamide, 7 M urea gels.
Preparation and Purification of Antisera-A 737-bp BstEII/Eco0109 fragment encoding amino acids 60 -302 of the G9 protein was isolated from the pG9-3 cDNA and cloned into EcoRI-cut pGEX-2T vector (32), in-frame with the coding sequence for the 26-kDa glutathione S-transferase from Schistosoma japonicum (Sj26). The Sj26-G9 fusion protein was expressed in Escherichia coli and purified on glutathione-agarose beads as described previously (33). Fusion protein was eluted from the beads with 5 mM reduced glutathione in 50 mM Tris-HCl (pH 8), 0.03% (w/v) SDS or cleaved on the beads by digestion with thrombin in 150 mM NaCl, 2.5 mM CaCl 2 .
A rabbit was immunized with the Sj26-G9 fusion protein by multiple intradermal injections containing 100 -200 g of antigen in 50% (v/v) Freund's complete adjuvant (first and second immunizations) or Freund's incomplete adjuvant (third and fourth immunizations). Sera were recovered from blood taken prior to immunization (preimmune sera) and 20 days after the fourth immunization (anti-G9 sera). Antisera were passed twice through a column of Sj26 coupled to CNBractivated Sepharose-4B (1.5 g of Sj26/ml resin) to remove anti-Sj26 antibodies.
Antibodies to G9 were affinity purified, for use in immunofluorescence staining, using recombinant G9 expressed in S. frugiperda (Sf)21 cells (see below). Antisera was first passed through a column of proteins from Sf21 cell lysates coupled to CNBr-activated Sepharose-4B (ϳ3.5 g of total cell protein/ml of resin) and then through an identical column prepared from Sf21 cell lysates 4 days after infection with recombinant G9-baculovirus. G9-specific antibodies were eluted from the latter column with 3 M MgCl 2 (pH 6.8) and concentrated and desalted using a Centricon-30 spin column (Amicon). Cross-reaction of the purified antibodies with recombinant G9, expressed in Sf21 and COS7 cells, was confirmed by Western blot analysis.
Western Blot Analysis-Cultured cells in phosphate-buffered saline (PBS) or other protein samples were mixed with an equal volume of 2 ϫ sample loading buffer (6% (w/v) SDS, 20% (v/v) glycerol, 1.4 mM ␤-mercaptoethanol, 0.12 M Tris-HCl (pH 6.8), 0.025% (v/v) bromphenol blue) and boiled for 5 min. Samples were electrophoresed on SDS-10% (w/v) polyacrylamide gels and proteins were transferred onto nitrocellulose by electroblotting. Blots were incubated for 1 h in PBS, 0.1% (v/v) Tween 20 (PBST) containing 10% (w/v) dried milk. The primary antibody (either antisera or a monoclonal antibody) was then added at an appropriate dilution in PBST, 1% (w/v) dried milk and blots were incubated for a further 1 h. After washing in PBST, blots were incubated for 1 h with horseradish peroxidase-conjugated goat anti-rabbit IgG (Amersham) or horseradish peroxidase-conjugated sheep antimouse IgG (Sigma) diluted in PBST, 1% (w/v) dried milk. The blots were then washed in PBST prior to addition of ECL development reagent (Amersham) and detection of bound secondary antibody by autoradiography. All incubations were carried out at room temperature with constant agitation.
Preparation of Recombinant Baculovirus and Expression in Insect Cells-Wild-type Autographica californica nuclear polyhedrosis virus and recombinant viruses were propagated in Sf21 cells grown in TC100 medium, 10% (v/v) fetal calf serum at 28°C (34).
Nucleotides 33-1330 of the G9 mRNA were amplified by polymerase chain reaction from a cDNA template generated by reverse transcription of RNA derived from the cell line HepG2 (hepatocytes) using the oligonucleotide primers OLG9Bam (5Ј-CTGTGGATCCTAGCTGC-CAGG-3Ј, sense) and OLG92Sma (5Ј-ATACCCGGGTGGCAGTGGCA-3Ј, antisense). Products were digested with BamHI and SmaI, cloned into SmaI/BamHI-cut pBluescript, and sequenced. A fragment encoding the full-length G9 protein was excised from a construct containing no deleterious polymerase chain reaction errors (G9-BS-4/7-54) by digestion with SmaI and PvuII and cloned into SmaI-cut baculovirus transfer vector, pAcCL29.1 (35). 2 g of the resultant plasmid DNA and 20 g of Bsu36I-cut wild-type virus DNA, BacPAK6 (Clontech), were combined with 20 g of Lipofectin and co-transfected into ϳ1.5 ϫ 10 6 Sf21 cells growing as a monolayer (36). The transfection supernatant was recovered after 72 h and recombinant virus (G9-bac) was purified by three rounds of plaque purification (37). Wild-type virus was prepared in parallel for use as a control. Large stocks of virus were prepared by infecting ϳ10 8 Sf21 cells at a multiplicity of infection of ϳ2 plaque forming units (pfu) per cell. The supernatant was collected 72 h later and titrated by plaque assay. In subsequent experiments cells were infected for 2 h with virus then returned to fresh TC100 media for the appropriate time. For metabolic labeling experiments cells were infected at a multiplicity of infection of 10 pfu/cell for 2 h then, at an appropriate time point after infection, transferred to methionine-and cysteine-free TC100 without fetal calf serum containing 25 mCi of Trans 35 S-label (ICN) for 3 h prior to harvesting. To determine the presence of N-linked sugars on expressed protein, Sf21 cells were infected at a multiplicity of infection of 10 pfu/cell for 2 h and then grown in fresh TC100, 10% (v/v) fetal calf serum, to which tunicamycin was added at a final concentration of 2 g/ml at appropriate time points prior to harvesting.
Cell lysates and media were analyzed by electrophoresis on SDS-10% (w/v) polyacrylamide gels as described above. Gels containing 35 S-labeled samples were stained with Coomassie Brilliant Blue in 50% (v/v) methanol, 10% (v/v) acetic acid, destained, incubated in Amplify (Amersham) for 30 min and dried, prior to autoradiography at Ϫ70°C for 4 -48 h. Unlabeled proteins were detected by Western blot analysis with anti-G9 sera as described above.
Sialidase Assays-Baculovirus-infected Sf21 cells (48 h after infection with 2 pfu/cell), or uninfected controls, were harvested, washed with PBS, resuspended in a small volume of PBS, and lysed by gentle sonication on ice. The protein concentration of cell lysates was determined by BCA assay (Pierce). The activity of G9 (in total insect cell lysates) toward the substrate 2Ј-(4-methylumbelliferyl)␣-D-N-acetylneuramic acid (4MU-NANA) (Sigma) was determined by fluorimetric assay (38,39). Reactions were set up in duplicate using 200 g of total protein in 0.14 M NaOAc or 0.1 M phosphate buffer (of the appropriate pH) with 0.7 mM 4MU-NANA in a final volume of 100 l and incubated at 37°C for 2-10 min. Reactions were stopped by the addition of 1 ml of 0.085 M glycine-carbonate (pH 10) on ice and spun down to remove insoluble material. Fluorescence was measured on a Hoeffer TK100 fluorometer with excitation at 365 nm and emission at 450 nm using 4-methylumbelliferone (4-MU) as a standard. Assays without cell lysates were set up in parallel and readings due to nonspecific hydrolysis were subtracted. Similar assays were set up to determine the activity of G9 toward the substrates 2-O-(o-nitrophenyl)␣-D-N-acetylneuramic acid (ONP-NANA) and 2-O-(p-nitrophenyl)␣-D-N-acetylneuramic acid (PNP-NANA) (Sigma), at pH 4.6. After incubation at 37°C for 2 min to 1 h the addition of 0.085 M glycine-carbonate (pH 10) to stop the reactions caused any nitrophenol, released by the action of G9, to turn yellow. This was quantified by measuring the A 405 of the reaction mixtures with o-nitrophenol or p-nitrophenol as standard.
The activity of G9 toward the substrates fetuin (supplied by Prof. G. Laver, Canberra, Australia), ␣(2-3)and ␣(2-6)-sialyllactose (Oxford Glycosystems), and mixed gangliosides (Sigma) was determined by incubating 200 g of total protein (insect cell lysate) with 100 -200 nmol of substrate in 0.2 M NaOAc (pH 4.6) at 37°C for 30 min. The amount of released sialic acid was determined by Warren's thiobarbituric acid assay using sialic acid as standard (40). The concentration of free sialic acid in the substrates was determined by hydrolysis in 0.1 M sulfuric acid for 30 min at 37°C.
Expression of G9 in COS7 Cells and Immunofluorescence Staining-The expression of recombinant G9 with a C-terminal peptide tag (T7.Tag) was achieved as follows. Two oligonucleotides, Tag-sense (5Ј-ATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCCA-TA-3Ј) and Tag-antisense (5Ј-GATCTATGGATCCCGACCCATTTGCT-GTC CACCAGTCATGCTAGCCAT-3Ј) were phosphorylated at their 5Ј ends using polynucleotide kinase, and annealed together by slow cooling from 65°C to room temperature in polynucleotide kinase buffer. The resultant adaptor was cloned into SmaI/BamHI-cut pBluescript-KS ϩ , to regenerate the BamHI site and generate an NcoI site in place of the SmaI site (designated pBS-Tag). Nucleotides 247-1311 of the G9 mRNA were amplified from pG9-3 using the oligonucleotide primers OLG93Sma (5Ј-GCTGGTGCCCGGGGAGCAACTGC-3Ј, sense) and OLG94Sma (5Ј-CAGTGGCACACCCGGGAGTGTCC-3Ј, antisense). The polymerase chain reaction products were cut with SmaI, cloned into NcoI-cut and end-filled pBS-Tag (G9-BS-Tag-C), and sequenced. Nucleotides 60 -1330 were amplified from a full-length G9 cDNA template, generated by reverse transcription of RNA derived from the cell line HepG2 (hepatocytes) using the oligonucleotide primers OLG9Pst (5Ј-GCAGCTGCAGGGAGAGATGAC-3Ј, sense) and OLG92Sma (5Ј-ATAC-CCGGGTGGCAGTGGCA-3Ј, antisense). Products were digested with PstI to generate a fragment corresponding to nucleotides 60 -1026 and this was cloned into G9-BS-Tag-C cut with PstI, to excise nucleotides 247-1026 of G9, and sequenced. An EcoRI/XbaI fragment, containing the G9 coding sequence with the Tag coding sequence at the 3Ј end, was excised from G9-BS-Tag-C-13.8 and cloned into the corresponding sites of pcDNA3 (Invitrogen), downstream of the human cytomegalovirus promoter (pcDNA3-G9-Tag-1). To allow expression without a tag the G9 coding sequence was excised from G9-BS-4/7-54 (see above) by digestion with SmaI/PvuI and cloned into EcoRV-cut pcDNA3 to generate the clone pcDNA3-G9 -4.2.
COS7 cells were seeded into 80-cm 2 flasks and grown overnight to ϳ50% confluency in Dulbecco's modification of Eagle's medium, 10% (v/v) fetal calf serum. The following day cells were washed with serumfree Dulbecco's modified Eagle's medium and incubated for 2 h at 37°C, 5% (v/v) CO 2 with 5 ml of transfection mixture, i.e. serum-free Dulbec-co's modified Eagle's medium containing 0.4 g/ml DEAE-dextran, 0.1 mM chloroquine, and 5-10 g of DNA. After removal of the transfection mixture, cells were washed with PBS, incubated for 2 min at room temperature with PBS, 10% (v/v) dimethyl sulfoxide, washed again with PBS, and then returned to fresh media for 48 h prior to harvesting.
To determine the presence of N-linked glycosylation, tunicamycin was added to the culture medium at a final concentration of 2.5 g/ml at appropriate time points after dimethyl sulfoxide shock.
For immunofluorescence, COS7 cells were seeded onto 4.8-cm 2 glass coverslips in 6-well plates and transfected as described above, but using 0.65 ml of transfection mixture/well. Cycloheximide was added to the culture media of some cells to a final concentration of 10 g/ml 24 h prior to fixation and staining. 48 h or 72 h after transfection, cells were washed with PBS, fixed with 4% (w/v) paraformaldehyde in 250 mM HEPES (pH 7.4) for 10 min at 4°C, followed by 8% (w/v) paraformaldehyde for 50 min at room temperature and then washed with PBS. The cells were incubated with 50 mM NH 4 Cl in PBS for 15 min, washed with PBS, and then incubated with PBS, 0.2% (w/v) gelatin, 0.05% (w/v) saponin (buffer A) for 15 min to block nonspecific binding. Primary antibody, diluted in buffer A, was added to the cells for 45 min. This was followed by washing with PBS, 0.05% (w/v) saponin and then blocking with buffer A. Fluorescein 5-isothiocyanate-conjugated goat anti-mouse IgG (Sigma), donkey anti-rabbit IgG (Jackson Immunoresearch Laboratories), or tetramethylrhodamine B isothiocyanate-conjugated goat anti-mouse IgG (Sigma) were diluted, as recommended by the supplier, in buffer A and added to the cells for 45 min, after which they were washed with PBS, 0.05% (w/v) saponin and then with PBS, prior to mounting in Moviol. Fluorescence microscopy was carried out using a Bio-Rad MRC 1024 confocal microscope.

RESULTS
Characterization of the G9 cDNA and Gene-The ϳ1.8-kb insert of the cDNA clone pG9-3 was used to probe Northern and Southern blots. A single mRNA of ϳ1.9 kb was detected in cell lines representing monocytes, macrophages, hepatocytes, epithelial cells, and B and T lymphocytes, with the level of expression in hepatocytes being significantly elevated compared to the other cell types tested, as previously observed with probe I (21). The pattern of hybridization observed in Southern blot analysis localized G9 in the MHC class III region ϳ75-kb telomeric of the C2 gene ( Fig. 1) and showed that it has a maximum size of ϳ5.5 kb. Identical hybridization patterns were observed on cosmid and genomic DNA blots, indicating that G9 is a single copy gene in the human genome.
The complete nucleotide sequence of the G9 insert (1763 bp) from pG9-3 was obtained on both strands to a degeneracy of ϳ5 by shotgun sequencing (Fig. 2). A non-consensus polyadenylation signal (AATGAA) was identified at one end of the pG9-3 cDNA indicating that this corresponds to the 3Ј end of G9. However, the longest continuous open reading frame, ORF1, did not start with a Met codon. Sequence data obtained from clone pGEM-9-M/H (a ϳ700-bp BamHI/HindIII genomic DNA fragment, containing the 5Ј end of the G9 gene) extended upstream of the pG9-3 cDNA by 193 bp and extended ORF1, placing an in-frame Met codon at nucleotide 67 and an in-frame stop codon at nucleotide 34 (Fig. 2). This suggests that the G9 coding sequence comprises 1248 nucleotides, with an initiation methionine encoded by nucleotides 67-69.
The pGEM-9-M/H clone was used in the determination of the G9 transcriptional start site by RNase mapping. A specific product of ϳ225 bp (taking into account the differing mobilities of RNA and DNA in acrylamide gels) was observed after electrophoresis of the reaction products under denaturing conditions. The size of this product, corresponding to exon 1 of G9, indicates that the transcriptional start site lies 66 bp upstream of the initiation methionine in the G9 mRNA. This has been defined as position ϩ1 (Fig. 2).
Comparison of nucleotide sequence obtained from overlapping genomic DNA fragments with the G9 cDNA sequence showed that the G9 gene is ϳ3.5 kb in length and consists of six exons. Intron/exon boundaries are located between nucleotides 225 and 226, 418 and 419, 681 and 682, 864 and 865, and 1087 and 1088 (Fig. 2) and these all conform to the consensus for eukaryotic genes (underlined in Table I).
The Derived Amino Acid Sequence of G9-The longest open reading frame in the G9 mRNA encodes a protein of 415 amino acids which has three potential N-linked glycosylation sites (Fig. 2). Protein data base searches showed that G9 has significant amino acid sequence identity with several mammalian and bacterial sialidases. The scores (in S.D. units) for an alignment of the sequences of G9, the sialidases of Salmonella typhimurium (41), Clostridium perfringens (42), and Clostrid-ium sordellii (43) and the rat and hamster cytosolic sialidases, over the regions indicated, are listed in Table II. These scores provide a good indication that G9 shares a common ancestor with the bacterial sialidases. An alignment of these sequences and that of the Vibrio cholera sialidase (44) is shown in Fig. 3. G9 also contains amino acid motifs characteristic of the bacterial sialidases. The first of these is the highly conserved consensus sequence, termed the Asp block (Ser-X-Asp-X-Gly-X-Thr-Trp) (45). G9 contains one perfect copy of the Asp block consensus (SMDQGSTW, amino acids 114 -121), two near matches (SKDDGVSW, amino acids 188 -195; and SDDH-GASW, amino acids 248 -255), and one poorly conserved copy (SYDACDTL, amino acids 295-302) (Fig. 3). The sequence motif Phe-Arg-Ile-Pro (FRIP), which occurs close to the N terminus of the bacterial sialidases, is also found in G9 (amino acids 77-80).
The three-dimensional structure of the 42-kDa sialidase from S. typhimurium LT2 has been shown by x-ray crystallography (at 2.2 Å) to comprise six ␤-sheets each composed of four ␤-strands (46). The catalytic subunit of V. cholerae sialidase adopts a structure highly related to this (47). The positions of regions of secondary structure and the active site residues in the S. typhimurium and V. cholerae sialidases are shown in Fig. 3. The S. typhimurium sialidase crystal structure has been used to generate a model of the G9 structure (data not shown) on the basis of protein sequence alignment. One of the features that supports the efficacy of this model is the conservation, in topologically equivalent locations, of 3 out of 4 Asp boxes and 9 out of the 13 residues involved in the active site (Fig. 3). However, Met-99, Trp-121, Trp-128, and Leu-175 in the S. typhimurium enzyme, which form a hydrophobic pocket in the active site, are replaced in the G9 model structure by Pro-134, Ser-156, Gly-163, and Ala-197, respectively. These residues also differ in the V. cholerae sialidase and the rat and hamster cytosolic sialidases (Fig. 3) and may be associated with variation in the substrate specificities of these enzymes. The Asp boxes in the S. typhimurium and V. cholerae sialidases lie on the turns between the third and fourth strands of four of the ␤-sheets. The role of these motifs is unclear, but it has been suggested that they may be involved in maintaining structure.
G9 differs from the bacterial sialidases in having an extended N terminus, which may constitute a cleaved signal sequence or a transmembrane domain. The program SigCleave predicts an N-terminal signal sequence and a cleavage site between amino acids 45 and 46, with a score of 10.7 (the cut-off score being 3.5). Upstream of the predicted signal cleavage site is a polar region (amino acids 41-45) preceded by a hydrophobic segment (amino acids 28 -40). This potential signal sequence lies some distance from the N terminus of G9. However, the maximum length for an N-terminal region upstream of a signal sequence has not been clearly defined and this region of G9 (amino acids 1-33) does fulfill the requirement of having a net positive charge (20). A hydropathy plot, generated by the program PepWindow, shows the G9 protein to have a mean hydrophobicity of 1.9 on the Kyte-Doolittle scale (24) over residues 19 -42. A mean hydrophobicity greater than 1.6 over a 19-residue window is defined as the minimun criterion for a transmembrane segment. The program TopPred (25), which uses the GES hydrophobicity scale, predicts that G9 has a transmembrane domain spanning amino acids 25-45 with the N terminus of the protein in the cytoplasm, on the basis of net charge differences between the sequences flanking the transmembrane domain. These data indicate that G9 has a signalanchor sequence between residues 19 and 45. Cleavage of this signal sequence would give a mature protein of 370 amino acids (Fig. 2). If there is no cleavage at residue 45 the full-length G9 protein is likely to be a type II membrane protein, most probably anchored via residues 28 -45. The TopPred program suggests the presence of a second transmembrane domain in G9 at residues 316 -336. However, this is inconsistent with our model for the structure of G9 and, in the orientation predicted, would place two of the three N-linked glycosylation sites of G9 in the cytoplasm.
Expression of G9 in Insect Cells Using Baculovirus-Recombinant baculovirus (G9-bac), with G9 expression under the control of the viral polyhedrin promoter, was prepared by homologous recombination in Sf21 cells. Monolayers of Sf21 cells infected with G9-bac or with wild-type virus were labeled with 35 S at 24, 48, 72, and 96 h after infection. In the labeled lysates from cells infected with wild-type virus a major band of 29 kDa, corresponding to polyhedrin, was observed on SDS-PAGE. Labeled lysates from cells infected with G9-bac contained unique bands in the size range 38 -48 kDa (data not shown). On a Fig. 4a), 72 h and 96 h after infection two species of ϳ44 and 48 kDa were detected by anti-G9 sera (diluted 1:1000) in G9-bac-infected cells (Fig. 4a, lane 2), but not in uninfected cells (Fig. 4a, lane 1) or cells infected with wild type virus (Fig.  4a, lane 3). In some experiments two additional proteins of ϳ74 and 72 kDa were detected by the anti-G9 sera (as in Fig. 4b) in cells infected with G9-bac and also in uninfected cells and in cells infected with wild type virus, indicating that these are unrelated to G9. No protein was detected in the culture media (Fig. 4a, lanes 4 -6) indicating that G9 is not secreted from Sf21 cells.

Western blot of lysates from Sf21 cells harvested 24 h, 48 h (shown in
Sf21 cells infected with G9-bac were grown in the presence of tunicamycin (an inhibitor of N-linked glycosylation) to determine the presence of N-linked sugars on the expressed G9 protein. Two proteins of ϳ44 and 48 kDa were detected in cells grown without tunicamycin and with tunicamycin added 3, 6, Consensus AGgta tncagG g a Exonic sequence is shown in uppercase type. Nucleotide positions in the cDNA sequence are indicated by numbers below the sequence. Amino acids are indicated by single-letter codes above the second nucleotide of each codon and amino acid positions are indicated by numbers above the sequence.
b Intronic sequence is shown in lowercase type and the length of each intron is indicated. or 9 h before harvesting (Fig. 4b, lanes 6, 5, 4, and 3, respectively). These species were not seen after more prolonged treatment with tunicamycin (Fig. 4b, lanes 1 and 2) and were replaced by a protein of ϳ38 kDa in cells grown in tunicamycin for 28 h prior to harvesting (Fig. 4b, lane 1). The observed reduction in molecular weight following treatment with tunicamycin indicates that G9 undergoes N-linked glycosylation in insect cells. The ϳ38-kDa unglycosylated species is closer to the expected size of G9 following cleavage of the predicted 45 residue signal sequence (41 kDa) than that of the full-length protein (46 kDa). However, size determination based on SDS-PAGE is not sufficiently accurate to allow the nature of this species to be concluded.
Functional Characterization of G9 -Total lysates from uninfected Sf21 cells and from Sf21 cells infected with G9-bac or wild-type virus, (harvested 48 h post-infection) were used in sialidase assays with 4MU-NANA as substrate. The conversion of 4MU-NANA to 4-MU, measured at pH 4.6, was found to be linear over ϳ10 min at 37°C with 1.4 -3.8 nmol of 4-MU produced per min/g of total protein in lysates from G9-bac-infected cells (Fig. 5a). In contrast, sialidase activity in lysates of uninfected Sf21 cells or cells infected with wild-type virus was negligible (ϳ10 pmol of 4-MU/min/g of protein) (Fig. 5a). The rate of conversion of 4MU-NANA to 4-MU by G9-bac-infected Sf21 cell lysates was measured over the pH range 4 -5.6 in NaOAc buffer (Fig. 5b) and 5.8 -8 in phosphate buffer (Fig. 5b, and data not shown). Optimal activity was observed in the pH range 4.4 -4.8, with maximum activity at pH 4.6 (Fig. 5b). This activity was unaffected by freeze-thawing of G9-bac-infected Sf21 cell lysates.
G9 was also shown to hydrolyze the substrates ONP-NANA and PNP-NANA at pH 4.6 and 37°C (data not shown). Reaction rates were similar to those observed for 4MU-NANA, with 0.7-2.5 nmol of o-nitrophenol or p-nitrophenol produced per min/g of total protein in lysates from G9-bac-infected cells. The activity of G9-bac-infected cell lysates toward the substrates fetuin, ␣(2-3)and ␣(2-6)-sialyllactose, and gangliosides was tested, but there was no detectable increase in the hydrolysis of these substrates when incubated with G9-bacinfected cell lysates compared to control lysates from cells infected with wild-type virus.
Subcellular Localization of G9 -Immunofluorescence stain-  (44). The S. typhimurium and V. cholerae sequences were aligned on the basis of topologically equivalent residues in the x-ray crystal structures of S. typhimurium sialidase (46) and V. cholera sialidase (47). The other sequences were aligned using the AMPS program (28) in its multiple alignment mode, weighted such that penalties were increased for gaps within secondary structural elements. Minor adjustments to the alignment were made by eye. Residues that are identical or conservatively replaced in all six protein sequences are shaded in gray. Residues in Asp boxes that conform to the consensus sequence for this motif are shown in white on black. The positions of active site residues in the S. typhimurium sequence (46) are indicated by asterisks (*) below the sequence. Residues that form ␤-sheets in the S. typhimurium and V. cholerae sialidases are indicated by double dashed lines (ϭϭϭϭ) above the sequence.
ing of HeLa cells using anti-G9 sera gave rise to a nuclear staining pattern (data not shown). Preincubation of the antisera with lysates of Sf21 cells expressing G9 eliminated detection of G9 on Western blots, but had little effect on the results of immunofluorescence, indicating that the nuclear staining was unrelated to G9. The construct pcDNA3-G9-Tag-1 was, therefore, used to express G9 with a C-terminal peptide tag (Met-Ala-Ser-Met-Thr-Gly-Gly-Gln-Gln-Met-Gly-Arg-Asp-Pro) in transfected COS7 cells. Transient expression of G9 was verified by Western blot analysis using the monoclonal antibody T7.Tag (Novagen) (Fig. 6, lane 1). Species of ϳ49, 45, and 42.5 kDa were detected in cells grown in the absence of tunicamycin. These were replaced by a single ϳ40-kDa species after a minimum of 28 h growth in the presence of tunicamycin (Fig. 6, lanes 2-5) indicating that G9 does undergo N-linked glycosylation in mammalian cells. The predicted molecular weights of tagged G9 with and without cleavage of the predicted N-terminal signal sequence are 42 and 47 kDa, respectively, but it is not possible to conclude which of these species the ϳ40-kDa unglycosylated species corresponds to on the basis of size determination by SDS-PAGE. No G9 was detected in the culture media from transfected COS7 cells by Western blot analysis (data not shown), indicating that G9 is not secreted from these cells.
Immunofluorescence staining was carried out in COS7 cells transiently transfected with pcDNA3-G9-Tag-1, using the T7.Tag monoclonal antibody (Novagen). At 48 and 72 h posttransfection the pattern of staining in transfected cells closely resembled that seen with a monoclonal antibody to protein disulfide isomerase, which is resident in the lumen of the  6) were collected and mixed with equal volumes of 2 ϫ sample loading buffer and cells (lanes 1-3) were harvested and lysed by boiling in sample loading buffer. Proteins were fractionated by SDS-PAGE (10% (w/v) acrylamide) and transferred to nitrocellulose membrane. The blot was incubated with anti-G9 sera and antibody bound to G9 was detected by incubation with horseradish peroxidase-conjugated goat anti-rabbit IgG followed by ECL developing reagents. The positions of protein standards (kDa) are shown on the left. b, Sf21 cells were infected with G9-bac (lanes 1-6) for 2 h and then grown in fresh media for 28 h prior to harvesting. Uninfected cells were grown in the same way as a control (lane 7). Tunicamycin was added to the media of the infected cells at a final concentration of 2 g/ml immediately after removal of the inoculum (i.e. FIG. 5. Sialidase activity of G9 expressed in Sf21 cells. a, sialidase activity toward the substrate 4MU-NANA was determined by measurement of nanomoles of 4-MU released/min/g of total cell protein. Uninfected Sf21 cells or cells infected with G9-bac or wild-type virus (2 pfu/cell) were harvested and lysed by sonication 48 h after infection. 200 g of total cell protein was incubated for 2, 5, or 10 min at 37°C with 70 nmol of 4MU-NANA at pH 4.6. Hydrolysis was stopped with glycine-carbonate pH 10 and the liberation of 4-MU was determined by measurement of fluorescence with 4-MU as standard. Sialidase activities are shown by hatched boxes on the graph. For G9-bacinfected cell lysates variation in the observed activity over six experiments is indicated by the error bar, with the top of the hatched box corresponding to the mean activity. b, assays were carried out as described in a using G9-bac infected Sf21 cell lysates with 4MU-NANA as substrate in NaOAc buffer, over the pH range 4 -5.6, or in phosphate buffer, over the pH range 5.8 -6. The rate of reaction in nmol 4MU released/min/g total cell protein at each pH tested is indicated. endoplasmic reticulum (ER) (Fig. 7, panels A, B, and C). An ER staining pattern was observed in cells with both low and high levels of G9 expression. In some cells additional rod-like structures and crystals of protein were detected by the T7-Tag antibody (Fig. 7, panel B). These were most prevalent at 72 h post-transfection and in cells transfected with larger quantities of DNA, suggesting that overexpression of G9 results in its accumulation and aggregation or oligomerization in the ER. To determine whether staining of other organelles was being masked by the extensive ER staining pattern, transfected cells were grown in the presence of cycloheximide for 24 h prior to fixation and staining. While this reduced the level of ER staining and the occurrence of rod-like structures and crystals (due to inhibition of protein synthesis) there was no evidence that G9 was transported to other organelles (Fig. 7, panel D). Staining of pcDNA3-G9-Tag-1-transfected cells with anti-G9 sera (Fig. 7, panel E) and purified antibodies to G9 (not shown) showed a pattern of staining identical to that seen with the T7.Tag antibody.
To determine whether the observed ER localization of G9 might be the result of altered trafficking due to the C-terminal peptide tag, COS7 cells were stained with purified antibodies to G9 72 h post-transfection with pcDNA3-G9 -4.2. In transfected cells G9 was seen to co-localize with protein disulfide isomerase in the ER (Fig. 7, panels G and H), but was also found in rod-like structures and crystals in some cells (Fig. 7, panel F).
There was no evidence of ER staining in transfected cells incubated with preimmune sera (Fig. 7, panel I). DISCUSSION Nucleotide sequence analysis of a cDNA clone has shown that the G9 gene from the human MHC encodes a protein with significant sequence similarity to bacterial sialidases. Subsequent expression of the G9 protein in insect cells has confirmed that it does have sialidase activity. Amino acid sequence comparison indicates that G9 and the rat and hamster cytosolic sialidases (8,9) share a common ancestor with the bacterial sialidases. However, while the two rodent cytosolic sialidases are 81% identical at the amino acid level, and the G9 proteins of man and mouse are 86% identical, 2 human G9 shows only 28 -29% sequence identity with the cytosolic enzymes indicating that it is a distinct sialidase.
The G9 genes in man and mouse 2 map to the same segment of DNA as the murine Neu-1 locus (16 -18), which has been shown to control liver sialidase activity. The activity of the enzyme encoded by Neu-1 has been demonstrated using the substrate 4MU-NANA, which we have also shown to be hydrolyzed by G9. It is, therefore, most likely that G9 corresponds to Neu-1. Three allelic forms of Neu-1 have been described, which are associated with variations in the sialic acid content of lysosomal enzymes including ␣-mannosidase and liver acid phosphatase (48). Sialidase-deficient mice that carry the Neu-I a , low activity, allele show hypersialylation of several lysosomal enzymes, as defined by electrophoretic mobility (48,49). This has also been observed in patients with sialidosis, although the plasma proteins of these individuals showed normal patterns of sialylation (50).
In an immune response the activation of T cells is accompanied by increased intracellular sialidase activity (51), which may account for the observed hyposialylation of glycoproteins, such as MHC class I molecules, on the surfaces of activated T cells (52). This is necessary to render the cells responsive to antigen presenting cells, such as B cells (53). In a comparison of activated T cells from different mouse strains, sialidase activity was found to be significantly lower in cells from mice carrying the Neu-I a allele compared to other strains (51). This suggests that Neu-1, i.e. G9, is responsible for increased sialidase activity following T cell activation and may thus be important in regulation of the immune response.
G9 contains an N-terminal signal-anchor peptide suggesting that it is translocated into the ER and this is supported by evidence of glycosylation and the results of immunofluorescence staining. However, it is not possible to say on the basis of the available data whether the signal sequence is subsequently cleaved. The acidic pH optimum observed for the activity of G9 is indicative of a sialidase located in endosomes, lysosomes, or the Golgi apparatus, but in transfected COS7 cells G9 was detected only in the ER. Proteins such as ␤1,4-galactosyltransferase that are normally retained in the Golgi, probably as a result of oligomerization via their transmembrane domains (reviewed in Ref. 54), have been found to accumulate in the ER when overexpressed, due to premature oligomerization. However, we observed localization of G9 to the ER at both low and high levels of expression. Sialidase activity associated with microsomes, which are usually composed to a large extent of ER-derived vesicles, has been previously reported. For example, sialidase activity with a pH optimum of 4.4 was detected in the microsomal fraction of human thymocytes (7). However, in the majority of studies involving subcellular fractionation the correlation of sialidase activity with the activities of ER-associated enzymes has not been tested. The role of a sialidase in the ER is not immediately obvious, since the addition of sialic acids to glycoproteins is understood to be mediated by sialyltransferases, for example, ␣2,6-sialyltransferase, localized in the Golgi apparatus (55). However, retrograde transport from all levels of the Golgi stack to the ER has been demonstrated (reviewed in Refs. 56 and 57) and, in cells treated with brefeldin A (which blocks transport out of the ER), Golgi enzymes have been found to be fully functional following redistribution to the ER (58). Although it is not entirely understood to what extent and by what mechanism(s) Golgi to ER redistribution occurs in vivo it is clearly possible that sialylated proteins might enter the ER.
Intracellular protein sorting is regulated by transport signals that specify retention within, or movement between, compartments and in their absence proteins may be transported to the cell surface by bulk flow. ER lumenal proteins are characterized by the sequence Lys-Asp-Glu-Leu (KDEL) at their C termini (59), while the cytoplasmic tails of type I and type II ER membrane proteins contain the motifs Lys-Lys-X-X (at the C terminus) and X-X-Arg-Arg or X-X-Arg-X-Arg (at or very close to the N terminus), respectively (60, 61). G9 does not carry any of these known signals.
Bulk flow of proteins out of the ER may also be prevented if oligomeric proteins fail to assemble correctly or if proteins are misfolded, in which case they often form aggregates that are retained in the ER by association with chaperones (62). The crystals of G9 that we detected in COS7 cells expressing high levels of recombinant protein may result from a blockage in the secretory pathway, possibly as a consequence of aberrant glycosylation or slight misfolding. Alternatively, retention of G9 in the ER may reflect a requirement for it to interact with another protein, prior to trafficking out of the ER, that is absent or expressed at very low levels in COS7 cells. This is typified by the following example. Copurification of the lysosomal enzyme ␤-galactosidase with the protective protein (a carboxypeptidase) and a sialidase has led to the suggestion that these proteins form a specific complex (63,64). When human ␤-galactosidase is expressed in COS-1 cells, some protein is detected in lysosomes, but much of it is secreted in an unprocessed form (65). Expression of human protective protein in COS-1 cells gave the same result, while various mutant forms of this protein were retained in the ER and degraded over several hours (66). In contrast, we have seen no evidence that G9 is secreted following expression in COS7 cells and it appears to be stable in the ER over long periods of time. In COS-1 cells co-transfected with protective protein and ␤-galactosidase these proteins were shown to associate soon after synthesis in the ER and to co-localize in lysosomes (66). This implies that FIG. 7. Immunofluorescence of COS7 cells. COS7 cells growing on coverslips were transfected with pcDNA3-G9-Tag-1 (panels A-E) or pcDNA3-G9 -4.2 (panels F-I) and grown for 48 h (panels A and C) or 72 h (panels B and D-I) prior to fixation and staining. Cycloheximide (10 g/ml final concentration) was added to the cells in panel D, 24 h prior to fixation. Cells were fixed with 4% (w/v) paraformaldehyde (10 min at 4°C) then 8% (w/v) paraformaldehyde (50 min at room temperature) and permeabilized with 0.05% (w/v) saponin. Cells were treated with: the T7.Tag monoclonal antibody (panels A, B, and D); anti-G9 sera (panel E); a monoclonal antibody to protein disulfide isomerase (panels C and H); purified antibodies to G9 (panels F and G); or preimmune sera (panel I). Fluorescein 5-isothiocyanate-conjugated anti-mouse IgG (panels A, B, C, and D), fluorescein 5-isothiocyanate-conjugated anti-rabbit IgG (panels E, F, G, and I), and tetramethylrhodamine B isothiocyanate-conjugated anti-mouse IgG (panel H) were used as second antibodies. Panels G and H show the same cells stained with antibodies to G9 and protein disulfide isomerase, respectively. Immunofluorescence was observed using a Bio-Rad MRC 1024 confocal microscope. The images shown are magnified by 250 ϫ (panels B and D-I) or 500 ϫ (panels A and C). association of these two species is required for their efficient intracellular transport and processing. However, the sialidase that has been co-purified with the ␤-galactosidase-protective protein complex does not seem to be essential for this, suggesting that its association may occur in the lysosomes. Due to the limited and conflicting data on the properties of this sialidase it is not possible to say how it compares to G9. The sequence at the C terminus of G9 (Tyr-Gly-Thr-Leu) does resemble targeting sequences identified in the cytoplasmic tails of several lysosomal and endosomal membrane proteins (reviewed in Ref. 67). However, the C terminus of G9 is more likely to be lumenal than cytoplasmic in which case it could not be involved in targeting.
The presence of an N-terminal signal peptide and the absence of any known motifs for retention in or targeting to intracellular organelles raises the possibility that G9 may be a secreted protein. If exit from the ER was the rate-limiting step in secretion a predominant ER staining pattern would be expected. We did not detect any G9 in the culture media of transfected COS7 cells, but this does not rule out the possibility that endogenously expressed G9 may be secreted from human cells under appropriate conditions. Sialidase activity has been detected in cell-conditioned media from cultured human fibroblasts (68). In preconfluent cultures a sialidase that hydrolyzed G M3 ganglioside with optimal activity at pH 6.5 was detected, while in confluent cultures the predominant sialidase was optimally active at pH 4.5 and hydrolyzed both the substrates tested: G M3 ganglioside and sialyllactol. The authors suggest that the removal of sialic acid from G M3 ganglioside on the extracellular aspect of the cell membrane may be a requisite for cell growth. U937 cells have also been shown to release a sialidase, in response to certain stimuli, which hydrolyses 4MU-NANA with a pH optimum of 4 -4.4 (69).
Of the substrates tested in this study G9 showed detectable activity only toward 4MU-NANA, ONP-NANA, and PNP-NANA. This limited substrate specificity suggests that G9 may be highly selective in its activity toward intrinsic substrates. The full extent of the enzymatic properties of G9 and its interesting localization are the subjects of further investigation.