Dynamic Glycosylation of Nuclear and Cytosolic Proteins

O-Linked N-acetylglucosamine (O-GlcNAc) glycosylation is a dynamic modification of eukaryotic nuclear and cytosolic proteins analogous to protein phosphorylation. We have cloned and characterized a novel gene for an O-GlcNAc transferase (OGT) that shares no sequence homology or structural similarities with other glycosyltransferases. The OGT gene is highly conserved (up to 80% identity) in all eukaryotes examined. Unlike previously described glycosyltransferases, OGT is localized to the cytosol and nucleus. The OGT protein contains multiple tandem repeats of the tetratricopeptide repeat motif. The presence of tetratricopeptide repeats, which can mediate protein-protein interactions, suggests that OGT may be regulated by protein interactions that are independent of the enzyme’s catalytic site. The OGT is also modified by tyrosine phosphorylation, indicating that tyrosine kinase signal transduction cascades may play a role in modulating OGT activity.

shown to regulate a number of cellular functions. For example recent studies have shown the following. 1) O-GlcNAcylation modulates the DNA binding activity of the p53 tumor suppressor (8). 2) O-GlcNAcylation of p67 regulates protein synthesis by controlling the phosphorylation state of the elongation initiation factor 2 (eIF-2␣) (9,10). 3) O-GlcNAcylation of the head domain of neurofilament-H appears to regulate neurofilament assembly (11). 4) The O-GlcNAc and phosphate modifications of the RNA polymerase II COOH-terminal domain are reciprocal and are likely to regulate transcription (12,13). 5) O-GlcNAc has a reciprocal relationship with phosphorylation at the site on the c-Myc protein, which has been implicated in modulating its oncogenic activity (14).
Consistent with the dynamic nature of O-GlcNAcylation, both a UDP-N-acetylglucosamine:peptide N-acetylglucosaminyl-transferase (O-GlcNAc transferase) (15), specific for the attachment of O-GlcNAc to proteins, and a soluble N-acetyl-␤-D-glucosaminidase with a neutral pH optima (O-GlcNAcase) (16), specific for the removal of O-GlcNAc from proteins have been purified and characterized. These two enzymes appear to regulate the attachment and removal of O-GlcNAc in much the same way that kinases and phosphatases regulate protein phosphorylation. Taken together these observations suggest that O-GlcNAc may play a role in modulating either the phosphorylation state or the assembly and disassembly of multimeric protein complexes in several key cellular systems, including transcription, nuclear transport, and cytoskeletal organization.
The O-GlcNAc transferase (OGT) 1 (EC 2.4.1) purified from rat liver cytosol appears to be a heterotrimer composed of two catalytic 110-kDa (p110) subunits and one 78-kDa (p78) subunit (17). Here we describe the cloning and characterization of the gene encoding the catalytic p110 subunit. The gene is highly conserved throughout evolution, consistent with the ubiquitous nature of the O-GlcNAc modification. We also find that, like many other regulatory proteins, OGT contains several tandem repeats of the tetratricopeptide repeat (TPR) motif (reviewed in Refs. 18 and 19), suggesting that OGT can interact with other proteins via the TPR domain, to form a regulatory complex. Examination of the posttranslational modifications of OGT shows that the enzyme is modified by both O-GlcNAc and tyrosine phosphorylation. The subcellular localization of the cloned gene is consistent with O-GlcNAcylation as a nuclear and cytosolic modification.

EXPERIMENTAL PROCEDURES
Preparation of Tryptic Peptides-The O-GlcNAc transferase was purified and concentrated on a Q-Sepharose column as described previously (17). The purified protein was separated by SDS-PAGE (20). The material corresponding to the 110-kDa subunit was visualized, excised, and subjected to in-gel protease digestion with trypsin (Boehringer Mannheim sequencing grade or Worthington tosylphenylalanyl chloromethyl ketone-treated trypsin) by the method of Rosenfeld et al. (21). The resulting peptides were separated by reverse phase high performance liquid chromatography (RP-HPLC). In addition, a second sample was electroblotted to Immobilon-P sq (Millipore) after SDS-PAGE separation and the protein stained with Amido Black 10B (Sigma) as per manufacturer's recommendations. The band corresponding to the 110-kDa subunit was excised and submitted to the Harvard Microchem protein sequencing facility (Boston, MA) for both NH 2 -terminal sequencing and internal sequence analysis.
RP-HPLC-The tryptic peptides were purified by several rounds of RP-HPLC. A Vydac 5-m C18 column (4.6 ϫ 250 mm) or a Rainin Microsorb-MV 5-m C18 column (4.6 ϫ 250 mm) was used for first and second-dimension RP-HPLC. A Vydac narrow-bore C18 column (2.1 ϫ 150 mm) was used for third-dimension RP-HPLC. The peptides were bound to the column in either 0.05% trifluoroacetic acid or 0.1% phosphoric acid containing 100 mM sodium perchlorate at pH 2 or pH 7 (22), and eluted with a linear gradient from 0 to 60% acetonitrile.
Peptide Sequencing-The RP-HPLC-purified peptides were sequenced by gas phase automated Edman degradation on either a Porton Instruments (Tarzana, CA) model PI 2090E microsequencing system, or an Applied Biosystems Inc. (Foster City, CA) model 470A gas phase sequenator.
Rabbit Antiserum-Polyclonal antibodies to the OGT protein were generated as follows. His-tagged protein was expressed in Escherichia coli using the pTrcHis vector, and the protein was purified as described below. The purified protein was separated by SDS-PAGE, visualized, and excised from the gel as described below. The gel slices were homogenized and used directly as immunogens by Hazelton Research Products (Denver, PA) to produce polyclonal antisera in two rabbits designated AL-24 and AL-25. Immunoglobulin G (IgG) was purified by passing the rabbit antiserum over a protein A-Sepharose column (Pharmacia) as per manufacturer's recommendations.
Western Blot Analysis-Crude protein extracts were prepared either from the dissected tissue of 3-6-month-old Harlan Sprague Dawley rats or from transfected HEK293 cells by following the first two steps of the purification protocol as described previously (17). The proteins were separated by SDS-PAGE and transferred to polyvinylidene difluoride membrane (23). Purified rabbit polyclonal IgG AL-24 (1:5000), or monoclonal anti-phosphotyrosine (Sigma, 1:2000) was used as a primary antibody with anti-rabbit or anti-mouse IgG coupled to horseradish peroxidase (Amersham) as the secondary antibody (1:20,000 dilution). Detection of the horseradish peroxidase activity was by enhanced chemiluminescence (ECL) and fluorography as described by the manufacturer (Amersham).
Carbohydrate Characterization-Purifed OGT from rat liver (through step 8, as described by Haltiwanger et al. (17)) was probed for terminal GlcNAc using Gal␤(1-4)galactosyltransferase and UDP-[ 3 H]galactose as described (24). The subunits were then resolved on a 7.5% SDS-acrylamide gel, and proteins tagged with [ 3 H]galactose were detected by fluorography of the gel treated with 1 M sodium salicylate. The p110 subunit of OGT was excised, and the O-linkage of the labeled sugar on p110 was demonstrated by its sensitivity to ␤-elimination and reduction (24). The released O-linked sugar was desalted over a Dowex AG 50W-X8 (hydrogen form)/Dowex AG 1-X8 (formate form) run in series. Identification of the ␤-eliminated products was performed by high pH anion-exchange chromatorgraphy with pulsed-amperometric detection (HPAE-PAD) by isocratic elution in 200 mM NaOH at 0.4 ml/min for 25 min on a Dionex Bio-LC equipped with a CarboPac-MA1 column.
Immunoprecipitation-Partially purified OGT from rat liver (through step 4, and desalted as described; Ref. 17) was incubated with AL-25 or preimmune IgG (see above) on ice for 3 h. The IgGs were then precipitated with protein A-Sepharose CL4-B (Pharmacia) in Tris-buffered saline plus 0.1% Tween 20 (TBST). The resin was washed extensively in TBST, followed by a desalting wash in 20 mM Tris, pH 7.8, 20% glycerol (desalt buffer), while the supernatant was desalted over a 1-ml Sephadex G-50 (Sigma) column in desalt buffer. The resin and supernatant was then resuspended to equivalent volumes in either SDS-PAGE sample buffer for Western blot analysis, or in desalt buffer for OGT activity assay as described previously (17).
General Recombinant DNA Techniques-Restriction endonuclease digestions and ligations were carried out as described (25). Plasmids were isolated using Wizard Prep Kits (Promega) according to the manufacturer's directions.
Polymerase Chain Reaction (PCR)-Two oligonucleotides were synthesized (ATGGGAAATACTTTGAAA ϭ forward, ATGGATTATATAT-CACT ϭ reverse), and PCR was carried out in 50-l reactions containing Ϸ10 7 -ZAPII rat liver cDNA library (no. 936513, Stratagene) phage, 2.5 mM MgCl 2 , 50 mM KCl, 10 mM Tris, pH 9.0, 0.1% Triton X-100, 200 M dNTPs, 1.5 M each primer, 2.5 units of Taq DNA polymerase (Promega). The -ZAPII phage were first denatured by boiling for 10 min in dH 2 O and then added to the PCR reactions and amplified in a DNA Thermal Cycler (MJ Research Inc.) using a step gradient annealing temperature cycle starting at 48°C (1 min) decreasing 0.5°C/cycle for 20 cycles, an additional 20 cycles were performed at 38°C (1 min) annealing temperature. A 65°C (2 min) elongation step followed by a 92°C (1 min) denaturing step was used in all cycles. The PCR products were resolved on a 1% agarose gel containing ethidium bromide (0.5 g/ml), gel-purified using a silica suspension (26), and subcloned into the pGEM-T cloning vector (Promega).
DNA Sequencing-Subcloned PCR products and excised cDNA -ZAPII clones were subjected to double-stranded DNA sequencing using deoxyadenosine 5Ј-[␣-[ 35 S]thio]triphosphate and Sequenase II (U. S. Biochemical Corp.), as described by the manufacturer. Additional sequence information of the cDNA clones was obtained by automated DNA sequencing on an Applied Biosystems model 373A automated DNA sequencer.
Screening the cDNA Libraries-Two rat liver cDNA -ZAPII libraries (nos. 936513 and 936507, Stratagene) were plated on XL-1 Blue host cells (Stratagene) and screened by hybridization in 50% formamide (25), with the PCR-22b probe (see above) labeled to a specific activity of Ϸ8 ϫ 10 8 dpm/g using the Ready⅐To⅐Go DNA Labeling Kit (Pharmacia). Hybridization was performed at 42°C, and the filters were washed in 0.2 ϫ SSC at 70°C. Nine positive clones were isolated, ranging in size from 1-3.2 kb (designated H1-H9). The longest H1 (Fig. 1A), contained a putative start site, 310 base pairs of upstream untranslated DNA, and an open reading frame encoding 958 residues; however, no in-frame stop codon was present. A rat hippocampus -ZAPII library kindly provided by Anthony Lanahan (Johns Hopkins School of Medicine, Baltimore, MD) was screened as above, except that a gel-purified SacI restriction enzyme fragment representing the 3Ј end of the partial clone was used as probe (Fig. 1A); all other conditions were identical. Five clones were isolated (designated LTP1-LPT5), ranging in size from 2.5 to 4.6 kb. All of these clones overlapped the H1 clone by 0.3-2.2 kb, and all contained a poly(A) tail. A total of 8 ϫ 10 5 recombinant phage from each library were screened. Positive clones were subjected to in vivo excision protocol according to the manufacturers' directions, to recover a pBlueScript plasmid containing the cDNA clone of interest for further analysis. A full-length cDNA clone was constructed by ligation of the 5Ј-end of the H1 clone to the 3Ј-end of LTP3 at a convient XhoI site (Fig. 1A).
DNA and RNA Blot Analysis-Genomic-DNA analysis was performed on a prepared Zoo-Blot (Clontech) as per manufacturer's directions with a PCR-22b probe labeled as above. In addition, total genomic DNA was isolated from HEK293 cells (25), digested to completion with EcoRI, resolved on a 0.8% agarose gel, and transferred to a Nytran (Schleicher & Schuell) filter according to manufacturer's directions. This blot was probed with a PCR-22b probe as described above. Total RNA was isolated from the dissected tissue of 3-6-month-old Harlan Sprague Dawley rats using RNeasy Total RNA kit (Qiagen) according to manufacturer's directions. 25 g of each RNA was resolved on 1% agarose gels, transferred to a Nytran (Schleicher & Schuell) filter according to manufacturer's directions, and probed with the PCR-22b fragment (see above) as described (25).
Expression of the Ogt cDNA-The coding region of the GTF was assembled from two cDNA clone fragments at a unique XhoI site. The 5Ј-untranslated region was removed and replaced with a linker restoring the start codon, and a BamHI site was added 5Ј to the start (Fig.  1B). The coding region was then subcloned as a BamHI/HindIII fragment into the polylinker of the pTrcHis Xpress vector (Invitrogen), designated pLK51, for expression in E. coli, or into pGW1 (a kind gift of Mike Lee) designated pLK61, for transient expression in mammalian cells. pLK51 was transformed into the XL-1 Blue (Stratagene) strain of E. coli and grown to midlog phase, and protein expression was induced by the addition of 1 mM final concentration of isopropyl-1-thio-␤-Dgalactopyranoside. Cells were harvested 6 h after induction and protein was purified on Hi-Trap Chelating Columns (Pharmacia) under urea denaturing conditions as per manufacturer's instructions. HEK293 cells were grown in six-well plates for protein expression assays, or on glass coverslips for immunolocalization in Dulbecco's modified Eagle's medium, 10% fetal bovine serum until 50% confluence. The cells were then transfected with pLK61 by calcium phosphate-mediated transfection (27) and grown for an additional 24 -48 h to allow protein expression.
Immunofluorescence-Transfected HEK293 cells (see above) or CHO cells, grown on glass coverslips in Dulbecco's modified Eagle's medium/ Ham's F-12 (1:1) supplemented with 10% fetal bovine serum until 50% confluence, were washed twice in serum-free medium and fixed in 4% formaldehyde for 30 min. The cells were then washed four times in phosphate-buffered saline, pH 7.5 (PBS), and permeabilized in 0.5% Triton X-100 in PBS for 5 min. Cells were washed in PBS and blocked with goat serum and 3% bovine serum albumin in PBS (1:3) for 15 min at 37°C. The cells were then incubated in primary antibody for 30 min at 37°C. Excess primary antibody was washed away with four 10 min incubations in PBS, and the cells were incubated in secondary antibody, at room temperature for 30 min in the dark. The cells were rinsed as for the primary antibody and mounted onto slides in 0.1% paraphenylenediamine in 90% glycerol. Secondary antibody alone gave no signal, and no signal was observed in non-permeabilized cells. Primary antibodies, AL-25 or preimmune, were used at a 1:500 dilution. The secondary antibodies, FITC-conjugated goat anti-rabbit IgG (Jackson ImmunoResearch Laboratories) were diluted 1:200; DAPI stain was used at a final concentration of 0.1 g/ml. All antibodies were diluted in 3% bovine serum albumin in PBS.

RESULTS
The Gene Encoding p110 Is Evolutionarily Conserved-Fourteen unique peptide sequences were obtained from protease digests of the p110 subunit of OGT purified from rat liver (underlined in Fig. 1A). Polymerase chain reaction amplification and standard cDNA library screening techniques were combined to clone the gene encoding the p110 subunit (see "Experimental Procedures"). Two overlapping clones were isolated and the open reading frame was reconstructed at a convenient XhoI restriction site (Fig. 1B). The full-length cDNA contains a single open reading frame encoding a protein of 1037 residues, designated p110 OGT .
Computer searches of the standard GenBank™ data bases using the BLAST algorithm (28) revealed 61% sequence identity between p110 OGT and a hypothetical 1194 residue protein encoded at locus K04G7.3 of Caenorhabditis elegans (accession number U21320). A matrix plot of the predicted C. elegans protein and p110 OGT is shown in Fig. 2A. The homology extends through the entire clone with regions as long as 350 residues sharing Ͼ80% identity. A recently cloned gene in Arabidopsis, SPINDLY, involved in gibberellin signal transduction (29) also shares extensive homology with p110 OGT throughout the entire coding region. Thus, both K04G7.3 and SPINDLY are likely to encode homologues of p110 OGT . In addition, searches of the dbEST data base of expressed sequence tags (30) revealed homology between p110 OGT and the conceptual translation products of expressed sequence tags from human (R7594; 93% identity over 414 residues), Schistosoma mansoni (T14553; 65% identity over 442 residues), and rice (D24403; 67% identity over 326 residues).
The amino-terminal portion of p110 OGT shares homology with a diverse group of proteins all containing a common motif designated the TPR motif (18,19), while the carboxyl terminus shares no significant homology to any known protein in the data bases. Thus, the p110 OGT appears to consists of two distinct domains: the amino-terminal 463 residues containing 11.5 tandem repeats of the TPR motif (Fig. 1C), and the car- boxyl-terminal 563 residues representing a novel polypeptide perhaps encoding the catalytic activity.
Ogt Is Not a Member of a Multigene Family and Is Present in Many Organisms-Southern blot analysis was used to determine if the OGT clone represents a family of O-GlcNAc transferase genes. Rat genomic DNA was digested with several restriction enzymes (BamHI, EcoRI, HindIII, PstI and SacI) and probed with an 850-base pair fragment (PCR-22b, see Fig.  1B). Only one or two bands of equal intensity are seen in most lanes (Fig. 2B). The EcoRI lane has three bands, which is consistent with the expected restriction pattern of the cloned cDNA. The absence of several bands of varying intensity in each lane indicates that the Ogt gene is not a member of a closely related multigene family.
To further examine the level of conservation among higher eukaryotes, genomic DNA from rat, mouse, dog, cow, rabbit, and human was probed with PCR-22b (see above). A specific signal is seen in all lanes (Fig. 2C), demonstrating that a single related gene is present in many higher eukaryotes.
OGT Activity Is Immunoprecipitated from Rat Liver Extracts by an Antibody against Recombinant p110 OGT Expressed in E. coli-Polyclonal rabbit antibody (designated AL-25) was prepared against purified, recombinant p110 OGT overproduced in E. coli (see "Experimental Procedures"). AL-25 immunoglobu-lin G (IgG) is highly specific for the OGT and shows no crossreactivity to other proteins present in partially purified preparation of rat liver OGT designated the Q-Sepharose pool (17) (Fig. 3A, compare lanes 1 and 2). Preimmune IgG shows no reactivity (data not shown). On Western blots, AL-25 antibody recognizes both the p110 and the p78 subunits of the rat liver OGT (lane 2), suggesting that p110 and p78 are related at the polypeptide level. Similar results were obtained with antibodies raised against synthetic peptides derived from the p110 subunit sequence (data not shown).
Both the p110 and p78 subunits of the native OGT are immunoprecipitated from the Q-Sepharose pool using AL-25, while no protein is precipitated using preimmune IgG (lanes 3  and 4). The pellets and supernatants from the immunoprecipitation were assayed for OGT enzyme activity (17). OGT enzymatic activity is precipitated from the Q-Sepharose pool using AL-25, while no activity is precipitated by buffer alone or preimmune IgG (Fig. 3B). These studies demonstrate that the cloned cDNA indeed represents the p110 subunit of the rat OGT.
Levels of OGT RNA, Protein, and Activity Vary in Different Tissues-Northern blot analysis (Fig. 4A) indicates that there are four transcripts ϳof 8.0, 6.0, 4.2, and 1.7 kb, present in all rat tissues examined thus far. The 6.0-kb transcript is closest in size to the cloned cDNA (5.7 kb). The larger 8.0-kb transcript may be an alternate splicing product containing additional exons, as is seen in C. elegans, which has two distinct cDNAs representing alternative splicing events at the 5-prime end of the message. The longer message produces a 130-kDa protein as predicted (by Wilson et al.,31) (accession number U21320), while the smaller message would produce a 112-kDa protein. 2 The pattern of protein expression was examined by Western blot analysis of 30% ammonium sulfate cytosolic pellets (30% pellet) from rat tissues using the AL-25 antibody (Fig. 4B). The p110 band is clearly detectable in all tissues except the kidney, while the p78 band is detectable only in kidney, liver, and muscle, and an 80-kDa band is present only in muscle. Additional bands are visible upon longer exposure (data not shown), including a faint 110-kDa band in kidney as well as a 190-kDa band in liver, indicating that there are several Ogt-derived proteins present in most tissues.
The 30% pellets were also assayed for enzyme activity (17), the results are shown in Fig. 4C. All the extracts contained enzymatically active OGT, with brain and thymus having the highest specific activities and liver the lowest. However, the amount of enzyme activity did not always correlate well with the amount of OGT protein present in each extract, or with the transcript levels seen in each tissue.
OGT Is Modified by Tyrosine Phosphorylation and O-Glc-NAcylation-Western blot analysis of the Q-Sepharose pool using an anti-phosphotyrosine antibody shows that the p110 and p78 subunits are immunoreactive. This reactivity is blocked by the addition of 10 mM phosphotyrosine (Fig. 5A,   compare lanes 1 and 2), but not 10 mM tyrosine (data not shown). Similar experiments using antibodies against phosphoserine and phosphothreonine showed no immunoreactivity (data not shown). Examination of the p110 OGT amino acid sequence indicates that there is only one well conserved receptor protein-tyrosine kinase phosphorylation site, Tyr 979 (32, 33) (outlined in Fig. 1A).
To determine if OGT was itself modified by O-GlcNAc, highly purified OGT was probed with galactosyltransferase (see "Experimental Procedures"). Galactosyltransferase is a specific probe for terminal GlcNAc resides (24,34) that is commonly used to detect O-GlcNAc by covalently labeling the GlcNAc with UDP-[ 3 H]galactose. Both the p110 and the p78 subunits are labeled with [ 3 H]galactose (Fig. 5A, lane 3), indicating that they are modified by GlcNAc. The labeled p110 band was excised from the gel and subjected to alkaline ␤-elimination. The label was released by this treatment indicating that the sugar was an O-linked glycan (data not shown). The released sugars were then analyzed by HPAE-PAD chromatography (see "Experimental Procedures"). The radioactivity was seen to migrate with the disaccharide alditol of Gal␤1,4GlcNAc (Fig.  5B), indicating that p110 is modified by O-GlcNAc.
Overexpression of p110 OGT in Human Cells Increases OGT Enzyme Activity-Overexpression of the Ogt cDNA in HEK293 cells by transient transfection produces a protein, which co- migrates with the p110 subunit of purified liver OGT and is recognized by AL-25 IgG (Fig. 6A, compare lane 1 to lanes 4 and  5). The endogenous OGT is not visible at the short exposure time shown (lanes 2 and 3), a faint p110 band is seen in the vector alone controls upon longer exposure (data not shown). The protein level of p110 OGT increases dramatically over time (compare lanes 4 and 5); however, overexpression of the p110 OGT in HEK293 cells gives only a modest (20 -30%) increase in OGT activity over control cells transfected with vector alone (Fig. 6B). To address the possibility that mislocalization of the over expressed p110 OGT was preventing high levels of activity in transfected cells, we examined the localization of OGT in transfected HEK293 cells (see below). There are differences in the pattern of expression that may account for the unusually small increase in enzyme activity of the transfected cells. It is also possible that additional factors not in abundance in HEK293 cells are required for the activity of OGT, or that these cells are down regulating the activity of the overexpressed OGT by some as yet unknown mechanism.
OGT Is Present in Both the Cytosol and Nucleus-Immunolocalization of p110 OGT in CHO cells using AL-25 IgG shows that the p110 OGT is located specifically in nucleus and cytosol (Fig. 7, panels a-c) where previous studies have shown the activity is located (15,35). The nucleus, where most of the O-GlcNAcylated proteins are found (36), stains evenly and brightly, while the cytosol shows a weak, diffuse, punctated staining pattern. No reactivity is seen in CHO cells using preimmune IgG (panels d-f). HEK293 cells overproducing p110 OGT during a transient transfection show a somewhat different pattern of expression (panels g and h). The cytosolic staining in these cells is also punctate but it is significantly more intense. In addition, the level of nuclear staining is much reduced (compare panels b and h). Several non-transfected cells in the same field have no significant signal (panels g-i). DISCUSSION The OGT cloned in the present study displays several features that are unique and also provides clues with respect to the general functional significance of the O-GlcNAc modification. 1) The high evolutionary conservation of the enzyme suggests that it has a fundamental cellular function. 2) The enzyme's nuclear and cytosolic localization is consistent with its action on a myriad of proteins in both compartments. 3) The tyrosine phosphorylation of the enzyme implies that it may be regulated by one or more of the receptor tyrosine kinases, linking O-GlcNAcylation to signal transduction cascades (37,38). 4) The presence of multiple TPR repeats suggests a twobinding site model for the regulation of the O-GlcNAcylation of proteins (Fig. 8).
The Ogt gene described above represents a novel glycosyltransferase that has no structural or sequence similarities to any previously described glycosyltransferase (39). It has been highly conserved among higher eukaryotes such as rats, nematodes, and plants. This level of conservation indicates strong evolutionary pressure and suggests that Ogt encodes a protein with an essential cellular function. However, aside from the common TPR motif domain, p110 OGT does not share any significant homology at the primary sequence level with any protein in the Saccharomyces cerevisiae data bases. OGT also shares no sequence homology with the ␣-toxin from Clostridium novyi that catalyzes the incorporation of O-GlcNAc into the Rho family of proteins in a manner very analogous to OGT (40). However, the ␣-toxin does share some sequence homology with an uncharacterized open reading frame from yeast (accession no. Z73530). Thus, a protein unrelated to p110 OGT at the amino acid level could perform a similar enzymatic function in yeast or other eukaryotes.
Southern blot analysis and sequence comparisons indicate that Ogt is not a member of a closely related gene family. However, O-GlcNAc is found on a diverse group of proteins at a multitude of glycosylation sites. Thus, it seems unlikely that one enzyme could be responsible for the specific addition of O-GlcNAc to all these proteins. We cannot rule out the possibility that a family of OGT proteins exists, which, like the Golgi glycosyltransferases, share no significant sequence homology, only structural similarity (39,41). Alternatively, there may exist a mechanism for the regulation of a single OGT enzyme, which could confer both temporal and substrate specificity in response to cellular signals.
While there is only one Ogt gene, there are multiple transcripts and proteins related to p110 OGT , some of which are tissue-specific. These related proteins likely arise from one gene by a combination of alternative RNA splicing and specific proteolysis. The presence of the p110 OGT subunit in nearly every tissue examined leads us to postulate that this form of the enzyme provides the majority of the basal cellular OGT activity. However, the levels of activity in the various tissues do not always correlate with the p110 protein and RNA levels, indicating that additional factors regulate the activity of OGT. These additional factors may be limiting when the p110 OGT is overexpressed in mammalian cells. Thus we do not see a proportional increase in OGT activity with protein expression.
The expression of tissue-specific forms of OGT is one mech- anism by which the substrate specificity of the enzyme could be regulated. Posttranslational modifications of OGT present another mechanism for modulating activity or specificity. We have shown that OGT is modified by O-GlcNAc. In addition, we have shown that OGT is modified by tyrosine phosphorylation. A single receptor protein-tyrosine kinase phosphorylation consensus site is present in the putative catalytic domain, where it could act as a regulatory modification modulating the activity of OGT in response to signal transduction cascades (38).
The presence of TPR motifs in p110 OGT is interesting, as TPRs have been found in a large number of proteins of diverse function and are believed to play a role in modulating a variety of cellular processes, including cell cycle (42)(43)(44), transcription regulation (45)(46)(47), and protein transport (48). Direct evidence for TPR-mediated protein-protein interactions regulating cellular functions is seen for the yeast transcription factor Cyc8. Cyc8 contains 10 TPR domains that are directly involved in recruitment of the Cyc8-Tup1 co-repressor regulating transcription from a distinct set of genes (47). Other examples are the yeast Cdc proteins: Cdc16p, Cdc23p, and Cdc27p, which directly interact with each other via their TPRs in a sequencedependent manner during mitosis. Mutational analysis of the TPR domains of the Cdc proteins shows that a given TPR modulates a specific protein interaction (43). The ability of TPRs to regulate cellular processes via protein-protein interactions suggests that the TPR motifs of p110 OGT could mediate specific protein interactions with accessory proteins, thereby modulating the activity or specificity of OGT. The tyrosine phosphorylation and O-GlcNAc modifications of OGT would provide an additional level of regulation.
A model for OGT regulation, combining the TPR accessory proteins and the posttranslational modifications, is presented in Fig. 8. In this model OGT has a basal level of activity for a narrow range of substrates. Binding of TPR accessory proteins would allow O-GlcNAcylation of additional specific substrates. The basal activity of OGT is up-regulated by changes in the phosphorylation or O-GlcNAcylation state of the protein. This model is not without precedence, as both the activity and specificity of RNA polymerase II are regulated by a large array of transcription factors as well as posttranslational modifications. Although RNA polymerase II does not contain TPR motifs, many of the transcription factors required for activation bind directly to the protein (reviewed in Refs. 49 -51). In addition, both O-GlcNAcylation (12,13,52) and phosphorylation (52) of the COOH-terminal domain of RNA polymerase II have been documented. Additional study of OGT will allow further elucidation of the mechanisms regulating O-GlcNAcylation and will facilitate the direct evaluation of O-GlcNAc's functions in cellular metabolism.
Acknowledgments-We thank Bill Kelly for work on the characterization of the C. elegans transcripts, Anthony Lanahan from the Department of Neuroscience at Johns Hopkins Medical School for generously providing the rat hippocampus library, Betty Jean Earles for peptide sequencing and synthesis, and all the members of the Hart laboratory for helpful discussions.