Identification and characterization of a missense mutation in the O-linked β-N-acetylglucosamine (O-GlcNAc) transferase gene that segregates with X-linked intellectual disability

O-GlcNAc is a regulatory post-translational modification of nucleocytoplasmic proteins that has been implicated in multiple biological processes, including transcription. In humans, single genes encode enzymes for its attachment (O-GlcNAc transferase (OGT)) and removal (O-GlcNAcase (OGA)). An X-chromosome exome screen identified a missense mutation, which encodes an amino acid in the tetratricopeptide repeat, in OGT (759G>T (p.L254F)) that segregates with X-linked intellectual disability (XLID) in an affected family. A decrease in steady-state OGT protein levels was observed in isolated lymphoblastoid cell lines from affected individuals, consistent with molecular modeling experiments. Recombinant expression of L254F-OGT demonstrated that the enzyme is active as both a glycosyltransferase and an HCF-1 protease. Despite the reduction in OGT levels seen in the L254F-OGT individual cells, we observed that steady-state global O-GlcNAc levels remained grossly unaltered. Surprisingly, lymphoblastoids from affected individuals displayed a marked decrease in steady-state OGA protein and mRNA levels. We observed an enrichment of the OGT-containing transcriptional repressor complex mSin3A-HDAC1 at the proximal promoter region of OGA and correspondingly decreased OGA promoter activity in affected cells. Global transcriptome analysis of L254F-OGT lymphoblastoids compared with controls revealed a small subset of genes that are differentially expressed. Thus, we have begun to unravel the molecular consequences of the 759G>T (p.L254F) mutation in OGT that uncovered a compensation mechanism, albeit imperfect, given the phenotype of affected individuals, to maintain steady-state O-GlcNAc levels. Thus, a single amino acid substitution in the regulatory domain (the tetratricopeptide repeat domain) of OGT, which catalyzes the O-GlcNAc post-translational modification of nuclear and cytosolic proteins, appears causal for XLID.


O-GlcNAc is a regulatory post-translational modification of nucleocytoplasmic proteins that has been implicated in multiple biological processes, including transcription. In humans, single genes encode enzymes for its attachment (O-GlcNAc transferase (OGT)) and removal (O-GlcNAcase (OGA)
). An X-chromosome exome screen identified a missense mutation, which encodes an amino acid in the tetratricopeptide repeat, in OGT (759G>T (p.L254F)) that segregates with X-linked intellectual disability (XLID) in an affected family. A decrease in steadystate OGT protein levels was observed in isolated lymphoblastoid cell lines from affected individuals, consistent with molecular modeling experiments. Recombinant expression of L254F-OGT demonstrated that the enzyme is active as both a glycosyltransferase and an HCF-1 protease. Despite the reduction in OGT levels seen in the L254F-OGT individual cells, we observed that steady-state global O-GlcNAc levels remained grossly unaltered. Surprisingly, lymphoblastoids from affected individuals displayed a marked decrease in steady-state OGA protein and mRNA levels. We observed an enrichment of the OGT-containing transcriptional repressor complex mSin3A-HDAC1 at the proximal promoter region of OGA and correspondingly decreased OGA promoter activity in affected cells. Global transcriptome analysis of L254F-OGT lymphoblastoids compared with controls revealed a small subset of genes that are differentially expressed. Thus, we have begun to unravel the molecular consequences of the 759G>T (p.L254F) mutation in OGT that uncovered a compensation mechanism, albeit imperfect, given the phenotype of affected individuals, to maintain steady-state O-GlcNAc levels. Thus, a single amino acid substitution in the regulatory domain (the tetratricopeptide repeat domain) of OGT, which catalyzes the O-GlcNAc post-translational modification of nuclear and cytosolic proteins, appears causal for XLID.
1-3% of the world population is affected by intellectual disability (ID) 3 (1)(2)(3)(4). ID is a leading problem of socio-economic health care in Western countries as reported by the Centers for Disease Control and Prevention (5,6), due to the lifetime support required by the affected individuals. ID is characterized by an intelligence quotient of 70 or lower. In addition to ID, affected individuals exhibit two or more behavioral deficits in terms of social, conceptual, or practical adaptation (3,7). A variety of factors cause ID, and they can range from malnutrition during pregnancy to chromosomal abnormalities (8). Recently, deletions, duplications and missense mutations have been the focus in the field, given the advent of the human genome (9) as well as the subsequent explosion in sequencing technologies. Monogenic causes have been mainly attributed to genes found on the X-chromosome. 5-10% of ID in males is inherited in an X-linked pattern (10), and epidemiological studies consistently show a 30 -50% excess of males over females diagnosed with ID. To date, mutations in at least 102 genes result in 81 of the known 160 X-linked intellectual disability (XLID) syndromes (10). However, the cause of ID in nearly 50% of males with XLID remains unknown (10).
Nuclear and cytoplasmic proteins can be modified with O-linked ␤-N-acetylglucosamine (O-GlcNAc) on the hydroxyl groups of serines or threonines to influence various intracellular processes (11)(12)(13)(14)(15)(16). The O-GlcNAc modification is a dynamic and inducible post-translational modification that cycles on and off intracellular proteins in response to the cellular environment similar to protein phosphorylation (17)(18)(19). In mammals, individual genes encode for the enzymes responsible for the addition and removal of O-GlcNAc, O-GlcNAc transferase (OGT), and O-GlcNAcase (OGA), respectively (20 -22). The sugar nucleotide UDP-GlcNAc, which is the end product of the hexosamine biosynthetic pathway and whose cellular concentrations are responsive to nutritional status, is the donor substrate for OGT (23)(24)(25). O-GlcNAc modification of transcription factors, chromatin remodeling enzymes, and histones has established this modification as a regulator of gene expression (26). The O-GlcNAc modification has also been associated with several human diseases that involve changes in metabolism, including cancer, diabetes, and Alzheimer's disease (25,(27)(28)(29)(30).
The O-GlcNAc modification of proteins is regulated by the opposing actions of OGT and OGA (11)(12)(13)(14)(15)(16). OGT is encoded by a single gene (OGT) and maps to chromosome Xq13.1 (31). OGT was cloned and partially characterized in 1997 (20,32,33) and contains a C-terminal catalytic domain (20,32) and an N-terminal tetratricopeptide repeat (TPR) domain that varies in length (33,34). This TPR domain is thought to regulate protein-protein interactions and to bring the catalytic domain into proximity to its protein substrates (35,36). Murine ogt knockouts are embryonic lethal, demonstrating its requirement for cell survival (31). OGA is encoded by a single gene (MGEA5) located on chromosome 10q24. 1-24.3 (37) and was cloned (38) and partially characterized in 2002 (39). OGA has a catalytic N-terminal O-GlcNAcase domain and a C-terminal domain with low sequence identity to histone acetyltransferase that lacks its enzymatic activity (40). Oga homozygous null mice are perinatal lethal (41).
In this study, we have characterized a missense mutation in OGT (759GϾT (p.L254F)) that segregates with XLID in a family. This mutation results in an amino acid substitution in the TPR domain of OGT (p.L254F). We demonstrate that the protein is active but unstable. Surprisingly, O-GlcNAc levels remain unaltered in the L254F-OGT-expressing cells. Reduced OGA protein levels compensate for lowered OGT protein levels in the cells derived from XLID individuals. OGA protein levels are decreased by means of transcriptional regulation, because both OGA mRNA and promoter reporter expression are decreased in XLID lymphoblastoid cells. OGT, in addition to O-GlcNAc-modified proteins, mSin3A and HDAC1, are enriched at the OGA promoter in L254F-OGT individual cells. These results suggest that L254F-OGT regulates OGA gene expression in XLID lymphoblastoids in a co-repressor fashion. In parallel, RNA deep sequencing has revealed a small subset of genes that are regulated in a disorder-specific context, including several chromatin components. Taken together, these data begin to delineate the molecular impact of the L254F substitution on OGT function and uncover a cellular compensation mechanism, albeit imperfect, for maintaining global O-GlcNAc levels.

Mutations in OGT segregate with XLID
Exome sequencing of multiple family members (family K9427), revealed a mutation (759GϾT (p.L254F)) encoding an amino acid in the seventh TPR of OGT perfectly segregating with XLID (Fig. 1A). A total of 26 single nucleotide substitutions in 22 annotated genes were identified after filtering by affected kindred/cross-cohort analysis. Among the 26 variants, 14 were located in deep introns far away from conserved splicing consensus sequence, and seven were located within 3Ј-UTR regions. Among the five variants located within the coding regions, two resulted in no change in amino acids, and three resulted in missense variants. Among the three missense variants, one, GABRE-p.R87C, did not segregate with ID phenotype in the proband's family. A second missense change, p.R175C, in PRICKLE3 did segregate and had a CADD score of 15.9, which could be indicative of pathogenesis. However, the gene does not appear to be expressed in the brain based on data in UNIGENE and the lack of amplification using a multiple tissue cDNA panel (data not shown). However, the OGT variant, p.L254F, did segregate with the ID phenotype in family K9427, and bioinformatic analysis (CADD score of 26.5) as well as the presence of expression in the brain made this variant the only likely candidate for the phenotype in K9427, and therefore, it was further pursued. Three family members displayed ID ( Fig. 1A and Table 1). The affected males (II-4, III-4, and IV-1; Fig. 1) spanned three generations and also presented with a small head circumference, 5th finger clinodactyly, and genital abnormalities. Other minor findings were also noted ( Table 1). Lymphoblastoids from affected individuals and controls (II-1, -3, -7, and -8; Fig. 1) were isolated for further studies. Corroborating the prediction of causality was the fact that all examined carrier females exhibited highly skewed X-inactivation (98 -100% in patients II-5, -6, and -8 and III-3; Fig. 1), as has been previously seen in many families with XLID (42).

L254F-OGT protein is unstable
We first modeled the impact of the L254F substitution on the known structure of the human OGT TPR repeats. The known structure of the OGT TPR repeats (PDB code 1W3B) was used to calculate ⌬⌬G for the L254F variant. The ⌬⌬G predictions, conducted using seven different web services, for the variant predicted an average loss in stability of 0.72 kcal/mol, with six of the seven programs predicting that the L254F variant was destabilizing ( Table 2). The average predicted ⌬⌬G upon substitution is relatively small, indicating that the single amino acid replacement probably does not grossly affect overall protein stability. Consistent with this finding, we visualized the structures of wild-type and L254F variant OGT TPR domain with UCSF Chimera, and the substitution sites are marked in red (Fig. 1B). The side chain conformations of the residues within 3 Å of the wild-type position or variant position are shown with a stick representation. The amino acid replacement is located in a helix region, but it is also close to a tight turn connecting two helices. The wild-type Leu residue is almost totally buried with low accessible surface area. Thus, the substitution with Phe, which is a bulkier residue, would be difficult to accom-

L254F-OGT and XLID
modate by the protein. This is consistent with the folding free energy predictions that L254F is a destabilizing variant of OGT.
Because modeling predicted that the L254F-OGT protein is partially unstable, we examined steady-state levels of OGT in both affected and control lymphoblastoids. Immunoblotting with multiple antibodies to OGT showed that affected males P1 (III-4) and P2 (II-4) exhibited significantly lower OGT steadystate levels ( Fig. 2A), compared with a heterozygous female carrier CA (II-8) as well as unaffected control males C1 (II-1), C2 (II-3), and C3 (II-7) (Fig. 2B). We used two different loading controls, ␤-actin and ␣-tubulin, to account for any possible changes in housekeeping proteins. We next investigated the half-life of L254F-OGT to explain the decrease in steady-state levels observed in affected samples. Following blocking of translation, the L254F-OGT in affected P1 turned over faster compared with control C1 (Fig. 2C). The half-life of WT OGT in normal control C1 was 13 h, compared with the L254F-OGT in affected male P1, whose half-life was 5.5 h (Fig. 2C). ␤-Actin

L254F-OGT and XLID
was used as the loading control due to its extended half-life of 2-3 days (43). Thus, L254F-OGT appears to be a slightly unstable protein.

L254F-OGT protein is an active enzyme
We next wanted to explore whether the OGT variant protein had enzymatic activity. When recombinant L254F-OGT is exogenously introduced into HEK293F cells via transient transfection, it is expressed and capable of elevating O-GlcNAc levels on nucleocytoplasmic proteins comparable with WT OGT overexpression (Fig. 3A). Immunopurified recombinant wildtype and L254F variant proteins from HEK293F cells were also catalytically competent using a CK2␣-derived peptide as an acceptor (data not shown). We also evaluated purified, recombinantly expressed WT and L254F variant proteins from an E. coli expression system, as well as a catalytically inactive variant (K852M) (44), for activity toward recombinantly expressed CK2␣ protein and saw no difference in the ability of wild type and L254F to glycosylate CK2␣ protein in vitro (Fig. 3B). Because OGT has also been demonstrated to have catalytic protease and glycosylation activity toward HCF-1 (45-47), we examined these activities for the wild type and L254F variant but found no differences between the WT and variant enzyme (Fig. 3C). Thus, L254F-OGT appears to be an active enzyme.

Steady-state O-GlcNAc levels remain unaltered in XLID lymphoblastoids
Considering that steady-state levels of OGT are decreased in L254F-OGT-expressing lymphoblastoids, we expected to observe a global decrease in O-GlcNAc levels. Surprisingly, global O-GlcNAc levels, as detected by immunoblotting with anti-O-GlcNAc antibodies, remained unaltered in affected males (P1 and P2) relative to both female carrier (CA) and unaffected male relatives (C1, C2, and C3) (Fig. 4). We probed for O-GlcNAc-modified proteins using multiple anti O-GlcNAc antibodies (110.6 and mAb14 antibodies (shown in Fig. 4) and mAb10, mAb3, and RL-2 (data not shown)). Thus, steady-state O-GlcNAc levels remain unaltered in the L254F-OGT-containing lymphoblastoids compared with controls.

Steady-state OGA protein and mRNA levels and OGA promoter expression are decreased in L254F-OGT XLID lymphoblastoids
As a result of global O-GlcNAc persisting at normal levels, we examined the other cycling enzyme of this dynamic modification in the L254F-OGT individual cells. Interestingly, OGA steady-state protein levels were severely diminished in affected individuals (P1 and P2) when compared with a female carrier (CA) and unaffected male relatives (Fig. 5, A and B). Thus, we have apparently uncovered a compensation mechanism that is in play in affected lymphoblastoids that accounts for the lack of change in O-GlcNAc levels in the XLID cell lines with reduced OGT levels.
To investigate this reduction in OGA, we investigated whether the decrease in OGA steady-state levels was due to a decrease in transcription. We performed quantitative RT-PCR to assess mRNA levels of OGA. XLID lymphoblastoids exhibited a significant decrease in OGA mRNA (Fig. 5C). We also performed cycloheximide half-life assessment and ruled out the A, immunoblotting (IB) of equal amounts of crude lysates displays a decrease in steady-state OGT levels in XLID lymphoblastoids (P1 and P2) when compared with a control female carrier (CA) using three independent antibodies to OGT. ␤-Actin immunoblotting was performed to confirm equal loading. B, immunoblotting of equal amounts of crude lysates from XLID lymphoblastoids (P1 and P2) displays less OGT protein when compared with control male relatives (C1-C3) or the control female carrier (CA) using a fourth ␣-OGT (11576-2-AP) antibody. ␣-Tubulin immunoblotting was performed to confirm equal loading. C, OGT has a reduced half-life in an XLID lymphoblastoid (P1) compared with control (C1) as measured by immunoblotting over time following blocking of translation with cycloheximide. For ␤-actin, which has a half-life in excess of 48 h, immunoblotting was performed to confirm equal loading. Blots shown are representative of results from three independent replicates.

L254F-OGT and XLID
possibility that lowered OGA protein levels might be due to protein turnover (data not shown), because there was no change in protein stability during the 24-h treatment. To further strengthen our hypothesis that OGA is transcriptionally down-regulated, we transfected a 2-kb proximal promoter region of OGA tagged to a luciferase reporter into the XLID and control lymphoblastoids. After 48 h of transfection, we observed that there was significantly lower expression from the OGA promoter in the affected lymphoblastoids compared with the control (Fig. 5D). These results suggest that OGA is transcriptionally regulated by OGT.

OGT, O-GlcNAc-modified proteins, and components of a chromatin silencing complex are enriched at the OGA promoter in XLID cells
The OGA down-regulation in a transcription-dependent manner led us to further probe the OGA promoter region. ChIP was performed using OGT-and O-GlcNAc-specific antibodies and qPCR with primers designed to amplify the proximal promoter region of OGA. OGT was significantly enriched, whereas O-GlcNAc-modified proteins displayed a similar trend of enrichment at the XLID OGA promoter (Fig. 6, A and B). These results suggest that whereas there is less OGT protein in the whole cell, there is more OGT at the OGA promoter in the XLID lymphoblastoids. It has been established previously that OGT exists in co-repressor complexes that down-regulate gene expression (48). This led us to hypothesize that OGT might exist in a similar co-repressor complex at the OGA promoter. Thus, we tested for mSin3A and HDAC1 enrichment at the OGA promoter by performing ChIP using the same proximal promoter primers to OGA as described above. We observed an increase in both HDAC1 and mSin3A at the OGA promoter in affected cells compared with controls ( Fig. 6, C and D). It has previously been demonstrated that both mSin3A and HDAC1 are substrates of OGT and that OGT activity is required for maximal gene silencing (48). These results suggest that the L254F-OGT as part of the mSin3A-HDAC1 complex compared with wild-type OGT acts preferentially at the OGA promoter to down-regulate OGA transcription.

Subsets of differentially expressed genes segregate by XLID as revealed by global transcriptome analysis
Due to the previously fortified role of OGT in transcription and our combined knowledge of the transcriptional down-regulation of OGA in the L254F-OGT-expressing cells, we elected to perform Illumina Hiseq 2000 RNA sequencing (Illumina, Inc., San Diego, CA) to see whether genes beyond OGA were being differentially regulated by the L254F-OGT. We chose to compare two unaffected males (C1 and C2) with the two affected males (P1 and P2) in the L254F-OGT XLID family. Three of the four males (C1, C2, and P2) are brothers, whereas the other affected male (P1) is a nephew. Using R software (for the algorithm, see the supplemental material) to perform a Spearman correlation (49), we compared each set of data with all of the other sets to determine which data sets showed the highest degree of similarity. We demonstrated that transcript expression segregates tighter with disease than generation (Fig.  7A). A plethora of genes involved in various cellular processes exhibited a differential expression pattern in the affected cells compared with the controls. Following the stringency applied to our data set described under "Methods," we were able to quantify 8800 genes (see supplemental Table 1). The data discussed in this publication have been deposited in the NCBI Gene Expression Omnibus (50). When we applied -fold filters, there were 349 genes that were differentially expressed 2-fold, 89 genes that changed 3-fold, and 38 genes that changed 4-fold (Fig. 7B). Given that our data set is from only four males, we generated two mock data sets by taking averages of groupings of our samples not based on disease (see "Experimental procedures" and supplemental Table 1). When we compared the mock set with the set sorted by XLID diagnosis, we observed a 1.3-fold enrichment of disease over natural variation in the human subjects for all 8800 genes. Of note, there was a 3.9-fold enrichment of disease over natural variation in the list of genes changing 3-fold in expression (Fig. 7, B and D, and Tables 3 and  4). In total, we saw ϳ1% of all quantified genes being differentially regulated in the affected versus the controls in the 3-fold group (Fig. 7C). Of the 89 genes in the 3-fold group, 67% of the genes were up-regulated in disease, whereas 33% were downregulated ( Fig. 7C and Tables 3 and 4). We orthogonally validated HIST1H4B (histone H4) to be up-regulated and HIST1H3A (histone H3.1) to be down-regulated in the XLID

L254F-OGT and XLID
samples, as seen in the transcriptome data (data not shown). This analysis, in conjunction with the previously described role of OGT in transcriptional regulation under various settings, demonstrates that OGT influences the expression of a subset of genes in the L254F-OGT XLID cells.

Discussion
Although both XLID and O-GlcNAc have been studied for several decades, we have for the first time characterized a muta-tion in OGT that segregates with XLID. O-GlcNAc modification of almost 2000 proteins impacts several crucial cellular processes, and there is only one enzyme (OGT) for its addition (51). A novel mutation has been identified in OGT by a focused X-chromosome exome sequencing in XLID-affected individuals. The observation that a missense mutation in OGT segregates with disease is the first example of the OGT enzyme being directly linked to a disease. OGT is highly expressed in the brain (20), and a recent study has demonstrated that placental OGT is  crucial for hypothalamic gene expression in the mouse neonatal brain (52).
The X-chromosome only contains 4% of all human genes; however, 10% of all Mendelian diseases are assigned to the X-chromosome (53). X-linked inheritance pattern was observed in ID as early as the 1930s (54), and it was only in the 1980s that the first successes were seen in gene mapping and causal gene identification (55)(56)(57). Subsequent studies have consistently illustrated a 30% excess in males with XLID (10). 1:600 to 1:1000 males of the total world population are afflicted with XLID (1, 10), and only 50% of the cases have been assigned a defective gene (10). The remaining cases are of unknown etiology.
In this study, we have identified a mutation, resulting in an amino acid replacement in the TPR domain region of the encoded enzyme, in OGT associated with XLID in a family ( Fig.  1 and Table 1). For L254F-OGT, we demonstrated that OGT steady-state protein levels are decreased in lymphoblastoids from affected individuals as a result of reduced half-life (Fig. 2). This finding validates the bioinformatic predictions of instability for this variant of OGT (Table 2). Complete loss of OGT is lethal in mammals (31). Therefore, we moved forward with the knowledge that OGT would probably be active in cells derived from affected individuals and validated that the variant is indeed catalytically competent (Fig. 3). Given that the mutation resulted in a variant in the regulatory protein-protein interac- A, Spearman correlation analysis shows segregation of RNA-seq data by disease (P1 and P2 expression data are more similar to one another than to either C, and this is also true for C1 and C2). B, the vast majority of transcripts change by Ͻ 3-fold (log 2 ϭ 1.58). C, approximately 1% of quantifiable transcripts are altered 3-fold in XLID lymphoblastoids (P1 and P2) compared with control males (C1 and C2). D, number of quantifiable genes under differential -fold expression sets by disease versus natural variation.  Table 4 Genes

L254F-OGT and XLID
tion TPR domains, exploration of the OGT interactome of L254F-OGT compared with wild-type OGT is warranted, given that such interactions are thought to temporally and spatially regulate OGT (58) and that several OGT interactors, such as HUWE1, HCFC1, and histone deacetylases, have been associated with XLID.
Notably, global O-GlcNAc levels remained unaltered in all affected lymphoblastoids compared with controls (Fig. 4). This was a surprising finding for L254F-OGT, given the reduction observed in steady-state enzyme levels. This led us to evaluate the other cycling enzyme of the O-GlcNAc modification, OGA, in these cells. Steady-state OGA protein levels were concomitantly lowered in XLID, leading to the observation that a compensation mechanism exists in the affected lymphoblastoids (Fig. 5). This result suggests that there is an active mechanism for attempting to maintain global O-GlcNAc levels in lymphoblastoids. Further evaluation is required to elucidate whether this compensation mechanism is cell type-specific, especially in affected tissues of the XLID individuals.
Reduction of OGA protein levels in L254F-OGT lymphoblastoids was due to a decrease in OGA transcription as seen by reduced transcript levels in affected cells compared with controls (Fig. 5). To further validate this finding, we performed promoter luciferase reporter assays, which further validate reduced transcription from the OGA promoter in affected cells (Fig. 5). To potentially explain this finding, we were able to demonstrate the enrichment of an OGT-containing corepressor complex at the OGA proximal promoter region in cells derived from affected individuals (Fig. 6). Interestingly, P1 appeared to be slightly more affected than P2 in terms of lack of OGA promoter activity (5D) as well as increased binding of the OGT-containing corepressor complex at the OGA promoter. To further explore the impact of the L254F-OGT variant on transcription, we examined global gene expression and found that ϳ1% of the 8800 genes quantified by RNA-seq were altered 3-fold or more (Fig. 7). Panther Gene list analysis software was used to assign gene ontology-based grouping (59). This assignment revealed that 16.4% of the genes up-regulated and 8.1% of the genes down-regulated in the 3-fold set are involved in modulating transcription factor activity (Tables 3 and 4).
Our initial studies presented here were conducted in the lymphoblastoid cell lines available. We have also identified a second mutation in OGT (1013AϾG (p.E338G)) that would also lead to an amino acid replacement in the N-terminal TPR repeats of OGT in a separate XLID family but have not yet fully characterized this mutation. During the preparation of this manuscript, a meeting abstract appeared from another independent exome-sequencing study (60) that identified a third mutation, c.775GϾA (p.A259T), resulting in another amino acid substitution in the seventh TPR domain, the same repeat as the L254F mutation, of OGT associated with XLID. Future studies are aimed at obtaining neural lineages derived from a single human embryonic stem cell line, which has the benefit of a homogenous genetic background, that is edited to produce the relevant OGT mutant lines using CRISPR/Cas9 technology. These studies will further enhance our understanding of XLID and the causal role of OGT. In addition to studying XLID, these studies will also serve to illuminate the regulation (and dysregu-lation in disease states) of OGT, OGA, and O-GlcNAc and their importance in maintaining normal cellular function.
Finally, to summarize our findings, we have characterized a mutation identified in OGT for the first time that segregates with disease in families with XLID. Thus, approximately 30 years after identification of the O-GlcNAc modification in the Hart laboratory (16), we now have mutations in one of the genes encoding a cycling enzyme (OGT) that segregate with a specific disease. Further, during the characterization of this mutation, a compensatory mechanism to maintain global O-GlcNAc levels by OGT-dependent transcriptional regulation of OGA in lymphoblastoids was uncovered.

Study samples
Individuals with XLID and control males with normal cognitive function were recruited from Greenwood Genetic Center (Greenwood, SC). Human subject research protocols for these studies were approved by the institutional review board. Informed consent was obtained from each study member and/or their parents or legal guardians. These individuals were evaluated by clinical geneticists and underwent comprehensive laboratory evaluations for ID. All individuals were found to have a normal karyotype, a negative molecular test for fragile X syndrome, and a negative screen for common inborn errors of metabolism. For each individual, 5-10 ml of blood was collected from affected probands males with XLID to establish Epstein-Barr virus-transformed lymphoblast cell lines for preparation of genomic DNA samples.

Human X chromosome exome sequencing
Sequence libraries were prepared using a TruSeq TM genomic DNA library preparation kit (Illumina), enriched for the human X chromosome exome using a SureSelect target enrichment kit (Agilent), and sequenced using the 75-bp pair-end sequence module on HiSeq2000 (Illumina). Alignment of the fastq reads, base recalibration, and variant calling were completed using Bowtie2 and Unified Genotyper (GATK). To enrich for disease-

L254F-OGT and XLID
causing mutations, we first utilized variant filters based on dbSNP, the male-restricted portions of the 1000 Genomes Project, or the Exome Variant Server data sets for variants with a frequency of 1% or less. Later, we developed and optimized a strategy, affected kindred/cross-cohort analysis, which utilizes a cohort of affected male kindred pairs and an additional small cohort of affected unrelated males to enrich for potentially pathological variants and to remove neutral variants. This strategy achieved a substantial enrichment for mutations in known XLID genes as compared with variant filters from public reference databases (63). Evolutionary conservation of the amino acid residues involved in the identified mutations was evaluated by multiple-sequence alignment of HomoloGene. Standard bioinformatics algorithms, including SIFT and PolyPhen-2, were used to predict the functional impact of the identified mutations (63).

Mutation validation, segregation analysis, and polymorphism study
Sanger sequencing was used for validation, segregation analysis, and polymorphism studies of each mutation using the Big-Dye Terminator version 3.1 cycle sequencing kit on an ABI3100 automatic DNA analyzer (Applied Biosystems) following the manufacturer's instructions. Variant analysis was completed on standard sequence alignment software (CodonCode and MacVector) followed by manual investigations of the chromatograms. X-inactivation in females available in family K9427 was determined using the androgen receptor locus as described previously (64).

Bioinformatics and modeling analysis
The structure of OGT contains an N-terminal TPR domain that mediates the recognition of a broad range of target proteins and C-terminal catalytic regions. The X-ray structure of the OGT TPR repeats (PDB code 1W3B), which contains 11.5 TPR units of human OGT and covers the sequence from 26 to 410, was used to model the effects of the L254F variant of OGT. Change of folding free energy upon missense mutation (⌬⌬G) was calculated with multiple web servers. The web servers used in this study include DUET, Eris, I-Mutant 2.0, mCSM, PopMuSiC, SDM, and SAAFEC (65,66). The structure of OGT TPR repeats (PDB code 1W3B) was used to calculate ⌬⌬G for the L254F variant. Both the wild type and L254F variant were visualized with UCSF Chimera.

Site-directed mutagenesis
To create a recombinant HA-tagged L254F-OGT, we designed primers using the QuikChange Primer Design program available through Agilent Technologies. Mutagenesis was set up using the QuikChange site-directed mutagenesis kit (catalog no. 200523) as per the manufacturer's protocol using Gateway pENTR vector. Sanger sequencing was done at the Johns Hopkins School of Medicine Synthesis and Sequencing facility to validate the mutant DNA sequences. Following validation, we performed an LR reaction using the Gateway technology to obtain the mammalian expression vector pDEST26 containing the HA-tagged WT or L254F-OGT.

Creation of lymphoblastoid cell line
Cell lines were obtained by immortalization of lymphocytes from blood samples using Epstein-Barr virus with standard protocols (67).

Tissue culture, transfection, and cycloheximide treatment
Lymphoblastoids were cultured in RPMI 1640 medium containing 15% fetal bovine serum and 1% antibacterial/antimycotics. Cells were passaged every week and grown in suspension to the desired density for assays at 37°C in 10% CO 2 .
Cycloheximide was added to cells at a concentration of 50 M and harvested at 0, 1, 2, 4, 8, 16, and 24 h following treatment.
Recombinant OGT constructs containing the mutation in the pDEST26 backbone were transfected in HEK293F cells using 293Fectin TM (Life Technologies, catalog no. 12347-019) reagent. Enhanced GFP transfection was used as a both a transfection control and a vector control. We followed the manufacturer's protocol for transfection.

Escherichia coli expression, purification, and analysis of glycosyltransferase and protease activity of WT and L254F-OGT
A plasmid encoding CK2 was obtained from Gerald Hart (Johns Hopkins University and the NHLBI (National Institutes of Health) P01HL107153 Core C4). CK2␣ (residues 1-365) was cloned into the expression vector pGEX6P2 using the BamHI and SalI sites. The plasmids encoding ncOGT and HCF1-rep1 were obtained from Suzanne Walker (Harvard University) and

L254F-OGT and XLID
Winship Herr (University of Lausanne), respectively. All plasmids were expressed in E. coli and purified as described previously (44). CK2␣ glycosylation and HCF1-rep1 cleavage/glycosylation assays were performed as described before with minor modifications (45)(46)(47). Briefly, reaction mixtures contained 1 M OGT, 1.5 M CK2␣ or HCF1-rep1, and 1 mM UDP-GlcNAc in a final volume of 30 l with buffer composed of 20 mM Tris, pH 8.0, 20 mM MgCl 2, 150 mM NaCl, and 1 mM DTT. Reactions were incubated at 37°C for 1.5 h (CK2␣) or 6 h (HCF1-rep1) and terminated by boiling in the presence of 1ϫ Laemmli buffer. One-fourth of the final volume of the samples was resolved by SDS-PAGE and subjected to Western blotting. The following primary antibodies were used at a 1:1000 dilution: O-GlcNAc (Thermo, MA1072), O-GlcNAc transferase (Santa Cruz Biotechnology, sc-32921), CK2␣ (Santa Cruz Biotechnology, sc-373894), and GST (to detect GST-tagged HCF1-rep1) (Santa Cruz Biotechnology, sc-33613). Secondary antibodies were purchased from LI-COR and used at a 1:10,000 dilution. Experiments were performed in triplicate with separate preparations of recombinant WT and L254F-OGT.

RNA isolation and quantitative RT-PCR
Total RNA was extracted from affected and control lymphoblastoids using the Qiagen RNeasy Plus minikit (catalog no. 74134) by following the manufacturer's protocol. cDNA was prepared using the Bio-Rad iScript cDNA synthesis kit (catalog no. 170-8890) as per the manufacturer's instructions. The resulting cDNA was used as a template for amplification in a Bio-Rad 96-well MyIQ single-color real-time PCR detection instrument using the SYBR protocol (catalog no. 170-8880). All Quantitect assay primers for qPCR were purchased from Qiagen and used with the Bio-Rad iQ SYBR Green supermix (catalog no. 170-8880). Changes in gene expression of OGA (QT00085862), HIST1H4B (QT00207207), and HIST1H3A (QT00246764) were normalized using B2M, RPL4, GAPDH, CYCG, and GUS as the housekeeping genes. Quantification was performed using the ⌬⌬Ct method (69).

Reporter luciferase assays
Lymphoblastoids from both affected individuals (P1 and P2) were transfected with pGL4.10 luciferase vectors using the Roche X-tremeGENE HP DNA transfection reagent (catalog no. 06366244001). pGL4.10-OGA contained a 2-kb proximal promoter region of OGA subcloned upstream of the coding region of the luciferase gene. The proximal promoter region of OGA was determined using Promoter 2.0 prediction, and the resulting plasmid was confirmed by DNA sequencing. Controls used were the Renilla luciferase (pGL4.74 hRluc/TK) (Promega, catalog no. E6921), SV40 promoter (pGL4.13 luc2/SV40) (Promega, catalog no. E6681), and the pGL4.10 luciferase empty vector (Promega, catalog no. E6651). After 48 h of transfection, cells were pelleted, followed by lysis and detection using the Promega Dual-Glo luciferase assay system (catalog no. E2980) as per the manufacturer's protocol on a luminometer. The resulting data were all normalized to a control male (C1 ϭ 1).

ChIP
ChIP was performed as described previously (70). Briefly, DNA and protein were cross-linked using 2% formaldehyde and quenched with glycine. Sonicated DNA extract was precleared using protein A/G-agarose beads and mouse or rabbit IgG linked to agarose conjugate. Chromatin from 3 ϫ 10 6 cells was used for each immunoprecipitation. Lysates were incubated with anti-OGT (Abcam, catalog no. 50273), anti-O-GlcNAc (mAb14), anti-HDAC1, or anti-mSin3A antibodies overnight at 4°C with rotation. Protein-DNA complexes were incubated with protein-agarose A/G beads for 2 h and washed three times using buffers containing 0.1% SDS, 1% Triton X-100, 2 mM EDTA, 20 mM Tris, 150 -500 mM NaCl, and protease inhibitors. DNA was eluted from beads using elution buffer containing 0.1% SDS and 100 mM NaHCO 3 . Cross-linking was reversed by the addition of NaCl to a final concentration of 325 mM, and DNA was incubated overnight at 65°C. DNA was extracted using phenol/chloroform after RNase and proteinase K treatment and analyzed by quantitative RT-PCR against the primers to the proximal OGA promoter (forward, 5Ј-aggggaaacagcggaagac-3Ј; reverse, 5Ј-tgccacctctgcgggt-3Ј). The primers were designed using the UCSC In-Silico PCR free tool. Results are shown as percentage of input.

RNA sequencing analysis and bioinformatics
RNA from lymphoblastoids from both affected individuals (P1 and P2) and related controls (C1 and C2) were extracted using the Qiagen RNeasy Plus minikit (catalog no. 74134) by following the manufacturer's protocol. Samples were submitted to the Genomics Services Laboratory at the Hudson Alpha Institute of Biotechnology (Huntsville, AL) for poly(A) mRNA library preparation and further sequencing and analysis. The concentration and integrity of the extracted total RNA were estimated by a Qubit version 2.0 fluorometer (Invitrogen) and Agilent 2100 bioanalyzer (Applied Biosystems, Carlsbad, CA), respectively. RNA samples with an RNA integrity number value of Ն 9.5 were used for further processing. From each of the four samples, 1 g of RNA was used for poly(A) mRNA library preparation using the NEBNext poly(A) magnetic isolation module (New England Biolabs Inc., Ipswich, MA), according to the manufacturer's protocol. Samples were individually barcoded with unique in-house Genomics Services Laboratory primers and amplified through 10 cycles of PCR using KAPA HiFi Hot-Start Ready Mix (Kapa Biosystems, Inc., Woburn, MA). The quality of the libraries was assessed by a Qubit version 2.0 fluorometer, and the concentration of the libraries was estimated by utilizing a DNA 1000 chip on an Agilent 2100 bioanalyzer.
Accurate quantification of the prepared mRNA libraries for downstream sequencing applications was determined using the qPCR-based KAPA Biosystems library quantification kit (Kapa Biosystems). Each library was then diluted to a final concentration of 12.5 nM and pooled equimolar before clustering. Cluster generation was carried out on a cBot version 1.4.36.0 using the TruSeq paired-end cluster kit version 3.0 (Illumina). Pairedend sequencing was performed using a 200-cycle TruSeq SBS HS version 3 kit on an Illumina HiSeq2000, running HiSeq Control Software (HCS) version 1.5.15.1 (Illumina). Image analysis and base calling were performed using the standard Illumina Pipeline consisting of RTA (Real-Time Analysis) version 1.13. Raw reads were demultiplexed using bcl2fastq conversion software version 1.8.3 (Illumina) with default settings.
Postprocessing of the sequencing reads from RNA-seq experiments from each sample was performed as per our unique in-house pipeline. Briefly, quality control checks on raw sequence data from each sample were performed using FastQC (Babraham Bioinformatics, London, UK). Raw reads were mapped to the reference human genome hg19/GRCh37 using TopHat version 1.4 (71,72) with two mismatches allowed and other default parameters. TopHat is a splice junction mapping tool for RNA-seq reads that utilizes an ultrafast high-throughput short read aligner Bowtie (71) in the background and then takes the mapping result and identifies the splice junctions. The alignment metrics of the mapped reads were estimated using SAMtools (73). Aligned reads were then imported onto the commercial data analysis platform, Avadis NGS (Strand Scientific Intelligence, Inc.). After quality inspection, the aligned reads were filtered on the basis of read quality metrics, where reads with a base quality score Ͻ 30, alignment score Ͻ 95, and mapping quality Ͻ 40 were removed. Remaining reads were then filtered on the basis of their read statistics, where missing mates and translocated, unaligned, and flipped reads were removed. The reads list was then filtered to remove duplicates. To reduce noise from low-signal reads, the minimum intensity was set to 8, and reads were removed if neither the average of affected individual values nor control values were Ͼ16. The intent of this filter was to ensure conservative interpretation of -fold change for low signal values. The final list was created by removing reads where the relative S.D. of the affected individuals values (P1 and P2) or control values (C1 and C2) was greater than 25%. This resulted in 8800 quantifiable transcripts. R was used to perform Spearman correlation analysis (see the supplemental material). Differential expression of genes was calculated on the basis of -fold change (using thresholds of Ϯ2.0, Ϯ3.0, and Ϯ4.0) observed between defined conditions. To assess data quality, two "mock" versions of the final reads list were created by grouping P1 with C1 and P2 with C2 (mock 1) and P1 with C2 and P2 with C1 (mock 2) and filtering as described above (from noise reduction on). Enrichment for differentially expressed genes (disease versus natural variation) was obtained by dividing the number of genes for experimental sets at preset -fold changes by the average number of genes of the two mock sets at the same -fold change.

Statistical analysis
Data are expressed as mean Ϯ S.E. The differences between means and the effects of treatments were analyzed using Excel or GraphPad to determine statistically significant values.