Crystal Structure of the Human Histone Methyltransferase ASH1L Catalytic Domain and Its Implications for the Regulatory Mechanism*

Absent, small, or homeotic disc1 (Ash1) is a trithorax group histone methyltransferase that is involved in gene activation. Although there are many known histone methyltransferases, their regulatory mechanisms are poorly understood. Here, we present the crystal structure of the human ASH1L catalytic domain, showing its substrate binding pocket blocked by a loop from the post-SET domain. In this configuration, the loop limits substrate access to the active site. Mutagenesis of the loop stimulates ASH1L histone methyltransferase activity, suggesting that ASH1L activity may be regulated through the loop from the post-SET domain. In addition, we show that human ASH1L specifically methylates histone H3 Lys-36. Our data implicate that there may be a regulatory mechanism of ASH1L histone methyltransferases.

Nucleosomes, the fundamental unit of the highly ordered chromatin structure, are composed of DNA wrapped around histone octamers. Histones have N-terminal tails that are exposed on the outside of nucleosomes. These tails are subjected to several post-translational modifications, including acetylation, phosphorylation, ubiquitination, sumoylation, and methylation (1).
The site-specific methylation of histone lysine residues is important for the epigenetic control of gene expression. These marks serve to regulate epigenetically the organization of chromatin structure and to recruit other chromatin modifiers (2,3). Methylation can occur at multiple lysine residues, including lysines 4,9,27,36, and 79 of histone H3 and lysine 20 of histone H4. Absent, small, or homeotic disc1 (Ash1) is a member of the trithorax group proteins, which are essential for epigenetic gene activation (4,5). Previous studies have shown that Drosophila Ash1 activates homeotic gene ultrabithorax expression in imaginal discs of the third leg (6) and interacts with trithorax to regulate and maintain ultrabithorax expression (7). It has also been reported that the mammalian homolog of Ash1, ASH1L, is a histone methyltransferase (HMTase) 2 that is associated with transcribed regions of active genes (8,9). ASH1L has several domains, including an associated with SET domain (AWS), a SET domain, a post-SET domain, a bromodomain, a bromoadjacent homology domain (BAH), and a plant homeodomain finger (SMART database (10)). The SET domain in HMTases is responsible for catalyzing the formation of monomethylated, dimethylated, and trimethylated lysine, establishing an additional complex system with respect to methylated lysine recognition in signaling (11). Several crystal structures of SET domain proteins have been solved, and they revealed that the SET domain forms a knot-like structure that constitutes the active site of HMTases (12). Notably, ASH1L contains a SET domain in the middle of the protein, whereas other proteins possess a SET domain at the C terminus. Drosophila Ash1 has been previously shown to affect H3K4 methylation levels genetically and to methylate H3K4, H3K9, and H4K20 in in vitro assays (8,13). The Tanaka group (14) has found H3K36 to be the enzymatic target of the human ASH1L (hASH1L)-SET domain, whereas another group has demonstrated that hASHlL methylates H3K4 (9).
To elucidate the molecular function of mammalian ASH1L, we determined the crystal structure of the hASH1L catalytic domain, including AWS, SET, and post-SET domains, bound to S-adenosylmethionine (AdoMet). Surprisingly, the crystal structure shows that the substrate binding pocket is blocked by a loop in the post-SET domain. Further mutagenesis studies and biochemical analyses suggest that the loop regulates the HMTase activity of hASH1L. size exclusion column and concentrated up to 14 mg/ml in a buffer of 50 mM Tris-HCl (pH 8.0) and 100 mM NaCl. Mutants were generated with the QuikChange site-directed mutagenesis kit (Qiagen) and purified in the same way as the wild-type protein.

EXPERIMENTAL PROCEDURES
Crystallization and Structure Determination-Crystals of the hASH1L catalytic domain were obtained using the hanging drop vapor method at 18°C in 70 mM Bis-Tris (pH 7.5), 30 mM citric acid, 20% polyethylene glycol 3350, and 10 mM spermine tetrahydrochloride. For cryoprotection, crystals were soaked for 1 min in a crystallization solution containing 45% PEG 3350. All data were collected under cryogenic conditions (105 K) at a beamline 4A-HFMX at the Pohang Accelerator Laboratory (PAL, Pohang, Korea). Data were processed with HKL2000. Phases were calculated from the single anomalous dispersion method at the zinc peak using the program SOLVE/RESOLVE (16). Models were built using the program O, and crystallographic refinement was performed with the program CNS.
Methyltransferase Assay-For the in vitro methyltransferase assay, hASH1L proteins (3.5 M) were incubated in a reaction mixture containing mononucleosomes, a G5E4 nucleosomal array, or histone octamers (17) and 2 M S-adenosyl-L-[methyl- 3 H]methionine (15 Ci/mmol, PerkinElmer Life Science NET155) for 30 min at 25°C. Samples were then separated by 15% SDSpolyacrylamide gel electrophoresis and transferred to a PVDF membrane. The membrane was applied to FLA-5000 (Fuji Film) for radioactive imaging. The membrane was stained with Ponceau S (Sigma) and scanned with LAS-3000 (Fuji Film) for protein quantification. The intensities of the detected bands were compared with Multi-Gauge (version 3.2). Histone purification and nucleosome reconstitution were carried out as described previously (18). H3K4A and H3K36A mutants were purified and reconstituted into nucleosomes in the same way as the wild-type histone H3. Antibodies against the H3K4me2, H3K36me2, or H3K36me3 (Upstate Biotech Millipore) were used to detect histone methylation. The intensities of the detected bands were scanned and compared using LAS-3000 (Fuji Film) with Multi-Gauge (version 3.2). Recombinant Xenopus histones or HeLa histones were used for the HMTase assay.

RESULTS
Crystal Structure of the hASH1L Catalytic Domain-We determined a crystal structure of the hASH1L catalytic domain bound to AdoMet, including the AWS, SET, and post-SET domains (2069 -2288 amino acids, Fig. 1, C and D) to 2.9 Å resolution using the zinc single anomalous dispersion method ( Table 1, Fig. 1B). The structure of hASH1L shows that the N and C termini of the AWS domain are interwoven with the SET domain (Fig. 1A). The AWS domain is composed of mostly loops and two short ␣-helices. These loops are bound by two zinc atoms and coordinated by eight conserved cysteines in the AWS domain. The fact that the AWS domain in one of the two molecules of the asymmetric unit is partially disordered and that the temperature factor of the AWS domain is high suggests that the AWS domain is highly flexible. The structure of the hASH1L SET domain is similar to those of other known SET domain proteins.
The structure of the hASH1L catalytic domain, however, is striking because of its substrate binding pocket. It was previously determined that the SET-I subdomain serves as a substrate binding platform with the post-SET domain (12). It was also determined that an AdoMet methyl donor is positioned at the inner pocket between the SET-I subdomain and the post-SET domain. In our structure, however, a loop from the post-SET domain is found in between the SET-I subdomain and the post-SET domain, blocking the substrate binding pocket (Figs. 1A and 2A). We refer to the loop occupying the substrate binding pocket as the "auto-inhibitory loop." The auto-inhibitory loop is highly flexible, as indicated by its high temperature factor, suggesting that the loop can be repositioned upon substrate binding. In this configuration, the methyl donor AdoMet is totally buried inside the protein, and the auto-inhibitory loop must be repositioned before the substrates can bind. The two molecules in the asymmetric unit have basically identical conformations of the auto-inhibitory loop, indicating that the loop blocking conformation is not due to crystal packing.
Configuration of the Auto-inhibitory Loop of hASH1L-To better understand the configuration of the auto-inhibitory loop of hASH1L, we compared our structure of the hASH1L catalytic domain with that of SET8 bound to the histone H4 peptide ( Fig. 2A). In the structure of the SET8-histone H4 substrate complex, the histone H4 substrate forms a parallel ␤-sheet with one of the two hairpin ␤-strands in the SET-I subdomain via hydrogen bonding between the backbones (19). Although the auto-inhibitory loop in hASH1L also forms hydrogen bonds with the ␤-strand in the SET-I subdomain in an anti-parallel manner, the major interactions between the loop and the SET-I subdomain are contributed by the following (Fig. 2, A and B). First, the highly flexible auto-inhibitory loop is held by an interaction between the side chain of Asn-2197 from the SET-I subdomain and the carbonyl oxygens of His-2258 and Ser-2259 at the start of the loop. This interaction is particularly interesting because the auto-inhibitory loop makes a sharp turn between His-2258 and Ser-2259, suggesting that the turning confirmation of the auto-inhibitory loop is assisted by the interaction between Asn-2197, His-2258, and Ser-2259. Second, the autoinhibitory loop is held by the hydrophobic interaction between Phe-2260 in the middle of the loop and the Phe-2179 and Val-2203 residues in the SET-I subdomain. Last, the auto-inhibitory loop interacts with the AdoMet molecule. The Gln-2266 side chain in the loop forms a hydrogen bond with the 3Ј-hydroxyl group of the ribose ring of AdoMet at the final position of the loop. This 3Ј-hydroxyl group of AdoMet has been shown to interact with the histone substrates in the SET8-H4 complex  (19). In this configuration, the auto-inhibitory loop is hinged by Ser-2259 and Gln-2265, and the middle of the loop is held by Phe-2260 (Fig. 2B).
It has been previously identified that a Phe/Tyr in the SET domain guides the side chain of the Lys substrate (12). When we superimposed the hASH1L and SET8-H4 structures, the Tyr-2255 in hASH1L can be identified as the guide Phe/Tyr, maintaining an identical position to other known guide phenylalanines (Fig. 2C). The Ser-2259 in the auto-inhibitory loop occupies the position of the target Lys at the substrate binding pocket. These data also suggest that the auto-inhibitory loop must be reconfigured for the substrate to bind to hASH1L.
Mutations on the Auto-inhibitory Loop Stimulate hASH1L HMTase Activity-Previous work has suggested that the post-SET domain may be involved in regulating HMTase activity (20). To investigate the effect of the auto-inhibitory loop of hASH1L on enzyme activity, we generated several mutations in the loop that would be predicted to affect the interaction between the loop and the other parts of hASH1L. We then measured the HMTase activity using mononucleosomes or nucleosomal arrays as substrates.
In the first set of mutants, which were designed to make the auto-inhibitory loop more flexible, we mutated Asn-2197 to Ala to disrupt the interaction between the loop and the SET-I subdomain at the beginning of the loop. We also mutated Gln-2265 to Ala at the final position of the loop to make the hinge region of the loop more flexible. In these mutants, the autoinhibitory loop would be expected to become more flexible due to loss of the interaction between the loop and the SET-I subdomain. We crystallized the Q2265A mutant protein, and preliminary structural analysis of the mutant protein shows that the auto-inhibitory loop is highly disordered. With these mutants, we observed an increased level of HMTase activity for hASH1L, with higher levels of activity observed for Q2265A than for N2197A (Fig. 2D). This observation suggests that mutating Asn-2197 or Gln-2265 to alanine makes it easier for the substrate to access the active site by making the auto-inhibitory loop more flexible. In the second mutant, we generated the F2260A mutant to disrupt the hydrophobic interaction between the loop and the SET-I subdomain and measured the HMTase activity of the mutant. Unexpectedly, the F2260A mutant showed no activity toward either the nucleosomal array or the mononucleosome substrates. Phe-2260 is highly conserved in the hASH1L subfamily; the result that the mutation of Phe-2260 abolishes HMTase activity indicates that this residue may also have other critical roles for regulating HMTase activity. In a third mutant, we mutated the Gln-2266 residue to Ala to perturb the interaction of the protein with AdoMet. In this mutant, we were not able to see any HMTase activity, suggesting that the Gln-2266 residue is critical for activity. In a fourth mutant, Glu-2263 was mutated to Ala. This mutant showed neither increased nor decreased HMTase activity (supplemental Fig. S1).
To examine whether the increase of activities of two mutants results from the substrate binding, we performed kinetic analysis with wild-type, Q2265A, and N2197A mutant proteins (supplemental Fig. S2). Our kinetic analysis shows roughly similar V max values for all proteins. However, there is significant decrease in K m values of the mutant proteins when compared with that of wild-type protein, indicating that the mutants have higher substrate binding affinity.
It should also be noted that all mutant proteins behaved identically to the wild-type protein in gel filtration chromatography, indicating that the mutant proteins maintain overall structures that are similar to that of the wild type. Our mutagenesis study on the auto-inhibitory loop indicates that the interaction between the loop and the SET-I domain may be involved in regulating hASH1L HMTase activity. These results also suggest that there may be a regulatory mechanism to reconfigure and open up the auto-inhibitory loop.
hASH1L Catalytic Domain Methylates Histone H3 Lys-36-Drosophila Ash1 was previously shown to affect the levels of histone H3 Lys-4 (H3K4) methylation genetically and to methylate H3K4, H3K9, and H4K20. Recently, however, the Tanaka group (14) showed using nucleosomal substrates that the hASH1L catalytic domain methylates H3K36 but not H3K4. The other group used free histone as the substrate to demonstrate that hASH1L methylates H3K4 (9).
To investigate the substrate specificity of hASH1L, we performed an HMTase assay with the wild type as well as two mutant hASH1L proteins (F2260A, Q2265A) using mononucleosomes as the substrate. The methylation state was examined with H3K4me2-, H3K36me2-, and H3K36me3-specific antibodies (Fig. 3A). Fig. 3A clearly shows that the methylated H3K36-specific antibodies, but not the methylated H3K4-specific antibodies, recognize the product of hASH1L. These data indicate that hASH1L methylates histone H3 Lys-36 in vitro under our conditions. Moreover, with the hyperactive mutant (Q2265A), the signal specific for H3K36me2 greatly increased, whereas that specific for the H3K4 remained undetectable. We were also able to detect trimethylation of histone H3 Lys-36 with the hyperactive hASH1L mutant. This observation is also consistent with the notion that hASH1L specifically methylates H3K36. To rule out the possibility that this result was caused by cross-reactivity of the antibody to other methylated histones, we assembled nucleosomes with mutant histones that have Lys-4 mutated to Ala or Lys-36 mutated to Ala. HMT assays were performed with the wild type and hyperactive hASH1L mutants (Fig. 3B). hASH1L was not able to methylate nucleosomes containing histones with H3 Lys-36 mutated to Ala, whereas it was found to have full activity toward the H3 K4A mutant nucleosomes, as well as wild-type nucleosomes. Similar results were obtained when the hyperactive mutant was used in the HMT assay. In addition, when we used histone H3 or the histone octamer, we also observed that hASH1L specifically methylates histone H3 Lys-36 (Fig. 3C). Taken together, the hASH1L catalytic domain specifically methylates histone H3 Lys-36 in vitro.

DISCUSSION
The results presented here reveal that the hASH1L substrate binding pocket is blocked by an auto-inhibitory loop. Our data suggest that this loop regulates hASH1L activity. At this moment, however, it is not clear what the physiological role of the auto-inhibitory loop is.
We were able to measure the HMTase activity with the hASH1L catalytic domain alone; thus, the auto-inhibitory loop may be reconfigured upon substrate binding. Alternatively, there may be other regulatory mechanisms that function to open the auto-inhibitory loop. This mechanism needs to be further studied.  5). The methylation product of the hASH1L catalytic domain is specifically recognized by the anti methyl-H3K36 antibody but not by the methyl-H3K4 antibody. The hyperactive mutant (Q2265A) is able to also produce H3K36 trimethylation. B and C, the wild-type hASH1L SET catalytic domain and the hyperactive mutant methylate the wild-type or H3K4A mutant histone, octamer, and nucleosome (Nuc.), but not the H3K36A mutant histone, octamer, and nucleosome.
Considering that an auto-inhibitory mechanism has also been observed in other enzymes (21)(22)(23) and that histone methyltransferases need to be tightly regulated to coordinate gene expression, it is likely that the auto-inhibitory loop plays an important role in coordinating histone methylation. Based on sequence alignment (Fig. 1D) and the accompanying study (25), other SET domain proteins containing a SET domain in the middle, such as SET2 and nuclear receptor SET domaincontaining protein (NSD1), may adopt a similar loop configuration as hASH1L in the post-SET domain.
The auto-inhibitory conformation of hASH1L catalytic domain is unlikely due to an artifact of using a fragment of the catalytic domain. The accompanying study (25) on the structure of NSD1 catalytic domain shows a similar auto-inhibitory conformation, suggesting that the auto-inhibitory conformation may be a common mechanism of the methyltransferases having a SET domain in the middle. In addition, the auto-inhibitory loop is well ordered in the crystal structure, and the interaction between the auto-inhibitory loop and the SET domain is specific. Also, the disruption of the specific interaction increases the activity. These observations suggest that the autoinhibitory conformation is physiological relevant, although it is possible that the auto-inhibitory conformation can be transient. Consistent with our data, it has been suggested that the post-SET domain may be involved in regulating the activity in other methyltransferases (mixed-lineage leukemia) (20). However, the exact mechanism how the auto-inhibitory loop functions needs to be studied in the full-length protein. It would be interesting to examine the roles of plant homeodomains, bromo domains, and BAH domains in regulating HMTase activity.
Recent work on the nuclear receptor NSD1 protein shows that substrate specificity depends on the nature of the substrate (24). It would be interesting to investigate whether the regulation of the specificity is also through an auto-inhibitory loop in the post-SET domain of the NSD protein.
Although the Drosophila Ash1 mutant has a defect in histone H3 Lys-4 methylation, the substrate specificity of mammalian ASH1L is controversial (9,14). However, in our in vitro assay, hASH1L clearly has histone methyltransferase activity that is specific for H3 Lys-36. This result is consistent with those obtained by the Tanaka group (14), whereas the Blobel group (9) observed specificity for H3 Lys-4. This discrepancy may be attributed to the different substrates that were used. We and the Tanaka group used nucleosomes, whereas the Blobel group used a histone peptide as the substrate and used a long incubation time. We also detected activity for histone H3 Lys-36 when octamers or histone H3s were used as substrates. Furthermore, the hyperactive mutant of hASH1L (Q2265A) is able to trimethylate histone H3 Lys-36, suggesting that hASH1L is a histone H3 Lys-36-specific methyltransferase. We cannot totally rule out the possibility that hASH1L may have dual specificity toward histone H3K4 as well as H3K36 or that it actually prefers H3K36 over H3K4. It has been reported recently that the substrate specificity of NSD1 methyltransferase depends on the nature of the substrate (24). The substrate specificity of hASH1L may be similarly regulated, although we have not observed any activity other than the H3 Lys-36-specific activity when nucleosomes, nucleosomal arrays, histone octamers, or even histone H3 molecules are used as substrates. To unequivocally address the substrate specificity of hASH1L, further structural studies on hASH1L in complex with its substrate will be required. Taken together, our data demonstrate that the hASH1L SET domain adopts a novel configuration of the substrate binding pocket that is limited by an auto-inhibitory loop from the post-SET domain. This result suggests that there is a mechanism to regulate hASHL1 HMTase activity.