Chemical Approaches for Studying Histone Modifications*

Histones form the protein core around which genomic DNA is wrapped in eukaryotic chromatin. Numerous genetic studies have established that the structure and transcriptional state of chromatin are closely related to histone post-translational modifications. Further elucidation of the precise mechanistic roles for individual histone modifications requires the ability to isolate and study homogeneously modified histones. However, the highly heterogeneous nature of histone modifications in vivo poses a significant challenge for such studies. Chemical tools that have enabled biochemical and biophysical studies of site-specifically modified histones are the focus of this minireview.

Histones form the protein core around which genomic DNA is wrapped in eukaryotic chromatin. Numerous genetic studies have established that the structure and transcriptional state of chromatin are closely related to histone post-translational modifications. Further elucidation of the precise mechanistic roles for individual histone modifications requires the ability to isolate and study homogeneously modified histones. However, the highly heterogeneous nature of histone modifications in vivo poses a significant challenge for such studies. Chemical tools that have enabled biochemical and biophysical studies of sitespecifically modified histones are the focus of this minireview.
Genomic DNA is stored as chromatin in the nuclei of eukaryotic cells. The fundamental repeating unit of chromatin is the mononucleosome. Each mononucleosome consists of ϳ147 bp of double-stranded DNA wrapped around an octameric protein complex composed of two copies each of the four core histones, H2A, H2B, H3, and H4 ( Fig. 1A) (1).
The histones were first isolated and characterized in 1884 by Albrecht Kossel (2). However, it was not until 1950 that Stedman and Stedman (3) identified multiple forms of histones in the nuclei of cells and put forth the hypothesis that different cellular phenotypes in an organism may arise from the suppression of different genes by cell-specific histones. A decade later, in vitro experiments with cell-free systems demonstrated that histones were indeed inhibitory to DNA-templated RNA synthesis (4). This period also saw the discovery of histone acetylation by Phillips (5) and of histone lysine ⑀-N-methylation by Murray (6). As the protein synthesis inhibitor puromycin did not inhibit these histone modifications, Allfrey et al. (7) suggested that acetylation and methylation were PTMs 2 of histones. On the basis of the observation that chemical acetylation of histones greatly reduced their inhibitory effect on RNA synthesis, they put forth the prescient hypothesis that small reversible PTMs of histones could switch RNA synthesis on or off at different loci along the chromosome.
Since these early studies, a large number of histone PTMs (Fig. 1B), along with the various proteins responsible for install-ing (writers), removing (erasers), and binding (readers) these PTMs, have been identified (8). It is now well established that both the position and chemical property of histone modifications dictate the structure of chromatin as well as its functions in transcription, replication, and DNA repair (9). This has led to the histone code hypothesis for epigenetic control of cellular events, whereby distinct histone modifications, on one or more tails, act sequentially or in combination to bring about distinct downstream events (9). Understanding the specific roles for histone modification, either individually or in combination with other modifications, and histone interactions with chromatin-associated proteins is key to understanding the mechanisms underlying epigenetic control of cellular activity. Chemistry provides a growing arsenal of tools to study the roles for histone PTMs, and these will be discussed below.
The generation of peptide libraries has allowed highthroughput screening of histone-protein interactions in a microarray format. Toward this goal, Bedford and co-workers (15) have designed a chromatin-associated domain array (CADOR) chip containing an array of immobilized glutathione S-transferase-tagged histone-binding domains, including tudor and MBT domains, bromodomains, and chromodomains. Binding experiments with fluorophore-tagged N-terminal peptides from H3 and H4 bearing varying sites and degrees of methylation revealed novel interactions with chromodomains and tudor and MBT domains from various chromatin-associated proteins. Rathert et al. (16) have utilized SPOT synthesis to generate arrays of as many as 420 mutant H3 tail peptides (residues 1-21) and tested the substrate specificity of the H3 Lys 9 methyltransferase Dim-5 from Neurospora crassa. Results from these assays suggested an important role for Thr 11 and Gly 12 in the H3 tail in conferring specificity for Dim-5 activity and its discrimination against other lysines.
An impressive example of the combinatorial power of peptide synthesis was reported by Denu and co-workers (17), who developed a one-bead one-compound combinatorial library of 800 peptides bearing all possible permutations of the known modifications within the 21 N-terminal amino acids of histone H4. Peptide modifications included phosphorylation, acetylation, citrullination, and all possible methylation states of Lys and Arg. From the initial library, 512 members were used to elucidate the binding preferences of the double tudor domain of the human demethylase JMJD2A for the H4 tail. Interestingly, binding hits with various combinations of modifications revealed a rheostat-like continuum of binding affinities for human JMJD2A from 1 M to 1 mM. Finally, a selfassembled monolayer for matrix-assisted laser desorption-ionization (SAMDI) mass spectrometric assay was applied by Gurard-Levin and Mrksich (18) to characterize the activity of HDAC8 for H4 tail peptides. Their results indicated the importance of both distal residues (residues 16 -19) and those immediately adjacent to acetylated Lys 12 for HDAC8 activity.

Amber Suppression Strategies
An understanding of the physiological roles for histone modifications requires the ability to study them in the context of nucleosomes and chromatin. Although peptide models are often sufficient for studies of binary protein interactions, they cannot address the effects of modifications on directing trans-tail histone modifications, on nucleosomal or higher order chromatin structure, and on chromatin-remodeling complexes.
Amber suppression mutagenesis of proteins with orthogonal pairs of amber suppressor tRNAs and their cognate aminoacyl-tRNA synthetases could be employed to incorporate modified amino acids into full-length histones. Toward this goal, Schultz and co-workers (19) have evolved a mutant Methanococcus jannaschii tyrosyl amber suppressor tRNA, TyrMjtRNA CUA / tyrosyl-tRNA synthetase pair to site-specifically incorporate (Se)-phenylselenocysteine in response to the amber TAG codon in Escherichia coli ( Fig. 2A). Oxidative elimination of phenylselenic acid yielded Dha (20), which underwent Michael addition with N-acetylated or N-methylated derivatives of 2-aminoethanethiol to produce the thiol-containing analogs of N-acetylated and N-methylated lysine (Fig. 2B). This methodology was employed to generate an analog of histone H3 acetylated at Lys 9 , which underwent phosphorylation at Ser 10 by the Aurora B kinase. A potential limitation of this methodology may be the well established absence of diastereoselectivity in non-enzymatic thiol additions to Dha (20), although the local protein conformation may significantly influence the final diastereomeric ratios of the addition products (21).
An alternate approach developed by Neumann et al. (22) utilized an evolved Methanosarcina barkeri pyrrolysyl-tRNA synthetase and its cognate amber suppressor, tRNA CUA , to genetically incorporate N-⑀-acetyllysine in response to the TAG codon in E. coli ( Fig. 2A). This was employed to generate H3 acetylated at Lys 56 that was incorporated into mononucleosomes and nucleosomal arrays. Förster resonance energy transfer-based experiments with fluorophore-labeled nucleosomes permitted the direct observation of DNA unwrapping. Acetylation was found to increase the extent of DNA unwrapping within the last turn of DNA on the nucleosome core by 7-fold. It also accelerated nucleosomal repositioning by remodeling complexes ϳ20% over unmodified nucleosomes. However, acetylation did not affect the ATP-dependent H2A/H2B dimer transfer from mononucleosomes or higher order chromatin structure formation. Thus, the overall effects of H3 Lys 56 acetylation on nucleosome structure and stability are fairly subtle and are manifest at the level of DNA breathing.

Cysteine Modification Strategies
Cys is the most convenient amino acid for selective modification because of its highly nucleophilic side chain sulfhydryl group (pK a ϳ 8.5). Shokat and co-workers (24) have taken advantage of the unique reactivity of Cys to generate N-methylated aminoethylcysteine residues (Fig. 2C). Coupled with mutation of the single Cys in H3 (Cys 110 ) to Ala, their methodology permitted the site-specific installation of MLAs in histone H3. Thiol-containing analogs of H3 methylated at Lys 9 (H3-K C 9me), Lys 4 , Lys 36 , and Lys 79 and of H4 methylated at Lys 20 were successfully recognized by methylation-specific antibodies. Furthermore, both an H3-K C 9me 2 peptide (residues 1-14) and nucleosome-associated full-length H3-K C 9me 2 were demonstrated to bind the heterochromatin-binding protein HP1␣, which is known to bind H3-K9me 2 . Similar degrees of methylation of H3 Lys 9 and the H3-K C 9 analog by the methyltransferase SUV39H1 demonstrated the equivalence of MLAs in biochemical assays. The ease of access to MLAs in histones also permitted structural determination of mononucleosomes where both copies of the native histones were substituted with either H4-K C 20me 3 or H3-K C 79me 2 (25). In either case, methylation did not significantly affect mononucleosome structure. In solution, however, sedimentation velocity analysis indicated that 12-mer nucleosomal arrays reconstituted with octamers containing H4-K C 20me 3 had an enhanced propensity to form maximally folded and higher order structures with increasing Mg 2ϩ concentrations compared with H3-K C 79me 2 -containing or wildtype nucleosomal arrays. MLAs have also been applied in studies of DNA replication (26), recruitment of HDACs to histones (27), the propagation of repressive modifications (28), and cross-talk between modifications (29).
An alternate Cys-directed modification strategy developed by Davis and co-workers (30) is the oxidative elimination of Cys to Dha by the reagent O-mesitylenesulfonylhydroxylamine. Site-specific Cys mutagenesis in histones could in principle be coupled with this methodology to generate Dha, which would readily be converted to methylated and acetylated lysine analogs, similar to the methodology reported by Schultz and co-workers (19).

Native Chemical and Expressed Protein Ligation
The first report of EPL in 1998 by Muir et al. (31) marked a new avenue for employing chemistry to explore protein function. EPL extends the synthetic technique known as native chemical ligation (32), whereby two peptide halves, one bearing an N-terminal Cys and the other bearing a C-terminal thioester, are joined by a native amide bond. The peptide fragments can be obtained by solid-phase peptide synthesis employing either Boc or Fmoc (N-(9-fluorenyl)methoxycarbonyl) protecting group chemistry (33). Given the inevitable limitation of native chemical ligation by the length of the synthetic peptide halves, EPL significantly expands the range of proteins accessible for chemical modification. EPL employs an expressed protein in its C-terminal thioester form obtained by thiolysis of a C-terminally fused intein rather than by synthetic means. The expressed protein thioester undergoes trans-thioesterification when reacted with a second peptide/protein bearing an N-terminal Cys (Fig. 3). A subsequent S-to-N-acyl shift generates a native amide bond between the two halves, leading to the fulllength target. EPL may also be reversed to enable the ligation of a C-terminal expressed protein half with an N-terminal synthetic peptide thioester. Most importantly, because any desired protein modification may be introduced in the synthetic half of the ligation partners, in principle, EPL permits the incorporation of any histone modification observed in Nature.

Site-specific Histone Modification by EPL
Shogren-Knaak et al. (34) first reported the use of EPL to generate Xenopus laevis histone H3 bearing a pSer residue at position 10. Nucleosomal arrays reconstituted with H3 pSer 10 were efficiently remodeled by the yeast SWI/SNF remodeling complex, demonstrating that semisynthetic H3 pSer 10 did not drastically affect nucleosome structure. These arrays were also used to probe the substrate specificity of the histone acetyltransferase Gcn5 and revealed its different activities when presented with peptide versus nucleosomal substrates. This indicates the need for investigating the mechanisms of histonemodifying enzymes on intact nucleosomes rather than peptide substrates.
In another study, site-specifically Lys 16 -acetylated histone H4 was generated (35). This was incorporated into nucleosomal arrays, and its effect on chromatin compaction was determined by sedimentation velocity analysis during ultracentrifugation. The single acetylation of H4 Lys 16 inhibited 30-nm fiber formation in nucleosomal arrays, similar to the absence of the histone tail. These results suggested a structural role for Lys 16 -acetylated H4 in establishing transcriptionally active euchromatic regions by decondensing chromatin.
EPL has permitted the generation of multiply modified histones with as many as three acetyl groups at Lys 5 , Lys 8 , and Lys 12 in histone H4 and five acetyl groups at Lys 4 , Lys 9 , Lys 14 , Lys 18 , and Lys 23 in H3 (36). An additional desulfurization step was added to these syntheses, converting the Cys introduced for ligation to a native Ala in the final step and rendering the ligation traceless. Histones are particularly amenable to chemical desulfurization because only a single Cys (Cys 110 in H3) is present in the four core histones in higher eukaryotes, and Cys is altogether absent in yeast histones. Surprisingly, substituting wild-type H3 with semisynthetic pentaacetylated H3 did not interfere with RSF (remodeling and spacing factor) complexmediated assembly of a chromatinized plasmid. Additionally, H3/H4 tetramers generated with pentaacetylated H3 were also demonstrated to be a substrate for the HDAC Sir1 and subsequently for the methyltransferase G9a.
Owen-Hughes and co-workers (37) employed semisynthetically tetraacetylated derivatives of H3 (Lys 9 , Lys 14 , Lys 18 , and Lys 23 ) and H4 (Lys 5 , Lys 8 , Lys 12 , and Lys 16 ) to generate specifically modified chromatin templates to study the interaction of various yeast remodeling complexes with differentially acetylated nucleosomes. It was shown that the ATP-dependent remodeling enzyme complex RSC (remodel the structure of chromatin) preferentially remodeled chromatin containing tetraacetylated H3, but not H4, ϳ16-fold faster than unmodified chromatin. Kinetic analysis revealed this to be due to a 3-fold lower K m for the tetraacetylated nucleosomes and that a single acetylation at H3 Lys 14 contributed most to this effect. On the other hand, tetraacetylation of H4 inhibited nucleosome remodeling by the Isw2 enzyme by ϳ1.5-fold relative to unmodified nucleosomes. This was due to a reduction in the rate of ATP hydrolysis (k cat ) by Isw2. These experiments categorically demonstrated that histone modifications affect nucleosome remodeling through distinct pathways for different remodeling enzymes.
EPL has also enabled investigations of the effects of acetylation near the nucleosome dyad pseudo-symmetry axis, where key histone-DNA contacts occur (38). Results from nucleosome competitive reconstitution experiments revealed that acetylation at H3 Lys 115 near the dyad axis reduced DNA binding significantly more than acetylation at H3 Lys 122 . However, mononucleosomes acetylated at H3 Lys 122 underwent thermal repositioning about twice as fast as those acetylated at Lys 115 . These results suggest that the different sites of acetylation near the nucleosome dyad may have different physiological consequences in vivo, such as their effect on genome positioning and nucleosome assembly/disassembly. Another interesting result from this study was the observation that the Lys-to-Gln mutation that is commonly employed to mimic lysine acetylation in vivo showed significant differences from acetyllysine in in vitro competitive reconstitution assays.
Our own laboratory has reported several advances toward the semisynthesis of histones for biochemical studies. Chiang et al. (39) have demonstrated the utility of a thiol-protected 2-hydroxy-3-mercaptopropionic acid linker (40) and a diaminobenzoic acid linker (41) to synthesize a histone H2B N-terminal peptide thioester containing pSer at position 14 as well as acetyllysines at positions 5, 11, 12, and 15. These synthetic strategies avoided epimerization at the peptide C terminus, which is known to occur during the solution-phase activation of side chain-protected peptides (42). Furthermore, a mild radicalbased desulfurization methodology specific for cysteine (43) was employed to render the ligation product traceless. The phosphorylated and polyacetylated full-length H2B was probed with a commercial H2B pSer 14 -specific antibody. Surprisingly, the presence of multiple acetylations in the H2B tail prevented the recognition of pSer 14 by this antibody. This result indicates a limitation of antibody-based ChIP (chromatin immunoprecipitation)-on-chip experiments for whole genome analysis because histones are often decorated with multiple modifications in vivo that may interfere with the detection of specific modifications. However, the acetyl groups did not interfere with phosphorylation of Ser 14 by the human MST1 (mamma- lian sterile twenty-like 1) kinase, suggesting an unusual mechanism for H2B recognition by this enzyme.
One of the most dramatic PTMs of proteins is their conjugation with the small protein ubiquitin. Ubiquitylation is undertaken by a family of E1-E3 ligases that activate the C terminus of ubiquitin and catalyze its condensation with specific lysine side chain ⑀-amines in target proteins (44). Unlike its typical role in proteasome-assisted degradation, the ubiquitylation of histones is associated with DNA damage repair and both transcription elongation and repression (45). Ubiquitylation of H2B occurs at Lys 123 in yeast (Lys 120 in humans) and has been linked to transcription elongation and trans-tail methylation of H3 Lys 4 and Lys 79 through genetic studies (45). However, the small fraction (1-2%) of uH2B and the heterogeneity of histone modifications in vivo posed a serious challenge in the isolation of uH2B for biochemical studies aimed at identifying its mechanistic role in H3 methylation.
We have developed several semisynthetic approaches for the site-specific ubiquitylation of histones. McGinty et al. (46) have reported an EPL strategy to access uH2B that employs traceless peptide ubiquitylation (47). In this synthetic scheme (Fig. 4), the H2B protein was divided into a short synthetic C-terminal fragment (1) and recombinant N-terminal thioester (4). In the first step, ubiquitylation of the H2B C-terminal peptide (1) was accomplished with a photolytically removable ligation auxiliary that was coupled to the side chain of the residue corresponding to Lys 120 in full-length H2B. The ligation auxiliary acted as an N-terminal cysteine surrogate and facilitated EPL with a recombinant ubiquitin thioester lacking its C-terminal residue Gly 76 (2). Upon ligation, photolysis of the auxiliary with UV irradiation yielded native uH2B peptide (3) and simultaneously released a photoprotected Cys at the N terminus of the H2B peptide. In a second ligation step, the ubiquitylated peptide was reacted with the recombinant H2B thioester (4) to generate full-length uH2B(A117C). In a final step, the Cys was desulfurized to yield native uH2B (5). Biochemical assays of chemically ubiquitylated mononucleosomes with the human histone lysine methyltransferase DOT1L revealed that ubiquitylation directly stimulated intranucleosomal methylation of H3 Lys 79 (46). This was the first direct biochemical evidence of cross-talk between PTMs on different histones.
Very recently, we reported an alternate methodology for histone ubiquitylation that bypasses the need for the synthetically challenging ligation auxiliary (48). This was achieved by incorporating the mutation G76A at the ubiquitin C terminus, which allowed ligation with cysteine both at the lysine side chain and within H2B. Avoiding ligation onto a sterically hindered secondary amine, which was unavoidable in the ligation auxiliary approach, led to significantly reduced reaction times and higher yields. Furthermore, substitution of UV irradiation with chemical unmasking of the second cysteine greatly facilitated sample handling. Simultaneous desulfurization of both cysteines yielded the u(G76A)H2B protein, which was recognized by a uH2Bspecific antibody and by the ubiquitin-specific hydrolase UCHL3. u(G76A)H2B also stimulated robust methylation at H3 Lys 79 by human DOT1L, thus demonstrating similarity to native uH2B in a nucleosomal context. Subsequent kinetic and structure-activity relationship analyses with u(G76A)H2B have revealed a non-canonical role for ubiquitin in the enhancement of the chemical step of H3 Lys 79 methylation. In particular, the hydrophobic patch on the surface of ubiquitin centered around Ile 44 , which forms critical interactions with most helical ubiquitin-binding domains, was found to be non-essential for stimulation of human DOT1L activity. Mutagenic studies aimed at identifying the specific surface residues of ubiquitin involved in human DOT1L stimulation are currently under way in our laboratory.

Conclusions and Future Directions
Beginning with their discovery Ͼ40 years ago, histone modifications have been shown to play critical roles in directing key cellular events such as transcription activation, gene silencing, DNA damage repair, and DNA replication. Several semisynthetic methodologies for the generation of homogeneously modified histones have been developed, and these have led to investigations of the mechanistic roles for individual histone modifications in these processes. Advances in protein chemistry have made accessible synthetically challenging histone modifications, such as those found in the globular core domains of mononucleosomes and those involving large proteins such as ubiquitin and SUMO (small ubiquitin-like modifier). In the future, experiments with semisynthetic histones in our own laboratories will be aimed at the level of testing the histone code hypothesis in nucleosomal arrays bearing controlled modifications for studies of gene transcription, replication, and repair.