PTE, a novel module to target Polycomb Repressive Complex 1 to the human cyclin D2 (CCND2) oncogene

Polycomb group proteins are essential epigenetic repressors. They form multiple protein complexes of which two kinds, PRC1 and PRC2, are indispensable for repression. Although much is known about their biochemical properties, how mammalian PRC1 and PRC2 are targeted to specific genes is poorly understood. Here, we establish the cyclin D2 (CCND2) oncogene as a simple model to address this question. We provide the evidence that the targeting of PRC1 to CCND2 involves a dedicated PRC1-targeting element (PTE). The PTE appears to act in concert with an adjacent cytosine-phosphate-guanine (CpG) island to arrange for the robust binding of PRC1 and PRC2 to repressed CCND2. Our findings pave the way to identify sequence-specific DNA-binding proteins implicated in the targeting of mammalian PRC1 complexes and provide novel link between polycomb repression and cancer.

Polycomb group (PcG) 3 proteins compose a family of epigenetic repressors that prevent unscheduled transcription of hundreds of developmental genes (1)(2)(3)(4). PcG proteins act in concert as multisubunit complexes. These are usually grouped into two evolutionarily conserved classes called Polycomb Repressive Complexes 1 and 2 (PRC1 and PRC2). Several auxiliary complexes, whose repertoire varies between species, further aid PRC1 and PRC2 to achieve robust repression (5).
The systematics of the PRC1 group are more complicated as RING2 (or closely related RING1 protein), the central subunit of PRC1, is incorporated into large variety of protein complexes (4,30). Here, we reserve the PRC1 name for complexes that consist of one of the five variant chromodomain proteins (CBX2, CBX4, CBX6, CBX7, or CBX8), one of the three polyhomeotic-like proteins (PHC1, PHC2, and PHC3), the SCMH1 protein (or related SCML2), and a heterodimer between RING2 (or RING1) and one of the two closely related PCGF proteins MEL18 (also known as PCGF2) and BMI1 (also known as PCGF4). These complexes, sometimes called canonical PRC1, can recognize H3K27me3 produced by PRC2 via the chromodomain of their CBX subunits (23)(24)(25). Mutations in genes encoding PRC1 subunits lead to embryonic lethality and misexpression of HOX genes, indicating that PRC1 complexes are essential for PcG repression (26 -29).
Much is known about biochemical properties of PRC1 and PRC2, but how they are targeted to specific genes is not well understood. The process is better described for Drosophila, where the PcG system contains fewer variant proteins and has been studied for longer time. In flies, the targeting to specific developmental genes depends on designated polycomb-re-sponse elements (PREs). These ϳ1-kb-long elements are the strongest genomic binding sites for PRC1 and PRC2 (39,40) and are sufficient to generate novel binding sites for both complexes when integrated elsewhere in the genome (41). The pervasive di-methylation of H3K27 within intergenic regions and inactive genes suggests that PRC2 transiently interacts with most of the genome (21,22). Likewise, the increase of intergenic transcription in cells where PRC1 is depleted suggests that this complex also scans the entire genome (22). In this view, Drosophila PREs are the sites where PRC1 and PRC2 are retained much longer than elsewhere in the genome. A subset of transcriptionally active gene promoters can also retain PRC1 but not PRC2 (42)(43)(44)(45). The amount of PRC1 detected at these sites is an order of magnitude lower than at PREs (43)(44)(45).
Multiple lines of evidence indicate that Drosophila PREs contain combinations of recognition sequences for different DNAbinding factors. These factors act cooperatively to anchor PRC1 and PRC2, which themselves cannot bind DNA in sequencespecific fashion (41). Recent studies indicate that PREs continue to retain PRC1 when PRC2 and H3K27 methylation are removed by mutation, but PRC2 binding at many PREs is significantly reduced if PRC1 is ablated (46). Consistently, in cells where PRE-equipped Drosophila genes are active, they often lose H3K27me3 and PRC2 but have PRC1 strongly bound at PRE sites (40,47,48).
PRC1 and PRC2 complexes are evolutionarily conserved, and target many of the same developmental genes in Drosophila and mammals (4). Therefore, it is likely that the mechanisms that retain PRC complexes at these genes were in place before flies and mammals split from their last common ancestor. To what extent these mechanisms remain similar and which of them have diverged are open questions. The majority of studies in mammalian models have been focused on PRC2 targeting. Their results concur that DNA sequences with a high density of unmethylated CpG di-nucleotides (so-called CpG islands) that lack binding sites for transcriptional activators are sufficient to generate new binding sites for mammalian PRC2 but not PRC1 (49 -51). However, in the context of the mouse HoxD locus, factors other than high CpG density appear more important (52). Much less is known about the DNA sequences involved in the targeting of mammalian PRC1. Two studies reported DNA elements capable of autonomous PRC1 targeting (53,54). Although neither of the elements was mapped to high precision, the two elements appear to be very different. One of them was reported to generate new binding sites for both PRC1 and PRC2 (54), whereas the other seems to be targeting just PRC1 (53).
Here, we establish the human cyclin D2 (CCND2) oncogene as a simple model system to investigate the targeting of mammalian PcG complexes. Using this system, we find that the targeting of PRC1 to CCND2 involves a dedicated targeting element (PTE). This element may further cooperate with an adjacent CpG-island to support the robust binding of PRC1 and PRC2 at repressed CCND2.

Results
PRC1 and PRC2 complexes usually act together to effect epigenetic repression. However, in experiments with cultured Drosophila cells, we noted that some of the PcG target genes, when transcriptionally active, have their PREs bound by PRC1 in the absence of PRC2 and H3K27me3 (40,48). This fueled our interest to a region ϳ4.8 kb upstream of the transcription start site (TSS) of the human CCND2 gene. Analyzing previously published chromatin immunoprecipitation (ChIP)-binding profiles (55), we noticed that in the human embryonic teratocarcinoma NT2-D1 cells this region is strongly immunoprecipitated with antibodies against BMI1 and MEL18 but very weakly with antibodies against EZH2 and H3K27me3 (Fig. 1, A  and B). This is in stark contrast to all other sites on human chromosomes 8, 11, and 12, profiled by Kahn et al. (55), which show strong precipitation with anti-EZH2 and anti-H3K27me3 antibodies whenever they are strongly precipitated with antibodies against BMI1 or MEL18 (Fig. 1A).
In striking analogy with the PRC1 binding at PREs of transcriptionally active Drosophila genes, in the NT2-D1 cells, the CCND2 gene is highly transcribed (Fig. 1C). In contrast, in TIG-3 human embryonic fibroblasts, the CCND2 gene was reported to be transcriptionally inactive (56) (confirmed by our RT-qPCR measurements in Fig. 1C), decorated with H3K27me3, CBX8, and SUZ12, and up-regulated upon the knockdown of PRC1 and PRC2 (56). This indicates that CCND2 is a regular PcG target gene that, when transcriptionally inactive, acquires the chromatin state characteristic of PcG repression but binds a lot of PRC1 and little PRC2 and H3K27me3 in cells where it is transcriptionally active. Altogether, these observations raised a possibility that the PRC1-binding peak upstream of the CCND2 marks an element that targets PRC1 to this locus.

PRC1 and PRC2 binding to the CCND2 gene in alternative transcriptional states
To explore this possibility, we first performed quantitative ChIP analysis of PRC1, PRC2, and H3K27me3 binding at the CCND2 gene in the NT2-D1 and TIG-3 cell lines. To select informative protein targets for immunoprecipitation, we measured the mRNA levels for genes encoding PRC1 subunits to know which of the alternative variants are available for interrogation in each of the cell lines (Fig. S1). As shown by RT-qPCR, MEL18 mRNA is abundant in NT2-D1 cells and slightly less abundant in TIG-3 cells (Fig. S1). The BMI1 mRNA levels are lower than those of MEL18 but at the same level in both NT2-D1 and TIG-3 cells. RING1 mRNA level is low in both cell lines, but RING2 mRNA is abundant (Fig. S1). Out of five CBX genes implicated in PcG regulation (5), mRNA levels for CBX4, CBX6, and CBX7 are low in both NT2-D1 and TIG-3 cells. CBX8 mRNA is at the edge of the detection in both cell lines, and CBX2 mRNA is abundant in NT2-D1 but barely detectable in TIG-3 cells.
Immunoprecipitations of the formaldehyde cross-linked NT2-D1 chromatin with antibodies against MEL18, BMI1, CBX2, and RING2 give essentially the same results ( Fig. 2 and Fig. S2). The ChIP signals peak within a putative PRC1 PTE and recede steeply at both sides to reach a background level halfway between the PTE and the CCND2 TSS. In TIG-3 cells, where CCND2 is transcriptionally inactive, the putative PTE remains the strongest precipitated site with ChIP signals similar to those PRC1-targeting element of the cyclin D2 oncogene detected in NT2-D1 cells. However, in these cells, in addition to the PTE, the entire upstream region of the CCND2 gene, including the TSS, is also immunoprecipitated, albeit at 10 times lower level. In contrast to PRC1, ChIPs with antibodies against SUZ12 and H3K27me3 give enrichment profiles that differ dramatically between the two cell lines. In NT2-D1 cells, SUZ12 ChIP signals are very weak, compared with that at the positive control ALX4 gene (Fig. 2), which is transcriptionally inactive and bound by PRC1 and PRC2 in both NT2-D1 and TIG-3 cells (55,57,58). These are paralleled by weak ChIP signals for H3K27me3. In contrast, in TIG-3 cells, ChIP signals for both antibodies are very strong and on par with those of the ALX4 gene. Strikingly, their profiles do not match those for PRC1. Instead, the SUZ12 and H3K27me3 profiles are broad

CCND2 PTE can generate new PRC1-binding sites
If the DNA fragment underneath the CCND2 PRC1 binding peak is a PRC1-targeting element, it should be able to generate new binding sites for PRC1 when integrated elsewhere in the genome. To test this, we cloned the 2.4-kb fragment covering the putative PTE into a lentiviral vector (Fig. 3A) and integrated it back into the genome of NT2-D1 cells by viral transduction. To distinguish transgenic and endogenous copies of the putative PTE, we identified a small stretch of nucleotides close to the summit of the MEL18/BMI1 peak that shows little conserva-tion within mammalian species and substituted five of those nucleotides in the transgenic copy to create an annealing site for a specific PCR primer (Fig. S3). An analogous construct containing a 2.4-kb fragment from a gene desert region on chromosome 12 that showed no binding of PcG proteins and H3K27me3 (Fig. S3) and an empty vector were integrated in parallel as negative controls. The transduced cell lines were genotyped by PCR to validate their identity ( Fig. 3A and Fig. S4).
As summarized in Fig. 3B  Following viral transduction and puromycin selection, NT2-D1 cells were genotyped and collected for ChIP-qPCR assays. The representative analysis of PCR products shows that the transgene-specific primers (tran) amplify the expected product from the genomic DNA of cells transduced with the 2.4-kb PTE but not with the control (desert) construct (see Fig. S4 for additional details). The primers specific for the endogenous CCND2 PTE (endo) give the product in both cases. The no template control (NTC) line shows that no product is produced when the template DNA is omitted. Lane M contains DNA molecular weight markers. Comparison of immunoprecipitations of the 2.4-kb PTE fragment (B) and the control 2.4-kb fragment from chromosome 12 gene desert (C) indicates that the 2.4-kb PTE fragment generates a new strong binding site for PRC1. The precipitation of endogenous (endo) PTE fragment was assayed using amplicon 2 ( Fig. 2A). The immunoprecipitations of the ALX4 and chromosome 12 gene desert regions were assayed in parallel as positive and negative controls. The PCR amplicon used to assay the immunoprecipitation of the gene desert fragment in C does not discriminate between transgenic and endogenous copies. Because the endogenous site does not bind PcG, any elevated ChIP signal would come from the transgenic copy. All histograms show the average and the scatter (whiskers) between two independent experiments.
MEL18. The precipitation is robust and as strong as that of the endogenous PTE upstream of CCND2. This is in contrast to the transgenic insertion of the negative control fragment from the gene desert ( Fig. 3C) or the empty vector ( Fig. S5) whose precipitation is very low and close to the ChIP background. The immunoprecipitation of the transgenic 2.4-kb fragment with the antibodies against SUZ12 and H3K27me3 is weak and comparable with that of the endogenous site (Fig. 3B). Here, and in the following experiments mixed (nonclonal) populations of cells with insertions at multiple random genomic locations were used for ChIP assays. Because yields of ChIP reactions are normalized to the amount of input material, comparable immunoprecipitation of transgenic and endogenous PTEs indicates that, at the majority of the insertion sites, the transgenic 2.4-kb PTE fragment is bound by PRC1. Altogether, these observations suggest that the 2.4-kb DNA fragment underneath the CCND2 PRC1-binding peak contains an element capable of autonomous recruitment of PRC1 complexes.

CCND2 PTE is evolutionarily conserved
Evolutionary conservation is a good indicator that a regulatory element is functionally important. To investigate this question, we first looked at the evolutionary conservation of the DNA sequence underneath and around CCND2 PTE. Mouse Ccnd2 resides within a large 32-Mbp block of shared synteny between mouse chromosome 6 and human chromosome 12. Alongside with the ORF, which has high DNA sequence conservation, the 10-kb sequence upstream of the Ccnd2 TSS contains multiple blocks predicted as evolutionarily conserved DNA elements (Fig. 4A). One of these blocks corresponds to an ϳ300-bp sequence that is directly below the PRC1 binding peak within the human CCND2 PTE.
In mouse F9 testicular teratoma cells, where the Ccnd2 gene is transcriptionally inactive (Fig. 4B), its upstream region is immunoprecipitated with antibodies against Cbx7 and Suz12 (Fig. 4C). This indicates that the mouse Ccnd2 is a PcG target gene. Similar to what is seen at inactive human CCND2 in TIG-3 cells, the ChIP signal for Cbx7 is highest at the site corresponding to the evolutionarily conserved block within the PTE, but the highest Suz12 ChIP signal is shifted from the Cbx7 peak toward the TSS into the CpG-rich area. In contrast, in mouse NIH3T3 cells where the Ccnd2 gene is transcriptionally active (Fig. 4B), the upstream region shows very little precipitation with anti-Suz12 antibodies. However, much like in human NT2-D1 cells, ChIP signals for Ring2 (used to track PRC1 because the Cbx7 gene is not expressed in NIH3T3 cells) are high and peak within the 300-bp sequence orthologous to the human CCND2 PTE (Fig. 4D). To extend the parallel between human and mouse Ccnd2 further, we cloned 3.6-and 1.4-kb fragments covering the putative mouse PTE into the same lentiviral vector used to test the human PTE (Fig. 4A), and we integrated it into the genome of the human NT2-D1 cells by lentiviral transduction. ChIP-qPCR analysis indicates that the AT-rich region common between 3.6-and 1.4-kb fragments is precipitated with antibodies against MEL18, CBX2, and RING2 and weakly with antibodies against SUZ12 (Fig. 4, E and F). This indicates that the mouse Ccnd2 gene is also equipped with a PTE.

CCND2 PTE is a composite element whose activity depends on specific DNA sequences
A typical Drosophila PRE is ϳ1 kb long and contains recognition sequences for multiple unrelated DNA-binding proteins that cooperate to provide robust PcG targeting. Often it may be subdivided into fragments of a few hundred bp that can still recruit PcG proteins in transgenic assays, but the recruitment is less robust (59 -61). Therefore, we wondered whether the CCND2 PTE is smaller than the 2.4-kb fragment tested in our initial transgenic assay and whether the PTE contains a single core recruiting element or multiple weaker elements that cooperate. To address these questions, we examined PRC1 binding by sub-fragments of the 2.4-kb CCND2 PTE (Fig. 5A). ChIP analysis of PRC1 binding in cells transduced with corresponding lentiviral constructs indicates that the 1-kb fragment (PTE 1.2) centered on the summit of the MEL18-and BMI1-binding peak and the larger overlapping fragments (PTE 1.1 and PTE 1.3) bind MEL18, BMI1, and CBX2 as efficiently as the fulllength 2.4-kb fragment or the endogenous CCND2 PTE (Fig. 5, Further dissection indicates that smaller sub-fragments of PTE 1.2 cannot bind PRC1 as efficiently as the full-length fragment ( Fig. 6 and Fig. S6). When integrated elsewhere in the genome, the left (PTE 1-6) or the central (PTE 6 -8) parts are immunoprecipitated with anti-MEL18 antibodies very weakly (Fig. 6, B and C), and their precipitation with anti-BMI1 antibodies is at the edge of detection (Fig. S6). This is likely, because in NT2-D1 cells BMI1 is less abundant than MEL18 (Fig. S1). The immunoprecipitation of larger fragments that combine the left and the central parts (PTE 1-8) or the central and the right parts (PTE 6 -11) is more efficient (Fig. 6, D and E, and Fig. S8). Consistent with the asymmetric shape of the MEL18/BMI1 ChIP-chip peak (Fig. 1B), the PTE 6 -11 fragment shows the strongest immunoprecipitation from all PTE 1.2 sub-fragments tested. This suggests that the central and the right parts of the 1-kb PTE have greater contribution to the PRC1 retention. Importantly, the synthetic fragment (PTE 1.2⌬) that includes both the left and the right parts of PTE 1.2 but lacks the central part is still immunoprecipitated (Fig. 6F and Fig. 6). Overall, these results indicate that the CCND2 PTE consists of at least three separable modules, all of which contribute to the PRC1 retention with the central and the right modules being more important.
Both sequence-specific DNA-binding activity and cis-acting noncoding RNAs have been implicated in the targeting of mammalian PcG proteins (62). Recent genome annotation indicates two long noncoding RNAs CCND2-AS1 and CCND2-AS2, which originate within the second exon or the first intron of CCND2 (Fig. 1B). They are transcribed in the opposite direction to CCND2 and traverse over the PTE to terminate some 27 kb away from their TSS (63). These lncRNAs are unlikely to play a critical role in retaining PRC1 at CCND2 PTE as their transcription start sites were not included in our lentiviral constructs. Therefore, we sought evidence of a specific DNA-binding activity targeting the PTE 1. PRC1-targeting element of the cyclin D2 oncogene (Fig. S7A), the sequence-specific DNA-binding proteins previously implicated in the recruitment of mammalian PcG proteins (54, 64 -66). Consistently, we and others detect no YY1 binding to the CCND2 PTE (55,67). Mining the ENCODE data shows that the PTE 1.2 fragment does not bind any of the sequence-specific DNA-binding proteins mapped to date (Fig. S7A), although multiple transcription factors bind the adjacent CpG-rich region. Because we did not find any binding sites for known sequence-specific proteins, we analyzed the CCND2 PTE for sequences that might bind proteins not yet implicated in PcG targeting. We hypothesized that some of the other PcG target genes, represented by high-confidence BMI1/MEL18-binding sites on the three human chromosomes surveyed by ChIP (55), may have elements similar to the CCND2 PTE. In this case, they may use some of the same DNA-binding proteins for PcG targeting and may be distinguished by the same sequence motifs. To test this conjecture, we applied multivariate modeling (68,69) and searched for all possible 6-nucleotide-long sequence words predictive of high-confidence MEL18/BMI1-binding sites compared with 1000 random sequences not bound by MEL18 or BMI1 and matched for the distance to the closest TSS. This approach yielded two motifs that we dubbed "CGA" and "CGCG" (Fig. 6G and Fig. S7B). Although the CGCG motif is likely an outcome of the close proximity between PTEs and CpG-islands, the CGA motif is interesting. First, it is present right at the summit of the CCND2 MEL18/BMI1 peak, and its position within the CCND2 PTE is highly evolutionarily conserved (Fig. S7C). Second, the CGA motif is significantly enriched in high-confidence MEL18/BMI1-binding sites (Fig.  6H). Overall, these observations suggest that the CGA motif is a common feature of MEL18/BMI1-bound sites and may be involved in binding of PRC1 at CCND2 PTE.
To test this hypothesis, we generated a lentiviral construct that contained the PTE 1.2 fragment in which the CG dinucleotide within the CGA motif was replaced with the AA dinucleotide. ChIP-qPCR analysis of transduced NT2-D1 cells showed that antibodies against PRC1 components precipitated the mutated PTE1.2 fragment (PTE1.2mCGA) 4-fold weaker than the WT variant ( Fig. 6I and Fig, S8). The reduction of ChIP signals is comparable with that seen after deleting the entire central part of the PTE (PTE1.2⌬ construct). This suggests that the CGA motif makes an important contribution to PRC1 bind-ing by the evolutionarily conserved central part of the PTE. The loss of ChIP signals upon mutation of the CGA motif or deletion of the central part of the PTE is significant but not complete. This indicates that other sequences within the 1-kb PTE contribute to binding. Consistently, electrophoretic mobility shift assays with nuclear protein extracts from NT2-D1 cells and a set of partially overlapping 100 -150-bp fragments indicate that three different fragments (N2, N5, and N7) within the 1-kb PTE 1.2 element bind nuclear proteins in a sequencespecific fashion (Fig. S9). Altogether, our observations suggest that the CCND2 PTE is a composite element that may require multiple sequence-specific DNA-binding proteins to anchor PRC1.

PRC2 activity promotes PRC1 binding to the CCND2 PTE
In the NT2-D1 cells, where CCND2 is transcriptionally active, the PTE displays strong binding of PRC1 and very weak binding of PRC2 and H3K27me3. In these cells, the adjacent CpG-island binds neither PRC1 nor PRC2 nor H3K27me3. In the TIG-3 cells, where CCND2 is inactive, the PTE retains as much PRC1 as in the NT2-D1 cells, and we see weak PRC1 binding (ϳ10-fold lower compared with the PTE) within the adjacent CpG-island. In these cells, the CpG-island is covered with PRC2 and H3K27me3. Taken together our observations argue that H3K27me3 is not sufficient to retain PRC1 to an extent seen at the PTE. At best, the interaction between H3K27me3 and PRC1 may account for the low-level PRC1 binding within the CpG-island.
Although not sufficient to drive the strong PRC1 binding at the PTE, H3K27me3 may still be necessary for it. To address this question, we used the CRISPR/Cas9-mediated genome editing to knock down SUZ12 in the NT2-D1 cells (Fig. S10). To our surprise, the SUZ12 knockdown (Fig. 7, A and C) reduced the PTE immunoprecipitation by the antibodies against PRC1 subunits (Fig. 7B). Western blot analysis argues that the reduced immunoprecipitation of the CCND2 PTE is not due to the lower PRC1 abundance in the SUZ12-depleted cells (Fig.  7A). In fact, the NT2-D1 cells seem to require at least a low level of PRC2 for proliferation as after multiple attempts we failed to recover any cell lines completely deficient for SUZ12. In the clonal isolate used here (S1H6), the level of SUZ12 dropped to a few percent of that in the original NT2-D1 cells and the overall H3K27me3 is reduced about 10-fold (Fig. 7, A and C). Under vertebrate genomes illustrate that putative mouse PTE is an evolutionarily conserved element. The heat map below the conservation score graph shows the number of CpG nucleotides within the 100-bp sliding window (ranging from dark blue ϭ 0 to bright yellow ϭ 10). Similar to the human CCND2 locus, the CpG-poor PTE is adjacent to a CpG-island that extends toward the Ccnd2 TSS. The Ccnd2 gene is transcribed from left to right. Also shown is lncRNA (9330179D12Rik) of unknown function. Positions of PCR amplicons analyzed in C-F are shown below the scale in mm10 genomic coordinates. Green and red rectangles indicate positions of the 3.6-and 1.4-kb fragments tested for the ability to generate a new PRC1-binding site in human cells. White rectangle marks position of evolutionarily conserved DNA sequence that corresponds to ϳ300-bp fragment directly below the PRC1 binding peak within the human CCND2 PTE. B, RT-qPCR measurements indicate that mouse Ccnd2 is strongly expressed in NIH3T3 but very little in F9 cells. Histograms show the average and the scatter (whiskers) between two independent experiments. Values are normalized to the expression of the housekeeping ␤-actin gene. C, Cbx7 and Suz12 ChIP profiles across the upstream region of Ccnd2 in the mouse F9 cells. Note that Suz12 profile is offset from the PTE and the Cbx7 peak into the CpG-rich region. D, Ring2 and Suz12 ChIP profiles across the upstream region of Ccnd2 in the mouse NIH3T3 cells. The immunoprecipitation of ␤-actin gene (shown as bars to the right of the each graph) was used as negative control. Here and in C the bar plots and graphs are to the same scale and show the average and the scatter (whiskers) between two independent experiments. ChIP analyses of 3.6-kb (E) and 1.4-kb (F) mouse PTE fragments integrated in the genome of human NT2-D1 cells show that both generate new binding sites for PRC1. Amplicons 3-5 shown in A were used to assay the transgenic mouse fragments. Immunoprecipitation of the endogenous human CCND2 PTE (amplicon 2, Fig. 2A) and the ALX4 locus was assayed in parallel as positive controls, and precipitation of the chromosome 12 gene desert was assayed as negative control. In both E and F, the overall difference between the MEL18, CBX2, and RING2 ChIP signals at the control ("desert") amplicon and the transgenic amplicon 3 or the transgenic amplicon 4 is statistically significant (p value ϭ 0.01563, Wilcoxon signed rank test).

PRC1-targeting element of the cyclin D2 oncogene
these conditions SUZ12 and H3K27me3 are no longer detectable at the CCND2 PTE. At the same time, a very low SUZ12 ChIP signal is still detected at the ALX4 gene (used as a positive control example of a PcG-repressed gene in NT2-D1 cells). With this level of SUZ12 binding, ALX4 shows significant immunoprecipitation with anti-H3K27me3 antibodies, and ChIP signals for CBX2 and MEL18 are not reduced (Fig. 7B).
Taken together, these results suggest that the physical presence of PRC2 or its enzymatic activity is required for the robust binding of PRC1 at the CCND2 PTE.
To distinguish between the two possibilities, we inhibited PRC2 methyltransferase activity with a small molecule inhibitor UNC1999 shown to be highly specific for EZH2 and EZH1 (70). Notably, UNC1999 does not lead to degradation of PRC2   Fig. 3A) is compared with that of the endogenous CCND2 PTE region (endo, amplicon 2 Fig. 2A), unrelated PcG repressed gene ALX4, and the gene desert region (desert). Here and in C and D the histograms show the average and the scatter (whiskers) between the two independent experiments. ChIP-qPCR analyses of transgenes containing PTE 1.1 (C) and PTE 1.3 (D) indicate that both fragments bind PRC1 to an extent seen at the endogenous CCND2 PTE.

PRC1-targeting element of the cyclin D2 oncogene
and does not prevent its binding to chromatin (71). Consistent with observations in other cultured cell lines (70,71), 12-day treatment of NT2-D1 cells with recommended (2 M) concentration of UNC1999 leads to partial inhibition of the PRC2 activity without reducing the overall levels of PRC2 and PRC1 (Fig. 7D). Under these conditions, ChIP signals for H3K27me3 at the CCND2 PTE drop to a background level but reduce less than 2-fold at the control ALX4 gene. At both sites, the SUZ12 ChIP signals remain unchanged (Fig. 7E). Importantly, following the UNC1999 treatment, the MEL18 and CBX2 ChIP signals decrease 4-fold at the CCND2 PTE but remain unchanged at ALX4 (Fig. 7E). To exclude the possibility that UNC1999 treatment causes indiscriminate loss of proteins bound next to active genes, we tested the binding of transcription factor CTCF upstream of the CLBP, VDAC2, and STK17A genes. According to a published expression array profile of the NT2-D1 cells (58), these genes are expressed within the same range (2-16% of GAPDH) as CCND2. As illustrated by Fig. 7F, the CTCF ChIP signals are not affected by the UNC1999 treatment. Taken together, our experiments suggest that enzymatic activity of PRC2 but not its physical presence are critical for the strong binding of PRC1 to the PTE.    Fig.  3A. Note that some transgenic constructs lack some of the amplicons. Precipitation of transgenic fragments is compared with internal positive and negative controls (white box). The histograms show average ChIP yields normalized to that of the endogenous CCND2 PTE (amplicon 2) with whiskers indicating the scatter between two independent experiments. G, logo representation of the CGA motif. H, occurrence of the CGA motif within high-confidence genomic BMI1/MEL18-bound sites (blue) is much higher than that within control regions (red). The whiskers mark 95% confidence interval of the mean based on a hundred permutations of the control group. I, ChIP analysis of MEL18 binding to transgenic 1-kb PTE 1.2 fragment with the mutated CGA motif. 4-Fold reduction of ChIP signals suggests that the CGA motif is important for PRC1 recruitment by the PTE.

Discussion
Epigenetic repression by polycomb group mechanisms is essential for development of all multicellular animals and is frequently disrupted in cancers (4,72,73). Yet, our understanding of how the repression is targeted to specific genes is far from complete. Here, we analyzed the binding of canonical PcG complexes within the human CCND2 gene, which led to the following main conclusions. First, ChIP experiments identified the high-affinity PRC1-binding site upstream of the CCND2 TSS. This site binds PRC1 regardless of whether CCND2 is transcriptionally active or silent, and in both conditions it represents the strongest bound region within the locus. Second, transgenic analyses showed that the DNA of this high-affinity PRC1-binding site is sufficient to generate new PRC1-binding events when integrated elsewhere in the genome, and, therefore it represents a novel PTE. Third, the comparison of PRC1-, PRC2-, and H3K27me3-binding profiles in cells where CCND2 is transcriptionally active with those in cells where CCND2 is repressed indicates that the high level of H3K27me3 is not sufficient to retain PRC1 to an extent seen at the PTE. Hence, other mechanisms must contribute to PRC1 binding at the PTE. Fourth, although H3K27me3 is not sufficient to account for the strong PRC1 binding at the CCND2 PTE, the enzymatic activity of PRC2 is necessary.
Historically, our view of the mammalian PcG system has been influenced by concepts developed in the Drosophila model. One of them is the concept of PREs that, in flies, are found at all developmental genes regulated by PcG mechanisms and serve as high-affinity binding sites for both PRC1 and PRC2. Similar to fly PREs, the CCND2 PTE is short (ϳ1 kb), modular, and able to generate new binding sites for PRC1 when integrated elsewhere in the genome. However, in contrast to fly PREs, its ability to retain PRC2 is less clear cut. In cells where CCND2 is silent and bound by large quantities of PRC2, most of PRC2 binds outside the PTE within the adjacent CpG-island. This agrees with the documented ability of CpG-islands to retain PRC2, as long as their DNA is un-methylated and contains no enhancers or promoters engaged in the transcriptional activity (49 -51, 74). Consistently, the PTE binds little PRC2 and H3K27me3 when integrated elsewhere in the genome or in cells where CCND2 is transcriptionally active. Altogether, our observations argue that, by itself, the CCND2 PTE is not very efficient in retaining PRC2. We speculate that at CCND2, the PTE and the adjacent CpG-island act in concert to target PRC1 and PRC2 in quantities necessary for repression. In this view, the combination of the PTE and the CpG-island represents the CCND2 PRE.
Similar to Drosophila, where H3K27me3 is often excluded from PREs (40,75,76), the tri-methylation of H3K27 is not sufficient to account for the strong binding of PRC1 at the CCND2 PTE. Yet, different from the fly case where PRC1 does not require PRC2 or H3K27me3 to bind PREs (46), the CCND2 PTE relies on the catalytic activity of PRC2 to bind PRC1 efficiently. How the catalytic activity of PRC2 helps the binding of PRC1 at the PTE is not entirely clear. Interactions with H3K27me3, deposited by the small amount of PRC2 present at the PTE, may combine with individually weak interactions between PRC1 and the PTE-bound sequence-specific adapter proteins to yield the robust PRC1 binding. Alternatively, the binding of PRC1 at the PTE may require the global hit-and-run di-methylation of H3K27, which is known to make chromatin refractory to transcription and, possibly, more accommodating for PRC1 binding (22). More work will have to be done to discriminate between the two possibilities.
The discovery of the CCND2 PTE presents new opportunities to study the targeting of mammalian PcG complexes. Of obvious interest is the nature of the DNA-binding proteins that may retain PRC1 at the CCND2 PTE. Another interesting problem that could be addressed using the CCND2 model is the question of what impairs the PRC2 binding to target genes when these are transcriptionally active. Finally, CCND2 is an oncogene (77)(78)(79), and multiple lines of evidence link the PcG mechanisms to cancer progression (73,80). It is still puzzling why some tumor types depend on the overexpression of PcG proteins and others require the loss of PcG function. The oncogenic effect of the overexpression has to some extent been explained by the erroneous silencing of the INK4A/ARF locus (81), but the link between malignant transformation and the loss of PcG proteins remains elusive. We speculate that a failure to repress CCND2 may, at least in part, explain this link. From this, we predict that the CCND2 PTE may be disrupted in some cancers that overexpress CCND2 but have the PcG system intact.

Transgenic constructs
Lentiviral constructs were produced by in vitro recombination of fragments of interest into the Eco47III site of the pLenti-CMVTRE3G-eGFP-ICR-Puro or pLenti-ICR-Puro vectors. In vitro recombination was done using In-Fusion HD system (Clontech). All fragments were amplified using high-fidelity Pfu DNA polymerase (ThermoFisher Scientific) and human or mouse genomic DNA or, in case of CCND2 PTE sub-fragments,  Fig. 2A. Here and in E and F the histograms show the average and the scatter (whiskers) between two independent experiments. The overall difference of the SUZ12 and H3K27me3 or the MEL18 and CBX2 ChIP signals at amplicons 2 and 3 in the SUZ12-depleted and the control cells is statistically significant (p value ϭ 0.003906, Wilcoxon signed rank test). C, immunofluorescent analysis of H3K27me3 reduction in SUZ12-depleted cells. Staining for total histone H3 is used as a positive control, and 4,6-diamidino-2-phenylindole (DAPI) staining as a nuclear marker. D, Western blot analysis of 2-fold serial dilutions of nuclear protein from the UNC1999-treated and control cells shows that H3K27me3 is reduced, whereas the levels of SUZ12 and CBX2 remain unchanged. E, ChIP analysis of SUZ12, H3K27me3, MEL18, and CBX2 in control (red bars) and UNC1999-treated (blue bars) cells. The overall difference of the MEL18 and CBX2 ChIP signals at amplicons 2 and 3 in the UNC1999-treated and the control cells is statistically significant (p value ϭ 0.003906, Wilcoxon signed rank test). F, ChIP analysis of the CTCF occupancy at the CLPB, VDAC2, and STK17A genes and the negative control locus (desert) in the UNC1999-treated (blue bars) and control (red bars) cells.

PRC1-targeting element of the cyclin D2 oncogene
DNA of the PTE 2.4-kb construct as a template. PCR primers and their sequences are indicated in Tables S1 and S2. pLenti-CMVTRE3G-eGFP-ICR-Puro was generated based on the pLenti-CMVTRE3G-eGFP-Puro backbone (82). As the first step, pLenti-CMVTRE3G-eGFP-Puro was cut with HpaI and EcoRI and recombined with two DNA fragments produced by PCR with the following pairs of primers: 5Ј-TCGACGGTA-TCGGTTAACTT-3Ј; 5Ј-AGCGCTAGTCTCGTGATCGAT-AAA-3Ј and 5Ј-CACGAGACTAGCGCTGAGAGTTGG-3Ј; 5Ј-CTACCCGGTAGAATTCCACGTGGGGAG-3Ј, and DNA of pLenti-CMVTRE3G-eGFP-Puro as a template. This step introduced a unique PmlI site between eGFP and Pac (puromycin resistance gene) and the unique Eco47III site upstream of the eGFP gene. As the second step, the resulting construct was digested with PmlI and recombined with the mouse Igf2/H19 ICR insulator sequence PCR-amplified from mouse genomic DNA using the forward 5Ј-CCGGTAGAATTCCACTGTCA-CAGCGGACCCCAACCTATG-3Ј and the reverse 5Ј-GGGC-CGCCTCCCCACTCGTGGACTCGGACTCCCAAATCA-3Ј primers. The native Igf2/H19 ICR sequence contained one Eco47III site. As final step, this site was removed by cutting the above construct with Eco47III and recombining it with the DNA fragment produced by PCR with the following primer pair: 5Ј-GATCACGAGACTAGCGCTGAGAG-3Ј and reverse 5Ј-TTTTCACACAATGGCGCTGATGGCC-3Ј, using DNA of the above construct as a template.
We originally planned to use the pLenti-CMVTRE3G-eGFP-ICR-Puro-based constructs and integrate them into the NT2-D1 cells that had been modified to express the TetR protein from the constitutive CMV promoter. In theory, this should have allowed us to induce eGFP by adding the doxycycline to the media. Unfortunately, we soon discovered that the CMV promoter is not active in NT2-D1 cells. Therefore, to simplify and reduce the size of the transgenes, we have removed the TRE3G-eGFP part from pLenti-CMVTRE3G-eGFP-ICR-Puro to yield the pLenti-ICR-Puro vector (Fig. S5A). This was done by digesting pLenti-CMVTRE3G-eGFP-ICR-Puro with EcoRV and Eco47III and recombining it with the short dsDNA fragment produced by the annealing of the 5Ј-GATCACGAG-ACTAGCGCTGAGAGTTGGCTTCACGTGCTAGACCCA-GCTTTC-3Ј and 5Ј-GAAAGCTGGGTCTAGCACGTGAAG-CCAACTCTCAGCGCTAGTCTCGTGATC-3Ј oligonucleotides. Side-by-side comparison of the PRC1 binding to transgenic PTE 1.2 in either vector showed that the presence of the CMVTRE3G-eGFP part had no effect on the recruitment of PRC1. The analyses on Fig. 3, B and C, and Fig. S5B were done with transgenic constructs based on pLenti-CMVTRE3G-eGFP-ICR-Puro. All other analyses used transgenes based on pLenti-ICR-Puro.
To produce viral particles, 5 ϫ 10 6 293T cells were plated in a 75-cm 2 flask 24 h prior to transfection. The packaging plasmids pCMV-dR8.2dvpr (6 g) and pCMV-VSV-G (3 g) (gift from of M. Roth, University of Medicine and Dentistry of New Jersey) were combined with a transfer construct (9 g) and co-transfected using X-tremeGene HP (Roche Applied Science) at a 1:3 ratio of DNA to transfection reagent. After 24 h of incubation, the medium was changed. Lentiviral supernatant was collected after another 24 h, filtered (0.45-m filters), and used for infection directly or stored at Ϫ80°C.
For lentiviral infection, cells were plated at a confluence of 40 -60% 24 h in advance. Viral supernatant was added in serial dilution to cells in combination with 8 g/ml Polybrene (Millipore). After overnight incubation, the medium was changed to remove Polybrene. Transduced cells were selected for 14 days by growth on culture medium supplemented with 4 g/ml puromycin (Invitrogen).
To generate the SUZ12 knockout NT2 cell line, the SUZ12g1.1 (caccgGGTGGCGGCGGCGACGGCTT) and SUZ12g1.2 (aaacAAGCCGTCGCCGCCGCCACCc) DNA oligonucleotides were annealed and cloned into lentiCRISPRv2 plasmid (Addgene catalog no. 52961) linearized by digestion with BsmBI. The construct was introduced in NT2-D1 cells by lentiviral infection, and transduced cells were selected by growth on the culture medium supplemented with 4 g/ml puromycin (Invitrogen). Cells were cloned, and knockdown was assayed by Western blotting and immunostaining with the antibodies listed in Table S3.
To inhibit EZH2/EZH1, the NT2-D1 cells were grown in complete DMEM supplemented with 2 M UNC1999 (Cayman Chemical Co., catalog no. 14621) or 0.2% DMSO as a negative control. The media were replaced every 2nd day for 12 days, and cells were cross-linked for ChIP analyses. Fractions of the same cell cultures were taken before cross-linking, and their total nuclear protein was analyzed by Western blotting.

ChIP and RT-qPCR analyses
ChIP reactions were performed as described in Refs. 40, 55. The antibodies used for ChIP are listed in Table S3. Total RNA from cultured cells was isolated using TRI Reagent (Sigma) according to the manufacturer's instructions. cDNA was prepared with the first-strand cDNA synthesis kit (ThermoFisher Scientific) using 2 g of RNA and random hexamer primers and purified as described (48). qPCR analysis of cDNA and ChIP products was performed essentially as described previously (48,75) except that the iQ5 real-time PCR detection system (Bio-Rad) and KAPPA SYBR FAST qPCR kit (Kappa Biosystems) were used for all analyses. Wilcoxon signed rank test implemented in R was used to evaluate the statistical significance of the difference in the immunoprecipitation of transgenic and control amplicons (Fig. 4, E and F; test parameters: Wilcox.test (my.data$transgene, my.data$spacer, paired ϭ TRUE, alternative ϭ "greater")) or the difference in immunoprecipitation of the PTE amplicons in SUZ12 knockdown or PRC2-inhibited cells compared with control cells (Fig. 7, B and E; test parameters: Wilcox.test (my.data$knockout, my.data$WT, paired ϭ PRC1-targeting element of the cyclin D2 oncogene TRUE, alternative ϭ "less")). To compare gene expression between different cell lines and experimental conditions, the number of cDNA molecules was normalized to the stably and constitutively expressed glyceraldehyde-3-phosphate Dehydrogenase (GAPDH) gene (83,84). The primers used for qPCR analyses are described in Tables S4 -S6.
dsDNA fragments were labeled with [␣-32 P]dATP (Perkin-Elmer Life Sciences) using PCR and the DNA of the PTE 2.4-kb construct as a template and purified by passing through Illustra MicroSpin S300 HR columns (GE Healthcare). The corresponding PCR primers are indicated in the Table S7.
Binding reactions (final volume of 20 l) were assembled by combining 5 l of 4ϫ binding buffer (80 mM HEPES, pH 7.4 -7.9, 20 mM MgCl 2 , 20% glycerol, 400 mM NaCl, 4 mM dithiothreitol, 4 mM EDTA), 5-10 g of nuclear protein, 1-5 g of salmon sperm DNA (Life Technologies, Inc.), 40 fmol of labeled probe, and 150-fold excess of specific or nonspecific competitor. The binding was allowed to proceed for 20 min at room temperature after which samples were run on 8% acrylamide gel (29:1 acrylamide to bisacrylamide solution) on 1ϫ TBE for 4 -6 h at 160 V. The resulted gels were dried for 2 h and exposed to X-ray film (AGFA Healthcare).

Computational analyses
Definition of bound regions-The MEL18-and BMI1-bound regions were defined as coordinates of clusters of microarray features that satisfied the following three criteria. (i) Smoothed ChIP/input hybridization intensity ratios of the features were above the 99.8 percentile cutoff; (ii) the maximum distance to the neighboring feature above the intensity cutoff was equal or greater than 200 bp; and (iii) the length of the cluster was equal or greater than 200 bp. The peak center of a bound region was set at the center of the five consecutive microarray features with the highest hybridization intensity values. Regions of 1 kb, centered on overlapping binding peaks of MEL18 and BMI1, were considered as high-confidence MEL18/BMI1-binding sites. CpG-islands were defined using default parameters in the EMBOSS package (85).
Motif analysis-The 317 most predictive words from the 1-kb MEL18/BMI1 high-confidence regions derived by multivariate modeling, as described previously (68,69), were pairwise aligned, and each alignment was assigned a score reflecting the maximum number of identical nucleotides in the alignment. Based on these scores, we generated a hierarchical tree (Euclidian distance and complete linkage) using Cluster 3.0 (86). The tree was divided into eight groups, and the words from each group were realigned using Muscle (87). The aligned words were used to build the position weight matrix (PWM) as described in Ref. 88. As final step, the resulted PWMs were optimized with the Bound/Surveyed Sequence Discrimination Algorithm (89). The prediction of potential binding sites for the YY1, RUNX1/CBF␤, REST, and SNAIL proteins was done as described previously (90).