Arid4b is critical for mouse embryonic stem cell differentiation towards mesoderm and endoderm, linking epigenetics to pluripotency exit.

Distinct cell types emerge from embryonic stem cells through a precise and coordinated execution of gene expression programs during lineage commitment. This is established by the action of lineage specific transcription factors along with chromatin complexes. Numerous studies have focused on epigenetic factors that affect ESC self-renewal and pluripotency. However, the contribution of chromatin to lineage decisions at the exit from pluripotency has not been as extensively studied. Using a pooled epigenetic shRNA screen strategy, we identified chromatin related factors critical for differentiation towards mesodermal and endodermal lineages. Here we reveal a critical role for the chromatin protein, Arid4b. Arid4b-deficient mESCs are similar to wild-type mESCs in the expression of pluripotency factors and their self-renewal. However, Arid4b loss results in defects in upregulation of the meso/endodermal gene expression program. It was previously shown that Arid4b resides in a complex with Sin3a and Hdacs 1 and 2. We identified a physical and functional interaction of Arid4b with Hdac1 rather than Hdac2, suggesting functionally distinct Sin3a sub-complexes might regulate cell fate decisions Finally, we observed that Arid4b deficiency leads to increased H3K27me3 and reduced H3K27Ac level in key developmental gene loci, while a subset of genomic regions gain H3K27Ac marks. Our results demonstrate that epigenetic control through Arid4b plays a key role in the execution of lineage-specific gene expression programs at pluripotency exit.

Distinct cell types emerge from embryonic stem cells through a precise and coordinated execution of gene expression programs during lineage commitment. This is established by the action of lineage specific transcription factors along with chromatin complexes. Numerous studies have focused on epigenetic factors that affect embryonic stem cells (ESC) self-renewal and pluripotency. However, the contribution of chromatin to lineage decisions at the exit from pluripotency has not been as extensively studied. Using a pooled epigenetic shRNA screen strategy, we identified chromatin-related factors critical for differentiation toward mesodermal and endodermal lineages. Here we reveal a critical role for the chromatin protein, ARID4B. Arid4b-deficient mESCs are similar to WT mESCs in the expression of pluripotency factors and their self-renewal. However, ARID4B loss results in defects in up-regulation of the meso/endodermal gene expression program. It was previously shown that Arid4b resides in a complex with SIN3A and HDACS 1 and 2. We identified a physical and functional interaction of ARID4B with HDAC1 rather than HDAC2, suggesting functionally distinct Sin3a subcomplexes might regulate cell fate decisions Finally, we observed that ARID4B deficiency leads to increased H3K27me3 and a reduced H3K27Ac level in key developmental gene loci, whereas a subset of genomic regions gain H3K27Ac marks. Our results demonstrate that epigenetic control through ARID4B plays a key role in the execution of lineage-specific gene expression programs at pluripotency exit.
During early embryonic development, a series of differentiation and cleavage events lead to the formation of distinct cell types that later form the organism. The emergence of various cell types is a complex process that requires a precisely timed mechanism for successful development. Embryonic stem cells (ESCs) provide an in vitro model for studying early cell fate decisions. ESCs self-renew limitlessly in vitro. Because they have the capacity to form all cell types (pluripotency), they can be directed to desired lineages under the guidance of specific cytokines.
Cell fate decisions are executed by changes in gene expression. Whereas the gene expression program of a particular lineage is being established, unrelated programs are simultaneously extinguished. The chromatin environment plays a critical role in regulating the timing and the level of gene expression. The ESC-specific gene expression program is stabilized by the interactions of core pluripotency transcription factors and chromatin complexes (1)(2)(3). The plasticity of ESC differentiation potential is reflected in an open chromatin structure. Progressively during differentiation, ESCs undergo reorganization of chromatin, architecture and genomic topology (4)(5)(6)(7)(8)(9). Alterations in the chromatin environment of ESCs, therefore, may impact lineage commitment dynamics.
Studies have identified chromatin factors regulating the ESC self-renewal and pluripotency (10)(11)(12)(13)(14)(15)(16)(17)(18). It is becoming increasingly clear that the chromatin architecture and histone modifications at the ESC stage can affect cell fate specification and differentiation kinetics at later stages (17,19). However, a comprehensive study of the epigenetic regulators subsequent to the loss of self-renewal and pluripotency has been lacking. Therefore, we sought to determine the role of chromatin factors in an unbiased manner during meso/endodermal lineage commitment. To accomplish this goal, we have monitored the expression of the first lineage-specific master transcription factors to enable a more precise look at the chromatin-related requirements at cell fate decisions. Our approach departs from previous reports focusing on epigenetic effects on ESC characteristics (10,12).

Functional RNAi screen identifies candidate chromatin factors required for endoderm and mesoderm commitment
We used a pooled shRNA library screen to identify epigenetic factors that impact mouse embryonic stem cell differentiation toward mesoderm and endoderm (Fig. 1a). The shRNA library consisted of 5 previously validated shRNAs per gene targeting ;300 chromatin-related proteins. A Brachyury-GFP; Foxa2-hCD4 reporter mESC line was transduced at low multiplicity of infection with the pooled shRNA library, enabling single shRNA knockdown per cell. After puromycin selection of  . 1, a and b). We found that the loss of chromatin factors more frequently led to the differentiated phenotype with variable strength (Fig. 1b).
However, this was to be expected for the screening design as the differentiation efficiency of WT cells was ;70% as calculated by BRACHYURY-positive cell population on day 5 (Fig. 1c). Consistent with published data (20)(21)(22)(23)(24)(25)(26), depletion of several members of PcG and TrX complexes affected mESC differentiation (Fig. 1b). shRNAs that were depleted at least 2-fold in differentiated cell pools versus undifferentiated cell pools were selected as potential candidates and further validated by single shRNA knockdown experiments (Fig. 1c). Observed differentiation defects were similar for mesoderm and endoderm lineages (data not shown). This observation suggests that under the conditions of this screen the candidate chromatin factors might impact a common mesendodermal cell population that gives rise to both lineages. Figure 1. ARID4B loss leads to meso/endodermal differentiation defects. a, design of the shRNA screen. b, waterfall plot of shRNAs ranked by log2 of the enrichment score in differentiated over undifferentiated cells. Negative controls are in red (not visible since their enrichment score is close to zero) and positive controls (PcG complex members) are in black. c, endoderm differentiation efficiency is plotted as % BRACHYURY-positive cells on day 5 of differentiation. Negative control: nontargeting shRNA. d, flow cytometry data for endoderm differentiation. Negative control; nontargeting shRNA. Bry-GFP, Foxa2-hCD4 mESCs were transduced with either nontargeting or Arid4b-targeting shRNAs. After differentiation toward endoderm, expression of BRACHYURY (GFP, x axis) and FOXA2 (hCD4, y axis) were determined by flow cytometry. e-l, RT-qPCR of selected transcripts during endoderm differentiation time course in WT, arid4bD, or arid4bD cells that re-express human ARID4B.
ARID4B is essential for successful mESC differentiation toward endoderm and mesoderm The ARID family protein ARID4B was chosen for in-depth study as its knockdown led to compromised mesoderm and endoderm differentiation. ARID family proteins exhibit DNAbinding activity with little or no sequence-specificity and display diverse functions in development and disease progression (27,28). Arid4b and related Arid4a proteins contain a Tudor domain and a chromobarrel domain that recognizes methylated histones (29). In the adult tissues Arid4b expression is restricted to testis and important for spermatogonial development (30)(31)(32). Reactivation of expression has been reported in cancer (33-37). Deficiency of ARID4A and ARID4B results in a decrease in repressive chromatin modifications in the Prader-Willi/Angelman imprinting cluster (38). In our experiments knockdown of Arid4b with two independent shRNAs severely compromised differentiation of reporter mESCs toward mesodermal or endodermal lineages (Fig. 1, c and d), and SSEA1 remained high (Fig. S1, a and b), even with a modest decrease in the Arid4b level (Fig. S1c).
ARID4B is reported to be a component of the Sin3a corepressor complex (39,40). Through its several protein interaction domains, SIN3A serves as a scaffold for histone deacetylases HDAC1/2 and several other proteins that regulate Hdac function and activity (41). Although the Sin3a complex was originally classified as a transcriptional repressor, more recent evidence suggests a role in transcriptional activation (42)(43)(44). In addition to Arid4b, knockdown of other members of the Sin3a complex, including Phf12, Mbd4, and Phf21a, lead to defects in commitment of mESCs to mesoderm and endoderm (Fig. 1c, Fig. S1, a and b).
To validate shRNA knockdown findings, we deleted the Arid4b gene in mESCs with CRISPR/Cas9 (Fig. S1d). Arid4b deleted mESCs expressed Oct4 and Nanog at similar levels to WT mESCs (Fig. 1, e and f, Fig. S1, e and f). Oct4 and Nanog expression was suppressed with similar kinetics as WT cells during endoderm or mesoderm commitment. Moreover, Arid4bdeleted mESCs failed to express Brachyury, Foxa2, or Sox17 during endoderm ( Fig. 1, g-i) or mesoderm differentiation (Fig.  S1g). Upon extension of endoderm differentiation from 5 to 8 days, we observed markedly reduced expression of Brachyury and Foxa2 in Arid4b-deleted cells (Fig. S1, h and i). Importantly, expression of human ARID4B in Arid4b-deleted mESCs rescued endoderm differentiation defect (Fig. 1, j-l, Fig. S1j). Due to this differentiation defect, we refer to arid4bD cells that are exposed to the same differentiation protocol as WT cells as "meso/endoderm directed" rather than "arid4bD meso/endoderm cells."

Hdac1 and Hdac2 exert different roles in lineage commitment
Given the reported presence of HDAC1 and HDAC2 (45) in Arid4b/Sin3a corepressor complexes, we tested whether the differentiation defect upon ARID4B loss is phenocopied by loss of histone deacetylase activity. First, we used a Class I HDAC inhibitor, Merck 60, which is selective toward HDAC1 and HDAC2 with IC 50 of 1 and 8 nM, respectively. Histone deacetylation has key functions in maintaining a balance between selfrenewal and differentiation (46)(47)(48)(49)(50)(51). To prevent confounding effects of Merck 60 treatment at the ESC stage, we limited its use only to the differentiation phase. We assessed endoderm/ mesoderm differentiation efficiency upon inhibitor treatment in the Brachyury-GFP; Foxa2-hCD4 reporter mESCs. Increasing concentrations of Merck 60 treatment was associated with elevated histone 3 acetylation (Fig. S2a). BRACHYURY and FOXA2 expression was reduced upon Merck 60 treatment during endoderm differentiation (Fig. S2b). However, SSEA1 levels were unchanged in DMSO or Merck 60-treated cells. Similar results were obtained for Merck 60 treatment during mesoderm differentiation (Fig. S2c).
To resolve ambiguities from inhibitor treatment, we generated independent CRISPR/Cas9-mediated Hdac1 or Hdac2 deletions in mESCs (Fig. 2, a and b). Similar to Arid4b-deleted cells, Hdac1-deleted mESCs fail to express Brachyury, Foxa2, or Sox17 during endoderm differentiation, whereas Hdac2 deletion had no evident effect (Fig. 2, c-e). Mesoderm differentiation was also defective in hdac1D cells (Fig. 2f). On the other hand, Nanog suppression during differentiation followed with similar kinetics in WT, hdac1D, and hdac2D cells (Fig. S2, d-f). These results are consistent with a critical role of HDAC1, but not HDAC2, in early embryogenesis (52,53). In essence, HDAC1 loss phenocopies aspects of ARID4B deficiency.
We next asked whether the loss of ARID4B or HDAC1 affected neuroectodermal lineage commitment. In contrast to mesoderm or endoderm differentiation, the loss of ARID4B or HDAC1 failed to affect commitment toward neuroectodermal lineage, as evidenced by the expression of Sox1, Pax6, or Jag1 marker genes (Fig. 2, g and h, Fig. S2g). We conclude that the function of ARID4B is essential for meso/endodermal commitment and dispensible for neuroectodermal lineage.
Although both HDAC1 and HDAC2 are present in the Sin3a complex, it is interesting that only Hdac1 deletion phenocopies Arid4b deletion. We tested whether this result might be due to a preferential physical interaction between ARID4B and HDAC1. We performed coimmunoprecipitation using Arid4b antibody in WT and arid4bD mESC nuclear extracts (Fig. 2i). As expected, ARID4B successfully immunoprecipitated SIN3A. HDAC1 was coimmunoprecipitated with ARID4B in WT nuclear extracts. Of note, HDAC2 was not detected in the pulldown with ARID4B even though it is expressed in these cells.
We then performed glycerol gradient centrifugation to analyze intact complex composition. ARID4B and SIN3A peaks coincided in high molecular weight fractions. We observed a greater proportion of HDAC1 coincided in these same fractions. Although there was some HDAC2 in these same fractions, the proportion of HDAC2 was more pronounced in lower molecular weight fractions that lacked SIN3A (Fig. 2j).
Next, we performed proximity ligation assay (PLA) to detect in situ ARID4B interaction with HDAC1 or HDAC2. This technique utilizes a pair of oligonucleotide-bound antibodies to enable continuous DNA synthesis only if epitopes are in close proximity (40 nm) and is used for intracellular visualization of protein-protein interactions. Consistent with previous results, we observed more ARID4B-HDAC1 interactions than ARID4B-HDAC2 interactions in mESCs (Fig. 2k). The majority of interactions colocalized with 49,6-diamidino-2-phenylindole, consistent of the subcellular localization and function of these proteins. A greater number of ARID4B-HDAC1 interactions were not because of differences in abundance, because HDAC1 and HDAC2 were expressed at similar levels in mESCs (Fig.  S2h). These results suggest that the observed mesodermal and endodermal differentiation defect of ARID4B deficiency is associated with loss of HDAC1 activity in Sin3a complex.

arid4bD and hdac1D cells exhibit similar global histone modification profile
Next, we investigated the global chromatin profile of endoderm committed WT, arid4bD, hdac1D, and hdac2D cells. To this end, we performed a quantitative analysis of histone posttranslational modifications by MS, which allowed for an unbiased examination of histone modifications, as well as their combinato-rial constitution in each cell type. The results were normalized to WT and clustered using the Euclidean distance metric (Fig. 2l). WT cells clustered closely with hdac2D cells. arid4bD cells clustered away from WT cells and were more similar to hdac1D cells than hdac2D cells. These observations are consistent with the similarities in phenotype of ARID4B and HDAC1 loss.
Given the differentiation defect of arid4bD cells, it is possible that the arid4bD cells maintain an ESC stage histone modification profile. To test this, we compared the global histone modification profile of WT ESCs to those of endoderm-differentiated WT, arid4bD, hdac1D, and hdac2D cells. Interestingly, WT ESCs clustered away from endoderm-differentiated cells, regardless of the genotype (Fig. S2i). These observations support a model in which arid4bD cells do not remain as ESCs Figure 2. ARID4B functionally and physically interacts with HDAC1. a, validation of Hdac1 knockout by Western blotting in WT and CRISPR-mediated knockout cells during endoderm differentiation. b, validation of Hdac2 knockout by Western blotting in WT and CRISPR-mediated knockout cells during endoderm differentiation. c-e, RT-qPCR of Brachyury (c), Foxa2 (d), and Sox17 (e) during endoderm differentiation time course in WT, hdac1D, and hdac2D cells. f, RT-qPCR of Brachyury during mesoderm differentiation time course in WT, hdac1D, and hdac2D cells. g and h, RT-qPCR of Sox1 (g) and Pax6 (h) during neuroectoderm differentiation time course in WT, hdac1D, hdac2D, and arid4bD cells. i, coimmunoprecipitation using ARID4B antibody. Nuclear extracts from WT or arid4bD mESCs were used. j, cosedimentation assay using glycerol gradient centrifugation. k, PLA of ARID4B-HDAC1, ARID4B-HDAC2 in WT mESCs. Red dots depict interactions between tested proteins. 49,6-Diamidino-2-phenylindole was used to stain the nuclei. A PLA reaction without the use of primary antibodies was used as a negative control. l, proteomic analysis of histone modifications in endoderm directed WT, arid4bD, hdac1D, and hdac2D cells.
The role of ARID4B in mESC lineage commmitment during differentiation but are unable to successfully execute commitment to endoderm or mesoderm lineages.
arid4bD cells fail to up-regulate meso/endodermal gene expression program To further investigate the role of ARID4B in mESC lineage commitment, we conducted RNA expression profiling of WT or arid4bD cells directed toward mesoderm or endoderm. Hierarchical clustering of the samples showed that RNA-seq retained high reproducibility in replicates (Fig. S3a). Compared with WT cells, arid4bD cells showed a reduction in the expression of primitive streak and endodermal markers (Fig. 3a). Comparative analysis of transcriptomes revealed 171 genes were significantly down-regulated (fold-change . 2, adjusted p value , 0.01) in endoderm-directed arid4bD cells and 35 genes were up-regu-lated. We validated the expression of a larger set of lineage specific genes (Fig. 3, b-j). Gene set enrichment analyses (GSEA) demonstrated that the loss of ARID4B was associated with reduced representation of pathways related to proper lineage commitment and embryonic development (Fig. 3, k-m). Signaling pathways activated in stem cell differentiation were downregulated in ari4bD cells (Fig. 3, n-o). On the other hand, type I interferon pathway and cellular viral defense response pathways were strongly activated in arid4bD cells (Fig. S3, b and c).

Chromatin landscape is altered upon Arid4b loss in lineage commitment
To interrogate changes in the chromatin structure of differentiating mESCs in arid4bD cells, we performed ChIP for the histone marks H3K4me3, H3K27me3, and H3K27Ac. We compared ChIP-seq intensities of these chromatin marks between WT and arid4bD cells using a quantitative algorithm called MAnorm (54). H3K27Ac signal was up-regulated in arid4bD mesoderm-or endoderm-differentiated cells (Fig. 4a,  Fig. S4a). There was a small but notable change in H3K27me3 levels as well (Fig. 4b, Fig. S4b). In contrast, H3K4me3 peaks were largely unchanged (Fig. 4c, Fig. S4c). Further analysis of H3K27Ac signal revealed the increase to be in regions distal, rather than proximal, to the transcription start site (TSS) (Fig.  4, d and e, Fig. S4, d and e).
Genes responsible for a specific biological process might be coregulated through chromatin changes. Therefore, we used Genomic Regions Enrichment of Annotations Tool (GREAT) to identify biological processes enriched for each chromatin mark (55). Consistent with previous results, genes that lose H3K27Ac and H3K4me3 signal, and genes that gain H3K27me3 signal in mesoderm-directed arid4bD cells were strongly enriched in pathways related to embryonic development, pattern specification, and differentiation (Fig. S4, f-h).
H3K27 acetylation is observed in active enhancers. Superenhancers (SE) are large clusters of enhancers that are marked by broad H3K27Ac and high concentration of transcription activators. They define cell identity by regulating the expression of key cell fate genes (56)(57)(58). Given the essential role of ARID4B in mesodermal and endodermal commitment, we assessed whether H3K27Ac changes in arid4bD cells correlate with any changes in SEs. We found that the number of SEs is greater in arid4bD cells as compared with mesoderm-differentiated WT cells (Fig. 4, f and g). There was a similar increase in the number of SEs in endoderm-differentiated arid4bD cells (Fig. 4, h and i). The changes in the number of SEs in arid4bD cells might underlie the cell fate defects. Next we analyzed the The role of ARID4B in mESC lineage commmitment genes and pathways enriched in SEs using the GREAT database. We found that SEs unique to endoderm-differentiated WT mESCs were enriched in morphogenetic and developmental processes as well as regulation of transcription (Fig. S4i). No pathways were enriched in common SEs or arid4bD unique enhancers. It should also be noted that many of the common SEs exhibited increased H3K27Ac mark in arid4bD cells. These results indicate that developmental genes critical for endoderm development might fail to acquire H3K27Ac mark in arid4bD cells.
Next, we investigated a possible correlation between changes in chromatin landscape and gene expression. Using SitePro analysis, we found that the genes that are down-regulated in mesoderm-directed arid4bD cells show increased H3K27me3 signal around their TSS. Interestingly, higher H3K4me3 modi-fication around TSS accompanied the H3K27me3 mark in these genes (Fig. 5, a-c). Genes that are down-regulated in endoderm-directed arid4bD cells had higher H3K27me3 and H3K4me3, and a pronounced decrease in H3K27Ac than WT (Fig. S5, a-c). On the other hand, up-regulated genes exhibited higher H3K4me3 and H3K27Ac mark around TSS in aridbD cells (Fig. 5, d-f, Fig. S5, d-f). These results indicate that the alterations in H3K27 rather than H3K4 are associated with changes in the gene expression program observed in Arid4b deficiency.
We compared the distribution and the intensity of chromatin marks of WT and arid4bD cells using Integrative Genomics Viewer (IGV). Genes required for the establishment of meso/ endodermal lineage (such as Bry (T), Eomes, Mixl1, Foxa2, Gsc, Hoxa1, Hoxb1, Lhx1) were found to be generally marked with , and H3K4Me3 (c) ChlP-seq on transcriptionally down-regulated genes in mesoderm-directed WT (blue) and arid4bD (red) cells. x axis, average signal profile; y axis, relative distance from the center (TSS). d-f, Sitepro analysis of H3K27Ac (d), H3K27Me3 (e), and H3K4Me3 (f) ChlP-seq on transcriptionally up-regulated genes in mesoderm-directed WT (blue) and arid4bD (red) cells. x axis, average signal profile; y axis, relative distance from the center (TSS). g, Integrative Genomics Viewer visualization of ChlP-seq tracks for selected lineage specific genes (Bry, Eomes, Mixl1, Fgf8, Foxa2, Gsc, Hoxa1, Hoxb1, and Lhx1) and ESC specific genes (Oct4 (Pou5f1), Nanog) in mesoderm-directed WT and arid4bD cells. y axes of WT and arid4bD tracks are set to the same data range.

Discussion
Prior analysis of the role of the Sin3a complex in ESC biology has led to apparently conflicting findings. Sin3a knockout results in embryonic lethality around E3.5 and 6.5 (59,60). However, loss of the highly related SIN3B protein is lethal only later during development (61). Although Arid4a knockout mice are viable, Arid4b knockout mice die between E3.5 and 7.5 (38). Hdac1 knockout mice are similarly embryonic lethal, whereas Hdac2 deletion is viable (52,62,63). Although both HDAC1 and HDAC2 independently interact with SIN3A, it is unclear whether these proteins function within the same complex or are present in alternate Sin3a complexes (45). Taken together with the previous findings, our results point to a unique role of a SIN3A, HDAC1, and ARID4B containing complex in ESC biology and differentiation. We found that, similar to Arid4b deletion, the deletion of Hdac1 but not Hdac2, prevents mesoderm and endoderm differentiation. Our findings support previous reports on the role of HDAC1 in ESCs (50,53,64). Moreover, we observe ARID4B interaction with SIN3A and HDAC1, but not with HDAC2, despite considerable Hdac2 expression in these cells. Recently, an ESC-specific variant Sin3a complex was identified, supporting the notion that the composition of the Sin3a complex may vary among cell types and during cell differentiation (65).
A genetic perturbation of a member of a protein complex may lead to formation of residual complexes with different functional outcomes, as recently described for SWI/SNF complexes in cancer (66)(67)(68). Accordingly, the phenotype observed in arid4bD ESCs might be because of the function of ARID4Bless Sin3a complex rather than the complete loss of Sin3a complex function.
In endoderm-directed arid4bD cells, transcripts for 41 genes were up-regulated and 170 genes were down-regulated more than 2-fold (adjusted p , 0.05). Similarly, for mesoderm differentiation, transcripts for 39 genes were up-regulated and 308 genes were down-regulated in arid4bD cells. Although these genes represent both direct and indirect targets of the Sin3a complex, the observation that a majority of genes are downregulated upon ARID4B loss is consistent with a role of the Sin3a complex in transcriptional activation. Indeed, evidence from yeast, Drosophila, and mammals reveal that the histone deacetylation by the Sin3a complex has a fine-tuning function for transcribed genes (42-44, 59, 69-79).
Our ChIP-seq experiments revealed critical changes associated with the loss of ARID4B during meso/endodermal differentiation, exemplified by modification at H3K27. Downregulated genes, many of which have key developmental roles, have H3K4me3 around their TSS in arid4bD cells, suggesting ARID4B loss does not compromise MLL complex function. However, high H3K27me3 modification accompanies H3K4me3 and there is very little transcriptional output.
These results suggest that the loss of ARID4B function might alter H3K27me3 deposition or removal in lineage-specific genes upon differentiation and might prevent their transcriptional up-regulation.
On the other hand, we observed elevated H3K27Ac mark and SEs in a subset of genes unrelated to ESC differentiation. SEs harbor a dense population of master regulators of cell fate and the Mediator complex components along with many chromatin factors (56)(57)(58). It is possible that the aberrant H3K27Ac-high SE regions in arid4bD cells may compete for and sequester away some of these factors required for the chromation reorganization and transcription of ESC differentiation genes.
Remodeling of the ESC cell cycle is coincident with exit from pluripotency (80,81). Even though there appears to be a link between these two events, the notion that the change in cell cycle is directly linked to differentiation has been challenged (82,83). ARID4A has a unique LXCXE motif that mediates interaction with pRB (27). ARID4A recruits the Sin3a corepressor complex (and thus HDAC1) to pRB targets for transcriptional suppression (84)(85)(86). This enables cell cycle control through the G1 phase. Interestingly, ARID4B lacks the LXCXE motif and is not predicted to interact with pRB. We also did not detect changes in the number of cycling ESCs or the distribution among cell cycle phases in arid4bD ESCs (data not shown). It is conceivable that a change in the composition of the Sin3a complex in arid4bD cells might indirectly affect the cell cycle. Similar changes in chromatin complex architecture and function are observed for chromatin remodeling complexes (66,(87)(88)(89)(90)(91).
Our Arid4b knockdown and knockout experiments resulted in protein deficiency at the ESC stage, whereas differentiation defects were observed later. Even though arid4bD ESCs are similar to WT ESCs on the basis of pluripotency marker expression and cell cycle analyses, we cannot rule out the possibility that the differentiation defect in arid4bD cells originates already at the ESC stage. A more detailed analysis of the transcriptomic changes observed in ESCs and throughout the differentiation time course is needed to identify precisely when and where ARID4B function is critical.

Generation of CRISPR deletion mESCs
Paired single guide RNAs were designed to limit off-target cleavage and delete critical coding exons of the selected candidate genes. mESC deletions were performed as previously described for MEL cells (94,95). mESC clones were screened using conventional PCR and validated by Western blotting.

Generation of Arid4b rescue mESCs
Full human ARID4B cDNA was purchased from Dharmacon (clone number 40146449). HARID4B ORF was amplified with AscI and XbaI restriction sites and cloned into pEF1a-FlagBio plasmid (96). arid4bD mESCs were electroporated with 10 mg of plasmid using a Bio-Rad electroporator. Clones were screened with Western blotting using anti-FLAG antibody.

shRNA screen and analysis
A list of epigenetic factors was prepared through literature, chromatin-related domain homology search, and other database searches. shRNA selection and library production was done through the Broad Institute the RNAi Consortium.
Brachyury-GFP; Foxa2-hCD4 reporter mESCs were transduced by centrifugation at 2000 rpm at 37°C for 2 h in serumfree mESC medium that contains 4 mg/ml of Polybrene. The transduced cells were immediately washed and plated in conventional mESC medium on a gelatinized tissue culture dish. Transductions were performed at .200 cells/shRNA to allow for adequate library representation. After 24 h, transduced cells were selected using 1 mg/ml of puromycin for 3-4 days. mESCs were allowed to recover for 2 days. Mesoderm and endoderm differentiations were performed as explained above. Day 0 mESC sample was taken as the starting shRNA population. At day 5 of differentiation, the top 5% of differentiated cells (for mesoderm: highest BRACHYURY expression, for endoderm: highest BRACHYURY and FOXA2 expression) as well as bottom 5% of undifferentiated cells (lowest BRACHYURY and/or FOXA2 expression and highest SSEA1 expression) were sorted on BD Aria (DFCI Flow Cytometry Core Facility). Library transductions were performed in three independent replicates. Genomic DNA was isolated from sorted cells and was sent to the Broad Institute for sequencing.
The analysis of the shRNA screen results were done using the average of the shRNAs for each gene as well as the Weighted Sum method on the GENE-E program developed by the Broad Institute. Day 5 shRNA representation was compared with day 0 mESC shRNA representation. Additionally, day 5 differentiated to undifferentiated comparison was also performed. The genes with less than three scored shRNAs were eliminated from analyses. Genes that are depleted at least 2fold compared with the day 0 or day 5 undifferentiated population were selected as candidates. Of these candidate genes, the ones that show up in only one of the three biological replicates were eliminated. Known Polycomb and Trithorax group proteins were also discarded from further study. The final list of candidate genes were tested one by one with three independent shRNAs in mESCs for differentiation toward mesoderm and endoderm.
Flow cytometry mESCs or differentiated cells were dissociated into single cells and stained with anti-SSEA1-Alexa Fluor 647 (eBioscience, 51-8213) and anti-human CD4-PE (eBioscience, 12-0049). Flow cytometry was performed on BD Fortessa and analyzed on FlowJo software. Cell sorting was done in DFCI Flow Cytometry Core Facility on BD FACSAria cell sorters.

RT-qPCR and RNA-seq
Cells were collected and resuspended in TRIzol (Thermo, 15596018). RNA was extracted using Qiagen RNeasy plus kits according to provided protocols. The concentration of purified RNA samples was tested on Nanodrop. Equal amounts of total RNA (250 ng to 1 mg) was converted into cDNA using an iScript cDNA synthesis kit (Bio-Rad, 1708890). qPCR was performed with primers listed in Table 1 and iQ SYBR Green supermix (Bio-Rad) using Bio-Rad CFX96 and CFX384 machines according to the manufacturer's protocols.
For RNA-seq, genomic DNA was eliminated in a column during RNA extraction using DNase (Qiagen, 79254). The quality of the RNA samples was tested on an Agilent BioAnalyzer (DFCI CCCB Core Facility). Libraries were prepared using New England Biolabs reagents (NEBnext ultra directional RNA library prep kit (E7420S), NEBnext rRNA depletion kit (E6310S), and NEBnext multiplex oligos for Illumina sequencing (E7335S)). The concentrations of library cDNA samples were analyzed using Qubit. Sequencing was performed using Illumina HiSeq2000.

Co-immunoprecipitation
Nuclear extracts were prepared from WT (CJ9) and arid4bD mESCs using the Universal Magnetic CoIP Kit (Active Motif, catalog number 54002) according to the manufacturer's protocol. For co-immunoprecipitation, kit protocol was followed. 400 mg of nuclear extract was incubated with 5 mg of anti-Arid4b (A302-233A; Bethyl) antibody. After immunoprecipitation and washes, beads were boiled in 23 Laemmli buffer (Bio-Rad) supplemented with b-mercaptoethanol at 95°C for 10 min.
Glycerol sedimentation assay WT (CJ9) mESCs were grown and glycerol sedimentation assay was performed as previously described (67).

Histone proteomics
Quantitative analysis of histone post-translational modifications was performed in collaboration with Dr. Jacob Jaffe of the Broad Institute Proteomics Platform. WT, arid4bD, hdac1D, and hdac2D mESCs as well as endoderm-directed cells were collected and processed to isolate histones. The procedure was completed as described in Ref. 97. The enrichment results for each modification in knockout cells were normalized to the WT counterpart and visualized using Morpheus tool of the Broad Institute.

RNA-seq data analysis
RNA-seq reads were aligned to the reference mouse genome mm10 using STAR (98) with default parameters. Aligned reads were counted in the genomic transcripts annotations from GenomicFeatures (99), using Rsamtools (Morgan M, 2016). DESeq2 (100) used for differentially expressed gene analysis was performed with the threshold at an adjusted p value 0.01 and fold-change 2.

ChIP-seq data analysis
ChIP-seq reads were aligned to the mm10 reference genome using Bowtie2 (101) with default parameters. Duplicate reads were removed using PICARD tools (RRID:SCR_006525).

Data availability
Data have been deposited in the Gene Expression Omnibus with accession numbers GSE153633) and GSE153634) ( Tables  3 and 4).