Interferon regulatory factor 1 and a variant of heterogeneous nuclear ribonucleoprotein L coordinately silence the gene for adhesion protein CEACAM1

The adhesion protein carcinoembryonic antigen–related cell adhesion molecule 1 (CEACAM1) is widely expressed in epithelial cells as a short cytoplasmic isoform (S-iso) and in leukocytes as a long cytoplasmic isoform (L-iso) and is frequently silenced in cancer by unknown mechanisms. Previously, we reported that interferon response factor 1 (IRF1) biases alternative splicing (AS) to include the variable exon 7 (E7) in CEACAM1, generating long cytoplasmic isoforms. We now show that IRF1 and a variant of heterogeneous nuclear ribonucleoprotein L (Lv1) coordinately silence the CEACAM1 gene. RNAi-mediated Lv1 depletion in IRF1-treated HeLa and melanoma cells induced significant CEACAM1 protein expression, reversed by ectopic Lv1 expression. The Lv1-mediated CEACAM1 repression resided in residues Gly71–Gly89 and Ala38–Gly89 in Lv1's N-terminal extension. ChIP analysis of IRF1- and FLAG-tagged Lv1-treated HeLa cells and global treatment with the global epigenetic modifiers 5-aza-2′-deoxycytidine and trichostatin A indicated that IRF1 and Lv1 together induce chromatin remodeling, restricting IRF1 access to the CEACAM1 promoter. In interferon γ–treated HeLa cells, the transcription factor SP1 did not associate with the CEACAM1 promoter, but binding by upstream transcription factor 1 (USF1), a known CEACAM1 regulator, was greatly enhanced. ChIP-sequencing revealed that Lv1 overexpression in IRF1-treated cells induces transcriptional silencing across many genes, including DCC (deleted in colorectal carcinoma), associated with CEACAM5 in colon cancer. Notably, IRF1, but not IRF3 and IRF7, affected CEACAM1 expression via translational repression. We conclude that IRF1 and Lv1 coordinately regulate CEACAM1 transcription, alternative splicing, and translation and may significantly contribute to CEACAM1 silencing in cancer.

Alternative splicing (AS) 2 is an evolutionary strategy that increases the proteomic diversity of the cell by facilitating removal of large noncoding sequences (introns) and ligation of short coding sequences (exons) (1). The functional consequence is processing of a single gene into multiple different transcripts that result in proteins with different functions. Splicing regulatory factors acting in trans control spliceosome activity through a complex network of RNA-processing events that include splice-site and alternative promoter selection as well as overall transcript levels (2). The participation of splicing factors in the regulation of gene expression was proposed several years ago because of subnuclear structures associated with sites of transcription (3).
Although transcriptional co-regulation is presently well-understood, the role of splicing regulatory proteins on gene silencing or activation is not well-studied. Whereas examples exist of RNA splicing factors regulating transcription pathways, such as the ATPase-dependent RNA helicase DDX3 and its association with the p21(waf1/cip1) promoter in collaboration with the transcription factor SP1 (1), the equally important role of modulation of epigenetic regulation by alternative splicing is just beginning to emerge. For example, chromatin structure and histone modifications can affect the assembly of pre-mRNAs in the pre-spliceosome (2). In particular, H3K9 methylation of histones, a subject of this study, is a factor influencing recognition of both constitutive and alternative exons (3). Furthermore, H3K36me3 loss is associated with changes in chromatin accessibility (4). Mechanisms that exert a regulatory role on nucleosomal packaging and positioning in gene promoters could also affect other aspects of the regulation of gene expression, including AS (5).
Several reports have also raised the possibility that some splicing factors act as oncogenic or anti-oncogenic factors. One particular class involves the heterogeneous nuclear ribonucleoprotein (hnRNP) family that binds to sequences near splice sites (6), and comprises global regulators of mRNA splicing (7). For example, down-regulation of hnRNP L, also a subject of this study, induced loss of tumorigenic capacity in non-small cell lung cancer cells (8), and more recently, hnRNP L was shown to promote prostate cancer progression by inhibiting apoptosis (9). Normally, hnRNP L is expressed at low levels, but in diverse human cancers, including lung, liver, ovarian, colorectal, and breast cancers, hnRNP L is overexpressed (10). Other splicing factors, including SRSF2 (11) and RBM4 (12), are also frequently overexpressed in cancer, whereas hnRNPA2B1, hnRNP D, and hnRNP L exhibit intense nuclei staining in gastric cancer compared with adjacent normal tissues (13). Although these studies demonstrate that many hnRNPs, and hnRNP L in particular, are associated with aggressive neoplastic characteristics, the mechanism of their role remains enigmatic.
The binding specificity for hnRNP L, involving CA-repeat motifs, is retained in exon or intron sequences leading to exon repression, as in the expression of CEACAM1, the main subject of this study (14). In humans and rat, CEACAM1 pre-mRNA undergoes extensive AS, generating isoforms consisting of an N-terminal and a variable number of multiple extracellular Iglike domains, a transmembrane domain, and either short isoform (S-iso) or long isoform (L-iso) cytoplasmic domains (15). The short cytoplasmic domain isoform, the predominant isoform in epithelial cells, has been shown to bind actin, tropomyosin, calmodulin, and annexin II and is involved in lumen formation (16). S-iso is the predominant isoform in normal breast, whereas in breast cancer the S-iso/L-iso ratio is greatly reduced (17). L-iso with two cytosolic phosphotyrosine residues in immunoreceptor tyrosine-based inhibitory (ITIM) or switch (ITSM) motifs, predominant in immune cells, binds SHP-1 when phosphorylated and conveys inhibitory activities to L-iso (18). Furthermore, in T lymphocytes, L-iso is significantly upregulated at the cell surface in response to IFN-␥ treatment, thereby mediating cell adhesion to other lymphocytes or tumor cells (19,20).
Accumulating evidence has indicated that CEACAM1 expression is altered during oncogenesis. We previously demonstrated that the S-iso/L-iso ratio of CEACAM1 is altered at the invasive front in MDA-MB-468 tumor cells orthotopically implanted into mammary fat pads of NOD/SCID mice (21). CEACAM1 isoform switching occurs in proliferating and quiescent epithelial cells, although the mechanism is not well-understood (22). In particular, expression of CEACAM1 in several carcinomas, including breast and prostate cancer, is down-regulated during neoplastic transformation (23,24). Despite many studies, the molecular mechanism of how CEACAM1 is systematically down-regulated in breast, and other cancers, is still not understood.
To address this problem, we have considered the possibility that the normal coordination of RNA transcription and splicing is disrupted in cancer, leading to chromatin remodeling as a mechanism of gene silencing. In this regard, the dependence of inclusion (or not) of E7 in CEACAM1 AS by the transcription factor IRF1 and splicing factor hnRNP L, occurred to us a logical starting point to investigate this possibility. During this study we found that a variant of hnRNP L, Lv1, generated from an alternative promoter of hnRNP L, plays a critical role in CEACAM1 gene silencing using methylation changes in H3K36me3 as a ubiquitous hallmark of productive transcriptional regulation. This study may help explain why CEACAM1 expression is down-regulated in a subset of cancers.

IRF1 requires E7 for L-iso-dependent AS
The AS of CEACAM1 mRNA includes E7 in immune cells to produce L-iso encoding an ITIM and an ITSM, whereas AS in epithelial cells produces predominantly S-iso by skipping E7. Previously, we showed that RNA splicing regulators hnRNP L and hnRNP A1 interact directly with CEACAM1 E7 to mediate production of S-iso whereas hnRNP M is essential for L-iso production (14). Although we established that IRF1 biases the AS of CEACAM1 mRNA to generate the L-iso via the inclusion of E7 (21), we did not address the possibility that the RNAbinding proteins would coordinately affect transcription, splicing, and translation.
Because in other studies, the Mediator complex was shown to cross-talk with the splicing machinery through interactions with hnRNP L (25), we first tested whether hnRNP L was necessary for IRF1-dependent splicing. This approach utilized our previously reported tripartite splicing reporter minigene constructs (21). This construct contains 1135 nucleotides of the CEACAM1 promoter, followed by exons 6, 7, and 8 plus their introns of the CEACAM1 gene (CAM1p (6 -7-8)) in translational reading frame with GFP, such that GFP expression occurs from a single mRNA reporting the inclusion of the variant E7. As a control, we generated CAM1p (6-rc7-8), similar to the WT construct but with the reverse complement 53-nucleotide (nt) E7 sequence (Fig. 1A). Whereas the WT construct previously established that IRF1 cross-talks with E7 to bias AS, here our intent was to disrupt hnRNP L association with the binding motif CACA of E7 (nt [22][23][24][25] in the presence of IRF1 (14). Notably, early on we observed our promoter constructs were inducible with IRF1 in HeLa cells, a cell line in which CEACAM1 protein is not natively expressed. Because GFP was expressed, this result suggested that the promoter of CEACAM1 was epigenetically silenced in HeLa cells but otherwise possesses the required machinery to express CEACAM1 mRNA in the transfected promoter construct. When IRF1treated HeLa cells were transfected with CAM1p (6 -7-8) or CAM1p (6-rc7-8) reporter constructs and analyzed for RNA expression, we observed induction of AS occurred only when IRF1 was present and this splicing event was ablated when E7 was in the reverse orientation (Fig. 1B, lane 6 versus 7). When we analyzed these constructs using flow cytometry in the absence (Fig. 1C) or presence of IRF1 (Fig. 1D), the GFP expression of CAM1p (6-rc7-8) was significantly attenuated compared with its WT counterpart, in agreement with our RNA splicing data. This GFP expression data (Fig. 1E) suggests that E7 but not introns 6 or 7 coordinate with IRF1 to regulate splicing and that E7 acts as a binding platform to place IRF1 in trans to RNA splicing regulators of the AS pathway.

Nucleotides 16 -35 of E7 coordinate with IRF1 to control RNA splicing
Because we previously identified hnRNP L and hnRNP A1 as candidates that guide spliceosome processing relative to E7 (14), we tested whether hnRNP L plays a role in IRF1-dependent splicing. Our approach was to sterically block hnRNP L access to E7 using antisense morpholinos (MO) and test whether IRF1 could still induce AS. Because we have previously established the CACA-binding motif in E7 (14), we targeted nucleotides 16 -35 of E7 (E7:MO) in MDA-MB-468 breast cancer cells to block regulators of that region of E7 ( Fig. 2A). MOs were chosen because of their demonstrated ability to disrupt RNA-protein interactions by steric hindrance (26). Our approach was not to deplete hnRNP L but to limit its access to E7. MDA-MB-468 cells were used to take advantage of their endogenous expression of S-iso, allowing quantitation of the ratio of the two isoforms under conditions in which the L-iso is induced by IRF1. As expected, addition of IRF1 both significantly induces transcription of CEACAM1 mRNA (Fig. 2, B and C, lanes 3, 6, and 7 versus 2, 4, and 5) and induces L-iso formation (lanes 3 and 6). Notably, addition of the nonspecific MO in the presence of IRF1 had no effect on the percentage E7 inclusion whereas E7:MO completely abolished L-iso production by AS (lane 7 versus 6). These data suggest that nucleotides 16 -35 of E7 are the binding site for hnRNP L and coordinate with IRF1 to direct AS of CEACAM1.

A variant of hnRNP L coordinates with IRF1 to down-regulate expression of CEACAM1
Next we asked how IRF1 guides AS through hnRNP L-directed RNA splicing. Although E7-specific MOs blocked hnRNP L access to E7 leading to a block in AS (Fig. 2), its availability to recruit the spliceosome to the transcription machinery was not disrupted. To show clear disruption of the AS crosstalk between IRF1 and hnRNP L, we studied the effects of A, schematic diagram of the reporter constructs: pZsG V lacks a promoter to drive GFP expression, CAM1p (6 -7-8) contains 1135 nucleotides of the CEACAM1 genomic promoter sequence (p) fused in translational reading frame to a minigene containing exons 6 -7-8 (introns included) and GFP, as a reporter for L-iso expression. Construct CAM1p (6-rc7-8) is similar but contains a reverse complement E7 sequence in place of the WT sequence. Primers for CEACAM1 RT-PCR recognize the pZsG V backbone and E8 of CEACAM1 as shown. B, analysis of RNAs derived from HeLa cells transfected with the indicated reporter GFP constructs in the presence or absence of IRF1. RT-PCR of CEACAM1 (upper) and GAPDH (lower) was used as a loading control. The mean percent E7 inclusion was calculated as (% L-iso mRNA/(L-iso ϩ S-iso) mRNAs) and is shown below the panel, as described previously (14). **, p Ͻ 0.01, CAM1p (6-rc7-8) versus CAM1p (6 -7-8). C-E, parallel samples were analyzed for GFP expression by flow cytometry (C and D), with the key shown above, and quantitated in E, by measuring the percent of cells expressing GFP and expressed as mean fluorescent intensity. **, p Ͻ 0.01, CAM1p (6-rc7-8) versus CAM1p (6 -7-8). Ad-⌿5 is the Ad-null empty vector control. Number of replicates for B-D were at least n ϭ 3. Error bars, S.D.

An hnRNP L variant and IRF1 silence CEACAM1
CEACAM1 regulation of hnRNP L by RNAi depletion. To accomplish this, we established an inducible expression system that allowed CEACAM1 gene expression dynamics. Based on our finding that the native promoter of CEACAM1 was inducible with IRF1 in HeLa cells (Figs. 1 and 3A, lane 3 versus 2), we were able to ask if hnRNP L was wholly or partially responsible for CEACAM1 silencing in these cells. Notably, there are three protein variants of hnRNP L observed by Western blot analysis that are greatly reduced by siRNAs directed to exon 8 of hnRNP L (Fig. 3A, lane 5 versus [2][3][4]. Quantitation of the knockdown ratio of total hnRNP L levels as compared with siScram ϩ IRF1 was ϳ93.3% (1/0.15-100) (Fig. 3B). Unexpectedly, under these conditions, and only when IRF1 was present, we observed the dramatic up-regulation of CEACAM1 protein expression, compared with scrambled siRNA controls (Fig. 3A, lane 5 versus 3). Quantitation of the normalized ratio of CEACAM1 as compared with siScram ϩ IRF1 protein levels revealed an almost 2-fold increase of protein (Fig. 3C). To determine whether hnRNP L affected expression of CEACAM1 at both the mRNA and protein levels in another cell line, we repeated the hnRNP L knockdown in melanoma A2058 (Fig. S1) with a similar result. This unexpected finding suggests that hnRNP L acts at both the transcriptosome and spliceosome to regulate expression of CEACAM1. To determine whether AS was also affected, RNAs from HeLa cells treated with siRNAs to exon 8 of hnRNP L in the presence or absence of IRF1 were collected and analyzed by RT-PCR (Fig. 3D). A significant up-regulation of L-iso was observed from ϳ78 to 86% (Fig. 3E).
To better understand the dynamics of how hnRNP L controls IRF1-dependent transcription and to understand which hnRNP L variant is responsible, we analyzed the exon structure of each hnRNP L variant using annotations from the UCSC and Ensembl Genome browsers (Fig. 3F). Using an in silico approach to determine the apparent molecular size for each exon (Table S1), we concluded that our hnRNP L siRNA treatment at exon 8 down-regulated hnRNP L variant 1 (Lv1, Ensembl ID 201), variant 2 (Lv2, Ensembl transcript 210), and a previously uncharacterized variant which we refer to as variant 3 (Lv3, Ensembl ID 212). Lv1 is the largest size but less abundant of the three variants, with a total of 13 exons compared with Lv2 which only includes 11 exons. The most abundant isoform in our Western blotting, Lv3, also expresses 13 exons, but as with all the observed variants, expresses a unique exon 1. Using this information, we screened the available hnRNP L EST dataset and found only one clone H04D135H08 with a unique exon 1 structure found in testis for Lv3. Both Lv1 and Lv2 were otherwise well-represented in other tissues. The predicted molecular mass of Lv1 is 65.0 kDa (589 amino acids) whereas the predicted molecular mass of Lv2 is 50.1 kDa (456 amino acids), and Lv3 has a predicted molecular mass of 58.1 kDa (530 amino acids). Unique transcription start sites and distinct promoter regions mark the differences between these isoforms. Lv1 initiates translation from a proximal exon 1 whereas Lv2 initiates translation from an alternate start site in exon 3. Amino acids found in exon 3 of Lv1 ( 130 YVVV 133 ) re-establish the reading frame such that the primary amino acid structure of Lv2 is identical thereafter to Lv1. Importantly, only the Lv1 exon 1 encodes an extra 89 amino acids at the N terminus, of which its composition is 33.7% Gly and 15.7% Arg (27), this exon is not found in Lv2 and Lv3 uses an alternate exon 1.
Using this information we designed a new siRNA targeting exon 1 to distinguish whether Lv1 was responsible for the change in translational activation of CEACAM1. Our data show that compared with our scrambled treatment or GAPDH control siRNAs, siLv1.E1 down-regulates Lv1 with approximately equal efficiency as the siRNA that targets total hnRNP L (Fig. 3A, lane 6 versus 5). This was quantitated in Fig. 3G, where we observed an approximate 98% knockdown of the Lv1 isoform (1/0.46 -100). Importantly, once again we observe significant up-regulation of CEACAM1 compared with the scrambled siRNA control and quantitated in Fig. 3H where the normalized ratio of CEACAM1 relative to siScram ϩ IRF1 protein levels increased almost 3-fold. To rule out that this effect was not because of the up-regulation of Lv2 observed by siRNAs to Lv1 (Fig. 3A, lane 6 versus 5), siRNAs targeting Lv2 exon 1 were similarly analyzed but we did not see evidence for up-regulation of CEACAM1 (data not shown). These data

An hnRNP L variant and IRF1 silence CEACAM1
establish that Lv1, a variant of hnRNP L, is a novel regulator of CEACAM1.

Deletion analyses of Lv1 in the presence of IRF1 de-represses hnRNP L activity
Because our RNAi analysis targeting Lv1 suggests a coordinating role with IRF1 in CEACAM1 regulation, we generated HeLa cells stably expressing a C-terminal 3ϫFLAG-tag version of Lv1 (Lv1-F) under the control of a CMV promoter (Fig. 4A). We predicted that high expression of both Lv1 and IRF1 should down-regulate CEACAM1. As expected, when these cells were treated with IRF1, the protein expression level of CEACAM1 was dramatically reduced by 57.2% (100 -42.8) compared with cells transfected with control vector (Fig. 4B, lane 7 versus 3). As a further control, the cells were also treated with IFN-␥, a natural inducer of IRF1. CEACAM1 expression was also lower under these conditions, albeit not as low as IRF1-treated cells. The fact that IFN-␥ does not repress CEACAM1 as strongly as IRF1 may be explained by its more pleiotropic role as a master regulator of the inflammatory response.
The finding that Lv1 could specifically regulate gene expression differences in CEACAM1 suggested that the composition of exon 1 with its predominant Gly-rich domain merited further investigation. Several constructs of Lv1 that had deletions of amino acids (⌬Gly 71 -Gly 89 or ⌬Ala 38 -Gly 89 ) in the N-terminal extension of hnRNP L in exon 1 (Fig. 4C) were transfected into HeLa cells in the presence of IRF1, and as predicted by the deletion of residues, we observed the expected change in their apparent size on Western blots with no effect on the levels of expression compared with the WT Lv1-F control (Fig. 4D). Surprisingly, in Fig. 4E we observed strong de-repression by hnRNP L in both mutants as compared with Lv1 (lanes 6 and 7 versus 5) along with a concomitant 11-fold increase for ⌬Gly 71 -Gly 89 and 16-fold for the ⌬Ala 38 -Gly 89 mutant in CEACAM1 levels (Fig. 4F). These results suggest that the polyglycine-rich domain of Lv1 is responsible for the regulation of CEACAM1 expression.

Changes in the chromatin landscape regulate CEACAM1 expression
These results and the finding that hnRNP L co-purifies with human KMT3a, an enzyme responsible for histone H3 lysine 36 methylation (H3K36me3) and gene silencing (28), suggested the possibility that the transcriptional regulation we observed occurred through changes in the chromatin landscape. To test whether Lv1-induced gene expression involves chromatin remodeling, we quantitated expression changes of several chromatin markers, including H3K36me3, H4AcK8, and DNA

An hnRNP L variant and IRF1 silence CEACAM1
methyltransferase 1 (DNMT1), before and after treatment of HeLa cells with Lv1 and IRF1 (Fig. 5A). H3K36me3 was chosen because its activity correlates with highly transcribed genes and is specifically down-regulated upon siRNA treatment of hnRNP L in HeLa cells (28). H4AcK8 plays a critical role in decompaction of chromatin structures during DNA replication (29) and was included as a test for euchromatinization of DNA. DNMT1, which transfers methyl groups to cytosine nucleotides of genomic DNA maintaining methylation patterns following DNA replication, was a general marker of gene activity (30). When HeLa cells were analyzed for expression of H3K36me3 by Western blotting in untreated cells, we found high enrichment of active histone modification marks that are largely absent upon introduction of the control virus (Fig. 5A,  lanes 3 and 4). When IRF1 was introduced to cells expressing the control vector, we observed significant enrichment of active H3K36me3 modification marks restored to similar levels seen in the basal untreated state (lane 5). By contrast, upon introduction of Lv1-F, we observed modification of H3K36me3 marks corresponding to gene silencing (lane 6) comparable with that seen in the control virus treatment (lanes 3 and 4). Quantification of the normalized total H36Kme3 levels is shown in Fig. 5B. These findings correlate with the repression pattern of CEACAM1 following Lv1 and IRF1 treatment observed using parallel samples in Western blots (Fig. 5C, lane 5 versus 6). To further test the hypothesis that Lv1 induces silencing by a chromatin remodeling mechanism, we treated HeLa cells with the global epigenetic modifiers DNA-methyltransferase inhibitor (DAC) or the histone deacetylase inhibitor (TSA) and analyzed CEACAM1 expression. As expected, we observed that uncoupling the chromatin from Lv1 activity using these treatments enhanced euchromatinization leading to restoration of CEACAM1 expression by ϳ1.8-fold as quantitated in Fig. 5C (183/100; lane 8 versus 6). When the corresponding H3K36me3 modification marks in Fig. 5A were analyzed further, we observed restoration of high enrichment of active H3K36me3 marks suggesting an altered chromatin landscape (lane 8 versus 6). In contrast, DNMT1 and H4AcK8 marks do not show similar changes in expression. This was also the case when we ectopically expressed USF1, a known transcription factor associated with the CEACAM1 promoter (31), in combination with Lv1 and we detected no statistically significant changes in H3K36me3 modification marks (Fig. 5D). Collectively, these data suggest that IRF1 and Lv1 coordinate to remodel chromatin structures that affect CEACAM1 expression.

Coordination of Lv1 with IRF1 mediates changes in chromatin of multiple promoters including CEACAM1
The effects of IRF1 and hnRNP L on the CEACAM1 promoter and AS appear to be part of a global pattern as exemplified by the global increase or decrease in the expression of H3K36me3 (Fig. 5A). To extend these results broadly across the genome, we performed MNase ChIP-Seq analysis on HeLa cells with or without Lv1-F before and after treatment with IRF1 (Fig. 6). MNase acts as a single-strand-specific exonuclease that digests linker DNA between nucleosomes and has been

An hnRNP L variant and IRF1 silence CEACAM1
applied to the study of chromatin structure (32). As shown in the nucleosome occupancy assay schematic (Fig. 6A), a nucleosome "free" region occurs globally at transcription start sites (TSS) surrounded by nucleosomes. Regions with well-positioned nucleosome arrays are detected as well as enrichments in potential regulatory regions. Our data show that the global distribution of nucleosome occupancy around the TSS is markedly different when Lv1-F is introduced with IRF1 (Fig. 6B) suggesting disruption of chromatin accessibility of promoter regions. As expected, CEACAM1 was identified as a gene affected at the TSS under both treatment conditions (Fig. 6C). One of the genes identified by this analysis, DCC (deleted in colorectal carcinoma) stands out, because it is deleted in 70% of colorectal cancers (33), is a putative tumor-suppressor gene (34), may influence the prognosis of breast carcinoma patients (35), and belongs to the immunoglobulin superfamily with similarity to the N-CAM transmembrane proteins. Colon carcinoma cells transfected with DCC also exhibited suppressed proliferation and down-regulation of CEACAM5 expression (36). To validate that DCC is down-regulated because of the interaction between Lv1-F and IRF1, we chose MCF7 breast cancer cells that express DCC and CEACAM5 under the regulation of IFN-␥ (37), but have undetectable levels of CEACAM1 (38). When the expression levels of DCC were determined by Western blotting with or without Lv1-F in the presence IRF1, we observed more than 58% down-regulation of DCC (Fig. 6, D (lane 4 versus 3) and E (100 -41.6)) in agreement with our ChIP-Seq dataset. This novel finding shows an Lv1-and IRF1-directed restructuring of chromatin in regulating gene expression.
Because our data suggest that Lv1 in the presence of IRF1 induces chromatin remodeling, we asked if this affects IRF1's ability to interact with its cognate ISRE (interferon stimulating response element) recognition sequence in the promoter of CEACAM1. Because we had previously identified IRF1-, USF1-, and SP1-binding sites in the CEACAM1 proximal promoter (31), we performed ChIP studies using antibodies to each of these transcription factors on HeLa cells expressing Lv1-F before and after treatment with IRF1 or IFN-␥. In particular, enrichment of the CEACAM1 promoter using antibodies to IRF1 showed occupancy decreases significantly in Lv1-F-and IRF1-treated cells compared with control virus treatment, in accordance with our hypothesis (Fig. 6F). Surprisingly, the situation with IFN-␥ treatment is opposite, suggesting that IRF1 alone does not explain the effects of IFN-␥ on CEACAM1 expression. When SP1 binding was analyzed, we observed no association with the CEACAM1 promoter in the presence of IRF1 regardless of whether Lv1-F was present or absent, whereas Lv1-F caused significant dissociation of SP1 to the pro-moter under IFN-␥ stimulation (Fig. 6G). Additionally, IRF1 has little effect on USF1 occupancy with or without Lv1-F expression, but USF1 occupancy is profoundly affected by the presence or absence of Lv1-F after treatment with IFN-␥ (Fig.  6H). We observed IFN-␥ stimulation in the presence of Lv1-F caused significant enrichment of USF1 at the CEACAM1 promoter suggestive of chromatin rearrangement under these conditions as well. The fact that USF1 and SP1 in the presence of Lv1 show opposite enrichment profiles may be because of steric hindrance or competition for these two transcription factors for adjacent sites on the proximal promoter of CEACAM1 (31) and suggests that IFN-␥ can potentiate chromatin remodeling by regulating factors in addition to IRF1.

IRF3 and IRF7, similar to IRF1, bias AS toward exon inclusion
Having established that Lv1 controls AS of CEACAM1 through IRF1, it was of interest to test whether other members of the IRF family share similar regulatory potential. It is notable that IRF proteins share binding to their cognate ISREs in the promoters of many genes, including members of the CEACAM gene family. For example, IRF3 was shown to mediate CEACAM1 expression in fibroblasts in response to several RNA viruses including influenza (39). We included IRF7 in our study because it has been shown to regulate anti-viral responses (40) and is often silenced in breast cancer (41). In other studies, we have shown that CEACAM1 regulates TLR4 signaling that, in turn, involves activation of IRF3 and IRF7 (42,43). To examine how IRF family members differentially regulate CEACAM1 AS, we first analyzed expression levels of IRF1, IRF3, and IRF7 in HeLa cells (Fig. 7A). Using primers that detect the different isoforms of CEACAM1 (distinguished by the number of N-term IgG-like ectodomains 3 or 4) we observed by RT-PCR that all the IRFs tested except IRF9 induce the expression of L-iso, with the greatest effect observed for IRF1, suggesting differential efficiency of promoter usage by the IRFs (Fig. 7B). This result was also observed at the protein level where these cells produced substantial amounts of CEACAM1 L-iso after transfection with IRF1, and to a lesser extent with IRF3, whereas IRF7 treatment failed to express any CEACAM1 ( Fig. 7C and quantitated in D). To determine whether the effect of the IRFs on RNA splicing affected expression of CEACAM1 at both the mRNA and protein level in other cell lines, we repeated the IRF induction in melanoma A2058 cells and observed similar findings (Fig. S2). In this case, IRF3 and IRF7 could significantly increase the levels of CEACAM1. Finally, when HeLa cells expressing Lv1-F were treated with the IRFs we observed no repression by IRF3 or IRF7 as compared with that seen in IRF1treated samples (Fig. 7E, lane 9, 7 versus 5, and quantitated in F). Based on this finding, we conclude that there is a general mech- Figure 6. Coordination of Lv1 with IRF1 mediates changes in chromatin of multiple promoters including CEACAM1. A, schematic diagram of the chromatin accessibility assay adapted from (67) using micrococcal nuclease (ϩMNase) near transcription factors (TF) with IRF1 as an example. DNA fragments generated are expected to produce the data signal obtained in areas rich in nucleosomes. B, nucleosome profile relative to TSS in 10-bp resolution. C, peak enrichment comparison between vector ϩ IRF1 versus Lv1-F ϩ IRF1 at the IRF1 locus (upper) and CEACAM1 locus (lower). D, validation of target DCC by treating MCF7 cells with and without Lv1-F and IRF1. Cell lysates were subjected to Western blotting and probed using antibodies to hnRNP L, GAPDH, and DCC. E, samples from D were quantitated where DCC was normalized to vector ϩ IRF1 protein levels. *, p Ͻ 0.05, Lv1 versus vector both in the presence of IRF1. Experiments were repeated in triplicate. Equal protein amounts (50 g) from protein lysates were loaded on each lane. F-H, MNase ChIP was performed in HeLa cells using antibodies directed to IRF1 (F), SP1 (G), or USF1 (H) expressing vector or Lv1-F with and without treatment with IRF1 or IFN-␥. Quantitative PCR was conducted using primers specific for the CEACAM1 ISRE or USF1 binding locus as shown on the y axis. Bar graphs represent the mean relative enrichment values (Ϯ S.D.) of three independent experiments. *, p Ͻ 0.05; **, p Ͻ 0.01; ***, p Ͻ 0.001 Lv1-F versus vector control for each treatment.

An hnRNP L variant and IRF1 silence CEACAM1
anism of CEACAM1 gene regulation under the control of IRF1 and the hnRNP L variant 1 regulator, Lv1.

Discussion
In this study, we identified novel interactions of IRF1 with hnRNP L and an hnRNP L variant (Lv1) at the level of the transcriptosome and the spliceosome. We have now established that the variable E7 of CEACAM1 serves as a binding site to tether IRF1 to hnRNP L and its variant Lv1 (Fig. 1). Because our previous attempts to affinity purify factors associated with E7 did not show IRF1 but did identify hnRNP L (14), we now show that MOs directed to the binding site of hnRNP L in E7 blocked that interaction only in the presence of IRF1. Thus, IRF1 spatially influences AS through an indirect association with hnRNP L (Fig. 2).
Additionally, the finding that hnRNP L interacts with IRF1 to down-regulate CEACAM1 expression (Fig. 4) adds to the repertoire of known interactions between the spliceosome and transcriptosome (44) and provides further evidence for the "recruitment model" (45) where splicing regulators are brought in close proximity to the promoter. Proteomic analysis of the human spliceosome reveals many spliceosomal proteins are . IRF3 and IRF7 like IRF1 bias AS toward exon inclusion. A, HeLa cells were infected with adenoviruses expressing mIRF1 (murine), IRF3, IRF7, IRF9, or control Ad-⌿5 and harvested after 24 h treatments. Total cell lysates were analyzed by Western blotting using antibodies to each IRF and ␣-tubulin was used as a loading control. STAT1 was included as a downstream effector control. B, RT-PCR analysis of total RNA isolated from IRF-induced cells using primers CEACAM1 or GAPDH. Primers to detect CEACAM1-4ecto and CEACAM1-3ecto mRNA isoforms have been described elsewhere (65). Asterisks (* and **) indicate the presence of uncharacterized mRNA species. C, Western blot analysis of total protein isolated from IRF-induced cells using antibodies to CEACAM1 or ␣-tubulin. The asterisk indicates the presence of nonspecific interactions. D, samples from C were quantitated where normalized CEACAM1 levels were compared with control Ad-⌿5 treatment. *, p Ͻ 0.05, IRF3 versus Ad-⌿5, * p Ͻ 0.05, IRF1 versus Ad-⌿5. E, HeLa cells were infected with adenoviruses expressing IRF1, IRF3, or IRF7 versus control Ad-⌿5 in the presence or absence of Lv1-F and total cell lysates were probed for CEACAM1 expression or ␤-actin by Western blotting. F, samples from E were quantitated where normalized CEACAM1 levels were compared with control Ad-⌿5. ***, p Ͻ 0.001, Lv1-F ϩ IRF1 versus IRF1 treated cells alone. A control lysate expressing CEACAM1 S-iso in MCF7 cells was used as CEACAM1 migration control. Equal protein amounts from protein lysates were loaded on each lane, 25 g for A and 50 g for C and E. Ad-⌿5 is the Ad-null empty vector control. Number of replicates for A-E were n ϭ 2. Error bars, S.D.

An hnRNP L variant and IRF1 silence CEACAM1
candidates for coupling splicing with gene expressionassociated factors (46). For instance, the hormone-inducible transcriptional repressor SHARP and the transcription-coupled repair XAB2 were identified as components of functional spliceosomes. In the case of viral promoters, the HIV promoter interacts with the spliceosome to affect AS of the primary HIV transcript (47) and the human papillomavirus E2 protein, involved in transcriptional regulation of viral gene, and interacts with splicing factors Srp75, Srp55, and Srp40 (48). One model proposes that the carboxyl-terminal domain (CTD) of RNA polymerase is a binding platform for the recruitment of these proteins (49). An example is the Mediator-CTD-binding complex that associates with a number of mRNA processing factors, including hnRNP L (25). A second example of an adaptor system similar to hnRNP L linking RNA splicing to IRF1 is the model system that brings splicing regulator PTB to histone marks through the interaction with chromatin-binding protein MRG15 (50).
Because IRF1 had no effect on the expression of hnRNP L isoforms but their interaction significantly affected mRNA expression (Fig. 3, A and D), it will be important to show whether IRF1 and hnRNP L are part of a transcriptional complex that exists at the CEACAM1 locus. It is also possible that IRF1 controls the abundance of CEACAM1 mRNA through interactions between hnRNP L and RNA-binding protein HuR. We discovered HuR in a screen of RNA factors using E7 as bait in our proteomics study (14) and HuR is known to promote alternative polyadenylation site usage (51). This suggests that future studies should test whether the 3ЈUTR of CEACAM1 in combination with E7 is regulated by IRF1 and hnRNP L. This would help explain the curious finding that a single mutation in a microsatellite region of the 3ЈUTR of CEACAM1 causes disruption of its mRNA stability (52).
Given the central role of hnRNP L in these examples, our study leads us to propose yet another mode of co-transcrip-tional regulation involving hnRNP L that may be specific to inflammation where IRF1 plays a central role (Fig. 8). In cells where IRF1 is not expressed, CEACAM1 is found mostly as a cis-homodimer expressing S-iso (53). During inflammation and induction of IRF1, inclusion of E7 leads to expression of L-iso, which if chronically expressed, may ultimately promote cancer in epithelial cells. We showed previously that differentially expressed CEACAM1 and alteration in its splicing pattern is linked to tumorigenesis in the breast (17). The interaction of IRF1 with the spliceosome not only generates different splice forms, but if acting chronically, may promote tumorigenesis, and explain yet another connection between inflammation and cancer. From this perspective, a secondary level of regulation based on the gene silencing observed for Lv1 and IRF1 in cancer cells explains why not only CEACAM1 is down-regulated, but in addition, a number of other genes involved in tumorigenesis. This suggests that chronic inflammation leads to global gene silencing, a common feature of cancer progression. Although such a program may be an attempt to terminate chronic inflammation, it may have the unanticipated effect of promoting cancer by loss of expression of critical normal genes such as CEACAM1.
Although this is the first report that a variant of hnRNP L (Lv1) may be involved in cancer, there are examples in the literature that suggest hnRNP L, without reference to specific variants, may be involved. For example, recent studies report hnRNP L protein levels are significantly up-regulated in gastric (13) and pancreatic cancers (54). Factors leading to the regulation of the many hnRNP L variants remain to be discovered. Indeed, a novel hnRNP L variant was recently described that contains a short "poison" exon in intron 6 that, when included, results in the nonsense-mediated decay of its own mRNA (55). Nevertheless, the induction of Lv1 and the mechanism by which Lv1 induces chromatin remodeling remain open questions. However, our data showing Lv1-F caused significant dissociation of SP1 and association of USF1 at the CEACAM1 promoter under IFN-␥ stimulation (Fig. 6) suggest this area should be explored further.
Additionally, the RNAi studies presented here provide an initial clue to the functional consequences of the complex exon structure of the different variants of hnRNP L (Fig. 3). Nonetheless, the fact that Lv1 arises from transcription at an alternative exon resulting in differences in an N-terminal extension, provides an important clue, suggesting unidentified transcription factors regulate Lv1 expression. The composition of the Lv1 N-terminal extension is another clue in that its absence in the predominant variant Lv2, is rich in arginine and glycine residues. Gly-and Arg-rich domains are common in RNAbinding proteins including hnRNP A1 (56) and hnRNP D (57), but their function is poorly understood. Gly-rich sequences are expected to form intrinsically unstructured domains that can adopt an active configuration when interacting with other proteins or with RNA. The role of Gly-rich domains in disease, as in the case of TAR DNA-binding protein 43 and FUS (fused in sarcoma), is related to their formation of abnormal aggregates in ALS and frontotemporal dementia (58). In our studies using deletion constructs of either ⌬Gly 71 -Gly 89 or ⌬Ala 38 -Gly 89 in the polyglycine domain of Lv1, we observed strong de-repres- Figure 8. Model of the role of Lv1 and IRF1 in regulating CEACAM1 in inflammation and cancer. A, E7 acts as a binding platform to place IRF1 in trans in close proximity to RNA splicing regulators of the AS pathway. Shown is Lv1 association with the spliceosome and factors in the transcriptosome, including IRF1, SP1, and USF1. B, relationship between IRF1, Lv1, and CEACAM1 phenotype in normal, conditions of inflammation, and cancer. Acute inflammation causes IRF1 to express the L-iso of CEACAM1, and its expression causes a resolution to revert to S-iso in normal conditions. Chronic inflammatory conditions in cancer causes high expression of both IRF1 and Lv1 leading to global gene silencing. Low and High refer to expression level.

An hnRNP L variant and IRF1 silence CEACAM1
sion of CEACAM1. The exact targets of the Gly-rich region remain to be discovered, but we suggest that future studies focus on the relevance of amino acids 1-89 at the N-terminal extension as a key to the function of Lv1.
One possibility is the families of otherwise nonfunctional transcription factors that undergo conformational changes in the presence of binding factors that activate their regulatory domains (59). In another respect, our findings regarding Lv1 could explain why knockdown of hnRNP L leads to reduced expression of CEACAM1 in mouse spermatogonia GC-1 cells (60).
This study expands on our previously reported findings that IRF1 mediates global AS in a large subset of human genes involved in a variety of cellular functions (Fig. 7B of Ref. 21). We now establish that IRF1 utilizes hnRNP L's interaction with variable E7 in the formation of L-iso. Additionally, a variant of hnRNP L, Lv1, plays a role together with IRF1 to silence a subset of human genes, including CEACAM1. Analysis of the Gly-rich domain of Lv1 suggests a highly unstructured domain controls gene silencing perhaps through modulation of protein-protein interactions. For example, other studies on the C-terminal domain of hnRNP A1 suggest that homocomplexes and heterocomplexes with other hnRNP proteins form through its Glyrich domain (61). Future studies may demonstrate that Lv1 coordinates with factors such as hnRNP K that also function as transcription factors (62) as a mechanism to link RNA splicing activity to gene silencing. In fact, in our study we demonstrated that IRF1 and Lv1 together induce chromatin remodeling, restricting IRF1 access to the CEACAM1 promoter. Furthermore, because interferon ␥-treated HeLa cells also caused the binding of USF1 to the CEACAM1 promoter in the presence of Lv1, this mechanism may apply to transcription factors other than IRF1.
In conclusion, we have provided evidence of a new mechanism for the coordination of the transcriptosome and spliceosome that relies on an RNA splicing factor to control CEACAM1 isoform expression. Studying CEACAM1 gene expression dynamics in the context of IRF1 and Lv1 showed for the first time that CEACAM1 can be silenced with implications for its role as a common mechanism in cancer development.

Induction of IRFs
All IRFs in this study were introduced by adenovirus delivery. For IRF1, mouse Ad-IRF1 was used essentially as described previously (21,64). Plasmid construction of the remaining IRFs was as follows: pUNO1-saIRF3 (containing a single point mutation of Ser 396 to Asp) (InvivoGen) was subcloned into pEntCMV shuttle vector and packaged into adenovirus at 3 ϫ 10 e 12 viral particles (vp)/ml (Welgen, Inc.). For IRF7, pUNO1hsa-IRF7⌬ (constructed by deleting the auto-inhibitory domain; ⌬238 -410) and IRF9 were purchased from InvivoGen and subcloned and packaged as detailed above. Induction of the IRFs (Ad-IRF1, Ad-IRF3, Ad-IRF7, and Ad-IRF9; hereafter referred to as only IRF1, IRF3, IRF7, and IRF9 for simplicity) was performed essentially as described previously (21) except for the following modifications. Cells were seeded at a density of 0.35 ϫ 10 6 cells in 6-well plates 24 h prior to treatment. Either this or the Ad-null empty vector control (hereafter referred to as Ad-⌿5) was added to cultures at an estimated 100 particles per pfu.

RNA isolation RT-PCR
Total RNA isolation and preparation of cDNA has been described elsewhere (21). Amplification of CEACAM1 mRNA (Figs. 1-3 and Fig. S1) was performed using sense exon 6 prim-

RNAi treatment
For reverse transfection of siRNA, cells were seeded at a density of 0.25 ϫ 10 6 cells in 15-mm plates at time of complex formation. Lipofectamine RNAiMAX was used at a dose of 5 l per 500 l Opti-MEM, 2 l of 20 M siRNA in a 2.5-ml final volume per well in media without antibiotics. For hnRNP L and Lv1, 1 l of 20 M siRNA was used. IRF1 or the Ad-null empty vector control virus was added to cultures after 48 h and after 72 h, cells were analyzed for RNA and protein expression differences.

Flow cytometry
For samples expressing GFP, detached cells were washed twice with ice-cold PBS before analyses. Percentage of FITCpositive cell population was expressed as percent of maximum. For cells expressing CEACAM1, 0.15 ϫ 10 6 cells were incubated with 80 l growth media containing 10% FBS with 50 l human serum for 20 min at 4°C. Following this, 1 g/ml antibody CD66A-PE (R&D Systems, final concentration) was added to each sample for 30 min at 4°C. Samples were then washed twice with ice-cold PBS before analysis by flow cytometer (BD FACSAria II, BD Biosciences). GFP fluorescence intensities were detected with 488 nm laser. Data were analyzed with FlowJo software, version 8.1.1 (Tree Star).

MNase ChIP-Seq
ChIP-Seq data were analyzed using CAM (66). The aggregation plots of both vector-IRF1 sample and Lv1-F, IRF1 samples over promoter regions were generated in 10-bp resolution.
Multiple key quality control measurements such as sequencing coverage and nucleosome organization profiles were generated and compared with a set of 268 historical MNase-Seq datasets in human and mouse that serve as a reference.