Differential and overlapping targets of the transcriptional regulators NRF1, NRF2, and NRF3 in human cells

The nuclear factor (erythroid 2)-like (NRF) transcription factors are a subset of cap'n'collar transcriptional regulators. They consist of three members, NRF1, NRF2, and NRF3, that regulate the expression of genes containing antioxidant-response elements (AREs) in their promoter regions. Although all NRF members regulate ARE-containing genes, each is associated with distinct roles. A comprehensive study of differential and overlapping DNA-binding and transcriptional activities of the NRFs has not yet been conducted. Here, we performed chromatin immunoprecipitation (ChIP)-exo sequencing, an approach that combines ChIP with exonuclease treatment to pinpoint regulatory elements in DNA with high precision, in conjunction with RNA-sequencing to define the transcriptional targets of each NRF member. Our approach, done in three U2OS cell lines, identified 31 genes that were regulated by all three NRF members, 27 that were regulated similarly by all three, and four genes that were differentially regulated by at least one NRF member. We also found genes that were up- or down-regulated by only one NRF member, with 84, 84, and 22 genes that were regulated by NRF1, NRF2, and NRF3, respectively. Analysis of the ARE motifs identified in ChIP peaks revealed that NRF2 prefers binding to AREs flanked by GC-rich regions and that NRF1 prefers AT-rich flanking regions. Thus, sequence preference, likely in combination with upstream signaling events, determines NRF member activation under specific cellular contexts. Our analysis provides a comprehensive description of differential and overlapping gene regulation by the transcriptional regulators NRF1, NRF2, and NRF3.

The nuclear factor (erythroid 2)-like (NRF) 5 transcription factors are a subset of basic leucine zipper (bZip) cap'n'collar (CNC) transcription factors sharing close homology. There are three NRF family members: NRF1, NRF2, and NRF3 (1). The discovery of different NRF family members followed an effort to find transcription factors controlling expression of ␤-globin, a gene required for hemoglobin production and proper oxygen transport in vertebrates. NRF1 was the first discovered NRF family member and was identified in a cDNA library screen for binding to the erythroid-specific NF-E2/AP1 consensus sequence found in the 5Ј ␤-globin locus control region, 5Ј-GC-TGAGTCA-3Ј (2). Shortly thereafter, NRF2 was identified in a comparable cDNA library screen (3). Intriguingly, the isolation and characterization of NRF3 arose a few years later following a logical deduction: as the genomic locations of nrf1, nrf2, and the NF-E2 subunit p45 (NF-E2p45) were found near hoxB, hoxD, and hoxC genes, respectively, nrf3 was predicted and shown to map close to the hoxA gene (4). Thus, the CNC families of NF-E2, NRF1, NRF2, and NRF3 were all predicted to have arisen from a common genetic lineage, diverging after chromosomal duplication.
Almost immediately after the discovery of NF-E2, NRF1, and NRF2, investigators began dissecting the differing roles of these three transcription factors. As they had all been discovered as mediators of ␤-globin expression, initial investigations looked into their roles in erythropoiesis and hematology. NF-E2p45 knockout mice exhibited an absence of platelets and hemorrhaging, but had very little effect on erythroid cell lineages and were able to survive to adulthood (5). In contrast, NRF1 knockout mice displayed severe anemia due to impaired fetal liver erythropoiesis and died in utero (6,7). Homozygous Nrf2 knockout mice, however, displayed no obvious defects in embryogenesis, fertility, or litter sizes (8). The role of NRF2 became more apparent following challenge with carcinogens, where it was initially shown to protect mice against chemical carcinogenesis. Similarly, Nrf3 knockout mice displayed no overt phenotype (9). Further insight into the role of Nrf3 may become more apparent after appropriate challenges: for example, Nrf3 Ϫ/Ϫ animals were more likely to develop cell lymphoblastic lymphoma following benzo-apyrene exposure (10). The NRF transcription factors also differed in their tissue localization; Nrf1 and Nrf2 are expressed relatively ubiquitously, whereas Nrf3 expression is highly localized to placental tissue (2)(3)(4).
Although these early studies into the differential underpinnings of the NRF members showed stark contrasts in expression and mediation of downstream responses between family members, mechanistic investigations into their transcriptional modes of action indicated a more conserved process. To control transcription, all CNC members utilize a similar mechanism. Nuclear CNC transcription factors dimerize with the small musculoaponeurotic fibrosarcoma, or sMaf proteins (sMafG, sMafK, and sMafF), to bind to DNA. The resulting sMaf/bZip factor complexes have the capacity to bind to several related target DNA sequences: the palindromic Maf recognition element (5Ј-TGCTGAC(G)TCAGCA-3Ј), the NF-E2 recognition element (5Ј-TGCTGACTCAT-3Ј), and the antioxidant-response element (ARE, 5Ј-TGABNNNGC-3Ј) (1,11,12). Historically, genes under ARE control were of particular interest; many of these genes were transcriptionally inducible by electrophilic xenobiotics and consisted predominately of detoxifying enzymes like GSH-S-transferase or NAD(P)H quinone dehydrogenase 1 (NQO1) (13)(14)(15). In a paradigm-shifting work, AREcontaining genes were discovered to be under the transcriptomic control of NRF members, particularly NRF2 (16).
Given the distinct, and occasionally conflicting, roles that have been attributed to the NRF family members, we combined ChIP and RNA-Seq analyses to catalog the overlapping and distinct genes regulated by each NRF family member, providing insights into the transcriptional program governing the NRF family members.

System design
The incomplete mechanistic understanding of NRF3 regulation, as well as the absence of a well-established NRF3-specific antibody, complicates systematic evaluation of the transcriptional targets of each NRF family member. To overcome these limitations, we created three derivative U2OS cell lines; each of these cell lines carries a doxycycline-inducible FLAG-tagged NRF family member inserted into the same locus using flp-in technology (Fig. S1). These cell lines allow us to modulate the expression of each NRF family member using doxycycline induction and detection using the highly-specific anti-FLAGtag antibody (Fig. 1A). To ensure that the induced NRF proteins will translocate to the nucleus to mediate their transcriptional functions, we removed the ER targeting sequences of NRF1 and NRF3, and we replaced the DLG and ETGE motifs of the Neh2 degron on NRF2 with AAA and AAAA, respectively (Fig. 1, B and C). The mutations to Neh2 degron on NRF2 is required to prevent it from interacting with kelch-like ECH-associated protein 1 (KEAP1), which will ubiquitylate and target NRF2 for proteasomal degradation. By establishing this system, we sought to compare gene expression and genomic binding sim-ilarities and differences among NRF members in the same cell line with comparable induction systems.

ChIP-exo
Using the constructed cell line, we performed ChIP-exo sequencing using anti-FLAG antibody-conjugated magnetic beads. Although the ChIP studies have been conducted with great success on individual NRF members (17), no studies have compared binding sites among all NRF members in the same system. All genomic loci found from the ChIP-exo analysis can be found in Table S1. The sequencing results showed good enrichment for binding sites containing antioxidant-response elements for all three NRF members ( Fig. 2A); 73, 46, and 66% of NRF1, NRF2, and NRF3 peaks contained one or more AREs, respectively. The ChIP peaks also showed good resolution for AREs, with 96, 83, and 96% of canonical AREs appearing within 200 bp of the peak summits (Fig. 2B). All three NRF members showed binding across all chromosomes (Fig. 2C). In terms of , and NRF3 proteins are inducible by doxycycline in their respective cell lines. Each NRF member was fused with a FLAG peptide sequence. GAPDH was used as a loading control. B, amino acids 1-121 and 1-173 were removed from the open reading frames of NRF1 and NRF3, respectively, to eliminate their ER targeting sequences. Similarly, the DLG and ETGE degron motifs within the Neh2 domain of NRF2 were mutated to AAA and AAAA, respectively, to allow for NRF2 accumulation in the ChIP-exo experiments. The DLG and ETGE degrons must be removed to prevent KEAP1-mediated ubiquitylation of NRF2. C, following doxycycline induction (0.5 g/ml), NRF members translocate to the nucleus. The scale bar represents a length of 10 m.

Transcription targets of NRF1, NRF2, and NRF3
genomic annotations, NRF1 and NRF3 peaks showed relatively similar distributions, with peaks mainly residing in distal intergenic regions (Fig. 2D), whereas NRF2 showed relatively fewer distal intergenic binding sites, and a relative increase in binding sites residing within the promoter and the first exon of genes. All annotated peaks are provided in Table S2. As expected from ChIP of transcription factors, binding sites from the promoter region congregate near the transcription start site (TSS) of genes (Fig. 2E).
To determine whether NRF1, NRF2, or NRF3 showed any binding preferences to specific ARE sequences or flanking region sequences, we grouped ChIP-exo binding sites by their specificity to each NRF member or overlaps among different NRF members, searched the peaks for AREs, and identified potential consensus sequences for a 50-bp region flanking the ARE at both the 5Ј and the 3Ј ends. We then compared whether the sequences were different between groups and identified sequences specific to each NRF member and to those shared by two or more NRF members (Fig. 3, A and B). NRF1 and NRF2 specific binding sites have ARE sequences that differ from each other. ARE sequences from binding sites shared by all three NRF members significantly differed from ARE sequences from binding sites specific to NRF1, NRF2, and NRF3 and ARE sequences from binding sites shared by NRF1 and NRF3. We identified similar trends for sequences flanking both the 5Ј and the 3Ј regions.
We compared the GC to AT ratio at each position in the sequences flanking the ARE from binding sites specific to each NRF members or shared among them (Fig. 4). Sites bound by all three NRF members were particularly enriched for AT at the position immediately preceding the ARE. We found that 28 positions were enriched for GC in binding sites specifically bound by NRF2, whereas seven positions were enriched for AT in NRF1-specific binding sites. None of the other groups showed consensus sequence enrichment for GC or AT at any particular sequence position.

RNA sequencing
We next assessed differences and similarities in genes differentially regulated by NRF1, NRF2, and NRF3 in the RNASeq data. To overexpress each NRF member, we exposed the U2OS TetR -NRFX to doxycycline for 16 h. U2OS TetR -NRF2 cells were also treated with sulforaphane to facilitate NRF2 translocation (18 -21). All differentially expressed genes can be found in Table S3. Overexpression of all NRF members yielded increased transcript levels of more than a thousand genes, with NRF1 up-regulating 3155 genes; NRF2 up-regulating 2619, and NRF3 up-regulating 1212 (Fig. 5A). Next, we looked at genes that were regulated by more than one member. All three NRF members commonly up-regulated 557 genes. Additionally, NRF1 and NRF2 commonly up-regulated 849 genes; NRF1 and NRF3 commonly up-regulated 277 genes, and NRF2 and NRF3 commonly up-regulated 149 genes.
In summary, NRF1 and NRF2 have the most genes in common relative to all NRF family members. This includes both

Transcription targets of NRF1, NRF2, and NRF3
up-regulated and down-regulated genes. Additionally, NRF1 and NRF2 alter transcript levels of more genes than NRF3. These results highlight the similarities of NRF1 and NRF2 and the uniqueness of NRF3 in downstream regulation of ARE-containing genes.

Combining RNAseq and ChIP-exo data
We then analyzed the ChIP-exo data to compare genes regulated by NRF members. We limited genes to those with binding sites falling within 5 kb upstream of the TSS, the first intron, or the first exon. These binding sites were categorized as likely regulatory elements (LREs), which may or may not contain an ARE. NRF1, NRF2, and NRF3 each showed more than 1000 genes harboring LREs (Fig. 5B). 167 genes were shared between all three NRF members. Beyond the 167 genes shared by all NRF family members, NRF3 shared 84 genes with NRF2 and 580 genes with NRF1. NRF1 and NRF2 shared 67 LREs.
We were interested in genes that showed differential expression from the RNA-Seq data and had an LRE; we called these genes "high-confidence genes." Based on these criteria, we identified 262 high-confidence genes. We stratified these genes into 17 groups based on how each NRF member affected their transcript levels ( Fig. 5C and summarized in Table 1). Several high-confidence genes were regulated by only one NRF member. For example, ADGRG7 was uniquely up-regulated by NRF1, and NRF1 is the only NRF member showing an LRE in the promoter region of ADGRG7 (Fig. 5D). A similar pattern was observed for PHF20L1 with NRF2 (Fig. 5E). Additionally, TACC1 was down-regulated by NRF2 and contained an NRF2 LRE, whereas MRPL52 is down-regulated by NRF3 and contained an NRF3 LRE (Fig. 5, F and G). 84 high-confidence genes were solely regulated by NRF1 (67 up-regulated and 17 downregulated). Similarly, NRF2 itself regulates 84 high-confidence genes (55 up-regulated and 29 down-regulated), and NRF3 independently regulates 22 high-confidence genes (3 up-regulated and, 19 down-regulated).
To determine which of the genes were regulated by canonical NRF signaling, we evaluated the ChIP peaks in the high-confidence genes for the presence of a core ARE consensus sequence, 5Ј-TGABNNNGC-3Ј. Indeed, 100% of the 18 highconfidence genes up-regulated by NRF1, NRF2, and NRF3 all contained AREs. This is in contrast to the nine genes downregulated by NRF1, NRF2, and NRF3, of which only three (33%) had an ARE. When we looked at all 262 high-confidence genes, 62% of them contained an ARE (Fig. 5C).
Some genes showed different up-or down-regulation patterns dependent on which NRF member binds to its LRE (Fig.  5C). For example, TNFRSF1A was up-regulated by NRF1 and NRF2, but down-regulated by NRF3 (Fig. 5I). Similarly, DST was found to be up-regulated by NRF1 and NRF3 but downregulated by NRF2 (Fig. 5J). Some genes were regulated by just two members. For example, GNA12 expression was found to be up-regulated by NRF1, unaffected by NRF2, and down-regulated by NRF3; as expected, only NRF1 and NRF3 have an LRE (Fig. 5K).
These different groupings and regulation patterns demonstrate the heterogeneity by which NRF members affect gene expression. Some genes were regulated by just one member, whereas others were regulated by two or more members. Moreover, NRF members binding to the same LRE of a gene sometimes altered gene expression in different directions. All genes and their respective groupings can be found in Table 1.

ADGRG7 and PHF20L1 are transcriptional targets of NRF1 and NRF2, respectively
Following the identification of different groupings described in the previous section (Table 1), we decided to focus on the NRF1-specific ADGRG7 and the NRF2-specific PHF20L1. To test whether the NRF member specificity of these two genes is generalizable across different cell lines, we performed validation studies in four different cell lines: H1299 (a nonsmall cell lung cancer cell line); HK2 (an immortalized human renal cortex/proximal tubule cell line); BEAS-2B (an immortalized human bronchial epithelia cell line); and MDA-MB-231 (a breast cancer cell line). These cell lines are not known to have NRF1, NRF2, or NRF3 activation. We pharmacologically induced NRF1 using tunicamycin (22) and induced NRF2 using sulforaphane (23). Accordingly, tunicamycin only increased NRF1 protein levels, and sulforaphane only increased NRF2 protein levels across the four tested cell lines (Fig. 6A). We then evaluated transcript levels of ADGRG7 and PHF20L1 using RT- Figure 3. Comparison of GC to AT ratios between NRF-binding-site groups for all ChIP-exo AREs. A, binding sites that were unique to a single NRF family member, or those that overlapped between multiple NRF members, are identified by ChIP-exo sequencing. Note that the sum of binding sites for each member does not always equal the number of binding sites in Fig. 2A due to peak resolution. B, proportions of GC to AT base pairs in 50-bp regions flanking the AREs were compared among NRF member binding-site groups identified from ChIP-exo sequencing. Red-colored scaled squares indicate one group being significantly different from the other as determined by the 2 test, with a darker color indicating a smaller p value. NRF members of each binding-site group are indicated with solid green squares to the right or below the p value grid. Transcription targets of NRF1, NRF2, and NRF3 Figure 5. Identification of high-confidence genes regulated by NRF members. A, number of intersecting or unique genes whose expression was up-regulated or down-regulated following overexpression of NRF1, NRF2, or NRF3. Data were determined by RNA-Seq. B, number of intersecting or unique genes harboring an LRE following overexpression of NRF1, NRF2, or NRF3. Data were determined by ChIP-exo sequencing. Note the number of genes harboring an LRE with an overlap between NRF1 and NRF2 is higher than that found in Fig. 3A due to some LREs annotating to multiple genes. C, number of high-confidence genes that concurrently had an LRE and showed altered transcription following overexpression of NRF1, NRF2, or NRF3. Different high-confidence groups are annotated according to whether the different members up-regulated (red) or down-regulated (blue) transcription, as determined by RNA-Seq. The percentage of high-confidence genes that harbored a canonical ARE sequence, 5Ј-TGABNNNGC-3Ј is presented in black (proportion with ARE) and white (proportion without ARE). D-K, ChIP-exo coverage profiles of binding sites for selected high-confidence genes, showing how genes not regulated by a certain NRF member did not have a nearby LRE. Scale bars represent 1 kb. Rectangles and arrowed lines below coverage profiles indicate gene transcripts. Red lines indicate locations of canonical AREs.

Transcription targets of NRF1, NRF2, and NRF3
qPCR and found that ADGRG7 mRNA level was induced (ϳ1.5-3.0-fold) by tunicamycin in all four cell lines (Fig. 6B). Sulforaphane did not induce ADGRG7 in MDA-MB-231, BEAS-2B, or H1299 cell lines and only induced ADGRG7 ϳ1.2fold in the HK2 cell line (Fig. 6C). Tunicamycin did not induce PHF20L1 mRNA in any cell lines (Fig. 6B), whereas sulforaphane induced PHFC20L1 mRNA in MDA-MB-231, BEAS-2B, and H1299 cell lines (Fig. 6C). To ensure canonical transcription targets of NRF1 and NRF2 were up-regulated with both treatments, we measured transcript levels of FTL, GCLM, and NQO1 in the four cell lines and confirmed that all three genes were induced (Fig. S2).

Species conservation
To evaluate whether the identified different grouping of genes are conserved across members of the subphyla vertebrata, we evaluated the distribution of phastCons scores for ChIP peaks specific to NRF1-up-regulated, NRF2-up-regu-lated, NRF3-up-regulated, and NRF1/NRF2/NRF3-up-regulated groups. PhastCons score estimates the likelihood that a specific nucleotide position belongs to a conserved element, and the scoring system range is from 0 to 1, with 1 representing the highest probability that the nucleotide position belongs to a conserved element. We found that ChIP-peaks from NRF2and NRF1/NRF2/NRF3-up-regulated groups are likely to contain a conserved element, with up to 70% of the peaks containing regions with a phastCons score of 1. ChIP-peaks from the NRF1-up-regulated group are moderately likely to contain a conserved element, with around 30% of the ChIP-peaks with a phastCons score of 1, whereas ChIP peaks from the NRF3up-regulated group are least likely to contain a conserved element (Fig. S3). However, because ChIP-peaks from NRF1-and NRF2-up-regulated groups are likely to contain a conserved element, we wanted to know whether AREs in ADGRG7 and PHF20L1 contributed to the observed high  Note the lowered approximate molecular weight of NRF1 following its activation is consistent with previous reports (22). B and C, RT-qPCR of mRNA levels of ADGRG7 (B) and PHF20L1 (C) following a 16-h treatment with either vehicle, 5 g/ml tunicamycin, or 5 M sulforaphane. *, p Յ 0.05, and N, not significant (p Ͼ 0.05) for treatment differences, as determined by ANOVA with post hoc Tukey's highly-significant difference test.

Transcription targets of NRF1, NRF2, and NRF3
phastCons score. By looking at position-wise phastCons scores for ADGRG7 and PHF20L1 ChIP-peaks, we found that the AREs for the two genes are not well-conserved (Fig.  S3). This indicates that those elements are likely to be more specific to higher primates.

Flanking sequence identification
We evaluated sequences flanking the canonical AREs of high-confidence genes to see whether any transcriptional program is determining NRF member binding. We extracted all ARE sequences found within the LREs (Table S4). We next developed consensus sequences covering the AREs and 50 bp flanking the canonical sequence (Fig. 7A). A handful of trends emerged. NRF2-specific binding sites had 21 positions that were preferentially enriched for GC base pairs. NRF1-specific binding sites had seven positions that were preferentially enriched for AT base pairs. NRF3-specific binding sites had two positions that were preferentially enriched for AT base pairs. Because the high-confidence genes solely regulated by NRF2 showed a preponderance of GC base pairs in the flanking regions, and high-confidence genes solely regulated by NRF1 showed a preponderance of AT base pairs in the flanking regions, we sought to determine whether the transcription factors showed any bias toward these sequences. We generated luciferase constructs that placed a firefly luciferase ORF under the control of the dominating consensus sequences for both the NRF1-specific and NRF2-specific binding sites. After transfecting these constructs into the U2OS TetR -NRF1 and U2OS TetR -NRF2 cells and activating the different NRFX members with doxycycline, we compared relative luciferase activities (Fig. 7B). We found that the NRF1-AREflank showed higher luciferase activity in U2OS TetR -NRF1, whereas the NRF2-AREflank showed higher luciferase activity in the U2OS TetR -NRF2 cells. This indicates that NRF1 showed a slight preference for AREs flanked by AT-rich sequences, whereas NRF2 showed a slight preference for AREs flanked by GC-rich sequences.
We expanded on the synthetic promoter sequence results by evaluating whether the promoter regions of the NRF1-specific gene, ADGRG7, and the NRF2-specific gene, PHF20L1, were sufficient to elicit NRF member specificity. We placed the promoter regions of ADGRG7 or PHF20L1 found immediately upstream of the TSS in front of a firefly luciferase vector. The 50-bp regions flanking the 5Ј and 3Ј ends of the AREs contained 48.27% GC (ADGRG7) and 81.08% GC (PHF20L1). After transfecting these constructs into the U2OS TetR -NRF1 and U2OS TetR -NRF2 cells and activating the different NRFX members with doxycycline, we compared relative luciferase activities as before (Fig. 7C). We found that the ADGRG7 promoter vector only showed high firefly luciferase activity in U2OS TetR -NRF1 cells treated with doxycycline, whereas the PHF20L1 promoter vector only showed high firefly luciferase activity in U2OS TetR -NRF2 cells treated with doxycycline. These results are consistent with the hypothesis that genes under the control of the ADGRG7 and PHF20L1 promoter regions are preferentially activated by NRF1 and NRF2, respectively.

Functional annotation
We sought to determine which biological processes each of these gene sets may be controlling by performing a gene ontology (GO) analysis on each gene set and identifying all statistically significantly enriched terms containing two or more genes. A full list of statistically-significant enriched terms, including those with just one gene, are included in Table S5. The high-confidence genes up-regulated by all three NRF members had the most enriched terms. As expected, terms around redox processes, iron homeostasis, and oxidative stress were enriched. Other key terms centered around apoptosis regulation, autophagy, angiogenesis, response to stressors (heat, mechanical stimuli, and cadmium), and proteostasis ("chaperone binding" and ubiquitination) (Fig. 8). This was in contrast to high-confidence genes up-regulated by NRF1 and NRF2 but down-regulated by NRF3; only "protein binding" and "viral processes" were enriched.
The high-confidence genes composed of two NRF members showed different themes. Those up-regulated by NRF1 and NRF3 were enriched for RNA binding, proliferation, and thermogenesis, whereas those up-regulated by NRF1, but downregulated by NRF3, showed enrichment for terms focused on cellular structure like "actin cytoskeleton," "membrane," or "intermediate filament cytoskeleton." High-confidence genes up-regulated by NRF2 alone were enriched for terms involved in proteostasis and protein trafficking. Indeed, several GO terms associated with protein folding were specifically associated with NRF2. These include "protein folding in endoplasmic reticulum," "chaperone cofactor-dependent protein refolding," and "protein-disulfide isomerase activity." NRF2-regulated genes involved in these processes include ERO1A, TOR1B, ERP44, and VAPA. GO processes involving protein trafficking included the "GPI-GnT complex" and the "GPI anchor biosynthetic process" with genes, including PIGY and PYURF. These processes involve protein glypiation, which is a critical post-translational modification for anchoring proteins to the cell membrane.
The GO enrichment analysis indicates the NRF members are involved in a wide variety of biological processes; some of these processes may be regulated by multiple NRF members, whereas others are more specific to certain members.
We also looked at GO for all high-confidence genes regulated by each NRF member, independent of positive or negative regulation ( Fig. 9 and Table S6). For the high-confidence genes regulated by NRF1, several terms around splicing ("U1 snRNP" and "mRNA 5Ј-splice site recognition") were enriched, as were terms around cellular structure such as "actin filament network formation" and "laminin-10 complex." NRF2-specific ontologies include "macroautophagy" and "protein-disulfide oxidoreductase activity." NRF3-specific ontologies include processes involved in metabolism ("L-ascorbic acid metabolic process" and "fatty acid catabolic process"), structure ("cadherin binding" and "cellular response to mechanical stimulus"), and protein sorting ("extracellular exosome" and "protein localization to organelle").
Some terms overlapped between the various NRF member groups (Fig. 9). No terms were commonly enriched between all Transcription targets of NRF1, NRF2, and NRF3 proportions of GC to AT base pairs in regions flanking the AREs were compared among NRF high-confidence gene groups within each position. NRF members of each high-confidence gene group are indicated with solid green squares to the right of the consensus sequences. Blacked out rows indicate the group being compared against the others. Red-colored scaled squares indicate one group being significantly different from the other as determined by the 2 test, with the darker color indicating a smaller p value. Bars below a particular position indicate the group was specifically enriched for GC (blue) or AT (red) base pairs at that position when compared with the other groups. B, consensus ARE sequences from A for NRF1 and NRF2 were placed in front of a luciferase ORF to generate NRF1-AREflank and NRF2-AREflank plasmids. The constructs were transfected into U2OS TetR -NRF1 and U2OS TetR -NRF2 cell lines treated with doxycycline, and luciferase activity was assessed. Bars represent mean luciferase readings from triplicate samples. Error bars represent Ϯ S.D. Asterisks represent statistically significant differences between the two samples, as determined by two-tailed Student's t test. C, relative luciferase units (RLU) of U2OS TetR -NRF1 or U2OS TetR -NRF2 transfected with the pGL4.21 vector with No Promoter (empty pGL4.21), the ADGRG7 promoter region, or the PHF20L1 promoter region as described under "Experimental procedures." Cells were treated with and without doxycycline. *, p Յ 0.05, and NS, not significant (p Ͼ 0.05) for treatment differences, as determined by ANOVA with post hoc Tukey's highly-significant difference test.

Transcription targets of NRF1, NRF2, and NRF3
three NRF members; however, NRF2 and NRF3 shared "cell redox homeostasis," and NRF1 was enriched for "regulation of cellular response to oxidative stress," which are closely related terms. This is not unexpected, as one of the primary functions attributed to NRF members is the antioxidant response. NRF2 and NRF3 shared the terms "erythrocyte homeostasis" and "iron ion homeostasis." Additionally, NRF1 and NRF3 shared "heme catabolic process." This is unsurprising considering the history of NRF members and globin regulation. Consistent with their individual processes, NRF1 and NRF3 shared other terms around structure, such as "membrane" or "hemidesmosome." Only one term was shared by NRF1 and NRF2: "neurofilament cytoskeletal organization."

Discussion
Here, we present a comprehensive comparison of NRF1, NRF2, and NRF3 transcriptional targets by comparing differentially-regulated genes determined from RNA-Seq with genes harboring nearby NRF-binding sites as determined by ChIPexo sequencing. Although several genes were up-regulated or down-regulated by all three members, each member also individually regulated its own subset of genes. Indeed, 235 of the 262 high-confidence genes were regulated differently by the NRF members. Based on this, it is likely that the observed transcriptional differences were determined by disparities in DNAbinding sequence preference, upstream-signaling pathways, and specific cellular contexts (i.e. biochemical/xenobiotic/nu-  Table S5 for details on which genes comprised the terms.

Transcription targets of NRF1, NRF2, and NRF3
tritional status, cofactor availability, competitive transcriptional programs, etc.). These features dictate which NRF family member becomes activated, where it binds in the genome, and what binding partners are available for determining gene induction or repression.
Supporting this, the known regulatory networks governing the different NRF family members appear to vary significantly. This is due in part to the dominant role that the discovery of NRF2 mediation of the antioxidant response had on the field, as investigations on NRF members subsequently diverged into disparate studies that generally focused on individual members. Furthermore, localization of each NRF member also plays a key role in dictating to which context each member is activated. For example, Nrf1 resides in an inactive, glycosylated state in the endoplasmic reticulum (22,24,25); once activated, Nrf1 translocates to the cytosol where it is deglycosylated and processed into several active isoforms, each of which harbors differential capacity to regulate transcription (26). Thus, glucose homeostasis, proteostasis, and other cellular conditions control Nrf1 activity (27). To control protein levels, Nrf1 is constantly degraded by the ubiquitin-proteasome system by several ubiquitin ligase systems, including HRD1 and ␤-TrCP complexes (28,29). Moreover, NRF1 is a well-described regulator of proteasomal function (30, 31).
The main negative regulator of NRF2 is KEAP1, which resides in the cytosol. KEAP1 is part of an E3 ubiquitylation complex that constantly ubiquitylates NRF2, marking it for proteasomal degradation (32). KEAP1 binds to the Neh2 domain of NRF2 and mediates NRF2 ubiquitylation. Following adduction of reactive cysteines on KEAP1 by electrophiles, the complex no longer effectively ubiquitylates NRF2, and newlysynthesized NRF2 can accumulate, translocate to the nucleus, and facilitate or repress transcription (33). This mechanism makes NRF2 the primary responder to chemical stressors. Intriguingly, KEAP1 stabilizes NRF1 when it binds to its Neh2 domain, contrasting the regulation of NRF2 from NRF1 (34). As more mechanistic insights into NRF2 regulation comes to light, the importance of NRF2 in maintaining cellular homeostasis, even in the absence of chemical stressors, continues to expand. For example, other ubiquitylation pathways, including HRD1and ␤-TrCPdependent mechanisms, also control the NRF2 protein level under specific physiological conditions (35,36). Additionally, we recently discovered that NRF2 is negatively regulated by many autophagy-related genes, implicating cellular proteostasis and metabolism also play important roles in NRF2 regulation (37).
Less is definitively known about NRF3 regulation; however, studies have shown that NRF3 is controlled in part by a PEST Figure 9. GO for all high-confidence genes regulated by NRF1, NRF2, or NRF3. Significantly-enriched GO terms for high-confidence genes altered by NRF1, NRF2, or NRF3 member groups. Only enriched GO terms with more than one gene represented were included. Venn diagram indicates number of unique or intersecting terms between each NRF member. See Table S6 for details on which genes comprised the terms.

Transcription targets of NRF1, NRF2, and NRF3
degron sequence and that NRF3 has three isoforms with varying subcellular localizations, including isoform "A" in the ER, isoform "B" in the cytosol, and isoform "C" in the nucleus (38). Similar to NRF1, NRF3 is targeted to and eventually cleaved from the ER (40). NRF3 degradation is also proteasome-dependent (40). Hence, the different mechanisms controlling the signaling and processing of NRF transcription factors, despite their structural homology, highlight the functional differences among them. Reviews are available describing the different signaling nuances of the NRF members (27,(41)(42)(43).
Accordingly, a better understanding of the functional differences between NRF members can further our understanding of the different roles each plays in physiology and pathology. We identified that some processes, like antioxidant response, are under the control of all three members. Other processes, like maintenance of cellular structure, may be limited to just a subset, like NRF1 and NRF3. Still, other processes may be completely under the jurisdiction of just one member. For the less well-studied NRF members, NRF1 and NRF3, this study provides unique target genes and their associated biological processes that could form the groundwork for further elucidation of their biological functions.
Functionally, NRF1 has been shown to be involved in various metabolic processes, including insulin resistance, diabetes, and inflammation (44 -47). This is, in part, due to Nrf1-validated control of enzymes involved in glycolysis, gluconeogenesis, and lipid metabolism (43,48); some of these regulated genes appear to be NRF1-specific and not shared by other NRF members like NRF2. NRF1 also regulates genes involved in cell differentiation, proteostasis, cell survival, and mitosis (43); however, NRF2 shares several of these functions, such as energy metabolism and cell differentiation (49 -56), while still controlling several unique genes in its own right. Metabolically, a recent work (57) has shown that loss of NRF2 in a rodent model can result in a lean phenotype. Beyond the aforementioned processes, a main function of NRF2 has been its role in antioxidation and detoxification, and while NRF1 and NRF3 have been shown to regulate several detoxification genes (58,59), NRF2 has been designated the master regulator of the antioxidant response, due in part to the KEAP1 electrophile sensory mechanism and several studies indicating NRF2 induction of this subset of genes to be the most robust. Specific roles of NRF3 have remained the most enigmatic; although several target genes have been identified, little clear evidence has determined the in vivo relevance of those genes (42). Indeed, a consistent lack of information exists on NRF3, particularly its physiological roles, again highlighting the need for more studies clarifying the role of the lesser-studied NRF family members.
To uncover potential unique roles of the different NRF members, we first conducted RNA-Seq to establish which genes each member regulates. This was then combined with ChIP-exo sequencing to identify LREs near those genes that were either similarly or differentially regulated. Many of the genes that were differentially regulated at the transcription level by an NRF member did not harbor an LRE. Similarly, many of the genes with ChIP-exo peaks were not annotated to a differentially-regulated gene. One possible explanation for these discrepancies could be distant enhancer regions. Enhancers have been shown to interact with promoters over 150 -200 kb or even span several megabases. Enhancers can even effect transcription across chromosomes due to the inherent flexibility in chromatin (60,61). To further complicate the situation, genes regulated by certain enhancers can vary from tissue to tissue and cell to cell depending on the chromatin state. Furthermore, we also recently reported that NRF2 can repress gene expression by interacting with replication protein A1 (RPA1) to bind 5Ј-TGABNNNGCAAACTTCA-3Ј consensus sequence (62). Although several NRF2-RPA1 target genes were indeed downregulated in the RNA-Seq data, no ChIP peaks were identified for those target genes possibly due to the lack of statistical power in the genome-wide approach or to cell type-specific effects. Regardless of these constraints, this remains the first study to systematically investigate NRF member dynamics at both a transcriptional and DNA-binding level simultaneously.
The ChIP-exo results showed a majority of NRF1-and NRF3-binding sites harbored canonical AREs. Interestingly, more than 50% of the binding sites for NRF2 did not contain a canonical ARE. This indicates NRF2 may be less specific than NRF1 or NRF3. Although regulating the antioxidant response is one of the primary functions of NRF2, and NRF2-based literature has often focused on this role, the 50% of NRF2-binding sites not harboring an ARE supports a role for NRF2 outside of xenobiotic elimination. Indeed, Gene Ontology for genes upregulated by NRF1, NRF2, and NRF3 centered on the stressresponse pathways, and 100% of those genes harbored an ARE. In contrast, the processes enriched from other gene groups associated with NRF2 did not contribute to antioxidation or stress-response processes, participating instead in areas like protein quality control and trafficking. Future investigations on NRF2 and protein quality control could uncover more specific roles of NRF2 in the process of ER protein folding and subsequent targeting to the membrane. In the past decade, new roles are increasingly being uncovered for NRF2 in areas such as metabolism, proliferation and differentiation, iron homeostasis, and inflammation (49 -56, 63-66). Perhaps some of these newer roles require NRF2 binding to less restrictive ARE sequences and may be an avenue of future investigations.
After combining the ChIP-exo and RNA-Seq data, we identified 262 high-confidence genes that fell into varying regulatory categories: some genes were regulated similarly by all NRF family members, whereas others were differentially regulated. We hypothesize that genes regulated similarly by all NRF members may be compensatory, i.e. deficiency in one gene could be compensated by up-regulation of another. For example, the following genes were up-regulated by all three NRF members and are lethal upon knockout in mice: VCP, GCLC, TRIO, TBK1, and TXNRD1 (67)(68)(69)(70)(71)(72). Perhaps the loss of some of these transcription targets is partly to blame for the lethality seen with Nrf1 knockout animals as Nrf2 and Nrf3 may not be expressed at the right space and time to compensate during development. However, because Nrf1, Nrf2, and Nrf3 share several transcription targets, could activation of Nrf2 or Nrf3 in an Nrf1 knockout mouse rescue the embryonic lethal phenotype? It is important to note that under different cellular contexts, the jurisdiction of different NRF members may be swapped according to upstream signaling, cross-talk with other pathways, and

Transcription targets of NRF1, NRF2, and NRF3
proficiency/abundance of negative regulators. For example, some genes show cross-talk with aryl hydrocarbon receptor transcriptional programs (73). The relative location of each NRF member on the genome may dictate their roles during development as is the case with KRAS and HRAS genes, whereby knocking-in HRAS into the KRAS locus circumvents the embryonic lethality seen with KRAS loss (74). Assessing whether embryonic lethality is maintained in NRF1 Ϫ/Ϫ animals by substituting NRF2 or NRF3 at the NRF1 locus would allow us to identify functional redundancy of NRF members and the importance of their genomic locations in their function. A putative understanding of the epigenetic and signaling pathways determining NRF member activation during development would help clarify their roles. Importantly, consensus sequence analyses also identified sequence preferences for NRF1 and NRF2. Thus, certain target genes may be preferentially regulated by NRF1 and NRF2, implicating that the two genes are not always interchangeable.
From a gene regulatory standpoint, an intriguing area to consider were those genes and GO processes that were regulated differentially by the NRF1, NRF2, and NRF3 members. Dystonin (DST) was the single gene that was up-regulated by NRF1 and NRF3 but down-regulated by NRF2. DST is a cytoskeletal linker whose loss-of-function is responsible for dystonia musculorum. Defects in DST are associated with malformations in the ER (75,76). Considering NRF1 and NRF3 are targeted to the ER whereas NRF2 is not, the up-regulation of DST by NRF1 and NRF3 may be necessary for ER organization and thus proper functioning of NRF1 and NRF3. Interestingly, NRF2 is activated following ER stress and contributes to cell survival (77,78). Why DST is not up-regulated by NRF2 when DST could help restore ER homeostasis remains enigmatic, but NRF2 did exclusively up-regulate the genes ARFGAP1, ERGIC1, TOR1B, and VAPA, which are involved in "endoplasmic reticulum organization" and "retrograde vesicle-mediated transport, Golgi to ER." More inquiry into the divergent roles of NRF members in ER biology would be valuable in unraveling the differing transcriptional programs.
BRD2, TXN, and TNFRSF1A were genes up-regulated by both NRF1 and NRF2 but down-regulated by NRF3. Bromodomain-containing 2 (BRD2) is a transcriptional regulator known to inhibit NRF member activity (79). Thus, for NRF1 and NRF2 it may serve as a negative feedback mechanism, whereas for NRF3 it is a feed-forward pathway. Thioredoxin (TXN) is a bona fide NRF2 target gene; the contrarian role of NRF3 in regulating this gene and the others in this group is intriguing, but not unheard of, as NRF3 has been shown to negatively regulate levels of other bona fide antioxidant genes before, such as NQO1 and PRDX6 by "sitting" on the ARE sequence without activating transcription (80 -82). Additionally, all three of these genes are involved in the viral infection process, which is supported from the GO analysis for "viral process." The NRF transcription factors differ in their tissue localization; Nrf1 and Nrf2 are expressed across almost all tissues, whereas the Nrf3 expression is predominately placental (2)(3)(4). NRF2-specific processes (see Fig. 9) included several metabolic processes, such as fatty acid catabolism. It has been well-described that maternal-to-fetal nutrient exchange is a critical function of the placenta, and this includes fatty acid transport and catabolism (83).
Perhaps the more intriguing proposition is the developmental activation and inactivation of NRF members over time. Nrf1 transcripts have been shown to be expressed in all tissues and at all stages of development investigated (84). In contrast, NRF2 is expressed solely in fetal liver and muscle and was absent in all other embryonic tissues investigated (3). Nrf3 transcript expression has been investigated in avian species, where it was first noticed in heart myocardium (85). Nrf3 transcripts then disappear from the heart and are next observed in the myotomal compartment of maturing somites. Central nervous system expression appears gradually and persists at low levels in ventricular neuroepithelial cells until at least embryonic day 6. Strong expression is observed in the early epiphysis, in the collecting ducts of the developing kidney, and in individual cells of the yolk sac (85). Studies in zebrafish have indicated that the differing expressions of Nrf transcripts in various tissue and developmental stages are due to the varying needs of oxidative stress regulation during development (86). Particularly, the absence of NRF2 during development may be to facilitate oxidative stress, which is required for some developmental processes (87). Nrf3 presents a second interesting case study; it is expressed early in avian heart development, and it is also critical in smooth muscle differentiation (81,85). As we saw here, biological processes related to vascularization and muscle development such as "M Band" and "endothelial cell proliferation" also suggest a role for NRF3 in circulatory system development.
However, until a detailed description of NRF1, NRF2, and NRF3 expressions at all different stages of development in the same model system is completed, we can only guess whether the different genes specific to each NRF member are relevant for a developmental function. Much of the available data on the developmental patterns of NRF member expression has utilized transcript levels instead of protein levels, and thus the preponderance of post-translational regulations of NRF members (i.e. KEAP1-mediated degradation of NRF2 or ER compartmentalization of NRF1) lessens the data's utility.
To help understand some of these differences between NRF members, we investigated whether any genomic sequences flanking the AREs were consistent among particular NRF groups. While looking for motifs flanking the AREs, we identified that NRF2 prefers GC-rich flanking bases, whereas NRF1 prefers AT-rich flanking bases. Although these moderate preferences may help determine differences between NRF1 and NRF2, there were no other trends among the NRF members or their overlaps.
Perhaps NRF members differentially regulate genes to some degree based on their relative promiscuities for the ARE. As has been shown, mutations to several bases of the canonical 5Ј-TGABNNNGC-3Ј do not always abolish NRF2 binding, and moreover, they can actually enhance binding. Particularly, mutations to the distal GC have a wide array of effects, but none abolish NRF2 binding (89). If NRF2 binding and lack of NRF1/ NRF3 binding to noncanonical AREs could be validated on a host of NRF2-specific genes, it may explain some of the differential regulatory capacities between the three. For example, the NRF2-specific up-regulated genes SYT14, SIAH1, and UBE2E2

Transcription targets of NRF1, NRF2, and NRF3
harbor 5Ј-TGACCACAT-3Ј, 5Ј-TGAGTGCGG-3Ј, and 5Ј-T-GAGTGCGG-3Ј sequences near the ChIP-exo peaks that closely mimic the canonical ARE and could still retain binding activity. An exhaustive comparison of binding capabilities for each NRF member to ARE derivatives may inform the differential binding capacities more completely.
Additionally, it is unclear how differing binding affinities may play a role. As has been shown previously (4), NRF2 binds more loosely to canonical AREs compared with NRF1 and NRF3. It is possible that NRF2 binding to some ARE sequences was blocked due to the presence of different ARE-binding proteins, such as dimers of Maf proteins and either p45, NRF1, NRF3, or BACH1, even with NRF2 overexpression. Similarly, the different affinities of NRF members to the sMAF proteins are not known. Thus, the context of other ARE-binding proteins and their binding sites may determine NRF2 binding as well, given it has less affinity in some cases.
In addition to a more refined understanding of binding affinities and promiscuities for each NRF member, a complete understanding of upstream events leading to transcriptional activation by NRFs could also uncover mechanisms governing their similarities and differences. For example, SETD6 was shown to inhibit NRF2 after ARE binding (90). As mentioned above, RPA1 is an additional inhibitor of NRF2 activity (62). In the context of the project presented here, the following questions arise. Can inhibitory proteins like SETD6 and RPA1 impede NRF1 and NRF3 activity, or just NRF2 activity? Does SETD6 bind to DNA near all AREs or just select ones? Understanding the interactions between NRF members and other transcription modulators is critical for interpreting differences between NRF members.
Here, we provide a compendium of the similarities and differences between the NRF1, NRF2, and NRF3 transcriptional programs. All three NRF members identify and bind to the ARE, indicating upstream signaling and cellular context, as well as other factors such as competitive binding of other transcription factors and ARE promiscuity, could play a major role in dictating the observed phenotypic differences among them. As such, determining the pathological and physiological relevance of the overlap and the disparities among the NRF family member transcriptional programs are key steps in targeting these pathways in human disease.

Cell culture and reagents
All cell lines were obtained from American Tissue Culture Collection (ATCC), and grown in Dulbecco's modified Eagle's medium with high glucose (4.5 g/liter) and no pyruvate and supplemented with heat-inactivated 10% fetal bovine serum. Cells were cultured in atmospheric air enriched with 5% CO 2 in an incubator maintained at 37°C. Zeocin, blasticidin, hygromycin, and doxycycline were obtained from Thermo Fisher Scientific and used at concentrations of 400, 10, 300, and 0.5 g/ml, respectively. Tunicamycin was purchased from Sigma, and sulforaphane was from LKT Labs (S8044).

Generation of U2OS TetR -lacZ cells
All plasmids were obtained from Thermo Fisher Scientific (Flp-In T-REx TM core kit, K6500-01). pcDNA5/FRT/TO/ 3ϫFLAG vector was generated by cloning a triple FLAG sequence into the pcDNA5/FRT/TO vector (Thermo Fisher Scientific) after multiple cloning sequences. NRF member open reading frames were reverse-transcribed and amplified by PCR from U2OS total RNA and modified using site-directed mutagenesis as outlined in Fig. 1B. U2OS TetR -lacZ cells were generated according to the manufacturer's protocol. Briefly, U2OS cells were transfected with pFRTlacZ-Zeocin; cells with stable incorporation of this plasmid were selected with Zeocin and termed U2OS-lacZ. U2OS-lacZ cells were transfected with pCDNA/TetR; cells with stable incorporation of this plasmid were selected for with blasticidin and termed U2OS TetR -lacZ. While U2OS TetR -lacZ cells were being generated, the modified ORFs of NRF members NRF1, NRF2, and NRF3 (collectively NRFX) were each cloned into pcDNA5/FRT/TO/3ϫFLAG vector (see Fig. 1B for modifications to NRF ORFs). The pcDNA5/FRT/TO/NRFX/3ϫFLAG vectors were co-transfected with pOG44 into U2OS TetR -lacZ to generate U2OS TetR -NRFX. U2OS TetR -NRFX cells with proper incorporation of pcDNA5/FRT/TO/NRF/3ϫFLAG at the Flp recombinase site were selected with hygromycin. Thus, U2OS TetR -NRFX cells expressed the hygromycin resistance gene, and NRFX was tagged with triple FLAG under tet-operon control. See Fig. S1 for a schematic.

Cell line validations
U2OS TetR -NRF1, U2OS TetR -NRF2, and U2OS TetR -NRF3 cells were treated with vehicle or doxycycline for 16 h and harvested for immunoblotting and immunofluorescence.

Immunofluorescence
For immunofluorescence assays, cells were grown on glass coverslips in 35-mm dishes. Cells were fixed in ice-cold methanol for 20 min and then blocked in 5% BSA for 1 h. FLAG-NRFX was detected using an anti-FLAG antibody (Sigma, catalog no. 3165) and an Alexa Fluor 488-conjugated anti-mouse secondary antibody. Images were taken with a Zeiss Observer.Z1 microscope. Images were acquired with Slidebook 4.2.0.11 software.

Luciferase assay-synthetic sequence
The dominating base pair sequences containing the NRF1and NRF2-flanking regions on either side of a core ARE sequence (5Ј-GTT AGT TGC ATA GAG TGT GTT AAT TTC TAG CCA ATC GAC TAG GAA AGA CA TGACTCAGC AAT TAC GTA ACA GTA TGA AAG GGA AAC ATA AAA AAA GTC TCA ATT GAA TA-3Ј and 5Ј-GCC GGG GGC GGG GGG GGG GCC GCG GGG GGG CGG GCG GGG CGG GCG GGG GC TGAGGCGGC CGG GCG GCG TGC GGC CGC GCG GCG GGC GGG GCC GCG GGC GCC GCT GCG GC-3Ј, respectively) were cloned into pGL 4.37 vector (Promega) in place of the commercially-available endogenous ARE sequence, to make NRF1-AREflank and NRF2-AREflank. U2OS TetR -NRF1 and U2OS TetR -NRF2 cells were co-transfected with a Renilla luciferase control vector (Promega E2261) and the respective NRF1-flank or NRF2-flank luciferase constructs. Cells were treated with doxycycline as described above. After 24 h, a dual-luciferase assay (Promega E1910) was conducted according to the manufacturer's instructions. Readouts from U2OS TetR -NRF1 and U2OS TetR -NRF2 were compared using Student's t test.

Luciferase assay-ADGRG7 and PHF20L1
Promoter regions were amplified by PCR from U2OS genomic DNA. Target regions included chr8:132775076 -132777861 (PHF20L1) and chr3:100630731-100637341 (ADGRG7). The amplicons were cloned 5Ј to the firefly luciferase ORF within the pGL4.21 vector (Promega) using Gibson cloning. Using these vectors and Renilla luciferase control vector, cells were subjected to dual-luciferase assay as with the synthetic vector above. All experiments were conducted in triplicate. Readouts were compared using ANOVA with post hoc Tukey's highly-significant difference test.

RNA sequencing
U2OS TetR -NRFX cells expressing NRF1, NRF2, and NRF3 were treated with doxycycline. The NRF1 and NRF3 open reading frames had their ER targeting sequences removed in U2OS TetR -NRF1 and U2OS TetR -NRF3, respectively. U2OS TetR -NRF2 cells were treated with 10 M sulforaphane (LKT Laboratories) for 10 h. An empty vector (U2OS TetR -FLAG) was used as a control. Total RNA from duplicate samples of the above cells was isolated using TRIzol and a column-based RNeasy kit (Qiagen). rRNA was depleted using Ribo-Zero Gold rRNA removal kit (Illumina) and prepared for sequencing using the Ultra Directional RNA library prep kit from Illumina (New England Biolabs). Samples were run on a HiSeq3000 Illumina Sequencing Platform as 50-bp single-end read runs. FASTQ files from RNA-Seq were mapped to the reference genome (UCSC hg38) using TopHat2, a splicing aligner for RNA-Seq data. Resultant binary equivalent files were sorted and indexed for quicker access using SAMtools. The number of reads per gene was annotated using a custom in-house generated script. Differentially expressed genes between control and NRF-activated samples were identified using the DESeq2 package in the R statistical environment on the read count data, which utilizes the negative binomial test to assess statistical significance. Differentially-expressed genes were defined as all genes with multiple testing corrected and adjusted p value of Ͻ0.05.

ChIP-exo sequencing
U2OS TetR -NRFX cells expressing NRF1, NRF2, and NRF3 were treated with doxycycline. NRF1 ORFs and NRF3 ORFs had their ER-targeting sequences removed. The NRF2 ORF had its two degron motifs, DLG and ETGE, mutated to AAA and AAAA, respectively. ChIP assay was performed according to the manufacturer's instructions (EZ-CHIP TM , Merck, Germany). In brief, U2OS TetR -NRFX cells were seeded in a 150-mm dish. Upon ϳ95% confluence, DNA-protein complexes were cross-linked by adding formaldehyde to a final concentration of 1%. The cells were washed with cold PBS and suspended in 1 ml of SDS lysis buffer containing 1 mM phenylmethylsulfonyl fluoride and 1% protease inhibitor mixture. Next, the samples were sonicated (nine times for 20 s and performed on ice) and centrifuged at 15,000 rpm for 30 min at 4°C. The solubilized chromatin was diluted with ChIP dilution buffer (1:9 ratio) for ChIP assays. 0.1 ml of diluted solubilized chromatin was saved for total chromatin input. The chromatin was pre-cleared using protein G-agarose beads for 1 h at 4°C and then incubated with 4 g of anti-FLAG antibody or mouse normal IgG overnight at 4°C with rotation. The cross-linked immunoprecipitates and total chromatin input were reverse cross-linked, and the DNA was isolated via phenol chloroform extraction ethanol precipitation. The library was created using adapter ligation methods, and each library was sequenced along with an input control library on a HiSeq3000 Illumina Sequencing Platform as 50-bp single-end read runs.
Sequencing FASTQ files for each NRF treatment group were mapped to UCSC hg38 using Bowtie2, a genomic aligner for DNA sequences. The resulting binary alignment sequence files were sorted, indexed, and purged of duplicate reads using SAMtools. ChIP peaks were identified using two algorithms: the Genome-wide Event finding and Motif discovery (GEM) algorithm and the Model-based Analysis for ChIP-Seq (MACS) algorithm. GEM was chosen for its utility in resolving ChIP-exo Transcription targets of NRF1, NRF2, and NRF3 data, which usually has narrow, sharp peaks; the analysis was complemented with MACS, which can better resolve wider peaks in data with high background. Overlapping binding sites were calculated using Bedtools (92). Peaks were annotated by their location to nearby genes using a custom script in the R programming environment that identified all peaks that fell within genes' exons, introns, or within 5 kb of the transcription start and end sites. We defined peaks that fell within the first exon, first intron, or within 5 kb of the transcription start site as LREs, and genes harboring LREs were used for further analyses. ChIP peaks and genomic regions were visualized using the ChIPSeeker package, Gviz package, or in-house scripts in the R statistical environment (88,(93)(94).

Combining ChIP-exo and RNA sequencing
Only genes harboring both LREs and differential RNA expression were taken forward for further analysis as high-confidence genes likely to be regulated by one or more NRF family members. If genes were inconsistent between any NRF members then that gene was excluded from analysis. Inconsistent genes include genes that had LREs but did not have differential RNA expression or genes that had differential RNA expression but did not harbor an LRE. Genes were then categorized by their regulatory network (i.e. NRF1 and NRF2 up-regulate Gene A).

Gene ontology annotation
High-confidence genes within each gene grouping or NRF categorization were subjected to a Gene Ontology (GO) analysis. GO terms were tested for statistically-significant enrichment by hypergeometric test (threshold p value Ͻ0.05 was considered significant).

Consensus sequence flanking analysis
Across the whole genome, the previously-identified LREs from high-confidence genes were scoured for sequences conforming to the core ARE sequence, 5Ј-TGABNNNGC-3Ј. All core sequences plus 50 flanking base pairs were extracted. For the different sets of genes presented, position weight matrices harboring the relative proportions of each base pair (A, T, G, and C) at each position were evaluated. Relative proportions of GC and AT base pairs were compared between gene sets and between flanking positions using the 2 test. p values were corrected for multiple testing using the Benjamini-Hotchberg methodology. Position weight matrices were visualized using the ggseqlogo package in the R environment.

ChIP peak sequence conservation
To evaluate conservation of ChIP-peaks' sequences across different members of the subphylum vertebrata, we retrieved phastCons scores for genomic regions covered by the highconfidence ChIP-peaks directly from the UCSC genome browser's Cons 7 Verts track. This track compares sequences from seven different species within the subphylum vertebrata by multiple alignment. The seven species being compared are human, chimp, rhesus, mouse, rat, dog, and opossum.

Data and materials availability
Datasets from the sequencing studies provided here are available at the NCBI short read archive (BioProject 541591).