Structural Studies of IRF4 Reveal a Flexible Autoinhibitory Region and a Compact Linker Domain*

Background: IRF4 is a master transcription regulator critical in immune cell development and thermogenic gene expression. Results: We report the crystal structure of IRF4-IAD and SAXS studies of full-length IRF4. Conclusion: IRF4 has a flexible autoinhibitory region and a compact semistructured linker. Significance: These studies identified new structural features that provide insights into the function and regulation of IRF4. IRF4 is a unique member of the interferon regulatory factor (IRF) family playing critical regulatory roles in immune cell development, regulation of obesity-induced inflammation, and control of thermogenic gene expression. The ability of IRF4 to control diverse transcriptional programs arises from its proficiency to interact with numerous transcriptional partners. In this study, we present the structural characterization of full-length IRF4. Using a combination of x-ray and small angle x-ray scattering studies, we reveal unique features of the interferon activation domain, including a set of β-sheets and loops that serve as the binding site for PU.1, and also show that unlike other IRF members, IRF4 has a flexible autoinhibitory region. In addition, we have determined the small angle x-ray scattering solution structure of full-length IRF4, which, together with circular dichroism studies, suggests that the linker region is not extended but folds into a domain structure.

Interferon regulatory factors (IRFs) 2 are versatile transcription regulators that mediate homeostatic mechanisms of host defense against pathogens (1)(2)(3)(4). Moreover, they also participate in cell growth regulation, differentiation of hematopoietic cells, and apoptosis (5). Promoters of genes regulated by IRFs have copies of the consensus IRF recognition sequence 5Ј-AANNGAAA-3Ј found in cis-acting elements called interferon-stimulated response elements (ISREs) (6 -9). Members of the IRF family (IRF1 through -9 in mice and humans) have two conserved functional domains: an N-terminal helix-turn-helix DNA-binding domain (DBD) with a signature five conserved tryptophan residues and a C-terminal interferon activation domain (IAD) critical in mediating protein-protein interactions (1, 9 -12). IRF3 is the best studied member of the family and has provided the overall mechanism of IRF activation mediated by viral infection (13,14). In this view, IRF3, which is ubiquitously and constitutively expressed, is localized in the cytoplasm as an inactive monomer. Dimerization and nuclear translocation is induced upon viral infection via IKK⑀/TBK1mediated phosphorylation of specific serine-threonine clusters present in the C-terminal autoinhibitory region (AR) of the IAD domain (13)(14)(15). Phosphorylation of corresponding residues in IRF7 and IRF5 has also been shown to be important for their activation in the regulation of IFN ␣/␤ genes (16,17). Thus, virus-induced phosphorylation/dimerization appears to be the mechanism by which a subset of IRF proteins (IRF3, IRF7, and IRF5) regulate gene expression. However, other IRF members regulate genes through different activation mechanisms. IRF4 (also called Pip (PU.1-interacting protein), ICSAT (interferon consensus sequence-binding protein for activated T-cells), and LSIRF (lymphocyte-specific IRF)) is the only IRF factor that is not regulated by interferons (IFNs) and is quite different from other members in multiple ways (18 -20). Originally, IRF4 expression was thought to be restricted to cells of the immune system, but recently, it has been detected in the heart, kidney, liver, and brain (21)(22)(23). In contrast to other IRF proteins, IRF4 binds DNA with low affinity and requires interaction with different binding partners to bind DNA. These include other IRF family members, the leucine zipper heterodimer BATF-JunB, STAT6, PU.1, and PGC-1␣ among others (24 -29). The low DNA binding affinity of IRF4 has been attributed to the presence of an AR residing in the last 30 residues of the IAD. It has been suggested that this region physically interacts with the DBD and maintains the protein in an autoinhibited state (30 -32). Upon interaction with a binding partner, the inhibitory mechanism is relieved, allowing IRF4 to bind its recognition DNA sequence (32). However, how this inhibition is relieved when IRF4 binds to ISRE sites as a homodimer is not known. Thus, there are multiple ways by which IRF4 can be recruited to DNA. For instance, during activation of genes containing Ets-IRF composite elements (EICEs), IRF4 interacts with PU.1, phosphorylated at serine 148, through residues located in the IAD, particularly Lys-399 and Arg-398 (32). On the other hand, binding of IRF4 to genes regulated by AP1-IRF-consensus elements requires cooperative interaction with BATF-JunB heterodimer through residues Glu-77, Lys-63, and His-55 of BATF (26). Moreover, at high local concentrations, IRF4 regulates genes containing ISRE sites presumably by IRF4 dimerization (33). The diversity of mechanisms by which IRF4 can be recruited to DNA sites suggests that is capable of providing binding sites to accommodate the diverse number of interacting partners and that the autoinhibitory mechanism can be released in multiple ways. To delineate the structural and molecular details unique to IRF4, we determined the x-ray structure of an IAD construct lacking the last 30 residues (IAD ⌬C ). The structure reveals several features that are exclusive to IRF4, such as an open binding pocket that serves as the PU.1 binding site. Small angle x-ray scattering (SAXS) studies of the complete IAD show a flexible AR, which is not folded into the IAD. Furthermore, we show that the full-length protein is an elongated molecule with the putative linker region being most likely folded into a domain. Taken together, our crystallography and solution scattering data reveal key differences between the IADs of IRF4 and other members of the IRF family and provide a low resolution structure of a full-length IRF protein.

Experimental Procedures
Protein Purification-Murine IRF4 IAD ⌬C construct 238 -420 was subcloned into pET-15b TEV vector (pet15TEV_ NESG (EvNO00338203) from the DNASU plasmid repository; the vector has a tobacco etch virus (TEV) protease-cleavable site following the His 6 site). Positive clones were sequenced and subsequently transformed into BL21-pLysS* Escherichia coli expression cells were used to start an overnight preculture with 100 g/ml ampicillin and 25 g/ml chloramphenicol antibiotics at 37°C. The next day, cells were grown at 37°C, induced with 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside at optical density ϳ0.8, and harvested after 4 h. Protein was eluted from an Ni-NTA column with 25 mM Tris-base, pH 8.0, 500 mM NaCl, 2 mM TCEP, 300 mM imidazole, 10% glycerol. Next, the His tag was removed using the TEV protease (34). The cleaved protein was further purified on an Ni-NTA column to remove the His tag as well as the TEV protease, followed by gel filtration on a Superdex 75 column in 25 mM Tris-base, pH 8.0, 300 mM NaCl, 2 mM TCEP, and 5% glycerol. Purified protein was concentrated to 16 mg/ml (0.753 mM) and stored at Ϫ80°C in 60-l aliquots.
Murine IRF4 construct 238 -450 (IAD) was subcloned into pET-15b, and positive clones were grown at 37°C until the optical density reached ϳ0.8, induced with 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside, and harvested after 4 h. Following Ni-NTA-based affinity purification, the His affinity tag was removed using thrombin (1 unit of thrombin/mg of protein). Thrombin was precipitated with p-aminobenzamidine-agarose beads, and finally the protein was purified on a Superdex 75 column in 25 mM Tris-base, pH 8.0, 500 mM NaCl, 5 mM ␤-mercaptoethanol, 10% glycerol, after which the protein was concentrated to 1 mg/ml (0.04 mM) and stored in 100-l aliquots.
The full-length IRF4 construct 1-450 (IRF4 FL ) was subcloned into pET-15b TEV using the appropriate primers. Positive clones of IRF4 WT were transformed into BL21-pLysS* E. coli expression cells, and an overnight preculture was grown using the appropriate antibiotics. IRF4 FL was grown at 37°C until the optical density reached ϳ0.6, induced with 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside, and harvested after 6 h. Protein was eluted from an Ni-NTA column with 300 mM imidazole, and the His tag was removed using TEV protease. The cleaved protein was further purified on an Ni-NTA column to remove the His tag as well as the TEV protease. This was followed by additional purification on a phenyl-Sepharose column with 25 mM Tris-base, pH 8.0, 2 M NaCl, 1 mM TCEP, 1 mM EDTA eluted with low salt. The final step of purification was done using a gel filtration on a Superdex 75 column in 25 mM Tris-base, pH 8.0, 300 mM NaCl, 2 mM TCEP, and 5% glycerol. The protein was concentrated to ϳ10 mg/ml (0.196 mM) and stored at Ϫ80°C in 100-l aliquots. IRF4 ⌬NC was cloned using the appropriate primers and expressed and purified using a similar protocol described here for IRF4 FL .
Crystallization-Crystals of IRF4 IAD ⌬C (amino acids 238 -420) were obtained in 2.5 M NaCl, 0.1 M imidazole, pH 8.0, at 20°C. Optimization led to separate, single crystals growing in 1.5-1.7 M KCl, 0.1 M imidazole, pH 8.0, at 4°C. The crystals were flash-frozen in liquid nitrogen after soaking in a solution of 75% mother liquor and 25% glycerol for 15-20 s.
Analytical Ultracentrifugation Studies-Analytical ultracentrifugation experiments were carried out in a XL-I analytical centrifuge (Beckman Coulter) using 12-mm Epon double sector cells with sapphire windows and loaded into an An-60 Ti 4-hole rotor. All protein constructs were buffer-exchanged into 25 mM Tris⅐HCl, 300 mM NaCl, 1 mM EDTA, 3 mM TCEP (pH 7.9). Sedimentation velocity experiments were performed at different loading concentrations (between 0.5 and 2 mg/ml) at 20°C. Density and viscosity of solutions and partial specific volumes were calculated using the program SEDNTRP (35,36). Samples were centrifuged at 35,000 rpm, and data were collected with both absorbance and interference detectors. The data were fit using the continuous distribution c(S) model in SEDFIT (37).
SAXS-SAXS data were collected at the undulator-based beamline X9 at the National Synchrotron Light Source part of the Brookhaven National Laboratory using an MAR165 chargecoupled device area detector located at a distance around 3.5 meters from the sample and an x-ray beam energy of ϳ2 keV with an exposure time of 60 s each. Merging, trimming, and scaling were performed using PRIMUS, a part of the ATSAS suite (38). Buffer subtraction was carried out using beamlinespecific software. Radii of gyration (R g ) were evaluated using the Guinier approximation sR g Ͻ 1.3. Distance distribution functions and maximum diameters (D max ) were calculated using the program GNOM (39).SAXS molecular envelopes were generated using GASBOR and DAMMIN (40,41).
Circular Dichroism Experiments-CD experiments were carried out at the University of Richmond Biochemistry Department. Spectra of IRF4 protein constructs were obtained using a JASCO J-720 spectropolarimeter. CD measurements (190 -260 nm) were collected in quartz cells of 0.1-cm path length at 20°C with a bandwidth of 0.1 nm. IRF4 ⌬NC and IAD ⌬C were bufferexchanged into a CD buffer, 50 mM sodium phosphate buffer (pH 8.0), whereas the DBD was in same buffer but at pH 7.0. Protein concentrations were calculated using a NanoDrop spectrophotometer using the calculated extinction coefficient for each of the proteins. The spectra were analyzed by BestSel (42).

Results
Structure Determination-IRF4 IAD ⌬C (amino acids 238 -420) was crystallized in 1.5-1.7 M KCl, 0.1 M imidazole, pH 8.0, at 4°C. Diffraction data were collected from multiple single thin plate crystals at beamlines X4A and X25 at the National Synchrotron Light Source. The crystals belong to the P2 1 22 1 space group with unit cell dimensions a ϭ 45.5 Å, b ϭ 84.9 Å, c ϭ 149.9 Å, and ␣ ϭ ␤ ϭ ␥ ϭ 90°. The structure was solved by molecular replacement using the structure of IRF5 IAD (Protein Data Bank entry 3DSH) as a search model using the program Phaser in CCP4 (43). Model building was carried out using COOT (44). The structure was refined to an R-factor of 25.5% and a final R free of 19.3%. The data collection and refinement statistics are presented in Table 1. The crystals contain two polypeptide chains per asymmetric unit, namely subunit A (residues 239 -420) and subunit B (240 -420), and 115 solvent molecules. Overall, we obtained an excellent Ramachandran plot with ϳ98% of residues in the most favored region, the remaining 2% of residues in the generally allowed region, and no residues in the disallowed region.
Overall Structure of IRF4 IAD ⌬C -The IRF4 IAD ⌬C structure contains the modified MH2 fold of the Smad family of proteins previously seen in IRF3 and IRF5 structures (11,12,45). The domain has a sickle-like shape with four ␣-helices (labeled ␣1-␣4) surrounding a ␤-barrel (␤1-␤11) (Fig. 1A). Helices ␣1, ␣3, and ␣4 are at one end of the molecule, forming a helix bundle where the N-and C-terminal ends of the domain are also located. The remaining helix, ␣2, sits at the opposite end of the domain and loosely packs against one of the ␤-sheets (Fig.  1A). Five long loops connect different secondary structure elements protruding from the main core of the domain. The two molecules in the asymmetric unit form a dimer through interaction with ␣3 of the helix bundle burying a surface area of ϳ1400 Å 2 (Fig. 1B). However, this interface appears to be an effect of crystallization because the protein is monomeric in solution (Fig. 6A). The two molecules are almost identical, superimposing with an r.m.s. deviation of 0.98 Å for 175 C␣. However, helix ␣1 is in a different conformation with an r.m.s. deviation of 3.7 Å (Fig. 1C). This difference suggests that ␣1 is loosely packed against helices ␣3-␣4 and has a dynamic character. To determine structural differences with other IRF IAD structures, we aligned the IRF4 to equivalent regions of IRF3 (amino acids 189 -382) and IRF5 (amino acids 222-422), resulting in an r.m.s. deviation of 1.35 and 1 Å, respectively (Fig.  2). The analysis reveals several features that are unique to IRF4, in particular the structure of loops L3 and L5. In IRF3, loop L3 hovers on top of the ␤ barrel covering strands 7, 10, and 11 ( Fig.  2A). In contrast, the loop in IRF4 projects out of the core of the protein in a conformation similar to IRF5 (Fig. 2B). In IRF3 and IRF5, the loop L5 has a short ␣-helix that is absent in IRF4.
Moreover, the overall direction of the loop is opposite to IRF4 with an r.m.s. deviation of 9.7 Å for IRF3 and 7.5 Å for IRF5 (Fig.  2). The conformation of both loops produces IADs with different accessible surface areas; in particular, the accessible surface area in IRF3 is ϳ1500 Å 2 smaller than IRF4 (Fig. 2C). Structure and sequence alignment of the three IADs reveal features that lead to the different loop L3 conformations. In IRF3, ␤7 and ␤10 are longer, causing L3 to "tilt" in the direction of strand ␤10 (Fig. 3A). In contrast, ␤7 in IRF4 and IRF5 is shorter by three residues and is followed by Gly-Pro. This combination breaks the ␤-strand and causes a turn in the polypeptide chain away from the core of the protein (Fig. 3B). The other IRF member that may have similar loop L3 conformation is IRF6, which is highly similar to IRF5 in sequence (Fig. 5B).
Structural Basis of the PU.1 Binding Site-IRF4 requires interaction with phosphorylated PU.1 to bind to EICE sites found in many promoters and enhancers (46). Previous studies, including alanine-scanning mutational analysis, identified several IRF4 residues that are critical for the interaction with phosphorylated PU.1, in particular Lys-399 and Arg-398 (Fig. 4, A and B) (32). Our structure shows that the mutations affecting complex formation with PU.1 and transcription of EICE sites are distributed along loop L5 and strand ␤11 and part of a binding pocket for phosphorylated PU.1 (Fig. 4B). Analysis of the structure using CASTp (Computed Atlas of Surface Topography of Proteins) was carried out to identify binding pockets in the IAD. CASTp can identify and measure protein pockets using precise computational geometry methods that include ␣ shape and Delauney triangulation (47). The main IAD pocket includes, in addition to ␤11 and loop L5, residues from ␤7, ␤8,  4C. Residue Lys-399 points straight up into the solvent and is found in a highly positive patch region (Fig. 4D). This binding pocket appears to be specific for IRF4 only due to the structural architecture of the loops L3 and L5 described previously. In IRF3, loop L3 acts as a lid blocking any access to residues residing in ␤11 and loop L5 (Fig. 3A). In IRF5, there is a similar size pocket localized between ␤11 and ␤12; however, the presence of several bulky hydrophobic residues in ␤10, such as Trp-393 and Phe-386, restricts access to the loop L5 residues. In addition, the Lys-399 equivalent residue (Lys-401) is located underneath the pocket in IRF5 and is not accessible for interactions (data not shown). Thus, in IRF4, the open structure of the IAD generates a large surface that can be used for interaction with different transcriptional partners. IRF4 IAD Helical Bundle-The x-ray structure of IRF3 IAD showed that the helical bundle located at one end of the IAD keeps the C-terminal AR in a conformation that maintains IRF3 in a monomeric state (11,12). The AR structural elements pack against a mostly hydrophobic surface generated by ␣1, ␣3, and ␣4 (11,12). In IRF4, the hydrophobicity of this region is significantly less than in IRF3 (Fig. 5). Sequence and structural alignment reveal that key hydrophobic residues found in IRF3 are missing in IRF4 (Fig. 5C). For example, residues corresponding to Val-391, Leu-393, and Ile-395 in helix ␣5 of IRF3 are not found in IRF4. Hence, it is likely that the conformation of the AR in IRF4 will be different from the one seen in IRF3. Moreover, IRF4 lacks the serine residues found in IRF3 and IRF5, whose phosphorylation release the autoinhibitory conformation inducing dimerization. In addition, the IRF4 structure shows that although the conformation of helices ␣3 and ␣4 does not change much among the three IRF proteins (r.m.s. deviation of 1.04 Å), their respective helices ␣1 are in different configurations (Fig. 5B). For instance, superposition of IRF3 and IRF4 gives an r.m.s. deviation of 3.22 Å for ␣1. Hence, ␣1 in IRF4 has a dynamic character, which can be seen in the two different conformations of the two molecules in the asymmetric unit (Fig. 1D). Interestingly, several studies have shown that the integrity of the helical bundle has a direct effect on the transcriptional ability of IRF proteins. For instance, mutation of Leu-368 to proline results in an IRF4 molecule that is unable to form a ternary complex with phosphorylated PU.1 (48). The corresponding residues in IRF8 and IRF9 were also shown to be important in IFN signaling as well (49). Leu-368 is part of a hydrophobic core in IRF4 that includes Leu-246 and Glu-243 in ␣1; Phe-364, Leu-365, and Phe-371 in ␣3; and Leu-409, Leu-413, and Tyr-414 in ␣4.

SAXS Studies Reveal That the IRF4 Autoinhibitory Region Is Flexible and Does Not Dock into the IAD Domain-To
obtain structural information about the IRF4 autoinhibitory region (residues 420 -450), we performed solution studies using sedimentation velocity and SAXS on the complete IAD domain and the IAD ⌬C constructs. Sedimentation velocity studies show that both constructs sediment around 2 S and are monomeric ( Fig. 6A and Table 2). SAXS-calculated parameters for IAD show larger R g and D max values than IAD ⌬C , suggesting that the AR is probably a flexible tail unlike IRF3 (Table 2). This is supported by the Kratky plots showing a departure from a bell shape in IAD but not for IAD ⌬C (Fig. 6B). We generated ab initio models using the programs DAMMIN and GASBOR with merged data sets from different concentrations. The generated SAXS molecular envelopes have overall shapes that resemble the IAD ⌬C x-ray structure (Fig. 6D). However, the envelope for IAD is larger, with additional density at one end of the envelope that most likely corresponds to the autoinhibitory region. To obtain a more detailed description of the conformation of this region, we made a model of the complete IAD (residues 238 -450) and used BILBOMD to generate minimum ensemble search models consistent with the SAXS data (50). The initial IAD model had the last 30 residues in an extended conformation, and several BILBOMD models showed that the AR does not fold into the core of the helical bundle like IRF3 (Fig. 6E). The of the fit between the experimental and the model scattering curves was 0.21. Interestingly, if we generate an IRF4 model with the AR region acquiring an IRF3-like conformation, the of the fit is the worst of all of the models (data not shown). Incorporation of 2-3 alternative conformations marginally improves the fitting to a of 0.19. Taken together, our data show a flexible unstructured IRF4 autoinhibitory region that does not dock into the helical bundle, as seen in IRF3.

SAXS Studies Show That IRF4 FL Has an Elongated Shape with Flexible N-and C-terminal Inhibitory
Regions-It has been hypothesized that IRF4 is in an autoinhibited "closed" conformation in which the C-terminal AR of the IAD interacts with the DBD domain, preventing it from binding DNA (51). To obtain structural information about the arrangement of the IRF4 domains and the conformations of N-and C-terminal autoinhibitory regions in the context of the full-length protein, we performed sedimentation velocity and SAXS-based structural characterization of IRF4 FL (residues 1-450). We obtained a sedimentation coefficient of 3.4 s 20,w for IRF4 FL , which is significantly smaller than expected for a globular protein with the same molecular weight (4.9 S), suggesting that IRF4 FL has an elongated shape. Removal of the N-and C-terminal autoinhibitory regions produces a protein construct, IRF4 ⌬NC (residues 20 -420), that sediments at 3.2 s 20,w , implying that the elongated shape is not due to the N-and C-terminal ends but due to the DBD-linker-IAD domain arrangement (Fig. 7A). This conclusion is supported by the SAXS parameters showing that the R g and D max have similar values in both constructs ( Table 2).
Comparison of the Kratky plot shows that removal of the Nand C-terminal regions (IRF4 ⌬NC ) results in a flat region in the high q regions, whereas the IRF4 FL data shows an upward behavior that is characteristic of folded proteins with flexible tails (Fig. 7B) (52). This conclusion is supported by the Porod-Debye plot with a loss of the plateau region in the IRF4 FL but not in IRF4 ⌬NC (Fig. 7C). Furthermore, the Kratky-Debye plot also supports a model of IRF4 FL containing flexible regions (Fig. 7, D  and E) (53). SAXS data were next used to calculate ab initio envelopes using GASBOR, resulting in models with elongated shapes (Fig. 7F). The dimensions of the averaged envelopes are consistent with the experimentally determined R g and D max values (Table 2).   NOVEMBER 13, 2015 • VOLUME 290 • NUMBER 46

SAXS Rigid Body Modeling of Full-length IRF4 Shows That the Linker Region Folds into a Compact Domain-like Structure-
Docking of the DBD and IAD into the IRF4 ⌬NC SAXS envelope was carried out using the chimera-fitting volume function and SUPCOMB from the ATSAS package to obtain a rough estimate of the relative position of the two functional domains (38,54). The docking leaves a small volume to be occupied by the linker region. However, because the linker region contains ϳ104 residues, it is most likely to fold into a compact domain structure. We should point out that the docking of the domains represents just one of the many configurations that can fit into the SAXS envelope, and we have used it to illustrate that the linker is not in an extended conformation. To determine whether the linker region has secondary structure elements, we performed CD experiments on DBD, IAD ⌬C , and IRF4 ⌬NC . Secondary structure content was calculated from the individual CD spectra using the program BestSel (42). The CD spectra calculated for each sample are shown in Fig. 8B. The calculated percentages of helix, strand, turn, and disorder are shown in Table 3. The data show an increase in the overall secondary structure character in IRF4 ⌬NC compared with the calculated content from the individual domains. Moreover, secondary structure prediction of the linker region using the program Jpred calculates a short ␤-strand region between residues 214 and 221 (42,55). Rigid body modeling was performed using IRF4 ⌬NC with BUNCH, using IRF4 DBD residues 20 -134 and IAD residues 238 -420 (56,57). The resulting model has a linker region that is highly condensed and is located between the DBD and IAD domains (Fig. 8C). The IRF4 ⌬NC rigid model fits well with the experimental data with a value of 0.9 (Fig.  8B). A similar model was obtained using the program CORAL (56). Next, we used BUNCH to generate an IRF4 FL model, resulting in a structure with a comparable arrangement of DBD and IAD domains and a highly folded linker region (Fig. 8C). The IRF4 FL and IRF4 ⌬NC models superimpose with an r.m.s. deviation of ϳ6 Å, and although there are differences in the rotational position of the domains, the distance with respect to each other is the same. The difference in the positions may reflect conformational changes occurring upon removal of the N-and C-terminal regions but may just be a result of the low resolution of the data. Nevertheless, the IRF4 FL model shows that the AR is extended, thus supporting the results found for the IAD.
The conformation of the linker shown in the two models in Fig. 8 can be interpreted only as an indicator of the high compactness that it must have to fit the SAXS data and not as an accurate representation of its three-dimensional structure. Thus, IRF4 has an elongated structure with the DBD and IAD domains at either end of the molecule and separated by a compact linker domain.

Discussion
Our studies provide a structural view of a full-length IRF protein. The crystal structure of IRF4 IAD ⌬C shows some of the structural features that make it unique among IRF family members. The structure displays an open binding pocket on one side of the ␤-barrel that exposes residues that are critical for the interaction with phosphorylated PU.1 and probably other transcriptional partners. The PU.1 binding site is located on one face of the ␤-barrel, where the conformation of loops L3 and L5 generate an open surface that is accessible for binding. In other members of the IRF family, the conformation of loop L3 acts as a lid that prevents access to the binding pocket. In addition, loop L5 has a small ␣-helix that partially blocks the binding pocket. Mutations in IRF4 that affect complex formation with PU.1 have been determined previously, and they validate our structure (32). Not surprisingly, they are all located in loop L5 and include critical residues Lys-399 and Arg-398 that interact directly with PU.1 phosphoserine 148 (32). Mutations in residues that weakly affect complex formation, such as Glu-389, Glu-390, Phe-391, and Pro-392, are all located in the N-terminal half of loop L5 and may affect its conformation, preventing access to the binding site. However, mutations of Leu-400 and Ile-401 also affect complex formation, and both are located in the buried face of L5 and ␤11, where they are part of a hydrophobic pocket interacting with helix ␣2 that stabilizes the overall structure (Fig. 4E). Our SAXS studies show that the AR in IRF4 is flexible and does not fold into the helical bundle as in IRF3. This difference in the AR reflects the diversity of the  NOVEMBER 13, 2015 • VOLUME 290 • NUMBER 46 sequences found in this region among IRF family members (Fig.  5B). In particular, the IRF4 AR is significantly shorter than IRF3 or IRF5 and does not have many of the hydrophobic residues that participate in the folding and docking of the AR into the helical bundle (Fig. 5B). This parallels the properties of the different helical bundles. In IRF3 and IRF5, helices 1, 3, and 5 generate a large hydrophobic region, whereas in IRF4, this is not the case. Interestingly, Leu-368 in helix ␣3 seems to be critical in stabilizing the helix bundle in IRF4 by making hydrophobic interactions with Leu-413 in helix ␣4. A mutation of Leu-368 results in a protein incapable of forming the ternary complex with PU.1 and DNA (48). Thus, the stability of the helical bundle has further repercussions on the overall function of IRF4.

Structural Studies of IRF4
Our SAXS studies show the overall domain architecture of a full-length IRF protein. It shows that the linker region most likely adopts a folded conformation, where it may interact with  the DBD and IAD domains located at either end of the molecule. This finding is supported by our CD studies and a report suggesting that the linker in IRF3 is not unfolded but may also adopt a folded conformation (58). Thus, our SAXS IRF4 envelope may represent the general domain architecture for all IRF proteins and suggest that the linker domain may play a role in the regulation of IRF function. Indeed, a study by Mamane et al. (59) showed that FKBP52, a peptidyl-prolyl isomerase, inhibited the transactivation activity of IRF4 and binding to EICE DNA sites. They mapped the site of interaction to the linker region and proposed a mechanism of posttranslational modification of IRF4 activity. In another study, Wang et al. (60) showed that phosphorylation of residues in the linker region of IRF3, which has 15% serine content, negatively regulates its transactivation activity. Moreover, it was shown that ubiquitination of IRF8 enhances its activity to regulate expression of IL-12p40, and the linker is the site of interac-tion for the E3 ligase Ro52 (61). Thus, it appears that the linker region can be used to regulate IRF activity through different mechanisms. Based on biochemical data and insights from the structures of IRF3 and IRF5, it has been proposed that upon phosphorylation, the AR undergoes a large conformational change promoting dimerization and binding to DNA. The fact that the ARs of IRF proteins are diverse in terms of their sequence homology and their length suggests the possibility of alternative mechanisms that could induce IRF dimerization. IRF4 in particular has evolved two mechanisms in order to bind DNA: 1) its ability to interact with other transcription factors and binding to composite sites and 2) formation of homodimers at high concentrations to bind ISREs. Thus, the critical event in the activation of IRF4 in both cases is the formation of homoor heterodimers that leads to an increase in DNA affinity. The fact that the C-terminal autoinhibitory region is flexible FIGURE 7. SAXS studies on full-length IRF4. A, sedimentation coefficient distribution profiles of IRF4 FL (black) and IRF4 ⌬NC (red). B, Kratky plots for IRF4 FL (black) and IRF4 ⌬NC (red). C, Porod-Debye plots of IRF4 FL (black) and IRF4 ⌬NC (red). D, Kratky-Debye plot for IRF4 FL . E, Kratky-Debye plot for IRF4 ⌬NC . The plot shows that whereas IRF4 FL plateaus, the plot for IRF4 ⌬NC has a downward behavior. F, GASBOR generated envelopes for IRF4 FL (gray) and IRF4 ⌬NC (red).
supports the notion that IRF4 homodimerization may not require a trigger, such as phosphorylation, but is prompted by an increase in protein concentration, as suggested by the fact that the IRF4 binds to ISRE sites as a homodimer at high concentrations, and this property is critical for B-cell differentiation into plasma cells (33,(62)(63)(64). Further studies are required to determine the structural determinants of IRF4 oligomerization.
Author Contributions-S. G. R. cloned, expressed, and purified the proteins; collected and determined the x-ray structure and the SAXS studies; performed initial analytical ultracentrifugation experiments; analyzed CD data; and wrote the original draft of the paper. V. S. performed analytical centrifugation and CD experiments and protein preparation for SAXS studies and analyzed CD data. C. R. E. conceived and analyzed data, wrote the manuscript, and secured funding.