Cryo-EM structure of Escherichia coli σ70 RNA polymerase and promoter DNA complex revealed a role of σ non-conserved region during the open complex formation

First step of gene expression is transcribing the genetic information stored in DNA to RNA by the transcription machinery including RNA polymerase (RNAP). In Escherichia coli, a primary σ70 factor forms the RNAP holoenzyme to express housekeeping genes. The σ70 contains a large insertion between the conserved regions 1.2 and 2.1, the σ non-conserved region (σNCR), but its function remains to be elucidated. In this study, we determined the cryo-EM structures of the E. coli RNAP σ70 holoenzyme and its complex with promoter DNA (open complex, RPo) at 4.2 and 5.75 Å resolutions, respectively, to reveal native conformations of RNAP and DNA. The RPo structure presented here found an interaction between the σNCR and promoter DNA just upstream of the −10 element, which was not observed in a previously determined E. coli RNAP transcription initiation complex (RPo plus short RNA) structure by X-ray crystallography because of restraint of crystal packing effects. Disruption of the σNCR and DNA interaction by the amino acid substitutions (R157A/R157E) influences the DNA opening around the transcription start site and therefore decreases the transcription activity of RNAP. We propose that the σNCR and DNA interaction is conserved in proteobacteria, and RNAP in other bacteria replaces its role with a transcription factor.

In the event of formation of the transcription-ready open complex (RPo) in Escherichia coli, the conserved regions 2.4 and 4.2 of the primary 70 factor in RNA polymerase (RNAP) 4 holoenzyme recognize the Ϫ10 and Ϫ35 promoter elements, respectively, to form the closed complex (RPc). This step is followed by DNA strand separation that is stabilized by the region 2.3, completing the formation of the transcription bubble from Ϫ11 to ϩ2 positions (1). E. coli RNAP forms RPo with at least two intermediates starting from the RPc. Although a series of large conformation changes of RNAP and DNA during the RPo formation have been characterized by kinetic and biophysical studies (2), the structures of intermediates have not been determined yet because of their transient and heterogeneous nature that makes them difficult to be captured by X-ray crystallography approach.
A series of technical advances such as development of a direct electron detector and also image processing techniques has moved cryo-electron microscopy (cryo-EM) to the forefront of structural biology to analyze large macromolecule complexes in unprecedented details (3). Several structures of E. coli RNAP in complex with nucleic acids (DNA/RNA and 6S RNA) (4,5) and transcription activators (NtrC and CAP) (6,7) have been determined by cryo-EM, demonstrating the feasibility of using this experimental approach to investigate the structure and function of E. coli RNAP. There are a couple of advantages of the cryo-EM over X-ray crystallography: 1) the 3D classification of individual particles from a single cryo-EM grid makes it possible to reveal the heterogeneity of macromolecules and determine their structures individually and 2) a cryo-EM grid preparation takes less than 10 s instead of taking days or longer for macromolecular crystallization, increasing our chances to capture elusive and unstable intermediates for cryo-EM structure determination.
The structural study of bacterial RNAP began in 1996 with the high-resolution X-ray crystal structure of the E. coli 70 , which provided insight into the DNA sequence recognition and double-strand DNA melting by the factor conserved regions 2.4 and 2.3, respectively (8). The structure also revealed the non-conserved region ( NCR ) between the conserved regions 1.2 and 2. 1. In this work, we determined the cryo-EM structures of the E. coli RNAP 70 holoenzyme and the holoenzyme-promoter DNA complex from a single cryo-EM grid. Comparing these structures revealed that the promoter DNA binding to RNAP not only triggers conformational changes of RNAP around the domains for binding the Ϫ35 and Ϫ10 elements, but also estab- cro ARTICLE lishes the direct interaction between the NCR and the DNA upstream of the Ϫ10 element. Our structural and biochemical data introduced a function of the NCR during the formation of transcription ready RPo.

Sample preparation and cryo-EM structure determination
In this study, we aimed to determine the structure of E. coli RNAP 70 holoenzyme and promoter DNA complex with KdpE (KDP operon transcriptional regulatory protein) to obtain structural basis of the transcription activation by the response regulator OmpR/PhoB family. We assembled a ternary complex (Fig. S1) containing E. coli RNAP, a constitutively active variant of KdpE (KdpE-E216A) (9) and the E. coli kdpFABC promoter DNA derivative containing both a RNAP-binding site (Ϫ35 and consensus Ϫ10 elements; synthetic bubble from Ϫ6 to ϩ1) and tandem KdpE-binding sites (from Ϫ67 to Ϫ62 and from Ϫ56 to Ϫ51) (pFABC-1, Fig. S2) for the cryo-EM grid preparation. We used the kdpFABC promoter derivative to form homogeneous RPo for the cryo-EM structure determination.
Of many grid types on which the sample was vitrified, lacey carbon grids coated with a single layer of graphene oxide followed by a layer of pyrene-nitrilotriacetic acid had images with more widely oriented particles, better suited for single-particle 3D reconstruction. From 1037 usable movie stacks, 286,424 particles were picked and used for single-particle reconstruction. The 54 classes with clear structural features representing 257,539 particles after 2D classification were selected for 3D classification. This resulted in two major 3D classes, one representing an E. coli RNAP 70 holoenzyme-promoter DNA complex (RPo) and the other of E. coli RNAP 70 holoenzyme by itself (Fig. S3) indicating the dissociation of DNA from some fractions of the RPo during the cryo-EM grid preparation. We failed to obtain a density map representing a ternary complex (RPo with KdpE), suggesting the dissociation of KdpE from the RPo. Determination of the cryo-EM structures of the RPo and the apo-form RNAP from a single cryo-EM grid allows us to reveal conformational changes of RNAP upon the promoter DNA binding in the identical experimental condition in solution.

Cryo-EM structure of the E. coli RNAP open complex
One of the 3D classes represents the E. coli RPo determined at 5.75 Å resolution ( Fig. 1 and Fig. S3). In addition to density for RNAP holoenzyme, strong densities are observed for dsDNA (from Ϫ45 to Ϫ12 bases) and single-stranded non-template DNA (nt-DNA) in the transcription bubble (from Ϫ11 to Ϫ3 bases). However, density for the template DNA (t-DNA) in the transcription bubble (from Ϫ9 to ϩ2 bases) is not traceable and density for downstream DNA duplex is weak and scattered, indicating their mobility within the DNA-binding main channel of RNAP (Fig. 1C). Positions of each DNA elements (Ϫ35, Ϫ10, transcription bubble, and downstream DNA) are nearly identical to a previously determined X-ray crystal structure of E. coli transcription initiation complex (TIC) (10) (PDB ID: 4YLN) (Fig. S4A). Although the overall resolution was 5.7 Å, local resolution calculations indicate the central part of the structure, including the N-terminal domain of ␣ subunit (␣NTD) as well as ␤ and ␤Ј subunits around the active site of RNAP, was determined close to 5 Å resolution, with peripheral areas of the structure such as NCR (Asn-128 to Arg-373, 246 residues), ␤Ј insertion 6 (␤Јi6, Ala-944 to Gly-1129, 186 residues), and subunit at 7-10 Å resolution (Fig. 1D). Density for the C-terminal domains of ␣ subunit (␣CTD) and region 1.1 domain ( 1.1 ) could not be traced because of the flexible nature of these domains. The distance between two pincers of RNAP (Thr-212 in the ␤Ј clamp domain and His-165 in the ␤ gate loop) is 28 Å, indicating the closed conformation of the RNAP clamp as observed in the E. coli TIC (27.7 Å). However, the position of NCR in the RPo structure determined by the cryo-EM in this study is different from the one in the TIC determined by the X-ray crystallography (Fig. S4A). Compared with the structure of TIC, the loop comprising residues 151-158 of NCR moved ϳ10 Å, facing the sugar-phosphate backbone of template DNA in the cryo-EM structure. Particularly, the NCR contacts the t-DNA at Ϫ16/Ϫ17 position via Arg-157 residue, whereas this interaction was not observed in the TIC because of possible restraint of crystal packing effect (Fig. S4B).

Cryo-EM structure of the E. coli RNAP 70 holoenzyme
The second 3D class represents the apo-form E. coli RNAP 70 holoenzyme with an overall resolution of 4.2 Å ( Fig. 2A and Fig. S3). Local resolution calculations indicate that the center part of the structure is determined at a resolution of 4 Å, whereas the peripheral areas of the structure are determined around 6 -10 Å (Fig. 2B). Densities for C-terminal domains of ␣ subunit and 1.1 were not traceable. Densities for 4 , NCR , and ␤ flap tip helix are sparse and weaker than their counterparts in the RPo, indicating their flexible nature before binding to promoter DNA. The distance between the two pincers around the main channel is 26 Å, indicating that the RNAP clamp adopts a closed conformation without DNA binding.
Density around the active site located at the center of the RNAP molecule showed not only main chains but also side chains. For example, the density map of the rifampin-binding pocket of ␤ subunit shows main chain as well as the side chains of His-526 and Ser-531 residues that play key roles in the RNAP and rifampin interaction (11) (Fig. 2C). The cryo-EM map shown here is the same quality as the X-ray crystal structure of E. coli RNAP determined at 3.6 Å resolution (PDB: 4YG2) (12), indicating that cryo-EM is able to provide the structural information for the RNAP and inhibitor interaction.
In the cryo-EM structures of both apo-form RNAP and RPo, a gap between the ␤Јi6 domain and ␤Ј rim helix is widely opened. In comparison, the ␤Јi6 shifts toward the ␤Ј rim helix about 31 Å in the X-ray crystal structure of TIC, showing the complete closure of the downstream DNA-binding cleft of RNAP ( Fig. 2D and Movie S1). Significance of the movement of ␤Јi6 will be discussed below.

Cryo-EM structure reveals a novel interaction between NCR and DNA upstream of the ؊10 element
Although densities of the 4 and NCR domains are blurred in the apo-form RNAP, these domains are well ordered in the RPo structure. Interestingly, the NCR is involved in the binding of NCR facilitates DNA opening at the transcription start site upstream DNA of the Ϫ10 element. Arg-157 residue of the NCR establishes a long-range electrostatic interaction with a phosphate between the Ϫ16/Ϫ17 DNA positions (Fig. 1E).
To investigate a role of the interaction between the NCR and promoter DNA, we prepared E. coli RNAPs containing the 70 -R157A or -R157E substitution and tested their transcription activities. We used the WT kdpFABC promoter DNA without synthetic transcription bubble (pFABC-3) (Fig. S2) to evaluate the effects of amino acid substitution not only in DNA binding but also DNA unwinding during the transcription process. The E. coli RNAP derivatives showed major defect in transcription, the R157A and R157E variants expressing 50 and 30% activities compared with the WT enzyme (Fig. 3, A and B). To identify the step of transcription influenced by the Arg-157 substitutions, we investigated the promoter DNA binding of RNAP by using electrophoretic mobility shift assay (EMSA) and found no effect by these substitutions (Fig. 3C). To test the effect of the Arg-157 substitutions in DNA unwinding, we used fluorescence signal of 2-aminopurine (2-AP) substitutions at Ϫ7 and Ϫ1 positions in double-strand promoter DNA (pFABC_2AP-1 and pFABC_ 2AP-7) (Fig. S2). Fluorescence signal of 2-AP is quenched in dsDNA, whereas the increase of 2-AP fluorescence intensity is observed when RNAP unwinds dsDNA (13). Both WT and derivative RNAPs show similar increase of the 2-AP fluorescence intensity at Ϫ7 position, suggesting that the R157A/ R157E substitutions do not influence the early step of DNA opening. However, 2-AP fluorescence at Ϫ1 position was 70 and 40% in the R157A and R157E derivatives, respectively, compared with the WT (Fig. 3D), indicating that the NCR plays a role in the DNA unwinding around the transcription start site, the final step of transcription-ready RPo formation. Consistent with this functional role, the Arg-157 substitutions had no effect (Fig. 3E) on the transcription of a premelted DNA (TIS) (Fig. S2).

Discussion
Using single-particle cryo-EM reconstruction, we determined the structures of the apo-form E. coli RNAP 70 holoenzyme and RPo at 4.2 Å and 5.7 Å resolution, respectively. Although the overall resolution of the apo-form RNAP is better than RPo, the density maps of RNAP involved in the Ϫ35 element recognition such as 4 and ␤ flap tip helix are better resolved in the RPo (Figs. 1B and 2A), suggesting that RNAP needs its flexibility of the Ϫ35 element-binding domain to recognize promoters with different lengths of spacer (16 to 18 bases in most case) between the Ϫ10 and Ϫ35 elements.
The density map of NCR in the apo-form holoenzyme is weak and sparse, whereas the 2 , which directly links to the NCR via two ␣ helixes, is well resolved (Fig. 2A). The promoter

NCR facilitates DNA opening at the transcription start site
DNA binding to RNAP enhances the rigidity of NCR (Fig. 1B) and establishes an electrostatic interaction between Arg-157 residue and t-DNA strand at Ϫ16/Ϫ17 position (Fig. 1E). Electrostatic nature of this interaction is verified when RNAP derivatives with the side chain charge reversal (R157E) showed more quantitative effect on the DNA unwinding and the transcription than the charge deletion (R157A) compared with WT (Fig.  3). This interaction was not observed in a previously determined E. coli RNAP TIC determined by the X-ray crystallography (10) likely because of restraint of protein and DNA packings in the crystal (Fig. S4B) whereas the cryo-EM structure represents the structure of molecule in solution without any effects of crystal packing. It is also possible that TIC structure represents a conformational stage where the NCR and t-DNA interaction do not occur. The NCR and t-DNA interaction presented here is distinct from the 3 and DNA interaction, which is formed with the nt-DNA strand upstream of the Ϫ10 element (14). Our structure-based biochemical experiments revealed a role of the NCR and t-DNA interaction, which facilitates the DNA opening of the transcription start site and formation of the transcription-ready RPo (Fig. 3). Arg-157 residue is widely conserved in Betaproteobacteria, Gammaproteobacteria, Epsilonproteobacteria, Deltaproteobacteria, and Chlamydiae (Fig. 4A), further supporting the importance of the NCR and DNA interaction.
The size of NCR found in the primary factor ( 70 in E. coli and SigA in other bacteria) depends on bacterial phyla. For example, E. coli 70 contains 247 residues, whereas Thermus aquaticus, Mycobacterium smegmatis, and M. tuberculosis (MTB) SigA contain 72, 29, and 32 residues, respectively (Fig.  4B). There is a good correlation between the size of NCR and the stability of RPo. Compared with MTB RNAP that forms very unstable RPo with two promoters tested, E. coli RNAP formed highly stable and irreversible complexes with the same promoters (15). Stable RPo formation of MTB RNAP requires accessory proteins RbpA and CarD (16). Because of its small size, the NCR of MTB and M. smegmatis SigA cannot touch DNA in the RPo (16,17). However, in the presence of RbpA, the basic linker (BL) and interacting domain

NCR facilitates DNA opening at the transcription start site
(SID) of RbpA occupies a space between the promoter DNA and NCR , acting like an extension of NCR to reach DNA upstream of the Ϫ10 element (16). Basic residues (Lys-76 and Arg-79) of the RbpA-BL form salt bridges with phosphates of the nt-DNA upstream of the Ϫ10 elements (Fig.  4B) and particularly, Arg-79 and DNA interaction is critical for the stable RPo formation (16). Thus, it is likely that the RpbA in MTB is a functional counterpart of the NCR in the E. coli RNAP transcription system.
The ␤Јi6 is a lineage-specific insertion found between the middle of the trigger loop and changes its position at different stages in the transcription cycle. The ␤Јi6 domain is in the open conformation in the cryo-EM structures of RNAP presented here, whereas it is in the closed conformation in the X-ray crystal structure of TIC ( Fig. 2D and Movie S1). The movement of the ␤Јi6 domain toward the ␤lobe/i4 domain results in closure of the gap between the ␤Јjaw and ␤lobe/i4 domains. The ␤Јjaw and ␤Јi6 form the downstream mobile element (DME) of RNAP, and the kinetic studies of E. coli RNAP transcription proposed the conformational change of DME during the stable RPo formation (2, 18). Deletion of ␤Јi6 drastically reduced the stability of the RPo (19), which is consistent with the idea that the ␤Јi6 may tighten the grip of RNAP on the downstream DNA at the late stage of RPo formation. Although both the cryo-EM and X-ray crystallographic studies used the same sequence and length of downstream DNA to prepare the RNAP-DNA complex, the downstream DNA in crystal forms longer DNA as a result of head-to-tail binding with the upstream DNA of adjacent symmetrically related molecule (Fig. S4C). It is tempting to speculate that certain length of downstream DNA accommo-

Conclusion
The cryo-EM structure of RNAP holoenzyme was determined at 4.2 Å resolution and its density map quality is equivalent to the X-ray crystal structure of E. coli RNAP holoenzyme determined at 3.6 Å resolution, indicating that single-particle cryo-EM is an alternative and promising method for structural studies of RNAP inhibitors because of eliminating the crystallization step. The cryo-EM structure of RPo presented here revealed a novel interaction between NCR and DNA upstream of the Ϫ10 element, which facilitates the formation of stable RPo. RNAP derivatives containing 70 -R157A/R157E accumulate intermediate species between the closed and open complexes; therefore, it, along with the cryo-EM, could be a useful tool to capture elusive intermediates during the RPo formation. Such experiments are underway.

Protein expression and purification
E. coli RNAP core enzyme and 70 proteins were prepared and RNAP holoenzyme was reconstituted as described (20). R157E mutation in the E. coli rpoD gene was obtained by sitedirected mutagenesis of plasmid pGEMD, and 70 derivative was prepared and holoenzyme containing 70 derivative was reconstituted as described (20). Response regulator KdpE-E216A was expressed and purified as described (9).

Sample preparation for cryo-EM
E. coli RNAP 70 holoenzyme, response regulator KdpE-E216A, and synthetic DNA with an artificial transcription bubble (pFABC-1) (Fig. S2) were mixed at 1:2:3 molar ratio in sample buffer (10 mM Hepes, pH 8, 100 mM NaCl, 5% glycerol, 10 mM MgCl 2 ) and incubated at room temperature for 30 min. The ternary complex was purified using size-exclusion Superose 6 column chromatography (Fig. S1) equilibrated with buffer containing 10 mM Hepes, pH 8, 100 mM NaCl, 5% glycerol, 1 mM MgCl 2 . The ternary complex in peak fractions was pooled and cross-linked with 0.1 mM glutaraldehyde for 30 min at room temperature. The cross-linked sample showed that the RNAP subunits formed a single band representing a large cross-linked complex, whereas the mobility of KdpE-E216A remains the same as the non-cross-linked sample (Fig. S1D), suggesting that KdpE-E216A does not directly interact with RNAP. After cross-linking, buffer was exchanged to reduce glycerol concentration to Ͻ1% and the sample was applied to lacey carbon grids coated with a layer of graphene oxide followed by a layer of pyrene nitrilotriacetic acid.

Grid preparation for cryo-EM
Lacey carbon grids (Ted Pella, Inc.) were glow discharged for 90 s, and graphene oxide solution (0.2 mg/ml, 3 l) was applied and incubated for 1 min at room temperature. Grids were then washed with water (25 l droplets) to remove excess graphene oxide. Pyrene nitrilotriacetic acid solution (1.91 mM, 3 l) was applied to the graphene oxide-coated grid and incubated at room temperature for 5 min. Grids were further washed with water (50 l droplets) before applying the ternary complex (250 g/ml, 3 l) followed by blotting for 6 s and immediately plunge frozen in liquid ethane with the Cryoplunge 3 (Cp3, Gatan, Inc.).

Cryo-EM data collection and image processing
Data were collected using the Titan Krios (Thermo Fisher) microscope equipped with a K2 Summit direct electron detector (Gatan) at Purdue Cryo-EM Facility (Table 1 and Fig. S5). Sample grids were imaged at 300 kV, with an intended defocus range of 1.0 -5.0 m, a nominal magnification of 22,500ϫ in super-resolution mode (0.668 Å per pixel), and at a dose rate of ϳ8 electrons per pixel per second. Movies were collected with a total dose of ϳ45 electrons per Å 2 (ϳ1.12 electrons per frame per Å 2 ) at 5 frames per s for 8 s. Of the total 1731 movies collected, 1037 usable movies were aligned and dose weighted using MotionCor2 (21). Contrast transfer function fitting was performed with Gctf (22). A total of 286,424 particles were picked using Gautomatch (Dr. Kai Zhang, Medical Research Council, UK). Subsequent 2D and 3D classifications, 3D refinement, post-processing, particle polishing, and local resolution Table 1 Cryo-EM data collection and refinement statistics NCR facilitates DNA opening at the transcription start site estimation were performed in a beta release of Relion 2.0 (23). De novo initial model was constructed using EMAN2 (24).

Structure refinement
To refine the apo-form RNAP structure, E. coli RNAP holoenzyme crystal structure (PDB ID: 4YG2) was manually fit into the cryo-EM density map using Chimera (25) and realspace refined using Phenix (26). In the real-space refinement, domains of RNAP were rigid-body refined, then subsequently refined with secondary structure, Ramachandran, rotamer, and reference model restraints. To refine the structure of RPo, E. coli RNAP TIC crystal structure (PDB ID: 4YLN) without RNA was manually fit into the cryo-EM density map using Chimera. Upstream DNA from the Ϫ35 element was manually built by using Coot (27). The structure was refined the same as in the apo-form RNAP.

Electrophoretic mobility shift assay
For testing the promoter-specific RNAP and DNA complex formation, 10 pmol RNAP was mixed with 20 pmol DNA (pFABC-3) (Fig. S2) in binding buffer (10 mM Tris, pH 8, 50 mM NaCl, 10 mM MgCl 2 , 5% glycerol, 0.01% Triton X-100, and 1 mM DTT) and incubated at 37°C for 10 min. Heparin was added to final concentrations of 10 or 50 g/ml and incubated at 37°C for 5 min. RNAP-DNA complex was separated from free DNA by using 6% polyacrylamide-Tris borate-EDTA gel electrophoresis and DNA was visualized by ethidium bromide staining. The experiments were conducted twice.

2-Aminopurine fluorescence assay
Synthetic DNA oligos (Integrated DNA Technologies) were obtained with 2-AP substituted at Ϫ7 or Ϫ1 positions of template strand DNA and annealed to the complementary strand forming double-strand labeled DNA (pFABC_2AP-1 and pFABC_2AP-7) (Fig. S2). 100 nM ssDNA, dsDNA, and dsDNA mixed with 200 nM of purified WT RNAP or its 70 -R157A/ R157E variants were incubated at 37°C for 10 min, in a total of 100 l reaction volume. The buffer composition of the mixture was kept as 10 mM Hepes, pH 8, 50 mM NaCl, 5% glycerol, 1 mM MgCl 2. Fluorescence signal from the samples were measured in a SpectraMax M5 spectrophotometer (Molecular Devices) at excitation wavelength 320 nm and emission wavelength 380 nm.