Cryo-EM structure of Escherichia coli σ70 RNAP and promoter DNA complex revealed a role of σ non-conserved region during the open complex formation

First step of gene expression is transcribing the genetic information stored in DNA to RNA by the transcription machinery including RNA polymerase (RNAP). In Escherichia coli, a primary σ70 factor form the RNAP holoenzyme to express housekeeping genes. The σ70 contains a large insertion at between the conserved regions 1.2 and 2.1, the σ non-conserved region (σNCR), but its function remains to be elucidated. In this study, we determined the cryo-EM structures of the E. coli RNAP σ70 holoenzyme and its complex with promoter DNA (open complex, RPo) at 4.2 and 5.75 Å resolutions, respectively, to reveal native conformations of RNAP and DNA. The RPo structure presented here found an interaction between R157 residue in the σNCR and promoter DNA just upstream of the −10 element, which was not observed in a previously determined E. coli RNAP transcription initiation complex (RPo plus short RNA) structure by X-ray crystallography due to restraint of crystal packing effect. Disruption of the σNCR and DNA interaction by the amino acid substitution (R157E) influences the DNA opening around the transcription start site and therefore decreases the transcription activity of RNAP. We propose that the σNCR and DNA interaction is conserved in proteobacteria and RNAP in other bacteria replace its role with a transcription factor.


INTRODUCTION
In the event of formation of the transcription ready open complex (RPo) in Escherichia coli, the conserved regions 2.4 and 4.2 of the primary s 70 factor in RNA polymerase (RNAP) holoenzyme recognize the -10 and -35 promoter elements, respectively, to form the closed complex (RPc). This step is followed by DNA strands separation that is stabilized by the s region 2.3, completing the formation of the transcription bubble from -11 to +2 positions (1). E. coli RNAP forms RPo with at least two intermediates starting from the RPc. Although a series of large conformation changes of RNAP and DNA during the RPo formation have been characterized by kinetic and biophysical studies (2), the structures of intermediates have not been determined yet due to their transient and heterogeneous nature that make them difficult to be captured by X-ray crystallography approach.
A series of technical advances such as development of a direct electron detector and also image processing techniques have moved cryoelectron microscopy (cryo-EM) to the forefront of structural biology to analyze large macromolecule complexes in unprecedented details (3). Several structures of E. coli RNAP in complex with nucleic acids (DNA/RNA and 6S RNA) (4,5) and transcription activators (NtrC and CAP) (6,7) have been determined by cryo-EM, demonstrating the feasibility of using this experimental approach to investigate the structure and function of E. coli RNAP. There are a couple of advantages of the cryo-EM over Xray crystallography: 1) the 3D classification of individual particles from a single cryo-EM grid makes it possible to reveal the heterogeneity of macromolecule and determine their structures individually; 2) a cryo-EM grid preparation takes less than 10 sec instead of taking days or longer for macromolecular crystallization, increasing our chances to capture elusive and unstable intermediates for cryo-EM structure determination.
The structural study of bacterial RNAP began in 1996 with the high-resolution X-ray crystal structure of the E. coli s 70 , which provided insight into the DNA sequence recognition and double-strand DNA melting by the s factor conserved regions 2.4 and 2.3, respectively (8). The structure also revealed the s non-conserved region (sNCR) at between the conserved regions 1.2 and 2.1.
In this work, we determined the cryo-EM structures of the E. coli RNAP s 70 holoenzyme and the holoenzyme-promoter DNA complex from a single cryo-EM grid. Comparing these structures revealed that the promoter DNA binding to RNAP not only triggers conformational changes of RNAP around the domains for binding the -35 and -10 elements, but also establishes the direct interaction between the sNCR and the DNA upstream of the -10 element. Our structural and biochemical data introduced a function of the sNCR during the formation of transcription ready RPo.

Sample preparation and cryo-EM structure determination
In this study, we aimed to determine the structure of E. coli RNAP s 70 holoenzyme and promoter DNA complex with KdpE (KDP operon transcriptional regulatory protein) to obtain structural basis of the transcription activation by the response regulator OmpR/PhoB family. We assembled a ternary complex (Fig. S1) containing E. coli RNAP, a constitutively active variant of KdpE (KdpE-E216A, (9)) and the E. coli kdpFABC promoter DNA derivative containing both a RNAP binding site (-35 and -10 elements; synthetic bubble from -6 to +1) and tandem KdpE binding sites (from -67 to -62 and from -56 to -51) (pFABC-1, Fig. S2) for the cryo-EM grid preparation.
Out of many grid types on which the sample was vitrified, lacey-carbon grids coated with a single layer of graphene oxide followed by a layer of pyrene-NTA had images with more widely oriented particles, better suited for single particle 3-D reconstruction. From 1,037 usable movie stacks, 286,424 particles were picked and used for single particle reconstruction. The 54 classes with clear structural features representing 257,539 particles after 2D classification were selected for 3D classification. This resulted in two major 3D classes, one representing an E. coli RNAP s 70 holoenzyme-promoter DNA complex (RPo) and the other of E. coli RNAP s 70 holoenzyme by itself (Fig. S3) indicating that the dissociation of DNA from some fractions of the RPo during the cryo-EM grid preparation. We failed to obtain a density map representing a ternary complex (RPo with KdpE), suggesting the dissociation of KdpE from the RPo. Determination of the cryo-EM structures of the RPo and the apo-form RNAP from a single cryo-EM grid allows us to reveal conformational changes of RNAP upon the promoter DNA binding in the identical experimental condition in solution.

Cryo-EM structure of the E. coli RNAP open complex
One of the 3D classes represents the E. coli RPo determined at 5.75 Å resolution ( Figs. 1 and  S3). In addition to density for RNAP holoenzyme, strong densities are observed for double-stranded DNA (from -45 to -12 bases) and single-stranded non-template DNA (nt-DNA) in the transcription bubble (from -11 to -3 bases). However, density for the template DNA (t-DNA) in the transcription bubble (from -9 to +2 bases) is not traceable and density for downstream DNA duplex is weak and scattered, indicating their mobility within the DNA binding main channel of RNAP (Fig. 1C). Positions of each DNA elements (-35, -10, transcription bubble and downstream DNA) are nearly identical to a previously determined X-ray crystal structure of E. coli transcription initiation complex (TIC) (10) (PDB: 4YLN, Fig. S4A). Although the overall resolution was 5.7 Å, local resolution calculations indicate the central part of the structure including the N-terminal domain of a subunit (aNTD) as well as b and b' subunits around the active site of RNAP was determined close to 5 Å resolution, with peripheral areas of the structure such as sNCR (N128 to R373, 246 residues), b' insertion 6 (b'i6, A944 to G1129, 186 residues) and w subunit at 7-10 Å resolution (Fig. 1D). Density for the C-terminal domains of a subunit (aCTD) and s region 1.1 domain (s1.1) could not be traced due to flexible nature of these domains. The distance between two pincers of RNAP (T212 in the b' clamp domain and H165 in the b gate loop) is 28 Å, indicating the closed conformation of the RNAP clamp as observed in the E. coli TIC (27.7 Å). However, the position of sNCR in the RPo structure determined by the cryo-EM in this study is different from the one in the TIC determined by the X-ray crystallography (Fig. S4A). Particularly, the sNCR contacts the t-DNA at -16/17 position via R157 residue, while this interaction was not observed in the TIC due to restraint of crystal packing effect (Fig. S4B).

Cryo-EM structure of the E. coli RNAP s 70 holoenzyme
The second 3D class represents the apoform E. coli RNAP s 70 holoenzyme with an overall resolution of 4.2 Å (Figs 2A and S3). Local resolution calculations indicate that the center part of the structure is determined at a resolution of 4 Å, while the peripheral areas of the structure are determined around 6-10 Å (Fig 2B). Densities for aCTD and s1.1 were not traceable. Densities for s4, sNCR and b flap tip helix are sparse and weaker than their counterparts in the RPo, indicating their flexible nature before binding to promoter DNA. The distance between the two pincers around the main channel is 26 Å, indicating that the RNAP clamp adopts a closed conformation without DNA binding.
Density around the active site located at the center of the RNAP molecule showed not only main chains but also side chains. For example, the density map of the Rifampin binding pocket of b subunit shows main chain as well as the side chains of H526 and S531 residues that play key roles in the RNAP and Rifampin interaction (11) (Fig. 2C). The cryo-EM map shown here is the same quality as the X-ray crystal structure of E. coli RNAP determined at 3.6 Å resolution (PDB: 4YG2) (12), indicating that cryo-EM is able to provide the structural information for the RNAP and inhibitor interaction.
In the cryo-EM structures of both apo-form RNAP and RPo, a gap between the b'i6 domain and b' rim helix is widely opened. In comparison, the b'i6 shifts toward the b' rim helix about 31 Å in the X-ray crystal structure of TIC, showing the complete closure of the downstream DNA binding cleft of RNAP (Fig. 2D). Significance of the movement of b'i6 will be discussed below.

Cryo-EM structure reveals a novel interaction between sNCR and DNA upstream of the -10 element
Although densities of the s4 and sNCR domains are blurred in the apo-form RNAP, the RPo showed well-ordered these domains. Interestingly, the sNCR is involved in the binding of upstream DNA of the -10 element. R157 residue of the sNCR establishes a long-range electrostatic interaction with a phosphate between the -16/-17 DNA positions (Fig. 1E).
To investigate a role of the interaction between the sNCR and promoter DNA, we prepared an E. coli RNAP containing the s 70 -R157E substitution and tested its transcription activity with a double-stranded promoter DNA (pFABC-3, Fig. S2). The E. coli RNAP derivative showed major defect in transcription, expressing only 22 % activity compared with the wild-type enzyme (Figs. 3A and B). To identify the step of transcription influenced by the R157E substitution, we investigated the promoter DNA binding of RNAP by using Electrophoretic Mobility Shift Assay (EMSA) and found no effect by the substitution (Fig. 3C). To test the effect of the R157E substitution in DNA unwinding, we used fluorescence signal of 2aminopurine (2-AP) substitutions at -7 and -1 positions in double-stranded promoter DNA (pFABC_2AP-1 and pFABC_2AP-7, Fig. S2).
Fluorescence signal of 2-AP is quenched in double-stranded DNA, whereas the increase of 2-AP fluorescence intensity is observed when RNAP unwinds double-stranded DNA (13). Both wild type and derivative RNAP show similar increase of the 2-AP fluorescence intensity at -7 position, suggesting that the R157E substitution does not influence the early step of DNA opening. However, 2-AP fluorescence at -1 position was reduced by 2-folds in the derivative compared with the wild-type (Fig. 3D), indicating that the sNCR plays a role in the DNA unwinding around the transcription start site, the final step of transcription ready RPo formation. Consistent with this functional role, the R157E substitution had no effect (Fig. 4E) on the transcription of a pre-melted DNA (TIS, Fig. S2).

DISCUSSION
Using single-particle cryo-EM reconstruction, we determined the structures of the apo-form E. coli RNAP s 70 holoenzyme and RPo at 4.2 Å and 5.7 Å resolution, respectively. Although the overall resolution of the apo-form RNAP is better than RPo, the density maps of RNAP involved in the -35 element recognition such as s4 and b flap tip helix are better resolved in the RPo (Fig. 1B, 2A), suggesting that RNAP needs its flexibility of the -35 element binding domain to recognize promoters with different lengths of spacer (16 to 18 bases in most case) between the -10 and -35 elements.
The density map of sNCR in the apo-form holoenzyme is weak and sparse, whereas the s2, which directly links to the sNCR via two a helixes, is well resolved (Fig. 2A). The promoter DNA binding to RNAP enhances the rigidity of sNCR (Fig. 1B) and establishes a long-range electrostatic interaction between R157 residue and t-DNA strand at -16/-17 position (Fig. 1E). This interaction was not observed in a previously determined E. coli RNAP TIC determined by the X-ray crystallography (10) likely due to restraint of protein and DNA packings in the crystal (Fig.  S4B). The sNCR and t-DNA interaction presented here is distinct from the s3 and DNA interaction, which is formed with the nt-DNA strand upstream of the -10 element (14).
Our structure-based biochemical experiments revealed a role of the sNCR and t-DNA interaction, which facilitates the DNA opening of the transcription start site and formation of the transcription-ready RPo (Fig. 3). R157 residue is widely conserved in Betaproteobacteria, Gammaproteobacteria, Epsilonproteobacteria, Deltaproteobacteria and Chlamydiae (Fig. 4A), further supporting the importance of the sNCR and DNA interaction.
The size of sNCR found in the primary s factor (s 70 in E. coli and SigA in other bacteria) depends on bacterial phyla. For examples, E. coli s 70 contains 247 residues, whereas Thermus aquaticus, Mycobacterium smegmatis and Mycobacterium tuberculosis (MTB) SigA contain 72, 29 and 32 residues, respectively (Fig.  4B). There is a good correlation between the size of sNCR and the stability of RPo. Compared to MTB RNAP that forms very unstable RPo with two promoters tested, E. coli RNAP formed highly stable and irreversible complexes with the same promoters (15). Stable RPo formation of MTB RNAP requires accessary proteins RbpA and CarD (16). Due to its small size, the sNCR of MTB and M. smegmatis SigA cannot touch DNA in the RPo (16,17). However, in the presence of RbpA, the basic linker (BL) and s interacting domain (SID) of RbpA occupies a space between the promoter DNA and sNCR, acting like an extension of sNCR to reach DNA upstream of the -10 element (16). Basic residues (K76 and R79) of the RbpA-BL form salt bridges with phosphates of the nt-DNA upstream of the -10 elements (Fig. 4B) and particularly, R79 and DNA interaction is critical for the stable RPo formation (16). Thus, it is likely that the RpbA in MTB is a functional counterpart of the sNCR in the E. coli RNAP transcription system. The b'i6 is a lineage specific insertion found between the middle of the trigger loop and changes its position at deferent stages in the transcription cycle. The b'i6 domain is in the open-conformation in the cryo-EM structures of RNAP presented here whereas it is in the closedconformation in the X-ray crystal structure of TIC, respectively (Fig. 2D). The movement of the b'i6 domain toward the blobe/i4 domain results in closure of the gap between the b'jaw and blobe/i4 domains. The b'jaw and b'i6 form the downstream mobile element (DME) of RNAP and the kinetic studies of E. coli RNAP transcription proposed the conformational change of DME during the stable RPo formation (2,18). Deletion of b'i6 drastically reduced the stability of the RPo (19), which is consistent with the idea that the b'i6 may tighten the grip of RNAP on the downstream DNA at the late stage of RPo formation. Although both the cryo-EM and X-ray crystallographic studies used the same sequence and length of downstream DNA to prepare the RNAP-DNA complex, the downstream DNA in crystal forms longer DNA as a result of head-totail binding with the upstream DNA of adjacent symmetrically related molecule (Fig. S4C). It is tempting to speculate that certain length of downstream DNA accommodated in the DNA binding main channel triggers the conformational change of the DME for the establishment of the stable RPo formation.

CONCLUSION
The cryo-EM structure of RNAP holoenzyme was determined at 4.2 Å resolution and its density map quality is equivalent to the Xray crystal structure of E. coli RNAP holoenzyme determined at 3.6 Å resolution, indicating that single-particle cryo-EM is an alternative and promising method for structural studies of RNAP inhibitors due to eliminating the crystallization step. The cryo-EM structure of RPo presented here revealed a novel interaction between sNCR and DNA upstream of the -10 element, which facilitates the formation of stable RPo. RNAP derivative containing s 70 -R157E accumulates intermediate species between the closed and open complexes, therefore, it, along with the cryo-EM, could be a useful tool to capture elusive intermediates during the RPo formation. Such experiments are underway.

EXPERIMENTAL PROCEDURES Protein expression and purification
E. coli RNAP core enzyme and s 70 proteins were prepared and RNAP holoenzyme was reconstituted as described (20). R157E mutation in the E. coli rpoD gene was obtained by site directed mutagenesis of plasmid pGEMD, and s 70 derivative was prepared and holoenzyme containing s 70 derivative was reconstituted as described (20). Response regulator KdpE-E216A was expressed and purified as described (9).

Sample preparation for cryo-EM
E. coli RNAP s 70 holoenzyme, response regulator KdpE-E216A and synthetic DNA (pFABC-1, Fig. S2) were mixed at 1:2:3 molar ratio in sample buffer (10 mM Hepes pH 8, 100 mM NaCl, 5% glycerol, 10 mM MgCl2) and incubated at room temperature for 30 minutes. The ternary complex was purified using size exclusion Superose 6 column chromatography (Fig. S1) equilibrated with buffer containing 10 mM Hepes pH 8, 100 mM NaCl, 5% glycerol, 1 mM MgCl2. The ternary complex in peak fractions were pooled and cross-linked with 0.1 mM glutaraldehyde for 30 min at room temperature. The cross-linked sample showed that the RNAP subunits formed a single band representing a large crosslinked complex, while the mobility of KdpE-E216A remains the same as the non-cross-linked sample (Fig. S1D) suggesting that KdpE-E216A does not directly interact with RNAP. After crosslinking, buffer was exchanged to reduce glycerol concentration to < 1% and the sample was applied to laceycarbon grids coated with a layer of graphene oxide followed by a layer of pyrene NTA.

Grid preparation for cryo-EM
Graphene oxide solution (10 µg/ml, 3 µl) was applied to a lacey carbon grids (Ted Pella, Inc.) and incubated for 1 min at room temperature. Grids were then washed with water (25 µl droplets) to remove excess graphene oxide. Pyrene-NTA solution (1.91 mM, 3 µl) was applied to the graphene oxide coated grid and incubated at room temperature for 5 min. Grids were further washed with water (50 µl droplets) before applying the ternary complex (250 µg/ml, 3 µl) followed by blotting for 6 sec and immediately plunge frozen into liquid ethane with the Cp3 Cryoplunger (Gatan, Inc.).

Cryo-EM data collection and image processing
Data was collected using the Titan Krios (ThermoFisher) microscope equipped with a K2 Summit direct electron detector (Gatan) at Purdue Cryo-EM Facility (Table 1 and Fig. S5). Sample grids were imaged at 300 kV, with an intended defocus range of 1.0 -5.0 µm, a nominal magnification of 22,500X in super-resolution mode (0.668 Å per pixel), and at a dose rate of ~8 electrons per pixel per second (e.p.s.). Movies were collected with a total dose of ~45 electrons per Å 2 (~1.12 electrons per frame per Å 2 ) at 5 frames per seconds for 8 seconds. Of the total 1,731 movies collected, 1037 usable movies were aligned and dose weighted using MotionCor2 (21). CTF fitting was performed with Gctf (22). A total of 286,424 particles were picked using Gautomatch (Dr. Kai Zhang, MRC, UK). Subsequent 2D and 3D classification, 3D refinement, post-processing, and local resolution estimation were performed in a beta release of Relion 2.0 (23). De novo initial model was constructed using EMAN2 (24).

Structure refinement
To refine the apo-form RNAP structure, E. coli RNAP holoenzyme crystal structure (PDB ID 4YG2) was manually fit into the cryo-EM density map using Chimera (25) and real-space refined using Phenix (26). In the real-space refinement, domains of RNAP were rigid-body refined, then subsequently refined with secondary structure, Ramachandran, rotamer and reference model restraints. To refine the structure of RPo, E. coli RNAP TIC crystal structure (PDB ID 4YLN) without RNA was manually fit into the cryo-EM density map using Chimera. Upstream DNA from the -35 element was manually built by using Coot (27). The structure was refined the same as in the apo-form RNAP.

2-Aminopurine (2-AP) fluorescence assay
Synthetic DNA oligos (Integrated DNA Technologies) were obtained with 2-amino purine substituted at -7 or -1 positions of template strand DNA and annealed to the complimentary strand forming double stranded labelled DNA (pFABC_2AP -1 and pFABC_2AP-7, Fig. S2). 100 nM single stranded DNA, double stranded DNA and double stranded DNA mixed with 200 nM of purified wild-type RNAP or its s 70 -R157E variant were incubated at 37 0 C for 10 min, in a total of 100 µl reaction volume. The buffer composition of the mixture was kept as 10 mM Hepes pH 8, 50 mM NaCl, 5% glycerol, 1 mM MgCl2. Fluorescence signal from the samples were measured in a SpectraMax-M5 spectrophotometer (Molecular Devices) at excitation wavelength 320 nm and emission wavelength 380 nm.