Co-overexpression of Escherichia coliRNA Polymerase Subunits Allows Isolation and Analysis of Mutant Enzymes Lacking Lineage-specific Sequence Insertions*

The study of mutant enzymes can reveal important details about the fundamental mechanism and regulation of RNA polymerase, the central enzyme of gene expression. However, such studies are complicated by the multisubunit structure of RNA polymerase and by its indispensability for cell growth. Previously, mutant RNA polymerases have been produced by in vitro assembly from isolated subunits or by in vivo assembly upon overexpression of a single mutant subunit. Both approaches can fail if the mutant subunit is toxic or incorrectly folded. Here we describe an alternative strategy, co-overexpression and in vivoassembly of RNA polymerase subunits, and apply this method to characterize the role of sequence insertions present in theEscherichia coli enzyme. We find that co-overexpression of its subunits allows assembly of an RNA polymerase lacking a 188-amino acid insertion in the β′ subunit. Based on experiments with this and other mutant E. coli enzymes with precisely excised sequence insertions, we report that the β′ sequence insertion and, to a lesser extent, an N-terminal β sequence insertion confer characteristic stability to the open initiation complex, frequency of abortive initiation, and pausing during transcript elongation relative to RNA polymerases, such as that from Bacillus subtilis, that lack the sequence insertions.

The amino acid sequences of the two largest subunits of cellular RNA polymerases (RNAPs), 1 called ␤Ј and ␤ in bacteria, are remarkably conserved among multisubunit RNAPs from eubacteria to the human homologues, RNAPI, RNAPII, and RNAPIII (1,2). Recently determined crystal structures of yeast and bacterial RNAPs (3,4) reveal the structural congruity responsible for this conservation, which extends far beyond the catalytically important residues (5)(6)(7). Representing the bulk of the enzyme, ␤Ј and ␤ (and their yeast homologues, Rpb1 and Rpb2) form two halves of a crab claw-shaped molecule, in which the secondary structures of the homologous subunits are nearly identical within 25 Å of the active site (located at the internal junction between the claws). The divergence among RNAP's large subunits generally increases toward the exposed surface of the molecule and frequently is manifest as insertions of up to several hundred amino acids that are characteristic of different evolutionary lineages (3). This pattern of surface sequence variation in RNAPs, which in eukaryotes includes additional subunits bound at the enzyme's periphery, led to the idea that the conserved catalytic core of RNAP is adapted to various environments and cellular milieus by the addition of surface modules that interact with different regulatory factors (4,7,8).
In bacteria, the most studied examples of lineage-specific sequence insertions (SIs) 2 occur in proteobacteria. In the proteobacteria, three easily recognizable insertions protrude from the surface of the enzyme (Figs. 1 and 6), two in ␤ (SI1 between the conserved regions B and C and SI2 between regions G and H) and one in ␤Ј (SI3 between regions G and H) (9,10). These insertions are absent in Gram-positive bacteria and in Deinococci, from which the bacterial RNAP structures have been determined; the Deinococci contain a different SI between conserved regions A and B of ␤Ј (3,11). Both ␤SI1 and ␤SI2 tolerate insertions and partial deletions without affecting cell viability or in vitro RNAP activities (9,12,13). In contrast, partial deletions in ␤ЈSI3 are more deleterious; they reduce cell viability and confer defects in transcript cleavage and elongation, whereas the complete removal of ␤ЈSI3 inhibits assembly of the mutant ␤Ј into core RNAP (14).
The observations that SIs can tolerate significant structural alterations without the loss of RNAP function led to them being described as dispensable (13)(14)(15). SIs in related bacteria exhibit somewhat greater variability than the sequences of fulllength ␤ and ␤Ј (e.g. SIs are ϳ70% conserved between Escherichia coli and Hemophilus influenzae and ϳ60% conserved between E. coli and Shewanella violacea, whereas the remaining parts of ␤ and ␤Ј are ϳ85 and ϳ80% conserved between the two pairs, respectively). Nonetheless, their retention in distinct bacteria with significant sequence conservation suggests that they play some functional role. One role could be the recruit-ment of cellular regulatory proteins. For instance, ␤SI1 appears to be the target of the phage T4 Alc termination factor (15). SI modules could also affect biochemical properties of RNAP more directly; both ␤ЈSI3 and ␤SI1 are proposed to make downstream DNA contacts that explain the greater stability and longer downstream footprints of E. coli RNAP open complexes (relative to, for instance, Bacillus subtilis RNAP) (7,16).
We became interested in the sequence insertions in proteobacterial RNAP as possible explanations for the inability of B. subtilis RNAP to recognize hairpin-dependent pause signals as well as the decreased open complex longevity and decreased abortive initiation of B. subtilis RNAP (17). To facilitate study of RNAPs with precise SI deletions, we developed a polycistronic co-overexpression system for E. coli RNAP that relies on T7 RNAP-dependent transcription of all three core RNAP subunit genes (rpoA, rpoB, and rpoC) on a single plasmid. To facilitate purification of the recombinant RNAPs, we appended an intein-chitin-binding protein (CBP) module to the ␤Ј C terminus and a hexahistidine-hemagglutinin tag to the N terminus of ␤. This system allowed us to purify and test precise deletions of all three SIs in E. coli RNAP and provides a generally useful method for the study of mutant E. coli RNAP enzymes that avoids the loss of activity and assembly problems that sometimes arise with in vitro reconstitution methods (18,19). Table I) were obtained from Operon Technologies (Alameda, CA); dNTPs were from U.S. Biochemical Corp.; NTPs were from Amersham Biosciences; [␣-32 P]CTP was from PerkinElmer Life Sciences; and other chemicals were from Sigma. Restriction and modification enzymes were obtained from New England Biolabs. Linear DNA templates for in vitro transcription were generated by PCR amplification and purified using reagents (cat. A7170) from Promega (Madison, WI).

Reagents and Proteins-Oligonucleotides (listed in
Construction of Deletion Mutants-To delete ␤SI1, an rpoB fragment was PCR-amplified from pRL702 with oligonucleotides 343 and 3135. The PCR product was digested with NruI and cloned between two NruI sites of pRL702. To delete ␤SI2 and ␤ЈSI3, site-directed PCR mutagenesis with two fully complementary oligonucleotides flanking the deleted fragment in pIA160 (␤) or pRL663 (␤Ј) was performed. Each oligonucleotide (see Table I) annealed on both sides of the deletion, forcing the intervening fragment to loop out. The shortened rpoB fragment located between the unique NcoI and ClaI sites was sequenced, excised, and recloned into NcoI, ClaI-cut pIA160 (pIA319) or NcoI, and ClaI-cut pIA178 (pIA302). The shortened rpoC fragment located between the unique SalI and BspEI sites was sequenced, excised, and recloned into SalI-, BspEI-cut pRL663. To obtain overexpression plasmids encoding ␤ ⌬SI1 and SI2, the mutant rpoB fragments were transferred to the T7 RNAP-based expression plasmid pIA423 (Fig. 2) on an NcoI to SdaI fragment. To obtain overexpression plasmids encoding ␤Ј ⌬(943-1130), the mutant rpoC fragments were transferred to the same expression plasmid on a BsmI to XhoI fragment.
Design of the Polycistronic Overexpression Plasmid-The original plasmid for co-overexpression of E. coli RNAP subunit genes, pIA423 ( Fig. 2; GenBank TM accession number AF533984), contains a polycistronic operon rpoA-rpoB-rpoC*, flanked by a single T7 promoter and terminator sequences derived from pET21 (Novagen, Madison, WI); each ORF is preceded by a separate ribosome-binding site (Fig. 2). Introduction of a XhoI site at the 3Ј-end of the rpoC ORF resulted in the addition of two amino acids (LE) to the C terminus of the ␤Ј subunit, whereas a new EagI site in rpoB ORF is silent. In the absence of induction, expression is repressed by the product of the lacI q gene (carried on the same plasmid) through lacO positioned upstream of rpoA. To facilitate purification of the recombinant RNAP, the rpoC* fusion contains a CBP-intein module from pTYB3 plasmid (New England Biolabs) fused in-frame to the 3Ј-end of the rpoC ORF. Derivatives of pIA423 containing hexahistidine and a hemagglutinin epitope tag at the N terminus of ␤ were prepared by ligation of NcoI-SdaI fragments from pIA160 and pIA178 (rpoB SF531 ) into the corresponding sites of pIA423.
The co-overexpression plasmid pIA423 was created in the following steps (see also Table I). pET21␣ was constructed to express rpoA under the control of an IPTG-inducible T7 RNAP promoter. The HindIII site in the pET21␣ rpoA gene was eliminated without altering the encoded amino acid sequence by site-directed PCR mutagenesis with primers 3683 and 3684 to yield pIA287. pIA287 was converted to pIA299 first by introducing a new sequence between the BamHI and NcoI sites downstream of rpoA. The resulting plasmid was then modified by insertion between its NcoI and HindIII sites of an NcoI to Sbf I fragment from pRW408 carrying most of rpoB, followed by an Sbf I to BsmI fragment from pNF1346, carrying the remainder of rpoB, the rpoBC intergenic region, and the first half of rpoC, followed by a BsmI to HindIII fragment from pRL663, carrying the remainder of rpoC, a C-terminal XhoI site, and a His 6 tag. pIA423 was then created by insertion between XhoI sites of pIA299 of a XhoI-SalI-digested PCR product from pTYB3 (amplified with primers 3741 and 3742; Table I).
RNAP Purification-Plasmids encoding variants of E. coli RNAP (wild type and mutants) were transformed into BL21 DE3 (20). A single colony was inoculated into 500 ml to 2 liters of LB ϩ 100 g of ampicillin/ml at 37°C and grown until apparent A 600 reached 0.3-0.5, at which point protein production was induced by the addition of IPTG to 1 mM. Cells were grown for 3 h at 37°C, rapidly chilled on ice, collected by centrifugation for 15 min at 5000 ϫ g and 4°C, and resuspended in 50 ml of column buffer (20 mM Tris, pH 7.9, 5% glycerol, 500 mM NaCl, 1 mM EDTA). Protease inhibitor mixture (Sigma catalog no. P8465) was then added as recommended by the manufacturer, and the cells were disrupted by sonication. The resulting lysate was cleared by centrifugation for 20 min at 27,000 ϫ g and 4°C and then filtered through the 0.4-m syringe filter (Nalgene). Chitin beads (5 ml; New England Biolabs) were equilibrated with 10 volumes of column buffer in a 20-ml disposable column (Bio-Rad Econo-Pac), and cleared lysate was passed through the column by gravity flow, followed by 20 volumes of column buffer. To induce intein cleavage, the column was washed with 3 bed volumes of column buffer containing 50 mM DTT (to exchange buffer) and then incubated at 4 -8°C for 8 -16 h (overnight). To elute the protein, column buffer (ϳ4 ml) was added, and 0.2-ml fractions were then collected and tested for protein by Bradford assay and SDS-PAGE (4 -12% NuPAGE gels; Invitrogen). Fractions containing RNAP were pooled, concentrated using Centriplus 100 or Ultrafree concentrators (Millipore Corp.) to 1-5 ml (depending on total RNAP recovered), and then dialyzed against loading buffer for heparin affinity or anion exchange chromatography. Chromatography was carried out using Hi-Trap columns and an Akta Prime low pressure chromatography system (Amersham Biosciences). For heparin affinity separation, samples were loaded onto the HiTrap Heparin HP column in 50 mM sodium phosphate (pH 6.9), 0.1 mM DTT buffer. For quaternary amine chromatography (Hi-Trap Q Sepharose Fast Flow), protein was loaded in 50 mM Tris-HCl (pH 8.0), 5% glycerol, 0.1 mM Na-EDTA, 0.1 mM DTT. Columns were washed with 10 column volumes of the loading buffer and eluted with a gradient (0 -1.5 M) of NaCl in loading buffer over 20 column volumes. Elution peaks were identified by monitoring A 280 , and their contents were further characterized using SDS-PAGE. Fractions containing RNAP were pooled and dialyzed against storage buffer (10 mM Tris, pH 7.9, 50% glycerol, 0.1 mM EDTA, 0.1 mM DTT, 0.1 M NaCl) for 12-14 h at 4°C. Enzymes containing deletions in the ␤ subunit were additionally purified by adsorption to Ni 2ϩ -nitrilotriacetic acid beads (Qiagen), followed by washing and imidazole elution prior to dialysis against storage buffer. Typical yields from 2-liter cultures were 0.5-2 mg of purified RNAP, depending on properties of the particular enzyme. For the wild type RNAP, versions containing or lacking hexahistidine tags on ␤ were purified and found to behave indistinguishably in the assays used here. B. subtilis core RNAP (21) and 70 (22) were purified as described. Wild type and mutant RNAP holoenzyme (core ␤Ј␤␣ 2 plus 70 ) was prepared by incubating a 5-fold molar excess of 70 with core enzyme for 30 min at 30°C.
Open Complex Longevity Assays-Linear DNA template (40 nM) carrying the T7A1 promoter (pIA171) (23) Impact-CN TM system vector for the in-frame insertion into a polylinker upstream of an intein tag and CBP New England Biolabs pET21␣ NdeI-BamHI PCR fragment of E. coli rpoA ligated between NdeI and BamHI of pET21c (Novagen) such that rpoA starts in the NdeI site and stops two nt (CC) before the BamHI site.
This work pRW408 lacI q -P trc -His 6 -HA-rpoB -bla -ColEI ori plasmid. Insertion of an NcoI-, BsaBIdigested 2195-106 PCR fragment from pRW408 into NcoI-, BsaBI-digested pRW408. The N-terminal sequence of ␤ from pRL702 is MAHHHHHHAYPYDVPDYAMVY (His 6 tag, hemagglutinin tag, and wild type ␤ start are underlined). Single Round Pause Assays-Linear DNA template encoding the his pause signal was generated by PCR amplification of pIA171 (23). Halted A29 elongation complexes were formed during a 15-min incubation of 40 nM DNA template and 50 nM RNAP holoenzyme at 37°C in 50 l of transcription buffer in the absence of UTP, with ApU at 150 M, ATP and GTP at 2.5 M, and CTP at 1 M, with 32 P derived from [␣-32 P]CTP (3000 Ci/mmol). Transcription was restarted by the addition of 20 M GTP; 150 M ATP, UTP, and CTP; and heparin to 100 g/ml. Samples were removed at the times listed in the figure legends and after a final 5-min incubation with 250 M each NTP (chase) and were quenched as above.
Sample Analysis-Samples were heated for 2 min at 90°C and separated by electrophoresis in denaturing acrylamide (19:1) gels (7 M urea, 0.5ϫ TBE; 8% for pause assays, 15% for abortive initiation and open complex stability assays). RNA products were visualized and quantified using a PhosphorImager and ImageQuant Software (Amersham Biosciences). Pause half-life (the time during which half of the complexes reenter the elongation pathway) was determined by nonlinear regression analysis as described previously (25).

Co-overexpression of RNAP Subunits Allows Study of the
Role of Proteobacterial Sequence Insertions-To allow facile expression, assembly, and purification of poorly assembled or toxic mutant RNAPs, such as ␤Ј⌬SI3, we designed a vector that expresses ␣, ␤, and ␤Ј polypeptides from a single T7 promoter; the ␤Ј subunit is fused to the CBP tag (see "Experimental Procedures"). We predicted that expressing the mutant core RNAP from this plasmid would facilitate its assembly in vivo and would allow purification via the CBP tag (Fig.  2). This approach has several advantages. First, only the fully assembled core (␣ 2 ␤␤Ј) RNAP is purified because ␤Ј recruitment is the last step in the assembly pathway (26) and because the expression levels from this vector follow the assembly pathway of RNAP (␣Ͼ␤Ͼ␤Ј; data not shown). Second, intein-mediated cleavage removes the CBP tag to release the assembled RNAP from the matrix; therefore, the purified enzyme does not carry additional protein segments. Third, the entire purification can be completed quickly, is relatively inexpensive, and yields RNAP that is sufficiently pure for in vitro transcription (Fig. 2). Fourth, the plasmidencoded ␤ and ␤Ј subunits assemble preferentially with each other (and not with the chromosomal subunits; data not shown), allowing combination of substitutions in different subunits and more homogenous population of purified RNAPs. To purify RNAP with altered ␤ subunit, we also constructed a version of the co-overexpression plasmid with a hexahistidine tag at the N terminus of ␤ (see Table I).
Using the co-overexpression plasmid, we were able to obtain active ⌬␤ЈSI3 RNAP (Fig. 2) as well as RNAPs with precise excisions of ␤SI1 or ␤SI2. We found that RNAP eluted directly from chitin beads was suitable for in vitro transcription but sometimes exhibited reduced activity relative to RNAP purified by conventional methods (27). We hypothesized that this low activity arose from residual binding of nucleic acids to the RNAPs; the addition of one chromatography step on heparin-or quarternary amine-Sepharose yielded overexpressed RNAP of an activity and purity comparable with that obtained by conventional purification (Fig. 2 and data not shown).
When the ␤Ј⌬SI3-intein-CBP fusion was expressed from a T7 promoter plasmid that did not encode the other RNAP subunits, essentially all tagged ␤Ј was found in inclusion bodies (data not shown). Therefore, co-overexpression with ␣, ␤, and ␤Ј subunits facilitates RNAP assembly. Perhaps translation of ␤ and ␤Ј from the same mRNA facilitates proper interaction, or the elevated level of the initially formed ␣ 2 ␤ complex simply allows plasmid-encoded ␤Ј to compete effectively with chromosomally encoded ␤Ј for assembly.
␤ЈSI3 Stabilizes Open Initiation Complexes-Using RNAPs with precise SI deletions obtained by co-overexpression, we first tested the effects of the E. coli SIs on open initiation complexes. Fully mature E. coli open complexes, which form on many but not all cellular promoters, exhibit extended contacts of RNAP with DNA from ϳϪ55 to ϳϩ20 relative to the transcription start site and melting of the DNA duplex between ϳϪ12 and ϳϩ2. These E. coli open complexes are long lived and collapse back to closed complexes at rates of Ͻ0.01 s Ϫ1 (28 -31). In contrast, B. subtilis RNAP, which lacks insertion sequences, forms open complexes that are in rapid equilibrium with closed promoter complexes (32,33). In B. subtilis RNAP open complexes, the DNA downstream of the start site is not efficiently protected against DNase I digestion (32,34), and the melted region is shortened in the downstream direction (35). Interestingly, a deletion variant of the E. coli ␤ subunit that includes part of ␤SI1 but extends beyond its boundaries (⌬186 -433) forms open complexes that (at 37°C) exhibit a melting pattern similar to those formed by the B. subtilis enzyme (35).
To measure the relative stability of open complexes formed by wild type or ⌬SI RNAPs, we challenged open complexes formed on the T7 A1 promoter with the polyanion heparin. Although for many promoters heparin binds only free RNAP that is in equilibrium with closed initiation complexes and has no effect on open complexes, heparin directly attacks open complexes formed at the T7 A1 promoter and displaces RNAP from the promoter DNA in addition to binding free RNAP (36). Upon heparin addition to 14 g/ml, transcriptionally competent open complexes (as assayed by the addition of NTPs and formation of an RNA transcript at different time intervals following heparin challenge) disappeared at a pseudo-first order rate of 0.006 s Ϫ1 for wild type and ⌬␤SI2 RNAP, about twice as fast for ⌬␤SI1 RNAP, and about 10 times faster for ⌬␤ЈSI3 RNAP and B. subtilis RNAP (Fig. 3A). We conclude that the characteristic instability of B. subtilis open complexes is mimicked by ⌬␤ЈSI3 E. coli RNAP and that ␤SI1, but not ␤SI2, also contributes to the stability of E. coli open complexes.
␤ЈSI3 Promotes Abortive Initiation-A second property that distinguishes B. subtilis and E. coli RNAPs is an ability of B. subtilis RNAP to escape promoters at which E. coli RNAP is trapped in abortive synthesis. We found this previously using a variant of the T7 A1 promoter whose initially transcribed sequence (5Ј-AUCCCACACC . . . versus wild type 5Ј-AUC-GAGAGGG . . . ) appears to act cooperatively with strong contacts of E. coli RNAP to promoter DNA to trap the enzyme in abortive synthesis (24); B. subtilis RNAP is able to escape from the mutant T7 A1 promoter with relatively few abortive prod-ucts made, and the patterns of abortive products also differ between the B. subtilis and E. coli enzymes (17).
To investigate whether any of the sequence insertions in E. coli RNAP might explain its characteristically poor escape from the mutant T7 A1 promoter, we measured the abortive to productive RNA product ratios for the various RNAPs on this promoter (Fig. 3B). Interestingly, deletion of ␤ЈSI3 reduced the abortive to productive product ratio, although only to a level still ϳ2-fold greater than found for B. subtilis RNAP. However, deletion of either SI in ␤ dramatically increased the ratio. Thus, the E. coli sequence insertions profoundly affect promoter escape, although in different directions; ␤SI1 and ␤SI2 promote escape, whereas ␤ЈSI3 inhibits escape.
The patterns of abortive products produced also are revealing (Fig. 3B). Both ⌬␤SI1 and ⌬␤SI2 RNAPs produce an altered pattern of abortive products relative to wild type or ␤Ј⌬SI3 RNAPs (compare the 60-min lanes for each RNAP in Fig. 3B). However, although ␤Ј⌬SI3 RNAP exhibited increased promoter escape, like B. subtilis RNAP, it does not recapitulate the distinctive pattern of B. subtilis abortive products. We conclude that the absence of the Proteobacterial sequence insertions substantially, although not completely, accounts for the distinctive initiation properties of B. subtilis RNAP.
␤ЈSI3 Slows Escape from Pause Sites-Another key difference between the E. coli and B. subtilis RNAPs is their responses to certain hairpin-dependent pause sites. Both enzymes can recognize hairpin-independent pause signals as well as a B. subtilis P RNA pause site at which a nascent RNA hairpin 12 nt upstream from the pause is important in the presence, but not in the absence, of NusA (17). However, B. subtilis RNAP fails to recognize the his pause site, at which a nascent RNA hairpin 11 nt upstream from the pause site contributes a factor of 5-10 to the delay in pause escape through an interaction with the ␤ flap domain of the E. coli RNAP (17,37).
The structural basis for this difference is not known; B. subtilis RNAP might be unable to respond to the RNA hairpin formation or to other components of this signal (the downstream DNA, the 3Ј-proximal RNA, and the nucleotides in the active site) that also slow pause escape independently of the RNA hairpin. We previously found that replacement of the E. coli ␤ flap with the corresponding fragment of the B. subtilis ␤ did not alter pausing (17), suggesting that the difference in their response to the his pause signal lay elsewhere. Thus, the sequence insertions in E. coli RNAP were the next logical possibility.
To ask if the E. coli sequence insertions play a role in pause site recognition, we first tested pausing on the his pause template. We found that, unlike B. subtilis RNAP, ⌬␤SI1, ⌬␤SI2, and ⌬␤ЈSI3 RNAPs all recognized the his pause site (Fig. 4). However, ⌬␤ЈSI3 RNAPs escaped from the pause site ϳ5 times faster than wild type and exhibited an ϳ2-fold decrease in the efficiency of pause site recognition. To ask if the residual pausing by ⌬␤ЈSI3 RNAPs was still hairpin-dependent, we tested the effect of an antisense oligonucleotide that pairs to nascent pause RNA including the two 5Ј-most nt of the pause hairpin stem (Fig. 4A) and that eliminates the 5-10-fold contribution of the hairpin to pausing by disrupting its structure (23). If ␤ЈSI3 somehow mediated the effect of the pause hairpin, despite their locations on opposite sides of the paused TEC, we would expect the antisense oligonucleotide to have little effect on pausing by ⌬␤ЈSI3 RNAP. However, the antisense oligonucleotide reduced the pause half-life of ⌬␤ЈSI3 RNAP by a factor of ϳ3 (Fig. 4D), suggesting that ␤ЈSI3 contributes to pausing through interactions that are largely independent of the hairpin interaction. We verified this conclusion by testing the effect of ⌬␤ЈSI3 on pausing at the hairpin-independent ops pause site, where the deletion had effects on half-life and efficiency nearly identical to the effects observed at the his pause site (Fig. 5). However, ⌬␤ЈSI3 did not fully recapitulate the pausing behavior of B. subtilis RNAP, which bypasses the his pause site without apparent delay (Fig. 4B) and transcribes through the ops pause site even more rapidly than does ⌬␤ЈSI3 RNAP (17).
We conclude that ␤ЈSI3 plays a central role in the strong pausing behavior of E. coli RNAP but that differences between E. coli and B. subtilis RNAP in addition to ␤ЈSI3 must contribute to reduced pausing by the latter enzyme. Interestingly, a partial deletion in ␤ЈSI3 (⌬1091-1130) displays the opposite effect on pausing as the precise ␤ЈSI3 deletion that we studied; i.e. it increases rather than decreases pausing (14). We return to the implications of this discrepancy under "Discussion."

Multiple Deletions of E. coli RNAP Sequence Insertions-
Given the distinct effects of ␤SI1 and ␤ЈSI3 on open complex longevity, abortive initiation, and pausing, we wondered how these effects would combine in an enzyme lacking both SIs. All attempts to produce such an enzyme failed, however (data not shown). Derivatives of pIA423 that expressed ␤⌬SI1 and ␤Ј⌬SI3 never yielded assembled RNAPs bearing the deletions, even when we included on the co-expression plasmid rpoZ, the gene encoding the dispensable RNAP subunit that is reported to promote RNAP assembly (38).
Deletions of the Sequence Insertions Impair Growth of Bacteria-Finally, we wished to test the dispensability of sequence insertions in RNAP for bacterial growth. Studies conducted prior to the availability of high resolution RNAP structures on RNAPs lacking portions of ␤SI1, ␤SI2, or ␤ЈSI3 led to the idea that these were dispensable regions, at least for core RNAP functions (9,(12)(13)(14)39). Now that we could define precise deletions of the sequence insertions and express functional enzymes containing these deletions in vivo, we wished to revisit this idea and ask to what extent the sequence insertions are required for bacterial growth.
To test the requirement for sequence insertions for bacterial growth, we expressed the wild type or ⌬SI subunits from plasmids on which their expression either singly or in combination with the other RNAP subunits could be regulated by IPTG and lac repressor encoded on the same plasmids (Table I). We transformed the plasmids into strains that express wild type RNAP from the chromosome (rich medium, wild type subunit from chromosome expressed) (Table II) or in which expression of the corresponding subunit from the chromosome could be inactivated at 39 or 42°C (for rpoC and rpoB, respectively; wild type subunit from chromosome not expressed, Table II).
We found that plasmid-encoded expression of mutant ␤ subunit lacking SI1 or SI2 in the presence of wild type ␤ encoded on the chromosome had little effect on cell growth However, ␤⌬SI1 was unable to substitute for wild type ␤ at 42°C, and it compromised growth at 30°C when it contained a rifampicinresistant amino acid substitution (SF531) not present in the chromosomal copy of rpoB and when rifampicin was added to the medium (Table II). Growth on rifampicin at 30°C was not restored when ␤ SF531 ⌬SI1 was co-expressed with ␤Ј and ␣ (conditions that yielded assembled ␤⌬SI1 for purification).
Thus, ␤⌬SI1 appears to be incorporated into RNAP but unable to support cell growth. ␤⌬SI2 was able to support growth on rich medium even at 42°C or in the presence of rifampicin at 30°C when combined with SF531. However, ␤ SF531 ⌬SI2 was unable to support growth on minimal medium containing rifampicin, when expressed either singly or together with ␣ and ␤Ј. Thus, both sequence insertions in E. coli ␤ appear necessary for growth on minimal medium, suggesting that they play some regulatory role in RNAP function that becomes essential in stringent growth conditions.
We found that ␤ЈSI3 also was required for growth in some conditions; however, the requirements for this sequence insertion are complex. A ␤Ј⌬SI3 plasmid could not be transformed into RL602 (Table I) in which expression of chromosomally encoded ␤Ј is temperature-sensitive, even in the absence of IPTG (probably because low, uninduced expression of ␤Ј⌬SI3 produced too much defective enzyme when combined with the lowered expression of ␤Ј in RL602). The ␤Ј⌬SI3 co-expression plasmid could be transformed into the conditional rpoC strain but blocked growth when induced at the permissive temperature and failed to support growth at the nonpermissive temperature. Although induction of ␤Ј⌬SI3 in a strain producing wild type ␤Ј lowered plating efficiency by 10 2 , it actually raised plating efficiency by Ͼ10 3 relative to wild type ␤Ј, when plated in the presence of an antibiotic (microcin J25) that inhibits RNAP upon binding at a site near the ␤ЈSI3 in the RNAP secondary channel (62). Co-expression ␤Ј⌬SI3 with ␤ and ␣ was not inhibitory in a strain expressing wild type RNAP (Table I) and conferred nearly 100% plating efficiency in the presence of microcin J25 (62). However, the growth rate of these colonies in the presence of microcin J25 is drastically slower than strains carrying the prototypical microcin J25 resistance mutation (rpoC T931I ). On balance, it appears that ␤ЈSI3 is essential for normal growth of E. coli but that, at least in the presence of microcin J25, RNAP lacking ␤ЈSI3 can still support slow growth on rich medium.
We conclude that sequence insertions, although dispensable for basic RNAP function at all stages of the transcription cycle, nonetheless are important to growth of the bacterium. This may reflect effects of the sequence insertions on the basic steps in transcription by RNAP that become crucial for expression of certain genes or in some growth conditions but could also reflect roles in yet undescribed but essential interactions of transcription factors similar to the previously described interaction of the T4 Alc protein with ␤SI1 (15). DISCUSSION We have described a co-overexpression system for mutant E. coli RNAPs and its use to investigate whether sequence insertions in the ␤ and ␤Ј subunits could explain the different initiation and elongation properties of E. coli RNAP compared with B. subtilis RNAP, which lacks these insertions. We found that co-overexpression facilitates assembly of otherwise difficult to assemble or toxic mutant RNAPs and that the sequence insertions, most notably ␤ЈSI3, confer some of its distinctive biochemical properties on E. coli RNAP and partially account for its differences from B. subtilis RNAP. We will discuss the implications of these findings for the study of mutant RNAPs, the function of the E. coli sequence insertions with focus on ␤ЈSI3, and the evolution of sequence insertions in RNAP.
An RNAP Overexpression System That Promotes Efficient Assembly-Previous studies of toxic RNAP mutants have relied either on conditional expression of a tagged mutant subunit followed by tag affinity separation from wild type RNAP (or tagging and removal of the wild type enzyme from the RNAP preparation) (40) or in vitro reconstitution of RNAP by gradual renaturation of mixtures of the denatured subunits (19,(41)(42)(43). ␤Ј⌬SI3 illustrates the limitation of these approaches.
When expressed as an individual subunit in vivo, ␤Ј⌬SI3 fails to out-compete wild type ␤Ј for assembly into useful amounts of mutant enzyme; when used in in vitro reconstitution protocols, it fails to assemble into RNAP (see "Results" and Ref. 14).
Co-overexpression of wild type and mutant tagged subunits solved these problems for ␤Ј⌬SI3 and has allowed us to isolate substantial quantities of many mutant RNAPs, including notably toxic mutants such as the ␤⌬(900 -909) flap mutant RNAP (37). Thus, the co-overexpression plasmid provides a powerful tool for genetic and biochemical analysis of RNAP. The resulting enzymes are substantially pure after two steps, primarily because chitin affinity chromatography is much more selective than the hexahistidine-Ni 2ϩ -nitrilotriacetic acid-agarose method that has been widely used to date (44), and has the advantage of being assembled by the normal pathway in vivo. We anticipate that the co-overexpression plasmid will also prove useful for selection of RNAP suppressor mutants of inviable amino acid substitutions in the enzyme. Finally, because RNAP is pure enough for in vitro transcription assays after just the single chitin column step, the method should facilitate rapid screening of sets of closely related alterations to RNAP.
In using the co-overexpression method for mutant RNAP assembly and purification, we encountered a few problems that require care to avoid. First, the chitin matrix apparently can break down when batch binding and elution are used, which prevents retention of the tagged enzyme on the solid chitin a RNAP subunits were expressed singly from pRL702 (␤) or pRL663 (␤Ј), or co-expressed from pIA423 (␤Ј␤␣ 2 ) derivative plasmids (see Table I). To measure plating efficiencies, plasmids were transformed into the indicated strains, grown to A 600 of ϳ0.3, an incubated with 1 mM IPTG for 1 h to induce expression of WT or mutant subunit. Serial dilutions of these cultures were then plated on agar medium containing 100 g of ampicillin/ml, with or without 1 mM IPTG for calculation of plating efficiencies. All plating efficiencies are the average of quadruplicate measurements among which the S.E. was Ͻ30%. Although the coexpression plasmid ordinarily depends on T7 RNAP for overexpression of RNAP, the amount of RNAP generated from transcripts that initiate at other E. coli RNAP promoters on the plasmid is sufficient to support growth of an E. coli strain not containing T7 RNAP, provided that IPTG is present to remove lac repressor from the operator site at the beginning of rpoA on the plasmid. b Plating efficiency ϭ (colonies formed with IPTG/colonies formed without ITPG) in strain RL585 at 30°C for the ␤ subunit and strain DH5␣ at 37°C for the ␤Ј subunit (see Table 1 and "Experimental Procedures"). Under these conditions, the WT ␤ and ␤Ј subunits are present.
c Plating efficiency ϭ (colonies formed at 42°C with IPTG/colonies formed at 30°C without IPTG) in strain RL585. At 42°C, the RL585 chromosomal copy of ␤ is inactivated (57). Plating efficiency when WT ␤Ј was absent could not be determined, because the strain that conditionally expresses ␤Ј (RL602) (40) could not be transformed with plasmids encoding ␤Ј⌬SI3 even under noninducing conditions. Plating efficiencies for the co-overexpression plasmids were determined using derivatives of strains RL585 and RL602 in which DE3 was integrated in their chromosomes.
d,e Plating efficiency on antibiotic (inhibitor) plates ϭ (colonies formed with IPTG and antibiotic colonies formed with IPTG but without antibiotic). ␤ mutants (top part) were grown in RL585 (d) or DH5␣ (e) at 30°C with 25 g of rifampicin/ml. ␤Ј mutants (bottom part) were grown in DH5␣ at 37°C with 40 g microcin J25/ml. f ND, not determined. g RL585 transformed with ␤⌬SI2 plasmid formed small colonies (ϳ 1 ⁄5 of WT in diameter) in the absence of the WT ␤ (at 42°C or in the presence of rifampicin matrix. We used slow passage of lysates through a column of chitin matrix to avoid this problem. Second, the efficiency of DTT-mediated cleavage of the Sce VMA intein that connects ␤Ј to the CBP was usually less than 100%, and both the binding capacity and the cleavage efficiency appeared to vary among lots of chitin matrix. It is advisable to test new batches of matrix before committing valuable samples; some loss of material due to inefficient cleavage appears unavoidable. Third, we sometimes observed loss of expression of one or more of the RNAP subunits when the co-expression plasmid was maintained in recA ϩ , T7 RNAP-expressing strains, apparently due to plasmid rearrangements, mutations, or recombination with the chromosomal copies of rpoA, rpoB, or rpoC. The last problem was substantially eliminated by maintaining the plasmids in recA strains such as DH5␣ except during expression. Finally, we emphasize that not all RNAP mutants can be successfully recovered by the co-overexpression approach. For instance, a deletion slightly larger than ␤Ј⌬SI3 (⌬932-1137 versus ⌬943-1130 in ␤Ј⌬SI3) failed to assemble and yielded only insoluble ␤Ј⌬(932-1137) aggregates in vivo.
Role of ␤ЈSI3 in RNAP Function-Although both SI1 and SI2 in ␤ exhibited some effects on the biochemical properties of RNAP (and were essential for full viability in vivo), the most dramatic effects on RNAP properties were observed for ␤ЈSI3. ␤ЈSI3 profoundly affected initiation and pausing by E. coli RNAP, in both cases accounting for many but not all of its differences from B. subtilis RNAP. We will consider the effects on initiation and pausing separately, although the underlying changes in RNAP's properties could be related.
␤ЈSI3 stabilized open complexes formed at the T7 A1 promoter against disruption by heparin. This stabilization could reflect either direct contact with the downstream DNA, as previously proposed (16) (11), also favoring its possible interaction with downstream DNA. However, this structure is of an enzyme that lacks ␤ЈSI3 and was obtained in the absence of DNA. The ␤ЈG loop is poorly ordered in both the yeast RNAPII and T. aquaticus RNAP crystal structures and was found in different conformations in the ordered portions of the two structures, suggesting that it is capable of significant motion. Further, ␤ЈSI3 itself is not visible in a recent electron microscopy crystal structure of E. coli RNAP, suggesting that it is even more mobile than the ␤ЈG loop (10). Thus, ␤ЈSI3 folding could be coupled either directly or indirectly to interactions of RNAP with downstream DNA. Many DNA-binding proteins undergo folding transitions in protein segments distant from the actual DNA binding surface (45). These indirect but coupled protein folding events are thought to contribute significantly to the avidity of protein-DNA interactions. Thus, stabilization of open complexes by ␤ЈSI3 could arise not via its direct interaction with the downstream DNA but via an indirect coupling of its folding to open complex formation. Further study will be required to distinguish these possibilities.
Both the direct and indirect explanation of ␤ЈSI3 function also could apply to its effect on transcriptional pausing. Both formation of paused transcription elongation complexes and is more open in the E. coli structure than in the T. aquaticus structure). Pink, ␤Ј; cyan, ␤; white, ␣ and ; magenta, Mg 2ϩ at catalytic center. (Note that the subunit is included based on the T. aquaticus RNAP structure; it was not seen in the E. coli RNAP electron microscopy structure.) The locations of ␤SI1 and ␤SI2 are depicted in red space fill (the density for ␤SI1 represents only a portion of the space expected to be occupied by this insertion; the remainder is disordered in the electron crystal structure (10)). The location of ␤ЈSI3, which is completely disordered and not visible in the electron crystal structure, is depicted based on its junction to the ␤ЈG loop as red balls. The locations of sequence insertions shown in Fig. 1 are marked by green letters and lines. A, view into the main channel of RNAP with the enzyme oriented for transcription from left to right. B, view into the NTP entry channel of RNAP. their slow escape are enhanced by an interaction of downstream DNA with RNAP (46 -49). Thus, ␤ЈSI3 could strengthen the pause-enhancing downstream DNA interaction either directly or by the indirect mechanism described above. Interestingly, the effects on pausing of deleting ␤ЈSI3 are quite similar to the effects of deleting the ␤Ј jaw domain of E. coli RNAP, for which a variety of data suggest a direct DNA interaction (49).
In the case of pausing, however, a third explanation is possible. ␤ЈG, in which ␤ЈSI3 is inserted, appears to cross-link to the RNA 3Ј nt in the his paused transcription complex 3 as well as to the RNA 3Ј nt in arrested complexes (50) and in certain artificial transcription complexes that may mimic paused transcription complexes (51). Further, amino acid substitutions in ␤ЈG, near the site of ␤ЈSI3 insertion, strongly affect chain elongation and transcriptional pausing and termination (40,51). If movements of the ␤ЈG loop are involved in transcriptional pausing, it could readily explain why deletion of ␤ЈSI3 has such a strong effect on pausing. This also might explain why a partial deletion in ␤ЈSI3 (⌬1091-1130) as well as monoclonal antibody binding to ␤Ј1091-1130 in wild type RNAP greatly increase pausing (14), whereas complete deletion of ␤ЈSI3 (⌬943-1130) decreases pausing (Figs. 4 and 5). By altering the structure of ␤ЈSI3, both ␤Ј⌬(1091-1130) and antibody binding could interfere with movements of the ␤ЈG loop necessary for rapid nucleotide addition, whereas removal of ␤ЈSI3 could favor such movements.
Roles of ␤SI1 and ␤SI2 in Abortive Initiation-␤SI1 and ␤SI2 both exhibited effects on initiation complexes, either in open complex longevity, abortive initiation, or both. Interestingly, both SIs increased abortive initiation significantly, although they decreased open complex longevity modestly (␤SI1) or had no effect on it (␤SI2). This suggests that RNAP must be exceptionally sensitive to changes in its structure during abortive initiation. This could be related to the sequential rearrangements of contacts between core RNAP and that are thought to occur during the initial stages of RNA synthesis (52). Perhaps ␤SI1 and ␤SI2 somehow facilitate the corerearrangement during initial transcript synthesis. Alternatively, they could indirectly affect the conformation of the main channel of RNAP such that their removal increases the rate of abortive transcript release.
Sequence Insertions Are Ubiquitous in Bacterial RNAPs-Finally, we note that sequence insertions are not limited to those found in the RNAPs from enteric and thermophilic bacteria. A search of sequences now available for ␤ and ␤Ј subunits in bacteria revealed that sequence insertions are ubiquitous and that the positions of these insertions are not restricted to those observed in the enteric and thermophilic bacteria (Figs. 1  and 6). Rather, the sequence insertions differ in size, sequence, and location but are predicted to be surface-exposed (Fig. 6). Such a distribution of sequence insertions is consistent with the idea that they confer species-specific properties on core RNAP, either by modulating its enzymatic activity directly or through interactions with transcription factors. The relatively low degree of divergence among SIs in closely related proteobacterial species (see Introduction) also lends support to this idea.
This pattern of surface-exposed sequence insertions is reminiscent of the small subunits of eukaryotic RNAPs, which also are surface-exposed (at least in RNAPII); vary in composition to some extent among RNAPI, RNAPII, and RNAPIII; and can affect the biochemical properties of the enzyme yet are dispensable for core function. The RNAPII subunits 4 and 7, for instance, are readily dissociated from RNAP and affect its biochemical properties but are dispensable for nucleotide addition (53,54); deletion of RNAPII subunit 9, like deletions in ␤ЈSI3, affects responses of the RNAPs to their respective RNA cleavage factors, GreB (14) and TFIIS (55). Perhaps the bacterial sequence insertions are like these small eukaryotic RNAP subunits, conferring distinctive properties on the RNAPs of different bacterial species much as the small RNAP subunits may confer distinctive properties on RNAPI, RNAPII, and RNAPIII.