Formation and Fate of a Complete 31-Protein RNA Polymerase II Transcription Preinitiation Complex*

Background: Analyses of pol II transcription are hampered by the difficulty of preparing an abundant functional preinitiation complex (PIC). Results: We reconstituted milligram quantities of a complete 31-subunit PIC. Conclusion: An intermediate comprising TBP, TFIIE, TFIIH, and DNA could be isolated and combined with TFIIB and pol II-TFIIF to generate the PIC. Significance: The results enable definitive biochemical and structural studies of the transcription initiation machinery. Whereas individual RNA polymerase II (pol II)-general transcription factor (GTF) complexes are unstable, an assembly of pol II with six GTFs and promoter DNA could be isolated in abundant homogeneous form. The resulting complete pol II transcription preinitiation complex (PIC) contained equimolar amounts of all 31 protein components. An intermediate in assembly, consisting of four GTFs and promoter DNA, could be isolated and supplemented with the remaining components for formation of the PIC. Nuclease digestion and psoralen cross-linking mapped the PIC between positions −70 and −9, centered on the TATA box. Addition of ATP to the PIC resulted in quantitative conversion to an open complex, which retained all 31 proteins, contrary to expectation from previous studies. Addition of the remaining NTPs resulted in run-off transcription, with an efficiency that was promoter-dependent and was as great as 17.5% with the promoters tested.

The initiation of RNA polymerase II (pol II) 4 transcription is a multistage process, most likely for the purpose of multifactorial control. It extends far beyond the formation of the first phosphodiester bond; a transcript of ϳ25 nucleotides must be synthesized before a transition occurs from initiation to RNA chain elongation (1,2). The proteins responsible for initiation, a set of general transcription factors (GTFs) and the polymerase, associate in a so-called preinitiation complex (PIC) (3,4) and largely dissociate at every round of transcription. Evidence for a PIC was previously obtained with nuclear extract or with par-tially purified GTFs assembled on immobilized promoter DNA (5)(6)(7)(8)(9). Due to the poor efficiency of the reaction and trace amounts of the protein involved, detection was possible only by the synthesis of a radiolabeled transcript and by immunoblotting.
An abundant, homogeneous, and soluble PIC is required to elucidate the mechanisms of initiation and regulation of transcription. We sought to assemble a PIC with pure GTFs and pol II from the yeast Saccharomyces cerevisiae. Four GTFs (transcription factor (TF) IIA, TFIIB, the TATA box-binding protein (TBP), and TFIIE) were available in recombinant form, but the remaining GTFs (TFIIF and TFIIH) could be obtained only by isolation from yeast and presented technical difficulties. Pure TFIIF is as largely insoluble, and TFIIH, an 11-subunit complex, invariably dissociates upon isolation (10). The TFIIF problem was solved by identifying a detergent capable of solubilizing the protein without an effect on transcriptional activity. The instability of TFIIH was traced to a previously unrecognized subunit, termed Tfb6 (11), which provokes the dissociation of Ssl2, the helicase responsible for conversion of a closed (fully doublestranded) promoter to the open state (with ϳ15 bp unwound in the form of a "transcription bubble"). Isolation of TFIIH from a tfb6 deletion strain of yeast resulted in a good yield of the complete 10-subunit protein, lacking only Tfb6 (therefore referred to here as TFIIH*) and active in transcription.
With all the GTFs in suitable form in hand, we investigated the assembly of a PIC and arrived at an efficient procedure for obtaining a stable functional complex. We found intermediates consistent with the emerging picture of PIC assembly in vivo. The results were also informative about the fate of the complex upon the initiation of transcription. The way is now open to structure determination of the PIC and to dissection of the initiation process.

EXPERIMENTAL PROCEDURES
Oligonucleotides-Short fragments of the HIS4 promoter DNA (HIS4(Ϫ81/ϩ1) and HIS4(Ϫ81/ϩ19)) were obtained by annealing equimolar amounts of complementary oligonucleotides (Integrated DNA Technologies). Other HIS4 promoter templates were obtained by restriction digestion of a concatemeric form as described below. HIS4 promoter DNA was amplified by PCR using two primers with EcoRV sites at both ends and was cloned into the pDrive vector (Qiagen). The plasmid construct was digested with EcoRI, and the promoter DNA fragment was purified and concentrated to 240 g/ml using a Vivaspin 500 concentrator (5000 M r cutoff; Vivascience). The fragment was self-ligated in 20 l of ligation buffer with 2 units of T4 ligase (New England Biolabs) to obtain a concatemer (usually four to six copies). The concatemer DNA was purified by agarose gel electrophoresis and cloned into pUC18. The XL10-Gold strain (Stratagene) harboring the plasmid was grown in 2-6 liters of LB medium, and the plasmid was isolated using a plasmid Gigaprep kit (Qiagen). After restriction digestion with EcoRV, the plasmid DNA fragment (ϳ3.8 kbp) was precipitated by adding PEG 6000 and NaCl to final concentrations of 6 -8% and 500 mM, respectively. The promoter DNA in the supernatant was extracted with phenol/chloroform/isoamyl alcohol (25:24:1) and chloroform/isoamyl alcohol (24:1) and precipitated with ethanol. The DNA pellet was resuspended in gel filtration buffer (10 mM HEPES (pH 7.6), 300 mM potassium acetate, and 5 mM DTT) and fractionated on a Superose 6 column (GE Healthcare), yielding 1-3 mg of fragment. Thus, non-native ATC and GAT were retained at the 5Ј-and 3Ј-ends (see Fig. 1A).
PIC Reconstitution on a Preparative Scale-All reconstitution experiments were performed on ice or at 4°C with proteins purified as described previously (11,13). Promoter DNA (0.5 nmol) was mixed with 0.75 nmol of TFIIB, 0.75 nmol of TFIIA, 0.5 nmol of TBP, 0.7 nmol of TFIIE, and 0.3 nmol of Tfb6⌬-TFIIH (TFIIH*) in 40 l of buffer(500) (20 mM HEPES (pH 7.6), 5 mM DTT, 2 mM magnesium acetate, 5% glycerol, and the mM concentration of potassium acetate in parentheses). The protein mixture was dialyzed in steps of buffer(300), buffer(220), and buffer(150) for at least 4 h at each step and then combined with 0.25 nmol of pol II-TFIIF complex. The mixture was further dialyzed into buffer(100) and buffer(80) before loading on a 10 -40% (v/v) glycerol gradient containing 20 mM HEPES (pH 7.6), 5 mM DTT, 2 mM magnesium acetate, and 80 mM potassium acetate. After centrifugation at 40,000 rpm in a Beckman SW 60 rotor for 9 h, the gradient was fractionated using a PGF Piston Gradient Fractionator TM (BioComp Instruments, Inc.). The fractions were kept at Ϫ80°C without loss of transcriptional activity. For isolation of the open complex, the PIC was reconstituted in the same way; incubated with 1.6 mM ATP, 0.5 mM GTP, and 0.5 mM CTP for 15 min; and sedimented on a glycerol gradient containing 20 mM HEPES (pH 7.6), 5 mM DTT, 2 mM magnesium acetate, 80 mM potassium acetate, 1.6 mM ATP, 0.5 mM GTP, and 0.5 mM CTP. The pol II-TFIIF complex was omitted in the reconstitution of an intermediate complex, and glycerol gradient centrifugation was performed for 5 h at 60,000 rpm in a Beckman SW 60 rotor.
Exonuclease Footprinting-For determination of the downstream boundary of the PIC, the 5Ј-end of an upstream primer (5Ј-GGATATGACTATGAACAGTAG-3Ј) was labeled with [␥-32 P]ATP using T4 polynucleotide kinase. HIS4(Ϫ96/ϩ112) was amplified by PCR in a 1-2-ml reaction using the 32 P-labeled upstream primer and downstream primer (5Ј-TATTC-CATGAGGCCAGATC-3Ј) and purified by electrophoresis on a 2% agarose gel. The labeled DNA (1 pmol) was incubated for 1 h at room temperature with 2 pmol of TFIIB, 1.6 pmol of TFIIA, 1.1 pmol of TBP, 2.4 pmol of TFIIE, 1.5 pmol of TFIIH*, and 1.2 pmol of pol II-TFIIF complex in 8 l of buffer A; combined with 12 l of buffer B; and incubated for 1 h at 4°C. The reconstituted PIC was combined with an equal volume of 2ϫ NTP buffer (1.6 mM NTP(s) or 0.5 mM non-hydrolyzable analog ATP␥S, 10 mM magnesium acetate, and 5 units of RNaseOUT) and incubated for 30 min at 30°C. Exonuclease III digestion was performed with 200 units of exonuclease III (New England Biolabs) for 9 min at 30°C and stopped by adding 185 l of stop buffer II (10 mM Tris (pH 7.5), 300 mM sodium acetate (pH 5.5), 5 mM EDTA, 0.7% SDS, 0.1 mg/ml glycogen, 0.013 mg/ml proteinase K, and 0.5 mg/ml salmon sperm DNA (Invitrogen)). The products were precipitated with 650 l of 100% ethanol and kept overnight at Ϫ20°C. The DNA pellet was recovered by centrifugation at maximum speed for 1 h, dried at 37°C, and resuspended in 10 l of gel loading buffer (95% formamide, 0.02% bromphenol blue, 5 mM EDTA, and 0.025% xylene cyanol). The products were analyzed by denaturing 6% polyacrylamide gel electrophoresis and detected with a PhosphorImager.
Psoralen Cross-linking of the PIC-The glutaraldehyde-fixed PIC was combined with psoralen (20 g/ml) and irradiated with a 360-nm-long wavelength ultraviolet fluorescent lamp. DNA was denatured in the presence of glyoxal and spread for electron microscopy as described (15). Grids were scanned at a magnification of ϫ20,000 on a charge-coupled device (4096 ϫ 4096 pixels, Gatan UltraScan TM 4000) with a JEOL 1230 electron microscope. The size of denatured bubbles was calculated from the average of the lengths of the two halves of the bubble.

RESULTS
Assembly and Isolation of Yeast pol II PIC-To identify a suitable DNA for assembly of a PIC, we performed transcription with a series of fragments of the HIS4 promoter and a mixture of pure transcription proteins (TFIIA, TFIIB, TBP, TFIIE, TFIIH*, and pol II-TFIIF complex). All fragments extending from Ϫ84 to ϩ74 with respect to the first transcription start site at position ϩ1 (Fig. 1A) yielded run-off transcripts of the expected lengths (Fig. 1B). A fragment truncated at position ϩ50 failed to support transcription, consistent with a previous study that identified a requirement for promoter DNA extending at least 50 bp downstream from the transcription start site (17). All proteins except TFIIA were required for transcription (Fig. 1C), consistent with previous studies of transcription in vitro (12,18). The amount of pol II-TFIIF complex was limiting, with all other proteins and DNA added in molar excess and saturating for activity; the yield of the reaction was 0.076 transcripts per pol II-TFIIF complex. Run-off transcription was ϳ95% complete in 15 min and was limited to a single round (Fig. 1, D and E). Similar results were obtained with other promoters, except for variation in the level of transcription (Table 1).
It was not possible to perform the reaction on a preparative scale by increasing the concentration of components, as a 10-fold increase resulted in precipitation, and no transcription was obtained (data not shown). A concentrated mixture was, however, soluble at elevated ionic strength. TFIIH* could be combined with excess TFIIA, TFIIB, TBP, TFIIE, and HIS4 promoter DNA fragment Ϫ96/ϩ112 in 0.5 M potassium acetate and dialyzed to 0.15 M without precipitation. The resulting GTF-DNA complex was combined with the pol II-TFIIF complex, dialyzed to 0.1 M potassium acetate, and sedimented on a 10 -40% glycerol gradient. A single peak in the center of the gradient ( Fig. 2A, lanes 7 and 8) contained equimolar amounts of all transcription proteins (Fig. 2E), as shown by SDS-PAGE and densitometry, which resolved the 31 proteins in the PIC.

Formation of a Complete Transcription Initiation Complex
MARCH 1, 2013 • VOLUME 288 • NUMBER 9

JOURNAL OF BIOLOGICAL CHEMISTRY 6327
The same results were obtained with a minimal DNA fragment (Ϫ81/ϩ1) (Fig. 2E, lane 2). Electron microscopy provided support for the assembly of a complete PIC. Peak glycerol gradient fractions, embedded in stain, disclosed fields of particles virtually identical to one another except for differences in direction of view (Fig. 3A). The particles often appeared bipartite, with each part comparable in size to a pol II-TFIIB-TBP-DNA complex. No uniform particles were obtained when the experiment was repeated with the omission of any one of the GTFs.
The location of the PIC on the HIS4 promoter DNA fragment was mapped by reaction with psoralen, which cross-linked the two DNA strands (Fig. 3B). A region of DNA bound by protein was protected from cross-linking and appeared as a singlestranded bubble following denaturation. Analysis of the PIC in this way gave rise to a bubble at one end of the promoter fragment (Fig. 3B), in keeping with the location of the TATA box toward one end of the fragment. The contour length of the bubble corresponded to 67.6 Ϯ 2.6 bp (average of 57 molecules) ( Table 2) and increased by 10.9 bases (to 78.5 Ϯ 5.0 bp) upon the addition of ATP, GTP, and CTP (average of 85 molecules), although the percentage of molecules with a bubble was much lower in this case ( Table 2).
The fully assembled PIC isolated by gradient sedimentation utilized the same start site as PICs obtained by simple mixing (Fig. 2B). The level of transcription, 0.106 Ϯ 0.012 transcripts per PIC, was similar to levels obtained by simple mixing of pol II with all factors added in excess. There was no increase in transcription upon supplementation with additional TFIIB, TBP, TFIIE, or TFIIH*, further attesting to the completeness of the isolated PIC (Fig. 2C).
Further evidence that the PIC represents a stable entity, rather than a mixture capable of transcription with additional factors, came from a template challenge experiment. A PIC formed on HIS4(Ϫ96/ϩ112) was transcribed in the presence of the shorter promoter fragment HIS4(Ϫ96/ϩ74). Only the pair of run-off transcripts from the longer fragment was observed (Fig. 2D, lanes 1-3), but not the transcripts expected from the shorter fragment (lane 4).
Isolation of the GTF-DNA Intermediate-We investigated the intermediate formed in the first step of PIC assembly by sedimentation on a glycerol gradient. Two complexes were resolved, one containing equimolar amounts of four GTFs (TFIIA, TBP, TFIIE, and TFIIH*) as well as promoter DNA (Fig.  4A), and a second, slower sedimenting complex containing TFIIA, TBP, TFIIB, and promoter DNA. When supplemented with TFIIB and pol II-TFIIF, the four-GTF complex supported transcription (Fig. 4B), whereas little transcription was obtained when supplemented with the pol II-TFIIF complex alone, confirming the absence of TFIIB from the gradient-purified preparation. The four-GTF complex therefore provides a platform onto which pol II-TFIIF and TFIIB assemble to form the PIC.
Isolation of the Open Complex-The PIC assembled by simple mixing of all components at low concentration exhibited downstream and upstream barriers to digestion by exonuclease III at about positions Ϫ9 and Ϫ70, respectively, with respect to the first transcription start site (Fig. 5A). Upon the addition of ATP (but not the non-hydrolyzable analog ATP␥S), the downstream barrier disappeared, and pauses or stops in digestion downstream (between ϩ7 and ϩ46) were intensified, indicative of essentially complete conversion from closed to open complexes. Open complex formation was dependent on the inclusion of TFIIE and TFIIH* in the PIC (Fig. 5). The shift in the downstream boundary of the PIC is consistent with the idea of "promoter scanning" proposed to explain the location of transcription start sites 40 -120 bp downstream of the TATA box in yeast (19). There was no further change in the pattern of exonuclease III digestion upon the addition of GTP, CTP, and UTP, consistent with the small percentage of open complexes that give rise to run-off transcripts (ϳ7.6%, as noted above).
The PIC in the presence of ATP (and also GTP and CTP) appeared in faster sedimenting fractions (Fig. 6, A and D). Following sedimentation on the glycerol gradient, the PIC retained its entire complement of 31 polypeptides (Fig. 6B). The largest subunit of pol II (Rpb1) was hyperphosphorylated (Fig. 6C), producing a quantitative mobility shift from the starting posi-    MARCH 1, 2013 • VOLUME 288 • NUMBER 9 tion (IIa) to that characteristic of a hyperphosphorylated state (IIo). Electron microscopy revealed a bipartite structure of the same size, regardless of the presence or absence of nucleotides (Fig. 6E).

DISCUSSION
The notable finding from this work is the isolation of an abundant homogeneous pol II transcription PIC. It was uncertain, even doubtful, that such a complex could be formed and that it would resist the rigors of isolation. In our experience, complexes of pol II with TFIIB, with TFIIF, and with DNA, dissociate during handling; we hoped, however, that a complex with all of the factors would be more stable than those with individual ones, and this hope was realized.
The procedure we developed for the assembly of the complete complex may resemble the pathway for PIC formation in vivo. An intermediate comprising TBP, TFIIE, TFIIH, and DNA could be isolated and then combined with TFIIB and pol II-TFIIF to generate the PIC. Support for the significance of this intermediate comes from studies of transcription reinitiation in vitro and in vivo. Following the addition of NTPs to a PIC formed with nuclear extract on immobilized promoter DNA, four GTFs (TFIIA, TFIID, TFIIE, and TFIIH), as well as Mediator, were retained on the DNA as revealed by immunoblot analysis. The addition of TFIIB, TFIIF, and pol II to the immobilized complex enabled transcription. The complex retained on the DNA was termed a reinitiation "scaffold" (7). A similar complex may be responsible for the identification by ChIP-ChIP analysis of genes enriched for TFIIH and Mediator (20). The assembly intermediate we have isolated may be regarded as a scaffold in the full sense of the word: it supports promoter DNA positioned with respect to pol II, following a path around pol II with little, if any, direct contact with the pol II surface.
Promoter-pol II interaction takes place upon the addition of ATP and open complex formation. We have shown that virtually all PICs undergo the transition to the open complex (Fig. 5), which can be re-isolated without loss of any of the 31 protein components (Fig. 6). In this respect, our findings differ from the unstable open complex obtained by others with immobilized templates. In the human system, PICs lose activity 1 min after the addition of ATP (21,22), and the activity can be partially restored by additional TFIIE (8); in the yeast nuclear extract system, PICs incubated with ATP rapidly undergo dissociation of pol II, TFIIB, and TFIIF (7,23). The loss of pol II is unexpected in light of studies demonstrating the high affinity of pol II for open promoter DNA. Indeed, if pol II is truly lost from the complex formed on immobilized DNA, it is unlikely to represent an intermediate on the pathway to transcription. We cannot explain this previous result of others, but the very stable open complex we have isolated, retaining all protein components, is demonstrably relevant, as shown by its capacity for transcription.
We have mapped the location and extent of the PIC on promoter DNA in both closed and open complexes by nuclease digestion and psoralen cross-linking. The barriers to exonuclease III digestion of the PIC around positions Ϫ70 to Ϫ9 are in good agreement with the observed size and location of the 68-bp psoralen bubble at the end of the DNA fragment and are consistent with previous protein-DNA cross-linking studies performed with yeast nuclear extract (24) and with a reconstituted human system (25). The TATA sequence, between positions Ϫ63 and Ϫ56, would be located off-center within the PIC, with an additional 40 bp lying within the PIC on the downstream side.
There was no loss of transcriptional activity of the PIC upon isolation. The level of transcription with the purified PIC was approximately the same as that obtained by direct mixing of pol II with excess GTFs but was much higher than that obtained by direct mixing of pol II and equimolar amounts of GTFs (data not shown), indicating that one or more of the GTFs was only partially active.
The level of transcription was promoter-dependent, ranging from 0.04 to 0.175 transcripts per PIC with the three promoters tested. In view of the homogeneity of the isolated PIC and the virtually complete transformation from closed to open complexes, it seems likely that the failure to achieve 100% transcription efficiency reflects some limitation(s) of the initiation pro-cess itself, in formation of the first phosphodiester bond, in events leading to a transcript length of ϳ25 nucleotides, or in the subsequent transition from initiation to elongation. It is noteworthy that the observed initiation rate in vitro (on the order of 0.1/min) (Fig. 1, D and E) is much slower than the initiation rate in vivo, e.g. ϳ0.15/s at an enhanced HIS3 promoter (26) and 0.25/s at the hsp70 promoter (27). The fraction of productive PICs and the initiation rate are doubtless influenced by other factors, such as activators and coactivators (28). There is now the possibility of separating inactive complexes from those engaged in transcription and determining the basis for the difference between them.