DNA bends in TATA-binding protein-TATA complexes in solution are DNA sequence-dependent.

The TATA-binding protein (TBP) initiates assembly of transcription preinitiation complexes on eukaryotic class II promoters, binding to and restructuring consensus and variant "TATA box" sequences. The sequence dependence of the DNA structure in TBP-TATA complexes has been investigated in solution using fluorescence resonance energy transfer. The mean 5'dye-3'dye distance varies significantly among oligomers bearing the adenovirus major late promoter sequence (AdMLP) and five single-site variants bound to Saccharomyces cerevisiae TBP, consistent with solution bend angles for AdMLP of 76 degrees and for the variants ranging from 30 degrees to 62 degrees. These solution bends contrast sharply with the corresponding co-crystal structures, which show approximately 80 degrees bends for all sequences. Transcription activities for these TATA sequences are strongly correlated with the solution bend angles but not with TBP-DNA binding affinities. Our results support a model in which transcription efficiency derives primarily from the sequence-dependent structure of the TBP-TATA binary complex. Specifically, the distance distribution for the average solution structure of the TBP-TATA complex may reflect the sequence-dependent probability for the complex to assume a conformation in which the TATA box DNA is severely bent. Upon assumption of this geometry, the binary complex becomes a target for binding and correctly orienting the other components of the preinitiation complex.

The TATA-binding protein (TBP) initiates assembly of transcription preinitiation complexes on eukaryotic class II promoters, binding to and restructuring consensus and variant "TATA box" sequences. The sequence dependence of the DNA structure in TBP⅐TATA complexes has been investigated in solution using fluorescence resonance energy transfer. The mean 5dye-3dye distance varies significantly among oligomers bearing the adenovirus major late promoter sequence (AdMLP) and five single-site variants bound to Saccharomyces cerevisiae TBP, consistent with solution bend angles for AdMLP of 76°and for the variants ranging from 30°to 62°. These solution bends contrast sharply with the corresponding co-crystal structures, which show ϳ80°b ends for all sequences. Transcription activities for these TATA sequences are strongly correlated with the solution bend angles but not with TBP⅐DNA binding affinities. Our results support a model in which transcription efficiency derives primarily from the sequence-dependent structure of the TBP⅐TATA binary complex. Specifically, the distance distribution for the average solution structure of the TBP⅐TATA complex may reflect the sequence-dependent probability for the complex to assume a conformation in which the TATA box DNA is severely bent. Upon assumption of this geometry, the binary complex becomes a target for binding and correctly orienting the other components of the preinitiation complex.
The TATA-binding protein (TBP) 1 binds to eukaryotic class II promoters at specific sequences of DNA of the consensus sequence TATA(a/t)A(a/t)N, nucleating assembly of the pro-teins required for transcription. Atomic resolution co-crystal structures of complexes of DNA bearing consensus strong promoter sequences bound to Saccharomyces cerevisiae (1), Arabidopsis thaliana (2), and human (3,4) TBPs are extremely similar, characterized by a TBP-induced ϳ80°bend in the DNA helix. TBP also binds to numerous variant TATA sequences, many of which occur naturally in promoters (5,6). For 21 such single-point mutants of the adenovirus major late promoter (AdMLP) TATA box sequence, in vitro transcription activity was found to range from Ͻ1% to 107% of that of the reference AdMLP TATA sequence (6).
The wide range of observed transcription activities suggested that TBP does not bind similarly to all TATA elements. Gel electrophoresis circular permutation analysis of TBP⅐DNA complexes shows that the electrophoretic mobility of the complexes is TATA sequence-dependent, with bend angles from Ͻ34°to 106°inferred from the gel mobility patterns (7). In contrast, the co-crystal structures of 11 TATA sequence variants of varying affinity bound to A. thaliana TBP are all very similar, with the DNA helix bent as in the strong promoters (3,8).
The present study was undertaken to further explore the TATA box sequence dependence of TBP binding and DNA structure using native, full-length S. cerevisiae TBP together with the AdMLP TATA sequence and five single-base-pair variant sequences. End-to-end distance distributions for these duplexes, free and TBP-bound, were extracted from measurements of time-resolved fluorescence emission in conjunction with fluorescence resonance energy transfer (FRET). Bend angles for the DNA within each of the TBP⅐DNA complexes were determined using three models. The reference AdMLP and five variant TATA sequences bound to TBP have significantly different mean end-to-end distances in solution. These distances are consistent with DNA bend angles ranging from 29.9°to 61.8°for the variant sequences and 76.2°for the native AdMLP. The latter bend angle is in excellent accord with the bends observed in the co-crystal structures. A strong correlation is observed between the solution bend angles and the transcription activities (6). These findings are consistent with the structure of TBP⅐TATA complexes being a principal determinant of TATA-box-dependent transcription activity. A model is proposed that reconciles the sequence dependence of bend angles and transcription activities measured in solution with the DNA structures observed in the co-crystals.

EXPERIMENTAL PROCEDURES
Protein, DNA, and Solution Conditions-Full-length S. cerevisiae TBP was prepared as described previously (9,10). The double-labeled 14 base oligonucleotides, with 5Ј-TAMRA and 3Ј-fluorescein, were as described previously (11). The specific sequences are shown in Table  I. The corresponding single-labeled oligonucleotides (denoted 14mer*F) were identical except that each had 3Ј-fluorescein but no 5Ј-TAMRA. The double-and single-labeled probes and unlabeled complementary oligomers were synthesized by Sigma-Genosys (The Woodlands, TX). The former two classes of oligomers were high performance liquid chromatography/polyacrylamide gel electrophoresis and polyacrylamide gel electrophoresis-purified, respectively. Studies were conducted at 30 Ϯ 0.05°C in 10 mM Tris-HCl (pH 7.4), 100 mM KCl, 2.5 mM MgCl 2 , 1 mM CaCl 2 and 1 mM dithiothreitol.
Theory, Instrumentation, Data Acquisition, and Analysis-Extensive discussions of Förster resonance energy transfer and its application to these studies have been published (Refs. 11-13 and references therein). Very briefly, FRET is the process whereby excitedstate energy is transferred nonradiatively from a donor to an acceptor fluorophore. Because both dyes are attached to an oligomer by flexible tethers, their distance apart is variable and the donor decay depends upon the probability distribution, P(R), of all possible such distances (13,14) as follows, where I d and I da are the fluorescence emission intensity of the donor in the absence and presence of acceptor, respectively, and the inverse of the ith donor lifetime, 1/ di , equals (k F ϩ k I ) di and [1/ D* (R o /R) 6 ] ϭ k t . (k F and k I are the respective rate constants for fluorescence and nonradiative decay, and k t is the rate constant for energy transfer.) R 0 is the Förster distance, for which the efficiency of transfer is 0.5. D* in Eq. 1 is the donor lifetime uniquely associated with a particular value of R 0 , and remains constant as long as the acceptor absorption and donor emission spectra remain unchanged. Thus, P(R) may be extracted from measurements of the fluorescence lifetime decay of the donor in the presence and absence of acceptor. Fluorescence lifetime measurements were made in the time domain using a LaserStrobe spectrofluorometer (Photon Technology International, Inc., Lawrenceville, NJ) with PTI dye PL481 to generate pulsed 488-nm excitation light. A 520-nm interference filter (Oriel Corp., Stratford, CT) between the sample compartment and the detector isolated the fluorescein emission. In obtaining the fluorescein decay, the following experimental procedure was used in every case: The duplex or duplex⅐TBP complex was formed in the cuvette and equilibrated ϳ12 min. Data acquisition was initiated, with detection beginning just prior to fluorescence emission from the sample and ending at ϳ10ϫ the longest fluorescence lifetime, determined in preliminary measurements. Three such complete decays were collected and averaged by the software to generate one representative decay curve. The true fluorescence decay was extracted from the total emission, which included the instrument response function, using an iterative reconvolution procedure incorporated into the minimization procedure.
Measurements of the fluorescence lifetimes of the 3Ј-fluorescein donor fluorophores were made for the reference AdMLP sequence and the five variant sequences, for both the single-and double-labeled duplexes, and for the duplexes both free and TBP-bound. The concentrations of the double-and single-labeled duplexes were 50 and 20 nM, respectively, with a 2-fold excess of complement and a total sample volume of 750 l. The DNA⅐TBP complexes for all variants except A3 were formed by adding protein to the duplexes to final concentrations ranging from 350 to 910 nM, respectively, sufficient to ensure Ͼ96% bound DNA in accord with the previously determined equilibrium constants. Five replicate representative curves (each an average of three decays) were defined as one set. Four such sets were collected for each of the following cases: free ML dpx *F, TBP-bound ML dpx *F, free T*ML dpx *F, and TBP-bound T*ML dpx *F. Four data sets were also obtained for each case of the C7 sequence. Three data sets were collected for each case of the T5, G6, and T6 sequences. Collection and analysis of data for the A3 sequence is discussed separately.
These normalized data were fit to both bi-and tri-exponential decay models with the relative quality of the fits assessed according to the values of 2 , the Durbin-Watson parameter (15,16), and the runs test parameter (17). The five ␣ and values composing a given set were then averaged; the corresponding average decay curve, henceforth a composite curve, characterized 15 separate decays. Four such composite curves were thus obtained for each case of AdMLP and C7, with three composite curves for each case of the other sequences. For the free AdMLP duplex, a 4 ϫ 4 matrix was then constructed from the four composite ML dpx *F decays and the corre-sponding four T*ML dpx *F decays. The resulting 16 combinations of donor/donor-acceptor decays were individually analyzed as described (13) to obtain 16 values of P(R) free using Eq. 1. Sixteen values were thus obtained for R free (the mean 5Јdye-3Јdye distance for the free duplex) and for free characterizing the distribution. These values were averaged to yield the reported values of R free and free and the corresponding standard deviations.
For the bound AdMLP duplex, 16 values of P(R) were likewise obtained from the corresponding 4 ϫ 4 matrix. Because all solutions of bound DNA contained a small amount (Ͻ1% for AdMLP) of free duplex, each of the 16 probability distributions for the bound DNA was subsequently fit to the sum of two P(R) values corresponding to bound and free duplex. P(R) bound and P(R) free were weighted to reflect their respective fractional populations, determined using K a . The values for R free and free were fixed at the previously determined values and the optimal values for R bound and bound obtained. The mean values and standard deviations for R bound and bound were determined as for the free duplex. The C7 data were analyzed identically to AdMLP. The same analysis obtained for T5, G6, and T6 but with 3 ϫ 3 matrices.
The titrations of the double-labeled duplexes with TBP, conducted using a spectrofluorimeter (Photon Technology International, Inc., Model A-1010) and steady-state fluorescence emission, and subsequent determination of the equilibrium binding constants (K a ) were as described (11) with the exception of T*A3 dpx *F.
The value of K a for the A3 variant was determined indirectly using time-resolved measurements, because the steady-state emission of the T*A3 dpx *F changed very little upon TBP binding. Three composite bi-exponential decays characterizing free A3 dpx *F were obtained as described. To 25 nM A3 dpx *F, TBP was added to final concentrations of 153 and 178 nM (lower), 306 and 333 nM (intermediate), and 457 and 660 nM (higher), resulting in six different mixtures of free and TBP-bound A3 dpx *F. Measurements of the fluorescence lifetimes of each of these mixtures were made as described, to obtain one composite curve (five data sets) for each; mean values of ␣ i and i , describing the decay for each mixture, were determined. The observed decay, F(t), from each of these mixtures derived from both free and bound A3 dpx *F and may be described as where X is [A3 dpx *F⅐TBP], D and T are the respective total concentrations of DNA and TBP, K a is the association constant for the A3⅐TBP interaction, ␣ i and i characterize the free A3 dpx *F decay, and ␣ i Ј and i Ј characterize the TBP-bound A3 dpx *F decay. The mean values for ␣ i and i for the free A3 dpx *F were fixed in Eq. 2 at the mean values for the three composite curves previously obtained. The decay curves for the mixtures were analyzed globally in groups of three for four parameters: To determine P(R) for free A3, three composite bi-exponential decays characterizing free T*A3 dpx *F were determined as described. These ␣ i and i values were used together with those for free A3 dpx *F to construct a 3 ϫ 3 matrix and obtain mean values and error estimates for R free and free as described.
The determination of P(R) for TBP-bound A3 differed from that of the other sequences, because the value of K a was too low to achieve saturation of the duplex. TBP was added to 50 nM T*A3 dpx *F to final concentrations of 656 and 365 nM, with the latter repeated in two independent experiments. According to the previously determined value for K a , the resulting mixtures were 66 and 51% bound T*A3 dpx *F, respectively, with the remainder being free T*A3 dpx *F. The observed decays, I da (t), from these three mixtures were used in an expansion of Eq. 1, with the values of the fractions of free (F free ) and bound (F bound ) T*A3 dpx *F fixed according to K a . The values of ␣ i Ј and i Ј in the second term on the right-hand side of Eq. 4 reflect bound donor-only decay and were fixed at the mean values previously determined. P(R) was then obtained using a matrix approach, analyzing separately all nine combinations of the three I da values and, in the first term on the right-hand side, the three sets of ␣ i and i values from the three composite curves corresponding to free T*A3 dpx *F. The resulting nine values for R bound and bound for T*A3 dpx *F were averaged, and the S.D. was determined.
To confirm that the value of R 0 remained essentially constant for all cases studied, R 0 was determined independently for AdMLP and the T6 variant for both the free and TBP-bound duplexes, using the solvent refractive index of 1.332. The overlap integrals were determined independently as described (13) using emission spectra for free and bound ML dpx *F and T6 dpx *F and absorption spectra for free and bound T*ML dpx and T*T6 dpx . Emission and absorption spectra were collected on a steady-state fluorimeter (Photon Technology International, Inc., model A-1010) and a Hewlett-Packard diode array spectrophotometer (model HP8452A), respectively.
To establish sufficient dye mobility consistent with 2 ϭ 2/3, timeresolved anisotropy decays were also measured for free and TBP-bound ML dpx *F, T*ML dpx , T6 dpx *F, and T*T6 dpx . Semi cone angles for the dyes in each of these eight conditions were determined as described (18). Anisotropy decay measurements were made using the LaserStrobe spectrofluorometer with 488-and 500-nm excitation light for fluorescein and TAMRA, respectively, and with the corresponding emission isolated by a 520-nm interference and a 530-nm long pass filter.
Thermostability of TBP-Because a time of ϳ80 min was required for one set of five replicate fluorescence lifetime measurements, the thermostability of TBP both free and DNA-bound was investigated. For the latter, the complex was formed using 1:1 TBP:DNA, both at 2.5 M, with 25 nM T*ML dpx *F and the remainder of the top-strand being unlabeled DNA. (The reliability of the double-labeled duplexes as trace probes has been demonstrated previously (11).) Equilibrium was established with 95% saturation, within 1 min, as reflected by the steady-state emission spectrum (9). The spectrum was monitored, and the fractional saturation determined, at 5-to 10-min intervals for ϳ60 min. Because the double-labeled duplexes are stable for several hours at 30°C, any change in the spectrum of the bound complex reflected a change in the fraction of DNA bound. The time dependence of the fraction of bound DNA was extrapolated to obtain an estimate of the half-time for inactivation of DNA-bound TBP.
To determine the stability of free TBP, a control experiment was done first, as follows: TBP was added to 250 l of buffer to a concentration of 550 nM. To this solution was immediately added T*ML dpx *F (as a trace probe), unlabeled top strand, and complement to 30, 520, and 890 nM, respectively, to form 550 nM duplex. The fractional saturation was determined from the steady-state emission spectrum at 2, 5, and 10 min following addition of the complementary strand. The duplex had formed and bound TBP to 90% saturation within 2 min, as reflected by the steady-state spectrum, in precise accord with K a . Identical measurements were subsequently made with the 550 nM protein solution incubated at 30°C for 5, 7, 10, 15, 30, and 60 min prior to addition of the duplex. Binding of less than 90% saturation within 2 min was attributed to inactivation of the free protein during the preincubation. An estimate of the half-time for inactivation of free TBP was determined directly from these data.
Simulations were conducted to demonstrate the significance of thermal inactivation of TBP on the fluorescence measurements under the experimental conditions of these studies. The two-intermediate linear model and the six rate constants previously determined for this model for AdMLP binding to yeast TBP (11) were used as the basis for the simulations, with irreversible pathways added for inactivation of free and bound TBP. For the latter, the first-order rate constants determined from the measurements just described were used.

RESULTS
TATA Sequence-dependent DNA⅐TBP Affinity-That native S. cerevisiae TBP is monomeric at the protein concentrations and solution conditions used in these studies has been demonstrated by analytical ultracentrifugation (19,20) and biochemical studies (11). Stopped-flow binding studies conducted at stoichiometric and equimolar concentrations of TBP and DNA (1 M), thus sampling the entire population of TBP molecules, are well-described by the rate constants associated with a complex model for monomeric TBP binding (11). 2 These results confirm the absence of a slow, rate-limiting step in binding (which might derive from a dimer 3 monomer process (21)) under the experimental conditions of this study.
The sequences of the reference AdMLP and five variant TATA boxes are listed in Table I. The top strand of each of the double-labeled 14-base pair duplexes (denoted T*14-mer dpx *F) had 6-carbon linkers to 5Ј-TAMRA and 3Ј-fluorescein identical to those used previously (11,13). These oligonucleotides differ only by a single base pair within the core sequence. The equilibrium association constants (K eq ) of native S. cerevisiae TBP binding these duplexes, determined by steady-state FRET, varied over a range of ϳ75ϫ (Table I). Of the variant sequences, only T5 binds TBP more tightly than the native AdMLP.
TBP Stability-Control experiments were conducted to demonstrate that a constant concentration of TBP⅐DNA complex was maintained throughout the course of a set of fluorescence lifetime measurements. The half-times for inactivation of the free and DNA-bound S. cerevisiae TBP preparation used in these studies (9, 10) were determined to be ϳ1 and ϳ10 h, respectively, at 30°C under the experimental conditions of these studies. The protein remains fully active in DNA binding even after 24 h at 0°C. These and similar results obtained by other assays 3,4 contrast sharply with a recent report of the loss of the "vast majority" of the DNA binding activity of S. cerevisiae TBP after 0.3 min of incubation at 30°C and all binding activity after 45 min even at 0°C (21).
Numerical simulations mimicking the experimental conditions and incorporating these rates of TBP inactivation demonstrated the effects of TBP inactivation to be entirely negligible for at least the 80-min time period over which fluorescence lifetime data were acquired. 5 This result is consistent with the experimental observation that the first and last curves were nearly identical for a given set of five replicate fluorescence lifetime decays (Fig. 1), showing no detectable loss of the protein⅐DNA complex with time.

End-to-End Distance Distributions, P(R), in Solution for
Free and TBP-bound TATA Duplexes-The mean end-to-end distances (R) for the six duplexes, both free and TBP-bound, together with the corresponding values of for the distribution of distances, are listed in Table II. The values of R were similar for all of the free duplexes except T*A3 dpx *F, differing by no more than 0.4 Å with an average value of 54.5 Å. The R value of 53.4 Ϯ 0.1 Å for the A3 variant is consistent with inherent curvature and/or flexibility related to the length of an uninterrupted A tract (23). A detailed study of these distance distributions for unbound DNA duplexes and their relationship to DNA sequence is in progress and will be reported elsewhere.
In contrast to the results obtained for the free DNA, the mean end-to-end distances for the TBP-bound duplexes varied significantly (Table II). The largest decrease in R upon TBP binding, 7.3 Å, was measured for T*ML dpx *F. A decrease in the end-to-end distance was also observed upon TBP binding each of the variant duplexes, ranging from 4.9 to 0.1 Å.
The values of , the S.D. of the distribution, increased upon binding for all sequences, although not uniformly (Table II). A 5% increase in the breadth of the distribution for T*ML dpx *F contrasts with a 144% increase for T*T6 dpx *F. The range of values of for the bound duplexes, 7.5 Ϯ 0.1 to 10.5 Ϯ 0.3 Å, greatly exceeds the confidence limits of the measurements.
Control experiments were conducted to ensure that the changes in the value of R did not derive from changes in R 0 due to sequence or to TBP binding. Semi cone angles, ⌶ 0 , were determined for both fluorescein and TAMRA for the AdMLP and T6 duplexes, free and TBP-bound. (The transition moment of the fluorophore wobbles within a cone with the vertex at the center of the transition moment. The angle ⌶ 0 is half of the apical angle of this cone.) For these eight conditions, the semi cone angles ranged from 56°to 70°with an average value of 64°. Because the error for each angle was estimated to be Ϯ7°, none of these angles differed significantly from the mean. The fast and slow rotational correlation times, , correspond to the free dye and to the macromolecule to which the dye is attached, respectively (18). For these eight conditions, fast ϭ 0.15 Ϯ 0.03 ns (free and bound), slow,free ϭ 5 Ϯ 2 ns, and slow,bound ϭ 23 Ϯ 2 ns. These values of ⌶ 0 and fast reflect a high degree of rotational freedom for both dyes, free and bound, for two sequences with disparate values for R bound . Furthermore, the fluorophores are in very similar environments for the six duplexes, because all sequences are identical for Ն4 base pairs both 5Ј and 3Ј. The value of 2 was therefore assumed to equal 2/3 in all R 0 calculations. (The invariance of slow,bound further confirms the absence of condition-dependent TBP aggregation.) The independently collected 3Ј-fluorescein emission and 5Ј-TAMRA absorption spectra for the free AdMLP and T6 duplexes were invariant, yielding identical overlap integrals and R 0 ϭ 61.0 Å. The integrals were likewise invariant for TBPbound AdMLP and T6, with R 0 ϭ 61.2 Å, an increase of 0.3% upon binding. These values of R 0,free and R 0,bound were therefore assumed for the other four sequences. The results of these control experiments confirm that the sequence dependence of R bound does not derive from changes in R 0 .
TATA Sequence-dependent Solution Bend Angles for TBPbound Duplexes-We have shown previously that the P(R) determined for TBP-bound T*ML dpx *F using FRET fluorometry is generally consistent with the bent DNA observed in the co-crystals bearing strong promoter sequences (9). The relationship of R to the solution bend angle has been further explored by consideration of three models of DNA bending (Fig.  2). Although these models do not account for phasing of the dyes and unwinding of the duplex, the present data do not allow the critical testing of more complex models incorporating such detailed parameters of DNA structure.
For these three models, the smooth bend and single central bend models (Fig. 2, C and A, respectively) correspond to the upper and lower limits for the bend angles for a given ratio of

TABLE II
Values for the mean end-to-end distance, R , and for the distance distribution corresponding to P(R) for the AdMLP duplex and each of the five variants, both free in solution and TBP-bound A total of 368 decay curves, each one an average of three curves, were collected and analyzed. All curves were very well described by biexponential decay, with a mean value for 2 of 0.97 Ϯ 0.08. These data were subsequently analyzed to obtain the probability distributions. P(R) was modeled in all cases as a shifted Gaussian, determined previously from Hermite polynomial expansions to best approximate these distributions (13,22). P(R) values obtained for TBP-bound DNA were refit to P(R) bound ϩ P(R) free , weighted using K a , to correct for the small amount (Ͻ4%) of free duplex. All P(R) values fit to the summed distributions with correlation coefficients Ͼ 0.999.

FIG. 1. Typical time-resolved fluorescence decays, for T*ML dpx *F free in solution (dotted line) and TBPbound.
The decay collected initially for bound T*ML dpx *F (solid line), following equilibration at 30°C of 50 nM duplex with 440 nM TBP, is very similar to the fifth decay collected in the data set on the same material (dashed line). The latter curve was obtained after ϳ60 min and shows no trend toward the time progression of the free duplex, due both to the stability of the bound protein and to the large excess of TBP.
Variable DNA Bending in Solution TBP⅐TATA Complexes R bound /R free (Table III). Both models are universal in that the calculated bend angle depends only on the ratio of R bound /R free for bent versus linear DNA and is independent of DNA length.
Because of the nature of the DNA bend observed within the TBP⅐TATA complex, with sharp kinks at either end of the TATA sequence (1, 2), a two-kink model (Fig. 2B) has also been considered. The bend angles derived from this model are intermediate between the smooth and single central bend models, and depend on both the total length of the oligonucleotide and the position of the kinks. The bend angles that derive from the ratio of R bound /R free corresponding to each model are as follows, where ␣ and L 2 are as described in Fig. 2 and all distances are in angstroms. For all models, the linear distance is assumed to be the measured R for the free duplex. The DNA bend observed in all the co-crystal structures is relatively smooth, with most of the total bend occurring at the flanking Phe intercalation sites and the remainder in between. For TBP-bound T*ML dpx *F, the solution bend angle associated with the two-kink model, 76.2 Ϯ 0.2°, closely corresponds to that observed in the co-crystals (Table III). The solution angle derived from the smooth bend model, 105.0°, is significantly larger, and from the single central bend model, 60.1°, significantly smaller. As we show in the accompanying paper (24), the close correspondence of the AdMLP bend in solution and in the co-crystal demonstrates clearly the appropriateness of the twokink model to the interpretation of the FRET data.
The values of R for the variant sequences are consistent with solution bend angles, determined using the two-kink model, ranging from 29.9°(for A3) to 61.8°(for T5). With the exception of the A3 variant, the solution bend angles reported in Table III derive from ratios of R bound /R free with the latter specific for the given sequence. 6 Because free T*A3 dpx *F appears to have inherent curvature, the bend angle reported for TBP-bound A3 was determined using the average value of R free for the other five sequences of 54.5 Å. It is notable that the A3 sequence, alone among those studied, is not known to occur naturally and has transcription activity that is Ͻ1% of the AdMLP (6). The differences in bend angles observed among the AdMLP and variant promoter sequences are highly significant due to the high precision in the values of R (see "Experimental Procedures").
The similarity of the AdMLP bend angles determined in the co-crystal and in solution using the two-kink model demonstrates the adequacy of this model in describing the overall AdMLP conformation, without detailed consideration of DNA structure such as helical unwinding. Comparisons of the variant sequences to the reference AdMLP thus focused on this model, with the larger values of R bound for the variants interpreted as relatively smaller bends. These larger R values could in principle also arise from an increase in the contour length of the DNA due to increased variant helical unwinding. However, maximum unwinding would be expected to be associated with TABLE III Derived solution bend angles according to three bending models for the six duplex DNAs bound to TBP The bend angles derive from the ratio of R bound /R free for each sequence except A3. Error estimates in the angles corresponding to the two-kink model were obtained using a matrix approach analogous to that for the errors associated with R and . From the 16 (AdMLP and C7) or 9 values (T5, G6, and T6) for both R bound and R free , 16 ϫ 16 or 9 ϫ 9 matrices were constructed. Bend angles were determined for the resulting 256 or 81 independent ratios, respectively. The mean values and standard deviations reported correspond to a model-specific parameter (Fig. 2B) and do not imply that changes in trajectory of the DNA helical axes are being measured in solution to accuracy Ͻ1°. For the TBPbound A3 duplex, the nine values for R bound were compared to the mean value of R free for the other five sequences, because the free A3 duplex appears to be inherently bent (see text). These nine ratios were averaged to obtain bend angles and standard deviation, resulting in much larger error estimates for A3 than for the other sequences. FIG. 2. Three simple models for describing the TBP-induced DNA bend. A, a single central bend; B, a symmetric two-kink model; and C, a continuous smooth bend model. The two kinks in the TBP⅐TATA co-crystal structure envelop the six central core base pairs. L 2 in model B ϭ 20.4 Å, consistent with the structure of B DNA and also closely approximating the distance from the midpoints of the helix between steps 1-2 and steps 7-8 in the co-crystal structures (1,2). L 1 ϭ L 3 ϭ (R free Ϫ 20.4 Å)/2. The bend angles reported herein corresponding to all three models are those described by "␣." maximum bending to relieve the strain introduced by bending. Furthermore, the inter-phenylalanine distances (measured from the ␣ carbons) differ by Ͻ1 Å in the TBP crystal structures with and without bound DNA bent to 80°(1, 2), the conformations corresponding to the maximum structural distortion. The distance between the kink sites, L 2 , was therefore held constant in Eq. 6 for all sequences. Differences in R bound are then assumed to derive primarily from differences in the details of phenylalanine intercalation in the minor groove at the kink sites, with greater penetration resulting in increased compression of the major groove and greater bending.
Correlation between Bend Angle and the Breadth of the Distribution-Because the end bases and linker arms are identical in all oligonucleotides studied, the differences in the values of are assumed to derive primarily from the duplex DNA rather than the linker arms. The changes in , ⌬, upon TBP binding, rather than itself, then provide the most informative comparison among the sequences. The relationship between ⌬ and derived bend angle is shown in Fig. 3. The native AdMLP sequence (T*ML dpx *F), with the largest bend angle, has the smallest increase in of only 0.4 Å upon TBP binding. DISCUSSION Time-resolved fluorescence resonance energy transfer provides a rigorous approach to the determination of the structure and dynamics of macromolecules in solution. The primary experimental findings from this work are 1) the existence in solution of DNA sequence-dependent differences in the trajectory of the DNA as it passes through TBP⅐TATA complexes and 2) the inverse correlation between the observed DNA bend angle and the breadth of the corresponding distance distribution.
DNA Bend Angles in TBP⅐TATA Complexes and the Corresponding Probability Distributions Are DNA Sequence-dependent-The FRET data clearly demonstrate sequence-dependent differences in the trajectory of the DNA as it passes through TBP⅐DNA complexes. In sharp contrast to this result and similar conclusions drawn from circular permutation and DNA phasing studies (7,25), eleven variant TATA sequences bound to TBP, including all of the sequences in this study, have essentially identical ϳ 80°DNA bends in the atomic resolution structures determined for TBP⅐DNA co-crystals (3,8). These contrasting results are accommodated within a two-state allosteric model, based on an equilibrium between transcriptionally active and inactive TBP⅐DNA conformations (discussed below). The apparent conundrum presented by the solution and cocrystal structures is then definitively explained in the accompanying paper (24).
Also important for consideration of the underlying mechanism of this observation are the differences in the breadths of the corresponding distance distributions provided by the timeresolved FRET data. Clearly both structure and dynamics contribute to TBP⅐TATA function. The AdMLP sequence alone shows only a slight increase in the value of , the S.D. of the end-to-end distance distribution, upon TBP binding. A plausible hypothesis is that the complementarity of the protein-duplex interface confines the helix and restricts additional motion. This slight increase in the breadth of the distribution for the tightly bound AdMLP may derive from the presence of multiple conformers at equilibrium, each with bent DNA but differing, for example, in the extent of phenylalanine intercalation (11). An integrated hydroxyl radical footprinting and molecular dynamics study of the TBP-AdMLP interface supports this view of its dynamic nature (26).
The variant sequences show a general trend toward increasingly broader distributions as the extent of bending decreases, up to ⌬ ϭ 6.2 Å for the T6 variant. The inverse correlation between bending extent and distribution broadening may derive from the increasing misfit along the protein-DNA interface as helical bending decreases, including retention of solvent molecules at the interface. Indeed, complexes of TBP with the variant duplexes may be present in multiple conformations with the DNA bent very differently among those conformers, as discussed further in the following section. The broadened distribution would then result from equilibrium exchange among such conformers occurring on a time scale that is slow relative to the nanosecond time scale of the measurements, i.e. microseconds. In this case, the broader distribution of distances would not derive from any high frequency torsional and bending motions of the duplex that occur on time scales faster than nanoseconds, because such motion would be averaged out in these measurements (27).
A Bi-modal Distribution Model Reconciling the Solution and Co-crystal Bend Angles-A two-state model is hypothesized, unifying into a coherent perspective the sequence-dependent solution bend angles reported herein and the x-ray results in which only an AdMLP-like structure was crystallized. Each variant duplex bound to TBP is proposed to exist in two conformations, one (conformer ML ) with the DNA bound and bent as in the AdMLP⅐TBP complex and the other (conformer TI ) with the DNA significantly less bent (Fig. 4A). Only conformer ML has the correct geometry to allow binding of subsequent transcription proteins and effect measurable mRNA synthesis. Conformer TI is transcriptionally inactive and has the same FIG. 3. The correspondence for the naturally occurring sequences between bend angle (from Eq. 6), and the change in upon TBP binding, diff . The sequence associated with each data point is noted along the top axis. Because independent variances are additive, diff was calculated according to diff ϭ ( bound 2 Ϫ free 2 ) 1/2 , which assumes that diff reflects the broadening of due to the binding process; i.e. the contribution from the tethers is removed.
overall conformation for all variant DNA⅐TBP complexes, although the local structural and energetic features of the protein-DNA interface are sequence-dependent. The presence of conformer ML is assumed for all variants, because such a conformer is crystallized (except A3 (8)), although the solution conditions for the crystallizations differed from those employed in the present study. The two-state model provides a unifying and simple relationship among the variants to explain their observed differences in bend angle and distance distribution variance, rather than necessitating that each variant, with a unique set of conformers, be considered separately. 7 If the equilibrium for conformer TI 7 conformer ML occurs on a time scale significantly slower than that of the nanosecond measurements, the measured fluorescence decay for a given sequence would derive from both conformers. In fact, the probability distributions for all of the bound variant duplexes, P(R) i,bound , are very well fit globally by two constrained Gaussian distributions 8 corresponding to conformer ML and conformer TI , P͑R͒ i,bound ϭ ⌽ ML P͑R͒ ML,bound ϩ ⌽ TI P͑R͒ TI,bound (Eq. 8) where i specifies the variant, ⌽ ϭ mole fraction, and ⌽ TI ϭ 1 Ϫ ⌽ ML . In this analysis, the values of R ML,bound and ML,bound for conformer ML were fixed at 47.1 and 8.5 Å (Table II), respectively. The values obtained for the two fitted parameters were R TI,bound ϭ 53.3 Å and TI,bound ϭ 9.9 Å. The relative mole fractions of conformer ML and conformer TI for each variant, i, were determined concurrently as a function of the fitted value of R TI,bound and the measured values of R ML,bound and R i,bound .
Bend angles corresponding to the two-state model were then calculated for each variant, i, using Eq. 6 with appropriate weighting for R ML,bound and R TI,bound , where ͗R i,free ͘ is the observed mean end-to-end distance for a given free duplex. The relationship of the calculated (Eq. 9) and observed bend angles (Table III) with the mole fraction of conformer ML is shown in Fig. 4B.
If the exchange between conformer TI and conformer ML is fast relative to subsequent binding processes, the transcription factors "see" and appear macroscopically to bind to an average TBP⅐DNA structure that is sequence-dependent. The model predicts that the more AdMLP-like the average binary structure, the more efficiently transcription will proceed. Implicit in this model is a correspondence between the structure of the TBP⅐TATA complex and transcription activity, which is explored further below.
Minimal Correspondence of TBP⅐DNA Complex Lifetime to Bend Angle or Transcriptional Activity-Hawley and coworkers (7), inferring bend angles from gel mobility shifts for TBPbound AdMLP and eight variant sequences, also observed sequence-dependent differences in bend angles. However, for the sequences common to both studies, those angles differ from those reported herein in magnitude, by up to a factor of two, but more significantly, in the ordering of sequences by decreasing bend.
Although a correlation was asserted between bend angles inferred from circular permutation analysis and TBP⅐TATA complex stability (7), careful inspection of those data reveal a minimal correspondence between these two properties. A plot of the lifetime of the TBP⅐DNA complex versus bend angle from Table I ( 7) shows no general linear correlation (correlation ϭ 0.76, coefficient of determination ϭ 0.59); rather, the data form two distinct sets. The first of these sets of five sequences is composed of unstable TBP⅐TATA complexes, with lifetimes Յ 0.08 that of the wild type, but with bend angles ranging from Ͻ34°to 80°. The second set of five sequences, constituting a step function relative to the first set, includes only severe bends, from 80°to 106°, but lifetimes that vary 23-fold, from 0.08 to 1.85 that of the wild type. This conclusion is further supported by the recent work of Bareket-Samish et al. (25), who report no correlation between TBP⅐TATA complex stability and DNA bend angles determined similarly using gel phasing analysis.
The data of Hawley and coworkers also show a minimal correspondence between transcription activities and the lifetimes of the TBP⅐DNA complex. A plot of the lifetime of the TBP⅐DNA complex versus transcription activity (Ref. 7, Table I) shows roughly two data sets, with a poor linear relationship (correlation ϭ 0.73, coefficient of determination ϭ 0.54).
DNA Bends in TBP⅐DNA Complexes Are Highly Correlated with Relative Transcription Activity-The correlation between the solution bend angles determined in the present study and the corresponding in vivo and in vitro transcription activities 7 This two-state model is conceptually analogous to the allosteric model, with an R-state active form, a T-state inactive form and L, the allosteric constant, defining the R-T equilibrium. Myriad hemoglobins are well described by these two states but have widely variable value of L, depending on how substitutions alter the ␣ 1 ␤ 1 /␣ 2 ␤ 2 interface. 8 Whereas at least two distinct distributions are required to fit various higher-order DNA structures (28), each of the structures investigated herein were very well fit by a single distribution. It is only in the model-dependent global analysis of all bound sequences that two distributions can be distinguished, in varying proportions, accounting for all observed decays.

FIG. 4.
A, the two-state model in which each TBP-bound duplex is distributed between two populations, conformer ML , in which the DNA is bent ϳ80°as for AdMLP, and conformer TI , in which the DNA is bent only slightly. K eq is sequence-dependent. Transcription proceeds from conformer ML , with the DNA in the correct geometry. B, the correspondence of the mole fraction of conformer ML with the observed (Ⅺ, Table  III) and calculated (ࡗ, Eq. 9) bend angles, the latter corresponding to the two-state model. The theoretical values derive from only one fitted parameter, R TI,bound . The bend angle for conformer TI using the two-kink model is ϳ30°.
reported by Wobbe and Struhl (6) are shown in Figs. 5, A and B, respectively. The same correlation is observed upon comparison with either the HeLa TFIID or yeast TBP in in vitro transcription assays. Two possible explanations for this correlation present themselves. First, the observed differences in transcription activity are structurally based, resulting fundamentally from the sequence-dependent differences in the DNA bend angles in the binary complexes, or second, they derive simply from different levels of saturation of the TATA site by TBP, due to sequence-dependent differences in binding affinity. The concentrations of HeLa and yeast TBP used in the in vitro assays were reported to be saturating under the experimental conditions of those studies (6). We therefore conclude that the Ͼ100-fold differences observed in transcription efficiency could not have arisen from differences in TBP⅐DNA affinity.
Suppose, however, that only the tightly bound AdMLP sequence was saturated and the variant sequences were fractionally saturated in accord with their respective binding constants, so that transcription activity did reflect differences in affinity. Then, for example, were the AdMLP sequence 95% bound (as a lower limit), the transcription activity for the T6 sequence would be 86% that of AdMLP, based on the K a values shown in Table I. In contrast, the experimentally observed transcription activity for T6 was only 10% that of AdMLP (6). Thus, several independent lines of evidence support the conclusion that differences in TBP⅐DNA binding affinity cannot account for the observed differences in transcription efficiency.
In contrast, a significant correlation is observed between the solution bend angles and transcription activity. Wobbe and Struhl (6) similarly concluded that the in vivo activity of a TATA element is directly affected by the binary TBP⅐TATA structure. This conclusion was based on the close similarity between the in vitro activity of yeast TBP (and human TFIID) and transcription activity in yeast cells. The strong correspondence between the solution geometry of the TBP⅐DNA complex and transcription activity is further supported by a comparison of Figs. 4B and 5B. The relationship between transcription efficiency and bend angle is strikingly similar to the relationship between the fractional population of the allosteric conformer ML and bend angle. The extent to which conformer ML is populated, for a given sequence, thus closely corresponds to the relative transcription activity.
The relatively large values of determined herein for the bound duplexes with less favorable TATA box sequences are consistent with low frequency DNA flexibility within the binary complexes. Such duplex motions cannot be effectively distinguished from multiple conformations (29). The observed correlation between the extent of DNA bending and transcription activity thus leads us to propose that the probability for a given TBP⅐TATA complex to assume the conformation required for binding of subsequent proteins determines the corresponding transcription efficiency. For the bound variants, as the deviation from ϳ 80°increases, severe distortions of the duplex DNA to approach 80°become increasingly less probable. In terms of such fluctuations, a dependence of transcription efficiency on the average conformation of the binary TBP⅐promoter complex seems reasonable. Both biochemical and crystallographic results show that flanking sequences up-and downstream of the TATA box are contacted by TFIIA (30,31) and TFIIB (32)(33)(34), with TFIIB contacting both. Appropriately bent DNA in the TBP⅐DNA target may thus be critical for formation of stable ternary and quaternary complexes involving these proteins.
The trajectories of the helical axes resulting from different bends diverge rapidly (Fig. 6). For example, for a 14-bp duplex centered on the TATA box, the difference in the 5Ј-3Ј distance between a 40°and an 80°bend is ϳ4 Å. Extension of the duplex by only 6 bp up-and downstream, for example, more than triples that difference, from ϳ4 to ϳ13 Å. TBP-bound T6 and AdMLP have angles of ϳ40°and ϳ80°, respectively, and 6-bp extensions correspond generally to the flanking contact regions for TFIIA and -B. Formation of a stable higher-order structure is thus predicted to be less probable for the TBP⅐T6 complex than for the TBP⅐AdMLP complex, due to the spatial requirements.
In drawing a correlation between the apparent bend angle and transcriptional activity, however, consideration must be given to the experimental conditions of the respective studies. The in vitro transcription assays were performed in the presence of osmolyte (6,35). As shown in the accompanying paper (24), the conformations of some bound variant sequences are FIG. 5 (6), with the activity of the AdMLP sequence set at ؉؉ and 1.00, respectively. The bend angles were determined using the two-kink model (Fig. 2B). For B, the correlation ϭ 0.98 and the coefficient of determination ϭ 0.97. These data show a minimal relationship between transcription activity and the association equilibrium constant (correlation ϭ 0.88 and coefficient of determination ϭ 0.77).

. The correlation between the solution bend angles of the TBP-bound duplexes and the corresponding in vivo (A) and in vitro (B) transcription activities reported by Wobbe and Struhl
FIG. 6. The helical trajectories corresponding to 40°(dotted lines) and 80°(solid lines) DNA bends. The lighter segments of each trajectory correspond to that part of the 14-mer duplex beyond the 5Ј and 3Ј phenylalanine insertion sites, beginning with positions Ϫ31 and Ϫ24, respectively. The Ϫ20 position downstream of the TATA box (q) and the Ϫ38 position upstream (E) delineate the TFIIB contact regions and the Ϫ42 position (R), the TFIIA contact region. The distance between the up-and downstream TFIIB contacts (double arrows) differs by ϳ7 Å for 40°(T6) and 80°(AdMLP) bends.
sensitive to the presence of osmolyte. Because a significant correlation is observed between bend angle and transcriptional activity both in vitro and in vivo (Fig. 5, A and B), it is plausible that the extremely small differences in energy between conformers for these sequences (24) are compensated in osmolyte by protein⅐protein interactions among multiple transcription factors. How the binding of even one additional transcription protein, in osmolyte, might affect the equilibrium among sequence-dependent TBP⅐DNA conformers is not known. Thus, effects of osmolytes on the conformation of the binary complex within multiprotein complexes require further exploration.
However, unequivocal new insight is provided by elucidation of the solution structures of TBP⅐AdMLP and TBP⅐A3. These binary complexes with the high and low extremes of the observed bend angles correspond to the high and low extremes of transcriptional activity. The solution geometries of these two complexes are insensitive to the presence of osmolyte (24) and establish clearly the relationship between transcription activity and the structure of the binary complex.
Conclusions-The geometries of the TBP-bound variant TATA sequences in solution vary significantly and differ from their corresponding co-crystal structures. These solution conformations are consistent with DNA bend angles ranging from ϳ30 to ϳ76°based on a two-kink bending model. A strong correlation between the solution bend angles and relative transcription activity, but not with TBP⅐DNA affinity, is observed. This correlation is particularly notable, because efficient transcription requires complex geometric relationships among many proteins and to summarize such complexity with a single, simple bend angle must be, to some extent, an oversimplification.
This model contrasts with models in which the TBP⅐DNA binary complex structure is conserved (8) and sequence-dependent differences in transcription efficiency derive primarily from sequence-dependent differences in the stability of that complex (7,8). Our results support a model in which transcription efficiency derives in significant part from the sequence-dependent structure of the TBP⅐TATA binary complex. More specifically, the distance distribution for the average solution structure of the TBP⅐TATA complex may reflect the sequencedependent probability for the complex to assume a conformation in which the TATA box DNA is severely bent. Upon assumption of this geometry, the binary complex becomes a target for binding and correctly orienting the other components of the preinitiation complex.