Base pair opening in three DNA-unwinding elements.

DNA-unwinding elements are specific base sequences that are located in the origin of DNA replication where they provide the start point for strand separation and unwinding of the DNA double helix. In the present work we have obtained the first characterization of the opening of individual base pairs in DNA-unwinding elements. The three DNA molecules investigated reproduce the 13-mer DNA-unwinding elements present in the Escherichia coli chromosome. The base sequences of the three 13-mers are conserved in the origins of replication of enteric bacterial chromosomes. The exchange of imino protons with solvent protons was measured for each DNA as a function of the concentration of exchange catalyst using nuclear magnetic resonance spectroscopy. The exchange rates provided the rates and the equilibrium constants for opening of individual base pairs in each DNA at 20 degrees C. The results reveal that the kinetics and energetics of the opening reactions for AT/TA base pairs are different in the three DNA-unwinding elements due to long range effects of the base sequence. These differences encompass the AT/TA base pairs that are conserved in various bacterial genomes. Furthermore, a qualitative correlation is observed between the kinetics and energetics of opening of AT/TA base pairs and the location of the corresponding DNA-unwinding element in the origin of DNA replication.

DNA replication is a fundamental process in living cells that ensures transmission of genetic information between generations. In prokaryotic organisms, such as enterobacteria, replication starts at a unique origin site called oriC (1). The oriC provides a defined starting point for DNA replication, where the two parental strands of the DNA double helix separate such that semiconservative replication can be initiated. These functions are achieved by the selection of suitable base sequences in the oriC site and by its interactions with specific origin-binding proteins.
The organization of the oriC site in enterobacteria is shown in Fig. 1. The sites labeled R1-R4 contain 9 base pairs each and are the binding sites for the initiator DnaA protein (hence, the sites are also termed DnaA boxes). To the left, there are three tandem 13-mer sites, labeled right (R), 1 middle (M), and left (L). The base sequences of these sites are highly conserved in enterobacteria (1,2). Their consensus sequence (see Fig. 2) occurs nowhere else in bacterial chromosomes but in replication origins.
In the currently accepted model, DNA replication initiates with the binding of several copies of DnaA protein to the DnaA boxes. The binding induces unwinding and opening of the double helix at the neighboring M and R sites (1). The resulting "open complex" or "bubble" provides the entry site for DnaB helicase. Two hexameric DnaB molecules, assisted by DnaC protein, assemble onto the open M and R sites (3). DnaB encircles the M and R sites and forces the remaining 13-mer site L to unwind also. This event completes formation of the replication bubble, with the two DnaB helicase molecules now in position to unwind the DNA duplex bidirectionally from oriC.
The unwinding of the three 13-mer sites is the main event that must precede the onset of replication (1). The unwinding is specifically localized at the 13-mer sites and is initiated by the long range strain induced in the DNA double helix by binding of DnaA protein to the DnaA boxes. Interestingly, the unwinding occurs even in the absence of origin-binding proteins. Kowalski et al. (4,5) were the first to show that in the absence of replication proteins the three 13-mer sites are specifically cleaved by single-strand-specific nucleases, when placed in supercoiled plasmids. This result demonstrated that DNA supercoiling alone can induce unwinding of the three 13-mer sites. Specific unwinding in the absence of proteins and under negative supercoiling has been identified in a variety of other replication origin and regulatory sites (6 -9). Because of their unique functional properties, these sites have been termed DNA-unwinding elements or DUEs (5).
The molecular origin of the unwinding properties of DUEs is not yet understood. Several software packages have been developed to analyze the free energy required to unwind and separate the strands of the double helix in various DNA base sequences. Some analyses (10,11) use the free energies of dinucleotide steps deduced from DNA-melting experiments (12). Other analyses incorporate the energetic coupling that occurs between DNA sites under negative supercoiling (13). Experimental characterizations of DUEs by high-resolution structural methods have not yet been obtained.
In the present work, we have investigated the properties of three DUEs using imino proton exchange and NMR spectroscopy. The combined use of these two approaches allows characterization of the stability and dynamics of the three DNA double helices at the level of individual base pairs (14,15). The DNA molecules investigated are shown in Fig. 2. The sequences of the central 13 base pairs (positions 3-15) correspond, respectively, to the sequences of the L, M, and R DUE sites in the oriC of Escherichia coli. Two GC/CG base pairs are added at each end of each duplex to minimize the effects of fraying on the exchange of imino protons.

MATERIALS AND METHODS
DNA Samples-For each DNA duplex of interest, the two strands were synthesized separately, using phosphoramidite chemistry on an automated DNA synthesizer (Applied Biosystems model 381A). The six DNA strands were purified by reverse-phase high pressure liquid chromatography on a PRP-1 column (Hamilton) in 50 mM triethylammonium acetate buffer at pH 7, with a gradient of 5-32% acetonitrile for 39 min, at 60°C. The counterions were replaced with Na ϩ by repeated centrifugation (3-4 times) through Centricon YM-3 tubes (Amicon Inc.) using 0.5 M NaCl. The DNA solutions were desalted by repeated centrifugation (8 -10 times) against water. The corresponding two strands were annealed by equilibrating them in a water bath at 80°C for 30 min, followed by slow cooling down. The extinction coefficients were calculated as described by Cantor et al. (16): 336 OD 260 /mol for L duplex, 329 OD 260 /mol for M duplex and 330 OD 260 /mol for R duplex. Accordingly, the DNA concentrations in the NMR samples were 1.7 mM for duplex L, 2.0 mM for duplex M, and 1.9 mM for duplex R. All three DNA samples were in 90% H 2 O, 10% D 2 O, in 0.1 M NaCl and 1 mM EDTA at a pH between 8 and 9. Each DNA sample also contained 1 mM triethanolamine, which was used to measure the pH of the DNA solution in each NMR experiment. The pH was determined directly in the NMR tube from the difference in chemical shifts of the two methylene proton resonances of triethanolamine as we described previously (17).
NMR Experiments-The rates of exchange of imino protons with solvent protons were measured by NMR on a Varian INOVA 500 spectrometer using transfer of magnetization from water (18,19). The water proton resonance was selectively inverted using a Gaussian 180°pulse (6 -7 ms). A weak gradient (0.21 gauss/cm) was applied during the exchange delay following water inversion, to prevent the effects of radiation damping upon the recovery of water magnetization to equilibrium. A second Gaussian pulse (2-4 ms) was applied at the end of the exchange delay to bring the water magnetization back to the z axis. The observation was with the Jump-and-Return pulse sequence (20). The exchange delays were in the range from 2 to 600 ms.
The dependence of the intensity of an imino proton resonance on the exchange delay (t) is given by (15), where I 0 is the equilibrium intensity, I(0) is the intensity immediately after the first selective pulse on water, k ex is the exchange rate, and R 1 is the longitudinal relaxation rate of the observed proton. The factor A is defined as where I w (0) and I w 0 are the intensities of the water proton resonance after the inversion pulse and at equilibrium, respectively, and R 1w is the longitudinal relaxation rate of water protons. I w (0), I w 0 , and R 1w were measured in separate experiments. The exchange rate of each imino proton was obtained by fitting the intensity of the corresponding resonance to Equation 1 using a non-linear least-squares program.
The exchange rate measurements were carried out as a function of the concentration of ammonia, in the presence of 0.1 M NaCl, 1 mM EDTA, and 1 mM triethanolamine. Increasing concentrations of ammonia were obtained by adding to the DNA sample small aliquots of a stock ammonia solution in 0.1 M NaCl, 1 mM EDTA, and 1 mM triethanolamine at pH 8.0. The final concentration of ammonia in the sample was measured by 14 N NMR using a VXR-400 NMR spectrometer. The intensity of the 14 N NMR resonance of ammonia was calibrated separately for a range of ammonia concentrations from 0.1 to 4.5 M. The concentration of the ammonia base NH 3 was calculated from the total ammonia concentration (C 0 ) as with a pK value of ammonia of 9.40 at 20°C (21). Theory of Imino Proton Exchange in DNA-Characterization of the kinetics and energetics of base pair opening in DNA relies upon the exchange of imino protons (N 3 H in thymines and N 1 H in guanines) with water protons. In double helical DNA, imino protons are hydrogen bonded and are not accessible to water or exchange catalyst molecules. Therefore, for the exchange to occur, the base pair must open (14,15). In the opening reaction, the hydrogen bond at the imino group breaks, and the base swings into an open state where the imino proton can be abstracted by water or by exchange catalysts.
The exchange rate measured experimentally, k ex , depends on the rates of opening and closing of the base pair, k op and k cl , respectively, and on the concentration of exchange catalyst. When the ammonia base NH 3 is the catalyst for exchange, like in the present work, this dependence is (14,22) where k 0 is the rate of proton transfer from the open state of the base pair in the absence of ammonia, and k NH 3 is the rate constant for the transfer of the proton from the open state to NH 3 . The rates of opening and closing of an individual base pair are obtained by fitting the experimentally measured exchange rate as a function of the concentration of ammonia base to Equation 4. The rate constants k NH 3 for DNA double helices have been calculated based on results for isolated mononucleotides (15,23), for example, 1⅐10 9 M Ϫ1 s Ϫ1 for imino proton in guanine and 4.3⅐10 8 M Ϫ1 s Ϫ1 for imino proton in thymine, at 20°C. The equilibrium constant of the opening reaction is calculated from the rates of opening and closing as The equilibrium constant K op provides a measure of the energetic stability of a base pair (15,22); the higher the K op value, the lower the base pair stability is. Two regimes for imino proton exchange are generally observed depending on the concentration of ammonia base. The EX1 regime occurs at high concentrations of NH 3 such that k 0 ϩ k NH 3 ⅐ [NH 3 ] Ͼ Ͼ k cl . In this case, as Equation 4 shows, the exchange is rate-limited by the opening of the base pair. Hence, and the experimentally measured exchange rate equals the opening rate. Equation 4 can also be approximated as when the concentration of ammonia base is small such that k 0 ϩ (k NH 3 ⅐ [NH 3 ]) Ͻ Ͻ k cl . In this regime (EX2 regime), the equilibrium constant of the opening reaction, K op , can be determined from the linear dependence of the exchange rate on the concentration of ammonia base.

RESULTS
The NMR resonances of imino protons in the three DNA duplexes investigated are shown in Fig. 3. The assignments of resonances to individual protons were obtained from nuclear Overhauser enhancement spectroscopy experiments as described previously (17,18). The connectivities in the nuclear Overhauser enhancement spectroscopy spectra were those expected for B-DNA conformations of the three duplexes (data not shown).
To characterize the opening of individual base pairs in each DNA we have measured the exchange rates of imino protons as a function of the concentration of ammonia base (NH 3 ) at 20°C. A representative example of these measurements is shown in Fig. 4. As clearly seen, increasing concentrations of ammonia accelerate the exchange, as predicted by Equation 4. The dependence of the exchange rates on ammonia base concentration is illustrated in Figs. 5 and 6. Fig. 5 shows the results for five base pairs, which, in at least two of the duplexes investigated, are the same as in the consensus sequence, namely, GC 3 , CG 6 , TA 9 , TA 10 , and TA 15 (Fig. 2). In duplex R, the exchange rate of G 6 imino proton was measured only at low ammonia concentration (i.e. up to 50 mM total ammonia, which corresponds to the EX2 regime of exchange). At higher ammonia concentrations the resonance of this proton shifts slightly upfield and overlaps the G 8 imino proton resonance (Fig. 3). As Fig. 5 shows, for several imino protons, such as T 9 in R, and T 10 and T 15 in M, the EX1 regime could not be reached. This is because at increasing ammonia concentrations the exchange of these protons becomes too fast to be measurable by transfer of magnetization, i.e. exchange rate higher than ϳ100 s Ϫ1 . Fig. 6 compares the exchange of imino protons for several base pairs, which have the same nearest neighbors in at least two of the duplexes investigated. The base carrying the imino proton is shown in bold, and its 5Ј-and 3Ј-neighboring bases are as follows: 5Ј-AGA-3Ј for guanines and 5Ј-ATA-3Ј, 5Ј-ATC-3Ј, and 5Ј-TTG-3Ј for thymines. The exchange rate of the T 11 imino proton in the R duplex was measured only at low concentrations of ammonia (i.e. up to 75 mM total ammonia). At higher ammonia concentrations, the resonance of this proton shifts slightly upfield and overlaps the imino proton resonances from G 2 and G 16 (Fig. 3).
The dependence of the exchange rates on ammonia base concentration was fitted to Equation 4 (or to Equation 7 if only data in the EX2 regime were available). The fits provided the rates of opening and closing (k op and k cl , respectively) for each base pair. The equilibrium constants of the opening reaction (K op ) were calculated based on Equation 5. The results for k op and K op are summarized in Table I. The table contains only the base pairs whose imino proton resonances can be individually resolved in the NMR spectra (Fig. 2, shown in bold). The first two and the last two GC/CG base pairs are not included, because they are not part of the L, M, or R sites, and the exchange of their imino protons is affected by fraying at the ends of the duplex structure. DISCUSSION The results presented in Table I show that the opening of GC/CG base pairs in the three duplexes is significantly different from that of AT/TA base pairs. For GC/CG base pairs most opening rates are ϳ20 s Ϫ1 , whereas the opening equilibrium constants are ϳ2⅐10 Ϫ7 . In contrast, for AT/TA base pairs, the opening rates are much higher and span a wider range of values, namely, from 15 to 240 s Ϫ1 . Similarly, the opening equilibrium constants are increased relative to those for GC/CG base pairs, and range from 6⅐10 Ϫ7 to 340⅐10 Ϫ7 . It has been generally thought that the ability of the three 13-mer sites to unwind during initiation of DNA replication results from their high AT/TA content, i.e. 84% for the L site and 69% for the M and R sites (1,5). Our results are consistent with this view. They show that the kinetic and energetic propensities for opening of AT/TA base pairs are enhanced relative to those of GC/CG base pairs. Closer analysis of the results in Table I reveals  if one compares the base pairs that are conserved in the three DUEs. Fig. 2 shows the consensus sequence of the oriC DUE sites in enterobacteria. This consensus sequence was derived by comparing minimal origin sites in E. coli, Salmonella typhimurium, Enterobacter aerogenes, Klebsiella pneumoniae, Erwinia carotovora, and in the distant marine bacterium Vibrio harveyi (2). The L, M, and R sites in the oriC of E. coli contain 9 (or 10) of the 11 base pairs that constitute this consensus sequence (Fig. 2). The resolution of the NMR spectra allows comparison of opening processes for five of these base pairs in at least two duplexes, namely, GC 3 , CG 6 , TA 9 , TA 10 , and TA 15 (Fig. 5). The exchange rates of guanine imino protons in the conserved GC 3 and CG 6 base pairs are the same, or very similar, in all three duplexes. Accordingly, the rates and equilibrium constants for opening of these base pairs are also the same, within experimental errors (Table I). In contrast to these guanines, the exchange rates of imino protons in the conserved thymines differ. For all protons shown (i.e. T 9 , T 10 , and T 15 ), the exchange rates in the M duplex are consistently higher than those in the L duplex. Furthermore, for T 9 imino proton, the exchange rate in the R duplex is higher than that in either the M or the L duplex. These higher exchange rates reflect increases in the rates and/or equilibrium constants for opening of the corresponding AT/TA base pairs in the M and R duplexes relative to the L duplex (Table I). Hence, despite the fact that these AT/TA base pairs are conserved, the rates and equilibrium constants for their opening vary. The implication of this result is that the presence of a given base pair in a certain position of several origin sites does not necessarily ensure that its dynamic or energetic properties are the same. In fact, as our results show, the opening properties of the conserved base pair can vary dramatically depending on the extended base sequence context of the DNA. These variations are especially significant for the L, M, and R sites, because the role of these sites is to open and unwind the DNA for its replication.
The differences in the opening parameters of TA 9 and TA 10 that we observed could be because of the fact that in every duplex each of these base pairs is flanked by a different base pair. For example, TA 9 is flanked on one side by an AT base pair in duplex L, by a GC base pair in duplex M, and by a CG base pair in duplex R. Thermal melting studies have shown that the energy of the stacking interactions at a TA base pair varies by up to 1 kcal/mol depending on the neighboring base pair (12,24). Based on these results one could expect the rates FIG. 5. Dependence of the exchange rates of imino protons on ammonia base concentration for the conserved base pairs GC 3 , CG 6 , TA 9 , TA 10 , and TA 15 in duplex L (circles), duplex M (squares), and duplex R (diamonds). For clarity, the exchange rates of G 6 imino proton in duplex R are shown in the insert. The curves represent non-linear least-squares fits to Equation 4. For G 6 and T 9 in the R duplex, the lines represent linear fits to Equation 7. and equilibrium constants for opening of TA 9 to be different in the three duplexes. However, for TA 15 , the nearest neighbors are the same in L and M. Yet this base pair exhibits different opening parameters in these duplexes (Fig. 5 and Table I).
To further characterize these effects of base sequence on base pair opening we have compared several base pairs, which have the same nearest neighbors in at least two of the DUEs investigated. As shown in Fig. 6, for all guanines that are flanked on both sides by adenines (5Ј-AGA-3Ј), the exchange rates are the same. This finding parallels the invariance observed for the opening of the conserved GC 3 base pair (Fig. 5). Together, the two results suggest that the opening of GC/CG base pairs is mostly determined by the nature of the nearest neighboring base pairs. It is interesting to note also that GC 3 and CG 6 are part of the sequence 5Ј-GATC-3Ј/3Ј-CTAG-5Ј, which is the target for methylation at the N 6 positions of adenines by Dam methylase. In the oriC of enteric bacteria there are 11 of these Dam sequences, and three of them are located within the DUEs sites investigated here (2). Methylation of these sequences has been proposed as a mechanism to regulate initiation of DNA replication (1). Our present results show that the dynamics and energetics of the GC/CG base pairs at the Dam methylation sequences within the DUE sites are insensitive to the surrounding base sequences.
In contrast to the guanines, for all AT/TA base pairs shown in Fig. 6, the opening is not a simple function of the nearest neighbors. For example, for the 5Ј-ATA-3Ј near neighbor context, the exchange rates of T 13   R are both higher than those of T 8 and T 12 in duplex L. Accordingly, the rates and equilibrium constants for opening of AT 13 and AT 11 base pairs in the M and R duplexes are larger than those of AT 8 and AT 12 in the L duplex (Table I). Furthermore, within the L duplex, the rate and the equilibrium constant for opening of AT 8 are larger than those of AT 12 . These results clearly demonstrate that the opening of AT/TA base pairs in the three DNA duplexes is strongly influenced by long range effects of the base sequence. These effects are such that the opening of these base pairs in duplexes M and R is faster and energetically more favorable than that in duplex L. This trend suggests a correlation with the function of the three 13-mer sites in DNA replication. As described in the Introduction, the L, M, and R sites are strategically located in the origin of replication (Fig. 1). The M and R sites are the closest to the DnaA boxes. When DnaA protein binds to the DnaA boxes, the M and R sites unwind. At this stage, the L site, which is located farthest away from the DnaA boxes, is intact (3). Its unwinding requires intervention of DnaB helicase in complex with DnaC protein. Our results are consistent with this directionality of unwinding; base pair opening in the L site is the slowest and the least favorable energetically, whereas base pair opening in the M and R sites is kinetically and energetically enhanced. This variability does not involve any of the GC or CG base pairs. Instead, our results suggest that the differential stability of the three sites is conferred by AT or TA base pairs. This differential stability cannot be accounted for simply by base composition. The L site, for which base pair opening is the slowest and the least favorable, has the highest AT/TA content among the three sites (i.e. 84% as compared with 69% for the M and R sites). Therefore, the variability in base pair opening must result from the different sequences of AT and TA base pairs in the three duplexes. The exact features of these sequences that are responsible for the observed variability are not known. One possibility is suggested by the fact that the L site contains a continuous tract of nine AT/TA base pairs with the sequence TAT 3 AT 3 . The sequence encompasses two shorter tracts AT 3 , which belong to the family of A n T m tracts. Previous work from this and other laboratories has shown that the opening of AT/TA base pairs in A n T m tracts is much slower, and energetically less favorable, than that of AT/TA base pairs in random base sequence context (25)(26)(27). These properties are present when the tract contains at least four AT/TA base pairs (i.e. n ϩ m Ն 4) and does not include a 5Ј-TpA-3Ј step. Based on these previous observations we suggest that the base sequence of the L site has been selected in evolution such that two AT 3 tracts are present therein. The presence of these tracts stabilizes the double helix and could prevent its unwinding by the binding of DnaA protein to the distant DnaA boxes. In the M and R sites, the number of GC/CG base pairs is larger than in the L site. GC/CG base pairs are found in the 8th and 11th positions, which are variable in the consensus sequence (Fig.  2). Furthermore, the orientation of the TA base pairs in the 13th and 14th positions, which are conserved in the consensus sequence, is changed. The overall consequence of these changes is that the two AT 3 tracts are eliminated or shortened to less than four base pairs. As a result, the opening of AT/TA base pairs becomes faster and more favorable than in the L site.
In summary, in the present work we have shown that the opening of individual AT/TA base pairs in the three DUEs investigated differs from that of GC/CG base pairs in at least two respects. First, the opening of AT/TA base pairs is significantly more favorable, kinetically and energetically, than that of GC/CG base pairs. Second, the opening of AT/TA base pairs is affected by long range effects of the DNA base sequences, whereas the opening of GC/CG is insensitive to the base sequence beyond that of nearest neighbors. The first difference confirms the earlier proposal that the high content of AT and TA base pairs is necessary to lower the stability and enhance the dynamics of the three DUE sites. The second difference suggests that the sequences of these AT and TA base pairs have been selected by evolution as a means to modulate the energetic stability of the three DUE sites according to their location and function in the origin of DNA replication.