Nucleotide Sequence Context Effect of a Cyclobutane Pyrimidine Dimer upon RNA Polymerase II Transcription*

We have studied the role of sequence context upon RNA polymerase II arrest by a cyclobutane pyrimidine dimer using anin vitro transcription system consisting of templates containing a specifically located cyclobutane pyrimidine dimer (CPD) and purified RNA polymerase II (RNAP II) and initiation factors. We selected a model sequence containing a well characterized site for RNAP II arrest in vitro, the human histone H3.3 gene arrest site. The 13-base pair core of the arrest sequence contains two runs of T in the nontranscribed strand that impose a bend in the DNA. We hypothesized that arrest of RNAP II might be affected by the presence of a CPD, based upon the observation that a CPD located at the center of a dA6·dT6 tract eliminates bending (Wang, C.-I., and Taylor, J.-S. (1991) Proc. Natl. Acad. Sci. U. S. A.88, 9072–9076). We examined the normal H3.3 sequence and a mutant sequence containing a T → G transversion, which reduces bending and efficiency of arrest. We show that a CPD in the transcribed strand at either of two locations in the arrest site is a potent block to transcription. However, a CPD in the nontranscribed strand only transiently pauses RNAP II. The CPD in concert with a mutation in the arrest site can reduce the extent of bending of the DNA and improve readthrough efficiency. These results demonstrate the potential importance of sequence context for the effect of CPDs within transcribed sequences.

We have studied the role of sequence context upon RNA polymerase II arrest by a cyclobutane pyrimidine dimer using an in vitro transcription system consisting of templates containing a specifically located cyclobutane pyrimidine dimer (CPD) and purified RNA polymerase II (RNAP II) and initiation factors. We selected a model sequence containing a well characterized site for RNAP II arrest in vitro, the human histone H3.3 gene arrest site. The 13-base pair core of the arrest sequence contains two runs of T in the nontranscribed strand that impose a bend in the DNA. We hypothesized that arrest of RNAP II might be affected by the presence of a CPD, based upon the observation that a CPD located at the center of a dA 6  DNA repair rates along the genome are clearly heterogeneous (2). Certain lesions in actively transcribed genes are repaired much faster than in inactive loci. This can be generally ascribed to the preferential repair of the transcribed strand over the nontranscribed strand (3). Transcription-dependent repair has been shown to occur in mammalian cells (4,5), in Escherichia coli (6), and in Saccharomyces cerevisiae (7)(8)(9), suggesting that it is a widespread phenomenon reflecting the existence of a mechanism that directly couples nucleotide excision repair and transcription.
An additional level of repair heterogeneity is represented by the sequence-specific rates of repair of UV-induced cyclobutane pyrimidine dimers (CPDs) 1 (10 -13). Analysis of repair of CPDs along the tumor suppressor gene p53 (11) revealed that the sites at which CPDs were poorly repaired were also hot spots for mutations found in skin cancers, demonstrating a direct correlation between slow repair of CPDs and mutation hot spots in skin cancer. A correlation between slow repair rates and occurrence of mutations has also been observed in the hypoxanthine-guanine phosphoribosyltransferase gene in human fibroblasts treated with benzo[a]pyrene diol epoxide (14). Slow repair of CPDs at particular sites on the transcribed strand (10, 11) may reflect efficient bypass of CPDs by RNA polymerase II (RNAP II) in specific sequence contexts. A dimer might even be masked in particular sequence contexts such that transcription-coupled repair would not be initiated.
The role of the sequence context of a lesion on transcription can be analyzed more closely by using substrates containing site-specific lesions and a transcription assay with purified RNAP II and initiation factors. Using this approach, we have previously shown that N-2-aminofluorene and N-2-acetylaminofluorene adducts located on the nontranscribed strand enhance arrest by rat liver RNAP II in vitro at a pause site located ϳ15 bp downstream from the site of the lesion. Removal of that sequence-specific pause site abolished the adduct-specific arrest, suggesting that the adduct is responsible for the enhanced RNAP II pausing at this site (15). This finding reveals the general importance of sequence context on polymerase arrest. Some clues to understanding the effect of DNA sequence context on repair rates of transcription blocking lesions are provided by the conditions under which transcriptional arrest occurs. Studies of transcription in vitro using partially purified components and substrates containing specific sequences have resulted in the identification and characterization of RNAP II arrest sites. In some cases, arrest occurs at a bend in the helix axis of template DNA (16). Examples include sites within the first intron of the human c-myc gene, the first intron of the human histone H3.3 gene, and an early transcription unit of SV40. Polymerase arrest can also be induced by DNA binding drugs (17) and by sequence-specific DNA-binding proteins (18).
The human histone H3.3 gene arrest site has been well characterized for RNAP II in vitro (Ref. 19 and references herein). It includes several runs of thymine in the nontranscribed strand that are required for transcription arrest. Circular permutation experiments have shown DNA to be bent at this sequence (16). The core of the arrest site contains the sequence 5Ј-TTTTTTTCCCTTTTTT-3Ј in the nontranscribed strand. Arrest occurs near the end of the first T run. When the entire sequence is present in the opposite orientation it does not cause transcription arrest, indicating a polarity in the recognition of the signal by RNAP II. Point mutations in the T runs that eliminate bending by this sequence are associated with readthrough of the arrest site by RNAP II, suggesting that bending of this sequence is involved in the arrest effect. However, a bend is not sufficient to cause transcription arrest, because the arrest signal is orientation-dependent, whereas the sequences containing the signal cause a bend when inserted in either orientation.
Studies on the effect of thymine dimer formation on dA n ⅐dT n tract bending (1) showed that a TT CPD located at the center of a dA 6 ⅐dT 6 tract eliminates bending of the sequence. Destruction of the A tract structure at the site of the dimer could have accounted for this effect, resulting in intact A tracts of reduced length and of reduced bending.
Based upon these findings, we have hypothesized that the efficiency of RNA polymerase arrest at the H3.3 arrest site might be affected by the presence of a thymine dimer in the center of the dA 7 ⅐dT 7 tract contained in the H3.3 arrest site. Using an in vitro transcription assay with purified RNAP II and initiation factors and substrates containing specifically located CPDs we have analyzed the effect of a CPD on RNAP II arrest at the H3.3 sequence.

EXPERIMENTAL PROCEDURES
Proteins and Reagents-RNAP II and transcription initiation factors were purified from rat liver or recombinant sources as described previously (19,20). T4 polynucleotide kinase and T4 DNA ligase were from Boehringer Mannheim. Photolyase from Anacystis nidulans was a gift from Dr. Anders Eker (Erasmus University, Rotterdam, The Netherlands). D44 IgG anti-RNA antibodies (21) were purified from ascites fluid as described previously (20). Highly purified NTPs were purchased from Pharmacia Biotech Inc., and formalin-fixed Staphylococcus aureus was from Life Technologies, Inc. Radiolabeled nucleotides were purchased from Amersham Corp.
Plasmids-Construction of plasmids pAdBam and pAdSma has been described (22). To construct plasmids to receive CPD adducted oligonucleotides, 52-base oligonucleotides with the sequences 5Ј-GATCCCCA-AAGAGGGACGTTTTTTTCCCTTTTTTGGAGAGGCGGAAACTGG-G-3Ј and 5Ј-GATCCCCAAAGAGGGACGTTTTTTTCCCTTGTTTGGA-GAGGCGGAAACTGGG-3Ј were annealed to the complementary strands and ligated into the BamHI site of pAdBam in one orientation to yield pAdH1 (see Fig. 1) and pAdG1 and in the opposite orientation to yield pAdH2 and pAdG2, respectively. These plasmids are identical to pAdSma except that the SmaI recognition sequence is interrupted by a 42-bp fragment containing the normal (H3.3) or the mutant (H3.3 T 3 G) H3.3 arrest site (see Table I).
Insertion of Adducted Oligonucleotides into Plasmids-14-and 16mer oligonucleotides of the sequence 5Ј-AAAGAGGGACGTTT-3Ј and 5Ј-AAAGAGGGACGTTTTT-3Ј containing a site-specific CPD at positions 12 and 13 or 14 and 15, respectively, were obtained from Dr. John-Stephen Taylor (Washington University, St. Louis, MO) (23). DNA templates containing a single CPD adduct on either the transcribed or nontranscribed strand were constructed by utilizing gapped heteroduplexes (GHD) as described previously (22). An adducted 14mer and a 28-mer or an adducted 16-mer and a 26-mer were phosphorylated by T4 polynucleotide kinase and inserted into GHD using T4 DNA ligase (see Fig. 1). DNA ligation products were purified from an agarose gel containing 0.5 g/ml ethidium bromide. Under these conditions, covalently closed circular DNA migrates as supercoiled DNA and can be resolved from nicked plasmids that resulted from incomplete ligation of the oligonucleotides into the GHD. Plasmid pAdSma and pAdH1 or pAdG1 were used to make GHD in which the adducted oligonucleotides containing the normal (pAdH1, Fig. 1) or the mutant (pAdG1) H3.3 sequence ligated into the nontranscribed strand. Plasmids pAdSma and pAdH2 or pAdG2 were used to make GHD in which the adducted oligonucleotides containing the normal (pAdH2) or the mutant (pAdG2) arrest site were ligated into the transcribed strand.
Transcription Reactions-Transcription reactions were performed as described previously (15,22). Briefly, 50 -100 ng of DNA templates were linearized with HindIII and incubated at 28°C with rat liver protein fractions D (2 g, containing TFIID and TFIIH), BЈ (1 g, TFIIF/TFIIE), recombinant rat TFIIB (3 ng), and rat liver RNAP II (0.5 g) to form preinitiation complexes. Nascent transcripts were radiolabeled at their 5Ј ends by the addition of ATP, UTP, and 40 Ci of [␣-32 P]CTP. Elongation proceeds until position 15, at which the first GTP is required for incorporation. Heparin was added to prevent further initiation, and then 800 M of each NTP was added to allow elongation to continue, typically for 15 min. Reactions were stopped with SDS and proteinase K. The RNA was precipitated with D44 anti-RNA antibodies and formalin-fixed S. aureus and then washed in reaction buffer (20 mM Tris-HCl, pH 7.9, 3 mM HEPES-NaOH, pH 7.9, 60 mM KCl, 0.5 mM EDTA, 2 mM dithiothreitol, 0.2 mg/ml acetylated bovine serum albumin, 2.2% (w/v) polyvinyl alcohol). RNA was resuspended in formamide loading dye, heat denatured, and electrophoresed through a polyacrylamide gel in TBE (89 mM Tris, 89 mM boric acid, 1 mM EDTA, pH 8) with 8.3 M urea. Gels were dried and autoradiographed using intensifying screens.
Preparation of Multimers of Synthetic Duplexed Oligonucleotides and Measurement of Their Gel Mobility-For each construction, a 5Ј end labeled 14-mer or 16-mer containing a single CPD was mixed with a 28-mer or 26-mer and a complementary 42-mer. All duplexes were constructed with 4-base 5Ј protruding sticky ends. 300 pmol of CPDcontaining oligo was labeled at the 5Ј end by 2.5 units of T4 polynucle- The samples were irradiated 10 cm from a Sylvania Blue F15T8/B lamp ( max , 425 nm) for 1 h at 4°C, and DNA was then purified by phenolchloroform extraction and ethanol precipitation. The ligation products, treated or mock-treated with photolyase, were analyzed on a native 8% polyacrylamide gel at room temperature and at 4°C. Gels were run at 7 V/cm, until the bromphenol blue dye reached the bottom of the gel. HaeIII digests of X 174 RF DNA were labeled at the 5Ј end as described above and were used as size markers.

RESULTS
Characterization of a 42-mer DNA Sequence Containing the Human Histone H3.3 Arrest Site-We selected from the H3.3 arrest site a 42-mer sequence containing the signal required for arrest (Table I) and characterized its structural and functional properties. To study DNA bending, 42-mer duplexes were constructed with 4-base 5Ј protruding sticky ends, ligated together to form multimers, and run on nondenaturing 8% polyacrylamide gels. Fig. 2A shows the migration of multimers of the 42-mer sequence, ranging in length from 84 to 210 bp, containing the H3.3 arrest signal (lane H3.3) and a mutant containing a T 3 G transversion that interrupts the second T run of the H3.3 arrest site (lane H3.3 T3 G). Both sequences migrate anomalously during electrophoresis when compared with the migration of the X174/HaeIII molecular mass markers (lane M). For example, the bands corresponding to 168-bp multimers migrate near the 194-bp band of the marker. However, the migration anomaly of the mutant sequence was less than that of the normal sequence, suggesting that the mutation affected DNA bending (Fig. 2A, lane H3.3 T3 G). This was confirmed by quantitation of bending in plots of RL versus the number of base pairs, where RL indicates the ratio between the apparent length, determined from comparison with the migration of electrophoretic markers, and real length (Fig. 2B).
To correlate bending with transcription arrest, the normal or the mutant H3.3 arrest sequences were inserted into GHD downstream of the major late promoter of adenovirus and assayed for transcription in vitro. Previous work using tailed templates (16) revealed 19% of full arrest activity in the 13-bp core sequence, whereas a T 3 G mutation in the second T run of the core sequence had a slight effect on arrest. Point mutations in the first T run of the full size (285 bp) arrest site had little effect on arrest, presumably because they are in the context of a strong, full size arrest site. 2 We found that the template containing the mutant H3.3 arrest signal caused 20% arrest, whereas the template containing the normal H3.3 sequence caused 30% arrest (data not shown), suggesting a correlation between the degree of bending and the efficiency of arrest. When the normal or the mutant H3.3 sequence was present in the opposite orientation we observed complete readthrough of the arrest site by RNAP II, confirming that there is a polarity in the recognition of the arrest signal by RNAP II (16). We concluded that the sequence we selected from the H3.3 arrest site maintained its structural and functional properties.
Effect of a Specifically Located CPD on Transcription Arrest at the H3.3 Arrest Site-A CPD was placed either at a single site at the center (Table I, H-T∧T-16 and G-T∧T-16) or at the beginning (Table I, H-T∧T-14 and G-T∧T-14) of the dA 7 ⅐dT 7 tract of the H3.3 arrest site using a cys-syn thymine dimer building block (1). These CPD-containing arrest signals were inserted into GHD so that in the product they were located either in the transcribed or in the nontranscribed strand downstream of the major late promoter of adenovirus (Fig. 1). When the CPD-containing arrest signals were located in the transcribed strand, we observed only transcripts shorter than the full-length RNA present in the control (Fig. 3, lanes 2, 3, 5, and  6). Comparison of the size of these transcripts with the molecular mass markers indicated that these RNAs were extended up to or near the site of the CPD. Thus, a CPD in this sequence context is a complete block for RNAP II as we had previously observed for a TT CPD located in the HpaI restriction site (22).
We then analyzed the effect of a site-specific CPD located in the normal or in the mutant H3.3 arrest signal when the arrest sequence was in the natural orientation. Based upon the observation that a TT CPD located at the center of a dA 6 ⅐dT 6 tract reduces bending of the sequence (1), we anticipated that a dimer in the center of the dA 7 ⅐dT 7 tract in the H3.3 arrest signal might affect the efficiency of RNAP II arrest at the H3.   arrest site. In the controls (Fig. 4, lanes 1 and 4) we observed two transcripts corresponding to the full-length RNA and to the incomplete RNA resulting from RNAP II arrested at the normal (lane 1) or the mutant (lane 4) arrest sites. When a CPD was located in the normal arrest signal (lanes 2 and 3), the relative amount of incomplete transcripts did not change significantly from that present in the control (lane 1). Quantitation indicated that the normal arrest site caused 30% arrest (lane 1) compared with 26 (lane 2) and 21% (lane 3) for the damaged templates. When a CPD was located in the mutant arrest signal (lanes 5 and 6), we observed instead a decrease in the band corresponding to the incomplete transcript (lanes 5 and 6) from 20% arrest for the mutant site (lane 4) to 8 (lane 5) and 1% (lane 6) for the damaged templates, suggesting that the presence of a dimer in this sequence context reduced bending to an extent sufficient to allow RNA polymerase readthrough.
Effect of a Specifically Located CPD on Bending of the H3.3 Arrest Site-To determine the degree of bending of the CPDcontaining arrest signals, we analyzed the migration of ligated multimers on nondenaturing polyacrylamide gels (Fig. 5A). The presence of a dimer in the center of the first T run (H-T∧T-16 and G-T∧T-16) almost completely abolished bending of the normal and of the mutant H3.3 arrest signals (Fig. 5A). This was confirmed by quantitation of the RL values for each multimer. In contrast to the large RL value characteristic of the undamaged H3.3 and H3.3 T 3 G multimers, the RL values of the multimers with the dimer at the center of the A tract (H-T∧T-16 and G-T∧T-16) were almost equal to unity (Fig. 5B). To confirm that a change in migration was due to the presence of a CPD, ligated multimers were treated with A. nidulans photolyase. After photolyase treatment, the damaged ligated multimers migrated at the same position as the undamaged sequences (Fig. 5A). DISCUSSION We have confirmed that the presence of a CPD in the nontranscribed strand in the H3.3 arrest site allows readthrough by RNAP II (15). Additionally, we have shown that when a T 3 G base substitution is introduced in the second T run of that arrest site, we observe almost complete readthrough when the CPD is located in the center of the first T run of the mutant site, and this correlates with removal of bending of the H3.3 sequence by the dimer. When the CPD is located in the beginning of the first T run we observe only partial readthrough that correlates with the persistence of bending in the sequence. A CPD located in either position on the nontemplate strand in the normal H3.3 arrest site did not affect readthrough by RNAP II.
The ability of a CPD to allow readthrough of RNAP II at a natural arrest site is similar to the effect of base substitutions on the efficiency of RNAP II arrest at the H3.3 arrest site (16). Arrest signals with single or double nucleotide substitutions that interrupt either one or both runs of T residues allowed efficient readthrough by RNAP II, suggesting that both runs of T residues are required for arrest. Restriction fragments containing mutant signals migrated with mobilities closer to those predicted by the DNA sequence length, indicating that the sequences that cause transcription arrest are identical with those that cause the bend in the DNA helix (16). Footprinting analysis of the H3.3 arrest site showed that the footprint of RNAP II covers ϳ35 nucleotides, with the 3Ј end of the transcript about 10 nucleotides upstream of the leading edge of the polymerase (24). A similar footprint has been observed in a mutant arrest site containing point mutations on both runs of T residues. However, the RNAP II complex can resume elongation after pausing at the mutant arrest site without the need for accessory elongation factors like SII, and it translocates 14 -15 bp downstream, resuming the usual conformation characteristic of an elongation competent complex (25). A cyclobutane dimer in the first run of T residues and a base substitution in the second run of T residues may have similar effects.
Current models for transcriptional arrest propose that at certain template locations RNAP fails to continue nucleotide addition, resulting in a strained configuration, characterized by a decreased distance from the 3Ј end of the transcript to the leading edge of the polymerase (24, 26 -28). The "sliding clamp" model suggests that when RNAP II approaches pause, arrest, or termination sites it can slide backward along the RNA and DNA chains shifting the RNA⅐DNA hybrid and the transcription bubble with it, displacing the RNA 3Ј end from the active site (29). This model implicates the RNA⅐DNA hybrid in maintaining the RNA 3Ј terminus engaged with the active site of RNA polymerase (29,30). This may explain why at certain template positions where the DNA⅐DNA hybrid at the leading edge of the transcription bubble is energetically favored over the RNA⅐DNA hybrid, as in the case of dA⅐dT base pairing versus dA⅐rU base pairing, RNAP becomes efficiently arrested at these sequences in the nontemplate strand. Conversely, the presence of a CPD in this sequence context may shift the equilibrium toward the RNA⅐DNA hybrid rather than the DNA⅐DNA hybrid (31). This is also in agreement with our previous observation that a CPD is not inherently a block to RNAP II when it is located in the nontranscribed strand (22).
A CPD in a dA 6 ⅐dT 6 sequence affects DNA curvature depending upon its location in the A tract. This has been explained by the "junction model" for DNA containing an A tract (1). According to this model, the base planes in A tracts are inclined relative to the helix axis, resulting in a bend at the junction with B DNA. Bends are formed in A tracts when the A tract is longer than 3 bp at which point the unfavorable base stacking configuration is offset by favorable minor groove hydration. Formation of a CPD causes a small deformation of the double helix consisting in unwinding and bending of at least 7°relative to B form (1,32,33). The dimerization of the pyrimidines requires rotation of the neighboring pyrimidines from their usual B-form DNA alignment with overlapping of the 5,6 bonds. This affects B-DNA duplex stability and Watson-Crick hydrogen bonding interaction, which would interfere with cooperative A tract bending formation by preventing thymines from adopting the appropriate conformation. This results in intact A tracts of reduced length and reduced bending.
Bending of the H3.3 arrest site appears to be a structural element required for arrest by RNAP II (1). A bend in the DNA could influence the ability of the polymerase to move along the DNA helix, the contacts between RNAP II and the DNA template may be altered, or the conformation of RNAP II may change as it transcribes through this sequence (24). However, bending is not sufficient for arrest, because the arrest site is orientation-dependent, whereas the sequence is bent in both orientations. The bend geometry is likely to be important. Because the polymerase is travelling unidirectionally, the bend geometry of the sequence in one orientation would not be the same as the geometry when the sequence is positioned in the opposite orientation. This correlates with our finding that a site-specific CPD has a similar effect on bending in the normal or in the mutant arrest site, even though we observe improved readthrough efficiency of RNAP II only when the CPD is located in the mutant sequence.
Control of the rate and extent of transcription elongation is an important mechanism for regulation of gene expression both in prokaryotes and eukaryotes (26,34). RNA polymerase pause and termination signals have been identified (29), and transcription elongation factors have been isolated (35). A lesion, such as a CPD, can interfere with normal regulation of transcription through modulation of "natural" pause sites. These effects could lead to transitory destabilization of cellular growth control if the lesion were strategically located in the appropriate gene. Recently, it was shown that CPDs interfere also with binding of cell cycle regulatory proteins, DNA-damage responsive transcription factors (36), and nucleosomes (37), suggesting that interference with DNA-protein interactions by CPDs may be another relevant effect of UV-induced DNA damage on cellular metabolism.