|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 281, Issue 20, 14119-14128, May 19, 2006
Quantitative Analysis of in Vivo Initiator Selection by Yeast RNA Polymerase II Supports a Scanning Model*From the Department of Biomolecular Chemistry, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin 53706
Received for publication, February 28, 2006 , and in revised form, March 29, 2006.
Initiation of transcription by RNA polymerase II (RNAP II) on Saccharomyces cerevisiae messenger RNA (mRNA) genes typically occurs at multiple sites 40120 bp downstream of the TATA box. The mechanism that accommodates this extended and variable promoter architecture is unknown, but one model suggests that RNAP II forms an open promoter complex near the TATA box and then scans the template DNA strand for start sites. Unlike most protein-coding genes, small nuclear RNA gene transcription starts predominantly at a single position. We identify a highly efficient initiator element as the primary start site determinant for the yeast U4 small nuclear RNA gene, SNR14. Consistent with the scanning model, transcription of an SNR14 allele with tandemly duplicated start sites initiates primarily from the upstream site, yet the downstream site is recognized with equivalent efficiency by the diminished population of RNAP II molecules that encounter it. A quantitative in vivo assay revealed that SNR14 initiator efficiency is nearly perfect ( 90%), which explains the precision of U4 RNA 5' end formation. Initiator efficiency was reduced by cis-acting mutations at 8, 7, 1, and +1 and trans-acting substitutions in the TFIIB B-finger. These results expand our understanding of RNAP II initiation preferences and provide new support for the scanning model.
Eukaryotes rely on RNA polymerase II (RNAP II)2 to synthesize all messenger RNAs (mRNAs) and most of the small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs) encoded within their nuclear genomes. Efficient and accurate transcription initiation is vital to ensure the proper expression and function of these RNAs. The recruitment of RNAP II to gene promoters is mediated through the assembly of a pre-initiation complex (PIC). RNAP II accessory proteins provide promoter specificity and the structural core for assembly of the PIC. These accessory proteins include the general transcription factors TFIID, TFIIB, TFIIF, TFIIE, and TFIIH (13). Some transcription factors engage in sequence-specific contacts with core promoter elements (4); one of the most fundamental interactions for PIC assembly is between the TATA-binding protein subunit of TFIID and the TATA box (56). In a stepwise model for PIC assembly, TATA-binding protein binding is followed by the addition of TFIIB, RNAP II-TFIIF, TFIIE, and TFIIH (7).
In metazoans, the assembly of a PIC at the TATA box results in start site selection 2530 bp downstream (4). The architecture of the PIC is such that the transcription start site is placed precisely within the active center of RNAP II (89). In the yeast Saccharomyces cerevisiae, RNAPII initiation typically occurs at multiple sites at variable distances from the TATA box, with most start sites ranging from 40 to 120 bp downstream of the TATA box (10). The initiation mechanism that accommodates this extended and variable promoter architecture is unknown, but it does not appear to be dependent on assembling the yeast PIC in a manner different from that of metazoans. Yeast promoter melting has been shown to begin at the same position as in metazoans, The initial sequence comparisons and mutational analysis of a relatively small set of yeast mRNA genes helped define three related yeast start site consensus sequences, RRYRR, TCRA, and YA(A/T)R in the non-template strand, where the initiation site is underlined, Y is pyrimidine, and R is purine (1214). Recently, an alignment of sequences flanking 4637 yeast transcription start sites has provided a more refined consensus sequence: A(Arich)5NYA(A/T)NN(Arich)6 (15). The DNA sequences encompassing yeast transcription start sites are sometimes termed initiator elements. In metazoans the initiator is defined as a core promoter element distinct from the TATA box that nucleates PIC assembly and is sufficient for accurate transcription (16). Although there is evidence to suggest that some yeast initiators may function in this way (1718), most appear to play a more limited role in transcriptional control and influence accuracy but not overall efficiency (2). It seems likely that the recognition and efficient utilization of yeast start sites involves a sequence-specific interaction between the yeast initiator element and either RNAP II, an accessory protein, or both. RNAP II and TFIIB have been shown to dictate the distance from TATA boxes to start sites in yeast (19). TFIIB substitutions that confer downstream shifts in yeast start site selection map to the "B-finger" domain, which encompasses residues 5588 of the N-terminal region (9, 2022). The promoter sequence immediately upstream of yeast start sites can influence the severity with which TFIIB B-finger substitutions alter start site selection (23). A yeast RNAP II-TFIIB crystal structure model shows the TFIIB B-finger inserted through the RNA exit pore into the polymerase active site, suggesting that start site selection may be mediated by a direct interaction between the B-finger and promoter DNA (9). Here we report the characterization of cis- and trans-acting determinants of start site selection at the yeast U4 snRNA gene, SNR14. In contrast to the heterogeneous transcription start site selection exhibited at most mRNA genes, yeast snRNAs typically have one major start site, thus providing a model system for the study of accurate initiation. We identified a highly efficient initiator element within the SNR14 promoter, defined the positions most critical for its function in start site selection, and quantified its efficiency relative to other initiator sequences. Substitutions within the TFIIB B-finger genetically interact with SNR14 initiator mutations in a sequence-dependent manner. Quantitative analysis of the utilization of tandemly duplicated initiator elements strongly supports the proposed scanning model for yeast transcription start site selection and demonstrates that scanning by RNAP II is processive.
Plasmid ConstructionSNR14 (positions 224 to +701, relative to +1 transcription start site) was cloned by PCR amplification of a genomic DNA template isolated from yeast strain PJ432b and ligated into the BamHI site of pRS313 (CEN4, ARS1, HIS3). 5' end truncation constructs were generated by the same method but using pRS313-SNR14 as the template. pRS313-SNR14-StDup was created by using QuikChange PCR mutagenesis (Stratagene) to insert 14 bp of DNA (13 to +1 relative to SNR14 start site) between positions +1 and +2of pRS313-SNR14, creating an overlapping 20-bp duplication. pRS317 (LYS2)-SUA7 contains the entire SUA7 promoter and coding region (TFIIB gene) and was constructed by ligation of the ClaI/SacI fragment of pRS314 (TRP1)-yIIBN (kindly provided by A. Ponticelli, State University of New York at Buffalo) into pRS317 (LYS2). TFIIB expressed from these constructs contains an N-terminal hexahistidine tag. All mutations within pRS313-SNR14-StDup and pRS317-SUA7 were created using the QuikChange method (Stratagene). pRS316 (URA3)-SNR14, SUA7 was constructed by ligating a PCR-amplified region of SNR14 (224 to + 701) into the SalI/XhoI sites of pRS316 (URA3)-yIIBN (kindly provided by A. Ponticelli, State University of New York at Buffalo). Oligonucleotide sequences are available upon request.
Yeast StrainsGenomic DNA isolated from PJ432b (MAT DNase I Chromatin FootprintingChromatin footprinting was performed as previously described (27) using the yeast strain PJ432b. After digestion of lysed yeast cells or purified genomic DNA with DNase I (Invitrogen), cleavage sites on the non-template strand of the SNR14 promoter were mapped by primer extension using 32P-labeled oligo U4-14C, which is complementary to non-template strand residues +32 to +51. Sequencing ladders were generated by primer extension of genomic DNA using 32P-labeled oligo U4-14C and a dNTP mix containing dideoxy-ATP or -GTP. RNA AnalysisTotal cellular RNA was isolated using the guanidinium thiocyanate method including a 65 °C phenol extraction (28). Reverse transcription for the determination of the Sec3 mRNA-processing sites was performed in a 50-µl reaction volume containing 5 µg of total RNA from strain PJ432b, 50 mM Tris-HCl (pH 8.3), 8 mM MgCl2, 50 mM NaCl, 11 mM dithiothreitol, 1 mM dNTPs, 40 units of RNasin (Promega), 250 pmol of T16-EcoR1 oligo, and 37.5 units of avian myeloblastosis virus reverse transcriptase (United States Biochemical). cDNA synthesis proceeded at 42 °C for 1 h. Ten µl of the 50-µl RT reaction was used as a template for PCR in a 100-µl volume containing 20 mM (NH4)2SO4, 50 mM Tris-HCl (pH 9.0), 0.75 mM MgCl2, 50 pmol of T16-EcoR1 oligo, 100 pmol of SEC3-RT-PCR oligo, and 1 unit of MasterAmpTM Tfl DNA polymerase (Epicentre). Each PCR cycle consisted of a denaturation at 94 °C for 30 s, annealing at 42 °C for 30 s, and elongation at 72 °C for 1 min. A total of 30 cycles was performed with an additional extension at 72 °C for 5 min. RT-PCR products were gel-purified and ligated into the BamHI/EcoRI site of pRS316. Recovered plasmids were sequenced using an M13F oligo. RNA immunoprecipitations were done as described previously (29). Each immunoprecipitation mixture contained 20 µl of swollen protein A-Sepharose CL-4B beads (Amersham Biosciences) that had previously been incubated with 5 µg of 2,2,7-trimethylguanosine Ab-1 antibody (Calbiochem), 50 units of RNasin (Promega), and 5 µg of total cellular RNA in a 200-µl volume. The precedent for Ab-1 recognition of both 7-methylguanosine and 2,2,7-trimethylguanosine caps came from Bochnig et al. (30). Primer extension analysis of 5 µg of total cellular RNA was carried out using 32P-labeled oligonucleotide U414B (complementary to nucleotides 140159 of yeast U4 RNA) or SCR1 (complementary to nucleotides 7592 of yeast scR1 RNA) (31). Sequencing ladders were generated using the Sequitherm EXCEL II DNA sequencing kit (Epicentre). The cDNA products were electrophoresed on 6% polyacrylamide, 8.3 M urea gels. Gels were visualized with a Storm PhosphorImager (Amersham Biosciences), and data were quantitated with Amersham Biosciences ImageQuant software (Version 5.2).
Conserved Sequence Elements Upstream of the Yeast U4 snRNA Gene, SNR14To begin characterizing SNR14 promoter architecture, we used comparative sequence analysis to identify conserved elements upstream of the transcription start site and downstream of the 5'-adjacent gene, SEC3 (Fig. 1A). An alignment of sequences upstream of SNR14 in four different species of Saccharomyces (32) helped identify several conserved elements (Fig. 1B). The most strikingly conserved regions include the sequence immediately upstream of the transcription start site, a TATA box located 100 base pairs upstream of the start site, a T-stretch just upstream of the TATA box, and a region located 3144 base pairs upstream of the TATA box. The most upstream conserved region may be an upstream activating sequence and in S. cerevisiae exactly matches the consensus binding site of the transcriptional activator Abf1 (3334). In addition to promoting SNR14 transcription, another likely function for conserved sequences in this intergenic region is to direct cleavage and polyadenylation of Sec3 mRNA. RT-PCR was used to identify the predominant Sec3 mRNA 3'-processing sites. The most efficiently recovered site, which appeared in 12 of 21 clones, mapped to the middle of the putative Abf1 binding site (Fig. 1B). Six other nearby sites were represented by 12 clones each. This result implies that the sites of SEC3 transcription termination and SNR14 PIC assembly overlap.
DNase I chromatin footprinting was used to complement comparative sequence analysis in the search for potential SNR14 promoter elements. This procedure probed in vivo assembled chromatin by lysing yeast cells directly into a solution of DNase I. For comparison, purified genomic DNA was digested with DNase I. Cleavage sites were detected by primer extension (27). No obvious DNase I footprint was observed between the SNR14 TATA box and start site (Fig. 1C) despite the fact that the gene is single-copy and highly transcribed. This finding suggests that there is no high occupancy protein binding site in the region of the promoter that separates the location of PIC recruitment from that of transcription initiation. Rather, the subtle changes in DNase I protection and enhancement suggest partial occupancy. In addition, some subtle changes in DNase I sensitivity were observed at the putative upstream activating sequence, TATA box, and initiator region, consistent with partial occupancy. We could not make any conclusions regarding the protein occupancy of the T-stretch given that it was not efficiently cleaved by DNase I. Because a scanning RNAP II complex is unlikely to provide sufficient promoter occupancy for detectable DNase I protection, the footprinting results obtained for the SNR14 promoter are consistent with this model for transcription initiation. The Conserved SNR14 TATA Box Is Not a Determinant of Start Site Position in VivoFunctional upstream SNR14 promoter elements were roughly mapped using 5'-truncation analysis of a plasmid-borne allele (Fig. 2A). Inserts were tested in both orientations within the vector to control for effects of plasmid sequences. Primer extension of U4 RNA synthesized from its chromosomal locus showed the single major transcription start site designated as +1 (Fig. 2B, lane 1). Strains bearing SNR14 with 224 base pairs of upstream DNA on a plasmid yielded the same initiation pattern as the chromosomal locus (Fig. 2B, lanes 2 and 9). Deletion of the putative upstream activating sequence and T-stretch had no effect on transcription efficiency or accuracy in the context of the "forward" plasmid-borne allele but decreased efficiency 2-fold in the "reverse" orientation (Fig. 2B, lanes 3 and 10). The efficiency of SNR14 transcription was reduced an additional 411-fold upon deletion of the TATA box, but transcription start site selection was changed little (Fig. 2B, lanes 4 and 11). Upon further truncation to position 74 and beyond, transcription start site selection became increasingly aberrant, resulting in the appearance of additional U4 RNA 5' ends (Fig. 2B, lanes 5-8 and 12-14). Except for the 3 reverse allele, all SNR14 truncation mutant strains are viable. Surprisingly, even strains bearing deletions of upstream DNA to position 3 expressed transcripts initiating from the +1 start site, albeit at a very low level (Fig. 2C). The effects of promoter truncations varied somewhat depending on the orientation of SNR14 alleles within the plasmid, with the reverse orientation yielding more severe effects on growth and U4 RNA synthesis. Because the differences appeared to be limited to transcription efficiency and not transcription start site selection, they are likely due to functional promoter elements in the vector sequences.
To confirm that the additional U4 RNA 5' ends observed upon truncation of the SNR14 promoter were due to misinitiation and not degradation, we tested for the presence of the methylguanosine 5' cap that is added to nascent RNAP II transcripts (35). Because the 7-methylguanosine cap attached co-transcriptionally to RNAP II transcripts is hypermethylated to trimethylguanosine on snRNAs, we used a monoclonal antibody that recognizes both cap structures. RNAs corresponding to all major U4-specific primer extension products were efficiently immunoprecipitated with an anti-cap antibody (Fig. 3A), confirming that they are primary transcripts. The scR1 RNA is synthesized by RNAP III and was not immunoprecipitated with anti-cap antibody, showing that immunoprecipitation is specific for capped transcripts. The major alternative start sites resulting from SNR14 promoter truncations were mapped to positions 35, 32, and 27, which are located within plasmid sequence, and +20, +32, +83, and +88 in the U4 coding region. Interestingly, all of the start sites mapped to a purine residue, and 5 of the 8 start sites (including the wild-type start site) immediately follow the dinucleotide CC (Fig. 3B).
Thus, transcription initiation of SNR14 from the normal +1 start site is remarkably resistant to deletion of conserved upstream promoter elements. Much like what has been observed previously at mRNA genes, the SNR14 TATA box primarily affects the frequency of RNAP II transcription and not the position of initiation (10). Alternative SNR14 initiation sites minimally include a purine preceded by a pyrimidine and most often by two cytosines. These results suggest that start site selection at SNR14 is directed by an initiator-like element, potentially in combination with a downstream (intragenic) element. A Polar Effect on SNR14 Start Site Selection Supports a Unidirectional Scanning ModelA scanning model for yeast transcription start site selection posits that after melting DNA near the TATA box, RNAP II translocates along downstream DNA until a suitable initiation sequence is located (11). We directly tested the validity of this model by constructing an SNR14 allele with tandemly duplicated start sites, SNR14-StDup (Fig. 4A, construct 2). The SNR14-StDup construct has a 14-base pair insertion that creates two start sites identical in sequence from positions 13 to +7, which should be indistinguishable to factors binding directly to these sequences, including RNAP II. Therefore, if start site selection is driven by a random collision of RNAP II or another initiator-binding protein with the DNA, we should detect roughly equal utilization of the two sites. If the start site is defined by its distance from factors bound upstream or by a polymerase scanning from the TATA box, we should detect primarily upstream starts. In contrast, if the start site is measured from factors bound to intragenic promoter elements, we should detect primarily downstream starts (36).
In fact, the upstream start site (+1u) is preferentially utilized (Fig. 4A, lane 2), consistent with upstream recruitment and a directional scanning model. When the initiating non-template nucleotide of the upstream start is changed from A to T, the downstream start (+1d) is more heavily utilized, confirming that the start sites are in competition and that precise spacing from an upstream recruitment site is not required (Fig. 4A, lane 3). Changing the downstream start from A to T essentially abolishes its usage entirely (Fig. 4A, lane 4). When both start sites are changed from A to T, weak alternative sites are used at 8d and +7d (Fig. 4A, lane 5). Weak initiation at 8 and +7 is also observed in the wild-type allele. Interestingly, initiation at 8u, the most upstream observed start site in the SNR14-StDup allele, does not increase when the +1u and +1d start sites are mutated. This finding is also consistent with the directional scanning model, in which read-though of a site should not affect initiation at sites upstream. Yeast Start Site Efficiency Is an Intrinsic, Quantifiable Property of an Initiator SequenceTo obtain a quantitative estimate of initiation sequence preference, we assumed that a homogeneous population of initiation-competent polymerases scan unidirectionally through the 8u to +7d interval of the SNR14-StDup allele in search of a good match to the ideal initiator consensus. Start site efficiencies were calculated by dividing the relative yield of a start site product by RNAP II flux at that site. The relative yield of a transcript from a given start site was determined by dividing its signal intensity by the total signal intensity of products from all detectable start sites (8u, +1u, 8d, +1d, +7d). RNAP II flux was defined as the relative number of polymerases encountering a given start site and was arbitrarily assigned a value of 100 units at the 8u site. Because 2% of the U4 cDNA ends at the 8u position, 2 units of RNAP II must have initiated at this site, and 98 units continued to scan (assuming there is no loss of RNAP II except by detectable initiation in the 8u to +7d interval). When start site efficiency is determined without considering flux, the +1u and +1d start site efficiencies differ by about 10-fold (87% versus 8%). When RNAP II flux at the two positions is included in the calculation, the +1u and +1d start site efficiencies are found to be equal at 89% (Fig. 4B). Even when the efficiency of the +1u site is reduced more than 10-fold by the A+1uT mutation, the efficiency of +1d and other downstream sites remains about the same when flux is considered (Fig. 4C). These results indicate that initiator efficiency is an intrinsic property that is largely independent of start site position. We can, therefore, use this value to classify initiator strength. For example, we can deduce that the +1 initiator sequence of the U4 gene is a nearly perfect initiator, with an efficiency of 8992%.
Mutations in SNR14-StDup Define Preferred Sequences at 1 and 8 of the InitiatorHaving developed a quantitative assay for in vivo start site selection, we next tested the sequence requirements for RNAP II initiation through site-directed mutagenesis of phylogenetically conserved base pairs in the SNR14 major initiator. In addition to the +1 position, the nucleotide identities of positions 8, 7, 5, 4, 2, and 1 of SNR14 are conserved across the Saccharomyces genus (Fig. 1). A double transversion mutation at positions 8u, 7u (A T) or 2u, 1u (C G) of the SNR14-StDup allele significantly shifted initiation toward downstream start sites, reducing efficiency of the +1u site by about 5-fold (Fig. 5A, lanes 2 and 5). Upon separation of the 8u/7u double mutation into single point mutations, it became clear that the 8u mutation contributes more to the initiation defect than the 7u mutation (Fig. 5A, lanes 3 and 4). A similar dissection of the 2u/1u double mutation showed that the change at position 1u accounted for all of the downstream shift (Fig. 5A, lanes 6 and 7). Transversions at 5u or 4u on their own had little if any effect on start site selection (Fig. 5B, lanes 8 and 9). The base preference at positions 8u and 1u was explicitly tested by creating all possible base substitutions. The A-8uT mutation reduced usage of the +1u start site by about 2-fold, an effect that was slightly greater than that observed for A-8uG and A-1uC (Fig. 5B, lanes 2-4). The C-1uG mutation reduced usage of the +1u start site by at least 5-fold, an effect that was followed closely by C-1uA (Fig. 5B, lanes 5 and 6). The C-1uT mutation had a very minor effect on start site selection (Fig. 5B, lane 7). Overall, it appears that the strength of the SNR14 start site is dependent on a purine at +1, a pyrimidine at 1, and an adenine at 8. There are other sequences within the RNAP II scanning window between the SNR14 TATA box and +1 start site that resemble start sites but at which initiation does not efficiently occur. We predicted that changing these sites to match the bases preferred at positions 8, 7, and 1 would contribute to more efficient start site usage. Weak initiation occurs at 8u in a wild-type SNR14-StDup allele (Fig. 5C, lane 1). The C-16u/15uA or T-9uC mutations on their own increase usage of the 8u site by about 5- or 2-fold, respectively (Fig. 5C, lanes 2 and 3). Combining these mutations has an additive effect, resulting in a 10-fold increase in 8u start site strength (Fig. 5C, lane 4). The optimization of the 8u start site toward a higher efficiency further demonstrates the importance of the 8, 7, and 1 positions in SNR14 initiator function. The efficiencies of 26 different wild-type and mutant yeast initiators calculated from various SNR14-StDup alleles are shown in Table 1, with values ranging from about 89 to 4%. The wild-type SNR14 +1 site (A+1d, A+1u) is the most efficient, and divergence from this sequence reduces start site efficiency. Changes at the 8, 7, 1, and +1 positions reduce start site efficiency anywhere from about 2- to 15-fold. Although the efficiency of the wild-type A-8u start was improved 10-fold by changing the 8, 7, and 1 positions toward a more preferred initiator sequence, it is interesting to note that the efficiency of this site is still around 2-fold less than the wild-type A+1 start site. The 8u and 8d start site efficiencies differ by 6-fold (4 versus 24%) even when the flanking sequences are identical from 9 to +7, as is the case in the StDup-A+1uT allele (Table 1). Taken together, these results indicate that there are positions other than 8, 7, 1, and +1 at which nucleotide identity influences initiator efficiency.
Substitutions in the TFIIB B-finger Exacerbate the Effect of Initiator Mutations at 8 and 1One possible cause of the start site selection defects exhibited by SNR14 initiator mutants is a disruption of direct amino acid/nucleotide contact(s) made between a protein component of the yeast PIC and the initiator. To analyze the role of TFIIB as the potential trans-acting component of the PIC that interacts with the yeast initiator element, we generated a double knock-out strain that has disrupted chromosomal copies of the SNR14 and SUA7 (TFIIB) genes and carries wild-type copies of these genes on a URA3-marked plasmid. Standard plasmid shuffle protocols were used to introduce mutant alleles of SNR14 and SUA7. The effect of TFIIB B-finger substitutions previously shown to alter initiation on protein-coding genes (2223) was tested in the context of the SNR14-StDup allele. For the most part, substitutions in residues 63, 64, 66, and 78 of TFIIB all caused a similarly modest shift in transcription initiation from upstream to downstream sites, reducing +1u start site efficiency by about 1.5-fold (Fig. 6A). The effect of the W63R substitution was less severe than W63P and the other TFIIB substitutions, consistent with what has been observed at the CYC1 and ADH1 genes (22). Overall, the effect of SUA7 mutations on SNR14 start site selection was not as dramatic as has been observed on some mRNA genes, but this is not surprising given that the sensitivity of genes to TFIIB substitutions is known to be dependent on the sequence immediately upstream of the start site (23).
The superimposition of the RNA-DNA hybrid from the RNAP II transcribing complex structure upon a recent crystal structure of yeast RNAP II-TFIIB suggested that template-strand DNA is adjacent to conserved residues of the TFIIB B-finger domain (9). The points of closest contact included nucleotides 6 to 8 (relative to the nucleotide addition site at +1) and B-finger residues 6266. These structurally predicted contacts are consistent with results showing that the archaeal TFIIB homologue cross-links to template DNA near the transcription start site (3738). We tested whether mutations in SUA7 genetically interact with mutations in the initiator element at positions that are required for accurate start site selection. An analysis of +1u start site efficiency in the context of the TFIIB R64A substitution revealed a difference in the level of enhancement between 8 and 1 initiator mutants, suggesting that the effect of R64A on start site selection is influenced by the initiator sequence (Fig. 6B). The C-1uG/R64A mutant exhibited a 2-fold reduction in +1u efficiency relative to C-1uG alone, similar to the effect of the R64A substitution with the wild-type +1u site (1.5-fold). In contrast, the A-8uT/R64A mutant exhibited about a 4-fold reduction in +1u efficiency relative to A-8uT alone. The differential level of sensitivity of +1u variants to R64A was not observed at the +1d site, consistent with the fact that A-8uT and C-1uG specifically alter usage of the upstream site in SNR14-StDup. Of the remaining start sites tested, 8d was very sensitive to R64A (9-fold reduction), whereas 5d was relatively insensitive. Overall, these data support the notion that TFIIB B-finger substitutions affect RNAP II start site selection in a manner dependent on the sequence of the initiator, an element that extends upstream to at least the 8 position.
The synthesis of non-coding RNAs, especially snRNAs and snoRNAs, puts strong demands on the accuracy and efficiency of transcription initiation by RNAP II. The transcription start site of such RNAs usually corresponds to a unique mature 5' end, and its precise placement may be required for optimal RNA function. Yeast snRNAs and snoRNAs typically have steady-state levels of hundreds of copies per cell, so their genes must be actively transcribed. Non-coding RNA gene promoters are, therefore, interesting subjects for study of the optimal sequences for directing initiation by RNAP II. Here we provide evidence that the S. cerevisiae U4 snRNA gene, SNR14, fulfills these stringent requirements by coupling a consensus TATA box with a nearly perfect initiator element. Furthermore, the DNA between the TATA box and initiator is devoid of initiator-like sequences that might divert RNAP II from the proper start site as it scans downstream from the TATA box. The differential utilization of duplicated initiator elements in artificial variants of the SNR14 promoter strongly supports the scanning model of start site selection by RNAP II in budding yeast and demonstrates that initiator element efficiency is an intrinsic property dependent primarily on the sequence at positions 8, 7, 1, and +1 relative to the start site. Architecture of the SNR14 PromoterIn terms of their general promoter architecture, yeast snRNA genes bear a strong resemblance to mRNA genes both in the position and function of their core elements. In agreement with their observed roles in yeast mRNA genes, the conserved TATA box and initiator elements of the SNR14 promoter primarily influence RNAP II transcription efficiency and accuracy, respectively. A more distinctive feature of the SNR14 promoter is the presence of a conserved T-stretch and putative Abf1 binding site. Abf1 sites have previously been identified upstream of T-rich stretches in the promoters of ribosomal protein genes (39) and snoRNA genes (40). Both microarray-based readout of chromatin immunoprecipitation (ChIP-chip) and protein binding to microarrays (PBM) have identified SNR14 as a target gene for Abf1 (33, 41). The similarity in upstream sequences between genes encoding components of the spliceosome, ribosome, and RNA modification machinery raises the possibility of coordinate regulation of these activities at the level of transcription. In one orientation of a plasmid-borne SNR14 allele, deletion of the putative Abf1 site and T-stretch led to a 2-fold reduction in U4 RNA level, suggesting a potential role for one or both of these elements in transcription efficiency. In the context of the ribosomal protein-coding gene RPS28A, a mutation that destroys Abf1 binding in vitro reduced transcription by 10-fold, whereas substitutions in the T-rich element reduced transcription by 2-fold (39). Other potential functions for Abf1 within the SNR14 promoter (e.g. genome partitioning, nucleosome organization) would likely require a more native chromosomal environment than was provided in our study (42). The potential role for Abf1 in genome partitioning of this region is made more interesting given the observation that the sites of SEC3 transcription termination and SNR14 PIC assembly overlap. Refinement of the Yeast RNAP II Initiator Consensus SequenceThe preference for particular nucleotides at positions 1 and +1 of the yeast initiator has been widely reported in the literature. Although a YA initiator consensus applies well to many documented yeast start sites, it is too minimal to have predictive value. A point mutation analysis of an RRYRR consensus initiator from the TRP4 gene revealed that the central pyrimidine and at least one of the 3'-flanking purine nucleotides were essential but alone insufficient to define a functional initiator element (43). Previous reports have identified regions immediately 5' of the original yeast initiator consensus that influence start site selection. Maicas and Friesen (44) identified a region centered at 9 of the TCM1 gene and 95 other mRNA genes that they termed the "locator." The locator was defined as a region where the base composition of the non-template strand sharply switched from a preponderance of thymine residues to predominantly adenine residues. Rathjen and Mellor (45) identified a region from 10 to 4 (ACAGATC) of the major PGK1 start site as a "determinator" element. Deletion of the determinator resulted in a loss of initiation from the normal start site and increased use of more downstream sites. Healy and Zitomer (46) were able to show that insertion of CAAG upstream of the CYC7 gene could direct initiation at a site at which it did not normally occur, and it is interesting to note that their insertion also introduced an adenine at the 8 position on the non-template strand. Our genetic evidence supporting a preference for adenine at positions 8 and 7 of the SNR14 initiator expands the older yeast initiator consensus and can account for the earlier observations described above for the TRP4, TCM1, PGK1, and CYC7 genes. The general importance of the 8, 7, 1, and +1 positions for initiator efficiency across all RNAP II-transcribed yeast genes is supported by a recent bioinformatics study that compared 4637 yeast transcription start sites. Sequence alignment produced the consensus A(Arich)5NYA(A/T)NN(Arich)6, where the underlined A is the initiation site (15). This yeast initiator consensus is more expansive than that reported for higher organisms like Drosophila TCA(G/T)T(T/C) or mammals YYAN(T/A)YY (4). Although they all share the minimal consensus YA, the latest evidence suggests that yeast initiator sequence preferences extend beyond 1/+1 to include at least 8 nucleotides upstream and downstream. Zhang and Dietrich (15) could not conclude whether the A-richness of the yeast initiator consensus sequence was important for transcription initiation or a consequence of some other aspect of genome structure. Here, we present direct evidence indicating that the adenine at position 8 and to a lesser extent the adenine at 7 contributes to the functionality of the yeast initiator as a start site determinant. The fact that substitution of any base besides adenine at 8 significantly decreases start site efficiency suggests that the functional impairment is not merely related to the melting potential of an A:T base pair. Rather, it suggests that the 8 position is recognized in a sequence-specific manner. How the yeast initiator sequence determines start site usage is as yet unknown. It seems likely that the initiator is recognized by a protein component of a scanning pre-initiation complex. Mutations that alter yeast transcription start site selection have been identified in numerous protein components of the PIC, including RNAP II (Rpb1, Rpb2, Rpb9), TFIIB, and TFIIF (2021, 4752). Of these proteins, Rpb1, Rpb2, TFIIB, and TFIIF (Tfg1, Tfg2) have also been cross-linked to DNA at or near a transcription start site (53). Faitar et al. (23) determined that mutations in yeast initiator sequences genetically interact with substitutions in the TFIIB B-finger, making certain start sites more or less sensitive to downstream shifts in transcription start site selection. Specifically, they showed that among a set of mutations made from positions 6 to +5 of an ADH1 initiator, changes at 2, +1, and +2 significantly increased or decreased the sensitivity of the +1 transcription start site to TFIIB-V79L. Here, we present evidence supporting a genetic interaction between the 8 position of the SNR14 initiator with the TFIIB B-finger, expanding our view of what constitutes an initiator element and where potential protein-DNA interactions may occur. Furthermore, the fact that the 8d start site is highly sensitive to the TFIIB-R64A substitution in both the StDup-A-8uT and -C-1uG alleles whereas the 5d start site is essentially insensitive in both contexts indicates that sensitivity correlates with sequence and not simply initiator strength. Implications for a Scanning Model of RNAP II Start Site SelectionA scanning model is currently the best supported explanation for how yeast start sites are selected, but experiments directly testing the basic implications of the model are scarce in the literature. We constructed an allele of SNR14 with tandemly duplicated start sites as a means to test the yeast scanning model in both a qualitative and quantitative fashion. We observed that although the upstream start site had a higher relative yield than the downstream start site, a reduced level of RNAP II flux fully accounted for the lower relative yield from the downstream site. RNAP II flux is an inherent property of a unidirectional scanning model for yeast transcription initiation. Thus, the fact that RNAP II flux can be used to resolve the observed differences in relative utilization of two identical start sites is in itself strong support for the model. The fact that our estimations of RNAP II flux so closely agree with shifts in utilization of start sites 1422 bp apart in response to mutations suggests that RNAP II scanning is reasonably processive. A scanning mechanism of start site selection requires RNAP II to be processive to accommodate the large and variable distances between yeast TATA boxes and initiator elements. It will be interesting to test the limits of processive scanning by RNAP II. The identification of initiators with a range of efficiencies (Table 1) should aid such studies. The yeast initiator sequence consensus is readily apparent among RNAP II-transcribed snRNA genes (SNR14, SNR19, SNR20, SNR7) and snoRNA genes. A Weblogo alignment (54) using a pool of 22 yeast snRNA and snoRNA transcription start sites results in an initiator consensus very similar to that reported for mRNA genes, A(Arich)3NNYYA(A/T)N(Arich)2. Given that yeast RNAP II transcribes all mRNA genes and most snRNA and snoRNA genes, it is expected that the cis-acting sequence requirements for their start site selection would be similar. However, the identification of a strong match to the initiator consensus from such a small sample size of snRNA/snoRNA genes suggests a basis for why yeast snRNA/snoRNA transcription initiation is much more precise than that of mRNA genes. Simply stated, genes with a highly efficient initiator have fewer start sites because a smaller population of polymerases is available to scan further downstream. Nagawa and Fink (55) touched on this idea when they suggested that one reason that yeast genes HIS1 and CYC1 have multiple weak start sites is because they lack a strong start site. In addition to snRNA genes evolving highly efficient initiators, the regions between snRNA TATA boxes and coding regions may have also undergone negative selection against the initiator consensus to minimize the usage of non-optimal start sites. Consistent with this idea, an analysis of all 10 potential YR start sites within the RNAP II scanning window between the SNR14 TATA box and the +1 start site revealed that none contained the preferred adenine at the 8 or 7 positions. The driving force behind this proposed kind of snRNA promoter evolution would be to increase the functional capacity of the RNA gene products, which are essential for yeast viability. snRNA genes encode structural RNAs that require precise 5' ends for their function. For example, the 5' end of yeast U4 RNA engages in base-pairing interactions with the yeast U6 RNA during the splicing cycle. In contrast, mRNA transcripts contain 5'-untranslated regions that typically have no precise length requirement for proper expression. Consequently, the 5' end of mRNAs typically need not be formed in as precise of fashion as those of snRNAs, and their gene promoters would likely not have undergone the same type of evolutionary selection. The study of transcription initiation on yeast non-coding RNA genes has provided useful insight into the fundamental process by which RNAP II initiates RNA synthesis, particularly with regard to its accuracy. Additional genetic, biochemical, and structural studies are necessary to elucidate the underlying mechanism by which both initiator DNA and PIC proteins function in the process of RNAP II transcription start site selection.
* This work was supported by National Institutes of Health, Public Health Service Grant GM44665 (to D. A. B.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The work is dedicated to the memory of Kathryn A. Kuehner (19392006). 1 To whom correspondence should be addressed: Dept. of Biomolecular Chemistry, University of Wisconsin School of Medicine and Public Health, 1300 University Ave., Madison, WI 53706-1532. Tel.: 608-262-1475; Fax: 608-262-5253; E-mail: dabrow{at}wisc.edu.
2 The abbreviations used are: RNAP II, RNA polymerase II; snRNA, small nuclear RNA; snoRNA, small nucleolar RNA; PIC, pre-initiation complex; TF, transcription factor; RT, reverse transcription.
We thank E. Craig, P. James, C. Guthrie, K. Shannon, and A. Ponticelli for providing plasmids and/or yeast strains and R. Landick, V. Gerlach, and current members of the Brow laboratory for discussions.
This article has been cited by other articles:
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||