Identification of Proximal Sequence Element Nucleotides Contributing to the Differential Expression of Variant U4 Small Nuclear RNA Genes*

The two U4 genes in the chicken genome code for distinct sequence variants of U4 small nuclear RNA that are differentially expressed during development. Whereas U4B RNA is constitutively expressed, U4X RNA is specifically down-regulated relative to U4B in a tis- sue-specific manner during development. To investigate mechanisms controlling the differential expression of the U4B and U4X genes, chimeric U4 genes were constructed and their transcriptional activities assayed by injection into Xenopus oocytes or by transfection of CV-1 cells. The proximal regulatory region of the U4B gene and the enhancers of both the U4B and U4X genes func-tioned efficiently in each expression system. However, the proximal region of the U4X gene was inactive. To localize and identify the responsible nucleotides, recip- rocal point mutations were introduced into the U4X and U4B proximal regulatory regions. The results indicate that the U4X gene contains a suboptimal proximal sequence element, and that this results primarily from the identities of the nucleotides at positions (cid:50) 61 and (cid:50) 57 relative to the transcription start site. The small nuclear RNAs (snRNAs) 1 of the U family (U1, U2, U3, etc.) are evolutionarily conserved and metabolically stable RNAs that function in pre-mRNA and pre-rRNA processing in the nuclei of eukaryotic cells. The genes for vertebrate snRNAs possess similar promoter structures and, with the exception of U6, are transcribed by RNA polymerase II. Two distinct evolutionarily conserved regions in the 5 (cid:57) -flanking DNA contrib-ute to the expression

The two U4 genes in the chicken genome code for distinct sequence variants of U4 small nuclear RNA that are differentially expressed during development. Whereas U4B RNA is constitutively expressed, U4X RNA is specifically down-regulated relative to U4B in a tissue-specific manner during development. To investigate mechanisms controlling the differential expression of the U4B and U4X genes, chimeric U4 genes were constructed and their transcriptional activities assayed by injection into Xenopus oocytes or by transfection of CV-1 cells. The proximal regulatory region of the U4B gene and the enhancers of both the U4B and U4X genes functioned efficiently in each expression system. However, the proximal region of the U4X gene was inactive. To localize and identify the responsible nucleotides, reciprocal point mutations were introduced into the U4X and U4B proximal regulatory regions. The results indicate that the U4X gene contains a suboptimal proximal sequence element, and that this results primarily from the identities of the nucleotides at positions ؊61 and ؊57 relative to the transcription start site.
The small nuclear RNAs (snRNAs) 1 of the U family (U1, U2, U3, etc.) are evolutionarily conserved and metabolically stable RNAs that function in pre-mRNA and pre-rRNA processing in the nuclei of eukaryotic cells. The genes for vertebrate snRNAs possess similar promoter structures and, with the exception of U6, are transcribed by RNA polymerase II. Two distinct evolutionarily conserved regions in the 5Ј-flanking DNA contribute to the expression of vertebrate snRNA genes (1)(2)(3)(4). The more distal region, generally positioned between nucleotides Ϫ250 and Ϫ180 relative to the transcription start site, functions as a transcriptional enhancer. It normally contains at least one copy of the octamer motif (recognized by the Oct-1 transcription factor), and it usually contains functional binding sites for one or more additional proteins, such as a GC box, SPH motif, CCAAT box, cyclic AMP response element, etc. (5)(6)(7)(8)(9)(10)(11)(12).
The proximal sequence element (PSE) is an essential promoter element centered near position Ϫ55 and is required for accurate initiation of snRNA transcription. The PSE is recognized by a transcription factor variously named PSE binding protein (13,14), PSE transcription factor (15), or snRNA activating protein complex (16). The PSE binding factor is believed to play a role in recruiting the TATA box-binding protein to snRNA promoters (16 -19).
In the chicken, there are two and only two genes that code for U4 snRNA (20,21). These genes, designated U4B and U4X, are closely linked within 500 base pairs (bp) of each other in the chicken genome. The two genes encode distinct sequence variants of U4 RNA that differ at seven nucleotide positions. Moreover, the relative amount of U4X and U4B RNA is differentially regulated during chicken development. The U4X and U4B genes are expressed at nearly equal levels during early embryogenesis, but U4X gene expression is specifically down-regulated in a tissue-specific manner relative to U4B as development proceeds (22).
Although there is no direct evidence linking the differential expression of the chicken U4X and U4B RNA genes to transcriptional control, their promoter structures are consistent with such a mechanism. First there is no obvious similarity between the U4B and U4X gene 5Ј-flanking DNA other than the PSE and the enhancer region. Second, both the enhancer and PSE contain a number of base differences in crucial regulatory regions that could affect the relative expression of the two U4 genes (20,23). Third, just 5Ј of, and partially overlapping, the U4X PSE there exists a unique dyad symmetry element that is not found in the U4B proximal region. This palindromic sequence is recognized by a sequence-specific DNA binding factor, proximal palindrome binding factor, whose DNA binding activity is inversely related to the accumulation of U4X RNA (23).
To gain further insight into the molecular basis for the differential expression of the U4X and U4B genes, we have generated constructs that possess chimeric U4B/U4X promoters, including families of constructs with reciprocal point mutations in the PSE and dyad symmetry elements. We find that the proximal regulatory region, not the enhancer, is responsible for the differential activity of the U4B and U4X genes in transient expression assays. Moreover, we identify two specific nucleotide positions that differ between the U4X and U4B PSEs that are crucial for U4 snRNA gene expression in these assays.

MATERIALS AND METHODS
Construction of Plasmids-All constructs were in the pUC19 vector. Plasmid BBB contained the chicken U4B gene, 280 bp of 5Ј-flanking DNA, and 283 bp of 3Ј-flanking DNA. Plasmid XXX contained the U4X gene, 323 bp of 5Ј-flanking DNA, and 116 bp of 3Ј-flanking DNA. These plasmids were previously designated pU4B(Sst) and pU4X(Sst) since the cloned wild type fragments were flanked by natural SstI sites (20).
Template BBX contained the U4B gene 5Ј-flanking DNA but the coding region and 3Ј-flanking DNA of the U4X gene; it was constructed by utilizing an AluI site that occurs at the beginning of the U4 coding region. The reciprocal exchange of SstI-AluI fragments produced the template XXB.
To construct template BXX, U4B sequences from Ϫ227 to Ϫ142 were joined to position Ϫ132 of the U4X gene. Template -BX was constructed by deleting U4B sequences from positions Ϫ280 to Ϫ118 in the 5Јflanking region of the BBX construct. Template XBX contained, in order, U4X sequences from Ϫ323 to Ϫ97, two nucleotides (AT) of linker sequence, and U4B 5Ј-flanking DNA from position Ϫ117 to the beginning of the gene, followed by the U4X RNA coding region and 3Јflanking DNA.
To construct the U4B maxigene, 17 nucleotides from the pUC19 polylinker were inserted between positions 8 and 9 of the U4B gene as follows. The construct U4B⌬-227 (9) was cut with EcoRI and HindIII, and the fragment containing the U4B gene sequence was isolated. This fragment was then cut with restriction endonuclease FspI, giving blunt ends between nucleotides 8 and 9 of the U4B DNA coding region. These fragments were then cloned between the EcoRI and SmaI sites and between the HindIII and HincII sites of the pUC19 vector. Thus the plasmid polylinker sequence between the HincII and SmaI sites became a part of the U4B maxigene transcript. The U4mB/AS template was constructed by exchanging the 5Ј-flanking DNA of U4mB with that of U4B/AS (24). The remaining U4X and U4B constructs that contained point mutations were generated by site-directed mutagenesis using polymerase chain reaction or by cloning synthetic oligonucleotides between pre-existing restriction sites.
Expression Assays in Xenopus Oocytes-Microinjection of oocytes and analysis of labeled RNA were performed as described previously (20). Oocytes were each injected with ϳ30 nl of a solution containing 400 g/ml U4 template DNA, 1 g/ml Xenopus 5 S gene plasmid, and 8 Ci/ml [␣-32 P]GTP. After an overnight incubation, RNA was isolated from oocytes, analyzed by electrophoresis in denaturing polyacrylamide gels, and autoradiographed. Normalization for efficiency of injection and recovery of RNA was carried out by adjusting the amounts of sample loaded in subsequent gels to give a uniform 5 S RNA signal. Relative transcription efficiencies were determined by cutting out the bands from the gels and quantitating in a liquid scintillation counter.
Transfection of CV-1 Cells-Monkey kidney CV-1 cells were grown in Dulbecco's modified Eagle's medium from Life Technologies, Inc. supplemented with 5% bovine serum for transfection experiments. Plates containing confluent cells were split 1:6 on 100-mm plates 24 h prior to transfection. Transfection was carried out using the ProFection Mammalian Transfection System and accompanying protocols from Promega. Each plate was transfected with 15 g of the U4 plasmid DNA together with 5 g of pcDNA3-CAT plasmid (Invitrogen) as control for transfection efficiency. The cells were exposed to the precipitate for 3 h after which the media was replaced, and the cells were further incubated for 24 h before harvesting. RNA from the transfected CV-1 cells was prepared using the RNAgents Total RNA Isolation System employing guanidinium isothiocyanate (Promega). U4 transcripts were measured by primer extension as described below, and chloramphenicol acetyltransferase transcripts were quantified by an RNase protection assay using a kit purchased from Ambion (Austin, TX). Transfections were repeated a minimum of three times and typical results are presented in the figures.
The primer extension reactions were performed as follows: 5 ng of the specific primer (ϳ500,000 cpm) was mixed with 10 g of total RNA in 10 l of 1 ϫ annealing buffer solution (250 mM KCl, 10 mM Tris-Cl, pH 8.3). The mixture was heated at 80°C for 3 min and transferred to a 45°C water bath for 45 min. After 45 min of annealing, 10 l of ice-cold nuclease-free water was added to each sample. To 10 l of the annealed primer/RNA mixture, 12 l of reverse transcription mixture (24 mM Tris-Cl, pH 8.3, 16 mM MgCl 2 , 0.8 mM each dNTP, 8 mM dithiothreitol, 125 ng actinomycin D/l, and 1.5 units of avian myeloblastosis virus reverse transcriptase/l) were added. This mixture was incubated at 45°C for 45 min to allow extension of the primer. The reaction was stopped by adding 8 l of stop solution (100% formamide with 0.3% bromphenol blue and 0.3% xylene cyanol FF). The samples were boiled for 3 min and then loaded on either a 20 or 10% denaturing polyacryl-amide gel for analysis of U4maxiB or U4X RNA, respectively. The labeled bands were detected by autoradiography and quantitated by PhosphorImager analysis.

RESULTS
Differential Activity of the U4B and U4X Genes Governed by the Proximal Regulatory Region-We previously observed that the chicken U4B gene was efficiently transcribed from plasmids injected into the nuclei of frog oocytes (20). In contrast, the U4X gene was not detectably expressed in the frog oocyte system (20). To localize the cis-acting sequences contributing to the differential activity of the U4X and U4B genes in oocytes, chimeric constructs were made by switching the proximal and/or enhancer region of the U4X gene with the U4B gene proximal and/or enhancer region, respectively. A schematic representation of these constructs is shown at the bottom of Fig. 1. The expression of each construct was assayed by microinjection of the plasmids into frog oocytes together with [␣-32 P]GTP. The accumulation of chicken U4 RNA was assayed by electrophoresis of the labeled RNA products in denaturing polyacrylamide gels. The expression level of a co-injected 5 S RNA gene was used to normalize for the efficiency of the injection into the oocyte nuclei. As noted previously (20), the FIG. 1. Transcriptional activities of chimeric U4 gene constructs. A, chimeric constructs (diagrammed in C) were expressed by injection into the nuclei of Xenopus oocytes together with [␣-32 P]GTP to label the synthesized RNA. A 5 S RNA gene plasmid was coinjected to serve as an internal control for injection efficiency and sample recovery. RNA was isolated on the day following injection and aliquots were electrophoresed through a denaturing polyacrylamide gel. Bands corresponding to U4B or U4X RNA and to 5 S RNA are indicated. B, four of the chimeric constructs were transfected into CV-1 cells, total RNA was isolated, and the accumulation of U4X RNA was assayed by primer extension using a primer specific for chicken U4X RNA. The upper U4X band represents the full-length U4X extension product, and the lower band corresponds to a shorter product resulting from a strong stop for reverse transcriptase (21). C, a diagrammatic representation of the chimeric U4 constructs is pictured to the left. Relative expression levels in oocytes or in CV-1 cells is shown in the corresponding columns to the right (nt, not tested). All values are normalized to a U4XBX expression efficiency of 100%. wild type U4B gene was efficiently expressed in the oocyte assay, whereas the wild type U4X gene was not detectably expressed (Fig. 1A, lanes 2, 4, and 6). However, when the U4B gene 5Ј-flanking DNA was placed upstream of the U4X coding region, U4X RNA was efficiently synthesized (BBX, lanes 1 and 9). In contrast, the U4X gene 5Ј-flanking DNA was unable to drive efficient expression from the U4B coding region (XXB, lane 5). (An extremely low level of expression detectable from this construct suggests that the U4B gene coding region or 3Ј-flanking DNA makes a very minor contribution to the greater efficiency of U4B gene expression. However, since expression of the construct was very low, this observation has not been further investigated.) More definitive conclusions were obtained from experiments employing the chimeric constructs that combined the U4B enhancer with the U4X proximal region, and vice versa. The U4B enhancer region was unable to activate U4X gene expression from a template containing the U4X proximal region (BXX, Fig.  1A, lane 3). In contrast, the U4X enhancer fully supported a high level of transcriptional activity from a template containing the U4B proximal sequence (XBX, cf. lanes 8 and 9 versus lane 7). Taken together, these results indicate that the U4X proximal region, not the enhancer, is the regulatory region primarily responsible for the inactivity of the U4X gene in frog oocytes.
We next examined the relative activities of the enhancer and proximal regions of the U4X and U4B genes in a different expression system. Four of the chimeric constructs that contain the U4X gene coding region were transfected into monkey CV-1 cells, total RNA was prepared, and the accumulation of U4X RNA was assayed by primer extension using a 17-mer primer complementary to the 3Ј end of U4X RNA. This primer was chosen because it does not cross-hybridize with any form of U4 RNA found in CV-1 cells (21,22). As shown in Fig. 1B, the two constructs that contained the U4X proximal region, XXX and BXX, were not detectably expressed (lanes 1 and 3). In contrast, the two constructs that contained the U4B proximal region, BBX and XBX, were efficiently expressed (lanes 2 and 4). Overall, the results in Fig. 1 indicate that the U4X proximal region is unable to support transcription in either frog oocytes or monkey CV-1 cells, whereas the U4B gene proximal region is an efficient promoter in both expression systems.
Base Alterations That Activate the U4X PSE-To further localize and identify the nucleotides responsible for the differential activities of the U4X and U4B promoters, we introduced a series of point mutations into the region near and within the U4X PSE that, in a stepwise fashion, made the U4X PSE more similar to the U4B PSE. The ability of these specific base alterations to rescue U4X expression was measured by transient transfection of CV-1 cells. As previously observed, a chimeric construct (XBX) that carried the wild type U4B proximal region between the U4X enhancer and the U4X coding region was efficiently expressed and served as the positive control in this experiment (Fig. 2, lane 1). A 3 bp change in the U4X PSE (at positions Ϫ49, Ϫ51, and Ϫ52) was not sufficient to activate U4X expression (Fig. 2, lane 2). The conversion of 8 additional bp between position Ϫ37 and Ϫ48, bringing them into agreement with the U4B sequence, resulted in a low level of U4X expression (lane 3). Interestingly, nearly full activity was obtained from a construct in which the two bases at positions Ϫ57 and Ϫ61 were changed to match the U4B sequence (lane 4). These two changes not only increased the similarity of the U4X PSE to the U4B PSE, but also destroyed the perfect dyad symmetry of the palindromic sequence that overlaps the U4X PSE. A marginal further increase in activity was observed from a construct that combined the entire set of mutations extending from Ϫ37 to Ϫ61 (lane 5). These results indicate that the nucleotides at positions Ϫ57 and Ϫ61 play an especially important role in determining the activities of the U4X and U4B promoters in the transient expression assay. The downstream  Fig. 1B, using the U4XBX construct as a positive control for normalization to 100% efficiency. The sequence alignment shows the point mutations that were introduced into the PSE region of the U4X gene, gradually converting it into identity with the U4B PSE. Dashes indicate nucleotides identical to those in the wild type U4X sequence, whereas differences are explicitly shown as lower case letters in the mutant U4X constructs or as capital letters in the wild type U4B proximal region sequence (bottom line). A 16-bp dyad symmetry element (palindromic sequence) that overlaps the U4X PSE is underlined. Note that when the U4X and U4B PSE sequences are aligned to maximize homology, the U4B PSE is two nucleotides closer to the transcription start site. (The numbering above the sequence alignment indicates U4X nucleotide positions, and the numbering below is for U4B).
DNA sequences in the 3Ј portion of the PSE (Ϫ37 to Ϫ52) appear to make only a minor contribution toward the relative strengths of the U4B and U4X promoters.
Reciprocal Base Alterations That Inactivate the U4B PSE-To further investigate and confirm the importance of the bases at positions Ϫ61 and Ϫ57, we made the reciprocal changes in the U4B PSE. However, it is important to note that, when the U4X and U4B PSEs are aligned for maximum similarity, the U4B PSE is located two nucleotides nearer to the transcription start site than the U4X PSE. Thus positions Ϫ59 and Ϫ55 in the U4B PSE are the counterparts of positions Ϫ61 and Ϫ57 in the U4X PSE (note nucleotide numbering in Fig. 2). For the following studies, the parent construct was U4mB/AS which contains 17 nucleotides from the pUC19 polylinker inserted between positions 8 and 9 in the U4B coding region. This allowed the resultant U4maxiB product to be distinguished from the endogenous U4 RNA in primer extension assays. The U4mB/AS construct also has two points mutations at positions Ϫ64 and Ϫ48 that introduce ApaI and SstI sites, respectively. These two point mutations have previously been shown to have no detectable effect on the activity of the U4B gene in an oocyte expression assay (24).
The U4mB/AS construct was efficiently expressed in CV-1 cells (Fig. 3, lanes 1 and 7). However, a single C to T point mutation at position Ϫ55 resulted in a 10-fold reduction in promoter activity (lane 2) and a single T to G change at position Ϫ59 resulted in a 6-fold reduction (lane 3). When the two point mutations were combined (lane 4), the residual level of expression was very low and was detectable only on long exposures or by PhosphorImager analysis.
Role of the Palindromic Sequence-We next investigated, in the context of the U4B promoter, the role of the gene-specific dyad symmetry element that normally overlaps the U4X PSE. Additional point mutations were introduced into the U4B proximal region at positions Ϫ62, Ϫ66, and Ϫ70. This construct, U4mB/pal, contains the entire 16-bp U4X palindromic sequence at a position in the U4B promoter homologous to its location in the U4X promoter. It overlaps the U4B PSE and includes the two strong down-mutations at positions Ϫ55 and Ϫ59. When this construct was transfected into CV-1 cells, there was no detectable expression (Fig. 3, lanes 5 and 8). Restoring the nucleotides at positions Ϫ55 and Ϫ59 back to the wild type (which consequently also destroys the dyad symmetry element), restored transcription to high levels (lanes 6 and 9).
In an attempt to divorce the effect of the nucleotide substitutions at positions Ϫ55 and Ϫ59 from potential regulatory effects of the palindromic sequence, a construct was made in which the palindrome was displaced 4 bp in the 5Ј direction. In this construct (U4mB/pal59), the wild type U4B PSE sequence was maintained to position Ϫ63, and the complete palindromic sequence extended 16 bp upstream beginning at position Ϫ59. Fig. 3, lane 10, shows that this construct was expressed at a wild type level in CV-1 cells. Moreover, a construct that was mutated at six positions within the palindromic sequence was expressed at a similar level (lane 11). Thus, expression of the U4B gene in CV-1 cells was not significantly affected by either the presence or the absence of the palindromic sequence near, but not extensively overlapping, the U4B PSE. DISCUSSION We have investigated the cis-acting DNA sequences important for the differential expression of the U4X and U4B genes in frog oocytes and monkey CV-1 cells. The chicken U4B gene was efficiently expressed in both oocytes and CV-1 cells, whereas the U4X gene was not detectably expressed in either system. To localize the DNA sequences responsible for this differential activity, enhancer and promoter swap experiments were carried out using chimeric constructs. The results indicated that the proximal region of the U4X gene was inactive in both transient expression systems.
To define the specific nucleotides responsible for the difference in activity, point mutations were introduced into the U4X proximal region to determine whether specific changes (designed to match the sequence of the U4B PSE) could rescue U4X expression. Indeed, the alteration of two nucleotides at positions Ϫ57 and Ϫ61 was sufficient to raise the activity of the U4X PSE to a level comparable to that of the U4B PSE (Fig. 2). Moreover, some increased level of activity was alternatively obtained by replacing the U4X sequences between Ϫ52 and Ϫ37 with the corresponding nucleotides of the U4B PSE. These findings strongly suggest that the U4X gene contains a suboptimal PSE sequence, and that the nucleotides responsible for this effect are to some extent delocalized throughout the PSE. However, it is clear that the greatest down-effects on expres- FIG. 3. Transcriptional activity of U4maxiB gene constructs having mutations within or near the PSE. Total RNA was prepared from transfected CV-1 cells and primer extension analysis was carried out using a U4maxiB specific primer. The sequence alignment at the bottom shows the alterations from the U4mB/AS sequence shown in the top line. The underlined regions in the U4mB/pal and U4mB/pal59 sequences indicate a 16-bp palindromic sequence identical to that found upstream of the U4X PSE. Expression levels relative to U4mB/AS (taken as 100%) are shown in the column to the right. sion result from the presence of a G at position Ϫ61 and a T at position Ϫ57.
The importance of the bases of these two positions was confirmed by making the reciprocal changes at the homologous positions in the U4B PSE (positions Ϫ59 and Ϫ55). These two nucleotide changes resulted in 10-and 6-fold reductions in the U4B PSE activity, respectively, when the mutations were assayed individually, and, when present together in the same construct, resulted in essentially complete loss of activity (Fig.  3). It is perhaps significant that none of the other cloned chicken U1, U2, or U4B genes contain either a G or a T at these respective positions (20). Our data localizing the critical nucleotides to the proximal region are reminiscent of studies of developmentally-regulated sea urchin U2 RNA genes performed by Stefanovic and Marzluff (25). In those studies, the sequences responsible for the differential expression of the early and late U2 genes was localized to a 14-bp region within the U2 PSE.
The two critical nucleotide positions identified above are not only located within the U4X PSE, but they also are members of a 16-bp dyad symmetry element that overlaps the U4X PSE. Thus, in addition to disrupting DNA recognition by the PSEbinding factor, mutations at these positions could also affect DNA recognition by another factor, proximal palindrome binding factor, that binds specifically to the palindromic element (23). In an attempt to separate and distinguish between these two potential effects, a construct was prepared that contained an intact palindromic sequence upstream of the U4B PSE (U4mB/pal59, Fig. 3). In this construct, the wild type U4B PSE was maintained, but the palindromic sequence was displaced 4 bp upstream relative to its homologous position in the wild type U4X proximal region. As shown in Fig. 3, the palindromic sequence at this position had no significant effect on expression of the U4mB transcript. Thus, in this transient expression system, U4 gene transcription was more closely correlated with the identities of the nucleotides at U4B positions Ϫ59 and Ϫ55 (U4X positions Ϫ61 and Ϫ57) rather than with the presence or the absence of the intact palindromic sequence. Nonetheless, it remains a possibility that the palindromic sequence in its natural chromosomal context in chicken cells may play a role in regulating the developmental expression of the U4X gene in vivo. Such effects may not be observable in the heterologous transient expression systems investigated here. Another possibility, not addressed by our experiments, is that U4X gene activity conceivably could be regulated in response to the methylation state of the multiple CpG dinucleotides in the palindromic sequence (CGCGCCGGCGCG) (26).