Overlapping cis sites used for splicing of HIV-1 env/nef and rev mRNAs.

Alternative splicing is used to generate more than 30 human immunodeficiency virus type 1 (HIV-1) spliced and unspliced mRNAs from a single primary transcript. The abundance of HIV-1 mRNAs is determined by the efficiencies with which its different 5' and 3' splice sites are used. Three splice sites (A4c, A4a, and A4b) are upstream of the rev initiator AUG. RNAs spliced at A4c, A4a, and A4b are used as mRNAs for Rev. Another 3' splice site (A5) is immediately downstream of the rev initiator. RNAs spliced at A5 are used as mRNAs for Env and Nef. In this report, primer extension analysis of splicing intermediates was used to show that there are eight branch points in this region, all of which map to adenosine residues. In addition, cis elements recognized by the cellular splicing machinery overlap; the two most 3' branch points overlap with the AG dinucleotides at rev 3' splice sites A4a and A4b. Competition of the overlapping cis sites for different splicing factors may play a role in maintaining the appropriate balance of mRNAs in HIV-1-infected cells. In support of this possibility, mutations at rev 3' splice site A4b AG dinucleotide dramatically increased splicing of the env/nef 3' splice site A5. This correlated with increased usage of the four most 3' branch points, which include those within the rev 3' splice site AG dinucleotides. Consistent with these results, analysis of a mutant in which three of the four env/nef branch points were inactivated indicated that use of splice site A5 was inhibited and splicing was shifted predominantly to the most 5' rev 3' splice site A4c with preferential use of the two most 5' branch points. Our results suggest that spliceosomes formed at rev A4a-4b, rev A4c, and env/nef A5 3' splice sites each recognize different subsets of the eight branch point sequences.

Splicing of mRNA precursors in the nucleus of metazoan cells occurs by the cleavage and joining of 5Ј and 3Ј splice sites. This is mediated by recognition of conserved cis-acting sequences near the two splice sites that removes the intron between these splice sites. The splicing process is catalyzed by the cellular splicing machinery within spliceosome complexes (for reviews, see Refs. [1][2][3]. The assembly of spliceosomes involves an initial interaction with U1 snRNP 1 at the 5Ј splice site. The 3Ј splice site is then recognized by factor U2AF binding to a greater than 12-nt polypyrimidine tract upstream of the AG dinucleotide, which borders the site of 3Ј cleavage. Another cis element, the branch point sequence, is located 18 -40 nt upstream of the AG dinucleotide with the loosely conserved consensus sequence YNYURAY (the underlined residue is the branch point). The branch point sequence is recognized by U2 snRNP, whose binding is facilitated by the protein U2AF. This step is followed by binding of U5 and U4/U6 snRNPs as a triple snRNP to form the complete spliceosome.
The splicing process for most cellular mRNA precursors is normally efficient. However, processing of some pre-mRNAs occurs by tissue-specific or developmentally regulated alternative splicing pathways, resulting in multiple mRNAs produced for a single pre-mRNA. This involves the use of alternative 5Ј or 3Ј splice sites, exon skipping, intron inclusion, and mutually exclusive exons. Such regulated splicing results in the production of different proteins from the same gene and is an important mechanism for expanding the repertoire of cellular gene expression (1).
HIV-1 uses the cellular splicing machinery to express its genes. Over 30 different HIV-1 mRNAs are derived by alternative splicing from a single RNA precursor. In addition, approximately half of the RNA transcripts remain unspliced. These primary transcripts are packaged into progeny virions and are used as mRNA for the gag and pol gene products. It has been shown that HIV-1 RNA splicing efficiency is primarily determined by the relative strengths of the various 3Ј splice sites in the viral RNA (4 -6). Several cis elements have been shown to affect HIV-1 3Ј splice site usage. These include suboptimal splice sites (polypyrimidine tracts and branch point sequences) as well as enhancer and silencer elements mapping within the exons downstream of the splice sites (4 -12).
The levels of tat, rev, nef, and env mRNAs in HIV-1-infected cells are determined in part by splicing. Several alternative 3Ј splice sites are present in a 300-nt region in the middle of the HIV genome (splice sites A3, A4a, A4b, A4c, and A5; Fig. 1). In addition, a 5Ј splice site (D4) downstream of these alternative 3Ј splice sites is spliced to 3Ј splice site A7 within the env coding sequence (Fig. 1). When both splicing events occur, multiply spliced mRNAs are created. tat mRNAs are spliced at site A3, whereas rev mRNAs are spliced at either A4a, A4b, or A4c. Most env and nef mRNAs are spliced at site A5; however, env mRNA remains single-spliced (13). Splicing between sites D4 and A7 creates a noncoding exon upstream of the nef reading frame. In addition to alternative splicing, the levels of singlespliced and unspliced RNA are also selectively regulated by the HIV-1 Rev protein (for a review see Ref. 14). This small basic protein binds to an RNA element in the env gene called the Rev-responsive element within the env gene and facilitates transport of the unspliced and single-spliced RNA from the nucleus to the cytoplasm.
We have previously shown that the efficiency of splicing at splice site A3, which is primarily used for synthesis of tat mRNA, is determined by both a suboptimal polypyrimidine tract and by an exon-splicing silencer element (ESS) (6). However, the ESS, which maps approximately 60 nt downstream of the tat splice site, does not significantly inhibit splicing at the rev and env splice sites A4a, A4b, A4c, and A5, which are downstream of the ESS (10). Splicing at these latter closely spaced 3Ј splice sites, which are utilized for approximately 90% of the spliced mRNAs in HIV-1-infected cells (13), may involve the assembly of a single spliceosome complex upstream of the A4c splice site followed by scanning to the four different AGs. The relative usage of the different splice sites would then be determined by competition between the AGs based on their position and sequence context (15,16). This predicts that only one branch point sequence or one set of branch point sequences would be recognized by the U2 snRNP. Alternatively, separate spliceosomes may form upstream of each AG. This model predicts that different branch points or sets of branch points corresponding to different binding sites for U2 snRNP would be used for splicing at each of these AGs. To distinguish these models, we identified branch points in the 80-nt region containing the four 3Ј splice sites by primer extension analysis. Our results indicate that separate spliceosomes form and that there is overlap of the cis splicing sequences recognized by the cellular splicing machinery.
Plasmid EB124, which inactivates branch points 5, 6, and 8, was created with sense primer ESSEQ (5Ј-TAGTATGGGCAAGCAGGGAG-3Ј) and antisense primer NOBRNCH (5Ј-TCTGATGAGCTCTTCGTCG-CTGTCTCCGCTTCTTCCTGCCATAGGAGATGCCCAAGGCTTTTGA-CACGA-3Ј) and used to amplify the 4B-G DNA template by polymerase chain reaction. The resulting 300-nt polymerase chain reaction product was cleaved with AccI and SacI and ligated together with the 3.24-kb AccI/SacI restriction digest fragment of pHS1-⌬Sac. A similar strategy was used to construct mutant plasmid 4B-W (containing mutation AG2G to CG2G in 3Ј splice site A4b) with sense primer ESSEQ and antisense primer FWS (5Ј-TCTGATGAGCTCTTCGTCGCTGTCTCCG-CTTCTTCCTGCCATAGGAGATGCCGAAGG-3Ј) to amplify the polymerase chain reaction product. The expected changes in the plasmids used as templates for splicing were confirmed by DNA sequencing. Plasmid pSP64-H␤⌬6 containing human ␤-globin downstream of an SP6 RNA polymerase promoter has been previously described (18).
In Vitro Splicing-Splicing reactions were carried out essentially as described previously (9). In brief, a 25-l reaction mixture consisting of 60% (v/v) nuclear extract in Dignam's buffer D (19), 20 mM creatine phosphate, 3 mM MgCl 2 , 0.8 mM ATP, and 2.6% (w/v) polyvinyl alcohol was incubated with approximately 8 fmol of radiolabeled RNA substrate for 2 h at 30°C. In order to increase the yield of branched lariat intermediate product for subsequent primer extension analysis, the splicing reaction was stopped at 1 h, and approximately 100 fmol of RNA substrate was used. Except where noted, RNA splicing products were electrophoresed on 6% (w/v) polyacrylamide, 7 M urea gels for 19 h at 300 V.
Primer Extension-Primer extension analysis was carried out essentially as described previously (20). . 32 P-Labeled PED133 was also used for fmol ® DNA cycle sequencing (Promega). The extended products were loaded on a 8% (w/v) polyacrylamide, 7 M urea gel for 3.5 h at 2000 V. Dideoxynucleotide sequencing of DNA corresponding to the splice site region using the PED133 primer was carried out on the gel in order to map the locations of the branch points. Primer extension analysis of ␤-globin pSP64-H␤⌬6 was carried out using the exact conditions described above, except that primer ABLOB (5Ј-AGGAGTGGACAGATC-CCCAAAGGACTCAA-3Ј) was used instead of primer PED133.

Multiple Branch Points Are Present in the rev and env/nef 3Ј
Splice Site Region-We first performed primer extension analysis to identify branch points in the 80-nt region containing the four rev and env 3Ј splice sites (A4a, A4b, A4c, and A5). The products of a splicing reaction of an RNA substrate transcribed from minigene template pHS1-⌬Sac (see Fig. 1B) were separated on a denaturing polyacrylamide gel ( Fig. 2A). Branched lariat-exon intermediates were isolated and a primer, complementary to the exon (PED 133) was used to extend a DNA product to the branch points using avian myeloblastosis virus reverse transcriptase. As a control for the specificity of the stops, we isolated linear RNA precursor, annealed it to the same primer, and used it as substrate for reverse transcriptase. As a positive control, we carried out a similar primer extension analysis to determine the branch point site used in the splicing of exons 1 and 2 of minigene substrate SP␤⌬6 derived from the human ␤-globin gene (Figs. 1D and 2B). As expected, based on previous results (21), the branch point mapped to an adenosine within the sequence (CACUGAC), 37 nucleotides upstream from the 3Ј splice site. In the case of the HIV-1 substrate, specific stops corresponding to eight branch points were detected, and all of these branch points were mapped to adenosine residues (Figs. 1C and 2C). Examination of the sequences containing these branch points indicated that there was a correspondence to the mammalian consensus branch point sequence YNYURAC ranging from three out of seven to six out of seven nucleotides (Table I). Interestingly, the most 3Ј branch point (branch point 8) mapped to the adenosine residue within the AG dinucleotide of the rev A4b 3Ј splice site. In addition, a minor amount of branching occurred at the rev A4a 3Ј splice site (branch point 7).

Mutations in the rev A4b 3Ј Splice/Branch Point Site Enhance Splicing at the env/nef A5 3Ј Splice Site and Increase
Usage of Downstream Branch Points-The fact that branch points 7 and 8 overlapped with the rev AG dinucleotides strongly suggested that these branch points were used in splicing at the env/nef 3Ј splice site A5. To further test this possiblity, we correlated the use of the different branch points with the use of specific splice sites. It has been previously reported that certain mutations of the rev A4b AG dinucleotide cause a significant inhibition of virus replication and reduced usage of tat and the remaining rev 3Ј splice sites (17). We tested four different mutant substrates: AG2G 3 GG2G (mutant 4B-G), AG2G 3 AC2G (mutant 4B-S), AG2G 3 GA2G (mutant 4B-P), and AG2G 3 CG2C (mutant 4B-W). Fig. 3 shows that, as expected, splicing at 3Ј splice site A4b was blocked with all mutant substrates. Splicing at 3Ј splice site A5 of substrates with AG to GG, AC, and CG mutations was dramatically increased compared with wild type. On the other hand, splicing of the AG to GA mutant substrate was elevated only slightly compared with wild type. In each case, there was a reduction in splicing at the rev A4a 3Ј splice site; this was most pronounced for the 4B-S substrate. Concomitant with the increase in splicing at A5 there was a shift in the distribution of lariats and lariat-exon intermediates to the slowest migrating bands (Fig.  3). This shift occurred to a lesser extent with the AG to GA mutant. Since RNA intermediates containing larger lariats would be expected to migrate more slowly on such gels, this result suggested that the branch points used for splicing at the env/nef 3Ј splice site A5 are farther downstream than those that are used for splicing at the rev 3Ј splice sites A4a, A4b, and A4c.
To determine the locations of the branch points used in splicing of the rev A4b mutants, we carried out primer extension analysis of the mutant lariat-exon intermediates. It can be seen in Fig. 4 that enhanced splicing at the A5 3Ј splice site was correlated with a relative increase in lariat-exon intermediate products of the four most 3Ј branch point sites (branch points [5][6][7][8] in substrate 4B-S and three of the four distal branch point sites (branch points 5-7) in substrates 4B-G and 4B-W. The 4B-G and 4B-W substrates, in which the AG dinucleotide was mutated to GG and CG, respectively, did not branch at the mutated A4b 3Ј splice site (branch point 8). Branching did occur at this site with mutant substrate 4B-S, in which the AG dinucleotide was mutated to AC. Thus, the presence of the AG adenosine is necessary for branch point formation at the A4b 3Ј splice site. The increase in the use of branch points 5-8 relative  to the use of branch points 1-4 correlated with increased use of 3Ј splice site A5, suggesting that these branch points are used for splicing at this site. On the other hand, with the 4B-P mutant in which the AG dinucleotide was mutated to GA, the shift to splicing at A5 was less pronounced (see Fig. 3), and there was a more uniform usage of all eight branch points. The strong correlation that existed between the enhanced splicing to the A5 splice site of the A4b mutants and the increased use of downstream branch points in the lariat-exon intermediates suggested that the four most 3Ј branch points were used for splicing to A5.
Mutation of env A5 Branch Points Shifts Splicing to the rev 3Ј Splice Site 4c-To further confirm the role of the branch points identified above in A5 splicing, we used substrate EB124 in which the candidate branch points for A5 were mutated from adenosine to other nucleotides. EB124 combines the mutation present in substrate 4B-G (with the A4b and branch point 8 mutation AG2G 3 GG2G) with mutations of the remaining two major A5 branch points (branch points 5 and 6). The minor branch point 7 within the A4a AG dinucleotide was not changed. If these three branch points were necessary for splicing to the A5 3Ј splice site, we expected that the splicing pattern of substrate EB124 should resemble that of a substrate in which both rev A4b and env/nef A5 3Ј splice sites were inactivated (mutant 4B-/5-). As shown in Fig. 5A, the splicing pattern of 4B-/5-was indeed similar to that of EB124 substrate. Interestingly, mutant substrates EB124 and 4B-/5-both showed an increased amount of product spliced at rev splice site A4c compared with wild type. As shown in Fig. 5B for EB124, this correlated with a relative increase in the usage of branch points 1 and 2. This result suggested that these branch points may be used preferentially for splicing at A4c. There was also a residual amount of splicing at 3Ј splice site A5 (Fig. 5A), and this was correlated with a small amount of branching at the rev A4a 3Ј splice site (branch point 7), which remains intact in EB124 (Fig. 5B). The shift to increased splicing at A4c required mutations of both the A4b AG dinucleotide and either the A5 AG dinucleotide or the A5 branch points. Mutation of the A5 AG dinucleotide alone (mutant 5-) did not show the preferential use of the A4c 3Ј splice site. The AG to GG mutation at A4b caused the expected increase in splicing at A5 (Fig. 5A).

DISCUSSION
In this study, we showed that different sets of branch points are used for splicing at the different HIV-1 rev and env/nef 3Ј splice sites. This implies that separate spliceosome complexes used for splicing at the rev and the env/nef mRNAs form upstream of the different AG dinucleotides. It further implies that there are a number of different binding sites for U2 snRNPs. From the locations of the branch points, we can deduce the locations of the polypyrimidine tracts used for assembling spliceosomes at the different 3Ј splice sites (Fig. 6). The polypyrimidine tract is positioned downstream of the branch points used for splicing at a given site. Thus, the polypyrimidine tract used for the env/nef 3Ј splice site A5 at nt 5975 is almost certainly the sequence spanning nt 5963-5970 (5Ј-UCUCCUAU-3Ј). Consistent with this, branch points 5-8, which are used for splicing at site A5, are all upstream of this sequence. A second polypyrimidine tract located between nt 5922 and 5929 (5Ј-UUGCUUUC-3Ј) is upstream of branch points 3 and 4 but downstream of branch points 1 and 2. Thus, branch points 1 and 2 may be used for splicing at rev 3Ј splice site A4c. This is in agreement with the data shown in Fig. 5, since an increase in product spliced at site A4c correlated with the increased use of branch points 1 and 2. It is likely that the rev 3Ј splice sites A4a and A4b use a common set of branch points, since mutation of the A4a AG in the context of HIV-1 infection has been shown to result in a compensatory increase in splicing at the A4b splice site but with little or no change in splicing at A5 (17). It follows that a likely polypyrimidine tract for rev 3Ј splice sites A4a and A4b is the sequence from nt 5936 to 5944 (5Ј-UUUGUUUC-3Ј), which is downstream of branch points 1-4 but upstream of branch points 5-8. Interestingly, all of these potential polypyrimidine tracts are shorter than the eukaryotic consensus (greater than 12 pyrimidines) and have interspersed purine residues. Thus, as has been shown with other HIV 3Ј splice sites, the polypyrimidine tracts of the rev and env/nef splice sites also appear to be suboptimal (4 -6).
We also showed in this study that, although most of the branch point sequences differ significantly from the metazoan consensus sequence, all of the branch points in the region of the HIV-1 rev and env/nef 3Ј splice sites map to adenosine residues. Previous studies of several other HIV-1 branch points showed that nucleotides other than adenosine residues are used. Dyhr-Mikkelsen and Kjems found that the major branch point for the tat/rev 3Ј splice site A7 maps to a uridine residue within the sequence 5Ј-UACUUUC-3Ј (11). Damier et al. mapped the branch point for 3Ј splice site A2 to a guanosine residue within the sequence 5Ј-UAGCAGA-3Ј (12). Our results show that the use of nonadenosine branch points is not universal for all HIV-1 3Ј splice sites. Interestingly, the adenosine residues within the AG dinucleotides of the rev 3Ј splice sites A4a and A4b also both serve as branch points for splicing at env/nef 3Ј splice site A5. To our knowledge, there is no precedent for overlap of these two essential cis elements within two adjacent alternative 3Ј splice sites. We also showed that each of the HIV-1 splice sites appears to use multiple branch points. This has previously been shown for a number of other alternatively spliced viral and cellular mRNAs (22)(23)(24)(25).
We showed above that substrates with different mutations of the rev A4b AG dinucleotide behave differently. Mutations in which the AG was changed to GG, AC, or CG resulted in dramatic increases in splicing at the env/nef 3Ј splice site A5, whereas a mutation in which the AG was changed to GA does not show this increase. Interestingly, these differences in splicing can be correlated with previous results in which the effects of these mutations were studied in the context of HIV-1 infection. Two of these mutants (AG to GG and AG to CG) are Rev-deficient and replication-defective (17,26). In the case of the AG to GG mutant, this was shown to involve a relative decrease in levels of mRNAs spliced at the remaining rev 3Ј splice sites as well as the tat 3Ј splice site A3 (17). In contrast, the AG to GA mutant has wild type Rev function and is replication-competent (13). Our results suggest that the Rev-deficient HIV-1 phenotypes may be caused by an unbalanced splicing of the viral RNA, resulting in an increase in splicing at the env/nef 3Ј splice site A5 and a consequent failure to splice efficiently at the remaining rev 3Ј splice sites. In the in vitro system, the 3Ј splice site A4b mutations do not exhibit the same inhibitory effect on rev and tat splicing as they do in the context of virus infection. We do not yet understand this difference in the effects in the two systems. One possibility is that in infected cells the alternative 3Ј splice sites may compete for limiting splicing factors. An increase in splicing efficiency at the env/ nef 3Ј splice site may cause a diminution in splicing at the tat and rev 3Ј splice sites. In the in vitro splicing system, these factors may not be limiting. Further studies comparing splicing in vivo and in vitro will be necessary to resolve this issue.
Our results suggest that the rev A4b 3Ј splice site AG dinucleotide contributes to the maintenance of balanced HIV-1 RNA splicing by inhibiting splicing at the env/nef 3Ј splice site A5.
The only A4b mutation tested that blocks splicing but fails to significantly increase splicing at 3Ј splice site A5 is AG2G to GA2G. In both this mutant and the AG2G to AC2G mutant, env/nef branch point 8 adenosine was used (see Fig. 4), indicating that the ability to branch within this AG dinucleotide is not correlated with repression of splicing at 3Ј splice site A5. Furthermore, the use of branch point 8 was abolished by the AG2G to GG2G and AG2G to CG2G mutations, yet the inhibition of splicing at 3Ј splice site A5 was relieved. Interestingly, the AG2G to GA2G mutation is unique in that a new AG dinucleotide is created one nucleotide downstream. Several different protein factors that interact with the AG dinucleotide both early and late in spliceosome assembly have been identified by UV cross-linking experiments (27,28). Thus, it is possible that factors bound at or near the rev A4b 3Ј splice site may interfere with assembly of spliceosomes, leading to splicing at env/nef 3Ј splice site A5. Alternatively, the three A4b AG mutations that activate 3Ј splice site A5 splicing may elicit a change in the RNA secondary structure that allows more efficient formation of spliceosomes at the A5 splice site.