Mechanism of HIV-1 RNA Dimerization in the Central Region of the Genome and Significance for Viral Evolution*

Background: HIV-1 genome dimerization is important for viral survival. Results: The central region of HIV-1 genome contains a G-rich sequence promoting recombination and capable of forming a dimeric G-quartet. Conclusion: Parallel alignment and proximity of two HIV-1 genomes associated at the 5′ end can be maintained through dimeric G-quartet formed in the central region of viral genome. Significance: Determining mechanism and factors involved in HIV-1 genome dimerization. The genome of HIV-1 consists of two identical or nearly identical RNA molecules. The RNA genomes are held in the same, parallel orientation by interactions at the dimer initiation site (DIS). Previous studies showed that in addition to interactions at DIS, sequences located 100 nucleotides downstream from the 5′ splice site can dimerize in vitro through an intermolecular G-quartet structure. Here we report that the highly conserved G-rich sequence in the middle portion of the HIV-1 genome near the central polypurine tract (cPPT) dimerizes spontaneously under high ionic strength in the absence of protein. The antisense RNA does not dimerize, strongly indicating that RNA dimerization does not exclusively involve A:U and G:C base pairing. The cation-dependent reverse transcriptase pausing profile, CD spectra profile, and cation-dependent association and thermal dissociation characteristics indicate G-quartet structures. Different forms of G-quartets are formed including monomers and, significantly, intermolecular dimers. Our results indicate that RNA genome dimerization and parallel alignment initiated through interactions at DIS may be greatly expanded and stabilized by formation of an intermolecular G-quartet at a distant site near the cPPT. It is likely that formation of G-quartet structure near the cPPT in vivo keeps the RNA genomes in proximity over a long range, promoting genetic recombination in numerous hot spots.

HIV-1 efficiently diversifies its population within an infected person, allowing it to evade host immune response and therapeutic treatments. The error-prone HIV-1 reverse transcriptase (RT) contributes significantly to viral diversity. Viral evolution is further accelerated through the process of genetic recombination. The presence of two often non-identical copies of viral genome in a single virion is a fundamental requirement for genetically significant recombination. One copy of genomic RNA is sufficient for viral replication. The role of the second copy is to allow the RT to switch templates during reverse transcription. This mixes mutations from two genetically different viral genomes, allowing the virus to evolve into more robust forms.
The two genomic RNAs adhere within the cytoplasm; thus, the viral genomes are packaged into the virion as a dimer rather than two RNA monomers (1,2). The sequences involved in initial viral genome dimerization and encapsidation partially overlap and are localized close to the 5Ј end (3)(4)(5). One of these sequences forms the stem-loop 1 secondary structure with a highly conserved palindromic sequence called the dimer initiation site (DIS). 2 The first stable interaction is a contact within the loops of two stem-loop 1 hairpins called RNA kissing (6,7). The interaction is then thought to spread to the stem regions, with their palindromic sequences promoting base pairing; however, this contact is not sufficient by itself to produce dimers in vivo that resist extraction and gel electrophoresis (8). Other interactions along the genome seem to be desirable to keep a consistent proximity that would facilitate a generally uniform probability of template switching of the RT throughout reverse transcription, desirable for efficient recombination.
Studies in vitro show that in HIV-1 and other retroviruses, certain G-rich genomic sequences can form G-quartet structures (9 -13). Four guanine bases can associate through Hoogsteen hydrogen bonding and form a G-tetrad, and two or more guanine tetrads can stack on top of each other to form a G-quartet. The G-tetrads can form by repeated folding of either a single nucleic acid molecule or by association of two or four molecules. The structure has been intensively analyzed in telomeres, and in recent years G-quartets have been shown to regulate activity of promoters, formation of DNA flaps, stability of mRNA, and alternative splicing and translation (14 -21). Very little is known about G-quartets in viruses, including HIV-1 and their possible function in the viral life cycle. G-quartet structure formation was found to be essential to maintain the Epstein-Barr virus genome during latency in proliferating cells (22). In HIV-1 and Moloney murine sarcoma virus (MuSV), G-rich sequences in the gag region near DIS were shown to form an intermolecular G-quartet (9 -13). This structure adheres the two RNA genomes; thus, it represents a different mode of genome dimerization in retroviruses from that of the DIS hairpin interaction. Formation of dimeric G-quartet structure correlates with hot spots of recombination, exhibiting an increased rate of template switching (23,24).
Here we show that short RNA templates from the central region of the HIV-1 genome, containing the G-rich sequences near the central polypurine tract (cPPT), are able to form both monomer and dimer G-quartet structures. The cPPT is located in the integrase gene and is a region where one of two primers for synthesis of (ϩ) strand DNA is produced during reverse transcription. The G tract in the cPPT and two other G runs located downstream from the cPPT are highly conserved among different HIV-1 isolates and several closely related simian immunodeficiency virus (SIV) species. By application of assays in vitro and affinity selection methods, we found that the RNA templates dimerize through the G-rich regions, indicating that in addition to contacts near the 5Ј end, the central regions of the viral RNA genomes are likely to be maintained in proximity through dimer G-quartet formation. In reconstituted systems of reverse transcription, the formation of G-quartets facilitates RT in switching templates during synthesis of minus strand DNA, suggesting that the structure supports an increased recombination rate.

EXPERIMENTAL PROCEDURES
Materials-DNA oligonucleotides and the HPLC-purified RNA strand used for CD spectra analyses were purchased from Integrated DNA Technologies, Inc. (Coralville, IA). HIV-1 nucleocapsid protein (NC; 55 amino acids) was generously provided by Dr. Robert J. Gorelick. HIV-1 reverse transcriptase (p66/p51 heterodimer) was purified as described previously (25). The [␥-32 P]ATP was purchased from PerkinElmer Life Sciences. The sequences of viruses were analyzed with the HIV database (www.hiv.lanl.gov).
Preparation of RNA Templates-RNA molecules were transcribed in vitro (Ambion T7-MEGAshortscript kit; Applied Biosystems) from DNA templates amplified by PCR using Vent DNA polymerase (New England Biolabs) and two overlapping oligomers with the sequence of the desired region. The following RNA strands were used in our studies. (a) For the reverse transcription assay, the wild-type and mutant RNA template strands with the cPPT region (4309 -4396 in the RNA genome) sequences from the NL4-3 HIV-1 were made from DNA templates synthesized with a pair of oligomers, 1/2 and 3/4, respectively. (b) For affinity selection analysis, the non-tagged RNAs and poly(A)-tagged RNAs were made from DNA generated using oligomers 1/2 and 1/5 for the cPPT region (4309 -4396) of NL4-3 HIV-1, 6/7 and 6/8 for the gag region (290 -403) of NL4-3 HIV-1, 9/10 and 9/11 for the gag region (303-415) of MAL HIV-1, and 12/13 and 14/15 for 5Ј-UTR with DIS (1-520 and 183-520) of NL4-3 HIV-1. (c) For analysis of dimerization in a native gel, the RNA template of the cPPT region (4309 -4396) in NL4-3 HIV-1 was made from DNA generated using oligomers 1/2, and for the antisense sequence of the cPPT region, oligomers 16/17 were used. (d) For transfer assays, the donor and acceptor templates for cPPT region (4309 -4396) of NL4-3 HIV-1 were made from DNA generated using oligomers 1/5 and 18/2, respectively. After transcription in vitro, the RNA templates were purified by polyacrylamide/urea gel electrophoresis and resuspended in water. RNAs were quantitated by UV absorption using a GeneQuant II from Amersham Biosciences.
Preparation of the 5Ј-Radiolabeled RNA Template and the DNA Primer-DNA primers were labeled at the 5Ј end using T4 polynucleotide kinase (New England Biolabs) and [␥-32 P]ATP (6000 Ci/mmol). Preparation of 5Ј-radiolabeled RNA template was performed as follows. The gel-cleaned RNA template was treated with shrimp alkaline phosphatase (Fermentas) at 37°C for 60 min and then incubated at 65°C for 25 min to inactivate the enzyme. After cooling on ice, the reaction mixture was treated with [␥-32 P]ATP (6000 Ci/mmol), 10ϫ PNK T4 Polynucleotide Kinase Reaction buffer and T4 polynucleotide kinase (New England Biolabs). After incubation for 1 h at 37°C, the radiolabeled RNA and DNA primers were separated from unincorporated radionucleotides using a Micro Bio-Spin column (Bio-Rad).
Reverse Transcription Assay-RNA (200 fmol) and the 5Ј end-labeled DNA primer 19 (300 fmol) were heated to 95°C for 2 min and annealed by slow cooling to 37°C in 9 l. HIV-1 RT was added to the substrate and incubated for 4 min before the reaction was initiated with MgCl 2 and dNTPs. Reactions were carried out in 12.5 l at a final concentration of 50 mM Tris-HCl, pH 8.0, 50 mM KCl or LiCl, 1 mM DTT, 1 mM EDTA, 32 nM HIV-1 RT, 6 mM MgCl 2 , and 50 M dNTPs. After 30 min of incubation at 37°C, reverse transcriptions were stopped with 1 volume of termination buffer (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each xylene cyanol and bromphenol blue). Extension products were resolved on a 6% polyacrylamide, 8 M urea gel and analyzed using phosphorimaging (GE Healthcare). Sizes of DNA products were estimated by using a 5Ј-radiolabeled 10-bp DNA ladder (Invitrogen).
Circular Dichroism-CD spectra were obtained at 25°C over a wavelength range of 220 -340 nm using an AVIV circular dichroism spectrometer, model 202. The RNA was at a concentration of 4 M in 10 mM Tris HCl, pH 7.5, 0.3 mM EDTA and in the presence of KCl, NaCl, or LiCl at 10 and 50 mM concentrations. Before analysis, the samples were heated to 90°C for 10 min, gently cooled at a rate of 1°C/5 min, and incubated at 4°C overnight. Spectra were recorded using a quartz cell of 1-mm optical path length, with data collected every nanometer at a bandwidth of 1 nm. Each spectrum was recorded three times and base-line-corrected for signal contributions from the buffer. The data were processed with AVIV Biomedical Inc, software and reported as ellipticity (millidegrees) versus wavelength (nm).
Thermal Denaturation of G-quadruplex Structure-Melting curves were obtained by monitoring the change at a wavelength of 265 nm, chosen for having the highest CD value. The temperature was raised from 20 to 90°C; the heating rate was 2°C/ min. The melting temperature T m was defined as the temperature of the mid-transition point.
Affinity Selection with Oligo-d(T) 25 Magnetic Beads-About 40 pmol of poly(A)-tagged RNA and non-tagged RNA tem-plates were mixed (ratio 1:1) in the presence of 50 mM Tris-HCl, pH 8.0, 200 mM KCl, and 1 mM EDTA in a final volume of 20 l. The mixtures were heated to 95°C for 3 min, then chilled on ice and incubated at room temperature for 2 h. Before using oligod(T) 25 magnetic beads (New England Biolabs), the suspension of 50 l was washed once with binding buffer (20 mM Tris-HCl, pH 7.5, 500 mM KCl, 1 mM EDTA), resuspended in 180 l of binding buffer, and added to the RNA. The mixture was agitated at room temperature for 10 min, and then placed in a magnetic rack to separate the magnetic beads from solution. The beads were washed once with binding buffer and three times with wash buffer (20 mM Tris-HCl, pH 7.5, 200 mM KCl, 1 mM EDTA), each time for 1 min with gentle agitation. To elute the RNA, the beads were resuspended in 15 l of elution buffer (20 mM Tris-HCl, pH 7.5, 1 mM EDTA), incubated for 3 min in 95°C and placed in a magnetic rack to separate the magnetic beads from solution. 1 volume of loading buffer (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each xylene cyanol and bromphenol blue) was added to the eluted solution, and the products were resolved in 4 or 6% polyacrylamide, 8 M urea gels. The gel was stained with ethidium bromide.
Formation of Higher Order Products and Native Gel Analysis-The mixture of the 5Ј end 32 P-labeled (about 500,000 cpm) and unlabeled RNA at a concentration of 4 M and a final volume of 6 l was heated to 95°C for 3 min, chilled, and incubated for 60 min at room temperature in a buffer containing 10 mM Tris-HCl, pH 7.5, 1 M KCl (or 1 M LiCl), 1 mM EDTA. Before electrophoresis, samples were mixed with 1 volume of loading dye (30% glycerol in 1 ϫ Tris-EDTA and loading pigments) and loaded onto a 6% non-denaturing polyacrylamide gel (0.5 ϫ Tris borate-EDTA ϩ 10 mM KCl). Electrophoresis was performed in 4°C at 7V/cm for 3-4 h, and the gel was dried onto Whatman No. 3MM chromatography paper and analyzed using phosphorimaging (GE Healthcare).
Cation-dependent Dimerization and Thermal Dissociation Analysis-Dimerization and melting experiments with RNA dimers of the cPPT region were conducted in parallel at the same molar concentration (4 M) for each reaction setting. To form a dimer, the RNA was heat to 95°C for 3 min, chilled, and incubated for 16 h at room temperature in a buffer containing 10 mM Tris-HCl, pH 7.5, 1 mM EDTA, and one of three different salts (KCl, NaCl, or LiCl), each at 1 M. After incubation, the mixtures were placed on ice, and 1 volume of 1 ϫ Tris-EDTA buffer was added. Aliquots of 15 l were transferred to new tubes and incubated for 8 min at a specific temperature between 40 and 90°C, then returned to the ice. Samples were mixed with 1 volume of loading dye (30% glycerol in 1 ϫ Tris-EDTA and loading dyes) and loaded onto 6% non-denaturing gels run at 4°C. All gels were dried onto Whatman No. 3MM chromatography paper and analyzed using a phosphorimaging (GE Healthcare).
Antisense DNA Oligonucleotide Binding Assay-A 5-fold molar excess of each oligomer (a-g) was added to separate dimerization incubations (4 M RNA, 10 mM Tris-HCl, pH 7.5, 1 M LiCl, 1 mM EDTA) before the denaturation step. The samples were chilled on the ice and incubated for 16 h at room temperature in a final volume of 7 l. Products were resolved on a 6% non-denaturing gel run at 4°C in 0.5 ϫ Tris borate-EDTA ϩ 50 mM KCl.
Strand Transfer Assay-The DNA primer 20 was heat-annealed to donor RNA by incubation at 95°C for 5 min and slow cooling to 37°C. Acceptor template was also present in the mixture. NC at the 200% polymer substrate level (100% NC is 7 nt/NC polymer substrate molecule) was added and incubated for 3 min. Next, the RT was added to the mixture and incubated for another 4 min to pre-bind the RT with substrates before reactions were initiated with MgCl 2 and dNTPs. Primer, donor, and acceptor were mixed at a ratio of 2:1:1. The final reaction contained 50 mM Tris-HCl, pH 8.0, 50 mM KCl, 1 mM DTT, 1 mM EDTA, 32 nM HIV-1 RT, 6 mM MgCl 2 , and 50 M dNTPs. For reactions in the presence of lithium ions, KCl was replaced with 50 mM LiCl. Reactions were incubated at 37°C and terminated after 1, 5, 15, and 30 min with 1 volume of termination dye (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each xylene cyanol and bromphenol blue). Products were then resolved by 6% polyacrylamide, 8 M urea gels and analyzed using phosphorimaging (GE Healthcare) and ImageQuant (Version 2.1). Sizes of DNA products were estimated by using a 5Ј-radiolabeled 10-bp DNA ladder (Invitrogen).

G-rich Sequences Capable of G-quartet Formation Are Present in the Middle of the Genome in Various HIV Species-Inten-
sive research on the gag region near the dimerization initiation site of different retroviruses showed that formation of an RNA genomic dimer is also supported by interactions though G-quartet structure (9 -13). To discover whether there are more distant G-rich sequences potentially capable of adhering the HIV-1 genomes, we used QGRS Mapper, a software program designed to predict formation of this structure (26). Our computational analyses revealed that the central region of the HIV-1 genome near the cPPT can form G-quartet structures with high probability. The sequence contains three G runs, in which the first G-rich element is a part of the cPPT sequence (Fig. 1). Putative G-quartets would also require downstream G-rich elements in HIV-1 that would be part of either an intramolecular structure containing two G-tetrads or a bimolecular structure with up to six G-tetrads. To form the RNA dimer, only one or two G-rich elements on each RNA molecule would need to participate in the interaction.
Conservation of these elements would attest to their likely function in genome dimerization. The cPPT sequences are conserved among viral species, as expected from their multiple functions. To determine whether cPPT plus flanking sequences are also predicted to participate in G-quartet formation in different viruses, we analyzed genomes of various isolates of HIV-1 (A, B, C, D, F, G, H, U, O), HIV-2 (A, B, G, U), and several SIV species closely related to HIV-1. All analyzed sequences of HIV-1 and SIV have a similar distribution of cPPT-region G-rich elements and are also predicted to form G-quartet structures (Table 1). In the case of HIV-2, two G-rich elements are present and might be involved in formation of a dimer. In summary, our computational analyses of viral sequences suggest that G-rich sequences within and flanking the cPPT are involved in G-quartet formation in all analyzed species.
G-rich Regions Near the HIV-1 cPPT Induce Cation-dependent Pauses of RT-G-quartets are formed preferentially in the presence of specific monovalent cations and can be stabilized by some proteins and chemical agents. The order of cation preference is usually K ϩ Ͼ Na ϩ Ͼ Cs ϩ Ͼ Li ϩ . The quadruplex formation rates increase with the salt concentration. Previous studies revealed cation-dependent pauses of RT progression immediately before and within a G-rich RNA template region involved in G-quartet formation (23,27). In the presence of low concentrations of potassium ions, pause sites caused by hairpin structures and G-quartets were observed; however, in the presence of a low concentration of lithium ions the pauses caused by hairpin structures were not affected, whereas the pauses caused by G runs involved in G-quartet formation were greatly reduced. This cation-dependent pause profile behavior reliably indicates locations where an analyzed sequence can form a G-quartet structure.
The RNA sequence of the cPPT region (4309 -4396 in the RNA genome) was synthesized by transcription in vitro. Then, reverse transcription by HIV-1 RT was performed using the transcript as a template in the presence of K ϩ or Li ϩ to analyze the cation-dependent pausing profile for this region. Results in Fig. 2A show that cation-dependent pauses of RT were observed in the presence of 50 mM concentrations of salt for the template folded at a concentration of 22 nM. Three RT pause sites in the presence of potassium ions correspond to the first, second, and third runs of G residues in the RNA template and presumably result from formation of different G-quartet structures. The pauses were not observed in the presence of lithium ions. We also performed the reactions in the presence of K ϩ , Na ϩ , and Li ϩ , with folding at a higher salt concentration (200 mM). The mixtures were then diluted to 50 mM concentrations of salt before the addition of RT and initiation of the reaction. The pausing profile of RT showed that the pause at the third G-rich element (G-3) was stronger in the reaction performed in the presence of K ϩ than in Na ϩ , which is consistent with known cation preferences for forming G-quartet structure (Fig. 2B). Weak RT pauses at G-rich elements in the presence of Li ϩ indicate that the structure can also be formed in a higher salt concentration, which is also consistent with previous reports showing that G-quartets might be formed efficiently in the presence of lithium ions but are less stable (9, 10). Significantly, for an RNA template with base substitutions in the second and third G-rich elements (MT in Fig. 2), the RT pause corresponding to third G run was eliminated. This indicates that the third G-rich element contributes to formation of a G-quartet, which could be a monomer, dimers (between G-1 and G-3 or between G2 and G-3), or tetramer, all able to pause the RT. The pauses at first and second G-rich elements were clearly visible with the mutant template in the presence of potassium ions but less so in the presence of sodium and lithium, suggesting that both elements could still form G-quartets. However, these pause signals might only result from G-quartets formed between two and four RNA templates, as at least two more G residues are needed for this template to form a monomer G-quartet. In summary, these results demonstrate that G-rich elements in the HIV-1 cPPT region can fold into different forms of G-quartets, including monomers and intermolecular structures.
CD Spectra Analysis Confirms G-quartet Formation by RNA in the cPPT Region of HIV-1-G-quartets have a unique CD spectrum that depends on topology, which might be parallel or antiparallel. The parallel configuration refers to the structure in which the 5Ј-3Ј direction of all strands that form the G-tetrad is the same. If one or more strands have a 5Ј-3Ј direction opposite to the other strands, the G-quadruplex is said to adopt an antiparallel topology. The CD signature for a parallel configuration has a positive peak at around 265 nm and a negative peak at around 240 nm, whereas antiparallel G-quadruplexes have positive and negative peaks at 295 and 260 nm, respectively (28,29). For RNA molecules, only a parallel configuration of G-quadruplex structure was observed.
The CD spectra were acquired for 4 M RNA in a buffer containing 50 or 10 mM KCl, NaCl, or LiCl. Before analysis, the samples were heated to 90°C for 10 min, gently cooled at the rate of 1°C/5 min, and incubated at 4°C overnight. Results show that the 28-nt-long RNA of the cPPT region (4340 -4367) in HIV-1 has a typical CD spectral profile for the parallel G-quadruplex configuration (Fig. 3A). The structure is formed in the presence of 50 mM K ϩ , Na ϩ , or Li ϩ , although the intensity of the maximum peak at 265 nm is reduced for lithium ions. For RNA folded at 10 mM Li ϩ , the CD profile almost overlaps with data recorded for RNA incubated in the absence of any salt. Significantly, the positive ellipticity at 265 nm is only slightly reduced for RNA folded in 10 mM KCl, indicating that the sequence easily adopts G-quadruplex configuration even at low salt concentration.
The results from the RT progression assays did not show G-quartet formation in 50 mM LiCl. However, in this method the RNA concentration used for folding was also much lower (22 nM) than for the CD spectra (4 M), and the ability of a sequence to form a G-quartet is known to increase with the salt and template concentration.
The thermal stability analysis also indicates that the G-quadruplex structure formed in the cPPT region is very stable in the presence of KCl. Melting curves were obtained by monitoring changes at a wavelength 265 nm at a heating rate of 2°C/min. Fig. 3B shows normalized CD melting curves for G-quartets folded in the presence of K ϩ , Na ϩ , and Li ϩ . The thermal denaturation for G-quadruplexes formed in 50 mM KCl showed that ϳ80% of them were still present at 90°C. The unfolding of the  structure is likely prevented by the presence of potassium ions. Surprisingly, the melting curve for the structures formed in 10 mM KCl showed that only 40% of them were unfolded at the highest temperature, indicating that the G-quadruplex is very stable. The G-quadruplexes formed in 50 mM NaCl or LiCl completely melted with T m values of 68 and 59°C, respectively. Because potassium is the dominant monovalent cation inside cells, these results suggest that G-quadruplex configuration could be easily adopted by viral RNA sequence of the cPPT region.
Interacting RNA Strands with the HIV-1 cPPT Region Sequence Are Selected by an Affinity Isolation Method-One, two, or four nucleic acid molecules can form a G-quadruplex. To test whether G-quartets that we detected are formed as intermolecular structures, we developed an affinity selection method, with which the interactions between nucleic acids can be tested by selection of interacting partners with one tagged by a poly(A) sequence. Magnetic beads conjugated with oligod(T) 25 were used for affinity selection. The method has been used to select mRNAs and was modified here by using buffers suitable for G-quartet formation. The interacting partners are distinguished in a denaturing gel stained with ethidium bromide. We tested the specificity of this approach using RNA strands with the HIV-1 DIS synthesized by transcription in vitro. The poly(A)-tagged RNA with the sequence 183-520 of the HIV-1 RNA genome (with the DIS at position 257-262) could select another RNA having the DIS (1-520) but devoid of poly(A). Non-tagged DIS RNA could not be selected by magnetic beads in the absence of the poly(A)-tagged partner, demonstrating that the observed interactions are not the result of nonspecific binding to the magnetic beads (Fig. 4A).
To determine whether RNA molecules with the sequence of the of the HIV-1 cPPT region interact, the RNA strands corresponding to positions 4309 -4396 of the HIV-1 NL4-3 RNA genome were synthesized with poly(A) tails and were co-incubated with equivalent RNA strands but devoid of poly(A) sequence. Because the gag regions near DIS from HIV-1 MAL and NL4-3 were already shown to form dimers through G-quartet structure, we used RNA strands of these regions as positive controls for dimerization (9,10,12). Our RNA strands having the relevant gag sequences (303-415 in HIV-1 MAL, 290 -403 in HIV-1 NL4-3) did not include the DIS. The RNA partners were combined and incubated to allow possible dimerization and subsequently used for affinity selection with magnetic beads. As shown in Fig. 4B, the non-tagged RNA strands of the gag region of the two different HIV-1 species (MAL and NL4-3) and non-tagged RNA strands of the cPPT region were co-selected with corresponding poly(A)-tagged RNAs. This indicates that the G-rich cPPT region of the HIV-1 RNA genome is a likely additional point of contact between the two RNA genomes that augments dimerization initiated through DIS.
Dimerization of RNA Strands from the HIV-1 cPPT Region-The complexes of interacting RNA partners from the cPPT region selected by magnetic beads might have represented a mixture of dimers and tetramers in which G-quartets are formed between two and four nucleic acid molecules, respectively. To determine what types of complexes are formed between these RNA molecules, we used a native gel analysis assay in which RNA templates of the cPPT region were radiolabeled, incubated in the presence of 1 M KCl or LiCl, and then resolved in a native polyacrylamide gel. As a control, we used RNA strands with the antisense sequence.
An 88-nt HIV-1 fragment of the cPPT region (4309 -4396) including the G-rich segments self-associated to form mostly a dimeric complex and a lesser amount of tetrameric complex, both with reduced electrophoretic mobility. In contrast, the antisense RNA template remained unfolded, which strongly suggests that strand interaction resulted from structures that cannot exclusively depend on A:U and G:C base pairs (Fig. 5). Surprisingly, the higher order structures were formed more efficiently in the presence of LiCl than in the presence of KCl, a  result not observed at the same concentration of RNA in the CD spectral analysis. However, the concentration of salt in both experiments was also significantly different (1 M for RNA dimerization and 50 mM for CD spectra).
The higher rate of G-quartet dimer formation in the presence of LiCl is consistent with results obtained for G-quartet dimers formed from gag region sequences in which the yield of dimers correlated inversely with the size of monovalent cation (i.e. Li ϩ Ͼ Na ϩ Ͼ K ϩ ) (9, 10). However, although the RNA G-quartet dimers of the gag region fold at a slower rate in the presence of potassium ions, the complexes are much more stable than those folded in the presence of lithium ions. Our results indicate that this is also the case with the G-rich sequences of the cPPT region. A thermal dissociation experiment (Fig. 6) revealed that although only 12% dimerization occurred in the presence of potassium ions, the dimers remained stable when incubated for 8 min at temperatures between 40 and 90°C. Formation of dimers was more efficient in the presence of sodium ions, with ϳ34% of the strands forming complexes, and they also remained stable during the incubation at higher temperatures. However, in the presence of lithium ions, although up to 49% of the strands formed dimers, they exhibited substantial breakdown after incubation at 50°C and were completely disrupted after incubation at 90°C. These results demonstrate that complexes formed by the RNA template of the cPPT region display characteristics known for intermolecular RNA G-quartet structures.
To confirm that the G-rich sequences contributed to dimerization of the templates, we attempted to form dimers with the G-rich sequences in the additional presence of different antisense 16-nt-long DNA oligonucleotides that were expected to prevent G-quartet formation by interacting with complementary regions of sense RNA template (Fig. 7). Because dimers were more efficiently formed in the presence of lithium ions, we used this salt for this experiment. Inclusion of a 5-fold molar excess of each DNA oligomer confirmed that dimer formation is affected by DNA oligomers that bind the G-rich sequences. In particular, G-rich element G-1 within the cPPT and G-2 were indicated to be critical for RNA templates dimerization. These results establish that dimer formation within the cPPT region depends on G residues and involves two of the most conserved G-rich elements.

G-quartet Structure Near the cPPT Promotes Template
Switching by the RT-We previously showed that a major recombination hot spot in the HIV-1 gag region near DIS correlates with sequences rich in G residues that can form G-quartet structures (23,24,30). We showed enhanced strand transfer resulting from G-quartet dimer formation that holds the templates in close proximity and/or G-quartet monomer or dimer formation that increases frequency of RT RNase H by pausing the RT (Fig. 8A).
We have now acquired evidence that the G-quartet structure formed in the cPPT region also promotes template switching by the RT. We were initially encouraged in this expectation because the distribution of recombination breakpoints across multiple HIV-1 genomes revealed several concentrations, one of which was within a 200-nt-long sequence spanning from near the cPPT to the 3Ј end of the pol gene that contains runs of G residues (31). To test the basis of the observed template switching, we constructed a reconstituted system to determine whether G-quartet structure induces the RT to transfer the synthesis of minus-strand DNA from one RNA template to another. This system consisted of HIV-1 RT, HIV-1 NC, a primer (DNA oligonucleotide), and two RNA templates representing two copies of the HIV-1 RNA genome denoted as donor and acceptor for primer strand transfer (Fig. 8B). The reaction was initiated from a 32 P-labeled oligo(dT) DNA primer   annealed to the donor RNA. The donor and acceptor templates shared a homology spanning an 86-nt sequence of cPPT region (4311-4396). The donor RNA template (4309 -4396) was elongated with a poly(A) sequence at its 3Ј end to serve as a primer binding site. The acceptor RNA (4311-4396) was elongated at the 5Ј end with GGAAAAAAAAAA, so that transfer products could be separated and distinguished on a denaturing gel from DNA synthesized on the donor RNA. Moreover, the acceptor template does not share homology with two nucleotides at the 5Ј end of the donor RNA (4309 -4310, green dot in Fig. 8B) to prevent transfers from the end of the donor template. Thus, all transfers to the acceptor would only originate from internal regions of the donor template, as they do in vivo during synthesis over the cPPT region. When the primer switched from the donor onto the acceptor, the completion of the synthesis on the acceptor yielded a 115-nt transfer product (TP). However, if the primer completed its synthesis on the donor without transfer, it yielded a 104-nt donor extension product (DE). To determine whether formation of G-quartets influence transfer efficiency, we performed strand transfer assays in the presence of a low concentration of either K ϩ or Li ϩ (Fig. 8C). The transfer efficiency of reactions was calculated by comparison of values of donor extension products and transfer products using the formula: transfer efficiency ϭ 100 ϫ TP/(TP ϩ DE). We also note that monovalent cations added to the reaction do not significantly affect the enzymatic activity of RT (32,33).
In these strand transfer reactions, the major cation-dependent pause site of RT synthesis was clearly visible at the first G-rich element (GQ) in the presence of K ϩ but not Li ϩ . The strong pause evidently caused dissociation of the RT, as fewer final products were made when the pause was prominent. Significantly, the strong pause induced by potassium, presumably resulting from G-quartet formation, resulted in a 2.6-fold increase in transfer efficiency when compared with the reaction in the presence of lithium ions, wherein the structure was not formed. The strong pausing of the RT most likely promoted cleavages in the RNA donor template catalyzed by the RT RNase H, as previously demonstrated (23,34). It is also probable that G-quartet induced template dimerization, promoted transfer by template proximity. Thus, this result strongly indicates that the high recombination rate in the cPPT region observed in vivo is associated with formation of G-quartet structure.

DISCUSSION
Electron microscopic evidence suggests that the two genomic RNAs in HIV-1, murine leukemia virus, Rous sarcoma virus, avian reticuloendotheliosis virus, endogenous feline type C virus RD-114, and woolly monkey sarcoma virus are linked but only within the region close to their 5Ј ends (35)(36)(37)(38)(39). However, the denaturing conditions used in electron microscopy likely disrupt most of the RNA secondary and tertiary structures. Under milder conditions additional points of contact were observed between the two RNA genomes of Rous sarcoma virus, suggesting that interactions at the 5Ј end could be the most stable point of a multiple contact association (40). It was also shown that nicked viral RNA of avian leukosis virus migrates as a dimer, suggesting that additional interactions in other parts of the genome maintain integrity of damaged RNA (41). Moreover, a significant amount of HIV-1 viruses with dimerized genomes were formed even when the DIS was made non-functional with dimerization-disruptive mutations, although the viruses were less infectious (8,42). Intensive analysis of the HIV-1 gag region RNA-RNA interactions revealed that in addition to dimerization at DIS, RNA templates with the gag sequence adhere by formation of intermolecular G-quartets folded from G-rich sequences located ϳ100 nt 3Ј of DIS (9 -13). The initial interactions through DIS and parallel alignment of the RNA strands raises the local RNA concentration to a point at which additional interactions are favored at other locations of the genome through intermolecular G-quartets. One of those places appears to be the central region of the HIV-1 genome near the cPPT, where three runs of G residues are present. The G-rich elements in this region are conserved among different HIV-1 isolates and can also be found in closely related SIV strains. In the case of HIV-2, there FIGURE 8. G-quartet formation in the cPPT region facilitates RT template switching during reverse transcription. A, G-quartet formation promoted minus-strand DNA transfer (gray line) through pausing of the RT (blue oval), which increased RNA cleavages (blue triangle) and allowed interaction with the second RNA template. The transfers were also promoted through an RNA template proximity effect resulting from template dimerization through a G-quartet dimer. B, shown is a reconstituted system to analyze the influence of G-rich elements on strand transfer during HIV-1 minus-strand DNA synthesis in vitro. Donor and acceptor RNA templates represent two copies of the viral RNA genome in which reverse transcription is initiated from a 32 P-labeled DNA primer (P) annealed to donor RNA. The transfer reactions onto acceptor RNA will result in synthesis of TP distinguished in a denaturing gel from the DE by its greater length. The acceptor RNA does not share a homology (green dot) with two nucleotides at the 5Ј end of the donor RNA. These alterations prevent end transfers of donor extension products; thus, transfer products should only originate internally. C, shown is a time course of strand transfer reactions performed in the presence of potassium and lithium ions. Samples were collected at 1, 5, 15, and 30 min after the reaction was initiated. Formation of a G-quartet in the RNA templates paused the RT during minusstrand DNA synthesis and influenced the yield of the final products. The transfer efficiency was 2.6-fold higher in reactions in which the templates could form a G-quartet than for reactions with substrates that could not form the structure.
are two G-rich elements, G-1 and G-2, in this region, sufficient to form a dimer, and our results indicate that they are critical for forming an intermolecular G-quartet. With an affinity selection approach and native gel analysis we show that two RNA molecules with the cPPT region sequence can self-associate, and the dimeric complex has similar features and properties resembling those described for RNA dimers formed in the gag region through an intermolecular G-quartet. The G-2 and G-3 elements in the HIV-1 cPPT region were previously shown to form G-quartet structure in DNA, and it was suggested that the elements are involved in formation of the cPPT DNA flap (43).
Structurally, G-quartets can be polymorphic and adopt several different forms depending on their sequence and concentration and the milieu of monovalent cations. Our results also show this diversity, and whereas G-quartet dimers are evident at a high concentration of RNA and salt, monomers are almost solely formed in the presence of potassium ions at a low salt and RNA template concentration. Of course, our results cannot fully simulate the structural arrangements of the cPPT region in a virion. However, they reliably demonstrate that the sequences having G-rich elements in the middle portion of the RNA genome are capable of forming a G-quartet dimer and are likely a recombination-relevant position of interaction between the two viral genomes.
Our results also confirm that G-quartets formed near the HIV-1 cPPT can effectively pause synthesis by the RT and induce DNA primer strand transfer from a donor to an acceptor RNA template, suggesting that a G-quartet formed near the cPPT facilitates RT template switching. In fact, an analysis of 271 sequences of the HIV-1 group M subtype indicates that the sequence near the 3Ј end of the pol gene, containing the cPPT, is a common site of recombinant breakpoints (31). Very likely, the increased rate of recombination in this region derives from the presence of G-quartets.
Because, in our experiments monomers were formed more efficiently than dimers, the stimulation of transfers in vitro probably did not primarily result from a proximity effect of dimerization. However, dimerization of templates through DIS and G-quartets in gag was previously shown to stimulate strand transfer (24,44). Moreover, in the context of whole genomes, the opportunity for multiple contact points should favor dimeric over monomeric G-quartet formation. Consequently, we expect that dimerization is also a substantial contributor to recombination in the cPPT region. Significantly, the two genomic RNAs adhering by formation of an intermolecular G-quartet near the cPPT would have a global effect on recombination as, together with the gag adherence point and possibly others, the adhering would maintain the proximity of the viral genomes over a long distance. This would contribute to increased recombination rates in many other locations.
Previous studies showed that pol gene sequences and others located downstream display recombination hot spots (30,31,45). Because reverse transcription proceeds from the 3Ј to the 5Ј end of the genomic RNA, the dimerization near cPPT would facilitate recombination in downstream (3Ј on the template) regions only, as all interactions between genomes would be disrupted by synthesis of minus-strand DNA. Investigation of the distribution of recombination breakpoints across multiple HIV-1 genomes revealed the presence of a recombination hot spot lying ϳ1 kb downstream of the cPPT, within the 600-ntlong sequence window of the first exon of tat, vpu, rev and the beginning of env (31). In fact, this region together with the second exon of tat and rev exhibits the highest rate of recombination throughout the whole genome. We propose that global genome spatial alignment, mediated by multiple sites of dimerization, augments the recombination rate at this and other major hot spots. We also suspect that local dimerization of genomic RNA in different regions plays a role in distinguishing RNA genomes from messenger RNA molecules.
The ability of viral RNA genomes with binding-disruptive mutations in DIS to form some dimers indicates that the mechanism of genome dimerization also involves other regions (8,42). Our results show that the sense RNA template of the cPPT region has an ability to spontaneously dimerize through G-quartet formation. Thus, we suggest that alignment of HIV-1 RNA in vivo is also mediated by formation of G-tetrads between viral genomes. Because self-association of RNA templates through the G-quartet structure occurs at a slower rate than through DIS, G-dimers would appear after interaction at DIS and parallel alignment of two viral genomes, which would bring homologous regions into proximity and initiate formation of intermolecular G-quartets in gag, near cPPT, and likely in other locations. This is in agreement with an observation that genomic RNA dimerization undergoes maturation process, as gRNA dimers of newly released viral particles are less stable than genomic RNA dimers of 48-h-old virions (8).
Formation of G-quartet dimers between homologous sequences throughout retroviral genomes could be a spontaneous reaction, similar to formation of double-stranded DNA or formation of helices involving hybridization of complementary sequences. Any sequence with two runs of two or more G residues could potentially be a region where the RNA genomes associate, although the distance between two G runs and the sequence context will be significant factors determining the probability of the formation of G-quartet structure. For example, the G residues might be occluded in stable hairpins so that formation of G tetrads cannot occur. Thus, more advanced analyzes with application of computational approaches are needed to predict genomic regions more prone to form G-quartet dimers between associated retroviral genomes. Moreover, the sequence requirements for formation of G-quartets are yet not fully understood, as the structure might also involve some adenines, as was proposed for the gag region (9, 10). The cPPT sequence also has a long track of A residues that might participate in formation of a quadruplex. The pausing of RT during reverse transcription occurs primarily at G residues, but that does not rule out involvement of A residues, particularly those after G residues, forming a purine run.
In summary, there is still little known about the mechanisms and factors involved in dimerization of viral genomes despite the fact that retroviral RNA dimerization is a key element in virus propagation and survival. Thus, viral genome dimerization might be a good target for inhibitors of replication of HIV-1 and other retroviruses. However, understanding of this process is prerequisite for development of suitable therapeutic approaches.