A Preferred Target DNA Structure for Retroviral Integrasein Vitro *

The retroviral integrase protein catalyzes the insertion of linear viral DNA ends into the host cell DNA. Although integration in vivo is not site-specific, the detection of local and regional preferences within cellular DNA suggests that the integration reaction can be influenced by specific features of host DNA or chromatin. Here we describe highly preferred in vitrointegration sites for avian sarcoma virus and human immunodeficiency virus-1 integrases within the stems of plasmid DNA cruciform structures. The preferred sites are adjacent to the loops in the cruciform and are strand-specific. We suggest that the observed preference is due to the end-like character of the stem loop structure that allows DNA unpairing. From these results we propose that such unpairing may enhance both the processing and the joining steps in the integration reaction, and perhaps other cellular recombination reactions as well.

The integration of retroviral DNA into the host cell chromosome is an obligatory step in the retroviral replication cycle and is catalyzed by a virus-encoded protein, integrase (IN) 1 (1)(2)(3). With respect to viral DNA, integration is site-specific and occurs at the ends of the linear DNA. In contrast, many sites in host DNA can be targets for integration. IN is sufficient to perform the integration reaction in vitro, and two well defined steps have been described. In the first step, the "processing" reaction, linear viral DNA is nicked 3Ј of a conserved dinucleotide (5Ј-CA-3Ј) that is usually located two nucleotides from the 3Ј-ends of the viral DNA strands (4,5). In the second step, "joining" (5,6), the same active site (7,8) is used to catalyze a coupled cleavage-ligation reaction via the direct attack on host DNA phosphates by the newly formed CA 3Ј-OH groups at the viral DNA ends (9). The two target DNA phosphates selected for joining are staggered by 4 -6 base pairs on the two DNA strands. IN functions as a multimer that may facilitate the positioning of the two viral DNA ends at the integration site. Subsequent repair and sealing of the 5Ј-ends of the viral DNA strands gives rise to the characteristic 4 -6-base pair duplication of host sequences flanking the integration site. This joining of 5Ј-ends of the viral strands is thought to be carried out by host mechanisms. In vitro, IN can utilize synthetic duplex viral DNA substrates whose sequence corresponds to a single viral DNA terminus, as well as longer substrates that contain two viral-like DNA ends. To study the latter concerted reaction, we have also constructed substrates comprised of two viral DNA ends held together by a single-stranded DNA tether (10).
Although purified IN can join 3Ј-ends of viral DNA strands to naked target DNA in vitro, the in vivo reaction is likely influenced by components of host chromatin or DNA structure. In a recent study (11), it was found that many regions of avian host DNA are accessible for ASLV integration in vivo. Within these regions, preferred integration sites were observed, and it was suggested that such preferences may be influenced by local structural variations (11). There is also evidence that specific components of the transcriptional apparatus may play a role in recruiting HIV-1 (12), or yeast Ty3 retrotransposon (13) preintegration complexes to specific sites through protein-protein interactions. Nonrandom integration patterns have also been observed in vitro (14 -20) using both purified IN, as well as retroviral preintegration complexes isolated from infected cells. These in vitro results indicate that specific features of DNA structure, either native or protein-induced, are capable of providing favorable targets for integration.
Here we describe a strong local preference for IN-mediated joining of model viral DNA ends to plasmid target DNA in vitro. The preferred sites are situated in the stems of a DNA cruciform, immediately adjacent to the loops. The polarity of joining site selection, with respect to this structure, suggests that a crucial feature of the preferred sites may be a resemblance to DNA ends. The potential for DNA unpairing at viral DNA ends is apparently critical for IN processing activity on viral DNA substrates (21), and we suggest that partial melting of target DNA may promote IN activity in a similar manner.

EXPERIMENTAL PROCEDURES
DNA Plasmids, Substrates, and PCR Primers-The construction of the pBR711-2 target plasmid has been described previously (22). A synthetic DNA duplex containing two tandem lexA operators was inserted between the BamHI (position 375) and SalI (position 651) sites of pBR322 (see Fig. 2). pBR880-1 was constructed in the same manner except that the potential stem sequences were flipped (see Figs. 4B and 5C). Plasmid DNAs were prepared using the Qiagen Plasmid Maxi Kit. The ASV DNA substrate was a synthetic 16 mer/18 mer duplex corresponding to the "processed" U3 end of viral DNA, as described previously (22). The HIV-1 oligonucleotide duplex substrate (19 mer/21 mer) corresponded to a "processed" viral U5 end: top strand, 5Ј-GTGTG-GAAAATCTCTAGCA-3Ј; bottom strand: 5Ј-ACTGCTAGAGATTTTC-CACAC-3Ј. Two "fixed" oligonucleotide primers complementary to pBR322 sequences and these were used for PCR-based detection of integration events on both DNA strands. Bottom strand primer: 5Ј-CACCTGTCCTACGAGTTG-3Ј, positions 751-768 of pBR322 and top strand primer: 5Ј-TCTCGGAGCACTGTCCGACCG-3Ј, positions 271-291 of pBR322. The top strand primer detects integration events on the bottom strand, while the bottom strand primer detects integration events on the top strand. The ASV viral specific primer for PCR-based detection of integration events was as described previously (22). Similarly, the HIV-1 viral-specific PCR primer corresponds to the U5 top strand described above.
Detection of Cruciform Structures-Potential cruciform loops were detected by S1 nuclease sensitivity, as described previously (23,24). Supercoiled pBR711-2 DNA and control DNA (pBRN1, a pBR322 derivative) were incubated at 37°C in a volume of 200 l with 80 units of S1 nuclease (Boehringer Mannheim) for the indicated times using the suggested buffer. The reactions were stopped with EDTA, and the products were fractionated on a 1% agarose gels containing ethidium bromide.
Unpaired adenines were detected as described previously (23, 25) using diethylpyrocarbonate (DEPC) modification. In our protocol, PstIcleaved or supercoiled pBR711-2 plasmid DNAs were treated with DEPC, cleaved with SalI, and repair-labeled at the SalI site with 32 P. A uniquely end-labeled PstI-SalI fragment was isolated from both substrates and subjected to cleavage with piperidine. The products were fractionated on a 20% acrylamide DNA sequencing gel.
Integration Reactions-ASV IN was purified from Escherichia coli as described previously (22). ASV IN (1 l, 35 pmol) and 300 ng (ϳ0.1 pmol) of PstI linearized or supercoiled plasmid target DNA were first incubated on ice for 30 min in 20 mM Tris-HCl, pH 7.4, in a reaction volume of 16 l. (The ASV IN storage buffer contained 0.5 M NaCl, 40% glycerol, 50 mM Hepes, pH 8.1, 1% thiodiglycol, and 0.1 mM EDTA). One picomole of viral DNA substrate corresponding to a processed U3 ASV terminus (U3 16 mer/18 mer) was added, followed by MgCl 2 to a final concentration of 5 mM. The reactions (20 l) were then incubated at 37°C for the indicated times and were stopped by transfer to a proteinase K solution. After proteinase K digestion (minimum 2 h), the DNAs were purified by phenol extraction and ethanol precipitation, and 10% of the sample was used for PCR-based detection of single-end integration events (22,26). The fixed PCR primers complementary to pBR322 sequences were end-labeled using [␥-32 P]ATP and T4 polynucleotide kinase. The PCR primer complementary to the viral DNA was unlabeled. PCR conditions were: 1 min, 94°C; 1 min, 37°C; 2 min, 72°C; for 30 cycles. PCR products were fractionated on a 7% DNA sequencing gel in parallel with a 10-base pair ladder (Gibco BRL) that was also endlabeled. The reaction conditions described here for in vitro integration were similar to those used for sequence-specific targeting experiments using a LexA repressor-ASV IN fusion protein (22). The major modification was to use short time points such that less than one integration event per plasmid target was detected (see "Results"). We have not yet investigated other parameters, such as the requirement for the INtarget DNA preincubation step.
For HIV-1 integration assays, a bacterially produced derivative of IN was used. This derivative displays wild-type processing and joining activities, but can be stored and assayed at much higher concentrations than the wild-type protein. 2 The improved solubility properties do not appear to be relevant to the experiments described here. The buffer conditions and order of addition were similar to those described for ASV integration reactions. Reactions contained 55 pmol of HIV-1 IN, 10 mM MnCl 2 and target DNA, and were incubated for 60 min at 37°C. Integration events were assayed by PCR as described above.

An Enhanced Target Region for ASV IN Is Dependent on
DNA Structure-We have previously described in vitro targeting of an ASV IN-LexA repressor fusion protein to tandem lexA operator sites on a linear plasmid DNA (22). The results suggested that the IN-repressor fusion protein can bind to the cognate operator sequences and promote enhanced integration into adjacent regions. Similar results have been described by others (27,28). Subsequently, we observed that if the plasmid target substrate was supercoiled, the enhanced integration into adjacent regions was abrogated and strongly preferred integration was observed near the junction of the tandem lexA operators. This new preference was seen with wild-type ASV IN, as described below. As in previous experiments, we used a PCRbased assay to detect integration events (22,26,27). The reaction components include a short viral DNA substrate that represents a single processed ASV DNA end (U3 16 mer/18 mer), purified ASV IN, and a plasmid DNA target containing two tandem copies of the lexA operator (pBR711-2). Joining events near the lexA operators were detected using a PCR primer complementary to fixed positions in the plasmid target DNA and another primer complementary to the viral DNA. The size of the PCR products indicate the distance between the fixed primer and the joining site (including the length of the viral-specific primer).
As seen in Fig. 1, many joining events can be detected on both strands of the linear pBR711-2 plasmid substrate in the region scanned in this assay (lanes 3 and 5). However, if the pBR711-2 plasmid target DNA was supercoiled, a strong local enhancement of joining is observed (lanes 4 and 6). As shown, the enhanced sites on both strands mapped to the tandem lexA operators (denoted OP). In some experiments, the enhancement appeared to occur at the expense of other sites; this is seen most clearly in Fig The percentage of supercoiled target DNA available decreases as the reaction proceeds because a single joining event anywhere on the target plasmid should relax the DNA. Therefore, in the experiments described above, we used suboptimal conditions for the joining reaction so that the initial events could be measured (see "Experimental Procedures"). The starting plasmid preparation was estimated to be 80% supercoil and, under these conditions, it was converted to approximately 30% supercoil after 60 min (data not shown). Therefore, under the conditions used, the supercoiled DNA represents a significant percentage of the target DNA population. Thus, these assays detect "on average" less than one integration event per target plasmid molecule.
The Preferred Joining Sites Map to a Potential DNA Cruci-2 E. Asante-Appiah and A. M. Skalka, unpublished results.

FIG. 1. PCR-based assay for detecting ASV IN-mediated joining events.
Joining of model viral DNA substrates was detected by a PCR-based assay. 32 P-Labeled PCR products were fractionated on 7% DNA sequencing gel. Each band represents a joining event into the target plasmid DNA (pBR711-2). The form of plasmid DNA, linear (L) or supercoiled (S), that was included in the joining reaction is indicated above each lane. Samples were removed from the integration reaction for PCR analysis at three time points as indicated. Joining events on the two strands of DNA (top and bottom) were detected with strand-specific primers designed to detect joining events in the region of the tandem lexA operators (see Fig. 2). Regions denoted OP indicate size of PCR fragments corresponding to joining events within the tandem lexA operator region. See text and "Experimental Procedures" for details.
form Structure-A high resolution sequencing gel (see Fig. 5B) was used to map precisely the enhanced joining sites, and the results are summarized in Fig. 2. The enhanced region can be resolved into approximately seven contiguous joining sites on each strand. This region corresponds to an extensive inverted repeat (IR) sequence comprising the two tandem copies of the lexA operator with an intervening spacer, shown in boldface (Fig. 2). The enhanced joining sites on the two strands are offset symmetrically in the 5Ј-direction from the axis of symmetry of the IR. However, this assay cannot reveal whether the joining events on both strands are coordinated.
The enhanced target sequence (5Ј-TATACAGT-3Ј) is present twice in this region on each strand due to the tandem operators, but only the sequence 5Ј to the axis of symmetry is preferred in the supercoiled plasmid substrate. These observations, along with the requirement for supercoiling, suggested a structural rather than sequence basis for the preferred joining sites. It has been observed previously that certain IR sequences are capable of forming cruciform structures under conditions of DNA supercoiling (23,29). The IR corresponding to the tandem lexA operators in the pBR711-2 is predicted to contain perfectly base paired 18-base pair stems and 5 nucleotide loops. We asked whether the IR in pBR711-2 could form such a structure. S1 nuclease produces double-stranded breaks at cruciform loops, and these break points can be mapped using secondary cleavage with a restriction endonuclease (24). As shown in Fig.  3A, S1 nuclease treatment of the pBR711-2 plasmid resulted in a significant increase in the rate of linearization, as compared with another pBR322 derivative that lacked the extensive IR sequence (pBRN1). Linear pBR711-2 DNA produced by S1 nuclease was isolated and subjected to cleavage by PstI (Fig.  3B). Two discrete fragments, of approximately 3 and 1 kilobase pair, were observed. The sizes of these fragments indicated that treatment with S1 nuclease resulted in preferential linearization by cleavage within or near the lexA operator IR region (Fig. 3B). The pBR322 backbone of pBR711-2 is known to contain at least one other IR sequence that has the potential to form a cruciform structure (24). However, under the conditions used here, the introduced lexA operator IR was the major S1-sensitive site detected. If the pBR711-2 was first linearized with PstI and then treated with S1 nuclease, no significant cleavage within the IR region was observed (Fig. 3A). These results indicate that a major S1-sensitive site is present within or near the IR repeat under conditions of supercoiling. This is consistent with the presence of a cruciform structure.
Chemical modification studies were carried out to confirm and fine-map the potential cruciform structure (23,25). Super-coiled and linear forms of pBR711-2 were treated with DEPC, a reagent that preferentially modifies unpaired adenosine residues. As shown in Fig. 4A, a major modification site in the supercoiled DNA corresponds to the single adenosine residue predicted to fall within the cruciform loop on the top strand (see Fig. 4B). This adenosine is nonreactive in the linear form of pBR711-2 (Fig. 4A). Preferential modifications of the three adenosine residues predicted for the bottom strand loop were also detected (see Fig. 4B; data not shown). It can be noted in Fig. 2 that the stems of the cruciform themselves contain IR elements that also have the potential to form hairpins. However, results of the modification studies suggest that these alternative structures are not favored under these conditions. As shown in Fig. 4B, the preferred integration sites map to the stem regions, adjacent to the loops. Most notably, the integration pattern is strand-specific with respect to the stem loop structure.

DNA Sequence Is Not a Major Determinant for the Preferred Joining Sites in the Potential Cruciform Stem-We next asked
if the sequence of the cruciform stem loop was crucial for the observed enhancement. We note that the dinucleotide 5Ј-CA-3Ј is embedded within the enhanced joining region near the loop, and this dinucleotide sequence is found near the termini of all retroviral and retrotransposon DNAs. Thus, it seemed possible that the stem loop might be recognized by IN through this appropriately positioned sequence element. To test this hypothesis, a new plasmid substrate was constructed, pBR880-1, in which each stem sequence was replaced by the complementary sequence (see Fig. 5C). The resulting inversion of each base pair in the stem produced a new stem sequence, but allowed a comparison of two stem-loops of equivalent stability. Results of the PCR-based joining assay are shown in Fig. 5 (only the bottom strand was analyzed). The enhanced joining observed with the supercoiled pBR880-1 target plasmid was quite similar to that with the original pBR711-2 target plasmid (Fig. 5A, compare lanes 4 and 5 with lanes 8 and 9). As ex-FIG. 2. Sequence, and potential structural elements, at the preferred joining site. Top, sequence of extensive IR corresponding to tandem lexA operators showing fine mapping of enhanced joining sites in Fig. 1. Core sequences of each operator is underlined and the spacer between operators is shown in boldface type. Enhanced joining sites observed with ASV IN using supercoil DNA are indicated by arrows. Numbers correspond to observed PCR fragment lengths from which the position of the target sites could be extrapolated (see Fig. 5B). Thicker arrows indicate increased intensities of the corresponding bands observed with the pBR711-2 substrate in Fig. 5B. Bottom, schematic representation of inverted repeats within the tandem lexA operator region. Dashed lines indicate dyad symmetry and borders of each lexA operator. Solid lines with arrows denote the long perfect IR created by the two tandem operators. The open box indicates a spacer between the tandem operators.
FIG. 3. Detection and mapping of S1 nuclease sensitive sites in supercoiled plasmid target DNA. Plasmid DNAs that were treated with S1 nuclease for the indicated times were fractionated on 1% agarose gels containing ethidium bromide. A, comparison of S1 sensitivities of supercoiled (SC) pBR711-2, pBRN1, and PstI linearized (L) pBR711-2 plasmids. B, mapping of linear ends produced by S1 nuclease treatment of supercoiled pBR711-2. The production of the linear product indicates double strand breaks. The linear product from panel A was purified and digested with PstI. Arrows indicate the positions of the two discrete fragments that are produced. Fig. 1. Only bottom strands were analyzed. A, the pBR711-2 substrate was compared with pBR880-1, which contains the an inversion of stem sequences. Conditions were the same as described in Fig. 1. B, high resolution of integration sites using 7% sequencing gels. C, predicted stem-loop structure for pBR880-1. Enhanced integration sites (arrows) are superimposed on the structure. Thicker arrows indicate the end points of the more intense bands shown in panel B.

FIG. 4. Detection of unpaired adenines in supercoiled plasmid by DEPC modification.
A, supercoiled (S) and linear (L) pBR711-2 DNAs were treated with DEPC, and modified sites were detected as described (see "Experimental Procedures"). 32 P end-labeled products were fractionated on a 20% sequencing gel with G and G plus A chemical sequence ladders. A marker lane containing a 10-base pair ladder is indicated (M). B, predicted structures of DNA cruciforms based on DEPC modification studies. Adenines that are highly accessible to DEPC modification sites are indicated by filled circles. In the bottom strand, two of the adenines were less accessible (open circles); nonuniform modification of loop adenines has been noted previously. Enhanced integration sites (arrows) are superimposed on the structures. Thicker arrows indicate the end points of the more intense bands (see Fig. 5B). pected, the pBR880-1 IR can apparently form a cruciform structure analogous to pBR711-2. More precise mapping of the integration sites in supercoiled pBR880-1 revealed that the pattern was similar to that observed with pBR711-2 (Fig. 5, B  and C). Thus, the primary sequence of the stem is not a major determinant for the enhanced joining events. Rather, we conclude that the structure itself is an important feature.
HIV-1 IN Displays a Similar Preference for a Potential DNA Cruciform Structure-We next asked whether HIV-1 IN displays a similar preference. For these experiments, we used a derivative of HIV-1 IN that shows increased solubility. 2 As in the case of ASV, we chose conditions where only a portion of the pBR711-2 supercoiled target DNAs were utilized, such that the integration occurred primarily in a supercoiled target DNA rather than target DNA that had been relaxed by a previous single-end integration event. Because HIV-1 IN is generally less active than is ASV IN, these conditions included a longer incubation time (60 min) as well as use of the preferred metal co-factor, Mn 2ϩ . As shown in Fig. 6, HIV-1 IN, like ASV IN, exhibits a preference for the IR region in the supercoiled, but not the linear, form of the plasmid. (Only integration events on the top strand were measured in this experiment.) We conclude that the enhanced integration catalyzed by HIV-1 IN into the IR region in the supercoiled plasmid also reflects targeting to the extruded cruciform structure induced by DNA supercoiling. This preference may, therefore, be a general property of retroviral integrases and, perhaps, related recombinases as well. DISCUSSION It is well established that retroviral DNA integration shows little specificity with respect to host-cell target sites. On the other hand, target sites do not appear to be selected entirely at random. For example, several earlier studies have suggested that regions associated with active transcription are preferred in vivo (see Craigie (30) for a review). However, more recent analyses of ASLV DNA integration events in turkey embryo fibroblasts have demonstrated that many (perhaps all) regions of the host DNA are accessible (11). Highly preferred sites were also identified within accessible regions, but the underlying mechanisms for such preferences are not yet understood. Local preferences are also observed in vitro with purified IN and naked DNA, and these results have been interpreted to mean that sequence-determined structural variations can influence IN activity (15,16,19). Other studies suggest that integration site selection may be mediated by specific cellular proteins. One cellular protein, denoted Ini-1, has been shown to interact with HIV-1 IN (12). As this protein is a component of a transcriptional activator complex, it has been proposed that the HIV-1 pre-integration complex may be recruited to specific sites in the host target DNA through protein-protein interactions (12). The yeast Ty3 retrotransposon, which expresses a retroviral-like integrase, inserts almost exclusively into the start sites of tRNA genes (31). Recent studies suggest that this specificity may involve a protein-protein recruitment mechanism via the transcription factors TFIIIB and TFIIIC (13). It is also possible that this transcriptional complex may alter DNA structure in a manner that promotes integration (13,32).
Several studies have addressed the effects of protein binding and target DNA structure on IN activity or retroviral integration (14, 16 -20). One general conclusion to be derived from these studies is that certain proteins are capable of bending or distorting DNA in a manner that promotes selective integration into the exposed face of the DNA helix. For example, wrapping of DNA around nucleosomes promotes a pattern of integration sites with a 10-base pair periodicity in vitro (17,20). Naked, bent DNA promotes a similar pattern in vitro, suggesting that DNA structure is a major determinant of this periodicity (16). However, the underlying mechanism or biological significance of these observations is unknown.
Here we describe a local target preference in vitro that is clearly based on DNA structure. It is well established that, under conditions of negative supercoiling, certain IR sequences have the potential to form cruciform structures (29). We provide strong evidence for the existence of such a structure in our model target DNA using two standard methods, S1 nuclease sensitivity and DEPC modification. The preferred sites for joining are observed under conditions of DNA supercoiling and correspond to the stem region adjacent to the loops. Within each stem loop of the cruciform, the joining pattern is asymmetric. A polarity of site selection was observed on each strand of the cruciform, whereby only sites corresponding to 3Ј-ends (sites on the 5Ј-side of the stem) were preferred.
The structure of the preferred joining site that we describe here appears to be distinct from those previously reported. The joining is sequence-independent and does not correlate with an obvious DNA bend. For example, one study has reported that the stem of a DNA hairpin adopts a B-form structure with no obvious distortion (33). What features of the stem loop might be responsible for the enhanced joining activity, and how might this be relevant to integration in vivo? Clues may be found by comparing the processing and joining reactions. IN uses a single active site and a similar chemical mechanism for both the processing of viral DNA ends, and joining of these ends to target DNA (7)(8)(9). In the processing reaction, IN nicks the viral DNA on the 3Ј-side of the highly conserved CA dinucleotide, which is usually located two nucleotides from the 3Ј-terminus of both viral strands. If additional duplex sequences are added beyond the terminus, processing is severely inhibited (10,21,34). The location of the viral processing site relative to the termini may reflect the requirement for partial unwinding of the viral DNA ends during the processing reaction (10,21). In this respect, a contiguous DNA duplex, the potential target for the joining reaction, is structurally dissimilar from viral DNA ends. As IN uses a common mechanism for processing and joining, we speculate that the integration machinery may exploit target sites that have an end-like character.
It is difficult to exclude certain biases that might be imposed by the assay used in this study. However, our results seem consistent with other observations regarding DNA structure and in vitro target site selection. For example, using ASV IN we generally observe that with short, linear duplex target DNAs, there is a graded preference for joining close to 3Ј-termini, as reported by others previously (35). The preferred target sites in the stem loop fall within a region near the loop and this may reflect a structural resemblance to DNA termini. The observation that there is a polarity of site selection may be relevant to such a model; the "3Ј-end" of the top strand of the stem is selected. This is analogous to the polarity of recognition of viral ends in the processing reaction. Therefore, we suggest that the stem loops may provide unique end-like structures within the supercoiled plasmid, and that the loop may provide the potential for end-like unpairing. It has previously been suggested that DNA distortion might also cause local unpairing, and this may be the basis for the observed preferences for integration into bent DNA (16,18,19). Chemical modification studies have indicated that ASV IN may promote unpairing of DNA termini. 3 However, unpairing of internal target sequences poses a more significant energetic barrier, and the preference that we observe may reflect an exploitation of partially unpaired target DNA.
The relevance of our observations to in vivo integration remains to be investigated. As noted above, we suggest that the critical structural feature of the preferred target site described here is the junctions between paired and unpaired DNA that occur at the tips of the stem loops. Thus, we speculate that retroviral integration machinery may exploit partially unpaired cellular DNA sequences. Our results also provide further support for some common elements in recognition and utilization by IN of two apparently diverse DNA substrates, the viral termini and internal cellular target sequences. These results are consistent with the idea that IN uses the same active site for both processing and joining. Although we speculate that the stem loops provide important determinants for IN activity, we cannot exclude a role for the cruciform four-way junction. For example, it is possible that the four-way junction provides a favored binding site for IN, and the target sites are selected through positioning. There is evidence that cruciform structures exist in vivo and they may participate in a wide range of processes, including the initiation of DNA replication, repair, and transcription (29). We note that IN belongs to a structural superfamily that includes the E. coli RuvC protein (36), an enzyme that recognizes four-way DNA junctions.
In general, very little is known about how retroviral IN recognizes DNA. However, such knowledge is critical to the design of specific inhibitors that may be used to treat HIV infection. Here we identify a strong DNA structure-based preference for IN joining activity. This preference may be based on the resemblance to a structural intermediate in the joining reaction that involves unpaired DNA. Our results may also be relevant to understanding substrate preferences of certain cellular recombinases. It has been shown that the RAG-1/RAG-2 recombinase system (which mediates immunoglobulin and T cell receptor gene rearrangements) shares mechanistic features with the retroviral IN reaction (37). A role for DNA unpairing has also been proposed for RAG-1/RAG-2-mediated recombination from independent lines of evidence (38). Thus, our results are likely to reflect fundamental features of both types of reactions.