Changes in the Mechanism of DNA Integration in Vitro Induced by Base Substitutions in the HIV-1 U5 and U3 Terminal Sequences*

We have reconstituted concerted human immunodeficiency virus type 1 (HIV-1) integration with specially designed mini-donor DNA, a supercoiled plasmid acceptor, purified bacterial-derived HIV-1 integrase (IN), and host HMG-I(Y) protein (Hindmarsh, P., Ridky, T., Reeves, R., Andrake, M., Skalka, A. M., and Leis, J. (1999) J. Virol. 73, 2994–3003). Integration in this system is dependent upon the mini donor DNA having IN recognition sequences at both ends and the reaction products have all of the features associated with integration of viral DNA in vivo. Using this system, we explored the relationship between the HIV-1 U3 and U5 IN recognition sequences by analyzing substrates that contain either two U3 or two U5 terminal sequences. Both substrates caused severe defects to integration but with different effects on the mechanism indicating that the U3 and the U5 sequences are both required for concerted DNA integration. We have also used the reconstituted system to compare the mechanism of integration catalyzed by HIV-1 to that of avian sarcoma virus by analyzing the effect of defined mutations introduced into U3 or U5 ends of the respective wild type DNA substrates. Despite sequence differences between avian sarcoma virus and HIV-1 IN and their recognition sequences, the consequences of analogous base pair substitutions at the same relative positions of the respective IN recognition sequences were very similar. This highlights the common mechanism of integration shared by these two different viruses.

Integration of retroviral DNA is an obligatory step in viral replication. Integration is catalyzed by the viral encoded enzyme, integrase (IN), 1  The properties of concerted DNA integration have been reconstituted in vitro using purified HIV-1 (3,4) or ASV (3,(5)(6)(7)(8)(9)(10)(11) IN and MgCl 2 . The donor DNAs contain only 20 base pairs for HIV-1 or 15 base pairs for ASV derived from the ends of the LTRs, respectively. These viral DNA end sequences correspond to the nearly perfect inverted repeats that define the relationships between the U3 and U5 RNA ends. The inverted repeat for RSV is 12 of 15, while that for HIV-1 is 12 of 20. A comparison of the RSV and HIV-1 IN recognition sequences indicates that they are unique. The only common feature is the presence of a conserved CA dinucleotide at positions 3 and 4 from the terminus. Short oligodeoxynucleotide duplexes representing the ends of HIV-1 U5 LTR are more efficient substrates for IN processing (12)(13)(14)(15) and strand transfer (15) reactions in vitro than those corresponding to the U3 LTR. In the case of ASV IN, the U3 LTR end is preferred over the U5 LTR end (7,9,16). It has also been observed that mutations at the viral U3 LTR end have different effects than those at the U5 end. U3 mutations in ASV reduce integration rate to a greater extent than comparable mutations in U5 (5,6). Also, regions critical for integration are very close to the ends of the LTRs, adjacent to and including the conserved CA dinucleotide (5,6,8,9,(12)(13)(14)(15)(17)(18)(19)(20). Alteration of these sequences results in retroviral strains deficient in integration. Assays that utilize oligodeoxynucleotide duplexes that represent either U3 or U5 HIV-1 LTR ends have also demonstrated the importance of positions 3-6 to the efficiency of the processing reaction (14,19).
The U5 3AC4, U5 5GA6, U5 4CGAT7, and U3 for U5 oligodeoxyribonucleotides were used to prepare HIV-1 donor-concerted DNA integration substrates with mutations in the U5 terminus sequence. In each case, the sequence refers to the 3Ј-cleaved strand of the U5 LTR IN recognition sequence. The U3 5CT6, 3AC4, 4CCTT7, and U5 for U3 oligodeoxynucleotides were used to prepare comparable donor DNAs with mutations in the U3 IN recognition sequence. The U5seq and U3seq oligodeoxyribonucleotides were used as sequencing primers. The U3seq primer is complementary to plasmid vx nucleotides 180 -151, and the U5seq primer is complementary to plasmid vx nucleotides 312-341.
Bacterial Strains and Growth Conditions-Escherichia coli DH5␣ (Invitrogen) and MC1061/P3 (Invitrogen) strains were used for these studies. MC1061/P3 is a derivative of MC1061 containing the male episome, P3, which can be selected for the presence of an encoded Kan r gene. In addition, P3 possesses amp (Am) and tet (fAm) genes, the expression of which can be rescued by the supF amber suppressor tRNA. Under these conditions, MC1061/P3 can be selected for ampicillin, tetracycline, and kanamycin resistance.
Plasmid Constructions and Preparations-Plasmid pHHIV2 was used in this study as a template to amplify donor DNA and is a variation of pBCSK ϩ in which a wild type HIV-1 donor DNA PCR product was inserted into pBCSK ϩ catalyzed by IN, resulting in the loss of two base pairs from the LTR ends. This plasmid was propagated in E. coli MC1061/P3 under the conditions described above. The integration acceptor was plasmid pBCSK ϩ (Stratagene, La Jolla, CA), which was propagated in E. coli DH5␣. Plasmids were purified with Qiaprep columns (Qiagen, Chatsworth, CA) according to the manufacturer's instructions. The growth of DH5␣ containing pBCSK ϩ was selected for by addition of chloramphenicol (35 g/ml).
Preparation of Donor DNAs-Integration donors were amplified by using thermostable Vent DNA polymerase and the primers listed above. Twenty-five pmol of each primer and 50 ng of pHHIV2 DNA, as the template, were used for each PCR reaction. Vent DNA polymerase was used according to the manufacturer's instructions. A total of 20 rounds of amplification were performed in each reaction. The amplification conditions were 94°C for 2 min, 50°C for 1 min, and 72°C for 1 min for three rounds. This was followed by amplification conditions that used 94°C for 2 min, 57°C for 1 min, and 72°C for 45 s for 17 additional rounds. The resultant product donor DNA was isolated after electrophoresis through 2% agarose gels equilibrated with 0.5ϫ Tris borate-EDTA (3). The purified DNA (600 ng) was recovered using QIAquick gel extraction kit (Qiagen). The integration donors were ϳ300 base pairs in length and were internally labeled during the PCR by the inclusion of [␣-32 P]dCTP (3000 Ci/mmol, 10 mCi/ml). The final concentrations of deoxyribonucleoside triphosphates during amplification reactions were 0.25 mM each of unlabeled dATP, dGTP, and dTTP. The final dCTP concentration was 0.0502 mM (12 Ci/mmol, 0.6 mCi/ml).
Standard Integration Reaction Conditions-The concerted integration reaction conditions were similar to those described by Hindmarsh et al. (3). Briefly, 15 ng (0.15 pmol of ends) of donor DNA was mixed with 50 ng of acceptor DNA (0.02 pmol) and 80 ng of HIV-1 IN (1.25 pmol) in a 8.5-l preincubation reaction mixture containing, at final concentrations, 25 mM MOPS, pH 7.2, 23 mM NaCl, 10 mM dithiothreitol, 5% polyethylene glycol 8000, 10% dimethyl sulfoxide, 0.05% Nonidet P-40, 1% glycerol, 1.6 mM HEPES, pH 8.0, and 3.3 mM EDTA. The IN was diluted in a buffer containing 30% glycerol, 0.5 M NaCl, 50 mM HEPES, pH 8.0, 1 mM dithiothreitol, and 0.1 mM EDTA. Where specified 100 ng of HMG-I(Y) was added to the reaction mixtures. The preincubation reaction mixtures were placed on ice overnight. The volume of each preincubation mixture was then increased to 10 l with the addition of MgCl 2 to a final concentration of 7.5 mM, and the integration assay mixture was incubated at 37°C for 2 h. The reactions were stopped by increasing the volume to 150 l by the addition of EDTA (final concentration of 4.25 mM), sodium dodecyl sulfate (final concentration of 0.44%), and proteinase K (final concentration of 0.06 mg/ml). After digestion for 60 min at 37°C, the reaction mixtures were extracted with phenol followed by phenol-chloroform-isoamyl alcohol (25:24:1 mixture). Fifteen l of 3 M sodium acetate, pH 5.2, was added along with 1 l of glycogen (10 mg/ml stock solution). The reaction products were precipitated by the addition of 450 l of 100% ethanol and washed twice with 70% ethanol prior to electrophoresis and autoradiography. The reaction products were separated on a 1% agarose gel run in 0.5ϫ Tris borate, EDTA, and ethidium bromide at 10 V/cm for 2 h. Following electrophoresis, gels were submerged in 5% trichloroacetic acid for 20 min or until the bromphenol blue dye turned bright yellow. After being washed with water, the gels were dried on DE-81 paper (Whatman) in a Bio-Rad slab gel dryer at 80°C for ϳ2 h under a vacuum. Quantitation of reaction products was carried out using a phosphorimaging device and ImageQuant 5.0 software. Experiments with wild-type donor integrants always accompanied experiments with mutant donor integrants as controls. All experiments were repeated at least two times.
Cloning and Sequencing of Integrants-In all experiments, integration products were used directly for transformation of bacteria. The integration products were introduced into E. coli MCI061/P3 by electroporation, using a Bio-Rad electroporator with 0.1-cm electroporation cuvettes, 1.8-kV voltage, 25-F capacitance, and 200-ohm resistance. The P3 episome is maintained at a low copy number. Therefore, only 40 g/ml ampicillin, 15 g/ml kanamycin, or 10 g/ml tetracycline were required for selection. Under these conditions, we detected no colonies after supF selection when the donor, acceptor, or donor and acceptor were electroporated into cells in the absence of IN. Plasmid DNAs were recovered from individual clones, and integration junctions were sequenced by using primers U3seq (for sequencing the U3 junction) and U5seq (for sequencing the U5 junction). Sequencing was performed using the Thermo-Sequenase kit (U.S. Biochemical, Cleveland, Ohio).
Statistical Analysis-We used chi-square test to examine statistical significance of the difference between numbers of non-concerted events for different integration reactions. A binomial probability was used to determine significance of integration events into the same site in the target DNA. Since the total number of sequenced concerted integrants was 203 and the target plasmid length is 3400 base pairs, the formula used for calculations was the following: p ϭ 203!/(x!(203-x)!) x (1-1/ 3400) (203-x) , where x is a number of integration events into the same site. For calculation of probability of integration into a region we divided 3400 by the number of base pairs in the region.

RESULTS
The HIV-1 U5 Is the "Dominant" IN Recognition Sequence-In the present study, we have used a reconstituted HIV-1 concerted DNA integration system that employs a specially designed mini donor DNA with HIV-1 U3 and U5 IN recognition sequences flanking a supF transcription unit (Fig.  1A), a supercoiled plasmid acceptor, purified bacterial-derived HIV-1 IN, and host HMG-I(Y) protein. The DNA integration products resulting from in vitro reactions have the characteristics associated with viral DNA integrated in vivo (3). When the products are analyzed by agarose gel electrophoresis, integration is detected by the insertion of the small donor into the larger acceptor DNA (see Fig. 1B). As much as 15% of the wild type donor DNA was converted to RFII products so that the molar amount of integrated donor DNA is half that of the target DNA in the reaction. The integration intermediates that are present in the different bands are diagrammatically represented to the right of the gel. RFIII products form via a nucleophilic attack of two one-ended integration events into the same site on the target plasmid. Thereby the amount of RFIII product is dependent on the donor DNA efficiency as an integrase substrate. If the U5 end, which is more active in one-ended events, is changed to be a less efficient substrate for integrase, the amount of RFIII product would decrease parallel to that of RFII product. Among the RFII-like products are integration intermediates that represent one-ended and two-ended donor insertions into the acceptor. The presence of one-ended versus two-ended donor insertions can be distinguished when the products of a reaction are introduced into bacteria (see Fig. 1C). One-ended donor integration products are not maintained and are thereby lost. Two-ended DNA integration products can be recovered from individual colonies and sequenced to establish the junctions between donor and target DNA. Examination of these junction sequences distinguishes whether the two-ended insertion products were derived by a concerted or a non-concerted mechanism. For instance, for a wild type donor, ϳ93% of the two-ended insertion events have characteristics associated with a concerted DNA integration mechanism (Table I). This includes the loss of two base pairs from the ends of the LTRs, wide distribution of insertion sites of the donor into acceptor DNAs, and five base pair duplications introduced at the site of insertion. Only a small percentage of the two-ended integrants occurred by a non-concerted mechanism detected by deletions rather than five base pair duplications introduced into the acceptor (Table I).
We reexamined the effect of removing one of the two IN recognition sequences from the ends of the HIV-1 donor DNA. When the U5 IN recognition sequence was replaced by random sequence, the RFII-like product detected by gel electrophoresis decreased 89% compared with wild type ( Fig. 2A, lanes 1 and  2). When the U3 IN recognition sequence was replaced by random sequence, the RFII-like product decreased only 10% compared with wild type (Fig. 3A, lane 3). As a control, when both IN recognition sequences were deleted, no integration into the acceptor DNA was detected on the gels (Fig. 3A, lane 2). When the products from the ⌬U3 or ⌬U5 reaction were introduced into bacteria, no colonies above the background level were obtained indicating that the removal of either of these IN recognition sequences resulted in the loss of two-ended DNA integration products (data not shown). The finding that there was very little decrease in the RFII-like products observed with the ⌬U3 donor DNA indicates that the U5 IN recognition sequence efficiently promotes one-ended insertion events in the absence of a U3 IN recognition sequence.
Effect of an HIV-1 mini donor DNA is presented in Fig. 2. We also analyzed the effect of substitution of the conserved CA dinucleotide at positions 3-4 (TAGCAGT 3 TAGGGGT). Integra-

TABLE I
Sites of DNA integration of a wild type donor DNA into an acceptor DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. The sequence for only the 3Ј-cleaved strand of the duplex for U5 and U3 is shown. Therefore the complementary strands of the duplex are presented. Lowercase letters denote duplication of the cell DNA; uppercase letters indicate the processed viral DNA sequences, which have lost two base pairs from each end unless otherwise indicated. Shaded entries are derived from a separate experiment than from non shaded entries. In a third experiment, the ratio of concerted to non-concerted integrants was 14:1.
b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA.

FIG. 1. Reconstitution of HIV-1 INdependent concerted DNA integration with wild type donor DNA.
A, diagrammatic representation of a mini HIV-1 donor DNA substrate of 310 base pairs in length. The 20 base pairs shown are from the U3 and U5 HIV-1 LTR termini. The bolded deoxynucleotides denote the highly conserved CA dinucleotide found near the ends of the LTR sequences. The solid box represents an expression cassette for the supF suppressor tRNA. B, diagrammatic representation and gel electrophoresis migration positions of products from integration reactions reconstituted with purified IN, HMG-I(Y), mini donor, and acceptor DNA as described in "Experimental Procedures." Possible products include those that result from concerted DNA integration (majority product) (a), from non-concerted integration by two (b) or one (c) donor DNA(s) via one-ended insertion events, or (d) by one donor DNA via twoended insertion events. C, assay for biological selection of integrants.
tion reactions were carried out as described under "Experimental Procedures", and products were analyzed first by agarose gel electrophoresis. Base pair substitutions introduced at positions 5-6 into the U5 LTR IN recognition sequence caused more than a 2-fold decrease of integration efficiency compared with wild type (Fig. 2A, lane 3). Substitutions at positions 4 -7 and 3-4 resulted in more dramatic decreases of integration efficiency ( Fig. 2A, lanes 4 and 5).
Biological selection of integration products after their intro-duction in bacteria showed that when substitutions were made at positions 5-6 the number of two-ended integrants (judged by the number of recovered colonies) decreased to 40% that of a wild type donor (Fig. 2B). The number of selected double-ended integrants that resulted from reactions with donors containing substitutions at positions 4 -7 and 3-4 was 5.7 and 3%, respectively, that of wild type (Fig. 2B). These decreases followed the percentage loss of integrants detected by the gel electrophoresis analysis (Fig. 2B). We further analyzed the products recov- Integration efficiency of wild type mini donor DNA was set as 100%. The data shown is an average of three independent experiments, the standard deviation between experiments was 0.25-2%. C, percent of integrants derived from Tables II-IV formed by a concerted mechanism involving two ends of the same donor DNA.  6, 9). B, quantification of RFII products shown in A (closed bars) and total number of colonies containing two-ended integrants (open bars) as described in the legend to Fig. 2. The data shown is an average of two independent experiments, the standard deviation between experiments was 1-2%. C, percent of integrants derived from Tables V-VII formed by a concerted mechanism involving two ends of the same donor DNA. ered from bacteria by sequencing individual integrants. In comparison to wild type, a donor with base pair inversions at positions 5-6 caused a 3-fold increase in non-concerted integration events, which introduced deletions rather than small duplications into the acceptor DNA (Table II). This is a statistically significant result as the probability of its happening is p ϭ 0.04. Interestingly, the larger base pair inversions at positions 4 -7, which includes positions 5-6, resembled wild type in that most integrants arose by a concerted mechanism (Fig. 2C, Table III).
Base pair substitutions introduced at the conserved CA dinucleotide had a different effect on the mechanism of integration. As noted above, there was a substantial decrease in the integration efficiency of the modified donor (Fig. 2B). Among the recovered integrants were those that partly altered the processing mechanism. For instance, in five clones sequenced, one rather than two deoxyribonucleotides were excised from the LTR ends (Table IV). As with a wild type donor, integration sites were distributed across the acceptor DNA, and small duplications were introduced at the site of insertions. However, the sizes of the base pair duplications were more heterogeneous than seen with a wild type donor. Six and four base pair duplications were observed among the integrants instead of the almost exclusive five base pair duplications observed with a wild type donor (Table IV). In two cases, integrants had the original mutations deleted such that IN used the first internal CA dinucleotide for the nucleophilic attack reaction. This was also observed with one integrant where the donor contained a base pair inversion at U5 positions 5-6 ( Table II).
Effect of Base Pair Inversions at Positions 3-7 in the U3 IN Recognition Sequence on Integration-We introduced base pair inversions at the same positions as analyzed in U5 but into the U3 IN recognition sequence, and integration was reconstituted as described under "Experimental Procedures." The effect on integration as analyzed by gel electrophoresis is shown in Fig.  3A. Most of the mutations in U3 had little effect on the efficiency of the donor insertion into the acceptor DNA. However, when integration products from these reactions were introduced into bacteria the number of recovered colonies greatly diminished (Fig. 3B). This suggested that changes introduced into the U3 LTR IN recognition sequence lead to an increase in one-ended integration events, presumably through integration with the wild type U5 IN recognition sequence. In the case of the U3, position 5 and 6 modified donor, the recovery of colonies were only 30% that of using a wild type donor. Moreover, when the two-ended insertion integrants were sequenced (Table V), more than half arose by a non-concerted mechanism, which introduced deletions into the acceptor DNA (Fig. 3C). This difference from the data with a wild type donor was statistically significant as probability of its occurrence is very low, p ϭ 1.9 ϫ 10 Ϫ5 . Furthermore, the increase in non-concerted integration events was statistically significant not only compared with wild type data but also to data obtained with a U5 5GA6 donor (p ϭ 0.01).
Substitution of positions 4 -7 resulted in more than a 5-fold decrease of biologically recoverable integrants, while a mutation to the conserved CA dinucleotide reduced this number even further (Fig. 3B). However, in contrast to the U3 5-6 mutant, the majority of integrants recovered from reactions substitutions at U5-5GA6 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table  I. Shaded entries are from a separate experiment than from non shaded entries. In a third experiment, the ratio of concerted to non concerted integrants was 9:3.
b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA. d Denotes a 19-base pair deletion in the donor DNA in the U5 LTR end that uses the first internal CA dinucleotide.

TABLE III
Sites of DNA integration of a HIV-1 donor DNA with base pair substitutions at U5-4CGAT7 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table I. b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA.
with the U3, 4 -7 and 3-4 modified donor DNAs were formed by a concerted mechanism comparable with integration of a wild type donor (Fig. 3C, Tables VI and VII). Among the integrants obtained using the U3, position 3-4 modified donor, were several in which one rather than two base pairs were excised from the LTR end (Table VII). This is similar to what was observed when the conserved CA dinucleotide was changed in U5. To further examine efficiency of integration of mutated U3 IN recognition sequences in the absence of a wild type U5 IN recognition sequence, we deleted the U5 LTR end from donor molecules containing the mutated U3 IN recognition sequences. As shown in Fig. 3A (lane 7), substitutions at positions 5-6 in the U3 IN recognition sequence resulted in a donor DNA that integrated four times less efficiently as wild type. This suggests that the one-ended integration events seen in Fig. 3A, lane 4 occurred by either the U3 or U5 end of the donor molecule. However, U3 LTR ends with base pair substitutions at positions 4 -7 or 3-4 were poor substrates for IN (Fig. 3A,  lanes 8 and 9). This suggests that the one-ended insertion events seen in Fig. 3A (lanes 5 and 6) were due to insertions using the wild type U5 rather than the mutated U3 IN recognition sequence.

Mechanism of Integration Using a Donor DNA Containing U5 LTR IN Recognition Sequence at
Both Ends-To further analyze the role of the U5 LTR end in promoting concerted DNA integration, we substituted U3 LTR IN recognition sequence in the wild type donor DNA with a wild type U5 LTR IN recognition sequence. The resulting donor contained U5 LTR IN recognition sequences at both ends and is referred to as the U5-U5 donor. Gel electrophoresis of products obtained from a reaction with U5-U5 donor revealed that the total number of RFII-like products was almost one and a half times greater than that obtained with a wild type donor (Fig. 4A, lanes 1 and  2). Seventy-two percent of these integrants were biologically selected during the antibiotic screening procedure (Fig. 4B). Thus, 28% of the U5-U5 donor integrants represent one-ended insertion events. However, of the two-ended integrants sequenced, only 23% arose by a concerted DNA integration mechanism (Fig. 4C, Table VIII). The majority of integrants showed deletions rather than small base pair duplications in the acceptor DNA and therefore arose via a non-concerted mechanism (Table VIII). This is a statistically significant result as the probability of its happening is p ϭ 1.2 ϫ 10 Ϫ8 . Thus, overall only 16.5% of the total integration events using a U5-U5 donor DNA occurred by a concerted integration mechanism characteristic of a wild type donor DNA.

Mechanism of Integration Using a Donor DNA Containing U3 LTR IN Recognition Sequences at Both
Ends-In the complementary experiment, we analyzed a donor DNA that contained two wild type U3 IN recognition sequences. This substrate is referred to as a U3-U3 donor. As judged by gel analysis (Fig. 4A, lane 3), the formation of the RFII-like integration products was 23% of that formed with a wild type donor DNA (Fig. 4B). The biological selection yielded only 5% the number of colonies obtained with a wild type donor. Thus, as many as 78% of all integrants with this donor arose by a one-ended insertion event. For integrants that were recovered, sequence analysis demonstrated that 60% had substitutions at U5-3AC4 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table  I. Bolded G indicates the removal of one rather than two base pairs from ends of the LTR.
b Denotes a 19-base pair deletion in the donor DNA in the U5 LTR end that uses the first internal CA dinucleotide.
c Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism. d Denotes deletion introduced into the acceptor DNA. e Denotes that g (in bold) adjacent to viral CA nucleotide in U3 may be derived from either donor or acceptor DNA. substitutions at U3-5CT6 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table  I. In a third experiment the ratio of concerted to non concerted integrants was 3:5.
b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA.
characteristics of a wild type two-ended concerted DNA integration event (Fig. 4C, Table IX). The remainder contained deletions rather than base pair duplications at the site of insertion into the acceptor. The difference in the number of non-concerted events between integration reactions with U3-U3 donor DNA and wild type donor as well as U5-U5 donor was statistically significant as probabilities calculated by chi-square test were p ϭ 0.00187 and p ϭ 0.0016, respectively. Overall for a U3-U3 donor, only 13% of integrants arose by a concerted DNA integration mechanism. Thus, donor DNAs that contain either two copies of wild type U3 or two copies of wild type U5 IN recognition sequences have lost the ability to efficiently undergo a concerted DNA integration reaction.
Location of Integrants in the Acceptor Plasmid-For all donor DNAs analyzed, the integration into the acceptor DNA was distributed with large numbers of integration sites clustered between plasmid positions 300 -1200. However, as previously reported (22), there were "hot spots" for integration. In several instances we detected integration at the same site to the base (in either orientation) (Tables I-VIII). The most frequently used integration sites were found around plasmid positions 354 and 1051 (Table X). Interestingly, one of the preferred sites for integration, plasmid positions 349 -355, has the same sequence as positions 4 -10 of the U3 LTR terminus of viral DNA. This integration into the same target site was statistically significant, binomial probabilities of these events were obtained as described under "Experimental Procedures" and are shown in Table XI. DISCUSSION The HIV-1 reconstituted system was used to examine the roles of the U3 and the U5 IN recognition sequences in cata-lyzing concerted DNA integration. In this system, we have found that the HIV-1 U5 IN recognition sequence is almost nine times more efficient a substrate for HIV-1 IN than the U3 IN recognition sequence. This is shown by the analysis of integration of donor DNAs, which contain only a wild type U5 and no U3 IN recognition sequence that produced almost wild type levels of RFII-like products as detected on gels. In contrast, a donor DNA that contained a wild type U3 and no U5 IN recognition sequence showed significant decreases in the efficiency of the formation of RFII-like products compared with wild type. This confirms oligodeoxynucleotide substrate-derived data that suggested that the HIV-1 U5 IN recognition sequence was the catalytically more active or the "dominant" LTR end (12)(13)(14)(15). The difference in activity associated with the two ends is related in part to the deoxyribonucleotides at positions 5 and 6 adjacent to the conserved CA dinucleotide. In the HIV-1 U5, G and A are at positions 5 and 6, respectively. In the U3 LTR end these positions contain a C and a T, respectively. When C and T are substituted for the G and A in the U5 LTR termini in the above donor substrates, the activity of this IN recognition sequence significantly decreases (data not shown). In the converse experiment, when G and A are substituted into the U3 LTR termini, the activity of this IN recognition sequence increases. This is consistent with previous findings that HIV-1 IN selected G and A at positions 5 and 6 from a pool of oligodeoxyribonucleotide substrates where these positions were randomized (19,24). In addition, the processing reaction using oligodeoxyribonucleotide substrates was more efficient when G was at position 5 and A at position 6 (14,19).
To examine the roles of the U5 and the U3 IN recognition substitutions at U3-4CCTT7 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table I. b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA. substitutions at U5-3AC4 positions DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table  I. Bolded G denotes removal of one rather than two base pairs from the LTR ends.
b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA.
sequences in the concerted DNA integration reaction, donor DNAs were prepared in which mutations were introduced into one of the LTR IN recognition sequences leaving the second as wild type and integration reconstituted in vitro. The mutations examined included introducing base pair inversions at positions 5-6, at positions 4 -7, or at positions 3-4. These mutations were chosen since comparable base pair inversions were previously analyzed in Rous sarcoma virus, both in a reconstituted integration system (3) and in vivo (25). It was also shown using HIV-1 oligodeoxyribonucleotide substrates that single base changes at positions 3-6 affect the efficiency of the processing reaction (14,19). With donor DNAs that contained a wild type U5 and a mutated U3 IN recognition sequence, there were small reductions in the amount of RFII-like products detected on gels. However, in most instances only one end of the U3 mutant donor was inserted into the acceptor plasmid. This result supports the notion that a wild type U5 in a donor DNA containing mutated (or deleted) U3 IN recognition sequences promotes non-concerted one-ended donor DNA insertions into an acceptor. A sequence analysis of two-ended integrants recovered from reactions with donor DNAs containing mutations at HIV-1 U3 positions 5 and 6 indicated that 58% were derived from a non-concerted DNA integration mechanism (Table IV). This change in mechanism was not observed in the sequence analysis of integration products recovered from reactions with other mutant U3 donor DNAs (containing base pair inversions at positions 4 -7 and 3-4). While surprising, this data is consistent with the in vivo and in vitro analysis of comparable mutants in the RSV system (5,6,25). Among the sequenced products derived from reactions using donors with mutations at the conserved CA dinucleotide (positions 3 and 4) were some that showed changes in the nucleophilic attack mechanism where one rather than two base pairs were excised from the ends of the IN recognition sequences (Table VII). This phenomenon was not observed with wild type or the other HIV-1 U3-mutated donors including the position 4 -7 mutant, which recognition sequences at both ends DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table I. b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA. With donor DNAs that contained a wild type U3 and comparable mutated U5 IN recognition sequences, there was a marked reduction in RFII-like products detected on gels. The most severe defect was found with donors containing mutated position 4, the C of the conserved CA dinucleotide. When products were analyzed in bacteria, there was a correspondence between the percentage decrease in RFII-like products detected on gels and the number of colonies compared with wild type. This indicates that there were very few one-ended insertion events and that mutations in U5 had little effect on the mechanism of integration of these donors beyond the overall decrease in their integration efficiency. This is consistent with the finding that most U5 mutant donors integrated by a two-ended concerted mechanism. The presence of mutations at positions 5 and 6 in HIV-1 U5 increased the number of non-concerted mechanismderived two-ended integration products, but not nearly to the extent as when these base pair inversions were introduced into the U3 IN recognition sequence. Mutations to positions 3 and 4 of U5, like those of U3, produced integrants that formed by a change in the nucleophilic attack mechanism. In this case, however, as many as 16% of the integrants sequenced contained one rather than two bases excised from the mutated end (Table IV). Such products were not observed with the other U5 mutants.
The effect of donor DNAs containing two wild type copies of the HIV-1 U5 IN recognition sequence was more enlightening. As predicted such donors were more efficient substrates than wild type, due in part to an increase in single ended insertion events. However, the number of biologically selected integrants was the same as wild type. Nevertheless, there is a striking difference from wild type in that most of the two-ended integrants were derived by a non-concerted mechanism that introduced deletions into the target DNA (Table VIII). In the case of the donor DNAs containing two wild type copies of U3, there was a substantial decrease in the efficiency of integration. A sequence analysis of recovered two-ended integrants indicated that many occurred by a non-concerted mechanism, though not as many as obtained with the donor containing two U5 IN recognition sequences.
Taken together, these results indicate that to catalyze concerted DNA integration, HIV-1 IN requires both a U5 and U3 IN recognition sequence in a protein DNA complex. The need for both a U5 and U3 LTR end was previously suggested by the in vivo data of Murphy and Goff (26) where a mutation introduced into one of the Moloney murine leukemia virus IN recognition sequences caused a change in processing of both ends. More recently, Wei et al., (27) reported that formation of the preintegration complex required two functional viral DNA ends and that IN alone might be able to bring the two ends into the complex. We presented HIV-1 IN with substrates containing several U3 and U5 positions in which the deoxyribonucleotides were randomized. 2 Sequence analysis of the recovered integrants indicated that IN selected sequences in one LTR end based on the deoxyribonucleotide content of the other.
Somewhat different results were published by Masuda et al. (28). In their note, it was suggested that in vivo integrase recognized the two ends independently. They examined the effect on integration of using a donor with two HIV-1 U3 or two U5 sequences created by placing 11 base pairs derived from one LTR terminus into the other. The product integrants were not sequenced. They reported that the U5-U5 construct integrated roughly similar to wild type with a decrease in integration of only 10%. This could be consistent with our results in that integration occurs, but in our case with the loss of the concerted mechanism. They also reported that the integration efficiency of a U3-U3 construct was relatively high, around 75%. This is in contrast to what is observed in this study. These differences may be related to the inherent differences in the two systems used for analysis. An alternative explanation could be related to what is defined as the IN recognition sequence. The mini donor DNAs used here contained 20 base pairs as the HIV-1 IN recognition sequence. We chose this site size for the HIV-1 IN recognition sequence because in a separate series of experiments we introduced random bases into a donor DNA substrate and allowed IN to select those sequences required for integration in vitro. One of the positions where there was a statistically significant selection of the U5 wild type base pair was position 20. 3 If 20 base pairs represent a more complete HIV-1 IN recognition sequence, then the IN recognition sequences analyzed in the Masuda et al. (28) study could be variants that use an additional nine base pairs from adjacent sequence. Such sequence substitutions are tolerated by IN and in some cases will alter activity.
Formation of the integration complex involves multimers of IN (29 -33). Yang et al. (34) suggested that an integrase tetramer is both necessary and sufficient to catalyze concerted DNA integration in vivo. DNase protection studies examining the interaction of ASV IN with its respective LTR ends suggested that multimer formation was different depending upon whether it was a U3 or a U5 end (8). This ASV model predicts that for a one-ended insertion reaction only dimers of IN are required, that for a recognition sequences at both ends DNA integration products from the HIV-1 reconstituted integration system were introduced into bacteria, and individual clones were isolated and sequenced as described under "Experimental Procedures." a Deoxyribonucleotide sequence of the junction of the donor integration into the acceptor DNA. All notations are as in the legend to Table I. b Zero denotes no base pair duplication indicative of a non-concerted DNA integration mechanism.
c Denotes deletion introduced into the acceptor DNA.
two-ended concerted integration a tetramer is assembled on the ASV U5 end, and that a higher order multimer forms at the dominant ASV U3 end (8). The differential effect of similar mutations placed in the HIV-1 U3 and U5 IN recognition sequences described here has several possible implications that would lend support to the above model. One is that the two LTR ends bind to different regions of a multimer IN complex. Alternatively, the IN complex contains proteins with different conformations that bind to the different LTR ends or are induced into different conformations upon binding the different LTR ends. By the nature of the effect of the different mutations, we speculate that (a) the positioning of the conserved CA dinucleotide in the IN complex greatly influences the excision of two deoxynucleotides from the LTR ends, (b) the HIV-1 U3 IN recognition sequence, particularly around positions 5 and 6, is important for positioning the LTR ends in a functional concerted DNA integration complex, since mutations to these positions result in large numbers of twoended non-concerted DNA integration products, and (c) the HIV-1 U5 probably binds to IN more tightly than the U3 IN recognition sequence.
We have now been able to compare the mechanism of integration between HIV-1 and ASV and the effect of comparable mutations introduced into the respective IN recognition sequences using the reconstituted concerted DNA integration systems. Despite the wide differences in the sequences of integrases and the corresponding LTR IN recognition sequences, HIV-1 and ASV share common properties for integration such as the presence of a dominant LTR end (U5 in the case of HIV-1, U3 in the case of ASV). Mutations in dominant LTR end have larger effects on efficiency of integration reaction than comparable substitutions in the "weak" LTR end. In both systems inversions at positions 5-6 in either LTR IN recognition sequence significantly decrease the efficiency of two-ended integration events. In addition, substitutions at positions 5-6, especially in the weak LTR end, change the concerted mecha-nism of integration in both HIV-1 and ASV. Base pair inversions at positions 4 -7 in either LTR end in HIV-1 or ASV DNA decrease two-ended integration efficiency to almost an undetectable level but do not change concerted mechanism of twoended integration events. Based upon this shared conservation of mechanism, we would predict that the three-dimensional structures of the respective IN proteins with their IN recognition sequences would be very similar. While complete structural information is not available, it has been noted that the ␣ carbon backbone of the central catalytic cores of the HIV-1 and ASV IN proteins superimpose to within 1.4 Å r.m.s. (35), with the deviations in the active site itself being less than 0.9 Å (36).