Strategy to discriminate between high and low affinity bindings of human immunodeficiency virus, type 1 integrase to viral DNA.

The last decade has contributed to our understanding of the three-dimensional structure of the human immunodeficiency virus, type 1 (HIV-1) integrase (IN) and to the description of how the enzyme catalyzes the viral DNA integration into the host DNA. Recognition of the viral DNA termini by IN is sequence-specific, and that of the host DNA does not require particular sequence, although in physicochemical studies IN fails to discriminate between the two interactions. Here, such discrimination was allowed thanks to a model system using designed oligonucleotides and peptides as binding structures. Spectroscopic (circular dichroism, NMR, and fluorescence anisotropy) techniques and biochemical (enzymatic and filter binding) assays clearly indicated that the amphipathic helix alpha4, located at the catalytic domain surface, is responsible for the specific high affinity binding of the enzyme to viral DNA. Analogues of the alpha4 peptide having increased helicity and still bearing the biologically relevant lysines 156 and 159 on the DNA binding face, and oligonucleotides conserving an intact attachment site, are required to achieve high affinity complexes (Kd of 1.5 nm). Data corroborate previous in vivo results obtained with mutated viruses.

Nearly 20 years into the human immunodeficiency virus (HIV) 1 /AIDS epidemic, an estimated 40 million people worldwide are currently living with the virus, and some 20 million people have already died (1,2). If the spreading of HIV (3) continues at the current rate, even the most devastating scenario someone can anticipate from the present landscape will look pale compared with the reality. Even through a small number of vaccines are in clinical tests, none has lived up its early promise (4). Thus, treatment of AIDS still requires the development of effective inhibitors of HIV replication (5)(6)(7). Those targeted to reverse transcriptase and protease have demonstrated their efficiency in antiviral therapy. New drugs, acting on integrase (IN), would be a valuable complement in this therapy.
The IN of HIV is essential for the viral replication (8 -10). As it has no cellular counterpart, it is considered as a potential target for anti-HIV drugs (7,11,12). IN uses a multistep reaction to integrate a linear DNA copy (cDNA) of the retroviral genome in the host cell DNA (8,9,13). In the first step, termed 3Ј end processing, IN removes two nucleotides from the 3Ј terminus of each strand of the viral cDNA. In the second step, the free terminal 3Ј-hydroxyl groups attack the targethost DNA, and the viral cDNA is integrated by a transesterification reaction into the cell genome (14).
The HIV-1 enzyme (288 amino acid residues) is organized into an N-terminal domain, a central catalytic domain, or catalytic core, and a C-terminal domain (15)(16)(17). Several crystal structures of the catalytic domain fragment and of two twodomain fragments (catalytic domain linked either to the Cterminal domain or to the N-terminal domain), have been already resolved by x-ray crystallography (18 -26). The N-terminal and C-terminal domains have also been analyzed in solution by NMR spectroscopy (27,28). The N-terminal domain includes a conserved HHCC motif that binds zinc and an HTH motif (27,29). The C-terminal domain, although less conserved, contains an SH 3 fold (28,30). The catalytic domain contains five ␤ strands surrounded by six ␣ helices, numbered from one to six, as well as a highly conserved catalytic D, DX 35 E motif embedded in a protein RNase H fold (19 -22). The three domains, taken separately, form dimers, this being also true for the two two-domain fragments (18 -30). The dimer of the catalytic domain is organized around a 2-fold axis with an interface between the helix ␣1 of one unit and the helix ␣5 of the other one (18), whereas the ␣4 helix is located at the enzyme surface (Fig. 1).
Yet, despite this ensemble of data, a clear physicochemical demonstration of the involvement of helix ␣4 in viral cDNA recognition has not yet been established, undoubtedly because of the absence of crystallographic or NMR information on the DNA-protein complexes. The failure of the spectroscopic methods to discriminate between the high affinity (specific) and the low affinity (nonspecific) binding modes in experiments using the entire enzyme and several DNA substrates does not help to fill this gap (48,49).
Here, we carried out a detailed physicochemical study, combining CD, NMR, fluorescence anisotropy, and filter binding assay, aiming to decipher the role of the helix ␣4 of IN in the cDNA recognition events. The principle of our simplified approach rested on the design of target oligonucleotides and ligand peptides with optimized binding structures. We assumed that, to achieve good binding interactions, the partners would have to have stable secondary structures resembling the secondary structures of their parent segments within the entire cDNA and IN. Actually, our results show that the flexibility and the poor helicity of the synthetic ␣4 peptide, reproducing the helix ␣4 sequence, prevent the formation of a high affinity complex with the oligonucleotide target. In contrast, the ␣4 peptide analogue K156, which presents a higher helix content, generated by appropriate helicogenic mutations in the sequence, and still bears the residues Lys-156 and Lys-159, critical for the cDNA recognition (37)(38)(39)(40), expresses a high affinity binding (K d ϭ 2 nM). For the latter to take place, it further requires oligonucleotide targets with (i) a stable double-helix structure under the low concentration conditions used in fluorescence anisotropy experiments obtained by using hairpin folds (monomolecular structures) instead of linear duplexes (dimolecular structures) and (ii) an intact attachment sequence (att site): AGCAGT3Ј/5ЈACTGCT.
All in all, our simplified system demonstrates the utility of employing selected protein and DNA fragments to decipher the mechanisms of interaction of IN with its viral DNA target. Three criteria have proved to be imperative for the occurrence of primary (specific) or high affinity binding: (i) an optimized helical content of the peptide involved in the recognition, this is necessary for a minimal loss of free binding energy during the adjustment to the DNA partner; (ii) conservation of the basic residues Lys-156 and Lys-159 in the optimized peptide structure; and (iii) good stability of the DNA target with integrity of the IN binding locus att.

EXPERIMENTAL PROCEDURES
Structure Predictions-␣ Helix secondary structure predictions were carried out using the AGADIR and GOR computer programs (50,51). The first method considers short range interactions between residues and provides helicity per residue of peptides lacking tertiary interactions in solution, selecting the pH and temperature values. The second method provides structure predictions more suitable for peptide segments fixed in a tertiary structure i.e. within the protein context. Thus, from the comparison of the AGADIR and GOR predictions one can learn on the influence of the protein surrounding on the helix ␣4 stability. The AGADIR predictions can also be used for the choice of the mutations improving the helix content in the ␣4 peptide. Such mutations are needed to generate a peptide secondary structure resembling as much as possible the helix ␣4 folding in the protein tertiary structure. Maximization of the helical structure is expected to overcome the entropy problems linked to the otherwise large conformational freedom of the peptide in the DNA binding.
Peptide Samples-The peptides (see Fig. 2) were synthesized according to the Fmoc (N-(9-fluorenyl)methoxycarbonyl) procedure on an Applied Biosystems model 432A automatic solid phase synthesizer and were purified by reverse-phase high pressure liquid chromatography on an Aquapore column using a linear gradient from 0 to 100% acetonitrile, 0.1% trifluoroacetic acid in water. The molecular mass and purity of each peptide was confirmed by electrospray ionization mass spectrometry on a Platform-quadruple instrument (VG Biotech). Peptide concentrations were determined from the UV signal of Tyr and Trp purposely added at the C terminus, using a molar absorption coefficient at 280 nm equal to 1197 M Ϫ1 ⅐cm Ϫ1 (K156 and E156) and equal to 5600 Protein Samples-The plasmid encoding double mutant IN (IN 1-288 / F185K/C280S) was kindly provided by R. Craigie (National Institutes of Health). The enzyme was expressed in Escherichia coli strain BL21(DE3) as described previously (52). Purification was performed at 4°C under native conditions without detergent on a His-trap column (Amersham Biosciences) using a zinc chelate. Cells expressing double mutant IN from 250 ml of culture medium were resuspended in 10 ml of ice-cold lysis buffer (10 mM imidazole in Buffer A (20 mM Tris-HCl, pH 7.5, 1 M NaCl)). The cell suspension was treated with 1 mg/ml lysozyme for 30 min at 4°C, and bacteria were disrupted using a French press. Lysed cells were centrifuged for 30 min at 10000 ϫ g, and the supernatant was filtered (0.22 m) and loaded on the column equilibrated with lysis buffer. The column was washed with Buffer A plus 10 to 70 mM imidazole, and the protein was eluted with Buffer A plus 400 to 500 mM imidazole. Fractions containing integrase were pooled and dialyzed overnight against storage buffer (20 mM Tris-HCl, pH 7.5, 0.8 M NaCl, 2 mM dithiothreitol, 50 M ZnCl 2 , and 10% (v/v) glycerol). Concentrations of purified double mutant IN were determined with the Bradford kit (Promega) and by UV absorption using a calculated extinction coefficient of 46542 M Ϫ1 ⅐cm Ϫ1 at 280 nm, based on the amino acid composition. Finally, aliquots of purified enzyme were stored at Ϫ80°C.
DNA Samples-The target oligonucleotides were purchased from CYBERGENE ESGS.
For fluorescence anisotropy experiments, the fluorescein group grafted to oligonucleotides was used as a reporter (see Fig. 3). Hairpin structures were preferred to linear duplexes for their higher stability at the low concentrations dictated by fluorescence spectroscopy. They contain a three-thymine loop and a 17-bp stem reproducing or deriving from the outermost part of the U5 LTR of HIV-1 cDNA. The fluorescein group was introduced either at the 5Ј end or on the central thymine residue, which allows us to assess the impact of the position of the bulky fluorescein on the peptide-DNA interactions. A hexadeca oligonucleotide reproducing the cAMP-responsive element (CRE) sequence was also used to determine the nonspecific interactions with the peptides.
For the autointegration assays, we used U5b, 5Ј-GTGTGGAAAATC-TCTAGCAGT; and U5a, 3Ј-CACACCTTTTAGAGATCGTCA; and LTR34, 5Ј-ACTGCTAGAGATTTTCCTTTGGAAAATCTCTAGCAGT (hairpin) as substrate/target DNAs. 15 pmol of U5b and LTR34 were radiolabeled using T4 polynucleotide kinase (New England Biolabs) and 50 Ci of [␥-32 P]ATP (3000 Ci/mmol) (Amersham Biosciences). The T4 kinase was heat-inactivated and then 20 pmol of U5a and NaCl at a final concentration of 0.1 M were added to U5b. Both samples, LTR34 and U5a/U5b were heated to 95°C for 3 min, and the DNA was then annealed by slow cooling in the case of U5a/U5b and quickly cooled in the case of LTR34. Unincorporated nucleotides were removed using a Sephadex G-10 column (Amersham Biosciences).
Integration Assays-The autointegration reaction was performed using either U5a/U5b or LTR34 in a buffer containing 20 mM Hepes, pH 6.7, 10 mM MnCl 2 , 10 mM dithiothreitol. The enzyme was added at a final concentration of 200 nM. Reactions were initiated by addition of the DNA and incubated for 90 min at 37°C. They were stopped by phenol/chloroform extraction, and DNA products were precipitated with ethanol. Samples were dissolved in the loading buffer containing 95% formamide, 0.05% bromphenol blue, 0.05% xylene cyanol, and 20 mM EDTA and heated for 3 min at 95°C. Products were separated on a denaturing (7 M urea) 16 or 20% polyacrylamide gel in 1ϫ TBE, pH 8, and visualized using a STORM Amersham Biosciences PhosphorImager.
NMR Spectroscopy-Proton NMR spectra were recorded on a Bruker Avance 500-MHz spectrometer and processed with XWIN-NMR and FELIX 97.0 (Accelrys) software running on Silicon Graphics workstations. Samples of 400 l were prepared with K156 and E156, at 4.5 and 3.2 mM, respectively, diluted in 90% H 2 O-10% D 2 O. Solutions were adjusted to pH 5.5 by addition of aliquots of NaOH (1 N). TSP was used as an internal chemical shift reference. Standard two-dimensional NMR experiments were recorded at 10, 20, 30, and 40°C; NOESY (53) and Clean-TOCSY (54) data sets were collected with 2048 real points in t 2 and 800 points in t 1 using TPPI scheme for quadrature detection in the t 1 dimension. The relaxation delay was set to 1.5 s, and the spectral widths were enlarged to 10 ppm. In the NOESY, mixing times were varied from 50 to 300 ms, when Clean-TOCSY were collected with MLEV-17 spin-locking pulse ranging from 80 to 120 ms. Solvent suppression was achieved by applying a WATERGATE pulse sequence (55). Spectra were zero-filled (final data matrix in real points: 2048 ϫ 1024) and subjected to sine bell weighting functions in both dimensions before Fourier transformation. The quantification of nuclear Overhauser effect data was achieved by calculating the ratio of number of levels in contour plot spectra between sequential HN i /HN iϩ1 or H␤ i /HN iϩ1 connectivities and the H␣ i /HN iϩ1 ones, taking into account (averaging) the parts below and above the diagonal of spectra.
Fluorescence Anisotropy Measurements-Fluorescence anisotropy (A ϭ (I ⌸ Ϫ I Ќ )/(I ⌸ ϩ 2I Ќ )) titration measurements were carried out using a Jobin-Yvon Fluoromax II instrument. Parallel (I ⌸ ) and perpendicular (I Ќ ) emission components were measured in L-format. The denominator of A is simply the total light that would be observed if no polarizers were used. With the fluorophore fluorescein, the excitation from the xenon lamp (150-watt ozone-free) is observed at 492 (488) nm with a 4-nm slit width and the emission at 516 nm with a 5-nm slit width.
The fluorescein-labeled oligonucleotides were diluted to the desired concentration in 800 l of assay buffer (pH 6.0, 40 mM Na/Na 2 phosphate, 0.5 mM EDTA). Samples were placed in thermally jacketed 1-cm ϫ 0.5-cm quartz cuvettes, and measurements, at least 10 data points for each titration point, were recorded at 5°C with an integration time of 5 s. The peptides were stepwise diluted. For each anisotropy measurement, the parallel (I ⌸ ) and the perpendicular (I Ќ ) intensities of the background solution (i.e. buffer and protein contributions) were subtracted from the sample value.
CD Spectroscopy-CD spectra were recorded on a Jobin-Yvon CD6 dichrograph. Measurements were calibrated with (ϩ) Ϫ10-camphorsulfonic acid. Peptide concentrations were 40 M in pure H 2 O. Samples were placed in thermally jacketed cuvettes with a 1-mm path length. Spectra, recorded with 1-nm steps and corrected for the base line, were averaged over 10 scans. Before spectral recording, samples were incubated 10 min at the chosen temperature to allow the solutions to reach their equilibrium state. Spectra were presented as circular dichroism per residue, ⌬⑀ (M Ϫ1 ⅐cm Ϫ1 ), as a function of wavelength, nm, between 260 and 185 nm. The ␣ helix content was estimated by the relation, P␣ ϭ Ϫ[⌬⑀ 222 ϫ 10] (P␣, percentage of ␣ helix; ⌬⑀ 222 , circular dichroism per residue at 222 nm) (56).
Filter Binding Assay-The reaction conditions have been described previously (56). Typically, in a 20-l volume, the peptide Lys-156 was incubated at different concentrations in the presence of its DNA-labeled 32 P LTR 34fm substrate at 5 nM in a 20 mM phosphate buffer, pH 6. After 15 min of incubation at 34°C, samples were filtered through a nitrocellulose membrane (0.2-m reinforced cellulose nitrate membrane; Schleicher & Schuell) in a dot-blot apparatus. Wells were washed three times with 0.5 ml of washing buffer (20 mM phosphate buffer, pH 6, 100 mM NaCl) prior to filtration. At the end, the wells were washed three times with 0.5 ml of washing buffer. The membrane was then washed for 15 min in the same buffer and dried for 30 min at 40°C. Finally, the membrane was scanned, and the radioactivity was recorded using a PhosphorImager Scanner (Amersham Biosciences).

RESULTS
Peptide ␣4 reproduces the natural helix ␣4 sequence of the catalytic domain; K156 derives from ␣4 through selected helix stabilizing mutations, E156 is obtained from K156 through replacement of Lys-156 with Glu, and E159 results from substitution of Lys-159 with Glu. These are presented in Fig. 2. The oligonucleotides, that is their lengths and sequences, were selected according to the experimental requirements and on the basis of the known recognition properties of the viral cDNA termini by IN. They are shown in Fig. 3.
In the following, the conformations of the hairpin oligonucleotides were all of the B-DNA type according to our CD measurements (not shown). The secondary structures of the peptides were predicted by the AGADIR and GOR methods and determined by CD and 1 H NMR experiments. The interactions of the peptides with DNA were measured by fluorescence anisotropy using fluorescein-labeled oligonucleotides.
Peptide Secondary Structure Predictions and Choice of Analogues-The results are reported in Fig. 4. The AGADIR method predicts a very low helix content for the natural ␣4 peptide (Fig. 4A), whereas the corresponding peptide segment is present under helical conformation in the enzyme crystal structure (Fig. 1), as indeed predicted to be a helix by the GOR algorithm (Fig. 4B). This important difference strongly suggests that within the tertiary structure of the protein the helix ␣4 is submitted to a substantial stabilization of its helical secondary structure.
It is generally agreed that the association of two molecules is accompanied by a loss of conformational (translational and rotational) entropy. The larger is the entropy (flexibility, motions within the molecules), the higher is the cost in terms of binding energy, and the lower is the resulting binding affinity. Thus, the ␣4 peptide, whose secondary structure presents higher degrees of freedom compared with the helix ␣4 in the enzyme, is also expected to associate more weakly with the target DNA and constitutes a poor model for the protein binding studies.
A more valuable model is embodied by peptide analogues having both a better preorganized structure, that is a higher helix content and the functional amino acids, important for the interactions, in a correct spatial orientation. The reduction of the degrees of freedom of the ␣4 peptide was achieved through selected mutations (Fig. 2), whose effects on secondary structure were predicted by the AGADIR method (Fig. 4A). The resulting analogues, K156, E156, and E159, are alleged to better mimic the helix ␣4 secondary structure than their parent ␣4 peptide. To obtain K156 the following took place: (i) residues Gly-3, Gly-17, Val-5, Ile-15, and Ile-16 were replaced by the more helicogenic residues Ala and Leu, respectively; (ii) residue Val-4 was substituted by Lys, and residue Ser-7 was substituted by Glu, to engender stabilizing electrostatic inter- actions (i 3 iϩ3 and i 3 iϩ4) between side chains. It can be noticed, first, that despite the various mutations performed, the hydrophobic face of the helix remains unaltered (compare in Fig. 2 the residues in positions a and d of the different  analogues), allowing the helix to maintain its amphipathic properties. Second, the two basic residues Lys-156 and Lys-159, which have been repeatedly shown to be essential for the binding of IN to viral cDNA and for 3Ј-processing (37,40), are conserved in K156 whereas they are suppressed in E156 and E159. Together with residue Gln-148, also important for the binding of IN to viral DNA, and residue Glu-152 of the catalytic triad, they are situated in the helical heptad positions a, b, and e (Fig. 2), on the same face of the helix (Fig. 1) and can participate in face-to-face contacts between the peptide helix and the DNA (Fig. 5). Third, the three analogues display an almost similar secondary structure. This resemblance was necessary within this study, which aims to evaluate the impact on the binding affinities of the basic side chains in positions 156 and 159, to avoid the interference of the conformational aspects in the comparison.
CD and Secondary Structures-CD is a very sensitive method for the analysis of peptide secondary structures and conformational changes (57). CD spectra of the ␣4, K156, E156, and E159 peptides were recorded in aqueous solution at pH 5.6 and 20°C (Fig. 6). The spectrum of ␣4, with its intense negative band arising close to 190 nm, is typical of peptides devoid of regular structure. On another hand, K156, E156, and E159 display spectra with two negative signals at about 222 and 208 nm and a positive one at about 190 nm, characteristic of peptides containing a fair amount of ␣ helix. Only small differences are observed; the spectrum of E156 is almost identical to that of K156, whereas that of E159 displays slightly more intense signals. As expected from the AGADIR prediction data, the mutations significantly increased the helix content.
NMR and Helical Structures-The NMR structure of the ␣4 peptide in aqueous solution has been reported previously (45). The results have not provided any evidence of a stable secondary structure, in agreement with the AGADIR predictions (Fig.  4A) and the CD data (Fig. 6).
Proton resonances of peptides K156 and Glu-156 were assigned in combining the P-COSY, Clean-TOCSY, and NOESY spectra (not show), following the standard strategy developed by Wü thrich (58). The comparative analysis of P-COSY and TOCSY spectra, recorded at 20°C in water, gave the spin systems signatures of the peptides. Spin systems were then sequentially assigned using NOESY spectra by virtue of sequential H␣ i /HN iϩ1 , HN i /HN iϩ1 , and H␤ i /HN iϩ1 connectivities. Spectra recorded at different temperatures were used to confirm and clarify our initial assignments. As first evidence, the analysis of NOESY spectra revealed, by the presence of numerous HN/HN correlations, a high degree of secondary structure elements. The so-obtained sequential and medium range connectivities, summarized in Fig. 7A, stipulate a large proportion of helical content in both peptides, covering almost their entire sequences, with well spread H␣ i /HN iϩ1 , H␣ i /HN iϩ3 , H␣ i /HN iϩ4 , and HN i /HN iϩ4 correlations of medium intensities. The occurrence of several H␣ i /HN iϩ2 connectivities draws the attention to the presence, besides the ␣ helix, of a weak amount of 3.10 helix.
The H␣ and HN chemical shifts have also been used to detect the presence in proteins of elements of regular secondary structures (59). The chemical shift index profiles, which are the differences between the observed, ␦ obs , and the random chemical shifts, ␦ RC , respectively (␦ obs Ϫ ␦ RC ) (58), are shown in Fig.  7B. The HN-chemical shift index profiles show significant periodicity, already mentioned in the case of amphiphilic helices (60). Overview of H␣-chemical shift index values obtained for both peptides show negative values, especially for the most internal residues. A chemical shift difference of Ϫ0.39 ppm is commonly accepted for 100% of helix content, and the values displayed by the residues Lys-156 and Lys-159 in K156 are close to this limit. Actually, the majority of the residues are involved in helical secondary structures, especially at the center of the peptides.
A DNA Hairpin Structure as a Model Substrate of IN-For susceptibility to processing or to DNA joining by IN, viral cDNA substrates must be double-stranded and at least 15 bp long (61). To make sure that our designed hairpin structures are suitable DNA targets for the binding experiments, the susceptibility to IN of the LTR34 stem of 17 bp was compared with that of a classical 21-bp substrate duplex (62) in an autointegration assay (Fig. 8). The materialization of autointegration products when LTR34 is used as a substrate stipulates that the hairpin presents all the features allowing its recognition by the enzyme and that neither the three-thymine loop nor the shortening of the DNA stretch (17 versus 21 bp) alter its susceptibility to processing and DNA joining.
Fluorescence Anisotropy-The relationship between size and anisotropy provides information on both the stoichiometry of the complex and the binding constants. Here, fluorescence anisotropy was used for monitoring the complexation of peptides (Fig. 2) with the fluorescein-labeled oligonucleotides (Fig. 3). Heyduk and Lee (63) were the first to use such labeled oligonucleotides for anisotropy titration studies with a DNA-binding protein. The authors rapidly understood that fluorescein, which had both a high quantum yield and a remarkable sensitivity to its environment, was a good probe of complexation events. Since then, the technique has been successfully applied for the analysis of specific interactions of transcription factors with target DNAs (64).
The anisotropy titration profiles provided by the binding of K156 to LTR 34fm at three concentrations of DNA (10, 30, and 50 nM) are presented in Fig. 9. LTR 34fm is the only hairpin DNA substrate that contains both the appropriate att site of IN within the 17-bp stem and the fluorescein moiety grafted on the loop. The three binding curves are biphasic, with two increases in anisotropy, in the nanomolar range and the micromolar range, respectively. The fitting of the experimental data, by a non-linear least squares procedure, within a model that takes into account two binding sites, provides dissociation constant (K d ) values of 2 nM and 14 M, corresponding to high and low affinities, respectively. In the case of the high affinity binding, we observed the same increase in anisotropy upon addition of K156 to LTR 34fm, whatever the DNA concentration used (10, 30, and 50 nM). This permitted the application of the procedure of Bujalowski and Lohman (65), which yields a 1:1 stoichiometry for the high affinity complex between Lys-156 and LTR 34fm, indicating that one molecule of peptide binds to one site on the DNA double helix.
The K d values for the various DNA-peptide complexes are gathered in Table I. We identified the following main features. The ␣4 peptide, which lacks helix stability compared with its structural analogues K156, E156, and E159, is unable to form high affinity complexes whatever the DNA target, including the one containing the appropriate att site (LTR 34fm). In contrast, the analogue K156, which fulfils both the condition of ␣ helix stability and of sequence integrity on the polar/charged face, is the only peptide that forms a high affinity complex. The two analogues E156 and E159, lacking the basic residues Lys-156 and Lys-159, respectively, do not present this high affinity binding, despite their helix stabilities at least equal to that of K156.
The high affinity binding of K156 to DNA also requires the integrity of the att site on the DNA target. Any change performed within the six outermost base pairs . . . AGCAGT3Ј/ 5ЈACTGCT . . . , such as: deletion of the GT 3 Ј dinucleotide (LTR 32fm), replacement of Ala-6 with Gly (LTR 34fmG), and grafting of the bulky fluorescein moiety on the terminal residue A 5 Ј, facing the processing site on the complementary strand (LTR 34f5Ј), impairs the high affinity binding. On another hand, the internal substitutions (namely, in positions 9 and 14) are ineffective, and the high affinity binding of K156 to DNA is maintained intact.
Table I further shows that K156 and the two analogues are unable to bind single stranded DNAs with high affinity, although the low affinity binding persists. Indeed, the low affinity binding (micromolar range) is also observed for the com-plexation of K156 with the unrelated 16-bp oligonucleotide CRE, used as a negative control, proving that it corresponds to a nonspecific binding (not shown).
Filter Binding Assay-The filter binding assay was used to verify the tight binding (K d ϭ 1.5 nM) provided by the fluorescence anisotropy method for the interaction of K156 with LTR 34fm bearing the intact IN att site. The data reported in Fig.  10, A and B indicate that a strong binding interaction occurs in the 1 to 10 nM range (K d ϳ 3 nM) between the peptide and its DNA target, confirming the information provided by the fluorescence anisotropy method. DISCUSSION Biochemical studies have identified the basic mechanisms of integration mediated by the IN enzyme (8 -10, 13). Yet, in in vitro experiments IN does not escape being aggregated. This ill behavior of IN has both prevented the determination of its active unit and made difficult the visualization of its specific interactions with the viral DNA, likely because of an increased number of parasitic nonspecific interactions. These features preclude the use of the entire IN protein and of long DNA fragments, which multiplies the number of nonspecific binding sites, as partners in a physicochemical study aiming to provide details on the complexation events. Therefore, we turned toward a simplified model directed to establishing the conformational, the functional side chain, and the DNA base requirements for the specific binding of IN to the viral DNA termini.
DNA Bases Important for Interactions-We have shown that hairpin DNA structures with a 17-bp stem reproducing or deriving from the distal end of the viral cDNA can be used as targets in binding studies. The choice of the stem length was justified by the fact that DNA fragments of 15 to 21 bp constitute good in vitro IN substrates (61,62), whereas fragments longer than 25 bp should favor nonspecific interactions. The choice of monomolecular folded structures relies on the fact that bimolecular duplex structures are only weakly stable at the low concentrations used in fluorescence anisotropy experiments. Autointegration assays performed prior to fluorescence anisotropy experiments demonstrate that hairpin structures are perfectly recognized by IN. Various hairpin analogues were designed, based on the fact that not all the nucleotide residues at the viral DNA termini are equally important for the virus cycle and the DNA recognition (47). For instance, mutations performed within the processing site CA2GT3Ј are not tolerated at all; substitution of the highly conserved A residue results, in vivo, in replication-incompetent viruses, and in model DNA substrates they generate an up to 100-fold reduction of 3Ј end processing. The same A, when replaced with G in our model oligonucleotides, leads to suppression of the high affinity binding, thus confirming the prominent role of this residue in the recognition process. This is also true when the 2GT3Ј step is deleted or when it bears the G 3 A transition, known for its deleterious effect on the 3Ј processing rate (35,36,40,62). These different analogues exhibit only the low affinity binding in our experiments, as does the analogue with the bulky fluorescein moiety at the 5Ј end, supposed to spatially interfere with the binding of the peptide at the att site.
More internal nucleotide positions contribute to DNA recognition, as well. The A 3 G transition in position six suppresses the high affinity complex in our binding experiments. This observation is interesting, as the above mutation strongly affects the 3Ј processing rate in integration assays with model DNA substrates (62,66). On another hand, even more internal mutations, such as those at positions 9 and 14, which have been shown to impair the binding of IN to cDNA, do not affect the high affinity binding of our K156 peptide to the target oligonucleotide. Actually, several residues of IN, situated outside the ␣4 helix, have been found to interact with the cDNA in photocross-linking experiments (47).
Altogether, the positions, which, in our system, contribute to the high affinity binding, have been shown to participate in critical protein-DNA contacts. Indeed, it has been shown that the strongest cross-links involve the A and C residues of the invariant CA23Ј but also the C and A residues of the terminal CA5Ј facing the 2GT3Ј dinucleotide deleted upon cDNA processing (37)(38)(39)(40).
Active Conformations-The synthetic ␣4 peptide, reproducing the primary structure of the ␣4 helix, is according to our CD and previous NMR experiments almost completely devoid of ␣ helix content compared with its analogue K156 (45). The active ␣ helical conformation is cruelly missing in the peptide ␣4, and the large energy expenditure for its molecular adjustment to DNA results in a low binding affinity. In contrast, the molecule K156, which displays a much larger proportion of binding conformation, pays a lower free energy cost on fit by circumventing the need to reduce its degrees of freedom and, hence, binds more tightly to DNA. This corresponds to one of the founding principles of host-guest chemistry (67). Actually, there are many cases of small proteins that bind to DNA with very low K d . For instance, in their pioneering work using fluorescence anisotropy, Le Tilly and Royer (68), have found a K d of 0.13 nM for the binding of the trp repressor protein to its specific binding sites. A high affinity has been also obtained for the interaction between DNA and the DNA binding domain of c-Myb, with K d of ϳ5 nM (69). In such complexes the number of contacts between the protein and the DNA are not so numerous, hardly more than those exchanged between an oligopeptide and its nucleic acid site; nevertheless, the affinities achieved are considerable. An illustrative example is the picomolar affinity displayed by a short arginine containing peptide (22 residues) under helical structure for a mall RNA hairpin (70). This example provides evidence that oligopeptides can be obtained that bind nucleic acids with affinity that rivals that of proteins. Moreover, it is not so unusual that in pharmacological studies peptide antagonists with restricted motions display affinities for their target receptors increased by 2 or 3 logs compared with the natural agonists.
Peptide Side Chains Important for the Interactions-Obviously, changing any one of the catalytic residues (Asp-64, Asp-116, Glu-152) in IN blocks the HIV-1 replication in several cell lines and the in vitro integration assays (47,71). Yet, other mutations performed in the core domain, such as K156E and K159E, appear deleterious, as well, for both 3Ј-processing and strand transfer reactions (37), although the interaction with nonspecific DNAs is globally maintained. Our results obtained with the stabilized helices are therefore in full agreement with the literature. Although K156 displays both the high and the low affinity binding modes, the analogues E156 and E159, bearing mutations K156E and K159E, are devoid of high affinity binding but still continue to bind the natural DNA termini with low affinity.

CONCLUDING REMARKS
In conclusion, implication of the ␣4 helix in the recognition of viral cDNA by IN is underscored by our studies using a model for the binding of peptides to oligonucleotides, similar to the host-guest model of organic chemistry. Attention was directed onto the judicious design of the primary sequences and correlatively of the secondary structures of the partner molecules. To achieve measurable affinity, the entropic gain in the binding reaction was optimized by reducing the dynamics in the partner molecules and increasing the number of binding conforma-  tions. In the peptide, this was obtained by replacing residues, not involved in the DNA binding interactions, by more helicogenic ones. In the DNA, the binding site was incorporated in a hairpin fold more stable than a duplex structure at low concentrations. We have thus established a simplified approach of the complexation events using oligonucleotides and peptides assuming preorganized binding conformations. This resulted in binding affinities that rival with macromolecular complexes and better agree with the behavior of parent compounds in which the secondary structures of elements involved in recognition are generally frozen by stabilizing interactions arising from the tertiary or quaternary folding.
Finally, our strategy has allowed the disclosure of the high affinity binding of IN to viral DNA, which had not otherwise been detectable. We firmly believe this will be useful for the development of IN inhibitors.