Integration Requires a Specific Interaction of the Donor DNA Terminal 5#-Cytosine with Glutamine 148 of the HIV-1 Integrase Flexible Loop

Integrationisessentialforretroviralreplicationandgenetherapy using retroviral vectors. Human immunodeficiency virus, type 1 (HIV-1), integrase specifically recognizes the terminal sequences of each long terminal repeat (LTR) and cleaves the 3 (cid:1) -end terminal dinucleotide 5 (cid:1) -GT. The exposed 3 (cid:1) -hydroxyl is then positioned for nucleophilic attack and subsequent strand transfer into another DNAduplex(targetorchromosomalDNA).Wereportthatboththe terminal cytosine at the protruding 5 (cid:1) -end of the long terminal repeats (5 (cid:1) -C) and the integrase residue Gln-148 are critical for strand transfer. Proximity of the 5 (cid:1) -C and Gln-148 was demon-strated by disulfide cross-linking. Cross-linking is inhibited by the inhibitor 5CITEP 1-(5-chloroindol-3-yl)-3-hydroxy-3-(2 H -tetra-zol-5-yl)-propenone. We propose that strand transfer requires a conformationalchangeoftheintegrase-viral(donor)DNAcomplex with formation of an H-bond between the N-3 of the 5 (cid:1)


Integration Requires a Specific Interaction of the Donor DNA Terminal 5-Cytosine with Glutamine 148 of the HIV-1 Integrase Flexible Loop *
Integration of the HIV-1 2 DNA into a host chromosome is required to complete viral infection. Integration is also required for retroviral gene transfer. HIV-1 integrase (integrase) catalyzes integration in two steps (1)(2)(3). First, an endonucleolytic cleavage reaction releases a terminal 5Ј-GT dinucleotide from the end of each viral long terminal repeat (LTR) in a reaction called 3Ј-processing (3Ј-P). Subsequently, the viral and host DNAs are joined by insertion of each exposed viral DNA 3Ј-hydroxyl end into a host chromosome (strand transfer, ST). The integrase-DNA complex during ST consists of several integrase monomers, the two ends of the viral DNA, and the target DNA. Although the arrangement of the integrase dimers has been elucidated by x-ray diffraction of the structures of various integrase domains (4 -10), determination of the structure of an integrase-DNA co-crystal has remained elusive.
HIV-1 integrase binds to the tips ("att" sites) of each viral LTR. The roles of several bases at the extremities of the viral DNA have been examined. The release of the 5Ј-GT dinucleotide during 3Ј-P leaves a complementary 5Ј-AC overhang (see Fig. 1A for the sequence of the terminal 21 bp of the HIV-1 U5 LTR). The 5Ј-AC overhang is required for ST (11) and has been proposed to stabilize the viral DNA end in the proper position for ST (11,12). Yet the roles of these individual nucleotides, as well as their functional association with an integrase amino acid during ST have not been established.
A flexible loop consisting of amino acids 140 -149 resides over the integrase active site. The conformational flexibility of this loop is suggested to be important for the catalytic step following DNA binding (4). Amino acids Gln-148 and Tyr-143 have been identified through photocross-linking as possible contacts for the 5Ј-adenine of the 5Ј-AC overhang (13). Additionally, a Q148L integrase mutation reduces the stability of the integrase-viral DNA complex as well as ST activity (14). Together, these results suggest that integrase Gln-148 resides in the region of the 5Ј-AC overhang and plays a role in the ST reaction.
The aim of the present study was to elucidate the molecular interactions between integrase and the viral LTR DNA ends. We find the cytosine in the 5Ј-AC overhang (hereafter referred to as the "5Ј-C", Fig. 1A) and integrase Gln-148 to be critical for ST especially in the presence of magnesium, the probable biological metal cofactor. We demonstrate the proximity of the 5Ј-C and Gln-148 through disulfide cross-linking, and propose that ST requires the formation of a hydrogen bond between the cytosine N-3 and the glutamine amine group. These results directly relate to the mechanism and structure of the ST reaction as well as to the mechanism of action of integrase inhibitors that block 3Ј-P and ST.

EXPERIMENTAL PROCEDURES
Oligonucleotide Synthesis-Oligonucleotides, except those used for disulfide cross-linking, were commercially synthesized by (Integrated DNA Technologies, Coralville, IA). O-4-Triazolyl-dU-CE phosphoramidite was purchased from Glen Research (Sterling, VA). Oligonucleotides containing convertible nucleosides were synthesized on an Applied Biosystems 392 DNA synthesizer. The two carbon tether (cystamine) was added post-synthetically using the convertible nucleoside approach (15), reduced with excess dithiothreitol, and purified using a NAP-5 column. Incubation of the DNA with a 10-fold molar excess of 5,5Ј-dithiobis(2-nitrobenzoic acid) (Sigma) using pH 8.5 phosphate buffer and subsequent ethanol precipitation provided the activated DNA (16). The activated DNA was annealed with the complementary strand before cross-linking experiments were performed.
Mutagenesis-Integrase mutants were created using the Stratagene QuickChange site-directed mutagenesis kit (La Jolla, CA), according to the manufacturer's recommendations. Primers containing mutations were as follows: for C56S, 5Ј-CAAGTAGACAGTAGCCCAGGA-3Ј; * This research was supported in part by the Intramural Research Program of NCI, Center for Cancer Research, National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1  for C65S, 5Ј-GGCAGCTAGATTCTACACATTTAG-3Ј; for C280S, 5Ј-GGCAGGTGATGATAGTGTGGCAAGTAG-3Ј; for Q148N, 5Ј-CCCCAAAGTAACGGGGTAATAG-3Ј; for Q148C, 5Ј-CCCCAAA-GTTGCGGGGTAATAG-3Ј. A complementary primer was used for each mutant. The presence of desired mutations and the integrity of the remainder of the integrase sequence were verified by DNA sequencing.
Integrase Purification-Recombinant wild-type or mutant integrase was purified from Escherichia coli as described (17) with the addition of 10% glycerol to all buffers. Reducing agents were omitted from the elution and dialysis steps of proteins used in disulfide cross-linking reactions.
Integrase Reactions-Integrase reactions contained 400 nM integrase, 20 nM DNA, 7.5 mM MgCl 2 or MnCl 2 , 7.5 mM NaCl, 14 mM 2-mercaptoethanol, and 20 mM Mops, pH 7.2. Magnesium was used as the divalent metal, unless otherwise noted. Reactions were at 37°C for 1 h and were quenched by addition of an equal volume of gel loading dye (formamide containing 0.25% bromphenol blue and xylene cyanol). Products were separated on 20% polyacrylamide denaturing sequencing gels. Dried gels were visualized using a 445 SI PhosphorImager (Amersham Biosciences). Densitometry analyses were performed using Image-Quant software from Amersham Biosciences.
Schiff Base Cross-linking-Inhibition of DNA binding by 5CITEP was examined by using 5Ј-32 P-labeled oligonucleotide containing an abasic site (X) substitution for adenine of the conserved 5Ј-CA, 5Ј-GTGTG-GAAAATCTCTAGCXGT annealed to the thiol-modified DNA. Q148C/SSS integrase (500 nM) was incubated with 111 M 5CITEP for 15 min at room temperature in the presence of 50 mM Mops, pH 7.2, 143 mM 2-mercaptoethanol, and 7.5 mM MgCl 2 . Schiff base DNA (20 nM) was added, and the reaction was incubated at 37°C for 10 min. 100 mM NaHB 4 was added to stabilize the Schiff base cross-link, followed by an equal volume of SDS-polyacrylamide gel loading buffer. Samples were heated at 95°C for 3 min and electrophoresed by SDS-PAGE. For a more detailed description of the assay see Ref. 18.

RESULTS
The 5Ј-C Is Required for ST-Integrase 3Ј-P and ST activities can be measured by using a duplex DNA derived from the terminal 21 bp of the HIV-1 U5 LTR (Fig. 1A). The base pair on the 3Ј-side of the integrase 3Ј-P site, which normally is guanine/cytosine, was replaced by each of the other 15 possible base pair combinations at the X/Y position ( Fig.  1B) to examine the importance of the base pair sequence for the 3Ј-P and ST reactions. By using a full-length substrate, the 3Ј-P reaction must  occur prior to ST. Only two combinations fully inhibited 3Ј-P (T/A and T/T) and therefore also inhibited subsequent ST in magnesium-containing reactions (Fig. 1C). The presence of an upper strand guanine (X ϭ G) or a lower strand cytosine (Y ϭ C), both corresponding to the natural sequence, led to the most efficient 3Ј-P in the presence of magnesium. Notably, cytosine as the lower strand base (Y ϭ C) was strongly preferred for ST, regardless of the identity of the upper strand base. When reactions were performed with manganese, the DNA base requirements for 3Ј-P and ST were less stringent (Fig. 1D). The identity of the base pair had no effect on 3Ј-P, and all of the base combinations yielded some level of ST. Yet, even with manganese, the 5Ј-C on the lower strand was preferred for ST. Removal of the terminal 5Ј-nucleotide (dA; see Fig. 1A) had no effect on ST (Fig. 2). By contrast, when the 5Ј-C was removed, integrase failed to produce ST products but retained a normal level of 3Ј-P (Fig. 2). These results demonstrate the selective requirement of the 5Ј-C for ST.
To examine which portion of the 5Ј-C confers efficient ST, exocyclic substitutions were introduced sequentially to convert the base from cytosine (preferred) to thymine (nonpermissive) (Fig. 3A). Addition of a 5-methyl group had no effect on ST (Fig. 3B). Conversion of the 4-amine to an O-4-methyl group resulted in a slight enhancement of the ST reaction. The O-4-methyl group can be a weak hydrogen bond acceptor compared with the hydrogen bond donating amine group of cytosine. ST remained effective without any substitution at position 4 (zebularine), indicating this position does not play a role in ST. Most critically, ST was lost when the N-3 group was changed to NH, as in thymine, which reverses the direction of hydrogen bonding at this position. Also, the ST in the presence of the O-4-methyl group indicates the thymine carbonyl is not inhibitory to ST (Fig. 3). These results suggest the N-3 group, as a hydrogen bond acceptor, makes a critical contact with integrase or the target DNA for ST.
Glutamine 148 of Integrase Promotes ST-We next looked for a potential integrase hydrogen bond donor residue. An alignment of the retroviral flexible loop residues through the adjacent catalytic glutamate residue and the DNA base corresponding to the 5Ј-C are shown in Table  1. Of the potential hydrogen bond donating residues, only Asn-144 and Gln-148 are conserved and required for viral replication (19,20). We chose those residues as potential hydrogen bond donors, as well as Tyr-143 because of published results indicating this amino acid cross-linked the 5Ј-A of the LTR tip (13). Additionally, residues Thr-122 and Lys-127 were selected from modeling studies of integrase and viral DNA. 3 Mutating Thr-122 (to Ala or Cys) or Tyr-143 (to Ser, Phe, or Ile) had no effect on ST. Mutating Asn-144 (to Lys, Ala, or Cys) had a minimal effect on ST. Mutation of Lys-127 (to Ala or Cys) blocked 3Ј-P and ST (data not shown). Therefore, positions 122, 127, 144, and 143 were not investigated further. Fig. 4A shows the close proximity of Gln-148 (blue) to the three acidic, catalytic amino acids (red) (from the crystal structure 1BI4 3 R. Karki and M. Nicklaus, personal communication.

TABLE 1 Alignment of flexible loop residues for selected retroviruses
The base refers to the identity of the base corresponding to the 5Ј-C in the HIV LTR (see Fig. 1A). Boldface letters indicate amino acid residues differing from the HIV sequence. The underline indicates the conserved glutamine (Gln-148 in HIV-1    (21)). We created the mutants Q148N and Q148A to examine the efficiency of ST when the amino acid 148 side chain is shortened by one methylene group or changed to a methyl group (Fig. 4B). Q148N completed 3Ј-P at a level similar to wild-type integrase but had a 40% reduction in ST activity ( Fig. 4B and data not shown). Q148A had a slight decrease in 3Ј-P and lacked ST. Other Gln-148 mutants (Glu and Lys) were also deficient for ST (data not shown). The 5Ј-C requirement was tested with the Q148N mutant, and the results were similar to wild-type integrase (see Fig. 1) for both 3Ј-P and ST, except for the globally lower ST efficiency (data not shown).
Direct Cross-linking of Integrase Gln-148 to the 5Ј-C-Because changes to either the 5Ј-C and Gln-148 diminished ST, and both the 5Ј-C and Gln-148 are plausible partners for hydrogen bonding, we tested their potential direct interaction using disulfide cross-linking. Disulfide cross-linking between proteins and DNA has been accomplished for several protein-DNA complexes, including HIV-1 reverse transcriptase and DNA repair proteins (for review see Ref. 22). Additionally, the integrase mutant E246C has been cross-linked previously with viral DNA substrate at the seventh base (adenine) from the 5Ј-end of the lower DNA strand by using a similar method (23).
Three surface cysteine residues of integrase were mutated to serine to eliminate background cross-linking. Therefore, in cross-linking experiments the control protein had the mutations C56S/C65S/C280S ("SSS"). Sequential mutations indicated Cys-56 was responsible for background cross-linking, in contrast to a previous study where Cys-65 and Cys-280 provided background cross-linking to the 5Ј-adenine of the lower DNA strand (23). The mutation Q148C was added to the triple serine mutant integrase to place a reactive sulfur at residue 148 ("Q148C/SSS").
The activity of these mutants is shown in Fig. 4B. The triple serine (SSS) retains wild-type activity for both 3Ј-P and ST. The Q148C/SSS mutant exhibits no ST and altered 3Ј-P specificity. 3Ј-P was reduced at the canonical cleavage site (CA2GT-3Ј generating the 19-mer) and enhanced at the adjacent site CAG2T-3Ј, resulting in the formation of a 20-mer product and a similar level of total 3Ј-P. Altered 3Ј-P has also been reported for a Q148L mutant of HIV-2 integrase with enhanced generation of circular dinucleotide products (24). Thus, it is possible that Gln-148 helps position the 3Ј-P reactants, allowing dinucleotide release, and the role of Gln-148 in ST becomes important only once the 5Ј-C is unpaired. Activity of the Q148C/SSS mutant was tested with precleaved DNA in the presence of manganese. These conditions permitted a partial "rescue" (19%) of ST. Presumably cysteine can interact with manganese, and not magnesium, because of the ability of manganese to bind cysteine (25).
The cross-linking components and reaction are shown schematically in Fig. 5A. DNA was synthesized with an alkanethiol tether at the 5Ј-C position. The tether was activated by 5,5Ј-dithiobis(2-nitrobenzoic acid) to yield the substrate shown before reaction with integrase in Fig. 5A.
The DNA substrates used to examine position-specific cross-linking are shown in Fig. 5B. Cross-linking of the Q148C/SSS integrase to the thiol-modified 5Ј-C-containing DNA (DNA X-1) was observed as the band migrating between the monomer and dimer species of integrase (highlighted by horizontal arrow in Fig. 5C). Cross-linking efficiency was between 30 and 50% relative to the monomer concentration. The cross-link band was not observed when unmodified DNA was used (Fig.  5, C and D). As expected, dithiothreitol reduced the level of protein-DNA cross-link, and no cross-link was observed with the SSS integrase lacking Q148C (Fig. 5C). Thus, the core domain surface residue Cys-130 does not cross-link with the thiol-containing DNA. Cross-linking was not observed when the reactive thiol group was placed distal from the 5Ј-end of the duplex (DNA X-2), which rules out nonspecific crosslinking (Fig. 5D). Finally, cross-linking was observed in the presence of manganese and magnesium, precleaved or full-length DNA (Fig. 5, E and F) in agreement with the previous observation of DNA binding in the absence and presence of metal (26,27). The recently reported flexibility of the viral DNA end may allow cross-linking of nonprocessed substrate (28). Together, these experiments indicate that the integrase residue 148 is in the vicinity of the cytosine of the 5Ј-overhang and that catalysis is not required for DNA cross-linking.
Cross-linking Interference by Integrase Inhibitors-Diketo acid derivatives have emerged recently as specific ST inhibitors with antiviral activity (29,30) and are reviewed in Refs. 3 and 31. Interaction between 5CITEP and Gln-148 was observed crystallographically (25). Inhibition of cross-linking by the integrase inhibitor 5CITEP was examined at concentrations of integrase and DNA used in standard catalytic exper- iments in the presence of magnesium. The DNA was radioactively labeled to permit visualization of integrase-DNA complexes. A time course experiment indicated cross-linking was complete in less than 2 min (Fig. 6, A and B). Addition of 5CITEP decreased the rate of crosslink formation. A 1-min cross-linking time was chosen to examine the inhibition of cross-linking at different 5CITEP concentrations (Fig. 6C). The IC 50 of cross-link inhibition was 29 M. At 111 M, 5CITEP had no effect on overall binding of integrase to DNA, as measured by Schiff base assay (Fig. 6, D and E), suggesting inhibition of disulfide cross-linking reflects the 5CITEP-binding site.

DISCUSSION
The present study demonstrates that efficient ST requires both the presence of a 5Ј-C at the end of the donor (viral) DNA and Gln-148 in the flexible loop of HIV-1 integrase. The 5Ј-C is conserved in the LTR sequences of HIV-1, HIV-2, simian immunodeficiency virus, and feline immunodeficiency virus (Table 1). In contrast, avian sarcoma virus and Rous sarcoma virus LTRs contain an adenine at that position, where the nitrogen at the first ring position could contribute to a similar interaction in an appropriately structured active site. Gln-148 is conserved among retroviral integrases, including HIV-1, simian immunodeficiency virus, avian sarcoma virus, Rous sarcoma virus, and feline immunodeficiency virus (32). Gln-148 resides in a flexible loop disordered in most integrase crystal structures or crystallized in a conformation that is unsuitable for catalysis (4 -9). It has been suggested that the loop becomes ordered upon DNA binding and acts to stabilize the 5Ј-end of the viral DNA (5,13).
Mutations that reduce the flexibility of the loop containing Gln-148 impair catalysis without affecting DNA binding (4). A conformational change following 3Ј-P has been observed (23). This flexibility might  allow Gln-148 and the 5Ј-C to interact following the 3Ј-P reaction, allowing efficient ST. This structural change could involve movement of the flexible loop, rotation of the 5Ј-C base out of the DNA duplex, or both, as illustrated in Fig. 7. In our model, release of the 5Ј-GT dinucleotide provides the rotational freedom to the lower strand 5Ј-C, leaving the area near the catalytic 3Ј-hydroxyl available for subsequent ST. The 5Ј-AC overhang is then anchored away from the catalytic 3Ј-hydroxyl by a hydrogen bond between the Gln-148 side chain amine and the cytosine N-3 group. Note that integrase is active as a multimer, and our model does not confine the active site performing catalysis and the Gln-148 facilitating ST to the same subunit.
A precedent for a similar glutamine-cytosine N-3 hydrogen-bonding interaction exists for human deoxycytidine kinase (33). Deoxycytidine kinase efficiently phosphorylates deoxycytidine but also deoxyguanosine and deoxyadenosine by interaction with Gln-97. The bound base dictates rotation of the glutamine side chain into an appropriate hydrogen-bonding position. Discrimination of deoxycytidine by deoxycytidine kinase is achieved through H-bonding of the cytosine N-3 with the amino groups to the Gln-97 (33).
The use of magnesium in the integrase active site is an important component of our model. Divalent metals influence the effectiveness of integrase inhibitors (34,35), integrase sequence specificity during DNA binding (13,36), 3Ј-P (Fig. 1) and ST (37), 3Ј-P nucleophile selection (24,38), and the sensitivity of drug-resistant mutant integrases to inhibitors (34). Integrase undergoes a divalent metal-induced protein conformational change (39), which may enhance specific recognition of the viral DNA end (26). The strict coordination requirements of magnesium (40) and the base specificity we observed in the presence of magnesium in vitro may point to the importance of the cytosine base requirement during ST in vivo.
This specificity may be reduced in our in vitro studies done in the presence of manganese because of altered metal coordination (40). Magnesium, a hard metal, prefers coordination to hard atoms such as oxygen (reviewed in Ref. 41). Coordination of magnesium may dictate that the Gln-148 carbonyl group supplies a direct-or water-mediated contact to magnesium (Mg specific contact in Fig. 7). This contact restricts rotation of the Gln-148 side chain and pulls electrons from the glutamine amino group, facilitating H-bond formation (Fig. 7). As a result, the side chain amine is directed to specifically contact and stabilize the 5Ј-C during ST. In contrast, the Gln-148 carbonyl group may not be needed in the same capacity in the presence of manganese, a soft metal. An altered metal coordination could allow the conformation of both of the glutamine side chain functional groups to accommodate any of the four DNA bases; hence, a loss of stringency for the base that stabilizes the viral DNA during ST in the presence of manganese (may be biologically irrelevant).
Because of the flexibility of the loop containing Gln-148 and the single-stranded nature of the 5Ј-C following 3Ј-P, other loop residues may contact the 5Ј-C. The loop contains residues 140 GIPYNPQSQG 149 , with the underlined residues capable of acting as H-bond donors. The functional relevance of Tyr-143 and Gln-146 can be excluded because mutants have no effect on viral replication (19), and Ser-147 is not conserved among retroviruses. We tested N144C in the disulfide crosslinking assay. Only a small amount of cross-linking was observed, and furthermore, aggregation of the protein, presumably from protein selfcross-linking precluded accurate assessment of integrase-DNA crosslink specificity (data not shown).
Gln-148 may form part of the binding pocket for diketo acid integrase inhibitors, which are selective against the integrase ST reaction (for review see Ref. 3). A co-crystal between integrase and 5CITEP revealed a hydrogen bond between Gln-148 and the nitrogen of the indole ring of 5CITEP (42), and docking studies suggest the participation of Gln-148 in the drug-binding pocket for several other integrase inhibitors (35,43,44). A drug-resistant mutation Q148K developed after exposure to S-1360, a diketo acid integrase inhibitor in clinical trial, and this mutant exhibited poor viral replication lending support to the important role of Gln-148 (20). Diketo acids have been proposed to bind the integrase-DNA complex and block ST by interacting with the 5Ј-AC overhang (45). We observed that cross-link between the LTR 5Ј-C and residue 148 of integrase was inhibited by 5CITEP at an IC 50 of 29 M (Fig. 6, C and  D). 5CITEP inhibits 3Ј-P and ST at IC 50 values of 35 and 0.65 M, respectively (45). Inhibition of Q148C/5Ј-C cross-linking by 5CITEP suggests the binding site for this inhibitor overlaps the region of the Gln-148/5Ј-C interaction. This is supported by the lack of inhibition of Schiff base formation between integrase and an abasic site substitution for the conserved adenine at the 3Ј-P site (Fig. 6, D and E). The crosslinking approach could be used to scan the molecular contacts between integrase and its DNA substrates, as well as inhibitor-binding sites. Stabilization of integrase-DNA complexes may also enable co-crystal structure determination.