Dynamic Modulation of HIV-1 Integrase Structure and Function by Cellular Lens Epithelium-derived Growth Factor (LEDGF) Protein

The mandatory integration of the reverse-transcribed HIV-1 genome into host chromatin is catalyzed by the viral protein integrase (IN), and IN activity can be regulated by numerous viral and cellular proteins. Among these, LEDGF has been identified as a cellular cofactor critical for effective HIV-1 integration. The x-ray crystal structure of the catalytic core domain (CCD) of IN in complex with the IN binding domain (IBD) of LEDGF has furthermore revealed essential protein-protein contacts. However, mutagenic studies indicated that interactions between the full-length proteins were more extensive than the contacts observed in the co-crystal structure of the isolated domains. Therefore, we have conducted detailed biochemical characterization of the interactions between full-length IN and LEDGF. Our results reveal a highly dynamic nature of IN sub-unit-subunit interactions. LEDGF strongly stabilized these interactions and promoted IN tetramerization. Mass spectrometric protein footprinting and molecular modeling experiments uncovered novel intra- and inter-protein-protein contacts in the full-length IN-LEDGF complex that lay outside of the observable IBD-CCD structure. In particular, our studies defined the IN tetramer interface important for enzymatic activities and high affinity LEDGF binding. These findings provide new insight into D,donor;FS,full-site;HS,half-site;IBD,integrase-bindingdomain;IN,inte-grase;mt,mutant;NTA,nitrilotriaceticacid;NTD,N-terminaldomain;CCD,catalyticcoredomain;CTD,C-terminaldomain;LEDGF,lensepithelium-derived growth factor; NHS, N -hydroxysuccinimide; HPG, p -hydroxyphe- nylglyoxal; S, substrate; P, product; PIC, preintegration complex; Di, dimer; Tet, tetramer; M, modifier; MOPS, 4-morpholinepropanesulfonic acid; CHAPS, 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonic acid.

Integration of the reverse-transcribed RNA genome into a host chromosome is an obligatory step for HIV-1 3 replication (reviewed in Ref. 1). This process is catalyzed by the retroviral enzyme integrase (IN) in two reaction steps. In the first step, which is called 3Ј-processing and takes place shortly after the cDNA is made, IN hydrolyzes a GT dinucleotide from each end of the viral DNA. In the second step, IN catalyzes concerted integration of the processed viral DNA ends into chromosomal DNA. The sites of attack on the two target DNA strands are separated by 5 bp, which leads to dissociation of the small double-stranded DNA fragment between the attachment sites. The subsequent repair of the intermediate species by cellular enzymes completes the integration reaction.
HIV-1 IN consists of three distinct structural and functional domains. The N-terminal domain (NTD) (residues 1-50) contains conserved pairs of histidine and cysteine residues that bind zinc (2,3), which contributes to IN multimerization and its catalytic function (4,5). The catalytic core domain (CCD) (residues 51-212) contains three acidic residues, Asp-64, Asp-116, and Glu-152, which play a key role in coordinating active site divalent metal ions (6,7). The C-terminal domain (CTD) (residues 213-288) also contributes to functional IN multimerization (8,9). Results of structural biology studies revealed each individual domain as a dimer (3,6,7,10,11) and more recent two-domain crystal structures comprised of the CCD and CTD (12) or NTD and CCD (13) likewise unveiled dimeric organizations. Functional studies suggested that a dimer of full-length IN could suffice to process each 3Ј end, whereas a tetramer is required to integrate both viral DNA ends into chromosomal DNA (14 -16). Efforts to determine the complete IN structure have been impeded by limited protein solubility and/or the inherent flexibility of the three-domain enzyme. Full-length IN interestingly exists as a mixture of monomers, dimers, tetramers, and higher order species in the absence of DNA (9,(17)(18)(19).
While in vitro analysis with model DNA substrates demonstrated that IN alone could catalyze 3Ј-processing and DNA strand transfer reactions, the in vivo function of the enzyme is likely to be regulated by a number of viral and cellular proteins. Following the completion of reverse transcription, the newly synthesized cDNA remains associated with several viral pro-teins and recruits host factors to form the preintegration complex (PIC) (20 -29). Of these, transcriptional co-activator p75, also known as lens epithelium-derived growth factor (LEDGF), is the principal cellular interactor of HIV-1 IN (27)(28)(29). A number of recent studies have indicated that LEDGF is critically important for effective HIV-1 integration and viral replication (30 -33). RNA interference (RNAi)-mediated knock-down of endogenous LEDGF to below detectable levels resulted in reduction of infection to 3.5% of that observed in the presence of normal cells (33). Similarly significantly reduced levels of HIV-1 infection were detected in LEDGF knock-out mouse embryo fibroblasts (34,35). Expression of recombinant HIV-1 IN in human cells revealed that LEDGF protects the viral protein from proteasomal degradation and tethers it to chromosomal DNA (25, 28, 36 -38). Accordingly, LEDGF primarily functions during HIV-1 infection to tether PICs to active genes during integration (34). In vitro assays with purified recombinant proteins furthermore demonstrated that LEDGF binds directly to IN, which significantly stimulates its enzymatic activities (27, 39 -44).
The N-terminal part of LEDGF contains a PWWP domain, nuclear localization signal, and dual copy of the AT-hook DNA binding motif (reviewed in Ref. 45 and Fig. 1). These conserved elements primarily mediate LEDGF association with chromatin (38,40). An evolutionarily conserved domain in the C-terminal region (residues 347-429) mediates the interaction with IN and was thus termed the IN-binding domain (IBD) (39,46). The solution structure of the LEDGF IBD and its co-crystallization with the IN CCD has been recently reported (47,48). Interestingly, the IBD docks into a relatively small cavity at the CCD dimer interface, contacting both IN subunits (48). The importance of the interacting amino acids revealed from the crystal structure has been validated by site directed mutagenesis in the context of full-length recombinant proteins (29,36,47,49,50) and by the out-growth of resistant viral strains in the presence of a dominant-interfering LEDGF fragment (32). However, mutagenesis studies have also indicated that full-length IN-LEDGF interactions extend beyond the contacts observed in the co-crystal structure of the isolated domains (28).
We have undertaken a number of innovative biochemical approaches to characterize the structural and mechanistic foundations between the full-length interacting partners, which has revealed a highly dynamic nature for the interactions between free IN subunits. LEDGF moreover strongly stabilized the IN subunit-subunit contacts. Mass spectrometric surface topology studies furthermore uncovered novel protein-protein contacts, which lie outside of the central IBD-CCD co-crystal structure. Mutational analysis confirmed the importance of the identified residues and indicated a strong correlation between IN tetramer formation and high affinity LEDGF binding. These findings provide new insight into how LEDGF modulates HIV-1 IN structure/function, and highlight the potential to exploit the highly dynamic nature of IN subunit interactions as a novel therapeutic target.

EXPERIMENTAL PROCEDURES
Expression Plasmids and Recombinant Proteins-HIV-1 IN proteins were expressed from pKBIN6Hthr, which was derived from pKB-IN6H (28) by replacing amino acids VDKLAAALE upstream from the C-terminal His 6 affinity tag with LVPRGSALE (thrombin cleavage site underlined) by PCR-directed mutagenesis. Mutations were also introduced into pKBIN6Hthr using PCR, and the coding regions of plasmids created via PCR were verified by DNA sequencing. Wild-type and mutant IN proteins were purified according to the previously described procedure (41). Purified recombinant LEDGF, mutant (mt) LEDGF, and IBD ( Fig. 1) were obtained as described previously (28,39,47).
IN 3Ј-Processing and DNA Strand Transfer Activities-The 32 P-labeled 21-mer synthetic double-stranded DNA (50 nM) mimicking the U5 viral end sequence was used as substrate. The concentrations of wild type and mutant IN proteins as well as LEDGF and LEDGF IBD included in the reactions are indicated in the figure legends. The reactions were carried out at 37°C for 1 h in buffer containing: 50 mM MOPS (pH 7.2), 2 mM ␤-mercaptoethanol, 10 mM MnCl 2, 1 mM CHAPS, 50 mM NaCl, and stopped with 50 mM EDTA. Reaction products were subjected to denaturing polyacrylamide gel electrophoresis and visualized using a Storm 860 Phosphorimager (Amersham Biosciences).
Concerted Integration Assay-These assays were performed as described previously (43). Briefly, the 972 bp ScaI-DraIII restriction fragment from pU3U5 (51) served as donor DNA and was 5Ј-end-labeled with [ 32 P]ATP and T4 polynucleotide kinase. HIV-1 IN (400 nM) was assembled with the labeled donor substrate (18 nM) in the presence of 20 mM HEPES (pH 7.0), 5 mM dithiothreitol, 10 mM MgCl 2 , 25 M ZnCl 2 , 100 mM NaCl, 5% DMSO, 10% PEG 6000. Ligands (IBD or LEDGF) were added before preincubation for 20 min at room temperature. Reactions (25-l final volume) were initiated by adding 500 ng of circular target DNA (pGEM, Promega), and the mixtures were incubated for 1 h at 37°C. Reactions were stopped by adding 10 mM EDTA, 0.2% SDS, and 1 mg/ml proteinase K. After ethanol precipitation, samples were subjected to 0.6% agarose gel electrophoresis for 6 h at 50 V. The gels were dried, and the labeled DNA products were detected using the Storm 860 Phosphorimager.  (47,50). In contrast, the IBD interacts with IN but lacks the ability to bind DNA and chromatin (38,40).
Subunit Exchange Assay-His-tagged IN (1 M) was preincubated with or without ligand (2 M LEDGF or 2 M mtLEDGF) in exchange buffer (25 mM Hepes, pH 7.1, 200 mM NaCl, 4% glycerol, 2 mM ␤-mercaptoethanol) for 30 min at room temperature. Tag-free IN (1 M) was then added and incubated for the indicated times. Aliquots were then briefly centrifuged 2 min at 1,000 ϫ g, and supernatants were pulled-down by Ni-nitrilotriacetic acid (NTA) resin (GE Healthcare) for 10 min in the presence of bovine serum albumin (0.1 mg/ml). The IN-bound resin was then washed three times with buffer containing 50 mM HEPES (pH 7.1), 200 mM NaCl, 2 mM MgCl 2 , 100 mM imidazole, and 0.1% (v/v) Nonidet P40. The bound proteins were subjected to SDS-PAGE separation and visualized by Coomassie Blue stain.
Mass Spectrometric Footprinting-In parallel reactions, free IN and INϩLEDGF were first incubated at room temperature for 30 min and then subjected to treatments at 37°C with 1 mM N-hydroxysuccinimide (NHS)-biotin for 30 min or 20 mM p-hydroxyphenylglyoxal (HPG) for 60 min. These concentrations of modifying reagents were chosen because comparative pulldown experiments with untreated and modified IN-LEDGF complexes indicated that under these conditions the integrity of the preassembled protein-protein complex was fully preserved (data not shown). NHS-biotin treatment was carried out in buffer containing 50 mM HEPES (pH 8.0), 150 mM NaCl, 10 mM MgCl 2 . The HPG modifications were performed in 50 mM HEPES (pH 8.0), 50 mM boric acid, 150 mM NaCl. The reactions were quenched by excess Lys and Arg using free amino acid forms. IN-LEDGF complexes were selectively pulled-down using Ni-NTA resin. The bound proteins were separated by denaturing SDS-PAGE and visualized by Microwave Blue stain (Protiga, Gaithersburg, MD). IN bands were excised, destained, and subjected to in-gel proteolysis with 0.5 g of trypsin. The tryptic peptides were analyzed with the Axima-CFR MALDI-ToF instrument (Shimadzu) using ␣-cyano-4-hydroxy-cinnamic acid as a matrix.
Molecular Modeling-The model of the NTD-CCD tetramer bound to the LEDGF IBD was generated by overlaying the CCDs within PDB structures 2B4J (48) and 1K6Y (13) using the Insight II software package (Accelrys Inc., San Diego) on a Silicon Graphics O 2 work station. The constructed model was then energy-minimized by the same software package using the CFF91 force field and steepest descent method.
LEDGF Binding Affinities to Wild-type and Mutant INs-LEDGF (50 -650 nM) was incubated with 100 nM His-tagged IN (WT or mutant) in binding buffer (50 mM Hepes (pH 7.1), 200 mM NaCl, 2 mM MgCl 2 , 100 mM imidazole, 0.1% (v/v) Nonidet P40) for 60 min at room temperature. Samples were then briefly centrifuged for 2 min at 1,000 ϫ g, and supernatants were pulled-down by Ni-NTA resin for 30 min in the presence of bovine serum albumin (0.1 mg/ml). The resin was then washed three times with the same buffer, and the bound proteins were separated by SDS-PAGE. LEDGF was detected by Western blot analysis using a mouse monoclonal LEDGF antibody (BD Biosciences) and quantified using Image software (NIH). Plotting and curve fitting was performed with Origin 8 software (Orig-inLab). Nonspecific signal was not detected when LEDGF was incubated with Ni-NTA beads in the absence of IN (data not shown).

RESULTS
We previously reported that LEDGF significantly stimulated the in vitro activities of HIV-1 IN whereas the isolated IBD failed to do so (39,40). As these assays utilized relatively long blunt-ended viral DNA substrates and DNA strand transfer product formation as read-out, we reanalyzed the effects of these two proteins on IN function using an oligonucleotidebased assay that monitors the formation of 3Ј-processing and DNA strand transfer reaction products on denaturing sequencing gels ( Fig. 2) (52). The results in panel B (lanes 1-6) revealed stimulation of IN DNA strand transfer activity by the LEDGF IBD under these assay conditions. It should be noted that in this setting the 19-mer 3Ј-processing reaction product is the substrate for the second catalytic step ( Fig. 2A). To dissect if the IBD directly enhanced IN 3Ј processing activity, a selective IN strand transfer inhibitor (53,54) was included in the experiment (supplemental Fig. S1). The 19-mer reaction product accumulated under these conditions, revealing significant stimulation of IN 3Ј processing activity by the LEDGF IBD (supplemental Fig. S1). For control experiments we analyzed the D366N point mtLEDGF, which is defective for IN binding in vitro (47) and in yeast cells (50), but retains LEDGF DNA binding activity (Fig. 1). In contrast with the LEDGF IBD, formation of 3Ј-processing and DNA strand transfer reaction products decreased with increasing mtLEDGF concentrations (Fig. 2B, lanes 7-12), probably due to competition between mtLEDGF and IN for binding to DNA. A bell-shape enhancement of IN activities was observed in the presence of wild-type LEDGF (Fig. 2B, lanes [13][14][15][16][17][18], likely due to the competition with IN for DNA binding at high LEDGF:IN ratios. The IBD, which is the weaker stimulant, could display this effect at relatively high stoichiometry (compare lanes 6 and 18 in Fig. 2B) because it lacks the N-terminal LEDGF regions that mediate DNA binding (40) (Fig. 1).
We next analyzed the activity of the IBD using a longer donor DNA substrate and a second circular target DNA, a design that can distinguish the formation of single-end half-site (HS) integration products from those that form by the pairwise concerted integration of two viral DNA ends (labeled full-site (FS) in Fig. 2C) (42,43). As expected (42,43), wild-type LEDGF modestly stimulated FS and HS product formation at LEDGF:IN ratios of Ͻ1, while higher LEDGF concentrations selectively inhibited FS product formation (Fig. 2D, lanes 7-9). The selective reduction of FS product formation was also observed with increasing concentrations of IBD (Fig. 2D, lanes  4 -6).
The above experiments ( Fig. 2 and supplemental Fig. S1) demonstrated that direct binding of LEDGF or its IBD could differentially influence IN activities. The co-crystal structure did not show any significant changes in the tertiary structure of the CCD upon IBD binding, but did reveal that the host factor engaged both monomers of the IN dimer at the CCD interface (48). Therefore, one possible mechanism could be that LEDGF binding influences the dynamics of IN subunit-subunit interactions. Indeed, prior analyses using IBD-based peptides hinted at this possibility (55,56). We therefore devised the following experiment to analyze the dynamics of IN subunit exchange  preincubation with IN2 due to the effective exchange of IN protein subunits. Kinetic analyses determined exchange within 10 min of mixing, reflecting the highly dynamic nature of protein-protein interactions between free IN subunits under these conditions (Fig. 3B, lanes 1-4).
We next asked how the IBD and full-length LEDGF would affect IN subunit exchange. For this, tag-free IN1 pre-incubated with LEDGF was then exposed to His-tagged IN2. LEDGF effectively prevented IN subunit exchange (Fig. 3B, lanes 5-8). Very similar results were obtained with the IBD (data not shown). In contrast, due to the inability to effectively bind IN, mtLEDGF failed to affect the dynamics of IN subunit exchange (Fig. 3B, lanes 9 -12).
These results indicated that LEDGF or its IBD markedly affected IN subunit-subunit interactions but did not distinguish whether the cofactor stabilizes ("locks") IN into a specific multimeric state or prevents the multimer from forming by interfering with subunit-subunit interactions. To delineate how the proteins modulated IN subunit exchange, we employed size exclusion chromatography (Fig. 4). The lowest detectable concentration of IN (2.5 M) was used to approach those employed above in activitybased assays. Under these conditions, wild type IN exhibited two main peaks with the predominant and minor species corresponding to tetramer and dimer, respectively ( Fig. 4A and supplemental Table  S1). The "shoulder" observed to the right of the dimer peak suggests that some monomeric protein was also present under these conditions. The addition of the IBD resulted in a single shifted peak with a retention time consistent with two LEDGF IBD molecules bound to the IN tetramer ( Fig. 4A and supplemental Table S1). Because the CCD used in the crystallographic studies contained the F185K solubilizing mutation, we also assayed the F185K/ C280S double mutant IN (dmIN), which is a soluble version of the full-length protein (9). By contrast to the wild type, dmIN contained greater quantities of the dimer than tetramer (Fig. 4B). Similar with the unmutated IN, the IBD shifted the equilibrium sharply in favor of the dmIN tetramer-IBD complex compared with the dimer dmIN-IBD complex. These experiments demonstrate that the IBD preferentially binds and stabilizes the IN tetramer.
To obtain more detailed information on the interaction between full-length IN and LEDGF, we turned to our mass spectrometric protein footprinting approach (Fig. 5). The method enables the comparison of surface topologies of free protein versus a protein-ligand complex using small chemical amino acid selective modifiers. The concentrations of the modifying reagents are optimized to ensure mild reaction conditions such that the integrity of the protein-ligand complex is preserved. Subsequent SDS-PAGE separation, proteolysis, and mass spectrometric analyses are carried out to reveal surface amino acids readily modified in free protein but shielded from modification by the interacting partner (Fig. 5, top). This approach has proved instrumental for studying a number of nucleic acid-protein interactions including the IN-viral DNA complex, and consistently revealed biologically relevant contact amino acids (57)(58)(59)(60)(61)(62)(63)(64). Here, we used the method for the first time to examine protein-protein interactions. For this, in parallel reactions with free IN and pre-formed IN-LEDGF complexes were subjected to treatments with Lys-and Arg-modifying reagents. Despite tight binding, the solution mixture of IN and LEDGF was likely to contain unliganded proteins in addition to the protein complex. To ensure for selective analysis, we utilized His-tagged LEDGF and tag-free IN proteins. Following chemical modification, unliganded LEDGF and IN-LEDGF complexes recovered using Ni-NTA resin, and in parallel free IN, were separated by SDS-PAGE. IN amino acids protected from modification by LEDGF binding were then deciphered by comparing the modification patterns obtained with the different species of gel-isolated IN (Fig. 5, top). Control experiments utilized mtLEDGF that lacks the ability to bind IN.
Representative mass spectrometric fragments are shown in Fig. 5, and two distinct sets of modification patterns were readily observed. Three peaks containing modified Arg-107, Lys-186, and Lys-14 were detected with free IN and INϩmtLEDGF but were markedly reduced from the IN-LEDGF complex (Fig. 5, A-C). In contrast, two peaks containing modified Arg-228 and Lys-273, as well as an unmodified residue 265-273 fragment, persisted in all three samples. A detailed summary of the modification patterns is presented in Table 1: of the 12 Lys and 9 Arg residues readily modified in free IN, Lys-14, Arg-107, Arg-166, Lys-186, Arg-187, and Lys-188 were selectively protected by LEDGF binding. These results are fully consistent with previous deletion analyses that revealed the CCD as the primary viral recognition determinant with the NTD donating secondary affinity-enhancing contacts. The CTD, by contrast, was dispensable for high affinity LEDGF-IN binding (28).
The footprinting results were next analyzed in the context of available protein structures to gain further insight into the details of the LEDGF-IN interaction. Protections of Arg-107 and Arg-166 are consistent with available data from unliganded (6, 7, 12, 13) as well as IBDbound (48) CCD dimers. For example, Arg-166 is part of the socalled ␣4/5 connector that forms the primary IBD recognition determinant from one of the IN monomers (48). The guanidino group of Arg-107 furthermore directly participates in IN dimerization through interactions with Glu-85 of the other CCD subunit (6,7,48). Intriguingly, our results indicated a relatively rapid exchange between IN subunits, consistent with Arg-107 accessibility in the free protein. In contrast, LEDGF binding stabilized the interacting IN subunits, rendering Arg-107 inaccessible to chemical modification (Figs. 3 and 5).
Our footprinting studies revealed new protein-protein contact residues Lys-14, Lys-186, Arg-187, and Lys-188 in the fulllength IN-LEDGF complex. Of these, the latter three were unliganded and freely surface exposed in the IBD-CCD co-crystal structure (48). The co-crystal was comprised of a CCD dimer, while the results of size exclusion chromatography indicated affinity of the IBD for the full-length IN tetramer (Fig. 4). Interestingly, the two domain NTD/CCD crystal structure revealed interactions between two dimers, with Lys-14 from the NTD of one dimer and Lys-186/Arg-187/Lys-188 from the CCD of another dimer contributing to a tetrameric interface (13). In particular, supplemental Fig. S2 depicts the hydrogen bonding network between the two dimers involving the side chains of Lys-14, Lys-186, and Arg-187. The primary amine of Lys-188 is not directly involved in dimer-dimer interactions. However, two acidic residues (Glu-198 and Asp-25) located at ϳ4.5 Å from Lys-188 could effectively restrict the access of NHS-biotin to this site (supplemental Fig. S2). Therefore, shielding of Lys-14, Lys-186, Arg-187, and Lys-188 in the context of the com-  Fig. 4).
To test our hypothesis that protections of Lys-14, Lys-186, Arg-187, and Lys-188 resulted from LEDGF-mediated stabilization of tetrameric IN, we conducted site-directed mutagene-sis experiments (Fig. 6). Single point mutations of the target residues significantly compromised tetramer formation (Fig. 6A). For example, K14A, K186A, and R187A mutants were predominantly dimeric, even at relatively high protein concentration (10 M) (Fig. 6A and supplemental Table S2). The K188A substitution had relatively modest affect on tetramer formation (Fig.  6A) consistent with the lesser role of this residue in direct dimer-dimer interactions (supplemental Fig. S2).
We next examined the effects of these mutations on the LEDGF-IN interaction. For this, increasing concentrations of wild-type tag-free LEDGF were incubated with Histagged IN proteins, and the fractions of LEDGF recovered by Ni-NTA pull-down were quantitated (Fig. 6, B and C). These experiments yielded an apparent K d of ϳ200 nM for the interaction between wild-type IN and LEDGF. The K14A, K186A, and R187A mutants exhibited significantly reduced affinity for LEDGF, while the K188A protein was relatively more effective at binding (Fig. 6, B and C). In fact, the comparison of size exclusion chromatography (Fig.  6A) and binding (Fig. 6, B and C) results suggested a strong correlation between IN tetramer formation and high affinity LEDGF binding.
The HIV-1 IN dimer can suffice to process viral DNA 3Ј ends whereas the tetramer has been implicated in DNA strand transfer activity (14 -16); results of a separate study using chimera IN however indicated that the tetramer was required for efficient 3Ј-processing activity (65). Because the relevant protein-protein interfaces within the catalytic complexes are for the most part unknown, we investigated whether the tetrameric interface important for high affinity LEDGF binding played a significant role in IN catalysis. Impressively, the K14A, K186A, and R187A substitutions adversely affected 3Ј-processing and DNA strand transfer activities (Fig. 6D). In contrast, the wild-type level of activities was observed with the K188A protein. Taken together, our results suggest that the tetramer interface involving basic residues Lys-14, Lys-186, and Arg-187 is important for HIV-1 IN 3Ј-processing and DNA strand transfer activities. . Surface residues in free IN and the complex were modified, but the interacting amino acids in the complex were shielded from modification. His-tag LEDGF and tag-free IN proteins were used. Following the modification reactions, the complex was pulled-down by NTA beads, which enabled recovery of only the LEDGF-bound form of IN from the reaction mixture. The interacting proteins were then separated by SDS-PAGE. The IN band was excised and subjected to in-gel proteolysis. Subsequent comparative mass spectrometry (MS) analyses revealed modification patterns in free protein and the complex. Lower panel, representative segments of the MALDI-ToF mass spectra. A, free IN was treated with HPG or NHS-biotin. B, IN was preincubated with mtLEDGF and then exposed to treatments with HPG or NHS-biotin. C, IN-LEDGF complexes were preformed and then exposed to HPG or NHS-biotin treatments. Start and end amino acid numbers of the detected peptide peaks are indicated. IN residues affected by modification are depicted in brackets.
While the NTD-CCD tetramer was observed in a crystal lattice at high concentrations of the two-domain protein, it is less clear how LEDGF could promote tetramer formation in solution at significantly lower protein concentrations. To address this, we superimposed the structures of the CCD-IBD complex (two IBD molecules bound to the CCD dimer, see Ref. 48) and the two domain (NTDϩCCD) tetramer (13) using molecular modeling (Fig. 7). The results indicated asymmetric interactions of the IBD molecules with the NTDϩCCD tetramer. For example, two (colored magenta) of the four IBD molecules could effectively bridge between the two IN dimers by coordinating the CCD of one dimer and establishing additional electrostatic interactions with the NTD of another dimer (Fig. 7). These additional contacts could contribute to the high affinity IBD-IN interactions and stabilize the IN tetramer. The other two IBD molecules (colored gray) interacted with the CCD dimer interfaces but could not establish additional charge-charge contacts with the NTDs due to the spatial separation between the IN domains (Fig. 7). The lack of such interactions may reduce the binding affinity for these IBD molecules. In other words, we propose that the IN tetramer has two high affinity and two lower affinity binding sites for the LEDGF IBD.

DISCUSSION
The present studies revealed a highly dynamic nature of interactions between IN subunits, which could be essential for its biological function as well as exploited as a novel therapeutic target (see below). The cellular cofactor LEDGF strongly modulated the dynamic structure of HIV-1 IN by stabilizing subunit-subunit interactions. Unlike the published co-crystal structure of the isolated domains that indicated binding of two IBD molecules to the CCD dimer, our experiments with the full-length proteins demonstrated the importance of the IN tetramer for high affinity LEDGF binding. These results are consistent with previous findings that endogenous LEDGF protein associated with tetrameric recombinant HIV-1 IN in human cells (27).
Our results also indicate that the stabilized IN tetramer is very effective at catalyzing IN 3Ј-processing and DNA strand transfer activities (Figs. 2, 6, and supplemental Fig. S1). Whereas free IN exists in solution as a mixture of monomer, dimer, tetramer, and high order species, the IBD strongly promoted formation of the IN tetramer (Fig. 4) and stimulated IN catalytic function ( Fig. 2 and supplemental Fig. S1).
Our footprinting and molecular modeling studies uncovered novel intra-and inter-protein-protein contacts in the fulllength IN complex with LEDGF (Figs. 5 and 7). Site-directed mutagenesis experiments confirmed the importance of Lys-14, Lys-186, and Lys-187 residues for tetramer formation and high affinity LEDGF binding (Fig. 6). Furthermore, the K14A, K186A, and R187A mutations also compromised IN 3Ј-processing and DNA strand transfer activities. These results suggest the importance of the common tetramer interface for IN interactions with DNA substrates and LEDGF. Consistently, our previous footprinting analysis of the IN-DNA complex revealed a role for K14 in DNA binding (62). The protections in the nucleoprotein complex could arise from direct protein-DNA or DNA-induced protein-protein interactions (62). The present studies clarify that Lys-14 could be a critical dimer-dimer contact essential for effective binding of IN with both DNA and LEDGF. Of note, our findings corroborate results of a number of virus-based assays that indicated essential roles for Lys-14, Lys-186, and Arg-187 (62, 66 -70) in HIV-1 infection, whereas the K188A mutation resulted in reduced but reproducible levels of virus spread (69). Therefore, we propose that the IN tetramer is the biologically relevant form responsible for catalytic activities and high affinity binding to LEDGF.
The fact that LEDGF selectively impaired concerted integration of two HIV-1 DNA ends (Fig. 2, see also Refs. 42,43) has been rather puzzling. How can this observation be reconciled in vivo? The following two scenarios can be considered. Free IN could first assemble onto the viral DNA ends before encountering LEDGF (43). The high degree of flexibility within individual IN subunits (62,71,72) as well as their dynamic interplay (Fig.  3) could be critical for effective assembly of the synaptic complex, where the two catalytic sites position themselves for concerted integration. The preassembled IN tetramer-viral DNA complex would engage LEDGF, which in turn would tether the nucleoprotein complex to active genes without significantly affecting structural arrangements of IN with its DNA substrates. Accordingly, in vitro experiments with model DNA substrates and purified proteins indicated the importance of the order of IN binding to DNA and LEDGF for effective concerted integration (42,43). An alternative possibility is that the viral DNA-IN-LEDGF complex engages an as yet unidentified cellular chromatin factor(s). Binding this hypothetical cellular partner could trigger structural changes within the nucleoprotein complex to enable concerted integration. It has been reported that the IN tetramer catalyzes the sequential joining of the two viral DNA ends to target DNA in vitro (16), and that LEDGF strongly facilitates single end HIV-1 integration (41)(42)(43). Protein-protein interactions at chromatin could prompt LEDGF to partly or fully disengage the PIC to allow IN to regain its flexibility and complete the integration of the second viral DNA end.
ϩ ϩ ϩ a ϩ, surface residues susceptible to modification. b Ϫ, residues protected from modification because of protein-protein contacts.
Our structure-function studies have significant implications for exploiting the highly dynamic nature of HIV-1 IN as a novel therapeutic target. For example, inhibition of functional IN could be accomplished by the following mechanisms: 1) interfering with subunit-subunit assembly and 2) restricting IN flexibility by "locking" the protein into a functionally compromised multimeric state. Of note, such compounds would be complementary to the clinically approved IN active site inhibitor Raltegravir, as the resistant mutations developed to this drug are significantly distanced from the protein-protein interfaces (73). In general, protein-protein interactions are thought to be rather challenging drug intervention targets for the following reasons. The shape of such interfaces is typically flat and comprises large surface areas (ϳ750 -1,500 Å) (reviewed in Ref. 74). However, the subset of interfaces that contribute to high affinity binding could be significantly smaller and affected by small molecule inhibitors. Indeed, significant progress has been made in discovering a number of proteinprotein inhibitors in recent years (reviewed in Ref. 74).
Our results indicate that IN monomers could also be a plausible target for interfacial inhibitors. For example, the fact that IN subunits exchange rapidly (Fig. 3) provides biological targets for effective binding of small molecules. Consistently, our mass spectrometric footprinting revealed that Arg-107, which is located at the dimer interface, is readily modified by HPG (Fig. 5). While HPG cannot be viewed as a lead compound, detailed analysis of the CCD structure revealed the following intriguing role for this residue. Arg-107 together with Gly-106 form a bulge that effectively docks into a cavity within the interacting subunit (supplemental An alternative intriguing mechanism could be to restrict the catalytically important flexibility of IN by stabilizing a multimeric state. For example, we showed that the LEDGF IBD forms a stable complex with the IN tetramer, which is active for 3Ј-processing and single-end DNA strand transfer but not concerted integration in vitro (Fig. 2). Previous reports showed that overexpression of LEDGF IBD proteins in target cells effectively impaired HIV-1 replication (30,33). Furthermore, the IBD was significantly more effective at suppressing HIV-1 replication in LEDGF-deficient cells (555-fold) as compared with cells containing normal LEDGF levels (ϳ30 fold) (33). These observations cannot be fully explained by a competition between the IBD and endogenous LEDGF for IN binding. Instead, our findings provide mechanistic clues that direct binding of the IBD to the IN tetramer restricts protein While the studies with the LEDGF IBD provide proof-ofconcept for a new mechanism of IN inhibition, the main interest is to discover small molecule inhibitors. Such compounds do not have to compete with IN subunit-subunit or IN-LEDGF interactions and overcome large energy barriers created by protein-protein interfaces. Instead, they could specifically exploit the structural "pockets" present in tetrameric or dimeric proteins to stabilize the interacting subunits and compromise the catalytically essential dynamic structure. Recent reports indicate that IBD-derived short peptides impaired IN catalytic activities in vitro (55,56) and inhibited HIV-1 infection (55). In fact, earlier observations that certain small molecule inhibitors selectively bind at the IN dimer interface gain particular interest in the context of recent elucidation of structural and mechanistic details of LEDGF-IN interactions. For example, 3,4-dihydroxyphenyltriphenylarsonium bromide has been reported to bind the IN dimer at a site that overlaps the IBD binding pocket (75). Our previous studies revealed highly selective binding of another small molecule, methyl N,O-bis (3,4-diacetoxycinnamoyl)serinate, to the IN dimer at the site that is immediately adjacent to the IBD-binding pocket (60). A coumarin-based inhibitor was likewise determined to bind IN nearby the IBD interaction site (76). Further research in this direction may well lead to the discovery of novel types of clinically useful HIV-1 IN inhibitors.