Architecture and Assembly of HIV Integrase Multimers in the Absence of DNA Substrates*

Background: No full-length structure of HIV integrase alone has been reported. Results: We elucidated the architectures of dimers and tetramers of full-length HIV-1 integrase in solution. Conclusion: HIV apo-integrase can assemble in two alternate dimer forms: a reaching and a core-core dimer. The tetramer comprises two stacked reaching dimers, stabilized by core-core interactions. Significance: New insights into HIV integrase architecture and its inhibition are suggested. We have applied small angle x-ray scattering and protein cross-linking coupled with mass spectrometry to determine the architectures of full-length HIV integrase (IN) dimers in solution. By blocking interactions that stabilize either a core-core domain interface or N-terminal domain intermolecular contacts, we show that full-length HIV IN can form two dimer types. One is an expected dimer, characterized by interactions between two catalytic core domains. The other dimer is stabilized by interactions of the N-terminal domain of one monomer with the C-terminal domain and catalytic core domain of the second monomer as well as direct interactions between the two C-terminal domains. This organization is similar to the “reaching dimer” previously described for wild type ASV apoIN and resembles the inner, substrate binding dimer in the crystal structure of the PFV intasome. Results from our small angle x-ray scattering and modeling studies indicate that in the absence of its DNA substrate, the HIV IN tetramer assembles as two stacked reaching dimers that are stabilized by core-core interactions. These models of full-length HIV IN provide new insight into multimer assembly and suggest additional approaches for enzyme inhibition.

Integration of retroviral DNA into the host cell genome is an essential step in viral replication catalyzed by the viral integrase (IN) 3 protein (for a recent review, see Ref. 1). The integration reaction takes place in two temporally and spatially separated steps known, respectively, as 3Ј-processing and end joining (or strand transfer). In the first step a di-or trinucleotide is hydrolytically removed from each viral DNA end, exposing 3Ј-hydroxyl groups of invariant CA dinucleotides. In the second step the two processed viral DNAs are joined to host DNA in a concerted cleavage and ligation reaction. The functional nucleoprotein complex that includes two viral DNA ends and a multimer containing at least four IN monomers is known as an intasome (2). The pathway by which intasomes are assembled from apoIN proteins is unknown. In the absence of the viral DNA substrate, soluble full-length IN proteins from different viruses can exist as monomers, dimers, or tetramers at similar concentrations. For example at ϳ1-4 mg/ml, the prototype foamy virus (PFV) IN is a monomer; avian sarcoma virus (ASV) IN is a dimer, and the human immunodeficiency virus (HIV) IN is a tetramer. Furthermore, the three domains of IN proteins, the N-terminal domain (NTD), catalytic core domain (core), and C-terminal domain (CTD), play important roles in the conformational dynamics and multimerization of the apo-protein that are not fully understood.
We recently employed small and wide angle x-ray scattering (SAXS/WAXS) in combination with chemical cross-linking to elucidate the full-length architecture of the apoIN monomer and dimer of ASV (3). We showed that the ASV IN dimer derives its stability from interactions of the NTD of one monomer with the CTD and core of the second monomer and from CTD-CTD contacts between the monomers; core-core interactions are only observed in tetramers. We called this unique and unexpected architecture a reaching dimer (illustrated in Fig. 1,  right). We also found that the substitution of a single hydrophobic residue at an interaction hotspot in the CTD-CTD interface was sufficient to disrupt the reaching dimer and noted that a similar hydrophobic hotspot could be predicted for other retroviral IN proteins, including that of HIV. The arrangement of monomers in the ASV IN reaching dimer structure resembles that of the inner dimer of the PFV intasome that engages the viral and target DNA substrates and catalyzes the concerted joining reaction. We, therefore, proposed a dual role for hydrophobic residues in the CTD hotspot; that is, DNA binding and reaching dimer stabilization in the absence of DNA, and binding to the non-cleaved strand of viral DNA (3). To test the generality of reaching dimer assembly and to gain further insight into the modes of assembly of apoIN, we analyzed the solution structure of HIV IN using approaches that proved successful with the ASV IN protein.

Static/Size Exclusion Chromatography (SEC)-SAXS/WAXS
of ApoHIV IN-X-ray scattering experiments were performed at the Advanced Photon Source at Argonne National Laboratories, 5ID-D beamline, Chicago, IL. Data were collected either directly from the homogeneous protein solutions or with protein fractions that were eluted at 600 l/min from a Tricorn column (Superdex TM 200, 10/300 GL, GE Healthcare) immediately upstream of the SAXS flow cell. In the latter case, because the proteins were eluting at high concentrations, 3 scans at the retention time were averaged at an interval of 14 s. Specific details concerning protein expression and purification, experimental setup, data collection, fitting, and shape modeling are described in the supplemental Experimental Procedures.
Protein Cross-linking-Tag-less HIV IN proteins were buffer-exchanged by dialysis in 0.1 M MES-HCl, 1 M NaCl, pH 6.0, 1 mM Tris(2-carboxyethyl)phosphine, 20% glycerol (4). For wild type HIV IN cross-linking, a 1:1 mixture of unlabeled and isotopically labeled protein (final concentration 450 nM) was equilibrated overnight (5) and freshly prepared 1-ethyl-3-[3-dimethyaminopropyl]carbodiimide hydrochloride (EDC; Pierce) bifunctional zero-length cross-linker was added at increasing concentrations. After 5-10 min at 37°C, the reactions were quenched by addition of 20 l of 1 M mercaptoethanol and then left on ice for 60 min. After centrifugation at 14,000 ϫ g at 4°C for 10 min to remove unwanted aggregates, the supernatant fractions were transferred to new Eppendorf tubes. The reactants were then precipitated with acetone and resuspended in 20 mM HEPES, pH 7.8, 0.5 M NaCl, 2 mM DTT, 10% glycerol. For HIV IN F181T cross-linking at 25 M or 250 nM concentration, the mixture of 1:1 unlabeled and isotopically labeled IN was first treated with 10 mM EDTA for 10 -15 min on ice, and then dialyzed on ice in 0.1 M MES-HCl 1 M NaCl, pH 6.0, 20% glycerol supplemented with 2 mM DTT, 20 mM MgCl 2 , and 50 M ZnSO 4 . After 60 min to allow for refolding of the NTD, the mixture was dialyzed in 0.1 M MES, pH5.8, 1 M NaCl, 1 mM Tris(2-carboxyethyl)phosphine, 20% glycerol. Cross-linking of the IN F181T mixtures was as described for wild type IN. The cross-linked products were separated by electrophoresis in denaturing NuPAGE 4 -12% BisTris gels using MES running buffer and Coomassie Blue stain. Sample recovery was only slightly diminished by acetone precipitation. The dimer bands from all EDC reactions were excised, trypsin-digested, and analyzed for cross-links by mass spectrometry as described previously (3) and in the supplemental Experimental Procedures.
HADDOCK Docking and Fine Model Fit-To model the flexible HIV F181T IN dimer interface, we used HADDOCK docking (Guru Interface) (6) together with SAXS-driven refinement parameters and distance constraints from the mass spectrometric analysis of the protein chemical cross-linking. Starting models for docking were based on homology with the model of the ASV IN dimer (3) using the SWISS MODEL resource (7)(8)(9), and cross-linking residues were defined to have a proximity of ϳ4 Å between each pair. Based on the flexibilities of the IN domains, docking was grouped into three classes that satisfied the identified chemical cross-links. Structures were selected for further refinement based on the HADDOCK score, and models were clustered with a cutoff root mean square of 10 Å that satisfied the SAXS maximum distance (D max ). All the final models from each group that have a maximum dimension equivalent to experimental SAXS data were compared using CRYSOL analysis fit (ATSAS) (10). In addition, P(r) functions were plotted for each model and compared with the experimental data to assess the quality of the dock model using the Igor Pro package (Irena macro) (11).

Preparation and Characterization of Wild Type HIV ApoIN-
HIV IN is notoriously difficult to maintain at moderate to high concentration because of its tendency to aggregate. After investigating a variety of buffer conditions we, as others (12), found that solubility and stability could be optimized by the inclusion of 1 M urea during protein purification. Analysis of wild type HIV IN (8 M) by CD spectroscopy revealed no significant differences in the ␣-helical structural elements of the protein in the region of interest (218 -223 nm) in the presence of 0 -1 M urea; denaturation was not detected until the urea concentration reached 2 M or higher ( Fig. 2A). In addition, our enzymatic assays showed that single viral DNA end processing and joining activities were unaffected in the presence of 1 M urea; concerted integration into a target DNA was unaffected by 250 mM urea, and this activity was still detectable in 1 M urea (Fig. 2, B-D). We concluded from these results that HIV IN retains its native structure in the presence of this chaotropic reagent at concentrations employed in our biophysical analyses.
Static and dynamic light scattering analyses were used to determine molecular mass and to gauge the homogeneity of our HIV IN protein preparations. We observed that wild type HIV IN exhibited properties expected of a homogeneous tetramer in FIGURE 1. Schema for alternate apoIN assemblies. Two possible HIV apoIN dimer forms are illustrated in this schematic. The three domains common to retroviral integrase proteins are depicted in color-coded shapes identified in the monomer as N for the red NTD, C for the green CTD, and core for the blue catalytic core domain. A small white circle symbolizes the active site in the core. The arrangement of domains in the monomer and reaching dimer are adapted from the published architectures of ASV monomers and dimers (3). The arrangement of the core-core dimer is adapted from the "outer" dimers in the crystal structure of the PFV intasome (2). A crystal structure of the fulllength apoIN tetramer is not available. An architecture based on our solution studies of HIV IN is proposed in the present study (cf. Fig. 10).
the presence of 250 mM and 1 M urea at protein concentrations in the 1-2 mg/ml range (Table 1). In contrast, the protein appeared to be a mixture of tetramers and larger aggregates at similar concentrations in the absence of urea (data not shown).

Disruption of Hydrophobic Interactions in the Core-Core Interface of HIV IN Blocks Tetramer but Not Dimer Formation-
We have demonstrated that in the absence of DNA, full-length ASV IN forms two distinct subunit interfaces. A reaching dimer interface is stabilized by CTD-CTD interactions and interactions of the NTD from one monomer with the CTD and core domain of the second monomer. A second interface stabilized by core-core domain interactions is observed in ASV IN tetramers, which we and others have shown are required for catalysis of concerted integration but not 3Ј-end processing of viral DNA (3). In the case of HIV, Neamati and co-workers (13,14) have noted that a four-tiered interaction among several conserved hydrophobic amino acids at the core-core interface (Trp-132, Met-178, Phe-181, Phe-185) is likely to be critical for its stability. Although molecular masses were not determined for proteins with non-conservative substitutions of these residues, these authors showed that such changes resulted in loss of the single end-joining activity of HIV IN but had less effect on 3Ј end-processing, as might be expected for some capacity to form dimers (15,16).
By analogy with ASV IN, we hypothesized that substitution of one or more of these conserved hydrophobic residues in the HIV IN core domain might block formation of tetramers but not reaching dimers. To test this idea, we introduced three independent, non-conservative substitutions for residue Phe-181 in wild type HIV IN and analyzed these proteins by dynamic light scattering. The results showed that the HIV IN derivatives with either threonine (1.2 mg/ml) or alanine (3.2 mg/ml) at position 181 were, indeed, homogenous dimers in the presence of 1 M urea; the protein that contained glycine (2.3 mg/ml) at this position had properties expected for a mixture of dimers and tetramers (Table 1). Our enzymatic assays showed that the IN F181A derivative was essentially inactive for 3Ј end-processing, but F181T and F181G retained ϳ12% of the catalytic rate exhibited by wild type HIV IN (Fig. 3A). Partial processing activity for the F181G substitution was reported previously (13). These data indicate that the HIV IN protein can form partially active dimers when core-core interactions are disrupted.
Destabilization of NTD Structure Blocks Formation of Reaching Dimers but Not Core-Core Dimers-The NTD of HIV IN contains a conserved Zn 2ϩ binding motif (HH-CC), and the presence of this ion is required for conformational integrity of this domain (17)(18)(19). As the reaching dimer of ASV IN is stabi- Fluorescence-based assays for single end processing (B) and joining (C) have been described previously (44). Reactions included the indicated concentrations of urea. D, to measure concerted integration, a modification of the conditions described by Li and Craigie (45), was used. HIV IN was first treated with 10 mM EDTA overnight at 4°C in a buffer that included 1 M NaCl to optimize activity (19)  lized by interactions of the NTD with the core and CTD of the second IN monomer, we reasoned that disruption of the NTD structure would prevent formation of an HIV IN reaching dimer but not a dimer formed by core-core interactions (illustrated in Fig. 1, left). To remove the NTD-bound Zn 2ϩ ion, wild type HIV-1 IN (1.5 mg/ml) was dialyzed overnight in our standard buffer (including 1 M NaCl and 1 M urea) supplemented with 10 mM EDTA and then concentrated to 5 mg/ml. A sample of this protein was then applied to a SEC column that had been pre-equilibrated with the same EDTA-supplemented buffer. A homogeneous peak of protein was eluted from this column with retention time (24.3 min) expected for a dimer. The F181T derivative eluted as a dimer in SEC (24.75 min) in the absence of EDTA treatment. When the IN F181T was treated with EDTA and chromatographed in the presence of EDTA, its retention time was 26.2 min, consistent with a monomer. Light scattering analysis of untreated IN F181T and EDTA-treated wild type IN confirmed that both were dimers (Tables 1 and 2). These results support the hypothesis that destabilization of the NTD by removal of Zn 2ϩ ion leads to disruption of the reaching dimer interface but will allow core-core stabilized dimers to assemble. Further support for this interpretation comes from the demonstration by Hare et al. (20) that an E11K substitution, which disrupts a salt bridge between the NTD and Lys-186, results in a shift in the equilibrium of multimeric forms of full-length wild type HIV-I IN from tetramers to a dimer-monomer mixture. Our light scattering data, at higher concentrations (Table 1), also revealed dimers, and an expected reduction in enzymatic activity of this derivative is illustrated by analysis of 3Ј processing (Fig. 3B).

SAXS Analysis of Wild Type HIV IN and the F181T and E11K
Dimers-The homogeneous preparations of wild type HIV IN tetramers and derivative dimers were next analyzed by SAXS at protein concentrations ranging from ϳ1 to 2 mg/ml. A summary of the SAXS-determined parameters and apparent multimeric state of each of the proteins analyzed is provided in Table  2. Consistent with the light-scattering results, SAXS data for all proteins in the Guinier regions and Kratky plots confirmed the absence of aggregation or unfolding of IN (data not shown). A PRIMUS analysis (10) of scattering intensity versus a q range of 0.01-0.04 for four independent wild type IN concentrations also showed that there was no concentration-dependent aggregation in the range tested (data not shown).
The scattering profiles for wild type HIV IN and the F181T and E11K dimers are shown in Fig. 4A; their pair distance dis-  tribution P(r) functions using data to a q max of 0.4 Å Ϫ1 revealed no major deviations in the D max as a function of concentration in the range of 1-1.5 mg/ml (Fig. 4B  a SAXS scattering data obtained from IN proteins at the listed concentrations were processed with the program IRENA to determine the radius of gyration (R g ) and the maximum length of the scattering multimer (D max ) and The volume of the Situs-derived envelope was calculated with Chimera software (UCSF).  (Tables 1 and 2). However, the significant differences in the D max and R g values of F181T and E11K dimers are indicative of distinct assemblies; that is, a compact dimer in the case of F181T and a more extended dimer for E11K. SAXS envelopes for these proteins were derived using GASBOR modeling (Fig. 4C). We note that the dimensions of the HIV F181T IN dimer are similar to those of the reaching dimer of full-length ASV IN (3), with minor differences in the contours of the envelopes. Furthermore, the D max of 117 Å observed for the HIV IN tetramer is only slightly larger than that of the F181T dimer, whereas the volume calculated for the tetramer is approaching twice that of the F181T dimer ( Table  2). The envelope derived for E11K is larger than that of F181T and the shape is longer than either the wild type tetramer or the F181T dimer with a distinct contour and narrower ends.

SAXS Analysis of Wild Type HIV IN and Derivatives in the Presence of the Metal (Mg 2ϩ )
Cofactor-As the enzymatic activities of HIV IN are highly cofactor-dependent, we asked if the presence of Mg 2ϩ would alter the overall architecture of the protein in solution. These experiments included wild type IN, a D64N derivative that cannot bind the metal cofactor at the active site, and the F181T dimer. The results from light scattering studies indicated that the size of these three proteins was not altered significantly in the presence of Mg 2ϩ ( Table 1). Comparison of the SAXS parameters in the absence or presence of the metal showed minor variations in the D max values but no drastic change in the R g values ( monomers were also analyzed by SAXS. Envelopes derived for these EDTA-treated proteins are shown in Fig. 6 and, judging from their Kratky plots (not shown), both of these proteins have lost some secondary structure compared with untreated proteins, as expected for destabilized NTDs (18). SAXS parameters for the EDTA-derived wild type IN dimer show values of 33 Å for the R g and 100 Å for D max, which are quite similar to the values of 36 and 113 Å, respectively, for the SEC-isolated IN F181T dimer. However, the volume of the wild type EDTAderived dimer is larger than that of the IN F181T dimer, indicating that the conformations are likely to be distinct (Table 2). Indeed, the envelopes derived for the wild type IN EDTA dimer and the F181T IN dimer are quite different, consistent with the notion that there are two distinct modes of stabilization of their respective dimer interfaces. The difference in contours of the wild type EDTA dimer and the E11K dimer likely reflects the disorder of the NTD associated with the removal of Zn 2ϩ ion. The EDTA-treated HIV IN F181T monomer exhibited reduced values for R g and D max compared with the untreated protein.
These results are consistent with a monomer with disordered NTD (Fig. 6).
Identification of Intersubunit Proximities in the HIV IN Reaching Dimer-Previously we used isotopic labeling followed by chemical cross-linking and mass spectrometry to map the interacting interfaces in ASV IN dimers and tetramers (3). In the present study we employed similar methods to identify the intermolecular proximities of protein domains in the HIV IN F181T dimer, and the wild type IN dimer(s) that exist at low protein concentration (supplemental Table S1).
The strategy was to mix equal amounts of separate preparations of unlabeled and isotopically labeled lysine ( 13 C, 15 N) and arginine ( 13 C, 15 N) proteins and allow the mixtures to equilibrate such that they formed mixed multimers (5). In preliminary tests, we found that labeled and unlabeled monomers of the F181T derivative did not exchange as readily as those of wild type IN, indicating that the F181T dimer was somewhat more stable than the wild type. To facilitate exchange we treated the F181T protein with 10 mM EDTA to form monomers; mixed dimers were readily assembled upon the addition of Zn 2ϩ ion

HIV-1 Integrase (Dimer and Tetramer) Solution Structures
through slow dialysis (see "Experimental Procedures"). After treatment with the EDC cross-linking reagent, the wild type IN or F181T IN products were separated by electrophoresis in a denaturing gel. Protein excised from the dimer band was then subjected to trypsin digestion and mass spectrometry. Intersubunit cross-linked peptides are recognized uniquely by their hybrid mass. The observed mass differences for cross-linked labeled and unlabeled tryptic peptides were consistent with the expected values of Kϩ8.014 and Rϩ10.008, where K and R are masses of unlabeled lysine and arginine, respectively.
The reagent used in these experiments, EDC, promotes formation of trypsin-resistant, irreversible cross-links between the carboxyl groups of acidic amino acids, such as aspartate and glutamate, with the side chains of lysine residues that act as salt bridge partners. Gel electrophoresis of the IN F181T products revealed robust cross-linked dimer bands at 25 M concentration (Fig. 7Ai). At 250 nM, only dimers were detected (Fig. 7Aii), and the same was true for wild type IN at 450 nM (Fig. 7Aiii). Unfortunately, this method is not useful for mapping oligomers higher than dimers, as cross-links from more than two interacting monomers are difficult to distinguish. Consequently, we focused our efforts on mapping the dimer interfaces.
Data obtained with F181T IN at 25 M revealed numerous cross-links between residues in the NTD of the unlabeled monomer with the CTD of the labeled monomer (Fig. 7B, supplemental Table S1C). This included interactions between Glu-11, Glu-13, and Glu-35 in the NTD with lysines at positions 215, 240, and 264 in the CTD. Cross-links of NTD to NTD, NTD to core, and CTD to CTD were also detected. The IN F181T derivative includes an N-terminal extension of three amino acid residues, Gly, His, and Met, which remains after removal of the His tag. This Gly-1 is also capable of forming cross-links to the core domain at Glu-157 and Glu-170. The C-terminal tail extremities of both the labeled and unlabeled monomers were found to form cross-links to G1 in the other subunit. As the labeled IN only incorporates isotopes of Lys and Arg, the origin of the cross-linked tryptic fragments that include the extreme tail, with sequence QDED, could not be determined. G1 of the unlabeled monomer of F181T IN is observed to form cross-links with labeled IN at helical regions of the NTD at positions Glu-10, Glu-11, and Glu-13. Glu-35 from unlabeled IN is crosslinked with G1 of the labeled IN. These cross-links reveal the proximities of the NTD helical regions in the unlabeled and labeled IN subunits.
In addition to cross-links to secondary structural elements of the NTD, core, and CTD tail ends, G1 was observed to crosslink at CTD ␤-sheet elements at positions Asp-229 and Glu-246 in IN F181T. Furthermore, Glu-212 from both labeled and unlabeled IN was found to cross-link with G1 on unlabeled and labeled IN, respectively (Fig. 7B). Helical regions of the NTD from the unlabeled IN at positions Asp-6, Glu-11, Glu-13, and Glu-35 were also cross-linked to the labeled lysines in either the core or CTD regions. Cross-links between ␤-sheet regions in the CTDs of the unlabeled and labeled F181T IN were observed between residues Glu-212 and Glu-246, with Lys-236, Lys-240, and Lys-264 (Fig. 7B). Overall these results indicate that the HIV IN F181T dimer is stabilized by interactions of the NTD of the one monomer subunit with the core, CTD, and NTD of the second subunit. In addition, CTD-CTD interactions are also detected between the two subunits. Finally, no cross-links were observed between the core domains of the monomer subunits, indicating that this dimer is not stabilized by core-core interactions but rather more closely resembles the reaching dimer architecture observed with wild type ASV IN. Results from mass spectrometry analysis of cross-linked peptides formed with IN F181T at 250 nM protein concentration are summarized in Fig. 7C and supplemental Table S1D. As expected, fewer interactions were detected under these conditions. However, all of the cross-links found in this sample were included in the set obtained at the higher protein concentration (Fig. 7B).
Results from mass spectrometry analysis of cross-linked peptides from the dimer band of wild type HIV IN treated with EDC at 450 nM concentration are summarized in Fig. 7D and supplemental Table S1E. We detected 17 cross-links between labeled and unlabeled IN. Although we again observed prominent interactions of the NTD with the core and CTD, the crosslinks are at different positions than observed with the IN F181T dimer. For example, the CTD to CTD cross-links in the wild type IN dimer were Glu-212-Lys-211, Asp-229 -Lys-264, Asp-232-Lys-264, and Lys264-Asp-232, whereas with IN F181T, the cross-links were Glu-212-Lys-236, Glu-246 -Lys-240, and Glu-246 -Lys-264 (compare Figs. 7, B and D). Other distinct features of the wild type dimer were observed in the region of the core-CTD interface, including cross-links of Glu-170 -Lys-264, Asp-229 -Lys-159, and Asp-167-Lys-264, and NTD-core and NTD-CTD cross-links at Glu-35-Lys-160 and Asp-3-Lys-264. Furthermore, core-core cross-links at Glu-157-Lys-186 and Asp-167-Lys-160 were interactions observed only with the wild type dimer. Most notably, unlike the F181T dimer, no cross-links were observed between the NTDs from each interacting monomer in the wild type dimer.
A number of the intermolecular cross-links observed with the wild type HIV IN were mutually exclusive, likely arising from a mixture of oligomers. The possibility that some of the cross-links (e.g. core-core adducts) reflect inclusion in the SDS-PAGE dimer band of products from cross-linked tetramers cannot be excluded. That caveat aside, the results are most consistent with a cross-linked population comprising alternate reaching dimer configurations as well as some core-core dimer forms. With respect to the wild type reaching dimer architecture, the NTD-core cross-link between Glu-35 and Lys-160 is likely to represent a biologically relevant interface, as a careful analysis using PROTCID (21) reveals that a common interface with these residues in close proximity exists in all two-domain (NTD ϩ core) IN proteins that have been crystallized to date, including those of HIV-1, HIV-2, PFV, and maedi visna virus. The interface (supplemental Fig. S1) is observed in both intraand intermolecular interactions. In contrast to the wild type, there is no evidence for this common interface in the IN F181T reaching dimer. It seems probable that the non-conservative substitution of Phe-181, which is located in this interface (supplemental Fig. S1), disrupts this common NTD-core interaction leading to a shifted reaching dimer conformation that is manifested most clearly in the adjacent arrangement of NTDs in the IN F181T dimer. Such a shift could account for the increased stability of the F181T dimer noted above.

Data-driven Docking and Model Fit into SAXS Envelopes-
To obtain a more detailed model of the HIV IN F181T dimer, we used the mass spectrometric data obtained from the crosslinking experiment summarized in Fig. 7B. The starting template for a monomer of F181T IN was a homology model derived from the ASV IN reaching dimer (3), with data-driven docking using the HADDOCK module. We noted that the presence of mutually exclusive ␤-strand cross-links at the CTD-CTD interface (Fig. 7, B and C) is likely to reflect dynamic movement at this interface. For example, NTD residues are observed with cross-links to either core or CTD, and further cross-links between CTD domains engage different ␤-strands (Fig. 7B). Therefore, to characterize the predominant conformational state of the IN F181T dimer, cross-link contacts were parsed into three groups for which the distance constraints are compatible and satisfy the maximum distance (D max ) from SAXS data (Fig. 8).
In the first group dock model A (Fig. 8A, top) the distance constraints of the Glu-246 -Lys-240 were maintained in addition to NTD-NTD interactions with cross-links of G1 with Glu-10, Glu-11, Glu-13, and Glu-35; NTD cross-links to CTD at G1-Glu-212 and Glu-11-Lys-215 are also included (Fig. 7B). In the second group dock model B, the salt bridge distances of Glu-246 -Lys-264 were maintained in addition to the links between Glu-11-Lys-215, Glu-35-Lys-264, and G1-Asp-116. In this conformation, the interactions between the Trp-243 residues from each monomer are disrupted. In the third group dock model C interactions are similar to dock model A, except NTD-NTD proximities were relaxed. NTD-to-core distances were enforced at G1-Asp-116, G1-Glu-157, Asp-6 -Lys-159, and Asp-6 -Lys-188. In this model, NTD cross-links to CTD at G1-Glu-212 and Glu-11-Lys-215 are not taken into consideration. During successive cycles of the optimization for each group, we included newly observed features that increased the hydrophobic packing in the interacting domains at the interface and that satisfied the SAXS determined maximum distance D max .
Because of the dynamic nature of movements of the NTD and CTD with respect to the core domain, in all three dock models we endeavored to determine which configuration might represent the closest parity with the actual experimental data and, therefore, the predominant dimer architecture. Although independently obtained, comparison of the CRYSOL-derived P(r) function for each of the models with the experimental data and analysis of the predicted scattering profiles (Fig. 8, B and C) indicated that the closest identity was obtained with model A. Furthermore, a predicted inner dimer model based on the PFV intasome crystal structure (22) with viral DNA removed (i.e. intasome inner dimer sans DNA; Fig.  8A) showed a large difference of 30 Å in D max and scattering when compared with experimental data. These analyses showed good correlation between the data-driven docking reaching dimer model A and the observed SAXS parameters.
The unique features of the HIV IN F181T model A reaching dimer are a CTD-CTD interface with prominent Trp-243 hydrophobic interactions (yellow spheres, Fig. 8A) and stacking of both NTD domains above the CTD domains. The final, most accurate model from the HADDOCK docking was fitted into the experimental SAXS envelope, and its occupancy inside the FIGURE 8. Models of the HIV IN F181T reaching dimer and selection of the best fit. A, shown is a depiction of the three HADDOCK-derived models, generated by docking with distance constraints from EDC cross-links, and published intasome models sans DNA. The best-fit Model A, which includes NTD and CTD stacking interactions at the interface, is shown in two orthogonal views. The interface of this reaching dimer includes interactions between the NTD of one monomer (red) to the NTD of the second monomer (pink) and also Trp-243:Trp-243 stacking between the subunit CTDs (yellow spheres). Model B was derived without Trp-243 stacking, and model C was derived with inclusion of NTD to core interactions. The intasome and intasome inner dimer models are derived from the published HIV intasome model (22) by omitting the viral DNAs. IN domains are colored as in Fig. 1, and the second monomer is shown in faded colors. B and C shown a comparison of the theoretical P(r) functions and scattering profiles (generated with CRYSOL) of the HADDOCK models and the intasome inner dimer with experimental data. MARCH 8, 2013 • VOLUME 288 • NUMBER 10 envelope was 95% (Fig. 9A, left). Although model A in Fig. 8A includes only residues 1-270, our analyses indicated that the addition of the 18-amino acid tail and disordered N-terminal His 6 into the final model had insignificant effects on the CRYSOL profiles (data not shown).

HIV-1 Integrase (Dimer and Tetramer) Solution Structures
As noted above, the SAXS-derived envelope of the E11K IN dimer is longer than either the F181T dimer or the wild type tetramer; its unique shape includes narrow extremities and bulging occupancy at the center (Fig. 4C). As the core interface residues are not changed in the E11K substitution, this dimer should include a core-core domain interaction. To model an architecture that would satisfy these features, residues 60 -186 of the core domain crystal structure (PDB ID 1BIS) was selected to represent the core domain dimer, which could only be accommodated in the central bulge of the E11K envelope. Attempts to place NTD and CTD dimer pairs at the narrow extremities of the envelope (PDB IDs 1E0E and 1QMC, respectively) resulted in a model that did not fit the experimental data (not shown). The alternative arrangement, with an NTD and CTD at both ends, was a better fit, and this model was further refined with SWISS MODEL by placing one CTD domain at either extremity with the NTDs nested between the CTD and core domains; the missing linkers were built between the domains as shown in Fig. 9A, right. The SAXS scattering data and P(r) function derived for this E11K dimer model compared well with the experimental data (Fig. 9, B and C) and, as expected, was distinct from F181T data.
Building an HIV IN Tetramer-As described above, disruption of core-core interactions at Phe-181 of HIV IN revealed a unique homo-dimer whose SAXS analyses yielded very similar D max to that of the homo-tetramer of wild type IN (Fig. 4B). The volumes from the SAXS-derived envelopes of the wild type IN tetramer and F181T IN dimer, 260 and 130 Å 3 , respectively (Table 2), indicated that the homo-tetramer has twice the volume of a homo-dimer with similar D max . Comparison of these parameters and the SAXS envelope shapes suggests that IN tetramers assemble by the stacking of two reaching dimers through core-core interactions of both dimers. Fig. 10A shows a plausible model for the formation of a wild type tetramer by such stacking. Coalescence of the two reaching dimers in this manner results in a tetramer that has same maximum dimension as a reaching dimer. The corresponding wild type tetramer model, which places the NTDs in close proximity to the core interfaces, fits into the SAXS envelope as shown in the Fig. 10B.
CRYSOL was applied to gauge the correctness of this tetramer model without any bias with respect to the experimental data. The intasome sans DNA model (22) was also included in this comparison. The results indicate that the apoform of the HIV IN tetramer comprising stacked reaching

HIV-1 Integrase (Dimer and Tetramer) Solution Structures
dimers is most compatible with the experimental data (Fig.  10C). Furthermore our SAXS data for apoIN are inconsistent with the elongated model of the HIV intasome sans DNA tetramer derived from the crystal structures (2,22) or solution structures (23) of PFV.

DISCUSSION
Elucidation of the Organization of Two Dimer Forms of HIV IN-In this study we applied SAXS, protein cross-linking coupled with mass spectrometry, and molecular modeling to reveal the architectures of full-length HIV IN dimers in the absence of DNA substrates (apoIN). Our analyses have allowed us to distinguish two dimer forms of the protein. One form, stabilized by core-core interactions, is observed when a charge substitution, E11K, is introduced into the NTD (Table 1; Fig. 4) or the NTD is destabilized through removal of an essential Zn 2ϩ ion from the wild type protein ( Table 2; Fig. 6). Previous investigations have shown that that wild type EDTA-treated IN formed monomers but at much lower concentrations (19) that may be below that required for stable core-core interactions and are not suitable for SAXS analysis. Our preliminary modeling experiments indicate that the envelopes of the E11K derivative ( Fig. 9) and EDTA-derived wild type dimer (not shown) can accommodate a centrally located core-core interface, with the NTD and CTD domains extended to either side. It is noteworthy that neither of these core-core-stabilized dimer envelopes resembles that predicted for the HIV IN dimer model derived from consolidation of the crystal structures of two-domain fragments (3).
The second dimer form has a reaching dimer architecture similar to that derived for wild type ASV IN (3). In this dimer, the core domains lie at opposite poles, and the structure is stabilized by interactions of the NTD of one monomer with the core and CTD of the second monomer as well as CTD-CTD interactions, mediated by Trp-243 stacking ( Table 2 and Fig. 8). We showed that interruption of the 4-tiered interaction between conserved hydrophobic amino acids at the HIV IN core-core interface resulted in formation of a homogeneous reaching dimer. Although several residues were proposed to participate in the 4-tiered interaction, Trp-132, Met-178, Phe-181, and Phe-185 (13), in our studies only substitution of Phe-181 to either T or A produced monodispersed dimer preparations. Proteins with substitutions at the other positions formed either aggregates or equilibrium mixtures. As expected, when the core-core interface is compromised by the F181T substitution, disruption of NTD interactions by EDTA treatment leads to production of IN monomers (Fig. 6).
The HIV IN F181T reaching dimer is highly stable at high protein concentration and suitable for SAXS analysis ( Table 2). Our cross-linking analyses showed that the positions of the terminal domains in the F181T reaching dimer varied somewhat from those in the reaching dimers of wild type HIV IN and ASV  MARCH 8, 2013 • VOLUME 288 • NUMBER 10 IN (3). Although the reason for this difference is unknown, we note that in the HIV IN F181T core domain, a critical hydrophobic amino acid that could interact with residues in the NTD (supplemental Fig. S1) has been replaced with a non-aromatic amino acid. Furthermore, the HIV-1 NTD has four surfaceexposed aromatic residues (Phe-1, Tyr-15, Trp-19, and Phe-26) that are available for interaction with another NTD as well as the core and CTD. In contrast, ASV IN lacks aromatic residues in its NTD and may have less propensity to make such contacts. We speculate, therefore, that the NTD-NTD interaction in HIV IN F181T may reflect a "domain swapping" phenomenon (24) in which an aromatic interface-involving interaction between the NTD with the core domain is substituted by an alternative interaction with the other NTD to obtain maximum hydrophobic packing. It is possible that such swapping may also occur transiently in a wild type reaching dimer.

HIV-1 Integrase (Dimer and Tetramer) Solution Structures
Modeling an HIV ApoIN Tetramer-Stability of the HIV IN tetramer has been investigated extensively by various groups (13,20,(25)(26)(27)(28). Single amino acid substitutions of particular residues were observed to result in either equilibrium mixtures of dimers with tetramers or pure homo-dimers. Although usually assumed to be core-core dimers (29,30), the structure of these HIV dimers or the tetramer was unknown. Our SAXS analysis indicates that the wild type IN tetramer has a D max similar to that of our F181T dimer but that the envelope volume of this tetramer and the F181T IN dimer are 260 and 130 Å 3 , respectively ( Table 2). With the same D max but twice the volume, the simplest interpretation is that the tetramer comprises two stacked reaching dimers, likely stabilized by core-core interactions at both ends (Fig. 10A). Interestingly our SAXSderived tetramer model resembles a combined model of Cai et al. (25) that was proposed solely on consideration of structures and interactions of the isolated domains of HIV IN. Our tetramer model is also consistent with the SAXS parameters in the preferred model 1 of a LEDGF(IBD) bound HIV IN tetramer of Gupta et al. (31), as IBD attachment to distally placed catalytic cores can account for the increased D max . Our results also showed that the HIV IN tetramer conformations are only slightly affected by the presence of the metal cofactor Mg 2ϩ without any gross change in the overall architecture. The minor changes observed in the D max values of these two HIV IN proteins (wild type and D64N) in the presence of metal (Fig. 5) might reflect some of the conformational alterations detected in earlier studies (32,33).
After an analysis of many homo-multimeric enzymes, Powers and Powers (34) have observed that although two alternate dimer assemblies could be envisioned, one dimer form is utilized predominantly in pathways of homo-tetramerization. They also present theoretical reasons for why interconversion of dimeric forms is not permitted without dissociation to monomers and why formation of trimers by monomer addition to a dimer is not observed. The dissociative requirement in the pathway of monomer-dimer-tetramer formation is interesting in light of a previously reported observation that intasome formation is enhanced by IN monomerization before DNA addition (19).
Application of PISA (Proteins Interfaces Surfaces and Assemblies) analysis (35) to our F181T reaching dimer (model A) provides a value for the free energy of formation (⌬G) of this dimer interface of Ϫ15.8 kcal/mole. This number is similar to the Ϫ14.0 kcal/mole that can be obtained from the outer corecore dimer interface in a model of the HIV IN intasome (22). Comparable values were obtained from the PFV intasome crystal structure: ⌬G for the NTD-core interface of the PFV inner dimer is Ϫ14.5 kcal/mol and for the core-core interface in the outer dimer the value is Ϫ15.3 kcal/mol. Although this calculation only relates to protein-protein interactions in the reaching dimer interface of the PFV NTD ϩ NED with the core residues, it is likely that bound viral DNA contributes to complex formation and stability of the intasome. Overall, the HIV and PFV examples suggest that both dimer interfaces have similar stabilities. These estimates imply that wild type HIV IN can exist in two dimer forms in solution. If one interface has been compromised, the predominant form will be the alternate dimer.
Possible Modes of Intasome Assembly-An HIV IN tetramer model with longer D max and a smaller R g has been derived by removal of the DNA substrate from the intasome model (22). Given the existence of flexible linkers from the core to the NTD and CTD and the potential for dynamic motion of these domains, as suggested earlier (3) it is possible that a transient tetramer conformation can favor viral DNA capture by one of the stacked reaching dimers, which then becomes the inner dimer of an intasome that performs catalysis. The terminal domains of the other stacked reaching dimer might simultaneously disengage to assume auxiliary functions. Others have proposed scenarios in which HIV IN subunits exchange and form core-core dimers, which bind DNA and come together to form a tetramer (29), or in which formation of functional intasomes involves the association of free monomers (19), as appears to be the case for PFV IN (2). Further experiments are needed to distinguish among these differing modes of intasome assembly, and the extent to which high salt concentration, a necessary condition of our biophysical analyses, affects the dimer-tetramer equilibrium or the disposition of terminal domains.
ApoHIV IN Is a Viable Drug Target-The lack of the fulllength HIV apoIN structures and knowledge of their multimer architectures have been an impediment in the development of non-active site inhibitors. Previous studies with single chain variable fragments derived from NTD-and CTD-specific anti-HIV IN monoclonal antibodies revealed that their intracellular expression can protect human T cells from HIV-1 infection at a stage before integration, implying that the apoIN form is susceptible to drug intervention (36). As IN does not engage its viral DNA substrate productively until reverse transcription completes the viral DNA ends, there is a substantial window of opportunity after infection for compounds that target the apoIN multimers specifically.
Hayouka et al. (37) have demonstrated that LEDGF-derived peptides that bind to the core-core interface of HIV IN inhibit binding of the viral DNA substrate by stabilizing an inactive tetramer conformation. They called such allosteric inhibitors "shiftides" to highlight their ability to shift the equilibrium between active and inactive multimers. More recent studies have shown that small molecules that bind to the LEDGF binding site can also inhibit conformational changes that are required for DNA engagement at the active site (29, 38 -40). We speculate that small molecules that attach to other sites on the apoIN multimer and either increase the stability of an inactive conformation or prevent establishment of the active conformation could also effectively block IN activity. The presence of alternate multimeric IN conformations is consistent with the inclusion of this protein in the morpheein paradigm, which describes a mechanism of enzyme regulation via modulation between alternate, functionally distinct oligomeric assemblies (41,42). Proteins that assume such alternate conformations represent attractive targets for small molecule allosteric inhibitors (43), which in the case of HIV IN would be effective against emerging, active site inhibitor resistant viruses.
In summary, solution studies including SAXS have proven to be a powerful approach for investigating the flexible 3-domain IN proteins. In this study SAXS analysis in combination with chemical cross-linking and modeling has revealed the modular architecture of HIV IN multimers and suggested the most likely mode of tetramer assembly. The findings have provided new insight into IN protein structure and function and suggested valuable new strategies for inhibition.