Solution Conformation and Dynamics of the HIV-1 Integrase Core Domain*

The human immunodeficiency virus type 1 (HIV-1) integrase (IN) is a critical enzyme involved in infection. It catalyzes two reactions to integrate the viral cDNA into the host genome, 3′ processing and strand transfer, but the dynamic behavior of the active site during catalysis of these two processes remains poorly characterized. NMR spectroscopy can reveal important structural details about enzyme mechanisms, but to date the IN catalytic core domain has proven resistant to such an analysis. Here, we present the first NMR studies of a soluble variant of the catalytic core domain. The NMR chemical shifts are found to corroborate structures observed in crystals, and confirm prior studies suggesting that the α4 helix extends toward the active site. We also observe a dramatic improvement in NMR spectra with increasing MgCl2 concentration. This improvement suggests a structural transition not only near the active site residues but also throughout the entire molecule as IN binds Mg2+. In particular, the stability of the core domain is linked to the conformation of its C-terminal helix, which has implications for relative domain orientation in the full-length enzyme. 15N relaxation experiments further show that, although conformationally flexible, the catalytic loop of IN is not fully disordered in the absence of DNA. Indeed, automated chemical shift-based modeling of the active site loop reveals several stable clusters that show striking similarity to a recent crystal structure of prototype foamy virus IN bound to DNA.

tion, strand transfer, the target DNA is cleaved and viral cDNA is integrated by joining the 5Ј end of the target sequence to the recessed 3Ј end of the viral DNA (1,7,8,11,12). The final removal of the unpaired 5Ј viral DNA ends and the ligation of the nicked DNA are thought to be carried out by cellular enzymes (13)(14)(15). Additionally, IN is observed to catalyze the reverse reaction, whereby the spliced cDNA is removed from the host genome (16). The biological relevance of this disintegration reaction is presently unclear, but it provides a useful means to characterize functionality of IN variants. Because of its functional requirement for viral replication, IN is an attractive target for inhibition, and drug development targeting of IN has been commercially successful (17). Further work inhibiting the mechanism of IN is ongoing and shows promise (18,19).
HIV-1 integrase is organized into three domains, an N-terminal zinc-binding domain (residues 1-50), a catalytic core domain (residues 50 -212), and a C-terminal Src homology 3 domain (3, 20 -22). Full-length IN exists in a dimer-tetramer equilibrium (23), but it is thought that at least a tetramer is required for biological activity (24). The function of each domain during integration remains unclear, but the C-terminal domain has nonspecific DNA affinity, and constructs without this domain fail to form tetramers (23). Importantly, the catalytic core domain alone can catalyze the disintegration reaction, and mutations to catalytic residues Asp 64 , Asp 116 , or Glu 152 (the Asp-Asp-35-Glu motif) abolish all activity of the enzyme (3,25). Thus, whereas all three domains work in concert to perform integration, the core domain alone is directly responsible for chemistry preprocessing and strand transfer.
The core domain has been studied extensively, and crystallography has yielded insight into its function, both when isolated (24, 26 -31) and when paired with either the N-terminal (32) or C-terminal (33) domains. The structures reveal a dimeric core domain with an RNase H-type fold. The dimer is thought to be biologically relevant (24), but several regions of the structure, including a loop containing catalytic residue Glu 152 (residues 140 -153), are often disordered or involved in crystal packing interactions. Furthermore, mutagenesis studies, where inherently flexible Gly residues in this catalytic loop are replaced with Ala, suggest that the dynamics of the active site are important for catalysis (29). This is not surprising, as the core domain presumably must catalyze both the 3Ј processing as well as the strand transfer reactions. The active site is known to complex divalent cations such as Mn 2ϩ and Mg 2ϩ (27), and binding of these cations stabilizes IN as evidenced by protease protection assays (34 -36).
Recently, the structure of an IN homolog from prototype foamy virus (PFV) was crystallized in complex with DNA (37). The catalytic loop in this structure extends helix ␣4 in the N-terminal direction, and residues 146 -148 immediately precede ␣4 as a short stretch of 3 10 helix. It is this 3 10 helix that fits into the pocket formed by the recessed 3Ј viral DNA end. This study represents a significant step forward to understanding integrase biochemistry; nevertheless, it is not currently known whether the ␣4 helix kinks in the absence of DNA, nor is the interplay between dynamics and catalysis well understood. Computational studies on the core domain particle have yielded conflicting results: short time scale simulations suggest that the active site loop samples ␣-helical conformations (38), whereas longer time scale simulations suggest that this loop acts as a gate over the other catalytic residues in IN (39). Experimental validation of these simulations to date has been limited.
Solution NMR provides a powerful means to characterize biological macromolecules, elucidating both structure and dynamics, provided that adequate spectra can be obtained. It has the additional advantage of bypassing the requirement for crystallization, although it has its own difficulties for sample preparation. Solution NMR has been particularly useful for the study of HIV proteins, including studies of components of the Gag polyprotein (40 -45). Although the structures of the Nand C-terminal domains of IN have also been determined by solution NMR (46 -51), poor solubility along with the need for high salt concentrations have hindered NMR characterization of the catalytic domain (52). Here, we present the solution characterization of a soluble dimeric variant of the IN core domain (IN 50 -212 ), Q53E,C56S,W131E,F185K,Q209E, which yields good NMR spectra in the presence of 40 mM MgCl 2 . Having assigned backbone resonances of 97% of the residues, we show that the chemical shifts are consistent with crystal structures of the core domain. We report the dynamics of IN 50 -212 and find that the catalytic loop is only moderately dynamic on the picosecond to nanosecond time scale. Additionally, we investigate a structural transition induced by Mg 2ϩ binding that dramatically improves the NMR spectral quality. This conformational shift, while requiring relatively high concentrations of MgCl 2 , may reveal important clues about the structural stability of IN, including effects on the N-and C-terminal domain orientations. Finally, we study the interaction of the IN core domain with raltegravir, a commercially available drug for IN inhibition (17).

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-The F185K,W131E IN 50 -212 construct was used in earlier crystallographic studies (30). Initial screening revealed that this construct had limited solubility at NaCl concentrations below 250 mM and pH Յ 7.0, so mutagenesis was performed to lower the pI of the protein. Mutagenesis sites were located at the protein termini to avoid undesired effects on catalysis. The final variant, Q53E,C56S,W131E,F185K,Q209E, was created using the QuikChange II mutagenesis kit (Agilent) and confirmed using DNA sequencing. Further testing revealed this variant to be more soluble and better suited for NMR conditions (up to 700 M at pH 7.0, 150 mM NaCl).
Deuterated, 15 N-and 13 C-labeled IN 50 -212 was expressed in BL21 Star(DE3) Escherichia coli cells (Invitrogen). A modified M9 growth medium was used (53), supplemented with 1 g/liter of 13 C/ 2 H/ 15 N-IsoGro (Sigma). [ 1 H/ 13 C]Glucose was used in the medium to reduce 1 H T 1 values by retaining partial protonation of methyl groups. After an overnight growth at 37°C in 90% D 2 O, 1 ml of culture was transferred to 1 liter of medium in 99% D 2 O. The resulting protein is nearly fully deuterated at all H ␣ positions, and characterization by mass spectrometry indicated that the overall deuteration fraction of the protein was 85-90%. The culture was grown at 37°C to an A 600 of 0.6 and induced with 1 mM isopropyl thiogalactoside for 6 h. Cells were then harvested by centrifugation and stored overnight at Ϫ80°C.
Residue specifically labeled IN 50 -212 was expressed using methods as outlined previously (54,55). Briefly, unlabeled amino acids were added to unlabeled M9 growth medium at a concentration of 60 mg/liter. This mixture contained all amino acids except those that would ultimately be labeled, e.g. Val would be excluded if 15 N-Val labeling was desired. BL21 Star(DE3) cells were grown in this medium and induced with 1 mM isopropyl thiogalactoside when they reached A 600 of 0.6. Fifteen minutes before induction, labeled amino acids were added to the medium (60 mg/liter), as well as the transaminase inhibitors disodium succinate, oxaloacetate, and sodium maleate (250 mg/liter each) (56). Cells were grown for 1.5 h and harvested by centrifugation. Residue specifically labeled constructs were made for 15 N-Ile, 15 N-Leu, 15 N-Lys, 15 N-Met, 15 N-Phe/Tyr, and 15 N-Val. Scrambling with other amino acids was very low, typically less than 5%.
Protein purification was performed as described by Goldgur et al. (27). The final dialysis buffer for IN 50 -212 was 150 mM NaCl, 20 mM HEPES, pH 6.8, 40 mM MgCl 2 , 6% D 2 O, and 0.02% NaN 3 . Additionally, EDTA-free Complete Protease Inhibitor (Roche Applied Science) was added to reduce proteolytic degradation during storage. The dialyzed protein was then concentrated by ultrafiltration to 500 -700 M and used for NMR measurements. The core domain was confirmed to be dimeric in both the presence and absence of MgCl 2 using a Wyatt DAWN EOS multiangle laser light scattering detector, yielding an estimated molecular mass of 35.5 kDa.
NMR Measurements-NMR measurements were carried out at 298 K on Bruker Avance 600, DRX800, and Avance 900 spectrometers equipped with cryogenic triple-resonance probeheads and z axis pulsed field gradients. Backbone assignment was performed using TROSY-based versions of the HNCO, HNCA, HNCB, HN(CO)CA, HN(CA)CO, and HN(COCA)CB triple resonance experiments (57) with modifications to the TROSY block for gradient selection and improved water suppression (58). Triple resonance pulse programs were also modified to allow for long nitrogen evolution times using the mixed constant time strategy (59). Triple resonance experiments used acquisition times of 45 ms (90 complex points) in the 15 N dimension. For experiments measuring C ␣ and C ␤ resonances, 30 complex points were measured in the 13 C dimension, with total acquisition times of 12 (C ␣ ) and 2 ms (C ␤ ). Acquisition times in the 13 CЈ dimension were 36 ms (60 complex points). 1 H acquisition times varied from 60 to 70 ms using 512 complex points. A three-dimensional 15 N-1 H MT-PARE HMQC-NOESY spectrum was acquired on IN 50 -212 at the 800 MHz field with a 150-ms NOE mixing time (59). This experiment used EBURP and reverse EBURP-selective 1 H N pulses (60) for the first two 90°pulses as well as for the final readout block, to improve water suppression and increase the repetition rate of the experiment (61). Acquisition time in the 15  TROSY HNCO-based experiments were employed to measure R 1 and R 1 relaxation rates at 600 and 900 MHz (62). An additional hyperbolic secant (600 MHz) or WURST (900 MHz) pulse was applied to 13 C nuclei at the midpoint of the R 1 and R 1 relaxation periods to eliminate nitrogen-carbon cross-correlated relaxation effects. At 600 MHz (900 MHz), the 13 CЈ acquisition time was 22 ms (9 ms) over 20 (12) complex points, and 15 N evolution was measured for 70 ms (55 ms) using 140 (165) complex points. The spectra were folded extensively in the 13 CЈ dimension to reduce overall measurement time. At 600 MHz, R 1 and R 1 rates were measured using two points assuming an exponential decay, measured at 40 and 880 ms for R 1 and 2 and 35 ms for R 1 . At 900 MHz these times are identical for R 1 , but 2 and 33 ms for R 1 . To improve sensitivity, the latter time point was measured for twice as many scans as the earlier time point (63). Steady state 15 N-{ 1 H} NOE values in IN 50 -212 were measured at 600 MHz. Because of the long recycle delays required to avoid proton saturation during the no-NOE reference experiment, a two-dimensional TROSY readout was used instead of the three-dimensional TROSY-HNCO. In this experiment, 15 N was acquired for 150 ms (300 complex data points), and 1 H acquisition was 60 ms (512 complex points). R 1 was measured using a 2-kHz spin lock field, and R 2 was calculated for this field using R 1 and R 1 (64).
Data Processing and Chemical Shift Assignment-All NMR data were processed using the NMRPipe/NMRDraw software package (65). Peak picking and initial assignment was performed with the aid of the AutoLink package (66) within CARA (67). Site-specifically labeled variants were used to confirm and complete the assignments manually. For relaxation experiments, peaks were picked using NMRDraw and fit to exponential decays using in-house scripts.
Structure Ensemble Generation-Chemical shift-derived structures were created using the CS-ROSETTA protocol (68). In the first set of calculations, all backbone and 13 C ␤ chemical shifts were used to construct a backbone model for the core domain monomer using the standard CS-ROSETTA algorithm. Note that use of CS-ROSETTA to generate a monomer structure for this homodimeric protein requires prior knowledge from the x-ray structures that the protein is not a domainswapped dimer. The latter would require an alternate computational approach, for which IN falls beyond the current size limit of CS-ROSETTA (69). In the monomer structures generated by CS-ROSETTA, the conformation of the active site is not sampled optimally because the ROSETTA energy is calculated over the entire molecule. Hence, models that make favorable interactions at the active site, but have higher energies elsewhere would not be selected in the final scoring process as the total energy would remain high. To better sample low energy conformations of the catalytic loop that are in agreement with NMR chemical shifts, a second set of CS-ROSETTA calculations was performed. For this round, fragment selection (70) was once again carried out by using the backbone and 13 C ␤ chemical shifts; however, during refinement, all residues were fixed to the x-ray coordinates (PDB entry 1QS4, chain C) except for the catalytic loop (residues 139 -153) and N-terminal residues 50 -57. Specifically, the loop-relax protocol in ROSETTA 3.0 was used to generate 3,000 all-atom models, where conformational sampling was confined to the catalytic loop. These 3,000 models were clustered into 20 groups according to the conformations sampled by the active site loop. For each model in each cluster, the ROSETTA full-atom energy was calculated for the residues in the active site loop only. To this energy a chemical shift component was added to favor agreement between experimental and SPARTA-predicted chemical shifts, following the standard protocol (68). Then, for each of the lowest energy clusters, a representative low energy model is selected. Six such models, representing the six lowest energy clusters, are finally selected as representative of the ensemble. Repeating the analysis on PDB entry 1BL3 produced nearly identical clusters.
Relaxation Analysis-15 N R 1 , R 2 , and 15 N-{ 1 H} NOE values at 600 MHz and 15 N R 1 and R 2 values at 900 MHz were used to determine an isotropic diffusion tensor for IN 50 -212 . Residues used for tensor optimization were filtered by the following criteria: only residues existing in regular secondary structure were selected for analysis. Additionally, residues were removed with a chemical shift order parameter Ͻ0.7 (71, 72), or a 15 (73) and FAST-ModelFree (74). Lipari-Szabo order parameters (75)(76)(77) were obtained for all residues from the 600 MHz data. Model selection was performed by FAST-ModelFree (74) using the previously determined, fixed isotropic diffusion tensor.
Magnesium and Manganese Titrations-Magnesium titrations were performed by direct addition into 340 M IN 50 -212 , using a 2 M stock solution of MgCl 2 . 15 N-1 H TROSY-HSQC spectra were recorded at 2, 10, 20, and 40 mM MgCl 2 on a Bruker Avance 750 MHz spectrometer. Spectral quality improved substantially at MgCl 2 concentrations above 10 mM, but reasonable TROSY HNCA/HN(CO)CA spectra could also be obtained at 5 mM MgCl 2 , and these were used to follow changes in C ␣ chemical shifts. To test for specific binding of Mg 2ϩ /Mn 2ϩ , we added 10 and 50 M MnCl 2 to the 40 mM MgCl 2 standard NMR buffer. pH stability before and after the titrations was confirmed by direct measurement. Chemical shift changes as a function of Mg 2ϩ concentration were quantified using a weighted mean chemical shift change, as described by Grzesiek et al. (78). Peaks present in at least three of the four spectra were used to determine a two-state K d value. Fit parameters were bootstrapped using the R statistical computing package (79). Examination of the MnCl 2 peak attenuation and comparison with the crystal structure 1BL3 (28) revealed the possibility of a second Mn 2ϩ binding site not seen in the crystal structure. In the fast exchange limit, the weighted average relaxation rate for a single proton site experiencing paramagnetic relaxation enhancement (PRE) resulting from a bound Mn 2ϩ is given by, where R 0 is the intrinsic proton relaxation rate, K d is the Mn 2ϩ dissociation constant, x is the concentration of Mn 2ϩ , and A is a scaling constant for the PRE term, determined primarily by the protein tumbling time c (80 -82). If NMR line shapes are Lorenzian, and ignoring magnetization decay during the fixed duration transfer delays in the NMR pulse sequence, the peak height attenuation is simply the ratio of rates without and with Mn 2ϩ . Using the crystallographic Mg 2ϩ ion, and assuming equivalent binding and relaxation rates across residues 60 -180, we fit R 0 , K d , and A to the data, and then used those values to optimize a second Mg 2ϩ binding site location near the C-terminal helix. The -squared statistic measuring the difference between the observed and predicted attenuation of residues 181-212 was optimized as a function of Mn 2ϩ position using the R statistics package (79). The resulting fit (red curve in Fig.  4B) was robust to small changes in R 0 , K d , and A and consistently produced the same Mn 2ϩ site (Fig. 2, B and C). Repeating the analysis for PDB entries 1BIU (27) and 1QS4 (30) yielded very similar locations for the second Mn 2ϩ . Raltegravir Purification and Titration-Raltegravir (Isentress, Merck) tablets were obtained commercially. After removal of the tablet coating, tablets were crushed and suspended in deuterated dimethyl sulfoxide (DMSO). After centrifugation for 10 min at 16,000 ϫ g, the supernatant was collected. A one-dimen-sional 1 H NMR spectrum was recorded and compared with synthetic, pure raltegravir obtained from Dr. Yves Pommier. The comparison revealed the tablet extract to be ϳ90% pure, and this material was used in titrations without further purification. Raltegravir was quantified by NMR using a trimethylsilyl propionic acid standard, and stock solution concentrations were 45-50 mM in DMSO. The raltegravir extinction coefficient in DMSO at 341 nm was determined to be 9960 Ϯ 120 M Ϫ1 cm Ϫ1 . The uncertainty in this number was estimated from the variation of peak volumes in the one-dimensional 1 H NMR spectrum.
Titrations were performed on a Bruker Avance 750 MHz spectrometer using a 49 mM stock solution of raltegravir and a NMR sample of 245 M IN 50 -212 . Raltegravir was added in steps of 200 -1000 M up to a drug: protein ratio of 14:1. The final DMSO concentration was 6.5% (v/v). To assess the affects of DMSO on IN 50 -212 , the titration was repeated with DMSO alone, and peak positions of a 15 N-1 H TROSY-HSQC were compared at corresponding titration points. Although chemical shift changes were observed during the raltegravir titration, nearly identical changes were observed with DMSO alone, indicating that the effects of raltegravir on the isolated core domain are very small.

RESULTS
Chemical Shift Assignments-The F185K,W131E IN 50 -212 variant has been used in crystallization studies and is catalyically competent for 3Ј processing (27). However, when studied by NMR, this construct yielded very poor spectra and required high salt concentrations, which had a strong adverse impact on the sensitivity of our experiments. Site-directed mutagenesis was employed to develop a soluble variant of IN 50 -212 amenable to NMR characterization. First, we introduced C56S to eliminate potential disulfide bonding between monomers. It is known that this mutation does not hinder disintegration (27). Then, working from the termini to avoid interfering with the active site, we added two Asp residues to lower the protein pI. These residues are not visible in most crystal structures and therefore should not affect the conformation of the active site. The final variant, Q53E,C56S,W131E,F185K,Q209E, exhibited good solubility at 150 mM NaCl, pH 6.8, a substantial improvement over the original variant, which required NaCl concentrations larger than 250 mM. In this paper, IN 50 -212 designates this particular variant of the integrase core domain.
Although  was soluble, spectra were still unfit for NMR assignment purposes (Fig. 1A). Initial experiments included 5 mM MgCl 2 (27), but increasing the MgCl 2 concentration was found to dramatically increase the spectral quality to the point where triple resonance assignment techniques could be employed (Fig. 1B). Not only did the line width of existing peaks become more uniform, but more peaks were observed as well. At 40 mM MgCl 2 , the marginal improvement in spectral quality diminished, and no new peaks were observed beyond this concentration. Characterizing the IN 50 -212 variant by gel filtration chromatography revealed that, whereas still eluting as a dimer, the particle had diminished in size somewhat, possibly indicating a structural collapse of a partially disordered region (data not shown).
The standard suite of TROSY triple resonance experiments were employed to assign backbone H N , N, CЈ, C ␣ , and C ␤ resonances on 15 N-, 13 C-, 2 H-labeled IN 50 -212 (57). However, because of spectral overlap, several assignments remained ambiguous even after all other connectivities were established. To resolve this problem, 15 N residue-specific labeling was employed to identify particular classes of amino acids and to confirm existing assignments. Additionally, a three-dimensional 15 N-1 H MT-PARE HMQC-NOESY spectrum was recorded to determine H N -H N connectivities through space where traditional J-correlated methods failed to yield unambiguous results. The collected spectra allowed for the assignment of 97% of all non-prolyl backbone and 13 C ␤ resonances from residues 51 to 212 as well as two sets of assignments for residues 205 to 212. Assignments have been submitted to the BMRB data base (accession number 16695; supplemental Table S1).
Structural Information from Chemical Shifts-Because of the size of dimeric IN 50 -212 (36 kDa) and its relatively unfavorable NMR properties, high quality NMR spectra of uniformly protonated samples were not obtainable, precluding extensive side chain 1 H assignment and traditional 1 H NOE-based methods for structure determination that are successful on smaller proteins (83,84). Although work is ongoing to apply small angle x-ray scattering and residual dipolar coupling constraints to develop refined models of the integrase core domain, recent developments have allowed a wealth of structural information to be inferred from chemical shifts alone.
It is well known that C ␣ and C ␤ secondary chemical shifts (i.e. their deviations from random coil values) are strongly correlated with backbone torsion angles and thereby with secondary structures in folded proteins (85,86). We characterized secondary shifts in IN 50 -212 and compared them with the secondary structure reported for chain C in the crystal structure of PDB entry 1BL3 (28) (supplemental Fig. S1). This is the highest resolution crystal structure of the core domain obtained to date in the presence of Mg 2ϩ . The agreement between the NMR observations and the x-ray data is very good, and the TALOSϩ secondary structure predicted from chemical shifts (72) predicts every unit of secondary structure observed in the crystal structure (supplemental Fig. S1). Some minor differences are seen at the ends of helices and strands, but these may be attributed to fraying in solution, especially considering the dynamic nature of the molecule (see below).
To further characterize the structure of IN 50 -212 in solution, we created a CS-ROSETTA model of a core domain monomer (68). This algorithm relies on backbone atom chemical shifts to select possible peptide fragments, which are then assembled and scored according to the ROSETTA scoring functions. The resulting structure is not a true NMR solution structure and is less than ideal for two reasons. First, no experimental data were used to determine side chain conformations, although nine detected long range 1 H N -1 H N NOEs were included and helped to ensure correct pairing of the ␤-strands. Second, dimer calculations performed on IN 50 -212 with CS-ROSETTA did not converge because of the size of the system. As a result, we were limited to generating models for the monomeric unit ( Fig. 2A, red conformation). In the top 10 low-energy CS-ROSETTA models, the position of the C-terminal helix is ill-defined, retaining its helical fold yet sampling multiple orientations. The R 1 /R 2 values for this helix are similar to those for residues 60 -186 (Fig. 3, discussed below), indicating that the C-terminal helix has the same rotational diffusion properties as the rest of the core domain and is not actually sampling multiple conformations. The CS-ROSETTA simulation likely lacks sufficient restraints to orient the C-terminal helix correctly, because it has no dimeric partner to pack against. Thus, we have no reason to doubt the crystallographic orientation, which pairs the two helices along the dimer interface. Excluding the C-terminal helix, the CS-ROSETTA model for the remainder of the core domain agrees well with crystal structures, and the backbone root mean square deviations for residues 60 -186 differ by only 1.6 Å relative to PDB entry 1BL3, chain C ( Fig. 2A, blue  conformation).
Partial Unfolding of the C-terminal Helix-During the assignment process, more resonances were observed in the HNCO spectrum than were expected given the number of backbone amides in the core domain. Additionally, two strong resonances were observed near 7.9 1 H ppm and 128 15 N ppm, where the C-terminal residue typically appears. Linking both C termini to their preceding residues resulted in two sets of connectivities: one that became less and less intense and impossible to trace after seven residues, and one that could be assigned with the rest of the core domain. Because both fragments show C ␤ resonances consistent with the C-terminal residues, initially we suspected that the alternate assignment was simply a proteolytic fragment of the C terminus. However, closer inspection of the NOESY data reveals cross-peaks between both sets of assignments, suggestive of conformations in slow exchange (supplemental Fig. S3). The secondary shifts for the shorter assignment fragment are near zero for both C ␣ and C ␤ resonances indicating that this conformation is highly disordered. At 40 mM MgCl 2 , it is also the less populated conformation based on peak intensity. Assuming a two-state model and using peak volumes determined by NMRPipe (65), we estimate an apparent K eq of 1.7 Ϯ 0.6 favoring the folded conformation of the C-terminal helix at 40 mM MgCl 2 . Given the ratio of crosspeaks to diagonal peaks as well as the NOE mixing time (150 ms) it is possible to calculate k ex , the sum of forward and backward exchange rates (87). Although kinetics are harder to determine because of the overlap observed for cross-peaks in the NOESY spectrum, the ratios observed for residues 206 -207 and 209 -212 are consistent and averaged to 0.8 Ϯ 0.2 s Ϫ1 .
To probe whether the original W131E,F185K construct also showed C-terminal helix unfolding, we recorded tr-HSQC spectra at 40 mM MgCl 2 . Two C-terminal peaks were observed, although the peak corresponding to the unfolded form was much weaker, less than 1% of the peak corresponds to the primary conformation (data not shown). Therefore, it is likely that the Q209E mutation used in our studies affects the stability of the final helix. Interestingly, even at 40 mM MgCl 2 , ϳ30% of the backbone amide resonances remain missing in the F185K,W131E spectrum. If Q209E destabilizes the C-terminal helix, one would expect the original construct to yield a better spectrum, but this is not the case. It is possible that this helix modulates the conformation of the rest of IN 50 -212 depending on how it folds, and this may be important for full-length integrase function.
Fast Time Scale Dynamics-To probe dynamics of the core domain, we measured 15 Table S2). These values characterize the flexibility of the core domain on the picosecond to nanosecond time scale and are useful for determining which parts of the chain are disordered. Assessed by heteronuclear NOE values Ͻ0.6, the most dynamic regions of the protein (after the termini) are residues 140 -153, near the active site loop, and residues 185-195, before the final C-terminal helix. 15 N relaxation data indicate that N-terminal residues 50 -56, lacking electron density in the X-ray structure, are also dynamically disordered in solution, suggesting that the Q53E,C56S mutations have little impact on the ordered structure in the core domain. The active site is known to be conformationally promiscuous (26), and corresponding residues in RNase H (88) and bacteriophage Mu transposase (89) are also flexible. Many crystal structures lack density for the loop containing the catalytic residue Glu 152 (24,27). Similarly, multiple conformations have been observed for residues 180 -195, and in the extreme cases both short stretches of ␣-helix (27) as well as ␤ turns (32) have been observed. The conformational heterogeneity of these regions seen in crystals is therefore consistent with the backbone dynamics observed in solution.
To further investigate IN 50 -212 dynamics, we used the ModelFree (73) and Fast ModelFree (74) programs to determine the isotropic tumbling time c and Lipari-Szabo order parameters  (108). D, Lipari-Szabo order parameters determined using measurements from panels A-C, as described in the text. IN exhibits some dynamics on the picosecond-nanosecond time scale in the active site region (residues 140 -153), but several of the residues in the 188 -196 loop are markedly more dynamic on this time scale. The alternate assignments for C-terminal residues 204 -212 (broken axis at the right) indicate that these assignments correspond to a dynamically disordered state. The secondary structure from PDB entry 1BL3, shown in panels A and D, is plotted as helices (outlined rectangles) and strands (solid black arrows). (75)(76)(77). The c for the core domain particle is 20.5 Ϯ 0.2 ns, which is somewhat larger than expected for a 36-kDa dimeric system. Multiangle light scattering confirmed that IN 50 -212 is fully dimeric at concentrations of 55 M, but partial unfolding of the C-terminal helix or transient aggregation of the core domain particles at the higher concentrations used in NMR may explain the elevated c value. Of note is that the loop containing Glu 152 is somewhat dynamic on the picosecond-nanosecond time scale, whereas the backbone of catalytic residues Asp 64 and Asp 116 are found to be well ordered. In the Model-Free analysis, residues 185-195 remain highly dynamic compared with residues 140 -153, suggesting that the C-terminal loop is the most disordered in the molecule, even more so than the active site loop. Comparing the dynamics observed at the active site and the 185-195 loop, it is interesting to note that the latter shows behavior typical for a disordered loop, with the order parameter lowest near its midpoint and gradually increasing when approaching the ends of the loop. In contrast, residues 140 -153 show a more limited degree of internal dynamics, which remains relatively flat across the entire active site loop, suggestive of concerted motions between distinct states, rather than simple increasing disorder toward its midpoint.
In agreement with the secondary chemical shift data, the 15 N relaxation data and NOE values also indicate that residues 205-212 in the alternate conformation are disordered (Fig. 3, right).
Although this conformation appears to be promoted by the mutations used to solubilize IN 50 -212 , it is nevertheless, surprising that these residues are in slow exchange, as helix fraying typically occurs on the fast exchange, submicrosecond time scale. Because of the disorder seen for the alternate conformation, it is likely that the entire C-terminal helix unfolds, with residues prior to residue 205 not observable because of basecatalyzed hydrogen exchange, which increases when further removed from the carboxyl terminus. This would reconcile the observed dynamics with the slow exchange implied by multiple resonance peaks.
Magnesium and Manganese Binding-The addition of MgCl 2 to NMR samples of IN 50 -212 dramatically improved the spectral quality. To investigate the causes of this improvement, we performed titrations on the core domain, from 2 to 40 mM MgCl 2 . During this titration, peaks were observed to both shift and sharpen as MgCl 2 was added, indicating intermediate exchange (supplemental Figs. S3-S5). The weighted mean chemical shift differences between 10 and 40 mM MgCl 2 reveals that most residues in IN 50 -212 are affected by Mg 2ϩ concentration to some extent, but catalytic residues Asp 64 and Asp 116 are very strongly affected as indicated by the fact that they do not become visible in NMR spectra until at least 20 mM MgCl 2 is present (supplemental Fig. S4). The catalytic loop (residues 140 -150) and the start of ␣4 are visible at lower MgCl 2 concentrations, and significant chemical shift changes are observed for these residues (Fig. 4A; mapped onto the structure in Fig. 2B). Additionally, another region that is particularly sensitive to MgCl 2 concentration comprises residues 90 -96. This region is not known to be involved in 3Ј preprocessing or strand transfer. Nevertheless, Glu 92 coordinates the Mg 2ϩ ion observed crystallographically (27), and mutations to this residue confer resistance to the IN inhibitors raltegravir (90) and elvitegravir (91,92), suggesting that this loop may be important for catalysis. Although many peaks disappear entirely below 5 mM MgCl 2 (including all of the catalytic residues), we are able to determine apparent K d values from our experiments. Residues Gly 118 , Asn 120 , Thr 115 , and Asn 155 , proximal to the active site, are most sensitive to MgCl 2 , with a K d of 1.75 Ϯ 0.18 mM. Residues Ala 91 , Thr 93 , Glu 96 , and Tyr 99 have a K d of 1.84 Ϯ 0.28 mM, and probably track the same binding event. Other residues titrate with weaker affinities but correspondingly larger uncertainties, e.g. Gly 163 with a K d of 3.7 Ϯ 1.0 mM (supplemental Fig. S3). The K d values for all residues generally differ by no more than 2 S.D. of experimental uncertainty.
Integrase is functional when Mg 2ϩ or Mn 2ϩ is present as a divalent cation (93,94), and it is therefore reasonable to expect that the core domain will bind Mn 2ϩ just as it binds Mg 2ϩ . Manganese has the additional property of being paramagnetic and therefore increases relaxation rates of nearby nuclei when bound. To identify potential Mn 2ϩ binding sites, we performed Black circles show the experimental data points, and the red curve shows the predicted attenuation using the Mg 2ϩ position in PDB entry 1BL3 as well as the additional C-terminal binding site described in the text and shown in Fig. 2, B and C. C, distances between H N atoms and the crystallographic Mg 2ϩ observed in PDB entries 1BIU (F), 1BL3 (Ⅺ), and 1QS4 (ϫ). Residues that are close to the Mg 2ϩ in C are more attenuated in B, except for residues near the C terminus. Thus, crystallographic Mg 2ϩ alone cannot solely account for the observed attenuation in B. The absence of PRE effects in the C-terminal alternate assignments (broken axis at right in A and B) indicates that this conformation does not bind Mn 2ϩ . The secondary structure from 1BL3 is overlaid in panel A.
titrations of MnCl 2 into IN 50 -212 at 40 mM MgCl 2 . This high background concentration of MgCl 2 masks any nonspecific Mn 2ϩ binding effects that would otherwise be observed. As assessed by peak intensity changes, the attenuation profile at 50 M MnCl 2 reveals several clusters of residues in IN 50 -212 where attenuation caused by Mn 2ϩ binding is strong (see Fig.  4B; mapped onto the structure in Fig. 2C).
We compared the observed attenuation profile (Fig. 4B) with crystal structures to determine whether the crystallographic Mg 2ϩ binding site could account for the observed PRE. For most of the residues in IN 50 -212 , the minimum distance between H N atoms (added using REDUCE (95)) and the crystallographically observed Mg 2ϩ ion correlate very well (Fig. 4,  C, compare with B). Specifically, the correlation is very good for residues 60 -180. After residue 180, however, the correlation breaks down, indicating that the crystallographic Mg 2ϩ site is insufficient to describe Mn 2ϩ binding for these residues. This finding suggests that another Mn 2ϩ binding site exists near the C-terminal helix. To test this, we used the crystallographic Mg 2ϩ binding site to calibrate a 1/r 6 potential for residues 60 -180 (see "Experimental Procedures"). Then, this potential was used to position an additional Mn 2ϩ binding site that could reproduce the broadening seen under identical conditions for residues 181-212. The resulting curve, using the x-ray coordinates as a reference frame, shows good agreement between experimental and predicted PRE rates (Fig. 4B). The additional Mn 2ϩ /Mg 2ϩ binding site is able to reproduce the data for the final helix, and when combined with the original binding site, the overall attenuation is reproduced very well.
Because of the symmetry requirements, the second Mn 2ϩ binding site is forced to lie on the C 2 axis of the symmetrical dimer (top of the molecule in Fig. 2, B and C). However, attempting to fit multiple Mn 2ϩ atoms without this constraint identified essentially the same position. The Mn 2ϩ binding site lies in a cleft between the two Glu 209 side chains, and this suggests that the binding site was introduced artificially via the Q209E mutation made to stabilize IN 50 -212 . To confirm this observation, a similar set of experiments was carried out on F185K,W131E IN 50 -212 , without the additional Q209E mutation (data not shown). Spectral quality was poorer without the Q209E mutation, but MgCl 2 binding was observed near the catalytic loop (residues 64 and 140 -157), and spectral quality improved for these residues as MgCl 2 was added, similar to what was observed with Q209E (supplemental Fig. S4). Constructs with and without Q209E exhibit virtually indistinguishable active site chemical shifts, and bind Mg 2ϩ at the active site in a similar way, indicating that the Q209E mutation does not significantly affect the catalytic loop. On the other hand, Mn 2ϩ or Mg 2ϩ binding effects at the C terminus were absent for the F185K,W131E variant. Moreover, we also do not observe binding for the IN 50 -212 conformer that contains the disordered C-terminal helix (Fig. 4, A and B, right). Given these results and the position of the additional binding site, we hypothesize that the C-terminal helix in wild type IN 50 -212 is unstable, and the additional engineered Mg 2ϩ binding site in our variant helps rigidify this helix, thereby stabilizing the entire core domain particle, leading to better NMR spectra. However, the Q209E mutation destabilizes this helix when Mg 2ϩ or Mn 2ϩ is not present (supplemental Fig. S4). At this point it is unclear to what extent the stability of the C-terminal helix impacts the function of the full-length enzyme. However, the structure of this helix controls the position of the DNA binding domain relative to the catalytic core domain, which differs between models of the full-length enzyme and crystal structures of the hexameric catalytic core plus DNA binding domain construct (33).
Raltegravir Binding-Raltegravir is a member of a class of diketo acid IN inhibitors currently approved for clinical use (17). It functions primarily by blocking integrase at the strand transfer step, but at higher concentrations it also inhibits 3Ј processing (96,97). Previous studies have shown that raltegravir fails to bind integrase tightly in the absence of DNA, nor does it appear to interact with the core domain alone (96). However, given that NMR is well suited to detect even very weak levels of binding up to K d values greater than 1 mM, we attempted to titrate raltegravir into solutions containing IN 50 -212 . Raltegravir (obtained commercially) was solubilized to high concentrations in DMSO and added to a final drug:protein ratio of 14:1 and a final DMSO concentration of 6.5% (v/v). Residues 143-152 were observed to shift, but shift changes were very small after correction for the contribution from DMSO alone, with the maximum chemical shift change, ⌬␦ avg , being 0.03 ppm. Binding curves had not approached a plateau by this point in the titration. Assuming a typical lower limit of 0.1-0.2 ppm chemical shift change upon ligand binding for at least some of the protein backbone amide resonances, this indicates less than about 20% drug binding at this point in the titration, i.e. K d Ն ϳ10 mM. Thus, in agreement with previous studies, we find no evidence of significant binding to the core domain alone in the absence of target DNA, and work is continuing to develop a minimal system that binds both DNA and inhibitors.

DISCUSSION
This work represents the first study of the IN catalytic core domain by NMR. The data presented here confirm many prior findings about integrase to apply to the solution state, but they also reveal a few surprises. In particular, the dramatic affect of the high magnesium concentration on stability of the core domain was unexpected and may provide clues about the importance of dynamics during catalysis. Additionally, the ModelFree 15 N relaxation analysis presented here confirms that the active site catalytic loop from residues 140 to 153 is dynamic on the picoseconds-nanosecond time scale, but it does not show the characteristics often seen for flexible linker regions in proteins. Instead, it shows dynamic behavior of moderate amplitude that is relatively homogeneous across the active site loop, suggestive of concerted motions between distinctly different conformations. Finally, our work with drug binding highlights some of the difficulties that will arise for IN as structure-activity relationships are studied using NMR.
Previous studies have characterized a Mg 2ϩ -or Mn 2ϩ -induced conformational change in integrase. Asante-Appiah and Skalka (34) were the first to characterize this effect on the fulllength IN construct. They identified the surprising result that increased metal concentrations (up to 10 mM) were able to confer resistance to proteolytic digestion of IN. Furthermore, they found that this behavior was unique to HIV-1 IN, and they proposed that the relevant conformational change occurred between the core and C-terminal domains. Later work confirmed that the metal-driven conformational changes were influenced by at least two binding sites: one at the active site and one at an unknown binding site elsewhere in the molecule (35). It was hypothesized that this second binding site changes the relative orientation of the IN domains, which resulted in the increased proteolytic protection observed at higher metal concentrations.
Our work confirms that magnesium induces significant conformational changes in the catalytic domain of IN. In particular, the active site residues Asp 64 , Asp 116 , and Glu 152 show peak broadening at low MgCl 2 concentrations, indicating exchange with at least one other (possibly disordered) conformation (supplemental Fig. S4). Thus, we can confirm that metal binding at the active site is essential for conformational stability in IN. An unexpected result is the degree to which other, noncatalytic residues are affected by the addition of MgCl 2 to the sample. Even in the original F185K,W131E construct, spectral quality improves as magnesium concentration is increased, indicating that this effect is not simply a result of the IN 50 -212 mutations used in our study. Examining the chemical shift changes as a function of MgCl 2 ( Fig. 2B and supplemental Figs. S1 and S5), it is clear that magnesium may interact with multiple regions of the core domain to act as a general stabilizing agent.
Highest quality NMR spectra were obtained with the Q209E mutation, which introduces an additional, specific Mg 2ϩ / Mn 2ϩ binding site between the two C-terminal helices in the catalytic core dimer. Introducing this binding site was unexpected, because Gln 209 is rarely seen as a helix in IN crystal structures, but we hypothesize that this binding site may be directly responsible for the improved NMR spectra over F185K,W131E integrase. Given that the final loop between helices ␣5 and ␣6 is highly dynamic, the orientation of the C-terminal helix may be ill-defined in solution. Introducing the binding site may help stabilize the helical packing, thereby stabilizing the entire core domain-fold. Although the Q209E binding site is artificial, it may nonetheless hint at another, biologically relevant metal binding site. Many Asp and Glu residues line the helix connecting the core and C-terminal domains, and a slight reorientation of the crystallographically observed helix may be sufficient to form such a site (33). This site may correspond to the additional site hypothesized by Asante-Appiah et al. (35).
Uncertainty exists as to whether the IN active site binds one or two Mg 2ϩ /Mn 2ϩ ions in the absence of DNA substrate. The proposed mechanism requires two metal cations (98), and two Mn 2ϩ ions are observed to bind the active site in DNA-bound PFV IN (37). Non-physiological metals such as Zn 2ϩ , Cd 2ϩ , and Ca 2ϩ have been complexed to avian sarcoma virus IN as well as HIV IN in the absence of DNA (33,99), but to date no crystal structure exists with two Mg 2ϩ or Mn 2ϩ ions in the active site. Because it has poorer geometry, it is likely that the second binding site has weaker affinity (100). Unfortunately, our data cannot resolve this issue. We do not observe positive evidence for two binding sites from chemical shift changes during titrations, and the chemical shift titrations for residues near the active site can each be fit with a single apparent K d value. This suggests that if multiple metal binding sites are present, K d values for them cannot be very different. The fact that apparent K d values for different active site residues do not differ much beyond the experimental error from one another (supplemental Fig. S3) confirms this conclusion. So, even though our data are compatible with two Mg 2ϩ ions binding with similar affinities, it is equally compatible with the case where only one Mg 2ϩ binds.
We also note that our observations on IN stability versus magnesium concentration may be helpful for producing more crystal structures of the core domain as well as other multidomain constructs. To date, crystal structures have been determined at relatively low concentrations of MgCl 2 , typically 5 mM or less (27,29,31). At this concentration, the conformational change we observe is far from complete, and our spectra continue to improve up to concentrations of 40 mM MgCl 2 . Although the lower protein concentrations used in crystallization translate to a somewhat lower required concentration of MgCl 2 , we expect that fully saturating IN metal binding will aid in the production of new high-quality crystal structures.
A topic of key interest in IN biochemistry is the structure and function of the active site. Given that the core domain must catalyze both 3Ј processing and strand transfer, it is not surprising that the domain would be dynamic and populate multiple conformations in its unbound state. Despite the wealth of structural information available on the core domain, little is known about the conformation of IN during catalysis. Recently, a cocrystal of PFV integrase and DNA has been determined, and this promises to guide our understanding of HIV integrase structure and function (37). In their structure, Hare et al. (37) observed an extended ␣4 helix with a kink for residues homologous to Pro 145 and Gln 146 . These two residues are in the 3 10 helix and interact with the viral DNA end to stabilize the nucleoprotein particle. In the apo state, it is known from other x-ray studies that the catalytic loop can sample multiple conformations and is relatively unstable compared with the rest of the molecule (24,26,27). It is not known, however, whether the active site samples the kinked conformation seen in the PFV IN structure. One insightful study by Greenwald and co-workers (29) made Gly to Ala mutations to rigidify the catalytic loop. They found that G140A and G149A variants of IN reduced catalytic rates to 6 -12% of their normal levels, and a double mutant construct virtually abolished activity. Their crystal structures indicated that these variants stabilize a helical conformation, extending the N-terminal end of the ␣4 helix to Pro 145 in the G140A,G149A variant. Given that the PFV IN crystal structure is kinked at Gly 218 (corresponding to Gly 149 in HIV-1 IN), it is easy to see how mutations to this residue could affect catalysis. It is less clear why mutations at Gly 140 would influence catalysis. In PFV, the analogous residue is Ser 209 , which is even more sterically constrained than Ala in the G140A variant. It is therefore likely that dynamics play an important role in catalysis as well.
Because of the uncertainty in the mechanism of catalysis, computational studies have examined the conformation and dynamics of the active site loop in the core domain apo state. An initial molecular dynamics study by Lins et al. (38) observed a tendency for the helix extension at the N-terminal end of ␣4, and dynamics on the picosecond time scale were commensurate with elevated crystallographic temperature factors. A later study extended the time scale of simulations to 40 ns (39). During this time, the catalytic loop was observed to close across the surface of the other catalytic residues, leading the authors to propose a gating mechanism for these residues. Additionally, both papers support an important role for Tyr 143 , where the side chain increases enzymatic efficiency by helping to position an activated water nucleophile during catalysis.
Our CS-ROSETTA model for the core domain lacks the long range constraints typically used to generate high quality NMR macromolecular models, but the correlation between backbone torsions and chemical shifts is well established (68,72,85,101,102), and chemical shift-based modeling can provide some insight on the structural organization of the active site. Because the CS-ROSETTA energy of the entire IN 50 -212 monomer is dominated by its overall molecular topology, low energy conformers do not necessarily correspond to optimal low energy active site conformations. Therefore, we also carried out CS-ROSETTA calculations for the catalytic loop (residues 139 -153) on an otherwise fixed crystal structure of the core domain. The backbone conformations of the six low-energy CS-ROSETTA clusters suggest that the kinked 3 10 helical conformation observed in PFV is also sampled in solution along with other conformations (Figs. 5 and 6). The lowest energy model (Figs. 5 and 6, red conformation) is in striking agreement with the PFV IN structure for residues 145-153. In addition to being kinked at residues 149, this model adopts a 3 10 helical conformation for residues 145-147 as defined by the DSSP software package (103). In addition to the DNA-bound conformation it is clear that other conformations can be sampled by the catalytic loop as well. Chemical shifts and sequential H N -H N NOE intensities (supplemental Figs. S7 and S8) suggest extensive sampling of helical conformations from Pro 145 onwards, but we also find two clusters that trace an extended conformation for these residues (Figs. 5 and 6, green and yellow conformations), similar to what is observed in crystal structures complexed to Mg 2ϩ (PDB entry 1BL3; Fig. 6, gray conformation). Although the other clusters possess near ␣-helical , torsions, the hydrogen bonding patterns suggest that helix ␣4 undergoes fraying as well (Fig. 6).
Two remaining residues are also of interest, Tyr 143 and Asn 144 . The chemical shifts for Asn 144 are most compatible with the northwest quadrant of the Ramachandran plot, but the residues have a broad distribution roughly centered at the , values for polyproline II helix. The backbone torsions for Tyr 143 are even less well converged, and only one of our clusters is in agreement with the left-handed helical conformation observed  in the PFV IN crystal structure (Fig. 5, purple conformation). The TALOSϩ fragments are split approximately evenly between right-handed helix and strand conformations, and correspondingly most of our clusters sample the left-hand side of the Ramachandran plot for Tyr 143 . Even though this residue (Tyr 212 ) differs in PFV, the overall topology of the catalytic loop is conserved for our models.
In light of prior simulations on the core domain, our data reconcile two conflicting models for the catalytic loop conformation. The structures of Greenwald et al. (29) and the simulations of Lins et al. (38) both suggest a helical conformation for the catalytic loop, in general agreement with the PFV integrase-DNA crystal structure. Alternatively, longer simulations by Lee et al. (39) suggest that Tyr 143 shifts conformations as helix ␣4 partially unfolds, effectively gating the active site. Our CS-ROSETTA calculations, which are based on experimental chemical shifts, and our H N -H N NOE data (supplemental Fig. S7) support both models. Most of the time, residues 139 -153 sample helical basins, but large scale, occasional structural excursions are also possible. Thus, even without the Gly to Ala mutations made by Greenwald and co-workers (29), the residues N-terminal of helix ␣4 appear to sample helix in solution. We hypothesize that their mutations lock in the helical conformation and prevent the excursions into other conformations seen in the wild type protein. These excursions may be required to sample the appropriate 3 10 helix observed in PFV integrase, or they may be needed to accommodate the structural strain experienced during DNA binding.
Although structural modeling based on chemical shifts alone is tentative, our 15 N relaxation data support the above conclusions. With order parameters in the 0.7-0.8 range and heteronuclear NOE values ranging from 0.4 to 0.7, the residues in the catalytic loop are not nearly as dynamic as would be expected for a random coil (cf. the N-and C termini in Fig. 3). Instead, dynamics indicate that the region is fairly stable on the picosecond-nanosecond time scale. Except for the N and C termini, and in agreement with temperature factors and simulations, the most dynamic region of the molecule at this time scale is the loop from residues 185 to 195 (38,39). As a point of reference, a Lipari-Szabo order parameter, S 2 , of 0.7 corresponds to angular diffusion within a cone subtending 28 degrees (77). Thus, the dynamics of the active site loop fall somewhere between the extremes of disordered and fully folded. Although our chemical shift-based simulations support conformational excursions of the catalytic loop (39), we do not find evidence for gating on the picosecond-nanosecond time scale. Given the CS-ROSETTA models, it is likely that these motions occur but that they happen on longer time scales. In fact, other studies have found helix formation to occur on a time scale of microseconds (104), and loop gating can even occur on time scales up to milliseconds in some cases (105). The conformational change observed in molecular dynamics simulations may therefore be slower than we could reasonably expect to detect with 15 N relaxation measurements, which then report on dynamics in each of the stable clusters shown in Figs. 5 and 6 individually. Now that all three domains of IN have been assigned by NMR, it is tempting to design experiments testing structureactivity relationships with inhibitors and DNA substrates. Such experiments would in principle be faster and more straightforward than crystallographic studies. Unfortunately, our attempts to characterize full-length IN by NMR so far have remained unsuccessful (data not shown), and our results with raltegravir binding demonstrate that studies of IN by NMR will continue to be challenging. Diketo acid inhibitors of IN have been shown to require the presence of DNA substrate (96), and the extremely weak binding we observe for the free core domain confirms this. Moreover, we have shown that the solvents used for IN inhibitors, such as DMSO, affect the chemical shifts of IN itself at concentrations below 5% (v/v). Therefore, structureactivity studies will have to account for these effects. An additional complication is the sheer size of the protein-DNA complex, which for PFV approaches 270 kDa (37). Although de novo studies of such large systems tends to be extremely difficult, novel TROSY-based technology combined with extensive deuteration can make such studies feasible provided that assignments of its domains are available.