Thermodynamic dissection of a low affinity protein-protein interface involved in human immunodeficiency virus assembly.

Homo-dimerization of the capsid protein CA of human immunodeficiency virus through its C-terminal domain constitutes an early crucial step in the virion assembly pathway and a potential target for antiviral inhibitors. We have truncated to alanine the 20 amino acid side chains per monomer that participate in intersubunit contacts at the CA dimer interface and analyzed their individual energetic contribution to protein association and stability. About half of the side chains in the contact epitope are critically involved in the energetic epitope as their truncation essentially prevented dimerization. However, dimerization affinity is kept low partly because of the presence of interfacial side chains whose individual truncation improves affinity by 2-20-fold. Many side chains at the interface are energetically important also for the folding of a monomeric intermediate and for its conformational rearrangement during dimerization. The thermodynamic description of this low affinity interface (dissociation constant of approximately 10 microm) was compared with those obtained for the other protein-protein interfaces, nearly all of them of much higher affinity, that have been systematically analyzed by mutation. The results reveal differences that may have been evolutionary selected and that may be exploited for the design of an effective interfacial inhibitor of human immunodeficiency virus assembly.

Assembly and stability of the complex oligomeric proteins that constitute the capsids of viruses are mediated by multiple noncovalent interactions between many protein subunits. An attractive antiviral approach involves the use of compounds able to interfere with critical intersubunit interactions in viral oligomers. Potential inhibitors include small molecules and also larger "interfacial inhibitors" that mimic a part of the interacting epitope (1)(2)(3)(4). Rational design of antiviral agents aimed at inhibiting assembly or facilitating dissociation of a virus capsid may require not only a detailed structural knowledge of capsid subunit interfaces but also a thermodynamic understanding of the individual energetic contribution of interfacial residues and interactions to subunit association. This is particularly important if the inhibitor must met the conflicting requirements of a minimum size with maximum affinity and specificity.
Alanine scanning mutagenesis (5) provides an excellent strategy to experimentally determine the energetic contributions of interfacial side chains to protein-protein interaction. Mutation to Ala of residues other than Gly or Pro eliminates the targeted side chain beyond C␤ and disrupts any interaction that involves that side chain, without introducing new interactions and with the lowest probability of altering the conformation of the polypeptide backbone (5,6). A few groups have applied this approach for a systematic thermodynamic dissection of interfaces in dimeric heterocomplexes (7)(8)(9)(10)(11)(12)(13)(14)(15)(16) or small homo-oligomers (17)(18)(19)(20)(21). These studies have substantially advanced the molecular understanding of protein-protein recognition. In many of the interfaces analyzed, a few centrally located residues contribute most of the binding energy and are surrounded by energetically less important contacts (22); in a few cases, however, the energy of interaction appears to be distributed across most of the interface (15,21).
The structural complexity of viral capsids has generally prevented the application of quantitative experimental thermodynamic approaches based on the use of mutants to dissect the energetics of capsid subunit association during assembly. Fortunately, assembly of some capsids, including that of human immunodeficiency virus (HIV), 1 involve a number of discrete oligomerization interfaces between protein domains that may be amenable to quantitative thermodynamic analyses in isolation. Dimerization of the capsid protein (CA) of HIV-1 through its C-terminal domain (CA-C) is a major driving force in virus morphogenesis and budding from the cell (23)(24)(25)(26). The threedimensional structures of CA from HIV-1 and other retroviruses are known (27)(28)(29)(30)(31)(32)(33)(34). Image reconstruction by cryo-electron microscopy (35) and measurements of amide hydrogen exchange (36) of HIV-1 capsids formed in vitro have provided direct evidence for the relevance in the assembled capsid of the homotypic CA-C interactions determined by x-ray crystallography of the soluble CA-C dimer (29,30). Each subunit in the HIV-1 CA-C dimer (CA amino acid residues 147-231) is composed of a short 3 10 helix followed by an extended strand and four ␣-helices connected by short loops. The dimerization interface is essentially formed by the parallel packing of helix 2 from each subunit (Fig. 1A) (29,30). The isolated CA-C domain is not only correctly folded but also dimerizes with the same affinity as the full-length protein (29), which has validated its thermodynamic study. Folding and dimerization of CA-C of HIV-1 is a three-state reaction in which the polypeptide folds to yield a monomeric intermediate of low stability, and the intermediate dimerizes in a process that also involves a tertiary conformational reorganization of each monomeric subunit ( Fig.  1B) (37).
We have chosen the CA-C dimerization interface to undertake the first experimental thermodynamic dissection of a protein-protein interface critically involved in assembly of a virus capsid. The use of different equilibrium techniques allowed us to perform separate calculations on the specific free energy contributions of each interfacial side chain to each of the different processes observed during folding/association of CA-C (37). The implications for HIV capsid assembly, the design of interfacial antiviral inhibitors, and protein-protein recognition are discussed.

EXPERIMENTAL PROCEDURES
Mutagenesis and Protein Purification-Site-directed mutagenesis was performed using the QuikChange kit (Stratagene) on recombinant plasmid pET21b(ϩ) containing the segment corresponding to CA-C of HIV-1 (strain BH10; CA residues 146 -231). The mutations introduced were confirmed by sequencing the entire CA-C coding region. The mutant proteins were expressed in Escherichia coli BL21(DE3) and purified as described for wild-type CA-C (37). The proteins were stored and analyzed in 25 mM sodium phosphate buffer, pH 7.3, unless specified otherwise. Purified CA-C mutants were run in overloaded SDS-PAGE gels and found free of contaminants. Their concentration was determined by UV-spectrophotometry as described (37).
Determination of Affinity Constants by Analytical Gel Filtration Chromatography-Association constants for the homodimeric mutants were determined by gel filtration using frontal (large zone) elution (38). 15-ml aliquots of CA-C samples with a total protein (monomer) concentration (C t ) usually ranging from 0.5 to 60 M were serially applied to a recalibrated Superdex 75 HR FPLC column (Amersham Biosciences), which had been thoroughly equilibrated with the appropriate buffer (25 mM sodium phosphate, pH 7.3, unless specified otherwise) and kept at 23°C. Samples were eluted at 1 ml/min and continuously monitored with an on-line UV detector at a wavelength of 280 nm. The elution volume V e at each C t was determined as the midpoint of the ascending frontal profile in the chromatogram. The weight average partition coefficient (39) at each C t ( ) was calculated using the expression, where V t is the total column volume and V 0 is the void volume, which were determined as described (18). The experimental w values obtained at different C t were fitted to the equation (39), where m and d are the partition coefficient of monomer and dimer, respectively, and K a is the equilibrium association constant. The free energy of association ⌬G a was calculated by using the equation, where R is the gas constant and T the absolute temperature. Zonal (small zone) elution experiments were carried out in the same way, except that 200-l aliquots were applied to the column and that the V e was defined by the peak position in the chromatogram. Gel filtration analysis using small zone elution does not normally provide good K a values (38), and it was not used for most CA-C mutants. However, we found for nonmutated CA-C (and for another, unrelated dimeric protein) that in our conditions, the K a obtained using small zone gel filtration was coincident within error with that obtained using frontal elution gel filtration or other standard techniques (see below). Thus, for those CA-C mutants that we found essentially in the monomeric form at all but the highest protein concentrations tested, a rough estimation of the K a was obtained as follows: the frontal elution data obtained at M concentrations were complemented by small zone elution data obtained at much higher (around 1 mM) concentrations (that were prohibitive for large zone experiments), and the combined data were fitted using Equation 2. Because of the extremely low affinity of these particular mutants, to achieve meaningful fitting values for m and d , we had to assume that at extremely high (unreachable) concentrations (around 1 M), the protein would be essentially in the dimeric form and that d did not significantly change upon mutation. Use of these reasonable theoretical assumptions allowed a good fitting of the experimental data for the very low affinity mutants.
Fluorescence and Circular Dichroism (CD) Spectroscopy and Dissociation/Unfolding Equilibrium Data Analyses-Dissociation/denaturation of CA-C mutants by guanidinium chloride (GdmHCl) were spectroscopically probed as described for wild-type CA-C (37) using a Varian Cary Eclipse luminescence spectrophotometer and a Jasco 500 spectropolarimeter equipped with temperature control units. Total protein (monomer) concentration was 200 M unless indicated otherwise. Variations in fluorescence probed dissociation of the dimer into a monomeric form, and the data were fitted to a dimer-to-monomer two-state transition. Variations in CD probed the unfolding of the monomeric intermediate, and the data were fitted to a unimolecular two-state transi-  (37). TS1 and TS2 are transition states. ⌬G um is the free energy difference corresponding to unfolding of I into U, as determined in equilibrium dissociation/denaturation analysis by far-UV circular dichroism. ⌬G ud is the free energy difference at a standard 1 M protein concentration, corresponding to dissociation of N 2 into I, as determined in equilibrium dissociation/denaturation experiments probed by intrinsic Trp fluorescence. ⌬G dis is the free energy difference at a standard 1 M protein concentration corresponding to the dissociation of N 2 by dilution as determined by analytical gel filtration. ⌬G ut is the free energy difference at a standard 1 M protein concentration corresponding to the complete process of dissociation/unfolding of N 2 into 2U.
tion. The equations used in Ref. 37 and the program Kaleidagraph (Abelbeck Software) were used.
Molecular Modeling and Computer Analyses-The three-dimensional structure of CA-C (29, 30) was inspected using personal computers, a Silicon Graphics work station, and the programs InsightII (Biosym Technologies) and RasMol (40). The program Whatif (41) was used for analyses of the number, nature, and structural parameters of the interdimer contacts, solvent accessibility calculations, and modeling of the mutations introduced.

Energetic Contribution of Interfacial Residues to Dimerization of CA-C Monomers-Analysis of intersubunit interactions
in the crystallographic structure of the CA-C dimer revealed that 22 side chains per monomer are involved in the contact epitope (29,30). We have individually truncated all of these side chains (except Ala-177 and Pro-207) to alanine. Dissociation of nonmutated CA-C and each mutant by simple dilution was quantitated in analytical gel filtration experiments ( Fig. 2 and Table I). This allowed determination of the association constant K a and the free energy difference ⌬G a between dimer and monomer, and thus, the contribution of quaternary interactions to the dissociation process (37,42). The difference between ⌬G a for nonmutated CA-C and ⌬G a for each mutant (⌬⌬G a ) was taken as a measure of the relative energetic contribution of the truncated side chain (beyond C␤) ( Only five (about one-fourth) of the side chains in the contact epitope could be truncated with no or very minor effects on dimerization (absolute values of ⌬⌬G a below 0.3 kcal/mol; Table I and Fig. 3). These included Thr-148 and Val-191, which participated in a few intersubunit hydrophobic contacts; Asn-193, which was involved in two reciprocal, contiguous intersubunit hydrogen bonds with Gln-192; and Asp-152 and Lys-203 which participated in no intersubunit contacts except two reciprocal, solvent-exposed salt bridges. These interactions may thus contribute to the free energy of association in a negligible way. However, an alignment of more than 240 CA sequences from HIV-1, HIV-2 or simian immunodeficiency virus (SIV) from the HIV sequence data base (hiv-web.lanl.gov), showed that residue 152 was invariably Asp (Glu in one variant) and that residue 203 was either Lys or Arg. Conservation of charge at these positions could be explained because the two 152-203 pairs were located at opposite sides of the interface rim and may favor, through electrostatic steering (43), the correct orientation of the monomers during association.
Two of the side chains at the interface (Thr-188 and Lys-199) had a significant contribution to dimerization (⌬⌬G a ϭ 1.1 or 0.5 kcal/mol, respectively; Fig. 3). In the dimer, Thr-188 participated in two intersubunit hydrophobic contacts, whereas Lys-199 was substantially exposed to solvent. Remarkably, as many as nine interfacial side chains (nearly one-half of the contact residues) were most critically involved in the energetic epitope, as their individual truncation led to monomeric CA-C even at high protein concentrations (Table I). We were able to estimate the extremely low association constants for the essentially monomeric mutants (see "Experimental Procedures" and Fig. 3). The most critical side chain was Trp-184 (estimated ⌬⌬G a Ͼ 10 kcal/mol). Mutant W184A (and also M185A) had been already found monomeric (29). In the dimer, Trp-184 from one monomer is essentially buried within a cavity formed by residues from the opposite monomer and makes multiple (nine) intersubunit hydrophobic contacts (30). Most of the remaining eight side chains that also led to dramatic losses in the free energy of association (estimated ⌬⌬G a ϭ 6 -8 kcal/mol, depending on the mutation) are also nonpolar (Ile-150, Leu-151, Leu-172, Val-181, Met-185, and Leu-189) and deeply buried in the dimer interface, where they make between one and seven intersubunit C-C contacts. Together with Trp-184, they delineate a large central area within the contact epitope (Fig. 4A). Two charged residues (Arg-154 and Glu-175) were also critical for dimerization. Neither is involved in intersubunit hydrogen bonds or salt bridges, but both are partially buried in the dimer (compare Ref. 30 and Table I). The energetically critical residues at the interface were all, except for Leu-150 and Arg-154, almost absolutely conserved in HIV and SIV. Leu-150 was either completely conserved (in HIV-1) or replaced only by the chemically similar residues Val (in most HIV-2 viruses) or Ile (in all SIV variants). Arg-154 was found replaced by Lys in many variants, but this substitution preserved the positive charge (see below). It must be noted here that in sequence analyses of CA residue conservation, we occasionally found very rare variants that did not fit the observed trends. Among other possibilities, such exceptional mutations may derive from nonfunctional proviruses (44). These exceptions were not found in an alignment of a subset of CA sequences that were clearly derived from infectious virions.
Finally, it was found that truncation to Ala of any of the four remaining interfacial side chains (Ser-178, Glu-180, Glu-187, and Gln-192, one-fifth of the residues in the contact epitope) reproducibly increased the affinity between 2-and 20-fold (Table I). Very similar results were obtained at two different ionic strengths (in the presence or absence of 150 mM NaCl; compare Fig. 2 and Table I). Truncation of Gln-192 increased the association constant by more than one order of magnitude. The Gln-192 side chains from the two subunits contact each other and are located just underneath a cluster of four positive charges (the guanido group of Arg-154 and amino group of Lys-199 from the two monomers), located at the interface rim. These four charged groups are close to each other (Fig. 4B tion of those four positive charges, thus diminishing their mutual electrostatic repulsion. In all but one of the 240 CA variants analyzed, Arg-154 and Lys-199 were either preserved or mutated to Lys or Arg, respectively, and Gln-192 was almost absolutely conserved. Other effects may, however, complicate the situation, as truncation of Arg-154 prevented dimerization and that of Lys-199 led to a small decrease in K a (see above). Ser-178 and Glu-180, whose individual truncation also improved affinity, are involved in reciprocal intersubunit hydrogen bonds, and the two pairs of side chains form a cluster at the interface rim (Fig. 4B). The carboxyl group of Glu-180 also binds a water molecule that is in turn hydrogen-bonded to the main chain oxygen of Gln-176 from the opposite subunit. As discussed for Arg-154/Lys-199, the two Glu-180 carboxylates are spatially very close in the dimer structure (Fig. 4B, 3.7 Å between both O␦) and may electrostatically repel each other. This could explain the higher affinity of the E180A mutant. Truncation of the two serine side chains in S178A may allow some separation of the Glu-180 carboxylates, thus decreasing their mutual repulsion and leading to an increase in affinity. Ser-178 and Glu-180 were not completely conserved. Yet, in practically every HIV or SIV variant analyzed, including several hundred HIV-1 variants from the European Molecular Biology Laboratory (EMBL) data base, a negatively charged side chain was preserved either at position 180 (Glu or Asp) or at position 178 (Asp), and a small size, neutral side chain was found at the other position (in most cases, Ser or Thr at 178 or Ala at 180). This and the relative orientations of these side chains (Fig. 4B) suggest that a repulsive intersubunit interaction between a pair of negatively charged residues in this particular spot of the CA-C dimer interface may be preserved in essentially all HIV and SIV variants. Further experiments (underway) are needed to validate the above hypotheses on the negative effects and biological conservation of some interfacial residues on CA-C dimerization (see "Discussion").
Energetic Contribution of the Interfacial Side Chains to the Conformational Reorganization of the CA-C Monomer during Dimerization-In equilibrium denaturation analyses of CA-C using GdmHCl, dissociation of the dimer into the monomeric intermediate could be specifically probed by following the intrinsic fluorescence of the only Trp present (Trp-184). Unfolding of the intermediate did not contribute to the fluorescence change because Trp-184 is nearly fully exposed to solvent both in the intermediate and in the denatured protein (37). The free energy difference thus determined (⌬G ud ϭ 12.3 kcal/mol at a 1 M standard state) was much higher than the free energy of dissociation of the native dimer by simple dilution, as determined by gel filtration (⌬G dis ϭ 6.9 kcal/mol at 1 M standard a Thermodynamic parameters corresponding to dissociation of dimeric CA-C by dilution were obtained by gel filtration. K a is the association constant, and ⌬G a is the free energy difference at a standard 1 M protein concentration. K a values in italics were obtained using 25 mM phosphate buffer pH ϭ 7.3 that contained 150 mM NaCl. All other values were obtained using the same buffer with no NaCl added. For the essentially monomeric mutants the K a (in parenthesis) and ⌬G a , values were estimated as described in Methods.
b Thermodynamic parameters corresponding to dissociation of dimeric CA-C into a folded monomeric intermediate were determined in equilibrium experiments using GdmHCl and measuring the intrinsic Trp fluorescence. ⌬G ud is the free energy difference at a standard 1 M protein concentration, extrapolated to absence of denaturant, and m ud is the variation in free energy with GdmHCl concentration corresponding to dissociation of native dimeric CA-C into the intermediate. For T188A and N193A the native baseline could not be experimentally determined, and the ⌬G ud values were obtained assuming the baseline values determined for nonmutated CA-C, which were also very similar to those obtained for all other dimeric mutants. ⌬G ur (the free energy difference corresponding to the tertiary rearrangement of each monomeric intermediate during dissociation of the CA-C dimer) was calculated from ⌬G ur ϭ (⌬G ud Ϫ ⌬G dis )/2 (37,42).
c Thermodynamic parameters corresponding to unfolding of the CA-C monomeric intermediate were determined in equilibrium experiments using GdmCl and measuring the far-UV protein ellipticity. ⌬G um is the free energy difference extrapolated to absence of denaturant, and m um is the variation in free energy with GdmCl concentration. ⌬G ut (the free energy difference at a standard 1 M protein concentration corresponding to the complete process of dissociation/unfolding of dimeric CA-C into denatured monomers) was calculated using the expression ⌬G ut ϭ ⌬G ud ϩ2⌬G um .

Energetic Dissection of HIV Capsid Protein Dimerization
state). Comparison of the two values allowed calculation of the relative contributions of quaternary and tertiary structure changes to the dissociation (or conversely, association) step (37,42). For CA-C wild-type, the value ⌬G ur ϭ (⌬G ud Ϫ ⌬G dis )/2 was 2.7 kcal/mol and corresponds to the tertiary reorganization of each CA-C monomeric intermediate molecule during dissociation of the dimer (or during dimerization) (37,42).
We have now likewise determined ⌬G ud for each interfacial mutant using GdmHCl and fluorescence analysis ( Fig. 5 and Table I) and compared these values with the corresponding free energy of dissociation ⌬G dis ϭ Ϫ⌬G a as determined by gel filtration (Table I). As expected, the nine mutants identified as essentially monomeric by gel filtration were found also monomeric by fluorescence analysis. For these nine mutants, the fluorescence intensity and maximum emission wavelength at the highest protein concentrations that could be reasonably used in these experiments (200 M) corresponded very nearly to those of the monomeric intermediate found for nonmutated CA-C and did not change substantially at any GdmHCl concentration (not shown). For each of the 11 alanine mutants that dimerized, the value ⌬G ur was obtained. The difference between ⌬G ur for nonmutated CA-C and each mutant (⌬⌬G ur ) was taken as a measure of the relative energetic contribution of the truncated side chain to the tertiary rearrangement of each monomer during dissociation (or conversely, association) (Fig.  3, gray bars). Only 4 of the 11 interfacial side chains had no or very little effect on this rearrangement, whereas all others, irrespective of their role in the establishment of quaternary interactions, contributed energetically to the rearrangement in a substantial way (⌬⌬G ur between 0.8 and 2.3 kcal/mol).
Energetic Contributions of the Interfacial Side Chains to the Folding Stability of the CA-C Monomer-Dissociation of the wild-type CA-C dimer into the folded monomeric intermediate had no effect on the secondary structure and was invisible by far UV-CD. Thus, the unfolding of the CA-C monomer could be specifically analyzed in GdmHCl denaturation experiments followed by far-UV CD (37). The free energy of unfolding ⌬G um of the monomeric form of each CA-C variant mutated at the interface ( Fig. 6 and Table I) has been now determined. The difference between ⌬G um for nonmutated CA-C and each mutant (⌬⌬G um ) was taken as a measure of the relative energetic

FIG. 3. Differences in free energy (⌬⌬G) between nonmutated CA-C and each mutant for association of the CA-C monomeric intermediate as determined by gel filtration (⌬⌬G a , black bars), for the conformational rearrangement of the monomeric intermediate during association (؊⌬⌬G ur , gray bars), and for folding of the monomeric intermediate (؊⌬⌬G um , white bars).
Positive ⌬⌬G values in this figure indicate that the mutation stabilized the product of the folding/association reaction probed. For the essentially monomeric mutants (⌬⌬G a ϽϪ6 kcal/mol), ⌬⌬G ur could not be determined. The free energy values ⌬G a , ⌬G ur , and ⌬G um are given in Table I.

FIG. 4. Energetic dissection of the CA-C dimerization interface.
A, effect on CA-C association of side-chain truncations. A spacefilling model of one of the monomeric subunits in the CA-C dimer is represented. The interfacial residues are color-coded according to the effect of side-chain truncations on K a and ⌬⌬G a . Red, truncation led to monomeric CA-C at all but the highest concentrations (estimated ⌬⌬G a Ͼ 6 kcal/mol); orange, truncation had a substantial but not dramatic effect on affinity (Thr-188, ⌬⌬G a ϭ 1.1 kcal/mol); green, truncation did not significantly affect the affinity (⌬⌬G a Ͻ Ϯ0.3 kcal/mol) or had only a small negative effect (Lys-199, ⌬⌬G a ϭ 0.5 kcal/mol); violet, truncation increased the affinity (⌬⌬G a ϭ Ϫ0.4 to Ϫ1.8 kcal/mol). B, location of residues Arg-154 and Lys-199 (cyan), Gln-192 (yellow), Ser-178 (green), and Glu-180 (red), represented as space-filling models, on the CA-C dimer structure (ribbon model). Residues are labeled a or b to distinguish between subunits. Individual truncation of the side chains of Gln-192, Ser-178, or Glu-180 increased dimerization affinity (see "Results"). contribution of the truncated side chain to the stability of the isolated monomeric intermediate (Fig. 3, white bars). Truncation of eight interfacial residues did not substantially affect monomer stability. However, 10 other residues destabilized the monomeric intermediate by 1-2 kcal/mol each, ϳ20 -40% of the free energy of folding of this already weakly stable form. As expected, destabilization of the intermediate did not correlate with the role of these residues in dimerization. The nondestabilizing mutations included several polar side chains (Thr-148, Glu-175, Glu-187, . Most of them were, at least in the dimer, substantially exposed to solvent. Fluorescence spectroscopy had shown that the side chain of Trp-184 was essentially solvent-exposed in the monomeric intermediate and its truncation, as expected, had no effect on its stability. Met-185 did not contribute to stability either, which was perhaps unexpected if its substantially buried position in each subunit (as seen in the dimer) is preserved in the monomeric intermediate. Among the clearly destabilizing truncations were those of the nonpolar side chains Ile-150, Leu-151, Leu-172, Val-181, or Leu-189, all of which were also critical for dimerization, and of polar residues Asp-152, Arg-154, Thr-188, and Asn-193, most of which were not. Interestingly, truncation of Glu-180 or Ser-178 not only increased dimerization affinity (see above) but also stabilized the monomeric intermediate by 1.3 or 2.5 kcal/mol, respectively. Despite the fact that folding and dimerization of CA-C are not coupled, many interfacial side chains provide a substantial energetic contribution to both processes.

DISCUSSION
The CA-C dimerization interface is structurally typical of a protein-protein interface in many respects (22,45,46). It involves as many as 22 residues from each monomer and buries as much as 1846 Å 2 of the solvent-accessible area (if the simplification of a rigid-body interaction is accepted), two-thirds of which is contributed from nonpolar side chain atoms that cluster in a large central area of the contact epitope (Fig. 4A) (29,30). In addition, the energetic dissection of the interface described here has revealed that all of the hydrophobic side chains buried at the central region of the contact epitope are critically involved in dimerization, whereas most largely polar side chains at the interface rim are not.
However, the thermodynamic results have also revealed major differences with other interfaces in heterocomplexes and homo-dimeric proteins that have been subjected to complete alanine scanning: (i) As many as nearly half of the side chains involved in the CA-CA contact epitope are so critically involved in the energetic epitope that their individual truncation led to essentially monomeric protein. In sharp contrast, nearly no other contact side chain showed a significant positive contribution to affinity. (ii) CA-C dimerization is a very low affinity process when compared with nearly any other structurally similar interface that has been analyzed by alanine scanning. The results obtained indicate that such low affinity is partly due to the presence of several side chains in the contact epitope (Ser-178, Glu-180, Glu-187, and especially Gln-192) that not only do not favor association but that individually decrease the association constant up to more than one order of magnitude. An increase in affinity has been rarely found upon mutation of residues in other protein-protein interfaces, and when encountered, this effect was generally very small (5,8,15,21). The interfacial residues that negatively affect affinity, and/or the interactions they make, including possible electrostatic repulsions at opposite sides of the interface rim (see "Results"), appeared highly conserved in HIV, even more than some of the critical residues in the energetic epitope. This remarkable conservation could be explained if those residues were needed for the conformational stability of the individual partners. However, truncation of any of the CA-C side chains that impaired association either had no effect or substantially increased the conformational stability of both the monomeric and dimeric forms. It is tempting to speculate that interfacial side chains and interactions that impair CA-CA affinity may have been evolutionarily conserved because of a selective functional advantage of a low stability and dimerization affinity of this CA domain for the productive assembly and maturation of the conformationally flexible HIV capsid (31,37). A nonexclusive explanation is that these residues may be involved in other interactions needed for completion of the HIV life cycle. Specifically, Ser-178 may be phosphorylated in HIV. Mutation of this residue affected viral infectivity, and the evidence suggested that Ser-178 phosphorylation is essential for the viral uncoating process (47). Thr is found at the equivalent position in many HIV-1 variants, but this residue also has the potential to be phosphorylated. Glu-180, Glu-187, and Gln-192 were all found involved in an alternative intermolecular interface between the N-and C-terminal domains as observed in crystals of CA of HIV-1 complexed with antibody (31). The involvement of some residues at the CA-C dimerization interface in the folding stability of the CA-C monomer, its conformational rearrangement, CA phosphorylation, and/or possible alternative types of CA-CA interactions during capsid assembly add strength to the notion that structurally overlapping functions may be frequent in viral capsids and may impose severe constraints to virus evolution (48). For many protein heterocomplexes or homo-oligomers including CA-C, protein-protein recognition is not coupled to their folding but involves association of already folded monomers, in a process that frequently includes additional conformational rearrangements of the interacting polypeptides. A molecular interpretation of free energy differences found upon mutation of residues at protein-protein interfaces thus requires careful consideration of the specific process that is being observed by the probe used (49). The combined thermodynamic analyses carried out for CA-C allowed us to distinguish for each FIG. 6. GdmHCl denaturation of nonmutated CA-C and some mutants at 200 M followed by their ellipticity at 222 nm. The curves were fitted to a unimolecular two-state transition using Equation 13 in Ref. 37. Circles, parental CA-C; inverted triangles, mutant K203A; triangles, L172A; squares, S178A.

Energetic Dissection of HIV Capsid Protein Dimerization
interfacial side chain its energetic contributions to the folding stability of the CA-C monomer, to the tertiary conformational rearrangement of the monomers during dimerization, and to their direct association through the establishment of quaternary interactions (42). Many residues in protein-protein interfaces have their side chains oriented toward the interacting partner, and it is often assumed that the effects of interfacial residues on the folding stability of the monomeric subunits are generally minor or even negligible. In fact, the results with CA-C show that as much as half of the interfacial side chains can make important contributions to the stability of the monomeric form. Several of these residues clearly participate in the hydrophobic core of each subunit as they appear in the dimer structure, but other residues important for monomer stability have their side chains oriented toward the interface. Determination of differences in tertiary structure between the isolated subunit and the oligomer may be needed for a satisfactory molecular explanation of the thermodynamic results. For CA-C, this could be achieved by determining the structure of the monomeric W184A mutant (underway). Because for CA-C the establishment of quaternary interactions is accompanied by an energetically substantial tertiary rearrangement of the interacting subunits, it was also important to differentiate between the contributions of each interfacial side chain to these processes. Most of the side chains were involved in both, but to widely diverse extents. For example, Asp-152 and Gln-192 had a positive contribution to the conformational rearrangement of the monomers during dimerization, whereas Asp-152 had no significant effect on and Gln-192 actually impaired this specific process. Even with such combined thermodynamic approach, the energetic dissection of any protein-protein interface may still be complicated by nonadditive (cooperative) effects (49). This is also clearly seen with CA-C, as individual truncation of each of nearly half of the interfacial residues led to differences in the free energy of association large enough to essentially abolish dimerization.
The isolated wild-type CA-C domain has been shown previously to inhibit HIV assembly (4). Now, alanine scanning has allowed us to identify the several residues at the CA-C dimerization interface that do not positively contribute to the stability and/or association of this small protein domain (Table I and Fig. 3). Correctly folded CA-C variants with substantially increased affinity for wild-type CA could be obtained by rational or combinatorial mutagenesis of some of these residues, in particular those that keep dimerization affinity low. CA-C mutant Q192A, with its 20-fold increased affinity, could serve as a starting point for this approach to design a high affinity interfacial inhibitor of native CA dimerization and HIV-1 assembly and infectivity.