Solution Structure of the Mature HIV-1 Protease Monomer

We present the first solution structure of the HIV-1 protease monomer spanning the region Phe1–Ala95 (PR1–95). Except for the terminal regions (residues 1–10 and 91–95) that are disordered, the tertiary fold of the remainder of the protease is essentially identical to that of the individual subunit of the dimer. In the monomer, the side chains of buried residues stabilizing the active site interface in the dimer, such as Asp25, Asp29, and Arg87, are now exposed to solvent. The flap dynamics in the monomer are similar to that of the free protease dimer. We also show that the protease domain of an optimized precursor flanked by 56 amino acids of the N-terminal transframe region is predominantly monomeric, exhibiting a tertiary fold that is quite similar to that of PR1–95 structure. This explains the very low catalytic activity observed for the protease prior to its maturation at its N terminus as compared with the mature protease, which is an active stable dimer under identical conditions. Adding as few as 2 amino acids to the N terminus of the mature protease significantly increases its dissociation into monomers. Knowledge of the protease monomer structure and critical features of its dimerization may aid in the screening and design of compounds that target the protease prior to its maturation from the Gag-Pol precursor.

Catalytic activity of retroviral proteases requires dimer formation, unlike for cellular aspartic proteases that are monomeric. The active site of HIV-1 1 protease, similar to that of the Rous Sarcoma virus (RSV), Avian Myeloblastosis virus (AMV), and other retroviral proteases, is formed along the dimer interface. In HIV-1, a single copy of the protease, composed of 99 amino acids, is synthesized as part of the Gag-Pol polyprotein ( Fig. 1) (1). Thus, the initial critical step in the maturation of the protease involves the folding and dimerization of the protease domain in the form of a Gag-Pol precursor in order to catalyze the hydrolysis of the peptide bonds at its termini. The released active mature protease is expected to process the remainder of the Gag-Pol precursor and the Gag precursor at specific sites into the necessary mature functional and structural proteins required for viral maturation (1,2).
Stage-specific regulation of the protease is crucial in the viral replication cycle, evident from studies showing that premature activation or partial inhibition of protease activity lead to impaired maturation of the virus (3)(4)(5)(6). Kinetics of the protease maturation from a model precursor containing only the two native cleavage sites, p6 pol /PR at the N terminus and PR/RT at the C terminus ( Fig. 1), showed that the reaction takes place in two independent sequential steps (7). The first step involves an intramolecular cleavage at the N terminus of the protease domain concomitant with a large increase in mature-like enzymatic activity and the appearance of the transient protease intermediate containing the flanking C-terminal polypeptide. The transient protease intermediate that exhibits similar kinetic parameters and dissociation constant to that of the mature protease is converted in a second step to release the mature protease via an intermolecular cleavage (8).
The native transframe region that flanks the N terminus of the protease in the Gag-Pol polyprotein, comprises two domains, the transframe octapeptide (TFP) followed by the 48 amino acid p6 pol , both separated by protease cleavage sites (9,10). Reactions using the full-length TFP-P6 pol -PR ( Fig. 1) at pH 5.0, optimal for catalytic activity of the mature protease and the autocatalytic maturation reaction, showed the release of the protease to occur in two distinct steps (2,11). The first cleavage occurs at the TFP-P6 pol site to generate the intermediate precursor P6 pol -PR. In the second step, P6 pol -PR is converted to the mature protease concomitant with a large increase in catalytic activity. Thus, the two proteins, TFP-P6 pol -PR and P6 pol -PR, exhibit nearly the same low catalytic activity and the rate-limiting intramolecular cleavage at the p6 pol -PR site is indeed concomitant with the appearance of mature-like enzymatic activity and stable tertiary structure formation characteristic of a protease dimer (2,11). These results are consistent with studies showing that HIV-1 particles of four different strains obtained from different cell lines contained only the 11-kDa mature protease and no p6 pol -PR precursor (12). Importantly, a mutation of the N-terminal protease cleavage site p6 pol /PR leading to the production of an N-terminally extended 17-kDa protease species caused a severe defect in Gag polyprotein processing and a complete loss of viral infectivity (12).
The mature protease has been the target of drug development and hundreds of crystal structures of the protease dimer bound to various inhibitors have been solved (13,14). This has contributed to a large database for improved design of drugs that bind to the active site. However, treatment of HIV-1 infection on a longer term without the emergence of drug-resistance against protease inhibitors has been a challenge for the past decade (15). Although second generation active-site inhibitors are being developed in an effort to overcome the problem of drug-resistance, strategies to define an alternative mode of inhibition such as disrupting or preventing dimer formation may provide another avenue toward inhibitor design (16). These latter inhibitors may have a greater success in curbing the emergence of drug resistance. There have been published reports of dimerization inhibitors of the protease (17,18); however, to date, no structural information of such monomer-inhibitor complexes is available. These studies have been hampered by the fact that monomeric protease is difficult to obtain since the mature protease is predominantly dimeric in solution with a dissociation constant Ͻ 5 nM (2,19,20).
A network of interactions around the active site and termini of the mature protease are critical for its dimerization (21)(22)(23). To our knowledge there is no evidence for the existence of a folded monomer species during folding of the wild-type mature protease or structural data to support the dissociation of the dimer into a folded monomer. The existence of a folded monomer was recently observed only when unique dimer interface contacts were disrupted via mutations. These mutations increased the dissociation constant of the protease dramatically such that a monomer could be studied by solution NMR at a high protein concentration of up to 1 mM (in monomer) (24,25). A series of mutants PR R87K , PR D29N , PR T26A , PR 5-99 , and PR 1-95 has been described to exhibit a monomer fold in the absence of inhibitor. Of these, only PR 1-95 did not form a ternary complex with DMP323, an inhibitor that binds tightly (K I Ͻ 10 Ϫ9 M) (26) to the protease dimer, up to protease concentrations of 1 mM.
As a prelude to future structural studies of protease monomer-inhibitor complexes, here we present the first structure of the HIV-1 protease monomer, PR 1-95 as determined by solution NMR. The structure of the monomer together with the NMR relaxation results has allowed comparison of the structure of the monomer with the subunit of the uninhibited dimer. In addition, using an inactive precursor bearing the active site D25N mutation, termed TFP-P6 pol -PR D25N (Fig. 1), we show that the protease domain of the uninhibited precursor is mainly monomeric, adopting a tertiary fold very similar to that of the PR 1-95 monomer. Systematic analyses of NMR and kinetic data of protease constructs, flanked either by 4 residues or 1 residue of the p6 pol sequence, suggest that local interaction and packing of the terminal Pro 1 and Phe 99 residues in the mature protease are critical to achieve native-like dimer stability. Finally, the above results are discussed in the context of a model for the regulation of the HIV-1 protease in the viral replication cycle (2).

EXPERIMENTAL PROCEDURES
Protease Constructs-The protease (PR) domain in all constructs, optimized for NMR and kinetic studies, bears 5 mutations, Q7K, L33I, L63I to minimize the autoproteolysis of the protease and C67A and C95A to prevent cysteine-thiol oxidation (2). Plasmid DNA (pET11a, Novagen, Madison, WI) encoding TFP-p6 pol -PR and PR (2) were used with the appropriate oligonucleotide primers to generate the constructs TFP-p6 pol -PR D25N and PR D25N . Similarly, a stop codon was introduced to produce TFP-p6 pol -PR 1-95 and PR  . The PR-encoding plasmid was sequentially extended one codon at a time to produce SFNF PR. SFNF PR template was then used to introduce a D25N mutation. MI PR, MF PR and MG PR constructs were derived from PR. The initiator Met residue when not excised upon their expression in Escherichia coli is indicated as in the case of MI PR, MF PR and MG PR. All constructs were generated using the QuickChange mutagenesis protocol (Stratagene, La Jolla, CA) and verified by DNA sequencing and mass spectrometry. E. coli BL21(DE3) were grown in minimal media containing 15 N ammonium chloride with or without 13 C glucose as the sole nitrogen and carbon sources, respectively, at 37°C and induced for expression. Proteins were prepared using an established protocol as described previously (24). Specific cleavages giving rise to products due to the autoprocessing of active precursor proteins were accessed both by SDS-PAGE and mass spectrometry.
NMR Spectroscopy and Structure Determination-All 1 H-15 N correlation spectra were recorded using ϳ0.5 mM protein in monomer (unless noted otherwise) in 20 mM phosphate buffer at pH 5.8. NMR experiments for structure determination of PR 1-95 were carried out using 0.4 -0.5 mM protein in 20 mM phosphate buffer at pH 4.5 in 95% H 2 O/5% D 2 O and a sample volume of ϳ280 l in a 5-mm Shigemi tube (Shigemi, Inc., Allison Park, PA). Spectra were acquired on DMX500 spectrometers with or without a cryoprobe (Bruker Instruments, Billerica, MA) at 20°C. Backbone and side chain resonance assignments, and heteronuclear seperated proton NOESY spectra, for structure determination, were obtained using standard triple-resonance three-dimensional NMR experiments (27). Backbone dihedral angle restraints were obtained from 3 J HNH␣ coupling constants, and 1 angle restraints were determined from 3 J NH␤ coupling constants and NOESY data (28). Residual 1 D NH dipolar couplings were measured in 6% gel medium (29,30). Gels were prepared from a stock solution of 36% w/v acrylamide and 0.92% w/v N,NЈ-methylenebisacrylamide yielding a acrylamide/ bisacrylamide ratio of 39:1. Gels (280 l) were cast to a diameter of 5.4 mm, rinsed in water overnight, dehydrated to about one-sixth the original size over a period of 5-6 h at 37°C, soaked in protein solution (desired buffer and pH) for 16 -20 h at room temperature and gently pushed into a Wilmad open-ended NMR tube (4.24 Ϯ 0.012 ID) as described (29,30). The bottom end of the tube was sealed with a susceptibility-matched plug and the top with a regular, susceptibility matched Shigemi (Allison Park, PA) microcell plunger.
NMR data were processed and analyzed using the nmrPipe, nmrDraw, and PIPP software (31,32). Experimentally determined dis- tance, dihedral angles, residual dipolar coupling constraints (Table I) were applied in a simulated annealing protocol using Xplor-NIH with conformational database torsion angle potentials (33). Simulated annealing and minimization calculations were carried out in Cartesian coordinate space with a final Powell minimization as the last step. Structures were analyzed using PROCHECK-NMR (34) and structure figures were generated using Insight II (MSI) and GRASP (35). Accessible surface area was calculated using NACCESS (36) Enzyme Kinetics-Kinetic parameters were measured using the substrate, Lys-Ala-Arg-Val-Nle-(4-nitrophenylalanine)-Glu-Ala-Nle-NH 2 (California Peptide Research, Napa, CA) (37) as described previously (7,8,19) in 50 mM sodium acetate buffer, pH 5 and 0.25 M NaCl at 25°C. In a typical assay, 4 l of enzyme was added to 96 l of buffer in a 100 l spectrophotometer cell. Reaction was initiated by the addition of 10 l of substrate in water and monitored by following the decrease in absorption at 310 nm (⌬⑀ ϭ 1800). In all cases, data were collected at substrate concentrations (10 -460 M) above and below K m at a final enzyme concentration of 250 nM MG PR and MI PR and 80 nM PR. The kinetic parameters, K m and k cat were obtained by fitting the Michaelis-Menten equation to initial rates. Assays to determine the K d were performed under the same conditions without NaCl. K d values were derived from plots of specific activity versus dimeric enzyme concentration as described previously (2,19) in a final substrate concentration of 390 M. Enzyme concentrations were determined both spectrophotometrically (absorbance at 280 nm) and by Bio-Rad assay (Bio-Rad Laboratories, Hercules, CA).

RESULTS AND DISCUSSION
Upon autocatalytic maturation at its N and C termini, the HIV-1 protease forms a stable homodimer exhibiting a dissociation constant in the subnanomolar range (2,19,20). Studies characterizing the HIV protease monomer were not feasible until we recently demonstrated that mutations of the interface residues, such as D29N, R87K, or deletion mutants of the terminal residues 1-4 or 96-99, destabilize the dimer (24,25). While it is apparent that deletion of the terminal residues precludes the formation of the terminal ␤-sheet and disrupts dimerization, these studies also revealed that subtle intermonomer contacts formed by the conserved Asp 29 and Arg 87 residues are essential to the dimerization of the mature protease. NMR and equilibrium sedimentation analyses show that these mutants exhibit a monomer fold and a range of dimer dissociation constants. We chose to determine the monomer structure of PR 1-95 because it is predominantly monomeric with no observable dimer formation up to a concentration of 1 mM, even in the presence of the high affinity inhibitor DMP323 (24).
Description of PR  Monomer Structure-The three-dimensional structure of PR 1-95 was determined using heteronuclear multidimensional NMR spectroscopy. Although PR 1-95 undergoes aggregation that limited the number of experiments performed as well as the acquisition time of individual experiments, all necessary data for a high resolution structure were obtained. 40 of 100 calculated structures converged without angle or NOE violations greater than 5°or 0.5 Å, respectively. PR 1-95 is a ␤-rich protein, composed of seven ␤-strands and one ␣-helix. A superposition of 10 conformers depicted in Fig. 2 shows that the structure is well defined except for disordered terminal and loop segments.
A backbone superposition of the average NMR structure with the monomer subunit of two different crystal structures of the free mature protease dimer (21,38) is shown in Fig. 3A. As is apparent, residues 10 -90 of the PR 1-95 monomer exhibit a nearly identical fold to that of one subunit of the protease dimer. This similarity in structures is consistent with the backbone chemical shifts of the PR 1-95 monomer compared with those of the wild-type dimer (Fig. 4A). Characteristics that distinguish the PR 1-95 structure from the monomer subunit of mature protease dimer are (I) disorder of the: (a) N-terminal residues 1-10, (b) flap residues 48 -54, and (c) residues 91-95 at the C terminus of the ␣-helix and (II) solvent-exposed active site residues, comprising mainly polar amino acids. Interesting aspects of these regions of the monomer structure are discussed below.
Terminal ␤-Sheet Interface-In the mature protease dimer, the terminal residues 1-4 and 96 -99 of the two subunits form a well-ordered interfacial four-stranded anti-parallel ␤-sheet (16). The lack of secondary structure of the N-terminal residues in the PR 1-95 monomer is not surprising given that deletion of residues 96 -99 precludes formation of the terminal interface ␤-sheet. The loss of the terminal ␤-sheet and chain flexibility observed for PR   (Figs. 2 and 4) also occurs in monomer constructs such as PR T26A and PR R87K that contain intact terminal sequences. Thus, specific interactions distant from the terminal region clearly influence the stability of the terminal ␤-sheet interface (24,25).
Flaps-Flap disorder observed in PR 1-95 monomer structure is of particular interest due to the critical role of the flaps in protease function. Crystal structures show that the flaps form ␤-hairpin structures that range from semi-open conformations in the substrate-free form of the dimer to a closed conformation upon substrate binding (39). The flaps were predicted to be flexible in order to permit substrate binding and our earlier NMR solution studies indicated that the flaps in the free protease exhibit greater flexibility than when interacting with inhibitors (40,41). Coordinates of the free protease dimer in solution have not been reported before and thus, the coordinates presented here for PR 1-95 monomer provide the first information about a flap conformation of free protease in solution.
The flap in the PR 1-95 monomer exhibits a ␤-hairpin structure, similar to the flap of the dimer, but with significant disorder in residues 48 -53, and seems to adopt an open conformation (Fig. 3A). Evidence for flap flexiblity on a subnanosecond time scale is provided by a decrease of the heteronuclear NOE (Fig. 4B) and increase in the transverse relaxation times of residues 49 -53 (Fig. 4C). In addition, two sets of ␣Ϫproton signals of Gly 52 were observed in the monomer flap, suggesting a slow conformational change of this region presumably on a millisecond time scale. In contrast, a single set of ␣-proton   (blue) with one subunit of two free protease dimer crystal structures shown in green (21) and yellow (38). B, GRASP electrostatic surface potential of PR 1-95 (excludes residues 1-10). Note: the crystal structure shown in green has a flap conformation that is more open than the crystal structure shown in yellow. signals for Gly 52 was detected under equivalent conditions in the free protease dimer. The flap region in the free protease dimer was also found to undergo ϳ100 s conformational exchange at 20°C, pH 5.8 (42), suggesting a minor difference in flap dynamics between the monomer and the dimer. Although there may be differences in the time scale of the flap motion in monomer and dimer, both monomer and dimer flaps have dynamics undergoing slow conformational change in addition to fluctuations on the subnanosecond time scale (41).

FIG. 4. Comparison of PR 1-95 and PR chemical shifts and PR 1-95 relaxation parameters.
Active Site-Residues around the active site are also part of the interface that stabilizes the free protease dimer. In the free protease dimer, the active site Asp 25 residue is involved in a ␤-1 turn that is stabilized by a network of two intersubunit hydrogen bonds known as the "fireman's grip" (43,44). Disruption of this interaction via a T26A mutation was shown to destabilize the dimer (25,45). In the PR 1-95 monomer structure, the region encompassing the ␤-1 turn region is somewhat disordered. Although NOE interactions typical of a ␤-1 turn are observed in the monomer, indicating that the ␤-1 turn remains, the decrease in transverse relaxation times suggest flexibility on the subnanosecond timescale in the active site region in the monomer (Fig. 4C). Accessible surface area of side chains of residues 23, 24, 26, and 29 located in this ␤-turn region is 40% larger in the monomer than in the dimer. This increase in flexibility of the ␤-turn region is most likely due to the loss of the inter-molecular hydrogen-bond network and a partial exposure of the turn to the solvent.
The active site ␤-turn region in the monomer contains a large number of exposed charged side-chains. A negatively charged patch composed of Asp 25 , Asp 29 , and Asp 30 is adjacent to the positively charged Arg 87 side chain (Fig. 3B). Because of the relatively low number of experimental constraints, orientations of the side chains of the residues were not determined; however, the side chains are expected to be mobile based on their solvent accessibility. Consistently, the random coil C ␦ chemical shift of Arg 87 indicates the loss of an intramonomer hydrogen bond between Arg 87 side chain and Asp 29 . The loss of this intra-monomer hydrogen bond was previously attributed to a dramatic decrease in the dimer stability observed for the mature protease bearing a R87K mutation (25).
Tertiary Fold of the Protease Precursor-Earlier kinetics and NMR investigations of the maturation reaction using the TFP-p6 pol -PR precursor (Fig. 1) clearly indicated that the cleavage at the N terminus of the protease is concomitant with the appearance of mature-like enzymatic activity and stable dimer formation (2,11). A systematic NMR structural study of the wild-type TFP-p6 pol -PR precursor (Fig. 1) bearing the native cleavage sites, TFP/p6 pol and p6 pol /PR, was not feasible due to its autocatalytic maturation to release the mature protease (2). Although maturation can be blocked by adding a large excess of inhibitor, it perturbs the monomer-dimer equilibrium of the free precursor and also contributes to undesirable effects such as the precipitation or aggregation of the protein at concentrations above 0.5 mM, thereby preventing detailed structural studies of the precursor. In our earlier studies, we had shown that the mature protease bearing a mutation of the active site residue (46), PR D25N , was highly suitable for long-term solution NMR studies of the mature protease dimer at a concentration of ϳ0.5 mM (25,47). In addition, the D25N mutation, unlike the T26A mutation (25), has only a modest affect on the dimer stability of the mature protease. Thus in order to analyze the precursor protease by NMR in the absence of any inhibitor, an active-site mutation D25N was introduced in the TFP-p6 pol -PR precursor to abolish its maturation. This construct was compared with the mature protease bearing the same D25N mutation, PR D25N . Fig. 5, A and B, shows a comparison of HSQC spectra of PR D25N and TFP-P6 pol -PR D25N at identical conditions. Chemical shifts of most signals observed in a 1 H-15 N-correlation spectrum of PR D25N are very similar to those of the active protease dimer (41). Signals of residues in the dimer interface of PR D25N , such as Ile 3 , Ile 84 , Gln 92 , and Thr 96 exhibit a significant shift (identified in striped boxes in Fig. 5A). In the TFP-P6 pol -PR D25N spectrum, these peaks are absent, and additional intense resonances are observed in the random coil region (8 ϳ 8.5 ppm for proton, Fig. 5B). These intense signals likely arise from residues of the TFP-P6 pol domain, consistent with results indicating that the isolated transframe region does not possess a stable secondary or tertiary structure (48). In addition, less intense but well dispersed signals (indicated in solid boxes in Fig. 5B) were observed in positions similar to signals in the spectra of the folded monomer PR   (Fig. 5C) and other mutants that form monomers (24,25). We therefore conclude that the flanking transframe polypeptide influences the monomerdimer equilibrium of the protease domain in accordance with the observation that TFP-P6 pol -PR D25N is predominantly a monomer whereas the mature protease, bearing the same D25N mutation, is a dimer (47).
We extended the examination of the tertiary fold of the protease precursor by constructing the precursor variant, TFP-P6 pol -PR 1-95 , lacking the same 4 C-terminal residues (96 -99) as PR  . As expected, TFP-P6 pol -PR 1-95 does not undergo maturation, which further emphasizes the fact that intersubunit interaction between the C-terminal ␤-strands is critical to the dimerization of the protease precursor. Similar to TFP-P6 pol -PR D25N , the spectrum of TFP-P6 pol -PR 1-95 exhibited numerous intense peaks in the random coil region (8ϳ8.5ppm for proton, Fig. 5D). In addition, well-dispersed signals corresponding to those seen for PR   (Fig. 5C) were also observed. Although systematic sequential peak assignments for precursors could not be performed because of peak overlap, the high degree of similarity of PR 1-95 and TFP-p6 pol -PR 1-95 spectra permitted easy assignment by comparison of well resolved signals, and several peak identities are indicated. For example, Ile 93 and Gly 94 signals of the monomer show a large shift in the dimer spectrum (PR D25N , Fig. 5A) due to dimer interface contacts. In addition, other peaks having chemical shifts characteristic of the monomer, Gly 49 , Gly 52 , Ala 67 , and Ala 68 , are also indicated in the TFP-P6 pol -PR 1-95 spectrum. Although minor structural differences may exist, the similarity between the spectra indicates that the structure of PR 1-95 at least for the core region mimics the protease precursor monomer.
It is apparent from the above results that the native transframe region hinders dimer formation. It is noteworthy that both the terminal regions of the transframe region, the TFP and the C-terminal residues of p6 pol , are competitive inhibitors of the mature PR. Louis et al. (10) have shown that the isolated TFP is uniquely a hydrophilic competitive inhibitor of the mature protease but inhibition is dependent on the protonation of a group with a pK a of 3.8. Complementary studies by Paulus et al. (49) have shown that competitive inhibition of the mature protease at pH 5 by the purified transframe domain is dependent on the presence of the native C-terminal residues SFNF corresponding to the P4-P1 positions of the p6 pol /PR cleavage site sequence. Inhibition by the full-length TFP-p6 pol domain was about 45-fold better (IC 50 ϭ 13 M) than seen for the synthetic peptide SFNF and the TFP-p6 pol domain lacking the last four residues failed to inhibit. Comparison of the monomer spectra of the precursors TFP-p6 pol -PR D25N or TFP-p6 pol -PR 1-95 with that of PR 1-95 do not reveal significant differences indicative of an interaction between the TFP-p6 pol region and the protease monomer. We therefore believe that the dimeric precursor of the protease in which the N-terminal cleavage site sequence is bound to the active site is a transient form present in small amounts and thus undetectable in the HSQC spectrum. This interpretation is consistent with the previously proposed model of the protease precursor (2,7).
Local Interactions at the Termini of the Mature Protease Contribute to Dimer Stability-The protease is mainly monomeric when fused to relatively long sequences at its N terminus (2,7,11). Our earlier studies showed that the first-order rate constant of the autoprocessing of the model precursor, which consisted of the protease domain flanked at both ends by short native Gag-Pol sequences and an N-terminal maltose binding domain (MBP) (7), was similar to that of the native precursor p6 pol -PR (Fig. 1) (11). This observation indicates that the MBP (38kDa) domain, which mimics nearly the size of Gag domain of the Gag-Pol, does not perturb the autoprocessing reaction. In addition, both fusion proteins, the model precursor and p6 pol -PR, exhibit very low catalytic activities relative to the mature protease (2,7,11).
To assess if dimerization is sensitive to local interactions involving few residues of the flanking p6 pol sequence, herein we have examined the effect of just the SFNF extension on protease stability and enzymatic activity. The construct created, SFNF PR D25N , bears the p6 pol /PR native cleavage site but has an inactivating active site mutation D25N that precludes processing. The HSQC spectrum of SFNF PR D25N shows that it is mostly a folded monomer at ϳ0.6 mM (Fig. 6A) with signals in positions similar to those of PR  . Peaks that characterize the terminal ␤-sheet interface of the dimer were not observed. We have shown previously that even though some mutations e.g. R87K, D29N etc., significantly increase the dissociation of the free mature protease dimer such that a majority of the protein is monomeric at ϳ0.5 mM, in the presence of DMP323, nearly all of the protein forms a ternary complex in which the inhibitor is bound to the dimer (24,25). Comparison of the HSQC spectra of TFP-p6 pol -PR D25N and SFNF -PR D25N in the presence of ϳ5fold excess of DMP323 indicates that while nearly all the SFNF -PR D25N is dimeric, significant portion of TFP-p6 pol -PR D25N precursor still exists as a monomer. This observation suggests that a longer sequence, namely the intact native TFP-p6 pol , hinders the dimerization of the protease to a larger extent than a 4-amino acid extension. This interpretation is consistent with our earlier results showing that the active precursor TFP-p6 pol -PR is about 160-fold less active than the mature protease assayed using substrate (11).
The observation that 4 amino acids flanking the protease destabilize the dimer motivated us to examine the influence of a shorter extension and varying side chains of the P1 residue on the monomer-dimer equilibrium. We therefore examined three constructs in which the protease is fused at the N terminus either to the native P1 position Phe residue or non-native Gly or Ile residues. Since the initiator Met is retained in all   these constructs as determined by mass spectrometry, they are termed MF PR, MG PR, and MI PR accordingly. It was not feasible to record a HSQC spectrum for MF PR due to rapid processing at its N terminus (Phe-Pro cleavage site). MG PR and MI PR, which do not exhibit processing at the protein concentration used for recording the spectra, showed a mixture of monomer and dimer at ϳ0.5 mM, Ile having a larger influence on dimer stablility as compared with the Gly residue ( Fig. 6B and Table II). As seen with SFNF PR D25N in the presence of DMP323, MF PR, MG PR, and MI PR are also mainly dimeric. A comparison of the K d values for these 3 enzymes indicated that MI PR was the least stable of the proteins (see Table II). The k cat /K m for MG PR and MI PR was about 2-and 13-fold, respectively lower than for PR under identical conditions (Table III). This again confirms that the native N-terminal TFP-p6 pol sequence has a greater influence on the monomer-dimer equilibrium and catalytic activity of the protease than the 2 or 4 residue N-terminal extensions.
The gradual increase in K d seen by comparing MG PR with those of MF PR and MI PR (Table II) suggest that side-chain size in a 2 residue N-terminal extension influences the monomerdimer equilibrium. In contrast, a 19 amino acid extension of the reverse transcriptase sequence at the C terminus of the protease does not significantly affect either the kinetic parameters or the K d (8). As noted from the crystal structure of the free protease dimer, Pro 1 is relatively close to the Ala 67 -His 69 loop as well to Phe 99 . We speculate that additional residues at the N terminus of the protease may affect the local packing of side chains, thus lowering the dimer stability. CONCLUSIONS We describe here the first three-dimensional structure of the monomer of HIV-1 protease. Because the K d of the mature protease is Ͻ5 nM elucidation of this elusive monomer structure has only been feasible by mutations that destabilize the dimer interface contacts. The structure of the monomer exhibits a nearly identical tertiary fold to that of a single subunit of the dimer except for terminal region residues 1-10 and 91-95, which are flexible as expected in the absence of a specific interface structure. Unique features of the monomer structure are (I) an exposed charged surface of the active site region and (II) an open flap conformation.
Prior to maturation by cleavage at its N terminus, the protease precursor is mainly monomeric. The overall monomer fold of the protease domain, as revealed using the TFP-p6 pol -PR D25N precursor, closely resembles that of the mature protease monomer. Even though the terminal regions of the TFP-p6 pol domain are competitive inhibitors of the mature protease there is no indication of their interaction to the precursor TFP-p6 pol -PR monomer. Lack of signals indicative of the existence of a dimeric form of the precursor by NMR suggest that interactions with the cleavage site sequences, TFP/p6 pol and p6 pol /PR, with the dimeric precursor are likely to be transient and only occur for a small fraction of the protein until a significant amount of the mature protease is formed. It is conceivable that the protease domain in the context of a Gag-Pol precursor also adopts a nascent monomer fold in vivo similar to that of the TFP-p6 pol -PR D25N precursor. Therefore, the structure of the monomer determined in the present work may aid future screening and structure-based design of compounds targeted specifically to the precursor. Inactivation of the protease by interfering with dimer formation of the mature protease or of the precursor prior to its maturation provides an alternative strategy to address the rapid emergence of drug-resistance, commonly observed with current active site inhibitors.
The structure and quaternary state of HIV-1 protease is exquisitely sensitive to subtle changes in its sequence (24,25). We demonstrate that addition of only 4 amino acids flanking the N terminus significantly reduces dimerization of the free protease. This observation suggets that local interactions and/or a packing effect of the terminal Pro 1 and Phe 99 residues are critical to the monomer-dimer equilibrium, in addition to the previously reported active site and terminal ␤-sheet interfaces. It is interesting to note that both N-terminal extensions as well as a deletion (e.g. 1-4) induce dimer dissociation (increase in K d ), which may relate to the regulation of protease activity during the viral replication cycle. Whereas the Nterminal p6 pol domain when fused to the N terminus of the protease delays the onset of maturation of the protease, selfproteolytic cleavage at Leu 5 /Trp 6 within the protease (50) can be viewed as a final step toward the inactivation of the mature protease. Finally, although numerous similarities between the HIV-1 protease and the gastric enzyme pepsinogen have been pointed out (2,7,11), the lability of the protease dimer prior to its maturation clearly contrasts properties observed for monomeric pepsinogen that exhibits a stable tertiary structure in its precursor form and has comparable activity as the processed mature pepsin.