Solution Structure of the Human Immunodeficiency Virus Type 1 p6 Protein*

The human immunodeficiency virus type 1 p6 protein represents a docking site for several cellular and viral binding factors and fulfills major roles in the formation of infectious viruses. To date, however, the structure of this 52-amino acid protein, by far the smallest lentiviral protein known, either in its mature form as free p6 or as the C-terminal part of the Pr55 Gag polyprotein has not been unraveled. We have explored the high resolution structure and folding of p6 by CD and NMR spectroscopy. Under membranous solution conditions, p6 can adopt a helix-flexible helix structure; a short helix-1 (amino acids 14–18) is connected to a pronounced helix-2 (amino acids 33–44) by a flexible hinge region. Thus, p6 can be subdivided into two distinct structural and functional domains; helix-2 perfectly defines the region that binds to the virus budding factor AIP-1/ALIX, indicating that this structure is required for interaction with the endosomal sorting complex required for transport. The PTAP motif at the N terminus, comprising the primary late assembly domain, which is crucial for interaction with another cellular budding factor, Tsg101, does not exhibit secondary structure. However, the adjacent helix-1 may play an indirect role in the specific complex formation between p6 and the binding groove in Tsg101. Moreover, binding studies by NMR demonstrate that helix-2, which also comprises the LXXLF motif required for incorporation of the human immunodeficiency virus type 1 accessory protein Vpr into budding virions, specifically interacts with the Vpr binding region, indicating that under the specific solution conditions used for structure analysis, p6 adopted a functional conformation.

The main structural components of retrovirus particles are synthesized as three polyproteins that produce either the virion interior (Gag), the viral enzymes (Pol), or the glycoproteins of the virion envelope (Env). The Gag polyprotein is required and sufficient for virus particle assembly and budding, although genomic RNA and envelope proteins are obligatory for production of infectious progeny virions. The processing of the HIV-1 Gag polyprotein Pr55 by the viral protease generates the matrix, capsid, nucleocapsid (NC), and p6 proteins. Matrix mediates the plasma membrane targeting of Gag and lines the inner shell of the mature virus particle, capsid forms the conical core shell encasing NC, and NC regulates packaging and condensation of the viral genome (1)(2)(3)(4)(5)(6). The role of p6 during virus entry and its location in mature HIV-1 2 virus particles are not known, although it appears not to be associated with the virus core (7,8). Several functions, however, have been ascribed to p6. It facilitates virus budding (9 -11) and is required for the incorporation of the viral accessory protein Vpr into the virus particle (12)(13)(14)(15). It has also been implicated in the incorporation of the viral Pol and Env proteins (16,17) and in the control of particle size (18,19). Recently, p6 was reported to be the major phosphoprotein of HIV-1 particles (20), and there is evidence that the host cell mitogen-activated protein kinase ERK-2 regulates viral assembly and release by phosphorylation of Thr-23 in p6 (21).
The late steps of the HIV-1 replication cycle involve the assembly of newly synthesized structural proteins at the plasma membrane into budding particles that are released as immature noninfectious viruses consisting predominantly of uncleaved polyproteins. Concurrent with virus release and in concert with protease activation, processing of Gag polyproteins and condensation of the inner core structure occurs, resulting in the formation of mature infectious virus (2). The release of budding virions from host cells requires the fission of cell and virus membranes. This process is governed by late assembly domains (L-domains) of p6 (1,6,9,(22)(23)(24)(25)(26). Although, the detailed molecular mechanism(s) of how the HIV-1 L-domain regulates virus budding remains elusive so far, it is now generally accepted that it functions as a docking site for the cellular budding apparatus that is normally involved in the endocytotic recycling of cell surface receptors (1).
Two highly conserved motifs can be discerned in p6; the N-terminal P(T/S)AP motif forms the primary L-domain that mediates binding of Pr55 to the tumor susceptibility gene product (Tsg101), an E2-type ubiquitin ligase-like protein (9,10,(27)(28)(29)(30)(31)(32). Mechanistically, the interaction between Tsg101 and p6 is mediated by the N-terminal ubiquitin binding domain of Tsg101, designated as ubiquitin E2 variant sequence (UEV). The UEV binds to the PTAP motif of p6 in a process that appears to be up-regulated when upstream Lys residues in p6 are monoubiquitinylated (28 -30, 32, 33). L-domains similar to the P(T/S)AP motif of HIV-1, such as PPXY or YXXL, have been identified in many other retroviruses and enveloped viruses (reviewed in Refs. 22 and 27). Another region of p6, residues 32-46, comprising the LXXLF motif necessary for Vpr binding and a cryptic YPXL-type L-domain, mediates the binding to AIP-1/ALIX, a class E vacuolar protein sorting factor that also interacts with Tsg101. AIP-1/ALIX also binds to late acting components of the endosomal sorting complex required for transport (ESCRT) and is necessary for the formation of multivesicular bodies at endosomal membranes (26). Further, in more recent studies, VPS37B was identified as a new component of the ESCRT-I complex that binds to Tsg101. Together with Tsg101, it is incorporated into virions and, most astonishingly, is able to support budding of mutants of HIV-1 that are missing the PTAP L-domain (34). Since the process of multivesicular body formation can be envisioned as topologically similar to virus budding from the cell membrane, it is assumed that the complex formation between p6, Tsg101, and AIP-1/ALIX recruits components of the ESCRT system to the viral budding site (reviewed in Ref. 1). The current status of the molecular characterization of HIV-1 p6 is summarized in Fig. 1A.
Among known lentiviruses, the 52-amino acid HIV-1 p6 protein is by far the smallest protein for which a molecular structure has not been defined hitherto. Thus, we have explored the high resolution structure and folding of p6 under various solution conditions using a combination of CD and NMR spectroscopy. We found that p6 adopts a helix-flexible helix structure; a short helix-1 (residues 14 -18) is connected to a pronounced helix-2 (residues 33-44) by a flexible hinge region. We also found that helix-2 of p6, which comprises the LXXLF binding motif for Vpr, specifically interacts with Vpr. This indicates that p6 adopted a functional conformation under the specific solution conditions used for our NMR structure analysis. An overall model of the molecular structure of p6 is developed and discussed in the context of known functional modules of p6.
CD Spectroscopy-CD spectra were recorded at room temperature in 0.5-mm cuvettes on a Jasco J-600 spectropolarimeter in a wavelength range from 260 to 180 nm, and the resulting curves were smoothed with a high frequency filter. Samples of full-length sp6-  and its related shorter fragments sp6-  and sp6-  were dissolved at a concentration of 0.2 mg ml Ϫ1 under various conditions (trifluoroethanol (TFE) concentration and pH). Phosphate-buffered solutions contained an appropriate mixture of 67 mM KH 2 PO 4 ϩ Na 2 HPO 4 to give pH 7.2. Secondary structure content was quantified with the program VARS-ELEC (36). 1 H NMR Spectroscopy-Samples of the peptides were dissolved in distilled water/TFE-d2 (1:1; v/v) to give a final volume of 0.7 ml. Spectra of sp6-(1-52) (9.7 mg dissolved in a mixture of 350 l of H 2 O (super distilled) and 350 l of TFE-d2) at pH 3 were recorded at 277, 290, 300, and 308 K, respectively, and similar spectra of sp6-  and sp6-  were recorded at 300 K at 600.133 MHz on a Bruker Avance DMX 600 NMR spectrometer. The 1 H spectra were internally referenced to the residual methylene signal of TFE at 3.95 ppm.
Two-dimensional phase-sensitive spectra of 1 H COSY (correlation spectroscopy) were recorded with 512 free induction decays in t 1 and 2000 data points in t 2 , and 64 (sp6-(1-52)) or 40 (sp6-(1-21) and sp6-(23-52)) transients were collected for each t 1 increment. Spectra of two-dimensional total correlation spectroscopy (TOCSY) were recorded with mixing times of 110 ms. 800 free induction decays were recorded in t 1 and 2048 data points in t 2 , and 32 (sp6-(1-52)) or 80 (sp6-(1-21) and sp6- ) transients were collected for each t 1 increment. Spectra for two-dimensional nuclear Overhauser enhancement spectroscopy (NOESY) were recorded with mixing times of 250 ms. 800 free induction decays were recorded in t 1 and 2048 data points in t 2 , and 32 (sp6-(1-52)) or 80 (sp6-(1-21) and sp6- ) transients were collected for each t 1 increment. For all two-dimensional experiments, t 1 was either zero-filled to 2048, or linear prediction was applied to 1024 and 1024 zero filling was used. Both t 1 and t 2 had final sizes of 2048. All two-dimensional experiments were recorded without spinning and processed with standard Bruker software.
Binding Studies of sp6-  and sVpr- (26 -33)-After recording one-and two-dimensional NMR spectra of sp6- , the same 50% TFE NMR sample as above was transferred to a new vial containing an equimolar amount of the peptide sVpr-(26 -33) (LKSEAVRH), comprising the domain of the HIV-1 NL4 -3 Vpr that was shown previously to bind to p6. To prevent changes in the molarity of sp6- , the peptide sVpr- (26 -33) was administered in dry, lyophilized form to the peptide solution. After complete dissolution of sVpr- (26 -33), the solution was transferred back into the NMR sample tube, and identical NMR experiments were performed under exactly the same conditions as for sp6-  at 300 K. In a further experiment, a second equimolar amount of sVpr-(26 -33) was added. All NMR experiments with homogenous sp6-  and heterogeneous sp6-(1-52) mixed with sVpr- (26 -33) at ratios of 1:1 or 1:2 were performed under identical conditions. In these solutions, the p6 resonances were easily resolved from those of the sVpr- (26 -33) peptide, which was also unambiguously assigned at both concentrations (see supplemental Fig. 2 and Table 6).
Structure Calculations-The volumes of the integrated cross-peaks from the NOESY spectra of full-length sp6-  and sp6-  at 300 K were determined using the AURELIA program (37). The distances were calibrated against Gln side chain amide protons. After corrections for pseudoatoms, these values were used directly as distance restraints in molecular dynamics calculations. The structures were cal-culated on a Silicon Graphics Octane work station using the program CNS 1.0 (38). For the structure calculations, standard CNS parameters for data sets based on NMR were applied. A total of 320 and 281 distance restraints for sp6-  and sp6- , respectively, in 50% TFE were used to generate 104 conformations for all procedures. In each case, 20 conformations that exhibited no restraint violations greater than 0.2 Å and had the lowest values of NOEs and total energies were used for the final fitting analysis.
The heterogeneity within a final set of 20 structures was visualized using the consecutive segment approach, in which the r.m.s. differences of the backbone atoms for short segments 2-5 residues in length are systematically compared pairwise for all selected final structures (39). Consequently, such a comparison provides an objective method for the recognition of stable structural elements in the ensemble of final structures and allows the regions to be defined for which alignments are made. The structure with the lowest energy was then determined using the programs LSQMAN and MOLEMAN2 (Uppsala Software Factory) (40). Finally, alignments, using the regions defined by the consecutive segment approach, were performed, comparing all other structures with the lowest energy structure, and these were visualized with the BRAGI program (41).
Generation of Anti-sp6-(1-52) Antibodies-Polyclonal antibodies directed against full-length sp6-(1-52) (p6 antibodies) were generated by immunization of rabbits with sp6- , coupled to keyhole limpet hemocyanin, administered in Titermac adjuvant. The resultant antiserum reacts with both sp6- , as well as with its viral counterpart, the p6 domain of HIV-1 Gag, either in its mature state as the free protein or as part of the Gag polyprotein Pr55 and processing intermediates thereof.

RESULTS
Structure Prediction of the HIV-1 p6-Several methods for empirical structure prediction, which have been performed on the p6 sequence derived from the isolate HIV-1 NL4 -3 , are summarized in TABLE ONE.
All of our structure predictions and also those reported in previous literature (43) indicate that the main ␣-helical region is located at the C terminus. Earlier work based on prediction programs by Chou and Fasman (44) and by Garnier et al. (45) tends to afford contradictory results in that either a highly structured protein or only a short 6-residue ␣-helical region at the C terminus was predicted, respectively (43). More recent predictions of secondary structure of p6 (including ours) imply an ␣-helical region at the C terminus consisting of 6 -8 residues, which always include residues Ala-39 to Leu-44 (TABLE ONE). In contrast, inconsistent results were calculated for the N terminus, although the two prediction methods nearest to our experimental NMR results to be reported below showed the presence of a short ␣-helix around residues Glu-12 to Ser-14 and a more extensive ␣-helix region that includes residues Leu-38 to Phe-45 (46 -48).
An overview of the previously reported binding domains for cellular (Tsg101, AIP1/ALIX, ERK-2) and viral (Vpr) proteins within the HIV-1 p6 domain and their relationship to the primary structure derived from the HIV-1 NL4 -3 sequence together with the predicted sites of posttranslational modifications are shown in Fig. 1. Further, based on the primary sequence, p6 appears to be amphipathic in nature; the N terminus up to position 31 is hydrophilic, whereas a more hydrophobic region is located at the C terminus (16). Another structural characteristic is the relatively high content of polar residues as well as the unusually high appearance of Pro residues, namely 8 residues, of which 6 are highly conserved among different HIV-1 isolates (EMBL Data Library, Heidelberg; see Ref. 43) (Fig. 1).
Synthesis and Purification of p6-For structural analysis of p6, the molecule was synthesized and purified to homogeneity. Solid-phase peptide synthesis (SPPS) of full-length sp6, designated sp6- , was performed with a sequence derived from the isolate HIV-1 NL4 -3 ( Fig. 1). Different from a SPPS procedure described previously for the synthesis of a p6 protein derived from a diverse isolate, HIV-1 BRU (43), we optimized the SPPS procedure of sp6-  with respect to the use of coupling agents, protection groups, cleavage reagents, and duration of coupling reactions. Our protocol gave reproducibly high yields (usually 15%) of purified sp6-  without encountering synthesis problems such as incomplete deprotection and coupling, inter-and intrachain reaction with the resin matrix, side chain reactions, or peptide aggregation. We also synthesized various fragments of p6 using the same SPPS protocol. After cleavage from the resin, the crude peptide was purified by reverse phase HPLC using Vydac C18 protein column and a linear H 2 O-acetonitrile gradient. Illustrative data are shown in the supplemental materials for the full-length peptide sp6-(1-52) (crude and purified products (supplemental Fig. 1, A and B, respectively)), and the C-terminal fragment sp6-(23-52) (supplemental Fig. 1, C and D, respectively).
The identities of purified sp6-  and its related fragments were confirmed by molecular mass determination using positive ion electrospray ionization mass spectrometry. The experimental data for sp6-(1-52) showed a well defined multiply charged spectrum showing 6-, 5-, and 4-fold positively charged ions ( Fig. 2A) that was deconvoluted to give an intense envelope for the molecular ion cluster at a molecular mass of 5807.9 Da and the corresponding Na ϩ and K ϩ adducts at 5828.9 and 5850.8 Da, respectively (Fig. 2B). Similar results were also obtained by matrix-assisted laser desorption/ionization-mass spectrometry (data not shown). The cumulative data indicate that sp6-  was homogenous and showed little evidence of by-products. Similar results were also obtained for the N-and C-terminal fragments sp6-  and sp6- .
The molecular identity of sp6-  was further defined by SDS-PAGE (Fig. 3). Similar to other small amphipathic virus proteins, such as the HIV-1 accessory proteins Vpu (49) and Vpr (50), or the influenza A virus proteins M2 (51) and PB1-F2 (52), sp6-  was separated with reasonable resolution in Tris/Tricine gel according to Schägger and von Jagow (42). In Fig. 3A, serial dilutions of sp6-  were separated in a gradient SDS-PAGE and detected by silver staining. The peptide migrated with an apparent molecular mass of ϳ6 kDa, consistent with the predicted molecular mass of 5.8 kDa. No additional smaller peptide species migrating below the major product were detected, either by silver staining of the gel (Fig. 3A) or by Western blot using p6 antibodies directed against the full-length sp6-(1-52) peptide (Fig. 3, B and C). Together, these data attest to the absence of degradation or chain termination products in the purified preparation of sp6- .
The peptide sp6-  was also used as an immunogen in rabbits to generate polyclonal anti-p6 antibodies (p6). Using the p6 antibodies for immunoprecipitation, the purified peptide sp6-  was first recovered from solution, followed by detection in Western blot (Fig. 3B). Similar to direct separation of the peptide, the Western blot analysis of precipitated material of sp6-(1-52) did not show any evidence of low molecular degradation or chain termination products.
The p6 antibodies react with viral p6 derived from several different HIV-1 isolates (data not shown) and bind with comparable efficiency to both virus-derived p6 and synthetic sp6- . In Fig. 3D, the binding of the p6 antiserum to different processing products of the HIV-1 Pr55 Gag polyprotein is demonstrated. For this purpose, human CD4 ϩ T-cells were infected with HIV-1 NL4 -3 , and at peak virus replication, cells were lysed, separated in SDS-PAGE, and probed with the p6 antiserum. In addition to mature p6, the Pr55 Gag polyprotein and the intermediate processing products p39 (capsid-p2-NC-p1-p6) and p20 (NC-SP1-p6) were detected (Fig. 3, D and E). A similar pattern of Gag proteins was also detected in lysates of purified viruses released from the infected cells into the cell culture supernatant. Compared with the level of Pr55 Gag and its intermediate processing products, the relative content of mature p6 in the virus fraction was significantly increased compared with the amount of p6 detected intracellularly. This confirms that most of the Gag processing occurs upon budding and release of viruses from the cell membrane and that the processed form of p6 for the most part exists only in the mature HIV-1 particle (reviewed in Ref. 5). Further, the Western blot data (Fig. 3D) demonstrate that the p6 antibodies, generated while using sp6-  as immunogen, interact with the p6 domain of HIV-1, either as part of the Gag polyprotein or in its mature and fully processed form of free p6.
Characterization of sp6-(1-52), sp6- , and sp6-  by CD Spectroscopy-A first insight into the folding of sp6-(1-52) and fragments thereof was obtained by analysis of the peptides at ambient temperature under various solution conditions by CD spectroscopy. In particular, we were interested in studying the changes found in hydrophobic environments, since the p6 domain should be present in a cellular compartment near the hydrophobic membrane during assembly and budding of HIV-1. In our system, we have used TFE to simulate such conditions. TFE is an organic solvent that is known to favor intramolecular interactions that stabilize secondary structures, particularly ␣-helices in domains of a peptide that have a propensity for such secondary structure (53). In addition, TFE can alleviate problems occurring with intermolecular interactions in the higher concentration ranges required for NMR investigations, since it tends to disrupt quaternary structure and dissociates peptide aggregates (54).
The initial shallow negative CD curve of sp6-  in water with a minimum at 200 nm and a very small negative ellipticity value near 220 nm was characteristic of a disordered peptide conformation with very little evidence of secondary structure (Fig. 4A). The addition of TFE caused a pronounced shift in the initial minimum of the CD curve of sp6-  to 206 nm. Simultaneously, a second substantial negative ellipticity occurred around 220 nm, and a positive band occurred at 189 nm (Fig. 4A). These changes in the CD curve were altogether indicative of an increasing content of stable ␣-helical secondary structure in sp6-  stabilized upon the addition of TFE. The most pronounced changes were observed going from 0 to 50% TFE, and a further increase in TFE concentration from 50 to 70% did not cause any significant   DECEMBER 30, 2005 • VOLUME 280 • NUMBER 52 changes in the CD spectra. Deconvolution of CD spectra revealed that sp6-(1-52) exhibited ϳ29% ␣-helical structure in 50% TFE at pH 3. At neutral solution conditions (e.g. at pH 7.2 in phosphate buffer) and in the presence of 50% TFE, the protein exhibited similar CD spectra (Fig. 4A) and thus had a similar content of ␣-helical structure. Thus, CD data indicate that sp6-(1-52) adopts stable ␣-helical structure in the presence of a hydrophobic environment, and this structure appears to be preserved over the acidic to neutral pH range.

Structure of HIV-1 p6 Protein
CD data similar to that of sp6-  were also obtained for the Nand C-terminal fragments, sp6-  and sp6-   (Fig. 4, B and C). Together, these results demonstrate that p6 consists of ␣-helical regions at the C terminus as well as at the N terminus. Considering the differing lengths of the fragments and the intensities of the curves, these qualitative data indicate that the more stable ␣-helical structured region is located in the C-terminal fragment, and this structure appears to exist independently of the N-terminal region. The propensity of sp6-  as well as the C-terminal and N-terminal peptides to form secondary structure is more or less independent of the pH but strongly depends on the structure-stabilizing effect of TFE, with the optimum conditions for folding of both helical domains in 50% TFE solution.
Identification of Structural Elements in sp6- , sp6- , and sp6-(1-52) by 1 H NMR Spectroscopic Characterization-The assignment of the NMR spectra of the peptides was accomplished using a combination of homonuclear two-dimensional NMR techniques at dif-   sp6-(1-52) (A), sp6-(1-21) (B), and sp6-(23-52)  ferent temperatures (290, 300, and 308 K, respectively). The spectral resolution of sp6-(1-52) was found to be particularly well resolved at 308 K, whereas those of sp6-  and sp6-  were optimal at 300 K. Signal overlap in the two-dimensional NOESY and two-dimensional TOCSY spectra of sp6-(1-52) was resolved with spectra at 308 and 300 K and by comparison of the spectra of sp6-  and sp6- . The spin systems were identified from two-dimensional 1 H COSY and TOCSY spectra, starting from the backbone amide protons. Sequencespecific assignments were determined from the cross-peaks in twodimensional 1 H NOESY spectra based on short observable distances between H N , H ␣ , and H ␤ of amino acid i and H N of amino acid i ϩ 1. The spin systems of Gln-2, Ala-9, Ile-31, Ala-39, and Asp-48 were readily recognized and were used as starting points for establishing residue positions in the peptide sequence. The full assignments and chemical shift data are presented in the supplemental material.
It has been shown experimentally that ␣-proton chemical shifts greater than 0.1 ppm relative to the random coil values are qualitative indicators of protein secondary structure. A minimum of four adjacent residues with an upfield shift are indicative of an ␣-helix, whereas ␤-sheets require a minimum of three residues with downfield shifts (55). The plots for ␣-proton chemical shift differences observed for sp6-(1-21) and sp6-  are very similar to the corresponding regions of sp6-(1-52) (Fig. 5). These plots indicate the existence of two helical regions from Ser-14 to Phe-17 and from Tyr-36 to Leu-44, respectively, when the inherent downfield effect of Pro-37 on the adjacent preceding residue (ϩ0.28 ppm) is taken into account (35). For the latter structured region, according to the chemical shift differences, there are some indications that the second structured region also includes the residues from Ile-31 (Ile-31 to Lys-33 exhibit upfield shifts, and Glu-34 and Leu-35 only exhibit slight downfield shifts compared with random coil values). The position of the region with secondary structure in sp6-  was independently confirmed from the medium range NOE interactions (Fig. 6A), where strong NH i Ϫ NH i ϩ 1 , ␣H i Ϫ NH i ϩ 3 , ␣H i Ϫ ␤H i ϩ 3 NOEs are indicative of helical secondary structure. For the C-terminal fragment sp6- , both sets of data revealed the pres-ence of a helical region that corresponds to those defined from the shift data. However, the medium range NOE interactions indicate structured regions slightly more extensive than the chemical shift index approach, namely from Ile-31 to Ser-47.
The ␣-proton chemical shift difference plot for sp6-  at 300 K indicates the existence of two helical regions extending from Ser-14 to Phe-17, and from Tyr-36 to Leu-44. As observed for the C-terminal peptide sp6- , according to the chemical shift differences, there is some indication for secondary structure including the residues upstream from Ile-31. The corresponding plots of the ␣-proton chemical shifts for sp6-  at 290 and 308 K were very similar to those based on the spectra recorded at 300 K (Fig. 5C), suggesting that the determined structure of sp6-  at 300 K is present to the same extent at physiological temperature, 310 K (and physiological pH according to the CD spectra).
The positions of regions with secondary structure were independently confirmed from the medium range NOE interactions (Fig. 6). Using this approach, two distinct helical regions were identified that are slightly more extensive than the helices identified by the chemical shift index approach; helix-1 extends from Glu-12 to Thr-21, and helix-2 extends from Ile-31 to Asp-48. As with the shift index approach, the position of helix-2 was almost identical when the NOE interactions were analyzed for the full-length molecule sp6-(1-52) (Fig. 6A) or the C-terminal fragment sp6-(23-52) (Fig. 6B).
In subsequent NMR experiments conducted on sp6-  in 50% aqueous TFE, the chemical shifts of the NH protons did not change uniformly as a function of temperature. For instance, the NH shift values of Ser-3 and Glu-6 belonging to the flexible part of the protein changed by 0.13 and 0.12 ppm, respectively, when the temperature was increased from 290 to 308 K. However, the chemical shift values of Thr-21, Thr-22, Glu-34, and Leu-38 that are included in or close to the helical structures did not change significantly in this temperature range. In fact, the NH shift value of Leu-38 did not change at all in the temperature range from 277 to 308 K, indicating that this amino acid is not exposed to the solvent due to its involvement in a stable well defined  sp6-(1-52) (A), sp6-(1-21) (B), and sp6-(23-52) (C) in 50% TFE at 300 K. DECEMBER 30, 2005 • VOLUME 280 • NUMBER 52 secondary structure where, according to our structure calculations, Leu-38 is located in the middle of the second and most extensive ␣-helix of p6.

Structure of HIV-1 p6 Protein
In summary, the independent sets of qualitative NMR data for sp6-(23-52) and sp6-  show two regions with ␣-helical structures, with the shortest located at the N-terminal part of the protein and the most extensive one at the C-terminal part of the protein. The exact positions of these structural features were defined from a statistical analysis of the final molecular conformations calculated from the quantitative NOE data, using simulated annealing techniques, and elaborated below.

Structure of HIV-1 p6 Protein
for each peptide with the lowest NOEs and total energies (TABLE  TWO) were used for the final fitting analysis. Initially, the heterogeneity within these structures was assessed using the consecutive segment approach, in which the r.m.s. differences of the backbone atoms for short segments, 2-5 residues in length, were systematically and pairwise compared (39). This analysis affords an objective method for the recognition of stable structural elements in the final 20 conformations of the two peptides. The best defined regions of the molecules were then those showing r.m.s. deviations of the backbone atoms of less than 0.2 Å. Consequently, two structured regions located between residues Ser-14 and Gly-18 and between Lys-33 and Ser-43 were identified in sp6- . The respective residue backbone deviations are shown in Fig. 7A. Compatible with the C-terminal structured region identified in fulllength sp6- , a structured region was also identified between residues Lys-33 and Leu-44 in the C-terminal fragment sp6-   (Fig.  7B). Compared with the full-length sp6- , the structured region of the fragment sp6-  appears to be slightly more extensive, since it also includes Leu-44. This minor difference is probably due to fewer signals and thus better resolution in the NOESY spectrum of the former, which allows detection of more cross-peaks that are important for defining the secondary structure.
The two major structures with the lowest NOEs and total energies were then determined for the selected fitting regions for each peptide (i.e. the residues comprising the ␣-helical regions defined by the r.m.s. difference analysis) (Fig. 7) using the programs LSQMAN and MOLE-MAN (Uppsala Software Factory package (40)). This algorithm was used to superimpose and compare the best final refined 20 structures with the lowest NOEs and total energies for each of the p6 peptides. All structures for each stable region were superimposed on the lowest energy central structures, and the resulting aligned conformations are shown in Fig. 8. The quantitative NOE data for full-length sp6-  allow the definition of two ␣-helical structured regions extending from Ser-14 to Gly-18 and from Lys-33 to Ser-43 (Fig. 8, A and B, respectively), connected by a more flexible region that allows orientational freedom of the two helices. Confirmation of the stability of the second longer helix is provided by the analysis of the C-terminal peptide, sp6-(23-52) (Fig. 8C). The central structure (i.e. the one that shows the lowest r.m.s. deviation value of the final structures for the chosen fitting  7. A, mean r.m.s. differences for the backbone atoms of sp6-  in each residue, calculated using a consecutive segment method that averages the differences for segments 2-5 residues in length, plotted against the residue number for the 20 final structures in 50% TFE. B, the same for sp6- . DECEMBER 30, 2005 • VOLUME 280 • NUMBER 52

Structure of HIV-1 p6 Protein
regions calculated for sp6-  in which the two helices at either side of a flexible region are clearly seen) is shown in Fig. 8D. However, it should be emphasized that this horseshoe conformation represents only one of several possible structures of sp6-  in solution, since no long range NOEs were observed that would specifically stabilize such a conformation. The 20 best structures have very similar magnitudes for their NOE and total energies and show several possible orientations of the helices with respect to each other. Clearly, in solution, sp6-  consists of an equilibrium mixture of various conformations. Evidence for Interaction between sp6-(1-52) and sVpr- (26 -33) by NMR Spectroscopy-In order to test whether the structure of sp6-  detected under the specific solution conditions is biologically relevant, we analyzed the interaction of the peptide with its natural binding partner, the HIV-1 Vpr protein, using NMR spectroscopy, since this is an ideal way of studying weak protein-protein interactions (56). It is generally accepted that the p6 region of HIV-1 Gag binds to Vpr and in this way directs the incorporation of this accessory protein into budding virions (57,58). Further, in vitro studies revealed that the minimal binding region of p6 required for the interaction with Vpr comprises the LXXLF motif (i.e. the region Leu-41 to Phe-45 of p6 covering the C-terminal end of helix ␣-2) (15,59,60). A key residue in the correspondent binding region of Vpr with respect to interaction with p6 is Ala-30; a leucine-to-alanine substitution at residue 30 of Vpr, which is known to inhibit Vpr packaging in vivo (61), prevented detectable p6 binding (15). To test whether sp6-  is capable of this specific interaction with Vpr under the solution conditions used for NMR experiments, we synthesized the peptide sVpr- (26 -33), comprising the minimal sequence sufficient and required for interaction with p6 (15). sVpr- (26 -33) was added to sp6- , dissolved in 50% aqueous TFE-d2 in 1 and 2 mol eq, respectively. sVpr- (26 -33) was administered to the sp6-(1-52) solution in such a manner that the concentration of sp6-  in the NMR sample remained constant. A full complement of NMR spectra were recorded that allowed the determination of the 1 H shift values of the proton signals of the amino acids of sp6-(1-52) prior to and after each of the sVpr- (26 -33) additions. The chemical shift changes are shown in Fig. 9 (representative spectra are shown in supplemental Figs. 2-5, and the full set of chemical shift data are given in supplemental Tables 1-5). The most significant chemical shift changes occur for those signals of the amino acids associated with the helix ␣-2 (Ile-31-Asp-48), whereas those outside this structured region were little affected or completely unaffected by the addition of sVpr- (26 -33). The changes become even more pronounced when 2 mol eq of sVpr- (26 -33) were added to the sp6-  sample. In particular, the 1 H chemical shifts of protons belonging to Ile-31, Lys-33, Arg-42, Leu-44, and Asp-48 of sp6-  were significantly affected (Fig. 9). Both positive and negative chemical shift changes are observed, with the changes in the NH protons being in general in the direction opposite to those of the ␣-protons and ␤-protons. In general, such data afford evidence for specific interactions and yield the location of the interface between the binding molecules (56). In the present case, the chemical shifts are most probably caused in part by aromatic ring current effects due to the spatial proximity of His-33 on sVpr- (26 -33) to the affected residues belonging to helix ␣-2 of sp6- .
These data confirm that the interaction between p6 and Vpr involves the LXXLF motif of the p6 domain (13-15, 58, 59). Furthermore, the 1 H chemical shift differences indicate that in addition to the LXXLF sequence, a structural motif associated with helix ␣-2 of p6 should be involved in the binding to Vpr. Finally, these data strongly support our notion that the peptide sp6-(1-52) adopted a functional conformation under the solution conditions investigated and that the structure of p6 developed in this work is biologically relevant.

DISCUSSION
In this study, we have developed a first insight into the molecular structure of the HIV-1 p6 Gag protein. Although p6 fulfils a major function in the formation of infectious viruses and represents a docking site for several cellular and viral binding factors, the structure of this molecule, either in its mature form as free p6 or as the C-terminal part of Pr55, has not been unraveled until now. From previous studies of synthetic p6 in aqueous solution by CD and temperature-dependent one-dimensional 1 H NMR spectroscopy, it was concluded that the molecule, although well soluble in water, adopts only a random conformation without any preference for secondary structure (43). More recently, the solution structure of the UEV domain of Tsg101 was resolved in complex with the peptide binding site of HIV-1 p6, embodied as the 9-residue L-domain-containing peptide PEPTAPPEE. According to this study, the L-domain, which itself is believed to reside within an unstructured region of p6, connects to a cavernous pocket of the UEV domain of Tsg101 (62).
However, considering the facts that p6 (i) binds to a number of cellular and viral factors, (ii) is predicted to have some secondary structure, and (iii) is amphipathic in character, a fuller appraisal of the conformational possibilities of the molecule was warranted. In general, it is now evident that solution conditions (54) as well as the presence of binding partners (63) have a considerable influence on the conformational properties of relatively small linear proteins, and these need to be taken into account if meaningful structural-functional rationalizations are to be attempted on the p6 molecule. Consequently, we have explored the structural response of p6 to changes in the solution conditions.
The independent sets of qualitative NMR data of sp6-  show two regions with ␣-helical structures, with the shortest located at the N-terminal and the most extensive one at the C-terminal part of the protein. The ␣-proton chemical shift difference plot for sp6-  indicates the existence of two helical regions extending from Ser-14 to Phe-17 and from Tyr-36 to Leu-44. According to the chemical shift differences in both molecules, full-length sp6-  and the shorter peptide sp6- , there is some indication that the second structured region extends up to Ile-31. Further, the medium range NOE interactions independently confirm the existence of two distinct helical domains that are slightly more extensive than the chemical shift index approach, namely (i) from Glu-12 to Thr-21 and (ii) from Ile-31 to Asp-48. The exact positions of these structural features were defined from a statistical analysis of the final molecular conformations calculated from the quantitative NOE data of sp6-(1-52) using simulated annealing techniques and by a similar analysis of the shorter peptide sp6- , which exhibits better spectral resolution than the fulllength protein. Consequently, two structured regions in sp6-  were identified between residues Ser-14 and Gly-18 and between Lys-33 and Leu-44, respectively. The two ␣-helices in sp6-  are connected by a more flexible region, which allows orientational freedom of the two helices. These NMR results are compatible with the CD spectra, where 15 residues (29%) are predicted to be present in helical conformation according to the deconvoluted data.
In earlier studies, it was reported that p6 lacks secondary structure, a conclusion that was based on temperature-dependent one-dimensional 1 H NMR spectra of p6 dissolved in water, showing that all NH protons are uniformly decreased in intensity and shift simultaneously (43). Our CD analysis of sp6-  confirmed this previous observation inasmuch as the protein exhibits very little secondary structure when dissolved in pure water. However, the situation changes significantly with the presence of increasing amounts of TFE used to introduce a more hydrophobic environment that is assumed to more closely simulate in vivo conditions where p6 is present near cell or virus membranes. In contrast to pure aqueous solution, full-length sp6-(1-52) exhibited ϳ29% ␣-helical structure at room temperature in 50% aqueous TFE at pH 3, according to our CD and NMR data. When the temperature was increased to 308 K or the pH was changed to either 5.6 or 7.6, the protein exhibited a similar content of ␣-helical structure, indicating that the  DECEMBER 30, 2005 • VOLUME 280 • NUMBER 52 secondary structure of sp6-  is preserved at physiological pH and temperature.

Structure of HIV-1 p6 Protein
Several functional motifs are known for p6 (see Fig. 1). Analyses of mutants indicates that a conserved LXXLF sequence from Leu-41 to Phe-45 in p6 from HIV-1 NL4 -3 is the major determinant in p6 required for incorporation of Vpr (59), whereas N-terminal residues up to position 31 are not involved in packaging of Vpr into budding viruses (13,59). The 14-kDa accessory protein Vpr contributes to the nuclear import of the viral preintegration complex, facilitating its passage across the nuclear pore, and induces G 2 cell cycle arrest in proliferating human T-cells (reviewed in Refs. 64 and 65). Direct evidence that Vpr interacts specifically with the LXXLF motif of p6 was provided by studying complex formation of Vpr with p6 by surface plasmon resonance biosensor spectroscopy (BIAcore) (15). Since this region overlaps almost completely with the second, best defined and most extensive ␣-helical region now identified for sp6- , it is conceivable that this helical structure is essential for the interaction of p6 with Vpr. Further, this conclusion is now substantiated by our chemical shift data for the interaction of sVpr- (26 -33) with sp6- . These titration experiments afford complementary data to that determined previously by BIAcore studies (15). The strongest 1 H chemical shifts of sp6-  upon the addition of sVpr- (26 -33) were observed for Ile-31 and Arg-42, in addition to significant changes of Asp-48, Leu-44, Asp-32, Leu-38, Leu-41, Ser-40, Leu-35, and Lys-33. Therefore, most of the amino acids belonging to the C-terminal ␣-helix of sp6 are affected by the addition of sVpr- (26 -33). The corresponding chemical shifts of the amino acids that do not belong to this ␣-helix are less affected or unaffected by the addition of sVpr- (26 -33). These data provided compelling evidence that p6 interacts with Vpr through its C-terminal ␣-helical region extending from Ile-31 to Asp-48.
According to Jenkins et al. (15), the binding affinity of a relatively short seven-residue peptide spanning residues 39 -45 of p6 also formed a complex with Vpr, although the binding affinity was reduced up to 1.7-fold compared with that of full-length p6. However, our findings suggest that the interactions between Vpr and p6 involve the entire ␣-helical region of p6 (Ile-31-Asp-48) (e.g. more than the 5-amino acid LXXLF motif). Previous studies also demonstrated that substitution of Arg-42 or Ser-43 in p6 reduced the binding affinity to Vpr only modestly, whereas substitution of Leu-41, Leu-44, or Phe-45 led to a failure in binding to Vpr-(1-71) (59). Although some significant changes of the 1 H chemical shifts of Leu-41 and Leu-44 occurred after the addition of sVpr- (26 -33), our results do not indicate any direct interaction between Phe-45 and Vpr. Although the 1 H chemical shifts of Phe-45 indicate that this residue is not directly involved in the p6-Vpr interaction, Phe-45 might still be important to afford the correct orientation of the ␣-helix to permit interactions with Vpr. In summary, from our NMR data taken in the context of previous binding studies, it seems reasonable to suggest that, although Vpr and p6 form a specific complex requiring the LXXLF motif, the binding affinity and the ability for complex formation also depend on the secondary structure of helix-2. Moreover, our titration experiments of Vpr and p6 peptides also support our notion that under the solution conditions used for NMR experiments, sp6-(1-52) adopts a functional conformation.
Recent work provided intriguing insight into the mechanism of how virus budding exploits a cellular machinery that is normally involved in vacuolar lysosomal protein sorting and multivesicular body biogenesis (66,67). In the case of HIV-1, the recruitment of these cellular factors to the virus assembly site is facilitated by the interaction between the L-domain PTAP motif of p6 and at least one important protein, which is the ubiquitin ligase-like protein Tsg101 (29,30,33).
In a structural analysis of the Tsg101 UEV domain in complex with a 9-amino acid p6 peptide containing a central PTAP motif, it was shown that the late domain assumes a bent linear structure that adapts to a narrow hydrophobic groove of the UEV protein (62). Our NMR data are compatible with such a structure inasmuch as the PTAP motif is not part of a region within p6 that exhibits secondary structure. Thus, the orientational freedom would favor direct insertion of the p6 N terminus into the groove of the UEV domain of Tsg101 without any significant structural changes for the rest of the molecule. Although the solved structure of the Tsg101 UEV-p6 complex gives no information about the rest of the p6 molecule (62), our data locate a relatively short but well defined ␣-helical region (Ser-14 to Gly-18) at the C-terminal end of the L-domain (Fig. 7) that may play an indirect role in stabilizing the specific complex formation between p6 and Tsg101.
It has been suggested that the Tsg101-mediated augmentation of virus budding requires not only the presence of the PTAP-binding motif but also the cooperative function of another Gag region in p6 (26). Finally, a second region upstream to the N-terminal and primary PTAPtype L-domain was identified in HIV-1 p6 spanning positions Glu-34 to Gly-46 of p6, which is required for binding to the host factor AIP-1, also known as ALIX (see Ref. 26 and references therein). AIP-1/ALIX, a homolog of a yeast class E Vps protein, also interacts with Tsg101, and it is assumed that in concert with other factors of the ESCRT-III complex, like the CHMP4 proteins, it merges the HIV-1 p6 to the late stage endosomal sorting complex involved in virus budding (26). The supposed interacting domain of AIP-1/ALIX in p6 almost perfectly colocalizes to the second helix, indicating that this secondary structure is required for the specific interaction between p6 and AIP-1/ALIX during virus budding. Since this structure in p6 is also indispensable for virus incorporation of Vpr, it would be interesting to study the potential of mutual influence between Vpr and AIP-1/ALIX in regard to interaction with p6, since both binding factors appear to target the same structural domain.
Furthermore, as p6 becomes monoubiquitinylated at conserved Lys residues in positions 27 and 33 as well as sumoylated at position 27, it would be attractive to study the impact of covalent attachment of ubiquitin and/or small ubiquitin-like modifier 1 on the structure of p6, particularly the flexibility of the central region and the C-terminal helix-2 that starts adjacent to Lys-33. Since this modification might impair the folding of helix-2, it can be speculated that ubiquitinylation not only regulates the interaction between the N-terminal L-domain and Tsg101 (30) but also the interaction between Vpr and AIP-1/ALIX with the C-terminal region of p6.
In HIV-1 particles, p6 also exist as a highly phosphorylated protein that is modified at Ser, Thr, and Tyr residues (20). Subsequently, it was shown that p6 becomes phosphorylated at position Thr-23 by the cellular mitogen-activated protein kinase ERK-2, which is also incorporated into mature virions, and appears to regulate L-domain function of p6 (21). Given that p6 represents the major phosphoprotein of HIV-1 (20), more phosphorylation sites must be present in p6. Using the software "NETPHOSPHO" (68) six additional phosphorylation sites were predicted (Fig. 1). The first predicted phosphorylation site, Thr-8, is part of the PTAP motif and might regulate L-domain function. The second phosphorylation site, Ser-14, is included in helix-1. Two additional phosphorylation sites were predicted in the flexible hinge region, namely, Thr-21 and Ser-25. Given that monoubiquitinylation can be regulated by phosphorylation, it is tempting to speculate that cellular kinase(s) might regulate ubiquitinylation and thus the structure of p6.
We conclude that p6 must be considered as a highly flexible protein that can exist in various conformational states, the structure of which depends on the solutions conditions and, most likely, on the presence of specific binding partners, as well as on post-translational modifications. Consequently, the folding of the p6 domain will depend on the appearance of the protein, either when it is present as the C-terminal domain of the Gag polyprotein or when it is matured to the self-contained p6 protein within the virus particle. Thus, the helix-flexible helix motif identified in this work for p6 should help to further unravel the molecular mechanism of this multifunctional virus protein.