Design, Expression, and Immunogenicity of a Soluble HIV Trimeric Envelope Fragment Adopting a Prefusion gp41 Configuration*

The human immunodeficiency virus-1 (HIV-1) envelope glycoprotein (Env) is comprised of non-covalently associated gp120/gp41 subunits that form trimeric spikes on the virion surface. Upon binding to host cells, Env undergoes a series of structural transitions, leading to gp41 rearrangement necessary for fusion of viral and host membranes. Until now, the prefusion state of gp41 ectodomain (e-gp41) has eluded molecular and structural analysis, and thus assessment of the potential of such an e-gp41 conformer to elicit neutralizing antibodies has not been possible. Considering the importance of gp120 amino (C1) and carboxyl (C5) segments in the association with e-gp41, we hypothesize that these regions are sufficient to maintain e-gp41 in a prefusion state. Based on the available gp120 atomic structure, we designed several truncated gp140 variants by including the C1 and C5 regions of gp120 in a gp41 ectodomain fragment. After iterative cycles of protein design, expression and characterization, we obtained a variant truncated at Lys665 that stably folds as an elongated trimer under physiologic conditions. Several independent biochemical/biophysical analyses strongly suggest that this mini-Env adopts a prefusion e-gp41 configuration that is strikingly distinct from the postfusion trimer-of-hairpin structure. Interestingly, this prefusion mini-Env, lacking the fragment containing the 2F5/4E10 neutralizing monoclonal antibody binding sites, displays no detectable HIV-neutralizing epitopes when employed as an immunogen in rabbits. The result of this immunogenicity study has important implications for HIV-1 vaccine design efforts. Moreover, this engineered mini-Env protein should facilitate three-dimensional structural studies of the prefusion e-gp41 and serve to guide future attempts at pharmacologic and immunologic intervention of HIV-1.

brane fusion, both of which are mediated by the viral envelope glycoprotein (Env) (1). Env is synthesized as a fusion-inactive precursor (gp160) that is cleaved by a host cell convertase in the Golgi compartment to generate a mature trimer of gp120/ gp41 subunits (2). The gp120 first binds to CD4 and then a chemokine receptor (CCR5 or CXCR4). The binding triggers a series of conformational changes in the gp120/gp41 complex that leads to the insertion of the fusion peptide located at the N terminus of the transmembrane-anchored gp41 subunit into the target cell membrane, and ultimately to virus-cell fusion (3).
The structure of the ectodomain of gp41 (e-gp41) in its fusogenic state has been solved by both x-ray crystallography and NMR (for HIV-1 and SIV, respectively) and consists of a central parallel trimeric coiled-coil of N-terminal helices surrounded by C-terminal helices of gp41 in an antiparallel hairpin fashion, forming a six-helix bundle (4 -7). The formation of the fusogenic state of e-gp41 provides the driving force for the apposition of the virus and cell membranes, thereby promoting membrane fusion. Several peptides that mimic the sequence of the N-and C-helices have been found to inhibit fusion in a dominant negative way by blocking the interaction between the C-and N-helices, respectively, and thus, preventing the formation of the gp41 fusogenic state (8,9). These observations suggest that native gp41 may exist as a "prefusion" structure and/or convert to a transient intermediate prior to formation of the fusogenic hairpin trimer structure. In the prefusion state, the C-region of e-gp41 must not yet be fully associated with the N-helices, such that both N-and C-regions of gp41 are accessible to peptide inhibitors.
On the virion surface, e-gp41 is thought to exist in a metastable state through association with gp120. The sites of gp120-gp41 contact have been mapped by in vitro mutagenesis and deletion studies to conserved residues within the disulfide-bonded loop region between N-and C-helices of gp41 as well as in the N-terminal (C1) and C-terminal (C5) regions of gp120 (10 -12). Detailed structural data are available currently only on a CD4-bound and unliganded form of the gp120 core and the e-gp41 fusogenic six-helix bundle (4 -7, 13, 14). Structural information on the native gp120/ gp41 complex trimer on the virion surface prior to receptor binding should be extremely valuable in guiding attempts at pharmacologic and immunologic intervention. However, achieving this goal has been impeded in part, due to the extensive glycosylation, heterogeneous conformation, and lability of the gp120/gp41 complex. To circumvent these issues, we now have designed a truncated "topless" gp140 variant in which most parts of gp120 have been removed, while maintaining e-gp41 in a prefusion configuration.

MATERIALS AND METHODS
Design, Expression, and Purification of the Topless gp140 Constructs-All constructs were expressed in insect cells using the Bac-to-Bac expression system (Invitrogen). The ADA HIV-1 gp160 gene was used as the template for constructs in Fig. 1B (15). For clarity, residues of ADA HIV-1 gp160 were numbered according to the HXBc2 gp160 sequence alignment. To facilitate the secreted expression of constructs, the expression vector pFastBac (Invitrogen) was modified to add a honeybee melittin secretion signal (HMSS) sequence at the 5Ј terminus of multiple cloning sites and a His 6 tag at the 3Ј terminus. Briefly, the HMSS DNA sequence was amplified by PCR using primer (5Ј-GCTTG-AGATCTATGAAATTCTTAGTCAACGTTGCC-3Ј) and (5Ј-GTTCCAAG-CTTTTAGTGATGGTGATGGTGATGTGAACCGAATTCGAGCTCGG-ATCCCCATCGATCCGC-3Ј) with pMelBac (Invitrogen) as a template. The PCR product was digested with BglII and HindIII and cloned into vector pFastBac at corresponding sites. The expression vector so generated, containing an HMSS sequence followed by BamHI, EcoRI, His 6 tag, and HindIII site, was named pFBHMH6. All constructs shown in Fig. 1B were cloned into the BamHI and EcoRI sites of pFBHMH6 for secreted expression. In the first topless construct TN, Asn 94 in the C1 region and Lys 487 in the C5 region of gp120 were connected by a short linker TPGK. Among all the topless constructs, the primary and secondary cleavage sites between gp120 and gp41 were mutated by PCR as shown in Fig. 1B. The Val 489 was also mutated to Ala. In construct STN, the C-terminal tryptophan-rich region was removed from TN and is truncated after Lys 665 . In construct SH3-STN (SS), the shorter linker TPGK between C1 and C5 in construct STN was replaced by the SH3 domain (residues Glu 8 -Lue 61 ) of CD2 BP1 protein (16). To avoid disulfide-linked protein aggregation during expression, the Cys in C1 (Cys 54 -Cys 74 ) and gp41 (Cys 598 -Cys 604 ) in TN, STN, and SS were mutated to Ala jointly or separately. For STN, the resulted mutants were named c1-STN (Cys 54 and Cys 74 to Ala), gp41-STN (Cys 598 and Cys 604 to Ala), 4cSTN (four Cys to Ala), respectively. The 24-amino acid residues fusion peptide (AVGTIGAMFLGFLGAAGSTMGAAS) in 4cSS was replaced by three linkers in construct 4cSSL24, 4cSSL16, and 4cSSL10, respectively. All mutations were generated by PCR-directed mutagenesis method, and the correct sequences were verified by DNA sequencing.
The generation and amplification of recombinant baculovirus followed the manufacturers' protocol (Bac-to-Bac Baculovirus Expression System, Version C). The titer of virus stock was measured using the BacPAK baculovirus rapid titer kit (BD Biosciences) following the manufactures' manual. The virus was usually amplified to a titer around 1 ϫ 10 8 plaque-forming units before the infection of sf9 (Spodoptera frugiperda) or Hi5 (Trichoplusia ni) cells for expression. Expression of these constructs was confirmed by Western blotting using the anti-His 6 mAb (BD Biosciences) and optimized with respect to virus multiplicity of infection and post-infection harvest times. For large scale expression, Hi5 cells at about 2 ϫ 10 6 cells/ml in Ex cell 405 medium (JRH Biosciences) were spun down and re-suspended into new medium before the infection with virus at a multiplicity of infection of 10. Supernatants were harvested 48 h post-infection. Secreted proteins were purified from supernatants by immunoaffinity chromatography using a mAb 8c9 affinity column, where the mAb 8c9 was cross-linked at 5.0 mg/ml to ␥-BindPlus Sepharose (Amersham Biosciences) with dimethylpimelimidate (Pierce). The supernatant were passed through the column with a flow rate of ϳ0.5 ml/min. After extensive washing with buffer containing 100 mM Tris, 100 mM NaCl, pH 8.0, the protein was then eluted with 100 mM glycine (pH 2.5), followed by immediate neutralization with 3.0 M Tris, pH 8.8. The fractions containing corresponding proteins were pooled, concentrated, and further purified by gel filtration chromatography on Superdex 200 (Amersham Biosciences). The peaks containing 4cSSL24 were collected and concentrated using an Amicon centrifugal filter device (Millipore).
Protein Chemistry-The concentration of 4cSSL24 protein described below was calculated using the theoretical extinction coefficient of A 280 ϭ 2.1 at 1.0 mg/ml and confirmed by BCA kit (Pierce). The purified protein was used for protein N-terminal sequencing on a 494 Procise sequencer (Applied Biosystems). The molecular weight of the purified glycosylated protein was measured by MALDI-TOF mass spectrometry (Applied Biosystems).
Chemical Cross-linking-4cSSL24 protein at a concentration of 0.1 mg/ml in PBS was cooled and then incubated with EGS on ice for 30 min or 2 h. The reactions were then quenched by adding Tris (pH 7.0) to a final concentration of 50 mM and incubated at room temperature for 30 min. The cross-linked products were analyzed on 4 -12% Bis-Tris gradient SDS-PAGE (NuPAGE, Invitrogen) and then Western blotted using the anti-His 6 mAb.
Dynamic Light Scattering (DLS) Analysis-About 1.0 mg/ml of 4cSSL24 (in 100 mM Tris, 100 mM NaCl, pH 8.0) was subjected to DLS analysis using a DynaPro apparatus (Protein Solutions Inc.) equipped with a temperature stabilizer. The experiments were conducted at 25°C, and the wavelength of the laser was set at 781.8 nm. About 50 observations were made to calculate the hydrodynamic radius (R h ) using the software DynaPro. The value of the frictional ratio (f/f 0 ) was calculated from D 20,w 0 by the relation in Equation 1, where K B ϭ 1.379 ϫ 10 Ϫ16 erg/°K, 20,w ϭ 0.01 poise, T ϭ 293.15°K, N A ϭ 6.022137 ϫ 10 23 /mol, and ϭ 0.73 cm 3 /g. Negative Staining Electron Microscopy-For staining with uranyl formate, samples at ϳ2 g/ml were absorbed to glow-discharged carbon grids for 30s, followed by washing with 2 drops of deionized water and staining with 2 drops of freshly prepared 0.75% uranyl formate. Images were taken on negative film at a magnification of ϫ52,000 and a defocus of 1.5 m using the low dose procedure with a Philips Tecnai12 electron microscope operated at 120 kV. The images on negative film were then digitized with a Zeiss SCAI scanner (pixel size of 0.4 nm at the specimen level).
Proteinase Digestion Assay-6-Helix and 4cSSL24 were dissolved into Tris/NaCl buffer (100 mM Tris, 100 mM NaCl, pH 8.0) at a concentration of 0.6 mg/ml and 0.8 mg/ml, respectively. Trypsin was then added to the solution at a weight ratio of 1:40 and incubated at room temperature. At different times, 20-l aliquots of the solution were taken out and quenched by adding 5 l of 5 ϫ SDS sample loading buffer. The samples were then analyzed on a 10 -20% Tris-Tricine gradient gel (Invitrogen) and stained with Coomassie Blue G-250.
Surface Plasmon Resonance Analysis-A BIAcore 3000 instrument (BIAcore Inc.) was employed for binding analyses at 25°C in HBS buffer (150 mM NaCl, 3.4 mM EDTA, 0.005% surfactant P-20, 10 mM HEPES, pH 7.4). The biotin-labeled DP178 or scrambled DP178 (scr-DP178) peptide was separately immobilized on the streptavidin-coated SA sensor chip at 1000 RU, following the manufactures' procedure. Both peptides were biotin-labeled at their N terminus. Samples 4cSSL24 (0.5 mg/ml), 5H (0.25 mg/ml), and 6H (0.3 mg/ml) were sequentially run over the DP178-SA and a blank SA for 5 min at a flow rate of 25 l/min. The sensor chip was regenerated by 10 mM HCl every time after the binding. All the samples were also analyzed on an scr-DP178-bound SA chip at the same condition. Each binding curve obtained from the DP178-bound chip was subtracted with the corresponding curve obtained from the scr-DP178-bound chip.
Immunoprecipitation Assay-The expression, refolding, and purification of HXBc2 HIV-1 gp41 5-Helix (5H), 5-Helix(D4) (5HD4), and 6-Helix (6H) followed the published procedure (17). Proteins were further purified by gel filtration on Superdex 75 (Amersham Biosciences) with a purity Ͼ95% on SDS-PAGE. The purified 4cSSL24 (500 l, 0.5 mg/ml in PBS) was incubated with 40 l of 8c9-coupled ␥-BindPlus beads for 2 h at room temperature, then the beads were spun down and washed with PBS buffer. To investigate the interaction between these proteins and 4cSSL24, 10 l of these beads was next mixed with 5H (0.1 mg/ml), 6H (0.12 mg/ml), and 5HD4 protein (0.1 mg/ml) in binding buffer (100 mM Tris, 100 mM NaCl, 0.5% Triton X-100, pH 8.0) and incubated overnight at 4°C. The beads were then spun down, washed three times with binding buffer, and re-suspended into 50 l of 1 ϫ SDS-PAGE sample loading buffer. As a control, the ␥-BindPlus beads without 8c9 mAb were also incubated with the above 5H, 5HD4, and 6H proteins using similar binding conditions.
Polyclonal Antibody Generation-Two New Zealand rabbits were immunized subcutaneously with 100 g of protein formulated with complete Freund's adjuvant for the primary injection and followed by several boosters with 100 g of protein formulated with incomplete Freund's adjuvant every 2 weeks until the titer increased no further. Serum samples were collected 7 days after each immunization and stored at Ϫ70°C until use in ELISA and neutralization assays as described below.
MT-2 Assay for Neutralizing Antibodies-Neutralization assays with HIV-1 MN were performed in MT-2 cells by using neutral red to quan-tify the percentage of cells that survived virus-induced killing. Briefly, a 500 tissue culture 50% infectious dose (TCID 50 ) of virus was incubated with multiple dilutions of serum samples in triplicate for 1 h at 37°C in 96-well flat-bottom culture plates. Cells (5 ϫ 10 4 ) in 100 l of growth medium were added, and the incubation continued until most but not all of the cells in virus control wells (cells plus virus but no serum sample) were involved in syncytium formation (usually 3-5 days). Cell viability was quantified by neutral red uptake as described previously (18). Neutralization titers were defined as the reciprocal serum dilution (before the addition of cells) at which 50% of cells are protected from virus-induced killing. A 50% reduction in cell killing corresponds to an approximate 90% reduction in p24 Gag antigen synthesis in this assay (19). Each set of assays includes a positive control serum that had been assayed multiple times and had a known average titer.
Luciferase Reporter Gene Assay for Neutralizing Antibodies-Neutralization was measured as reductions in luciferase reporter gene expression after multiple rounds of virus replication in 5.25.EGF-P.Luc.M7 cells (20). This cell line is a genetically engineered clone of CEMx174 that expresses multiple entry receptors (CD4, CXCR4, and GPR15/Bob) and was transduced to express CCR5 (21). The cells also possess Tat-responsive reporter genes for luciferase (Luc) and green fluorescent protein (GFP). Cells were maintained in growth medium (RPMI 1640, 12% heat-inactivated fetal bovine serum, 50 g gentamicin/ml) containing puromycin (0.5 g/ml), G418 (300 g/ml), and hygromycin (200 g/ml) to preserve the CCR5 and reporter gene plasmids. For neutralization assay, 5000 TCID 50 of virus was incubated with multiple dilutions of test sample in triplicate for 1 h at 37°C in a total volume of 150 l in 96-well flat-bottom culture plates. A 100-l suspension of cells (5 ϫ 10 5 cells/ml of growth medium containing 25 g of DEAE-dextran/ml but lacking puromycin, G418, and hygromycin) was added to each well. One set of control wells received cells plus virus (virus control), whereas another set received cells only (background control). Plates were incubated until ϳ10% of cells in virus control wells were positive for GFP expression by fluorescence microscopy (ϳ3 days). At this time, 100 l of cell suspension was transferred to a 96-well white solid plate (Costar) for measurements of luminescence using Bright Glo substrate solution as described by the supplier (Promega). Neutralization titers are the dilution at which relative luminescence units were reduced by 50% compared with virus control wells after subtraction of background relative light units. Cell-free stocks of the HIV-1 isolates used in this assay were generated in peripheral blood mononuclear cells.
ELISA-The titers of the rabbit anti-4cSSL24 serum were measured in a standard ELISA assay. 4cSSL24 protein (100 l/well) was used at a concentration of 2 g/ml to coat 96-well ImmulonB (ThermoLab System) microtiter plates. Antibody end-point binding titers were determined as the highest dilution -fold of the serum assayed against 4cSSL24 giving A 405 reading ratio of experimental/prebleed control of Ն3.0.

Design and Stepwise Modification of Topless gp140
Variants-To design gp140 variants in which e-gp41 is maintained in a prefusion state, C1 and C5 segments were retained in various gp140 constructs while the other gp120 components were eliminated. These deletions create a "topless" variant of Env. The crystal structure of gp120 core-CD4 -17b Fab complex from the HXBc2 strain of HIV-1 (22) (Fig. 1A, left panel) served as the template for the current topless ADA HIV gp140 design. In the core gp120 structure, 50 residues from the N terminus of C1 and 19 residues from the C terminus of C5 are absent, and the remaining termini of C1 and C5 segments are close in space, with residue Asn 94 in C1 and Lys 487 in C5 being most proximal (ϳ6 Å). Within the inner domain next to the two termini, a small subdomain, composed of ␤ strands ␤1, ␤5, ␤6, ␤7, and ␤25 (Fig. 1A, top left panel), appears relatively separate from the rest of structure and void of glycans. We suspect that this small subdomain may loosely contact the gp41 subunit, dissociating from it when gp120 sheds upon ligand binding. Therefore, for the first topless ADA gp140 variant design (as shown in Fig. 1B, variant TN), we connected Asn 94 of C1 and Lys 487 of C5 by a short peptide linker TPGK that has a propensity to form a ␤-turn structure. Additionally, Val 489 in C5 was mutated to Ala to reduce potential aggregation caused by its exposed side chain following removal of the gp120 "top." The primary and secondary cleavage sites were mutated to prevent FIG. 1. Schematic representation of the topless gp140 construct. A, structural model for construct design. The three-dimensional structure of the gp120 core structure is shown on the left panel as taken from the complex structure (PDB code: 1G9M) with CD4 and mAb 17b omitted for clarity. The structure of gp120 core is shown in ribbon (upper left) or C␣ trace (lower left) diagrams with the outer and inner domain depicted in red and green, respectively. The molecule is orientated with the associated gp41 at the bottom. The location of residues that link C1 and C5 are also indicated (Asn 94 and Lys 487 ). Two topless gp140 models, TPGK-or SH3-linked C1 and C5 to e-gp41 in a putative prefusion state are shown in the right panels, STN and STN-SH3, respectively. Those C1 and C5 segments not included in the crystal structure of gp120 core are indicated as dashed lines. For STN and STN-SH3, only a single subunit is depicted for the trimers. B, linear sequence representation of gp140 constructs. The schematic structure of ADA HIV-1 gp160 precursor is shown at the top. For clarity, the numbering of ADA gp160 follows that of HXBc2. Potential N-linked glycosylation sites are represented by small tree marks. Disulfide bonds in the TN topless gp140 construct are indicated at the top of C1 and gp41 loop regions, respectively. The sequence of the region including the mutated primary (KR) and secondary (KRR) cleavage site are shown with the mutated sequence at the bottom. The partial sequence of the C-terminal segment of e-gp41 (CTE) is given, with the linear epitopes for the two broadly NAbs, 2F5 and 4E10, indicated. For constructs STN and others below, the e-gp41 does not include the CTE region, terminating at Lys 665 . c1-STN and gp41-STN are STN mutants with double-Cys substitution in C1 and gp41 regions, respectively. 4cSTN is the four-Cys mutant of STN. The sequences of the fusion peptide of gp41 and the corresponding linker in 4cSSL24 are shown. dissociation of the remaining gp120 fragments from e-gp41 (Fig. 1B). To facilitate detection and purification, a hexa-histidine tag (H6) was attached to the C terminus of all constructs via a short linker (EFGS).
The construct, named TN (Fig. 1B), was expressed using the baculovirus system in Hi5 insect cells. The secreted protein expression level was low, ϳ50 g/liter, with most protein appearing as disulfide-linked aggregates on SDS-PAGE. Mutation of the Cys residues in C1 and/or e-gp41 could not eliminate the aggregation (data not shown). Hence, we suspect that the aggregation is caused by the tryptophan-rich segment within the gp41 C-terminal ectodomain membrane-proximal region (CTE), as was previously observed for Escherichia coli-produced e-gp41 segments. 2 Therefore, the second STN series of topless construct excludes the CTE (Fig. 1, A (middle panel) and B).
As shown in Fig. 2 (A and B), the STN formed high molecular weight aggregates in the absence of reducing agent, indicating that the disulfide bonds of C1 and/or gp41 were not correctly formed. Consequently, we created three STN disulfide mutants, termed c1-STN, gp41-STN, and 4cSTN (Fig. 1B), in which Cys in C1 and C5 were separately or concurrently mutated to Ala, respectively. c1-STN and gp41-STN, like STN, formed disulfide aggregates, whereas 4cSTN ran at an apparent molecular mass of 50 kDa under both reducing and nonreducing conditions. However, the purification efficiency of metal-chelating chromatography using nickel beads was too low to obtain sufficient amounts of protein for biochemical analysis. To overcome this problem, we replaced the short TPGK linker with the SH3 domain from CD2BP1, an adapter protein binding to the cytoplasmic tail of CD2 for which a specific mAb, 8C93d8 (8c9), exists (16). Given that the N-and C-terminal ends of folded SH3 domains approximate to within ϳ6 Å, this replacement will juxtapose the C1 and C5 segments as would the short linker. Therefore, in the constructs STN-SH3 (SS) and the four Cys mutant variant SS (4cSS), the CD2BP1 SH3 domain (residues Glu 8 -Leu 61 ) replaced the TPGK linkers (Fig. 1, A (right panel) and B). The 4cSS protein (expression level ϳ2.0 mg/liter) was efficiently purified using 8c9 affinity chromatography. The oligomer state of 4cSS was tested by gel filtration on a Superdex S-200 column equilibrated with PBS. As shown in Fig. 2C, most of the 4cSS formed heterogeneous soluble aggregates with an apparent molecular mass of ϳ440 kDa, larger in size than expected for trimers. The aggregates were not solubilized by acidic buffer conditions (pH 3.0), in contrast to the fusogenic hairpin-like e-gp41, which becomes soluble in acidic rather than neutral pH conditions (6,7).
Because much of the gp120 core in the topless constructs has been removed (Fig. 1A), we reasoned that the hydrophobic fusion peptide might be exposed, promoting aggregation. To test this hypothesis, we replaced the fusion peptide with one of three hydrophilic segments: the 24-residue DSQEGASGDSGS-GASGSQGTSGGS, the 16-residue DSQEGASGSQGTSGGS, or the 10-residue DSQEGASGGS. The resulting constructs are denoted as 4cSSL24, 4cSSL16, and 4cSSL10, respectively. The affinity-purified 4cSSL24 was analyzed by gel filtration, revealing a symmetric peak with an elution volume corresponding to an apparent molecular mass of ϳ300 kDa (Fig. 2D). This result strongly supports the notion that the fusion peptide contributes to higher oligomerization of the 4cSS construct variants. The molecular mass of the 4cSSL24 monomer on SDS-PAGE was ϳ50 kDa (Fig. 2D, inset) sequence and average glycan size of ϳ1.0 kDa in the Hi5 cells (23). Because 4cSSL16 and 4cSSL10 expression levels were no greater than 4cSSL24, they were not analyzed further.
4cSSL24 Exists as an Elongated Trimer at Neutral pH-To clarify whether 4cSSL24 was a trimer with an elongated shape or rather a higher order oligomer, we performed chemical crosslinking by adding EGS at varying concentrations (Fig. 3A). The reaction was stopped after 30 min or 2 h by adding Tris buffer, and samples were analyzed by 4 -12% gradient SDS-PAGE. As expected, without cross-linker addition, the 4cSSL24 monomer migrates in SDS-PAGE with a molecular mass of ϳ50 kDa. In the presence of 0.1 mM EGS, 4cSSL24 migrates primarily as two additional bands on SDS-PAGE at ϳ100 and 150 kDa, corresponding to the size of dimers and trimers, respectively. With increasing concentrations of EGS, more monomers are cross-linked to dimers and trimers. At a final EGS concentration of 2 mM for 30 min, the majority of the protein formed trimers, whereas only a small fraction remained as dimers and the monomer band virtually disappeared. After a 2-h EGS incubation period, most of the monomeric 4cSSL24 existed as trimers at each EGS concentration tested. This cross-linking data clearly demonstrate that 4cSSL24 exists as a trimer at neutral pH.
To reconcile the different molecular masses (300 versus 150 kDa) suggested by gel filtration and chemical cross-linking experiments, we further employed dynamic light scattering (DLS). The DLS results shown in Fig. 3B demonstrate that the size distribution of 4cSSL24 is quite homogeneous except for a very minor fraction of larger size. The hydrodynamic radius (R h ) of the major peak measured by DLS was 6.891 nm, corresponding to an apparent molecular mass of ϳ308 kDa, consistent with the gel-filtration data. The low polydispersity of 4cSSL24 in solution suggests homogeneity. The frictional ratio, f/f 0 ϭ 2.08, indicates that 4cSSL24 adopts an elongated conformation, explaining why the apparent molecular weight assessed by gel filtration appears larger than a trimer. To obtain qualitative structural information on 4cSSL24, we recorded images of 4cSSL24 by using negative staining EM as shown in Fig. 3C. The protein molecules mostly assume a rod-like structure (ϳ120 Å) with one end slightly larger than the other end, possibly corresponding to the position of the SH3 domains. These images of 4cSSL24 are consistent with the results of DLS and gel-filtration.
Antigenic Properties of 4cSSL24 -The structural integrity of 4cSSL24 was further assessed by ELISA using two mAbs against known epitopes (Fig. 4). The NC-1 mAb binds to the postfusion state of gp41, but not the free C-peptide (i.e. corresponding to HR2 of gp41 as shown in Fig. 1B) (24). In contrast, antibody 98-6 binds both the six-helix postfusion bundle (see below) and the free C-peptide (25). As shown, 4cSSL24 and gp140 are recognized by the 98-6 antibody but not by NC-1, whereas the 6-Helix protein is recognized by 98-6 as well as NC-1 antibodies. The free C-peptide (DP178) is weakly reactive with 98-6 but unreactive with NC-1, consistent with a previous report (25). These results suggest that 4cSSL24 is configured such that its N-and C-peptide segments are not associated in a hairpin structure. Rather the C-terminal half of the C-peptide of gp41 in 4cSSL24 is exposed, accessible to 98-6.
A Prefusion Conformation of the Trimeric Topless Protein-Several additional methods were used to assess whether 4cSSL24 exists in a different conformation from the six helix bundle. First, we tested the sensitivity of the 4cSSL24 protein to protease digestion, reasoning that while both HIV and SIV e-gp41 form trimer-of-hairpin structures that contain a highly stable and proteinase-resistant core (26,27), a prefusion structure might be less compact and hence, trypsin-labile. In these experiments, 6-Helix protein was used in parallel with 4cSSL24 for comparison. The 6-Helix (6H) is a recombinant protein composed of three N-and C-peptides connected by short linkers and folds into a 6-helix bundle like those observed in the postfusion trimeric e-gp41 structures (4 -6). As shown in Fig. 5A, when the 6-Helix protein was digested by trypsin and analyzed by SDS-PAGE, only two major bands (6 and 25 kDa) appear, reflecting a cleavage site in the linker region between the N and C helices, consistent with previous results (26,27). In contrast, protein 4cSSL24 was quickly and thoroughly digested by trypsin. The trypsin-digested samples of 4cSSL24 were also analyzed by ESI-MS (Fig. 5, B and C). Most of those peptide fragments detected by ESI-MS were released from the middle of N-and C-domains of 4cSSL24 protein (Fig. 5D),   FIG. 3. Trimeric state of 4cSSL24. A, chemical cross-linking with EGS. Purified 4cSSL24 protein was diluted in PBS, and EGS was added at the final indicated concentrations (millimolar). After incubation on ice for 30 min or 2 h, the reaction was stopped and samples were analyzed on a 4 -12% gradient SDS-PAGE, then blotted with anti-His 6 mAb. B, dynamic light scattering analysis of purified 4cSSL24. Volume-weighted size distribution of purified 4cSSL24 is shown as a histogram. Rh is the equivalent hydrodynamic radius; Dt is the translational diffusion coefficient; Pd is the degree of polydispersity (Ͻ15% being considered monodisperse); Diff coef. is the diffusion coefficient. The apparent molecular mass calculated from Dt is about 308 kDa. The frictional ratio f/f 0 was calculated by supposing that 4cSSL24 exists as a trimer in the solution with a molecular mass of 127 kDa. C, negative staining EM of 4cSSL24. Purified 4cSSL24 were negatively stained with uranyl formate, and images on negative film were recorded with a scanner at a pixel size of 0.4 nm. The bar represents 50 nm.
indicating that those basic residues within helices are enzymeaccessible with a configuration distinct from a hairpin bundle.
Next, a fusion inhibitor that targets the prefusion intermediate of gp41 was examined for binding to 4cSSL24. One inhib-itor, termed 5-Helix (5H), is a recombinant protein that consists of three N-peptides and two C-peptides (Fig. 6A). The absent C-peptide in the 5-Helix creates a high affinity binding site for the C-terminal region of gp41, and hence 5-Helix can  5. Tryptic digestion of 4cSSL24 versus 6-Helix. A, tryptic digestion assay. The reaction was carried out at room temperature by adding trypsin to the protein solution of 6-Helix and 4cSSL24. After the indicated periods, aliquots of the reaction were taken out and quenched by mixing with SDS-PAGE sample buffer. The samples were then analyzed on a 10 -20% gradient Tris-Tricine SDS-PAGE. B, ESI-MS identification of the trypsin-digested 4cSSL24 fragments eluted from C18 column. The peptide mixture was applied to a C18 reverse phase Zip Tip, and the absorbed peptides were subsequently eluted by 50% methanol, 50% water, 0.1% formic acid. The eluted peptides were identified by a combination of molecular weights and fragmentation patterns using nanoESI-MS and nanoESI-MS/MS on a QStar Pulser quadrupole-TOF mass spectrometer (Sciex). The peaks of the detected 4cSSL24 peptide are indicated, and the corresponding fragments in 4cSSL24 are also given. C, ESI-MS identification of the trypsin-digested 4cSSL24 fragments eluted from the C4 column. The conditions used were similar to those of B except the eluting buffer was 70% acetonitrile, 30% water, 0.1% formic acid. D, amino acid sequence of 4cSSL24 and the detected trypsin cleavage sites. The first five residues are from the secreting expression vector. The corresponding regions of the 4cSSL24 amino acid sequence are indicated, and the potential N-linked glycosylation sites are shown as small trees above the sequence. The trypsin-cleavage sites detected by the ESI-MS in B and C are indicated by arrows. The basic residues with a black dot on top were not detected as cleavage sites.
inhibit the fusogenic activity of HIV at nanomolar concentrations (17). The 6-Helix and a 5-Helix variant, denoted 5HD4, in which the C-peptide binding site is disrupted by mutation of four interface residues (Val 549 , Leu 556 , Gln 563 , and Val 570 ) to Asp, were used as controls (Fig. 6A) (17). As shown in Fig. 6B, the 5-Helix bound to the 4cSSL24 protein in an immunoprecipitation assay using anti-SH3 mAb 8c9 (Fig. 6B, lane 2). In contrast, no detectable amount of 6-Helix and only a minor fraction of input 5HD4, bound to 4cSSL24 proteins (Fig. 6B,  lanes 4 and 3, respectively). These data indicate that the Cpeptide region of gp41 in 4cSSL24 is accessible for 5-Helix binding.
In a third experiment, the biotinylated peptide fusion inhibitor DP178 (8) was immobilized on an SA sensor chip and the binding of 4cSSL24 in comparison with 5-Helix was measured by surface plasmon resonance using a BIAcore 3000. A scrambled DP178 peptide (scr-DP178) with the same amino acid composition was also immobilized on the SA chip as nonspecific background control. The 5-Helix showed preferentially strong and fast binding, whereas the 4cSSL24 showed specific yet relatively weak and slow binding to DP178 (Fig. 6C). This 4cSSL24 binding behavior may be due to steric constraints in accessing the trimeric N-peptide region of gp41 in 4cSSL24 not present in the structurally preconfigured 5-Helix. Taken together, these results confirm that the structural configuration of gp41 in 4cSSL24 is distinct from that of the 6-Helix fusogenic hairpin. Our results suggest that both the N-and C-peptide regions of the e-gp41 in 4cSSL24 are partially exposed, similar to the prefusion state of gp41 in the CD4-unligated gp140.
Generation of Polyclonal Antisera against 4cSSL24 and Assessment of Neutralizing Activity-In principle, subunit vaccines that mimic the structure of gp41 in a prefusion state might be excellent immunogens against which to generate neutralizing antibodies (NAbs). For example, partially exposed Nand C-peptide segments in 4cSSL24, as demonstrated above, might represent suitable targets. Those epitope targets are minimally accessible in the trimeric Env protein so that elicitation of such NAb specificities might be difficult using intact Env. To test this possibility, polyclonal antibodies to 4cSSL24 protein were raised in rabbits. Using an ELISA format, the specific antibodies directed against 4cSSL24 have a reciprocal half-maximal titer at ϳ6 ϫ 10 4 . A panel of proteins, including GST-SH3, gp140, gp120, and 6-Helix, were then used to assess specificities within the antisera. A half-maximal titer of ϳ3 ϫ 10 4 is directed against the SH3 domain, whereas a half-maximal titer of ϳ2 ϫ 10 4 is directed against HIV gp140 as well as the 6-Helix protein (Fig. 7A). Note that a minor fraction of those antibodies are HIV gp120 reactive (half-maximal titer, ϳ3 ϫ 10 3 ). Collectively, the ELISA assay shows that a significant fraction of the polyclonal antibodies in the rabbit sera generated with the 4cSSL24 immunogen was directed against the gp140 component, although there were also fractions with specificities against the fusion protein SH3 domain and the 6-Helix proteins. Interestingly, however, the neutralization assay of the antisera using either MT-2 or M7-Luc cells did not show any obvious protection against viral infection with ADA, SF162, Bal, or JR-FL HIV primary isolates (Fig. 7B). Minimal neutralizing reactivity was observed in one of two rabbits postimmunization against the easily neutralized laboratoryadapted isolate MN. These data indicate that the current 4cSSL24 construct, despite displaying a prefusion e-gp41 conformation, is not able to elicit NAbs against primary HIV-1 isolates. DISCUSSION We report on the design and stepwise modifications of topless gp140 constructs in which the majority of gp120 has been removed while retaining C1 and C5 regions to preserve gp41 in a non-fusogenic state. The structure of gp120 core-CD4 -17b Fab complex (22) was the basis of the design, anticipating that C1 and C5 would be juxtaposed in the CD4-unligated gp120/ gp41 complex as well. The recently solved monomeric SIV gp120 core structure in its CD4-unligated state reveals that this assumption was essentially correct (14). The furin-like cleavage site was mutated to prevent dissociation between gp120 fragment and gp41 subunit (15,28,29). Although the uncleaved gp140 may adopt a slightly different conformation relative to the native gp120/gp41 complex on the viral surface, covalent linkage of gp120 and gp41 in the uncleaved gp140 molecule should create only local conformational perturbations in such type I viral fusion proteins (30). In addition, replacing the 24-residue fusion peptide with a more hydrophilic segment of identical length should minimize any constraint resulting from covalent linkage.
Our antigenicity studies using mAbs against gp41 (Fig. 4) suggest that 4cSSL24 has a similar conformation to gp41 in the CD4-unligated ADA gp140. The trypsin-digestion pattern and NC-1 antibody binding results argue that 4cSSL24 is not in a postfusion state. The specific interaction of 4cSSL24 with 5-Helix indicates that the C-peptide region within the topless protein is exposed, consistent with results showing binding of 98-6 antibody to 4cSSL24. Binding between the C-peptide and FIG. 6. Prefusion conformation of 4cSSL24. A, schematic model of the designed gp41 protein 6-Helix (6H), 5-Helix (5H), and 5HD4. The schematic of 5H is adapted from Ref. 17. For the 6H, three N-peptide segments (green) and three C-peptide segments (light blue) are alternately linked (N-C-N-C-N-C) using short Gly/Ser peptide linkers with the exception of the last N-to-C linker, GGRGG, containing a trypsin digestion site. For the 5H, the last C-peptide is removed from 6H. For 5HD4, four residues (Val 549 , Leu 556 , Gln 563 , and Val 570 ) in the last N-peptide of 5H were substituted with Asp, disrupting C-peptide binding. B, immunoprecipitation assay. The anti-SH3 mAb 8c9 coupled to ␥-BindPlus beads were firstly saturated with 4cSSL24, and then an equal amount of beads was added to the protein solution containing 5H, 6H, or 5HD4, respectively. Postincubation, the beads (lanes 2-4) were spun down, washed, boiled with SDS, and analyzed on a 15% SDS-PAGE. The total input 5H, 6H, and 5HD4 proteins used in each immunoprecipitation assay are shown in lanes 5-7. Lane 1 shows purified 4cSSL24 as a control. C, surface plasmon resonance analysis of the interaction between 4cSSL24 and C-peptide. The biotin-labeled C-peptide, DP178, was immobilized on a SA chip coated with avidin. To exclude nonspecific background, a scrambled DP178 (scr-DP178) was also immobilized on an SA chip, and interaction with the same analytes was measured. The curves shown are those obtained after subtracting scr-DP178 binding. 4cSSL24 implies that the N-peptide segment is partially accessible in this mini-gp140 protein. Together, these results indicate that 4cSSL24 adopts a prefusion gp41 conformation and further imply that the N-and C-peptide segments of gp41 are separate from each other. This hypothesis is consistent with the extended molecular shape of the trimerized 4cSSL24 observed by negative-staining EM. Previous EM analysis of the trimeric HIV and SIV gp140 revealed molecules with one stem and three "arms" (31,32). The 4cSSL24 protein adopts a rod shape without projections. Those arms in the intact gp140 molecules likely represent the gp120 subunits of the trimer, which are absent from 4cSSL24, being replaced by the smaller CD2BP1 SH3 domain. Given that the EM analysis of 4cSSL24 shows the molecule to have a length of ϳ120 Å, whereas the NMR structure of the six-helix hairpin reveals a length of ϳ100 Å, it is unlikely that both HR1 (N-peptide) and HR2 segments (C-peptide) are in an unpaired, fully extended (or helical) conformation. Note that the NMR structure of the CD2BP1 SH3 domain reveals that its longest dimension is 35 Å. 3 Although this mini-Env 4cSSL24 exists as a stable trimer at physiologic conditions, the e-gp41 portion in this trimer is labile, as shown by the rapid digestion by trypsin in Fig. 6A, indicating the presence of mobile structural elements. This observation is consistent with the notion that e-gp41 is in a metastable state on the virion surface through association with gp120 (3). In the trimeric Env spike on the native virion, the highly glycosylated gp120 exterior domain of Env may protect this labile e-gp41 from proteinase digestion.
The obtained prefusion mini-Env in this study further confirms that the association with gp120 is crucial in stabilizing or fixing the prefusion configuration of e-gp41. Mutations in gp120 that result in the decreasing gp41-binding affinity have been mapped to a cluster of residues at amino (residues 36 -45) and carboxyl (residues 491-501) ends of gp120 as well as several sites within the C3 and C4 regions (33). Interestingly, we found here that segments of the C1 and C5 (residues 32-94 and 487-571, respectively) are sufficient to maintain e-gp41 in a prefusion state, possibly indicating their direct interaction with e-gp41. In contrast, C3 or C4 regions may indirectly affect the gp120/gp41 association.
During the HIV fusion process, the unliganded gp120 and prefusion gp41 components within the Env complex undergo a series of conformational changes following interaction with receptor (CD4) and co-receptor (CXCR4 or CCR5) on the host cell. As a result of the subsequent formation of the CD4-liganded gp120 and postfusion gp41, fusion of cellular and viral membrane is brought about. Structures of several components of this process have been solved, including the CD4 unliganded (prefusion) and attached (postfusion) gp120 core as well as the postfusion e-gp41 fragment (4 -7, 13, 14). However, the threedimensional structure of a prefusion gp41 is still unknown, mainly because until now, this prefusion conformation requires the association with the highly glycosylated gp120. We here successfully created a mini-Env in which most parts of gp120 were removed while still maintaining gp41 in a prefusion state. This mini-Env contains gp120 C1 and C5 components, which are absent from both the liganded and unliganded gp120 core structures (13,14). The molecular weight of this trimeric mini-Env is relatively small (127 versus 420 kDa of trimeric gp140), containing only five N-linked potential glycosylation sites. As a result, this mini-Env should be a promising candidate for structural studies of the prefusion Env and may serve to guide design of small molecules that interfere with the HIV fusion process.
The design of immunogens that elicit broadly neutralizing antibodies against HIV is far more difficult than previously anticipated (34). Although oligomeric gp140 is a quantitatively and qualitatively improved immunogen compared with monomeric gp120, adequate protection against diverse primary HIV isolates has not been achieved (35,36). Extensive studies on HIV envelope protein have suggested that its glycan shield, amino acid sequence variability, multiple conformational states, and proteolytic lability impede the elicitation of broadly neutralizing epitopes (34,37). Natural antibodies frequently arise against strain-specific epitopes, permitting evolution of viral escape variants (38). Nonetheless, primary HIV-1 isolates from different genetic subtypes can be neutralized by certain broadly reactive human monoclonal antibodies such as b12, 2G12, 2F5, and 4E10 (39 -43), offering hope that vaccines may induce comparable protective antibodies by exploiting epitopetargeted immunogen design.
We tested the immunogenicity of 4cSSL24 protein, expecting greater exposure of functionally relevant gp41 epitopes in 4cSSL24 than in native gp140, because most of the gp120 sub- FIG. 7. Generation of polyclonal antibody to 4cSSL24 and assessment of its neutralizing activity. A, ELISA of the rabbit anti-serum against 4cSSL24. A panel of proteins, including ADA gp140 (5.6 g/ml), ADA gp120 (4.8 g/ml), 6-Helix (1.2 g/ml), and GST-SH3 (2.0 g/ml), were directly coated onto 96-well ELISA plates. The generated anti-4cSSL24 serum from one rabbit (J28) was progressively diluted and incubated with the plated proteins. The titration curves of the anti-4cSSL24 against each individual protein are shown. The anti-serum from another rabbit (J27) gave similar results (data not shown). B, neutralization assay of the anti-4cSSL24 serum. The serum NAb levels were measured against HIV-1 MN and a panel of primary isolates representing a spectrum of neutralization-sensitivities (ADA, SF162, Bal, and JR-FL). The assay in MN strain was based on reductions in virus-induced killing in MT-2 cells, and neutralizing activities are expressed as values that are the reciprocal serum dilution where 50% of cells were protected from virus-induced killing as measured by neutral red uptake. For all other isolates, the assay was based on reductions in luciferase (Luc) reporter gene expression after multiple rounds of virus replication, and the neutralizing values are the reciprocal serum dilution at which relative light unit were reduced by 50% relative to no sample. J27-PB and J28-PB are prebleed sera samples before immunization, and J27-B1 and J28-B1 are anti-4cSSL24 sera from rabbit J27 and J28, respectively. unit was deleted. Nonetheless, no neutralizing activity was detected in the high titered polyclonal rabbit anti-4cSSL24 sera. The slow binding of DP178 to 4cSSL24 compared with that of 5-Helix may imply that neutralizing targets are obstructed by C1 and C5 segments linked by the SH3 domain and/or glycans in the mini-protein. Four of five glycans in the 4cSSL24 protein reside in or close to the HR2 region. Most of the gp41 epitopes are covered by gp120 in native gp120/gp41 complex on the virion surface (1), although our results showed that the C-terminal half of gp41 is exposed in ADA gp140 based on 98-6 antibody binding. Thus, steric constraints plus conformational dynamics of gp41 structural alterations likely make important epitopes inaccessible, or only transiently exposed, during gp41 fusion. One exception may be the recently identified e-gp41 caveolin-1 binding motif at the end of the loop between the helical segments and overlapping several residues of the HR2 (44).
The other gp41 ectodomain region accessible to NAbs at any stage prior to or during virion fusion involves the segment harboring the 2F5 and 4E10 epitopes. This most membrane proximal region of gp41 is problematic for inclusion in soluble gp41 subunit vaccines due to its hydrophobic character. The 2F5 mAb preferentially recognizes a prefusion conformation of gp41 (45,46). Recently, structural analysis of the 2F5 NAb Fab fragment in complex with its gp41 peptide segment suggests that the membrane-proximal region at the start of the 2F5 epitope is relatively flexible, perhaps assuming different configurations (47). In the prefusogenic state, this extended structure is presumably stabilized by interactions through its hidden face with other components of the HIV ectodomain. More C-terminal, from residues 670 -683, a predominantly hydrophobic ␣-helix, perhaps lying parallel to the viral membrane and containing the epitope of the most broadly neutralizing antibody, 4E10 (48), would complete the ectodomain proximal to the transmembrane anchor segment. This structural context may explain why all attempts to induce 2F5-and 4E10-like NAbs through vaccination with peptides or simple linear epitopes have failed (49 -51). Given the importance of the hydrophobic membrane milieu for configuration of the 2F5/4E10binding segment, topless protein variants must include the membrane proximal region of HIV (up to K683) and should be displayed on a lipid membrane or artificially constrained in future immunogenicity studies. This highly conserved gp140 ectodomain region is accessible to antibody. The key is to determine how to present the relevant epitope(s) in native configuration, without attendant misguiding structural cues on Env that represent components of viral chicanery.