Functional and Structural Characterization of Novel Type of Linker Connecting Capsid and Nucleocapsid Protein Domains in Murine Leukemia Virus*

The assembly of immature retroviral particles is initiated in the cytoplasm by the binding of the structural polyprotein precursor Gag with viral genomic RNA. The protein interactions necessary for assembly are mediated predominantly by the capsid (CA) and nucleocapsid (NC) domains, which have conserved structures. In contrast, the structural arrangement of the CA-NC connecting region differs between retroviral species. In HIV-1 and Rous sarcoma virus, this region forms a rod-like structure that separates the CA and NC domains, whereas in Mason-Pfizer monkey virus, this region is densely packed, thus holding the CA and NC domains in close proximity. Interestingly, the sequence connecting the CA and NC domains in gammaretroviruses, such as murine leukemia virus (MLV), is unique. The sequence is called a charged assembly helix (CAH) due to a high number of positively and negatively charged residues. Although both computational and deletion analyses suggested that the MLV CAH forms a helical conformation, no structural or biochemical data supporting this hypothesis have been published. Using an in vitro assembly assay, alanine scanning mutagenesis, and biophysical techniques (circular dichroism, NMR, microcalorimetry, and electrophoretic mobility shift assay), we have characterized the structure and function of the MLV CAH. We provide experimental evidence that the MLV CAH belongs to a group of charged, E(R/K)-rich, single α-helices. This is the first single α-helix motif identified in viral proteins.

The first step of assembly of immature retrovirus particles is oligomerization of the structural polyprotein precursor Gag into a hexagonal lattice. The immature particles then bud from the cells, and the Gag polyprotein is cleaved by the viral prote-ase into the main structural proteins: matrix, capsid (CA), 2 and nucleocapsid (NC). Subsequently, this proteolytic processing leads to structural transition and formation of the mature infectious virus.
The crucial interactions leading to the assembly of immature particles are mediated by the CA, NC, and CA-NC connecting region. The CA consists of two independently folded domains: the N-terminal domain (NTD-CA) containing seven helices, and the C-terminal domain (CTD-CA) containing four helices. In contrast to the CA, the NC is partially disordered and contains one or two zinc fingers critical for the selective packaging of the two copies of retroviral genomic RNA. During maturation, the CA-NC connecting region of various retroviruses is cleaved either at one site or at two sites, thus yielding a so-called spacer peptide. The spacer peptides (SPs) separate the CA and NC in lentiviruses such as HIV-1 and in alpharetroviruses such as Rous sarcoma virus (RSV). Spacer peptides are short peptides (12-14 amino acids long) whose cleavage is critical. for the proper transition from an immature to a mature particle (1,2). In contrast, the betaretrovirus Mason-Pfizer monkey virus (M-PMV) has only one cleavage site between the CA and NC. Nevertheless, the 12 and 13 amino acids of the CA C terminus and NC N terminus, respectively, were shown to play a critical role in the assembly of the immature particles (3,4). This part of the M-PMV Gag was therefore termed a spacer peptide-like domain (4). Similarly to M-PMV, gammaretroviruses have no spacer peptide between the CA and NC. However, unlike in other retroviral species, the C terminus of the gammaretroviral CA is composed predominantly of charged amino acids. Based on its composition, this 42-amino acid-long region was named a charged assembly helix (CAH) (5). The mutational analysis of several basic residues within this region showed a crucial role in the Moloney murine leukemia virus Gag-Gag interactions (6).
The combination of cryo-electron microscopy and tomography was successfully used to determine the nanometer-resolu-tion, and later, subnanometer-resolution structures of the immature particles of four retroviral species: HIV-1, RSV, M-PMV, and MLV (7)(8)(9)(10)(11)(12)(13). These structures show a hexameric arrangement of CA subunits that, despite their conserved tertiary structure, have different quaternary arrangements. Although the arrangement of the CTD-CA was similar in these retroviral particles, the arrangement of the NTD-CA in HIV-1 and M-PMV was completely different (8,9). Another striking difference between HIV-1, RSV, M-PMV, and MLV was found in the arrangement of the region connecting the CA and NC. In HIV-1 and RSV immature particles, the spacer peptides form a rod-like, six-helix bundle, presumably a coiled-coil. Also, in the in vitro assembled MLV particles the CA is connected to the NC-RNA complex by a rod-like structure (12). In contrast, in M-PMV, the CA-NC connecting region forms a "kinked rodlike" structure that brings the CA and NC into close proximity (4, 8 -11).
The critical role of the CA-NC connecting region as a transient assembly domain was first suggested by Kräusslich et al. (2) for HIV-1 and was further confirmed for RSV, M-PMV, and MLV by experimental data from other laboratories (4 -6, 14 -17). The peptides derived from HIV-1 and RSV spacer peptides were shown to form amphipathic helices in vitro (1,11,14,(17)(18)(19)(20). In contrast, a peptide derived from the M-PMV spacer peptide-like domain formed a random coil in all tested concentrations (4). The CAH of MLV was also predicted to form a helix (5); however, the real structural arrangements of the MLV CAH linker, as well as its assembly-related role, are still unknown.
Recently, several laboratories have described a new helical motif called a charged single ␣-helix (21), E(R/K) ␣-helical motif (22), or a single ␣-helix (SAH) (23). SAHs are rich in arginine, lysine, and glutamate residues. SAHs are isolated ␣-helices that are stable without being a part of a globular protein or a coiled-coil, and whose stability is thought to be achieved by intrahelical electrostatic interactions between arginine and glutamate residues, or lysine and glutamate residues (24,25). SAHs are usually 30 -200 residues long, and often serve as linkers bridging two protein domains. The SAH motif was first recognized in the calmodulin-binding protein caldesmon (26), and in the non-muscle myosins VI (27), VII (28), and X (21). SAHs are predicted to be present in a wide range of proteins (23, 29 -31).
To characterize the MLV CAH, we prepared a series of mutations substituting alanine for selected basic and acidic amino acids of the CAH. The impact of the mutated CAH regions on the ability to assemble either in bacterial cells or in the in vitro system (12) was monitored by transmission electron microscopy (TEM). Additionally, using CD and NMR spectroscopy, we confirmed the helicity of the CAH and also determined the position of the helix. The role of the MLV CAH in proteinprotein or in protein-nucleic acids interactions was analyzed by microcalorimetry and electrophoretic mobility shift assay. Our data show that the MLV spacer is an SAH motif. To our knowledge, the MLV CAH is the first reported SAH not only in retroviruses, but in viruses in general.

Results
The CAH Region Is Important for the in Vitro Assembly of the MLV CANC-To study the role of the CAH in the assembly of immature MLV particles, we initially compared the assembly efficiency of a deletion mutant lacking the entire CAH region with that of the MLV ⌬10CANC protein, which was shown to form spherical particles in Escherichia coli as well as in vitro (12). Both ⌬10CANC and ⌬10CANC ⌬CAH were expressed in E. coli (Fig. 1A), and the ability of the viral proteins to assemble was analyzed by TEM. Either the bacterial cells were embedded into epoxy resin and thin-sectioned (Fig. 1B), or the cells were partially lysed and negatively stained with sodium phosphotungstate (Fig. 1C). TEM analysis of the samples obtained by both approaches clearly showed that although ⌬10CANC assembled into spherical particles, ⌬10CANC ⌬CAH only accumulated in large protein aggregates (Fig. 1, B and C). To analyze whether the aggregation is an inherent property of the mutant or whether it is affected by the bacterial cytosolic environment, we used our recently established in vitro assay for the assembly of purified MLV ⌬10CANC (12). Both the mutant and wild type proteins were produced in E. coli, purified, and assembled in vitro according to the protocol developed earlier (12). TEM analysis of negatively stained material obtained in this study clearly showed that the deletion of the CAH abrogated the ability of ⌬10CANC to assemble (Fig. 1D). Because the TEM analysis of the bacterial cells or in vitro assembled particles yielded similar results, the thin-sectioned bacterial cells were used in the following experiments.
Mapping the CAH Amino Acid Residues Contributing to the Assembly of the MLV Gag-The CAH region consists of 42 amino acids, of which 16 are positively charged and 16 are negatively charged. Previously, Cheslock et al. (5) prepared a series of CAH deletion mutants, ranging from 11 to 33 amino acids, within the MLV Gag. This deletion analysis revealed a periodicity between the number of deleted amino acids and the abrogation of particle assembly. Furthermore the mutational analysis of two arginine-rich CAH regions ( 228 RIRR 231 and 251 RDRR 254 ) showed that arginine to alanine mutations affected the Moloney murine leukemia virus production (6). To explain these results, we prepared 18 mutants carrying: 12 single, 3 double, and 3 triple mutations ( Fig. 2A). The mutations were introduced into the ⌬10CANC expression vector, and the proteins were produced in E. coli (Fig. 2B). To analyze the impact of these mutations on the assembly, the bacterial cells expressing the mutant ⌬10CANC proteins were thin-sectioned and analyzed by TEM (Fig. 3). Ten of the ⌬10CANC single mutants assembled into spherical structures similar to that observed for the wild type (wt) ⌬10CANC. Two mutants, R225A and R228A, formed visible electron-dense protein layers without the ability to assemble into spherical particles. The double and triple mutations of the negatively charged amino acids had no effect on the formation of spheres within the bacterial cells. Similarly, the replacement of two positively charged amino acids (R254A/R257A) did not abrogate the spherical particle formation. However, in three mutants, one with a double (R247A/R251A) and two with triple basic residue mutations (K236A/R239A/R240A and R225A/R228A/R230A), neither spherical particles nor any other organized protein layers were observed (Fig. 3).
CD Spectroscopy of a CAH-derived Peptide-Based on the computational analysis and the results of the aforementioned CAH deletion study, Cheslock et al. (5) suggested that the CAH region forms a helix, which they termed a charged assembly helix (CAH). To experimentally characterize the secondary structure of the CAH, we analyzed the conformation of a synthetic, 41-amino acid-long peptide derived from this region using CD spectroscopy (Fig. 4A). The CD spectra were measured in an aqueous environment at 25°C in six different concentrations: 0.4, 0.2, 0.08, 0.04, 0.02, and 0.01 mM. The CD spectra showed two negative minima at 207 and 222 nm of comparable intensity, characteristic of helical conformations (Fig. 4B). This was further confirmed by the numerical CD data analysis revealing that ϳ67% of the peptide adopts a helical conformation, whereas the rest forms ␤-turns or is disordered. The helical content of the peptide decreased when heated to 95°C (Fig. 4C). However, the unfolding of the peptide was noncooperative (32), which suggests that the peptide does not form a coiled-coil or have a core structure (Fig. 4C).
NMR Spectroscopy Analysis of the CAH-Although the CD spectroscopy analysis showed that the CAH forms a stable helix in physiological conditions, it cannot map the position of the helix. To address this question, we further studied the structure of the CAH by NMR spectroscopy. An isotopically labeled CAH in fusion with a histidine tag was produced and purified (Fig. 5, A and B), and the backbone chemical shifts of CAH amino acids with the exception of those in the histidine tag were assigned (Fig. 5E). The secondary structure of the CAH was determined from the assigned chemical shifts using TALOSϩ (Fig. 5F). We found that the CAH forms a single, 35-amino acid-long helix spanning residues Pro 222 -His 256 (Fig. 5, C and D). The rest of the structure is most probably disordered (Fig. 5F).
MLV Charged Assembly Helix Is a Single ␣-Helix Motif-The results shown above revealed the ability of the CAH to form a stable ␣-helix without any core structure. Searching the literature for helices with the amino acid composition similar to those of the CAH, we found a relatively less well known structural motif called a charged single ␣-helix or single ␣-helix (SAH) (21)(22)(23). Using a published computational approach (23,29) that is based on the determination of potential intrahelical electrostatic interactions between arginine and glutamate residues, or the lysine and glutamate residues in position i, iϩ3, and/or i, iϩ4, we analyzed whether the MLV CAH might be an SAH (Fig. 6). Indeed, using the CSAH server and SCAN4CSAH algorithm (23,29), we confirmed that the MLV CAH can be classified as an SAH.
Although the monomeric character of several of the SAH peptides was verified by chemical cross-linking (23), it is plausible that the SAH motifs could be involved in transient protein-protein interactions. Thus, we further examined whether the MLV CAH region might contribute to Gag dimerization or to Gag-nucleic acid interactions.
Isothermal Titration Calorimetry-To examine a possible role of the CAH region in the dimerization of the CA domain during the assembly of immature particles, we used isothermal titration calorimetry (ITC). Five MLV CA-derived proteins (CA, CA ⌬CAH, CTD-CA, CTD-CA ⌬CAH, and CAH) were produced and purified (Fig. 7, A and B). Additionally, we used HIV-1 CA, prepared as described in Ref. 34, as a control. We performed ITC dilution experiments (Fig. 8), measuring the heat changes upon dissociation of the dimers as is described under "Experimental Procedures." The determined dissociation constants shown in Table 1 indicated that the control HIV-1 CA readily dimerized with a K diss of approximately 40 M, whereas the MLV CA dimerized approximately 10 times less readily, with a K diss of 380 M. An even lower degree of dimerization was determined for the MLV CTD-CA at a dissociation constant of 1,500 M. Surprisingly, a deletion of the CAH region increased the dimerization of both the CA and CTD-CA 2-and 4-fold, respectively. The CAH alone associates with a high dimer dissociation constant (K diss ϭ 16 mM). Nucleic Acid Binding Assay-As the high number of basic residues in the CAH suggests its potential role in RNA recruitment, we investigated whether the CAH region might contribute to nucleic acid binding. We incubated the MLV proteins (⌬10CANC, ⌬10CANC ⌬CAH, CA, CA ⌬CAH, CTD-CA, CTD-CA ⌬CAH, and CAH) with a 1-kb DNA ladder. Following incubation, aliquots of all reaction mixtures were treated with proteinase K. The samples of all tested proteins without the addition of the 1-kb DNA ladder (lane 1) as well as the samples of tested proteins incubated with the 1-kb DNA ladder either untreated (lane 2) or treated (lane 3) with proteinase K were analyzed using an EMSA (Fig. 9). No nucleic acids were detected in lane 1 where protein samples without the addition of the 1-kb DNA ladder were loaded, suggesting that the protein samples were not contaminated with nucleic acid (Fig. 9, lane 1). No shifts were observed in lane 3, indicating that any observed shifts are caused exclusively by the interaction of the protein with the added nucleic acid. The only shifts were observed for both ⌬10CANC and ⌬10CANC ⌬CAH, indicating that DNA is bound by the NC domain (lane 2). No DNA binding was observed either for the CA and CTD-CA proteins or for the CAH peptide itself. These data strongly suggest that the basic residues of the CAH do not contribute to nucleic acid binding.

Discussion
To explain the role of the MLV CA-NC connecting region, we studied and characterized the MLV CAH using mutagenesis   and structural analysis methods. We prepared an MLV CANC mutant lacking the CAH as well as a series of 18 mutations of this region and studied their effect on the formation of the MLV virus-like particles in vitro. We found that similarly to other retroviruses, the MLV CAH is also important for assembly as the deletion of the entire CAH region completely abrogated the ability of ⌬10CANC ⌬CAH to form immature particles. We also identified two single, one double, and two triple mutations of basic residues that prevented the formation of spherical particles. Combining CD and NMR spectroscopy, we provided the experimental evidence that the MLV CAH forms a helix. Our data from microcalorimetry and EMSA showed that the MLV CAH did not promote either protein-protein or protein-nucleic acid interactions. Using SCAN4CSAH software, we identified the MLV CAH as an SAH.
We experimentally confirmed the predicted helical arrangement of the CAH region (5), but we also showed that the tertiary structure of CAH differs from those of the HIV-1 and RSV CA-NC spacer peptides. Although our data are shown for the isolated CAH-derived peptide, based on the cryo-electron tomography data (12) and unpublished data from J. Briggs, 3 we believe that the arrangement of the CAH will be in the same conformation as found in immature MLV particles. Because the precise position of the CAHs in immature particles is unknown (12), we do not know how close in proximity the individual CAHs in the rod-like structure connecting the CA and NC are. Therefore, it is possible that the interactions, attractive or repulsive, between CAHs are sparse. Based on our finding that MLV CAH belongs to the "SAH motif family," we also suppose that almost all charged residues of the helical part of the CAH (residues 222-256) are involved in stabilizing intrahelical ionic interactions (Fig. 6), and thus are not available for intermolecular ionic interactions. One exception is residue Arg 225 . Its mutation to alanine in ⌬10CANC led to the formation of visible electron-dense protein layers (Fig. 3), suggesting that the protein had lost the ability to assemble into spherical particles. It is therefore possible that residue Arg 225 may be the only charged residue of the CAH with a free charge available for interhelical interactions, stabilizing the adjacent CAHs within the hexameric network of the immature particle.
Considering that acidic and basic residues have the same importance for the stabilization of the SAH, it is surprising that mutation of the acidic residues had no impact on the assembly of ⌬10CANC immature particles. However, the triple mutation E223A/E226A/E227A is at the N terminus of the CAH, where the predicted stabilizing ionic interactions are sparse and possibly of less importance (Fig. 6). The single and double mutations of acidic residues probably had too weak of an effect on the destabilization of the CAH, which was also the case in the majority of the single mutations of basic residues. The mutations that prevented the immature assembly (R228A, R247A/ R251A, K236A/R239A/R240A, and R225A/R228A/R230A) probably destabilized the CAH and made it less rigid than needed for the proper separation of the CA and NC-RNA complex. However, this explanation is based on our presumption that the CAH, in an SAH conformation, does not form a closely packed helical bundle like the HIV-1 and RSV spacer peptides do, but rather forms a rod-like structure with the helices farther apart.
The currently available data suggest that retroviruses need a rigid spacer separating the CA and NC domains in immature particles. Despite the possible similar roles of these spacers in the assembly of immature particles, the question remains about the role of the spacers during maturation, and their role in the mature virus. A common theme is rather unlikely due to their different arrangements, and more importantly, due to the fact that in some retroviruses (HIV-1, RSV), the spacers are cleaved out, but in other retroviruses (M-PMV, MLV), they remain as a C-terminal part of the CA protein in the mature core. The formation of a functional core is a complex process with precisely regulated molecular events. The creation of contact, resulting in genesis of the mature core, depends on the activity of the retroviral protease, which sequentially separates assembly domains that stabilize the immature Gag lattice.  RSV, the spacer peptides SP1 or SP, respectively, are cleaved sequentially from both the CA and NC proteins during maturation. The cleavage of SP1 and SP from the CA is the last step in maturation, prior to the assembly of the mature core (35). It has been suggested that the SP portion of the intermediate CA-SP maturation product may direct the timing of maturation and promote nucleation of the mature core (36,37  Potential i, iϩ4 interactions are shown above the sequence, and i, iϩ3 interactions are shown below the sequence. Positively charged arginine and lysine residues are shown in blue, and negatively charged glutamate and aspartate residues are shown in red. The gray rectangle marks the experimentally confirmed helical part of the region. rangement of capsid protein from its immature to mature form (38). Such sequentially cleaved spacer peptides between the CA and NC domains, although common to both RSV and lentiviruses such as HIV-1, are however absent in M-PMV and MLV. Although there has been discussion about the relationship between the tertiary structure of the SP and shape of the core, it is clear that in HIV-1 and RSV, the spacer peptides may be functional only up to the final maturation step, but in M-PMV and MLV, they may also play a role in the mature core. An interesting question therefore involves the arrangement of this region in the mature CA. Although in the mature M-PMV CA, this region is most likely disordered, this is unlikely for the MLV CAH. Based on the finding that the MLV CAH is an SAH, we hypothesize that the CAH, as part of the mature CA, remains in a helical arrangement.
In summary, we found that the MLV CAH is a novel structural type of the CA-NC linker domain, called the single ␣-helix. This structural motif has not been described in retroviruses or in viruses, in general. Applying this finding to the cryo-EM structure of MLV (12), we anticipate that in the immature lattice, the CAH forms a rod-like structure consisting of six separated helices connecting the hexameric CA lattice to the NC-RNA layer.

Experimental Procedures
Expression Vectors-For all DNA manipulations, standard subcloning techniques were used. All bacterial expression vectors were based on the parental MLV CANC pET22b and ⌬10 CANC pET22b expression vectors (12). To prepare the expression vectors carrying MLV CA, CTD-CA, and CAH sequences, the PCR fragments encoding the sequence of interest were amplified using an MLV CANC pET22b as a template and three sets of specific primers, each set containing different 5Ј-terminal primer: 5Ј MLV CA Nde AAA AAA CAT ATG CCA CTC CGC ATG GGG, 5Ј MLV CTD-CA Nde AAC ATA TGA GCC CCA CCA ATT TGG CCA AGG, 5Ј MLV CAH Nde AAA CAT ATG AAG CGA GAA ACC CCG GAA G, and the same 3Ј-terminal primer: 3Ј MLV CA stop Xho AAA CTC GAG TTA CAA GAG CTT GCT CAT CTC TC. Obtained PCR fragments were cleaved by NdeI and XhoI and ligated into the pET22b vector also cleaved by NdeI and XhoI. The His-tagged CAH pET22b construct was prepared by amplification of the CAH region using MLV CANC pET22b vector as a template and two specific primers: 5Ј-terminal 5Ј CAH NcoI AAA CCA TGG CCA AGC GAG AAA CCC CGG AAG and 3Ј-terminal 3Ј MLV CA stop XhoI (see the sequence above). Fragment was cleaved by NcoI and XhoI and ligated into the His-TEVpET22b vector. The His-TEVpET22b vector was prepared by ligation of an NdeI-XhoI fragment of pFastBac HT B vector (Invitrogen) into NdeI-XhoI-cleaved pET22b vector. The vector for MLV CANC ⌬CAH expression was prepared by ligation of two fragments into a Xba-HindIII-cleaved pSit vector (39) For each mutation, a set of two PCR fragments was obtained and cut by Nde-X and X-Xho, where X stands for the appropriate restriction endonuclease for particular restriction site introduced by targeted mutation. These sets of two fragments were then ligated into Nde-Xho-cleaved ⌬10 CANC pET22b.
Protein Production-LB or M9 minimal medium containing ampicillin (100 g/ml) was inoculated with E. coli BL21(DE3) RIL cells carrying an appropriate expression vector. The culture was incubated to an A 590 of ϳ0.8, and expression was induced by 0.4 mM isopropyl ␤-D-1-thiogalactopyranoside (IPTG). All procedures were carried out at 37°C. The cells were harvested 4 h after induction. The sample for NMR spectroscopy was produced in M9 minimal medium containing D-[U-99% 13 C]glucose and [U-99% 15 N]-NH 4 Cl as a sole source of carbon and nitrogen (U indicates uniform labeling; 99% 13 C indicates 99% of all C atoms in D-glucose is 13 C, 99% 15 N indicates 99% of all N atoms in NH 4 Cl is 15 N.
Protein Purification-All purification procedures were performed on ice or at 4°C, except for chromatography, which was carried out at 25°C. All chromatography columns used were manufactured by GE Healthcare Life Sciences. The purity of the proteins was analyzed by SDS-PAGE, and the identity of the proteins was confirmed by mass spectroscopy and N-terminal sequencing. The concentrated proteins were stored at Ϫ70°C.
Purification of ⌬10CANC and ⌬10CANC ⌬CAH-The harvested cells were resuspended in buffer A (50 mM Na 2 HPO 4 , pH 8.0, 1 M NaCl, 0.5% 2-mercaptoethanol, 0.2 mg/ml lysozyme, 0.15% (w/v) sodium deoxycholate, and one tablet/100 ml of Roche cOmplete protease inhibitor cocktail). The suspension was sonicated, and the insoluble fraction, containing the target proteins, was recovered by centrifugation. The pellet was resuspended in buffer B (50 mM Na 2 HPO 4 , pH 8.0, 2 M NaCl, 0.05% 2-mercaptoethanol, 1% Triton X-100, 20 g/ml DNase I, 20 g/ml RNase A, and one tablet/100 ml of Roche cOmplete protease inhibitor cocktail), sonicated, and centrifuged. Although ⌬10CANC was in the soluble fraction, ⌬10CANC ⌬CAH was in the insoluble fraction. The soluble ⌬10CANC was dialyzed against buffer C (50 mM Na 2 HPO 4 , pH 8.0, 500 mM NaCl, 0.05% 2-mercaptoethanol) and then loaded onto an immobilized metal affinity chromatography column (HiTrap IMAC FF) charged with Ni 2ϩ and equilibrated with buffer C. The column was washed with buffer C, and ⌬10CANC was eluted with 25 mM imidazole in buffer C. The fractions were dialyzed against buffer D (50 mM Na 2 HPO 4 , pH 8.0, 100 mM NaCl, 0.05% 2-mercaptoethanol) and then loaded onto a heparin chromatography column (HiTrap Heparin HP) equilibrated with buffer D. The target protein was eluted with a linear gradient of NaCl in buffer D. The protein was further purified by size exclusion chromatography (Superdex 75 10/300 GL) and then equilibrated with buffer E (50 mM Tris, pH 7.0, 500 mM NaCl, 0.05% 2-mercaptoethanol). The appropriate fractions were pooled and concentrated. The insoluble ⌬10CANC ⌬CAH was solubilized with 4 M urea in buffer E, centrifuged, and purified by size exclusion chromatography (HiLoad 16/600 Superdex 75 pg; equilibrated with 4 M urea in buffer E). The urea was removed by dialysis against buffer E, and the sample was then concentrated.
Purification of His-tagged CAH for NMR Spectroscopy-The harvested cells were resuspended in buffer F (50 mM Na 2 HPO 4 , pH 8.0, 300 mM NaCl, 2.5 mM TCEP) with the addition of 0.2 mg/ml lysozyme, 0.15% (w/v) sodium deoxycholate, and one tablet/100 ml of Roche cOmplete protease inhibitor cocktail. The suspension was sonicated, and the insoluble fraction was removed by centrifugation. The cleared lysate was loaded  1  3  2  1  3  2  1  3  2  1  3  2  1  3  2  1  3  2  1  L   CAH   CTD-CA  ΔCAH  CTD-CA   CA  ΔCAH  CA   Δ10CANC ΔCAH Δ10CANC FIGURE 9. EMSA. The indicated proteins were incubated with a 1-kb DNA ladder, and the sample aliquots were then treated with proteinase K. All the samples were analyzed by agarose gel electrophoresis. Lanes: L: 1-kb DNA ladder; 1: the tested proteins without the addition of the 1-kb DNA ladder; 2: tested proteins incubated with the 1-kb DNA ladder without proteinase K treatment; 3: tested proteins incubated with the 1-kb DNA ladder with proteinase K treatment.
onto an immobilized metal affinity chromatography column charged with Ni 2ϩ (HiTrap IMAC FF, equilibrated with buffer F). The column was washed with 50 mM imidazole in buffer F, and His-TEV-CAH was eluted with 80 mM imidazole in buffer F. The appropriate fractions were pooled, dialyzed against buffer G (50 mM Na 2 HPO 4 , pH 7.0, 50 mM NaCl), concentrated to 5.8 mg/ml (ϳ0.6 mM) and supplemented with 5% D 2 O.
Purification of CA, CA ⌬CAH, CTD, CTD ⌬CAH, and CAH-The harvested cells were resuspended in buffer H (50 mM Na 2 HPO 4 , pH 8.0, 150 mM NaCl, 2.5 mM TCEP) with the addition of 0.2 mg/ml lysozyme, 0.15% (w/v) sodium deoxycholate, 20 g/ml DNase I, 20 g/ml RNase A, and one tablet/100 ml of Roche cOmplete protease inhibitor cocktail. The suspension was sonicated, and the insoluble fraction was removed by centrifugation. The soluble fraction was precipitated with (NH 4 ) 2 SO 4 (0.5 g/ml of supernatant). The precipitate was recovered by centrifugation, resuspended in buffer H, and dialyzed against buffer SP (10 mM Na 2 HPO 4 , pH 6.0, 10 mM NaCl) in the case of CTD, CTD ⌬CAH, and CAH; or in buffer Q (10 mM Na 2 HPO 4 , pH 8.0, 10 mM NaCl) in the case of CA and CA ⌬CAH. The dialyzed sample was loaded onto an ion exchange chromatography column (HiTrap SP XL, equilibrated with buffer SP; or HiTrap Q XL, equilibrated with buffer Q), and the target proteins were eluted with a linear gradient of NaCl in buffer SP or Q. The proteins were further purified by size exclusion chromatography (HiLoad 16/600 Superdex 75 pg), equilibrated with buffer J (50 mM Tris, pH 7.0, 100 mM NaCl). The selected fractions were pooled and concentrated.
In Vitro Assembly-An aliquot of 60 g of a purified protein was mixed with 6 g of DNA (oligonucleotide 40-mer) in a total volume of 100 l of protein storage buffer (20 mM Tris, pH 8.0, 500 mM NaCl, 50 M ZnCl 2 , 1 mM PMSF). The mixture was dialyzed overnight at 4°C against 50 mM Tris, pH 8.0, 100 mM NaCl, and 1 M ZnCl 2 .
Electron Microscopy-The particles formed during the in vitro assembly reaction or bacterial cells lysed in a lysis buffer containing 50 mM Tris, pH 7.4, 250 mM NaCl, 1 mM EDTA, 1% octyl ␤-D-1-thioglucopyranoside, and lysozyme (1 mg/ml) were negatively stained with 2% sodium silicotungstate (pH 7.4) on carbon-counted grids and analyzed by transmission electron microscopy (JEOL JEM-1200EX) operated at 60 kV. For thin sectioning, the bacterial pellets were fixed in 2.5% glutaraldehyde, post-fixed in 1% osmium tetroxide, dehydrated in an ethanol series, and embedded in Agar 100 epoxy resin. Ultrathin sections (70 nm) were counterstained with saturated uranyl acetate and lead citrate.
Circular Dichroism Spectroscopy-CD spectra of the CAHderived peptide were collected at 25°C using the Jasco-815 spectrometer equipped with the Peltier temperature control system PTC-423S/L. The spectra were measured in the spectral range of 180 -280 nm with the scanning speed of 10 nm/min, response time of 8 s, 1-nm spectral bandwidth, and two spectra accumulations. Depending on the concentration of the sample (0.01-0.4 mM), the samples were placed in a quartz cell with 0.1-or 0.05-cm optical path length. After baseline correction, the final spectra were expressed as molar ellipticity (degrees ϫ cm 2 dmol Ϫ1 ) per residue (40 -42). A numerical analysis of the measured spectra and the secondary structure assignment was performed using the online circular dichroism analysis program CDPro. The melting curve was measured at the wavelength of 222 nm, in a temperature interval of 5-95°C, and with a temperature gradient of 20°C/h. The concentration of the sample was 0.01 mM.
NMR Spectroscopy-All NMR spectra were acquired at 25°C on a Bruker Avance III 600 spectrometer equipped with a triple-resonance ( 15 N/ 13 C/ 1 H) cryoprobe. The 1 H-detected 3D spectra were acquired in non-uniform sampling mode with 25% sparse sampling of indirect domains. 2,2-Dimethyl-2-silapentane-5-sulfonate (DSS) was used as a direct external chemical shift reference for 1 H and as an indirect reference for 15 N and 13 C shifts (43). The spectra were processed with Topspin 3.2 (Bruker BioSpin) and analyzed with CcpNmr Analysis 2.3 (44). The following spectra were used to assign chemical shifts: 2D NH-HSQC, CACO, and CON; 3D HNCO, HN(CA)CO, HNCA, HNCACB, CBCA(CO)NH, HNN, HBHA(CBCA-CO)NH, and CACON. Secondary structure was determined from the chemical shifts using TALOSϩ, which assigns secondary structure with high accuracy (45).
Isothermal Titration Calorimetry-Microcalorimetry experiments were performed at 25°C using a VP-ITC system (Micro-Cal, Malvern Instruments Ltd.). The analyzed proteins were transferred to buffer J (50 mM Tris, pH 7.0, 100 mM NaCl), and their exact concentration was determined by HPLC amino acid analysis. Typically, 10-l aliquots of protein solution were injected stepwise into a sample cell containing 1.43 ml of buffer J. Control experiments, in which buffer was injected instead of protein sample, were also performed. The initial protein concentrations in the injection syringe were the following: HIV-1 CA, 0.32 mM; CA, 0.41 mM; CA ⌬CAH, 0.37 mM; CTD, 3.7 mM; CTD ⌬CAH, 1.6 mM; and CAH, 8.0 mM (the maximal prepared concentration). Dimer dissociation constants were determined by MicroCal software implemented in Origin 7.0 (MicroCal, Malvern Instruments Ltd.), with an updated and corrected version (April 2010) of the dissociation analysis procedure.
Electrophoretic Mobility Shift Assay-The studied proteins (58 pmol) were mixed with 165 ng of a 1-kb DNA ladder (New England Biolabs) in the total volume of 10 l of buffer J (50 mM Tris, pH 7.0, 100 mM NaCl). The samples were incubated 45 min at 25°C. To verify that the change in the mobility of the nucleic acid was caused by a protein-nucleic acid interaction, duplicates of the samples were further treated with proteinase K (5 g/reaction) for 45 min at 37°C. The samples were then mixed with 5 l of a loading buffer (30% glycerol and 0.09% bromophenol blue in buffer J) and analyzed by agarose gel electrophoresis (1% gel, 7 V/cm). The gels were stained with GelRed (Biotium) and visualized by a Quantum ST4 fluorescence imaging system (Vilber Lourmat). This method was adopted from Fuzik et al. (33).