Structural and Functional Analysis of φ29 p16.7C Dimerization Mutants

Prokaryotic DNA replication is compartmentalized at the cellular membrane. The Bacillus subtilis phage φ29-encoded membrane protein p16.7 is one of the few proteins known to be involved in the organization of prokaryotic membrane-associated DNA replication. The functional DNA binding domain of p16.7 is constituted by its C-terminal half, p16.7C, which forms high affinity dimers in solution and which can form higher order oligomers. Recently, the solution and crystal structures of p16.7C and the crystal structure of the p16.7C-DNA complex have been solved. Here, we have studied the p16.7C dimerization process and the structural and functional roles of p16.7 residues Trp-116 and Asn-120 and its last nine C-terminal amino acids, which form an extended tail. The results obtained show that transition of folded dimers into unfolded monomers occurs without stable intermediates and that both Trp-116 and the C-terminal tail are important for dimerization and functionality of p16.7C. Residue Trp-116 is involved in formation of a novel aromatic cage dimerization motif, which we call “Pro cage.” Finally, whereas residue Asn-120 plays a minor role in p16.7C dimerization, we show that it is critical for both oligomerization and DNA binding, providing further evidence that DNA binding and oligomerization of p16.7C are coupled processes.

Prokaryotic DNA replication is compartmentalized at the cellular membrane. The Bacillus subtilis phage 29-encoded membrane protein p16.7 is one of the few proteins known to be involved in the organization of prokaryotic membrane-associated DNA replication. The functional DNA binding domain of p16.7 is constituted by its C-terminal half, p16.7C, which forms high affinity dimers in solution and which can form higher order oligomers. Recently, the solution and crystal structures of p16.7C and the crystal structure of the p16.7C-DNA complex have been solved. Here, we have studied the p16.7C dimerization process and the structural and functional roles of p16.7 residues Trp-116 and Asn-120 and its last nine C-terminal amino acids, which form an extended tail. The results obtained show that transition of folded dimers into unfolded monomers occurs without stable intermediates and that both Trp-116 and the C-terminal tail are important for dimerization and functionality of p16.7C. Residue Trp-116 is involved in formation of a novel aromatic cage dimerization motif, which we call "Pro cage." Finally, whereas residue Asn-120 plays a minor role in p16.7C dimerization, we show that it is critical for both oligomerization and DNA binding, providing further evidence that DNA binding and oligomerization of p16.7C are coupled processes.
Compelling evidence has accumulated that DNA replication of eukaryotic and prokaryotic genomes occurs in so-called replication factories, which probably contain, besides replicative DNA polymerases, all other enzymes involved in DNA replication and related processes (1)(2)(3). Imaging techniques demonstrated that replication factories are located at rather static positions, implying that they are attached to subcellular structure(s) (reviewed in Ref. 4), which in bacteria is the cytosolic membrane (for reviews, see Refs. 5 and 6). Besides functioning as a scaffold, membrane association inherently compartmentalizes replicative complexes, allowing surface catalysis, which increases the efficiency of the DNA replication process (7,8). Little is known, however, about proteins involved in the organization of membrane-associated replication of genomes. An exception is the well studied Bacillus subtilis phage 29 (for a review, see Ref. 9). The genome of 29 consists of a linear double-stranded DNA (dsDNA) 3 of 19,285 bp that contains a terminal protein (TP) covalently linked at each 5Ј-end. Initiation of 29 DNA replication occurs via a so-called protein-primed mechanism (reviewed in Refs. 10 and 11). The first step involves recognition of the origins of replication, constituted by the TPcontaining DNA ends, by a 29-encoded DNA polymerase/TP heterodimer. After origin recognition, the DNA polymerase catalyzes the covalent attachment of the first nucleotide to the TP molecule present in the heterodimer. Then, after a transition step, these two proteins dissociate, and the DNA polymerase continues processive elongation, which is coupled to strand displacement, until replication of the nascent DNA strand is completed.
Membrane-associated 29 DNA replication was first demonstrated by Ivarie and Pène (12), who also showed that this required early expressed 29 proteins. The early expressed 29 gene 16.7 is conserved in all 29-related phages studied so far (9). Protein p16.7 (130 amino acids) is a membrane protein, and its N-terminal transmembrane domain (residues 1-20) is responsible for membrane localization (13). Functional studies showed that p16.7 is involved in the organization of membraneassociated 29 DNA replication and that the efficiency of in vivo 29 DNA replication is affected in its absence (9,13,14). Biochemical studies using a soluble variant lacking the N-terminal membrane anchor, p16.7A, revealed (i) that it has singlestranded DNA (ssDNA) and dsDNA binding activity, (ii) that it is a dimer in solution, and (iii) that it can form oligomers, especially upon DNA binding (13)(14)(15). The p16.7 region following the transmembrane domain (approximately residues 30 -60) forms a low affinity coiled-coil (16). However, the main interactions responsible for dimerization are located in the C-terminal half of the protein (16). This domain, named p16.7C (residues 63-130), constitutes the functional domain of p16.7 (i.e. it has DNA binding activities and, in addition to dimers, is able to form oligomers) (16,17).
Recently, the solution and crystal structures of the p16.7C dimer have been solved (17), as well as the crystal structure of p16.7C in complex with dsDNA (18). Interestingly, one dsDNA binding subunit is constituted by three p16.7C dimers that are arranged in such a way that they form a deep dsDNA binding cavity. The structure of the p16.7C-DNA complex provided important insights into the DNA binding and oligomerization surfaces (18). On the other hand, the solution and crystal structures of the apo form gave a detailed view of the overall p16.7C folding (17). Thus, each p16.7C monomer contains three ␣ helices (corresponding to p16.7 residues 72-81 (H1), 88 -95 (H2), and 103-121 (H3)) (see Fig. 1A), and the secondary and ternary structure of each monomer is stabilized by a hydrophobic core resulting from the packing of the three helices. Two monomers form a symmetric p16.7C dimer that corresponds to a novel six-helical fold (see Fig. 1, B and C). The primary dimer interface is formed by the third ␣-helix and the following extended C-terminal region (p16.7 residues 122-128) of each monomer, which are oriented in an antiparallel fashion and pack against helices H1 and H3 of the other monomer. The intermolecular interactions present in the primary dimer interface involve multiple hydrophobic and polar contacts. Based on their position within the interface, these can be grouped as central and lateral intermolecular contacts. Whereas residues that locate in the central part of the dimeric interface form part of the C-terminal region of the third ␣-helices, the extended C-terminal regions following this third ␣ helix constitute the lateral interdimeric contacts.
The major aims of this work were (i) to validate the role of residues at the central and lateral regions of the primary dimer interface and to analyze their relative importance for dimerization and (ii) to study whether dimerization is required for functionality of the protein. The results obtained show that the lateral regions of the dimer interface as well as residue Trp-116 present in the central region of the dimer interface are crucial for proper dimerization and functionality of the protein. Trp-116 most probably forms an essential part of a novel dimerization motif, which we named the "proline cage." In addition, residue Asn-120, also present at the central dimer interface region, was studied. Although this residue was shown to be less important for dimerization, it appeared to be crucial for oligomerization and functionality of the protein.
DNA Techniques-All DNA manipulations were carried out according to Sambrook et al. (21). Restriction enzymes were used as indicated by the suppliers. [␥-32 P]ATP (3000 Ci/mmol) was obtained from Amersham Biosciences. Plasmid DNA was isolated using the Wizard Plus DNA purification kit (Pro-mega, Madison, WI). DNA fragments were isolated from gels using the Qiaquick gel extraction kit (Qiagen Inc., Chatsworth, CA). Sequencing was done using the dideoxynucleotide chain termination method (22) with Sequenase polymerase (United States Biochemicals sequencing kit).
Site-directed Mutagenesis of Gene 16.7C-Site-directed mutants of gene 16.7C were obtained by PCR using the QuikChange TM Site-D mutagenesis kit (Stratagene) in combination with appropriate complementary oligonucleotides and plasmid pET-16.7C containing gene 16.7C (16) as starting template DNA. Thus, the following primer sets were used for generating mutants W116A, N120W, and 16.7⌬9, respectively: primers W116A_U (5Ј-gag aca cag cgg aca tac gcg aaa ttg gag aat cag aaa-3Ј) and W116A_L (5Ј-ttt ctg att ctc caa ttt cgc gta tgt ccg ctg tgt ctc-3Ј); primers N120W_U (5Ј-cgg aca tac tgg aaa ttg gag tgg cag aaa aaa cta tat cgg ggg-3Ј) and N120W_L (5Ј-ccc ccg ata tag ttt ttt ctg cca ctc caa ttt cca gta tgt ccg-3Ј); and primers 16.7C⌬9_U (5Ј-gga aat tgg aga atc agt aat aac tat atc ggg ggt cat tg-3Ј) and 16.7C⌬9_L (5Ј-caa tga ccc ccg ata tag tta tta ctg att ctc caa ttt-3Ј). After PCR, according to the reaction conditions described by the supplier, the resulting DNA products were purified, digested with DpnI, and used to transform competent cells of E. coli BL21(DE3)pLysS. The presence of the desired mutations and absence of other mutations was checked by sequencing analysis of plasmid DNA isolated from kanamycinresistant transformants.
Protein Concentration-Protein monomer concentration was determined by UV spectrophotometry, measuring the absorbance at 280 nm. For p16.7C, pW116A, pN120W, and p16.7C⌬9 the extinction coefficients in native condition were ⑀ N ϭ 11,360, 5920, 16,800 and 9880 M Ϫ1 ⅐cm Ϫ1 , respectively. The ⑀ N values were determined from the extinction coefficient under denatured conditions (⑀ D ) using the expression ⑀ N ϭ ⑀ D (A N /A D ), where A⑀ N and A⑀ D are the absorbances of the native and the denatured protein, respectively (23).
Cross-linking Assays-Cross-linking reactions using disuccinimidyl suberate (DSS) as cross-linking agent were performed as described before (13). After cross-linking, proteins were precipitated upon the addition of 1 volume of ice-cold 20% (w/v) trichloroacetic acid and, after resuspension, analyzed by PAGE in the presence of SDS. Proteins were visualized by Coomassie Blue staining.
Gel Mobility Shift Assays-Gel retarding assays were performed as described before (14).
DNase I Digestion Assays-Nuclease digestion assays were performed using end-labeled 297-bp DNA fragments corresponding to the 29 right end genome. DNase I reactions contained, in 20 l, besides the end-labeled dsDNA fragment and the indicated amount of protein, 25 mM Tris-HCl (pH 7.5) and 10 mM MgCl 2 . Binding reactions were incubated for 10 min at 37°C before 0.05 units of DNase I (Promega) was added. Digestion was allowed to proceed for 2 min at 37°C, after which the reaction was stopped upon the addition of EDTA to a final concentration of 20 mM. A phenol/chloroform extraction step was performed before the DNA was precipitated with ethanol in the presence of 15 g of linear polyacrylamide as carrier. Next, the resuspended DNA was analyzed in denaturing 6% polyacrylamide gels. Finally, gels were dried and subjected to autoradiography.
CD and Intrinsic Trp Fluorescence Spectroscopy-FAR-UV CD spectra were recorded using a JASCO spectropolarimeter, model 600 (JASCO Europe SLR), equipped with a NESLAB RTE-100 water bath interfaced to a computer. Wavelength scans from 198 to 260 nm were performed at 25°C using 1.0and 0.1-cm path length cells (Thermal Syndicate Ltd., Wallsend, Northumberland, UK). Samples were allowed to reach thermodynamic equilibrium. Each spectrum represents the mean of four scans obtained at a rate of 50 nm/min, a bandwidth of 1.0 nm, and a response time of 2 s. Samples were prepared in 50 mM Tris-HCl, pH 7.5, 250 mM NaCl, 0.2 mM dithiothreitol, at protein concentrations of 2 or 20 M. Thermal denaturation experiments were carried out, increasing the temperature from 4 to 90°C at a scanning rate of 0.75°C/min and monitoring the ellipticity measured at 222 nm. The buffer, cuvettes, and protein concentrations used were as in the wavelength scans. The reversibility of the thermal transition for p16.7C was checked by comparing the spectra obtained by denaturation of the native sample with that obtained by cooling the denatured sample.
Fluorescence experiments were recorded using an AMINCO-BOWMAN (Runcorn, Chesire, UK) series-2 luminescence spectrometer equipped with a Selecta Ultraterm temperature control unit. Protein spectra were registered at 2 and 20 M in 1-cm path length cells, from 310 to 450 nm using an excitation wavelength of 295 nm. Blanks without protein were subtracted from the spectra. The melting curves were obtained measuring the intrinsic Trp fluorescence at 335 nm.
Analyses of Spectroscopic Data-Most of the secondary structural analyses were performed using DICHROWEB (available on the World Wide Web at www.cryst.bbk.ac.uk/cdweb), an interactive Web server (24,25) that permits the secondary structure analyses via the software package CDPro (26). Protein CD spectra were analyzed for percentages of secondary structure using CDPro software as CONTINLL (27), SELCON3 (28), and CDSSTR (29) with a wide range of protein spectral databases derived from soluble and membrane proteins (26,30). Thus, SELCON3, CONTINLL and CDSSTR programs were used for comparing variations in the amount of secondary structure observed in mutant proteins p16.7C⌬9, pW116A, and pN120W relative to that observed with protein p16.7C. As a means of comparison of the goodness of fit of the various methods, the normalized root mean square deviation parameter (31) was calculated for all of the analyses. The normalized root mean square deviation (NRMSD) is defined by Equation 1, summed over all wavelengths, where exp and cal are, respectively, the experimental ellipticities and the ellipticities of the back-calculated spectra for the derived structure. Normalized root mean square deviation values of Ͻ0.1 mean that the back-calculated and experimental spectra are in close agreement (32).
Analytical Ultracentrifugation Assays-Sedimentation equilibrium was performed to determine the state of association of p16.7C and its derivatives as well as their DNA binding capacity. In the absence of DNA, the experiments, done over a broad range of protein concentrations (from 5 to 700 M), were carried out at 20°C using different speeds (15,000 and 20,000 rpm) and wavelengths (230, 250, 280, and 290 nm) with short columns (80 -100 l) in an XL-A analytical ultracentrifuge (Beckman) equipped with a UV-visible optics detection system, using an An60Ti rotor and 12-mm double sector or six-hole Eponcharcoal centerpieces. All samples were in a buffer containing 50 mM Tris-HCl, pH 6.8, 200 mM NaCl, and 0.2 mM Tris(2carboxyethyl)phosphine HCl. After the equilibrium scans, a high speed centrifugation run (40,000 rpm) was done to estimate the corresponding base-line offsets. Weight average buoyant molar masses of the proteins were determined by fitting data to a single species model using either a MATLAB program (kindly provided by Dr. Allen Minton, NIH) based on the conservation of signal algorithm (33) or the HeteroAnalysis program (34); both analyses gave essentially the same results. The reported errors of the best fit molar masses correspond to two S.D. values (95% confidence limits). The corresponding protein molar masses were determined from the experimental buoyant masses using 0.734 ml/g as the partial specific volume of p16.7C (calculated from the amino acid composition using the SEDNTERP program) (35).
Tracer sedimentation equilibrium (36,37) was used to determine the DNA binding capacity of p16.7C and its derivatives. In brief, a constant concentration (5 M) of fluorescent fluorescein isothiocyanate 5Ј-end-labeled 12-mer oligonucleotide (5Ј-CCTGTGCACAGG-3Ј), alone or in the presence of increasing concentrations of p16.7C or one of its derivatives, was subjected to sedimentation equilibrium under the same conditions described above, except that the solute gradients were obtained at a wavelength (495 nm) in which only the fluorescently labeled DNA was detectable. From the analysis of these gradients (done as described above), the buoyant signal average molar masses (and hence the DNA binding capacity) of the different samples were determined. To estimate the complex binding stoichiometry in this case, the sedimentation equilibrium data of the mixtures were analyzed assuming the following linear approximation for the buoyant masses: bM w,ij ϭ i(bM w,A ) ϩ j(bM w,B ), where ij refers to the complex A i B j ; i and j are the number of molecules of A (DNA) and B (protein p16.7C), respectively; and bM w,A and bM w,B are the buoyant molecular weights of pure A and pure B, respectively (38). To further characterize the DNA binding process (in terms of stoichiometry, affinity, and degree of cooperativity), the dependence of the degree of protein p16.7C-DNA complex formation upon protein concentration can be described by the empirical Hill function: B(L) ϭ B max (L/L 50 ) n /(1 ϩ (L/L 50 ) n ), where B(L) is the amount of protein p16.7C bound to the DNA at a given free concentration of protein (L), L 50 is equal to half of the maximal binding capacity B max , and n is a cooperativity parameter (39). Finally, the analysis of p16.7C binding to a longer 297-bp DNA fragment was carried out calculating the sedimentation equilibrium and the buoyant molar masses as described above. In this case, the differences in size between the protein and the DNA allow the design of low speed sedimentation equilibrium experiments to discriminate the gradients of DNA (free and complexed) and protein without labeling the DNA. Because the complexes formed were very large, a further analysis was performed by means of sedimentation velocity. The sedimentation velocity runs were carried out at 30,000 and 40,000 rpm and 20°C using the same experimental conditions and instrument as in the sedimentation equilibrium experiments. Sedimentation profiles were registered every 5 min at 260 nm. The sedimentation coefficient distributions were calculated by modeling of sedimentation velocity data using the c(s) method (40), as implemented in the SEDFIT program, from which the corresponding sedimentation coefficients (s values) were obtained. The reported errors of the s values correspond to two S.D. values (95% confidence limits).

Rationale of Mutants
Constructed-Resolution of the solution and crystal structures of the p16.7C dimer revealed that the primary dimer interface is formed by the third ␣-helix and its following extended C-terminal region of each monomer, which are oriented in an antiparallel fashion and that pack against helices H1 and H3 of the other monomer (17) (see Fig. 1B). Within the central region of this dimeric interface, residues Trp-116 and Asn-120 (see Fig. 1, A and C) were likely to be involved in p16.7C dimerization. Thus, the side chains of residue Pro-87 of each monomer, which are 4 Å apart at the dimer interface, pack against the indole rings of the Trp-116 residue of the opposite monomer (see Fig. 1B). This particular arrangement in which the side chains of two proline residues are "locked" between the indole rings of two opposing tryptophan residues resembles that of a so-called Trp cage motif in which the indole rings of tryptophan residues are locked by side chains of prolines (41). Residue Asn-120 forms three interdimeric FIGURE 1. Relevant features of protein p16.7 and its derivatives. A, protein sequence of p16.7, p16.7C, and p16.7C derivatives used in this study. Residue numbering is according to that of native p16.7. The structural domains of p16.7 are indicated at the top. TM, transmembrane. Residues constituting p16.7C helices H1, H2, and H3 (boxed) and those forming the C-terminal tail (C-term Tail) are indicated. The p16.7C residues subjected to mutagenesis are underlined and in boldface type. B, full-size and extended view of the Pro cage dimerization motif. C, two different views of the crystal structure of p16.7C corresponding to a 90°rotation around the x axis. The monomers are illustrated in yellow and blue, and the central and lateral regions of the main dimerization interface are indicated. The atomic coordinates and structure factors were obtained from the Protein Data Bank (code 1ZAE). The presentation was done using the PyMOL visualization system (available on the World Wide Web at pymol.sourceforge.net/). For clarity, Trp-116, Asn-120, and the last 9 amino acids (C-terminal tail) are colored red, green, and violet, respectively, in A, B, and C. In addition, Pro-87, which forms part of the Pro cage, is underlined, in boldface type, and colored orange in A and B.
hydrogen bonds. One of these is between the side chains of Asn-120 of each monomer. The other two interdimeric hydrogen bonds are formed between the side chains of Arg-113 of either monomer with the backbone of Asn-120 of the other monomer.
The lateral regions of the dimeric interface are formed by the antiparallelly oriented extended C-terminal region of each monomer (p16.7 residues 122-130) that pack against helices H1 and H3 of the other monomer (see Fig. 1C). Although p16.7 residues 122-130 are not folded into an ␣-helical or ␤-sheet conformation, this region, except for the last two residues (positions 129 and 130), forms a stable and well defined extended structure both in the crystalline and solute form of the p16.7C dimer (17). Most probably, the multiple polar and hydrophobic intermolecular interactions of residues in this region are responsible for this stable extended structure.
To validate whether residues Trp-116 and Asn-120 and the extended C-terminal region are important for p16.7C dimerization and to analyze their relative contribution in dimerization, we have constructed single substitution mutants for residues Trp-116 (pW116A) and Asn-120 (pN120W) and constructed another mutant in which the last nine C-terminal residues of p16.7C are deleted (p16.7C⌬9). In pW116A, the indole ring interacting with the side chain of Pro-87 of the other monomer is absent and hence is expected to fully disrupt formation of the aromatic cage. Mutation of residue Trp-116 was chosen instead of residue Pro-87, which a priori would be equally important for formation of the aromatic cage, because Pro-87 is located in the loop connecting helices H1 and H2, and mutation of this residue is predicted to introduce gross effects on the overall p16.7C structure. The change of Asn-120 to Trp in mutant pN120W would break all three hydrogen bonds. Finally, the deletion mutant p16.7C⌬9 will lack all intermolecular interactions mediated by the extended C-terminal tail. After introducing the desired mutations, the corresponding DNA fragments were cloned in a pET-28b expression plasmid, and the three mutant proteins were overexpressed and purified to homogeneity (see "Experimental Procedures").
CD and Fluorescence Spectroscopy: Thermally Induced Transition of p16.7C- Fig. 2A shows the far UV CD spectrum of p16.7C at 25°C. The recorded spectrum has two minima, one at 222 nm and other at 208 nm, typical of a protein folded mainly in an ␣-helical structure. The helical content of p16.7C at 25°C as measured by CD spectroscopy is estimated to be 40%, indicating that ϳ36 residues of p16.7C adopt an ␣-helical structure, which corresponds well with the actual helical content of p16.7C obtained from crystallographic and NMR studies, demonstrating that 37 residues of p16.7C are in a helical structure (16,17).
Protein p16.7C forms high affinity dimers (16). We wished to determine whether dimerization is coupled to the folding of p16.7C or if the transition of folded dimers to unfolded monomers proceeds via two or more intermediate states. For this, we first monitored the thermally induced changes in the mean residue ellipticity of p16.7C at 222 nm. After thermal denaturation of the p16.7C protein to 90°C, the native signal was almost fully recovered upon cooling (Fig. 2B), showing that the transition is a virtually reversible process. The apparent midpoint transition temperatures (T m ) of the unfolding transition curves of p16.7C at 20 and 2 M were ϳ61 and 48°C, respectively (Fig. 2C). The observation that the apparent T m increases with the p16.7C concentration is consistent with the formation of dimers and indicates that the p16.7C self-association process is partially responsible for stabilization of the dimeric form of protein p16.7C.
The absence of measurable amounts of folded monomers was confirmed by intrinsic Trp fluorescence spectroscopy monitoring the thermally induced variations in the environment of the single p16.7C tryptophan residue (Trp-116). The fluorescence emission spectra of Trp presented a maximum at 335 nm that is consistent with the burial of this residue in a nonpolar environment. At increasing temperatures, the emission intensity decreased and shifted to 350 nm, reflecting exposure of the Trp side chain to solvent as the proteins unfolds (42). As shown in Fig. 2D, the transition curves of p16.7C obtained by CD and by intrinsic Trp fluorescence perfectly superpose. In consequence, the melting reaction must start with folded dimers and end with unfolded monomers. From the results presented in Fig. 2, it can be deduced that the folded monomer is not present at significant concentration to be recorded at equilibrium, although this does not exclude the presence of transiently folded monomers as kinetic intermediates. Together, these results show that the transition of folded dimers into unfolded monomers occurs without thermodynamic intermediates, confirming that the process proceeds as a coupled folding-dimerization process.
Finally, the curves shown in Fig. 2, B-D, also show that the unfolding process of p16.7C takes place in a narrow temperature range demonstrating that the coupled folding-dimerization process of p16.7C is highly cooperative.
CD Spectroscopy and Thermally Induced Unfolding Transition of p16.7C Derivatives-CD spectroscopy is an appropriate method for determining variations in the helical content of proteins and was therefore used to analyze if the mutations introduced in p16.7C affect the secondary structure. Fig. 3A shows the far UV CD spectra of p16.7C and the mutant proteins p16.7C⌬9, pW116A, and pN120W at 20 M and 25°C. In all cases, the spectra show minima at ϳ208 and 222 nm, indicating that they contain a significant amount of ␣-helical structure and, hence, that the introduced mutations do not grossly affect the secondary protein structure.
As described above, thermally induced transition of p16.7C corresponds to a highly cooperative coupled folding-dimerization process without stable intermediates. To study whether the p16.7C derivatives pW116A, pN120W, and p16.7C⌬9 display an altered thermal stability, their mean residue ellipticity at 20 M was recorded at 222 nm as a function of temperature in parallel with that of p16.7C. The results of these analyses are plotted in Fig. 3B. Compared with p16.7C, all three mutants show a notable decrease in their midpoint transition temperatures, T m . Thus, whereas the T m value at 20 M of p16.7C is about 61°C, those of p16.7C⌬9, pN120W, and pW116A are 43, 41, and 35°C, respectively. Interestingly, although the midpoint transition temperature of pN120W is markedly decreased, the cooperativity of the unfolding process of this mutant protein is very similar to that of p16.7C. This situation is different for derivatives pW116A and p16.7C⌬9, in which the width of the transition curve is very broad, showing that the cooperativity of the unfolding process of these two mutant proteins is affected. In other studies, it has been shown that a decrease in cooperativity combined with a diminished T m implies a decrease in the thermodynamic stability of a protein (see Ref. 43). Hence, residue Trp-116 and the C-terminal extended tail are important for the thermodynamic stability of protein p16.7C. In summary, whereas only the midpoint transition temperature is affected in mutant pN120W, both the midpoint transition temperature and the cooperativity of the unfolding process are affected in p16.7C derivatives pW116A and p16.7C⌬9.
Dimerization Is Strongly Affected in p16.7C⌬9 and pW116A and Moderately Affected in pN120W-To obtain a first qualitative indication of whether the p16.7C mutant proteins are affected in their dimerization ability, p16.7C and its derivatives were subjected to in vitro cross-linking analysis at a concentration of 10 M using the bifunctional cross-linking agent DSS (see Fig. 4). As observed before (16), a band with a molecular weight corresponding to a dimer was obtained for p16.7C after DSS treatment and SDS-PAGE. Compared with p16.7C, a similar amount of dimers was observed for pN120W. However, clearly diminished levels of dimers were obtained for p16.7C⌬9 and pW116A after DSS treatment. These results are an indication that the C-terminal tail and Trp-116 play a more important role in dimerization than Asn-120. However, a note of caution should be made in the case of p16.7C⌬9, since this mutant is deleted in three lysine residues, the side chains of which are preferred substrates for cross-linking with DSS (44).

Characterization of Dimerization and Oligomerization Properties of p16.7C and Mutants by Analytical Ultracentrifugation-To
gain further insight into the solution dimerization properties of p16.7C and the mutant proteins as well as their capacity to form oligomers, the proteins were subjected to sedimentation equilibrium analysis (Fig. 5). The open circles in Fig. 5 show the experimental gradient at sedimentation equilibrium (20,000 rpm) obtained for p16.7C at 5 M, which is close to the lower limit of p16.7C concentration that could be assayed with this technique. The corresponding best fit gradient (solid line) yielded a weight average molar mass of 21,200 Ϯ 800 Da that essentially corresponds to that of the protein dimer (21,000 Da). These results indicate that the K d value for the monomer-dimer equilibrium is much lower than micromolar. The corresponding best fit gradient of p16.7C obtained under the same conditions at 700 M (solid line over filled circles) yielded a weight average molar mass of (36,500 Ϯ 1100), which is around 3.5 times the monomer value, showing therefore that p16.7C forms higher order oligomers at 700 M. To estimate the concentration at which p16.7C dimers start to form oligomers, sedimentation equilibrium gradients were performed at intermediate concentrations. The results of these experiments, summarized in the inset of Fig. 5 (solid circles), show that p16.7C oligomerization started at about 300 M under these conditions. Next, the p16.7C mutant proteins were subjected to sedimentation equilibrium analyses under the same conditions (summarized in the inset of Fig. 5). These analyses showed that protein pN120W was also dimeric (molar mass 19,500 Ϯ 800) at the lowest concentration tested (5 M) (Fig. 5, inset, open  circles). However, in contrast with p16.7C, pN120W did not form higher order oligomers at elevated protein concentrations. The open inverted triangles in Fig. 5 show the experimental gradient at sedimentation equilibrium for pN120W at 650 M. The corresponding best fit gradient (solid line) yielded a weight average molar mass of 18,900 Ϯ 700 at this concentration. These results demonstrate that the ability of pN120W to form oligomers is affected.
In the case of mutant protein pW116A and p16.7⌬9, the sedimentation equilibrium studies show that their dimerization properties are severely affected. The estimated K d values of pW116A and p16.7⌬9 are ϳ40 and ϳ2 M, respectively, which is several orders of magnitude higher than the K d for p16.7C. In addition, no higher order oligomers were observed for either of these two mutant proteins within the concentration range studied (see open and closed inverted triangles in the inset of Fig. 5).
p16.7C Mutants Are Severely Affected in Their dsDNA Binding Capacities-Protein p16.7C was shown to have nonspecific ssDNA and dsDNA binding activity (16). Three different approaches were used to address the question of whether the DNA binding properties of the p16.7C mutants are affected. In the first approach, the DNA binding capacity of p16.7C and the mutant proteins was analyzed by gel retardation analyses. For this, the 297-bp right end fragment of the 29 genome was  end-labeled and incubated either directly (dsDNA) or after heat denaturation (ssDNA) with increasing amounts of protein, after which the samples were subjected to polyacrylamide gel electrophoresis under native conditions. The results obtained with dsDNA molecules are presented in Fig. 6. As observed before (16), part of the DNA molecules and all DNA molecules became retarded in the presence of 3.75 and 15 M p16.7C, respectively (Fig. 6, A and B). However, DNA binding of all three p16.7C mutant proteins appeared to be highly affected (i.e. no and only trace amounts of DNA molecules were retarded in the presence of the highest concentration tested (15 M) for pW116A and for pN120W and p16.7C⌬9, respectively). Similar results were obtained in gel retardation assays using ssDNA (not shown).
In the second approach, binding of p16.7C and the mutants to dsDNA was studied by DNase I digestion. An advantage of this approach over the gel retardation assays is that binding of the proteins to DNA is analyzed in solution. Thus, the nucleoprotein complexes formed are not subjected to gel electrophoresis in which relatively weak DNA-protein interactions may become lost during migration in the gel matrix. Thus, a 50 M concentration of p16.7C or each of the mutant proteins was incubated with end-labeled DNA molecules corresponding to the 297-bp right end fragment of the 29 genome. Next, the nucleoprotein complexes were challenged by DNase I digestion, after which the DNA fragments were fractionated through denaturing polyacrylamide gels (see Fig. 6C). In the absence of protein, a characteristic DNase I digestion pattern is observed, reflecting the fluctuations of inherent susceptibilities of the dsDNA fragment for DNase I digestion. Similar to previous results (17), the entire dsDNA fragment was fully protected from DNase I digestion in the presence of 50 M p16.7C, strongly indicating that the nucleoprotein complexes formed under these conditions consist of continuous arrays of protein covering the entire DNA fragment. However, DNase I digestion patterns highly similar to that observed in the absence of protein were observed in the presence of 50 M p16.7C⌬9, pN120W, or pW116A. Similar results were obtained using ssDNA fragments in which nucleoprotein complexes were challenged with micrococcal nuclease (not shown). Also, the results of this approach indicate that the DNA binding capacity of all three p16.7C mutants is highly affected.
In theory, the nucleases may disrupt weak interactions of the p16.7C mutant proteins with DNA. To study binding of p16.7C and its derivatives to DNA directly, we performed tracer sedimentation equilibrium with several protein-DNA ratios. The dsDNA binding patterns observed at increasing protein concentrations in retardation and footprinting assays indicate that p16.7C binds the rather long dsDNA fragments (297 bp) in a cooperative manner (13,16,17) (Fig. 6, A and B; see below). The crystal structure of p16.7C-dsDNA complex showed that one tridimeric p16.7C unit binds 7-8 bp (18). To avoid cooperative binding of multiple tridimeric p16.7C units to the same DNA fragment, tracer sedimentation equilibrium experiments were performed with 12-bp-long 5Ј-end fluorescently labeled fragments. The open circles in Fig. 7 show the experimental gradient at sedimentation equilibrium (10,000 rpm) of a 5 M concentration of this labeled DNA fragment in the absence of protein that yielded a best fit buoyant molar mass of 4500 Ϯ 500 (solid line) compatible with the molar mass of the fragment. The sedimentation equilibrium gradients of mixtures of 5 M DNA fragment and either 50 or 200 M p16.7C yielded no change (4400 Ϯ 400) or a small increase (5200 Ϯ 800) in the signal average buoyant molar mass of the tracer, respectively (not shown). However, a steeper gradient was observed upon increasing the protein p16.7C concentration to 400 M (closed circles in Fig. 7), yielding a buoyant molar mass of 15,500 Ϯ 1000. This clearly demonstrates that p16.7C binds the 12-bp DNA at 400 M. Interestingly, the concentration at which p16.7C is able to bind this DNA is similar to the concentration at which p16.7C showed incipient oligomerization (see above), strongly indicating that oligomerization and DNA binding are coupled processes. The dependence of complex formation with protein concentration analyzed by a Hill function is best described by a process with positive cooperativity (Hill n Ͼ 1.8), in which the maximum binding capacity is around three protein dimers per DNA, and the protein concentration required for half-saturation is about 350 M (Fig. 7, inset).
Sedimentation equilibrium assays were also carried out with p16.7C mutants pN120W, pW116A, and p16.7C⌬9. Contrary The same end-labeled DNA fragment used in the gel mobility assays was used in the DNase I footprinting assays. The labeled probe was preincubated in the absence (negative control, Ϫ) or presence of high amounts (50 M) of the indicated protein. After complex formation, samples were treated with DNase I as described under "Experimental Procedures," and the DNA products were fractionated through polyacrylamide gels under denaturing conditions. Note that the amount of DNA used in this assay was about 4-fold higher than that used in the gel mobility shift assays. Essentially the same results were obtained in duplicate experiments of the gel mobility and DNase I footprinting assays.
to p16.7C, the sedimentation gradients of the 12-bp DNA mixed with either of these mutant proteins at 400 M were almost identical to the one obtained for the 12-bp fragment alone. Thus, buoyant molar masses of 4200 Ϯ 400, 4000 Ϯ 600, and 3900 Ϯ 500 were obtained for pN1120W, pW116A, and p16.7C⌬9, respectively. Fig. 7 shows the results for pN120W and pW116A. Altogether, these results conclusively show that DNA binding of the p16.7C mutants is severely affected.
Finally, sedimentation velocity assays were performed with p16.7C in combination with the 297-bp DNA that was employed in the retardation and footprinting assays. These results were confirmed by parallel sedimentation equilibrium assays. Thus, whereas p16.7C started to bind the 297-bp fragment (buoyant molar mass ϭ 78,000, which is compatible with the size expected for this DNA) at 10 M, nucleo-protein complexes of high molecular weight were formed at 30 M (buoyant mass around 200,000, compatible with complexes having at least 20 protein dimers bound), and complexes of even higher molecular weights were formed at 60 M (buoyant masses above 400,000, which can accommodate at least 50 protein dimers) (not shown). In summary, the sedimentation velocity and equilibrium assays show on the one hand that p16.7C binds the 297-bp fragment at much lower concentrations than those needed to observe binding to the 12 bp DNA (see above). On the other hand, the large nucleoprotein complexes formed with the 297 bp are polydisperse in size. This precludes a comprehensive quantitative analysis of the binding process. Nevertheless, the fact that a modest increase in the protein concentration (within a range of less than 10-fold) caused a large increase in the mean size of the nucleoprotein complexes formed provides solid evidence that tridimeric p16.7C units bind also longer DNA fragments in an apparent cooperative manner.

DISCUSSION
Besides analyses of some basic features of the p16.7C dimerization properties, considered the wild type protein in this work, we have studied the role in protein dimerization of Trp-116, Asn-120, and the nine last C-terminal residues constituting the C-terminal tail. In addition, the effects of these mutations on oligomerization and functionality of the protein (i.e. DNA binding activity) were assessed.
Analyses of the spectroscopic thermally induced changes showed that the transition of p16.7C corresponds to a reversible and cooperative coupled folding-dimerization process (i.e. that the reversible transition of folded dimers into unfolded monomers occurs cooperatively and without thermodynamic intermediates). The solution and crystal structures of p16.7C showed that the side chains of p16.7 residue Pro-87 of each monomer pack against the indole rings of the Trp-116 residue of the other monomer and suggested that this particular arrangement may be important for the high dimerization affin-  ity of p16.7C. Substitution of residue Trp-116 by Ala would fully disrupt this arrangement. Analyses showed that dimerization, oligomerization, and functionality of mutant protein pW116A were severely affected. Thus, (i) the midpoint transition temperature of pW116A at 20 M was decreased by 26°C, and the cooperativity of the transition process was virtually lost; (ii) in vitro cross-linking and ultracentrifugation studies demonstrated that the dimerization of this mutant was severely affected; and (iii) various approaches showed that pW116A was unable to bind DNA. Together, these results show that p16.7 residue Trp-116 plays a crucial role for dimerization and by extension (see below) for oligomerization and functionality of the protein.
The specific arrangement of Trp-116 residues by keeping the side chain of Pro-87 residues in a locked state resembles that of the so-called Trp cage in which the side chain of a Trp residue is locked into position by side chains of proline residues. In both arrangements, the stability of the resulting environment is obtained by generating a hydrophobic patch formed by the side chains of aromatic residues. We therefore name this specific arrangement present in p16.7 a Pro cage. As far as we know, this is the first functional Pro cage described. The previously described Trp cage results from configuring the cooperative formation of an intramolecular and hydrophobic local environment that effectively stabilizes the folding of small monomeric peptides (41). In fact, this kind of motif is significantly more stable than any other miniprotein described. The arrangement of the p16.7 Pro cage, besides constituting a novel functional aromatic cage, differs fundamentally from the previously described Trp cage in that it plays a crucial role in stabilization of a folded dimer instead of stabilizing the folding of a monomer.
The importance of the Pro cage for dimerization of p16.7C may be further emphasized by the results obtained with mutant pN120W, which, like the Pro cage, makes intermolecular interactions at the central region of the primary dimer interface. Substitution of Asn-120 by a Trp residue would prevent the three interdimeric hydrogen bonds made by Asn-120 in the dimer. Nevertheless, this mutation affected the affinity of the dimers only moderately as demonstrated by the in vitro cross-linking and ultracentrifugation experiments. Thus, although residues Trp-116 and Asn-120 both locate to the central region of the dimer interface, mutation of Trp-116 affects dimerization far more than mutation of Asn-120.
Besides providing insights into dimerization, the results obtained with pN120W are also important in that they provide experimental evidence supporting the view that p16.7C oligomerization and DNA binding are coupled processes. The results obtained in this work show that pN120W, like p16.7C, forms stable dimers with a dissociation constant much lower than 1 M at 25°C. Contrary to p16.7C, however, pN120W is unable to form oligomers at high protein concentration (Fig. 5) and is unable to bind DNA ( Fig. 6 and 7). This strongly indicates that p16.7C oligomerization is required for DNA binding. The conclusion that p16.7C oligomerization and DNA binding are coupled processes is furthermore supported by the tracer sedimentation equilibrium assays using the 12-bp DNA probe.
Thus, p16.7C oligomerization and DNA binding were observed at similar protein concentrations (400 M).
The inability of pN120W to form oligomers may be explained as follows. Protein p16.7C dimers are characterized by a striking self-complementarity (17). NMR studies of the apo form of p16.7C and the crystal structure of the p16.7C-DNA complex showed that the interdimeric interface contains multiple contacts dispersed over the lateral complementary sides of the p16.7C dimers (17,18). However, p16.7C oligomerization is only observed at high concentration (Ͼ300 M; see Fig. 5), demonstrating that oligomerization is a low affinity process. Substitution of Asn for the bulkier Trp residue may introduce (subtle) structural changes at the interdimeric interface, which are likely to have drastic effects on p16.7C oligomerization, although we cannot fully exclude the possibility that this mutation affects independently both the oligomerization and the DNA binding activities of the protein.
Except for the last two residues (positions 129 and 130), the C-terminal tail of each p16.7C monomer (residues 122-130) forms a stable and well defined extended structure that makes multiple intermolecular interactions with helices H1 and H3 of the other monomer (17). The importance of the C-terminal tail for p16.7C dimerization, oligomerization, and functionality was studied using the deletion variant p16.7C⌬9 lacking the last 9 residues. In various aspects, the absence of the C-terminal tail caused effects similar to those observed with mutant pW116A. First, compared with p16.7C, a notable decrease was observed in the midpoint transition temperature and cooperativity; second, ultracentrifugation studies demonstrated that dimerization and oligomerization of p16.7C⌬9 was severely affected; and third, p16.7C⌬9 was unable to bind DNA. Together, these results show that the C-terminal tail of p16.7C⌬9 is crucial for protein dimerization and oligomerization. Protein p16.7C⌬9 was also unable to bind DNA. Various lines of evidence show that DNA binding is coupled to protein oligomerization (see above), and it was therefore not surprising to find that p16.7C⌬9 was unable to bind DNA. However, despite the effects on oligomerization, the defects in DNA binding observed with p16.7C⌬9 are also likely to be a direct consequence of the deletion of the C-terminal tail, which includes two of the three p16.7C residues (Lys-123 and Arg-126) that make direct contacts with the dsDNA phosphate backbone (18).
Finally, it is worth mentioning that p16.7C dimers can form two different types of oligomers. First, the crystal structure of the p16.7C-dsDNA complex showed that a single DNA binding unit is formed by three p16.7C dimers that are arranged in such a way that they form a roughly half-circular positively charged DNA binding surface that interacts with the phosphate backbone of dsDNA (18). The sedimentation equilibrium gradients showed that p16.7C started to form oligomers at a concentration of about 300 M (Fig. 5), and the tracer sedimentation equilibrium assays using the 12-bp DNA probe showed that p16.7C bound this short DNA probe at about 400 M (Fig. 7). However, solution biophysical approaches like sedimentation velocity and sedimentation equilibrium assays showed that p16.7C binds longer DNA fragments at much lower concentrations and that binding to the longer DNA fragments occurs cooperatively. This cooperativity explains the higher DNA binding activity of p16.7C for longer DNA fragments. In the crystal structure of the p16.7C-dsDNA complex, p16.7C dimers belonging to different tridimeric p16.7C units interact through a relatively large surface area (ϳ900 Å 2 ) located at the outer edges of the elongated p16.7C dimer (18), suggesting that this constitutes the intertridimeric p16.7C surface important for higher order multimerization.
In summary, by structural and functional analyses of sitedirected and deletion mutants of the functional domain of p16.7, we (i) have demonstrated that residue Trp-116 and the extended C-terminal tail are crucial for the formation of high affinity p16.7C dimers and (ii) have provided evidence that p16.7C oligomerization and DNA binding are coupled processes (i.e. that functionality of p16.7C requires oligomerization). Another important contribution of this work is the identification of a novel dimerization motif that we call the Pro cage. This motif involves two Trp and two Pro residues that together form an intermolecular hydrophobic patch as a consequence of the encapsulation of the side chains of Pro residues of one monomer in a sheath of aromatic rings of Trp residues of the other monomer.