Biophysical Characterization of the DNA Binding Domain of gpNu1, a Viral DNA Packaging Protein*

Terminase enzymes are common to double-stranded DNA viruses. These enzymes “package” the viral genome into a pre-formed capsid. Terminase from bacteriophage λ is composed of gpA (72.4 kDa) and gpNu1 (20.4 kDa) subunits. We have described the expression and biochemical characterization of gpNu1ΔK100, a construct comprising the N-terminal 100 amino acids of gpNu1 (Yang, Q., de Beer, T., Woods, L., Meyer, J., Manning, M., Overduin, M., and Catalano, C. E. (1999) Biochemistry 38, 465–477). Here we present a biophysical characterization of this construct. Thermally induced loss of secondary and tertiary structures is fully reversible. Surprisingly, although loss of tertiary structure is cooperative, loss of secondary structure is non-cooperative. NMR and limited proteolysis data suggest that ≈30 amino acids of gpNu1ΔK100 are solvent-exposed and highly flexible. We therefore constructed gpNu1ΔE68, a protein consisting of the N-terminal 68 residues of gpNu1. gpNu1ΔE68 is a dimer with no evidence of dissociation or further aggregation. Thermally induced unfolding of gpNu1ΔE68 is reversible, with concomitant loss of both secondary and tertiary structure. The melting temperature increases with increasing protein concentration, suggesting that dimerization and folding are, at least in part, coupled. The data suggest that gpNu1ΔE68 represents the minimal DNA binding domain of gpNu1. We further suggest that the C-terminal ≈30 residues in gpNu1ΔK100 adopt a pseudo-stable α-helix that extends from the folded core of the protein. A model describing the role of this helix in the assembly of the packaging apparatus is discussed.

Replication of DNA proceeds through a rolling circle mechanism that gives rise to linear concatemers of the viral genome linked in a head to tail fashion (22,23). Packaging of viral DNA requires the excision of an individual genome from the concatemer, and packaging of the 48.5-kb 1 duplex within the capsid. Genome packaging by terminase has been described in detail (21, 24 -27) and is summarized here. Packaging initiates with the assembly of the holoenzyme at a cos site in the concatemer. This site represents the junction between the left and right ends of individual genomes within the concatemer (Fig. 1A). Site-specific assembly at cos is mediated by cooperative gpNu1 binding to three repeated R-elements in the cosB subsite of cos. Assembly of gpNu1 at cosB promotes the assembly of a gpA dimer symmetrically disposed at cosN, yielding a stable prenicking complex. Site-specific nicking of the duplex at cosN, followed by an ATP-dependent separation of the nicked strands, yields complex I, the next stable intermediate. This nucleoprotein complex next binds an empty capsid, which triggers the transition to a mobile, ATP-driven translocation complex that inserts DNA into the capsid. Upon arrival at the next downstream cos site, terminase again nicks the duplex, and strand separation results in release of the DNA-filled capsid and re-generation of complex I.
The gpA subunit of terminase appears to possess all of the catalytic activities required for genome packaging, but the efficiency of each reaction is strongly stimulated by the smaller gpNu1 subunit (7, 8, 12-14, 16, 28). Moreover, gpNu1 is required for specific and high affinity gpA DNA binding interactions (29) and likely contributes to the exceptional stability of the pre-nicking complex and complex I (30,31).
Our laboratory is interested in the biochemical and biophysical mechanisms of DNA packaging by phage terminase. Central to the packaging process is the cooperative assembly of gpNu1 and gpA at cos (Fig. 1A). To define this assembly process at a molecular level requires an understanding of the structural features governing physical interaction between the enzyme subunits and with DNA. Toward this end, we have sought to define the properties governing intrinsic and cooperative DNA binding by gpNu1. Unfortunately, the isolated gpNu1 subunit shows a strong tendency to aggregate upon concentration (9,28,(32)(33)(34), a feature that has hampered structural and biophysical characterization of the protein.
We recently described the construction and characterization of two deletion mutants of gpNu1, gpNu1⌬P141 (35) and gpNu1⌬K100 (1), which are proteins truncated at Pro 141 and Lys 100 of full-length gpNu1, respectively. Studies of these constructs led to a model where the C-terminal 40 residues of the protein are required for interactions with the gpA subunit to form a catalytically competent holoenzyme complex (Fig. 1B). Residues 100 -140 promote self-association interactions that mediate cooperative DNA binding. The N-terminal 100 residues of the protein represented by gpNu1⌬K100 contain the putative helix-turn-helix DNA binding motif postulated to play a direct role in DNA binding. Indeed, the construct is folded in solution and binds cos-containing DNA with reasonable specificity (1). Preliminary NMR experiments suggested that gpNu1⌬K100 would be amenable to structural studies (1).
Here we present a detailed biophysical characterization of gpNu1⌬K100. These studies suggest that the construct consists of a functional N-terminal domain that possess a pseudostable C-terminal helix extending from the folded core of the protein. We further describe the construction and characterization of a shorter construct that clearly demonstrates the unusual biophysical characteristics of gpNu1⌬K100 result from this extended C-terminal helix. The biological significance of these results is discussed.

EXPERIMENTAL PROCEDURES
Materials and Methods-Tryptone, yeast extract, and agar were purchased from DIFCO. Restriction enzymes were purchased from Promega. DEAE-Sepharose FF and SP-Sepharose FF chromatography resins were purchased from Amersham Pharmacia Biotech. Restriction enzymes were purchased from Promega. Guanidinium hydrochloride was purchased from Mallinckrodt. All other materials were of the highest quality commercially available.
Bacterial cultures were grown in shaker flasks utilizing a New Brunswick Scientific series 25 incubator-shaker. All protein purifications utilized a Amersham Pharmacia Biotech fast-protein liquid chromatography system that consisted of two P500 pumps, a GP250-plus controller, a V7 injector, and a Uvicord SII variable-wavelength detector. UV-visible absorbance spectra were recorded on a Hewlett-Packard HP8452A spectrophotometer. Fluorescence spectra were recorded at room temperature on a PTI Quanta-Master spectrofluorometer. A protein concentration of 10 g/ml in 10 mM potassium phosphate buffer, pH 7.4, was used, and a buffer blank was subtracted from the fluorescence spectrum. Circular dichroism (CD) spectra were recorded on an Aviv model 62DS circular dichroism spectropolarimeter equipped with a Brinkmann Lauda RM6 circulating water bath and a thermostated cell holder. Near-UV CD spectra utilized a protein concentration of 1 mg/ml in a 0.1-cm strain-free cuvette. Data were typically collected between 250 and 350 nm at 0.5-nm intervals using a bandwidth of 1.5 nm and a dwell time of 30 s. Far-UV CD spectra utilized a protein concentration of 100 g/ml in a 0.1-cm strain-free cuvette. Data were typically collected from 180 to 260 nm at 0.5-nm intervals using a bandwidth of 1.5 nm and a dwell time of 30 s. The raw spectra were converted to molar ellipticity using, where is the molar ellipticity (degrees-cm 2 /dmol), obs is the ellipticity recorded by the instrument (millidegrees), MRW is the mean residue weight (formula weight divided by the total number of residues in the protein), b is the cell path length, and c is the protein concentration in mg/ml (36). Protein secondary matrix-assisted laser desorption time-offlight mass spectra were obtained from the University of Colorado Health Sciences Center Macromolecular Resource Center. Automated DNA sequence analysis was performed by the University of Colorado Cancer Center Macromolecular Resources Core facility. Both strands of the duplex were examined to ensure the expected DNA sequence. Prediction of protein secondary structures based upon primary sequence data was performed by the method of Chou and Fasman (37), using the DNASIS program (Macintosh version 2.0). Calculation of protein secondary structures based upon the far-UV CD data was performed using the SELCON program. Bacterial Strains, DNA Preparation, and Protein Purification-Escherichia coli BL21(DE3) cells were a generous gift of D. Kroll (University of Colorado Health Sciences Center, Denver, CO). All synthetic oligonucleotides used in this study were purchased from Life Technologies, Inc. and were used without further purification. Plasmids pSF1 and pAFP1, kindly provided by M. Feiss (University of Iowa, Iowa City, IA), were purified from the E. coli cell lines C600[pSF1] and JM107[pAFP1], respectively, using Qiagen DNA prep columns. All of our purified proteins were homogenous as determined by SDS-PAGE and densitometric analysis using a Molecular Dynamics laser densitometer and the Im-ageQuaNT data analysis package. Unless otherwise indicated, protein concentrations were determined spectrally using millimolar extinction coefficients (1,32).
Construction of pNu1⌬E68 -A truncated Nu1 gene was amplified by PCR using pSF1 as a DNA template. This plasmid contains the wildtype Nu1 gene cloned into a pBR322 background (38). Primers were designed such that EcoRI and BamHI restriction sequences were present at the 5Ј and 3Ј ends, respectively, of the PCR product. The primer sequences used to amplify pNu1⌬E68 were as follows. Forward primer: 5Ј-CCT CTC CCT TTC TCC GAA TTC ATG GAA GTC AAC AAA AAG C-3Ј; reverse primer: 5-CTT CCT GGA TTC TTA TTC TTC AAC CTC CCG GCG-3Ј. The EcoRI and BamHI restriction sequences in the above primers are indicated in italics, whereas the f-MET (forward primer) and stop (reverse primer) codons are shown in boldface. Sequences complementary to the Nu1 gene are underlined. The stop codon present in the reverse PCR primer yields, upon amplification, a truncated Nu1 gene that expresses only the first 68 amino acids of the protein. PCR amplification, isolation of the PCR product, and construction of the overexpression plasmid (pNu1⌬E68) was performed as described previously (1,28,35).
Expression and Purification of gpNu1⌬E68 -Four liters of 2X-YT media containing 50 g/ml ampicillin, 25 mM potassium phosphate, pH 7.5, and 5 mM glucose were inoculated with a 40 ml of overnight culture of BL21(DE3)[pNu1⌬E68] derived from an isolated colony. The cultures were maintained at 37°C until an optical density 1.0 (A 600 nm ) was obtained, at which point isopropyl-1-thio-␤-D-galactopyranoside (1.2 mM) was added. The cells were maintained at 37°C for an additional 3 h, and then harvested by centrifugation. Unless otherwise indicated, all subsequent steps were performed at 0 -4°C. The cell pellet was resuspended in ice-cold buffer A (20 mM Tris, pH 8.0, 2 mM EDTA, 7 mM 2-mercaptoethanol, and 10% glycerol) containing 100 mM NaCl, and the cells were disrupted by sonification. Insoluble cellular debris was removed by centrifugation (12,000 ϫ g, 30 min), and solid ammonium sulfate was added to the clarified supernatant to 50% saturation. Insoluble protein was removed by centrifugation (12,000 ϫ g, 30 min), and proteins were then precipitated with the addition of ammonium sulfate to 90% saturation followed by centrifugation. gpNu1⌬E68 was found in the 50 -90% ammonium sulfate-precipitated fractions.
The ammonium sulfate pellet was taken into buffer A and, after dialysis against the same buffer, loaded onto a DEAE-Sepharose column (200 ml) also equilibrated with buffer A. The column was developed with a salt gradient with gpNu1⌬E68 eluting at Ϸ300 mM NaCl. Column fractions were examined by SDS-PAGE, and the appropriate fractions were pooled, dialyzed against buffer A, and loaded onto an SP-Sepharose column equilibrated with the same buffer. The column was developed with a salt gradient with gpNu1⌬E68 eluting at Ϸ280 mM NaCl. As before, column fractions were examined by SDS-PAGE, the appropriate fractions were pooled, dialyzed against buffer A containing 20% glycerol, and stored at Ϫ80°C. As required, the proteins were concentrated and/or buffer exchanged using an Ultrafree-15 centrifugal filter device according to the manufacturer's instructions (Millipore).
Sedimentation Equilibrium Analysis-Experiments were carried out with a Beckman XL-A analytical ultracentrifuge equipped with a Ti-60 four-hole rotor with six-channel, 12-mm path-length centerpieces. Absorbance optics were used throughout. Three different protein concentrations were used with ratios of 10:3:1, with the highest protein concentrations of 150 M (Ϸ1.2 mg/ml). Samples were dialyzed against the appropriate buffer and then diluted to the concentrations indicated in each experiment. Sample volumes were 100 l with the inert oil FC-43 used to displace samples from the base of the cells. Samples were allowed to equilibrate at 20,000, 30,000, and 40,000 rpm. Samples were judged to be at equilibrium by successive subtraction of scans. The density of each buffer solution was calculated based on the salt composition and equilibrium temperature. The partial specific volume of gpNu1⌬E68 was calculated by summing the partial specific volumes of the individual amino acids (39). Data chosen for analysis had an absorbance between 0.1 and 1.5 optical density units. Each data point was an average of four scans taken every 0.001 cm. Data were selected for analysis using the program REEDIT (generously provided by Dr. David Yphantis). Individual and simultaneous analyses of nine channels (three concentrations at three speeds) were carried out to resolve assembly stoichiometry. Data were analyzed using the appropriate functions by non-linear least-squares parameter estimation (40) to determine the best-fit model-dependent parameters that minimize the variance. The program NONLIN was used (Ref. 41; kindly donated by Dr. David Yphantis). Confidence intervals (67%) correspond to approximately one standard deviation. Non-ideality was not considered, because there was no evidence for non-ideal effects.
Models incorporating different assembly stoichiometries were based upon the general equation, where Y (r) is the absorbance at radius r, ␦ the baseline offset, and ␣ the monomer absorbance at reference radius r o . is the reduced molecular weight ( ϭ M(1 Ϫ ) 2 /RT], N is the stoichiometry of the reaction, and K N is the association constant of the reaction NM º M N .
Thermal Stability Studies-Thermally induced protein denaturation experiments were performed as described previously (32,33,35). Each data set represents the average of at least two independent experiments. The fraction of protein in the denatured state (F D ) was determined using, where T is the ellipticity at temperature T, and N and D represent the ellipticity for the native and denatured protein, respectively. Baseline corrections were not performed to demonstrate temperature-induced alterations in the pre-transition baseline slopes. The unfolding curves were analyzed using a complex sigmoidal curve function, where (m N *T Ϫ b N ) and (m D *T Ϫ b D ) describe the linear portion of the pre-transition and post-transition baselines, respectively, at temperature T, m T is the slope of curve within the transition region, and T m is the melting temperature for the transition. All data sets were fit to the above equations by non-linear regression methods using the IGOR graphics/analysis package (WaveMetrics, Lake Oswego, OR).

RESULTS
We have previously described the construction, expression, and biochemical characterization of gpNu1⌬K100, a deletion construct of the small terminase subunit (1). The construct is a dimer in the concentration range of 5 M to 2 mM, with no evidence for dissociation or further aggregation. Preliminary NMR experiments suggested that the construct might be amenable to structural studies, and we therefore sought to further characterize the physical properties of this construct.
Thermally Induced Unfolding of gpNu1⌬K100 -Thermally induced unfolding of gpNu1⌬K100 secondary and tertiary structural elements is reversible, as indicated in Fig. 2, A and  B, respectively. Moreover, the loss of tertiary structure (near-UV CD signal) is cooperative, consistent with a folded and stable construct (Fig. 2C). 2 Salt and protons stabilize protein tertiary structure, as evidenced by the significant increase in the T m for the transition (Table I). Despite the observed coop-2 gpNu1⌬K100 possesses two tryptophan (Trp 22 and Trp 49 ) and three tyrosine (Tyr 41 , Tyr 50 , and Tyr 84 ) residues. The thermal unfolding data provide no indication for multiple unfolding transitions, which would indicate local unfolding in the vicinity of these residues. We thus interpret the loss of the near-UV CD signal as reflecting global unfolding of the protein. It is feasible, however, that the melting curves reflect regional versus global unfolding of the protein.   erative loss of tertiary structure, thermally induced loss of secondary structural elements (far-UV CD signal) appears essentially non-cooperative (Fig. 2C). The steep pre-transition baseline observed in these data make it difficult to accurately calculate the T m for this transition. Nevertheless, it is clear that salt and protons similarly affect the T m for the unfolding transition, whether monitored in the far-UV or near-UV region of the CD spectrum (Table I). Interestingly, salt and pH strongly affect the pre-transition baselines obtained in the far-UV CD melting curves, but not the near-UV CD melting curves (Table II). Limited Proteolysis of gpNu1⌬K100 -Evaluation of the line widths and chemical shifts in a 1 H-15 N correlation spectrum of gpNu1⌬K100 suggested that Ϸ30 residues of the construct were solvent exposed and highly flexible. 3 Primary sequence analysis predicts strong ␣-helical character in the region spanning residues Ϸ50 and Ϸ115 of gpNu1 (Fig. 1B). We postulated that this putative helix might be partially disrupted in the gpNu1⌬K100 construct, leading to the unusual unfolding properties of the protein. If this were the case, limited proteolysis of gpNu1⌬K100 would be expected to degrade the exposed portion of the helix, yielding a fully folded domain suitable for structural characterization. This was indeed correct. Limited proteolysis of the gpNu1⌬K100 with a number of proteases consistently yielded two predominant products (data not shown). Analysis of these products by SDS-PAGE and matrix-assisted laser desorption time-of-flight mass spectrometry yielded molecular masses of Ϸ7.5 kDa and Ϸ10 kDa, respectively.
Characterization of gpNu1⌬E68 -Proteolysis studies, NMR spectral analysis, and evaluation of secondary structure pre-3 T. de Beer and C. E. Catalano, unpublished.

FIG. 2.
A, far-UV CD spectra of gpNu1⌬K100 before (E) and after (x) heating to 85°C for 15 min. B, near-UV CD spectra of gpNu1⌬K100 before (E) and after (x) heating to 85°C for 15 min. The spectra presented in A and B were recorded at 4°C. C, thermally induced unfolding of gpNu1⌬K100. Unfolding was monitored by far-UV (E) and near-UV (q) CD signals as described under "Experimental Procedures. "   FIG. 3. A, far-UV CD spectra of gpNu1⌬E68 before (E) and after (x) heating to 85°C for 15 min. B, near-UV CD spectra of gpNu1⌬E68 before (E) and after (x) heating to 85°C for 15 min. The spectra presented in A and B were recorded at 4°C. C, thermally induced unfolding of gpNu1⌬E68. Unfolding was monitored by far-UV (E) and near-UV (q) CD signals as described under "Experimental Procedures." dictions for gpNu1⌬K100 suggested that the minimal folded DNA binding domain of gpNu1 is located within the N-terminal Ϸ70 amino acids of the protein. Based on these data, gpNu1⌬E68 was constructed. Expression and purification of this construct as described under "Experimental Procedures" yielded 10 mg of homogenous protein per liter of cell growth. The UV absorbance spectrum of gpNu1⌬E68 is typical of a purified protein that is essentially devoid of contaminating nucleotide (A 280 :A 260 ϭ 1.95) (42). An extinction coefficient of ⑀ 280 ϭ 13.9 mM Ϫ1 ⅐cm Ϫ1 was calculated for the protein using the method of Gill and von Hippel (Table III) (43,44). The progressive decrease in ⑀ 280 going from full-length gpNu1, to gpNu1⌬K100, to gpNu1⌬E68 is consistent with deletion of one Tyr and four Phe in gpNu1⌬K100, and an additional Tyr in gpNu1⌬E68. The fluorescence spectrum of gpNu1⌬E68 exhibits a maximum of 335 nm using an excitation wavelength of 280 nm (Table III). This maximum blue shifts to 350 nm and increases in intensity (1.6-fold) in the presence of 6 M guanidinium hydrochloride, consistent with denaturation of a folded protein. Identical fluorescence changes are observed with fulllength gpNu1 and the gpNu1⌬K100 construct (Table III), suggesting that the folded core of all the proteins is similar. Importantly, gpNu1⌬E68 binds cos-containing DNA with an affinity essentially identical to that of gpNu1⌬K100. 4 The far-UV CD spectrum of gpNu1⌬E68 demonstrates that the protein possesses secondary structural elements (Fig. 3A). Deconvolution analysis of the spectrum is consistent with a protein containing 30% ␣-helical structure as well as 22% of the residues being in a ␤-sheet conformation. This represents a significant loss of ␣-helical character compared with that ob-served in the gpNu1⌬K100 construct (Table III). Strong signals are also observed in the near-UV CD spectrum of the construct (Fig. 3B), demonstrating that the protein also possesses significant tertiary structure. The near-UV CD spectrum of gpNu1⌬E68 is quite similar to that of gpNu1⌬K100, with the exception of decreased signal intensity at 276 and 282 nm (compare Figs. 2B and 3B). These bands likely represent the 1 L b transitions of Tyr-84, which has been deleted in the gpNu1⌬E68 construct.
Stoichiometry of the Self-assembly Reaction-The assembly state of gpNu1⌬E68 was determined by sedimentation equilibrium methods. Least-squares analysis of a representative data set is presented in Fig. 4 and Table IV. The data consist of three different loading concentrations at three different rotor speeds as described under "Experimental Procedures." Each individual data set was fit to a single-species monomer model to resolve an average molecular mass. The apparent molecular mass is constant and twice that of the calculated monomer molecular mass of 7,803 Da, indicating that the construct is largely dimeric at all concentrations examined. Furthermore, there is no evidence of a concentration-dependent increase in molecular mass, indicating that gpNu1⌬E68 dimers are not in dynamic equilibrium with either monomers or higher order assemblies. A more rigorous analysis by simultaneous fitting of all nine data sets to a single-species model resolved an apparent molecular mass of 17,665 Ϯ 749 Da, 2.26-fold greater than the calculated monomer molecular mass. The quality of the simultaneous analysis can be seen in Fig. 4, with all data being well described by the model. The square root of the variance for the simultaneous analysis (0.010 optical density units) is also comparable to that of any individual fit, indicative of no systematic deviations. Fitting of the data to more complex assembly schemes either resulted in no improvement of the fit, or failure to converge. Taken together, the data demonstrate that gpNu1⌬E68 is exclusively dimeric over the concentration range of 15 to 150 M. Identical results were obtained in the absence or presence of 150 mM NaCl, and at temperatures between 4°C and 37°C.
Thermally Induced Unfolding of gpNu1⌬E68 -Thermally 4 T. de Beer and C.E. Catalano, manuscript in preparation. The symbols represent absorbance of gpNu1⌬E68 equilibrated at a rotor speed of 20,000 (q), 30,000 (OE), and 40,000 (ࡗ) rpm, respectively. For clarity, only every fifth data point is shown. The data were acquired at 4°C and analyzed as described under "Experimental Procedures." The solid lines represent the best-fit to a single species model obtained from simultaneous analysis of all nine data sets.

TABLE IV Sedimentation equilibrium analysis of gpNu1⌬E68
The data presented in Fig. 4   induced loss of gpNu1⌬E68 secondary and tertiary structure is fully reversible (Figs. 3, A and B). The loss of tertiary structure is cooperative (Fig. 3C) and possesses a T m similar to that observed for the gpNu1⌬K100 construct (compare Tables I and  V). Moreover, salt and protons similarly affect the thermal stability of both constructs. Thermally induced loss of gpNu1⌬E68 secondary structures is also cooperative, and the pre-transition baseline is comparatively insensitive to temperature (compare Figs. 2C and 3C). Moreover, the pre-transition baseline for gpNu1⌬E68 unfolding is unaffected by salt or pH (not shown). A significant increase in T m is observed with increasing gpNu1⌬E68 concentration (Table VI). Essentially identical results are observed with gpNu1⌬K100.

DISCUSSION
GpNu1, the small terminase subunit, is responsible for sitespecific assembly of a holoenzyme complex required for genome packaging. The protein is further responsible for the exceptional stability of multiple nucleoprotein intermediates along the packaging pathway. Insolubility of the isolated subunit has hampered biochemical analysis of the protein. We have previously described the construction and biochemical characterization of gpNu1⌬K100, a deletion mutant of gpNu1 that comprises the N-terminal 100 amino acids of the protein. This construct is fully soluble at concentrations up to 15 mg/ml (1), which allows biophysical and structural studies that are not possible with the full-length protein. Importantly, gpNu1⌬K100 contains the putative helix-turn-helix DNA binding motif found in full-length gpNu1 (see Fig. 1B) (45). We therefore initiated NMR structural studies on this construct to understand the structural basis for the stability of gpNu1⅐DNA complexes.
Fluorescence, CD, and solution NMR studies demonstrated significant secondary and tertiary structure in the construct (1). Thermally induced unfolding of gpNu1⌬K100 reveals an unusual situation where secondary structure is lost in a noncooperative manner, whereas global folding of the protein (tertiary structure) is lost cooperatively. Based on a variety of data, we reasoned that gpNu1⌬K100 is composed of a stable and folded N-terminal domain that possesses a C-terminal helix extending from the folded core of the protein (Fig. 5). Specifically, the C-terminal 30 amino acids of gpNu1⌬K100 adopt a pseudo-stable helix that "unravels" in response to elevated temperatures, much like an isolated helical peptide. This noncooperative loss of secondary structure is reflected in the steep pre-transition baseline observed in the far-UV CD melting curves. Ultimately, the folded N-terminal region of the protein denatures, yielding the cooperative unfolding curves observed in the near-UV CD unfolding transition.
Limited proteolysis experiments indeed demonstrate the presence of a protease-sensitive C terminus, and the shorter gpNu1⌬E68 mutant was constructed. Fluorescence, CD, and NMR data confirm that this construct is folded in solution. gpNu1⌬E68 retains a dimeric structure under all experimental conditions examined, indicating that the self-association domain of full-length gpNu1 (Lys 100 -Pro 141 ) is distinct from the dimerization region of the protein. Our data do not allow the calculation of an equilibrium constant for protein dimerization but suggest an upper limit in the nanomolar range. Of interest is the observation that thermally induced unfolding of both constructs is concentration-dependent, suggesting that unfolding and dimerization are, at least in part, coupled.
Thermally induced loss of gpNu1⌬E68 secondary structure is cooperative, unlike that observed for gpNu1⌬K100. Moreover, gpNu1⌬E68 unfolding monitored in the near-UV CD mirrors that obtained in the far-UV CD, suggesting that loss of secondary structure is concomitant with global unfolding of the protein. This is consistent with the suggestion that the strongly sloping baselines in gpNu1⌬K100 unfolding curves result from loss of secondary structure as the pseudo-stable helix unwinds. Deletion of this region results in cooperative secondary structure loss as the entire protein unfolds.
Our data suggest that the tertiary fold of the N-terminal 68 residues in both gpNu1⌬E68 and gpNu1⌬K100 is essentially  identical. First, guanidinium hydrochloride-induced denaturation yields identical fluorescence changes in both constructs. Second, the near-UV CD spectra confirm the loss of tyrosine 85 from gpNu1⌬E68 but otherwise indicate similar folded structures for the two constructs. Finally, thermally induced unfolding of tertiary structure yields comparable T m values for both constructs, and the salt and pH effects are virtually identical. These data suggest that deletion of residues 69 to 100 from gpNu1⌬K100 does not appreciably affect the folding or stability of the N-terminal 68 residues of the protein. Consistently, the affinity of both constructs for cos-containing DNA is essentially identical.
The data are consistent with the model presented in Fig. 5 that describes gpNu1⌬E68 as a folded, stable, and functional DNA binding domain of gpNu1. The model further suggests that the C-terminal 32 amino acids of gpNu1⌬K100 adopt a pseudo-stable helix that extends from the folded core of the protein. This is consistent with primary sequence analysis that predicts strong ␣-helical character extending from residue Ϸ50 in the DNA binding domain to residue Ϸ115 within the selfassociation domain of full-length gpNu1. Truncation of the helix at Glu100 yields a disrupted helix that is marginally stable.
Of what functional significance is this helical region of gpNu1? The proposed structural organization of gpNu1 finds similarity to that observed in the ␣ subunit of E. coli RNA polymerase (46). This protein consists of two independently folded domains connected by a flexible sequence that possess ␣-helical character. The N-terminal domain (NTD) forms a homodimer that is necessary and sufficient for core enzyme assembly and site-specific DNA binding. The C-terminal domain (CTD) is responsible for trans-activation by a number of transcription factors. The intervening flexible helix allows the CTDs to act as independent motional units capable of interacting with a variety of protein signals, whereas the NTD remains site-specifically bound at the promoter. A similar structural organization is observed in the integrase enzyme from HIV. The dimeric core domains are connected to the C-terminal DNA binding domains by a 26 residue ␣-helix (47). In this case, the interdomain helix likely plays a functional role that permits dynamic interaction of the CTDs during the integration of viral DNA.
We suggest a similar role for an intervening helix in gpNu1. The N-terminal domain forms a homodimer that binds specifically to cos-containing DNA. Cooperative DNA binding is driven by self-association interactions mediated by the hydrophobic domain of the protein. Additionally, the Cterminal gpA-interactive domain of gpNu1 must promote gpA assembly at cosN. A helical linker between these domains would allow sufficient flexibility such that each of these contacts may be formed appropriately, simultaneously assembling gpNu1 and gpA at cosB and cosN, respectively. Structural, kinetic, and biophysical studies currently underway in our laboratory are directed toward a mechanistic description of the role of this helix in the functioning of the protein, and virus assembly.