Purification and self-association equilibria of the lysis-lysogeny switch proteins of coliphage 186.

The CI repressor protein, responsible for maintenance of the lysogenic state, and the Apl protein, required for efficient prophage induction, are the two control proteins of the lysis-lysogeny transcriptional switch of coliphage 186. These proteins have been overexpressed, purified, and their self-association behavior examined by sedimentation equilibrium. Phage 186 CI dimers self-associate in solution through tetramers to octamers in a concerted process. The Apl protein of 186 is an unusual example of a helix-turn-helix protein which is monomeric in solution.

Bacteriophage has served as a model system for describing mechanisms of gene control in both prokaryotes and higher organisms. In particular, the means by which the CI and Cro proteins interact with their operator sites to control transcription, and so foster either lysogenic or lytic development, has been studied extensively by a range of genetic, biochemical, and physiochemical approaches. These studies have contributed enormously to our understanding of genetic control mechanisms (1). Bacteriophage 186 from the P2 family of phages has a completely different nucleotide sequence to and has evolved a different set of mechanisms for controlling expression of its genome. Coliphage 186, like , is able to replicate its genome through one of two independent but interchangeable pathways. The lytic pathway results in lysis of the host cell and production of progeny phage, while the lysogenic pathway involves integration of the phage genome into the chromosome of the bacterial host where it is replicated along with the host chromosome in subsequent generations. The lysogenic state in 186 is an extremely stable one: the frequency of uninduced transition from lysogeny to lytic development approaches the mutation rate (2). Despite the stability of the lysogenic state, the correct environmental stimuli can induce 186 to rapidly and efficiently switch from lysogeny to lytic development, the process of prophage induction (2). We anticipate that investigation of the mechanisms by which this switching occurs will further our understanding of general genetic control strategies.
The lysis-lysogeny switch region of 186 ( Fig. 1) involves two face to face promoters, p R and p L , whose transcripts overlap by 62 base pairs (5). The lysogenic state is maintained by the product of a single gene, CI, transcribed from the leftward lysogenic promoter, p L (3). CI represses transcription from the rightward lytic promoter, p R (5) as well as directly repressing transcription of the late control gene B from the p B promoter (8). In addition, there are two flanking sites whose function is unknown, one (FL), located within the cI gene, the other (FR), found at the 5Ј end of the apl gene (7). The Apl protein of 186, produced from the first gene of the rightward early lytic transcript, has no apparent role in lytic development after infection but is required for efficient prophage induction (6). It functions both at the level of derepression and of prophage excision (6). 1 In this regard, Apl assumes the roles of both the Cro and Xis proteins of phage . Consistent with its dual functions as repressor and excisionase, Apl binds between the p R and p L promoters and at the attP site for integrative-excisive recombination ( Fig. 1) (6).
We wish to investigate the molecular mechanisms by which these two proteins, CI and Apl, act to control gene expression. Any description in quantitative terms of a protein-DNA interaction must include consideration of each of the equilibria involved. Since DNA binding proteins rarely act as single structural units, but tend to exist as dimers or higher oligomers (either pre-existing or induced upon binding to DNA (9)), any such protein self-association must be taken into account. Hence, as a first step in understanding the molecular mechanisms by which (i) CI is able to repress transcription from p R and p B , and thereby efficiently maintain the lysogenic state, and (ii) Apl is able to function both as a repressor and as an excisionase in bringing about lytic development, we have overexpressed and purified both proteins. Further, we have examined by analytical ultracentrifugation the ability of CI and Apl to self-associate and found that in solution CI higher order self-assembly proceeds in a concerted manner from dimer through tetramer to octamer, while Apl remains monomeric over the concentration range examined.

MATERIALS AND METHODS
Radiolabeled nucleotides, acrylamide solutions, and oligonucleotide primers were purchased from Bresatec (Adelaide), while restriction enzymes were from New England Biolabs. All chemicals were of reagent grade or better.

Cloning and Expression
The cI gene from 186 was amplified by the polymerase chain reaction using primers designed to introduce an NdeI restriction site (underlined) at the 5Ј end of the gene (5Ј-GGTTTTATCCATATGAGAATA 3Ј) and a BamHI restriction site (underlined) at the 3Ј end of the gene (5Ј-CACGGATCCAACCGCCAGCC 3Ј). This fragment was ligated into a pET3a vector backbone (10) to give pET3aCI, which was then transformed into Escherichia coli strain BL21 (DE3) plysS (10) and the insert was sequenced to ensure no base pair changes had been introduced (constructed by I. Dodd). The Apl protein was also overexpressed using the pET system. The apl gene was ligated into pET3a to give pMRR1 (6) and transformed into BL21 (DE3) plysS.
For expression of CI, BL21 (DE3) plysS pET3aCI cells were grown at 37°C in 2-liter flasks containing Luria Broth (500 ml), 100 g ml Ϫ1 carbenicillin, and 30 g ml Ϫ1 chloramphenicol. When the culture had reached an optical density of 0.6 -0.8 at 600 nm, isopropyl-1-thio-␤-Dgalactopyranoside was added to a final concentration of 0.4 mM, and * This work was supported by the Australian Research Council. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ To whom correspondence should be addressed. Tel.: 61-8-303-4361; Fax: 61-8-303-4348; E-mail: jegan@biochem.adelaide.edu.au. growth was continued for an additional 3 h. Cells were collected by centrifugation, washed once with 50 mM Tris-HCl, 0.1 mM EDTA, 10% glycerol, 150 mM NaCl, pH 7.5 (TEG 150) buffer, and stored at Ϫ70°C in approximately 10 ml of the same buffer. Expression of Apl was the same as that described for CI, except that growth was at 30°C in order to maximize the fraction of soluble protein.

Protein Purification
CI-All steps in the purification of CI were performed at 4°C. Cells (from 2 ϫ 500-ml cultures) were thawed, and phenylmethylsulfonyl fluoride was added immediately to a final concentration of 0.1 mM, in order to minimize proteolytic activity. The cell suspension, already partially lysed as a result of the activity of lysozyme produced from the plysS plasmid, was sonicated on ice (3 ϫ 15 s) in order to complete lysis and reduce solution viscosity. Cell debris was removed by centrifugation (30 min at 12,000 rpm, JA20 rotor of a Beckman J-21 centrifuge). Polyethyleneimine (PEI) 2 (5% solution, pH 7.6) was added dropwise to the cleared lysate to give a final concentration of 0.4%. Following gentle stirring for 5 min, the mixture was centrifuged (10 min, 300 ϫ g) and the supernatant was discarded. CI was released from the PEI pellet by resuspension in TEG buffer containing 500 mM NaCl. The mixture was recentrifuged and the supernatant was retained. This high salt extract was brought to 60% (NH 4 ) 2 SO 4 saturation by the addition of solid (NH 4 ) 2 SO 4 . Following stirring for 30 min, precipitated proteins were pelleted by centrifugation. The pellet was dissolved in TEG buffer containing 10 mM NaCl (TEG 10) and loaded on a 5-ml column of Affi-Gel Blue (Bio-Rad), pre-equilibrated with the same buffer. The column was washed with several column volumes of TEG 10 buffer, followed by several volumes of TEG buffer containing 100 mM NaCl (TEG 100). Bound proteins were eluted with a 100 -500 mM NaCl gradient in TEG buffer, CI eluting at approximately 250 mM NaCl. Fractions containing CI, as judged by SDS-PAGE, were pooled and dialyzed overnight against TEG 10 buffer. This pool, containing only a few minor contaminating proteins, was further purified by chromatography on an Econopac heparin column (Bio-Rad). The sample was applied, the column was washed extensively with TEG 10, and CI eluted (at approximately 100 mM NaCl) with a 10 -500 mM NaCl gradient. CI-containing fractions were pooled, dialyzed against TEG 150 buffer, and stored at Ϫ70°C .
Apl-For purification of Apl, cells were thawed and phenylmethylsulfonyl fluoride was added to 0.1 mM final concentration in order to minimize proteolytic activity. The cell suspension was sonicated (3 ϫ 15 s) on ice and cell debris was removed by centrifugation. To the cleared lysate, polyethyleneimine (PEI) was added from a 5% stock solution (pH 7.6) to give a final concentration of 0.5%. Precipitated nucleic acids and unwanted proteins were removed by a brief centrifugation and solid (NH 4 ) 2 SO 4 was added, with stirring, to give 60% (NH 4 ) 2 SO 4 saturation at 0°C. Precipitated proteins were pelleted by centrifugation and the pellet was dissolved in approximately 1 ml of 50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 0.1 mM EDTA containing 8 M urea. This sample was applied to a 1 ϫ 120 cm column of Sephacryl S200 (Pharmacia), which had been pre-equilibrated with the same buffer. The column was run at 0.15 ml min Ϫ1 at room temperature, and 3-ml fractions were collected. Apl-containing fractions (as judged by SDS-PAGE) were pooled, and the protein was refolded by dialysis against decreasing concentrations of urea. Final dialysis was against TEG 150 buffer. The final product was centrifuged to remove any aggregated material and stored at Ϫ70°C.
Protein concentrations were measured using the Bio-Rad protein assay or by absorbance at 280 nm. Extinction coefficients were calculated from the average extinction coefficients of tryptophan (5500 M Ϫ1 cm Ϫ1 ) and tyrosine (1200 M Ϫ1 cm Ϫ1 ), assuming additivity of absorbances (11). These values are 23,470 M Ϫ1 cm Ϫ1 for CI and 14,600 M Ϫ1 cm Ϫ1 for Apl.

Mobility Shift Assays
A 437-base pair DNA fragment containing the p R /p L region of 186 (MaeII-MaeIII fragment, see Fig. 1) was used in gel mobility shift assays. This fragment, inserted into pBluescript (Stratagene) to give pEC625 (6), was labeled with 32 P by the polymerase chain reaction, using an end-labeled primer (USP), an unlabeled primer (RSP), and pEC625 DNA as the template. The labeled fragment was gel-purified before use. Samples containing CI or Apl were added to this [ 32 P]DNA in binding buffer (50 mM Tris-HCl, pH 7.5, 0.1 mM EDTA, 150 mM NaCl, 20% glycerol, 50 ng l Ϫ1 sonicated salmon sperm DNA), and the reactions (10 l) were incubated for 30 min on ice. The binding reactions, containing ϳ150 cpm, were loaded on nondenaturing 6% polyacrylamide gels (19:1 acrylamide:bisacrylamide) containing 20% glycerol. A separate lane of tracking dye was used. Gels were pre-electrophoresed at 4°C for at least 30 min prior to loading the samples. Electrophoresis was carried out for approximately 2 h, the gels were dried under vacuum, and the distribution of labeled DNA was recorded on a phosphor screen. The phosphor screen was analyzed with the Imagequant program on a Molecular Dynamics PhosphorImager.

Analytical Ultracentrifugation
All experiments were performed on a Beckman Optima XL-A analytical ultracentrifuge equipped with absorbance optics and a four-hole An60Ti rotor. Sedimentation equilibrium experiments were done at 5°C using double sector centerpieces. Data were collected at 230 or 280 nm with a spacing of 0.001 cm as the average of three absorbance measurements per radial position. Overspeeding was used to hasten the approach to equilibrium (12), and, after 24 h of centrifugation, scans were compared at 3-h intervals to ensure that equilibrium had been reached. All experiments were done in 50 mM Tris-HCl, 0.1 mM EDTA, 150 mM NaCl, 10% (v/v) glycerol, pH 7.5. Proteins were prepared for centrifugation by exhaustive dialysis against this buffer.
At sedimentation equilibrium, the total concentration, c T , of a reversibly self-associating species at radial distance, r, can be expressed as where c i,r 0 is the concentration of species i at reference position r 0 , M i is the molecular weight of species i, and A is defined as is the partial specific volume of the protein, is the solution density, is the angular velocity, R is the gas constant, and T is the Kelvin temperature. e is a baseline offset term. For a monomer-i-mer association, the equilibrium constant K is Substitution into Equation 1 and rearrangement gives, for i ϭ 2, The abbreviations used are: PEI, polyethyleneimine; PAGE, polyacrylamide gel electrophoresis; bp, base pair(s). . Sequence numbering begins at the PstI site at 65.5%. Genes are shown as boxes (rightward genes above the line, leftward genes below), promoters as arrowheads, their transcripts as arrows, and terminators as stem-loops. p L is the lysogenic promoter, p R is the early lytic promoter, and p B is the promoter for the B gene, the product of which activates transcription of late genes. cII is the gene required for establishment of lysogeny, int is the integrase, and 69 is of unknown function. The phage attachment site for integration into the host chromosome, attP (Reed et al. 1 ) is shown. CI binding sites (7) are indicated by solid boxes, while Apl binding sites are shown as crosshatched boxes. The MaeII 2666 to MaeIII 2668 (switch) region is enlarged in order to present the relative arrangement of the Apl and CI binding sites. The Ϫ10 and Ϫ35 regions of the p R and p L promoters are indicated, as are the start sites for transcription (ϩ1).
where equilibrium constants are fitted as ln K to constrain them to positive values. Additional terms (e.g. i ϭ 4 for tetramer) can be added to Equation 3 for more complex association schemes. Hence, on the basis of data obtained from the ultracentrifuge (total concentration, on an absorbance scale, as a function of radial distance), equilibrium constants describing a given association scheme can be obtained by fitting the data to Equation 3. These fitted values of K are then converted from an absorbance scale to a molar scale, based on the degree of polymerization and the appropriate extinction coefficient, ⑀ (corrected for the pathlength of the centerpiece).
Data sets used in the analysis were truncated to include only absorbance values below 1.2, to ensure absorbance is linear with respect to protein concentration. Nonideality was not considered. Data analysis was done using a commercial graphics/curve-fitting program (Sigmaplot 5.1, Jandel Scientific, Corte Madera, CA) or by NONLIN (13). The partial specific volumes of CI and Apl were calculated using the amino acid partial specific volume values of Zamyatnin (14). These calculations gave ϭ 0.725 ml g Ϫ1 for CI and ϭ 0.735 ml g Ϫ1 for Apl. Buffer density at 5°C was measured in an Anton-Paar precision density meter to be 1.0378 g ml Ϫ1 .

RESULTS
Protein Purification-The cI and apl genes of the temperate coliphage 186 were cloned into the T7 expression system of Studier et al. (10), and their products were expressed to high levels in E. coli. Expression of the proteins had little effect on the growth of the host cells, and almost all of the protein was in soluble form.
SDS-PAGE of samples taken at various stages of the purification of CI are shown in Fig. 2A. In purifying CI, PEI precipitation results in the coprecipitation of CI with nucleic acids. CI is then efficiently separated from the nucleic acids by extracting the PEI pellet with a buffer of moderate salt concentration. Ammonium sulfate precipitation serves to separate CI from any remaining PEI and some of the protein contaminants. CI bound strongly to the Affi-Gel Blue column and was eluted as a broad peak to give a protein pool containing only a few contaminants. These were removed by chromatography on a heparin affinity column. An additional chromatography step on a Superdex 75 HR 10/30 fast protein liquid chromatography gel filtration column showed no evidence of contaminating species (not shown). Fig. 2B shows SDS-PAGE of samples taken during the purification of Apl. As was the case for CI purification, PEI precipitation was used to remove nucleic acids and some contaminating proteins. However, Apl did not coprecipitate with the nucleic acids but remained in the supernatant, behavior consistent with its predicted isoelectric point of 10. Ammonium sulfate precipitation was used to concentrate the protein and separate it from PEI for the gel filtration procedure. The small size of the Apl protein (9.6 kDa) allowed purification in a single step on Sephacryl S200, under denaturing conditions. With conservative pooling of Apl-containing fractions, Apl was obtained with Ͼ95% purity. Refolding was performed by dialysis against progressively lower concentrations of urea. Since Apl contains only a single cysteine (3), refolding was straightforward and very little precipitate was observed.
The UV spectra of purified CI and Apl are typical of those obtained for tryptophan-containing proteins (not shown). The A 280 /A 260 ratios are 1.69 for CI and 1.84 for Apl, indicating the absence of significant quantities of contaminating nucleic acids.
Activity-Gel mobility shift assays were used to follow the purification of CI and Apl by quantitating the ability of the proteins to bind specifically to their operator sites. A 32 Plabeled 437-bp DNA fragment containing the p R /p L switch region (5) was used as the target for binding. This region contains binding sites for both CI and Apl (Fig. 1). One unit of activity was defined as the dilution at which 50% of the labeled DNA is retarded, under the standard conditions of the assay. Tables I  and II show the yields and activities obtained throughout the purification processes. Assays could not be performed on PEIcontaining fractions as this led to precipitation of the labeled DNA fragment. For CI, a 13-fold purification is achieved, with a yield of 0.7 mg of protein per liter of culture, while Apl underwent a 9-fold purification for a yield of approximately 1 mg/liter of culture.
N-terminal Sequence-Five cycles of automated Edman degradation (Applied Biosystems 475A Protein Sequencer) were performed on the purified proteins. The N-terminal sequence of CI (Met-Arg-Ile-Asp-Ser) is that predicted from the DNA sequence (3). Calculation of the monomeric molecular weight of CI based on its gene sequence and retention of the initiator methionine gives M r ϭ 21,160. The first five amino acids of Apl  9652. These results are consistent with the results of Flinta et al. (15) who found that in prokaryotic proteins, the N-terminal methionine is cleaved when the penultimate amino acid is small and uncharged.
Self-association-On SDS-PAGE (Fig. 2), CI migrates as a single molecular species with an apparent molecular weight of 24,000, in reasonable agreement with the molecular weight (21,160) based on its gene sequence. Apl migrates with an apparent molecular weight of 10,000, consistent with its predicted molecular weight of 9652.
In order to gain a qualitative estimate of the extent of any self-association, purified CI and Apl were subjected to gel filtration chromatography on a column of Sephacryl S200. By calibrating the column with a series of proteins of known molecular weight, the elution volume of the protein of interest can be used to infer an apparent molecular weight. It should be emphasized that for small zone experiments, the technique is dependent upon the assumption of spherical geometry and that no simple relationship exists between protein concentration and elution volume. At a loading concentration of 1 M, CI eluted from the column with an apparent molecular weight of 60,000, approximately 3 times that of the monomeric species (Fig. 3). When CI was loaded on the column at a 10-fold higher concentration (10 M), the apparent molecular weight increased to 158,000, approximately 7 times that of monomer. This concentration dependence of elution volume indicates that CI does indeed self-associate in solution. On the other hand, Apl eluted with an elution volume greater than that of the smallest molecular weight standard (cytochrome c, M r ϭ 12,400), indicating that it undergoes little self-association in solution.
The ability of 186 CI and Apl to self-associate in solution was investigated in a quantitative manner by sedimentation equilibrium. All experiments were performed at 5°C in TEG 150 buffer. For both proteins, different combinations of loading concentration and rotor speed were used. Fig. 4 shows the results of three of the sedimentation equilibrium experiments performed on CI with a loading concentration of 9 M (in terms of total repressor subunits) at rotor speeds of 12,000, 16,000 and 24,000 rpm. Initially, the individual concentration distributions were fitted to the equation for a single species (i ϭ 1) in order to obtain whole cell average molecular weights. These ranged from 91,000 (Ϯ1000) to 144,000 (Ϯ2300). Thus, the greatest whole cell average molecular weight is approximately 6.8 times that of the monomeric species. These results confirm that CI self-associates and that the self-association is to species larger than hexamer.
The sedimentation data sets were then analyzed globally in terms of various assembly schemes. Insufficient data could be obtained at the low concentration end (even when distributions were recorded at 230 nm) to satisfactorily describe the monomer-dimer interaction and so, in subsequent analyses, the smallest species in the association scheme was set at that of a dimer (M r ϭ 42,320). When constraining M 1 (Equation 3) to this value, the best fit to the data, as judged by the sum of squares of residuals (SSR), was to a dimer-tetramer-octamer equilibrium (⌬G 2,4 0 ϭ Ϫ7.0 Ϯ 0.1 kcal mol Ϫ1 , ⌬G 2,8 0 ϭ Ϫ21.3 Ϯ 0.1 kcal mol Ϫ1 , SSR ϭ 0.007). Consistent with this result, when the molecular weight of the smallest species (M 1 ) was allowed to float in the calculation, essentially the same free energies of association were obtained and the resulting fitted value for M 1 was, within experimental error, that of a CI dimer (43,390 Ϯ 1340). Including additional species in the association scheme (dimer-tetramer-hexamer-octamer) did not improve the fit above that expected solely on the basis of fitting to an additional parameter. Short pathlength cells (2.5 mm) were used with a higher loading concentration of CI (20 M) in order to better define the high end of the association scheme. However, inclusion of these data sets into the global fit did not justify inclusion of additional species larger than octamer. Fitting the data sets to a dimer-octamer equilibrium (i.e. no formation of tetramer, K 2,4 ϭ 0) resulted in a fit similar to that obtained for the dimertetramer-octamer scheme, reflecting the difficulty in defining the association constant for species which do not accumulate to a significant degree. Hence, the self-association of the 186 CI repressor protein under the conditions studied is best described by a dimer-tetramer-octamer equilibrium, the dimer-octamer transition being a concerted process (see "Discussion").
Apl self-association was also studied by sedimentation equilibrium. Four runs were performed, employing two different rotor speeds (16,000 rpm and 24,000 rpm) and two different loading concentrations (16 M and 32 M) (Fig. 5). Data sets from all four runs were fitted globally to various self-associa-  tion schemes. The best fit to the data was that of a single monomeric species. The fitted value of molecular weight was 10,410 Ϯ 60 (SSR ϭ 0.041).

DISCUSSION
Two of the major control proteins from bacteriophage 186, CI and Apl, have been expressed, purified, and their self-association properties examined. CI reversibly self-associates in solution, and this self-assembly is best described by a monomerdimer-tetramer-octamer equilibrium. Fig. 6 shows the distribution of CI species calculated from the free energies of association obtained from fitting the sedimentation equilibrium data to a dimer-tetramer-octamer association scheme. Even at the lowest concentration of CI used in the sedimentation equilibrium experiments, there was insufficient monomer present to characterize the monomer-dimer equilibrium. In order to permit inclusion of the calculated distribution of monomer and dimer in Fig. 6, a value of K 1,2 has been estimated. This estimate, 1 ϫ 10 8 M Ϫ1 (⌬G o ϭ Ϫ10.2 kcal mol Ϫ1 ), is based on the detection limit of the ultracentrifuge; that is, any value of K 1,2 lower (weaker) than 1 ϫ 10 8 M Ϫ1 would have been resolved in the fitting procedure. Based on this value of K 1,2 , CI exists primarily as a mixture of monomers (21 kDa) and dimers (42 kDa) between 10 Ϫ10 and 10 Ϫ7 M (in terms of monomer). Below 10 Ϫ10 M, CI is essentially monomeric. Use of a 10-fold higher value of K 1,2 results in the monomer-dimer curves shifting one log unit to the left, without significant change to the tetramer and octamer curves (not shown). Between 10 Ϫ6 and 10 Ϫ4 M, CI exists in solution as a mixture of dimer (42 kDa), tetramer (84 kDa), and octamer (168 kDa), with octamer being the predominant species at concentrations above 10 M. The tetrameric species exists only as an intermediate during the assembly of dimers to octamers, never reaching more than 35% of the total. The distribution of tetramer is subject to some uncertainty given the difficulty in precisely defining K 2,4 . Within the concentration range examined, there was no evidence for formation of polymers higher than octamers. That octamer formation is a concerted (favored) process can be seen by calculating the free energy per dimer required for formation of the higher species, for tetramer formation, ⌬G 0 ϭ Ϫ3.5 kcal mol Ϫ1 per dimer, while for octamer formation, ⌬G 0 ϭ Ϫ5.3 kcal mol Ϫ1 per dimer. Thus, the free energy per dimer for octamer formation is more negative than the free energy per dimer of tetramer formation, and assembly of dimers to octamer is the energetically favored process.
Like 186 CI, the CI repressor also associates from dimer through tetramer to octamer in a concerted process (16) data were consistent with a further association to dodecamer. Not only do the and 186 repressors assemble in solution in the same manner, but comparison of the fitted free energy values for the various steps in the association process shows a remarkable similarity. For the dimer-tetramer equilibrium, ⌬G 0 is Ϫ7.1 kcal mol Ϫ1 for and Ϫ7.0 kcal mol Ϫ1 for 186, while the free energies for the dimer-octamer equilibria are also quite similar; ⌬G 0 ϭ Ϫ23.0 kcal mol Ϫ1 for CI and Ϫ21.3 kcal mol Ϫ1 for 186 CI. Given this correspondence between the self-association characteristics of the two proteins, one might expect their structures to show some similarity. While the domain structure of the repressor has been studied extensively, little is known about the tertiary structure of 186 CI. Several lines of evidence suggest that CI consists of two domains (17). The N-terminal domain is responsible for DNA binding and the C-terminal domain contains the determinants for oligomerization and cooperativity. The N domain interacts with DNA via a helix-turnhelix motif, a motif common to many DNA-binding proteins. In the 186 CI repressor, a region in the N-terminal third of the amino acid sequence gives a weak match to a helix-turn-helix motif as judged by weight matrix analysis (18). We suspect that 186 CI contains a variant form of this DNA binding motif (7). Other than this, there is little similarity at the nucleotide or amino acid level between the and 186 repressors. The nature of the operator sites to which the two repressors bind do share some features. The repressors bind to three operators in both at O R (O R 1, O R 2, and O R 3) and 186 at p R (site II, site I, and site III), each operator being separated by approximately two turns of the helix (1,7). In 186, however, the central operator (site I) has a consensus sequence unrelated to that of the two adjacent operators, indicating that 186 CI may contain more than one region capable of binding DNA (7). DNase I footprint analysis of mutated 186 p R operator regions show evidence of cooperativity (7).
What are the implications of these results for 186 CI binding to its operator sites? Protein-protein association in solution to provide multidentate ligands capable of binding multiple sites on DNA is a common mechanism for cooperative binding of regulatory proteins to DNA (16). Thus, in order to fully characterize a protein-DNA interaction, oligomerization of the protein must be considered. For example, as discussed by Senear et al. (16), linkage between protein self-assembly and DNA binding may produce free energy changes for oligomerization which will differ depending on whether binding to DNA favors or disfavors protein self-assembly. In the case of , Laue et al. (19) found that binding of O R 1 oligonucleotides to octameric CI repressor did not dissociate it to tetramers, and, therefore, any model for which proposes pairwise cooperativity between adjacent DNA-bound dimers (to give DNA-bound tetramers) must also consider the free energy required to destabilize the octamer. Similarly, in the case of 186, further studies of the protein-protein and protein-DNA interactions are required to delineate the contributions of cooperativity, linkage, and allostery to the molecular mechanism by which CI binds to its operator sites and stably maintains the lysogenic state.
In considering the ability of CI to maintain the lysogenic state, one must also be aware of the relative arrangement of the lytic and lysogenic promoters (Fig. 1). An inevitable consequence of the overlapping face to face arrangement of the p R and p L operators in 186 is that RNA polymerase, in transcribing CI from p L , must traverse p R in order to maintain the lysogenic state. In doing so it must presumably dislodge CI already bound at p R , providing the opportunity for loss of repression. A possible mechanism for preventing this loss of repression involves the flanking sites FL and FR providing a locally high concentration of CI, allowing rapid rebinding of CI to p R following passage of RNA polymerase. Such a mechanism would require oligomerization of CI such that it could bind simultaneously to p R and either FL or FR. These studies have demonstrated that this oligomerization can occur, at least in solution.
Turning now to Apl, this protein functions as a repressor during prophage induction and is involved in excision of the prophage from the bacterial host, roles performed by two proteins, Cro and Xis, in phage . Apl binds to a set of seven direct repeats at p R /p L and five direct repeats at the attachment site, attP ( Fig. 1; Ref. 5). These sites have 10 -11-base pair center to center spacing, indicating that Apl binds to the same face of the helix. Given the narrow range of concentration over which Apl fills the multiple operator sites within the p R /p L region (6), cooperative interactions must be involved. Apl, like the homologous Cox proteins from P2 and HP1 phage, has a predicted helix-turn-helix DNA binding motif (18,20). In general, helixturn-helix proteins are dimers or tetramers in solution (for example, the majority of those listed in (21)) and it is these oligomers which interact with their operator sites, usually (but not always) inverted repeat sequences. In contrast, analytical ultracentrifugation of purified Apl shows that Apl remains monomeric in solution up to millimolar concentrations.
It could be argued that since Apl has been denatured and refolded during the course of the purification, the majority of the protein is in an inactive, nonassociating form and that the activity observed in gel shift assays arises from a small fraction of active associated protein. While we cannot completely rule this out, three lines of evidence argue against this possibility. Firstly, Apl purified to approximately 80% purity by ion exchange chromatography (without unfolding) eluted on a calibrated Sephacryl S200 column (Fig. 3) with the same elution volume as Apl purified by the unfolding/refolding procedure. This indicates that the "native" protein is the same size as the protein purified by unfolding/refolding. Secondly, this partially purified Apl fraction had approximately the same specific activity in gel shift assays as the pure refolded Apl. The yield and extent of purification of Apl (Table II) is consistent with retention of DNA binding activity throughout the purification procedure. Finally, while monomeric helix-turn-helix DNA binding proteins are unusual, there are examples in the literature. Thus, while Cro is a dimer in solution, Cro protein from phage 434 remains monomeric, even at the high concentrations employed for crystallization (22). Again, the biotin operon repressor (BirA) remains monomeric at concentrations 2-3 orders of magnitude higher than the concentration required for operator binding (23). Of particular interest are the Ner proteins of bacteriophage Mu and the closely related phage D108. Like apl, ner is functionally analogous to cro in that it is the first gene encoded by the early lytic operon and that its protein product FIG. 7. Thermodynamic cycle for a self-associating protein (A) binding to DNA. Shaded areas represent DNA binding sites. According to this scheme, the protein can either self-associate in solution (K 1 ) followed by binding of the oligomeric form (K 2 ) or the monomers can bind directly to DNA (K 3 ), followed by oligomerization on the DNA (K 4 ). In the case of Apl, the protein self-association constant, K 1 , is 0 and so binding to DNA must occur via the K 3 K 4 pathway. binds between the lytic and lysogenic promoters to negatively regulate transcription (24). The overlapping, face to face arrangements of the lytic and lysogenic promoters of Mu and D108 are quite similar to 186. Like Apl, Ner binds symmetrically between the two promoters (25,26). Mu Ner binds, with a dissociation constant in the nanomolar range, as a tetramer on a 30-bp oligonucleotide (26), yet is monomeric in solution at a concentration of 25 M (27). Similarly, D108 Ner forms dimers on DNA but is monomeric in solution up to 200 M (28). Although the tertiary structures of the Ner proteins are not known, they are selected by the weight matrix method of Dodd and Egan (18) as being potential helix-turn-helix proteins, albeit with a relatively weak score.
Given the monomeric nature of the Apl protein in solution and the fact that Apl is expected to bind cooperatively to its recognition sequences, it follows that this cooperativity can only be mediated on the DNA. This is illustrated in Fig. 7 which shows a simple thermodynamic cycle for a protein (A) binding to multiple sites on DNA. The protein-DNA complex can be formed in two ways. The protein can either self-associate in solution and then bind to DNA (K 1 K 2 pathway) or can bind as monomers to the DNA (K 3 K 4 pathway) where the protein subunits may or may not interact to give rise to cooperativity. A combination of these pathways is also possible, if both the monomeric and the self-associated forms of the protein have affinity for the DNA. The overall equilibrium will then reflect the relative affinities of the two forms of the protein for the DNA. For Apl, K 1 , the protein self-association constant in solution is 0. Apl binding therefore can only occur through the K 3 K 4 pathway: DNA binding followed by protein-protein association. The structural basis of cooperativity in Apl binding is unknown, but, given the periodic enhancements of DNaseI cleavage noted in footprint experiments (6), we speculate that it involves both protein-protein contacts and DNA bending.
The present results have provided a framework on which to base models describing the interaction of these control proteins with their operators in the 186 control region. We are currently undertaking further studies of both CI-DNA interactions and Apl-DNA interactions in order to dissect at the molecular and energetic levels the mechanisms by which these proteins control the lysis/lysogeny switch in bacteriophage 186.