The Helix-Turn-Helix Motif of the Coliphage 186 Immunity
Repressor Binds to Two Distinct Recognition Sequences*
Keith E.
Shearwin
,
Ian B.
Dodd, and
J. Barry
Egan
From the Department of Molecular Biosciences, University of
Adelaide, Adelaide, Australia 5005
Received for publication, August 13, 2001, and in revised form, October 26, 2001
 |
ABSTRACT |
The CI protein of coliphage 186 is responsible
for maintaining the stable lysogenic state. To do this CI must
recognize two distinct DNA sequences, termed A type sites and B type
sites. Here we investigate whether CI contains two separate DNA binding motifs or whether CI has one motif that recognizes two different operator sequences. Sequence alignment with 186-like repressors predicts an N-terminal helix-turn-helix (HTH) motif, albeit with poor
homology to a large master set of such motifs. The domain structure of
CI was investigated by linker insertion mutagenesis and limited
proteolysis. CI consists of an N-terminal domain, which weakly
dimerizes and binds both A and B type sequences, and a C-terminal
domain, which associates to octamers but is unable to bind DNA. A
fusion protein consisting of the 186 N-terminal domain and the phage
oligomerization domain binds A and B type sequences more
efficiently than the isolated 186 CI N-terminal domain, hence the 186 C-terminal domain likely mediates oligomerization and cooperativity.
Site-directed mutation of the putative 186 HTH motif eliminates binding
to both A and B type sites, supporting the idea that binding to the two
distinct DNA sequences is mediated by a variant HTH motif.
 |
INTRODUCTION |
DNA binding proteins are often modular in structure, with separate
domains responsible for binding and oligomerization. Such an
arrangement, even in a simple system such as the lysis-lysogeny switch
of coliphage
, permits remarkable control of gene expression through
a series of thermodynamically linked protein-protein and protein-DNA
interactions. The
genetic switch, one of the most intensively
studied systems in biology, has in many ways provided the basis for the
study of switch biology in higher organisms (1, 2). Coliphage 186, a
member of the P2 family of phage, shows essentially no similarity with
at the protein or DNA level, and so the two have presumably evolved
independently of each other. Nevertheless, the lysis-lysogeny switches
of each show superficial similarity, and it is expected that a
comparative analysis of the differences in detail between the two will
improve overall understanding of switch operation, hence the present
structure-function study of 186 CI repressor.
The immunity repressor, CI, of coliphage 186 is responsible for
maintenance of the stable lysogenic state and achieves this by binding
directly over and repressing two promoters: pR,
the promoter of the early lytic operon and pB,
the promoter for the late promoter activator gene B (Fig.
1). 186 contains a total of four binding
sites for CI, including two sites (FL and
FR) that flank the lytic promoter (6). These
flanking sites play a role fine-tuning CI-regulation of transcription
from pR and from the lysogenic promoter
pL.1
The flanking sites FL and
FR each consist of an inverted repeat, while the
CI binding site at the pB promoter contains a
pair of inverted repeats. These four inverted repeats share sequence
similarity and are separated in each case by a five-base pair A/T-rich
spacer. These CI binding sites have been designated A type sites (Fig. 1). The CI binding site at pR,
responsible for repressing the early lytic genes, consists of three
inverted repeats. There is a central A type site, which has a four-
rather than five-base pair spacing between conserved bases and is
designated an A' site. Situated on either side of the A' site are
inverted repeats, again with an A/T-rich spacer, but unrelated in
sequence to the A type sites (see Ref. 6, Fig. 1). These alternative
recognition elements have been termed B type sites. Hence, the CI
binding site at pR has the arrangement B-A'-B.
The recognition elements at pR all lie on the
same face of the helix and are strongly supported by DNase I footprint
data and by a library of 19 virulent (vir) mutations (6, 7).
Thus, CI is able to recognize two distinct DNA sequences.

View larger version (28K):
[in this window]
[in a new window]
|
Fig. 1.
Map of the control region of coliphage
186. The organization of the early control region of the 186 genome is shown, with the region from the PstI site
(sequence coordinate 20,315) to the end of the cII gene
(coordinate 23,943) enlarged to show details (3, 4, 5). Sequence
numbering starts at the left cos end of the 186 genome.
Genes are shown as gray boxes: B, activator of late
transcription; 69, unknown function; int, integrase;
cI, immunity repressor; apl, excisionase and
transcriptional control; cII, establishment of lysogeny.
Promoters are shown as arrowheads, their transcripts as
arrows, terminators as stem loops, and the phage
attachment site attP as a solid box. The CI
binding sites at pB, FL,
pR, and FR are indicated
as solid circles. The sequences of the inverted repeats from
each of these sites are shown in the lower part of the figure. The
diamonds indicate the center of symmetry of each inverted repeat.
FL and FR each consist of
one A type site, pB consists of two A type
sites, while pR has the arrangement B-A'-B. The
consensus sequences for A type and B type sites are shown, where w = A or T, y = C or T and r = A or G. The central w in the A
type consensus is optional, reflecting the alternate spacing of A and
A' type sites.
|
|
In the present work, we set out to determine whether there are two
distinct DNA binding regions within the CI protein or whether there is
just one motif that binds with relaxed specificity to the two different
types of binding sites. To this end we have investigated the domain
structure of CI by sequence analysis, linker insertion mutagenesis, and
limited proteolysis. We have examined the self-association and DNA
binding properties of the isolated domains and from the information so
obtained carried out mutagenesis on residues predicted to be critical
for DNA binding.
 |
EXPERIMENTAL PROCEDURES |
Bacteria BL21 (
DE3) pLysS was used as the
host for expression of CI and its variants from the T7 promoter of the
pET3a vector (8). NK7049 (9) was used as the host for lacZ
reporter genes. NK7049 (
RS45 pR short
lacZ) was used to assay transposon insertion mutants, NK7049
(
RS45 pR HincII/SnaBI
lacZ) was used to assay the ability of CI and its variants
to repress pR, and NK7049 (
RS45 pB lacZ) was used to assay
pB activity.
Plasmids, bacteriophage pRAS1 was created by amplifying the
186 cI gene, including the native ribosome binding
site by PCR using primers 108 and 55, digesting the PCR product
with EcoRI and BamHI and ligating into
pBluescriptKS (Stratagene), which had been digested with the same enzymes.
pET3aHis6 is a derivative of pET3a, which has been
engineered to encode a C-terminal thrombin cleavage sequence (LVPRGS)
followed by a six-histidine affinity tag. pETCIHis6 was
created by amplifying by PCR the cI gene from pET3aCI (10)
with primers T7 and 129, digestion of the PCR product with
NdeI and SacII, and ligation into
pET3aHis6, which had been digested with the same enzymes. pETCI (1-82)His6 and pETCI (83)His6 were
constructed similarly to pETCIHis6, using primers designed
to introduce NdeI and SacII sites at the
appropriate positions.
pETCI(hybrid)His6 was constructed by amplifying by
PCR the region encoding amino acids 1-82 of 186 cI using a
3'-primer designed to introduce an SphI site. The region of
the
cI gene encoding amino acids 93-236 of
repressor was amplified by PCR using a 5'-primer designed to introduce
an SphI site. The 186-cI PCR product was digested
with NdeI and SphI, the
PCR product with
SphI and SacII, and these fragments inserted into
NdeI/SacII-digested pET3aHis6. pETCI(HTH
)His6 was generated by performing
QuikChange (Stratagene) site-directed mutagenesis on a
pETCIHis6 template using primers 219 and 220.
pMRR9 (11) is a derivative of the lacZ promoter assay
plasmid pRS415 (9) containing translation stop codons from pKO2 and the
pUC polycloning site. pMRR9 pR(short), used to
generate NK7049 (
RS45 pR short
lacZ), contains the 22,980- to 23,190-fragment of 186 (
81
to +129 of pR) inserted into the XbaI
site of pMRR9, such that transcription from pR
reads into lacZ. pMRR9
pR(HincII/SnaBI), used to
generate NK7049 (
RS45 pR
HincII/SnaBI lacZ), contains the 22,583- to
23,552-fragment of 186 (
477 to +492 of pR)
inserted into the SmaI site of pMRR9 such that transcription
from pR reads into lacZ. pMRR9
pB, used to generate NK7049 (
RS45
pB lacZ), contains the 20,408- to
20,647-fragment of 186 (
176 to +64 of pB)
inserted into the EcoRI and KpnI sites of pMRR9
such that transcription from pB reads into
lacZ. Any regions amplified by PCR were checked by sequencing.
RS45 is a
phage vector used to transfer transcriptional reporter
fusions made in pMRR9 into single copy.
RS45 and pMRR9 share
portions of the N terminus of both the
-lactamase gene and the
lacZ gene thus allowing any promoter insert in pMRR9 to be
recombined into the phage (9). Lysogenization with this recombinant
phage gives a single copy chromosomal fusion.
Oligonucleotides--
Sequences of oligonucleotides
(shown 5' to 3') are as follows. A32A top:
CCCCCTCGAGATTCACTTAATGTGAATGTCGACCCCCTCGAGATTCACTTAATGTGAATGTCGACGGTA. (CI binding sites are shown in bold). A32A bottom biotin:
BiotinTACCGTCGACATTCACATTAAGTGAATCTCGAGGGGGTCGACATTCACATTAAGTGAATCTCGAGGGGG. (CI binding sites are shown in bold). B32B top:
AGCTTTCATTTCGATAAAACCTATTGTCGACCCCCTCGAGCTTTGGCTAAACCCACGCAATCTAG. (CI binding sites are shown in bold). B32B bottom biotin:
BiotinCTAGATTGCGTGGGTTTAGCCAAAGCTCGAGGGGGTCGACAATAGGTTTTATCGAAATGAAAGCT. (CI binding sites are shown in bold). OL1-OL1 top:
CCCTGTACATATCACCGCCAGTGGTATTTACTATATCACCGCCAGTGGTAATTAGCCGGCACCCC. OL1-OL1 bottom biotin:
BiotinGGGGTGCCGGCTAATTACCACTGGCGGTGATATAGTAAATACCACTGGCGGTGATATGTACAGGG. Primer 55: CACGGATCCAACCGCCAGCC (BamHI site shown
in bold.) Primer 108: GGAATTCTGAATAGGTTTTATCG
(EcoRI site shown in bold.) Primer 129:
GTCCCCGCGGCACCAGGTTAACCTCGCTGTA (SacII site shown
in bold.) Primer T7: AATACGACTCACTATAG. Primer 219: CACTTCGATATCGCGCGCGAGTCATTGTCAAACAGG (Non native sequence in
bold.) Primer 220: CCTGTTTGACAATGACTCGCGCGCGATATCGAAGTG (Non
native sequence in bold.)
Protein Purification--
Escherichia coli strain
BL21 pLysS containing the various pETCIHis6 constructs was
grown in LB broth (500 ml) containing 100 µg/ml carbenicillin, 30 µg/ml chloramphenicol at 37 °C to an
A600 of 0.55-0.7, induced with
isopropyl-1-thio-
-D-galactopyranoside to a final
concentration of 0.5 mM, and growth continued for a further
2-3 h. Cells were harvested by centrifugation, washed once with 50 mM Tris-HCl, 0.1 mM EDTA, 150 mM
NaCl, and 10% glycerol, pH 7.5 (TEG150), and the cell pellet stored at
20 °C until use. For protein purification, the cell pellet was
resuspended in 20 ml of buffer A (20 mM sodium phosphate,
pH 7.2, 500 mM NaCl) and sonicated on ice, and cell debris
removed by centrifugation (12 000 × g, 30 min,
4 °C). PMSF2 (50 µM) was added to inhibit proteases. The supernatant was
loaded onto a freshly charged 5-ml Hi-Trap chelating column (Amersham Biosciences, Inc.) using a disposable syringe. The column was washed with 12-column volumes of buffer A, followed by 12-column volumes of buffer A containing 150 mM imidazole, and the
protein eluted with 2-column volumes of buffer A containing 500 mM imidazole. Fractions of 200 µl were collected and
assayed for protein by absorbance at 280 nm. Protein-containing
fractions were pooled and dialyzed extensively against TEG150. Purity
of the dialyzed protein was examined by SDS-PAGE on 10% Tris-Tricine
gels and judged to be better than 95% in all cases. Protein
concentrations were determined spectrophotometrically, using molar
extinction coefficients of 23950 M
1
cm
1 for CIHis6 and
CI(HTH
)His6, 15470 M
1 cm
1 for CI
(1-82)His6, 8480 M
1
cm
1 for CI (83)His6 and 36440 M
1 cm
1 for
CI(hybrid)His6 (calculated using the SEDNTERP program).
Yields for the various proteins were between 10 and 30 mg per 500 ml of culture.
His6-tagged CI was shown to be equivalent to the wild type
repressor by several criteria; (i) when expressed from a plasmid both
forms of repressor gave immunity to infection by 186, (ii) in gel shift
assays, purified CIHis6 bound with a 5-fold higher affinity
than purified wild type protein, presumably due to the more rapid NiNTA
affinity purification procedure giving a higher active fraction of
protein, (iii) both CI and CIHis6 associated in solution in
an equilibrium between monomers, dimers, tetramers, and octamers, and
(iv) in vivo repression of pR and
pB lacZ reporters by CI and
CIHis6 were very similar (see Table I).
Linker Insertion Mutagenesis--
Linker insertion mutagenesis
was performed according to the vendor's instructions (New England
Biolabs), with minor modifications. In brief, pRAS1 was used in an
in vitro reaction (5 µl) containing target DNA (pRAS1, 20 ng), donor DNA (4.2-kb Transprimer pGPS5 carrying the kanamycin
resistance gene, 5 ng) and transposase protein. After a 1-h strand
transfer reaction at 37 °C, the enzyme was inactivated by heating to
75 °C for 10 min. Aliquots of the mixture were transformed by
electroporation into NK7049 (
pR short
lacZ) and plated on LB plates containing kanamycin (20 µg/ml), ampicillin (100 µg/ml), and X-gal (20 µg/ml) to select
for transformants carrying the transposon within pRAS1. Blue colonies
indicated the potential presence of a transposon within the CI gene,
since CI was rendered unable to repress the
pR lacZ reporter. These potential
insertion mutants were further screened by a PCR-based assay. Plasmid
DNA was isolated from those strains in which the transposon was
confirmed as being within the CI gene. The plasmid DNA was digested
with PmeI to remove the bulk of the transposon, the plasmids
religated and retransformed into strain NK7049 (
pR short lacZ). The position and
sequence of the resulting 15-bp insert was determined by DNA
sequencing. The effects of the mutations were assayed in the same
strain by measuring
-galactosidase activity.
Limited Proteolysis--
CIHis6 (1.1 mg/ml) in
TEG150 was digested at 37 °C at CI to protease molar ratios of 280 (subtilisin), 370 (papain), or 1100 (proteinase K). At appropriate time
points, 5 µl of samples were diluted into an equal volume of 2× SDS
loading buffer containing 20 mM PMSF, immediately heated to
95 °C for 1 min, and analyzed by SDS-PAGE. For samples to be
analyzed by mass spectrometry, reactions (20 µl) were stopped by the
addition of PMSF and heating. Electrospray mass spectrometry was kindly
performed by Dr. C. Bagley, Institute of Medical and Veterinary
Science, Adelaide, Australia. Cleavage points were deduced from mass
spectrometry results using the PAWS program.
Chromosomal Single Copy lacZ Fusions--
Strain NK7049
transformed with the appropriate pMRR9 derivative was used as the host
for growth of the
RS45 phage vector. Phage stocks obtained were
plated on NK7049, and single recombinant plaques selected on the basis
of color in the presence of X-gal and purified once by streaking across
a lawn of NK7049. Independent blue lysogens from at least two
recombinant plaques were purified by restreaking. Single copy status of
these lysogens was confirmed by PCR (12). For assay of
pR or pB
-galactosidase activity, the appropriate CI expression plasmid
(pETCIHis6 or the parental pET3a plasmid) was transformed
into these lysogens, and liquid cultures started from single colonies.
Kinetic LacZ assays were done in 96-well microtitre plates by an
extensively modified Miller method (13). Fresh colonies on selective LB
plates were used to inoculate 200 µl of LB + antibiotic. Plates were
sealed and incubated at 37 °C for ~16 h without shaking. These
cultures were subcultured by diluting 2 µl into 98 µl of fresh
medium and incubated with rotation to an A600 of
0.2-1.2 (log phase). A600 was measured using a
Labsystems Multiskan Ascent plate reader with a 620-nm filter; the
A620 values were converted to
A600 (1-cm path length) values using an
empirically derived relationship and adjusted for light-scattering
non-linearity according to (14). Cells were chilled and then
permeabilized with polymyxin B (15) by adding 20 µl of culture + 30 µl of LB to 150 µl of lysis buffer in a microtitre plate. Lysis
buffer was TZ8 (100 mM Tris-HCl, pH 8.0, 1 mM
MgSO4, 10 mM KCl) + 2.7 µl/ml
2-mercaptoethanol and 50 µg/ml polymyxin B. The presence of
detergents and chelating agents did not improve the assay. A higher pH
value than used by Miller (13) improved display of
o-nitrophenol in the absence of NaCO3 added to
stop the reaction. Assays were performed at 28 °C and were initiated
by addition of 40 µl of o-nitrophenyl-
-D galactoside (4 mg/ml in TZ8). The A414 of the reaction was read every 2 min for 1 h, and enzyme activity determined as the slope of the line of best fit of A414
versus minutes (readings with A414 > 2.5 were ignored). Enzyme activity was found to be directly proportional to the A600 of the culture and the
volume of culture added to the assay (V in µl). LacZ units
were calculated as 200,000 × (A414/min)/(A600 × V) and were roughly equivalent to standard Miller units.
Analytical Ultracentrifugation--
Sedimentation experiments
were performed in a Beckman XL-I analytical ultracentrifuge using
absorbance optics and a four-hole An60Ti rotor. Approximately 100 µl
of sample and 105 µl of reference solution were loaded in the sectors
of the epon centerpieces. Following 24 h of centrifugation, scans
were compared at 3-h intervals to ensure that equilibrium had been
reached. Data were collected at 280 nm at a spacing of 0.003 cm. The
buffer for all experiments was TEG 150. Protein was prepared for
centrifugation by exhaustive dialysis against TEG 150, and the
dialysate used as the reference solution for centrifugation. Buffer
density (
) was measured in an Anton-Paar precision density meter to
be 1.03953 g/ml at 5 °C and 1.03644 g/ml at 20 °C. The partial
specific volumes (
) were calculated (using the SEDNTERP
program) as 0.727 ml/mg for CIHis6, 0.712 ml/g for CI
(1-82)His6, and 0.736 for CI
(83)His6.
Sedimentation data was analyzed using Sigmaplot 4.0 for Windows (SPSS
Inc, Chicago, Il) initially by fitting each data set (absorbance
versus radial distance) individually to Equation 1, the
basic sedimentation equilibrium equation, in order to estimate whole
cell molecular weights.
|
(Eq. 1)
|
where Ar and Ar,0
are the absorbances at radial distance, r, and
r,0, M is apparent molecular weight,
is the partial specific volume,
is the solution
density,
is the rotor speed in radians per second, T is
the temperature in degrees Kelvin, R is the gas constant and
e is a baseline error term. M,
Ar,0, and e were fitting parameters.
All data sets were then analyzed globally by fitting to an extended
version of Equation 1, modified to take into account association
between species.
Surface Plasmon Resonance (SPR)--
Surface plasmon resonance
experiments were conducted on a Biacore 2000 using a
streptavidin-coated chip (SA chip, BIAcore AB, Sweden). Biotinylated
DNA (PAGE-purified, SigmaGenosys, Sydney) was prepared by adding a
slight molar excess of the non biotinylated strand over the
biotinylated strand. The strands were annealed by heating to 90 °C
for 3 min, followed by slow cooling to room temperature. Between 68 and
135 response units (RU) of DNA was immobilized at a flow rate of 5 µl/min. Flow cell 1, containing no DNA, was used as a reference
channel, while flow cell 2, containing two tandem
OL operator sites (90 RU), was used as a
negative control. A32A DNA (68 RU) was immobilized in flow cell 3, and B32B DNA (135 RU) in flow cell 4.
The binding buffer for Biacore experiments was TEG150. Proteins,
diluted in TEG150, were pumped across all four flow cells at 20 µl/min and 25 °C, and responses recorded at 1Hz.
 |
RESULTS |
Homologs of 186 CI--
The first step in investigating the
structure-function relationship of 186 CI was to search for homologs. A
number of proteins related to the 186 CI repressor were identified by
BLAST (16) (Fig. 2). The 186 CI amino
acid sequence was used initially to search the protein data base, and
four prophage proteins (repressors from phage
R67, Hemophilus
influenzae phages HP1 and S2, and Vibrio cholerae phage
K139) with homology to CI repressor were identified. The unfinished
microbial genomes data base was also searched with 186 CI as the input
sequence, and two additional proteins were found. The first was from
Klebsiella pneumoniae (WUSTL Genome Sequencing
Center) and, judging from other sequence similarities, appears to be
the CI homolog of a phage closely related to 186, present as a
prophage. The second was a putative prophage repressor from
Salmonella typhi (CT18 phage) (Sanger Center, Cambridge, UK). Several partial sequences related to 186 CI
were also evident in the unfinished genomes of other
Salmonella subspecies (typhimurium, paratyphi,
enteriditis) but have not been included here. A block
alignment of these 186-like proteins was then used to search the BLOCK
data base (24). This more powerful search method detected, in addition
to those proteins already identified, the CI protein from
80, a
lambdoid phage, as being related to the 186 like repressors. Alignment
of the 186 CI repressor amino acid sequence with those of the seven
related repressor proteins reveals two blocks of homology, one of ~70 amino acids at the N terminus, and a second block of ~60 amino acids
at the C terminus (Fig. 2). The two blocks are separated by a low
homology region of 40-50 amino acids. This non-conserved region may
represent an unstructured linker joining two more highly structured
domains. In the case of the
80 repressor, homology was less evident
at the C-terminal end. The C termini of the lambdoid repressors form
part of the RecA recognition site, with cleavage of the repressor
occurring within the central linker (25, 26). 186 CI is not
RecA-sensitive (7).

View larger version (47K):
[in this window]
[in a new window]
|
Fig. 2.
Alignment of proteins with homology to 186 CI. Alignment by ClustalW of proteins identified by BLAST as
having homology to the coliphage 186 CI repressor. The abbreviations
used are as follows: Kleb, presumptive phage protein from unfinished
sequence of Klebsiella pneumoniae (WUSTL genome sequencing
center),3 HP1; repressor
protein from phage HP1 of Hemophilus influenzae (17), S2,
repressor protein of phage S2 of Hemophilus influenzae (18),
CT18, repressor protein from a phage within Salmonella typhi
CT18 (Sanger center),4 EC67;
putative repressor from retronphage Ec67 (19), K139, repressor protein
of phage K139 of Vibrio cholerae (20), 80, CI
repressor protein of coliphage 80 (21). Where at least five of the
eight amino acids are identical or conserved, they are shown on a
black or gray background, respectively. Each
protein sequence was also examined for motifs. The Dodd and Egan weight
matrix method (22) and the GYM2 pattern recognition algorithm (23) both
detected potential HTH motifs. The S.D. scores obtained for each
protein by the Dodd and Egan method are given in parentheses after the
sequences. A score of 2.5 or greater indicates a likely HTH motif. Both
methods always identified the same sequence within each protein the
most likely to contain a HTH motif and these sequences, indicated by
the black line, are all aligned in the multiple sequence
alignment.
|
|
Each CI-like protein in Fig. 2 was examined for the presence of protein
motifs using a number of search methods. Potential helix-turn-helix
motifs were identified by both the Dodd and Egan (22) weight matrix
method and the GYM2 pattern recognition method (23) in some of the
proteins. The S.D. scores obtained for each protein by the Dodd and
Egan method are shown in parentheses following each sequence in Fig. 2.
An S.D. score above 2.5 is considered good evidence for a HTH; likely
HTH motifs were identified in some of the proteins, and both search
methods always identified the same location as the potential HTH region
(solid line in Fig. 2). Thus, although a number of the
proteins including CI itself have poor S.D. scores, the alignment of
amino acids within the N-terminal block of homology coincides with the
position of the predicted HTH motifs in each case. We take this as
evidence that 186 CI very likely contains an N-terminal
helix-turn-helix motif.
Domain Structure of CI--
We have used two techniques, linker
insertion mutagenesis and limited proteolysis, to investigate the
domain structure of the 186 CI repressor protein, with the aim of
determining whether one or both of the putative domains have the
potential to bind DNA.
In linker insertion mutagenesis (27, 28) a short stretch of amino acids
is inserted into the protein of interest. The effect of
the inserted amino acids on the
activity of the protein is dependent upon the location of the
insertion. For example, an insertion located on a surface loop of the
protein or in a relatively unstructured region would be expected to
have a minor effect on protein function. In contrast, an insertion
within a tightly folded or buried region is more likely to interfere
with protein function, whether by disrupting protein structure, protein folding, or through an effect on protein stability. To define the
regions of CI that are either tolerant or intolerant to insertions, we
have used the Genome Priming System-Linker Scanning system (New
England Biolabs, Beverly, MA). In this system, a modified Tn7-based
transposon is used in an in vitro reaction to make random (1.7 kb) insertions into the gene of interest. The majority of the
transposon sequence is then removed by restriction digest and
religation, leaving a 15-base pair insertion. In four of the six
possible reading frames, this insertion results in a five-amino acid
linker, while the other two reading frames generate stop codons. Thus,
two sets of CI mutants were generated; (i) a set of truncated proteins
and (ii) a set of proteins containing randomly located five-amino acid
insertions. The activities of these CI mutants were measured by their
ability to repress a single copy, chromosomally inserted pR lacZ
reporter, NK7049 (
RS45 pR short lacZ). In this system, unrepressed
pR lacZ gave 839 (± 77) units, while
wild type CI from pRAS1 repressed pR
lacZ to 0.7 (± 0.8) units.
Transposon insertions that resulted in truncated protein products
occurred at amino acids 36, 61, 74, 103, 106, 107, 110, 112, 123, 173, and 185. These truncated CI proteins invariably lost the ability to
repress the pR lacZ reporter. Even a
protein truncated at amino acid 185, resulting in just a seven-amino
acid C-terminal deletion, was inactive, indicating that these amino acids are required for CI function or stability.
Among the second set of mutants (Fig. 3),
insertions of five amino acids had quite different effects, depending
on the position of the insertion.
-galactosidase units for the
inactive or partially active mutants are given to the right of Fig. 3;
active mutants were defined as those that gave less than two units of
-galactosidase activity and are shown to the left of Fig. 3.
Insertions within the N-terminal region, with one exception, abolished
the ability of CI to repress pR. The exception
was the insertion at amino acid 5, which lies just outside the
conserved N-terminal region and which remained fully active. Western
blotting of cell extracts prepared from the inactive mutants indicated
that the inactivity of two of the mutants (insertions at amino acids 15 and 66) reflected a lack of CI in the soluble fraction (data not
shown). Together these data are consistent with the idea that
insertions within the conserved HTH-containing N-terminal region
disrupt folding and/or protein stability and are thus detrimental to
DNA binding. In contrast, mutants having insertions within the putative
linker region remained able to fully repress pR,
suggesting that the central region of CI is relatively unstructured or
forms part of a surface loop and is thus tolerant to insertions. Only
three insertions were obtained within the C-terminal region. The
insertion at amino acid 167 had no effect on CI activity, suggesting
that this amino acid may also be located on the surface of the protein. Insertions at amino acids 139 and 156 reduced but did not eliminate the
ability of CI to repress pR. One possibility is
that insertions in this region disrupt protein-protein association or
cooperativity, although we have not tested this explicitly.

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 3.
Linker insertion mutagenesis. A series
of repressor mutants were generated in which 15 bp of DNA were randomly
inserted into the plasmid (pRAS1)-encoded cI gene. Depending
on the reading frame, this 15-bp insertion gave rise to either a
truncated protein or a protein with a five-amino acid insertion. The
activities of the various CI proteins were assayed in NK7049 ( RS45
pR short lacZ). Unrepressed
pR lacZ (parental pBluescriptKS only)
gave 839 (± 77) -galactosidase units, while wild type CI (supplied
from pRAS1) repressed pR to 0.7 (± 0.8) units.
None of the 11 truncated proteins (see "results") were able
to repress the single copy pR lacZ
reporter gene. The locations of the five amino acid insertions within
CI are shown. Insertions shown to the right of the figure reduced or
eliminated the ability of CI to repress the
pR lacZ reporter gene; the activities
of each these mutants are shown as the mean of at least three
determinations (± 95% confidence limits). CI mutants containing the
insertions shown to the left of the figure retained the ability to
repress the reporter, giving less than two lacZ units. The two shaded
areas represent the blocks of homology described in Fig. 2.
|
|
Limited proteolysis was used to further probe the domain structure of
CI, the principle being that structured regions of the protein will be
more resistant to low levels of protease than an unstructured linker
region. Purified CIHis6, which we have shown to be
equivalent to wild type CI ("Experimental Procedures"), was
digested with low levels of protease (papain, proteinase K or
subtilisin), aliquots removed over the course of the digestion, and the
reactions quenched with PMSF. These samples were analyzed by
SDS-PAGE using 10% Tris-Tricine gels (29) (Fig.
4). At early time points, a stable
fragment of ~14 kDa was observed for each of the proteases employed.
A fragment representing the remainder of the protein (expected size
~8.5 kDa) was not observed. In the case of the subtilisin and
proteinase K digests, the 14-kDa CI fragment was further digested at
later time points to a stable 8- to 9-kDa fragment.

View larger version (49K):
[in this window]
[in a new window]
|
Fig. 4.
Limited proteolysis of
CIHis6. CIHis6 (1.1 mg/ml) was
digested with subtilisin (4 µg/ml), papain (3 µg/ml), or proteinase
K (1 µg/ml) in a 50-µl reaction. Samples (5 µl) were taken at the
time points indicated and quenched with PMSF. Samples were heated to
90 °C for 2 min and analyzed on a 10% Tris-Tricine SDS gel. The
full-length protein is indicated by the arrows. The sizes of
the molecular weight markers (M) are shown to the right of
the gels. To determine the location of the cleavage points, samples
from selected time points were subjected to electrospray mass
spectroscopy, and the masses of the peptides used to infer the point of
cleavage. The results are summarized in the lower part of the figure.
The N-terminal and C-terminal regions of homology, described in Fig. 2,
are shaded, while the His6 affinity tag is in
black. Lines indicate the major fragments obtained for each
of the proteases.
|
|
The boundaries of these stable fragments were determined by analyzing
samples from each of the digests using electrospray mass spectrometry.
The results are summarized in the lower part of Fig. 4. All three
proteases cleaved within the presumably unstructured C-terminal
six-histidine affinity tag of CI. The 14-kDa fragment from the
subtilisin digest represents the C-terminal region of CI, cut primarily
at amino acid 79, along with some minor products digested within a few
amino acids either side of residue 79, consistent with the nonspecific
nature of this protease. The smaller subtilisin fragment(s) obtained at
later time points result from further digestion at both ends of the
larger fragment to give a minimal polypeptide consisting of residues
116-198. This result suggests that the C-terminal region is a stable,
folded domain. The absence of a stable N-terminal fragment suggests
that the N-terminal region is at least partly unstructured in the
absence of DNA and so is susceptible to proteolysis. However,
repetition of the proteolysis experiment in the presence of an
oligonucleotide containing the FR CI binding
site gave an identical pattern of cleavage, suggesting that either the
N-terminal region remains susceptible to proteolysis when bound to DNA
or that upon cleavage elsewhere within the bound protein, the
N-terminal fragment dissociates from the DNA and is then susceptible to
the protease.
Limited proteolysis with papain gave fragments cleaved primarily around
residues 79 and 80, again with some other minor products cleaved at
nearby residues. Digestion of CIHis6 with proteinase K also
gave cleavage at residues 77-79 at early time points, followed by
cleavage primarily at residue 110 at later time points (Fig. 4). This
protease-sensitive region (approximately amino acids 77-116) of CI is
consistent with the central non-conserved region shown in Fig. 2.
Properties of CI Domains--
To further examine their biochemical
properties in comparison with full-length repressor, residues 1-82
(N-terminal region) and residues 83-204 (linker plus C-terminal
region) were cloned, expressed, and purified using a C-terminal
six-histidine affinity tag. Both fragments were soluble and obtained in
milligram amounts.
Full-length wild type CI repressor associates in solution in an
equilibrium between monomers, dimers, tetramers, and octamers (10). The
His6 affinity-tagged CI also associates to octamers in
solution (data not shown), with dimers the predominant species at the
concentration range in which DNA binding first occurs. The oligomeric
state of the N-terminal and C-terminal fragments of CI were assessed by
sedimentation equilibrium (Fig. 5). For CI (1-82)His6, data were obtained at three different
loading concentrations (cell 1, 5 µM; cell 2, 16 µM, and cell 3, 32 µM) at a rotor speed of
24,000 rpm (Fig. 5a). Initially, the individual scans were analyzed in terms of Equation 1 to obtain whole cell molecular weights.
These ranged from 15,870 (± 420) for cell 1 to 18,660 (± 100) for
cell 3, values that approach twice that of the monomer molecular weight
(10,549). The data for all three cells were then fitted globally to a
number of association schemes, with the monomer molecular weight fixed
at 10,549. The best fit (as judged by the sum of squares of the
residuals) was to a monomer-dimer equilibrium, with an association
constant of 2.5 × 105 M
1
(
G =
6.9 kcal/mol). When the molecular weight of monomer
(M1) was included as an additional fitting
parameter, the association constant was unchanged, and a value of
10,770 (± 170) was obtained for M1. There was
no evidence for species beyond dimer. Thus, the N-terminal fragment is
able to form stable dimers in solution, albeit with an association
constant at least 104-fold weaker than the full-length
protein (10). It seems reasonable to now refer to the CI
(1-82)His6 fragment as a domain, even though it is not
highly resistant to proteolysis.

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 5.
Self association of CI domains.
Solutions of CI (1-82)His6 (a) and CI
(83)His6 (b, c) were centrifuged
to sedimentation equilibrium, and the distributions of absorbance
versus radial distance recorded at 280 nm. For the sake of
clarity, only every third data point is shown. Following initial
analysis of each individual scan in terms Equation 1, all scans for
each protein were fitted globally to a number of reaction schemes. The
best fits to the data are shown as solid lines. a, loading
concentrations of CI (1-82)His6 were 5 µM
(circles), 16 µM (squares), and 32 µM
(triangles), and the rotor speed was 24,000 rpm. The CI
(1-82)His6 data was fit best by a monomer-dimer
equilibrium, with an association constant of 2.5 × 105 M 1. Loading concentrations of
CI (83)His6 were 19 µM (circles), 50 µM (squares), and 77 µM (triangles). Two
rotor speeds were used, 12,000 rpm (b) and 16,000 rpm
(c). The CI (83)His6 data fit best to a
dimer-octamer association, with an association constant of 1.4 × 1015 M 3, shown as a solid line.
Shown below each plot are the residuals, the difference between the
experimental data, and the line of best fit.
|
|
Data for CI (83)His6 were obtained at three loading
concentrations (cell 1, 19 µM; cell 2, 50 µM; cell 3, 77 µM) and two rotor speeds
(12,000 rpm (Fig. 5b) and 18,000 rpm (Fig. 5c)).
Analysis in terms of Equation 1 gave whole cell molecular weights in
the range 57,900 (± 1200) to 85,760 (± 1700), indicating association to a species at least 6.3-fold larger than the monomer
(M1 = 13,490). All six data sets were then
fitted globally to a number of models. The best fit (Fig. 5,
b and c) was to a dimer to octamer association with an association constant of 1.4 × 1015
M
3 (
G =
19.2 kcal/mol). It was not
possible to obtain data on the monomer-dimer association, which was
essentially complete over the concentration range accessible in the
sedimentation experiments. Nor was it possible to obtain information
about tetramer formation, since the tetramer to octamer transition is a
concerted (energetically favored) process and tetramer is not a
significantly populated species. However, the free energy of
association per dimer for the dimer to octamer transition is
4.8 kcal/mol for CI (83)His6, compared with
5.3
kcal/mol for wild type CI (10). This calculation suggests that the
majority of the free energy for CI association is derived from
interactions between C-terminal domains.
The functions of the domains were tested in two ways: (i) the ability
to repress a reporter gene under control of the early lytic promoters
pR or pB and (ii) binding
in vitro to A type and B type sites. The ability of these
protein fragments to repress a single copy chromosomally inserted
pR lacZ and
pB lacZ reporter in vivo
was tested (Table I). CI and its variants
were expressed from the T7 promoter of pET3a-based plasmids in a strain
lacking T7 polymerase. There was sufficient "leakage" of expression
in this system to give approximately the same level of CI expression as
that found in a 186 lysogen (data not shown). The strong
pR promoter was repressed ~230-fold, from 534 units in the absence of CI to 2.3 units in the presence of full-length
CIHis6. The pB promoter, which at
139 units is 4-fold weaker than the pR promoter, is also not as strongly repressed by CI, retaining 4.6 units of activity in the presence of full-length CIHis6, a 30-fold
repression. This weaker repression of pB
probably reflects the number and arrangement of the CI operators at the
respective promoters (Fig. 1). Neither CI (1-82)His6 or CI
(83)His6 were able to repress pR in this system (Table I). Similarly, the CI
(1-82)His6 and CI (83)His6 domains had no
effect on repression of the pB lacZ reporter. Thus, at least at the concentrations of protein generated in
this assay system, repressive capacity of the CI fragments was lost.
This is consistent with the inability of the CI truncation mutants
(Fig. 3a) to repress pR, even when
expressed from the high copy number pRAS1 plasmid. It is possible,
however, that the isolated CI domains may be able to bind the CI
operators, but be unable to bring about repression of the promoter.
View this table:
[in this window]
[in a new window]
|
Table I
Repression of pR and pB lacZ reporters by CI variants
-Galactosidase units are given as the mean ± 95% confidence
limits; n = number of assays.
|
|
The ability of the N and C-terminal fragments of CI to bind DNA was
measured in vitro by SPR. This technique measures binding between macromolecules by detecting changes in refractive index at the
surface of a sensor chip, the response being proportional to the mass
of macromolecule bound. While a pair of A type CI recognition sites are
found at pB, B type sites only occur in combination with an A' sites at pR. Since we
wished to differentiate between the ability of CI to bind to A type and
B type sequences, we employed synthetic oligonucleotides to generate
(i) a tandem pair of A sites separated by 32 base pairs, the natural
spacing found at pB, (ii) a tandem pair of B
sites separated by the same distance, and (iii) a tandem pair of
OL1 operators as a control for nonspecific
binding. The sequences of the oligonucleotides used are given in
"Experimental Procedures." The lower strand of each oligonucleotide
was biotinylated to allow attachment to a streptavidin-coated biosensor
chip. The results, corrected for bulk refractive index changes by
subtracting the response of a control (no DNA) flow cell, are shown in
Fig. 6. Comparison of full-length CI
binding to A type (Fig. 6a) and B type (Fig. 6b) sites showed that CI bound more strongly to A type sites, indicated by
a greater response at a given protein concentration. A titration performed over a 100-fold range of CI concentrations showed that CI
bound to A32A at least 10-fold more strongly than to B32B (data not
shown). During this titration, nonspecific binding of CI became apparent at concentrations above 1 µM, but only if the
DNA contained CI operators. It appears that nonspecific binding may be
seeded from specifically bound repressor, similar to the phasing seen with HK022 repressor (30). This phenomenon precluded analysis of SPR
data in terms of the equilibrium binding response, while the likelihood
of multivalency of the CI-DNA interaction prevented meaningful analysis
of the binding kinetics (31). Binding of the purified CI domains was
therefore examined only qualitatively by SPR. Purified CI
(1-82)His6 bound to both A and B Type sequences, albeit
weakly, with binding to B type sequences only evident at a
concentration of 10 µM. CI (83)His6 gave
no response above background to either type of site, even at a
concentration of 10 µM (data not shown).

View larger version (20K):
[in this window]
[in a new window]
|
Fig. 6.
Binding of CI and CI domains to A type and B
type sites. Surface plasmon resonance (BIAcore) was used to
examine qualitatively the binding of full-length CIHis6, CI
(1-82)His6, and CI (83)His6 to tandem A
type sites (a) and B type sites (b). The
streptavidin-coated biosensor chip contains four flow cells: cell 1 was
an underivatized (no DNA) control surface, cell 2 contained 90 RU of
control DNA (tandem OL operators), cell
three contained 68 RU of AA DNA, and cell 4 contained 135 RU of BB DNA.
Protein, diluted to the indicated concentrations in TEG150 buffer, was
flowed across all four surfaces at a rate 20 µl/min. For both
a and b, the curves are as follows; curve (1)
CIHis6 at 1 µM, curve (2) CIHis6
at 100 nM, curve (3) CI(1-82)His6 at 10 µM, curve (4) CI(1-82)His6 at 1 µM. The dotted lines are the response of flow
cell 2, containing the control ( OL) DNA, to
the highest protein concentration. The responses of the A type and B
type sites to CI(83-192)His6 at a concentration of 10 µM were indistinguishable from the
OL control. All results have been corrected for
bulk refractive index effects.
|
|
Hybrid Repressor--
The isolated N-terminal domain is capable of
only weak dimerization and hence has likely also lost the potential for
cooperative interactions between adjacently bound dimers, leading to a
lower overall affinity for its sites. This potential for higher
association and dimer-dimer cooperativity was replaced by creating a
hybrid protein consisting of the 186 CI N-terminal domain and the well characterized
CI repressor C-terminal domain (32) (Fig.
7a). We reasoned that if this
fusion protein could bind to both A and B type sites, then the residues
necessary for DNA binding to both types of sites must be present in the
N-terminal region. A chimeric repressor consisting of the 186 CI
N-terminal domain (residues 1-82 of 186 CI) and the
CI C-terminal
oligomerization domain (
residues 92-236) was cloned, expressed,
and purified, and its ability to bind A and B type sites tested
in vivo and in vitro. The chimeric repressor was
able to repress both pR and
pB lacZ reporters in vivo
(Table I), although not to the same extent as full-length
CIHis6.

View larger version (22K):
[in this window]
[in a new window]
|
Fig. 7.
Binding by a 186- hybrid
protein. A linear representation of the 186- hybrid protein is
shown in a. Surface plasmon resonance data for
CI(hybrid)His6 binding to A type sites (b) and B
type sites (c). The same sensor chip described in the legend
to Fig. 6 was used. CI(hybrid)His6, at concentrations of 1 µM (curve 1) and 100 nM (curve 2) was passed
across all four flow channels at a flow rate of 20 µl/min. The lower
dashed line is the response of the control ( OL) DNA channel. Results have been corrected for
bulk refractive index effects.
|
|
In SPR experiments (Fig. 7, b and c) the hybrid
repressor bound to both A and B type sites, although with somewhat
lower affinity than the wild type 186 repressor. Control BIAcore
experiments showed that full-length
repressor had no affinity for
either A or B type 186 sequences. Taken together, these results
indicate that at least some of the binding determinants for both A and B type sequences are located in the 186 CI N-terminal (amino acids 1-82) region. The loss of some binding affinity of the hybrid compared
with the wild type repressor is presumably due to less than optimal
cooperativity between adjacently bound dimers, since operator to
operator spacings differ between
and 186.
Mutagenesis of Helix-Turn-Helix--
To test whether the
determinants for DNA binding to A and B type sites are both located in
the same DNA binding motif of 186 CI, critical residues in the
predicted helix-turn-helix motif were mutated. Residues to be mutated
were chosen on the basis that they should change the sequence away from
the 186-like repressor consensus, but not disrupt the structure of the
protein (Fig. 8). Residues 12 and 13 of
the HTH motif are commonly involved in sequence-specific interaction
with the DNA (34). The serines at these positions in 186 CI (amino
acids 37 and 38 of CI) were mutated to arginine and glutamic acid,
respectively, amino acids occurring frequently at these positions in
other HTH motifs. These changes actually improve the match to the Dodd
and Egan (22) HTH master set consensus (S.D. score = 0.5 for wild
type, 1.4 for mutant). We expected that these changes would not disrupt the HTH motif but would alter its DNA binding specificity. This protein, CI(HTH
)His6, was purified in
milligram amounts and was shown by sedimentation equilibrium to
self-associate to octamers, similar to the wild type protein. This is
good evidence that the mutations do not have a large effect on protein
folding and are specifically affecting DNA recognition.

View larger version (24K):
[in this window]
[in a new window]
|
Fig. 8.
Mutation of the helix-turn-helix motif.
The proposed structure the 186 CI HTH motif is shown (33, 34).
Shaded residues are those proposed to be involved in
sequence specific DNA recognition. The mutations made at serine 12 and
serine 13 of the motif (amino acids 37 and 38 of CI) are indicated. The
first and last residues of the helices are numbered, as is residue 9, located at the turn and most often a glycine or alanine residue. The
bulkier aspartate residue present in CI will most likely result in
restricted stereochemistry. A common, but not essential, helix-helix
interaction between residues 5 and 15, which helps to orient the
helices, is indicated by a dashed line.
|
|
The mutated protein was unable to repress either
pR or pB lacZ
reporters in vivo (Table I) and gave no response in Biacore experiments to either A or B type sites (not shown), even when used at
a concentration of 10 µM. In addition, when wild type 186 phage was plated on a strain carrying
pETCI(HTH
)His6, the resulting plaques were less
turbid than those obtained by plating on a control (pET only) strain,
suggesting that CI(HTH
)His6 acts as a
dominant negative mutant in vivo to wild type phage-derived
repressor. This further supports the idea that the mutant is correctly
folded, is able to hetero-associate with wild type repressor subunits,
and is unable to bind CI operators. We conclude that serines 37 and 38 within the putative HTH motif of 186 CI are necessary for binding to
both A and B type sites.
 |
DISCUSSION |
Coliphage 186, like the intensively studied bacteriophage
, has
evolved to enable it to follow two distinct but interchangeable developmental pathways. As 186 and
are almost unrelated at the DNA
and protein level, the focus of this laboratory has been to study the
genetic switch of 186, since it represents an independently evolved
solution to a common problem. One aspect of this study has been to
investigate the properties of the 186 lysogenic repressor (3, 4, 6,
10). We have shown previously (6) that the 186 CI repressor binds to
four sites within the early control region of the phage and that, among
these four sites, are two distinct types of inverted repeat operator
sequences, termed A type sites and B type sites. The operators are
arranged in the order AA-A-BA'B-A, where the A' operator has a four
rather than five base pair spacing between half-sites. Since 186 CI
needs to recognize two different types of sequences, the possibility existed that CI does this using two distinct DNA binding motifs. Such
an arrangement is found for example in members of the integrase family
of proteins, which recognize core type sites and arm type sites (35,
36). In another example, the recent crystal structure of the bacterial
repressor MarA, an araC family member, shows that it contains two HTH
motifs, which together bind an asymmetric, degenerate sequence (37).
Here we have investigated the structure-function relationship of CI and
show that there is one DNA binding motif that recognizes both types of
site, rather than two distinct DNA binding motifs. We have also shown
that CI consists of two domains, an N-terminal domain (nominally amino
acids 1-82), which contains a putative helix-turn-helix motif, forms
weak dimers in solution, and is responsible for sequence-specific DNA
binding, and a C-terminal domain which, together with the linker
region, forms octamers in solution and has no capacity for DNA binding.
In terms of domain structure, 186 CI is similar to the lambdoid
repressors, which also consist of an N-terminal DNA binding domain, and
a C-terminal domain, which mediates dimerization as well as cooperative
interactions between adjacently bound dimers (32, 38). Indeed this
arrangement of domains is common among many prokaryotic and eukaryotic
transcriptional regulators where protein association is linked to DNA
binding (39). There are several lines of evidence that 186 CI also
utilizes cooperative interactions. The C-terminal domain alone can
associate strongly to octamers. Like
repressor (40), full-length
186 CI exists in solution in an equilibrium between monomers, dimers,
tetramers, and octamers, and both proteins have similar free energies
of association (10). CI binding sites are arranged such that they are
on the same face of the helix, spaced two or three turns of the helix
apart (6). Gel mobility shift experiments show only one retarded
species, whether there are one (A), two (AA) or three (BA'B) sites
present on the DNA (6). Mutations (vir mutants) in one or
two of the inverted repeats at pR diminishes
overall binding affinity, yet the same retarded complex is observed
(6). We also have preliminary evidence that CI bound at
pR can interact with CI bound at the flanking
sites,1 similar to the looping observed between
CI
bound at the OR and OL
operators (41). Taken together, these points suggest that cooperativity
between DNA bound dimers of CI may be important for the existence of a
stable lysogenic state. The recent crystal structure of the
C-terminal domain has provided a model for cooperative binding between
dimers bound to adjacent sites, as well as suggesting a mechanism for
tetramer-tetramer interactions (42). The availability of mutants unable
to associate were important in confirming the validity of the proposed
models. Sequence alignment of the 186 CI-like repressors shown in Fig.
1 with a set of lambdoid phage repressors was attempted, however, no
obvious homologies across families were found, with the exception of
the lambdoid
80 phage repressor (Fig. 1), where homology was
primarily at the N-terminal region. One approach to isolating
C-terminal association mutants (whether monomer-monomer, dimer-dimer,
or tetramer-tetramer) of 186 CI would be to select mutants of
CI(HTH
) that no longer display a dominant negative phenotype.
CI repressor must recognize alternate spacing between the A type
half-sites, five base pairs at A sites, four base pairs at the central
A' site of pR. Members of the araC family are
able to recognize different half-site spacings by utilizing a flexible linker between the DNA binding domain and dimerization domain (43).
Three lines of evidence suggest the presence of a flexible linker
joining two domains in 186 CI: (i) in the alignment of 186-like phages
(Fig. 2), the two blocks of homology are separated by a region (amino
acids 74-135) containing little sequence homology, (ii) insertions of
five amino acids within this region (between amino acids 87-124) did
not affect the ability of CI to repress pR (Fig.
3), and (iii) limited proteolysis of 186 CI (Fig. 4) resulted in
cleavage around amino acid 80 and, at later time points, around amino
acid 116, while retaining a stable C-terminal fragment. Thus the
presence of a linker between 40 and 60 amino acids in length may allow
CI to recognize the variable (4 or 5 bp) half-site spacing found in A
type sites and also may be important in higher order association.
It is apparent that A and B type sequences are not recognized equally
well by 186 CI. Both full-length CI and the N-terminal domain bind to a
pair of A sites at least 10-fold more strongly than to a pair of B
sites separated by the same distance. Strong repression of the
pR promoter by CI is essential for maintaining the lysogenic state and indeed, in a lysogen, pR
is repressed at least 300-fold (4). Strong cooperativity between
adjacently bound dimers of CI is likely responsible for the overall
tight binding at pR, as the individual sites
have relatively poor affinity for repressor (6). This strong
cooperativity is also manifested, at least in vitro, in the
observation of spreading of CI binding along the DNA from specifically
bound sites (Ref. 6; this work). On the other hand, in order for the
switch from lysogeny to the lytic mode of development to be effective,
derepression of the lytic promoters by removal of repressor must be
rapid and efficient; interfering with cooperativity would be one means
to facilitate this. In the lambdoid phages this is achieved through the
RecA*-mediated self-cleavage of the repressor at conserved cleavage
points in the linker region (25). With the loss of dimer-dimer
cooperativity, the N-terminal domains then have insufficient affinity
for DNA to enable repression of the lytic promoters. In the case of
186, however, the need for RecA is indirect. Induction of a 186 prophage requires a phage encoded-gene, tum, under LexA
control, whose product has antirepressor activity (7, 44, 45). Although 186 CI N-terminal domains also bind DNA less strongly than does the
full-length protein (Fig. 5), Tum does not cleave the repressor but
acts at some other level (45). One way to pursue the mechanism of Tum
action would be to test whether the 186-
repressor hybrid constructed in this study is susceptible to Tum.
What are the characteristics of CI that allow it to bind two distinct
operator sequences? No structural data is available for 186 CI. There
are general rules for recognition of DNA by HTH proteins (33, 34). (i)
Residues 1, 11, 12, 13, 17, and 20 of the HTH motif contact the DNA;
for 186 CI, all of the corresponding residues except Ala-11 could form
hydrogen bonds with the DNA via hydroxyl or amino groups. (ii) There is
usually a small residue (Gly or Ala) at position 9 of the turn; 186 CI
has a bulkier aspartate residue, a substitution likely to restrict the
relative orientation of the helices. (iii) Positions 4 and 15 of the
HTH should be hydrophobic; 186 satisfies this requirement with leucine
at these positions. (iv) Position 5 should not be branched; for CI,
position 5 is an alanine. Thus, although 186 CI follows the general
rules for recognition of DNA by HTH proteins, with the exception of the
residue located at the turn, the HTH motif has a poor match to the Dodd
and Egan (22) HTH master set (Fig. 2). Perhaps it is these differences
that allow dual site recognition. Our definition of A type and B type
sites is strongly supported by mutation data (6), but it is possible
there are sequence characteristics in common between the two types of
sites that we have not recognized. There are only a limited number of
sites from which to construct a consensus sequence, compared with some
bacterial regulators, which have numerous recognition sites within the
genome. In vitro site-selection methods could be used to
determine which bases are important for recognition by CI.
Alternatively, structural data on the CI N-terminal domain bound to A
sites versus B sites would provide direct answers to these questions.
 |
ACKNOWLEDGEMENTS |
We thank Peter Brautigan for preliminary
proteolysis experiments, Lyle Carrington for assistance with analytical
ultracentrifugation, and members of the Egan laboratory for valuable
discussions and reagents. We acknowledge the Genome Sequencing Center,
Washington University, St. Louis, MO for communication of DNA sequence
data prior to publication and the Sanger Center at Cambridge and Del Pickard at Imperial College for use of CT18 sequencing data.
 |
FOOTNOTES |
*
This work was supported by the Australian Research Council.The costs of publication of this
article were defrayed in part by the
payment of page charges. The article
must therefore be hereby marked
"advertisement" in
accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
To whom correspondence should be addressed: Dept. of Molecular
Biosciences, Univ. of Adelaide, North Terrace, Adelaide, South Australia, Australia 5005. Tel.: 61-8-8303-5362; Fax:
61-8-8303-4348; E-mail: keith.shearwin@adelaide.edu.au.
Published, JBC Papers in Press, November 7, 2001, DOI 10.1074/jbc.M107740200
1
I. B. Dodd and J. B. Egan, manuscript in preparation.
3
Genome Sequencing Center, personal communication.
4
D. Pickard, personal communication.
 |
ABBREVIATIONS |
The abbreviations used are:
PMSF, phenylmethylsulfonyl fluoride;
X-gal, 5-bromo-4-chloro-3-indolyl-
-D-galactopyranoside;
SPR, surface plasmon resonance;
RU, response units;
HTH, helix-turn-helix.
 |
REFERENCES |
| 1.
|
Ptashne, M.
(1986)
A Genetic Switch
, Cell Press and Blackwell Scientific, Cambridge
|
| 2.
|
Friedman, D. I.,
and Court, D. L.
(2001)
Curr. Opin. Microbiol.
4,
201-207[CrossRef][Medline]
[Order article via Infotrieve]
|
| 3.
|
Kalionis, B.,
Dodd, I. B.,
and Egan, J. B.
(1986)
J. Mol. Biol.
191,
199-209[CrossRef][Medline]
[Order article via Infotrieve]
|
| 4.
|
Dodd, I. B.,
Kalionis, B.,
and Egan, J. B.
(1990)
J. Mol. Biol.
214,
27-37[CrossRef][Medline]
[Order article via Infotrieve]
|
| 5.
|
Dodd, I. B.,
Reed, M. R.,
and Egan, J. B.
(1993)
Mol. Microbiol.
10,
1139-1150[CrossRef][Medline]
[Order article via Infotrieve]
|
| 6.
|
Dodd, I. B,
and Egan, J. B.
(1996)
J. Biol. Chem.
271,
11532-1154 |