Crystal structure of the Mor protein of bacteriophage Mu, a member of the Mor/C family of transcription activators.

Transcription from the middle promoter, Pm, of bacteriophage Mu requires the phage-encoded activator protein Mor and bacterial RNA polymerase. Mor is a sequence-specific DNA-binding protein that mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. Here we present the first structure for a member of the Mor/C family of transcription activators, the crystal structure of Mor to 2.2-A resolution. Each monomer of the Mor dimer is composed of two domains, the N-terminal dimerization domain and C-terminal DNA-binding domain, which are connected by a linker containing a beta strand. The N-terminal dimerization domain has an unusual mode of dimerization; helices alpha1 and alpha2 of both monomers are intertwined to form a four-helix bundle, generating a hydrophobic core that is further stabilized by antiparallel interactions between the two beta strands. Mutational analysis of key leucine residues in helix alpha1 demonstrated a role for this hydrophobic core in protein solubility and function. The C-terminal domain has a classical helix-turn-helix DNA-binding motif that is located at opposite ends of the elongated dimer. Since the distance between the two helix-turn-helix motifs is too great to allow binding to two adjacent major grooves of the 16-bp Mor-binding site, we propose that conformational changes in the protein and DNA will be required for Mor to interact with the DNA. The highly conserved glycines flanking the beta strand may act as pivot points, facilitating the conformational changes of Mor, and the DNA may be bent.

Transcription from the middle promoter, P m , of bacteriophage Mu requires the phage-encoded activator protein Mor and bacterial RNA polymerase. Mor is a sequence-specific DNA-binding protein that mediates transcription activation through its interactions with the C-terminal domains of the ␣ and subunits of bacterial RNA polymerase. Here we present the first structure for a member of the Mor/C family of transcription activators, the crystal structure of Mor to 2.2-Å resolution. Each monomer of the Mor dimer is composed of two domains, the N-terminal dimerization domain and Cterminal DNA-binding domain, which are connected by a linker containing a ␤ strand. The N-terminal dimerization domain has an unusual mode of dimerization; helices ␣1 and ␣2 of both monomers are intertwined to form a four-helix bundle, generating a hydrophobic core that is further stabilized by antiparallel interactions between the two ␤ strands. Mutational analysis of key leucine residues in helix ␣1 demonstrated a role for this hydrophobic core in protein solubility and function. The C-terminal domain has a classical helix-turnhelix DNA-binding motif that is located at opposite ends of the elongated dimer. Since the distance between the two helix-turn-helix motifs is too great to allow binding to two adjacent major grooves of the 16-bp Mor-binding site, we propose that conformational changes in the protein and DNA will be required for Mor to interact with the DNA. The highly conserved glycines flanking the ␤ strand may act as pivot points, facilitating the conformational changes of Mor, and the DNA may be bent.
Transcription initiation is a major control point for gene expression in prokaryotes (1) and is regulated by a number of gene-and regulon-specific proteins. In many cases this regulation is mediated by protein-protein interactions between tran-scriptional activators or repressors and one or more subunits of the bacterial DNA-dependent RNA polymerase (2).
Bacteriophage Mu is a temperate phage that infects several species of enteric bacteria, including Escherichia coli K-12 (3,4). The lytic cycle is characterized by a regulatory cascade with three phases of gene expression: early, middle, and late (5) (Fig.  1A). The early promoter, P e , has typical Ϫ10 and Ϫ35 sequences and is recognized directly by the bacterial RNA polymerase (16). The middle and late promoters have recognizable Ϫ10 hexamers but lack the Ϫ35 hexamer; transcription from these promoters requires the phage-encoded proteins Mor 1 and C, respectively (17)(18)(19).
The middle operon regulator, Mor, is a sequence-specific DNA-binding protein composed of 129 amino acids, with a molecular mass of 14.7 kDa (19,20). It has an isoelectric point of 6.3 and an acidic N terminus and basic C terminus. The C-terminal region is predicted to contain a helix-turn-helix (HTH) DNA-binding motif (19). A homology search with the Mor amino acid sequence identified the Mu late promoter activator C, one bacterial regulator RdgB, and 12 proteins from other Mu-like prophages as sequence homologues (9 -15, 19, 21-24) (Fig. 1D). Because there is no amino acid sequence homology between Mor and previously studied transcription factors, Mor and C define a new family of transcription factors that we call the Mor/C family.
Mor binds to the middle promoter P m as a homodimer, recognizing a 16-bp region from Ϫ36 to Ϫ51 with respect to the transcription start site at ϩ1 (25) . Mutational analysis of the middle promoter sequence (25) identified an imperfect dyadsymmetry element within the Mor-binding site (Fig. 1B). It also revealed a bias between the two halves of the Morbinding site, showing a greater importance of the downstream half-site for Mor binding and promoter activity (25). Transcription activation of the middle promoter by Mor requires the C-terminal domains of both the ␣ (␣-CTD) and (-CTD) subunits of bacterial RNA polymerase (26). Binding of Mor and RNA polymerase to the promoter introduces a strand separation or distortion involving promoter positions Ϫ32 to Ϫ34 in addition to positions Ϫ12 to Ϫ1 (27). Based on these results, it has been proposed that the mechanism for Mor-dependent middle promoter activation involves Mormediated recruitment of RNA polymerase to the promoter and/or isomerization of the closed complex to an open complex through its interaction with the ␣-CTD and -CTD of RNA polymerase (27) (Fig. 1C). The curved arrows and plus signs indicate those promoters activated by Mor (the middle promoter, P m ) and C (the late promoters P lys , P I , P P , and P mom ). B, the middle promoter. The sequence of the middle promoter is annotated with the locations of the Ϫ10 hexamer (box), the transcription initiation site at ϩ1 (bent arrow), the region protected from DNaseI by Mor binding (bar), and the imperfect dyad-symmetry element recognized by Mor (inverted arrows). Dots mark 10-bp intervals. C, diagrammatic representation of protein-protein and protein-DNA interactions in the middle promoter open complex. Middle In an effort to gain a better understanding of Mor function, we determined the crystal structure of Mor to a resolution of 2.2 Å. The structure reveals an unusual mode of dimerization along with a classical helix-turn-helix DNA-binding motif. Most interesting, in the structure, the HTH motifs of Mor are located too far apart to interact with the two adjacent major grooves of DNA; thus, conformational changes in Mor may be needed for it to bind to DNA. Because Mor does not share sequence or structural similarities to other characterized proteins, the Mor structure could very well serve as a paradigm for the Mor/C family of proteins.

EXPERIMENTAL PROCEDURES
Media, Chemicals, Enzymes, Columns, and Analytical Services-Routine cell growth and protein overexpression were done in LB medium (28), whereas cultures for ␤-galactosidase assays were grown in minimal medium with casamino acids (M9CA) (17). When needed, medium was supplemented with chloramphenicol (Cm) at 25 g/ml and ampicillin (Ap) at 40 g/ml. MacConkey lactose plates with 50 g/liter of MacConkey agar (Difco) was used for plate phenotyping. Chloramphenicol was purchased from Sigma, and ampicillin was obtained from U. S. Biochemical Corp. Isopropyl-␤-D-thiogalactopyranoside (IPTG) and o-nitrophenyl-␤-galactopyranoside were from American Bioorganics. Acrylamide, bisacrylamide, TEMED, SDS, ammonium persulfate, and Biospin columns were purchased from Bio-Rad. Chloroform and glycerol were obtained from Fisher, and thiomersal and K 2 PtCl 4 were from Aldrich. Imidazole was purchased from Sigma, and guanidine hydrochloride was from Hampton Research. Seakem-and Nusieve (both genome technology grade)-agarose were from FMC Bioproducts. Talon spin columns used for small scale protein purification were from Clontech, and the nickel-nitrilotriacetic acid-agarose resin for large scale preparations was from Qiagen. The Superdex-75 gel filtration column was from Amersham Biosciences. The restriction enzymes HindIII and PstI were from New England Biolabs. The Thermus aquaticus polymerase and T4 DNA ligase were from Roche Applied Science; T4 polynucleotide kinase was obtained from Promega Corp. Shrimp alkaline phosphatase was purchased from U. S. Biochemical Corp. Oligonucleotides were purchased from Integrated DNA Technologies. Automated Protein Purification and Crystallization-Construction of the Mor expression plasmid, pIA69, has been described elsewhere (29). The plasmid contains the gene encoding an N-terminal histidine-tagged Mor protein (His-Mor; 17.1 kDa) under the control of a T7 promoter as well as a slightly modified Plac promoter we call Plac SYN (29,30). The His-Mor protein was overexpressed in E. coli strain JM109 DE3 (mcrA ⌬pro-lac thi gyrA96 endA1 hsdR17 relA1 supE44 recA DE3/FЈ lacI Q pro ϩ ; Promega Corp.). An overnight culture (200 ml) of a fresh transformant was transferred to 4 liters of LB medium containing 25 g/ml chloramphenicol and grown at 37°C until the A 600 reached 0.5-0.6. The cells were induced with 1 mM IPTG for 3 h and harvested by centrifugation. Cell pellets were resuspended in 20 mM Tris-HCl, pH 7.9, 200 mM NaCl, 10% glycerol, 1 mM 2-mercaptoethanol and lysed by sonication. The His-Mor protein was purified by nickel-affinity chromatography (Qiagen). Buffer exchange into the storage buffer (20 mM Tris-HCl, pH 7.9, 50 mM NaCl, 10% glycerol, 1 mM EDTA, and 1 mM dithiothreitol) and further purification were achieved by gel filtration chromatography in a Superdex-75 column (Amersham Biosciences). For crystallization trials, the purified protein was concentrated to 50 mg/ml by using an Amicon YM30 (30-kDa cut-off) membrane filter, which retained the 34.2-kDa His-Mor dimer. Mass spectrometry and N-terminal sequencing confirmed that the purified protein was histidine-tagged Mor (data not shown).
Crystallization was performed by the hanging drop diffusion method (31). Initial screening was carried out using Hampton Research screens and then refined. His-Mor crystals were obtained at either 4 or 18°C by the hanging drop diffusion method using 1.8 -2.1 M NaCl as a precipitant in 0.1 M imidazole buffer (pH 7.0 -7.2). Guanidine hydrochloride (125 mM) was used as an additive to improve the diffraction quality of the crystals. Because guanidine is a chaotrope, the secondary structure composition of the protein was analyzed by circular dichroism spectroscopy in the presence and absence of guanidine hydrochloride; no difference was observed between the two samples (data not shown). The crystals belong to space group P3 2 21 with cell dimensions of a ϭ 81.2 Å and c ϭ 44.8 Å. There is one molecule in an asymmetric unit with a solvent content of 52%.
Data Collection-Crystals were frozen in liquid nitrogen using a cryoprotectant solution made of the mother liquor supplemented with glycerol to a final concentration of 40%, and diffraction data were collected in a nitrogen gas stream (100 K). The mercury-derivative crystals were obtained by soaking in mother liquor with 10 mM thiomersal for 3 days at 4°C. The two-wavelength data using mercuryderivative crystals and one-wavelength native data using a crystal grown in the absence of guanidine were collected at the Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory. The wavelengths for the data collection were determined from an x-ray fluorescence scan. We also prepared platinum derivatives by soaking the crystals with 10 mM K 2 PtCl 4 for 3 days at 4°C, and a single-wavelength data set was collected at the X-12C beamline at the National Synchrotron Light Source at Brookhaven National Laboratory. Mercury-derivative and guanidine-free crystals diffracted to respective resolutions of 2.6 and 2.5 Å, whereas platinum-derivative crystals diffracted to 2.0 Å resolution. Mercury-derivative data were processed and scaled using DENZO/SCALEPACK (32), and the guanidine-free and the platinumderivative data were processed with HKL2000 (32). The data collection statistics are shown in Table I.
Structure Determination-The structure was solved by the multiwavelength anomalous dispersion method (33) using two-wavelength data sets of mercury derivative; the peak wavelength was at 1.0051 Å, and the inflection wavelength was at 1.0086 Å. The single mercury site was located, and phase refinement was done by using the programs SOLVE and RESOLVE (34,35). Initial phasing and model building was done with mercury-derivative data, and platinum-derivative data were used for subsequent refinement. Model building was done using the program O (36) and refined with XPLOR (37) and CNS (38) to an R free of 0.268 and R work of 0.252 for all data in the resolution range of 20 -2.2 Å. The mercury was bound to Cys 61 of Mor, and platinum was bound to His 63 and Met 116 of Mor. The guanidine-free structure was refined to an R free of 0.311 and R work of 0.289 in the resolution range of 30 -2.5 Å. Although these two heavy atoms were bound to different sites in the protein, conformational differences of the two derivative structures to the guanidine-free native structure were limited to the side chains of heavy atom-bound residues. This finding demonstrates that the two derivative structures basically are the same conformation as the guanidine-free native structure. The highest resolution platinum-derivative structure promoter DNA is represented by the horizontal line, with the dyad-symmetry element for Mor binding shown as inverted arrows, the spacer distortion from Ϫ34 to Ϫ32 by an open diamond, and strand separation in the open complex from Ϫ12 to Ϫ1 by the open oval. Shaded rectangles represent Mor and the subunits of RNA polymerase, with the small black rectangles indicating the Mor-polymerase interactions important for promoter activity. The location of the second ␣-CTD is not known, as indicated by the question mark. D, amino acid sequence alignment of Mor family members. The alignment shown was developed by merging alignments generated by ClustalW (6) and multialign (7) and making minor modifications based on pairwise BLAST (8) alignments and visual inspection. The secondary structure features derived from the Mor structure are indicated above the alignment (␣ helix as ovals and ␤ strand as an arrow). Identical amino acids are shown in black boxes (with two exceptions allowed); chemically similar residues are shaded in gray (with three exceptions allowed). Dots indicate 10 amino acid intervals in Mor. Mor/C family members are designated Mor or C based on the Mu protein with which they share the greatest homology, as determined by BLAST score (data not shown). Names of previously identified Mu-like prophages are those used by Morgan et al. (9), Heidelberg et al. (10), and Casjens (11). Newly discovered Mu-like prophages in recently sequenced bacterial genomes are named according to the convention of Morgan et al. (9) and include DucMu1 in Haemophilus ducreyi, VioMu in Chromobacterium violaceum (12), and PhoMu in Photorhabdus luminescens (13). Genetic elements too short to be complete phages are indicated by asterisks and include CV1* from C. violaceum (12), Stm 7* from Salmonella typhimurium (14) and VV1* from Vibrio vulnificus. The sequence for FluMu C was generated by merging the sequences from two partially overlapping open reading frames (15) to make a single protein with C homology over its entire length. We suspect the separation into two open reading frames occurred by generation of a frameshift, either by mutation during evolution or by a sequencing error.
was used for structural interpretation. The refinement statistics of the platinum-derivative structure are summarized in Table I.
Site-directed Mutagenesis of the Hydrophobic Core-The construction of the plasmids and bacterial strains used here were described elsewhere (25,29). Plasmid pIA69, containing the mor gene with a histidine tag and silent restriction sites, was used as template for the PCR-based mutagenesis and cloning. Each mutagenic primer was designed to introduce multiple mutations at one position and synthesized using an equimolar mixture of desired nucleotides at the targeted position. The two strands were mutagenized in separate PCRs and then used as templates for overlapping PCR. The resulting mutagenized cassette was cloned into pIA69 between the PstI and HindIII sites. The ligation mixture was transformed into strain MH13435 (mcrA ⌬pro-lac thi gyrA96 endA1 hsdR17 relA1 supE44 recA/FЈ lacI QI ⌬lacZY pro ϩ / pIA14), with a P m -lacZ fusion reporter plasmid, pIA14 (25), and plated on LB plates with Ap and Cm. The resulting colonies were screened on MacConkey lactose agar indicator plates with Ap, Cm, and different concentrations of IPTG (0, 10, 50, 100, and 300 M). Candidate mutant plasmids were chosen based on the plate phenotypes, and the mutations were identified by automated DNA sequencing. Mutant plasmids were transformed into MH13355 (mcrA ⌬pro-lac thi gyrA96 endA1 hsdR17 relA1 supE44 recA DE3/FЈ lacI QI ⌬lacZY pro ϩ ); the DE3 encodes T7 RNA polymerase for protein overproduction and purification (39).
Crude Extract Preparation-Overnight cultures (2 ml) of MH13355 derivatives containing the mutant (and wild-type control) plasmids were transferred to 100 ml of LB containing Cm. The cells were grown at 37°C until the A 600 reached 0.4 -0.6 and then induced with 1 mM IPTG for 60 min. After the cells were harvested by centrifugation, the cell pellets were resuspended in 3 ml of buffer M containing 20 mM Tris-HCl, pH 7.9, 200 mM NaCl, 0.5 mM phenylmethylsulfonyl fluoride, 0.5 mM 2-mercaptoethanol, and lysed by sonication. The sonicated preparation was subjected to centrifugation at 10,000 rpm for 15 min. The supernatant was removed, and the pellet was resuspended in 3 ml of buffer M. Samples of the supernatant (24 l) and pellet (24 l) were mixed with 6 l of 5ϫ loading dye, boiled, and subjected to electrophoresis on a 12.5% SDS-polyacrylamide gel (40) stained with Coomassie Blue (41), and the protein concentration was determined by a Bradford assay (42).
In Vivo Transactivation Assay-Cells were grown overnight in 2 ml of M9CA medium containing Cm and Ap. A 50-l sample of the overnight culture was inoculated into 10 ml of M9CA medium with the same antibiotics and grown at 37°C until the A 600 reached 0.4 -0.6. A 2-ml sample was removed to serve as an uninduced control, and the remaining culture was induced with 2 mM IPTG for 60 min. Based on the plate phenotype of individual mutants, dilutions of the cells were made using M9CA medium, and the cells were permeabilized by mixing with 10 l of chloroform and 18.5 l of 0.1% SDS in a total volume of 100 l. After incubation for 20 min on ice, 0.5 ml of o-nitrophenyl-␤-galactopyranoside (0.833 mg/ml) in buffer Z (43) was added, and the mixture was incubated at 28°C for 20 min. The reactions were stopped by adding 250 l of 1 M Na 2 CO 3 , and spectrophotometer readings were taken at 420 nm for the reaction and 600 nm for the cell density. The ␤-galactosidase activities were calculated according to Miller's formula (43) and normalized relative to that of a wild-type culture assayed in parallel and set to 1000 Miller units.

RESULTS
The crystal structure of His-Mor was solved by the multiwavelength anomalous dispersion method using mercury-derivative crystals, and the refinement was done with the high resolution platinum-derivative data set. The final structure was refined to 2.2 Å resolution with an R work value of 26.4% and R free value of 24.8%. The asymmetric unit has one molecule. As expected, His-Mor forms a dimer, and the two subunits are related to each other by a crystallographic 2-fold symmetry axis. For the purpose of this discussion, the amino acid residues will be numbered according to their positions in the native protein. Out of the total 129 amino acids in native Mor, 26 residues at the N terminus and 9 residues at the C terminus are not visible in the structure; the extra 20 residues in the N-terminal histidine tag are also not visible. Because the side chains of Arg 45 and Leu 46 are disordered and not visible in the structure, they are represented as alanines. Based on PRO-CHECK analysis (44,45), 94% of the residues are within the most favored region of the Ramachandran plot (46), and 6% are within the additionally allowed regions. None of the residues are in the disallowed regions.
Overview of the Mor Monomer Structure-The structure of Mor reported here has a novel fold and shows several unusual features not found in other transcription factors (Fig. 2A). The Mor monomer is folded into two independent domains. The N-terminal domain is composed of two ␣ helices, ␣1 and ␣2. These two helices run in opposite directions with an angle of 120°to each other. The C-terminal domain contains three helices, ␣3, ␣4, and ␣5, which are folded into the DNA-binding HTH motif. The N-and C-terminal domains are connected by a ␤ strand linker. Three glycine residues (Gly 65 , Gly 66 , and Gly 67 ) are located in the loop N-terminal to the ␤ strand, and one glycine (Gly 74 ) is present in the C-terminal loop, potentially creating two flexible junctions between the N-and C-terminal domains.
Dimerization of Mor-The structural elements for dimerization include helices ␣1 and ␣2 of the N-terminal domain and the ␤ strand (␤1) of the linker. These structural elements are arranged with respect to the 2-fold symmetry-related axis to form an intertwined four-helix bundle with a pair of antiparallel ␤ strands capping one end of the bundle (Fig. 2, A and  B). This mode of dimerization is unusual, although similar packing of helices into a four-helix bundle is found in endonuclease III (48) (root mean square deviation of 1.3 Å). In endonuclease III, however, the four helices are part of the 6-helix barrel domain and interact tightly with each other to stabilize intra-molecular interactions, but they are not involved in oligomerization (48).
At the Mor dimer interface, many hydrophobic residues interact with their symmetry-related equivalents. Key residues involved in dimerization include several leucines of helix ␣1 and a mixture of isoleucines and leucines of helix ␣2. For example, the side chains of Leu 35 of helix ␣1 and Ile 60 of helix ␣2 point into the center of the four-helix bundle and interact with symmetry-related residues (Leu 35 Ј and Ile 60 Ј), to form a layer of hydrophobic interactions in the dimer interface (Fig.  2B). Similarly, the side chains of Leu 39 of helix ␣1 and Ile 56 of helix ␣2 interact with their symmetry-related residues (Leu 39 Ј and Ile 56 Ј) to form another layer of hydrophobic interactions. Four such layers of hydrophobic interactions form an extensive hydrophobic core, stabilizing the dimer (Fig. 2B). The antiparallel ␤ strands on top of the four-helix bundle also provide hydrophobic residues, Val 69 and Ile 71 , whose side chains point toward the center of the four-helix bundle. Interestingly, Gln 68 and Tyr 70 of the ␤ strand do not participate in these interactions, and their side chains point away from the four-helix bundle (Fig. 2C).
Helix-Turn-Helix Motif-The three helices, ␣3, ␣4, and ␣5, in the Mor C terminus form a three-helix bundle with a classical helix-turn-helix DNA-binding motif (Fig. 2, D and E). In this classical motif, the first helix is thought to form the structural scaffold that anchors the second and third helices, which contain the conserved residues identified as being characteristic of an HTH motif (49). The third helix, often called the recognition helix, typically makes specific contacts with the bases in the major groove of DNA. In some cases, the second helix also makes base-specific contacts, but usually it is involved in either nonspecific base contacts or interactions with the DNA phosphodiester backbone (50,51). In Mor, helix ␣3 serves the scaffolding role; multiple hydrophobic residues in helix ␣3 interact with others in helices ␣4 and ␣5 to stabilize the conformation of the HTH structure. Mor helices ␣4 and ␣5 and the turn between them contain conserved residues characteristic of the DNA-binding HTH motif.
Structural comparisons between the Mor HTH domain and other proteins using the program DALI (52) identified a number of structural homologues. The best matches were found with the HTH motifs of TrpR (Z-score of 5.3 and r.m.s.d. of 1.5 Å) (53) and region 4.0 of the subunit of T. aquaticus RNA polymerase, which we will call T. aquaticus (Z-score of 4.9 and r.m.s.d. of 1.7 Å) (54). Detailed comparison of these structural homologues with the C-terminal domain of Mor identified the HTH motif of TrpR as the most similar HTH motif (r.m.s.d. of 1.5 Å). Both TrpR and T. aquaticus interact with DNA through the N-terminal half of the HTH recognition helix (53,55). This "ends-on" base recognition has been observed in other prokaryotic transcription regulators, including the Lac repressor (56,57), and in two additional Mor HTH homologues, NarL (r.m.s.d. 2.7 Å; 58) and the Tc3 transposase of Caenorhabditis elegans (r.m.s.d. 2.5 Å) (59).
These structural similarities lead us to propose that the HTH of Mor will also exhibit ends-on DNA binding, with helix ␣5 serving as the Mor recognition helix. Consistent with this hypothesis, mutagenesis of several amino acids in helix ␣5 of Mor led to defects in DNA binding and transcription activation (29). Recent mutational and modeling studies of the Mu C protein also identified the corresponding region as the DNAbinding motif of C (60,61).
Mutational Analysis of the Hydrophobic Core-Analysis of pre-existing mutants with helix ␣1 amino acid substitutions that are predicted to interfere with the inter-helical contacts of the four-helix bundle (Fig. 3A) demonstrated the importance of the hydrophobic core in solubility and function of Mor. The inner face of helix ␣1 is lined by four leucines, Leu 31 , Leu 35 , Leu 39 , and Leu 43 , whose side chains point into the center of the four-helix bundle (Fig. 3A). Substitutions of these residues with valine, which has a shorter side chain, would be expected to weaken the hydrophobic interactions, whereas hydrophilic substitutions would be expected to disrupt the hydrophobic core and decrease the solubility of the protein.
The ability of each mutant protein to carry out transcription activation of a P m -lacZ fusion was determined by in vivo ␤-galactosidase assays. In addition, crude extracts from cells overexpressing the wild-type and mutant proteins were fractionated by centrifugation, and the supernatant and pellet fractions were analyzed by SDS-PAGE to determine the relative solubility of each protein (Fig. 3, B and C). All the mutant proteins were efficiently expressed and resistant to proteolytic degradation, but several had reduced solubility and were found predominantly in the pellet fraction. Mutant proteins containing hydrophilic substitutions, such as L35K, L35D, and L35Q, exhibited reduced solubility and could not be purified. In contrast, mutant proteins with hydrophobic substitutions, such as L39V, L39A and L35V, remained soluble but exhibited moderate to severe decreases in transactivation ability (Fig. 3D).
The mutant proteins differed in their loss of function, depending upon their location in the hydrophobic core. The greatest decreases in transcription activation and solubility were caused by substitutions at position Leu 35 , located in the middle of the hydrophobic core. On the other hand, proteins with substitutions at position Leu 39 (L39V, L39A, and L39T) remained soluble and exhibited less severely reduced transcription activation than those at Leu 35 (Fig. 3D). Among the three substitutions at Leu 39 , L39T showed the greatest defect. The side chain of Leu 32 extends away from the core but makes hydrophobic interactions with the core and with aliphatic residues in helix ␣3 of the HTH. Proteins with substitutions at this position, L32V, L32A, and L32T, were soluble and exhibited only a modest 2-4-fold reduction in transactivation (Fig.  3D). The polar residue Asn 36 makes hydrogen bonds with Ser 53 of helix ␣2 of the same monomer and Asn 88 of helix ␣3 of the second monomer. Proteins with hydrophobic substitutions at Asn 36 remained soluble but showed significantly reduced transcription activation.

DISCUSSION
The structure described here is the first structure determined for a member of the Mor/C family of transcription factors. As predicted from in vitro protein-protein crosslinking and the presence of a dyad-symmetry element within the Mor-binding site in P m (25), the structure is that of a Mor dimer. It shows that Mor dimerizes by a novel intertwining of N-terminal helices of two monomers to form a four-helix bundle that is stabilized by an antiparallel interaction between two ␤ strands that cap one end of the bundle. A helix-turn-helix DNA-binding motif is found in the Cterminal domain of each monomer, placing them at opposite ends of the dimer (Fig. 2A).
Primary Sequence Comparisons-At the time Mor was discovered, its only homologue was the Mu C protein (19). With the recent sequencing of bacterial genomes, there are now 14 closely related Mor homologues, the majority of which are found in Mu-like prophages in a number of different bacterial species (9 -15, 19, 22-24) (Fig. 1D). Sequence alignment of these proteins revealed several conserved features, suggesting that they may play an important role in Mor structure and function. Conserved residues in the N-terminal half are predominantly hydrophobic and constitute the hydrophobic core of the dimerization domain. The properties of mutant proteins with single amino acid substitutions within this hydrophobic core are consistent with the structure and emphasize the importance of the hydrophobic core to protein solubility and Mor function (Fig. 3, B-D).
The most strongly conserved residues are located in the C-terminal half of the protein (Fig. 1D). In the HTH motif the most highly conserved residues are Phe 90 , Gly 92 , Asn 94 , and Leu 98 located within and flanking the turn between helix ␣3 and helix ␣4. The Gly 92 residue may provide free backbone angles for the turn, which is stabilized by hydrophobic interactions between Phe 90 , Leu 98 , and Val 109 of helix ␣5 as well as hydrogen-bonding interactions of Asn 94 with Asn 91 and Glu 97 . The many conserved hydrophobic residues in helices ␣3, ␣4, and ␣5 generate a tightly packed hydrophobic core within the HTH domain that stabilizes the spatial arrangement between the helices of the HTH motif. Conserved charged residues in this motif are solvent-exposed and likely to play a key role in interaction with promoter DNA as well as providing hydrophilic surfaces at the ends of the Mor dimer (Fig. 4A).
Within the ␤ strand there are four highly conserved residues, the two invariant residues Tyr 70 and Pro 72 and the conserved residues Val 69 and Ile 71 . Although the side chains of Val 69 and Ile 71 extend down and make interactions in the hydrophobic core, the side chains of Gln 68 and Tyr 70 extend away from the molecule. These Gln 68 and Tyr 70 side chains are located be-tween the two HTH motifs and are not involved in any intra-or inter-molecular interactions. One intriguing possibility is that when Mor binds to DNA via the two HTH motifs, the Gln 68 and Tyr 70 side chains may interact with the minor groove spacer DNA located between the two Mor-binding half-sites. Immediately flanking the ␤ strand are the highly conserved The values presented in the 2nd column are the average of at least two independent assays and are normalized relative to that of a wild-type control in the same experiment. The 3rd column indicates the solubility properties of the mutant proteins based on the results of gels like those shown in B and C. The dash indicates that no solubility assay could be done for extracts lacking Mor; the asterisk added for mutant L35K means that although the protein was soluble in crude extracts, it was reproducibly lost during purification. glycines, Gly 65 , Gly 66 , and Gly 74 , and the invariant Pro 72 which we predict plays an important role in the conformational changes in Mor needed for DNA binding (discussed below).
The least conserved regions are the N-terminal and C-terminal portions of Mor/C family proteins. These are also the portions of Mor that are not visible in the structure. It is important to note, however, that these sequences are not random; they simply show a larger number of exceptions than allowed in Fig.  1D. Their potential importance to Mor function is underscored by the fact that deletion of the 26 N-terminal or 9 C-terminal amino acids not seen in the structure render Mor non-functional (data not shown).
DNA Recognition by the Mor Recognition Helix-The structure of Mor shows that the C-terminal domain folds into an HTH motif in which helix ␣5 corresponds to the recognition helix for DNA interaction. The structural homologues of the Mor HTH, TrpR and T. aquaticus , bind to DNA using an ends-on base recognition mechanism (53-55). These proteins use the turn between the preceding helix (␣4 of Mor) and the recognition helix along with the N terminus of the recognition helix (␣5 of Mor) to make most of the protein-DNA contacts. To gain insight into the residues potentially involved in DNA interaction, we superimposed the HTH motif of Mor onto the binary complex structure of TrpR with 20 bp of DNA (53). Despite their minimal amino acid sequence homology, the two HTH motifs superimposed well. Based on the structural homology, it seems likely that Mor residues Tyr 102 in helix ␣4, Thr 105 in the turn, and Phe 106 , Asn 107 , Thr 108 , Tyr 110 , and Lys 111 in the N-terminal one-third of ␣5 may make contacts with promoter DNA. Mutations in some of these residues, such as Y110F, N107D, N107Y, K111I, K111T, and K111R were found to reduce both DNA binding and transcription activation by Mor (29). Mutational analysis of residues in the HTH motif of Mu C protein also demonstrated the role of the HTH motif of C in DNA binding (60,61).
Conformational Plasticity of the Mor Dimer-Although the molecular details of the interactions of Mor with the middle promoter sequence are not yet known, the structure of the Mor dimer provides some clues as to how Mor may recognize the DNA. Analysis of mutants with single base substitutions in P m showed that Mor interacts primarily with bases in a 16-bp region from Ϫ36 to Ϫ51 of the middle promoter (25). Recent experiments suggest that there may be additional weaker contributions from the flanking bases as well. 2,3 If this region were straight B-form DNA, the maximum distance between the outer edges of two adjacent major grooves would be ϳ54 Å (3.4 Å/bp). In the Mor dimer, the distance between the two Phe 106 residues, the first residue of helix ␣5 of the HTH motif, is 63 Å. Clearly, conformational changes in both Mor and DNA will be required to overcome this 9-Å difference (Fig. 4, B and C). One possible conformational change of Mor may require that the HTH domains move away from the dimerization domain and toward opposite sides of the DNA helix to make contacts in two adjacent major grooves (Fig. 4C). This movement requires breaking the interactions of the HTH domains with the dimerization domain, which may be facilitated by the energy provided by binding of the HTH domains to DNA. The highly conserved glycines, Gly 65 , Gly 66 , and Gly 74 , flanking the ␤ strand may provide pivot points for conformational changes in Mor, and the isomerization of invariant Pro 72 may provide an additional conformational change, all of which may be required for DNA binding. Consistent with this hypothesis, alanine substitution of Gly 79 and threonine or leucine substitution of Pro 77 in Mu C protein, which correspond to Gly 74 and Pro 72 of Mor, caused defects in DNA binding (60,61). Experiments to identify the role of these conserved residues in Mor are currently underway.
Early experiments assaying DNA bending by Mor indicated that Mor did not cause a dramatic bend but left open the possibility that it might generate a small bend (25). Recent experiments have confirmed the presence of such a bend with a bending angle of at least 40°. 2 A similar bend was observed when Mu C protein bound to the late promoter P mom (63). Although the bending could occur either toward the protein or away from the protein, we favor a model in which the DNA bends away from Mor, lengthening the distance between the two major grooves that Mor must contact (Fig. 4C). Bending toward Mor would further shorten the distance between the two major grooves of DNA, making it impossible for Mor to bind. However, it is not known if binding of the HTH motifs of Mor dimer bends DNA or if the DNA bends due to its dynamic nature and then the HTH motifs of Mor bind to the bent DNA.
When the HTH motifs of Mor bind to bent DNA, regardless of the cause of bending, the side chains of ␤-strand residues Gln 68 and Tyr 70 may be in the position to interact with the minor groove spacer between the two adjacent major grooves. Precedent for such minor groove interactions does exist; for example, with Lac repressor the corresponding minor groove is contacted by hinge helices that contribute to the DNA bend (57). In Mu C protein leucine substitution at Gln 73 , which corresponds to Gln 68 of Mor, significantly reduced DNA binding (60), suggesting a possible involvement of Gln 68 of Mor for the polar interactions with DNA.
In conclusion, we have determined the crystal structure of Mor of bacteriophage Mu at 2.2 Å resolution. Based on the structure and mutational analysis of Mor and C as well as the middle promoter, we predicted the key residues that are likely to interact with DNA and possible conformational changes that would allow DNA binding by Mor. This structure should help us understand the function of Mor and the other members of the Mor/C family of transcription activators.