Specific DNA recognition by F Factor TraY involves beta-sheet residues.

The F Factor TraY protein is a sequence-specific DNA-binding protein required for efficient conjugal transfer. Genetic and biochemical studies indicate that TraY has two functional roles in conjugation. TraY binds to the PY promoter to up-regulate transcription of tra genes. TraY also binds to the plasmid origin of transfer (oriT), serving as an accessory protein in the nicking of F Factor in preparation for transfer. TraY is thought to belong to the ribbon-helix-helix family of transcription factors. These proteins contact DNA using residues of an antiparallel beta-sheet. We engineered and characterized six TraY mutants each having a single potential beta-sheet DNA contact residue replaced with Ala. Most TraY mutants had significantly reduced affinity for the TraY oriT binding site while possessing near wild-type stability and nonspecific DNA recognition. These results indicate that TraY beta-sheet residues participate in DNA recognition, and support inclusion of TraY in the ribbon-helix-helix family.

Bacterial conjugation is the process by which a conjugative plasmid directs transfer of a copy of itself from donor to recipient bacterium (for review, see Ref. 1). The TraY protein of Escherichia coli K12 Sex Factor F (F Factor or F plasmid) is required for efficient F Factor conjugal transfer (2). TraY, a sequence-specific DNA-binding protein, plays two distinct roles in F Factor conjugation. First, TraY binds at P Y , the major transfer (tra) operon promoter (3), up-regulating transcription of the tra operon (4,5). Second, TraY participates in the "relaxosome," a complex of three proteins that assembles at the F plasmid origin of transfer (oriT). In addition to TraY, the relaxosome includes F-Factor-encoded TraI, and the host-genomeencoded integration host factor. Through an undefined mechanism, TraY and integration host factor enhance the nucleolytic activity of TraI as it cleaves one DNA strand in preparation for transfer of the cut strand to the recipient bacterium (6,7).
Based on a shared pattern of mainly hydrophobic amino acids (Fig. 1A), Bowie and Sauer (8) assigned TraY to the ribbon-helix-helix family of transcription factors. This family includes three proteins of known three-dimensional structure: Arc (9) and Mnt (10) repressors of phage P22, and the Met repressor of E. coli (11). The family name derives from a structural motif that includes a ␤-strand followed by two ␣-helices (reviewed in Refs. 12 and 13). Two ribbon-helix-helix folds form a single protein domain, with the ␤-strands combining in a two-strand antiparallel ␤-sheet. In Arc, Mnt, and Met repressors, this domain is formed by pairing of monomers, each of which contains a single ribbon-helix-helix motif. TraY is a monomeric protein (3,14), but appears to contain a pair of tandem, nonidentical ribbon-helix-helix motifs. The three-dimensional structure of TraY has not yet been determined, and TraY shares no more than 20% sequence identity with any ribbon-helix-helix protein. However, like these proteins, TraY is composed of mixed ␣-helix and ␤-sheet secondary structure (14). Ribbon-helix-helix proteins form base-specific DNA contacts through residues of their ␤-sheets (12,13). If TraY is a member of this family, amino acid substitutions within the TraY ␤-sheet should alter DNA recognition. We therefore engineered a series of TraY mutants with Ala substitutions within the TraY ␤-sheet region, and characterized the DNA recognition and protein stability of these mutants.
Oligodeoxyribonucleotide Substrates-The 20-bp 1 DNA fragment used to assess the sequence-specific DNA binding of TraY and TraY mutants is 5Ј-TAGTTTCTCTTACTCTCTTT-3Ј and its complement. This sequence, bp 203-222 of the F Factor tra region (1), is within the TraY DNase I footprint (3), and will be referred to as the specific oriT binding site oligonucleotide. For radiolabeling, the strand shown was 5Ј-end-labeled using [␥-32 P]ATP (Amersham) and T4 phosphonucleotide kinase under conditions suggested by Stratagene. Unincorporated [␥-32 P]ATP and labeled oligonucleotide were separated using a Sephadex G-25 Quick Spin column (Roche Molecular Biochemicals). The labeled strand was annealed to its unlabeled complement by slow cooling after heating to 90°C.
The 22-bp DNA fragment used to assess nonspecific DNA recognition is 5Ј-AAAGCACCACACCCCACGCAAA-3Ј and its complement. This sequence corresponds to bp 133-154 of the F Factor tra region (1), which includes the sequence bound and nicked by TraI (18). In addition, a 20-bp DNA fragment lacking 1 bp on each end was used in some assays. These are referred to as the 22-and 20-bp nonspecific binding site oligonucleotides, respectively.
Cloning of traY-The traY gene was PCR-amplified using genomic DNA of FЈ E. coli strain JM109 as template. The amplification reaction and conditions were as described (14)  the sense strand) encoded NdeI and XhoI sites (underlined) to facilitate cloning. The PCR-amplified traY fragment was gel-purified, and recovered using the Wizard PCR Preps kit (Promega). The DNA fragment was digested with NdeI and XhoI, and ligated into NdeI/XhoI-digested pET-21a(ϩ) or pET-23a(ϩ). Competent E. coli TB1 cells were transformed with the ligation mixture. Restriction analysis of plasmid DNA from individual transformants was used to confirm the presence of the traY gene. Both DNA strands of the cloned traY genes were sequenced. E. coli BL21(DE3) cells were transformed with plasmid for protein expression.
Site-directed Mutagenesis of traY Gene-The traY gene cloned into pET-23a(ϩ) was mutated using a PCR-based procedure. PCR primers encoded the desired amino acid substitutions. To facilitate screening, PCR primers also encoded a unique restriction site introduced through silent mutations.
The 50-l reaction mixtures contained reaction buffer, 200 M dNTPs, 250 ng of each primer, from 5 to 400 ng (optimized for each primer pair) of the plasmid template, and either 5 units of Taq2000 or 2.5 units of Pfu DNA polymerase. The reaction involved denaturation at 94°C for 30 s; 2 cycles of denaturation for 30 s at 94°C, annealing for 60 s at 45°C, and extension for 8 min at 68°C; and 16 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 60 s, and extension at 68°C for 8 min. PCR products were treated with DpnI to digest the parental DNA template. The mutated, amplified linear vector was then cut at the engineered restriction sites to generate cohesive ends, and the ends of the plasmid were ligated. E. coli TB1 cells were transformed with the mutated plasmid. Plasmid purified from transformants was screened by restriction enzyme digestion. Both strands of mutant traY genes were fully sequenced prior to protein expression and functional analysis. Plasmids were transformed into E. coli BL21(DE3)/ pLysE cells for protein production.
Expression and Purification of Wild-type and Mutant TraY Proteins-For protein expression, cells were grown in LB broth with 50 g/ml ampicillin and (for E. coli strain BL21(DE3)/pLysE) 34 g/ml chloramphenicol. Cells were tested for protein expression and cell pellets prepared for protein purification as described (14). Cells harboring the wild-type traY gene cloned into pET-21a(ϩ) were used for protein purification as they grew faster and expressed more protein than those with the gene in pET-23a(ϩ). Mutant TraY proteins expressed well from pET-23a(ϩ) constructs.
For protein purification, cell pellets were thawed at 4°C and resuspended in 50 ml of ice-cold buffer A (50 mM Tris-HCl (pH 7.6), 1 mM EDTA, 5 mM dithiothreitol) plus 100 mM NaCl. Phenylmethylsulfonyl fluoride was added to a final concentration of 200 M. Cells were lysed by sonication, and the lysate DNase I-treated and cleared by centrifugation as described (14). The supernatant was loaded onto a 5-ml HiTrap Q column (Amersham Pharmacia Biotech) equilibrated with buffer A plus 100 mM NaCl. A GradiFrac System (Amersham Pharmacia Biotech) was used to apply a gradient from 100 mM to 2 M NaCl in buffer A to the column at 5 ml/min over a 200-ml volume. Elution was monitored by A 280 . Most proteins eluted as a single large peak from 160 to 400 mM NaCl, while DNA eluted from 400 mM to 2 M NaCl. Fractions containing protein were pooled, loaded onto a 5-ml HiTrap Heparin column (Amersham Pharmacia Biotech), and eluted as described (14). Fractions containing TraY or TraY mutants were applied to and eluted from a 5-ml HiTrap Blue column (Amersham Pharmacia Biotech) as described (14). Fractions containing TraY were pooled and TraY was concentrated using a 1.5-ml butyl-Sepharose 4 Fast Flow (Amersham Pharmacia Biotech) column equilibrated with buffer B (20 mM sodium phosphate (pH 7.4), 1 mM EDTA, 5 mM dithiothreitol) plus 3 M NaCl. Prior to application to the column, the TraY sample was brought to approximately 2.6 M NaCl by adding an equal volume of 4 M NaCl. TraY was eluted by applying 2 ml of buffer B plus 100 mM NaCl, stopping the flow, and incubating for 30 min. Following the incubation, additional buffer B plus 100 mM NaCl was applied to the column. Fractions containing TraY or TraY mutants were combined and dialyzed against buffer B plus 100 mM NaCl, with or without replacement of 5 mM dithiothreitol by 5 mM ␤-mercaptoethanol. The concentration of TraY was estimated by absorbance at 280 nm (⑀ ϭ 11,460 M Ϫ1 cm Ϫ1 ). Protein yields ranged from 2 to 4 mg/500 ml of culture. Purified TraY and TraY mutants were Ͼ95% pure as judged by Coomassie-stained sodium dodecyl sulfate-polyacrylamide gels.
Measurements of Affinity for DNA-Affinities of TraY and TraY mutants for DNA were measured by an electrophoretic mobility shift assay (referred to as the direct binding assay, to contrast with the competition assay described below). Assay buffer contained 10 mM Tris-HCl (pH 7.6), 3 mM MgCl 2 , 0.1 mM EDTA, 100 mM NaCl, 100 g/ml bovine serum albumin, 0.02% (v/v) IGEPAL CA-630 (Sigma), 1 mM dithiothreitol, 5% (v/v) glycerol, and 1 g/ml sonicated, phenol-extracted calf thymus DNA (Amersham Pharmacia Biotech). Varying concentrations of TraY or TraY mutant were incubated for 90 min with 22 pM (final concentration) 32 P-end-labeled specific oriT binding site oligonucleotide. The reaction volume was 50 l. Bound and free DNA were separated by electrophoresis on a 1/2X-TBE, 7% polyacrylamide gel. Following electrophoresis and gel drying, radioactive bands were quantified using an SF PhosphorImager (Molecular Dynamics). Data were converted to fraction of labeled DNA bound and plotted versus protein concentration. Multiple experiments were fit simultaneously with the equation: ϭ 1/(1 ϩ K D /[protein]), where is fraction DNA bound, using Kaleidagraph 3.0 (Synergy Software).
Affinities for both specific and nonspecific DNA binding sites were measured using a competition assay. The assay buffer is as described above. Concentrations of protein and labeled DNA used in the competition assay were based on the affinities of the proteins determined using the direct binding assay. For wild-type TraY and most mutants, the protein concentration used was one-tenth of the measured K D of that protein. The concentration of the radiolabeled specific oriT binding site used for TraY and most mutants was equal to the measured K D of that protein. Equilibrium Unfolding by Guanidine Hydrochloride (GdnHCl) Denaturation-GdnHCl unfolding of TraY and TraY mutants was monitored by the change in circular dichroism ellipticity at 234 nm. Experiments were performed and analyzed as described (14) except the instrument used was a Jasco J-710 spectropolarimeter, protein concentrations of 2 or 2.5 M were used, and the denaturant starting solution contained 6 M GdnHCl (Mallinkrodt) rather than 9.5 M urea. Data from multiple experiments were normalized and combined into a single fit.

Expression and Purification of Wild-type and Mutant TraY
Proteins-The gene encoding F Factor TraY was amplified by PCR and cloned into expression vectors pET-21a(ϩ) and pET-23a(ϩ) (see "Experimental Procedures"). The gene cloned into pET-23a(ϩ) served as the template for PCR-based mutagenesis of TraY. Six mutants, each having a single Ala substituted for a residue within the TraY ␤-sheet, were generated. Sites of amino acid substitutions were selected based upon the sequence alignment of TraY with other members of the ribbonhelix-helix family (Fig. 1A). The sites of amino acid substitutions within the ribbon-helix-helix fold are depicted in Fig. 1B. In the figure, the residues in the x-ray crystal structure of MetJ, the E. coli methionine operon repressor, that correspond to the sites of engineered substitutions in TraY are highlighted. Wild-type and mutant TraY proteins were expressed and purified. The wild-type and variants demonstrated similar chromatographic characteristics, simplifying purification.
Equilibrium Chemical Denaturation-Denaturation of TraY and TraY mutants by GdnHCl was monitored by change in circular dichroism ellipticity at 234 nm. As noted previously for urea denaturation of TraY (14), GdnHCl denaturation of TraY is fit well by a two-state reaction model (N 3 D) with no significantly populated intermediates. Denaturation of the TraY mutants also fit well with a two-state reaction model. Table I  Sequence-specific DNA Recognition-The affinities of TraY and TraY mutants for a specifically bound oriT DNA sequence were measured by electrophoretic mobility shift assay. The sequence of the 20-bp double-stranded oligonucleotide used in the assay is taken from the DNase I footprint of TraY (3). As shown in Fig. 2, the measured stoichiometry of TraY binding to this oligonucleotide is 1:1. The dissociation constants for TraY and TraY mutants as measured using the direct binding assay are listed in Table II, and representative curves are shown in Fig. 3. Wild-type TraY has a K D of 7 nM, while the mutants show reductions in affinity ranging from 2-fold to over 100-fold.
The IC 50 values for TraY and TraY mutants for the specific binding site were also determined using a competition assay. The competition assays were performed in addition to the di-rect binding assays for two reasons. First, the results of the competition assay are less influenced by dissociation during the electrophoretic separation of bound and free DNA than the results of the direct binding assay, and can therefore potentially yield more accurate numbers for proteins with fast offrates. Second, affinities for both specific and nonspecific sites (see below) may be readily measured using the competition assay, while accurately measuring nonspecific binding using a direct binding assay is difficult. Measuring affinities for both specific and nonspecific sites by the same assay facilitates direct comparison of these values. The results from competition assays are listed in Table III, and generally agree with the results obtained from the direct binding assay. Representative curves are shown in Fig. 3.
Nonspecific DNA Recognition-To ascertain whether the observed differences in K D represent loss of specific DNA recognition, the affinities of TraY and the mutants for nonspecific DNA were measured using a competition assay. In this assay, binding of the radiolabeled, specific oriT TraY site was competed away with increasing concentrations of a 20 or 22-bp oligonucleotide corresponding to a region of oriT outside of any identified TraY binding site. The IC 50 values of the proteins for nonspecific DNA are listed in Table III, and representative curves are shown in Fig. 3. The values for all mutants are within about 2-fold of the value for wild-type TraY. Values obtained with the 22-bp nonspecific site (Table III) or the 20-bp site (not shown) did not differ significantly.
For TraY and some of the mutants, the IC 50 values for the specific binding site represent microscopic equilibrium con-   stants while those for the nonspecific site represent macroscopic equilibrium constants. Assuming that TraY recognizes a 10-bp sequence, the 22-bp nonspecific binding site oligonucleotide contains 13 overlapping binding sites. Converting the value for the nonspecific sequence into a microscopic equilibrium value, TraY recognizes the specific oriT site with approximately 500-fold higher affinity than the nonspecific site (IC 50 of 17 versus 7800 nM). For some of the mutants, the IC 50 values for specific and nonspecific binding sites differ by only 2-or 3-fold. As the difference between the specific and nonspecific IC 50 values decreases, nonspecific protein-DNA interactions presumably contribute more to the overall affinity, and the distinction between macroscopic and microscopic equilibrium values is blurred. Therefore, we report the nonspecific binding as macroscopic IC 50 values rather than attempt to convert them into microscopic equilibrium values. DISCUSSION If TraY is a member of the ribbon-helix-helix family, the protein should contact DNA through residues of its ␤-sheet. To test whether the amino acids of the TraY ␤-sheet are involved in DNA recognition, we engineered a series of TraY mutants having single Ala substitutions within this region. Based on results from mutagenesis studies of Arc repressor, another member of the ribbon-helix-helix family of transcription factors, these TraY mutants should exhibit substantially reduced affinities for the TraY DNA-binding site while retaining near wild-type protein stability (19,20). In ribbon-helix-helix proteins, the side chains of these ␤-sheet DNA contact residues are largely solvent exposed and do not contribute significantly to the stability of the protein.
Our results are consistent with inclusion of TraY in the ribbon-helix-helix family of transcription factors. As was seen with Arc repressor (19,20), most of the Ala substitutions have little effect on protein stability (Table I), while their effects on sequence-specific DNA recognition are often considerable (Tables II and III). There is no apparent correlation between protein stability and DNA affinity. For example, mutants K15A and T71A exhibit similar significant reductions in affinity (Tables II and III), yet the former has wild-type stability while the latter has a reduced stability (⌬⌬G u H O 2 ϭ Ϫ0.8 kcal/mol; Table  I). The reduced affinities for DNA are therefore not attributable to protein destabilization by the mutations that render the protein nonfunctional.
These results suggest that TraY utilizes ␤-sheet residues for specific DNA recognition. While TraY and all TraY mutants have similar affinities for nonspecific DNA, most mutants exhibited significantly reduced affinities for the specific oriT TraY binding site. We also examined the binding of TraY and  some of the mutants to a second specific TraY binding site at the P Y promoter (results not shown). The proteins examined demonstrated similar affinities, relative to wild type, for both DNA binding sites. TraY, therefore, binds the two sites in a similar fashion. These experiments do not reveal the precise roles of each of these ␤-sheet residues in DNA recognition, and further experimentation is underway to better define their contributions. Given the effects of the mutations on specific but not nonspecific DNA binding, it is likely that some ␤-sheet residues participate in base-specific contacts. Most of the Ala substitutions, however, were for positively charged Lys or Arg residues. These residues are capable of forming energetically favorable, but nonspecific, electrostatic interactions with the phosphate moieties of the DNA backbone. If, however, these amino acid side chains contribute to binding in a nonspecific fashion (for example, with the DNA backbone), the results suggest that they contribute preferentially within the context of sequence-specific recognition. Contribution of presumably nonspecific backbone contacts to DNA binding specificity has been observed previously (21)(22)(23).
Some ␤-sheet amino acid side chains may also contribute to specific DNA recognition indirectly, in addition to forming direct contacts with DNA. An extensive hydrogen bond network involving the side chains of DNA-contact residues is apparent in the crystal structure of the Arc repressor tetramer-operator complex (9,19). These hydrogen bonds presumably orient the side chains for optimal DNA contact, and thereby contribute to DNA recognition. If an analogous series of interactions occurs between TraY contact residues, substitution of one of the involved amino acids could have a substantial effect on specific DNA recognition, even if that side chain makes only minor energetic contributions to DNA binding through side chain-DNA contacts. Loss of cooperative interactions between contact residues could explain why three different Ala substitutions reduce affinity for the oriT sequence to within 4-fold of the affinity for a nonspecific sequence (Table III).