Binding and Catalytic Contributions to Site Recognition by Flp Recombinase*

Flp catalyzes site-specific recombination in a highly sequence-specific manner despite making few direct contacts to the bases within its binding site. Sequence discrimination could take place in the binding and/or the catalytic steps. In this study, we independently measure the binding affinity and initial cleavage rate of Flp recombinase with ∼20 designed alternate target DNA sequences. Our results show that Flp specificity is largely, although not entirely, imparted at the binding step and is the result of a combination of direct and indirect readout. The Flp binding site includes an A/T-rich region that displays a characteristically narrow minor groove. We find that many A → T changes are tolerated at the binding step, whereas C or G substitutions tend to decrease binding affinity. The effects of the latter can be alleviated by replacing guanine with inosine, which removes the N2 amino group that protrudes into the minor groove. Some A → T changes reduce binding affinity, due to clashing with nearby residues, reinforcing that specificity requires avoiding negative contacts as well as creating positive ones. A tracts, which can lead to unusually rigid DNA structure, are tolerated during the binding step when placed within the region where the minor groove is already narrow. However, most A tracts slow catalysis more than C or G substitutions. Understanding what kind of sequence variation is tolerated in the binding and catalytic steps helps us understand how the target DNA is recognized by Flp and will be useful in guiding the design of Flp variants with altered specificities.

Specificity in protein-DNA interactions is required for many cellular processes including transcription, DNA repair, and recombination. Proteins that modify DNA must distinguish their substrates from an overwhelming number of competing DNA sequences. In many cases, DNA-binding proteins recognize a specific DNA sequence even when there are few direct, unique contacts between the DNA bases and the amino acid side chains. Whereas it was once thought that unique hydrogen bond donor and acceptor patterns in the major groove would be the most important feature in DNA-protein recognition, further studies have revealed that other sequence-specific features of DNA often play a critical role. A large number of DNAbinding proteins have been shown to use indirect readout in their recognition of the appropriate DNA sequence, including the 434 repressor, TATA-binding protein, restriction enzymes including EcoRI, and the DNA bending proteins IHF and HU (1)(2)(3)(4)(5)(6).
In this study we focus on understanding substrate recognition in a site-specific DNA recombinase. Flp is a tyrosine recombinase from Saccharomyces cerevisiae that specifically recognizes a pair of 13-bp inverted repeats and catalyzes recombination of the intervening sequence. The recombination reaction involves two rounds of DNA cleavage, strand exchange and religation (7). Flp is widely used as a biotechnology tool to knock out genes, especially in Drosophila. Efforts to design recombinases with altered specificity for biotechnology applications are ongoing (8,9), and would be aided by further understanding of DNA recombinase recognition. There is no simple set of rules for engineering protein-DNA interfaces. A few examples of successful engineering of DNA-binding proteins exist, most of which relied on selection methods rather than rational design. Zinc finger designs have become sophisticated with the use of phage display and DNA-shuffling (10), Cre recombinase has been evolved to have different specificity (11,12), and DNA polymerase has been converted to an RNA polymerase (13). The modular design of serine recombinases, where DNA binding and catalysis are physically separable, has made fusion proteins with transcription factor DNA binding domains replacing the DNA binding domains possible (8,14). In the case of Flp, where DNA binding and catalysis are concentrated in one region of the protein, engineering has focused on a combination of selection and rational design (9).
Flp recognizes and cleaves DNA specifically at the Flp recognition target (FRT) 2 site, despite the presence of only a few direct side chain to base contacts between Flp and the DNA (Fig. 1). This suggests that much of Flp-DNA recognition is driven by indirect readout, where sequence-dependent structure, flexibility, and other properties of the DNA create specificity, rather than unique contacts between amino acid side chains and DNA bases. The need for a narrow minor groove may explain the specificity for A:T base pairs in the Flp binding site near the scissile phosphate. A/T-rich regions typically have a narrow minor groove, because they are able to achieve a greater degree of propeller twist between adjacent base pairs than C:G pairs where the N2 amino group of the guanine clashes with the base in the nϩ1 position on the opposite strand (15,16). A:T base pairs are common among protein-DNA com-* This work was supported by National Institutes of Health Grant GM058827. plexes that rely on indirect readout to recognize preferred sites, such as HU and the nucleosome, presumably because they have a recognizably narrow minor groove (6).
Prior to the experiments described here, binding and catalysis had not been studied independently in Flp. Studies of the effects of binding site mutations have measured the more ambiguous efficiency of full recombination assays that involve two rounds of cleavage, strand exchange, and ligation, any of which could be affected by sequence changes. In these full recombination assays, the WT base is preferred over the other three possibilities in most cases, as the Cox lab showed in their initial study (17). The direct contacts made between Flp side chains and the DNA target are clearly important, especially the contact between Arg-281 and a nearby guanine (see Fig. 1). In our study we have focused on the bases near the scissile phosphate (positions 1-6 in Fig. 2) which largely do not make direct contacts with Flp, to understand whether and how they contribute to recognition of the cognate sequence of Flp.
Several classic enzyme studies have shown that there is a tight link between binding and catalysis; i.e. altering the substrate will affect both K m and k cat (18 -20). However, there are several examples of DNA-binding proteins that at least partially separate recognition in the binding and catalytic steps. For example, EcoRI can bind many DNA sequences that are similar to the cognate sequence with high affinity, but the protein distorts only the correct sequence into a form that it can cleave (4). Other DNA-binding proteins have modular designs that physically separate the DNA-binding and catalytic roles to some degree. For example, Mu transposase discriminates among most substrates at the binding rather than the catalytic step, presumably because sequence-specific DNA binding and catalysis are carried out by different domains (20). In Flp, it is not clear if sequence discrimination takes place during the initial binding step, in the cleavage step, or a combination of the two.
In our study, we address two questions. Is the identity of the bases near the scissile phosphate (which Flp makes almost no direct contacts with) important for Flp recognition of its cognate sequence? If so, is sequence discrimination taking place during the binding or catalytic steps, or both?
To gain insight into how Flp recognizes its DNA substrate, we designed the FRT DNA mutants shown in Table 1 for both binding and cleavage studies. Our analysis of the structure of the DNA in two Flp-DNA crystal structures revealed that the minor groove is unusually narrow in the A/T-rich region near the scissile phosphate (Fig. 3), and that there is an unusually large drop in the roll angle between the two consecutive A:T base pairs near the scissile phosphate.
Our FRT alternate sequences focus on the bases in the vicinity of the scissile phosphate. To probe whether the narrow minor groove of A/T-rich sequences is being recognized by Flp, we made A 3 T transversions, which should maintain roughly the same groove width, and substituted C:G pairs, which we expected to widen the minor groove. With several sequences we explore how A tracts of four or more sequential As on one strand, which are typically quite rigid, affect binding and catalysis, and whether inosine:cytosine pairs lacking the N2 amino group of guanine are less disruptive than G:C pairs. We also measured the effects of sequence changes on the binding and cleavage steps separately to reveal when sequence discrimination takes place. Overall we find that sequence variation leads to larger changes in binding affinity than cleavage rate constants. A combination of direct and indirect readout is used in the binding and the catalytic steps, which appear to have different specificity requirements. A 3 T changes, including those that create A tracts, are better tolerated during the binding step. However, C or G substitutions hinder binding more than catalysis. This suggests that a narrow minor groove is important for recognition in the binding step, whereas flexibility to form a specific transition state conformation is important in the catalytic step.
Binding Competition Assays-PAGE-purified FRT substrates were ordered from IDT. Aliquots of Flpe were diluted in  Table 1). Vertical arrow marks the cleavage site, and several Flp side chains are shown next to the bases they are in close proximity to (see Fig. 1).  Note that it becomes narrow in the A/T-rich region near the scissile phosphate. The dashed line represents the average minor groove width for canonical B DNA. B, roll angles between base pairs. Both plots were made using 3DNA version 1.5 (30).   presence of inhibitor divided by the fraction bound in the absence of inhibitor on the y axis and c, the concentration of cold competing DNA on the x axis. We used the fraction bound in the presence of 1 nM cold competitor DNA as the denominator in the ratio above, because the fraction bound in the absence of competitor DNA was artificially low, possibly because of low Flp solubility with low DNA or salt concentration. The IC 50 was determined by fitting the competition curves in Kaleidagraph to the following equation, where IC 50 is the concentration at which f ϭ 0.5. Each IC 50 was fit to the average and standard deviation of at least three independent experiments, with the errors weighted in the fits. The equilibrium dissociation constant for the competitor, K i , cannot be determined from the IC 50 using the simple Cheng-Prusoff equation because more than 50% of the DNA is bound to Flp in the absence of competitor DNA. K i was solved for using an equation that takes the initial fraction bound into account (23,24), where [L*] tot represents the total amount of radiolabeled DNA present, [L*R] represents Flp-bound radiolabeled DNA, ϭ ([L*R]/[L*]) 0 and represents the fraction bound without competitor, and the WT FRT K d was solved for iteratively until K d and K i converged. Experiments were repeated a minimum of three times; error bars in Fig. 4 represent the standard deviation, and errors in Table 1 represent the standard error from the fit. Substrates with weak binding affinity were poorly determined in the range of concentrations we used for the competition assays, leading to greater errors in Table 1.
Catalysis Assays-Using the FRT substrates found to have K i s less than ϳ100 nM, catalysis assays were performed in 5% glycerol, 0.1 mg/ml bovine serum albumin, 0.1 g/ml salmon sperm DNA, 2 mM EDTA, 12.5 mM TAPS, pH 8, 1 mM fresh dithiothreitol, 100 mM NaCl, 50 mM (NH 4 ) 2 SO 4 with Ͻ2 nM 32 P-FRT DNA and 100 nM WT Flpe. After establishing how long it took each substrate to reach 10% completion of the cleavage reaction, time points were chosen for each substrate, ranging from 10 s to 1 min, which would allow us to measure the initial rate. Reactions were stopped in 2ϫ SDS buffer and run on 16% Tris Tricine mini gels with SDS buffer at 150 V for 40 min. Gels were exposed, scanned, and quantitated as described above, and the average of at least three experiments time points were fit to a straight line going through the origin with Kaleidagraph, using the errors to weight the overall fit. Error bars on v 0 in Fig. 5 represent the standard error from the fit. To determine the cleavage rate constant, k cleavage , we divided the v 0 by the fraction bound, as calculated based on the K i with the following equation.
The errors were determined by numerically propagating the errors through on v 0 and the fraction bound, and multiplying these errors in quadrature to obtain the error on k cleavage .

RESULTS
The equilibrium inhibition constants (K i ) we derived from competition binding assays are shown in Table 1, these equilibrium inhibition constants can be treated as equilibrium dissociation constants (K d ). To rule out any possible covalent species in our competition binding assays, we used a Y343F mutant, which lacks the catalytic tyrosine. In all of these experiments we used a variant of Flp, Flpe, containing four mutations that were evolved to give Flp greater stability at 37°C (12,25). The DNA substrates we used each contain a single Flp binding site, which is one-half of the inverted repeat found in the FRT site (Fig. 4). Using competition assays with increasing amounts of unlabeled competitor DNA and a single concentration of Flpe YF eliminated many of the problems we had with Flpe solubility and behavior. Competition assays were repeated 3-6 times, and a weighted fit of the averages with their errors was carried out using an equation for the 50% inhibition concentration (IC 50 ) and converted to K i values as described under "Experimental Procedures." We also measured the catalytic efficiency of a subset of FRT variants with WT Flpe and a DNA suicide substrate containing the FRT variation of interest ( Table 2). The DNA substrates used in both binding and catalysis studies have only two bases after the scissile phosphate, which are diluted into solution following cleavage during the catalysis experiments, resulting in a covalent bond between Flp and the DNA. Early cleavage time points were taken, and the average fraction DNA cleaved from 3 to 5 experiments was fit with a weight for the error to a line to obtain the initial velocity, v 0 . The v 0 was converted to a cleavage rate constant by dividing by the fraction bound based on the concentration of Flp, DNA, and the K i for that substrate (Fig. 5). The K i for the WT FRT is 4.7 Ϯ 1.3 nM, and the cleavage rate constant, k cleave , in these conditions is 2.8 Ϯ 0.06 ϫ 10 Ϫ3 s Ϫ1 .

Binding
Exploring the Scissile Phosphate Position-The change from C to T (FRT WT and FRT1ta) is well tolerated, leading to a very  Table 1.
small enhancement in affinity (Table 1 and Fig. 4B). Additionally, comparing the ⌬⌬G of FRT14ta and Frt4ta, which differ only by the presence of a T in the scissile phosphate position, shows almost no change. However both, G and A (FRT1gc and FRT1at) in the scissile phosphate position weaken binding significantly. The crystal structure shows that there is a hydrogen bond between Lys-82 and the N7 amino group of guanine across from the scissile phosphate position (Fig. 1). This H-bond would be preserved by a C:G 3 T:A change, as the A supplies a similar H bond acceptor in this position. However, C:G 3 A:T and C:G 3 G:C changes would disrupt this H-bond (Fig. 1). Interestingly, a K82M mutation can recombine both the WT FRT sequence and the C:G 3 G:C transversion in Frt1gc in an in vivo assay (26).
We also probed whether the scissile phosphate itself affects binding, by comparing WT sequence substrates that end with a hydroxyl and one that contains the scissile phosphate (FRT WToh and FRT WTP, Table 1). The lack of a scissile phosphate does deter binding somewhat (K i ϳ 34 nM) and the presence of the scissile phosphate returns binding to levels that are slightly better than FRT WT with a two-base pair overhang.
Comparing A Tracts with the Sequential Changes of Which They Are Comprised-A tracts are stretches of 4 -6 As or Ts, which give a rigid structure with a narrow minor groove that can result in DNA bending at their edges (27). Because A tracts do not contain pyrimidine-purine (Y-R) base steps such as T-A, which have been found to be more flexible than other dinucleotide steps, they may also be less amenable to induced fit in DNA-protein interactions. The pure stretches of As or Ts in the FRT substrates shown in Fig. 4c (as opposed to A/T mixtures), reveal that the position of the A tract within the FRT site has an enormous effect on binding. An A tract (FRT46ta), which still allows for a C in the scissile phosphate position but extends back further toward position 7 where the minor groove is wider in the Flp-DNA crystal structures (Fig. 3) drastically weakens binding, whereas the A tract positioned closer to the scissile phosphate (FRT14ta) has a much smaller effect on binding. The narrowing of the minor groove in an A tract increases in the 5Ј-3Ј direction for a stretch of As, meaning that the FRT46ta sequence may become extremely narrow where the WT sequence widens a bit with a C. Individually making the changes that add up to the A tract that destroys binding does not reveal an additive effect: the individual changes in FRT4ta and FRT6ta do not seriously affect binding, suggesting that the rigid nature of the full A tract in FRT46ta may be responsible for the decrease in binding activity. Other proteins with affinity for A/T-rich segments have struggled to bind to rigid A tracts. For example, whereas the TATA-binding protein functions well and can co-crystallize with many A 3 T transversions in its target sequence, it cannot function or co-crystallize with a rigid A tract (2).
A 3 T Transversions-A 3 T changes preserve the minor groove hydrogen bond donor/acceptor pattern, and also the increased propeller twist that leads to a narrow minor groove. If these are the recognizable properties of A/T-rich regions, we should not expect A 3 T transversions to significantly affect binding. This is true for two of the four A 3 T transversions in the TATT sequence (FRT4ta and FRT2at), which have reasonable binding affinity. In fact, FRT2at has slightly improved binding affinity. However, the other two (FRT3at and FRT5at) have strongly reduced affinity, for position 3 even worse than a Uridine lacks the methyl group that may cause a steric clash with Ala-55 in the Frt3at substrate; a uridine-containing substrate (Frt3au) had much higher affinity for Flp. Inosine lacks the amine group of guanosine that normally prevents propeller twisting and narrowing of the minor groove. Note that the minor groove (lower) face of an I:C base pair also mimics that of an A:T base pair. CG substitution ( Table 1). Examination of the Flp-DNA structure reveals that there are two side chains located quite close to the negatively affected A 3 T transversions. In the case of FRT3at, modeling an A 3 T transversion in this position reveals that the Ala-55 of Flp would be positioned very close to the methyl on the swapped T, leading to a steric clash (Fig. 1B).
To test the potential clash between Ala-55 and the methyl on the swapped T in FRT3at, we substituted uridine for thymine (FRT3au). We found that removing this methyl largely restored binding; the binding affinity of Frt3au is much stronger than for FRT3at. The binding defect of FRT5at is harder to explain. Met-58, although poorly ordered, is potentially within range to have some favorable hydrophobic interaction with the methyl of T5 (22). To test the potential interaction between Met-58 and the methyl on T5, we substituted uridine for thymine in this position (FRT5uA). The binding affinity for FRT5uA is similar to WT (K i ϳ 12 nM versus 4.7 nM for WT), suggesting that the interaction between Met-58 and the methyl on T is not significant. It is thus unclear why the binding of FRT5at is impaired, although it may be that placing the hydrophilic N7 of adenine in proximity to Met-58 is less favorable than removing the methyl group of the WT thymine. These clashes are an important reminder that specificity comes from not only positive direct contacts between an enzyme and its substrate, but also from bad contacts that prevent the binding of the incorrect substrate. This likely explains why FRT235at4ta, with all As and Ts swapped, leads to extremely poor binding. Similarly, the poorly binding A tract FRT1235at includes the deleterious changes in FRT1at, FRT3at, and FRT5at.
Separate from our questions about Flp recognition, we noticed some differences in how FRT2at behaves in our binding and catalysis assays as compared with the full recombination assays used by Cox and co-workers (17). FRT2at is an alternative WT sequence found in one of the three Flp protomer binding sites that comprise a WT FRT site in the 2plasmid of S. cerevisiae. (Note that there are only 3 known "WT" Flp binding sites, and thus we cannot do the sort of large scale comparisons possible for transcription factors such as TATA-binding protein (2).) The Cox laboratory (17) found that this alternative WT sequence hindered full recombination activity only when the change shown in FRT2at was present in both half-sites, on both sides of the spacer. Interestingly, we find that FRT2at has even stronger binding affinity than the WT FRT substrate. This implies that the defect in full recombination assays lies in other steps, such as the conformational changes involved in strand exchange and Holliday junction isomerization.
Substituting C:G Pairs-Replacing an A:T pair with a C:G pair, which presumably widens the minor groove through the addition of the N2 amino group of guanine, consistently weakens binding of FRT variants in positions 2-5 (Fig. 4E). To confirm that it is specifically the N2 amino group of guanine causing the decrease in binding, we chose two positions to examine with inosine:cytosine pairs, which are identical to G:C pairs but lack the N2 amino group of guanine (Fig. 6). These changes preserve the WT pattern of pyrimidine-purine steps as well. This probes whether the narrow minor groove is indeed the sequence-specific structural characteristic that Flp recognizes. We found that the I:C pairs do not disrupt binding nearly as much as the G:C pairs (FRT5ci and FRT3ci in Table 1), suggesting that the N2 amino group of guanine is responsible for a great deal of the decrease in affinity.

Catalysis
We chose 13 of the FRT substrates to compare the efficiency of the first cleavage step with the binding step (Table 2). Using radiolabeled DNA suicide substrates and WT Flpe, we took early time points to measure covalent cleavage product formation. It is technically difficult to measure catalysis with FRT substrates that have a low binding affinity, because it is difficult to achieve saturated binding conditions that are comparable with WT within a concentration range where Flp is soluble (Ͻ200 nM). Interestingly, the cleavage rate constants appear to be more seriously reduced with the A tracts that can bind well (FRT14ta and FRT4ta) compared with other FRT substrates with poor binding due to C:G pair substitutions (FRT5cg, Frt4cg, and FRT3cg, see Fig. 5). Compared with WT, Frt14ta has a ϳ3-fold decrease in binding affinity, but a ϳ10-fold decrease in cleavage rate constant. Frt3cg has a ϳ30-fold decrease in binding but just a ϳ3-fold decrease in cleavage rate constant. Examining the structure of FRT DNA before and after cleavage shows that the cleaved DNA kinks at the site of the scissile phosphate (Fig. 7). This suggests that the scissile phosphate may need to be in a specific conformation that a rigid A tract would interfere with, inhibiting cleavage more than poor binding from a C/G substitution does.
Most of the differences in cleavage rate constants for the different FRT substrates are not significant on a logarithmic scale. This implies that many of our DNA sequence changes have not affected the Achilles heel of the reaction because they do not critically hinder cleavage. However, there are some changes that do significantly affect catalysis: Frt1gc, with a C 3 G change right at the scissile phosphate, essentially prevents cleavage from taking place. When combined with the binding data, the catalysis data reveal that whether discrimination takes place primarily in the binding or catalytic step depends on the position of the sequence change. Maintaining a narrow minor groove with a well placed A tract allows good binding, but hinders catalysis, likely due to the rigid nature of A tracts affecting the movement of the substrate in or before the transition state. Interestingly, the substitution of C:G pairs to widen the minor groove interferes with binding but not as much with catalysis. However, the C:G 3 G:C change in Frt1gc is devastating at both the binding and the catalytic steps.

DISCUSSION
We measured the binding equilibrium and the apparent rate of the first chemical step in the recombination reaction, so that we can independently compare how the changes in DNA sequence affect the binding and catalysis. Our results show that Flp specificity is largely imparted at the binding step. Specificity can be thought of as the ability to discriminate between substrates in the ground state, or the competition of substrates for an enzyme. Because many DNA-binding proteins do not have a catalytic function beyond binding to a particular DNA sequence, and others have unusually divorced specificity in their binding and catalytic steps, we could have found that sequence changes had no impact on binding and only affected catalysis or vice versa.
There are a few intriguing instances where Flp substrates that bind well are cleaved poorly, suggesting that the sequence requirements for optimal binding and catalysis are not identical. Some changes in the DNA sequence may affect the ability of the DNA-protein complex to reach the transition state, explaining why a rigid A tract such as Frt14ta that can bind well is a poor cleavage substrate.
Our results also show that Flp specificity is the result of a combination of direct and indirect readout, but that the details of the indirect readout are more subtle than initially thought. For example, the single C:G 3 T:A and A:T 3 T:A changes in Frt4ta and Frt6ta are only mildly deleterious, but when they are combined in Frt46ta, the effect is far more than additive. Frt46ta contains a bone fide "A tract," whose special structure may be incompatible with that of the Flp-DNA complex. However, Frt14ta, where the A tract is moved only one position to the right, binds reasonably well. This can be rationalized by the crystal structure, where the minor groove becomes increasingly wide (and thus A tract-incompatible) toward position C7. Disrupting the A/T-rich region near the scissile phosphate through the substitution of C:G pairs worsens binding, whereas A/T transversions and replacements with inosine:cytosine pairs have much smaller impacts on binding. This suggests that Flp does indeed prefer sequences that can easily adopt a narrow minor groove in this region. Some A/T transversions hindered binding, which can be attributed to clashes with nearby Flp side chains. While focusing on how specificity is created by positive interactions it can be easy to forget that specificity also comes from avoiding negative interactions so that the incorrect substrate does not bind. In addition, whereas binding is disrupted by the substitution of C:G pairs in the A/T-rich region of the FRT site, catalysis seems to be much less hindered. Even the A tract FRT14ta, which binds similarly to WT FRT, is outperformed by FRT4cg in the catalysis assays, where the substitution of a C:G pair does disrupt binding. Flp sequence discrimination in this region near the scissile phosphate is thus imparted at both the binding and catalytic steps, depending on the change that is made.