Identification and Characterization of the N-Ethylmaleimide-sensitive Site in λ-Integrase

Integrase (Int) of bacteriophage λ is a heterobivalent DNA-binding protein and a type I topoisomerase. Upon modification with N-ethylmaleimide (NEM), a sulfhydryl-directed reagent, Int loses its capacity to bind “arm-type” DNA sequences and, consequently, to carry out recombination; however, its ability to bind “core-type” sequences and its topoisomerase activity are unaffected. In this report, the NEM-sensitive site was identified by modifying Int with [14C]NEM. Following cleavage by formic acid, which cleaves Asp-Pro bonds, and fractionation on a Fractogel HW-50 (F) sizing column, the fragment containing the primary site of [14C]NEM incorporation was subjected to amino acid sequencing. The results indicate that the primary site of [14C]NEM incorporation is in the peptide-spanning amino acid residues 1-28, which contains a cysteine at position 25. To confirm that Cys-25 is the target of NEM reactivity, site-directed mutagenesis was used to change this cysteine to alanine or serine. The mutant protein is not chemically modified by NEM and shows no loss of activity after NEM treatment. The fact that C25A and C25S both retain full recombination activity indicates that the SH group of Cys-25 does not provide any critical contacts, either with arm-type DNA or with other parts of the Int protein to form the arm-type recognition pocket. The loss of arm-type DNA binding and the concomitant loss of recombination function as a result of NEM modification must be due to the presence of the maleimide moiety and not due to loss of a critical cysteine contact.

prises 240 base pairs and contains many protein binding sites for Int and accessory proteins (IHF, Xis, and Fis). These DNAbending proteins cooperate with Int to form the recombinogenic "intasomes" (3). attB contains only two Int binding sites of the core-type while attL and attR are hybrid sites of intermediate complexity between attP and attB.
There are two classes of binding sites for Int. Core-type sites are positioned as inverted repeats that encompass each of the sites of DNA strand nicking, which are staggered by 7 base pairs called the overlap region on each DNA helix (4,5). Armtype Int binding sites have a different consensus recognition sequence, located some distance away from the region of strand exchange, and they bind Int with a higher affinity than the core-type sites (6).
When Int was first purified, it was shown to be a type I topoisomerase that could nick and religate DNA in the absence of any high energy cofactors (7,8). While this activity fit nicely with the requirements for a recombination that cleaved and reassorted DNA helices, its function was clearly more complex than that of a simple topoisomerase. One of the first suggestions of this complexity came from the effects of pretreating Int with N-ethylmaleimide (NEM). While NEM treatment abolished recombination activity, it did not impair the topoisomerase activity of Int (8). Additionally, it was observed that NEM abolished the formation of non-filterable heparin-resistant complexes that Int formed with attP DNA (7). These observations led Kikuchi and Nash (8) to speculate that Int had two domains. Nuclease protection experiments showed that NEM inactivated one of two distinct DNA binding specificities, namely the one responsible for heparin-resistant binding at the high affinity arm-type sites (6). Binding at the low affinity heparin-sensitive core-type sites, where strand cleavage takes place, was (like the topoisomerase activity) resistant to NEM modification.
Although the analysis of partial proteolytic fragments has been used to localize the two DNA binding specificities to an amino-terminal and a carboxyl-terminal domain, respectively, the specific activity of the proteolytic fragments was significantly reduced relative to intact Int (9). Additionally, an extensive mutational mapping analysis of the functional domains of Int has revealed some mutants that affect binding to arm-type sites but map in the carboxyl-terminal domain (10). In this report, we identify the unique cysteine residue that is modified by NEM and show by site-directed mutagenesis that loss of recombination function is due to the presence of the maleimide moiety and not due to loss of a critical cysteine contact.

MATERIALS AND METHODS
Preparation of Int and IHF-Int proteins (wild type and mutant) were produced from an expression plasmid under the control of a T7 promotor in E. coli BL21. Int proteins were purified to near homogeneity by a modification of the method of Kikuchi and Nash (7). IHF was produced and purified by the method of Nash et al. (11). Protein concentrations were determined by the dye binding method (12).
Site-directed Mutagenesis-The desired mutations were made by oligonucleotide-directed mutagenesis using the polymerase chain reaction. The mutations were introduced on a 32-base primer that also overlapped the unique PpuMI site in the Int gene. The other primer overlapped a unique XbaI site downstream of the T7 promotor. The resulting product was purified by the Qiagen Inc. (Chatsworth, CA) polymerase chain reaction purification kit using the protocol and conditions of the manufacturer. The purified polymerase chain reaction product was then cleaved with the restriction enzymes XbaI and PpuMI, and the 150-base pair product was purified on a 1.2% agarose gel and excised from the gel using the Qiagen gel extraction kit. The purified XbaI-PpuMI fragment was then introduced into the Int expression plasmid at its XbaI-PpuMI sites. The mutations were confirmed by sequencing the entire XbaI-PpuMI interval. For expression of the mutant proteins, the expression plasmid containing the mutation was introduced by transformation into the host strain E. coli BL21.
Gels were stained with 0.25% Coomassie Brilliant Blue R-250 in 50% methanol and 10% acetic acid, followed by destaining in 5% methanol containing 7.5% acetic acid. 14 C radioactivity was located by fluorography using sodium salicylate as the fluor (15). Exposed film was analyzed by densitometry using an LKB Ultrascan XL laser densitometer and an IBM-AT personal computer.

Titration of the Int with N-[ethyl-2-3 H]Ethylmaleimide ([ 3 H]NEM)-
Int protein was adjusted to a protein concentration of 7.4 M in MEG 0.5 BPT (50 mM MOPS (7.4), 1 mM EDTA, 10% glycerol, 500 mM sodium chloride, 1 mM 2-mercaptoethanol, 25 g/ml phenylmethylsulfonyl fluoride, and 0.001% Triton X-100). Aliquots of [ 3 H]NEM (specific radioactivity of 50 Ci/mol) dissolved in 25 l of water were added to 475 l of Int (3.5 nmol). The volume was adjusted to 500 l. This was incubated at 23°C for 15 min followed by the addition of an equimolar amount of dithiothreitol prior to loading onto a hydroxylapatite column (50-l bed volume) preequilibrated in MEG 0.5 BPT. Unbound radioactivity was removed by washing the column with 5 ϫ 1 ml washes of MEG 0.5 BPT. Bound protein was eluted with 0.2 ml of MEG 0.5 BPT containing 40 mM sodium phosphate. An aliquot was used for measuring the radioactivity by liquid scintillation counting.
Recombination Assays Using 32 P-Labeled attB-pWR101 (linearized with EcoRI) 5Ј-labeled with [␥-32 P]ATP (17) was used as one of the substrates (attB) and used at a 2-fold excess over attP. Int protein and its mutants, C25A and C25S, were used at equal protein concentrations (80 pmol). Int and IHF were present at a molar ratio of 2:1. Other conditions were similar to those used for the non-radioactive assay for recombination. At different time intervals, the reaction was quenched by the addition of SDS, and the product was separated from the substrate on a 1.2% agarose gel. The amount of product (recombinant) was quantitated by scanning on a phosphoimager.
Amino-terminal Peptide Sequencing-Sequencing of the peptides was carried out by the Sequence Facility at the University of California, Davis on a Beckman model 890 M liquid phase sequencer. Amino acids were identified as their phenylthiohydantoin derivatives using two different reverse-phase high performance liquid chromatographic systems.

RESULTS AND DISCUSSION
The sulfhydryl group of cysteines is in general the most reactive functional group in a protein. N-Ethylmaleimide reacts with sulfhydryl groups in proteins with considerable specificity under conditions such as those used in the present study (18,19). The reaction is very rapid and involves the addition of the SH group to the olefinic double bond of NEM to form a thioether ( Fig. 1) (20). There are instances where NEM has been shown to react with functional groups other than the SH group, namely, the ␣-amino group of peptides, the imidazole moiety of histidine (20,21), and the ⑀-amino group of lysine (22). However, all of these non-SH side chain reactions are much slower than the reaction of NEM with the sulfhydryl moiety of cysteine. Under conditions where the reaction with thiol groups is complete within 20 s, the non-SH reactions have half-times of about 2 h (23). It should also be emphasized that these non-SH reactions occur efficiently only at very high concentrations of NEM (100 mM), well above those used in SH group modifications (1-5 mM) (22).
There are four Cys residues in the bacteriophage -Int protein. One Cys residue is present in the amino-terminal domain, which recognizes and binds to "arm-type" DNA, while the other three are located in the carboxyl-terminal domain, which binds to "core-type" DNA and also harbors the "nicking-ligating" topoisomerase activity (8). As observed previously (8), modification of the Int protein with 2.5 mM N-ethylmaleimide for 15 min results in 95% inactivation of recombination activity and no reduction in the topoisomerase activity (Fig. 2). The stoichiometry of incorporation of [ 3 H]NEM into the Int protein showed that at the concentration of NEM required for 95% inactivation of recombination activity, 1 eq of [ 3 H]NEM was incorporated per equivalent of the Int protein (Fig. 2), suggesting the modification of a single Cys residue.
To identify the target Cys residue, the Int protein was modified with [ 14 C]NEM. The [ 14 C]NEM-Int adduct was fragmented by treatment with 70% formic acid, which cleaves the protein at Asp-Pro bonds (24). The relatively low abundance of aspartic acid-proline linkages and the ease with which the reaction can be carried out make it an attractive method for generating large fragments for direct sequence analysis or for subsequent proteolytic digestion. There are two Asp-Pro linkages in the Int protein; one is in the amino terminus at position 28 -29 of the primary amino acid sequence while the other is at position 303-304 (Fig. 3). Thus, formic acid cleavage of the Int protein should result in three fragments of sizes 27, 5, and 3 kDa, respectively.
The formic acid digestion products were separated by size exclusion chromatography on a Fractogel HW-50 (F) column, and individual fractions were visualized by Coomassie Blue staining for total protein or by fluorography for detection of 14 C radioactivity (Fig. 4). The first peak of highest molecular weight material is expected to contain the 27-kDa peptide extending from position 30 to 303, two large (partial digest) peptides generated by formic acid cleavage at only one of the two Asp-Pro sites, as well as some large peptides due to formic acid cleavage at bonds of intermediate acid lability and uncleaved Int. This heterogeneous collection of peptides is seen near the top of the Coomassie-stained gel (fractions 17-21, middle panel). The most prominent band is the 27-kDa peptide; above it a trail of larger peptides extends to the undigested Int that has been trapped in the well. The radioactivity in these fractions did not migrate as a discrete band. Except for the undigested material trapped in the well, the radioactivity in this region is present as a dispersed low level smear ( fractions  18 -20, bottom panel). The plot of UV absorbance shows that, appended to the first peak, there is a shoulder of a lower molecular weight peptide that should correspond to the 5-kDa peptide extending from position 304 to the carboxyl terminus (fractions 22-23, middle panel). Comparison of the Coomassiestained gel and the autoradiograph shows that there is no radioactivity associated with this peptide. The next peak, fractions 25-28 (middle panel), contains the 3-kDa peptide that extends from the amino terminus to position 29. This is the only peak that contains a specific band of radioactivity (bottom panel). The last peak in the UV plot contains heterogeneous degradation products of Ͻ1 kDa and has no radioactivity associated with it.
The specific activity of the fractions containing the 3-kDa peptide (fractions 26 -28, bottom panel) is more than 6 times higher than any other fraction from the column. Additionally, the fraction of radioactivity recovered in the 3-kDa peak from the column (37%) is the same as the fraction of radioactivity recovered in the analogous band (39%) from the crude unfractionated hydrolysate (lane L, bottom panel). These observations, coupled with the overall high yield of radioactivity following the column fractionation (88% recovery), reinforce the conclusion that the primary site of [ 14 C]NEM modification is in the smallest formic acid peptide. We confirmed that the 3-kDa peptide does indeed correspond to the amino-terminal fragment (amino acid residues 1-28) by subjecting it to 10 cycles of amino-terminal amino acid sequencing (Fig. 3). As stated above, this fragment has a unique Cys residue at position 25.
To confirm that Cys-25 is the target of NEM reactivity and to rule out the unlikely possibility that the NEM reactivity was targeted to a non-SH group (see above), we utilized site-directed mutagenesis to change this residue to a serine (C25S) or an alanine (C25A). The mutations were introduced by standard oligonucleotide-directed mutagenesis techniques, confirmed by sequencing, and introduced into appropriate plasmid vectors for protein production (under "Materials and Methods"). Wildtype Int and the mutant proteins were purified to at least 85% purity as judged by Coomassie blue staining of a polyacryl- Bacteriophage -Int protein was modified with [ 14 C]NEM as described and subjected to treatment with 70% formic acid as described under "Materials and Methods." The resultant partial acid hydrolysate was fractionated on a Fractogel HW-50 (F) sizing column in 12% acetic acid. The top panel shows the elution profiles of total protein, absorbance at 280 amide gel. That Cys-25 in the wild-type Int protein is the target of NEM-reactivity was clearly evident from the observation that the mutant protein C25S is completely resistant to NEM modification (Fig. 5).
The mutants also made it possible to determine whether the functional sensitivity of Int to NEM modification reflected a critical role for the SH group of Cys-25. Fig. 6 shows that both C25S and C25A are indistinguishable from wild-type Int in their ability to catalyze recombination in vitro. We conclude that the SH group of Cys-25 does not provide any critical contacts, either with arm-type DNA or with other parts of the Int protein, to form the arm-type recognition pocket. The loss of arm-type DNA binding and the concomitant loss of recombination function as a result of NEM modification must be due to the presence of the maleimide moiety and not due to loss of a critical cysteine contact.
The initial observation of NEM sensitivity was one of the original bases for suggesting a two-domain structure for Int (8). Subsequent partial proteolysis provided more substantial evidence, but there are caveats associated with the results of partial proteolysis (9). The results reported here with the intact protein map the arm-binding defect of NEM-Int to the aminoterminal domain. This further strengthens the argument for, and identification of, two domains with distinct and non-overlapping interactions with DNA. One of these is an NEM-sensitive amino-terminal domain that binds with high affinity to arm-type DNA recognition sequence. The other is an NEMinsensitive carboxyl-terminal domain that binds with low affinity to core-type DNA recognition sequence and specifies the nicking and ligating activities of Int. Although the two domains appear to be relatively independent, mutational analyses indicate that there may be some interaction between them (10).
The disruption of arm binding by the maleimide moiety could be due to simple steric hindrance, i.e. the presence of a bulky group, or it could cause a conformational change in the vicinal arm-type DNA binding pocket. The change in conformation, if any, is probably confined to a specific and localized region since recognition/binding to core-type DNA or its topoisomerase activity are unaffected (6,8). As more structural information about Int is acquired, modeling the structural basis for the maleimide inhibition should be helpful in evaluating potential environments for Cys-25 and for working out the structural details of how this unusual heterobivalent DNA-binding protein interacts with DNA to form a higher order recombinogenic complex.