Identification of Lys-403 in the PI-SceI Homing Endonuclease as Part of a Symmetric Catalytic Center*

Superposition of the PI-SceI and I-CreI homing endonuclease three-dimensional x-ray structures indicates general similarity between the I-CreI homodimer and the PI-SceI endonuclease domain. Saddle-shaped structures are present in each protein that are proposed to bind DNA. At the putative endonucleolytic active sites, the superposition reveals that two lysine (Lys-301 and Lys-403 in PI-SceI and Lys-98 and Lys-98′ in I-CreI) and two aspartic acid residues (Asp-218 and Asp-326 in PI-SceI and Asp-20 and Asp-20′ in I-CreI) are related by 2-fold symmetry. The critical role of Lys-301, Asp-218, and Asp-326 in the PI-SceI reaction pathway was reported previously. Here, we demonstrate the significance of the active-site symmetry by showing that alanine substitution at Lys-403 reduces cleavage activity by greater than 50-fold but has little effect on the DNA binding activity of the mutant enzyme. Substitution of Lys-403 with arginine, which maintains the positive charge, has only a modest effect on activity. Interestingly, even though the Lys-301 and Lys-403 residues display pseudosymmetry, PI-SceI mutant proteins with substitutions at these positions have different behaviors. The presence of similar basic and acidic residues in many LAGLIDADG homing endonucleases suggests that these enzymes use a common reaction mechanism to cleave double-stranded DNA.

Homing endonucleases are a group of enzymes that mediate DNA rearrangement processes (for reviews see Refs. [1][2][3]. The genes that encode these proteins are frequently associated with inteins and Group I introns, and homing endonucleases initiate the mobility of these elements to loci that lack them by creating double strand breaks. The LAGLIDADG subfamily of homing endonucleases, including the yeast PI-SceI protein, typically contain two LAGLIDADG motifs (also termed EN1 and EN3 (4) or Blocks C and E (5, 6)) separated by approximately 110 residues, but some LAGLIDADG enzymes encoded by Group I introns, such as I-CreI, are significantly smaller and contain a single motif.
The crystal structure of the PI-SceI intein was recently determined, the first for a homing endonuclease and a protein generated by protein splicing (7). The structure shows that the endonucleolytic and protein splicing active sites are situated in separate domains. Residues within the two LAGLIDADG motifs in the endonuclease domain form two parallel ␣-helices that are tightly packed. Two conserved acidic residues (Asp-218 and Asp-326) are located at the C termini of these two helices and play a critical role in the catalytic mechanism of the enzyme because substitution of these amino acids with alanine eliminates cleavage activity but permits substrate binding (8). Furthermore, mutagenesis experiments indicate that a conserved lysine (Lys-301), which is 6 Å distant from Asp-218 and occurs in another conserved motif (Block D), is also critical for catalytic activity (7,9). We remarked previously that the spatial arrangement of the Asp-218, Asp-326, and Lys-301 residues resembles that of residues at the active sites of several restriction enzymes and suggested that the different endonucleases employ similar reaction mechanisms (7).
PI-SceI forms two protein-DNA complexes with its asymmetric 31-base pair recognition sequence in gel mobility shift experiments, one in which contacts are made between the protein splicing domain and a region of the substrate (region II) that is situated adjacent to the cleavage site, and a second that includes these contacts as well as additional interactions between the endonuclease domain and the cleavage site itself (region I) (9 -11). By contrast, I-CreI interacts with a shorter, pseudosymmetric 19 -24-base pair sequence to yield a single complex (12).
Here, we report that a structural comparison of the PI-SceI and I-CreI proteins (7, 13) reveals a lysine residue at the PI-SceI active site, Lys-403, that is related by local 2-fold symmetry to Lys-301. Analysis of mutant PI-SceI proteins with substitutions at Lys-403 demonstrates the importance of this residue in the reaction pathway. This previously unrecognized symmetry relationship in PI-SceI necessarily implies that two identical pairs of residues (Asp-218 and Lys-301 and Asp-326 and Lys-403) form either one or two active sites. Superposition of I-CreI and PI-SceI indicates that each pair of residues overlaps with a similar pair in each of the symmetry-related monomers of I-CreI and suggests that the LAGLIDADG homing endonucleases share a common active-site architecture and catalytic mechanism.

EXPERIMENTAL PROCEDURES
Materials-All oligonucleotides were synthesized by Genosys Biotechnologies, Inc. TALON metal affinity resin was obtained from CLONTECH, and SP-Sepharose was obtained from Amersham Pharmacia Biotech.
Mutagenesis of the PI-SceI Gene-Plasmid pET PI-SceI C-His contains a PI-SceI gene that encodes a 479-amino acid PI-SceI derivative with a polyhistidine C-terminal extension (9). The K403A and K403R * This work was supported by National Institutes of Health Grant GM50815 (to F. S. G.) and National Institutes of Health Training Grant GM08280 (NIGMS) to the Houston Area Molecular Biophysics Program (to X. D.) and the Howard Hughes Medical Institute (F.A.Q). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This paper is dedicated to the memory of Dr. Josephine G. Gimble. § To whom correspondence should be addressed: Center for Macromolecular Design, Institute of Biosciences and Technology, 2121 W. substitutions were introduced into the PI-SceI gene using mutagenic oligonucleotide primers in a two-step overlapping polymerase chain reaction amplification protocol (14). All mutations and inserted sequences were confirmed by dideoxy sequencing.
Expression of PI-SceI Protein-Wild-type PI-SceI and the PI-SceI mutant derivatives were purified as described previously (9) by Co 2ϩmetal affinity and SP-Sepharose ion-exchange chromatography to greater than 95% as judged by SDS-polyacrylamide gel electrophoresis. Protein concentrations were determined using an extinction coefficient of 5.03 ϫ 10 4 /M/cm (9).
Assay of PI-SceI DNA Binding-A 219-base pair duplex DNA probe that contains a single PI-SceI recognition site was prepared for native gel mobility shift assays by polymerase chain reaction and end-labeled with [ 32 P]ATP (9). Purified wild-type PI-SceI and the K301A, K301R, K403A, and K403R variant proteins (0.7 nM) were used in gel mobility shift experiments to measure DNA binding as described previously (9). Protein-DNA complexes were separated from unbound DNA on a 7% native polyacrylamide gel and were visualized by autoradiographic exposure of the dried gel to film.
PI-SceI-mediated DNA Cleavage-Rates of DNA cleavage were measured under single-cycle conditions where enzyme was in excess over substrate. A linear plasmid substrate (pBS-PISce36, 7 nM) containing a single PI-SceI recognition site was incubated as described previously with either purified wild-type, K403A, or K403R proteins (100 nM) for various lengths of time at 37°C (9). The amounts of undigested linear plasmid DNA and of the two cleavage products were determined by scanning densitometry (Molecular Dynamics), and cleavage rates were determined by curve-fitting of the data using KaleidaGraph (Synergy Software).

RESULTS AND DISCUSSION
The yeast PI-SceI and the Chlamydomonas I-CreI homing endonucleases, in common with all characterized LAGLIDADG endonucleases, require Mg 2ϩ co-factor for DNA cleavage activity and cut DNA to yield a 4-base pair extension with 5Јphosphate and 3Ј-hydroxyl groups (1), thus raising the possibility that these enzymes utilize a common reaction mechanism. An important functional difference between the proteins is that PI-SceI, but not I-CreI, catalyzes protein splicing and contains motifs located at the N-and C-terminal regions that are specific to splicing proteins (5,6). Consequently, I-CreI is a much smaller protein (163 residues) than PI-SceI (454 residues). A second reason for the large size difference between the proteins is that the PI-SceI endonuclease domain is likely to have evolved by duplication and fusion of a gene similar to the one that encodes I-CreI. This idea is supported by the observation of a tandemly repeated sequence and protein footprinting pattern in homing endonucleases (15). PI-SceI contains two LAGLIDADG motifs and binds DNA as a monomer (11), whereas I-CreI has only a single motif and exists in solution as a homodimer (12).
Structural Comparison of the I-CreI Homodimer and the PI-SceI Endonuclease Domain-To elucidate common and distinctive features of homing endonucleases that contribute to substrate recognition and cleavage, we first compared the PI-SceI and I-CreI structures. The PI-SceI structure reveals that the endonuclease domain (domain II) is folded from two similar ␣/␤ substructures (identified as N-and C-subdomains) that are related by an approximate or pseudo 2-fold symmetry (7). I-CreI is structurally similar to only the endonuclease domain of PI-SceI as is evident from superimposing the C␣ atoms of the endonuclease domain onto those of the I-CreI homodimer ( Fig.  1, a-c). The two PI-SceI subdomains and the I-CreI homodimer form a saddle-shaped structure with the underside lined by the two ␤ sheets and the convex upper surface topped by ␣ helices. Models of DNA docked to both proteins indicate involvement of the saddle in DNA binding (7,13). The length of the saddle in PI-SceI is about 40 Å, whereas that in I-CreI is about 70 Å. This size difference is because of a long extended structure in the I-CreI homodimer comprised of ␤1 and ␤2 strands and a connecting loop that form parts of the two diagonally opposite sides of the homodimer saddle that is significantly shorter in the analogous structures in the N-subdomain (a short loop and ␤14) and C-subdomain (␤19 and ␤20) of PI-SceI ( Fig. 1a-d).
The length of the groove formed by the underside of the I-CreI saddle is sufficient to bind a roughly 20-base pair pseudosymmetric DNA homing sequence (13). In contrast, the groove length in PI-SceI is sufficient to cover only about 12 base pairs of DNA, far shorter than the required 31 base pairs. Fewer binding contacts may be required by the PI-SceI endonuclease domain because additional contacts are made by the proteinsplicing domain (9,16). We speculate that PI-SceI lost the extended saddle architecture present in I-CreI once it became associated with a protein splicing domain that acquired DNA binding activity.
The topology of the secondary structure elements of the two PI-SceI subdomains is similar to that of the I-CreI homodimer except for the presence of two additional helices (␣7 and ␣8) at the C-terminal end of I-CreI (Fig. 1d). Of the 200 residues that are topologically similar between the PI-SceI endonuclease domain and the I-CreI homodimer, 102 C␣ atoms overlap with a root mean square of 2 Å and a sequence similarity of 27%. Despite the topological similarity, the extent of overlap of the I-CreI homodimer with the two subdomains differs considerably; of the 102 overlapped residues, 67 are made with the C-subdomain and 35 with the N-subdomain. Interestingly, the number of superimposed residues made with the C-subdomain is similar to that in the overlap between the two PI-SceI subdomains (63 residues with a root mean square of 1.7 Å). This supports the idea that the endonuclease domain of PI-SceI evolved by tandem duplication of a common ancestor of the I-CreI monomer and of one PI-SceI subdomain. Once the gene fusion occurred in PI-SceI, each subdomain could evolve independently to optimize substrate specificity. The structural comparison suggests that the N-subdomain diverged further from the evolutionary ancestor than the C-subdomain because it is far less similar to the I-CreI subunit. The ability of the two subdomains to evolve at different rates, which is not possible in a homodimeric protein, may have permitted PI-SceI to recognize an asymmetric substrate.
Symmetry Relationships in the I-CreI and PI-SceI Active Sites-The overlap between the two structures extends to the conserved Asp residues at the C-terminal end of the LAGLI-DADG repeats that comprise part of the active site. In our alignment, Asp-218 and Asp-326 of PI-SceI superimpose with Asp-20 and Asp-20Ј in the two monomers of the I-CreI structure with a root mean square of less than 1 Å between both pairs of C␣ atoms (Fig. 1a-c). Substitution of any of these acidic residues with asparagine eliminates cleavage activity (8,17). Taken together, these observations strongly suggest that these Asp residues play the same functional role in the endonuclease activity of both proteins, which may be to chelate the essential Mg 2ϩ ion co-factor(s). However, the coordination of the cation is unknown in both structures.
A new finding that results from the structural comparison is that Lys-98 and Lys-98Ј of the two I-CreI monomers superimpose on Lys-301 of the PI-SceI N-subdomain and the pseudosymmetrically related Lys-403 in the C-subdomain (Fig. 1). The symmetry relationship of Lys-301 and Lys-403 has not been previously reported. The closest distance between the pair of lysines in PI-SceI is about 15 Å, which is very similar to the 14-Å distance that separates the identical pair in the I-CreI homodimer. The Lys residues are located outside the overlapped secondary structural elements used to perform the structural comparison and, thus, deviate in their C␣ position (2.4 Å between Lys-301 and Lys-98 and 3.5 Å between Lys-403 and Lys-98Ј). However, all their side chains point from the same direction to the well overlapped active-site Asp residues (Fig. 1).
Site-directed Mutagenesis of Lys-403-To elucidate the role of Lys-403 in the PI-SceI reaction pathway, variant PI-SceI proteins containing alanine and arginine substitutions were expressed, and purified proteins were assayed for DNA cleavage and binding activities. As the purification properties of the K403A and K403R proteins were indistinguishable from those of wild-type PI-SceI, gross changes in conformation of either mutant protein are unlikely. In kinetic experiments performed in the presence of Mg 2ϩ , the K403A protein is more than 50-fold less active than wild-type PI-SceI, whereas the K403R protein is about 5-fold less active (Table I). Amino acid substitutions at Lys-301 and Lys-403 do not have identical effects because the K301A variant is inactive, whereas the K403A protein is partially active (Ref . 9 and Table I). However, for both the Lys-301 and Lys-403 mutant proteins, an increase in the level of cleavage activity is observed when manganese is replaced with magnesium in the cleavage buffer, which is known to increase the enzymatic activity of wild-type PI-SceI (Table I). It is also evident from our data that the K301R and K403R variants are more active than the respective K301A and K403A proteins, suggesting that the positive charge at these positions is critical for activity.
Gel mobility shift experiments indicate that the large decrease in cleavage activity of the K403A protein is not because of reductions in DNA binding. Fig. 2 shows that the upper and lower protein-DNA complexes that are generated by wild-type PI-SceI are also apparent for the K403A and K403R proteins, and the measured level of binding is not markedly different from that for wild-type PI-SceI. 1 The ratio of upper to lower complex is slightly higher for the Lys-403 mutant proteins relative to wild-type PI-SceI. The large reduction in catalytic activity of the Lys-403 variants coupled with their near wildtype binding behavior is consistent with a role for Lys-403 in catalysis. In contrast, no conclusions about the role of Lys-301 in catalysis can be drawn because of the fact that nonconservative substitutions at Lys-301 eliminate substrate binding to region I. Fig. 2 shows that as reported previously, no upper complex is produced by the K301A protein, and the K301R 1 M. Crist and F. S. Gimble, unpublished results. variant yields an upper complex that migrates faster than that of wild-type PI-SceI (9). The two lysines (Lys-98 and Lys98Ј) in I-CreI that superimpose on Lys-301 and Lys-403 occur in two identical monomer subunits and necessarily have identical functions. Consistent with the finding that Lys-301 and Lys-403 are critical for DNA cleavage activity, substitution of Lys-98 and Lys-98Ј of I-CreI with glutamine inactivates the protein in a genetic system in vivo (17). However, no biochemical studies have been performed, and the basis of the phenotype is unknown (17). Moreover, because I-CreI is a homodimer, both lysines are necessarily altered in the mutant, and analysis of proteins with single substitutions at these positions will require engineering differentially tagged monomers as has been done for EcoRV (18). It cannot be concluded that Lys-301 and Lys-403 of PI-SceI have identical functions given that proteins with mutations at these positions behave differently. In PI-SceI, where the two monomer subunits are fused, the two lysines occur in different local environments and may have evolved somewhat different functions during evolution. If indeed Lys-301 and Lys-403 have identical functions, the different binding behaviors of the K301A and K403A proteins could be because of differing abilities of the protein to accommodate the substituted amino acids at each position. Lys-403 is located in the middle of the large loop connecting ␣9 of domain II with ␤23 of domain I, whereas Lys-301 is located at the end of ␤18, which is part of the two-stranded short ␤ sheet connecting the two subdomains of domain II. Because ␤ strand is not as flexible as the loop, the mutation of Lys-301 to arginine may be unable to be positioned correctly to achieve full catalytic function like the K403R mutant.
A critical role in the homing endonuclease reaction pathway for two symmetrically related positively charged residues would be reflected by their conservation among the LAGLI-DADG protein family. Indeed, conserved lysine residues analogous to Lys-301 have been reported in Block D of inteins (5,6), and an alignment based on a Hidden Markov model indicates that conserved lysines and arginines occur at or near this position in virtually all LAGLIDADG proteins (19). By contrast, none of the alignments specifically identified Lys-403 of PI-SceI as a conserved residue. The Hidden Markov model alignment did identify a conserved lysine in some of the homing endonucleases only a few residues removed from Lys-403, and we presume that this residue plays an analogous role in these enzymes (19). To illustrate the relative positions of the symmetrically related acidic and basic residues at the putative active site, an alignment of PI-SceI, the related yeast Ho (F-SceII) endonuclease, and I-CreI is shown in Fig. 3. Conservation of Lys-403 may not have been detected in sequence alignments because this residue occurs in a loop whose sequence may have diverged significantly.
Architecture of the Endonucleolytic Active Site-Does PI-SceI use two active sites to effect double strand cleavage or a single site that acts sequentially? Our finding of two symmetrically related lysines is consistent with a two-active-site model. In this scenario, the two active sites would be comprised of Lys-301 and Asp-218 and of Lys-403 and Asp-326, respectively. The two aspartic acids either act together to bind a single metal ion that is shared by both active sites or bind two metal ions, one for each active site, in conjunction with other acidic residues, the polypeptide backbone carbonyl oxygens, or phosphate oxygens. According to this model, each I-CreI monomer subunit contains a single active site, as has been suggested previously (13,17). If PI-SceI contains two independent active sites, it might be expected that they could be uncoupled by mutation to produce a nicking activity. However, no nicking activity is apparent for the K301A and K403A mutant proteins. 2 Furthermore, no nicking occurs when either Asp-218 or Asp-326 is substituted with alanine (8), although this result might be expected if these residues function to bind metal ion(s) required by both sites. These observations suggest that either the two putative sites are tightly coupled and cannot be separated (11) or there is only a single active site in PI-SceI that cuts both strands.
The architecture of the PI-SceI and I-CreI active sites can be compared with that of the two active sites in homodimeric restriction endonucleases, each of which contains at least two acidic residues that chelate a metal ion and a lysine whose function is unclear (20). Substitution of the acidic residues in restriction endonucleases that bind a catalytic metal ion eliminates activity (20), and similar results are observed when Asp-218 and Asp-326 substitutions are made in PI-SceI (8). Whether the conserved lysine residues in both families of enzymes are functionally analogous is less certain. Lys-403 of PI-SceI may be involved in catalysis because mutations at this position reduce cleavage activity but do not affect DNA binding. However, the residue is unlikely to play an essential role in the catalytic reaction mechanism, such as generating the activated water molecule, because a larger reduction in activity would have been expected for the mutant enzymes, but it may still function in other capacities, such as helping to neutralize accumulated negative charge on the presumed pentavalent transition state or binding to the cleavage products. Similarly, an alanine substitution at the conserved Lys-92 of EcoRV reduces 2 D. Hu and F. S. Gimble, unpublished results. FIG. 3. Alignment of amino acid sequences from selected regions of PI-SceI, Ho, and I-CreI endonucleases. The two acidic residues, Asp-218 and Asp-326, and two basic residues, Lys-301 and Lys-403, that are presumed to comprise the PI-SceI active site(s) and the analogous residues in Ho and I-CreI are indicated by reverse lettering. Sequences from I-CreI have been manually aligned with blocks C, D, and E identified in the PI-SceI and Ho proteins (6). Identification of Lys-98 in I-CreI and Lys417 in Ho as analogues of Lys-403 in PI-SceI was accomplished by structural and sequence alignments, respectively. The position in the protein of the last amino acid in each block is indicated to the right of the block. cleavage activity (21). A key difference between EcoRV and PI-SceI is that arginine substitution at Lys-403 of PI-SceI yields a protein with near wild-type activity, but a K92R mutant of EcoRV is inactive (22). Thus, additional mechanistic information for both the restriction enzymes and the homing endonucleases will be required to determine whether the lysine residues are functionally analogous.
Besides the conserved acidic and basic residues discussed above, I-CreI residues Gln-47, Arg-51, and Arg-70 and their symmetric partners have also been implicated in the catalytic pathway by mutagenesis experiments (17). Similar studies of the related homodimeric endonuclease I-CeuI identified Gln-93 and 93Ј, which are homologues of Gln-47 and 47Ј in I-CreI, as critical amino acids (23). In PI-SceI, Asp-229 occupies a similar position in an analogous ␤ structure as Gln-47, but the side chains of the two residues do not overlap because the carboxyl group of Asp-229, unlike the Gln-47 amide, points away from the active site (9). We suggested previously that Asp-229 may be involved in catalysis because a D229A mutant protein is reduced Ͼ500-fold in cleavage activity but is only slightly reduced in substrate binding (9). Moreover, there is evidence that an acidic residue is conserved at or near this position (19). It is possible that Asp-229 functions to coordinate a metal ion in conjunction with Asp-218 as part of one active site, but if this is true, rotation of the Asp-229 side chain and additional conformational changes would be required when the protein binds to the substrate. PI-SceI residue Thr-341 is related to Asp-229 by pseudosymmetry and overlaps Gln-47Ј in I-CreI. A T341A mutant protein does not form an upper complex in gel mobility shift experiments, and this reduced binding may account for the ϳ11-fold reduction in catalytic activity of the protein relative to wild-type PI-SceI. 3 However, it is unlikely that Thr-341 is involved in binding the metal co-factor in a second active site because a larger decrease in activity would be expected for the mutant protein. The side chains of PI-SceI residues Arg-231 and His-343 are close to, but do not overlap, those of Arg-51 and -51Ј in I-CreI. R231A and H343A mutant proteins are 14-and 5-fold reduced in activity, respectively, and the H343A protein displays a modest reduction in binding (9). Whether these residues are functional analogues of I-CreI residues Arg-51 and -51Ј is unclear. Finally, Arg-70 in one I-CreI monomer is located in a position similar to Asp-254 or Arg-255 of PI-SceI, and Arg 70Ј in the other monomer closely overlaps with Glu-366 in PI-SceI. However, these PI-SceI residues are unlikely to be Arg-70 analogues because alanine substitutions at these positions do not adversely affect PI-SceI activity. Taken together, these data indicate that in some cases there are PI-SceI and I-CreI residues that occur at analogous positions that may play analogous functions in the cleavage reaction. A structure of a homing endonuclease bound to its substrate will help support this conclusion.