Structural and Mutational Studies of a Hyperthermophilic Intein from DNA Polymerase II of Pyrococcus abyssi*

Background: The PolII intein from the hyperthermophilic Pyrococcus abysii only splices at very high temperature. Results: NMR structure, dynamics, and mutagenesis of Pab PolII intein have been characterized. Conclusion: The Pab PolII intein has unique structural and dynamic features that may contribute to its higher temperature for optimal activity. Significance: Pab PolII intein is an ideal candidate for protein engineering. Protein splicing is a precise self-catalyzed process in which an intein excises itself from a precursor with the concomitant ligation of the flanking polypeptides (exteins). Protein splicing proceeds through a four-step reaction but the catalytic mechanism is not fully understood at the atomic level. We report the solution NMR structures of the hyperthermophilic Pyrococcus abyssi PolII intein, which has a noncanonical C-terminal glutamine instead of an asparagine. The NMR structures were determined to a backbone root mean square deviation of 0.46 Å and a heavy atom root mean square deviation of 0.93 Å. The Pab PolII intein has a common HINT (hedgehog intein) fold but contains an extra β-hairpin that is unique in the structures of thermophilic inteins. The NMR structures also show that the Pab PolII intein has a long and disordered loop in place of an endonuclease domain. The N-terminal Cys-1 amide is hydrogen bonded to the Thr-90 hydroxyl in the conserved block-B TXXH motif and the Cys-1 thiol forms a hydrogen bond with the block F Ser-166. Mutating Thr-90 to Ala dramatically slows N-terminal cleavage, supporting its pivotal role in promoting the N-S acyl shift. Mutagenesis also showed that Thr-90 and His-93 are synergistic in catalyzing the N-S acyl shift. The block F Ser-166 plays an important role in coordinating the steps of protein splicing. NMR spin relaxation indicates that the Pab PolII intein is significantly more rigid than mesophilic inteins, which may contribute to the higher optimal temperature for protein splicing.


Protein splicing is a precise self-catalyzed process in which an intein excises itself from a precursor with the concomitant ligation of the flanking polypeptides (exteins). Protein splicing proceeds through a four-step reaction but the catalytic mechanism
is not fully understood at the atomic level. We report the solution NMR structures of the hyperthermophilic Pyrococcus abyssi PolII intein, which has a noncanonical C-terminal glutamine instead of an asparagine. The NMR structures were determined to a backbone root mean square deviation of 0.46 Å and a heavy atom root mean square deviation of 0.93 Å . The Pab PolII intein has a common HINT (hedgehog intein) fold but contains an extra ␤-hairpin that is unique in the structures of thermophilic inteins. The NMR structures also show that the Pab PolII intein has a long and disordered loop in place of an endonuclease domain. The N-terminal Cys-1 amide is hydrogen bonded to the Thr-90 hydroxyl in the conserved block-B TXXH motif and the Cys-1 thiol forms a hydrogen bond with the block F Ser-166. Mutating Thr-90 to Ala dramatically slows N-terminal cleavage, supporting its pivotal role in promoting the N-S acyl shift. Mutagenesis also showed that Thr-90 and His-93 are synergistic in catalyzing the N-S acyl shift. The block F Ser-166 plays an important role in coordinating the steps of protein splicing. NMR spin relaxation indicates that the Pab PolII intein is significantly more rigid than mesophilic inteins, which may contribute to the higher optimal temperature for protein splicing.
Protein splicing is a self-catalyzed post-translational process in which an in-frame protein fusion, called an intein, is excised from the precursor protein with the concomitant ligation of the two flanking exteins, the N-and C-exteins ( Fig. 1A) (1). More than 600 inteins have been found in all three domains of life: bacteria, archaea, and eukarya (2). Applications of protein splicing include protein engineering, labeling, purification, and control of protein function (3)(4)(5)(6)(7). Inteins can also serve as a novel drug target in bacteria that rely on protein splicing for their survival, such as Mycobacterium tuberculosis (8).
Protein splicing is a strictly intramolecular reaction, requiring no cofactors or energy input. The precursor protein acts as both the enzyme and substrate of the reaction. Therefore, the precursor for protein splicing is the equivalent of a traditional enzyme-substrate complex and can provide key structural information for understanding the mechanism of protein splicing. However, because of the spontaneous nature of protein splicing, native precursors are unstable. Consequently, structural studies of the protein splicing mechanism have relied on spliced inteins or intein precursors with mutations at active site residues. The Pyrococcus abyssi DNA polymerase II intein, abbreviated as the Pab PolII 4 intein, is found in a hyperthermophilic organism that lives near deep sea thermal vents, with an optimal growth temperature of 96°C (9). This suggests that one could isolate a stable precursor with native intein sequence for structural studies.
There are four steps of protein splicing in canonical inteins ( Fig. 1A): step 1, N-X acyl shift (X ϭ S or O); step 2, transesterification and the formation of a branched intermediate; step 3, asparagine cyclization coupled with C-terminal cleavage; and step 4, X-N acyl shift and succinimide hydrolysis (10,11). These steps are catalyzed by active site residues in conserved blocks in intein sequences. Block A contains a conserved cysteine or serine at position 1 that serves as a nucleophile for the N-X acyl shift, the first step of protein splicing. Block B has the TXXH motif important for the N-X acyl shift (12)(13)(14). Block F includes a conserved aspartate (Fig. 1A, step 3) that has been proposed to play a pivotal role in coordinating splicing and a conserved histidine that modulates C-terminal cleavage (15)(16)(17)(18)(19)(20). In block G, there is the penultimate histidine (relative to the intein C terminus) at position 6 and a C-terminal asparagine at position 7, both critical for C-terminal cleavage. The first residue of the C-extein, usually a cysteine or serine and termed Cϩ1 or Sϩ1, respectively, serves as the nucleophile for transesterification. Variations of the four-step scheme have been discovered among a number of inteins (21)(22)(23). The C-terminal residue of inteins is typically an asparagine, whose cyclization leads to the cleavage of the intein from the C-extein. The Pab PolII intein (23) is only the second intein demonstrated to splice with a C-terminal glutamine, after the Chiloiridescent virus (CIV) RNR intein (21,22). In contrast to the CIV RNR intein, which has reduced splicing activity upon the mutation of C-terminal glutamine to asparagine, the same mutation enhances protein splicing by 3-fold in the Pab PolII intein (23).
In this paper, we carried out structural and mutagenesis studies of the Pab PolII intein using solution NMR and in vitro splicing assays. For the structural studies, we employed a wild type intein without any extein residues. The intein contains 185 residues and is categorized as a mini-intein as it lacks a homing endonuclease domain. Our investigation showed that the usual insertion site for the endonuclease domain is replaced by an extended loop, which may be an ideal site for protein engineering. The structure also shows that the threonine side chain in the block B TXXH motif forms a hydrogen bond with the Cys-1 amide nitrogen. Block F Ser-166, the equivalent of the conserved block F aspartate in inteins, is in close contact with the Cys-1 side chain. Our mutagenesis of the Thr-90 in the block B TXXH motif shows that it is crucial for promoting the N-S acyl shift. Mutagenesis of block F Ser-166 reveals that it has a coordination role in protein splicing. Finally, we used NMR spin relaxation to show that the Pab PolII intein has many unique properties in protein dynamics.

EXPERIMENTAL PROCEDURES
Protein Overexpression, Purification, and NMR Sample Preparation-The Pab PolII intein gene cloned into pETM-44 vector ppC1Q185 expresses a fusion protein with an N-terminal (His) 6 tag and maltose-binding protein (MBP). There is a single proline between the (His) 6 tag and MBP and a linker sequence TPGSLEVLKQGPM between MBP and the intein. Isotopically labeled ([U-15 N], [U-13 C; U-15 N] and [ϳ70%-2 H; U-13 C; and U-15 N]) proteins were obtained by transforming the plasmid into Escherichia coli strain BL21(DE3) and overexpressing the fusion protein in M9 medium. The M9 cultures were incubated at 37°C until A 600 reached 0.3-0.4 and were induced with 1 mM isopropyl 1-thio-␤-D-galactopyranoside at 20°C for an additional 16 h. Cell lysate was purified by nickel-nickel-nitrilotriacetic acid affinity chromatography to obtain the fusion protein. The Pab PolII intein was cleaved from the fusion protein with 100 mM dithiothreitol (DTT) at 60°C for 6 h. Affinity chromatography was utilized again to trap the (His) 6 -tagged MBP and uncleaved fusion protein, whereas the Pab PolII intein was not retained by the nickel column. Flow-through fractions were pooled and exchanged into buffer containing 20 mM sodium phosphate, 0.5 mM EDTA, and 0.05 mM sodium azide in 90% H 2 O, 10% D 2 O or 99.9% D 2 O at pH 6.5. The final concentrations of the NMR samples were between 0.5 and 2.0 mM.
NMR Spectroscopy-Both NMR structural and dynamics studies were carried out at 47°C, where Pab PolII intein can mediate protein splicing efficiently (23). All spectra were acquired on a Bruker Advance II 800 MHz or Bruker Advance II 600 MHz ( 1 H) spectrometer, each equipped with a cryogenic probe. Spectra were processed with nmrPipe software (24) and analyzed using Sparky (T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco, CA). Resonance assignment was carried out using the following experiments: two-dimensional 15 N, 1 H-heteronuclear single quantum coherence, three-dimensional HNCACO, three-dimensional HNCO, three-dimensional HNCACB, three-dimensional HN(CO)CACB, three-dimensional 15   sional H(CC)(CO)NH-TOCSY, three-dimensional HC(C)H-TOCSY ( m ϭ 15 ms), three-dimensional 15 N-NOESY ( m ϭ 100 ms), and three-dimensional 13 C-NOESY ( m ϭ 105 ms). The 1 H chemical shifts were referenced relative to 4,4-dimethyl-4-silapentane-1-sulfonic acid and the 15 N and 13 C chemical shifts were referenced indirectly using frequency ratios between 15 N, 13 C, and 1 H ( 15 N/ 1 H ϭ 0.101329118, 13 C/ 1 H ϭ 0.251449530) (25). The chemical shifts have been deposited in the BioMagResBank under accession number 17418. For residual dipolar coupling measurements with IPAP (26), the Pab PolII intein was aligned in 7.5% polyacrylamide gels with a stretch ratio (d O /d N ) of 1.29 using the apparatus described by Chou et al. (27). 15 N Relaxation Rates and Analysis-All relaxation experiments were carried out at 47°C on a Bruker Advance II 600 MHz spectrometer equipped with a triple-resonance cryogenic probe. Relaxation properties were characterized by 15 N R 1 , R 2 , and heteronuclear steady-state NOE experiments. These relaxation parameters were sensitive to motions occurring at the time scale faster than protein tumbling, on the order of picosecond to nanosecond. R 1 , R 2 , and NOE experiments were performed using the pulse sequence described by Farrow et al. (28). NMR spectra were acquired with 2048 (t 2 ) ϫ 256 (t 1 ) complex data points, spectral widths of 7560 Hz in 1 H and 2798 Hz in 15 N, and 16 scans. The recycle delay was 3.0 s. R 1 relaxation times of 10, 100, 200, 300, 400(ϫ2), 500, 600, 700, 800, and 900 ms were used. R 2 relaxation times of 2, 16(ϫ2), 30, 44, 58, 72, 86(ϫ2), and 100 ms were used. { 1 H}-15 N steady-state heteronuclear NOEs were obtained by interleaving the proton saturation experiment and no proton saturation experiment at each t 1 point. The recycle delay was 7.5 s and proton saturation was achieved by applying a 120 degree proton pulse at 5 ms delay. For each R 1 and R 2 experiment, the spectrum with the shortest relaxation time (highest intensities) was peak picked with Sparky. Fitting for relaxation rates and error estimates were accomplished using the program CURVEFIT (A. G. Palmer, Columbia University, New York). The heteronuclear NOE values were obtained from the ratio of the peak heights for 1 H-saturated and unsaturated spectra and carried out in triplicate. The uncertainties in the NOEs were set to twice the standard deviation from three trials (29). For the Pab PolII intein, we found 104 residues with well resolved peaks in the 15 N, 1 H-heteronuclear single quantum coherence spectrum to warrant quantitative relaxation analysis on a per-residue basis. 15 N NMR relaxation data were analyzed using a r NH of 1.02 Å as the mean amide nitrogen-hydrogen bond length, and ⌬ ϭ ʈ Ϫ s Ќ is the chemical shift anisotropy of Ϫ172 ppm for the backbone 15 N nucleus. The amplitudes and time scales of the internal motions of the protein were determined from the relaxation data according to the model-free formalism pioneered by Lipari and Szabo (30,31) and extended by Clore et al. (32), using the program Modelfree (version 4.15, A. G. Palmer, Columbia University) in combination with Fast Modelfree (33). The generalized order parameters (S 2 ) obtained by Modelfree described the amplitude of the internal motion for individual amide bonds at the picosecond to nanosecond time scale.
NMR Structure Determination-Peak lists were generated from three-dimensional 15 N-NOESY in 90% H 2 O, and two three-dimensional 13 C-NOESY spectra in 100% D 2 O and 90% H 2 O, respectively, all recorded at 800 MHz 1 H frequency. The peak lists and the chemical shifts from the resonance assignment (34) were used as input for CYANA3.0 (35). Dihedral angle restraints were derived from TALOSϩ (36, 37) using chemical shifts of 15 N, 13 CЈ, 13 C ␣ , 13 C ␤ , H ␣ , and H N . The final set of unambiguous NOE assignments contained 3555 meaningful distance restraints, corresponding to ϳ19 restraints/residue on average. The structure from CYANA with the lowest target function values obtained from cycle 7 were subject to refinement with residual dipolar coupling in explicit water in Xplor-NIH (38). The quality of the final structures was assessed with PSVS (39). The atomic coordinates of the bundle of 20 conformers (accession number 2LCJ) have been deposited in the Brookhaven Protein Data Bank. Model 1 represents the conformer that is closest to the mean coordinates.
Precursor Purification, Mutagenesis, and Splicing Assays-Plasmid pMIH expresses the protein MIH, previously described in Ref. 23, an in-frame fusion of E. coli MBP, the seven C-terminal residues of the native Pab PolII N-extein, the 185 residues of the intein, the five N-terminal residues of the native C-extein, and a His tag. To study the effect of mutations on the first step of splicing, we introduced mutations of Cϩ1A and Q185A to prevent steps two and three of splicing, respectively. To study the third step of splicing, we replaced the N-extein (M) with a short seven-residue polypeptide (N) and made a mutation of C1A. We overexpressed the proteins in E. coli BL21(DE3), which were purified using HisLink Protein Purification Resin (Promega, Madison, WI). The purified proteins were exchanged into buffer A (100 mM phosphate buffer, pH 7.0, with 500 mM NaCl) using 3 kDa MWCO centrifugal filters (Millipore, Billerica, MA). For the N-terminal cleavage assay, the proteins were incubated at 60°C at the times noted in Fig. 5, B and C, in a 16-or 20-l reaction mixture of buffer A with 2.0 g of purified protein and supplemented with 5.0 mM EDTA, 2.0 mM tris(2-carboxyethyl)phosphine, and 100 mM DTT. For splicing or C-terminal cleavage assays, the same reaction conditions were employed, but DTT was omitted. Reactions were stopped by the addition of SDS Blue Loading Buffer (New England Biolabs, Ipswich, MA) and analyzed by SDS-PAGE using 4 -20% gradient Tris glycine gels (Lonza, Rockland, ME). For Western blot analysis, gels were blotted onto PVDF. The membranes were blocked using 1% BSA in buffer W (PBS and 0.1% Tween 20) and incubated with a 1:5000 dilution of HisDetector Nickel-AP in Detector Block Solution (KPL

RESULTS AND DISCUSSION
NMR Structure of the Pab PolII Intein-The solution NMR structure of the Pab PolII intein is based on distance constraints, H-bond constraints, and local and long-range angular constraints, derived from NOESY, hydrogen deuterium exchange, chemical shift analysis using TALOSϩ, and residual dipolar coupling measurements, respectively (Table 1). On average, 19.5 constraints were obtained for each residue. In contrast to most inteins, the Pab PolII intein does not contain an endonuclease domain, but rather an extended loop of 26 residues (121-146) in the equivalent position in sequence (Fig.  3, highlighted in yellow). This Pab PolII-specific loop has few observed long-range NOEs and shows 15 N relaxation rates characteristic of disorder and flexibility (see below). In the ensemble of 20 conformers from the Xplor-NIH calculation (38), the backbone and heavy atom root mean square deviations are 0.46 Ϯ 0.10, 0.93 Ϯ 0.12 Å. These values exclude the extended loop and are for residues 1-120 and 147-185 (Table  1; Fig. 3). All the residues fall into the allowed regions in the Ramachandran plot (40), with 82% in the most favored region (Table 1). Numerous structural quality factors show that the Pab PolII intein structure has better quality than the average NMR structures in PDB (Table 1) (39).
A Unique ␤-Hairpin in Thermophilic Inteins (HTH)-A structure-based sequence alignment of Pab PolII with homologous proteins is shown in Fig. 3 in the order of descending DALI score from top to bottom (42). These inteins and the hedgehog protein have DALI Z-scores higher than 9.5, indicating structural homology despite their low sequence identity. The closest homologs to the Pab PolII intein are two other archaeal inteins, the Thermococcus kodakaraensis Pol-2 intein (Tko Pol-2; Z-score 18.5; PDB codes 2CW7 and 2CW8) (43) Fig. 4, A-D, respectively. In addition to the HINT domain, the Tko Pol-2 intein contains the endonuclease domain, domains III and IV (Fig. 4A) Structure-based sequence alignment also shows that the four thermophilic archaeal inteins, Pab PolII, Tko Pol-2, Pfu R1R1-1, and Mja KlbA have insertions of ϳ18 residues between positions 29 and 46, relative to other inteins. These insertions form a ␤-hairpin, composed of the ␤4 and ␤5 strands that are connected by a short loop. This ␤-hairpin is colored blue in Fig. 4. Such a ␤-hairpin is only present in the structures of thermophilic inteins among inteins with known three-dimensional structures, such as the Tko Pol-2 intein (Fig. 4A) (43), Mja KlbA intein (Fig. 4B), and Pfu R1R1-1 intein (44) (not shown). This ␤-hairpin is missing in the structures of mesophilic inteins, such as the Mtu RecA intein (18) (Fig. 4D). We therefore propose to name it the HTH. We speculate that the HTH could enhance the stability of thermophilic inteins by extending the ␤-sheet by two strands and may contribute to the higher optimal splicing temperature of thermophilic inteins. Interestingly, the mesophilic Mxe GyrA intein (12) contains a short segment that corresponds to ␤5 in Pab PolII but is missing the equivalent of ␤4 (Fig. 4C).
Comparison of the Pab PolII intein with three other archaeal inteins shows that they all contain an ␣-helix in positions corresponding to residues 21-27 in the Pab PolII intein. This helix is longer in the Pab PolII intein than in the HINT domain of other inteins, such as Ssp DnaB (16) -185 (Fig. 5A). Both side chains of Thr-90 and His-93 are close to Cys-1. The Thr-90 hydroxyl is well positioned to serve as a hydrogen bond donor to the Cys-1 amide nitrogen (Fig. 5A). Ser-166 in the Pab PolII intein is at the equivalent position as the conserved block F aspartate, which plays a coordinating role in protein splicing FIGURE 5. Intein active site and mutational studies of conserved block B and block F residues. A, Cys-1 forms a hydrogen bond with both the block B Thr-90 and block F Ser-166; B, influence of the block B TXXH motif on DTT-dependent N-terminal cleavage at 60°C as assayed by SDS-PAGE. A Q185A/Cϩ1A double mutant was used to eliminate C-terminal cleavage and splicing; C-F, influence of the block F serine on N-terminal cleavage (C), C-terminal cleavage (D and E), and protein splicing (F), all measured at 60°C. Panels C-E were assayed by SDS-PAGE. In panel F, the results of protein splicing were assayed by both SDS-PAGE (left) and Western blot (right), using HisDetect Nickel-AP reagent to detect the C-terminal His tag. MIH, MBP-Intein-His tag. MBP serves as the N-extein and is released from the fusion protein upon N-terminal cleavage. MBP and His tag are joined together upon protein splicing to form MH. NIH, short N-extein-Intein-His tag, which is used for C-terminal cleavage to increase the relative M r difference upon C-terminal cleavage. (18,20). The Ser-166 side chain forms a hydrogen bond with the Cys-1 thiol, with a distance of 3.8 Å between its O ␥ and the sulfur atom of Cys-1. This hydrogen bond is similar to the one observed between the block F aspartate and Cys-1 in the Mtu RecA intein (20). In the NMR structure of the Pab PolII intein, the C-terminal Gln-185 is not well defined. Gln-185 was not observed in the 15 N, 1 H-heteronuclear single quantum coherence spectrum, resulting in a few structural constraints. It is likely that Gln-185 experiences microsecond to millisecond time scale motion, which results in the broadening and disappearance of its NMR signal.
We have constructed a series of mutants of block B TXXH and block F Ser to test their role in the catalytic mechanism of the Pab PolII intein. To detect splicing or N-terminal cleavage, we used an MIH construct, with M, I, and H representing the N-extein (MBP plus seven native extein residues), intein (Pab PolII intein), and C-extein (five native extein residues plus a His tag), respectively. To study the isolated C-terminal cleavage, we used an NIH construct where the MBP was deleted from the expression context and the N-extein consists of just seven residues.
To study the role of the TXXH motif (residues 90 -93) in the N-S acyl shift, we constructed a Q185A/Cϩ1A double mutant so that the precursor can only undergo N-terminal cleavage to yield M and IH. We incubated the protein in a neutral buffer at 60°C with a saturating concentration of DTT such that N-terminal cleavage approximates the rate and extent of thioester formation.
The effects of T89A, T90A, D92A, H93A, and a T90A/H93A double mutant on N-terminal cleavage are shown in Fig. 5B. With the wild type TXXH motif (in the Q185A/Cϩ1A double mutant background), there are visible bands of both M and IH after 1 h of incubation with DTT, with the concomitant decrease in precursor MIH intensity, indicating the occurrence of the N-S acyl shift and DTT-mediated N-terminal cleavage. The band intensity in the cleavage products continues to increase with time, whereas the intensity for precursor continues to decrease. By 16 h, there is little precursor remaining. Mutating the nonconserved Thr-89 and Asp-92 to Ala slightly increases the rate of N-terminal cleavage. Mutation of either Thr-90 or His-93 to Ala dramatically slows the rate of N-terminal cleavage; substantial precursor remains even after 16 h. The T90A/H93A double mutant abolishes N-terminal cleavage, with no visible band of either M or IH after 16 h. Thr-90 is as important as His-93 in catalyzing the N-S acyl shift in the Pab PolII intein. The biochemical role of Thr-90 is supported by the structure. The catalytic mechanism of Thr-90 may be due to a hydrogen bond between the Thr-90 hydroxyl oxygen atom and the Cys-1 amide nitrogen, which are separated by 2.9 Å. This hydrogen bond may stabilize the negatively charged oxythiazolidine intermediate. The Cys-1 carbonyl is within 2.9 Å of the Thr-90 hydroxyl and 3.2 Å of the Thr-90 amide nitrogen; these hydrogen bonds may serve to properly orient Cys-1 in the active site to facilitate the nucleophilic attack of the Cys-1 thiol. Alternatively, Thr-90 may adopt different conformations in the unspliced precursor due to the presence of exteins. Thr-90 may form a hydrogen bond with the Ϫ1 carboxyl, which has been observed in the Mja KlbA (45) and Mxe GyrA inteins (12). This hydrogen bond could stabilize the negatively charged carboxyl oxygen of the oxythiazolidine anion. The imidazole ⑀-nitrogen of His-93 is within 3.5 Å of the Cys-1 thiol in three of the 20 NMR conformers so that the Cys-1 thiol may be deprotonated by block B histidine for the initiation of the N-S acyl shift. In eight of the 20 NMR conformers, the imidazole ⑀-nitrogen of His-93 is within 3.5 Å of the Cys-1 amide nitrogen so that the B block histidine may protonate the leaving group (Cys-1 amide) during the N-S acyl shift. These structural observations are consistent with the proposed dual catalytic role for the B block histidine in the N-S acyl shift (52).
We also studied the influence of mutations at Ser-166 on both N-and C-terminal cleavage. The equivalent residue of Ser-166 in other inteins, such as Asp-422 in the Mtu RecA intein, plays a coordination role in protein splicing (18 -20, 45). In Fig. 5C, we observe that the rate of isolated N-terminal cleavage is reduced with mutation of Ser-166 to Gly, Asp, or Ala, but remains unchanged with mutation to Thr. This suggests a role in promoting the first step of splicing for the hydrogen bond between the Ser-166 hydroxyl and Cys-1 thiol, which are separated by 3.8 Å (Fig. 5A). Ser-166 also plays a role in regulating C-terminal cleavage. In Fig. 5D, we note that mutation of Ser-166 to Gly greatly accelerates the rate of C-terminal cleavage due to Asn cyclization, and mutation of Ser-166 to Asp slows the rate dramatically. Similar results are observed with the native C-terminal Gln in Fig. 5E. These results are analogous to the Asp-422 to Gly mutation ("the C-terminal cleavage mutation")) in the Mtu RecA intein, which promotes C-terminal cleavage at the expense of splicing (17).
The effects on protein splicing of Ser-166 mutants within the context of a Gln-185 to Asn mutation are shown in Fig.  5F. The identities of the major bands were also confirmed by MALDI-TOF mass spectrometry and Western blot. Mutating Ser-166 to the more frequently observed Asp results in some splicing, but mostly N-terminal cleavage uncoupled from splicing. Mutating Ser-166 to Gly abolishes protein splicing and yields exclusively C-terminal cleavage product, as also observed in the D422G cleavage mutant of the Mtu RecA intein (17). Mutation of Ser-166 to Ala or Thr promotes mostly uncoupled N-and C-terminal cleavage, along with some splicing. The block F Ser is therefore vital to coupling the reactions at the N-and C-terminal splice junctions of the intein. 15 N Spin Relaxation-Both conformational changes and protein dynamics are important in intein catalysis (15,41). We measured the 15 N R 1 , R 2 , and heteronuclear NOE and performed model-free analysis using Fast Modelfree (33) to obtain the order parameters.
As shown in Fig. 6, most residues exhibit high order parameters, indicating that the majority of the protein, including the N and C termini, is well ordered. The residues of the active site, such as Thr-90, His-93, and Ser-166 show no observable differences in relaxation from the rest of the protein. Residues  picosecond to nanosecond motions, consistent with the lack of long-range NOE cross-peaks and increased disorder in the NMR structure ( Fig. 2A). There is also increased mobility in the loop connecting strands ␤12a and ␤12b, with residues 159 and 161 showing decreased order parameters. The average order parameter in the HTH (0.92 Ϯ 0.02) is similar to that of the rest of the protein (0.93 Ϯ 0.02, excluding residues from the extended loop 121-146). One plausible explanation is that the HTH stabilizes the entire Pab PolII intein, instead of the HTH alone.
The Pab PolII and Mtu RecA inteins (53) have a very similar pattern of order parameters for structurally conserved residues, with average order parameters of 0.93 (excluding residues from the extended loop 121-146) and 0.89, respectively. Both inteins are more rigid than typical proteins where the average order parameter is ϳ0.85 for structured regions (␣-helix and ␤strands) (54). The Pab PolII intein is even more rigid than the Mtu RecA intein. Structure-based comparison of order parameters, which compares paired residues in the Pab PolII intein and the Mtu RecA intein using structure-based alignment (Fig.  3), shows that the Pab PolII intein has ϳ4% higher order parameters on average. In addition, relaxation experiments were carried out at 47°C for the Pab PolII intein, whereas they were measured at a lower temperature of 25°C for the Mtu RecA intein. As lower temperature increases rigidity, the Pab PolII intein is expected to be much more rigid than the Mtu RecA intein at 25°C. In contrast, similar order parameters were found in E. coli and Thermus thermophilus ribonuclease H (55). This enhanced rigidity may prevent the Pab PolII intein from sampling active conformations important for protein splicing at low temperatures.
Both the Pab PolII and Mtu RecA inteins show rigid termini, which seems to be a common feature among inteins (53). In contrast, the termini in other proteins generally have order parameters ϳ20 -30% lower than the rest of the protein (56). This unusual rigidity was also observed for terminal residues of the intein domain in Mja KlbA (45) and Npu DnaE (57), but not for terminal residues in the N-or C-exteins.
The extended loop in the Pab PolII intein, characterized by disorder and mobility, could provide a site for protein engineering, such as creating an artificially split intein or the control of protein splicing and functions with small molecules. Artificially split inteins from the Pab PolII intein might require an elevated temperature for optimal splicing, which could increase the temperature range for the applications of intein trans-splicing. Inteins have also been engineered to respond to small molecules to control protein function (58). A ligand binding domain could be inserted into the disordered loop in the Pab PolII intein such that it can be engineered to respond to pH (59), light (60), temperature (61), or small molecules (62) through directed evolution.