Interaction Studies and Alanine Scanning Analysis of a Semi-synthetic Split Intein Reveal Thiazoline Ring Formation from an Intermediate of the Protein Splicing Reaction*

We recently reported an artificially split intein based on the Ssp DnaB mini-intein that consists of a synthetic N-terminal intein fragment (IntN) and a recombinant C-terminal part (IntC), which are 11 and 143 amino acids in length, respectively. This intein holds great promise for the preparation of semi-synthetic proteins by protein trans-splicing. In this work we synthesized a set of IntN peptide variants to investigate their structure-function relationship with regard to fragment association and promotion of protein trans-splicing. A further truncation of the IntN sequence below 11 amino acids resulted in loss of activity, whereas C-terminal extensions were tolerated. Alanine scanning analysis identified three essential hydrophobic residues, whereas substitutions at other positions were tolerated. We developed assays to monitor association of IntN with an IntC mutant blocked in protein splicing by native PAGE and fluorescence anisotropy. The kinetic parameters of intein complex formation were Kd = 1.1 μm, kon = 16.8 m–1 s–1, and koff = 1.8 × 10–5 s–1 for the native IntN11 sequence. Intriguingly, a G(–1)A substitution, previously known to significantly impair protein splicing, was revealed to result in thiazoline ring formation involving the catalytic Cys-1, likely by aberrant dehydration of a oxythiazolidine intermediate. This finding provides experimental evidence for the postulated intermediate during the initial N/S acyl shift and underlines the delicate spatial and temporal alignment required in the intein active site to prevent side reactions of the protein-splicing pathway.

remarkable post-translational protein backbone rearrangement is catalyzed by the intein in a spontaneous, autocatalytic reaction without the need of any additional factors or energy sources. Protein splicing is initiated by the attack of the nucleophilic side chain of a Cys or Ser residue at the first position of the intein on its upstream peptide bond to generate a thioester or oxoester intermediate, respectively. In the second step of the pathway, the Ex N acyl group is transferred by a trans(thio)esterification on the side chain of a Cys, Ser, or Thr at the first position of the Ex C . The side chain amide group of an Asn residue at the ultimate position of the intein then attacks its own carbonyl group to form a succinimide and thereby to cleave the peptide bond between the intein and the Ex C . In the final reaction, the thio-or oxoester between the liberated Ex N -Ex C quickly rearranges to the thermodynamically more stable peptide bond. This canonical mechanism of protein splicing as well as deviations from it found in certain inteins have been extensively reviewed (1)(2)(3)(4). Inteins are useful as self-cleavable tags for protein purification strategies (5). They also have found many applications in the fields of protein chemistry and protein semi-synthesis, as they provide a route to obtain recombinant protein thioesters and proteins with an N-terminal cysteine residue important for chemical ligation reactions like native chemical ligation (NCL) and expressed protein ligation (6 -8). These methods take advantage of the protein ester intermediates generated during protein splicing, which can be cleaved by thiolysis or hydrolysis, and the finding that inteins can in principle be inserted into a heterologous protein context, i.e. they are quite tolerant toward foreign extein sequences (5)(6)(7)9). Other more recently emerging exciting avenues for protein engineering approaches rely on protein trans-splicing. This reaction is performed by split inteins, which are found in a few native examples (10,11) or can be created artificially from a regular intein (12)(13)(14)(15)(16)(17)(18)(19). Protein trans-splicing enables the post-translational linkage of proteins or polypeptides originating from two separate molecules. The position in a loop region that is amenable to split the intein domain was indicated by the insertion of a homing endonuclease domain in many native maxi-inteins at this position. Typically, the corresponding N-and C-terminal intein fragments (Int N and Int C ) are about 100 -150 and 36 -50 amino acids (aa) in size. We have shown recently that an Int N fragment of 11 aa and an Int C fragment of 143 aa of the Ssp DnaB mini intein reconstitute spontaneously to the active intein in vitro (15). This finding has enabled the preparation of an Ex N -Int N polypeptide by chemical peptide synthesis. Protein trans-splicing with a recombinantly produced Int C -Ex C fusion protein can then be used to covalently link a synthetic part, e.g. a spectroscopic probe, post-translational modification, or unnatural amino acid, to the N-terminal end of a protein of interest (see Fig. 1) (15). This approach is, thus, a chemo-enzymatic alternative to chemical ligation strategies to prepare semi-synthetic proteins. It bears in addition the potential for cellular applications due to the specificity mediated by the intein fragment.
In this work we have further analyzed the interaction between the synthetic Int N and the corresponding Int C fragment of the Ssp DnaB intein. We determined sequence requirements within the Int N part by minimization of the peptide length and by alanine scanning analysis. Using fluorescence anisotropy measurements, the dissociation constants between the Int N and Int C fragments were calculated. These results lay the foundation for a further exploration of this intein, for example for the preparation of semisynthetic proteins or for detailed investigations of the protein splicing mechanism. Finally and surprisingly, using one of the peptide variants and a mutant of Int C that is blocked for protein splicing, we have observed the conversion of the catalytic Cys-1 residue and the upstream carbonyl group to a thiazoline ring, which was likely formed by water elimination of an oxythiazolidine anion. The initial N/S acyl shift is believed to proceed through this intermediate (2,20). Our results represent the first experimental evidence for the occurrence of the postulated 5-membered cyclic intermediate. The aberrant side-reaction to the thiazoline underlines once more the intricate spatial and temporal fine-tuning in the intein active site required to accomplish the correct protein splicing reaction.

MATERIALS AND METHODS
General Techniques and Materials-Unless otherwise specified, standard protocols were used. Kanamycin was used at 50 g/ml concentration. Synthetic oligonucleotides were obtained from Qiagen-Operon. All plasmids were verified by DNA sequencing. Reagents were purchased from Acros, Sigma-Aldrich, Novabiochem, or Roth, and restriction enzymes and markers were obtained from Fermentas. All reactions and assays were performed at least in duplicate.
Construction of Expression Plasmids-For a detailed description of construction of expression plasmid pCL20 for protein 1, see Ludwig et al. (15). Plasmids pMP06 encoding construct 2 (N154A, Sϩ1A) as well as pPJ03 encoding construct 3 (N154A, Sϩ1A, H73A) were obtained by site-directed mutagenesis of pCL20 following the QuikChange protocol from Stratagene.
Expression and Purification of Proteins-For protein expression, Escherichia coli BL21 (DE3) was transformed with the respective plasmids. Cells were grown in LB medium supplemented with the appropriate antibiotic at 37°C to absorbance (600 nm) ϭ 0.6 -0.8. Before the addition of isopropyl 1-thio-␤-D-galactopyranoside (0.4 mM) the temperature was lowered to 30°C. After 3-5 h the cells were harvested by centrifugation, resuspended in nickel-nitrilotriacetic acid buffer A (50 mM Tris, 100 mM NaCl, pH 8.0), and lysed by two passages through a high pressure homogenizer (Avestin EmulsiFlex C5). All pro-teins carried a C-terminal His 6 tag and remained, after cell lysis and centrifugation at 30,000 ϫ g, in the soluble fraction. An affinity chromatography using nickel-nitrilotriacetic acid (Ni 2ϩ -NTA) superflow material (Qiagen) was performed, and proteins were eluted with Ni 2ϩ -NTA buffer B (50 mM Tris, 100 mM NaCl, 250 mM imidazole, pH 8.0). For a subsequent anion exchange (AE) chromatography, the buffer was changed to AEbuffer A (20 mM Tris, 2 mM DTT, pH 8.0) by dialysis. Protein solutions were applied onto a HiTrap Q anion-exchange column (1 ml bed volume, Amersham Biosciences) and eluted with AE-buffer B (20 mM Tris, 2 mM dithiothreitol (DTT), 500 mM NaCl, pH 8.0). Pooled fractions were dialyzed against assay buffer (50 mM Tris, 300 mM NaCl, 2 mM dithiothreitol, 1 mM EDTA, pH 7.0) containing also 20% glycerin (to increase protein stability at 30°C for fluorescence anisotropy measurements), shock-frozen, and stored at Ϫ80°C. Protein concentrations were determined using the calculated molecular extinction coefficients at 280 nm.
Analysis of the Splicing Reaction by SDS-PAGE-Splicing reactions were basically performed as previously described (15). In brief, peptide and protein were mixed at indicated concentrations in assay buffer and incubated at 25°C. At various time points aliquots of the reaction mixture were removed, quenched by the addition of 4ϫ SDS loading buffer (containing 8% SDS and 20% ␤-mercaptoethanol), and boiled before applying onto an SDS gel. Gels were stained with Coomassie Brilliant Blue, and relative intensities of protein bands were densitometrically determined using the program Scion Image. For UV illumination, the SDS gels were photographed under UV light before staining. For the kinetic investigation of pep3 with 1 (42 M), pseudo first-order reaction conditions were established with a 5-, 7.5-, and 10-fold excess of peptide. The second-order rate constants k splicing and k C-cleavage were determined from a plot of kЈ (first-order rate constant) versus protein concentration by linear regression (slope of the fit).
Analysis of the Complex Formation by Native and SDS-PAGE-These assays with constructs 2 or 3 were carried out similarly to the protein splicing assays. However, for analysis on native PAGE gels, the aliquots of the reaction mixtures were quenched with sample buffer free of SDS (5% final concentration of ␤-mercaptoethanol). Protein samples were not boiled before loading on the gel.
MALDI-TOF MS Analysis-All peptide synthesis products as well as splicing reaction mixtures were analyzed by MALDI-TOF MS measurements using an Autoflex 2 (Bruker Daltonics) in the positive ion detection mode. Calibration of peptides was performed using the Peptide Standard I (Bruker Daltonics), whereas proteins were calibrated with Protein Calibration Standard I (Bruker Daltonics).
Fluorescence Anisotropy Spectroscopy-Fluorescence anisotropy was measured with a SafireII spectrofluorometer (Tecan) using 384-well plates (OptiPlate-384 F, PerkinElmer Life Sciences flat-bottom-black) and a final reaction volume of 50 l per well. Measurements were taken every 60 s at 30°C applying an excitation wavelength of 470 nm and an emission wavelength of 542 nm. To prevent solvent evaporation, all reaction wells were sealed with a self-adhesive transparent plastic foil (Biacore) directly after mixing the reaction components and before starting the measurement. Control experiments using non-binding pep6 as well as non-binding control proteins (thioredoxin (Trx) and bovine serum albumin) confirmed that differences in total fluorescence intensity of the free and bound peptide were negligible (data not shown). The background fluorescence anisotropy of a blank sample (1 M peptide in assay buffer) was subtracted from final fluorescence anisotropy values. From a set of experiments measured for each peptide with various protein concentrations, the K d was calculated by plotting the maximal fluorescence anisotropy value against the protein concentration. These data could be fitted using Equation 1 (21), where A ϭ fluorescence anisotropy, A 0 ϭ anisotropy of the free peptide, A max ϭ maximal saturated anisotropy, [P] 0 ϭ protein concentration at time point 0 s, and [Pep] 0 ϭ peptide concentration at time point 0 s (ϭ 1 M).
The k on Ј values were determined by fitting the kinetic data obtained as described above from those assays with a sufficient excess of 3 (5 M and more) to a first-order rate equation. From plotting kЈ on versus the protein concentration, the second-order rate constant k on (slope) was determined by linear regression.

Minimizing the Synthetic Int N Fragment of the Semi-synthetic
Intein-Our semi-synthetic split intein consists of a recombinantly expressed Int C and a synthetically prepared Int N fragment, which are derived from the Ssp DnaB mini-intein (15). The Int C fragment in protein construct 1 corresponds to aa 12-154 of the DnaB mini-intein sequence. C-terminal-fused Trx serves as a model Ex C sequence (see Fig. 1). The synthetic peptide comprises an Ex N sequence of 5 aa plus a 5,6-carboxyfluorescein moiety (Fl) as a fluorophore to allow for facile detection followed by Int N comprising aa 1-11. In our first report of this system (15), two extra residues (Ser-Arg) were added to the C terminus of this peptide (pep1; see Table 1 for all peptides used in this study) to ensure comparability with an initially investigated recombinant Int N construct, in which they were present for cloning reasons. The serine in fact corresponds to aa 12 in the native sequence of the DnaB intein, whereas the arginine is unrelated.
In a first series of experiments we aimed to determine the true minimal peptide sequence of the Int N fragment. To this end, peptides pep2 to pep6 (see Table 1) were synthesized. Each peptide was then incubated with construct 1 at equimolar concentrations (50 M each), and reaction progress was monitored by SDS-PAGE. Protein trans-splicing led to the semi-synthetic product 5 (Fl-(5 aa)-Trx, see Fig. 1). In addition, thioredoxin (10) itself is generated as a byproduct through C-terminal cleavage. Fig. 2, lane 4, shows that pep3, lacking both extra C-terminal residues, exhibited the same splice activity as the longer variants pep1 (lane 2) and pep2 (lane 3). Protein splicing yields of up to 70% with about 20% of C-terminal cleavage were obtained. Thus, extra residues appended to the Int N sequence were not important for activity but could be tolerated without loss of activity. This also held true for various other aa at these positions (data not shown). A further C-terminal truncation of the peptide sequence in pep4 (Int N10 ) (Fig. 2, lane 5) resulted in a complete loss of protein trans-splicing activity; however, C-terminal cleavage was still observed. Finally, the shorter Int N9 as well as the Int N5 fragments in pep5 (lane 6) and pep6  (lane 7) were completely inactive. Attempts to restore activity in these cases with a correspondingly extended recombinant Int C -(6 -154) counterpart were not successful (data not shown). In conclusion, an Int N fragment of 11 aa, as in pep3, which is in the following referred to as the reference peptide, represents the minimal sequence requirement for an intein active in protein trans-splicing.
A kinetic investigation of the reaction between pep3 and construct 1 was performed based on the densitometric analysis of protein band intensities using SDS-PAGE (see Fig. S1). By using a 10-fold excess of pep3 over construct 1, we assumed pseudo first-order conditions and calculated a rate constant of kЈ splicing ϭ 4.4 ϫ 10 Ϫ4 s Ϫ1 and kЈ C-cleavage ϭ 0.9 ϫ 10 Ϫ4 s Ϫ1 . From the used peptide excesses of 5-, 7.5-and 10-fold (data not shown), we could observe a linear concentration dependence of the rate constants. From these data the second-order rate constants k splicing ϭ 0.
Functional Analysis of Alanine Scanning Variants of the Int N11 Fragment-To evaluate the impact of each residue within the Int N11 segment for splicing activity, we performed an alanine scanning analysis. Additionally, the first aa of the Ex N sequence, Gly(Ϫ1), which is the natively occurring neighboring residue, was replaced with alanine. Thus, peptides pep7 to pep17 (Table 1) were synthesized and assayed as described above. The analysis of the reaction mixtures by SDS-PAGE is shown in Fig. 3. As expected, replacement of the catalytic Cys-1 with alanine in pep7 (lane 3) led to a complete loss of splicing activity, whereas C-terminal cleavage was not affected. The remaining peptides could roughly be categorized into four groups according to their activity profile. Group A comprises the reference peptide pep3 (lane 2) as well as all peptides with sequence variations that only slightly affected the splicing reaction. This group included all peptides with serine to alanine substitutions, i.e. pep9, pep12, and pep15. The second group B represents all peptides with no detectable intein activity, neither protein splicing nor C-terminal cleavage. Noticeably, in all these cases a large hydrophobic amino acid was replaced (pep8, pep14, and pep16). Inspection of the crystal structure of the DnaB intein reveals that these three residues are part of the hydrophobic core of the protein (22). Group C are peptides that showed a considerable reduction in splicing activity and an increased C-terminal cleavage activity. In these cases, Gly-4, Asp-5, or Leu-7 were changed to alanine (pep10, pep11, and pep13). Finally, pep17 containing the G(Ϫ1)A substitution represents group D, because it exhibited a different behavior with a  (7), and C-terminal cleavage product (10) are indicated. Lane 1, negative control without peptide; lanes 2-7, reaction mixtures with pep1 to pep6, respectively. See Fig. 1 for calculated molecular weights of protein components (the asterisk denotes contaminant impurity protein bands).  (5), liberated Int C fragment (7), and C-terminal cleavage product (10) are indicated. Lane 1, negative control without peptide; lanes 2-13, reaction mixtures with pep3 and pep7 to pep17, as indicated. See Fig. 1 for calculated molecular weights of protein components (the asterisk denotes contaminant impurity protein bands). significant, but not complete, reduction of splicing activity and a partial reduction in the formation of the C-terminal cleavage product. This observation is in agreement with a previous report in which the same G(Ϫ1)A mutation almost completely blocked splicing in the context of the Ssp DnaB mini-intein (23).
Analysis of Intein Complex Formation by Native and SDS-PAGE-To monitor complex formation between the Int N and Int C fragments and to discern it from the subsequent splicing reaction, two mutant Int C -Trx proteins were constructed that are blocked in protein splicing but not in fragment association. Construct 2 has two mutations of essential catalytic residues (N154A and Sϩ1A), which block the transesterification of the Ex N as well as the peptide bond-breaking step at the C-terminal splice junction. However, this construct is still active in triggering the initial N/S acyl shift to form the Ex N thioester (see Fig. 1). The second mutant, construct 3, therefore included an additional mutation (H73A), which also blocks the N/S acyl shift at the N-terminal splice junction (22, 24 -26). We discovered a high stability of the noncovalent complexes between Int N and the Int C mutant constructs and, therefore, used native PAGE gels to investigate intein fragment association. As shown in lane 13 of Fig. 4A (panel I), construct 2 in the absence of an Int N peptide migrates as a discrete band. Upon preincubation with the pep3 (Fig. 4A, lane 12), the protein exhibited higher electrophoretic mobility, suggesting that the formed Int N -Int C complex is more compact than the Int C protein alone. A similar complex formation indicated by a band shift was observed for most other peptides, except for those of group B (pep8, pep14, and pep16) and pep7 (C1A). Pep11 (D5A) induced only a minor shift, likely because it is the only peptide with a different charge. A striking observation was made when the native gel was illuminated under UV light to visualize the intein complexes by virtue of the fluorophore attached to the peptides (Fig. 4A, panel II). In all cases, except for pep17 (G(Ϫ1)A), the fluorescence signal was almost lost. Analysis of the reaction mixtures by MALDI-TOF MS revealed in these cases the product Fl-Ex N (8), presumably resulting from hydrolysis of the Fl-Ex N -Int N thioester (data not shown). Although this is a known side-reaction of the protein-splicing pathway, the degree of susceptibility of the thioester to cleavage was remarkable. Omission of all potential nucleophiles (e.g. free thiols, glycerol) for this reaction had no effect. We, therefore, conclude that a water molecule must directly attack this bond, possibly from a privileged position that became vacant by the S(ϩ1)A mutation. Interestingly, the resistance to cleavage in the case of the formed complex (4) with pep17 (G(Ϫ1)A) pointed to a different mechanism for this peptide (see below).
The triple mutant 3 served to suppress the N/S acyl shift and thereby to prevent thioester cleavage. Consistent with this idea, upon incubation with the peptides, fluorescent complexes (4) could be observed in a similar assay using native PAGE (see Fig. 4B, panels I and II). Group B peptides showed no or only traces of complex formation. The complex with pep11 (D5A) again exhibited a different migration behavior, likely due to the different charge. pep17 (G(Ϫ1)A, lane 13) did not form a stable complex under these conditions for reasons we do not understand yet (we could measure complex formation in solution; see below).
All intein fragment complexes described here for mutant proteins 2 and 3 could also be observed using denaturing SDS-PAGE gels as long as the probes were not boiled in the SDS sample buffer (data not shown). In summary, these results point to a significant stability of the Int N -Int C complex and show that group B peptides are inactive in protein splicing because intein fragment association is disrupted. They also point to a different mechanism caused by the G(Ϫ1)A substitution in pep17.
Quantitative Analysis of the Intein Complex Formation by Fluorescence Anisotropy-For a quantitative analysis of the intein fragment association, we developed a new assay based on fluorescence anisotropy. The fluorescently labeled peptides were incubated at a fixed concentration in each (1 M) with varying concentrations of the triple mutant construct 3 (0.05-30 M, in case of group B peptides up to 75 M). Binding of the low molecular weight peptide to the protein resulted in an increase in fluorescence anisotropy, which was monitored over a period of 22 h (see Fig. 5A). The data were fitted as shown in Fig. 5B using equation 1 (described under "Materials and Methods") to determine the dissociation constants K d . A K d ϭ 1.1 Ϯ 0.2 M was determined for the reference peptide pep3. Fig. 5C illustrates the dissociation constants for the peptides from the alanine scan analysis (see Table 1 for other peptides), which were either similar to the reference peptide or increased up to 12-27-fold for the group B peptides in excellent agreement with the behavior observed for complex formation using the native PAGE gel shift assay shown in Fig. 4B.
Because the above results argued for a hydrophobic interaction between Int N and Int C , we also specifically investigated the possibility of a hydrophilic contribution to the binding. For this purpose, the binding assays were repeated under low salt concentrations (20 mM instead of 300 mM NaCl) for pep3 and for pep11 (D5A). These conditions should result in a decrease of K d in case of a hydrophilic interaction. Pep11 was included in this experiment because it is the only peptide with a substitution of a charged amino acid, and its slightly increased K d value of 3.7 M at 300 mM NaCl might point to the importance of this charge for binding (Table 1). However, for both peptides under low salt conditions, even slightly increased K d values were obtained compared with the high salt concentrations (see Fig.  S2). Together, these findings rule out a significant contribution of an ionic interaction and argue for hydrophobic interactions as the driving force for complex formation.
For the determination of the rate constants k on and k off from the anisotropy data shown in Fig. 5A, we used an excess of protein 3 (at least 5-fold) over the peptides to avoid higher-order kinetic effects and applied a first-order rate equation to obtain pseudo first-order kЈ on values. A linear fit of protein concentration plotted against these kЈ on values led to the second-order k on constants (see Fig. S3). The k off values were calculated using k off ϭ K d ϫ k on (see Table 1). For example, in the case of the reference peptide pep3, we determined a k on ϭ 16.8 Ϯ 1.0 M Ϫ1 s Ϫ1 , and calculated a k off ϭ 1.8 ϫ 10 Ϫ5 s Ϫ1 . In summary, this artificially split intein system is characterized by a slow k on value, whereas an even slower k off value gives rise to a low K d ϭ 1.1 M.
Identification of a Trapped Intermediate of the Protein Splicing Reaction-The substitution of the first extein residue at the N-terminal splice junction in pep17 (Gly(Ϫ1)A) led in the above-described assays to unique observations compared with all other peptides. With this peptide, only trace amounts of protein splice product were observed upon incubation with construct 1 (Fig. 3, lane 13), although the binding constants for intein fragment association were very similar to the reference peptide (Table 1). Obviously, formation of the thioester in the initial N/S acyl shift was impaired, because pep17 was the only peptide that formed a fluorescent complex with the splice mutant 2 (Fig. 4A, panel II, lane 11). With all other peptides, the Ex N containing the fluorophore was cleaved off by hydrolysis of the thioester. We conceived two possible scenarios that could explain these findings; either the thioester hydrolysis in the complex of pep17 and protein 2 was suppressed because no attack of the cysteine side chain thiol on the peptide bond took place or because the resulting thioester was for some reason (e.g. shielding of the Ala side chain) less prone to hydrolysis. Attempts to cleave the potential thioester intermediate with high concentrations of free thiol failed, arguing against the latter hypothesis. A MALDI-MS analysis of the regular reaction mixture showed that the peptide was not cleaved into N-extein and Int N11 sequences, as expected. Surprisingly, however, an almost quantitative loss of 18 Da in the peptide pep17 was revealed (MϩH ϩ calc ϭ 1979.9 Da; MϩH ϩ obs ϭ 1962.0 Da, see Fig. 6B). Analysis of the reaction by reversed phase HPLC also showed that the signal of pep17 was shifted to slightly longer retention times after incubation with 2 (see Fig. S4). To test whether the chemical modification involved the catalytic Cys-1 residue, we incubated the peptide before and after the reaction with N-ethyl maleimide. Only the peptide before the incubation with 2 reacted with this thiol specific reagent (see Fig. S4). This finding indicated that the free thiol group of the Cys-1 side chain was modified. Based on these results, we postulated that an oxythiazolidine anion intermediate is formed after the nucleophilic attack of the cysteine thiol onto the carbonyl carbon atom of the scissile bond and that this intermediate is not converted into the thioester but instead into the thiazoline by water elimination (see Fig. 7). The additional methyl group introduced by the G(Ϫ1)A substitution might block the proceeding of the pathway toward thioester formation by steric hindrance or distortion of the active site. Final proof for the suspected thiazoline ring formation involving Ala(Ϫ1) and Cys-1 was obtained by ESI-MS analysis of the reaction product of mutant 2 and pep17 as well as of unreacted pep17 itself for comparison. MS/MS and MS 3 fragmentation patterns yielded the series of b-type fragment ions by which the loss of 18-Da was unequivocally localized to the Ala-Cys dipeptidyl unit (see Fig. S5).  Table 1 for results of deletion peptides. Finally, we collected evidence that the thiazoline ring formation also takes place in a fully recombinantly prepared intein (see Fig. S6). For this purpose we introduced the G(Ϫ1)A, N154S, and S(ϩ1)A mutations in another variant of the Ssp split DnaB intein. This variant is split at the endonuclease insertion site and was previously reported by our group (12). In agreement with the observations made for the semi-synthetic intein split at position 11, the G(Ϫ1)A mutation impaired the splicing activity of this intein. We were not successful in retrieving unequivocal data from MS measurements to prove thiazoline ring formation by either using a tryptic digest or the intact proteins. However, in the case of the G(Ϫ1)A mutation, no N-terminal thioester cleavage was observed, and the catalytic Cys-1 could not be chemically modified with 5-(iodoacetamido)fluorescein. In contrast, the protein with the native Gly at position Ϫ1 yielded a high degree of thioester hydrolysis, and the Cys-1 could be chemically labeled (see Fig. S6).
Taken together, these results show that both the semi-synthetic intein, which is the focus of this study, and a fully recombinant version of this intein are blocked by a G(Ϫ1)A substitution, which leads, in combination with mutation of the catalytic residues at the C-terminal splice junction, to formation of a thiazoline ring involving the catalytic Cys-1. This finding strongly supports the occurrence of the oxythiazolidine anion intermediate in the N/S acyl shift, which was previously postulated but has never been directly detected.

DISCUSSION
Semi-synthetic protein trans-splicing emerges as a new methodology with potential applications for protein semi-synthesis (9,15,(27)(28)(29)(30) and for mechanistic studies of the proteinsplicing pathway. Specific advantages over other approaches like chemical ligation strategies are the inherent affinity and specific recognition of the intein fragments as well as the circumvention of the otherwise required chemical functional groups like a thioester, an azide, or a phosphine. These features simplify the synthesis of the synthetic extein-intein fragments, which could be carried out in a regular peptide facility using standard Fmoc solid phase peptide chemistry. They also offer the potential for protein semi-synthesis in complex mixtures like a cell extract (28) or even a live cell (9,27). The price one has to pay for this is the requirement to append the intein fragments to the actual peptide or protein sequences to be joined, i.e. longer peptides have to be synthesized. In this regard, we have recently introduced a new semi-synthetic intein in which the Int N fragment is only 11 aa long, thus minimizing the length of the synthetic peptide (15). With this split intein system, synthetic peptides can be joined to the N terminus of a recombinant target protein. Here we have reported the first detailed biochemical and biophysical analysis of this system. Our data concerning minimal length of the Int N peptide, identification of the essential residues within the sequence, and determination of the kinetic parameters lay the foundation for a further exploitation of this approach toward different applications, including mechanistic studies of this intein. In the course of this study and using mutants blocked in protein splicing, we also identified a thiazoline ring formation that directly points to the previously postulated oxythiazolidine anion intermediate during the N/S acyl shift.
We have investigated the interaction between Int N and Int C with a reference peptide (pep3) comprising the wild-type Int N11 sequence and with peptides containing systematic alanine substitutions through this fragment. Three amino acids (Ile-2, Ile-8, and Leu-10) were essential for the reconstitution of an active intein. This finding can be explained by the participation of these residues in the formation of the hydrophobic core of the protein. We could also show that the interaction between the intein fragments is driven by hydrophobic forces. A recent study suggests a suitable distribution of hydrophobic core amino acids between the two protein fragments to be crucial for split protein reassembly (31). Furthermore, we identified several positions that did not affect protein trans-splicing when changed to alanine, including the three serine residues in the Int N fragment. Additionally, extra amino acids at the C terminus of Int N did not affect intein activity. This information will be useful in future studies and applications, e.g. for the attachment of auxiliary groups or the engineered improvement of splicing activity. Nagamune and co-workers (27) linked a trimethoprim moiety to the C terminus of Int N to facilitate high affinity binding to dihydrofolate reductase, which was fused to the N terminus of Int C . This arrangement enabled modification of a cell surface protein at Int N peptide concentrations of 2 M.
When the interaction between the Int N and Int C sequences is studied, one needs to bear in mind that they are artificially created protein fragments, which are likely to be largely unfolded. Therefore, the association between these two fragments must not be regarded as a protein-ligand interaction but rather as a reconstitution of a protein fold. Even though this reconstitution occurs spontaneously under native conditions, which is very advantageous for applications in protein semi-synthesis, one would expect that after initial contact of the two pieces, major structural changes must occur, possibly through more than one FIGURE 7. Proposed mechanism of thiazoline ring formation. The initial reaction step in the protein splicing mechanism of inteins is an N/S or N/O acyl shift, which was supposed to proceed through an oxythiazolidine or oxyoxazolidine anion intermediate, respectively. Here, the Ssp DnaB intein, which employs a Cys at position 1, was investigated. Upon incubation of pep17 (G(Ϫ1)A substitution) with construct 2, the formation of a thiazoline ring was observed. This reaction pathway could branch from the oxythiazolidine anion intermediate by O-protonation instead of N-protonation and subsequent water elimination. Note that a small number of inteins employs an alanine at position 1. In these cases the scissile peptide bond at the N-terminal splice junction is believed to be directly attacked by the side chain of the ϩ1 residue.
intermediate. This assumption of a complex reaction pathway is underlined by the observation that k splicing is concentrationdependent even though the initial second-order complex formation step is 17-fold faster (k splicing ϭ 0.8 M Ϫ1 s Ϫ1 ; k on ϭ 16.8 M Ϫ1 s Ϫ1 ). Also the very low k on value for intein fragment association likely reflects the required major structural changes. Further work will be necessary to better understand the folding processes and potential intermediates of active complex formation. The even lower k off ϭ 1.8 ϫ 10 Ϫ5 s Ϫ1 seems to explain our findings of a complex stable under PAGE conditions and gives rise to a decent dissociation constant K d ϭ 1.1 M.
These findings can be compared with a kinetic investigation of the natively split Ssp DnaE intein, for which Muir and Shi (32) described a biphasic behavior, consistent with a very fast association rate in the range of 10 7 M Ϫ1 s Ϫ1 and a second unimolecular process with a rate constant of 0.7 s Ϫ1 . Interestingly, the association of the Ssp DnaE Int N and Int C polypeptides strongly depends on the ionic strength of the buffer, indicating that ionic interaction largely account for the fast complex formation. In contrast to the intein fragments described here, the Ssp DnaE Int N and Int C fragments have undergone natural evolution for efficient protein trans-splicing.
Another aspect of our study that should be highlighted is the short length of the Int N piece. Minimized to 11 aa, it represents to the best of our knowledge the smallest fragment of a split protein active in protein fragment complementation. Only a few other cases of split proteins are reported in the literature that exhibit spontaneous association of the complementary fragments and that are of comparable size. Examples include green fluorescent protein (33), ␤-galactosidase (34), chymotrypsin inhibitor 2 (35), and ribonuclease S (36). The smallest fragments in these cases are found for the C-terminal 15 aa of green fluorescent protein and the N-terminal 20 aa of ribonuclease A. The latter fragment is also known as S-peptide and is obtained by a partial digest of ribonuclease A with subtilisin. A K d ϭ 42 nM was reported for the S-peptide and the 104-aa S-protein (C-terminal fragment of ribonuclease A) by fluorescence anisotropy (37). Because of its length, the short 11-aa Int N has a restricted ability for specific contact formation and interaction surface. In the light of these considerations, the spontaneous association with the Int C fragment at concentrations in the low micromolar range and with yields of 50 -90% is quite remarkable.
Our identification of thiazoline ring formation in the Int N fragment provides the first experimental support that the initial N/S acyl shift in the protein-splicing pathway proceeds through an oxythiazolidine anion intermediate. The thiazoline ring was observed when mutated intein fragments partially blocked for the downstream steps in protein splicing were incubated (Int N , G(Ϫ1)A; Int C ; N154A, S(ϩ1)A). Obviously, the Ala(Ϫ1) side chain introduces a steric obstacle or spatial distortion of the active site such that the oxythiazolidine anion intermediate cannot be resolved to the thioester but instead results in aberrant water elimination ultimately giving rise to the thiazoline. Interestingly, a reaction similar to this latter pathway is employed in the thiazoline and oxazoline (often oxidized to thiazole and oxazole) formation from Xaa-Cys and Xaa-Ser units in ribosomal-synthesized antimicrobial peptides like mic-rocin B17 (38). Thus, our findings once more emphasize the delicate coordination in time and space required in the inteins active site to accomplish formation of the correct splice product. Not only the correct timing of the individual steps of the protein-splicing pathway and the suppression of nucleophilic cleavage reactions at the (thio-)ester intermediates is crucial but also the suppression of the aberrant dehydration of the oxythiazolidine (or oxyoxazolidine in case of Ser) must be prevented by optimized alignment of the reaction intermediates.