Protein trans -Splicing and Cyclization by a Naturally Split Intein from the dnaE Gene of Synechocystis Species PCC6803*

A naturally occurring split intein from the dna E gene of Synechocystis sp. PCC6803 ( Ssp DnaE intein) has been shown to mediate efficient in vivo and in vitro trans splicing in a foreign protein context. A cis -splicing Ssp DnaE intein construct displayed splicing activity simi-lar to the trans -splicing form, which suggests that the N-and C-terminal intein fragments have a high affinity interaction. An in vitro trans-splicing system was devel-oped that used a bacterially expressed N-terminal fragment of the Ssp DnaE intein and either a bacterially expressed or chemically synthesized intein C-terminal fragment. Unlike artificially split inteins, the Ssp DnaE intein fragments could be reconstituted in vitro under native conditions to mediate splicing as well as peptide bond cleavage. This property allowed the development of an on-column trans -splicing system that permitted the facile separation of reactants and products. Furthermore, the trans -splicing activity of the Ssp DnaE intein was successfully applied to the cyclization of proteins in vivo . Also, the isolation of the unspliced precursor on chitin resin allowed the cyclization reaction to proceed in vitro . The Ssp DnaE intein thus represents a potentially impor-tant protein for in vivo and in vitro protein manipulation.

Protein splicing elements, termed inteins (1), catalyze their own excision from a primary translation product with the concomitant ligation of the flanking protein sequences (reviewed in Refs. [2][3][4]. Inteins catalyze three highly coordinated reactions at the N-and C-terminal splice junctions (5, 6): 1) an acyl rearrangement at the N-terminal cysteine or serine; 2) a transesterification reaction between the two termini to form a branched ester or thioester intermediate; and 3) peptide bond cleavage coupled to cyclization of the intein C-terminal asparagine to free the intein. Inteins have been engineered to be versatile tools in protein purification (7)(8)(9)(10)(11)(12)(13), protein ligation (9, 10, 12, 14 -18), and in the formation of cyclic proteins and peptides (11,19,20). However, the ligation and cyclization approaches were limited by the need to generate an N-terminal cysteine and/or C-terminal thioester intermediate in vitro.
In addition to inteins engineered to trans-splice (21)(22)(23)(24), a naturally occurring split intein was recently identified in the dnaE gene encoding the catalytic subunit of DNA polymerase III of Synechocystis sp. PCC6803 (25). The N-terminal half of DnaE, followed by a 123-amino acid intein sequence, and the C-terminal half, preceded by a 36-amino acid intein sequence, are encoded by two open reading frames located more than 745 kilobases apart in the genome. When co-expressed in Escherichia coli, the two DnaE-intein fragments exhibited protein trans-splicing (25). In this report we have further investigated the cis-and trans-splicing activities of the Ssp DnaE intein in a foreign protein context. Furthermore, novel methods were developed that allow the on-column ligation of protein fragments as well as the in vivo and in vitro cyclization of proteins by intramolecular trans-splicing of the Ssp DnaE intein fragments.
The DNA sequence encoding the C-terminal 36 amino acid residues and the first 3 C-extein residues (5Ј-ATGGTTAAAGTTATCGGTCGT-AGATCTCTGGGCGTGCAGCGCATCTTTGATATCGGTCTGCCGCA-GGACCATAACTTTCTGCTAGCCAACGGCGCTATCGCTGCTAACTG-CTTTAACAAATCC-3Ј) was inserted into pMEB2 to create pMEB3, which expresses a fusion protein (MEB) composed of MBP, the fulllength Ssp DnaE intein (residues 1-159) with 5 native extein residues at its N terminus and 3 native residues at its C terminus, and the CBD. A translation termination codon was introduced into pMEB2 following the codon for Lys 123 of the Ssp DnaE intein by insertion of a linker formed by annealing oligonucleotides 5Ј-AAATAAGGAGGTTAATAAA-AGGAAGAGCCATGGCGCGCCTTAATTAAA-3Ј and 5Ј-CCGGTTTAA-TTAAGGCGCGCCATGGCTCTTCCTTTTATTAACCTCCTTA-3Ј. The resulting plasmid, pMEB4, expresses a fusion protein composed of MBP and the N-terminal 123 residues of the Ssp DnaE intein (DnaE(N)). pKEB1 contains the kanamycin resistance gene and the p15a origin of replication from pACYC177 (28). It also expresses a fusion protein composed of the 36 C-terminal amino acids of the Ssp DnaE intein (DnaE(C)) followed by 3 native extein residues and the CBD. pBEL11 expresses a CBD-DnaE(C)-T4 DNA ligase fusion protein in the pBSL-C155 vector (10).
pMEB8 was generated by transferring the 0.6-kilobase XhoI to PstI fragment of pMEB3 into pMYB5 (NEB). Mutation of the extein residues in pMEB8 was performed by linker substitutions using the XhoI and KpnI sites flanking the N-terminal splice junction or the NheI and AgeI * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ‡ Present address: Dept. of Biochemistry, University of Zurich, CH-8057 Zurich, Switzerland.
Ssp DnaE cis-and trans-Splicing in Vivo-E. coli ER2566 cells (7) bearing pMEB8 or two compatible plasmids, pMEB4 and pKEB1, were grown in LB medium containing the appropriate antibiotic selection at 37°C to an A 600 of 0.5. Protein expression was induced by the addition of 0.3 mM isopropyl ␤-D-thiogalactopyranoside at either 15°C for 16 h or at 37°C for 2 h. Crude cell extracts were visualized by electrophoresis on a 12% Tris/glycine gel (Novex, San Diego, CA) followed by staining with Coomassie Brilliant Blue.
Protein Purification-ER2566 cells containing pMEB2 or pBEL11 were grown at 37°C to an A 600 of 0.5. Following isopropyl ␤-D-thiogalactopyranoside (0.5 mM) induced protein expression overnight at 15°C, cells were harvested by centrifugation at 3,000 ϫ g for 30 min. The MBP-DnaE(N)-CBD (ME(N)B) fusion protein was purified by amylose as described previously (9). The cell pellet was resuspended in Buffer A (20 mM Tris-HCl, pH 7.0, containing 500 mM NaCl) and lysed by sonication. After centrifugation at 23,000 ϫ g for 30 min the supernatant was applied to a 15-ml amylose resin (NEB) equilibrated in Buffer A. The resin was washed with 10 -15 column volumes of Buffer A. The fusion protein was eluted with Buffer B (20 mM Tris-HCl, pH 7.0, containing 500 mM NaCl and 10 mM maltose). Protein concentrations were determined using the Bio-Rad protein assay (Bio-Rad).
In Vitro trans-Splicing and/or Cleavage Assay-trans-Splicing and/or trans-cleavage studies of the Ssp DnaE intein were conducted in vitro using the purified ME(N)B and two 40-amino acid peptides, synthesized as described previously (9), consisting of the C-terminal 36 amino acids of the Ssp DnaE intein, with either an Asn (Splice-pep) or an Ala at residue 36 (Cleav-pep), and the next four naturally occurring amino acids (CFNK). The splicing peptide had a biotinylated lysine (K*) as the C-terminal residue. The reaction consisted of adding either the splicing or cleavage peptide (500 M final concentration) to ME(N)B (1 mg/ml) in reaction buffer (100 mM Tris-HCl, pH 7.0, containing 500 mM NaCl) followed by incubation overnight at room temperature. The oncolumn trans-splicing used the CBD-DnaE(C)-T4 DNA ligase protein absorbed onto a chitin resin in which unbound protein was washed off with 20 column volumes of Buffer A. The ME(N)B fusion protein (9 M), either free in solution or prebound to chitin beads, was then added to the chitin-bound CBD-DnaE(C)-T4 DNA ligase (3 M). The reactions were then incubated for 16 h at the appropriate temperature in Buffer A and monitored by SDS-PAGE.  1. Ssp DnaE intein cis-and trans-splicing constructs. The cis-splicing constructs, pMEB8, pMEB8-N2, pMEB8-C1, -C2, and -C3, all use MBP and the CBD as the N-and C-exteins, respectively. The differences are in the extein residues adjacent to the intein and are represented by their single letter code for ease of comparison. The constructs used in the two-plasmid, trans-splicing system were pMEB4 and pKEB1, which contain the N-and C-terminal Ssp DnaE intein fragments, DnaE(N) and DnaE(C), respectively. The intramolecular trans-splicing construct, pMEB21, placed DnaE(C) and DnaE(N) at the N and C terminus of MBP, respectively. and DnaE(C), respectively, aligns the two splice junctions for the fusion of the N-and C-extein sequences. The splicing reaction presumably occurs via the same splicing pathway as the cis-splicing pathway proposed previously (5,6). Cleavage at the N-terminal splice junction can occur by hydrolysis or nucleophilic attack of the thioester bond formed at the C terminus of the N-extein. B, intramolecular trans-splicing. A target protein is sandwiched between the intein Cterminal segment (36 amino acids) and the intein N-terminal segment (123 amino acids). Splicing joins the N terminus of the target protein to its own C terminus through a peptide bond. The presence of a CBD fused to the C terminus of the intein N-terminal segment facilitated purification of the precursor protein and the subsequent in vitro cyclization reaction on chitin resin.

trans-Splicing of Ssp DnaE Intein 9092
Protein Cyclization and Analysis-ER2566 cells bearing pMEB21 were grown, induced, harvested, and lysed as described under "Protein Purification." The clarified supernatant from cells induced at 15°C was applied to an amylose resin (10-ml bed volume) whereas the clarified supernatant from cells induced at 37°C was applied to a chitin resin (15-ml bed volume). Unbound proteins were washed from the resin with 20 column volumes of Buffer A. Proteins were eluted from the amylose column with Buffer B. The intramolecular trans-splicing reaction proceeded in vitro when the chitin column was incubated for 20 h at room temperature. Reaction products were eluted from the resin with Buffer A. The cyclic MBP was analyzed by treatment with FXa (1:100, FXa:protein mass to mass ratio) overnight at 4°C to generate linearized MBP. The proteolyzed proteins were subjected to amino acid sequencing using a Procise 494 protein sequencer (PE Applied Biosystems, Foster City, CA).

In Vivo Splicing Activity of the Ssp DnaE Intein-The cis-
splicing activity of the Ssp DnaE intein was investigated by expression of MEB in E. coli cells bearing pMEB8 (Fig. 1). The presence of the MBP-CBD (MB) band following induction of protein expression demonstrated that the Ssp DnaE intein can splice in cis with only 5 native N-terminal and 3 native Cterminal extein residues (Fig. 2A, lane 2). The identity of splicing products was confirmed by Western blot analysis using anti-MBP and anti-CBD antibodies and binding to chitin and amylose resins (data not shown). In addition to the spliced product, there was significant cleavage of the peptide bond at the N terminus of the Ssp DnaE intein.
The role of the extein amino acid residues was investigated by mutation of the distal extein residues in the cis-splicing construct (Fig. 2B). Splicing products were detected in mutants with either 2 proximal N-extein residues or 3 proximal C-extein residues ( Figs. 1 and 2B, lanes 1 and 4). However, reduction of the C-extein sequence to 1 or 2 native amino acid residues inhibited splicing (Fig. 2B, lanes 2 and 3).
Interestingly, the same pattern of splicing and cleavage as seen with the cis-splicing construct was observed for in vivo trans-splicing (Fig. 2C, lane 2). Furthermore, both the pMEB4 and pKEB1 plasmids were necessary to induce processing of the precursor protein (Fig. 2C, lane 5). There was a slight accumulation of Ssp DnaE intein precursor protein when protein expression was induced at 37°C, and this was processed after further growth overnight at 15°C (Fig. 2, A and C).
In Vitro trans-Splicing with the Ssp DnaE Intein-The in vitro trans-splicing (Fig. 3) and/or trans-cleavage activity of the Ssp DnaE intein was studied using the bacterially expressed ME(N)B precursor and two peptides, Splice-pep and Cleav-pep, that mimic the C-terminal Ssp DnaE intein fragment (see "Experimental Procedures"). Both the Splice-pep and the Cleav-pep could activate ME(N)B, resulting in bands corresponding to the expected spliced and/or cleavage product (Fig.  4A). Furthermore, the ME(N)B precursor was stable in the absence of either peptide (Fig. 4A, lane 1). The cleavage and splicing products, MBP and MBP-CFNK*, respectively, are indistinguishable by SDS-PAGE. However, a Western blot using anti-biotin antibody indicated that splicing was occurring, albeit the extent of reaction could not be determined (data not shown). Efficient in vitro trans-splicing occurred between two bacterially expressed proteins, MBP-DnaE(N)-CBD and CBD-DnaE(C)-T4 DNA ligase, yielding spliced product, MBP-T4 DNA ligase (ML), at 4 and 16°C but significantly less at 37°C (Fig. 4B). Interestingly, little difference in splicing efficiency was observed when either chitin-bound or free ME(N)B was   (Fig. 5A, lane 2) and Western blot analysis (data not shown) of crude cell extract of cells expressing pMEB21 showed that there was precursor protein, DnaE(C)-MBP-DnaE(N)-CBD, linear MBP, circular MBP, DnaE(N)-CBD, and DnaE(C)-MBP. The putative linear and cyclic MBP species as well as higher molecular weight species (Fig. 5A, lane 4) were found to bind to amylose and elute with maltose.
The maltose eluted proteins were subjected to FXa proteolysis and amino acid sequencing. The upper portion of the 43-kDa band yielded NH 2 -GTLEKFAEYXFNISTGM-COOH, which matched the sequence for the cyclic MBP that was linearized with FXa. Sequencing the lower part of the 43-kDa band gave NH 2 -XFNISTGM-COOH, which matched the N terminus of the linear MBP, which had not undergone cyclization. NH 2 -XVKIGRRSLGV-COOH was obtained from the 45-kDa band and correlates with the expected sequence from the DnaE(C)-MBP product. The X designates a sequencing cycle in which no amino acid could be assigned with confidence.
Precursor consisting of DnaE(C)-MBP-DnaE(N)-CBD could be obtained by inducing protein expression for 2 h at 37°C (Fig. 5B,  lane 2). The precursor was immobilized on a chitin resin through the CBD, and intramolecular trans-splicing proceeded overnight at 23°C. Fractions from the chitin resin contained cyclic and linear MBP species (Fig. 5B, lane 4). FXa treatment of the isolated proteins (Fig. 5B, lane 5) followed by amino acid sequencing confirmed the presence of both the linear and circular forms. DISCUSSION The present study demonstrated that the Ssp DnaE intein was capable of splicing in a non-native protein context and determined that the intein fragments can self-associate with no more than 5 native extein residues. This implies that the interaction of the two intein halves in the natural condition is at least partly, or perhaps entirely, dominated by the association of the intein fragments and not by the extein residues. However, the presence of more extein residues may increase the efficiency of the splicing reaction and be vital to the effectiveness of this intein in the DnaE protein. Interestingly, the transand cis-splicing activities of the Ssp DnaE intein were almost identical; this indicates that the N-and C-terminal domains of this intein have a high affinity interaction.
Inteins represent a unique opportunity to perform protein manipulations in vivo as well as in vitro. The present work has demonstrated the facile production of circular and possibly multimeric proteins in E. coli and opens up new avenues to produce stable, bioactive proteins and peptides in living cells. Also, a recent paper, published following completion of this manuscript, describes the in vivo cyclization of proteins using the Ssp DnaE intein, which they term the in vivo split intein-mediated circular ligation of peptide and proteins (SICLOPPS) (29). We propose that the term SICLOPPS be used in the future to describe the in vivo cyclization reaction. trans-Splicing allows the study of cyclic peptides and proteins in a cellular environment. Furthermore, the possibility of assembling multimeric proteins in vivo is a novel and potentially useful technology.
The Ssp DnaE intein may be the intein of choice for in vitro trans-splicing experiments as it has been demonstrated to undergo the trans-splicing reaction without the need for a denaturation/renaturation step, as was necessary with other inteins (21,23,24). Also, the addition of the CBD to either the N-or C-terminal intein fragment had no detectable inhibitory effect on splicing. These unique properties allowed fusion proteins of both the intein N-and C-terminal fragments to be immobilized on a chitin resin for the subsequent ligation of two proteins under native conditions (Fig. 4B). This represents a significant advantage over other intein-based protein fusion techniques because the spliced products are easily purified away from the column-bound reactants. Furthermore, association of the two halves of the Ssp DnaE intein permits trans-splicing to occur at relatively low concentrations of reactants.
The present work demonstrated the use of inteins in both the in vivo and the in vitro manipulation of proteins. In particular, the Ssp DnaE intein represents an interesting and useful protein that trans-splices effectively in vivo with only minimal extein residue sequence and does not require cumbersome denaturation/renaturation steps for use in vitro. Future work should refine procedures to build and modify proteins in a cellular environment in much the same way as it is now possible to perform in vitro.