Protein Splicing of the Saccharomyces cerevisiae VMA Intein without the Endonuclease Motifs*

The protein splicing element (intein) of the vacuolar ATPase subunit (VMA) of Saccharomyces cerevisiaecatalyzes both protein splicing and site-specific DNA cleavage. It has been demonstrated that the conserved splice junction residues are directly involved in protein splicing and the central dodecapeptide motifs are required for DNA cleavage. To examine whether the splicing activity of the intein can be structurally separated from the endonuclease motifs, we made large in-frame deletions at the central region of the intein. We demonstrate for the first time that protein splicing can proceed efficiently after the removal of the central region of the intein including the endonuclease motifs. Our results suggest that the N- and C-terminal regions of the Sce VMA intein may form a separate domain that is not only catalytically sufficient for protein splicing but also structurally independent from the endonuclease domain.

The protein splicing element (intein) of the vacuolar ATPase subunit (VMA) of Saccharomyces cerevisiae catalyzes both protein splicing and site-specific DNA cleavage. It has been demonstrated that the conserved splice junction residues are directly involved in protein splicing and the central dodecapeptide motifs are required for DNA cleavage. To examine whether the splicing activity of the intein can be structurally separated from the endonuclease motifs, we made large in-frame deletions at the central region of the intein. We demonstrate for the first time that protein splicing can proceed efficiently after the removal of the central region of the intein including the endonuclease motifs. Our results suggest that the N-and C-terminal regions of the Sce VMA intein may form a separate domain that is not only catalytically sufficient for protein splicing but also structurally independent from the endonuclease domain.
Protein splicing is a post-translational processing event in which an internal segment, the intein, from a protein precursor catalyzes its own excision and concomitantly ligates the flanking regions, the exteins, to form a mature protein (1). It has been shown that inteins plus the first residue of the C-terminal flanking region (C-extein) 1 contain sufficient structural and catalytic elements to direct splicing in the context of foreign proteins (2)(3)(4)(5)(6). In many cases, inteins belong to a family of site-specific endonucleases that cleave DNA in alleles lacking the inteins at a location called the homing site (7,8). Sequence analysis of inteins from diverse organisms revealed seven conserved motifs (motifs A-G) (9). Motifs A and G contain a set of highly conserved residues at the two splice junctions. In vitro studies of protein splicing of the inteins from the thermostable DNA polymerase of Pyrococcus sp. GB-D and the 69-kDa vacuolar ATPase subunit of Saccharomyces cerevisiae (Sce VMA intein) have shown that these conserved splice junction residues play defined roles in the protein splicing pathway (4, 6, 10 -13). Motifs C and E in the central region are referred to as the dodecapeptide motifs and are a characteristic feature of homing endonucleases (14,15). However, it has been shown that protein splicing is independent of endonuclease function because an archaeal intein with a mutation that abolishes the endonuclease activity can still splice efficiently (16). The Sce VMA intein contains all seven conserved motifs (see Fig. 1A) (9). The Sce VMA intein functions as an endonuclease that cleaves the yeast genome at a single location and initiates a gene conversion process that results in the transfer of the intein gene to other yeast strains (8). The dodecapeptide motifs C and E are directly involved in the DNA recognition and cleavage reaction (17). The catalytic properties of the Sce VMA intein generated from the natural protein splicing process are indistinguishable from those of the recombinant form, indicating that the endonuclease function of the intein is independent of the protein splicing process (18). Because protein splicing of the Sce VMA intein involves the splice junction residues (in motifs A and G), which are separated from the endonuclease motifs (C and E) in the primary sequence by more than 100 amino acids (see Fig. 1A), questions have been raised concerning whether all 454 Sce VMA intein residues are required for protein splicing and whether the N-and C-terminal regions of the intein may contain sufficient structural and catalytic elements to catalyze protein splicing.
In this paper, we report the construction and characterization of large in-frame deletions within the Sce VMA intein that remove the central region of the intein including the dodecapeptide motifs. The deletion mutants were studied in a chimeric threepart fusion system in which the Sce VMA intein (Y) 1 or its deletion mutants (⌬Y) were fused in-frame between the Escherichia coli maltose-binding protein as the N-extein (M) and the Bacillus circulans chitin-binding domain (B) as the C-extein. The resulting fusion constructs (pMYB for the wild-type full-length intein and p⌬MYB for the intein deletion mutants) were expressed in E. coli, and protein splicing was examined both in the crude cell extract and in the amylose-purified proteins. We demonstrate that protein splicing proceeds efficiently in the chimeric protein fusion context when 184 residues in the central region of the intein including the endonuclease motifs were replaced with flexible peptide linkers.

EXPERIMENTAL PROCEDURES
The procedures for cell culture, protein expression, and purification were the same as those described previously (6) except that the E. coli strain ER2426 ( Ϫ FЈ proA ϩ B ϩ lacI q ⌬(lacZ)M15 zzf::miniTn10 (KanR)/fhuA2 supE44 e14 Ϫ rfbD1? relA1? endA1 spoT1? thi-1 ⌬(mcrC-mrr)114::IS10, Elisabeth Raleigh, New England Biolabs) was used. The crude cell extracts and amylose-purified proteins were analyzed by SDS-PAGE, followed by Coomassie Blue staining and Western blot analysis. SDS-PAGE was performed in 12% Tris-Glycine gels (Novex, San Diego, CA). For Western blot analyses, the SDS-PAGE gels were blotted onto nitrocellulose membranes and analyzed by probing with polyclonal antibodies against the maltose-binding protein (New England Biolabs) or the Sce VMA intein (gift of Dr. F. S. Gimble) as described by Perler et al. (7). All enzymes are from New England Biolabs.
To construct pMYB to demonstrate protein splicing of the wild-type intein, pMYB129 (19) was digested with BamHI and AgeI and ligated with complimentary oligomers 5Ј-GATCCCAGGTTGTTGTACACAAC-TGTGGTGGCCTGA-3Ј and 5Ј-CCGGTCAGGCCACCACAGTTGTGTA-CAACAACCTGG-3Ј to yield pMYB, in which the Asn-454 3 Ala mutation in the C-terminal splice junction of pMYB129 was reverted to the wild-type asparagine residue. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
To test the ability of the splicing product MB fusion proteins to bind chitin, 1 ml of amylose purified MB (0.5 mg/ml) was mixed with 0.5 ml of chitin resin (New England Biolabs). After 10 min of incubation at 4°C, the chitin resin was pelleted by centrifugation and washed three times with column buffer containing 20 mM HEPES (pH 8.0), 0.5 M NaCl. The bound protein was eluted by 2% SDS and analyzed by SDS-PAGE as above.

RESULTS
Large in-frame deletions within the Sce VMA intein were created by a PCR method using two primers (primers 1 and 2, Fig. 1B) annealing at the deletion sites and pLitYP containing the Sce VMA intein sequence (6) as the template (Fig. 1B, step  (1)). Both primers contained an NheI site that allowed selfligation of the PCR products to generate plasmid p⌬LitYP in which the intein sequence between the two primers was deleted (Fig. 1B, step (2)). This deletion intein sequence was then transferred to a pMYB fusion construct replacing the fulllength intein sequence to yield p⌬MYB (Fig. 1B, step (3)). Because primer 2 also contained a second restriction site AatII, a linker sequence encoding a short flexible peptide of multiple Asn and Gly residues (NG linker) or Ser and Gly residues (SG linker) was inserted between the NheI and AatII sites, resulting in p⌬MYB (NG) or p⌬MYB (SG) (Fig. 1B, step (4)). All fusion constructs and the positions of deletion are illustrated in Fig. 1A (for details, see "Experimental Procedures").
Protein splicing of the wild-type Sce VMA intein in the MYB fusion generates the ligated exteins, MB, and the excised intein, Y. Due to the presence of non-native sequences between the intein (Y) and the exteins (M and B) in the MYB fusion protein (6,19), the ligated exteins, MB, are expected to have a molecular mass of about 51 kDa (approximately 4 kDa larger than the sum of the molecular masses of native MBP (M, 42 kDa) and CBD (B, 5 kDa)). The major component in the amylose-purified protein sample had a molecular mass corresponding to that of MB ( Fig.  2A, lane 4), whereas the excised intein Y (50 kDa) remained in the flow through ( Fig. 2A, lane 3). Due to a small difference in molecular masses between MB (51 kDa) and Y (50 kDa), they were not clearly separated on the SDS-PAGE of the crude cell extract (Fig. 2A, lane 2). The identities of MB and Y were further verified by Western blot analysis using antibodies specific for the maltose-binding protein (anti-MBP) and the Sce VMA intein (anti-Sce) and by the ability of MB to bind chitin (detection of Y by anti-Sce in the crude extract is shown in Fig. 2A, lane 5 The arrows and the numbers below the intein box indicate the amino acid residue positions of the deletions. The XhoI-BamHI fragment (indicated above the intein box) contains the Sce VMA intein sequence except the C-terminal seven residues (6). A putative intein from P. purpurea DnaB helicase is also shown to indicate that it has homologous motifs A, B, F, and G but not C, D, and E. (2), ⌬MYB1, a chimeric fusion protein containing an intein deletion mutant inserted in frame between the maltose-binding protein (M) (black box) and the chitinbinding domain (B) (striped box). The intein deletion was made between residues 204 and 387 of the wild-type Sce VMA intein. (3), ⌬MYB1(NG), same as (2) except that a peptide linker (NG linker) was inserted at the deletion site. (4), ⌬MYB1(SG), same as (2) except that a peptide linker (SG linker) was inserted at the deletion site. (5), ⌬MYB2, same as (2) except that the deletion was between residue 114 and 387. (6) ⌬MYB2 (NG), same as (5) except that a peptide linker (NG linker) was inserted at the deletion site. B, a scheme showing the PCR-based method for generating a deletion within the Sce VMA intein gene. Using primer 1 and primer 2 and pLitYP as template, a PCR reaction generated a fragment that contained all of the pLitYP sequence except the region of the Sce VMA intein between the two primers. See "Experimental Procedures" for details. data are not shown). Some minor components were also observed in the amylose-purified protein (Fig. 2A, lane 4). Based on their molecular masses and Western blot analysis, these components may be the cleavage products of the MYB fusion proteins at a single splice junction (i.e. MY and M). The above data indicate that the wild-type Sce VMA intein does splice efficiently in the MYB fusion.
A deletion between residues 204 and 387 of the Sce VMA intein removed the dodecapeptide motifs (C and E) and motif D to yield p⌬MYB1 (Fig. 1B). Due to the incorporation of restriction sites in primer 1 and 2, six residues, Ala-Ser-Gly-Gly-Asp-Val, were inserted between residues 204 and 387 of the deletion intein. Expression of p⌬MYB1 in E. coli resulted in completely unspliced precursors (⌬MYB1) as shown in SDS-PAGE of the crude cell extract and amylose-purified proteins (Fig. 2B, left  panel, lanes 1 and 2, respectively). No splicing products were observed when the purified ⌬MYB1 precursors were incubated under splicing conditions (i.e. 20 mM HEPES (pH 8.0), 0.5 M NaCl) at 4°C (Fig. 2B, left panel, lane 3) or 23°C (data not shown). Similar results were obtained when deletions were made between residues 114 and 387 (p⌬MYB2) (data not shown).
A flexible 19-residue peptide linker containing Asn-Gly repeats (NG linker) or a 14-residue peptide linker containing Ser-Gly repeats (SG linker) was inserted into the deletion site of p⌬MYB1 to yield p⌬MYB1(NG) or p⌬MYB1(SG). After transformation of the plasmid into E. coli, the expressed proteins were purified on an amylose column. As shown in SDS-PAGE, the major component of the purified proteins was a 51-kDa protein, the size expected for the ligated exteins, MB (Fig. 2B, middle and right panel, lanes 2), and the excised intein mutant ⌬Y was detected in the crude extract with the expected molecular mass of ϳ33 kDa (Fig. 2B, middle and right  panel, lanes 1), indicating that efficient splicing had occurred in vivo. The identities of MB and ⌬Y were further verified by Western blot analysis. The expected MB reacted with anti-MBP but not anti-Sce (data not shown), and it bound chitin as indicated by 2% SDS elution from the chitin resin (Fig. 2B,  middle panel, lane 3). The anti-Sce antibodies reacted specifically with the excised intein mutant ⌬Y in the crude cell extract (Fig. 2B, middle panel, lane 4, and right panel, lane 3). Three minor components with high molecular masses were detected in amylose-purified proteins, the 2% SDS eluates, and they reacted with both anti-MBP and anti-Sce VMA intein sera (Fig.  2B, middle panel, lanes 3 and 4, and right panel, lane 3), suggesting that they were probably incomplete splicing products, i.e. the precursor (84 kDa), branched intermediate, and C-terminal splice junction cleavage product M⌬Y (77 kDa). Other minor bands in the amylose-purified proteins, corresponding to the size of ⌬Y (33 kDa) and MBP (42 kDa), were also co-purified (Fig. 2B, middle and right panel, lane 2), indicating that some splicing and single splice junction cleavage of the precursor molecules occurred during purification and storage.
The NG linker was also inserted into the deletion site of p⌬MYB2 to yield p⌬MYB2(NG). Only a 74-kDa protein, corresponding to the unspliced precursor ⌬MYB2(NG), was purified from the amylose affinity column (Fig. 2C, lane 2). Incubation of the purified ⌬MYB2(NG) under splicing conditions (i.e. 20 mM HEPES (pH 8.0), 0.5 M NaCl) at 4 or 23°C showed no splicing (data not shown).

DISCUSSION
For efficient splicing to occur, the protein precursor has to fold properly to bring the two splice junctions in close proximity and precisely align all the reacting groups. Previous studies have not determined whether the entire intein sequence including the functionally unrelated endonuclease motifs is required for the proper folding of the precursor that leads to efficient splicing. It has been shown that a small deletion in an intein from the Vent DNA polymerase of Thermococcus litoralis resulted in an unspliced precursor (16), and a large deletion in the central region of an intein from Mycobacterium tuberculosis recA protein blocked splicing (2). In the case of the Sce VMA intein, although a seven-residue deletion in the middle of the Sce VMA intein (between motifs C and D) did not affect splicing, large in-frame deletions have been shown to block splicing (3). In the present study, large deletions in the intein central region also blocked splicing (in p⌬MYB1 and p⌬MYB2). However, efficient splicing was observed when flexible peptide linkers were inserted at the deletion sites (in p⌬MYB1(NG) and p⌬MYB1(SG)) (Fig. 2B). It appears that deletions in the central regions of the Sce VMA intein disrupt the remaining intein structure for proper folding thereby blocking splicing, albeit sufficient catalytic and structural elements for splicing still remain. Introduction of a linker at the deletion site appears to give flexibility to the local protein structure, thereby allowing the remaining intein to achieve the proper conformation for efficient splicing.
Protein splicing of the Sce VMA intein directly involves three essential splice junction residues, Cys-1, Asn-454, and Cys-455 (6). Mutations of intein residues close to the splice junctions have been shown to disrupt protein splicing (3,16,19,20). A close examination of the chemical mechanism of protein splicing suggests that there may be other catalytic residues that are involved in assisting the nucleophilic attack by the splice junction cysteines or asparagine. A conserved histidine residue in motif B was suggested to be involved in assisting in the N-or C-terminal splicing reactions (9). Similarly, the data from this study also suggest that all catalytic residues are located in the splice junction motifs (A and G) and the proximal motifs (B and F), whereas the central motifs (C, D, and E) are not essential. Both ⌬MYB1 (NG) and ⌬MYB1 (SG) retained motifs A, B, F, and G and allowed efficient splicing. A motif H, spanning residues 340 -359 in the Sce VMA intein, has recently been identified (21), and it is located within the deleted region of ⌬MYB1 and presumably not required for splicing. It appears that the motifs that are involved in splicing may not be required for the endonuclease activity of the intein. Mutations of the conserved residues in motifs A, B, F, and G have no effect on the endonuclease activity. 2 It is yet to be determined if an intein without the splice junction motifs A, B, F, and G can retain the endonuclease function. When a larger deletion was made as in ⌬MYB2 (NG) (Fig. 2C), protein splicing was blocked even though the same peptide linker was inserted at the deletion site (Fig. 2C). It is possible that the deletion in ⌬MYB2 (NG) contains certain essential elements for splicing or it disrupts the proper alignment of the catalytic residues. The intein deletion mutant ⌬Y in ⌬MYB1(NG) or ⌬MYB1(SG) also exhibited some differences in splicing efficiency from the full-length intein possibly due to certain structural alterations in ⌬Y caused by the deletion. For instance, compared with the MYB fusion, ⌬MYB1(NG) and ⌬MYB1(SG) fusion proteins spliced less efficiently as indicated by accumulation of a small amount of the precursors in the amylose-purified proteins (Fig. 2B,  middle and right panel, lane 2). It is possible that the position of the deletion or the choice of peptide linker was not optimal for the remaining intein structure to achieve the wild-type splicing efficiency. Further variation in the deletion sites and/or linker sequences may optimize the splicing efficiency.
Despite the differences between ⌬Y and the wild-type intein, the data from this study suggest that ⌬Y retains not only most of the wild-type splicing activity but also the overall wild-type structure and that deleting the central region of the Sce VMA intein including the endonuclease motifs may not affect the overall structure of the remaining intein. This raises the possibility that in the tertiary structure of the Sce VMA intein, the N-and C-terminal regions may closely interact to form an independent "splicing domain," whereas the central region of the intein may form an endonuclease domain. However, whether the endonuclease function of the Sce VMA intein resides only in the central region of the intein, thereby forming a separate domain, remains to be determined. A 150-amino acid open reading frame region in the chloroplast DnaB helicase protein of the red alga Porphyra purpurea and a 198-amino acid open reading frame in the gyrase A protein of Mycobacterium xenopi have been recently identified as putative inteins by sequence alignment (9,21). Both Ppu dnaB (Fig. 1A) and Mxe gyrA inteins contain motifs A, B, F, and G but completely lack the dodecapeptide motifs (C and E) and motif D. Although these inteins have not been demonstrated to splice in vivo or in vitro, it is possible that naturally occurring inteins with only a splicing domain may exist in some proteins and that splicing occurs without the endonuclease motifs.
In conclusion, we have provided strong evidence that the Nand C-terminal regions of the Sce VMA intein including motifs A, B, F, and G contain sufficient structural and catalytic elements for splicing, whereas the central region of the intein including the dodecapeptide motifs C and E and motif D are not essential for protein splicing. Our data should further our understanding of the mechanism of protein splicing and help to elucidate the three-dimensional structure of inteins. It also represents an important step toward "engineering" a minimal protein splicing element. Although a rational design of such a minimal splicing element awaits solving of the intein crystal structure, our study provides a new approach to intein-based protein engineering and a variety of applications in molecular biology.