Multistep Processing of an Insertion Sequence in an Essential Subunit of the Chloroplast ClpP Complex*

In Chlamydomonas reinhardtii, the clpP1 chloroplast gene encoding one of the catalytic subunits of the ClpP protease complex contains a large in-frame insertion sequence (IS1). Based on the Escherichia coli ClpP structure, IS1 is predicted to protrude at the apical surface of the complex, likely influencing the interaction of the catalytic core with ClpC/HSP100 chaperones. Immunoblotting with an anti-ClpP1 antibody detected two immunoreactive forms of ClpP1: ClpP1H (59 kDa) and ClpP1L (25 kDa). It has been proposed that IS1 is a new type of protein intron (different from inteins). By studying transformants harboring mutations at the predicted borders of IS1 and tags at the C terminus of ClpP1 (tandem affinity purification tag, His tag, Strep·Tag) or within the IS1 sequence (3-hemagglutinin tag), we show that IS1 is not a protein intron and that ClpP1L results from endoproteolytic cleavage inside IS1. Processing sites have been identified in the middle of IS1 and near its C terminus. The sites can be mutated without abolishing processing.

Clp proteases are self-compartmentalized serine proteases present in most eubacteria and, as a consequence of endosymbiotic events, in the mitochondrion and chloroplast of eukaryotes. In Escherichia coli, the organism in which they have been best characterized, Clp proteases associate a homo-oligomeric peptidase (ClpP) and a chaperone (ClpA or ClpX) that belongs to the Clp/HSP100 family, itself part of the large group of AAA ϩ ATPases (1)(2)(3)(4). ClpP is composed of 14 identical subunits arranged in two heptameric rings related by central symmetry. They form a barrel-like structure with the 14 active sites facing an inner proteolytic chamber (5). ClpP alone is able to degrade only small peptides (6), and the recognition and unfolding of protein substrates are carried out by the Clp/ HSP100 chaperone. The chaperone docks on the apical surfaces of ClpP and uses ATP hydrolysis to unfold and feed substrates through the ClpP axial pore into the proteolytic chamber (7)(8)(9)(10).
In chloroplasts, ClpP is present as a hetero-oligomer associating up to eight different types of subunit. This is the result of a gene diversification process that has begun in cyanobacteria and continues in the chloroplast of land plants. Not only has the number of clpP genes grown, but clpR genes have appeared that carry mutations in at least one residue of the catalytic triad and are thus presumed catalytically inactive. In the green alga Chlamydomonas reinhardtii, three clpP genes (clpP1, CLPP4, and CLPP5) and five clpR genes (CLPR1-CLPR4 and CLPR6) code for the subunits of the chloroplast ClpP complex (11). An additional CLPP2 gene codes for the homo-oligomeric mitochondrial ClpP.
ClpP1 is the only subunit that is encoded in the chloroplast and probably the best conserved. In C. reinhardtii, clpP1 contains a large insertion sequence (IS1) 3 translated in-frame with the conserved N-and C-terminal regions. This results in a protein about twice as large (ϳ59 kDa) as in other organisms. Chlamydomonas ClpP1 can be divided into two sequence domains, SD1 and SD2 (the latter containing the catalytic residues), corresponding to the conserved sequence, and one insertion sequence, IS1 (12). In C. reinhardtii, antisera raised against the entire open reading frame (ORF) recognize two products of clpP1 in Western blot: ClpP1 H (59 kDa) and ClpP1 L (21 kDa) (13). As the clpP1 mRNA does not undergo splicing (12), it has been proposed that IS1 could be a protein intron. Protein introns such as inteins (14) are defined as in-frame intervening sequences that disrupt a host gene and are post-translationally excised by a self-catalytic mechanism. In the case of clpP1, ClpP1 H would be the precursor protein and ClpP1 L the spliced form. However, IS1 lacks the sequence motifs characteristic of inteins. In addition, both ClpP1 L and ClpP1 H are stable, and both associate in the 540-kDa ClpP complex (11). Thus, if IS1 were a protein intron, it would be an unusual type. In the related species Chlamydomonas eugametos, clpP1 contains, in addition to IS1, another insertion sequence (IS2) displaying most of the sequence features of inteins. Indeed, IS2 can be induced to selfsplice in E. coli by changing a single residue (15).
In this study, we show that IS1 is not a protein intron and that ClpP1 L is the product of a complex proteolytic maturation of ClpP1 H . We have found similar insertion sequences in the clpP1 genes of other green algae from the group Chlorophyceae. Green algae accumulate such insertion sequences in many of their chloroplast genes, probably as a result of a high frequency of genome rearrangements.
ClpP1 Mutagenesis-A genomic fragment corresponding to nucleotides 16972-20083 of the chloroplast genome (Gen-Bank TM accession number BK000554) and encompassing the C-terminal end of the petB coding region and the trnL, clpP1, trnW, and trnS genes was PCR-amplified using Chlamydomonas total DNA as template and oligonucleotide primers SBamHI and REVA (see supplemental "Experimental Procedures" for oligonucleotide sequences and details on site-directed mutagenesis). After digestion with BamHI and KpnI, it was cloned into the pBluescript KS(Ϫ) vector digested by the same enzymes to create plasmid pClpP1. To create pLG3S, the 1.9-kilobase pair aadA cassette conferring resistance to spectinomycin and streptomycin (17) was inserted in the unique EcoRV site so that aadA and clpP1 read in the same direction. Site-directed mutagenesis was carried out with PCR or megaprime PCR (18). We verified correct introduction of all mutations by partial sequencing of the plasmids.
Chloroplast Transformation-Chloroplast transformation by tungsten particle bombardment (19) was performed with a helium gun built in the laboratory by D. Béal. Cells were grown in liquid Tris acetate/phosphate medium to a density of 2 ϫ 10 6 cells ml Ϫ1 . 10 8 cells were plated on Tris acetate/phosphate medium supplemented with 100 g ml Ϫ1 spectinomycin and bombarded with tungsten particles (diameter, ϳ1 m) coated with the appropriate DNA. Colonies were picked up after 2 weeks of growth and subcloned on Tris acetate/phosphate plates containing increasing concentrations of spectinomycin (from 100 to 500 g ml Ϫ1 ) until they reached homoplasmy (20). The presence of transforming DNA was assessed by PCR with specific primers and restriction analysis of the PCR products, whereas that of WT DNA was assessed by PCR using primers Seq4 and WA2. Colonies recovered from plates were resuspended in 20 l of sterile water, out of which 10 l were used for Chelex 100 DNA extraction (21), whereas the remaining 10 l were used for the next subcloning round.
Biochemical Analysis-SDS-PAGE was performed on 12-18% acrylamide gels containing 8 M urea (22); cell samples were loaded on an equal chlorophyll basis. For purified complex, protein concentration was determined using a BCA assay kit (Sigma). Proteins were electroblotted onto nitrocellulose membranes (Hybond-ECL, GE Healthcare) (23). Immunodetection was performed by ECL (GE Healthcare). The ClpP1 antiserum was raised against the entire ORF of C. reinhardtii clpP1 (13). The IS1 antiserum was raised by Neosystem (Strasbourg, France) against peptide NNESGRSLYRKQTER, corresponding to residues 325-339 localized in the C-terminal part of IS1. For Edman sequencing of ClpP1 fragments, proteins were transferred onto Westran polyvinylidene difluoride (Whatman) and stained with Coomassie Blue R-250. Edman sequencing was performed at the Plateau Technique d'Analyze et de Microséquençage des Protéines (Institut Pasteur).

ClpP1 IS1 Is Not a Protein
Intron-To test the hypothesis that IS1 is a protein intron, we introduced at the C terminus of ClpP1 a His 10 tag, a tandem affinity purification (TAP) tag (Fig.  1A), or a Strep⅐Tag (see Fig. 4). These tags consist of 10 consecutive histidines; a combination of a calmodulin-binding domain, a tobacco etch virus protease cleavage site, and a protein A domain (24); and a 9-amino acid peptide with high affinity for Strep⅐Tactin, respectively. In all cases, we observed the expected change in the electrophoretic migration of ClpP1 H (59 kDa in the WT), confirming that this immunoreactive band contains the C terminus and is produced by translation of the complete clpP1 ORF. In contrast, none of the tagged strains showed a change in the migration of ClpP1 L (ϳ21 kDa in all strains), indicating that ClpP1 L does not contain the C terminus of ClpP1 H , as would be expected if it were produced by IS1 splicing.
Protein introns like inteins rely on their N-and C-terminal flanking residues (usually Cys but sometimes Ser) for excision (25). To further rule out that an intein-like splicing of IS1 produced ClpP1 L , we mutated its putative flanking residues (Ser 60 and Ser 346 ) to Ala. In the two single mutants produced in the His-tagged context (S60A-His and S346A-His), we observed no modification of the ratio between the ClpP1 H and ClpP1 L bands compared with the WT (Fig. 1B). This result rules out that IS1 is a protein intron, as we would have expected at least a change in splicing efficiency by mutating its bordering residues. Interestingly, the only difference was the slightly faster migration of ClpP1 L caused by mutating Ser 60 to Ala (Fig. 1B). This was also observed in a double mutant (S60A/S346A) generated in a nontagged background (data not shown). This suggests that ClpP1 L contains this residue and is hence derived from the N-terminal part of the protein.
ClpP1 Is Cleaved within IS1 to Generate N-and C-terminal Fragments-In immunoblot experiments, the TAP-tagged strain showed a very strong labeling of the tagged ClpP1 H band ( Fig. 1A, F, 79 kDa versus 59 kDa in the WT) due to the fact that the protein A domain has by itself the capacity to nonspecifically bind antibodies in Western blots. Interestingly, two other strongly labeled bands were visible at 43 and 41 kDa (Fig. 1A, f and OE), which we reasoned should also contain the C terminus of ClpP1. We calculated that these two bands comprised the C-terminal 23 and 21 kDa of the ClpP1 protein, respectively, plus the 20-kDa TAP tag. In the His-tagged strain, we also observed two weakly immunoreactive bands that could correspond to such C-terminal fragments, with 1.5 kDa added from the His tag. They were not stronger than other background bands, but were missing in the WT. Instead, the WT showed a faint band at ϳ23 kDa ( Fig. 1, f), which was absent in the tagged strains. These observations led us to propose that ClpP1 is processed to yield (i) an N-terminal fragment, hitherto called ClpP1 L and which we now rename ClpP1 N , and (ii) two C-terminal fragments, which we call ClpP1 C (Fig. 1, f) and ClpP1 C Ј ( Fig.  1, OE), whose migration is retarded in strains carrying C-terminal tags ( Fig. 2). In the WT strain, ClpP1 C Ј co-migrated with ClpP1 N and was therefore not seen because of the much stronger immunoreactivity of ClpP1 N . It was only revealed when the presence of an epitope tag affected its mobility ( Fig. 1) or that of ClpP1 N (see ClpP1HA 85 in Fig. 3A below).
Further evidence for this model was obtained by Western blotting with an antibody directed against a peptide epitope (N 325 NESGRSLYRKQTER 339 ) corresponding to a region near the C terminus of IS1 (Fig. 1C). Although not very specific, this antibody recognized in Western blots all the bands that we propose represent C-terminal fragments of ClpP1 (Fig. 1C, f and OE). This result also allowed us to conclude that the N termini of these C-terminal fragments lie upstream of the peptide epitope, i.e. within IS1. We consistently observed two closely spaced C-terminal fragments regardless of the tag used, suggesting that cleavage can occur at either of two positions in the C-terminal portion of IS1. We call these cleavage sites C and CЈ (Fig. 2).
To prove that the smaller fragment (ClpP1 N ) corresponds to the N-terminal part of ClpP1, we introduced a 3-hemagglutinin (HA) tag at various positions inside the protein. In strain ClpP1HA 85 , the tag was inserted at position 85 ( Fig. 2), i.e. in the N-terminal part of IS1. In this strain, not only the ClpP1 H but also the ClpP1 N bands were shifted upwards by ϳ4.3 kDa, the size of the tag. Both bands were recognized by the anti-HA antibody (Fig. 3A). In contrast, in strain ClpP1HA 176 , in which the tag was introduced at position 176, approximately in the middle of IS1 (Fig. 2), the migration of ClpP1 N was unchanged. In this strain, the anti-HA antibody labeled ClpP1 H , but not ClpP1 N or the ClpP1 C and ClpP1 C Ј C-terminal bands (Fig. 3B). This indicates that the central portion of IS1, including position 176, is lost during processing. Processing must therefore include another cleavage event at a site between positions 85 and 176, which we call cleavage site N. This is in line with the observation that the sum of the apparent molecular masses of ClpP1 N and ClpP1 C (21 and 23 kDa, respectively, in the WT strain) is less than the mass of the ClpP1 H precursor (59 kDa).
Identification and Mutagenesis of the Cleavage Sites in IS1-To identify the processing sites more precisely, we analyzed the ClpP1-derived bands by N-terminal sequencing (Edman degra-  dation). We used a partially purified preparation of the ClpP complex obtained from the Strep⅐Tag-tagged strain by affinity chromatography. 4 This preparation contains all the ClpP and ClpR subunits and the four products of ClpP1, easily identified by immunoblotting (Fig. 4). As expected, the sequence obtained from the ClpP1 N band (XIGV…) matched that expected from the N terminus of the ClpP1 ORF (MPIGV) after removal of the N-terminal Met. The same sequence was obtained from the two bands forming the ClpP1 H doublet. For ClpP1 C and ClpP1 C Ј, two different N-terminal sequences were obtained (NYLD and YRK, respectively), identifying the positions of cleavage sites C and CЈ, respectively (Figs. 2 and 4). They are both found near the C terminus of IS1, and site CЈ is located within the peptide used to generate the IS1-specific antibody (NNESGRSL2YRK-QTER). Interestingly, electrospray ionization-tandem mass spectrometry experiments on the purified complex (data not shown) also identified a semitryptic peptide (NYLDQGALN-NESGR) whose N terminus coincided with site C. What could be the role of these processing events? To address this question, we mutated residues around sites C and CЈ, changing the sequences AY2NYL and RSL2YR to AAAAL and RAAAR, respectively. In the resulting strains (ClpP1mC and ClpP1mCЈ), as well as in the double mutant carrying both mutations (ClpP1mCmCЈ), the ClpP1 C and ClpP1 C Ј bands were still present (Fig. 5A). The mutation of site C slightly decreased the level of ClpP1 C , with no effect on ClpP1 C Ј, whereas the mutation of site CЈ slightly increased the level of ClpP1 C and decreased that of ClpP1 C Ј (for quantification, see supplemental Fig. S1). Interestingly, a new immunoreactive band appeared in these mutants (Fig. 5A, asterisk) whose size (39 kDa) corresponds approximately to the C-terminal product of a single cleavage at site N, i.e. with neither site C nor CЈ processing. When both sites were mutated, the intensity of this band increased, whereas that of ClpP1 C and ClpP1 C Ј decreased slightly.
The position of site N cleavage that generates the C terminus of ClpP1 N was established indirectly. Based on the size of ClpP1 N (21 kDa) and on the fact that its migration was unaffected in strain ClpP1HA 176 , cleavage had to occur not too far upstream of position 176. Indeed, a semitryptic peptide (DFSPNQDKDSAN, ending at position 170) was detected in the course of the electrospray ionization-tandem mass spectrometry analysis of the complex, which we thought could mark the C terminus of ClpP1 N . To corroborate this assignment, we mutated this region in Strep⅐Tag-tagged ClpP1, replacing residues 168 -178 with the dipeptide TG. In this strain (ClpP1⌬N-Strep), the apparent molecular mass of ClpP1 N was slightly higher than that in the WT and in the Strep⅐Tag-tagged strain (Fig. 5B). This suggested that cleavage can still occur, but at a position slightly downstream. This region of the protein therefore appears prone to cleavage, be it at site N or in the vicinity. To attempt to abolish cleavage completely, we introduced the HA tag just downstream of the site N mutation (mutant ClpP1⌬NHA 176 ), hoping to change the local conformation of the protein. Indeed, ClpP1 N disappeared (Fig. 3B). However, we  observed the appearance of a multiplicity of faint bands of intermediate molecular mass reacting with the anti-HA antibody. Accumulation of ClpP1 H was reduced, as was that of the ClpP complex, as evidenced from the commensurate reduction in ClpR2 amount (data not shown). The mutant protein therefore appeared to be unstable and subject to cleavage at multiple positions. Still, the fact that the mutant could be easily obtained indicates that some functional ClpP complex accumulates in this strain. We concluded that cleavage at or near site N is not strictly necessary for ClpP assembly and function. Other cleavage events can occur when it is prevented, but the protein produced appears less stable. Because of the large reduction in ClpP1 amount in this strain, we were unable to determine whether ClpP1 C and ClpP1 C Ј were still produced in the absence of site N cleavage.
As ClpP1 is itself a peptidase, we speculated that it could be responsible for its own maturation, and we attempted to mutate its catalytic Ser 387 . We were able to introduce the S387A mutation into the clpP1 gene, linked to the aadA cassette conferring antibiotic resistance. Despite all our efforts, the transformants remained heteroplasmic, i.e. retained the WT copy of the clpP1 gene on a fraction of their plastid chromosomal copies. After up to 12 rounds of subcloning on high concentrations of antibiotics, we achieved homoplasmy for the aadA resistance marker in some strains, but only to find that the S387A mutation had been lost, probably because of recombination with the WT clpP1 copies. This is diagnostic of a lethal mutation (12), and we concluded that not only ClpP1 but also its catalytic activity is essential for cell viability.
IS1 Is Essential in Chlamydomonas-We also investigated whether IS1 is dispensable in Chlamydomonas. Using restriction sites present in the plasmid carrying the mutations at both ends of IS1 (S60A/S365A), we removed the intervening sequence, generating a ClpP1 subunit of a canonical length.
Because this changed two residues at the junction between SD1 and SD2 (MEDD2AKKV), we also generated a plasmid restoring the WT sequence (MEDR2SKKV), corresponding to a clean excision of IS1. In both cases, we encountered the same problem as with the mutation of the catalytic Ser mentioned above: the mutation could be introduced, by virtue of its linkage to the resistance cassette, but it could not be brought to homoplasmy and was eventually lost by homologous recombination with the WT clpP1 copy. We concluded that IS1 plays an essential role in the function or biogenesis of the ClpP complex.
IS1 in Chlorophycean Algae-IS1 has been found in the ClpP1 protein not only of C. reinhardtii and C. eugametos (12) but also of the related multicellular alga Volvox carteri (11). These species belong to the family Chlamydomonadale, one of the five families of the class Chlorophyceae, We found an IS1 sequence in all of the Chlorophyceae clpP1 genes sequenced to date (26 -29), including Chlamydomonas moewusii (another Chlamydomonadale); Scenedesmus obliquus (from the sister group Sphaeropleales); and the more distantly related Oedogonium cardiacum (an Oedogoniale), Floydiella terrestris (a Chaetopeltidale), and Stigeoclonium helveticum (a Chaetophorale). In contrast, no IS1 was found in ClpP1 in the other classes of chlorophyte algae: Ulvophyceae (Pseudendoclonium akinetum and Oltmannsiellopsis viridis), Trebouxiophyceae (Leptosira terrestris and Chlorella vulgaris), and Prasinophyceae (Nephroselmis olivacea, Ostreococcus tauri, and Ostreococcus lucimarinus). It was also not found in green algae belonging to the streptophyte lineage and in the early diverging Mesostigma viride of uncertain classification. This suggests that IS1 was acquired just before the radiation of the Chlorophyceae. The alignment of these IS1 sequences (Fig. 6A) is consistent with the phylogenetic tree that has been derived by Turmel and co-workers (27) for Chlorophyceae. Another unusual characteristic of Chlorophycean ClpP1 is the presence of a long C-ter- minal extension compared with higher plants. For example, the sequence of ClpP1 in F. terrestris extends 36 amino acids beyond the longest higher plant sequence (Lycopersicon esculentum).
Compared with the rest of the ClpP1 protein, sequence variability was high in IS1. Even within the small group formed by Chlamydomonadales and Sphaeropleales, similarity was limited to a few blocks separated by variable regions with many insertions/deletions (Fig. 6A). The sequence around cleavage site C (NYLD) was conserved in Chlamydomonadale and in S. obliquus, but that surrounding sites N and CЈ was not. Still, we found evidence for ClpP1 processing in S. obliquus, in which our antibodies recognized two bands upon Western blotting (Fig. 6B, asterisk) that might correspond to ClpP1 H and ClpP1 N . Our antibodies showed no convincing cross-reaction with the other algae tested (C. moewusii, Gloeotilopsis paucicellularis, Uronema acuminatum, Chlorococum ellipsoïdum, Chlorella mirabilis, Leptosira obovata, Pseudendoclonium basiliense, Coccomyxa pringsheimii, and Coccomyxa rayssiae) (data not shown).

DISCUSSION
Because ClpP proteins are highly conserved through evolution, the presence in Chlamydomonas ClpP1 of an insertion sequence unrelated to any known protein has prompted speculations about its role, fate, and origin (11,12). It was proposed that IS1 is a new type of protein intron, distinct from inteins. Here, we have ruled out this hypothesis by showing that IS1 is not post-translationally spliced out, but is instead cleaved in at least three places to generate N-and C-terminal fragments that remain associated within the complex. ClpP1 N ends approximately after the first third of IS1, whereas the two alternative C-terminal fragments start near the IS1 end. The N-terminal fragment of ClpP1 (ClpP1 N ), referred to previously as ClpP1 L (11,13), appears responsible for most of the immunoreactivity, explaining why the C-terminal fragments had hitherto escaped our attention. But they can be clearly detected, either when they are marked with a tag, or in a purified ClpP complex, or with an IS1-specific antibody. Note that we consistently observed two C-terminal fragments, which we call ClpP1 C and ClpP1 C Ј. They are produced by cleavage in the C-terminal part of IS1 at either of the two sites C and CЈ. It is not known whether ClpP1 C and ClpP1 C Ј subunits coexist within one complex or whether the position of the cleavage varies from one complex to the other. The intervening sequence between sites N and C/CЈ is most likely lost from the complex and rapidly degraded because the anti-HA antibody recognized only ClpP1 H in strain ClpP1HA 176 (Fig. 3B). When sites C/CЈ were mutated, a 39-kDa fragment could be detected (Fig. 5A) that probably corresponds to the C-terminal fragment produced by site N cleavage.
These cleavages are more likely to occur on assembled than on unassembled ClpP1 polypeptides. The small fragments may not be able to fold by themselves, especially ClpP1 N , which of the ClpP structural elements retains only two ␣-helices and one ␤-strand. Interestingly, only a fraction of the ClpP1 subunits in the complex undergoes processing, as evidenced from the perfect co-migration of ClpP1 H and ClpP1 N in native gels (11). This, together with the large apparent size of the complex (540 kDa, compared with 350 kDa in higher plants), indicates the presence of more than one ClpP1 subunit per complex. Because IS1 processing entails the removal of ϳ15 kDa of protein between N-and C-terminal cleavage sites, the sharpness of the band suggests that the complex comprises a fixed number of unprocessed and processed subunits. Whether a given subunit is processed or not is most likely dictated by the nature of the neighboring subunits. The complex also comprises seven nuclear encoded subunits, some of which are proteolytically active (ClpP4 and ClpP5), whereas other are not (ClpR1-ClpR6).
Can we similarly explain why some ClpP1 subunits are cleaved at site C and others at site CЈ? A mutation altering the sequence around site C slightly increases the abundance of ClpP1 C Ј while slightly reducing that of ClpP1 C and vice versa. It is tempting to speculate that cleavage occurs at either of these sites, with a probability depending in part on the surrounding amino acid sequence. Because site N cleavage does not seem to be affected in site C/CЈ mutants (the ClpP1 H /ClpP1 N ratio did not change), we propose that it occurs first. Cleavage at site C or CЈ may be endoproteolytic or involve processive trimming of the newly formed N terminus. We note that E. coli ClpP is able to cleave its own 14-residue propeptide (2) and that the distance between sites C and CЈ in ClpP1 (15 amino acids) is approximately twice that in the peptides released by ClpP-mediated degradation (6,30). Thus, we considered the possibility that ClpP performs its own processing. After being processively degraded over a certain length, the polypeptide chain would be stretched between the proteolytic chamber and the already folded SD2 domain, preventing further cleavage. Whether the last cleavage occurs at site C or CЈ could be influenced by the nature of the residues at this position and by the propensity of the newly generated fragment to remain in the chamber or pop back out. Our efforts to test a possible role of ClpP1 in its own processing proved unsuccessful, as a mutation of the catalytic residue Ser 387 remained heteroplasmic. We were surprised to find that not only the ClpP1 protein but also its activity is essential for Chlamydomonas survival, despite the presence of other catalytically active subunits in the complex.
Moreover, we found that a deletion of IS1 cannot be brought to homoplasmy, even though the shortened protein is expected to fold normally. IS1 has thus become essential in Chlamydomonas, either for the function of the complex or for its biogenesis. In contrast, IS1 cleavage can be severely affected without compromising viability. Mutations at sites C and CЈ affected the ratio of the two C-terminal fragments and revealed the putative intermediate generated by site N cleavage. Tampering with the site N sequence led either to a shift in the cleavage position (ClpP1⌬N-Strep) (Fig. 5B) or to a diffuse processing pattern when it was accompanied by the insertion of the HA epitope (ClpP1⌬NHA 176 ) (Fig. 3B). Note that some form of cleavage was always observed, and we cannot rule out that activation of the complex requires one form or another of IS1 removal.
IS1 is predicted to protrude at the apical face of the heptameric ring, where the docking of the chaperone is known to occur. In Arabidopsis, small "ClpS" subunits (ClpT is probably a more appropriate nomenclature) may also crowd that surface (31). Assembly of the chloroplast Clp protease appears there-fore more complicated than in E. coli. This could be related to the diversification of ClpR in photosynthetic organisms. In cyanobacteria, a heptamer made of active ClpP3 has been proposed to interact with another one made of inactive ClpR to form a ClpP3-ClpR complex, distinct from that involving ClpP1 and ClpP2 (32). Cyanobacterial ClpP3 has given rise to chloroplast ClpP1 and ClpR2, whereas ClpR has diversified to produce ClpR1, ClpR3, and ClpR4. In Chlorophyceae, ClpR6 is derived from ClpP6 (11). However, some reorganization of the subunits must have occurred along the way, as in Arabidopsis, one heptameric ring containing ClpP1 and ClpR1-ClpR4 appears to be associated with another one made of ClpP3-ClpP6 (33). Whatever the organization of the Chlamydomonas complex, it is reasonable to assume that the presence of IS1 on at least one of the apical surfaces will impose strong constraints on the interaction with the chaperone and that IS1 processing represents a manner of activation of the ClpP complex.
The above discussion underscores how little we understand of the function and origin of IS1. Like many biological processes (mRNA editing, inteins, etc.), it seems to provide no competitive advantage, yet it has been conserved through the evolution of Chlorophyceae and has even become essential in Chlamydomonas. We must therefore take up a different point of view and ask how IS1 came about. More precisely, because ClpP1 is essential in all photosynthetic organisms (12,34,35), how did the system survive the introduction of a large additional domain disrupting a crucial interaction? And what changes occurred in the Clp system that now make IS1 essential?
First, we note that the interaction with the chaperone can be maintained if all ClpP1 subunits are on the same side of the complex (a hypothesis supported by the data in Synechocystis and Arabidopsis). The peptidase would simply have an active and an inactive surface with respect to chaperone docking and substrate entry. In this view, the function of IS1 processing is not to allow docking of the chaperone. The role of IS1 and its processing may instead be in regulating assembly of the complex.
The IS1 sequence shows no similarity to other proteins in the data bases. This suggests that it results from the insertion within clpP1 of a DNA fragment that had no specific coding function or lost it in the process. Apparently, clpP1 is not the sole chloroplast gene of green algae that had to cope with such an adventitious insertion. Chloroplast genome sequences from a variety of Chlorophyta (26 -29, 36 -38) show that, whereas the most ancestral group (Prasinophycaea, exemplified by Mesostigma) has retained a streptophyte-like organization, the more evolved Trebouxiophyceae, Ulvophyceae, and Chlorophyceae (the "UTC" group) are highly rearranged, with general scrambling of gene order and many gene losses. This correlates with a steady increase in the number and extent of short dispersed repeats (39). Simultaneously, a general increase in ORF length can be observed (35), probably caused by the introduction of "junk" DNA within permissive regions of the genes. Examples of such insertion sequences are presented in supplemental Figs. S2-S5 for CemA, RpoB, RpoC1, and RpoC2. Some of these insertions appear long enough to fold as an independent domain within the host protein, just as IS1 does in ClpP1. However, except for the maintenance of ORF continuity, most show a low degree of sequence conservation, even compared with ClpP1 IS1. This suggests that they have not acquired specific functions. We speculate that many will turn out to be cleaved post-translationally.
Insertion sequences can also evolve into gene splitting. For example, RpoB2 contains an insertion sequence in Ulvophyceae and in Chlorella, which independently gave rise to gene splitting in Chlorophyceae and in Leptosira. The genesplitting sites in C. reinhardtii RpoC1 and C. moewusii RpoC2 also coincide with the sites of large insertion sequences in other Chlorophyta.
In conclusion, the IS1 sequence found in the ClpP1 protein of C. reinhardtii and other Chlorophyceae is not a protein intron. It probably derives from a DNA insertion into the clpP1 gene of an ancestral Chlorophyceae. In a fraction of the ClpP1 subunits, it is proteolytically processed, probably after the assembly of the complex. During the course of its evolution, IS1 has acquired new functions that today make it an essential domain of an essential protein.