The Three Mitochondrial Encoded CcmF Proteins Form a Complex That Interacts with CCMH and c-Type Apocytochromes in Arabidopsis*

Three reading frames called ccmFN1, ccmFN2, and ccmFc are found in the mitochondrial genome of Arabidopsis. These sequences are similar to regions of the bacterial gene ccmF involved in cytochrome c maturation. ccmF genes are always absent from animal and fungi genomes but are found in mitochondrial genomes of land plant and several evolutionary distant eukaryotes. In Arabidopsis, ccmFN2 despite the absence of a classical initiation codon is not a pseudo gene. The 3 ccmF genes of Arabidopsis are expressed at the protein level. Their products are integral proteins of the mitochondrial inner membrane with in total 11 to 13 predicted transmembrane helices. The conserved WWD domain of CcmFN2 is localized in the inter membrane space. The 3 CcmF proteins are all detected in a high molecular mass complex of 500 kDa by Blue Native PAGE. Direct interaction between CcmFN2 and both CcmFN1 and CcmFC is shown with the yeast two-hybrid split ubiquitin system, but no interaction is observed between CcmFN1 and CcmFC. Similarly, interaction is detected between CcmFN2 and apocytochrome c but also with apocytochrome c1. Finally, CcmFN1 and CcmFN2 both interact with CCMH previously shown to interact as well with cytochrome c. This strengthens the hypothesis that CcmF and CCMH make a complex that performs the assembly of heme with c-type apocytochromes in plant mitochondria.

Mitochondrial gene contents are highly conserved. The genes encoded include components of essential mitochondrial functions, i.e. subunits of respiratory complexes and components of the mitochondrial translation apparatus. A major difference in gene content between plant and animal mitochondria resides in the occurrence in plants of ribosomal protein genes (2) and of ccm genes (for cytochrome c maturation) (4,5). These genes are similar to bacterial sequences found to be involved in the biogenesis of c-type cytochromes by genetic studies (6). In Escherichia coli, 8 ccm genes (ccmA-H) are encoded at 1 locus in a single operon (7). On the contrary, in plants, ccm genes are encoded by both the nucleus and mitochondria. In Arabidopsis, CCMA, CCME, and CCMH genes are encoded in the nucleus (8,9), whereas ccmB, ccmC, ccmF N1 , ccmF N2 , and ccmF c are encoded in the mitochondrial genome, at 5 different loci (2). The latter 3 sequences are similar to different domains of the bacterial ccmF. ccmF N2 is not an "open reading frame" because it lacks a classical ATG initiation codon. However, transcription has been established for all the ccm sequences encoded in Arabidopsis mitochondria (10,11). Furthermore, ccm transcripts have the highest RNA editing rate within the Arabidopsis mitochondrial transcriptome, with 26 editing sites/kb on average (10). Notable features are that RNA editing shortens the reading frame of ccmF C by introducing a termination codon (12) and that RNA editing is necessary for the conservation of the WWD motif of CcmF N (4,5).
Cytochromes of c-type are ubiquitous electron transporters. They are defined by the covalent attachment of heme C prosthetic groups to apocytochromes via two thioether bonds between vinyl groups on the heme and conserved cysteines on the protein. In mitochondria, these proteins are essential components of the respiratory chain involved in the generation of cellular energy. Two forms of these key proteins co-exist. They are present as soluble proteins, i.e. cytochrome c in the mitochondrial inter-membrane space and also as membrane proteins, i.e. cytochrome c 1 integrated into respiratory complex III, the cytochrome c reductase in the mitochondrial inner membrane.
The term "cytochrome c maturation" refers to the processes leading to the covalent attachment of heme to the apocytochromes. Different pathways have been described for this maturation (13). In yeast and animal mitochondria, a maturation process called "system III" has evolved. It involves one or two cytochrome c heme lyases (CCHL proteins) (14). "System I" is found in ␣, in most ␥ proteobacteria, and in some ␤ proteobac-teria, in deinococci and archaea (6,15), and in mitochondria of land plants and of some protists and algae (16,17). The proteins used by cytochrome c maturation system I are termed Ccm proteins. Among them, in plants, CCMA and CcmB form an ABC transporter (8), CCME is a heme chaperone that was proposed to bind heme covalently in the intermembrane space where its covalent attachment with apocytochromes takes place (18). Finally, CCMH interacts with apocytochrome c and can reduce cysteines of apocytochrome c that are required to be reduced for the ligation with heme to occur (9). In bacteria, CcmABCDE are involved in the delivery of heme to its site of assembly (19 -21). Co-immunoprecipitation experiments show direct associations between CcmC and CcmE and between CcmF and both CcmE and CcmH (21,22). Interestingly, a tryptophan-rich domain, called the WWD domain is found in both CcmC and CcmF (CcmC and CcmF N2 in Arabidopsis). This motif has been proposed to serve as a hydrophobic platform for heme binding (19) but other studies in bacteria suggest that this motif is needed for the interaction between CcmC and CcmE (22,23). In both bacteria and plant mitochondria, CcmF is predicted to be the final player of cytochrome c maturation. It is believed to be the site where heme and c-type apocytochromes are bound. However, no interaction between CcmF and c-type cytochromes had been established. Moreover, no interaction had been detected between cytochrome c 1 and any Ccm protein.
Here, we describe the submitochondrial localization of the 3 Arabidopsis CcmF proteins. We find them in a 500-kDa complex and show that they can be in direct interaction. Furthermore, CcmF N1 and CcmF N2 can bind CCMH and CcmF N2 is able to interact with both apocytochrome c and apocytochrome c 1 . These results together with previous work enable to propose a mechanistic model for the maturation of c-type cytochromes.

EXPERIMENTAL PROCEDURES
Phylogenic Analysis-CcmF protein sequences were aligned with the Muscle version 3.52 program (24). The poorly aligned and too divergent positions were removed from alignment using Gblocks 0.91b program (25). A phylogenetic analysis was performed using PhyML (26) and 100 boostraps support. The unrooted tree (Fig. 1A) was drawn with Treedyn (27).
Cell Fractionation and Purification of Mitochondria-Arabidopsis thaliana var. Landsberg erecta suspension cultures were maintained in a Gamborg G0210 basal medium containing 1 mg/liter 2.4-Dichlorophenoxy acetic acid and 2% (w/v) sucrose (pH 5.8). Cell cultures of 100 ml were maintained in 250-ml conical flasks in the dark at 22°C on a rotary shaker at 150 rpm. Every 7 days, 10 ml of the culture was subcultured into 100 ml of fresh media. Five-day-old cultures were used for the preparation of subcellular fractions as described previously (18). For mitochondrial extraction, cells were harvested by filtration through 100 M nylon mesh and mitochondria were extracted and purified by Percoll gradient centrifugation as described previously (28). Mitochondria were fractionated into mitoplast, membrane, and soluble fractions as described previously (18). The membrane fraction was subjected to alkaline treatment (0.1 M Na 2 CO 3 , pH 11.5, for 30 min at 4°C) to extract peripheral proteins.
Blue Native PAGE-Mitochondrial membrane complexes were resolved by Blue Native PAGE in the first dimension followed by SDS-PAGE in the second dimension as described previously (29). 400 g of mitochondrial membrane proteins were resuspended in ACA750 buffer containing 750 mM amino dicaproic acid, 50 mM bis-Tris, 4 and 0.5 mM Na 2 EDTA (pH 7.0). Protein complexes were solubilized with digitonin, 5/1 detergent/protein (w/w) for 30 min on ice, centrifuged at 100,000 ϫ g for 15 min at 4°C, and 5% (v/v) Serva blue solution (750 mM ACA750 solution, 5% (w/v) Serva Blue G250) was added to the supernatant. For the first dimension, mitochondrial complexes were separated on 5-13% acrylamide (in 0.5 M amino dicaproic acid, 50 mM bis-Tris, pH 7.0, buffer) gradient gels, with 50 mM bis-Tris (pH 7.0) anode buffer and 50 mM Tricine, 15 mM bis-Tris, 0.02% (v/v) Serva Blue G-250 (pH 7.0) cathode buffer. Electrophoresis was carried out overnight at 5 mA. Gel lanes were cut out and denatured for 1 h at room temperature in 50 mM Tris-HCl (pH 6.8), 1% (w/v) SDS, and 1% (v/v) ␤-mercaptoethanol. For the second dimension, subunits of the various complexes were separated by SDS-PAGE according to the method of Schagger and von Jagow (30).
Protein Interactions by the Split Ubiquitin System-cDNA fragments representing the full-length Arabidopsis CcmF N1 , CcmF N2 , CcmF C , apocytochrome c (At1g22840), apocytochrome c 1 (At5g40810), and CCMH (9) were amplified by PCR with oligonucleotides containing B1 and B2 recombination sites necessary for the entry of the cDNA fragments into vectors by in vivo recombination in yeast (sequences available upon request to authors). PCR products were recombined with linearized pMet-X-Cub-PLV, pX-NubG, and pNubG-X vectors also containing B1 and B2 sites (35) to obtain "XCub, XNub, and NubX" constructs. XCub constructs express a fusion between the protein of interest, the C-terminal domain of ubiquitin, and the chimeric transcription factor PLV consisting of protein A, LexA, and VP16 under control of the methionine repressible pMET25 promoter. XNub and NubX constructs express N-terminal and C-terminal fusions of the protein of interest with the N-terminal domain of ubiquitin (containing a point mutation that abolishes its ability to associate spontaneously with Cub) and the 3HA epitope, under the control of the pADH promoter. XCub constructs were transformed by heat shock in the haploid yeast strain AP4 (MATa), and XNub and NubX constructs in the haploid yeast strain AP5 (MAT␣) according to standard methods (36). Transformation in yeast was controlled by the growth on -Leu and -Trp media (-LW).
Protein interaction was monitored by the expression of the reporter genes ADE2, HIS3, and lacZ. The expression of ADE2 and HIS3 was visualized by the growth on -Ade-His media (-AH). The expression of lacZ was followed by measuring at OD 420 the accumulation of the product metabolized by ␤-galactosidase with 2.2 mM 2-nitrophenyl ␤-D-galactopyranoside (Sigma) as substrate.

ccmF in Mitochondrial Genomes-
In plants, ccm genes were inherited from the ␣ proteobacterial ancestor of mitochondria. Presently, some lineages have retained Ccm proteins, their genes are encoded in both mitochondria and the nucleus. Still, ccmF genes are only found in mitochondria. A high number of complete mitochondrial genomes covering the entire diversity of eukaryotic lineages have become available (37). We have searched mitochondrial genomes deposited in GOBASE (rel. 21, 2008) (gobase. bcm.umontreal.ca/) to try to understand the evolutionary history of ccmF genes among eukaryotes. ccmF, similar to all ccm genes has been lost from all holozoa (including animals) and from fungi. In viridiplantae, ccmF genes are absent from all the 10 mitochondrial genomes of chlorophytes available (including Mesostigma). In land plants including the bryophyte Marchantia polymorpha and the moss Physcomitrella patens fully sequenced to date (38,39), ccmF genes are always found. ccmF is also found in a wide array of other lineages: in charophytes, Rhodophytes, alveolates, Discicristata, and jakobids (17) (Fig. 1A). For all these lineages, ccmF is found in some species but not in all of them. Altogether, the presence of ccmF genes in evolutionary distant lineages, in some mitochondrial genomes, but not in all of them in the respective lineages, suggest that ccmF genes were lost from mitochondrial genomes at several time points during the evolution of eukaryotes. In particular in viridiplantae, the sequence data available suggests that they were lost at least twice: at the separation between streptophytes and chlorophytes and during the evolution of charophytes.
The Orthologue of Bacterial ccmF Is Encoded by Multiple Genes in Land Plants-In land plants, beyond the 16 fully sequenced mitochondrial genomes, ccmF genes are also found in all the incomplete genomic sequences available, in mitochondrial genomes of evolutionary distant organisms, such as e.g. Ginkgo or Amborella. Thus it appears to be strictly conserved in land plants. However, in this lineage, contrary to bacteria, ccmF is not encoded by single genes. ccmF has been split in multiple genes, each orthologue to different domains of the bacterial ccmF (Fig. 1B). This separation is differential according to species. In most plants, ccmF is split into 2 genes, e.g. in wheat (4,12). In M. polymorpha, the part encoding the C-terminal end of ccmF is further split in 2 genes (40). In Brasicacea such as Arabidopsis the part encoding the N-terminal end of ccmF is encoded by 2 genes (41,42). Thus, the 3 Arabidopsis ccmF genes are named ccmF N1 , ccmF N2 , and ccmF C . ccmF N2 that encodes the highly conserved WWD domain does not start by an ATG initiation codon (2). The 3 Arabidopsis ccmF sequences have an estimated 1027 codons, as compared with the 647 of ccmF in E. coli. This difference is explained by the presence of large insertions in plant between the regions of high sequence conservation between prokaryotes and eukaryotes (Fig. 1B).
The Three Arabidopsis ccmF Genes Are Translated-Gene expression has been documented for ccmF genes for various plant mitochondrial genomes. In wheat, the translation of CcmF N and CcmF C has also been reported (4,12). In Arabidopsis, the absence of the ATG initiation codon in ccmF N2 suggested that ccmF N2 could be a pseudo gene. However, transcription and RNA editing had been observed for the 3 Arabidopsis ccmF genes (10). Still, in plant mitochondria, many sequences and reading frames are transcribed but not translated (44,45). Similarly, RNA editing has been observed for pseudogene transcripts (46). A recent study where Arabidopsis mitochondrial transcript ends were mapped by circular reverse transcriptase-PCR shows that no ATG codon has been brought in 5Ј of ccmF N2 by trans-splicing (47), nor RNA editing (10). Thus we had to test whether the 3 ccmF were indeed translated and that ccmF N2 was not a pseudo gene. Therefore, we raised antibodies against peptides representing the 3 CcmF. The antibodies were used to probe Arabidopsis cell fractions. Single bands were detected in the mitochondrial fractions only, at 42, 30, and 60 kDa for CcmF N1 , CcmF N2 , and CcmF C antibodies, respectively (supplementary Fig. S1A). These sizes correspond to the calculated sizes of the proteins, 382, 203, and 442 amino acids long, respectively. For the CcmF N2 reading frame, we calculated its predicted size starting from a GUG codon (see below). Thus, it confirms that the 3 CcmF proteins, including CcmF N2 , are indeed translated. The quality of cell fractionation was assessed with antibodies directed against cytosolic, chloroplastic, and mitochondrial proteins.
The exact position of CcmF N2 translation start, however, remains uncertain. A precise answer would be brought by the direct N-terminal sequencing of CcmF N2 purified from Arabidopsis mitochondria.
The antibodies were also used with mitochondrial extracts from other plants. The CcmF N2 antibodies detect bands of 53 and 58 kDa for wheat mitochondria. The 58-kDa protein is only detected in a soluble protein fraction thus suggesting that it is not CcmF N , whereas the band of 53 kDa corresponds to the calculated size of wheat CcmF N and is mostly found in a membrane fraction. For CcmF C , as expected, a band of 60 kDa is detected in both wheat and Arabidopsis mitochondrial extracts. These results confirm at the protein level the differential organization of ccmF genes between wheat and Arabidopsis (supplementary Fig. S1B).
Submitochondrial Localization and Topology of CcmF Proteins-The submitochondrial localization, the biochemical properties, and the topologies of the 3 Arabidopsis CcmF proteins were investigated to establish whether they correspond or not to the predicted function of CcmF. Mitochondria were fractionated into soluble and membrane fractions. The antibodies detected signals in the membrane fractions only for CcmF N1 , CcmF N2 , and CcmF C ( Fig. 2A). Membrane proteins were further fractionated into peripheral and intrinsic membrane proteins by alkaline treatment. The 3 CcmF proteins were all found to be intrinsic membrane proteins. In bacteria, the topology of Rhodobacter CcmF was determined by fusing PhoA and LacZ to the predicted soluble domains of CcmF and by measuring the corresponding enzymatic activities (19,48). We used this data as well the ConPred II transmembrane and the topology prediction program (49) to build topology models for the 3 Arabidopsis CcmF proteins (Fig. 2B). CcmF N1 could have 3 to 5 transmembrane helices, and CcmF N2 2 or 3 transmembrane helices. CcmF C , however, has a clear prediction of 6 helices. We investigated experimentally the topology of the CcmF proteins by preparing mitoplasts (mitochondrial matrix and inner membrane) and digesting them with trypsin. The 3 CcmF proteins were detected in the mitoplast fraction, indicating that they are localized in the mitochondrial inner membrane. After trypsin digestion, the 42-kDa band of CcmF N1 was reduced to 35 kDa, the 30-kDa band of CcmF N2 was unchanged, and the 60-kDa signal of CcmF C was nearly undetectable (Fig. 2A). For CcmF N1 , all models give trypsin cleavage sites of the C-terminal domain accessible in the intermembrane space. Their cleavage result as observed in a shorter 35-kDa protein, which can be detected, indicates that the peptide used as an epitope is indeed oriented toward the matrix. Similarly, for CcmF C , accessible cleavage sites are located in the third intermembrane space loop. Their cleavages result in the generation of small proteins, as short as 4.8 kDa containing the antibody epitope and not detectable here. Accordingly, the 60-kDa signal disappeared for CcmF C . For CcmF N2 , trypsin cleavage sites present at the N-terminal end of the protein are not accessible to the protease in any of the two models. Trypsin sites are also present in the C-terminal domain including the CcmF N2 epitope. With a 3 transmembrane domains model, these sites would be accessible on the outside of mitoplasts, whereas with the 2 transmembrane domains model, the sites are localized in the matrix.
Here, the inner membrane appears to have protected the C-terminal domain from digestion, thus the 2 helices model is privileged. The tryptophan-rich WWD domain is present in a large loop, in the intermembrane space (Fig. 2B). Altogether, the results designate the 3 CcmF proteins as intrinsically attached to Arabidopsis mitochondrial inner membranes, with the conserved WWD domain oriented toward the intermembrane space. These features are in agreement with the predicted function of CcmF.
The Three CcmF Proteins Are Detected in a Complex of 500 kDa-Because the 3 Arabidopsis CcmF proteins are similar to different domains of a single bacterial protein, we investigated whether in plant mitochondria, these 3 proteins could be assembled in a single complex that would be the functional orthologue of bacterial CcmF. For this, we first used one-and two-dimensional Blue Native gels. We prepared mitochondrial membranes and solubilized their complexes with digitonin. The complexes were separated according to their sizes on one-dimensional Blue Native PAGE. For the second dimension their respective subunits were resolved on SDS-PAGE. One-and two-dimensional gels were transferred on mem-branes and reacted with the CcmF antibodies. For one-dimensional experiments, each antibody detected a signal corresponding to a 500-kDa complex (Fig. 3A). Two-dimensional experiments were then performed to verify that the 500-kDa signals truly corresponded to CcmF proteins. For CcmF N1 a signal at 42 kDa corresponding to complexes ranging from 470 to 620 kDa was detected, the strongest signal was for a 500-kDa complex. For CcmF N2 , a double signal was observed at 30 kDa corresponding to complexes of 500 and 600 kDa. Finally, for CcmF C , a unique signal at 60 kDa, corresponding to a complex of 500 kDa was observed (Fig. 3B). Thus, the 3 CcmF proteins are indeed all detected in complexes of 500 kDa. This does not prove a direct interaction between the 3 Arabidopsis CcmF proteins but strongly suggests that they are, together with other proteins, in particular CCMH that was also found in a 500-kDa complex (9), all part of a cytochrome c maturation complex.
The Three CcmF Proteins Can Be in Direct Interaction-The direct interaction of the 3 Arabidopsis CcmF proteins was investigated by "split ubiquitin," a genetic system derived from yeast two-hybrid, designed for membrane-bound hydrophobic proteins (35,50). Briefly, a protein of interest is fused to the C-terminal domain of ubiquitin followed by the PLV transcription factor (XCub constructs). A second protein of interest is fused in N-or C-terminal to a mutated version of the N-terminal domain of ubiquitin (XNub and NubX constructs). If an interaction takes place between the 2 proteins of interest, the functional ubiquitin is reconstituted and ubiquitin-specific proteases release the transcription factor, which migrates to yeast nucleus and activates the expression of reporter genes. We cloned full-length cDNA fragments corresponding to fully edited transcripts of CcmF N1 , CcmF N2 , and CcmF C into split ubiquitin vectors. We first cotransformed the XCub constructs with the respective Nub empty vectors and plated them on medium lacking leucine, tryptophan, adenine, and histidine (-LWAH). This was done to control that XCub fusion proteins were bound to membranes and could not freely diffuse to yeast nucleus to activate the expression of the reporter genes ADE2 and HIS3 without protein interaction. F N1 Cub, F N2 Cub, and F C Cub alone were as expected unable to activate the reporter genes (Fig. 4). Then, we cotransformed the XCub constructs with all the XNub and NubX constructs and plated them on -LWAH. For F N1 Cub, we observed growth with F N2 constructs, for F N2 Cub, growth was observed with both F N1 and F C constructs, and for F C Cub, with F N2 constructs (Fig.  4A). This shows that CcmF N2 can interact with CcmF N1 and CcmF C , whereas CcmF N1 cannot interact with CcmF C . These interactions were also monitored by the activation of lacZ, a third reporter gene. Background levels were estimated with wild-type yeast cells (Fig. 4B). The signals resulting from the interaction of CcmF N2 with CcmF N1 and with CcmF C were found to be on average 10 times higher than background levels, whereas with double transformants from CcmF N1 and CcmF C constructs, ␤-galactosidase activity never raised significantly above background levels (Fig. 4B). The results show that CcmF N2 can interact with both CcmF N1 and CcmF C and thus suggest that the 3 mitochondrial encoded Arabidopsis CcmF proteins are assembled in vivo into a protein complex that reconstitute a bacterial-like CcmF.
CcmF N2 Interacts with c-Type Apocytochromes-Because the assembly of heme with c-type cytochromes was predicted to take place at the level of CcmF (13, 51), we analyzed whether one or all the plant mitochondrial CcmF proteins were able to bind c-type apocytochromes, i.e. apocytocrome c and/or apocytochrome c 1 . Similar to what is described above, we cloned apocytochrome c and apocytochrome c 1 in split ubiquitin vectors. We controlled that the apocytochrome c 1 Cub construct did not activate alone the expression of reporter genes. The soluble apocytochrome c Cub fusion protein was logically not attached to membranes and could alone activate the expression of reporter genes (Fig. 5A). Therefore, interaction with apocy-tochrome c could only be tested with Nub constructs. We cotransformed the apocytochrome c 1 Cub construct with all the CcmF Nub and NubF constructs and the CcmF Cub constructs with the apocytocrome c and apocytochrome c 1 XNub and NubX constructs. For the apocytochrome c 1 Cub construct, we observed growth with the F N2 constructs only, for F N2 Cub growth was observed with both apocytocrome c and apocytochrome c 1 constructs. For the F N1 Cub and F C Cub constructs, no growth was observed with any of the apocytochrome c and apocytochrome c 1 constructs (Fig. 5A). The activation of lacZ gave similar results to that of ADE2 and HIS3 (Fig. 5B). The results show that CcmF N2 can interact with both c-type apocytochromes and suggest that in vivo the assembled CcmF indeed interacts with c-type apocytochromes at the level of CcmF N2 .
Two CcmF Proteins Interact with CCMH-Because apocytochrome c had been found to interact with CCMH as well in a previous study (9), the interaction of CcmF proteins with CCMH was also investigated. First, split ubiquitin assays showed that CCMH Cub fusion proteins could interact with both CcmF N1 and CcmF N2 Nub fusions, but not with the CcmF C fusion (Fig. 6A). Then, to better characterize the regions responsible for these interactions, the four largest domains of CcmF proteins, localized in the intermembrane space according to our topology models (Fig. 2), were cloned in yeast two-hybrid vectors. Interactions were tested against the D1 domain of CCMH localized as well in the intermembrane space (9). The activation of reporter genes was observed when the CCMH-D1 AD construct was co-transformed with CcmF N1 domain 5 and with CcmF N2 domain 2 constructs but not with CcmF N1 domain 3 and CcmF C binding domain 6 constructs (Fig. 6B). This suggests that CCMH can interact with CcmF at the level of both CcmF N1 and CcmF N2 and that these interactions are, at least in part, mediated by two domains localized in the intermembrane space in CcmF N1 and CcmF N2 .
Overall, these results together with previous work (9) suggest that the attachment of heme to c-type apocytochromes is done in cooperation with CCMH at the level of CcmF.

DISCUSSION
The analysis of the many fully sequenced mitochondrial genomes that became available (37,52) suggest that ccmF genes have been lost from mitochondrial genomes at several time points during the evolution of eukaryotes. During this evolution, CCHL proteins that define eukaryotic system III have appeared. They are found in animals and fungi (14) but also in viridiplantae, i.e. in Chlamydomonas (53). This could mean that CCHL has appeared early during the evolution of eukaryotes. Or that CCHL might have evolved later and been acquired by species such as Chlamydomonas by horizontal gene transfer. Beyond viridiplantae ccmF is found in other eukaryotic lineages. In some alveolates such as Paramecium and Tetrahymena ccmF is found in mitochondrial genomes, whereas all the other ccm genes have been lost. Little information is available on the nuclear sequences of these organisms. Thus, it is impossible to predict whether the given species still use system I proteins for cytochrome c maturation and the other ccm genes have been transferred to the nucleus or use system III, the remaining ccmF would then likely be non- functional in these organisms. New clues on the evolution of cytochrome c maturation pathways in eukaryotes will be brought by the identification of the ancestor of CCHL.
In non-land plant eukaryotes where ccmF is found in mitochondrial genomes, ccmF is present in a single gene similar to bacteria. However, in land plants, ccmF is split into 2 or 3 genes according to species. This division of the plant ccmF must have taken place through the recombination of mitochondrial genomes. Recombination is frequent in higher plant mitochondrial DNA (54,55). It has been shown to create numerous aberrant nonfunctional reading frames (56,57). However, we find that ccmF N2 even without an AUG initiation codon is not a pseudo-gene. In plant mitochondria, sequence analyses suggest that translation is most of the time, but not always initiated with an AUG codon. In Arabidopsis GGG, AAU, and GUG are possible additional translation initiation codons (2). Similarly, in Oenothera GUG and in radish ACG are potential translation initiation sites (58,59). Unlike chloroplasts, plant mitochondrial mRNAs do not have Shine-Dalgarno-like sequences (60). Thus, the precise mechanism of translation initiation is still unknown in plant mitochondria.
Our results suggest that after translation the 3 Arabidopsis CcmF proteins are integrated into a 500-kDa complex. In wheat, CcmF C was detected in a 700-kDa complex, solubilized with DDM (12), instead of digitonin here. This size difference could also mean that the CcmF complex has evolved differently in distinct species. The precise nature and protein content of the complex are unknown. The reunion of CcmF N1 , CcmF N2 , and CcmF C makes a 130-kDa protein complex, thus it appears that CcmF are either present as multimers or are in association with other proteins, in particular additional Ccm proteins. Indeed, CCMH was also detected in a complex of 500 kDa (9). In contrast, CCMA was found in a  480-kDa complex (8), thus showing that CCM proteins are found in at least 2 distinct complexes. They could correspond to a heme lyase and a heme delivery complex. Moreover, Ccm proteins could be associated with other assembly factors or with respiratory complexes during their assembly.
Here, we observe that CcmF N2 can be in direct interaction with both CcmF N1 and CcmF C . The conserved transmembrane domains of CcmF proteins could be responsible for these interactions. Alternatively, the interactions could take place through the additional domains present in plant proteins and absent from bacteria. We find as well that CcmF is able to directly interact with both c-type apocytochromes at the level of CcmF N2 and that CCMH can interact with both CcmF N1 and CcmF N2 . Thus CcmF N2 can interact with five different proteins. Some interactions, with the other CcmF proteins and CCMH are likely to be stable and others with c-type apocytochromes, to be transient. The specificity of system I toward its numerous apocytochrome substrates appear to be limited to the two cysteines and the histidine of the heme binding motif CXXCH (61). Because apocytochromes and Ccm proteins are thought to interact in a transient way, the identification of a complex between them is not easy. In bacteria no interaction between apocytochrome and any Ccm protein could be detected especially with CcmF and CcmH proteins, which are proposed to form a heme lyase complex (22). However, the fact that paralogs of these proteins (NrfE, NrfF, and NrfG) are needed for heme ligation on a noncanonical site CXXCK of the cytochrome c 552 nitrite reductase (NrfA) supports their potential role in apocytochrome substrate holding (63). The best candidates for the recognition of apocytochrome heme binding site are NrfG, the C-terminal domain of E. coli CcmH and Rhodobacter CcmI, these proteins all contain tetratricopeptide domains known to be involved in protein-protein interactions (62,64). The recent structure analysis of NrfG, modeling studies, and glutathione S-transferase pull-down assays suggest that its tetratricopeptide domain mediates the interaction with NrfA (43) and it is conceivable that the Ccm parologue of NrfG might play a similar role in the Ccm heme lyase complex. The situation is quite different in plant mitochondria system I. Plant genomes encode no ortholog of CcmI, although it is possible that it was not identified because of low sequence similarity. It is also possible that its function has been taken over by another protein. We have previously shown that CcmH can interact with apocytochrome c.
Here, we show that CcmF N2 can interact with both c-type cytochromes and CCMH. Thus we propose that CCMH and CcmF N2 have the apocytochrome binding function in plant mitochondria.
We can derive a mechanistic model for the final steps of cytochrome c maturation in plant mitochondria from data presented here and from previous results obtained with the different system I model organisms, reviewed here (17). The 3 CcmF proteins form a complex together with CCMH. Apocytochromes could be docked on CcmF N2 where they are reduced by CCMH. For heme delivery, CCME interacts with the CcmC WWD domain to bind heme. After that, CCME carrying the heme could bind CcmF through the second WWD domain on CcmF N2 , thus close to apocytochromes. Then, being in near proximity and presented in the right conformation, apocytochromes and heme would be covalently bound. Accordingly, in this model, CcmF is the converging point of two pathways, heme delivery and apocytochrome c reduction. This model will have to be confirmed experimentally, i.e. through precise investigation of interactions between CCME and CcmF N2 and by testing the effect of mutations in the WWD domain on protein interactions.