Examination of the Dimerization States of the Single-stranded RNA Recognition Protein Pentatricopeptide Repeat 10 (PPR10)*

Background: PPR (pentatricopeptide repeat) proteins in dimer state can specifically recognize single-stranded RNA. Results: The different amino-terminal lengths and critical point mutations in PPR protein lead to distinct dimerization state of PPR protein. Conclusion: PPR proteins are regulated to adopt different dimerization state. Significance: The biochemical studies advanced our understanding of PPR protein regulations and can facilitate the biotechnological applications of PPR proteins. Pentatricopeptide repeat (PPR) proteins, particularly abundant in plastids and mitochrondria of angiosperms, include a large number of sequence-specific RNA binding proteins that are involved in diverse aspects of organelle RNA metabolisms. PPR proteins contain multiple tandom repeats, and each repeat can specifically recognize a RNA base through residues 2, 5, and 35 in a modular fashion. The crystal structure of PPR10 from maize chloroplast exhibits dimeric existence both in the absence and presence of the 18-nucleotide psaJ RNA element. However, previous biochemical analysis suggested a monomeric shift of PPR10 upon RNA binding. In this report, we show that the amino-terminal segments of PPR10 determine the dimerization state of PPR10. A single amino acid alteration of cysteine to serine within repeat 10 of PPR10 further drives dimerization of PPR10. The biochemical elucidation of the determinants for PPR10 dimerization may provide an important foundation to understand the working mechanisms of PPR proteins underlying their diverse physiological functions.

Pentatricopeptide repeat (PPR) 3 proteins, prevalent across eukaryotic kingdoms of lives and especially abundant in terrestrial flowering plants, function as sequence-specific single-stranded RNA binding proteins (1)(2)(3)(4)(5)(6). PPR proteins mainly exist in chloroplasts and mitochondria, where they are involved in diverse aspects of organelle RNA metabolism processes, including RNA editing, maturation, stability, and translation (3,4,7). PPR mutants of important organelle functions frequently failed to survive through to the fetus stage, leading to high lethality (3). Various plant physiological studies on PPR proteins revealed their crucial roles in the ubiquitous plant development processes such as fertility, embryogenesis, circadian response, and organ separation (3,7,8). For example, certain Rf (restorer-of-fertility) genes can encode PPR proteins to overcome the cytoplasmic male sterility (9 -12). In human mitochondria, PPR protein leucine-rich pentatricopeptide repeat motif-containing protein (LRPPRC) is closely correlated with the French-Canadian type of Leigh syndrome and adenocarcinoma (13,14).
PPR10 from maize chloroplast has been extensively exploited as a prototype for deciphering RNA recognition code (18,21,22). PPR10, containing 19 repeats, targets two native RNA elements from atpI-atpH and psaJ-rpl33 intergenic regions, referred to as "atpH" and "psaJ", respectively (21,22). PPR10 is proposed to act as a site-specific barrier to protect target RNA from nuclease degradation and to enhance mRNA translational activation by remodeling the ribosomal binding site (21,22). We recently determined the crystal structures of PPR10 in the RNA-free and psaJ RNAbound forms (15). The crystal structure provides the first glance of sequence specific recognition of single-stranded RNA by PPR proteins. Interestingly, whereas analytical ultracentrifugation analysis suggested that atpH-bound PPR10 is a monomer (18), two molecules of PPR10 intertwine into an anti-parallel homodimer both in the absence and presence of psaJ RNA element in the crystal structures (15).
PPR proteins may be monomeric or dimeric. For example, PPR4, PPR5, and THA8 from maize and THA8L from Arabidopsis, exist as monomers (23)(24)(25)(26), whereas the PPR protein HCF152 from Arabidopsis was identified to be a homodimer (27). These observations raise the possibility that different dimerization states of PPR proteins may correspond to their diverse functions. Is it possible that PPR10 can be intrinsically regulated to adopt different dimerization states suitably fulfilling their respective physiological roles? If so, what are the determining factors for the dimerization states of PPR10? In this report, we sought to answer the questions through extensive biochemical analysis.

EXPERIMENTAL PROCEDURES
Protein Preparation-The codon-optimized cDNA of fulllength PPR10 (gene ID 100302579) from Zea mays was subcloned into pET15b vector (Novagen). Overexpression of PPR10 protein was induced in Escherichia coli BL21(DE3) with 0.2 mM isopropyl-␤-D-thiogalactoside at an optical density of 1.2 at 600 nm. After growing for 16 h at 16°C, the cells were FIGURE 1. Sequence alignment of the repeats in PPR10. A, a sequence logo generated using repeat sequences of PPR10 reveals relatively conserved amino acid positions within the repeats. The amino acids in the repeats are colored based on the physiochemical properties of the side chains: small (Gly (G) and Ala (A)) in black, nucleophilic (Ser (S), Thr (T), and Cys (C)) in blue, hydrophobic (Val (V), Leu (L), Ile (I), Met (M), and Pro (P)) in green, aromatic (Phe (F), Tyr (Y), and Trp (W)) in red, acidic (Asp (D) and Glu (G)) in light pink, amide (Asn (N) and Gln (Q)) in dark pink, and basic (His (H), Lys (K), and Arg (R)) in orange (33,44). B, the principal portion of PPR10 repeats (residues 107-771) is summarized in the table of 19 repeat alignment. The molecular determining residues for RNA recognition at the 2nd, 5th, and 35th positions are shaded light gray. collected and homogenized in a buffer containing 25 mM Tris-HCl, pH 8.0, and 150 mM NaCl. After sonication and centrifugation, the supernatant was applied to Ni 2ϩ affinity resin (nickelnitrilotriacetic acid, Qiagen) and further fractionated by heparin affinity chromatography (HiPrep Heparin FF 16/10, GE Healthcare) and gel filtration chromatography (Superdex-200 10/30, GE Healthcare). The PPR10 mutants were generated using twostep PCR and subcloned, overexpressed, and purified in the same way as the wild-type protein.
Data Collection, Structure Determination, and Refinement-The diffraction data were collected at Shanghai Synchrotron Radiation Facility, integrated, and scaled with the HKL2000 package (28). Further data processing was carried out using programs from the CCP4 suite (29). To determine the complex structure of triple cysteine mutant PPR10 (residues 69 -786, C256S/C430S/C449S), the previous resolved quadruple cysteine mutant PPR10 (residues 69 -786, C256S/C279S/C430S/ C449S) (Protein Data Bank code 4M59) was selected as the molecular replacement model. The molecular replacements were performed with program PHASER, and manually model rebuilding and refinement were iteratively performed with COOT and Phenix (30,31). Native data collection and refinement statistics are summarized in Table 1.
Size Exclusion Chromatography (SEC)-The SEC analyses were performed with a SD200 column (Superdex-200 10/30, GE Healthcare) in the buffer containing 25 mM Tris-HCl, pH 8.0, 150 mM NaCl, and 2 mM DTT. To examine the dimerization state of PPR10 and its mutants, 10 M, 500 l protein is applied each time onto the Superdex-200 column. To examine the interactions between protein and RNA, the indicated proteins of 10 M, 500 l were incubated with target RNA oligonucleotides with a molar ratio of ϳ1:1.5 at 4°C for ϳ40 min before injection.
For EMSA, PPR10 of different amino-terminal boundaries and other variants consisting of the indicated point mutations were incubated with ϳ40 pM 32 P-labeled RNA probe in the final binding reactions containing 40 mM Tris-HCl, pH 7.5, 100 mM NaCl, 4 mM DTT, 0.1 mg ml Ϫ1 BSA, 5 g ml Ϫ1 heparin, and 10% glycerol at room temperature for 20 min (22°C). Reactions were then resolved on 6% native acrylamide gels (37.5: 1 for acrylamide: bisacrylamide) in 0.5ϫ Tris glycine buffer under an electric field of 15 V/cm for 40 min. Vacuum-dried gels were visualized on a phosphor screen (Amersham Biosciences) with a Typhoon Trio Imager (Amersham Biosciences).

N-terminal Boundaries Determine Dimerization State of PPR10 in
Complex with RNA-In our previous structural study, we have exerted a systematic protein engineering effort on PPR10 for crystallization. Considering the oxidation process of surface cysteine residues can sabotage protein homogeneity hampering uniform crystal packing, we generated and combined the cysteines to serine mutants for each of the 18 cysteines in the PPR10 repeat region. After trials on a myriad of truncations, one construct of PPR10, residues 69 -786 (C256S/ C279S/C430S/C449S) in complex with the psaJ RNA element exhibiting excellent behavior gave rise to crystals, and the dimer formation of two PPR10 molecules in complex with RNA was finally observed in the refined structure ( Fig. 2A) (15). The 19 PPR repeats (residues 107-771) constitute the principal por- where F calc is the calculated protein structure factor from the atomic model (R free was calculated with 5% of the reflections selected). b r.m.s.d., root mean square deviation. tion of PPR10 preceded by three ␣-helices forming N-terminal domain (NTD, residues 74 -106) (Fig. 2B). The amino-terminal fragment of unknown function prior to NTD domain is predicted to be a flexible region containing the chloroplast transit peptide and is removed during structural study to facilitate crystallization. Along with previously reported data (15,18,21,22), these accumulated results describing PPR10 dimerization states aroused our interest. Two provocative, yet unanswered questions in PPR biology are thus introduced. Is it possible that amino-terminal length of PPR10 determines protein dimerization? Or is it possible that an amino acid mutation leads to the dimeric state of PPR10? In seeking the answers, extensive biochemical investigations were performed, respectively.
Simultaneously, we attempted to test the binding capacity of amino-terminal length variant constructs of PPR10 with its native target RNA elements by the EMSA. PPR10 -37 binds to the 18-nucleotide psaJ and 17-nucleotide atpH elements with dissociation constant (K d ) values of ϳ0.8 nM and 1.0 nM, respectively, indicative of strong interaction with the RNA elements (Fig. 3, A-C). For PPR10 -48 or PPR10 -69, drastically different from those of PPR10 -37, the K d values increased by over 10-fold, implying a much weaker interactions in comparison with those of PPR10 -37 (PPR10 -48, ϳ83 nM with psaJ, ϳ20 nM with atpH; PPR10 -69, ϳ34 nM with psaJ, ϳ12 nM with atpH) (Fig. 3, A-C). Taken together, these results argue the crucial role of amino-terminal fragments of PPR10, both in dimerization state regulation and target RNA binding affinity.
A Single Amino Acid Alteration Determines Dimerization State of PPR10-RNA Complex-Previous structural study aroused our interest to discover whether these four cysteine to serine mutations played a role in the dimerization state formation of PPR10 for target RNAs (Fig. 4A). Accordingly, we have conducted SEC analyses for wild type PPR10 (residues 37-786) and a series of PPR10 mutants (residues 37-786): (i) C256S single mutant; (ii) C279S single mutant; (iii) C430S single mutant; (iv) C449S single mutant; and (v) C256S/C279S/C430S/C449S quadruple mutant as a control.
In the absence of RNA, the wild type protein and all mutants consistently appeared as dimers on the chromatograms (Fig.  4B). In the presence of atpH RNA, the wild type protein and mutants uniformly eluted at monomer positions. Nevertheless, notable discrepancies were observed in the presence of psaJ RNA. The distinct elution volume value of 15.0 ml and 13.5 ml for wild type protein and quadruple mutants with psaJ RNA indicated the distinctive monomer and dimer formation, respectively (Fig. 4B), consistent with the previous sedimentation equilibrium analytical ultracentrifugation (SE-AUC) results (15). For single cysteine mutants with psaJ RNA, C256S, C279S, and C449S all eluted at similar positions as the wild type protein, illustrating cysteine to serine mutations at these three cysteine sites exert little impact on protein dimerization. Although it is noteworthy that the peak position of single mutant C430S with psaJ RNA shifted to the similar position as the quadruple mutants (Fig. 4B). This underscores the notion that Cys-430 is an important determinant residue in dimeric state formation, and C430S mutation can cause monomer-dimer transition of PPR10 in the presence of psaJ RNA element.
In the previously reported crystal structure, hydrogen bonds were observed between Ser-430 residue (side chain hydroxyl group) in repeat 10 and Gly-396 in repeat 9 (Fig. 4C) (15), whereas Cys-430 (side chain thiol group) may not efficiently generate polar contacts due to the lower electronegativity of sulfur atom in thiol group. This intramolecular interaction with the adjacent repeat suggests the intricately complicated role of Ser-430 in dimerization determination.
Concomitantly, we set out to test the RNA binding activity of wild type protein and a series of mutants. In the presence of psaJ RNA, the four single mutants of C256S, C279S, C430S, and C449S exhibit similar profiles as the wild type protein in EMSA experiments. The K d values ϳ0.8 -3.0 nM of C256S, C279S, C430S, and C449S single mutants resembled the value ϳ0.8 nM of the wild type protein (Fig. 5, A-C). Notably, for quadruple mutants, the K d value ϳ13 nM estimated from gel shift results is Ͼ10-fold higher than that of the wild type protein (Fig. 5, A-C). When binding to atpH RNA, the wild type protein, four single mutants, and quadruple mutants all exhibited similar EMSA profile and numerically comparable K d values ranging from ϳ0.5-1.5 nM (Fig. 5, D-F). The results illustrate that cysteine to serine mutations in the indicated PPR10 single mutants or quadruple mutants bring little influence on the binding affinity with atpH RNA element. Taken together, these results indicate that any of the single cysteine mutation fails to exert obvious impact on protein binding affinity with RNAs, whereas the multiple cysteine mutation combination can collectively reduce protein binding affinity with psaJ RNA element but impose no impact on the binding capacity with atpH RNA element.
Amino Acid Combination of "Phe-Ser-Ser (FSS)" and "Phe-Ser-Cys (FSC)" in PPR10 Can Both Target Adenine in RNA-In each PPR repeat, residues 2, 5, and 35 can specifically target a certain RNA nucleotide as a recognition code (18,19). Repeat 5 in PPR10 can recognize adenine in RNA strand as elucidated through biochemical and computational studies (6,15). In the prior study, quadruple cysteine to serine mutations C256S/C279S/ C430S/C449S (residues 69 -786) are present in the recently determined crystal structure of PPR10 in complex with the psaJ RNA element (15). This structure revealed, in repeat 5 of PPR10, Phe-246 in the 2nd position, Ser-249 in the 5th position, and Ser-279 in the 35th position, constituting FSS com-bination for discrimination of adenine RNA nucleotide (Fig.  6A). The results validated the previously proposed combination of FSS targeting adenine (6,19).
Due to the limited impact of C279S single mutation on protein dimerization and RNA binding capacity (Figs. 4B and 5), we proceeded to explore whether the FSC combination in the context of PPR10 repeat 5 can recognize adenine via structural biology study of PPR10 triple mutations C256S/C430S/C449S (residue 69 -786) with the RNA element. In this case, the 35th position of repeat 5 became a cysteine instead of a serine (Fig.  6A). Although the triple mutant was biochemically less stable and prone to precipitate in solution, we finally determined the crystal structure of the triple mutant with psaJ RNA at 2.8 Å resolution ( Table 1). Superimposition of two structures of quadruple mutant C256S/C279S/C430S/C449S and triple mutant C256S/C430S/C449S revealed an RMSD of only 0.041Å over 1368 aligned C␣ atoms, indicative of almost identical spatial arrangement (Fig. 6B). A structure comparison between Ser-279 in quadruple mutant and Cys-279 in triple mutant yielded minor differences for recognizing RNA bases (Fig. 6C). Scrutinizing the detailed microenvironment of Cys-279, through van der Waals interactions, the side chain thiol group can contribute to stabilization of the target adenine RNA base conformation in the RNA strand (Fig. 6D). Based on the dimeric crystal structure observation and comparison, repeat 5 of PPR10 can specifically target adenine in RNA element. With a phenylalanine at the 2nd position and a serine at the 5th position as a major recognition determinant, the 35th position in the repeat is suggested to be a cysteine, serine, or an amino acid with a small or nucleophilic side chain.

DISCUSSION
PPR proteins, initially discovered in Arabidopsis thaliana chloroplasts and mitochondria, constitute a large protein family. Ascribing to their sequence-specific RNA-binding activity, PPR proteins retain a range of essential functions in organelle gene expression in plants involving RNA editing, splicing, cleavage, and translational activation (3,5,6,33). It is worth mentioning that PPR proteins of distinct RNA binding properties with their target RNA undertake their respective diversified roles (6). In the process of RNA stabilization and translational activation, PPR tracts invariably exert tight binding affinity and complete sequestration on the non-coding RNA element. On another occasion for PPR protein-mediated RNA editing or other related functions, the low-affinity binding activity of PPR proteins allows the timely dissociation from RNA for subsequent translation execution (6). In the recent study, utilizing maize chloroplast protein PPR10 as the model protein target, we determined the crystal structure of PPR protein in complex with its native RNA target psaJ and validated the RNA recogni-tion code involving residues by structure-guided mutagenesis studies (15). Intrigued by the intertwined homodimer configuration of PPR10 in the crystal with RNA and other reported PPR10 dimerization results, we sought to understand the important determining factors for dimeric state formation. In this study, we attempted to unveil the molecular mechanisms underpinning the dimerization state determination.
Biochemical results presented here define a critical role of the variable lengths of amino-terminal fragment not only in determining PPR10 dimerization state but also in RNA binding capacity. Albeit it is proposed the mature portion of PPR10 commence roughly from residues ϳ30 -40 (21), disparate prediction results for chloroplast transit peptide cleavage site of residues ϳ50 -90 prior to PPR repeat region (residues 107-771) are retrieved from softwares such as ChloroP Server, Pre-diSi Software, SignalP Server, or TargetP Server (34 -37). The identification of PPR10 mature form in plant still awaits further in vivo appraisals. In plant chloroplasts, a large variety of proteases, including stromal processing peptidases, may alternatively excise a PPR protein, leaving variable lengths of amino termini as divergent mature functional forms. It is previously reported that the amino-terminal portions in a series of tetratricopeptide repeat (TPR) proteins are closely related to protein stability, dimerization formation, and interacting capacity with other protein partners (38 -40). Similarly, a strong emphasis can be hence rationally placed on the amino termini of PPR proteins.
Interestingly, the subtle alteration of a single amino acid (C430S) can significantly impact the dimerization state of PPR10. A single amino missense mutation as the least phylogenetic change can result in the distinct conformation, dimerization, and even functional diversity. In fact, a series of possible post-translational modifications in plants, including phosphorylation, sumoylation (41), and nitrosylation (42), leads to distinct mature protein forms. Thus, the strategy of variable length the amino-terminal fragments or alteration of key residues in PPR proteins, corresponding to distinct dimerization states and RNA binding affinities, explicitly implies not only an intrinsic regulatory mechanism for PPR proteins involved pathways in plants but also a tantalizing clue for potential PPR protein engineering in biotechnological application.
Within the PPR protein family, different members may adopt certain dimerization states to undertake their respective functional roles, exemplified by the fact that THA8, THA8L, PPR4, and PPR5 are monomers, whereas HCF152 exists as a homodimer (23)(24)(25)(26)(27). It is even more rational to speculate that a plant cell is capable to utilize different PPR proteins of distinctive dimerization states to trigger the corresponding downstream signal responses.
In sum, we have identified the amino-terminal boundaries to be the determinants of the dimerization state of PPR10 in this study. In addition, we found a single cysteine to serine amino acid alteration in repeat 10 of PPR10 favors the shift toward dimerization of the protein. Furthermore, structural comparison shows that in a PPR repeat, in cooperation with other key residues, either cysteine or serine at 35th position can recognize adenine RNA nucleotide in the overall dimer conformation.