Identification of Two Pentatricopeptide Repeat Genes Required for RNA Editing and Zinc Binding by C-terminal Cytidine Deaminase-like Domains*

Background: Many pentatricopeptide repeat (PPR) proteins are RNA site specificity factors and include C-terminal DYW deaminase domains. Results: ELI1 and DOT4 are required for editing single sites. The DYW deaminase domain binds two zinc atoms. Conclusion: The C terminus of PLS-type PPR proteins shares molecular characteristics with cytidine deaminase. Significance: This study provides the first evidence that DYW deaminase domains bind zinc. Many transcripts expressed from plant organelle genomes are modified by C-to-U RNA editing. Nuclear encoded pentatricopeptide repeat (PPR) proteins are required as RNA binding specificity determinants in the RNA editing mechanism. Bioinformatic analysis has shown that most of the Arabidopsis PPR proteins necessary for RNA editing events include a C-terminal portion that shares structural characteristics with a superfamily of deaminases. The DYW deaminase domain includes a highly conserved zinc binding motif that shares characteristics with cytidine deaminases. The Arabidopsis PPR genes, ELI1 and DOT4, both have DYW deaminase domains and are required for single RNA editing events in chloroplasts. The ELI1 DYW deaminase domain was expressed as a recombinant protein in Escherichia coli and was shown to bind two zinc atoms per polypeptide. Thus, the DYW deaminase domain binds a zinc metal ion, as expected for a cytidine deaminase, and is potentially the catalytic component of an editing complex. Genetic complementation experiments demonstrate that large portions of the DYW deaminase domain of ELI1 may be eliminated, but the truncated genes retain the ability to restore editing site conversion in a mutant plant. These results suggest that the catalytic activity can be supplied in trans by uncharacterized protein(s) of the editosome.

Pentatricopeptide repeat (PPR) 2 genes have been shown to be required for RNA editing (4,(11)(12)(13)(14)(15)(16)(17)(18)(19)(20), and form a large family of protein-coding genes in higher plants with over 400 members in Arabidopsis (21). The PPR genes can be divided into P and PLS subfamilies (22). The P subfamily is composed of tandem arrays of a degenerate 35-amino acid motif (P), whereas the PLS subfamily is characterized by distinct variants of the 35-amino acid repeats, with P type as well as long (L) and short (S) variants of the repeats (21). Specific amino acid residues within the PPRs have been shown to specify the base recognized in the cis-element (23,24). In addition, the PLS subfamily includes characteristic C-terminal domains known as the E, Eϩ, and DYW. The DYW domain is known to have zinc binding motifs (HXE and CXXC) (25), and analogous zinc-coordinating residues are conserved in nucleotide deaminases (26 -29). Phylogenetic analyses have shown that the distribution of PPR proteins with the DYW domain coincides with the distribution of C-to-U editing in land plants (25), in the protist, Naegleria gruberi (30,31), and possibly also in diverse lower eukaryotes such as Acanthamoeba, Physarum, Nitella, and rotifers (32). A bioinformatics analysis of the deaminase superfamily has shown that the C-terminal region exhibits structural characteristics of a deaminase domain and places the DYW deaminase domain in the deaminases superfamily (27). The deaminase fold is represented by sequences composed of part of the E, the entire Eϩ, and most of the DYW domain, and has been identified as the DYW family of nucleic acid deaminases (Pfam: Pf14432). The entire region is referred to as the "DYW deaminase domain" in this work.
Several PPR genes are known to be responsible for editing various sites, but lack a portion of the DYW deaminase domain (4,12,13,16,(33)(34)(35)(36). In addition, portions of the DYW deaminase domain can be eliminated through truncation, but transgenic plants expressing truncated variants are fully capable of editing site conversion (14,15). These experiments suggest that the PPR proteins act as editing site recognition factors (14,15) in a genetic complementation assay. Thus, although the entire DYW deaminase domain is under strong selection (37), most of the DYW deaminase domain is dispensable. In at least one case, editing has been shown to require a PPR that provides site specificity, and a separate protein has been shown to supply a portion of the DYW deaminase domain (38). Both PPR proteins CRR4 and DYW1 are required for editing ndhD C2 (38). CRR4 lacks an intact DYW domain and is apparently required as a site recognition factor, whereas DYW1 has an intact DYW domain and may contribute the catalytic activity. In addition, RIP/ MORF proteins have been shown to interact with PPR proteins required for editing (3,5) and may be involved in forming editing complexes with PPR proteins.
In this study, we demonstrate that two PPR genes, ELI1 and DOT4/FLV, are each required for editing single chloroplast editing sites. ELI1 has a full-length DYW domain, and mass spectrometry (MS) combined with inductively coupled optical emission spectroscopy (ICP-OES) demonstrates that the DYW deaminase domain binds two zinc atoms per subunit. A fulllength recombinant DYW1 protein has been analyzed and also binds two zinc atoms per polypeptide. This is the first demonstration of zinc binding by a plant editing factor, and it provides additional support for a catalytic function in C deamination. In addition, truncated variants of ELI1 that lack most of the structural features of the deaminase fold are capable of restoring editing in a mutant plant. A critical 15-amino acid region must be retained for editing site conversion. Based on these results, we propose a bipartite model where one PPR protein is required for site recognition, and a second PPR or other protein provides the catalytic activity for editing site conversion. PPR Ortholog Identification-A reciprocal BLAST approach was used to identify putative orthologs in a comparative genomics survey (17). A similar approach was used to identify the function of ELI1; the amino acid sequence for each Arabidopsis thaliana PPR gene was queried in the TBLASTN program to the nonredundant database at Phytozome 7.0 or the Brassica database (BRAD) to identify orthologs. To distinguish orthologs from nonorthologous PPR-containing genes, the amino acid sequences of initial hits was queried by TBLASTN back to the genome of A. thaliana using a nonredundant nucleotide database at GenBank TM . Gene sequences were aligned with ClustalW with the MEGA5 software (39). Cladograms were constructed using MEGA5 (39) for a neighbor-joining tree with the default parameters except that gaps and missing data were analyzed with pairwise deletion.

EXPERIMENTAL PROCEDURES
RNA Editing Analysis-Bulk sequencing was performed at the University of California, Berkeley DNA Sequencing Facility. Quantitation of editing site conversion was performed using the raw trace file. The peak height for C and T traces at the edited position was used to measure editing site conversion as the percentage of T. Percentage of editing was reported to the nearest 10%.
The transformation vectors were electroporated into competent Agrobacterium strain ASE (41). The bacteria were grown to an absorbance of 0.8 at 600 nm and resuspended in 5% sucrose with 0.05% Silwet V77. Flowering plants were dipped into the Agrobacterium culture and allowed to recover 1 day. Seeds were sterilized and selected on agar plates containing 1 ⁄ 2 Murashige and Skoog salts with 100 mg/liter gentamicin.
DYW Deaminase Domain Expression and Zinc Analysis-Sequences encoding the DYW deaminase domain from ELI1 (residues Lys-490 -Trp-632) and DYW1 (Ala-36 -Trp-239) were cloned into the BamHI and SalI restriction sites of pET28a. Amplicons for DYW1⌬zn1 encoded residues Ala-36 to Lys-166 of DYW1 and were also cloned into pET28a. Plasmids were transformed into Escherichia coli strain Rosetta 2 (DE3) pLysS from Novozymes (Bagsvaerd, Denmark). Strains were grown at 37°C to an absorbance of 0.5 at 600 nm and then cooled to 18°C, induced with 1 mM isopropyl-1-thio-␤-D-galactopyranoside, and maintained at 18°C with shaking for 4 h. Bacterial pellets were resuspended in 50 mM Tris, pH 8.0, 250 mM NaCl and sonicated in six bursts of sonication for 20 s. Crude lysates were purified using nickel-nitrilotriacetic acid resin from Thermo Fisher Scientific. Purified protein was dialyzed against 50 mM Tris, pH 8.0, 250 mM NaCl. For the analysis of ELI1-DYW, zinc was provided by an additional dialysis against 50 mM Tris, pH 8.0, 250 mM NaCl, 50 M ZnCl 2 . The N-terminal Histagged portion was removed by digestion with thrombin. One unit of thrombin (BD Biosciences) was added per 1 mg of protein and incubated overnight at 4°C. Protein extracts were treated with 1 mM PMSF and dialyzed against 20 mM ammo-nium acetate. For ICP analysis, purified protein was submitted to the Center for Applied Isotope Studies at the University of Georgia (Athens, GA). Highly purified bovine carbonic anhydrase (catalogue number C2624) was purchased from Sigma-Aldrich.
The mass spectrometry analysis of the proteins was performed with a Waters QTOF2 instrument. The instrument uses time-of-flight analysis with an electrospray ion source, and ionization conditions were manipulated to occur under native or denatured conditions. Under native ionization conditions, the protein samples were dialyzed against 20 mM ammonium acetate and ionized in 20 mM ammonium acetate with an electrospray ion voltage of 3.6 kV, a cone voltage of 40 V, and a desolvation temperature of 120°C. For the denatured ionization condition, the protein sample was denatured with 50% acetonitrile in the presence of 0.1% formic acid. The denatured protein samples were ionized under the same conditions as native samples.

ELI1 Is a DYW Class PPR Gene Required for Editing ndhB
C830-A comparative genomics approach was used to identify the gene responsible for editing ndhB C830. A. thaliana has 34 known editing sites in the chloroplast, and B. rapa var. Chiifu 401-42 shares 29 of these sites (Table 1). Three Arabidopsis editing sites (psbE C214, ndhB C836, and rps14 C80) are absent in B. rapa as a result of a genomically encoded T at those positions (Table 1). Two Arabidopsis editing sites are represented as a genomic C in B. rapa, but are not converted to a U: ndhB C830 and a site in the intron of rps12 (A. thaliana plastid genome nucleotide position 69553) (Table 1). Thus, a total of five editing sites including ndhB C830 are present in A. thaliana but are absent in B. rapa.
We identified 13 PPR genes that could be responsible for differences in editing between these species based on the presence of a PPR gene in Arabidopsis and the apparent absence of a PPR ortholog in B. rapa (Table 2). Arabidopsis T-DNA insertion lines are available with disruptions in or near 10 Arabidopsis PPR genes that have no apparent ortholog in B. rapa, or the ortholog is predicted to be severely truncated ( Table 2). T-DNA insertion lines for 10 candidate PPR genes were screened for editing for the five chloroplast editing sites absent in B. rapa: ndhB C830, ndhB C836, At 69553, rps14 C80, and psbE C214. Two independent T-DNA lines for a DYW type PPR gene (AT4G37380) were identified that were incapable of converting ndhB C830, and thus are called editing lacking insertional mutant 1 (eli1-1 and eli1-2) (Fig. 1). No other defects in chloroplast RNA editing were detected for the 34 known sites in eli1-1 or eli1-2 mutant plants (data not shown). Editing activity was fully restored in eli1-1 mutant plants that express the fulllength ELI1 gene (Fig. 1B). The ndhB C830 editing site changes a serine codon to a leucine codon; however, mutant plants do not display a dramatic phenotype under growth chamber conditions ( Fig. 1). Chloroplast number and morphology are normal based on confocal imaging. 3 A truncated ortholog for ELI1 is present in the B. rapa Chiifu-401 genome (Table 2), and these plants are incapable of editing ndhB C830 (Fig. 1). B. rapa Chiifu-401 grows vigorously with no observable growth phenotype under laboratory conditions (data not shown). ELI1 orthologs were sequenced from 16 Brassica species with the same genomic background as the sequenced B. rapa (the A genome). Truncation of the ELI1 gene and loss of editing ndhB C830 are only present in the B. rapa strain Chiifu-401 (data not shown). B. rapa variety Rubicon F1 has an uninterrupted ELI1 ortholog, and ndhB C830 was found to be fully edited (Fig. 1).
DOT4/FLV Is a DYW Class PPR Gene That Is Required for Editing rpoC1 C488-A gene named defectively organized tributaries 4 (DOT4) had been characterized with T-DNA knockout plants in a DYW class PPR gene, AT4G18750, and has a striking phenotype of serrated white to pale green leaves, and the formation of radially symmetric, needle-shaped leaves (42). There are extensive phenotypic similarities between (dot4-2) and flavodentata (flv) (43) The mutant flv has been described as an editing mutant (44). The similarities between the unusual phenotypes suggest that the same gene is disrupted in both mutants.
Mutant plants dot4-2 (SALK_139995) and flv were screened for editing at all 34 known chloroplast sites. The rpoC1 C488 editing site was found to remain unedited in both mutant lines (Fig. 2), whereas all of the other known editing sites in Arabi-3 M. R. Hanson, personal communication. dopsis chloroplasts were edited (data not shown). In addition, the DOT4 gene was recovered from mutant flv plants, and has a point mutation at Gly-554 (GGA 3 TGA) that creates a stop codon in repeat 11 (data not shown). Therefore, disruption of the DOT4 gene leads to an inability of plants to edit rpoC1 transcripts and results in the unusual phenotype. Although the mutant has a pronounced leaf phenotype, no gross abnormali-ties in chloroplast number or morphology are present in green tissues of the mutant compared with wild type plants. 3 The DYW Deaminase Domain Binds Two Zinc Atoms-All of the known PPR genes that are required for RNA editing include an intact or truncated DYW deaminase domain. DYW1 is a small PPR protein with a modified repeat region and an intact C-terminal DYW deaminase domain, and was

Gene
Subclass  recently reported to be required for editing ndhD C2 (38). Because the DYW deaminase domains from ELI1 and DYW1 could potentially serve a catalytic function in the editing mechanism, and zinc is a critical cofactor for most nucleotide deaminases, zinc binding by the DYW deaminase domain was investigated. The DYW deaminase domains from ELI1 and DYW1 were expressed as His 6 -tagged polypeptides, purified by nickel affinity chromatography, treated with thrombin, and dialyzed to remove the N terminus containing the His 6 tag (Fig. 3A). For ELI1-DYW, the purified protein preparation ran as a doublet of two polypeptides of about 17 and 18 kDa on SDS-PAGE (Fig.  3B), whereas DYW1 ran as a single band (data not shown). To identify the metal ions present in the recombinant DYW deaminase domain, the polypeptides were subjected to metal analysis by ICP-OES. Molar zinc ratios are 1.7 and 1.6 zinc atoms per recombinant DYW deaminase polypeptide for ELI1-DYW and DYW1, respectively (Fig. 3C). No additional metals were detected above trace amounts (data not shown). A highly purified commercial preparation of bovine erythrocyte car-bonic anhydrase was included as a control and was shown to contain 0.9 zinc atoms per monomer. Each molecule of carbonic anhydrase is expected to bind 1 zinc atom.
Native MS can determine the molecular mass of polypeptides under conditions that retain prosthetic groups (45,46), and covalently bound ligands can be identified based on the difference between native and denaturing MS spectra. Recombinant ELI1-DYW protein was analyzed under native MS conditions, and two polypeptides with average molecular masses of 18,204 and 16,686 Da were identified after application of the deconvolution algorithm (Fig. 3D). Denaturing MS revealed polypeptide masses for ELI1-DYW of 18,080 and 16,560 Da; the mass differences between the native and denaturing analyses are 124 and 126 Da (Fig. 3D). Zinc has an atomic mass of 65.4; however, cysteines coordinate zinc directly to the sulfur atom with loss of a proton (47). The expected mass difference for a polypeptide with two zinc atoms coordinated by four cysteine residues would be about 127 Da. Thus, the observed mass difference is consistent with 2 bound zinc atoms. The native MS spectra of ELI1-  DECEMBER 20, 2013 • VOLUME 288 • NUMBER 51

JOURNAL OF BIOLOGICAL CHEMISTRY 36523
DYW did not detect the presence of polypeptides with no zinc or a single zinc atom (Fig. 3D).
Recombinant DYW1 was also analyzed under native conditions. The most prevalent peak corresponds to a polypeptide with a mass of 24,744 Da, and a smaller peak at 24,614 Da was also observed. Under denaturing conditions, a single major form of the polypeptide was detected with a mass of 24,618 Da. The mass difference for the major form observed in the native MS is 126 Da, and is consistent with two zinc atoms per DYW1 polypeptide. In addition, the native MS shows a polypeptide with a mass of 24,614, and this form appears to be DYW1 polypeptides with no bound zinc atoms. Polypeptides with a mass corresponding to a single zinc atom are not observed in the MS spectra.
The region of DYW1 that contains the putative zinc-coordinating residues is in the C terminus (Fig. 3A). A truncated DYW1 protein, DYW1⌬zn1, was prepared that places a stop codon just upstream of the first zinc binding motif (HSE) and eliminates the entire zinc binding region. Native and denaturing MS spectra of DYW1⌬Zn1 polypeptide showed no significant mass difference (data not shown), and the zinc binding by DYW1 requires the C-terminal region of the DYW deaminase domain.
Truncation of the ELI1 Gene Identifies the 15-amino Acid PG Box That Is Critical for Editing-Truncated forms of CRR22, CRR28, and OTP82 that eliminate portions of the DYW deaminase domain are capable of restoring RNA editing in respective PPR knock-out mutants (14,15). To further examine the role of the ELI1 DYW deaminase domain, C-terminal truncation variants of ELI1 were expressed in eli1-1 plants (Fig. 4A). Five ELI1 variants were generated with truncations in the DYW deaminase domain (Fig. 4A). ELI1 transgenes with truncations C-terminal to the PG box (trc 1-4) were capable of fully restoring editing at C830; however, truncation immediately before the PG box (trc5) resulted in a dramatic decrease in editing site conversion (Fig. 4B). None of these truncations affected editing at ndhB C746 (Fig. 4B) or distal sites in ndhB transcripts (data not shown). Thus, the function of the ELI1 gene in a phenotype complementation assay requires that the gene encode the PG box, but sequences encoding amino acids beyond the PG box are not required for conversion of ndhB C830.
Overexpression of ELI1 Reduces Editing at ndhB C836 through Competitive Binding with OTP8-Several transgenic plants expressing ELI1 variants convert the adjacent editing site, ndhB C836, to a lower extent than wild type plants (Fig.  5A). The ndhB C836 editing site is only 5 nucleotides away from the ELI1 target and requires a PPR protein, OTP82, for conversion (15). The ELI1-trc4-B plant expresses a truncated variant of ELI1 with the PG box and exhibits the most dramatic decrease in conversion of ndhB C836 (Fig. 5A). The amino acid residues at positions 6 and 1Ј in a repeat can be used to infer the nucleotides recognized by each individual PPR (23), and application of the PPR code indicates that ELI1 and OTP82 share overlapping cis-elements (Fig. 5B). Semiquantitative RT-PCR demonstrates that the expression of the ELI1 transgene in the plant with the greatest suppression in ndhB C836 editing (trc4-B) is substantially greater than other transgenics or the native ELI1 gene (Fig. 5C). OTP82 has two editing site targets, and editing of the second target, ndhG C50, was not significantly different between plant trc4-B and wild type plants (data not shown). These results suggest that overexpression of ELI1 suppressed editing of a nearby editing site by binding to overlapping nucleotides in the cis-elements.
The PG Box of CRR21 Is Essential for Editing ndhD C383-To determine whether retention PG box is a general feature of other chloroplast PPR genes that function in editing, we identified T-DNA insertions or natural mutations that truncate PPR genes and examined editing function. The T-DNA insertion in crr21-3 is immediately upstream of the PG box and truncates CRR21 with loss of the PG box (Fig. 6A). The only known editing site target of CRR21 is ndhD C383, and crr21-3 plants fail to edit C383 (Fig. 6B). The CRR21 ortholog in B. verna, BvCRR21, contains a truncation in the 17th PPR repeat (Fig. 6A), and the ndhD C383 editing site is lost in that species through substitution to a genomic T (37). N. officinale is a close relative of B. verna, but edits ndhD C383 and contains a CRR21 ortholog with an intact coding sequence through the end of the Eϩ domain (37). Transgenic Arabidopsis crr21-3 plants expressing NoCRR21 are capable of editing ndhB C383; however, plants expressing BvCRR21 fail to edit ndhD C383 (Fig. 6C). These results provide an additional example of gene truncation 5Ј to the PG box, resulting in loss of editing capability.
The PG Box Is Highly Conserved in PPR Proteins Required for Chloroplast RNA Editing-With the addition of ELI1 and DOT4, 17 PPR proteins have been identified to be necessary for editing at least one chloroplast RNA editing site. The amino acid sequence of the PG box exhibits a high degree of sequence identity and conservation in chloroplast PPR proteins that are required for editing (Fig. 7A). The motif is not as highly conserved in mitochondrial proteins required for editing (Fig. 7B). In addition to the high coincidence among Arabidopsis PPR proteins, the PG box is highly conserved in ELI1 orthologs (Fig.  7C). The PG box corresponds to the first ␤-strand (S1) in the deaminase-fold (27) and is predicted to be in close association with the HAE and CXXC motifs required for zinc binding at the catalytic site (Fig. 7D).
The DYW Domain Experiences Truncation in Evolution, and OTP80 in the Brassicaceae Is Derived from a Gene with a Fulllength DYW-PLS-type PPR proteins have been divided into subtypes based on the presence of the E, Eϩ, or DYW domain at the C terminus of the polypeptide (21). Four PPR proteins (CRR4, CRR21, CLB19, and OTP80) with truncated C-terminal domains are currently known to be required for RNA editing in chloroplasts (4,12,13,16), and numerous mitochondrial editing factors (MEF3, MEF18 MEF19, MEF20, MEF21, OTP87, OTP71, and OTP72) are members of the E-class (33)(34)(35)(36). To examine whether DYW class PPR genes have experienced truncation in evolution, the architecture of OTP80 genes was examined in plants with sequenced genomes (Fig. 8A). Among these taxa, truncated orthologs for OTP80 are only observed in the order Brassicales, and they include all sequenced examples from the Brassicaceae (Fig. 8A). Additional C-terminal sequences for OTP80 genes were obtained by 3Ј rapid amplification of cDNA ends and demonstrated that Reseda lutea (Resedaceae), a member of the order Brassicales, has a fulllength DYW (Fig. 8B). Other members of the Brassicales such as Capparis (Capparaceae) as well as Cleome, Brassica, and Arabidopsis (Brassicaceae) contain truncated orthologs. In all cases, truncation occurred within the Eϩ domain, and the truncated orthologs retain the PG box. These data suggest that a single truncation event occurred in a branch of the Brassicales that lead to the genera Cleome, Capparis, Brassica, and Arabidopsis (Fig. 8B). The editing target of OTP80, rpl23 C89, is present in these species and was edited in plants with full-length or truncated OTP80 genes (Fig. 8B). Thus, the DYW deaminase domain is susceptible to truncation in evolution, and the truncated orthologs retain the capacity to edit a target site.   DECEMBER 20, 2013 • VOLUME 288 • NUMBER 51

DISCUSSION
The identity of the catalytic component of the C-to-U editing apparatus has been elusive. The higher plant DYW domain was recognized to bear similarities to deaminases based on the presence and proximity of canonical zinc binding motifs (25), and a recent informatics analysis based on structural motifs places the DYW deaminase domain into a deaminase superfamily (27). Most members of the deaminase superfamily have a single zinc atom at the active site that is coordinated in a tetrahedral configuration by a histidine or cysteine residue ((H/C)XE) and two cysteine residues (CXXC). The fourth ligand of the zinc atom is the substrate water molecule, which is deprotonated by the glutamate residue to facilitate nucleophilic attack in the deamination reaction (26). The detection of zinc in the DYW deaminase domain supports a catalytic function in C-to-U editing.
The presence of a second zinc atom associated with the DYW deaminase is an intriguing result. Most of the deaminase superfamily members bind a single zinc atom (48 -50); however, the DYW deaminase domain was predicted to have a second zinc binding site at the C terminus (27). The observation of two zinc atoms confirms the prediction of a second zinc binding site; however, the role of the second zinc atom is unknown. The detection of only DYW1 polypeptides with either two or zero zinc atoms suggests that both zinc atoms are required for protein stability and may bind cooperatively. The region of the DYW domain (HHFTDGSCSCGDFW) is highly conserved in PPR proteins, and the first H residue and the CSC are invariant in Arabidopsis and Naegleria DYW domains. Many zinc binding motifs include cysteine and histidine residues separated by 2 or more residues (51, 52); however, other configurations are well known. The SWIM domain is a zinc finger-like domain FIGURE 7. Amino acid residues in the PG Box are highly conserved in chloroplast PPR proteins required for editing. A, the coincidence of amino acid residues in the PG box of 17 PPR proteins that are required for editing chloroplast transcripts is represented by a WebLogo plot. B, a WebLogo plot shows the coincidence of amino acid residues in the PG box of 12 PPR proteins that are required for editing mitochondrial transcripts: REME1, SLO1, MEF1, MEF8, MEF9, MEF18, MEF14, MEF19, MEF20, MEF21, MEF22, and OTP87. C, conservation of amino acid residues is displayed by a WebLogo plot for 25 (27). The location of a large amino acid extension specific to DYW deaminase and thus absent in the model is indicated by a yellow triangle. The minimal portion of the DYW deaminase capable of complementing the eli1-1 mutant is represented by the orange polypeptide chain. The location of the zinc ion based on the deaminase fold is shown by the gray circle.

FIGURE 8. A truncation in OTP80 has occurred in one branch of Brassicales.
A, an unrooted neighbor-joining phylogenetic tree indicates the relationship between OTP80 orthologs found in several sequenced dicots. The phylogenetic tree is based on an alignment of codons from Arabidopsis codon Ser-80 to Lys-580. An asterisk indicates species that have predicted stop codons that truncate the OTP80 protein. B, at left is a neighbor-joining phylogenetic tree that was constructed with rbcL sequences from A. thaliana, B. rapa, Cleome hassleriana, Capparis spinosa, and R. lutea and that has the same topology as a phylogenetic survey using rbcL and 18 S sequences (56). The editing status of rpl23 C89 is shown for A. thaliana, B. rapa, Cleome gynandra, C. spinosa, and R. lutea. The C-terminal domains of these OTP80 orthologs are shown on the right. that is commonly found in plants in MuDR transposases and in FAR1, a transcription factor (53). The zinc finger-like domain in these proteins employ zinc binding residues separated by a single amino acid (CXC and CXH), and CSC is frequently observed. Thus, the zinc ring finger-like structures in the SWIM domain have a similar signature as the putative second site in the DYW deaminase domain. Because transposases and transcription factors are both likely to directly interact with nucleic acids through these ring finger-like SWIM domains, the highly conserved HX 6 CSC motif in the DYW deaminase domain might provide an additional site for substrate RNA interactions.
The DYW deaminase domain has been shown to be dispensable for PPR gene function as assayed in genetic complementation assays (Figs. 4 and 6) (14,15). Many chloroplast editing factors have been reported that lack an intact DYW deaminase domain (4,12,13,16); however, all of these PPR proteins (CRR4, CRR21, CLB19, and OTP80) include the entire PG box plus an additional 11-41 amino acid residues. These results have led investigators to refer to PPRs as site recognition factors (14,54) In this work, we pinpoint a highly conserved 15-amino acid region, the PG box, as a critical feature that is required for editing site conversion. Although the function of the PG box is unknown, it could be required to form a complex with other editing components. The amino acid sequence of the PG box exhibits heterogeneity between editing factors, especially between PPR proteins that function in chloroplasts and in mitochondria (Fig. 7). The distinguishing characteristics of the PG box may explain in part why the E domains from different editing PPR proteins are not functionally equivalent (33). PPR proteins with different PG boxes may interact with different binding partners in the formation of editing complexes.
Considerable heterogeneity exists in the C-terminal region of the PLS-type PPR genes (21). The evolutionary relationship of variants in the DYW deaminase has been an enigma given the potential enzymatic role of the C terminus. The molecular evolution of four PPRs required for chloroplast editing indicated that all regions within the DYW deaminase domain exhibit strong negative selection (37). In this work, we demonstrate that the DYW deaminase domain of OTP80 is unstable in evolution and has undergone truncation during the evolution of the Brassicales (Fig. 8). These results suggest that the PLS-type PPRs that are required for RNA editing may have dual functions that are revealed both by molecular manipulations and in evolution. The PLS repeat portion of the PPR is required for editing site recognition, and this portion of the gene is retained when the editing site is present; however, when editing sites are lost, the PPR gene may be lost or converted to a pseudogene (37). Thus, portions of the DYW domain are dispensable in recombinant molecular analyses, and the dispensable nature of DYW deaminase domain is also detected in evolution. Apparently the RNA specificity function of the PPR protein provided by the PLS repeat domain is critical, but the DYW domain has a distinct and separable function that may be lost, and apparently provided in trans by another protein.
Thus, there appears to be a conundrum with respect to the structure of the DYW domain and its potential catalytic involvement in the editing reaction. The DYW deaminase domain has key features that suggest that it has a catalytic role in cytidine deamination; however, large portions of the DYW deaminase domain are dispensable. These observations support a model that a PPR protein serves a RNA recognition function and that the editing deaminase activity can be provided in trans (14,16,55). A recent study demonstrated that two genes are responsible for editing ndhD C2 (38). CRR4 is a PLS-type PPR protein with a truncated DYW deaminase domain and provides editing site recognition, and DYW1 contains an intact DYW deaminase domain and could serve a catalytic function (38). These results provide an example of two genes acting in trans to perform editing site recognition and editing site conversion. This work, taken together with results presented here, suggests that PPR proteins may be capable of working in pairs with one PPR protein providing the site specificity and a second PPR protein providing the catalytic activity for the deamination reaction. Although there are relatively few examples of multiple PPR genes necessary for the same editing site (13,17,19), it is possible that there is functional redundancy of the PPR protein that provides the deaminase activity so that any one of several PPRs could serve a catalytic function. The RIP/MORF protein family members each have broad effects on various subgroups of editing sites (3,5), and multiple PPR proteins could potentially bind to each RIP/MORF to provide close proximity of PPR proteins for site specificity and catalysis.