Structures of Phytophthora RXLR Effector Proteins

Phytopathogens deliver effector proteins inside host plant cells to promote infection. These proteins can also be sensed by the plant immune system, leading to restriction of pathogen growth. Effector genes can display signatures of positive selection and rapid evolution, presumably a consequence of their co-evolutionary arms race with plants. The molecular mechanisms underlying how effectors evolve to gain new virulence functions and/or evade the plant immune system are poorly understood. Here, we report the crystal structures of the effector domains from two oomycete RXLR proteins, Phytophthora capsici AVR3a11 and Phytophthora infestans PexRD2. Despite sharing <20% sequence identity in their effector domains, they display a conserved core α-helical fold. Bioinformatic analyses suggest that the core fold occurs in ∼44% of annotated Phytophthora RXLR effectors, both as a single domain and in tandem repeats of up to 11 units. Functionally important and polymorphic residues map to the surface of the structures, and PexRD2, but not AVR3a11, oligomerizes in planta. We conclude that the core α-helical fold enables functional adaptation of these fast evolving effectors through (i) insertion/deletions in loop regions between α-helices, (ii) extensions to the N and C termini, (iii) amino acid replacements in surface residues, (iv) tandem domain duplications, and (v) oligomerization. We hypothesize that the molecular stability provided by this core fold, combined with considerable potential for plasticity, underlies the evolution of effectors that maintain their virulence activities while evading recognition by the plant immune system.

tor genes tend to reside in dynamic regions of the pathogen genomes, frequently exhibit high levels of sequence polymorphisms, and display signatures of rapid evolution, including high rates of copy number variation, presence/absence polymorphisms, and positive selection. The primary adaptive function of effectors is to promote virulence, mainly by suppressing plant immunity, and some effectors are required for full virulence (6 -9). However, plants have evolved mechanisms to recognize effector proteins, resulting in effector-triggered immunity that restricts pathogen growth. This results in selective pressure on the effector to evade host recognition while maintaining its virulence activity (10). How effectors evolve to enable parasites to adapt to their hosts is not well understood. Specifically, the underlying molecular mechanisms by which effectors evolve to gain new virulence functions, adapt to their host targets, and/or evade the plant innate immune system are unclear.
Phytophthora is a genus of the oomycetes (water molds). This group of eukaryotic microorganisms includes plant pathogens that are responsible for some of the most devastating diseases of plants and have serious impacts on global crop production, on forestry, and in ornamental settings (3,4). Phytophthora infestans, the causative agent of potato and tomato late blight, triggered the Irish potato famine in the 1840s. Phytophthora sojae causes soybean root rot, and Phytophthora capsici causes pepper and cucurbit blight. These destructive parasites continue to be a major problem for modern agriculture, with significant economic and environmental effects realized through crop losses and the methods used to control these diseases. In recent years, the sudden oak death pathogen Phytophthora ramorum has emerged as a significant threat to both private and commercial forests.
The genome sequences and effector repertoires of several economically important Phytophthora species are available, including P. infestans, P. capsici, and P. sojae (3,5) (see the P. capsici Genome Project Web site). These parasites secrete modular effectors that are translocated into host cells (12,13). One class of Phytophthora effectors is the RXLR family that also occurs in downy mildew oomycetes, such as Hyaloperonospora arabidopsidis (2). RXLR effectors are defined by a secretion signal peptide followed by a conserved N-terminal domain defined by the RXLR (Arg-Xaa-Leu-Arg) consensus sequence. The RXLR domain is required for translocation inside plant cells (13) but is dispensable for the biochemical activity of the effectors when expressed directly inside host cells (14). Indeed, the effector activity appears to reside in the C-terminal regions of RXLR effectors (the "effector domain") (14).
The effector domains of RXLR proteins display extensive sequence diversity and typically share little sequence similarity with proteins with known activities, making functional annotation and prediction of tertiary folds from sequence difficult. Despite this diversity, there are recognizable sequence relationships within the C-terminal regions of Phytophthora RXLR effectors, and some repeating sequence motifs termed "W", "Y", and "L" have been described (15). These analyses indicate that many, but probably not all, RXLR effector genes share a common ancestor (3,15). How RXLR effectors encode adaptability while retaining their viability at the molecular level is unknown. Further, the virulence activities of individual RXLR effectors are only just beginning to emerge, and little is known about the biochemical mechanisms that underlie these activities (6).
Several RXLR effectors, including P. infestans AVR3a, AVR4, AVRblb1, and AVRblb2 and H. arabidopsidis ATR1 and ATR13, were first identified based on their avirulence activity (their ability to activate effector-triggered immunity in plants that carry a corresponding resistance (R) protein (14) (supplemental Fig. 1). AVR3a is a member of a large subfamily of RXLR effectors with homologues in P. sojae (3) and P. capsici (see the P. capsici Genome Project Web site). In P. infestans, AVR3a is found in either of two major allelic forms, AVR3a KI or AVR3a EM , with K/E defining the amino acid residue at position 80 (Lys or Glu) and I/M the residue at 103 (Ile or Met) (16). These different forms trigger different responses in planta. AVR3a KI , but not AVR3a EM , activate effector-triggered immunity in the presence of the potato resistance protein R3a (16). In plants that do not carry R3a, both AVR3a forms bind and stabilize the E3 ubiquitin ligase CMPG1, leading to suppression of INF1-elicited plant immunity and programmed cell death (PCD) 3 (17,18). Further, the identity of the residue at the C terminus of AVR3a is critical for INF1 suppression activity but not for R3a recognition (17). Gain-of-function mutagenesis in AVR3a EM identified a "hot spot" between residues Ser 123 and Lys 130 that enhances recognition by R3a and strengthens PCD suppression in response to INF1 (17).
Other oomycete RXLR effectors have been validated based on their ability to modulate defense responses in plants (supplemental Fig. 1). One such effector, P. infestans PexRD2, was identified in a functional screen to promote cell death activity in several susceptible and resistant plants (19,20). The promotion of cell death induced by PexRD2 could reflect its virulence activity (19), but the exact mechanism by which this effector contributes to pathogenesis remains unclear.
The purpose of this study was to investigate the three-dimensional structures of RXLR effectors from Phytophthora. Our aim was to decipher the nature of the relationships between these effectors at the molecular level and determine how these might provide information on their remarkable plasticity and accelerated evolution. Here we report the crystal structures of the effector domains from two oomycete RXLR proteins: (i) P. capsici AVR3a11, a homologue of P. infestans AVR3a (6, 16 -18) and AVR1b from P. sojae (21,22) and (ii) P. infestans PexRD2. These two effectors do not share significant sequence identity in their effector domains but, remarkably, display a similar ␣-helical fold in their structure. Structure-informed bioinformatic analyses revealed that this fold defines a domain observed in about 44% of annotated Phytophthora RXLR effectors. We discovered that many functionally important residues and those that are polymorphic map to the surface of the proteins. Our findings led us to propose a model for the structural basis of plastic evolution in RXLR effectors. We hypothesize that the core ␣-helical fold (termed the "WY-domain") can adapt through (i) insertion/deletions in loop regions between ␣-helices, (ii) extensions to the N and C termini, (iii) amino acid replacements in surface residues, (iv) tandem domain duplications, and (v) oligomerization. We propose that the core fold provides both a degree of molecular stability and plasticity that enables development/maintenance of effector virulence activities while allowing evasion of recognition by the plant innate immune system during rapid "arms race" co-evolution.

EXPERIMENTAL PROCEDURES
Construct Design-We used RONN (23) to guide construct design of AVR3a11 and PexRD2 for protein expression. This predicted the N-terminal regions of the effectors up to and including the RXLR signal sequence were likely to be disordered.
Protein Production-Constructs of AVR3a11 and PexRD2 were designed to limit inclusion of any predicted disordered regions, and synthetic genes were ordered (optimized for expression in Escherichia coli) and then subcloned into expression vectors (for details, see supplemental material). A second AVR3a11 construct (Thr 70 -Val 132 ) was produced by PCR and cloned as detailed in the supplemental material. All constructs were verified by sequencing. Details of protein expression, which followed well established procedures, are given in the supplemental material. Proteins were purified by Ni 2ϩ -immobilized metal affinity chromatography and gel filtration chromatography (see supplemental material). Protein molecular masses were verified using mass spectrometry (see supplemental material).
Crystallization and Data Collection-Initial crystals of AVR3a11 (the Thr 70 -Val 132 construct at 19 mg/ml) and PexRD2 (12 mg/ml) were grown at 20°C in sitting drop plates set up using a robotic crystallization system and then optimized in 24-well plate hanging drop experiments. Details of precipitant and cryoprotectant solutions are given in the supplemental material. Data used to solve the structure of AVR3a11 were collected using a Rigaku RU-H3RHB generator/Mar345 detector, with high resolution data collected on beamline I04 of the Diamond Light Source, UK. All data for PexRD2 were collected on I02 (Diamond Light Source, UK). Data collection statistics are given in Table 1.
Data Processing and Structure Solution, Refinement/Rebuilding, and Validation-X-ray diffraction data were processed with iMOSFLM (24) and scaled with SCALA (25), as implemented in the CCP4 suite (26). For phasing, AVR3a11 crystals were immersed in cryoprotectant (see supplemental material), supplemented with 100 mM potassium iodide for 30 s before freezing. The AutoSol wizard of PHENIX (27) was used to solve the structure of AVR3a11 (to 1.9 Å resolution), with a combined SAD/SIRAS approach using the anomalous signal from iodide. PHENIX was also used to produce an initial model, which was then refined against the high-resolution data. PexRD2 was co-crystallized with ammonium bromide and anomalous scattering from bound bromide ions was used to solve the structure. Bromide sites were initially identified with SHELXC/D (28), using a SAD approach, and data were collected at a wavelength of 0.90 Å. These positions were then used by PHASER-EP (29), as implemented in the CCP4 suite, to calculate initial phases. These phases were modified with PAR-ROT, and an initial model was built using BUCANEER (30). This model was then refined against the high resolution data. Final models were produced through iterative rounds of refinement using REFMAC5 (26) and rebuilding with COOT (31). Structure validation used COOT and MOLPROBITY (32). Protein structure figures have been prepared with CCP4mg (33). Refinement and validation statistics are given in Table 1. The coordinates and structure factor amplitudes have been submitted to the Protein Data bank with accession codes 3ZR8 (AVR3a11) and 3ZRG (PexRD2).
Analytical Ultracentrifugation-Sedimentation equilibrium experiments were performed using a Beckman XLA-I analytical ultracentrifuge equipped with an An-50 Ti 8 place rotor and absorbance/interference optics. For AVR3a11, samples at 0.3 and 0.45 mg/ml were monitored using absorbance optics, and equilibrium scans were obtained at speeds of 38,000 rpm. Interference patterns were collected at equilibrium for PexRD2 with samples at 3 and 6 mg/ml and at a speed of 37,000 rpm. Data were fitted to either a single species (AVR3a11) or dimer/tetramer (PexRD2) models using ULTRASCAN (34). For PexRD2, the dimeric form predominates in solution, and there was no evidence for a monomer.
Bioinformatic Analysis-MEME searches allowed for motif lengths up to 100 amino acids (for additional search parameters see the supplemental material). We constructed HMM models of the four motifs that span the WYL sequences and used this to search the effector data base. From this we assembled an updated HMM model describing the WY-domain (49 amino acids). For inclusion of sequences in this updated model, we chose a cut-off of e Ͻ 0.12 from the initial search, which corresponds to the position of P. infestans AVR3a in the list (for sequences used, see supplemental Table 1). This new HMM model was used to re-search the RXLR effector and non-RXLR proteome databases from Phytophthora and H. arabidopsidis and the non-RXLR secretome of Phytophthora. Hits that scored greater than an HMM score of 0.0 were considered putative WY-domain-containing proteins. This cut-off was chosen based on the distribution of WY-domain sequences and our estimate of the false positive discovery rate in the non-RXLR secretome (see Fig. 4c and supplemental material).
Sequence and Structural Alignments-Sequence alignments were prepared using ClustalW2 (35). Structure-based sequence alignments were produced by hand. Structural alignments between AVR3a11/PexRD2 were prepared with SSM (36), producing a root mean square deviation based on C ␣ positions, as reported. In Planta Co-immunoprecipitations-To determine whether AVR3a11, AVR3a KI , AVR3a EM , and PexRD2 exist as oligomers in planta, co-immunoprecipitation experiments were performed using tagged effectors expressed in Nicotiana benthamiana leaves via agroinfiltration. Two different tagged constructs were made for each effector: (i) FLAG epitopetagged constructs were cloned into pJL-TRBO (37), and (ii) GFP-epitope effector constructs were generated in pK7WGF2 (38). Details of the cloning are given in the supplemental material. Co-immunoprecipitations were performed with anti-FLAG M2 affinity gel (Sigma) using total protein extracts harvested from leaves 3 days postinoculation. Interacting proteins were detected by Western blot, probing with anti-GFP primary antibodies (Invitrogen), anti-rabbit secondary antibodies conjugated to horseradish peroxidase (Sigma), and a chemiluminescent substrate (Thermo Scientific). Additional details are given in the supplemental material.

RESULTS AND DISCUSSION
To address structure/function relationships in oomycete RXLR effectors, we established an E. coli-based expression screen aimed at producing proteins that were amenable to structure determination. This screen identified the P. capsici effector AVR3a11 (a homologue of AVR3a from P. infestans) and P. infestans effector PexRD2, as highly expressed and soluble. The purified proteins were crystallized, and their structures were determined to 0.9 and 1.75 Å, respectively ( Table 1).

The Structure of AVR3a11 Is a Monomeric Four-helix Bundle-
The structure of AVR3a11 comprises Thr 70 -Val 132 (63 residues) of the native sequence. This construct lacks the N-terminal signal sequence and RXLR translocation motif, which were predicted to be disordered (Figs. 1, a and b, and supplemental Fig. 2). An earlier construct comprising residues Gly 63 -Val 132 was recalcitrant to crystallization but amenable to NMR analysis. Preliminary structure calculations using NMR data suggested that residues Gly 63 -Lys 69 were not well structured (see supplemental material). The effector domain of AVR3a11 adopts a four-helix bundle fold, stabilized by a hydrophobic core, with the helices connected by loop regions (Fig. 1c). The crystal structure suggests the effector domain of AVR3a11 is a monomer. This oligomeric state was also observed in solution, as shown by the retention volume of the protein on a size exclusion column during purification (data not shown) and analytical ultracentrifugation (7350 Ϯ 0.03 Da, Fig. 2a (theoretical mass ϭ 7474 Da)). Further, Avr3a11 was shown to be monomeric in planta using co-immunoprecipitation ( Fig. 2c and supplemental Fig. 3). Structure-based database searches (39) using the refined AVR3a11 model revealed low significance hits to other four-helix bundles, including KaiA from Anabena sp. PCC7120 (Protein Data Bank code 1R5Q). KaiA is a cyanobacterial circadian clock regulator (40), and although the overall fold of AVR3a11 is loosely related to KaiA, the differences in structure and surface properties do not suggest a role for the AVR3a family as circadian rhythm regulators.  OCTOBER 14, 2011 • VOLUME 286 • NUMBER 41

JOURNAL OF BIOLOGICAL CHEMISTRY 35837
AVR3a11 belongs to a family of Phytophthora effectors that includes P. infestans AVR3a and AVR1b from P. sojae (supplemental Fig. 4a). Pairwise alignments based on the effector domains revealed that AVR3a11 shares 41% sequence identity with AVR3a and 46% sequence identity with AVR1b (spanning residues 70 -132, AVR3a11 numbering). The structure of AVR3a11 is therefore representative of this family and provides an opportunity to interpret RXLR effector domain function with respect to structure.
Residues Important for the Function of P. infestans AVR3a Map to the Surface of AVR3a11-Several functionally important residues of the AVR3a11 homologue AVR3a have previously been identified (17). Amino acids 80 and 103 of AVR3a map to Glu 71 and Gln 94 in AVR3a11 (Fig. 3a and supplemental  Fig. 5) and are positioned at the start of the first ␣-helix (␣ 1 ) and midway along ␣ 2 . Although the C ␣ atoms of these residues are 16.7 Å apart, the residues locate to the same face of the fourhelix bundle, suggesting that this region forms an important surface both for recognition by R3a and PCD suppression activity. The C terminus of the AVR3a family is somewhat divergent (17,41) and contains indels of up to 3 residues. In the structure of AVR3a11, the final three residues emerge from ␣ 4 , and their conformation is not likely to be constrained in solution (Figs. 1c and 3a). The C terminus of AVR3a, which has a role in PCD suppression activity but not recognition by R3a, is on the opposite face of the four-helix bundle from the K/E and I/M positions. Further, the region comprising residues 123-130 of AVR3a (112-117 in AVR3a11) forms a prominent loop between ␣ 3 and ␣ 4 ("loop-3"; Figs. 1c and 3a). The sequence of this loop is variable within the AVR3a family (and includes an up to 4-residue indel). The fact that a single point mutation in this region in AVR3a EM (S123C (17)) confers AVR3a KI levels of R3a recognition and PCD suppression reveals that this loop is an important determiner of protein activity.
The Structure of AVR3a11 Is Built from Widely Conserved Sequence Motifs-The C-terminal domains of RXLR effectors have been reported to carry conserved sequence motifs termed W, Y, and L (3, 15). The AVR3a family contains a single W-and Y-motif (Figs. 1b and 4 and supplemental Fig. 6), with an additional "K" motif to the N terminus (21). The structure of AVR3a11 revealed that these motifs map to discrete secondary structure units that together assemble the four-helix bundle: residues of the K-motif comprise ␣ 1 and loop-1; the W-motif comprises ␣ 2 , loop-2 and ␣ 3 ; and the Y-motif comprises the C terminus of loop-3 and ␣ 4 (Fig. 1b). Side chains from residues of each helix contribute to the hydrophobic core of the bundle. In AVR3a11, the key residue of the W-motif is Trp 96 , which is located at the C-terminal end of ␣ 2 and is excluded from bulk solvent. This residue makes favorable edge-to-face stacking interactions with Tyr 125 (42), the most highly conserved residue of the Y-motif, which is positioned toward the end of ␣ 4 . Tyr 125 also forms a hydrogen bond with the carbonyl oxygen of Ala 101 , positioned on loop-2, which maybe important for stabilizing the conformation of this loop. Tyr 122 in AVR3a11 is positioned 3 residues before the conserved Tyr in the Y-motif (this position is frequently a bulky hydrophobic amino acid; see below). It is not immediately apparent from the structure why such high conservation at this position is maintained. Additional conserved residues of the W-motif (Leu 106 and Leu 110 in Fits of the analytical ultracentrifugation data (including residuals) for (a) AVR3a11 (0.3 mg/ml (circles) and 0.45 mg/ml (triangles)) (b) PexRD2 (3 mg/ml (triangles) and 6 mg/ml (circles)) to the models described confirmed that monomeric and dimeric species are the prevalent forms of these proteins in solution. c, co-immunoprecipitation (co-IP) shows that PexRD2 self-associates in planta; no oligomerization is apparent for AVR3a11, AVR3a KI , or AVR3a EM . RuBisCO is included as a loading control (Coomassie-stained SDS-polyacrylamide gel). AVR3a11) anchor ␣ 3 to the bundle by forming interactions within the hydrophobic core. Mutation of the Trp 96 and Leu 110 equivalents in AVR3a led to reduced protein stability in planta (17); similar mutations in hydrophobic residues of AVR1b also led to loss of function (21), most likely due to protein destabilization. The W-and Y-motifs together appear to form a critical folding unit, the WY-domain, that, with the addition of another helix (to the N terminus), can form a four-helix bundle.
PexRD2 Is Also a WY-domain Protein but with Adaptations-The structure of the effector domain of PexRD2 comprises Ala 57 -Ala 120 (64 residues), starting immediately after the RXLR sequence through to the penultimate residue in the construct (Val 121 could not be modeled in the electron density). The structure is composed of five ␣-helices and crystallizes as a dimer with intimate hydrophobic contacts at the interface (buried surface area ϭ 1830 Å 2 , 20% of the surface area of each monomer; PISA CCS score ϭ 0.89 (43)) (Fig. 1d). Structurebased database searches (39) did not reveal any significant hits to PexRD2. Given the lack of sequence identity (Ͻ20%, supplemental Fig. 4b), we were surprised to discover that comparison of the structures of a PexRD2 monomer with AVR3a11 using SSM (36) aligned 37 residues comprising ␣ 2 , loop-2, ␣ 3 , and ␣ 4 with a root mean square deviation ϭ 1.0 Å (Fig. 1e). This region comprises the helices of the WY-domain described for AVR3a11. However, PexRD2 contains no equivalent of ␣ 1 from the AVR3a11 four-helix bundle, and a significant insertion (16 residues) is present between ␣ 3 and ␣ 4 (loop-3). The structurally aligned residues only share 13.5% sequence identity ( Fig. 1b  and supplemental Fig. 4b).
PexRD2 Is a Dimer in Solution and Self-associates in Planta-To confirm that the PexRD2 dimer was not a crystallographic artifact, we used analytical ultracentrifugation to show that this oligomeric state predominates in solution (Fig. 2b). The best fit of the data was obtained using a dimer/tetramer model with the molecular masses of the species fixed at their theoretical value. The retention volume of PexRD2 on a size exclusion column during purification also supports the dimer conformation (data not shown). Further, we performed in planta co-immunoprecipitation experiments with epitope-tagged effectors. These showed that PexRD2, but not AVR3a11 (see above) or AVR3a, self-associates when transiently expressed in N. benthamiana leaves via agroinfiltration ( Fig. 2c and supplemental Fig. 3). Given the evidence from the crystal structure and analytical ultracentrifugation, it is reasonable to assume that PexRD2 is dimerizing in planta.
Polymorphic Residues in PexRD2 Are Surface-presented-PexRD2 from all strains of P. infestans sequenced to date displays 100% sequence identity in the effector domain (44). However, a PexRD2 homologue in the closely related species P. mirabilis PIC99114 contains polymorphisms at five positions ( Fig. 3b and supplemental Fig. 4c) (44). Comparison of these two sequences revealed that positive selection has acted on this gene because the ratio of non-synonymous to synony-mous substitutions is Ͼ1 (dN/dS ϭ 1.27). The PexRD2 structure revealed that these 5 polymorphic residues are presented on the protein surface and are not co-localized (i.e. do not define a particular surface region) (Fig. 3b). We conclude that polymorphisms in some surface-exposed residues may have contributed to the evolution of PexRD2 without disrupting the WY-domain fold.
The WY-domain Is a Unit Conserved in Many Phytophthora and H. arabidopsidis RXLR Effectors-The unexpected structural similarity we discovered between AVR3a11 and PexRD2 prompted us to re-evaluate the context of the repeated WYL motifs previously reported in Phytophthora RXLR effectors (3,15) and extend this analysis to H. arabidopsidis. We used MEME (45) to search for conserved motifs in a data set comprising the P. infestans, P. ramorum, and P. sojae RXLR effector repertoires (1207 proteins, C-terminal effector domain only, downstream of the RXLR). The highest scoring motifs are dominated by overlapping regions incorporating W-, Y-, and L-motifs that can be readily aligned to reveal repeating WYL sequences (supplemental Fig. 6). One newly identified 49-amino acid MEME motif spanned both the W-and Y-motifs and covers the ␣ 2 /␣ 3 /␣ 4 WY-domain seen in the structures of AVR3a11 and PexRD2 (Fig. 4a). Interestingly, the alignment of overlapping MEME motifs revealed that the conserved Leu of the L-motif is positioned 8 residues to the N terminus of the conserved Trp of the W-motif.
To explore the extent to which the WY-domain is found in RXLR effectors, we constructed HMM models of four MEME motifs (Fig. 4b, supplemental Fig. 7, and supplemental Table 1) and rescreened the Phytophthora and H. arabidopsidis RXLR effector sequences with HMMER (46); we also screened the non-RXLR proteome of Phytophthora and H. arabidopsidis and the non-RXLR secretome of Phytophthora with the WYdomain HMM. This analysis revealed that WY-domains are highly enriched in RXLR effectors compared with the non-RXLR proteome (Fig. 4c). Using an appropriate cut-off (see "Experimental Procedures"), we revealed that 527 of 1207 (44%) of Phytophthora and 35 of 134 (26%) of H. arabidopsidis RXLR effectors contain WY-domain-like sequences ( Table 2 and supplemental Table 2). In contrast, in the non-RXLR proteome, only 0.3 and 0.6% (Phytophthora and H. arabidopsidis, respectively) contain WY-domain-like sequences above the same cutoff (Table 2 and supplemental Table 2). We therefore conclude that our rate of false positive is Յ0.6% and that 562 of 1341 (42%) of oomycete RXLR effectors probably contain a core fold similar to that found in AVR3a11 and PexRD2. WY-domains can also be found in tandem repeats, and we identified one candidate effector, PITG_23035, with 11 WYdomains. Therefore, many Phytophthora RXLR effectors comprise either single or tandem arrays of WY-domains, joined by variable "linker" regions.
A Core Protein Structure for Phytophthora RXLR Effectors That Supports Rapid Adaptive Diversification-Here we present the crystal structures of the effector domains from Phytophthora RXLR proteins AVR3a11 and PexRD2. We describe a structural unit, the WY-domain, that bioinformatic analysis suggests is conserved in ϳ44% of Phytophthora RXLR effectors and ϳ26% of H. arabidopsidis effectors. This domain forms a core ␣-helical fold that tolerates considerable plasticity at both the N and C termini and within the loops between the helices. It also supports amino acid substitutions in surface residues and oligomerization as further mechanisms for modification.
Understanding the mechanisms that underlie protein adaptation under evolutionary pressure is a key question in molecular evolution. Adaptive evolution can drive the development of novel protein functions or the tuning of existing ones (e.g. new substrate specificity in an enzyme active site or perturbation of molecular recognition surfaces). Some proteins can also be under selective pressure to lose particular activities (e.g. to evade immune system recognition but still need to maintain structural integrity to avoid total loss of function). The evolution of toxicity in Kallikrein-1-like venom proteins involves the acquisition of small insertions in loop regions adjacent to the catalytic cleft (within a conserved core fold), followed by accelerated sequence diversification that increases catalytic efficiency (47). Many small protein domains (ϳ20 -60 amino acids) exist as tandem repeats to deliver a structural template that can adapt to provide novel functions. Examples include leucine-rich repeats (48) and the blades of ␤-propellers. Each of these structural modules encode functional diversity through variation in the number of repeats, mutations in individual surface residues, and larger changes to loop regions. Further, TIMbarrel enzymes are built from a minimal ϳ20-amino acid ␣/␤ unit, usually found in eight repeats, with specificity for the types of reaction catalyzed, and their substrates, encoded by the ␤/␣ loops (49). The degree of sequence conservation across the family of known ␤-propellers and TIM-barrels is below detectable levels. It is generally accepted that all of these proteins have evolved from ancestral precursors through diversification.
Plant pathogen effectors frequently display extreme signatures of positive selection, presumably as a consequence of the co-evolutionary arms race with resistant hosts (3,11,44,50,51). Our findings provide a molecular framework that explains how RXLR effector proteins tolerate sequence hypervariability. Although alternative hypotheses are possible (e.g. the WY-domain is the result of convergence to a fold suited to secretion and/or translocation from Phytophthora), we favor the model that the WY-domain is a conserved structural unit found in a large family of oomycete RXLR effectors that share a common evolutionary origin.
Examples from this study that support this include the 16-amino acid insertion in loop-3 observed in PexRD2 compared with AVR3a11, which shows that the WY-fold can incorporate significant insertions without losing integrity (Fig. 1e). The presence of ␣ 1 in AVR3a11 reveals a modification at a terminus, and the structures of AVR3a11 and PexRD2 suggest that additional residues at the C terminus of the WY-fold could easily be accommodated. Known functional and polymorphic residues within the AVR3a and PexRD2 families are positioned on the surface of the proteins where they can influence interaction with host molecules without disrupting the fold. Further, the structure of PexRD2 reveals the WY-domain fold can support oligomerization (which may or may not involve the fold adaptations/elaborations) to deliver new effector structures. Oligomerization in PexRD2 also raises the intriguing possibility that RXLR effectors may permit further functional diversification by forming heterodimers. Future experiments will determine the degree to which homo-and heterodimerization impact the activity of PexRD2.
AVR3a11 and PexRD2 are proteins with only a single WYdomain. However, our computational analyses revealed that Phytophthora RXLR effectors can comprise up to 11 WY-domain repeats, greatly enhancing the potential for encoding new structures and oligomeric states. Future studies will address how these repeating WY-domains are organized into higher order structures and how this impacts effector function. We will also use the structures of AVR3a11 and PexRD2 as templates for mutagenesis to understand how these specific effectors modulate immunity inside plant cells (supplemental Fig. 1).
In conclusion, we have defined a large family of oomycete RXLR effectors that contain at least one WY-domain and are structurally related. However, this does not preclude the existence of Phytophthora RXLR effectors that contain alternative folds and are phylogenetically unrelated to WY-domain effectors (Fig. 4c). In fact, given the diversity of RXLR sequences, it seems likely that other folds do exist, and these have yet to be structurally characterized.