Structure and Identification of a Pterin Dehydratase-like Protein as a Ribulose-bisphosphate Carboxylase/Oxygenase (RuBisCO) Assembly Factor in the α-Carboxysome*

Background: A gene encoding a distant homolog of pterin-4α-carbinolamine dehydratase (PCD) is present in all α-type carboxysome operons. Results: This conserved PCD-like protein promotes the assembly of native RuBisCO when overexpressed together with GroELS. Conclusion: The protein, named here acRAF for α-carboxysome RuBisCO assembly factor, is a newly identified molecular chaperone. Significance: The activity of this chaperone toward RuBisCO has potential agricultural application. Carboxysomes are proteinaceous bacterial microcompartments that increase the efficiency of the rate-limiting step in carbon fixation by sequestering reaction substrates. Typically, α-carboxysomes are genetically encoded as a single operon expressing the structural proteins and the encapsulated enzymes of the microcompartment. In addition, depending on phylogeny, as many as 13 other genes are found to co-occur near or within α-carboxysome operons. One of these genes codes for a protein with distant homology to pterin-4α-carbinolamine dehydratase (PCD) enzymes. It is present in all α-carboxysome containing bacteria and has homologs in algae and higher plants. Canonical PCDs play an important role in amino acid hydroxylation, a reaction not associated with carbon fixation. We determined the crystal structure of an α-carboxysome PCD-like protein from the chemoautotrophic bacterium Thiomonas intermedia K12, at 1.3-Å resolution. The protein retains a three-dimensional fold similar to canonical PCDs, although the prominent active site cleft present in PCD enzymes is disrupted in the α-carboxysome PCD-like protein. Using a cell-based complementation assay, we tested the PCD-like proteins from T. intermedia and two additional bacteria, and found no evidence for PCD enzymatic activity. However, we discovered that heterologous co-expression of the PCD-like protein from Halothiobacillus neapolitanus with RuBisCO and GroELS in Escherichia coli increased the amount of soluble, assembled RuBisCO recovered from cell lysates compared with co-expression of RuBisCO with GroELS alone. We conclude that this conserved PCD-like protein, renamed here α-carboxysome RuBisCO assembly factor (or acRAF), is a novel RuBisCO chaperone integral to α-carboxysome function.

Ribulose-bisphosphate carboxylase/oxygenase (RuBisCO) 4 catalyzes the rate-limiting reaction of carbon fixation: the covalent attachment of carbon dioxide to the sugar ribulose 1,5bisphosphate (1). Despite its central role in metabolism, RuBisCO is a notoriously inefficient enzyme. Its turnover rate varies between species but is in the range of seconds (2,3). The enzyme also shows relatively poor discrimination between its substrate, carbon dioxide, and molecular oxygen (4,5). Higher plants have evolved various mechanisms to overcome these inherent catalytic inefficiencies, including strategies that rely on compartmentalization of the steps of carbon fixation across multiple cell types. Many bacteria, on the other hand, rely on a giant proteinaceous structure known as the carboxysome to sequester the carbon fixation reactions of the cell. By encapsulating RuBisCO and carbonic anhydrase together, concentrated CO 2 can be provided in the vicinity of RuBisCO (6 -10). In this way, the carboxysome serves as an essential carbon fixing organelle in many photosynthetic and some chemoautotrophic bacteria.
The major protein constituents of the carboxysome have been well characterized. Bacterial microcompartment (BMC) shell proteins assemble into hexagonal lattices that form the walls of the carboxysome shell (8,9). Pentameric bacterial microcompartment vertex proteins enable full closure of these icosahedral structures by occupying the spaces at each of the 12 vertices (9,(11)(12)(13). A variety of paralogous BMC proteins are typically encoded in carboxysome operons; they are believed to play distinct roles in the shell (9). Carboxysomes are phyloge-netically classified as either ␣or ␤-type (14,15). The two divergent carboxysome lineages are distinguished by differences in RuBisCO phylogeny (RuBisCO type IA and type IB are found in ␣and ␤-carboxysomes, respectively), gene composition, and operon organization. ␣-Carboxysome genes generally appear within a single operon, whereas ␤-carboxysome genes typically appear in multiple dispersed gene clusters. Functional differences resulting from this divergence, if any, are not currently understood.
Numerous proteins are implicated in the complex process of RuBisCO folding and activation in diverse organisms. In some instances known RuBisCO chaperones are encoded near ␣-carboxysome operons or ␤-carboxysome RuBisCO genes. RbcX, a small (ϳ15 kDa) dimeric protein encoded between RuBisCO large and small subunits of ␤-carboxysomes, has been shown to facilitate hexadecameric assembly of RuBisCO in the ␤-carboxysome system (16 -18). RbcX homologs are not present in ␣-carboxysome containing organisms, suggesting that either the subunit assembly function performed by RbcX is unnecessary for carboxysome type 1A RuBisCO or that a distinct, unknown mechanism promotes hexadecameric assembly of ␣-carboxysome RuBisCO. CbbX, a RuBisCO activase, modulates RuBisCO activity by dislodging inhibitory ribulose 1,5bisphosphate molecules from the active site (19). Other ATPdependent chaperones CbbO and CbbQ regulate and enhance RuBisCO catalysis by altering RuBisCO conformational states in a variety of carbon-fixing prokaryotes (20). CbbO, CbbQ, and CbbX homologs are sometimes present in ␣-carboxysome operons (Fig. 1). GroELS (also referred to as chaperonin) is a ubiquitous ATP-dependent chaperone complex essential for the proper folding of RuBisCO large subunits (21,22). RuBisCO assembly factor 1 was recently identified as a novel RuBisCO chaperone in Zea mays (23). RuBisCO assembly factor 1 is conserved in green plant lineages, but is not present in carbonfixing prokaryotes. To date, plant RuBisCO has not been successfully assembled in vitro or in a heterologous Escherichia coli host, hinting at the probable existence of yet unidentified folding or assembly factors.
Recent bioinformatics studies have identified several genes of unknown function that consistently occur near ␣-carboxysome BMC genes (24,25). This conserved genetic association strongly suggests that the encoded proteins are involved in the catalytic function, regulation, or structure of the ␣-carboxysome. One of these proteins is remotely related in sequence (often with BLAST E values Ն 0.05) to pterin-4␣-carbinolamine dehydratase (PCD) enzymes, and is represented in all ␣-carboxysome operons we examined. It appears to be a defining member of this carboxysome subtype. Fig. 1 illustrates the organization of a number of diverse ␣-carboxysome operons and highlights the presence of this remote homolog of PCD.
PCD enzymes are ubiquitous throughout the three kingdoms of life, where they act in concert with aromatic amino acid  hydroxylases (AAH) in amino acid metabolism pathways (26,27). AAHs use tetrahydrobiopterin (also known as BH4 and THB) cofactors and molecular oxygen to catalyze the addition of a hydroxyl group onto aromatic amino acids. A specific example of an AAH is bacterial phenylalanine hydroxylase, which adds a hydroxyl group to phenylalanine to produce tyrosine (28). In this reaction, the tetrahydrobiopterin cofactor is simultaneously hydroxylated on the 4␣ carbon, generating petrin-4␣-carbinolamine. Two enzymes are required to regenerate pterin-4␣-carbinolamine to tetrahydrobiopterin: dihydropterin reductase and PCD (27). Functional PCDs exist as homodimers in prokaryotes. In animals, DCoH (dimerization cofactor of HNF-1 or hepatocyte nuclear factor-1) is a tetrameric bifunctional enzyme that serves both as a PCD and as a transcriptional activator of HNF-1 (29). Despite its dual role in animals, DCoHs are similar in sequence and tertiary structure to bacterial PCDs (30). In prokaryotes lacking AAHs, "orphan" PCDs are sometimes present and are suspected to participate in other unknown reactions or cellular processes (27). Catalytically inactive PCD-like proteins of unknown function are present both in chloroplasts (27) and, as shown here, in many bacteria, where they occur in close association with ␣-carboxysome operons. No structures have been reported to date for these PCD-like proteins. Fig. 2A summarizes current information on PCDs and PCD-like proteins.
Here we report our work investigating the structure and function of the conserved PCD-like family of proteins found in ␣-carboxysome containing bacteria. We show that the protein is an enzymatically inactive relic of PCD, and that its true cel-lular function relates to the folding or assembly of hexadecameric RuBisCO. Based on our results, we rename this protein the ␣-carboxysome RuBisCO assembly factor, or acRAF.

EXPERIMENTAL PROCEDURES
Cloning, Expression, and Purification of Tillandsia intermedia acRAF-The construct used for crystallization of codonoptimized acRAF (with amino acids "MHHHHHH" replacing the first 5 amino acids of the protein coding sequence) was ordered from Biomatik Corp. Gene Synthesis Co. Codon optimization was performed by DNAworks. The resulting gene sequence was cloned into pET22b(ϩ) vector between the 5Ј NdeI and 3Ј EcoRI sites. acRAF was overexpressed in E. coli BL21(DE3). One-liter flasks were inoculated with 15 ml of overnight culture, and grown under ampicillin selection in LB media at 37°C, shaking at 240 rpm, until cells reached an optical density absorbance of 0.8 at 600 nm (A 600 ϭ 0.8). Isopropyl 1-thio-␤-D-galactopyranoside was added to a final concentration of 1.0 mM. After 3-5 h, cells were spun down and frozen at Ϫ20°C until further use. Frozen cell pellets collected from 2 liters of culture were resuspended in 50 ml of buffer (50 mM Tris-HCl, pH 7.6, 300 mM NaCl, Roche Complete EDTA-free protease inhibitor mixture) and lysed by sonication. Cell debris was pelleted by centrifugation at 16,500 ϫ g in a SS-34 rotor for 30 min at 4°C. The resulting supernatant was filtered through a 0.22-m Whatman filter. Filtered supernatant was then loaded onto a HisTrap 5-ml column by syringe, washed with 30 ml of wash buffer (50 mM Tris-HCl, pH 7.6, 300 mM NaCl, 50 mM imidazole), and eluted in one step with elution buffer (50 mM Tris-HCl, pH 7.6, 300 mM NaCl, 400 mM imidazole). Purified acRAF was dialyzed into 10 mM Tris-HCl, pH 7.6, 50 mM NaCl, 5% glycerol. acRAF was concentrated to 11 mg/ml, as calculated from absorbance at 280 nm using the theoretical extinction coefficient.
acRAF Crystallization-Hanging-drop crystallization screens were performed using the nanoliter liquid handling Mosquito from TTP LabTech at the High Throughput Macromolecular Crystallization Facility at UCLA. Optimization of needle-like crystals obtained in JCSGϩ Suite crystallization screen condition A8 were pursued in 24-well (500 l well) trays. Crystals were collected from 2-l hanging drops of 1:1 ratio protein:well solution after 16 days of slow but continuous growth at room temperature. The optimized well solution contained 17.5% PEG 3500 and 800 mM ammonium formate. A cryoprotectant was prepared containing 17.5% PEG 3500, 800 mM ammonium formate, and 25% ethylene glycol, with or without 200 mM NH 4 I 2 for subsequent phasing by anomalous scattering. Crystals were soaked in cyroprotectant for 1-2 min before freezing and storage in liquid nitrogen.
Phasing and Refinement-Diffraction data extending to 1.3-Å resolution were collected at the Argonne National Laboratory, Advanced Photon Source (APS), beamline 24-ID-C (Table 1). Phases were obtained from single wavelength data using isomorphous replacement with anomalous scattering from iodide. Five iodide-bound sites were identified by anomalous scattering, and all five sites were used for phasing. Iodide sites were located and phases were calculated using the program HKL2MAP (31). The program Buccaneer (32) was used to automatically trace the main chain. Subsequent rounds of model building were performed with COOT (33) and the structure was refined using BUSTER (34) to a final R work and R free of 0.203 and 0.204, respectively. 98% of the backbone dihedral angles are within the favored regions of a Ramachandran diagram. Coordinates and structure factors have been deposited with the PDB ID code 4LOW.
PCD Complementation Assay-All PCD constructs were cloned between the 5Ј NdeI and 3Ј KpnI restriction enzyme sites in a pBSKII(Ϫ) vector. The PCD construct from Pseudomonas aeruginosa (NP_249562) in the vector pBSKII(Ϫ) and P. aeruginosa phenylalanine hydroxylase (NP_249563) in vector pACYC177 were gifts from Dr. Andrew Hanson. A cDNA template of the Arabidopsis thaliana chloroplast acRAF gene (NP_199924) was provided by Dr. Sabeeha Merchant. Codon optimized T. intermedia K12 acRAF (YP_003641868) DNA was purchased through Biomatik Corp. The codon-optimized T. intermedia K12 PCD (YP_003642661) gene was synthesized by assembly PCR from oligonucleotides designed by DNAworks and purchased from IDT. Codon-optimized acRAF genes from Halothiobacillus neapolitanus (YP_003262803) and Prochlorococcus marinus (NP_874951) were purchased as IDT gBlocks. All PCDs and acRAFs were transferred into clonal descendants of Hanson's original P. aeruginosa PCD pBSKII(Ϫ) construct using Gibson assembly cloning methods, thereby replacing the original insert. Codon optimized T. intermedia K12 phenylalanine hydroxylase (YP_003642727) (purchased from Biomatik) was transferred into that PCR-linearized pACYC177 using Gibson assembly (35). The starting ATG of the pACYC177 ampicillin resistance open reading frame became the beginning ATG of T. intermedia K12 phenylalanine hydroxylase, and largely replaced the ampicillin resistance gene. A pBSKII(Ϫ) empty vector control and pACYC177 empty vector controls were selected from failed cloning experiments. All constructs were sequenced by Laragen, Co. Sequences are available upon request.
Cloning of RuBisCO and acRAF-Several experiments were performed to determine the effects of acRAF and/or GroELS co-expression on the solubility and assembly of RuBisCO subunits. Ten different E. coli BL21(DE3) strains were created by triple or double electroporation into Lucigen Crop. "E. cloni EXPRESS BL21(DE3) Electrocompetent cells." CbbLS, CbbL, and CbbS were PCR amplified from H. neapolitanus genomic DNA and cloned into pRSF-duet vector between 5Ј NcoI and 3Ј XhoI restriction enzyme sites using Gibson assembly. acRAF, also PCR-amplified from H. neapolitanus genomic DNA, was cloned into pET22b(ϩ) between 5Ј NdeI and 3Ј EcoRI restriction enzyme sites. Plasmid pBB541, a GroELS expressing vector with a p15a origin of replication, was purchased from Addgene. Empty pRSF and pET22b(ϩ) plasmids were used as negative controls where appropriate. Exact sequences are available upon request.
Heterologous RuBisCO and acRAF Co-expression Assay-Transformed BL21(DE3) cells were grown overnight in LB broth supplemented with the following antibiotics as appropriate: 15 M kanamycin, 50 M ampicillin, 12.5 M streptomycin. 2 ml of overnight cultures were added to 250-ml flasks containing 100 ml of LB. Cells were grown at 37°C until the A 600 reached 0.8. Cells were induced with 1 mM isopropyl 1-thio-␤-D-galactopyranoside, shaking at 240 rpm, for 3 h at 37°C. Cells were spun down and resuspended in the following buffer: 50 mM Tris-HCl, pH 7.6, 300 mM NaCl, 10 mM MgCl 2 , 10 mM NaHCO 3 , 10 mM DTT. Buffer volume was adjusted so that all cell cultures achieved an A 600 of 54 (based on serial dilution). Cells were lysed by sonication. Insoluble cell debris was pelleted in 1.6-ml Eppendorf tubes spinning at 20,000 ϫ g in a tabletop microcentrifuge. Supernatant was then loaded onto Bio-Rad MiniProtean TGX gels (catalog number 456-9036) and analyzed by SDS-PAGE (data not shown) and native gels (Fig. 4).
Associated Gene Analysis-␣-Carboxysome operons were detected by searching the non-redundant protein sequence database for sequences homologous to the ␣-carboxysome protein CsoS2, using the program Blast. Twenty-one ␣-carboxysome operons were selected with the aim of maximizing diversity of both genomic backgrounds and ␣-carboxysome components. Operons from the following organisms were analyzed by hand: To develop a better picture of the various uncharacterized proteins that appear to be associated with ␣-carboxysomes, we have used the comparative genomics program SEED to system-atize the known ␣-carboxysome operons and their associated genes. Those results are publicly accessible (SEED curator: nwheatley; subset name: ␣-carboxysome).
Data Deposition-Atomic coordinates and diffraction data for acRAF have been deposited with PDB code 4LOW.

Structure of T. intermedia acRAF-
We determined the structure of T. intermedia acRAF (GI: 296134626, YP_003641868) to 1.3-Å resolution by x-ray crystallography (Fig. 2, B and C). An initial acRAF construct with a C-terminal His 6 tag did not crystallize despite many attempts. Secondary structure prediction programs indicated that the first six amino acids of native acRAF were unstructured. We therefore moved the polyhistidine tag to the N terminus. We chose to replace the presumptively disordered N terminus with the polyhistidine tag rather than extend the disordered region, and this construct crystallized. The structure was determined by anomalous scattering methods ( Table 1). The asymmetric unit of the P4 1 crystal contained two protein subunits, or one dimer. A final refined model was obtained in which a total of 148 of the 164 residues of the combined subunits of the acRAF dimer could be reliably modeled (Fig. 2B). The electron density for residues 66 -70, corresponding to the region of catalytic activity in PCD enzymes, was not interpretable, indicating either multiple conformations and/or high structural disorder in this region of acRAF. This segment was therefore not included in the final deposited structure, although the loop was included in a plausible conformation for the purposes of structure visualization in some cases. In the crystal structure, the histidine residues that were inserted at the N terminus emerge from an ordered ␤ strand, implying that the deleted N-terminal residues of acRAF might have contributed to a ␤ sheet, contrary to our predictions. Further analysis revealed that these ordered histidine residues are involved in a chelation-dependent crystal contact with a divalent cation, thereby creating a fortuitous interaction important for crystal formation.
Comparisons between acRAF and true PCD enzymes show significant structural similarities despite low sequence identity. Generally, about 75% of a typical acRAF sequence can be aligned to PCD. Within that region of similarity, 25-35% of residues are identically conserved. Structural superposition of acRAF onto Toxoplasma gondii PCD (PDB code 2V6T) resulted in a backbone root mean square deviation of only 1.5 Å (Fig. 3). The main difference between the PCD and acRAF structures is the N-terminal helix. The PCD enzyme from T. gondii contains an 11-residue N-terminal helix that is lacking in acRAF. The tertiary structures of PCD and acRAF are similar to the ferredoxin superfamily, but with a minor circular permutation near the chain termini. We note here the intriguing observation that a number of other carboxysome-related proteins also show a similar fold. This includes the main BMC shell proteins, the N-terminal domain of the RuBisCO large subunit, and a PII nitrogen regulatory-like protein encoded in several proteobacteria ␣-carboxysome operons (36) (Fig. 4). The oligomeric state of acRAF in the crystal structure is a dimer, which like true PCDs is composed of two subunits arranged as a symmetric dimer. The dimeric interface is formed by a ␤ strand (residues 50 -55) and an ␣ helix (residues 28 -43). The  strand pairs with the cognate strand from the other subunit to form an anti-parallel ␤ sheet. A presumptive catalytic motif for PCD enzymatic activity can be inferred by analyzing the arrangement and sequence conservation in representative PCDs known to be active compared with those proven earlier to be inactive. The PCD catalytic motif (27)
Despite structural divergence in the active site, we deemed it necessary to experimentally test acRAFs for PCD activity for two reasons. First, the catalytic mechanism of PCD is not well understood, making it difficult to rule out enzymatic activity of acRAF from the structure alone. Second, previous studies demonstrating a lack of catalytic activity in PCD-like proteins from plant chloroplasts (also referred to as type 2 PCDs) left open questions about the reason for this inactivity. The plant PCDlike proteins previously studied show a lack of conservation in the active site region, but they also bear a 40 -60-amino acid extension at their N termini (not including the predicted chloroplast targeting sequence), which could, in principle, block or regulate the active site. No such extension is present in the acRAF proteins from ␣-carboxysome operons.
Sequence alignments among 73 carboxysome-associated acRAFs were performed using the program PRALINE (37) to investigate potentially conserved motifs specific to this family.
Pro 47 and Phe 52 are the only fully conserved residues; both likely fulfill structural roles required for overall acRAF stability. Several other solvent accessible residues, however, also show substantial conservation (Fig. 2B). Interestingly, these conserved residues localize to a contiguous surface region of acRAF, distant from the vestigial active site, implicating this surface region as a putative functional interface.
Absence of PCD Enzyme Activity-To test acRAF for PCD enzyme activity, we employed a cell-based complementation assay that requires PCD activity for cell growth (18). The complementation assay was performed in E. coli strain JP2255, a tyrosine and phenylalanine auxotroph that can replicate on media lacking tyrosine if supplemented with both a functional phenylalanine hydroxylase and a functional PCD. E. coli species naturally lack AAHs and PCDs but do have genes encoding dihydropterin reductases, which are required (in combination with dehydration by PCD) to regenerate the pterin cofactor. We used E. coli strain JP2255 to confirm the catalytic inactivity of three different acRAFs: those from T. intermedia (the same organism as our crystal structure), H. neapolitanus C2, and P. marinus CCMP1375. As a negative control, we tested the type 2 PCD from A. thaliana and its N-terminal truncation mutant. Two positive controls were used: a well characterized PCD from P. aeruginosa POA1 and a predicted canonical PCD from T. intermedia, a paralog to the acRAF protein structurally characterized here. T. intermedia also contains a presumptive phenylalanine hydroxylase. We used this phenylalanine hydroxylase in our complementation assay along with another well characterized phenylalanine hydroxylase from P. aeruginosa PAO1.  In combination with either of the phenylalanine hydroxylases, none of the acRAFs we tested restored cell growth on media lacking tyrosine. In contrast, both the PCD from P. aeruginosa and the non-carboxysomal PCD from T. intermedia supported cell growth when co-transformed with plasmids encoding either of the phenylalanine hydroxylases. These results are consistent with the suggestion from structural and sequence analysis that acRAFs are not pterin dehydratases. These results also show that T. intermedia encodes enzymes capable of phenylalanine hydroxylation independent of acRAF, further supporting the hypothesis that acRAF fulfills a novel, non-enzymatic role in the function and/or structure of ␣-carboxysomes.
The absence of enzyme activity prompted us to search for an alternative function for this apparent enzyme relic. Clues regarding acRAF function arose in comparing gene content between ␣and ␤-carboxysomes. The two types of carboxysomes are functionally similar in their abilities to concentrate and fix CO 2 . They are morphologically similar, and their shells are assembled mainly from homologous proteins that follow similar architectural principles. Yet those similarities belie key differences in genomic organization and protein composition. The presence of mutually exclusive genes in the two carboxysome systems suggests that some functions might be satisfied by functionally analogous but evolutionarily distinct proteins. The use of highly divergent, or even unrelated, carbonic anhydrases in the two systems is a case in point (38). We wondered if acRAF might play a role in RuBisCO folding and assembly similar to that played by RbcX in the ␤-carboxysome.
Heterologous RuBisCO Assembly Assay-RbcX is a RuBisCO chaperone understood to function in ␤-carboxysomes by stabilizing RuBisCO L 2 dimers before they assemble into L 8 S 8 hexadecamers (16,18). Noting that ␣-carboxysome-containing organisms do not contain RbcX proteins but always encode acRAF, we tested the effects of co-expressing H. neapolitanus acRAF with H. neapolitanus RuBisCO. In a first set of experiments, we found that co-expressing acRAF with RuBisCO in E. coli did not improve the yield of soluble native RuBisCO. However, RuBisCO folding and assembly is a complex process: GroELS is important in subunit folding, but in ␤-carboxysome-containing cyanobacteria RbcX is required for assembly of the native (hexadecameric) complex. Indeed, in a second set of experiments we found that when acRAF was expressed together with GroELS, the yield of natively assembled RuBisCO, as judged by native PAGE, increased dramatically compared with when GroELS was expressed with RuBisCO but without acRAF (Fig. 6). This result identifies the PCD-like family of proteins as a new RuBisCO chaperone.

DISCUSSION
Previous studies have noted the occurrence of genes for PCD-like proteins, proteins related to the pterin dehydratase enzyme family, in ␣-carboxysome operons from both cyanobacteria (25) and chemoautotrophs (24). A systematic exami-FIGURE 5. acRAFs do not compliment PCD activity in vivo. An E. coli tyrosine auxotroph, strain JP2255, was doubly transformed with two compatible plasmids, one plasmid containing the phenylalanine hydroxylase from T. intermedia and the second plasmid containing a putative or experimentally verified PCD. Cells were streaked in wedges on minimal media agar plates supplemented with or without tyrosine (top and bottom rows, respectively). Putative PCDs are numbered as follows: 1, P. aeruginosa PCD (NP_249562). 2, type-2 PCD from A. thaliana missing an N-terminal 46 residue chloroplast-targeting sequence (NP_199924). 3, type-2 PCD from A. thaliana missing 101 residues of N terminus. 4, T. intermedia acRAF with a 5-residue N-terminal truncation (YP_003641868). 5, H. neapolitanus acRAF (YP_003262803). 6, T. intermedia PCD (YP_003642661). 7, P. marinus acRAF (NP_874951). The prime symbol, as in 1Ј, indicates an empty plasmid transformed in place of a putative PCD. None of the PCDs or acRAFs were able to support tyrosine-independent growth without co-transformation with a phenylalanine hydroxylase (data not shown). FIGURE 6. acRAF increases the quantity of natively assembled RuBisCO when heterologously co-expressed with GroELS in E. coli. A, the RuBisCO CbbLS genes were co-overexpressed with GroELS in E. coli BL21(DE3) either with (ϩ) or without (Ϫ) acRAF. Soluble, whole cell lysates were analyzed by native non-denaturing PAGE. Independent co-expression experiments were performed in triplicate. The amount of natively assembled RuBisCO is substantially increased by the presence of acRAF (when GroELS is also overexpressed). The band corresponding to assembled CbbLS was deduced by comparisons to control gels run on lysates from cells expressing or lacking specific proteins (panel C). B, CbbLS was overexpressed in E. coli BL21(DE3) either with (ϩ) or without (Ϫ) acRAF, but without GroELS overexpression. Soluble, whole cell lysates were analyzed by native non-denaturing PAGE. Independent co-expression experiments were performed in triplicate. No effect of acRAF on RuBisCO production is seen. C, CbbLS and acRAF bands were identified from soluble fractions of lysates of E. coli BL21(DE3) cells, which were transformed with different combinations of compatible plasmids. Strain 1, CbbLS, acRAF and GroELS. Strain 2, CbbL, acRAF, GroELS. Strain 3, CbbS, acRAF, GroELS. Strain 4, CbbLS, empty pET22b, GroELS. Strain 5, empty pRSF, acRAF, GroELS. RuBisCO (hexadecameric L 8 S 8 ) and acRAF band identities were deduced by comparisons of strain-specific band patterns. The acRAF from H. neapolitanus migrated as two species, in contrast to acRAF from T. intermedia, which migrates as one species (not shown). Note that only strains 1 and 4 are expressing both large and small RuBisCO subunits, consistent with the assignment of native RuBisCO to the band present exclusively in those lanes. Blue, red, and magenta arrows indicated bands that correspond to GroELS, CbbLS, and an unidentified E. coli protein, respectively. nation suggests that these proteins (now renamed acRAFs) are universally conserved in ␣-carboxysome operons, but absent from genomes that encode ␤-carboxysomes, making acRAF a defining gene of ␣-carboxysome operons. By determining the three-dimensional structure of acRAF from T. intermedia we showed that, despite overall structural similarity to bacterial PCDs, the active site cleft of acRAF is drastically altered. Conserved residues believed to be catalytically important in PCDs are not replaced by potentially catalytic residues in the acRAF structure. An in vivo enzyme complementation assay in E. coli confirmed that carboxysome-related acRAFs lack PCD enzymatic activity. After confirming its catalytic inactivity, we were able to establish a role for this protein in RuBisCO assembly.
When heterologously co-expressed in combination with GroELS, acRAF enhances the production of assembled RuBisCO. The mechanism by which acRAF interacts with and supports RuBisCO expression, stability, and/or assembly is not yet understood. All that can be asserted based on these first experiments is that acRAF likely acts at the subunit assembly stage (in a fashion analogous to RbcX), because its effect is only manifested in the presence of GroELS, which acts generally to promote subunit folding. Further mechanistic details await additional studies.
The evolution of enzymes for alternate functions is well known, as are examples of pseudoenzymes or "dead enzymes" wherein the loss of catalytic activity evolves in concert with the acquisition of regulatory or structural functions. Lens crystallins are classic examples of pseudoenzymes (39). Genomic and biochemical studies have focused attention on the adaptation of enzymes for other purposes as a widespread evolutionary phenomenon (40, 41) (reviewed in Refs. 42 and 43).
Our findings here regarding a cellular function for ␣-carboxysomal acRAFs align with previous studies of Type 2 PCDs from plant chloroplasts. Proteins from that group have been shown to lack catalytic activity (27), and genetic analysis has demonstrated that a homolog of acRAF in plants is required for RuBisCO assembly in vivo. 5 Now that a function has been identified, detailed mechanistic studies on acRAF and its homologs in plants can be undertaken.