The C-terminal peptide of Aquifex aeolicus riboflavin synthase directs encapsulation of native and foreign guests by a cage-forming lumazine synthase

Encapsulation of specific enzymes in self-assembling protein cages is a hallmark of bacterial compartments that function as counterparts to eukaryotic organelles. The cage-forming enzyme lumazine synthase (LS) from Bacillus subtilis (BsLS), for example, encapsulates riboflavin synthase (BsRS), enabling channeling of lumazine from the site of its generation to the site of its conversion to vitamin B2. Elucidating the molecular mechanisms underlying the assembly of these supramolecular complexes could help inform new approaches for metabolic engineering, nanotechnology, and drug delivery. To that end, we investigated a thermostable LS from Aquifex aeolicus (AaLS) and found that it also forms cage complexes with the cognate riboflavin synthase (AaRS) when both proteins are co-produced in the cytosol of Escherichia coli. A 12-amino acid-long peptide at the C terminus of AaRS serves as a specific localization sequence responsible for targeting the guest to the protein compartment. Sequence comparisons suggested that analogous peptide segments likely direct RS complexation by LS cages in other bacterial species. Covalent fusion of this peptide tag to heterologous guest molecules led to their internalization into AaLS assemblies both in vivo and in vitro, providing a firm foundation for creating tailored biomimetic nanocompartments for medical and biotechnological applications.

Compartmentalization of sequentially active enzymes in hollow proteinaceous cages is an ingenious strategy utilized by bacteria to optimize metabolic pathways (1)(2)(3). The polyhedral shell structures ensure high local catalyst concentrations, while regulating substrate uptake and product release, sequestering toxic/volatile intermediates, and minimizing undesired side reactions. Elucidation of the molecular mechanisms underlying the assembly of these supramolecular complexes is an active area of research, fueled by prospective applications in metabolic engineering, nanotechnology, and drug delivery.
Targeting of enzymes to protein microcompartments is often mediated by specific localization sequences. For example, ferritin-like proteins (Flp) and dye-decolorizing peroxidases (Dyp) are internalized by cages called encapsulins via conserved C-terminal peptides that associate with specific lumenal sites on the protein shell (4). Similarly, the component enzymes of propanediol utilization (Pdu) and ethanolamine utilization (Eut) compartments possess characteristic N-terminal sorting sequences that direct their uptake (5)(6)(7). Genetic fusion of such targeting peptides to heterologous cargo has been successfully used to internalize foreign molecules in several protein compartments (8 -11).
Lumazine synthase (LS) 4 is an enzyme that catalyzes the penultimate step in riboflavin biosynthesis. Although LSs from fungi, archaea, and some eubacteria exist as either pentamers or dimers of pentamers (12)(13)(14)(15)(16)(17), the enzyme from Bacillus subtilis (BsLS) and several other organisms, including one plant, form T (triangulation number) ϭ 1 icosahedral assemblies consisting of 60 identical monomers (12, 18 -21). Interestingly, the B. subtilis enzyme encapsulates the next enzyme in the vitamin B 2 biosynthetic pathway, a homotrimeric riboflavin synthase (RS) (22)(23)(24). The resulting supramolecular assembly is unusual in that both the compartment shell and the guest are catalytically active (25), and co-localization of the two enzymes leads to an overall improved rate of riboflavin formation at low substrate concentrations (26). This type of substrate channeling may be quite general, because sequence analyses suggest that cageforming LSs are widespread in nature (27). Nevertheless, aside from the BsRS⅐BsLS complex, LS encapsulation of RSs or other enzymes involved in riboflavin biosynthesis has not been demonstrated experimentally for any other species, nor has the mechanism of guest uptake been elucidated.
Here we report on the ability of the cage-forming LS produced by the hyperthermophilic bacterium Aquifex aeolicus (AaLS) to compartmentalize riboflavin synthase and foreign guest proteins. Because the enzyme is readily produced in recombinant form, exhibits extraordinary thermostability (T m ϳ120°C) (19), and tolerates both chemical and genetic modifi-cation, AaLS mutants have been widely utilized for biomimetic applications. These include biomineralization (28), drug delivery (29,30), bioimaging (31), nucleic acid storage (32), enzyme catalysis (33), templated synthesis of polymers (34), and mimicry of bacterial microcompartments like the carboxysome (35). Identification of a specific peptide localization sequence that enables encapsulation of guest molecules by native AaLS cages lays the groundwork for investigating the nature of RS⅐LS complexation, as well as for further expanding the range of biotechnological applications of this system.

AaLS complexation of AaRS
Based on homology to Schizosaccharomyces pombe RS (Protein Data Bank (PDB) 1KZL) (36,37), AaRS likely assembles as a homotrimer, with each subunit comprising two ␤-barrel catalytic domains (residues 1-179), a coiled-coil segment (residues 180 -195), and a structurally disordered C-terminal extension (residues 196 -207) (Fig. 1a). In analogy to the homotrimeric Dyp protein, which utilizes a C-terminal peptide to bind to a hydrophobic pocket near the 3-fold symmetry axes on the lumenal surface of encapsulin cages (4), we hypothesized that AaRS might bind to the lumen of AaLS assemblies via its trimeric coiled-coil and appended C-terminal peptides (Fig.  1b). To test this notion, we designed truncated AaRS variants that lack the C-terminal extension (AaRS  ), the coiled-coil and C-terminal peptide (AaRS 1-180 ), or just the coiled-coil segment (AaRS ⌬180 -196 ) (Fig. 1c). Additionally, we mutated a tryptophan at position 207 to alanine (AaRS W207A ) to test whether this C-terminal aromatic amino acid residue might be a specific recognition element that binds to a hydrophobic patch on the interior of the AaLS cage (38).
Complexation of the AaRS variants by AaLS was investigated using pulldown methodology. Wild-type AaRS and the four deletion constructs were individually co-expressed with His 6tagged AaLS in Escherichia coli. The His 6 -tagged LS from Saccharomyces cerevisiae (ScLS), which forms pentamers but no cages (13), was also co-produced with AaRS as a control. Inclusion complexes with the AaLS cages were purified by nickel affinity chromatography, and all untagged AaRS proteins not physically associated with AaLS were removed by extensive washing. The isolated complexes were then analyzed by SDS-PAGE ( Fig. 1c and supplemental Fig. 1). The results demonstrated that wild-type AaRS co-localizes with AaLS. The isolated particles were morphologically identical to empty cages as judged by native agar gel electrophoresis (AGE) and transmission electron microscopy (TEM) imaging ( Fig. 1e and supplemental Fig. 2). Formation of the AaRS⅐AaLS complex appears to be specific, because AaRS did not co-purify with ScLS. Substantial decreases in association efficiency observed for the truncated AaRS 1-179 and AaRS  variants further underscore the importance of the C-terminal region of AaRS for encapsulation. Interestingly, AaRS ⌬180 -195 , which only lacks the coiled-coil segment, associated with AaLS to an ϳ3-fold higher extent than full-length AaRS, demonstrating that the coiled-coil region is not required for guest loading. Finally, the C-terminal tryptophan of AaRS is not essential for host-guest recognition either, judging from the observation that AaRS W207A is taken up by AaLS only 30% less efficiently than wild-type AaRS.

Encapsulation of foreign guests
The results of the pulldown assay convincingly establish that AaRS associates with AaLS in vivo and also pinpoint the 12 amino acids at the C terminus of AaRS as the requisite localiza-  (19). The clefts constituting the enzyme catalytic site are highlighted in red. c, genes encoding AaRS and engineered constructs. The structurally disordered C-terminal peptide is designated as C-term. d, intensity ratio of riboflavin synthase/lumazine synthase determined by SDS-PAGE analysis. *** and * signify p Ͻ 0.001 and 0.1, respectively; n.d., not detected. Error bars indicate standard deviations from triplicate experiments. e, TEM images of empty AaLS assemblies (left) and complexes with AaRS (right). Scale bar ϭ 100 nm. tion sequence. To explore the utility of this peptide for targeting foreign proteins to the lumen of AaLS cages, we fused it to green fluorescent protein (GFP), with and without the upstream coiled-coil segment, to give the constructs GFP-AaRS 180 -207 and GFP-AaRS 196 -207 (Fig. 2a). Untagged GFP and GFP fused to a scrambled version of the C-terminal peptide (GFP-AaRS 196 -207scr ) served as controls. The GFP proteins were coexpressed with His 6 -tagged AaLS in E. coli, and the cage complexes were isolated by nickel affinity chromatography. Analysis of the samples by native AGE showed that the complexes migrated as intact AaLS cages, but only the nanocompartments co-expressed with GFP-AaRS 180 -207 and GFP-AaRS 196 -207 exhibited appreciable fluorescence (Fig. 2b). The presence of the tagged guests was confirmed by SDS-PAGE (supplemental Fig. 3). Judging from the fluorescence intensity, the coiled-coil segment does not enhance encapsulation efficiency; the C-terminal AaRS dodecapeptide suffices for cargo loading.
Cargo loading was quantified based on the 280/488 nm absorbance ratio (33). On average, AaLS cages co-expressed with GFP-AaRS 180 -207 and GFP-AaRS 196 -207 contained 0.40 Ϯ 0.06 and 0.47 Ϯ 0.06 guest molecules, respectively. These values are in good agreement with estimates based on guest fluorescence (supplemental Fig. 4). In contrast, Ͻ0.01 GFP molecules per cage were detected for the control samples with untagged GFP or GFP-AaRS 196 -207scr (Fig. 2c). TEM images confirmed that filled and empty AaLS cages have identical morphology (Fig.  2d). Assuming a packing density of 0.7, a GFP volume of ϳ40 nm 3 , and a lumenal volume of ϳ270 nm 3 , the T ϭ 1 AaLS cage would be expected to accommodate a maximum of 4 GFP molecules. However, loading efficiency likely depends on multiple factors, including the nature of the guest, the respective cytosolic concentrations of the host and guest proteins, the affinity of the interaction between the tag and its lumenal binding site, and the relative rates of guest binding versus cage assembly. For example, based on the intensity of Coomassie Blue staining, AaLS appears to load approximately six times more AaRS than tagged GFP (supplemental Figs. 1 and 3), which would be consistent with the one trimer-per-cage stoichiometry reported for the BsRS⅐BsLS complex (39). The homotrimeric quaternary state of the native guest would be expected to enhance its affinity for the interior surface of the capsid via a chelate effect.

Peptide localization tags
To gain insight into the properties of the AaRS sorting sequence, we synthesized the C-terminal dodecapeptide peptide (AaRS 196 -207 ) as well as its scrambled counterpart (AaRS 196 -207scr ), and characterized both in vitro (Fig. 3a).
Although the RS C-terminal extension is disordered in the crystal structure (36), the sequence of the 12 C-terminal residues of AaRS suggests that the peptide might adopt an amphiphilic helical structure that could interact with hydrophobic patches on the AaLS shell via its apolar face (supplemental Fig. 5). Nevertheless, CD spectra of both AaRS 196 -207 and AaRS 196 -207scr in aqueous buffer exhibited only a single minimum centered at ϳ200 nm, indicating random coil structures (Fig. 3, left). The addition of trifluoroethanol induced some helix formation, judging from the appearance of characteristic signals at 208 and 222 nm, but the helix content was low for both peptides (11 and 6% for AaRS 196 -207 and AaRS 196 -207scr , respectively). The native sequence exhibits somewhat more helicity, reflecting a greater propensity to form salt bridges between i and i ϩ 3 or i ϩ 4 residues (supplemental Fig. 5).
Because helix formation is promoted by longer sequences, we also synthesized and characterized constructs containing the coiled-coil domain of AaRS (supplemental Fig. 6). On its own, the coiled-coil segment (AaRS 180 -195 ) exhibits low helical content in aqueous buffer, even at high concentrations and low temperatures (supplemental Fig. 6b, left). Appending the C-terminal extension (AaRS 180 -207 ) resulted in poor solubility at room temperature. When solubilized at 40°C, however, this peptide displayed significantly higher helical content than the coiled-coil alone (AaRS 180 -195 ) or the coiled-coil plus the scrambled peptide (AaRS 180 -207scr ) (supplemental Fig. 6b,  right). Although the coiled-coil domain is not required for guest encapsulation, it appears to promote helix formation in the C-terminal region to some extent. Interaction with binding sites on AaLS could conceivably induce higher helicity as well.

In vitro loading
Given the remarkable stability and closed-shell structures of wild-type AaLS cages, cargo loading in living cells likely occurs during capsid assembly. To examine the feasibility of loading the nanocompartments with guest molecules post-assembly, mixing experiments were carried out in vitro using fluorescently labeled peptides and empty cages. FITC was linked to the N terminus of the sorting sequence and its scrambled counterpart via an aminocaproic acid spacer (Fig. 3a), and the peptides Fl-AaRS 196 -207 and Fl-AaRS 196 -207scr were mixed with AaLS. However, the peptides did not migrate with the AaLS particles on native agarose gels under any conditions screened, including high pH, high ionic strength, or upon sonication. Failure to load AaLS in vitro implies that the tagged proteins bind to sites on the interior rather than the exterior surface of the cages.
To weaken intersubunit interactions between shell components, we titrated the AaLS samples with the denaturant guanidine hydrochloride (GdnHCl). Analysis of the samples by dynamic light scattering (DLS) showed that the cages (partially) disassemble into pentameric subunits between 3 and 5 M Gdn-HCl and further dissociate into monomers at higher denaturant concentrations ( Fig. 3c and supplemental Fig. 7a). Although AaLS cages do not refold if completely denatured (25), this partial disassembly process seems to be reversible upon removal of GdnHCl as shown by TEM analysis (Fig. 3d). When AaLS cages were mixed with Fl-AaRS 196 -207 in 3 M GdnHCl overnight and then analyzed by gel electrophoresis or isolated by gel filtration, substantial association of the fluorophore with the cage fraction was observed ( Fig. 3e and supplemental Fig. 8a). This was not the case for the scrambled Fl-AaRS 196 -207scr construct, indicating a specific interaction between the cage and the targeting peptide even under these partially denaturing conditions. While encapsulation occurs between 2 and 4 M GdnHCl (supplemental Fig. 8a), the best loading efficiencies were observed with 3 M GdnHCl, conditions that favor largely intact cage structures, judging by DLS and CD measurements ( Fig. 3c and supplemental Figs. 7a and 9).
In preparative experiments with Fl-AaRS 196 -207 , more than 90% of the input AaLS protein was recovered as fully assembled cage complexes after treatment with 3 M GdnHCl, judging from absorbance spectra, native AGE, and TEM images (Fig. 3, d-f). Cargo loading was estimated to be approximately one labeled peptide per cage based on absorbance and fluorescence measurements ( Fig. 3g and supplemental Fig. 11). The spectra were recorded under denaturing conditions (6 M GdnHCl) to minimize environmental effects on the spectroscopic properties of FITC (supplemental Figs. 10 and 11). For comparison, AaLS was found to bind the scrambled Fl-AaRS 196 -207scr peptide ϳ7-fold less efficiently, underscoring the importance of the targeting sequence for guest compartmentalization. The weak association of the scrambled peptide can presumably be ascribed to nonspecific interactions with the capsid surface, possibly mediated by the appended fluorophore.
In contrast to chemical denaturation, heating AaLS up to 92°C does not cause disassembly ( Fig. 3c and supplemental Fig.  7b) (19). Nevertheless, we found that heating mixtures of the cage and cargo at 80°C promoted complex formation (Fig. 3e  and supplemental Fig. 8b), but the native and scrambled peptides were loaded to similar extents, indicating nonspecific binding. As for the experiments with Fl-AaRS 196 -207 and Gdn-HCl, the isolated cages contained approximately one guest molecule, but the red-shifted emission of the FITC label suggests that different binding sites in the AaLS assemblies are targeted in these complexes (supplemental Fig. 11). In the case of treatment with GdnHCl, the targeting peptide likely binds specifically to the native binding site with the fluorophore solvated by water, whereas heating the cages probably exposes alternative region(s) that interact nonspecifically with the peptide and/or FITC fluorophore.

Discussion
Our results show that AaLS, like its B. subtilis counterpart (22), forms specific inclusion complexes with AaRS when both proteins are co-expressed in a heterologous host. Formation of this complex suggests that molecular recognition of AaRS by AaLS is inherent to the proteins and not mediated by a third agent. In fact, the 12 unstructured C-terminal residues of AaRS were found to serve as a "molecular address" that targets the guest to the lumenal space of the cage.
RSs from other bacteria, including BsRS (40), possess C-terminal segments that exhibit sequence patterns (IXXXFL) that are similar or identical to that of the AaRS targeting tag (41), suggesting that these organisms likely share a common strategy for enzyme compartmentalization (supplemental Fig. 12). This strategy can be extended to foreign proteins, such as GFP; appending the targeting tag enables their efficient encapsulation both in vivo and in vitro. Nevertheless, RSs from many other organisms for which the corresponding LS is known or predicted to adopt a cage structure lack a homologous targeting peptide (27), indicating either a different encapsulation strategy or no complexation at all, as reported for E. coli LS (18).
Although the detailed mechanism for the formation of the AaRS⅐AaLS complex is not yet fully understood, bulky cargo proteins appear to be encapsulated by nascent, incomplete shells and cannot be released upon completion of the cage. As a consequence, encapsulation in vivo presumably accompanies cage assembly, whereas cargo loading in vitro requires (partial) disassembly of the shell by transient exposure to denaturants. We found that optimal loading was observed in 3 M GdnHCl, conditions that weaken interactions between capsomers but do not completely disrupt cage structure.
Because cage loading occurs at high ionic strength, hostguest binding is likely mediated by hydrophobic interactions. Many other bacterial microcompartments reportedly utilize conserved hydrophobic residues in helical motifs of shell proteins to bind guests (6,42). The lack of such elements in AaLS suggests a different binding mode. Clefts near the 3-fold symmetry axes of the cage represent potential recognition sites, which would rationalize the efficient in vivo loading of trimeric RS. Similar clefts have been found to bind the C-terminal peptides of guests in encapsulin complexes (4). Nevertheless, in the case of both encapsulins and AaLS, a monomeric recognition peptide suffices for guest encapsulation (11,43).
Beyond its biological role as a microcompartment for riboflavin biosynthesis, AaLS has proven to be a versatile platform for the development of artificial encapsulation systems. Its extreme thermostability is particularly useful in this context, enabling extensive modification without disrupting cage assembly. We engineered AaLS variants possessing negatively supercharged interiors, for instance, that spontaneously encapsulate positively charged cargo molecules both in vivo and in vitro (44,45). In contrast to the closed-shell wild-type structure, these cages adopt unusual expanded architectures with large keyhole-shaped pores in the shell (46), which facilitate rapid and quantitative loading of a wide variety of guests (33)(34)(35)(47)(48)(49). Utilization of a specific C-terminal peptide to target cargo to AaRS complements encapsulation strategies based on engineered electrostatic interactions, providing a potentially more selective means of compartmentalizing guests within living cells.
The specific encapsulation of guest molecules in stable and monodisperse AaLS cages makes them a useful alternative to viral capsids as generalizable molecular containers. The relatively small lumenal volume of native cages, ϳ270 nm 3 versus 4200 nm 3 and 15,600 nm 3 for the negatively supercharged AaLS variants (46), could prove useful for studying the effects of protein confinement on folding and function. By providing a protective shell for molecular cargo, these nanostructures represent attractive starting points for diverse biotechnological applications, including the development of customized nanoreactors and delivery or display vehicles.

Cloning
An AaRS gene, codon-optimized for E. coli, was synthesized by ATG Biosynthesis (Merzhausen, Germany). A plasmid encoding the gene for superfolder GFP was a generous gift from Prof. Andreas Plückthun (University of Zurich, Switzerland). Plasmids for the expression of AaRS variants and tagged GFP derivatives were derived from the pACYC-Ptet-HIV-R10 vector (45). pMG-AaLS and pMG-ScLS were used for AaLS and ScLS expression, respectively (50). E. coli strain XL1-Blue was used as the host for all cloning procedures. Sequences of all plasmids were confirmed by DNA Sanger sequencing performed by Microsynth AG (Balgach, Switzerland). The primer, plasmid, and protein sequences used in this study are summarized in supplemental Tables 1-3.

Pulldown assays
AaLS cages were co-expressed with guest proteins in E. coli strain BL21-gold (DE3) using the T7 promoter/lac operon and tetracycline promoter/tetracycline operon, respectively. After culturing the cells to an A 600 of ϳ0.7, protein production was induced by the addition of 0.1 mM isopropyl ␤-D-1-thiogalactopyranoside for AaLS and 1 g/ml tetracycline for guest proteins. The cages, which possess a C-terminal His 6 tag, were purified by nickel affinity chromatography. The proteins were eluted with 500 mM imidazole, and the buffer was changed to 10 mM sodium phosphate buffer (pH 7.4) containing 150 mM NaCl and 1 mM EDTA by ultrafiltration (Amicon Ultra-4, 30,000 molecular weight cutoff; Merck Millipore). Purified AaLS cages were analyzed by SDS-PAGE, native AGE, absorbance spectrometry, fluorescence spectrometry (for GFP guest), and TEM. AaLS concentration was determined by absorbance at 280 nm (⑀ 280 ϭ 13,980 M Ϫ1 cm Ϫ1 ) (33). Total protein concentration of the RS⅐LS mixture was determined by Bradford assay using AaLS as a standard. The pulldown assays were performed in triplicate, and standard deviations were quantified.

Peptide synthesis
All peptides were prepared by standard Fmoc (N-(9-fluorenyl)methoxycarbonyl) solid-phase peptide synthesis and purified by reverse-phase HPLC. The identity of the purified sam The concentration of N-acetylated peptides was determined by UV absorption at 280 nm following Pace et al. (51), whereas 495 nm (⑀ 495 ϭ 68,000 M Ϫ1 cm Ϫ1 ) was used for the FITC-labeled peptides (52).

In vitro loading of synthetic peptides into AaLS cages
For screening, empty AaLS cages (20 M, concentration with respect to monomer) purified by nickel affinity chromatography were mixed with FITC-labeled peptide (10 M) in 10 mM sodium phosphate buffer (pH 7.4) containing 150 mM NaCl, 1 mM EDTA, and appropriate additives (e.g. GdnHCl) in a total volume of 20 l. After overnight incubation at room temperature, the solution was supplemented with 4 l of 70% glycerol containing bromphenol blue and xylene cyanol, and then analyzed by native AGE. Excess salts, including GdnHCl, and any unbound peptides were removed before native AGE analysis using an Illustra MicroSpin G-25 column (GE Healthcare). For preparative scale experiments, 20 M AaLS cages were incubated overnight with 20 M peptide in a total volume of 500 l. After desalting on a PD MiniTrap G-25 column (GE Healthcare), the sample was diluted 40-fold and reconcentrated three times using an Amicon Ultra-4 filter with a 30,000 molecular weight cutoff (Merck Millipore) to completely remove unbound peptides. A portion of the sample was diluted with 3 volumes of 50 mM sodium phosphate (pH 8.0) containing 8 M GdnHCl and used for quantification by absorbance and fluorescence spectroscopy. The remainder was analyzed by native AGE, fluorescence spectroscopy, and TEM.
All other experimental details, including molecular cloning, peptide synthesis, and characterization of AaLS cages, are provided as supplemental Experimental procedures.
Author contributions-Y. A. and R. Z. designed, performed, and analyzed the experiments. D. H. conceived the project. All authors reviewed the results and wrote and approved the final version of the manuscript.