Elucidating Essential Role of Conserved Carboxysomal Protein CcmN Reveals Common Feature of Bacterial Microcompartment Assembly*

Background: CcmN is a conserved carboxysomal protein of unknown function in β-cyanobacteria. Results: The N-terminal domain of CcmN binds the encapsulated protein CcmM; the C-terminal peptide of CcmN interacts with the major carboxysomal shell protein. Conclusion: CcmN is an essential carboxysomal component; its C-terminal peptide is essential for carboxysome biogenesis. Significance: Diverse bacterial microcompartment-encapsulated proteins contain a similar peptide, suggesting that it is a common feature of bacterial microcompartment biogenesis. Bacterial microcompartments are organelles composed of a protein shell that surrounds functionally related proteins. Bioinformatic analysis of sequenced genomes indicates that homologs to shell protein genes are widespread among bacteria and suggests that the shell proteins are capable of encapsulating diverse enzymes. The carboxysome is a bacterial microcompartment that enhances CO2 fixation in cyanobacteria and some chemoautotrophs by sequestering ribulose-1,5-bisphosphate carboxylase/oxygenase and carbonic anhydrase in the microcompartment shell. Here, we report the in vitro and in vivo characterization of CcmN, a protein of previously unknown function that is absolutely conserved in β-carboxysomal gene clusters. We show that CcmN localizes to the carboxysome and is essential for carboxysome biogenesis. CcmN has two functionally distinct regions separated by a poorly conserved linker. The N-terminal portion of the protein is important for interaction with CcmM and, by extension, ribulose-1,5-bisphosphate carboxylase/oxygenase and the carbonic anhydrase CcaA, whereas the C-terminal peptide is essential for interaction with the carboxysome shell. Deletion of the peptide abolishes carboxysome formation, indicating that its interaction with the shell is an essential step in microcompartment formation. Peptides with similar length and sequence properties to those in CcmN can be bioinformatically detected in a large number of diverse proteins proposed to be encapsulated in functionally distinct microcompartments, suggesting that this peptide and its interaction with its cognate shell proteins are common features of microcompartment assembly.

Bacterial microcompartments are organelles composed of a protein shell that surrounds functionally related proteins. Bioinformatic analysis of sequenced genomes indicates that homologs to shell protein genes are widespread among bacteria and suggests that the shell proteins are capable of encapsulating diverse enzymes. The carboxysome is a bacterial microcompartment that enhances CO 2 fixation in cyanobacteria and some chemoautotrophs by sequestering ribulose-1,5-bisphosphate carboxylase/oxygenase and carbonic anhydrase in the microcompartment shell. Here, we report the in vitro and in vivo characterization of CcmN, a protein of previously unknown function that is absolutely conserved in ␤-carboxysomal gene clusters. We show that CcmN localizes to the carboxysome and is essential for carboxysome biogenesis. CcmN has two functionally distinct regions separated by a poorly conserved linker. The N-terminal portion of the protein is important for interaction with CcmM and, by extension, ribulose-1,5-bisphosphate carboxylase/oxygenase and the carbonic anhydrase CcaA, whereas the C-terminal peptide is essential for interaction with the carboxysome shell. Deletion of the peptide abolishes carboxysome formation, indicating that its interaction with the shell is an essential step in microcompartment formation. Peptides with similar length and sequence properties to those in CcmN can be bioinformatically detected in a large number of diverse proteins proposed to be encapsulated in functionally distinct microcompartments, suggesting that this peptide and its interaction with its cognate shell proteins are common features of microcompartment assembly.
Bacterial microcompartments (BMCs) 2 are organelles consisting of functionally related proteins surrounded by a protein shell (1)(2)(3)(4). Models from crystallographic studies have indicated that BMC shells are composed of proteins that form (pseudo) hexamers that tile into layers to form the facets and pentameric proteins that form the vertices of an apparently icosahedral shell (5). Genes encoding shell proteins are found in Ͼ20% of sequenced bacterial genomes, where they appear to encapsulate a variety of enzymes, many of which are oxygensensitive and/or catalyze reactions that produce toxic or volatile intermediates (2,4,6). Because of their structural and functional properties, BMCs have considerable potential for applications in bioengineering and synthetic biology.
Carboxysomes were the first BMCs to be discovered (7) and characterized (8) by transmission electron microscopy (TEM) of cyanobacteria (Fig. 1A, inset). They contain ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) (9) and carbonic anhydrase. The semipermeable protein shell of the carboxysome allows the diffusion of bicarbonate into the interior, where it is converted to CO 2 , a substrate for Rubisco, by carbonic anhydrase. The shell also functions as a selective barrier to retain CO 2 in the interior (10). It also presumably excludes O 2 , an alternative substrate for Rubisco, thereby enhancing CO 2 fixation. Given that all cyanobacteria contain carboxysomes and that cyanobacteria are considered the most abundant photosynthetic organisms on Earth (11), a substantial portion of global CO 2 fixation takes place in carboxysomes.
There are two types of carboxysomes, distinguished by the form of Rubisco they encapsulate and by their gene organization. ␣-Carboxysomes contain Form IA Rubisco and are found in picocyanobacteria (also known as ␣-cyanobacteria) and some chemoautotrophs; ␤-carboxysomes contain Form IB Rubisco and are found in all other cyanobacteria (␤-cyanobac-* This work was supported by the Director, Office of Science, of the United teria). ␣and ␤-carboxysomes also differ in the type of carbonic anhydrase (CsoSCA or CcmM (UniProt Q03513)/CcaA, respectively) that is encapsulated. Both types of carboxysomal gene clusters contain an additional gene encoding a protein of unknown function: CsoS2 in ␣ and CcmN (UniProt P46204) in ␤. CsoS2 and CcmN do not share any sequence homology.
Among cyanobacterial carboxysomes, the ␤-carboxysome is the most extensively characterized. ␤-Carboxysomes are coded for by one of the few gene clusters conserved in ␤-cyanobacteria (12), with additional carboxysomal genes dispersed throughout the genome ( Fig. 2A). Crystal structures of the shell proteins have led to a proposed model for the carboxysome shell (5,13). In addition, the crystal structures of the ␤-carboxysomal Rubisco (14) and the ␥-carbonic anhydrase domain of CcmM (15) are known, but structures of complexes of the encapsulated proteins have not been determined. Some of the protein-protein interactions within the ␤-carboxysome have been deduced: CcmM has been shown to interact with the carboxysomal carbonic anhydrase (CcaA), Rubisco, and CcmN via yeast twohybrid analysis (16); and CcmN was also identified as an interaction partner with CcmM through in vitro pulldown assays using recombinant protein (17).
Despite being conserved in genetic location and sequence, little is known about the function of CcmN. Here, we report the in vivo and in vitro characterization of CcmN, demonstrating that it localizes to the carboxysome and is essential for carboxysome biogenesis and function. We show that CcmN consists of two functional regions. The conserved N terminus forms a stable complex with CcmM that interacts with Rubisco. A short conserved C-terminal peptide interacts with the essential carboxysomal shell protein CcmK2 (UniProt Q03511) and is essential for carboxysome biogenesis. Shell proteins are the defining feature of all BMCs; we show that peptides similar to the one found in CcmN are present in a large number of proteins known or presumed to be encapsulated in diverse BMCs. Among these is a peptide found in the N terminus of an enzyme of the pdu (1,2-propanediol utilization) microcompartment of Salmonella typhimurium LT2; it was sufficient to target a fluorescent protein to the pdu microcompartment, but the nature of the protein-protein interactions underlying this targeting was not established (18). The peptides we identified are found at the N or C terminus or between domains of many BMC-encapsulated proteins, suggesting that they are a widespread structural feature of protein interaction with BMC shells.

EXPERIMENTAL PROCEDURES
Bacterial Strain Construction, Media, and Growth Conditions-Synechococcus elongatus PCC 7942 (Syn7942) was purchased from the Pasteur Culture Collection and cultured in BG-11 liquid or BG-11 agar (19) at 30°C under a constant illumination of ϳ100 micro-einsteins/m 2 /s in ambient or 3% CO 2 for high CO 2 -requiring (HCR) mutants. Syn7942 transformants/mutants were generated via homologous recombination as described previously (20,21). All fluorescent constructs were under the control of the native allophycocyanin promoter (papcA). Constructs and primers are listed in supplemental Tables S1 and S2, respectively. Fully segregated transformants were verified by PCR.
TEM of Syn7942-Cyanobacterial cells were fixed using 2% EM-grade glutaraldehyde and 0.2 M sodium cacodylate (pH 7.2). Fixed cells were embedded in Spurr's resin and sectioned according to standard methods (22). Thin sections were stained with 2% methanolic uranyl acetate and Reynolds lead citrate and imaged on a JEOL 1200-EX microscope at 80 kV.
Fluorescence Microscopy-Log-phase cells with YFP or cyan fluorescent protein (CFP) fluorophore were spotted on a thin BG-11 agar pad for imaging with a Zeiss Axioplan 2 microscope using a 100ϫ oil immersion objective. Images were visualized and analyzed with NIH ImageJ; the JACoP plug-in was used for co-localization analyses (24).
In Vivo Immunoprecipitation of YFP-tagged Proteins-7.5 g of goat anti-GFP polyclonal antibody (Abcam) in 200 l of PBS and 0.02% Tween 20 (pH 7.4) was added to 50 l of protein G-conjugated Dynabeads (Invitrogen) and incubated at 4°C with rotation. Log-phase Syn7942 (wild-type and mutant) expressing RbcL-CFP, YFP-CcmN, or YFP-CcmN⌬18 resuspended in 1.5 ml of Nonidet P-40 buffer (20 mM Tris-HCl (pH 8), 137 mM NaCl, 10% glycerol, 1% Nonidet P-40, and 2 mM EDTA) with protease inhibitors (Roche Applied Science) was sonicated on ice at 10-s intervals for a total of 90 s. 1 ml of cell lysate was applied to the anti-GFP antibody-conjugated beads and incubated overnight at 4°C; subsequent steps were carried out according to the Dynabeads manual.
In Vitro Pulldown Analysis-Escherichia coli BL21(DE3) cells were cotransformed with GST-CcmN or GST-CcmN⌬18 with CcmK2 (supplemental Table S1) and plated on LB agar plates containing antibiotics. 1 ml of an overnight culture of transformant was used to inoculate 49 ml of LB medium with antibiotics. The cultures were grown at 37°C to an A 600 of 0.4, at which point, the temperature was lowered to 20°C. At A 600 ϭ 0.8, the cultures were induced with 0.4 mM isopropyl ␤-D-thiogalactopyranoside for 3 h. The cells collected were resuspended in 1.5 ml of Nonidet P-40 buffer with protease inhibitors added and disrupted by sonication on ice at 10-s intervals for a total of 90 s. The supernatant was applied to 50 l of equilibrated glutathi-one-Sepharose and incubated at 4°C with rotation overnight. Washing and elution were carried out following the immunoprecipitation protocol using Bio-Rad Micro Bio-Spin chromatography columns to effectively remove the lysate and washes from the resin.
Western Blotting-Gels were transferred to a PVDF membrane at 70 V for ϳ1-2 h at 4°C. Primary rabbit anti-CcmK2 antiserum was diluted 1:2500, and secondary HRP-conjugated goat anti-rabbit IgG (Santa Cruz Biotechnology) was diluted 1:20,000 in TBS with 0.01% Triton X-100. The blot was developed using SuperSignal West Pico substrate (Pierce) and imaged using a Bio-Rad ChemiDoc XRSϩ system.
Sequence Analysis-CcmN amino acid sequences from cyanobacteria were obtained from public and private databases in Integrated Microbial Genomes. Alignments were made using MUSCLE (25) and viewed with Jalview (26). Jpred (27) was used for secondary structure predictions. A hidden Markov model (HMM) and HMM logos were created from the alignment of all CcmN orthologs using HMMER Version 3.0 and LogoMat-P (28), respectively. A visual search for BMC gene clusters was done by searching Integrated Microbial Genomes for Pfam00936 domains in bacterial genomes and viewing each hit and its gene neighborhood. Putative encapsulated metabolic pathways were inferred based on the annotation of the enzymes in the BMC gene cluster and pathways checked using the KEGG database. Helical wheel projections were created using a webbased server from the Zidovetski lab at University of California, Riverside.
Construction of CcmN Homology Model-SWISS-MODEL (29) was used in automated mode using the CcmN protein sequence from Syn7942 as the input and the CcmM structure (Protein Data Bank code 3KWD, Chain A) from Thermosynechococcus elongatus as the template.

CcmN Localizes to the Carboxysome-Although the gene for
CcmN is conserved in all ␤-cyanobacterial carboxysomal gene clusters, its localization to the carboxysome had not been demonstrated. To confirm CcmN localization to the carboxysome, we cotransformed wild-type Syn7942 with YFP fused to CcmN and CFP fused to the large subunit of Rubisco (RbcL) (UniProt Q31NB3) (30). When each fluorophore was imaged individually, fluorescent puncta could be seen in the cells. When these images were overlaid, the fluorescence intensity of the spots aligned with thresholded Mander's coefficients of 0.422 and 0.407 for the CFP and YFP channels, respectively (Fig. 3, A-C). These values are similar to co-localization statistics of the shell protein CcmK4 fused to YFP with an RbcL-CFP fusion (supplemental Fig. S1).
CcmN Is Essential for Carboxysome Formation-To determine whether CcmN is an essential carboxysomal protein, a knock-out mutant (7942⌬N) was created in Syn7942. The mutant exhibited an HCR phenotype (supplement Fig. S2), unable to grow at ambient CO 2 levels but with WT-like growth rates at 3% CO 2 , indicative of impaired carboxysome function. Reintroduction of CcmN restored the ability of the mutant to grow at ambient CO 2 (supplemental Fig. S2).
TEM revealed that the 7942⌬N strain did not form carboxysomes but instead contained large, mostly polar aggregates of protein (Fig. 1A). Introduction of YFP-CcmN into the 7942⌬N mutant resulted in fully formed carboxysomes (Fig. 1B). We confirmed the results of the TEM studies with fluorescence microscopy. Expression of YFP-CcmN in the 7942⌬N mutant led to an even distribution of fluorescent puncta (Fig. 3D), indicating restoration of proper carboxysome formation and localization. In contrast, expression of RbcL-CFP in the 7942⌬N mutant did not restore proper carboxysome formation. RbcL-CFP resulted in single fluorescent spots localized at the poles (Fig. 3E), which are consistent with the polar masses observed by TEM. RbcL-CFP expressed in WT cells resulted in evenly distributed fluorescent puncta in each cell (Fig. 3B). Expression of RbcL-CFP in 7942⌬N and WT cells did not alter the background strains' HCR and WT growth phenotypes, respectively (supplemental Fig. S2).
CcmN Interacts with CcmM through Its N-terminal Region-An HMM logo of all CcmN orthologs (Fig. 2B) revealed that CcmN can be subdivided into two conserved regions: the first, a conserved N-terminal region of ϳ120 amino acids that contains six bacterial hexapeptide repeat domains (Pfam00132). Hexapeptide repeat domains are also found in the N-terminal domain of CcmM, facilitating construction of a model for the N-terminal region of CcmN (Fig. 2C). The N-terminal domain of CcmN is followed by a variable length linker region enriched in Pro and Ser residues and a C-terminal peptide of 18 conserved amino acids (Fig. 2B). In some CcmN orthologs, the peptide is followed by another poorly conserved region of up to 24 residues.
To correlate function with the different regions of the CcmN primary structure, we coexpressed cleavable GST-CcmN, GST-CcmN⌬18 (lacking the conserved 18 amino acids on the C terminus), and GST-CcmN⌬39 (lacking both the variable region and conserved peptide) fusions (Fig. 2B) with CcmM in E. coli and were able to co-purify stable CcmM-CcmN complexes after on-column cleavage of the GST tag ( Fig. 4 and supplemental Fig. S3). These data indicate that CcmN interacts with CcmM through the N-terminal region of CcmN.
Conserved C-terminal Peptide of CcmN Is Crucial for Proper Carboxysome Assembly-To determine the function of the C-terminal peptide of CcmN, we constructed a mutant lacking 18 amino acids at the C terminus of CcmN in Syn7942 (7942N⌬18). The 7942N⌬18 mutant exhibited an HCR phenotype (supplemental Fig. S2). TEM of the 7942N⌬18 mutant revealed that the cells lacked carboxysomes; instead, they contained polar masses of protein larger than and without the distinct shape of carboxysomes (Fig. 5A). When we transformed this strain with YFP-CcmN⌬18, fluorescent puncta were present at the poles, consistent with the TEM images (Fig. 5B).  Wild-type growth at ambient CO 2 levels and proper carboxysome formation were restored by reintroduction of the fulllength ccmN gene in the 7942N⌬18 strain (Fig. 5, C and D). Collectively, these data indicate that the C-terminal peptide of CcmN is essential for carboxysome formation.
Conserved C-terminal Region of CcmN Binds Major Shell Component CcmK2 in Vivo and in Vitro-To identify protein interaction partners specific to the N-terminal region and C-terminal peptide of CcmN, the WT and 7942N⌬18 mutant strains were transformed with YFP-CcmN, YFP-CcmN⌬18, or YFP-CcmN⌬39, and their lysates were screened for interacting proteins using anti-GFP antibody to precipitate the YFP-tagged proteins. CcmM, RbcL, and RbcS (UniProt Q31NB2) were precipitated by both full length and truncated CcmN fusion proteins (Fig. 6A), as identified by Western blotting, molecular weight on SDS-PAGE, and/or mass spectrometry. Subsequent Western blotting with anti-CcmK2 antibody showed that a shell protein was present only in lanes that contained fulllength CcmN (Fig. 6A), indicating that the C-terminal peptide interacted with shell proteins. The amount of CcmK2 pulled down by full-length CcmN appeared to be greater in the WT experiment than in the mutant background.
To verify that the peptide interacts with the essential carboxysomal shell protein CcmK2, GST-CcmN and GST-CcmN⌬18 were each coexpressed with CcmK2 in E. coli. Only full-length CcmN was able to pull down CcmK2 in vitro (Fig.  6B).
Bioinformatic Identification of Conserved Peptides in Encapsulated Proteins of Diverse BMCs-CcmK2 contains the BMC domain (Pfam00936), the signature domain found in the most abundant protein of BMC shells. Gene clusters that have the potential to encode proteins that form BMCs can be identified bioinformatically by searching for conserved groups of genes  that contain orthologs to Pfam00936 and the putative vertex proteins (Pfam03319) (2). The function of a BMC that is identified bioinformatically may be inferred from the annotations of other genes in the cluster. We classified BMC gene clusters from currently available genome sequence data into 10 different functional categories (supplemental Table S3). Given the homology among shell proteins and the demonstration that the CcmN peptide interacts with the CcmK2 shell protein, we searched for homologs to the peptide in other gene products in BMC gene clusters. In all 10 types, one or more genes encoded proteins that contained a region similar to the CcmN peptide; these were found at either the N or C terminus or between domains of a presumably encapsulated protein. All were short (13-22 amino acids), and a portion of the peptide was predicted to be ␣-helical with high confidence in most ( Fig. 7A and supplemental Table S3). Notably, the ␣-helix had a conserved face formed by four hydrophobic amino acids flanked by two polar/ charged amino acids (Fig. 7B). As in CcmN, a poorly conserved region rich in Pro and Ser separated the peptides from the functional domains of the proteins.

DISCUSSION
We have shown that a previously uncharacterized protein, CcmN, is indispensable for ␤-carboxysome biogenesis. Our data show that CcmN interacts directly with the carboxysomeencapsulated protein CcmM and the carboxysomal shell protein CcmK2 through two distinct regions. The N-terminal domain of CcmN interacts with CcmM ( Fig. 4 and supplemental Fig. S3), which binds to other encapsulated carboxysomal proteins, Rubisco and the carboxysomal carbonic anhydrase CcaA (16,17). In the pulldown experiments (Fig. 6A), bands attributed to RbcL, RbcS, and CcmM were evident, whereas an expected band for CcaA at 30 kDa was not observed.
A yeast two-hybrid study reported an interaction between the N-terminal domain of CcmM and CcmN (17). A comparison of their (proposed) structures (Fig. 2C) readily suggests hypotheses for how CcmN and the N-terminal domain of CcmM interact within the carboxysome. Both proteins contain hexapeptide repeats in their N-terminal domains, which are known to adopt a left-handed ␤-helical fold (31). The N-terminal domain of CcmM, a member of the ␥-carbonic anhydrase family, crystallized as a trimer (15), similar to other hexapeptide repeat-containing proteins (32). A structural homology model of CcmN can be constructed using the structure of the N-terminal domain of CcmM (Fig. 2C) (15). Given their structural similarity, CcmN can be envisioned as interacting with the CcmM trimer or CcmN replacing CcmM monomers to form a heterotrimer. Also of note is the typically observed lack of carbonic anhydrase activity in purified CcmM (15); this could be due to the necessity of the interaction with the CcmN N-terminal domain to achieve activity.
The conserved C-terminal peptide of CcmN is essential for carboxysome formation; deletion of the peptide results in an HCR phenotype (supplemental Fig. S2), absence of carboxysomes, and, in most cells, a single, shapeless, polar protein aggregate (Fig. 5, A and B). The importance of the CcmN peptide for carboxysome assembly is also supported by an early study reporting the characterization of a spontaneous mutant of Syn7942; a point mutation of the conserved glycine (G146D) (Fig. 2B) found in the peptide region caused an HCR phenotype and resulted in carboxysomes with poorly defined edges when visualized by TEM (33). We attempted to show that the CcmN C-terminal peptide (both with and without the CcmN linker region) could target YFP to the carboxysome, but invariably, the constructs did not express well, resulting in formation of fluorescent puncta in only a small percentage of cells.
Based on our interaction data (Fig. 6) from both Syn7942 and E. coli, the C-terminal peptide region in CcmN interacts with the essential carboxysomal shell protein CcmK2. The decrease in detectable CcmK2 in the mutant background pulldown assay compared with the WT (Fig. 6A) could be due to lower expression of the carboxysomal gene cluster caused by the mutagenesis or an abundance of the truncated CcmN complex interfering with the proper assembly of carboxysomal components.
All BMC shell proteins have recognizable sequence homology, and many proteins identified experimentally or bioinformatically as encapsulated in BMCs contain peptides with sequence properties similar to those of the peptide in CcmN  Table S4) shows the conservation by property of the peptides. Blue, hydrophobic; green, polar uncharged; red, charged; orange, Pro and Gly. The consensus helical prediction is underscored with a yellow bar. B, helical wheel representation of the consensus predicted helix. The corresponding amino acid residues from the peptides of CcmN (inner), the N terminus of PduP of S. typhimurium LT2 (*, center), and the interdomain of the B 12independent diol dehydratase of R. palustris BisB18 ( †, outer) are mapped on the helix. (Fig. 7A). Among 10 types of BMCs (supplemental Table S3), one or more of the putatively encapsulated proteins in each contains a peptide similar to that in CcmN. The peptides range from 13 to 22 residues in length with a predicted propensity for ␣-helix formation, most with high confidence. The predicted helix has a conserved hydrophobic face formed by four residues, whereas the rest of the helix is mainly polar or charged (Fig. 7B). The peptide is always separated from the rest of the protein by a poorly conserved region of variable length that is predicted to be unstructured. The presence of this structurally disordered region could be necessary to reduce any steric hindrance between the region of the protein that interacts with shell and the functional domain of the protein, which presumably interacts with other encapsulated proteins.
Notably, one of the peptides containing these features is the N terminus of PduP, a propionaldehyde dehydrogenase. The PduP peptide was shown to target GFP to the pdu BMC (18) after it was noted to be present in sequences of BMC-encapsulated propionaldehyde dehydrogenases and absent in sequences of cytosolic forms of the enzyme. Deletion of PduP still resulted in BMCs that lacked only the dehydrogenase function, suggesting that, in contrast to the CcmN peptide, the PduP peptide is not essential for pdu BMC formation. Unlike the carboxysome, in which only one component protein contains an identifiable peptide, the pdu BMC gene cluster encodes two additional proteins containing similar peptides that may compensate for the loss of the PduP peptide.
In addition, peptides identified by sequence features similar to those of the CcmN peptide are also found at the C terminus (in an aldolase of Planctomyces limnophilus) or between domains (in a B 12 -independent diol dehydratase of Rhodopseudomonas palustris BisB18) of BMC-encapsulated proteins. Interestingly, the proteins containing the peptide domain tend to be associated with enzymes for the first or last step of the hypothesized encapsulated series of reactions, consistent with positioning these enzymes adjacent to the BMC shell, which is assumed to play a key role in substrate and product flux.
The presence of this peptide in many (putatively) encapsulated proteins suggests that it interacts with conserved sequence features in all shell proteins. The most conserved amino acids among BMC shell proteins are found at the edges of hexamers (23), where they are presumably important for interaction with adjacent hexamers to form the facets of the shell. Although speculative, this leads to a model for BMC assembly in which proteins that contain the peptide interact with the shell proteins as they assemble. In the carboxysome, for example, as CcmK2 hexamers form facets, binding sites for the CcmN peptide may be created at the 2-or 3-fold axes between adjacent hexamers. Other proteins to be encapsulated, such as CcmM, can then be recruited to the assembling shell via interaction with the N-terminal domain of CcmN. Alternatively, shell assembly may commence after recruitment of shell proteins to peptide motifs displayed by assemblies of interior components.
CcmN plays an essential role in the formation and organization of a bacterial organelle that plays a key role in global CO 2 fixation. The occurrence of gene products containing similar peptides in diverse BMC gene clusters suggests that such shell protein-peptide interactions are a common feature of BMC organization. This feature potentially offers a versatile approach to adhering molecules to BMC shell proteins for applications such as assembling enzyme-scaffold complexes or designed enzymatic nanoreactors.