Crystal Structure of β-Barrel Assembly Machinery BamCD Protein Complex*

Background: The Escherichia coli β-barrel assembly machinery complex (BamABCDE) facilitates outer membrane protein assembly. Results: The unstructured N terminus of BamC binds and blocks the proposed substrate-binding pocket on BamD. Conclusion: The unstructured N terminus of BamC is essential for BamCD interaction. Significance: The first Bam lipoprotein complex structure reveals how BamC and BamD interact. The β-barrel assembly machinery (BAM) complex of Escherichia coli is a multiprotein machine that catalyzes the essential process of assembling outer membrane proteins. The BAM complex consists of five proteins: one membrane protein, BamA, and four lipoproteins, BamB, BamC, BamD, and BamE. Here, we report the first crystal structure of a Bam lipoprotein complex: the essential lipoprotein BamD in complex with the N-terminal half of BamC (BamCUN (Asp28–Ala217), a 73-residue-long unstructured region followed by the N-terminal domain). The BamCD complex is stabilized predominantly by various hydrogen bonds and salt bridges formed between BamD and the N-terminal unstructured region of BamC. Sequence and molecular surface analyses revealed that many of the conserved residues in both proteins are found at the BamC-BamD interface. A series of truncation mutagenesis and analytical gel filtration chromatography experiments confirmed that the unstructured region of BamC is essential for stabilizing the BamCD complex structure. The unstructured N terminus of BamC interacts with the proposed substrate-binding pocket of BamD, suggesting that this region of BamC may play a regulatory role in outer membrane protein biogenesis.

In Escherichia coli, ␤-barrel outer membrane proteins (OMPs) 2 are synthesized in the cytosol and follow a pathway that directs them through the inner membrane and periplasm to the outer membrane, where they are finally folded and inserted (1,2). This last step is catalyzed by the ␤-barrel assembly machinery (BAM) complex, which consists of one OMP, known as BamA, along with four lipoproteins, BamB, BamC, BamD, and BamE (see Fig. 1a) (3). Homologous systems can be found in all Gram-negative species, as well as in the mitochondria and chloroplasts of eukaryotes (4,5). Although only BamA and BamD are essential for cell viability, understanding the structure and function of all components will provide greater insight into how outer membrane biogenesis occurs and could reveal the BAM complex as a new drug target (4 -6).
BamA consists of a ␤-barrel domain embedded in the outer membrane, as well as an N-terminal periplasmic domain composed of five polypeptide transport-associated (POTRA) motifs (7)(8)(9). BamB-E are lipoproteins anchored to the periplasmic face of the outer membrane via an N-terminally attached lipid (3). Several structures of the POTRA motifs of BamA and of the lipoproteins have been solved recently (10 -26). With structures of individual BAM proteins available to supplement functional studies, one can start to piece together how the BAM complex forms.
Previous research has shown that the POTRA motifs of BamA interact with the BamB-E lipoproteins, and it is believed that the POTRA motifs are also the docking site for the unfolded OMP substrates (3,10). Although BamB and BamD independently interact with the POTRA motifs directly, BamC and BamE require BamD to co-purify with BamA (6,10). Direct interaction between BamC and BamD has also been shown, and the mutagenesis data are consistent with the C terminus of BamD (residues 227-245) being necessary for the association (6). However, the region of BamC involved in the interaction with BamD was not determined.
To gain insight into the nature of the BamC-BamD interaction, a series of BamC truncation mutations were made and screened for interaction with BamD. This resulting in vitro interaction data, along with the crystal structure, reveal that the long, unstructured N terminus of BamC is required to stabilize the BamCD complex by forming a lasso-like structure that binds to BamD. using the primers listed in supplemental Table 1. All forward and reverse primers contained the restriction sites NdeI and XhoI, respectively (except for BamC NC , which had NdeI and HindIII). The PCR products of full-length BamC, full-length BamD, and BamC NC were ligated into vector pET28a (Novagen), and the resulting constructs had cleavable N-terminal hexahistidine affinity tags. The PCR products of BamC N , BamC C , and BamC UN were ligated into pET24a (Novagen), resulting in C-terminal hex-histidine tags. Subsequent DNA sequencing (Macrogen) confirmed that the BamC and BamD inserts matched the sequences reported in the Swiss-Prot Database (P0A903 and P0AC02, respectively). Protein Overexpression-Each expression plasmid coding for BamC, BamD, BamC NC , BamC N , BamC C , or BamC UN was transformed into E. coli BL21(DE3) cells and used to inoculate (1:100 back-dilution) 2 liters of LB medium containing kanamycin (50 g/ml). Cultures were grown at 37°C until A 600 nm ϭ 0.6. The culture was then induced with 1 mM isopropyl 1-thio-␤-D-galactopyranoside for 3 h.

Cloning-The
Purification of BamCD Complex-Cells overexpressing BamC and BamD were separately harvested by centrifugation and subsequently combined prior to lysis. The combined cell pellet was lysed using an Avestin EmulsiFlex-3C cell homogenizer in buffer A (20 mM Tris-HCl (pH 8.0) and 100 mM NaCl). The resulting lysate was clarified by centrifugation at 45,000 ϫ g for 30 min at 4°C, and the overexpressed proteins were initially purified by Ni 2ϩ affinity chromatography. BamC and BamD were co-eluted from the nickel-nitrilotriacetic acid-agarose column (Qiagen) with a step gradient method (100 -500 mM imidazole in buffer A in 100 mM increments). The fractions containing both proteins were pooled and concentrated to ϳ10 mg/ml using an Amicon ultracentrifugal filter device (Millipore). The concentrated BamCD sample was then further purified by size exclusion chromatography (Sephacryl S-100 HiPrep 26/60 column) in buffer A on an ÄKTAprime system (GE Healthcare).
Protein-Protein Interaction Studies-To test whether or not different truncation mutant forms of BamC can form a BamCD heterodimer, each truncation mutant was co-lysed with BamD and purified by Ni 2ϩ affinity and gel filtration chromatography as described above. The oligomeric state of the purified sample was further confirmed by gel filtration chromatography (Superdex 200 column, GE Healthcare) in-line with a multiangle lightscattering system (Wyatt Technologies Inc.). A sample of 100 l of purified sample (5 mg/ml) was injected and resolved at a flow rate of 0.5 ml/min in buffer A. The molecular masses of proteins in each sample were determined by a multiangle lightscattering DAWN EOS instrument with a 684 nm laser (Wyatt Technologies Inc.) coupled to a refractive index instrument (Optilab rEX, Wyatt Technologies Inc.). The molar mass of the protein was calculated from the observed light-scattering intensity and differential refractive index using ASTRA v5.1 software (Wyatt Technologies Inc.) based on the Zimm fit method using a refractive index increment of dn/dc ϭ 0.185 ml/g.
Crystallization-BamCD crystals were grown by the sitting drop vapor diffusion method. A final v/v concentration of 0.03% n-dodecyl ␤-maltoside was added to the protein sample prior to setting up crystallization plates. The crystallization drops were prepared by mixing 1 l of protein (30 mg/ml) suspended in buffer A with 1 l of reservoir solution and then equilibrating the drop against 1 ml of reservoir solution. The BamCD construct yielded crystals in the space group I121 with unit cell dimensions of 73.8, 133.4, and 145.0 Å. The optimal crystallization reservoir condition was 0.2 M K 2 HPO 4 and 20% PEG-3350. Crystallization was performed at room temperature (ϳ22°C). The cryosolution contained 0.2 M K 2 HPO 4 , 20% PEG-3350, and 30% glycerol. Crystals were washed in the cryosolution before being flash-cooled in liquid nitrogen.
Data Collection-Diffraction data were collected on the BamCD crystals at beamline X25 at the National Synchrotron Light Source, using an ADSC Q315 CCD x-ray detector. The crystal-to-detector distance was 375 mm. A total of 360 images were collected with 1°oscillations, and each image was exposed for 1 s. The diffraction data were processed with the programs iMosflm (27), POINTLESS (28), and SCALA (28). See supplemental Table 2 for data collection statistics.
Structure Determination and Refinement-Although the full-length BamCD complex was used for crystallization, BamC was cleaved to a smaller fragment corresponding to BamC UN (Asp 28 -Ala 217 ) in the crystallization drop during incubation. BamC has been previously shown to be susceptible to degradation in the linker region connecting the N-and C-terminal domains (29). Phases were obtained by molecular replacement using the program PHASER 2.1 (30). Previously solved E. coli BamC N (Protein Data Bank code 2YH6) and BamD (code 2YHC) structures were used as search models (11). The N-ter-minal unstructured region of BamC was manually fit into the difference electron density map using the program COOT (31). The structure was refined using restrained refinement in REF-MAC5 (32), and further manual adjustments to the atomic coordinates, especially in regions of BamD, were performed with the program COOT. The final model was obtained by running restrained refinement in REFMAC5 with TLS restraints obtained from the TLS motion determination server (33). The refinement statistics are shown in supplemental Table  2.
Structural Analysis-Secondary structural analysis was performed with the program DSSP (34). The program COOT was used to overlap coordinates for structural comparison. The stereochemistry of the structure was analyzed with the program PROCHECK (35). The on-line servers PISA (36) and PRO-TORP (37) were used to identify protein-protein interaction interfaces.

RESULTS AND DISCUSSION
The results from size exclusion chromatography and multiangle light-scattering analysis are consistent with both BamC and BamD existing in a monomeric state in solution when purified separately (supplemental Fig. 1). However, when cells overexpressing each protein were combined prior to lysis, a large population of BamC and BamD was observed to co-elute from a size exclusion column as a BamCD heterodimeric complex (Fig.  1). The molar mass of the complex was verified by multiangle light-scattering analysis (60.1 Ϯ 1.8 kDa) (supplemental Fig.  2), which is consistent with the sum of the calculated molecular masses (64.4 kDa) of the BamC (36.4 kDa) and BamD (28.0 kDa) constructs used in this study.
To determine which region of BamC is important for forming the BamCD complex, a series of BamC truncations were created and examined for their ability to interact with BamD. BamC can be divided into three domains: the unstructured region at the far N terminus of the protein (BamC U ), the N-terminal domain (BamC N ), and the C-terminal domain (BamC C ) (13,(21)(22)(23). The truncated forms of BamC created for this study are as follows: 1) BamC N (residues 99 -217), 2) BamC C (residues 220 -344), 3) BamC NC (both the N-and C-terminal domains, residues 94 -344), and 4) BamC UN (the N-terminal unstructured region followed by the N-terminal domain, residues 26 -217). Size exclusion chromatography analysis showed that any truncated form of BamC missing the unstructured N terminus (i.e. BamC N , BamC C , and BamC NC ) was unable to form the BamCD complex ( Fig. 1b and supplemental Fig. 1). On the other hand, BamC UN co-purified with BamD throughout the entire purification process despite missing the C-terminal domain (Fig. 1, c and d). Multiangle light-scattering analysis confirmed that BamC UN and BamD formed a complex with a molecular mass of 55.6 Ϯ 1.7 kDa, which is comparable with the sum (49.9 kDa) of the molecular masses of BamC UN (21.9 kDa) and BamD (28.0 kDa) (supplemental Fig. 2). Together, these results show that the unstructured N terminus of BamC is required for the formation of the BamCD complex.
Although the structures of the individual BamC and BamD monomers have been solved previously, that of the BamCD complex has not. In fact, no Bam lipoprotein complex structures have yet been solved. BamC forms two globular domains (N-and C-terminal domains) both with the "helix-grip" fold (38) that are connected by a flexible linker (11,23). On the other hand, the BamD structure consists of 10 ␣-helices that form five tetratricopeptide repeats (TPRs) (Fig. 2a) (11,21). In this study, we crystallized BamD in a heterodimeric complex with Bam-C UN , and its structure was solved and refined to 2.9 Å resolution (supplemental Table 2). The most outstanding structural feature of the BamCD complex is the 73-residue-long unstructured N terminus of BamC that has not been observed in previously reported structures (11,23,39); it folds into an elongated U-shaped loop structure that interacts extensively with BamD by fitting into a trail of crevices that run along the longitudinal axis of BamD ( Fig. 2b and supplemental Fig. 3). The globular N-terminal domain of BamC that follows this unstructured region lies adjacent to the N-terminal half of the BamD molecule. Although a previous study has predicted that BamC interacts with the C-terminal end of BamD (Met 227 -Thr 245 ) (6), our structure shows that the N-terminal region of BamD is also an important site of interaction, as BamC binds along the entire length of the BamD molecule.
To determine whether the BamC and BamD molecules undergo conformational changes upon binding to and forming the stable BamCD complex, the structures of both proteins were individually compared with the previously solved struc-tures of the monomeric form of each protein. Although the N-terminal domain structure of BamC superimposes very closely with the previously reported crystal structure (root mean square deviation of 0.54 Å) and the NMR structure (root mean square deviation of 1.24 Å) (supplemental Fig. 4), BamD in the BamCD complex shows a significantly different conformation compared with the monomeric structure (root mean square deviation of 2.3 Å). In the BamCD complex, the positions of ␣-helices in TPR motifs 3 (␣6), 4 (␣7 and ␣8), and 5 (␣9 and ␣10) of BamD are shifted to better accommodate the binding of BamC (Fig. 2c). The four C-terminal ␣-helices of TPR motifs 4 and 5, in particular, show the greatest change in conformation.
The interaction between BamC and BamD is created predominantly by the direct contact between the N-terminal FIGURE 3. BamC-BamD interface. a, ribbon diagram of BamC (red) with ribbon and semitransparent surface diagrams of BamD (gray). The interfacing residues are colored yellow (BamC) and purple (BamD). b, conserved residues mapped onto BamD with absolutely conserved residues in maroon and highly variable residues in cyan. An outline of BamC is drawn to show its position relative to BamD. c, conserved residues mapped onto BamC and colored as in b. An outline of BamD is also shown. unstructured region of BamC and all five TPR motifs of BamD (Fig. 2b). This interaction between the two proteins is mediated by numerous hydrogen bonds, salt bridges, and van der Waals forces and has an average interface area of 2249.4 Å 2 ( Fig. 3a and supplemental Table 3). Many conserved residues are found concentrated at the interaction interfaces on both proteins (Fig.  3, b and c), suggesting that the interaction between these two proteins has important biological and functional implications. For BamC, a multiple sequence alignment (supplemental Fig. 5) shows that its unstructured N terminus is the most well conserved region of the protein, which is not surprising considering its essential role in stabilizing the BamCD complex structure. For BamD, about half of the conserved residues are found at the BamCD interaction interface, but the other half are found clustered on the opposite side of the protein and are solventexposed in this heterodimeric complex ( Fig. 3c and supplemental Figs. 6 and 7).
Although this structural information has revealed how BamC and BamD interact with each other, the insights gained in this study raise many questions regarding the function and structure of the BAM complex. For instance, a pocket present in the N-terminal region of BamD (formed by TPR motifs 1 and 2) has been predicted to recognize and bind to the C-terminal targeting sequence of unfolded OMP substrates (11,21). In fact, this binding pocket of BamD has been previously shown to resemble closely that of PEX5 (Protein Data Bank code 3CVP) (21,40), a peroxisomal targeting signal receptor (root mean square deviation of 1.7 Å) (Fig. 4a). In the BamCD complex structure, however, this proposed binding pocket is occupied by part of the unstructured region of BamC that has no sequence similarity to the C-terminal targeting sequences of OMP substrates or to PTS1, which is a C-terminal peroxisomal targeting sequence recognized by PEX5 (Fig. 4b). If the function of BamD is indeed recognition of OMP substrates via their C-terminal targeting sequences, then perhaps one of the roles of BamC (more specifically, the unstructured N terminus of BamC) is that of a regulatory one, where it may block or expose the targeting sequence-binding site of BamD depending on need.
Many aspects of the BAM complex structure and function remain to be elucidated. How does BamD interact with the POTRA motifs of BamA? It is possible that the conserved region of BamD that is not binding to BamC may serve as a binding surface for POTRA5 of BamA. Where is the C-terminal domain of BamC positioned relative to the rest of the BamC and BamD molecules? Does the C-terminal domain of BamC also interact with BamD, or does it have a separate binding partner? Further biochemical and structural investigation of different interactions formed between the BAM proteins will be required to answer these questions. Continued progress in the structural and functional analysis of the BAM complex and its individual components will not only enhance our understanding of how ␤-barrel proteins assemble and insert into the outer membrane but may also ultimately contribute to the development of novel antibiotics. . Proposed C-terminal targeting sequence-binding pocket of BamD. a, superimposed ribbon diagrams of BamD (gray) and the C-terminal domain of PEX5 (blue), a peroxisomal targeting sequence receptor. The PTS1 peptide, the peroxisomal targeting sequence that is recognized and bound by PEX5, is shown in red. b, in the BamCD complex, part of the unstructured region of BamC (red) blocks the proposed C-terminal targeting sequence-binding pocket of BamD (gray). PTS1 (black outline) is shown for reference.