Crystal Structure of the Major Periplasmic Domain of the Bacterial Membrane Protein Assembly Facilitator YidC*

The essential bacterial membrane protein YidC facilitates insertion and assembly of proteins destined for integration into the inner membrane. It has homologues in both mitochondria and chloroplasts. Here we report the crystal structure of the Escherichia coli YidC major periplasmic domain (YidCECP1) at 2.5Å resolution. This domain is present in YidC from Gram-negative bacteria and is more than half the size of the full-length protein. The structure reveals that YidCECP1 is made up of a large twisted β-sandwich protein fold with a C-terminal α-helix that packs against one face of the β-sandwich. Our structure and sequence analysis reveals that the C-terminal α-helix and the β-sheet that it lays against are the most conserved regions of the domain. The region corresponding to the C-terminal α-helix was previously shown to be important for the protein insertase function of YidC and is conserved in other YidC-like proteins. The structure reveals that a region of YidC that was previously shown to be involved in binding to SecF maps to one edge of the β-sandwich. Electrostatic analysis of the molecular surface for this region of YidC reveals a predominantly charged surface and suggests that the SecF-YidC interaction may be electrostatic in nature. Interestingly, YidCECP1 has significant structural similarity to galactose mutarotase from Lactococcus lactis, suggesting that this domain may have another function besides its role in membrane protein assembly.

teins endowed with a Sec-dependent N-terminal signal peptide are exported across, or into, the inner membrane via the Sec system. The Sec system consists of the proteins SecY, SecE, and SecG, which form a heterotrimeric protein-conducting channel in the inner membrane (2); SecA, a cytosolic ATPase motor protein that unfolds (3) and pushes polypeptide substrates through the SecYEG channel (4); and the proteins SecD, SecF, and YajC, which form a heterotrimeric complex that interacts with SecYEG (5). The SecDFYajC complex has been proposed to (i) promote the release of substrate proteins from the SecYEG translocase following translocation (6) and/or (ii) enhance protein translocation by regulating SecA membrane cycling (7,8). Proteins intended for integration into the inner membrane engage the essential protein YidC, which directly contacts transmembrane segments (9) and facilitates insertion (10), folding (11), and assembly (12,13) of proteins into the inner membrane. Depending on the nature of the substrate, YidC can function in a Sec-dependent (SecYEG-YidC) (14,15) or Sec-independent ("YidC only") manner (16). It is thought that for Sec-dependent substrates, large hydrophilic domains are first exported across the membrane into the periplasm via the SecYEG channel, followed by movement of the transmembrane regions from the channel into the lipid bilayer; the latter step may be facilitated by YidC (9). How YidC promotes membrane protein insertion in a Sec-independent manner is unknown. YidC has been shown to co-purify with components of the Sec translocase (15), and a direct interaction between YidC and the SecDFYajC complex has been demonstrated, specifically with SecD and SecF (17,18). YidC also plays a role in the biogenesis of lipoproteins (19), but its role in this process is not clear.
Structurally, Escherichia coli YidC is a 548-amino acid polypeptide with a molecular mass of 61,526 Da and a predicted isoelectric point of 7.7. Saaf et al. (20) have experimentally mapped the topology of YidC and shown that it consists of 6 transmembrane regions (TM) with a large ϳ35-kDa periplasmic domain (residues 24 -342) located between transmembrane regions 1 and 2 (Fig. 1A). Deletion analyses have revealed that YidC insertase function is located mainly in the C-terminal five transmembrane regions (18,21). Remarkably, up to 90% of the 35-kDa YidC periplasmic domain (residues 25-323) can be deleted without affecting inner membrane protein biogenesis or cell viability (18,21). YidC is conserved in all three domains of life and is homologous to the well characterized proteins Oxa1 and ALB3 that are found in the inner membrane of mitochondria and the thylakoid membrane of chloroplasts, respectively (22). Consistent with functional mapping studies of inser-tase activity, amino acid sequence alignments reveal that the C-terminal ϳ200 residues of YidC, corresponding to transmembrane regions 2-5, are conserved in prokaryotic and eukaryotic versions of the protein (22). Further, Oxa1 has been shown to complement YidC insertase activity when expressed in E. coli (23), underscoring the functional significance of this region in catalyzing protein insertion. By comparison, the function of the YidC periplasmic domain remains largely unknown; nevertheless, the conservation of this domain in Gram-negative bacteria suggests that it performs a significant, but as yet unrecognized, role in the cell.
To gain further insight into the structure and function of YidC, we sought to determine the structure of the YidC periplasmic domain. We report here the crystal structure of residues 57-346 of E. coli YidC to 2.5 Å resolution.

EXPERIMENTAL PROCEDURES
Cloning and Mutagenesis-A 942-base pair DNA fragment, coding for residues 26 -340 of E. coli YidC, was amplified from E. coli K-12 genomic DNA using the forward primer 5Ј-ATG-CAAGCATATGGATAAAAACCCGCAACCTCAGG and the reverse primer 5Ј-ATGCCTACTCGAGGCTGCCGCGCGG-CACCAGCAGCGGCTGAGAGATGAACC that contain the restriction sites NdeI and XhoI, respectively. The resulting PCR product was ligated into vector pET20b (Novagen). The YidC EC (26 -340)His construct includes an N-terminal methionine and a C-terminal thrombin/hexahistidine affinity tag bearing the sequence LVPRGSLEHHHHHH. DNA sequencing (Macrogen) confirmed that the YidC insert matched the sequence reported in the Swiss-Prot data base (P25714). To facilitate crystallization, several residues within this construct were targeted for mutagenesis using the QuikChange method (Stratagene). The primer pair 5Ј-GTACTCCACGCCTGACG-CGGCGTATGCGGCATACGCGTTCGATACCATTGCCG and 5Ј-CGGCAATGGTATCGAACGCGTATGCCGCATAC-GCCGCGTCAGGCGTGGAGTAC was used to construct a version of YidC EC (26 -340)His that bears the mutations E228A, K229A, E231A, K232A, and K234A. This construct (referred to as pYidC EC P1) yielded crystals suitable for structure determination. The expressed pYidC EC P1 encodes 330 residues has a molecular mass of 35,702 Da and a theoretical pI of 5.3.
Protein Expression and Purification-The expression plasmid pYidC EC P1 was transformed into E. coli expression strain BL21(DE3) and used to inoculate (1:100 back dilution) 3 liters of Luria Bertani medium containing ampicillin (100 g/ml). Cultures were grown at 37°C to an A 600 of 0.6 and induced with 1 mM isopropyl-1-thio-␤-D-galactopyranoside for 3 h. Cells were harvested by centrifugation and lysed using an Avestin Emulsiflex-3C cell homogenizer. The lysate was clarified by centrifugation (30,000 ϫ g) for 30 min at 4°C. The supernatant was applied to a 5-ml nickel-nitrilotriacetic acid column (Qiagen) that had been equilibrated with 20 mM Tris-HCl, pH 8.0, 100 mM NaCl (buffer A). The column was washed with 30 ml of buffer A containing 20 mM imidazole and eluted with a step gradient (100 -500 mM imidazole in buffer A at 100-mM increments) in 5-ml volumes. The majority of the protein eluted from the column in fractions containing 100, 200, and 300 mM imidazole, which were pooled and concentrated using an Ami-con ultra centrifugal filter device (Millipore). Concentrated protein was then applied to a Sephacryl S-100 HiPrep 26/60 size-exclusion chromatography column on an Á KTA Prime system (GE Health Care) running at 1 ml/min in buffer A. Fractions containing pure YidC EC P1 were pooled and concentrated to 32 mg/ml and stored at Ϫ80°C. Analytical size-exclusion chromatography in line with Multi-Angle Light Scattering analysis is consistent with YidC EC P1 being a monodispersed monomer in solution (data not shown).
Se-Met-incorporated YidC EC P1 was prepared by growing an overnight culture of BL21(DE3) transformed with pYidC EC P1 in M9 minimal medium supplemented with 100 g/ml ampicillin. 30 ml of overnight culture was used to inoculate 3 ϫ 1 liter of M9 minimal medium (100 g/ml ampicillin) that was grown at 37°C to an A 600 of 0.6. Each 1-liter culture was then directly supplemented with a mixture of the following amino acids: 100 mg of lysine, phenylalanine, threonine; 50 mg of isoleucine, leucine, valine; 60 mg of selenomethionine. After 15 min, protein expression was induced with 1 mM isopropyl-1thio-␤-D-galactopyranoside (final concentration) for 3 h at 37°C. The purification procedure of Se-Met-incorporated YidC EC P1 was the same as that used for the native protein.
Crystallization-The crystals used for single wavelength anomalous diffraction data collection were grown by the hanging drop vapor diffusion method. The crystallization drops were prepared by mixing 1 l of protein (32 mg/ml) with 1 l of reservoir solution and then equilibrating the drop against 1 ml of reservoir solution. The YidC EC P1 construct yielded crystals in the space group I4 1 (20); however, its exact orientation with respect to the membrane has not been determined. B, a ribbon diagram of YidC EC P1. The structure is colored gradually from N terminus (blue) to the C terminus (red). The strands are numbered 1-18 and the helices labeled ␣1-␣3. C, a divergent stereo image of a C␣ trace of YidC EC P1. Every tenth residue is marked with a sphere and labeled. D, a protein topology diagram for YidC EC P1. Strands are shown as arrows, helices as boxes, and loops as lines. ␤-sheet 1 is shown in blue, ␤-sheet 2 in red, and helix 1 in yellow. yethylene glycol 3350, and 20% glycerol. Crystals were incubated in cryo-solution for ϳ5 min before being flash-cooled in liquid nitrogen.
Data Collection-Diffraction data were collected on selenomethionine-incorporated crystals at beamline 8.2.2 of the Advance Light Source, Lawrence Berkeley Laboratory, University of California at Berkeley using a Quantum 315 ADSC area detector. The crystal-to-detector distance was 320 mm. Data were collected with 1°oscillations, and each image was exposed for 3 s. The diffraction data were processed with the program HKL2000 (24). See Table 1 for data collection statistics.
Structure Determination and Refinement-The YidC EC P1 structure was solved by single wavelength anomalous dispersion using a data set collected at the peak wavelength (0.9794 Å), the program SHELX (25) within ccp4i (26), and Autosol within PHENIX version 1.3 (27). SHELXC found eight of the possible ten selenium sites. The program Autobuild within PHENIX version 1.3 (27) automatically constructed ϳ90% of the polypeptide chain and performed density modification. The rest of the model was built using the program Coot (28). The structure was refined using the program Refmac5 (29) and the program CNS (30). The final models were obtained by restrained refinement in Refmac5 with Translation Liberation Screw Rotation (TLS) restraints obtained from the TLS motion determination server (31). The data collection, phasing, and refinement statistics are summarized in Table 1.
Structural Analysis-Secondary structural analysis was performed with the programs DSSP (32), HERA (33), and Promotif (34). The programs SUPERIMPOSE (35) and SUPERPOSE (36) were used to overlap coordinates for structural comparison. The program CONTACT within the program suite CCP4 (26) was used to measure the hydrogen bonding and van der Waals contacts. The program CASTp (37) was used to analyze the molecular surface and search for potential substrate binding sites. The program SURFACE RACER 1.2 (38) was used to measure the solvent-accessible surface of the protein and individual atoms within the protein. A probe radius of 1.4 Å was used in the calculations. The Protein-Protein Interaction Server (39,40) was used to analyze the interactions between the molecules in the asymmetric unit. The stereochemistry of the structure was analyzed with the program PROCHECK (41). The DALI server was used to find proteins with similar protein folds (42).  (43). The alignment figure was prepared using the programs ClustalW (44) and ESPript (45).

RESULTS
Structure of YidC EC P1-We have produced a C-terminal His 6 -tagged soluble construct of the major periplasmic domain of E. coli YidC (YidC EC P1) that spans residues Asp 26 -Leu 340 (Fig. 1A). High resolution size-exclusion chromatography analysis and multiangle light scattering analysis of the YidC EC P1 reveal that the protein is very soluble in the absence of detergents and is monomeric in nature (data not shown). YidC EC P1 was crystallized, and the structure was solved by single wavelength anomalous diffraction and refined to 2.5 Å resolution. There are two molecules in the asymmetric unit, and the refined structure includes residues 57-340. In addition, there is electron density for 3 residues at the C terminus that corresponds to the affinity tag used to purify the protein. There is no visible electron density observed for a presumably mobile loop that spans residues 207-216. Additionally, no electron density is observed for the N-terminal residues 26 -56. To facilitate crystallization and improve the diffraction quality of the crystals the following mutations were introduced into YidC EC P1: E228A, K229A, E231A, K232A, and K234A. These lysine and glutamate residues were targeted for mutation to alanine in an attempt to reduce the degree of conformational entropy associated with longer side chains that may impede crystallization (46,47). The mutant YidC EC P1 protein behaves identically to the wild-type YidC EC P1 with regard to its solubility, chromatographic behavior, and light-scattering properties. In addition, no difference was seen between the mutant and wild-type YidC EC P1 proteins when analyzed by CD spectroscopy (data not shown).
YidC EC P1 has the approximate dimensions of 38 ϫ 60 ϫ 45 Å with a significant groove formed along the face of the twisted ␤-sheet 1. The groove formed along the opposing ␤-sheet (sheet 2) of the sandwich is partially occupied by ␣-helix 3. Examination of surface electrostatics indicates that YidC EC P1 does not appear to have any major hydrophobic surface that could accommodate interactions with the acyl chains of the membrane lipids but could possibly interact with the lipid head groups. This is consistent with the solubility of YidC EC P1 in the absence of detergents.
Protein-Protein Interactions Observed in the YidC EC P1 Crystals-Superposition of the two molecules in the asymmetric unit shows that the only significant structural difference between molecule A and molecule B is a shift in the orientation of the C-terminal ␣-helix 3 (Fig. 2). The difference in the orientation of helix 3 is likely due to crystal-packing interactions. As mentioned earlier, to facilitate crystallization, five mutations were introduced into a region of YidC EC P1 to replace a cluster of lysine and glutamate residues. The structure shows that these residues are located on or near ␤-strand 12. Molecule A makes a significant number of crystal contacts between its C terminus and the residues that were mutated in a symmetry-related mol- ecule A. This type of interaction is not observed in molecule B, giving a possible explanation for the differences seen in the orientation for the C-terminal ␣-helix 3.
Conserved Regions of YidC EC P1-Amino acid sequence alignment of eight YidC variants from various Gram-negative bacterial species reveals a number of conserved residues located throughout Yid EC P1 (Fig. 3). Most notably, the region at the extreme C terminus of the construct corresponding to ␣-helix 3 is well conserved. PFAM (48) analysis reveals that the residues 61-350 of YidC define a conserved domain PFAM-B_1222 that is remarkably consistent with the region of YidC EC P1 observed in the electron density (residues 57-340). Sixty-one YidC variants were extracted from the domain PFAM-B_1222, aligned using ClustalW (44), and analyzed using the program CONSURF (49) that maps conserved residues onto a threedimensional structure. As shown in Fig. 4, a significant number of conserved residues map to ␣-helix 3 and to ␤-strands 11, 12, 14, 15, 18 that cluster on the face of ␤-sheet 2 and pack against ␣-helix 3 (Fig. 4A). Closer inspection reveals that many of the A, the regions that are most conserved are rendered in purple, the least conserved in blue. Significant conservation is seen in ␣-helix 3 and the end of ␤-sheet 2, which ␣-helix 3 packs against. B, a close-up view of the most conserved region, as seen from the end of ␣-helix 3. The side chains are shown as sticks, and the most conserved residues in this region are labeled.   (18, 21). A, the electrostatic molecular surface of YidC EC P1. B, the same view as A but rendered with a semitransparent surface revealing a ribbon diagram of YidC EC P1 and the side chains for charged residues at the edge of the ␤-sandwich shown as sticks and labeled. The residues mutated to alanine for purposes of improved crystal quality are labeled in red. These residues occur in a cluster on or near ␤-strand 12.
conserved residues are involved in interactions between ␣-helix 3 and ␤-sheet 2 (Fig. 4B), suggesting that the interaction may be biologically significant.
Mapping Functional Regions-Previous studies of YidC have mapped two functional regions to the YidC periplasmic domain. First, deletion analysis of YidC has revealed that residues 323-346 of the first periplasmic domain are essential for cell viability and insertase activity (18,21). This region corresponds to the conserved ␣-helix 3 at the C terminus of Yid-C EC P1 (Fig. 5). Second, Xie et al. (18) have shown that residues 215-265 of E. coli YidC are sufficient for binding to SecF. As depicted in Fig. 5, this region maps to ␤-strands 11-15 and ␣-helix 1 at the edge of the ␤-sandwich. The approximate molecular surface area for the proposed SecF binding region that includes ␤-strands 12 and 13 and ␣-helix 1 is 450 Å 2 . The surface of this region of the structure includes the residues that were mutated for purposes of improving the crystal quality. If these surface residues are modeled back to their wild-type residues, it can be seen that there is a significant negatively charged patch of molecular surface adjacent to a positively charged patch corresponding to the region that was found to be important for SecF binding (Fig. 6). This suggests that the interaction between YidC and SecF may be electrostatic in nature. It is worth noting that the conserved regions of the structure (Fig. 5) correspond well with the regions known to be functionally significant (Fig. 6).
Search for Structural Homologues-Interestingly, despite very low sequence identity, YidC EC P1 shows a significant degree of structural similarity with galactose mutarotase from Lactococcus lactis (50). The root mean square deviation for superposition of YidC EC P1 on galactose mutarotase is 3.3 Å for 202 equivalent C␣ atoms, with 8% sequence identity for those residues compared (Fig. 7). As noted by Thoden and Holden (50), other proteins with related structures include copper amine oxidase (51), hyaluronate lyase (52), chondroitinase (53), ␤-galactosidase (54), and maltose phosphorylase (55). With the exception of copper amine oxidase, most of the related proteins contain sugar binding sites. The sugar binding pockets for these proteins do not seem to be conserved in YidC EC P1. Additionally, OpgG, which is located in the E. coli periplasm and is required for the biosynthesis of osmoregulated glucans (56), shares structural similarity with YidC EC P1.

DISCUSSION
In this study we present the first structure of the major periplasmic domain of the protein YidC of E. coli. The domain consists of a large stable ␤-sandwich with a short ␣-helix at the midpoint of the fold and edge of the sandwich and two ␣-helices at the C terminus that lay on the curved face of ␤-sheet 2.
The structure reveals that residues 323-346, which have previously been shown to be essential for cell viability and insertase FIGURE 7. YidC EC P1 shows structural similarity to a group of sugar-binding proteins. YidC EC P1 (white) is shown superimposed on the structure of galactose mutarotase (red) (Protein Data Bank code 1L7J) (50). This image is shown in divergent stereo. The side chains for the residues involved in sugar binding in galactose mutarotase are rendered as sticks.
activity (18), correspond to ␣-helix 3. Remarkably, Jiang et al. (21) have performed alanine scanning mutagenesis experiments on residues 324 -342 of the YidC periplasmic domain and shown that there is no single residue side chain (beyond C␤) within this region that is essential for cell viability. Consistent with this result, modeling using alanine side chains within residues 324 -342 of the structure of YidC EC P1 suggests that the structure of ␣-helix 3 would not be significantly altered. Thus, it is reasonable to assume that ␣-helix 3 of the YidC periplasmic domain is dependent on secondary structure, rather than on individual side chain interactions, for insertase activity. Interestingly, secondary structural analysis of the evolutionarily related proteins Oxa1 of mitochondria and ALB3 of chloroplasts predicts an ␣-helix located N-terminal to the first transmembrane domain. It is tempting to speculate that this structural element represents a conserved feature related to insertase function.
Xie et al. (18) have shown that residues 215-265 of the YidC periplasmic domain fused to maltose-binding protein are sufficient to interact with SecF of the SecDFYajC heterotrimer. The structure presented here shows that this region (215-265) consists of ␤-strands 11, 12, 14, and 15, which contribute to ␤-sheet 2, and ␤-strand 13 that contributes to ␤-sheet 1. This region also includes the short ␣-helix 1 (residues 235-239) (Fig. 5). This represents the edge of the ␤-sandwich structure and corresponds to a negatively charged surface just adjacent to a positively charged surface, suggesting that the interactions between SecF and YidC may be predominately electrostatic (Fig. 6).
An interesting question is how the YidC periplasmic domain is oriented with respect to the membrane. Because the structure does not reveal a hydrophobic surface and the construct does not require detergent for solubility, it is reasonable to assume the domain is probably loosely tethered to the membrane. This notion is consistent with the observation that residues 26 -55 do not appear in the electron density and are thus assumed to be flexible. Furthermore, PsiPred analysis (57) does not predict secondary structure for residues 28 -59. It is possible that this region forms a flexible "tether" or "linker" (Fig. 1A).
Previously it has been shown that full-length YidC purifies as a mixture of monomers and dimers in the presence of detergent (58). High resolution size-exclusion chromatography analysis in tandem with multiangle light dynamic light scattering analysis indicates that YidC EC P1 behaves as a monomer in solution in the absence of detergent with and without the mutations introduced for purposes of crystallization. Furthermore, although two molecules are present within the asymmetric unit, these molecules do not appear to interact in a manner that would suggest the presence of a strong dimer except for the fact that monomers in the chosen asymmetric unit (the one with the most buried surface between the monomers) interact via ␣-helix 3, the helix proposed to have insertase function (Fig. 2B). It is possible that YidC oligomerization is mediated by interactions between the transmembrane segments, as has been proposed for the interaction between SecF and SecD (59). It is also possible that the periplasmic domain studied here forms dimers or even higher order oligomers when in close proximity to a membrane.
The function of the major periplasmic domain of YidC remains unclear. In vivo assays of YidC-mediated protein insertase activity have shown that up to 90% of the periplasmic domain (residues 25-323) can be deleted without inhibiting function (21). Further, although it has been shown in vitro that the YidC periplasmic domain interacts with SecF, this interaction is not required for insertion of Sec-dependent or Sec-independent substrates (18). Taken together, these observations suggest that the YidC periplasmic domain, which is conserved in all Gram-negative bacteria, may have a function unrelated to protein insertion into the inner membrane. The structure presented here reveals that the YidC periplasmic domain shares significant structural similarity with proteins that are involved in binding sugars such as galactose mutarotase (Fig. 7). Although the YidC periplasmic domain does not appear to bear a sugar binding motif similar to galactose mutarotase, it is tempting to speculate that it could still interact with sugars present in the periplasm. Alternatively, it is conceivable that the YidC periplasmic domain could interact with other periplasmic proteins such as chaperones that facilitate protein folding or secretion following translocation across, or into, the inner membrane. The availability of a three-dimensional structure of the YidC periplasmic domain, as well as stable, soluble protein will facilitate experiments designed to address these ideas.