Membrane-bound Conformation of M13 Major Coat Protein

M13 major coat protein, a 50-amino-acid-long protein, was incorporated into DOPC/DOPG (80/20 molar ratio) unilamellar vesicles. Over 60% of all amino acid residues was replaced with cysteine residues, and the single cysteine mutants were labeled with the fluorescent label I-AEDANS. The coat protein has a single tryptophan residue that is used as a donor in fluorescence (or Förster) resonance energy transfer (FRET) experiments, using AEDANS-labeled cysteines as acceptors. Based on FRET-derived constraints, a straight α-helix is proposed as the membrane-bound conformation of the coat protein. Different models were tested to represent the molecular conformations of the donor and acceptor moieties. The best model was used to make a quantitative comparison of the FRET data to the structures of M13 coat protein and related coat proteins in the Protein Data Bank. This shows that the membrane-bound conformation of the coat protein is similar to the structure of the coat protein in the bacteriophage that was obtained from x-ray diffraction. Coat protein embedded in stacked, oriented bilayers and in micelles turns out to be strongly affected by the environmental stress of these membrane-mimicking environments. Our findings emphasize the need to study membrane proteins in a suitable environment, such as in fully hydrated unilamellar vesicles. Although larger proteins than M13 major coat protein may be able to handle environmental stress in a different way, any membrane protein with water exposed parts in the C or N termini and hydrophilic loop regions should be treated with care.

One of the most challenging problems in structural biology of the 21st century is the unraveling of the structure and function of membrane proteins. For the water-soluble proteins, x-ray diffraction and high field solution NMR spectroscopy are the most suitable tools for structure determination, but for membrane proteins that need to be embedded in an amphipathic environment, there is not yet a well defined strategy for obtaining a protein structure (1). Among the spectroscopic methods, site-directed spectroscopic approaches are becoming increasingly important as alternative tools for structure determination of membrane proteins. These techniques are based on site-directed mutagenesis in combination with specific labeling and provide detailed information about the local environment of the labeled sites and membrane embedment (2)(3)(4)(5).
Herein is described a further enhancement of the site-directed approach using site-directed labeling to determine distances within a membrane protein from fluorescence (or Förster) resonance energy transfer (FRET). 2 FRET is based on a dipolar interaction in which energy is transferred from one chromophore, the donor, to another chromophore, the acceptor (6). In the present work, we applied FRET to determine the membrane-bound structure of the bacteriophage M13 major coat protein. M13 is a small protein, composed of 50 amino acid residues. The protein can adopt a stable conformation in the phage particle stabilized by protein-protein and protein-DNA interactions but can also be stably integrated in the Escherichia coli host cytoplasmic membrane during membrane-bound assembly and disassembly (7). The protein has been the subject of several biophysical studies and is therefore a suitable model system to test new spectroscopic approaches for membrane proteins (5,(7)(8)(9).
M13 major coat protein contains a natural tryptophan at position 26, which is used as a donor. The chromophore AEDANS is a suitable acceptor that can be specifically attached to cysteine residues that can, in principle, be introduced anywhere in the protein using site-directed mutagenesis. With a sufficient number of mutants, the three-dimensional structure of the protein can be inferred, based on the combined FRET-derived distance constraints.
In the case of the M13 major coat protein, Ͼ60% of the amino acid residues were replaced by a cysteine and used in the FRET analysis. Different structural models for the membrane-bound M13 coat protein and related proteins have been proposed in the literature. Some structures are "L-shaped" (10) or "U-shaped" (11), whereas it was also proposed that the conformation of membrane-bound coat protein is in fact "I-shaped," similar to the conformation in the phage particle (12). On the basis of our FRET data, we showed that the conformation of coat protein in DOPC/DOPG vesicles is best described as I-shaped, in good agreement with the phage assembly model that was proposed earlier (13).
Our FRET results are quantitatively compared with the existing literature on the structure determination of the M13 coat protein in detergent micelles and stacked, oriented bilayers by high resolution (11) and solid state NMR spectroscopy (14), respectively. In both micelles and in stacked, oriented bilayers, the structure of the coat protein turns out to be strongly affected by the stress of the membrane-mimicking environment.
The advantage of the fluorescence approach applied in this study is that the experiments are carried out in unilamellar vesicles providing information about the structure of the coat protein in a natural nonstressed state. As such, this paper serves as a caution to experimentalists who study membrane proteins in systems other than vesicles, as the conformation can depend strongly on the environment.
Fluorescence Measurements-Absorption spectra were recorded on a Varian Cary 5E UV-Vis-NIR spectrophotometer. Fluorescence spectra were recorded on a Fluorolog 3.22 spectrofluorimeter manufactured by Jobin Yvon-Spex. Emission was detected at the wavelength of maximum fluorescence. The band pass for excitation was 2.0 nm, the band pass for emission was 5.0 nm. Spectra were corrected for wavelength-dependent output of the lamp. The size of the vesicles was 41 nm, as determined by dynamic light scattering. All spectra were corrected for contributions of light scattering by subtracting the excitation spectrum of vesicles containing wild type protein in approximately the same concentration as the mutant proteins.
Förster Radius-The Förster radius was calculated (6) from the following. R 0 ϭ 9.78 ϫ 10 3 ͑ 2 n Ϫ4 Q D J()͒ 1/6 (Eq. 1) In this equation, 2 is the orientation factor, n is the refractive index of the medium, Q D is the quantum yield of the donor, and J() is the overlap integral. The quantum yield of wild type protein was determined with a comparative method, using quinine sulfate in 0.05 M H 2 SO 4 as a reference (17). The overlap integral was determined from the fluorescence emission spectrum of the wild type protein and from the absorption spectrum of the AEDANS-labeled Y21A/Y24A/W26A/ G23C mutant (6), using the computer program IGOR Pro 3.13 (Wavemetrics, Lake Oswego, OR). Judging from the excitation spectra (see Fig.  1), the absorption spectrum did not depend on the position of the AEDANS label.
To calculate the Förster radius, a value 2 ϭ 2/3 was used. This approximation is assumed to be generally valid for proteins (6) and in particular for the Trp-AEDANS donor-acceptor pair (18). In our case, this assumption is further supported by the fact that the Trp donor (19,20) and AEDANS acceptor (21) have a high degree of motion. In addition, all our measurements were carried out at room temperature in a mobile lipid environment well above the gel to liquid crystalline transition temperature of the lipids used (T m,DOPC ϭ Ϫ20°C and T m,DOPG ϭ Ϫ18°C). This allows us to use 2 ϭ 2/3 for the dynamic averaging of the relative donor-acceptor positions in the M13 coat protein system. The refractive index of the bilayer (1.5) was taken from the literature (22). The quantum yield (Q D ) and overlap integral (J()) were calculated to be 0.3 and 6.0 ϫ 10 Ϫ15 M Ϫ1 cm 3 , respectively. The resulting Förster radius of 24 Ϯ 1 Å is in good agreement with the value found previously (23,24).
Energy Transfer Efficiency-In general, energy transfer efficiency depends on the distance r between the donor and acceptor (6), where R 0 is the Förster radius as defined by Equation 1.
Two typical excitation spectra of an AEDANS-labeled mutant close to tryptophan ( Fig. 1, red line) and of an AEDANS-labeled mutant far away from tryptophan ( Fig. 1, blue line) are shown. In the absence of energy transfer, the shape of the excitation spectrum should resemble the shape of the absorption spectrum. At an excitation wavelength of ϳ290 nm, the shape of the excitation spectrum deviates from the AEDANS absorption spectrum. The shape of this particular peak resembles that of the tryptophan absorption peak, which has a maximum at ϳ280 nm and a small characteristic shoulder at ϳ290 nm. Tryptophan itself does not emit at 490 nm, so the presence of a tryptophan peak in the AEDANS excitation spectrum indicates energy transfer from tryptophan to AEDANS.
Assuming that AEDANS is excited exclusively through direct excitation or tryptophan to AEDANS energy transfer, and that there is no intermolecular energy transfer, the fluorescence intensity depends on the wavelength of excitation through the following.
Here, F is the fluorescence intensity, ⑀ AEDANS and ⑀ tryptophan are the molar extinction coefficients of AEDANS and tryptophan at wavelength , E is the energy transfer efficiency, and ␥ is a constant that depends on the apparatus, energy acceptor, and concentration (25).
At 290 nm, AEDANS excitation can occur through both energy transfer and direct excitation, so Equation 3 becomes the following.
The tyrosine absorption is small but not negligible at 290 nm (27). Through tyrosine to tryptophan energy transfer, additional excitation of tryptophan occurs. M13 major coat protein has two tyrosine residues, FIGURE 1. Excitation spectra of AEDANS-labeled V31C (red) and T46C (blue) mutants. The peak at ϳ280 nm is indicative of tryptophan to AEDANS energy transfer, and the peaks at ϳ260 nm and ϳ340 nm are arising from direct excitation of AEDANS. The emission is detected at the AEDANS emission maximum at 490 nm. AU, arbitrary units. NOVEMBER 18, 2005 • VOLUME 280 • NUMBER 46 one at position 21 and one at position 24. The distance between the geometric center of the tyrosines and the center of tryptophan in an ideal helix is 10 and 11 Å, respectively. Both distances are close to the Förster radius of 10 Å (28). Therefore, it will be assumed that the tyrosine to tryptophan energy transfer efficiency equals 0.5. For mutants containing both native tyrosines, this implies a correction of ⑀ tryptophan 290 to 5000 M Ϫ1 cm Ϫ1 . For the Y21C and Y24C mutants, having only one tyrosine, ⑀ tryptophan 290 was corrected to 4900 M Ϫ1 cm Ϫ1 . Molecular Modeling-GROMACS 3.1.1 (29) was compiled and installed under Linux Debian on an AMD Athlon MP 2200ϩ dual node work station. MOLMOL (30) and Swiss PdbViewer (31) were run under Windows 2000. Scripts to run the GROMACS program gmxcheck and scripts to calculate efficiencies were programmed in Perl (32) and can be obtained from the authors on request.

Structure Validation through FRET-derived Constraints
The model of a straight ideal ␣-helix was built from the primary structure with the computer program Swiss PdbViewer, giving a perfect right-handed ␣-helix with a pitch of 5.41 Å and a rise/residue of 1.50 Å (33). AEDANS-labeled cysteine was built in MOLMOL. When possible, the bond lengths and bond angles were taken from the crystal structure of 1,5-IAENS (34). The dihedral angles in the naphthalene group and in the sulfonic acid group were taken from the crystal structure as well. Missing bond lengths and bond angles were taken from the AMBER force field (35).
To account for the flexibility, an atomic model of the AEDANS label was introduced that explicitly takes into account all possible rotamers. For each mutant, an atomic model of AEDANS-labeled cysteine was inserted in the structure using the computer program MOLMOL, preserving the backbone dihedral angles of the original residue. Side chain dihedral angles have a strong preference for three conformations: p, t, and m (36). For simplicity, it is assumed that these conformations correspond to discrete values of Ϫ60, 60, and 180°. The side chain dihedral angles 1 , 2 , 3 , 4 , 6 , 7 , and 8 (see Fig. 2) were allowed to adopt all three values, generating a large set of possible rotamers.
For each rotamer, a separate structure file was written. Two dihedral angles, 5 and 9 , involve a nitrogen atom that is either part of an amide bond ( 5 ) or is bonded to the aromatic ring system ( 9 ). These dihedral angles were fixed to 180°, being the value found in the crystal structure of 1,5-IAENS (34), because the flexibility of these bonds is limited due to their double bond character.
For the tryptophan side chain, the dihedral angles 1 and 2 were fixed to 180 and 90°, respectively, representing the most abundant rotamer of tryptophan in ␣-helices (36). A distribution of conformations of tryptophan was not taken into account, because the spatial translation of the tryptophan amino acid residue upon rotation about the 1 and 2 angles is relatively small.
All resulting structures were screened for unfavorable AEDANS conformations, i.e. with the label folded onto the backbone, onto the tryptophan, or onto itself. Hereto, the number of atom pairs with van der Waals overlap was determined. To save computer time, all hydrogen atoms were removed first. Heavy atoms of side chains other than tryptophan were removed as well, because Swiss PdbViewer placed them in a rather arbitrary way. Of the remaining atoms, an atom pair with an interatomic distance Յ0.8ϫ the sum of the combined van der Waals radii was regarded as an overlapping atom pair. A maximum of two overlapping atom pairs/structure was allowed, otherwise the structure was discarded. Typically, ϳ80% of the structures was discarded based on van der Waals overlap. Assuming that all allowed conformations are equally probable, the energy transfer contribution for AEDANS conformation i (E i i) is given by Equation 2. The calculated energy transfer taking into account all conformations is as follows, where N is the number of allowed conformations for a specific mutant. The corresponding distance in Equation 2, r i , is calculated from the middle of the central bond of the AEDANS naphthalene group to the middle of the central bond of the tryptophan indole group.

RESULTS
The experimental energy transfer efficiencies are depicted in Fig. 3. Because of the high lipid to protein ratios used, we assumed that the energy transfer efficiencies were only due to intramolecular energy transfer. To justify this assumption, we performed some calculations based on a random distribution of wild type and mutant protein in the membrane. In this case, the contribution due to intermolecular energy transfer was small for lipid to protein ratio 400 (9% for mutant A9C, 13% for mutant G38C) and negligible (Ͻ1%) for lipid to protein ratio 1500. We therefore assumed that the contribution of intermolecular energy transfer could be ignored, although a small contribution could be still present in the case of lipid to protein ratio 400. The experimental extinction coefficients in Equation 6 are the main source for errors in the energy transfer efficiencies in Fig. 3. Therefore, error bars were calculated based on an uncertainty of Ϯ200 in the extinction coefficients.
The energy transfer efficiencies become close to 1.0 for AEDANSlabeled mutants near the tryptophan at position 26, which is located in the center of the protein, and are almost zero for mutants in both termini. On top of this overall trend, the efficiencies show oscillations that are indicative for a helical conformation of the protein.
The efficiency is related to the distance between the donor and acceptor via Equation 2. The donor-acceptor distance is equal to the Förster radius when the efficiency is 0.5. As can be seen in Fig. 3, this is the case for labels around position 10 and 42. This implies that the distance from these residues to tryptophan at position 26 is ϳ24 Å, being the Förster radius of the tryptophan-AEDANS donor-acceptor pair. Both positions are 16 amino acid residues away from the tryptophan at position 26. A perfect ␣-helix has a rise of 1.5 Å/residue (33), giving a translation of 24 FIGURE 2. Overview of the side chain dihedral angles 1 -9 of AEDANS-labeled cysteine. The dihedral angles 1 , 2 , 3 , 4 , 6 , 7 , and 8 were allowed to adopt p, t, and m conformations. The dihedral angles 5 and 9 were fixed to a t conformation. Bond lengths and bonding angles were taken from the crystal structure of 1,5-IAENS (34) and the AMBER force field (35).
Å for 16 residues. This calculation shows that the efficiencies can be explained qualitatively on the basis of an ␣-helical structure for the protein.
Model Construction-Based on this finding, a molecular model of the protein in an ␣-helical conformation is constructed, and the theoretical efficiencies are calculated using four increasingly sophisticated models (A-D) to represent the donor and acceptor (Fig. 4). In model A, the sizes of the donor and the acceptor are not taken into account. The trend of the experimental data is well reflected by this model (Fig. 3, black line), but the calculated values are in general too high. Moreover, the oscillations that are clearly visible in the experimental data are hardly present in the calculated curve. To account for this, the sizes of the donor and acceptor are taken into account explicitly in model B. In this model, these sizes were estimated based on molecular models, assuming that they are in a fully extended conformation (Fig. 4, model B). In this case, the efficiencies become too low and the oscillations become too strong, indicating that the size of the donor and/or acceptor is overestimated (Fig. 3, green line). To quantify the performance of the models, a model quality parameter Q is introduced.
Here, j runs over the mutants, M is the total number of mutants, and E experiment and E calc are the experimental and the calculated efficiencies. As can be seen in Fig. 4, model B performs not as well as model A.
Previously, the effective size of tryptophan-and AEDANS-labeled cysteine were estimated to be 6.5 and 8.0 Å, respectively (2). The effective size of the donor is in close agreement with the size of a fully extended tryptophan, which was estimated here to be 6.6 Å. The size of AEDANS-labeled cysteine (8.0 Å), however, is much smaller than the value of 16.0 Å estimated from the molecular model. This indicates that the acceptor is not in a fully extended conformation. When these sizes are used to calculate the theoretical efficiencies, they are in good agreement with the experimental ones (Fig. 3, blue line), and the value of Q reduces again (Fig. 4, model C).
Apparently, the AEDANS label is not in a fully extended conformation. Therefore, we proposed model D in which AEDANS is represented as a distribution of conformations, giving rise to a smaller effective size of the acceptor. It can be seen (Fig. 3, red line) that model D gives excellent theoretical efficiencies in both the N-and C-terminal parts of the protein, leading to a further reduction of the value of Q (Fig. 4, model  D). This implies that the protein is well described by an ␣-helix when the conformational space of the acceptor is taken into account.
However, various conformations of membrane-bound coat protein have been proposed previously based on x-ray diffraction, solution NMR, and solid state NMR. Therefore, the same approach as described for model D was used to calculate the theoretical efficiencies for the known structures in the Protein Data Bank. The resulting quality parameters are depicted in Fig. 5. TABLE ONE gives an overview of the structures of the coat protein that were used in the comparison to the energy transfer data using model D.

DISCUSSION
Micellar Stress-The first fifty structures in Fig. 5, blue and purple bars, were calculated from solution NMR in detergent micelles (see Fig.  6A) (11). For the majority of these structures, the calculated energy transfer efficiencies of mutants in the C-terminal part of the protein are in good agreement with our energy transfer data. The main reason for FIGURE 4. Different models to calculate the distance between the donor and acceptor, r, and their model quality parameters, Q. Model A, the distance between the C-␣ atoms is taken as r. Model B, the size of the donor and acceptor in a fully extended conformation is estimated based on a molecular model. Model C, the donor and acceptor size are taken from the literature (2). Model D, the flexibility of the acceptor is taken into account, and a separate energy transfer efficiency is calculated for each conformation of the acceptor. Subsequently, an average energy transfer efficiency is taken over all possible conformations.  the high variation in Q values arises from the orientation of the N-terminal helix that is poorly defined with respect to the C-terminal helix. Highly curved micelles are known to induce stress on single ␣-helices passing through a micelle. For instance, micelles were shown to cause considerable curvature in a 22-amino-acid-long micelle-bound peptide (37). Chou et al. (37) anticipated that for longer ␣-helices, the environmental stress would be even larger. We propose that micelle-bound coat protein, which is 50 amino acids long, reduces this stress through partial unwinding of the helix around residues 20 -24, allowing the protein to adapt to the highly curved micelle surface. Because of this structural disorder, fewer NOEs are observed in this region, hampering the determination of an unambiguous protein structure (11).
Dehydrated Bilayer Stress-Remarkably, the structure found with solid state NMR in stacked, oriented bilayers (14) has a particularly high Q value (see Fig. 5, yellow bar). This structure consists of two helices, one running from residue 7 to 20, the other from 21 to 44, and is best described as L-shaped (see Fig. 6B). The helices are connected via a short turn. In addition, the latter helix has a distinct kink near residue 39. In our energy transfer data, there is no indication of such a turn in the N terminus nor of a kink in the transmembrane helix. We ascribed the difference in conformation of the protein in vesicles and in oriented bilayers to insufficient hydration of the lipid headgroups in oriented bilayers. Stacked, oriented bilayers are prepared by drying of a proteinlipid vesicle suspension on a glass plate and subsequent rehydration of the sample at high relative humidity. When the relative humidity is 100%, the water spacing between the stacks becomes ϳ18.0 Å (38), and the lipid leaflets are fully hydrated. However, when the relative humidity is slightly lower, the water spacing between the lipid leaflets decreases rapidly (to 3.6 Å at 97% relative humidity). The stacked, oriented bilayers that were used for solid state NMR in the case of the coat protein were rehydrated at an even lower relative humidity (93% (39)), suggesting a water layer Ͻ3.6 Å. Such a small water layer can impose a severe environmental stress on the N-terminal part of the protein, thereby changing its conformation. This stress is absent in unilamellar vesicles in which the lipid leaflets are fully hydrated, and the polar residues in the N-terminal part of the protein can interact with water molecules, allowing an I-shaped conformation as found in our work (Fig. 6D).
Phage-bound Coat Protein-The structure of phage-bound coat protein (12) has a relatively low Q value (Fig. 5, black bar). This structure is a continuous, slightly curved ␣-helix (Fig. 6C). Throughout the entire structure, the calculated energy transfer efficiencies are in good agreement with the experimental ones, showing that the conformation in vesicles is not very different from the phage-bound conformation. This is consistent with the model of Marvin (13), who anticipated that the membrane-bound conformation of the coat protein could not be very different from the phage-bound conformation. This view is further supported by the fact that the Q value of a straight ␣-helix based on FRETderived distance constraints is also low.
It should be noted, however, that the number of FRET-derived constraints in the N-terminal part of the protein may be insufficient to distinguish between some extended structures found in micelles and a continuous ␣-helix. Furthermore, some dynamics and curvature of the structure might still be present, permitting charged residues in the N-terminal part of the protein to interact with the membrane-water interface, allowing a slight deviation from an ideal ␣-helix. Nevertheless, the ␣-helix model depicted in Fig. 6D is in good agreement with recent findings of our group describing the structure and membrane assembly of the coat protein in phospholipids membranes (2,4).

CONCLUSIONS
Based on the data presented here, we conclude that M13 major coat protein bound to vesicles is well described by a continuous ␣-helix,  (11). The backbone atoms of residues 26 -40 were fitted to each other, and the putative transmembrane domain was placed to span the hydrophobic region of a bilayer, which is ϳ30 Å and is depicted in yellow. B, the coat protein in stacked, oriented bilayers with the N-terminal helix oriented perpendicular to the membrane normal (14). For reference, the tryptophan residue at position 26 is represented as a blue sphere, and the lysine residues at positions 40, 43, and 44 are represented as green spheres. In this model, the charged lysine residues, which have relatively long side chains, are able to interact with the membrane-water interface, even at position 40. C, the coat protein is represented as a slightly curved, continuous helix at an angle of ϳ39°to the membrane normal (13). The protein is oriented in such a way that the lysine residues are located at the membrane-water interface. D, protein model emerging from the FRET-derived constraints. The charged lysines in the C terminus are placed at the interface (40), and the protein adopts a tilt angle of 23°( 2). Overview of the structures of the coat protein that were used in the comparison to the energy transfer data using model D All structures were taken from the Protein DataBank. All 50 high resolution structures that were calculated with high field NMR were taken into account (11). For the structure of the coat protein in oriented bilayers, only the coordinates for residues 7-44 were published (14), so residues 1-6 and 45-50 were left out of consideration for the calculation of the quality factor Q. 1 Intact phage X-ray diffraction Fd a 12 a M13 coat protein has an asparagine residue at position 12, Fd coat protein (an aspartic acid residue). The rest of the protein is identical.

PDB entry Number of structures
resembling the I-shaped structure found in the bacteriophage. It should be noted that some extended structures of the coat protein in SDS micelles are also in good agreement with the energy transfer data presented here and that the number of FRET-derived distance constraints in the N-terminal domain is insufficient to distinguish between extended structures and a continuous ␣-helix. The presence of L-shaped and U-shaped structures in vesicles can be excluded based on our FRET-derived distance constraints.
In this work, we have shown that FRET offers a useful alternative for structure determination of membrane proteins, provided a sufficient number of single cysteine mutants can be obtained. In addition, the fluorescence technique is sensitive and thus not very demanding in terms of protein quantities needed.
Our findings emphasize the need to study membrane proteins in a suitable environment, such as fully hydrated vesicles. Although larger proteins than M13 major coat protein may be able to handle environmental stress in a different way, any membrane protein with waterexposed parts in the C or N termini and hydrophilic loop regions should be treated with care.