Structural Characterization and Oligomerization of PB1-F2, a Proapoptotic Influenza A Virus Protein*

Recently, a novel 87-amino acid influenza A virus protein with proapoptotic properties, PB1-F2, has been reported that originates from an alternative reading frame in the PB1 polymerase gene and is encoded in most known human influenza A virus isolates. Here we characterize the molecular structure of a biologically active synthetic version of the protein (sPB1-F2). Western blot analysis, chemical cross-linking, and NMR spectroscopy afforded direct evidence of the inherent tendency of sPB1-F2 to undergo oligomerization mediated by two distinct domains located in the N and C termini, respectively. CD and 1H NMR spectroscopic analyses indicate that the stability of structured regions in the molecule clearly depends upon the hydrophobicity of the solvent. In aqueous solutions, the behavior of sPB1-F2 is typical of a largely random coil peptide that, however, adopts α-helical structure upon the addition of membrane mimetics. 1H NMR analysis of three overlapping peptides afforded, for the first time, direct experimental evidence of the presence of a C-terminal region with strong α-helical propensity comprising amino acid residues Ile55-Lys85 connected via an essentially random coil structure to a much weaker helix-like region, located in the N terminus between residues Trp9 and Lys20. The C-terminal helix is not a true amphipathic helix and is more compact than previously predicted. It corresponds to a positively charged region previously shown to include the mitochondrial targeting sequence of PB1-F2. The consequences of the strong oligomerization and helical propensities of the molecule are discussed and used to formulate a hypothetical model of its interaction with the mitochondrial membrane.

Influenza A virus (IAV) 3 is one of the most common pathogens threatening humans and animals, with the potential to cause disastrous pandemics. In the last century, it was the origin of at least three pandemics, the most serious outbreak being the "Spanish flu" (1918 -1919) that claimed 20 -50 million casualties worldwide (for a review, see Ref. 1). Apart from various mammals, IAV also infects avian hosts, and particularly aquatic birds have been shown to be the primary reservoir. Sporadically, some of these avian strains acquire the capability to infect other mammals or humans either as a whole or more likely upon genetic reassortment with prevailing human IAV strains. This process termed antigenic (viral) shift appears to occur via the pig as an intermediate host and "mixing vessel" and can lead to new IAV subtypes of mixed surface antigens (2,3).
The genome of IAV, a representative of the orthomyxoviruses, consists of eight separate linear segments of negative sense RNA and was thought to encode 10 gene products. Only very recently, while screening for major histocompatibility complex class I epitopes derived from out-of-frame viral polypeptides, an 11th IAV gene product, named PB1-F2, was incidentally discovered (4). Like the two proteins M1 (matrix protein) and M2 (ion channel) encoded on the M gene segment and the two nonstructural proteins NS1 and NS2 encoded on the NS gene segment, respectively, PB1-F2 was found to be expressed as a second protein from the PB1 (RNA polymerase basic protein 1) gene segment. In contrast to NS2 and M2, which are derived from spliced mRNAs, PB1-F2 is the only influenza A virus protein that originates from an alternative (ϩ1) open reading frame of the PB1 gene. PB1-F2 was characterized as an 87-amino acid residue protein and originally discovered for IAV strain A/Puerto Rico/8/34(H1N1), also termed IAV PR8 . However, whereas the pb1-f2 open reading frame was identified in the majority of analyzed IAV subtypes, it is not present in the influenza B virus genus (4).
A major part of IAV replication occurs in the nucleus, where viral RNA is produced through the concerted action of three polymerase subunits (polymerase acidic protein, PB1, and PB2), together with the nucleoprotein. At a late stage of infection, M1 and NS2 proteins enter the nucleus, where, among other functions, they induce the shutdown of viral RNA synthesis and promote the export of newly assembled cores through the cytosol to the plasma membrane, where progeny virions bud from the membrane (reviewed in Ref. 5). IAV infection generally activates several host cell antiviral mechanisms that are in part counteracted by accessory IAV proteins, like the NS1 protein that is involved in the inhibition of type I interferon response (6,7). PB1-F2 appears to represent another tool by which IAV regulates the host's immune response to virus infection (4). It is assumed that PB1-F2 removes host immune cells responding to IAV infection either by functioning as an endogenously expressed apoptosis stimulator in infected cells or in an exogenous form when the protein is released from infected cells or from disintegrated virus particles similar to the function of the proapoptotic human immunodeficiency virus-1 protein Vpr (8,9).
With the goal of understanding the molecular mechanism involved in the biological function of the regulatory IAV protein PB1-F2, we describe here the first structural characterization of the molecule derived from the IAV PR8 isolate. Although the molecule investigated exhibits a high degree of flexibility in pure aqueous environment, PB1-F2 adopts extended ␣-helical structures in the presence of organic solvents that mimic the membrane environment. According to high resolution NMR data, PB1-F2 consists of two independent structural domains, two closely neighboring short helices at the N terminus, and an extended C-terminal helix. Both helical domains are connected by a flexible and unstructured hinge region. Furthermore, we observed that the PB1-F2 molecule has an intrinsic strong propensity to form oligomeric structures, a characteristic that supports the recent observation that the molecule can form membrane pores in planar lipid bilayers (10). The major oligomerization domain is located in the C-terminal helix, whereas both N-and C-terminal domains exhibit separate oligomerization capacity.
CD Spectroscopy-CD spectra of the protein and related peptides were recorded at room temperature and a concentration of 0.2 mg/ml in 0.5-mm cuvettes on a Jasco J-810 spectropolarimeter in a wavelength range from 260 to 180 nm at various TFE concentrations as described previously (8). The resulting curves were smoothed using a high frequency filter, and secondary structure elements were quantified by deconvoluting the measured ellipticity using the DICROPROT 2000 program. 1 H NMR Spectroscopy-One-and two-dimensional 1 H NMR spectra were recorded at various temperatures between 293 and 323 K on a Bruker Avance DMX 600 NMR spectrometer using a triple resonance probe head with a gradient unit. sPB1-F2 was dissolved at a concentration of 1 mM (10 mg/ml) without pH adjustment in 90% H 2 O, 10% D 2 O, and in 50 and 90% aqueous TFE-d 2 to give final volumes of 600 l. Spectra of the fragments, PB-(1-40), PB- (30 -70), and PB-(50 -87), were recorded in 50% aqueous TFE-d 2 at 300 K of 2 mM solutions.
The unambiguous amino acid spin systems, sequential assignments, and final nuclear Overhauser enhancements (NOEs) of the three fragments were established using a standard procedure (12) that has been used by us previously (13). The complete signal assignments and 1 H chemical shifts of PB-(1-40), PB-(30 -70), and PB-(50 -87) have been deposited in the Biological Magnetic Resonance Data Bank under accession numbers 7289, 7290, and 7258, respectively.
The volumes of the integrated cross-peaks from the NOESY spectrum with a mixing time of 250 ms of PB-(50 -87) were determined and converted to interproton distances by calibration against the side chain Gln or Asn amide protons (0.19 nm) using the AURELIA 2.7.9 program (14). Structures were then generated using the standard protocol embodied in the CNSsolve 1.0 software package starting from an extended peptide backbone as described previously (13). The 20 structures with the lowest energy terms were chosen for the final analysis. Structure fitting criteria were objectively derived using a consecutive segment approach described by us previously (15). Final structures were displayed and manipulated on an Silicon Graphics OCTANE-work station using the program BRAGI (16). Structure superposition was performed with the same program and r.m.s. deviations for the regions of interest were calculated using LSQMAN (Uppsala Software Factory). The final structure of PB-(50 -87) has been deposited in the Protein Data Bank under code 2HN8 and RCSB ID RCSB038537.

RESULTS
sPB1-F2 Has a Strongly Environmentally Dependent Secondary Structure-As a prerequisite for structural and functional studies on the small IAV regulatory protein PB1-F2, we have established solid phase peptide synthesis protocols for the production of milligram amounts of full-length synthetic (s) peptide, sPB1-F2, and several fragments thereof on a routine basis in highly pure and biologically active form (4,10,11). The sequence of the sPB1-F2 molecule used in the present investigation corresponds to the strain IAV PR8 (4,11). Although relatively small quantities of PB1-F2 have been produced by recombinant methods as a GST fusion protein (17), large scale production of recombinant PB1-F2 sufficient for structural studies is difficult both because of its inherent tendency for aggregation and interactions with hydrophobic components of the producer cell as well as its cytotoxicity causing altogether low expression levels. Here we have used our synthetic material to investigate the oligomerization and structure of PB1-F2. Based on secondary structure prediction, the molecule was subdivided into three similarly sized domains, and three overlapping fragments were synthesized, comprising the 40 N-terminal residues (PB-(1-40)), the central residues (PB-(30 -70)), and the 38 C-terminal residues (PB-(50 -87)) (11). Although no clear structure was calculated for the N terminus, strong helical regions were predicted for the central and particularly the C-terminal domain (11,18), as shown in Fig. 1.

Self-association of sPB1-F2 Is Regulated by Disulfide Bond Formation and by Two Distinct Oligomerization Domains-
The first evidence for self-association of sPB1-F2 originates from our previous studies demonstrating oligomers of sPB1-F2, or the viral counterpart expressed in IAV PR8 -infected cells, that were detected without any chemical fixation even under denaturating conditions of the SDS-PAGE (11). Further, it was shown that sPB1-F2 induces pore formation in planar lipid bilayers, a characteristic of membrane-interacting proteins that tend to form oligomeric structures (10). Direct evidence for the existence of oligomeric structures of sPB1-F2 is now provided by chemical cross-linking. In a first set of experiments, the peptide was exposed to various chemical cross-linkers, which clearly resulted in the fixation of a ladder of high molecular weight complexes with pronounced signals in the molecular weight range of dimers to pentamers. In Fig. 2A, the results are shown for exposure of sPB1-F2 to increasing concentrations, ranging from 10 Ϫ3 -to 10 3 -fold molar excess, of the cross-linking reagents bis(sulfosuccinimidyl)suberate, ethylene glycol disuccinate di(N-succinimidyl)ester, glutaraldehyde, and DSS. The most effective stabilization of oligomers was achieved with DSS and bis(sulfosuccinimidyl)suberate, where trimers were stabilized starting at a 10 Ϫ1 -fold molar excess of the cross-linking reagent, whereas dimers were observed for all cross-linkers used, even at their lowest concentration.
Next, we investigated the oligomerization of individual fragments of sPB1-F2. According to secondary structure prediction, the N-terminal ϳ40 residues are mainly random coil, whereas the C terminus has a high propensity for ␣-helix formation that should support protein interactions (Fig. 1). Indeed, we found that the C-terminal fragment PB-(50 -87) displayed the highest capacity to form oligomeric structures when compared with the N-terminal fragment PB-(1-40). First, titration experiments similar to that shown for full-length sPB1-F2 ( Fig. 2A) revealed oligomeric forms of PB-(1-40) starting at a 10 2 -fold molar excess of DSS, whereas the oligomerization of PB-(50 -87) was observed already at a 10-fold molar ratio (Fig. 2B). The formation of dimers (marked with an asterisk) was much more pronounced even at the lowest concentration of DSS for PB-(50 -87) compared with PB-(1-40) (Fig. 2B). Further, using 1-molar excess DSS ratios, oligomers of PB-(50 -87) were detected already at 100 ng of the peptide, whereas 400 ng of PB-(1-40) were required per cross-linking reaction in order to stabilize a similar pattern of oligomers (Fig. 2C).
Similar cross-linking analyses were also conducted with the middle fragment, PB- (30 -70). Although this fragment contains only one cysteine residue at position 42 and has the inherent capacity to form disulfide-linked dimers, it does not exhibit significant oligomerization capacity compared with the N-and C-terminal fragments (data not shown). Thus, the self-association of sPB1-F2 is mediated by both the N-and C-terminal domains, with the C-terminal domain showing a higher propensity for oligomerization than the N-terminal domain.
In an additional set of cross-linking experiments, we investigated the interaction of individual domains of sPB1-F2 with the full-length molecule. Increasing concentrations of PB-(1-40) ( Fig. 2D) or PB-(50 -87) (Fig. 2E) were mixed with 100 ng of sPB1-F2 and subjected to the cross-linking reaction with an equimolar ratio of DSS with respect to sPB1-F2. Most strikingly, both the N-terminal and the C-terminal fragments caused significant changes in the pattern of the sPB1-F2 oligomers. In addition to the monomeric, dimeric, and trimeric forms of sPB1-F2 migrating at ϳ11, ϳ23, and 34 kDa (Fig. 2, D and E, marked with an asterisk), hetero-oligomeric adducts of the full-length sPB1-F2 and the N-and C-terminal fragments were detected (Fig. 2, D and E, marked with arrows). The intensities of these hetero-oligomers clearly increased with higher concentration of the fragments PB-(1-40) or PB-(50 -87) added to the cross-linking reactions (Fig. 2, D and E). In the case of the C-terminal fragment, the maximum formation of heterooligomers, consisting of sPB1-F2 and PB-(50 -87), already occurred at 200 ng of PB-(50 -87) (Fig. 2E), whereas ϳ5-fold more of the N-terminal fragment PB-(1-40) was required to  achieve the same intense hetero-oligomerization (Fig. 2D), further supporting the notion that the strongest oligomerization domain is located within the C-terminal region of sPB1-F2.
In order to try to locate more specifically which residues are the underlying cause of this phenomenon, we have used the statistical mechanics algorithm TANGO, which identifies aggregation-prone regions of peptides and denatured proteins using a set of balanced physico-chemical parameters (19,20). According to the TANGO algorithm, a score of Յ0.02% indicates no aggregation, 0.02-5.0% indicates moderate aggregation, and Ն5.0% indicates high aggregation propensities. The application of this program predicted two regions of five residues populating the oligomerization state to more than 5% per residue, 54 -58 (1.34 -9.83% per residue) and 68 -72 (93.01-97.25% per residue) under variable conditions. A further oligomerization domain was predicted with much lower scores (1.38 -1.36% per residue) for the N-terminal region extending from residue 9 to 13. Thus, there is a good correlation between the predicted regions and the trends in the experimental cross-linking data with specific 5-residue regions being predicted to be responsible for the differences in the oligomerization behavior.
Further, to analyze the potential of disulfide bridge formation by the single cysteine residue in position 42 of sPB1-F2, the peptide was studied in the presence of 1 mM dithiothreitol and subjected to the cross-linking reaction. As demonstrated in Fig.  2F, the same pattern of oligomers was observed, albeit the for-mation of trimers and tetramers was slightly decreased under reducing conditions (Fig. 2F).
In a further set of confirmatory cross-linking studies, we followed the oligomerization of sPB1-F2 directly by staining the peptide in SDS-PAGE (Fig. 2G) as opposed to using antibody reactions. In contrast to the detection of oligomers of sPB1-F2 by Western blot (Fig. 2A), approximately a 100-fold higher concentration of the peptide was necessary for each cross-linking reaction in order to allow detection by Coomassie Blue staining. Nevertheless, a similar pattern of oligomers was detected, using various molar ratios of the cross-linking reagent DSS, when compared with those found above using the more sensitive Western blot analysis, suggesting that most of the possible oligomeric structures were reactive with the antibodies used in the above experiments ( Fig. 2A). Clearly, at a 20-fold molar excess of DSS, high order oligomers of sPB-F2 were stabilized, most of which were barely separated in the higher molecular weight range of the SDS-PAGE. In gels lacking ␤-mercaptoethanol, sPB1-F2 forms stable dimers even in the absence of cross-linkers (Fig. 2G). Also, the extent of stabilization of dimers by cross-linking was more evident when the cross-linking reaction was resolved in the absence of reducing reagent, indicating that at least dimeric forms of sPB1-F2 were stabilized by disulfide bonds. In summary, our comprehensive cross-linking data provide compelling evidence that sPB1-F2 is a molecule with an unusually high propensity for oligomerization that in addition is capable of forming a disulfide-linked dimer. It has two distinct oligomerizing domains, with the most efficient one being located in the C-terminal half of the molecule, that can be attributed to specific regions in the molecule. Both oligomerizing domains can form independent homo-oligomers when shorter peptide fragments of sPB1-F2 are present and have the ability to form hetero-oligomers with the full-length molecule sPB1-F2.
Secondary Structure in sPB1-F2 Is Essentially Localized in the C-terminal Domain and Is Dependent on Solution Conditions-Although no structure was calculated for the N terminus, the helical regions were predicted for both the central and the C-terminal domains (11,18) (Fig. 1). To analyze the impact of the solvent conditions on the folding of the molecule, a considerable number of one-and two-dimensional 1 H NMR spectra of 1 mM solutions of sPB1-F2 were recorded in both pure aqueous and aqueous TFE-d 2 containing solutions at temperatures varying between 293 and 323 K (Fig. 3). TFE-d 2 was chosen, since it not only functions as a membrane mimetic but also usually suppresses intermolecular interactions that support oligomerization and hence affords better resolved NMR spectra (21). Initially, one-dimensional 1 H NMR spectra of the aromatic and NH region were recorded at 323 K for 0, 50, and 90% aqueous TFE-d 2 solutions of sPB1-F2 (Fig. 3). When sPB1-F2 was analyzed in water alone (Fig. 3A), broad poorly resolved signals were observed, with a limited dispersion of the backbone NH signals between 8.6 and 7.2 ppm that are, altogether, characteristic of an oligomeric and random coil peptide conformation. The indole 1 NH signals of the aromatic side chains of the tryptophan residues appear as unresolved signals centered at 10.0 ppm, and similarly broad signals are observed for the carbon-bound side chain protons of the aromatic residues tryptophan, phenylalanine, tyrosine, and histidine between 7.7 and 7.2 ppm. The addition of 50% TFE-d 2 (Fig. 3B), however, improves the resolution of all signals and affords an increased dispersion of the signals in the region 8.7 to 6.2 ppm, providing evidence of a significant decrease in self-association. In the presence of 50% TFE-d 2 , the best resolved signals in this region are those of the aromatic side chains and four distinct signals characteristic of the indole 1 NH group of tryptophan residues between 9.8 and 9.5 ppm. A further increase in the TFE-d 2 content up to 90% (Fig. 3C) again affects the dispersion of the signals but also appears to lead to poorer resolution, particularly apparent for the signals at the low field edge of the envelope. The fact that the intensities, widths, and positions of the four tryptophan signals are strongly dependent on the solution conditions was also confirmed for the same set of solutions at 300 K (data not shown). Thus, the evidence of these NMR experiments (Fig. 3) are consistent with the assumption that sPB1-F2 exhibits a high tendency for self-association. In order to obtain experimental evidence for the nature of the structure present in sPB1-F2, we investigated the folding of the molecule and its fragments by CD spectroscopy under different solution conditions. When sPB1-F2 was initially analyzed in pure water, the spectrum showed little evidence of stable secondary structure formation, and deconvolution resulted in only 7% helical content (Fig. 4, A and E). However, upon the addition of 20% TFE and even more pronounced at 50% TFE, there was a substantial change in the shape of the CD curves showing negative ellipticities at 221 and 207 nm and a strong positive band at 192 nm (Fig. 4A), indicating the presence of significant amounts of helical structure upon deconvolution (Fig. 4E). Thus, consistent with the one-dimensional NMR experiments (Fig. 3), the CD measurements demonstrate that the solution conditions can profoundly affect the folding of sPB1-F2; the peptide is almost completely in the random coil conformation in pure aqueous environment, whereas adding a membrane mimetic, such as TFE or phospholipids (see below and Fig. 1 of the supplemental data), strongly stabilizes secondary structure that is mainly ␣-helical in character. TFE is known to favor intramolecular interactions and therefore stabilizes helix formation only in those parts of a protein that have the inherent propensity to adopt helical structures (21).
A similar set of spectra was recorded for the three overlapping peptide fragments (Fig. 4, B-D), and quantitative estimates of the corresponding ␣-helical contents were determined in TFE solutions (Fig. 4E). Similar qualitative changes in the CD curves are observed for PB-(50 -87) in phospholipid solutions (supplemental Fig. 1), indicating that helical secondary structure is also stabilized under these conditions that mimic the membrane hydrophobic environment.
The C-terminal fragment, PB-(50 -87), shows the largest helical content, which changes little upon going from 20 to 50% TFE. The central fragment, PB-(30 -70), shows a much smaller helical content that is more susceptible to the amount of TFE present, whereas the N-terminal fragment, PB-(1-40), has only a minor helical content even at the highest TFE level. Thus, the CD data indicate that the major helical structure is concentrated in the C-terminal region of the molecule, which is consistent with the previous calculations where the major secondary structure was predicted for the C-terminal domain of PB1-F2 (11,18). However, unlike the predicted structures, the CD data indicate that a minor helical region might be present in the very N terminus and that, furthermore, the secondary structure of PB1-F2 is extremely sensitive to the environment and will only be stabilized under suitable membranous solution conditions.

Identification of Structural Elements in PB-(1-40), PB-(30 -70), and PB-(50 -87) by 1 H NMR Spectroscopic Characterization-It is
unlikely that structural details of full-length sPB1-F2, 87 amino acids in length, can be enumerated at the atomic level using homonuclear 1 H NMR techniques at the field strengths currently available. Clearly, signal overlap, evident in Fig. 3A or from preliminary two-dimensional 1 H TOCSY spectra conducted on sPB1-F2 recently (11), will prevent unambiguous signal assignments. Such a study of the full-length molecule would require heteronuclear NMR approaches. However, recombinant PB1-F2 with uniform 13 C/ 15 N labeling has not been produced as yet, but the availability of the three moderately sized overlapping fragments PB-(1-40), PB-(30 -70), and PB-(50 -87) allowed us to probe structural details using two-dimensional 1 H NMR techniques.
According to the preliminary one-dimensional NMR and CD analyses of the full-length molecule, the best resolved NMR spectra were obtained in 50% TFE-d 2 , which also corresponds to conditions showing the most structured and least oligomerized state of the peptide. Indeed, dynamic light scattering data indicate that all of the molecules investigated were monomeric under these conditions (data not shown). Detailed analyses of the two-dimensional 1 H TOCSY and NOESY NMR spectra of three overlapping fragments of sPB1-F2 in 50% TFE-d 2 at 300 K and pH ϳ3 afforded complete assignments of all amino acid spin systems in each of the peptides PB-(1-40), PB-(30 -70), and PB-(50 -87) investigated. Qualitative information about the nature and position of secondary structure for such molecules in aqueous solution is readily deducible from the ␣-proton chemical shifts, since upfield shifts of these occurring in four adjacent residues relative to the random coil values (12) are indicative of local helical structure, whereas the downfield shift of three adjacent residues is indicative of ␤-sheets (22). In the present case, the 1 H chemical shift differences experimentally obtained for 50% TFE-d 2 solutions of each of the fragments are shown in Fig. 5, A-C, whereas values for the full-length molecule (Fig. 5D) were derived by combining the shifts of the individual fragments with averaging in the overlapping regions Leu 30 -Gly 40 and Val 50 -Val 70 , respectively. The data clearly imply that sPB1-F2 has a long stretch of continuous helical secondary structure located in the C-terminal section of the molecule between residues Lys 53 and Ser 84 . In contrast, the N terminus appears to have two short, weak helical regions (Trp 9 -Thr 13 , Ile 16 -Lys 20 ), each of approximately five residues in length, that were not predicted empirically (11,18). Further-more, there is no evidence of any secondary structure in the region between residues Arg 37 and Gln 48 that was previously predicted for high propensity of helix formation (11,18) and that is experimentally part of the unstructured central section Arg 21 to Met 51 of the molecule (Fig. 5E).
Further noticeable features in Fig. 5 are the unusual low field shifts of the ␣-protons of residues Thr 27 , Met 51 , and Asn 66 , all of which precede proline residues in the sequence. We have noted this phenomenon previously in similar NMR experiments during the study of cis/trans peptidyl-prolyl isomerism in proline residues located in the N-terminal region of the human immunodeficiency virus-1 accessory protein Vpr (13). In these analyses, it was observed that proline substitution in the trans-conformation causes an inherent downfield shift of the ␣-proton belonging to the adjacent preceding residue in the sequence of ϩ0.28 Ϯ 0.1 ppm and smaller shifts of ϩ0.08 Ϯ 0.03 ppm for these two residues toward the N terminus (13). Furthermore, this phenomenon occurs independently of the type of secondary structure in the vicinity of proline residues (13). Such effects clearly rationalize the shifts observed here for residues Thr 27 and Met 51 and imply that Asn 66 should exhibit a more negative shift difference. Taking this into account, the C-terminal helix appears to be almost continuous without any pronounced break.
Solution Structure of PB-(50 -87)-Finally, we have determined the high resolution structure of the long C-terminal helix using the NOE data of PB-(50 -87) (Fig. 6A). Quantitative NOE data derived from spectra recorded for PB-(50 -87) in 50% TFE-d 2 were used as distance constraints in molecular dynamics/energy minimization calculations using a standard protocol (23).
A total of 754 distance constraints from 373 intraresidue, 192 sequential, and 189 medium range NOEs, which are evenly distributed throughout the molecule (Fig. 6B), were used to generate 100 conformations. The 20 conformations with the lowest NOE (E NOE ϭ 237.0 Ϯ 5.4 kJ/mol) and total energies (E total ϭ 739.6 Ϯ 6.3 kJ/mol) and showing no constraint violations greater than 0.2 Å were used for the final fitting analysis (see supplemental Table 1). As we have shown previously, the identity and heterogeneity within a final set of molecular conformations can be visualized using the consecutive segment approach. This method provides a relative measure of how well the backbone atom positions for each amino acid are defined in all of the final structures and thereby allows the determination of an appropriate fitting region (15). Consequently, such a comparison provides an objective method for the recognition of stable structural elements in the ensemble of final structures, since the lower the mean root mean square deviations are, the more similar are the conformations in the final structures. Such an analysis (Fig. 6C) showed that the best defined region of the molecule, with the lowest root mean square deviation (Ͻ0.2 Å), is located between and includes residues Ile 55 -Lys 85 , which corresponds to a well defined ␣-helix, as shown by superposition of the finally refined structures (Fig. 7).
In summary, the CD data for both the intact molecule and its three overlapping fragments combined with the qualitative and quantitative NMR data obtained for the three fragments indicate that sPB1-F2 requires the presence of a hydrophobic envi- ronment to adopt structured domains. Under such conditions, there is strong circumstantial evidence that the molecule consists of a two-domain structure (Fig. 5E) with a relatively stable 31-32-residue helical structure at the C terminus (Fig. 7) connected by an unstructured central region to the N terminus, which contains very little structure apart from a section with two weak and neighboring helical regions, each 5 residues in length.

DISCUSSION
Originally, the proapoptotic PB1-F2 protein was serendipitously discovered as an 87-amino acid protein encoded by a cryptic open reading frame in the PB1 gene of the IAV PR8 isolate (4). Due to the very recent discovery of the protein, the evolution and function of the PB1-F2 protein are not fully understood yet, and several aspects are still under debate. Although the protein has originally been described to induce apoptosis, it has now been shown that PB1-F2 more likely acts as an apoptosis promoter in concert with other apoptosis-inducing agents (4,17). The finding that PB1-F2 is under positive selection pressure among highly pathogenic IAVs of the H5N1 lineage has been questioned recently (24,25). However, in vivo data in mice infected with virus mutants lacking PB1-F2 indicate that the protein may play a critical role in IAV-induced apoptosis (26). The ongoing discussion about PB1-F2 function highlights the need to understand the structural behavior of this novel protein. In consideration of the potential role of PB1-F2 in IAV pathogenesis, particularly its apoptosis-promoting activities on mitochondria (4,17,18,27), we sought to investigate the molecular characteristics of this small regulatory protein.
The first insight into the structure-function correlation of PB1-F2 domains stems from recent mutagenesis studies using GFP-PB1-F2 fusion proteins that mapped the inner mitochondrial membrane localization signal to a possible putative amphipathic and positively charged helix located near the C terminus of PB1-F2 (18). However, no structural investigation on PB1-F2 has been reported so far.
Without direct experimental evidence, we and other groups have applied a number of different algorithms to predict the secondary structure in PB1-F2 (11,18). These predictions for the IAV PR8 isolate reveal a 9 -20-residue-long C-terminal helix that in all algorithms is concluded at residue 83, a shorter central helix in the region of residues 37-48 with a maximum length of 12 residues, and a third helix of approximately the same size as the latter centered on residue 58 (summarized in Fig. 1). Clearly, all of these programs predict the molecule to be divided essentially into two approximately equally sized domains, corresponding to an N-terminal domain that shows little secondary structure and a C-terminal domain that should consist of pronounced ␣-helical structure. Such predictions are limited, however, since any environmental dependence is not taken into account.
As a small membrane-interacting proapoptotic protein, recombinant PB1-F2 would be complicated to produce in the large quantities required for spectroscopic analyses. For molecular analysis, sPB1-F2 was completely synthesized as a functional entity that exhibits various biological phenomena that were also observed for its virally expressed counterpart (11). Following microinjection, it localizes to mitochondria, where it induces morphological alterations and causes cell  death (4). It also induces transmembrane conductance of planar lipid bilayers and exhibits a behavior similar to other proapoptotic proteins (10). From this, it is legitimate to assume that the molecular characteristics of the synthetic peptide sPB1-F2 are comparable with those of its viral counterpart.
The high occurrence of cationic amino acids together with ␣-helical structure within the C-terminal region suggests a model of the entire molecule that is partially amphipathic in character. Such molecules typically form helical structures only in a hydrophobic environment that is encountered biologically in lipid membranes or when bound to the hydrophobic patches of interacting proteins and can be stimulated experimentally with solutions of organic solvents or lipids (21). Indeed, the experimental CD and NMR spectroscopic data clearly demonstrate the structural variability of PB1-F2 and its ready adaptability to solution conditions. In its most structured form in the model hydrophobic environment of 50% TFE, the molecule shows a C-terminal ␣-helix ϳ32 residues in length bound to an essentially unstructured N terminus apart from a region with relatively weak helical propensity (residues 9 -20). The experimentally observed C-terminal helix is shorter than the combined three empirically predicted helices and is terminated at its N terminus by proline 52. These helical structures gradually disappear upon increasing the hydrophilicity of the solution until in pure aqueous solution the molecule exhibits a random coil conformation.
The helical wheel representation of the experimentally determined ␣-helical region of the C-terminal fragment of PB1-F2 has a bias toward an amphipathic distribution of resi-dues, as surmised previously from sequence predictions (18), and two differently charged surfaces as shown in Fig. 8. Furthermore, there is a relatively high number of positively charged side chains and tryptophan residues located within the C-terminal helical region; within this 38-residue domain, a total of 10 positive charges are found, with a clustering of six in the region 72-85. The three tryptophan residues at positions 58, 61, and 80 are positioned on one surface of the helix, with the majority of the positively charged residues on the opposite surface. In a cellular membrane, such a distribution of residues favors an in-plane orientation of the helix through electrostatic interaction of the positively charged surface of the helix with the negatively charged surface of the membrane (Fig. 8). Although the helix is potentially long enough to span a lipid bilayer in trans, the charge distribution would disfavor such a trans membrane orientation. Hence, under these conditions, PB1-F2 would adopt a C-terminal ␣-helix that anchors the molecule in the plane of the membrane. However, our helical wheel representation reveals that the helix is not truly amphipathic in nature but has a unique, biased distribution of hydrophilic and hydrophobic residues, which finally results in two distinct surfaces (Fig. 8B).
In such an in-plane orientation, the six positively charged residues are oriented toward the membrane with the three tryptophan side chains and one phenylalanine (residue 83) exposed as two pairs of aromatic residues (Trp 58 -Trp 61 , Trp 80 -Phe 83 ) on adjacent turns toward each end of the helix. Presumably, any intermolecular interaction involving this region of the membrane-associated PB1-F2 must occur through interaction of the exposed hydrophobic surface. Residues predicted to be responsible for oligomerization in the C-terminal domain (Fig. 8) are distributed over the surface of the helix but only partially buried through interaction with the membrane. However, even in this anchored arrangement, the N-terminal domain of PB1-F2 would still be unstructured, apart from a potential short helical section, and thus freely available for further inter-and intramolecular interactions.
Cross-linking and NMR experiments revealed a strong tendency of sPB1-F2 to self-associate in pure aqueous solution without any sign of precipitation even at a millimolar concentration suitable for NMR spectroscopy. In contrast, increasing the hydrophobicity of the solvent stabilized considerable amounts of ␣-helical structure, with the molecule undergoing a transition from an oligomeric unstructured state in water to a less oligomeric, more structured molecule. Clearly, since the highly flexible molecule is prone to protein-protein interactions, we conducted a thorough oligomerization study of the full-length molecule and fragments thereof. The data suggest that the molecule has two independent oligomerization domains: an N-terminal and a stronger C-terminal region. Both are separated by a central section incorporating a cysteine residue at position 42 that allows the formation of disulfide-linked dimers. Cross-linking data indicate both the N-and the C-terminal fragments can interact with the full-length molecule, suggesting that the two distinct oligomerization domains are freely accessible and are able to interact with one another. Although Cys 42 probably contributes to the dimerization of PB1-F2, the Nand the C-terminal domains can interact independently of each other and must be considered the major force that drives the inherent self-association of the molecule. Complementary data from the program TANGO identified three sets of 5 residues that are predicted to contribute qualitatively and quantitatively to the oligomerization phenomenon. The strongest of these is the 68 ILVFL 72 motif centered in the C-terminal helix (Fig. 8).
Until the discovery of PB1-F2, the IAV-induced apoptosis has been thought to be regulated by only extrinsic pathways (e.g. Fas ligand-mediated apoptosis) (28 -30). Currently, PB1-F2 is the only influenza A virus factor that can be directly linked to mitochondrial localization (27). Conductance and ion permeability studies on planar lipid bilayer membranes have indicated that sPB1-F2 shares similar membrane destabilizing profiles with other proapoptotic proteins, such as the Bcl-2 family of proteins, which result in mitochondrial membrane instability and subsequently apoptosis (10). Our data unequivocally establish the unique oligomerization properties of PB1-F2 and indicate that these are due to particular regions in the molecule. Consequently, it seems probable that the formation of variably sized pores that have been shown previously to induce membrane instability (10) is a direct result of oligomerization of PB1-F2. A prerequisite for this to occur is location of the molecule at the membrane surface, which we propose takes place as a consequence of the cationic nature of the C-terminal helix. Formation of helical secondary structure in the vicinity of the membrane affords a unique arrangement of charges to give one surface that is positively charged and favorable for interaction with the negatively charged membrane. Interestingly, the other surface of the molecule contains more hydrophobic residues and four of five high aggregation propensity residues. This will now allow interaction with other PB1-F2 molecules and eventually lead to the formation of pores in membranes of mitochondria and other cellular compartments.