The Structure of the Haemophilus influenzae HMW1 Pro-piece Reveals a Structural Domain Essential for Bacterial Two-partner Secretion*

In pathogenic Gram-negative bacteria, many virulence factors are secreted via the two-partner secretion pathway, which consists of an exoprotein called TpsA and a cognate outer membrane translocator called TpsB. The HMW1 and HMW2 adhesins are major virulence factors in nontypeable Haemophilus influenzae and are prototype two-partner secretion pathway exoproteins. A key step in the delivery of HMW1 and HMW2 to the bacterial surface involves targeting to the HMW1B and HMW2B outer membrane translocators by an N-terminal region called the secretion domain. Here we present the crystal structure at 1.92Å of the HMW1 pro-piece (HMW1-PP), a region that contains the HMW1 secretion domain and is cleaved and released during HMW1 secretion. Structural analysis of HMW1-PP revealed a right-handed β-helix fold containing 12 complete parallel coils and one large extra-helical domain. Comparison of HMW1-PP and the Bordetella pertussis FHA secretion domain (Fha30) reveals limited amino acid homology but shared structural features, suggesting that diverse TpsA proteins have a common structural domain required for targeting to cognate TpsB proteins. Further comparison of HMW1-PP and Fha30 structures may provide insights into the keen specificity of TpsA-TpsB interactions.

protein (referred to as a TpsA protein) and a cognate outer membrane translocator (referred to as a TpsB protein) (1)(2)(3). TpsA proteins are synthesized as preproteins 100 -500 kDa in size that are processed in the course of secretion across the bacterial inner and outer membranes, yielding functional TpsA proteins (3,4). Despite limited overall sequence conservation among the TpsA members, functional studies have established that TpsA proteins contain common features, including an atypical N-terminal signal peptide and an adjacent region of about 250 residues that forms the so-called secretion domain (1)(2)(3)(4). Although our knowledge of TpsB proteins remains relatively limited, recent studies have established that TpsB proteins have a modular structure with a C-terminal pore-forming domain (5,6).
The H. influenzae HMW1 and HMW2 proteins are high molecular weight, non-pilus adhesins that were originally identified as major targets of the human serum antibody response during acute otitis media (7). These proteins are present in ϳ80% of nontypeable H. influenzae strains and mediate adherence to a variety of epithelial cell types (4,8). HMW1 and HMW2 are encoded by separate chromosomal loci, with each locus consisting of three genes, designated hmwA, hmwB, and hmwC (8 -10). The hmwA genes encode the surface-exposed adhesins (HMW1 and HMW2), and the hmwB and hmwC genes encode accessory proteins required for proper processing and secretion of the adhesins (5,(11)(12)(13)(14). In H. influenzae strain 12, the HMW1 and HMW2 proteins are synthesized as preproproteins and undergo two discrete cleavage events during the process of maturation and surface localization, the first releasing the N-terminal signal peptide corresponding to residues 1-68 and the second releasing the pro-piece corresponding to residues 69 -441 (see Fig. 1). In H. influenzae strain 12, mature HMW1 corresponds to residues 442-1536 in the HMW1 preproprotein, and mature HMW2 corresponds to residues 442-1477 in the HMW2 preproprotein. In both HMW1 and HMW2, adhesive activity resides in a ϳ360-amino acid region at the N terminus of the mature species, and anchoring to the bacterial surface is mediated by a 20-amino acid region at the very C terminus of the protein (13)(14)(15).
The HMW1 pro-piece (HMW1-PP) shares limited homology at the N-terminal end with the secretion domain in other TpsA proteins and is critical for secretion (15)(16)(17)(18). Recent work demonstrated that a chimeric protein containing HMW1-PP and a segment of the passenger domain of the H. influenzae Hia adhesin (Hia 50 -779 ) was secreted into the supernatant in an HMW1B-dependent manner (12). In contrast, this chimera was not secreted when co-expressed with the Hia translocator domain, demonstrating specificity between HMW1-PP and the HMW1B translocator (12). Far Western analysis established that HMW1-PP interacts directly with HMW1B, revealing in part a mechanism for how the TpsA secretion domain facilitates secretion (12). Following interaction with HMW1B, HMW1-PP is cleaved and released from the organism, leaving the mature adhesin on the bacterial surface (see Fig. 1).
The functionally characterized TpsA/TpsB pairs in the TPS family include FHA/FhaC of B. pertussis, ShlA/ShlB of S. marcescens, and HMW1/HMW1B of H. influenzae (5,6,(15)(16)(17)(18)(19). In all of these examples, proper secretion requires that the secretion domain of TpsA and the periplasmic domain of TpsB recognize each other in the periplasm. Interestingly, previous studies either excluded HMW1 from multiple sequence alignments of TpsA secretion domains or provided a poor sequence alignment (2,3,20), reflecting the limited sequence homology between the HMW1 N-terminal region and the secretion domains of other TpsA proteins. Of note, the sequence identity between HMW1-PP and Fha30 (FHA amino acids 72-368, resolved by x-ray crystallography) (20) is only 21%. Further comparison of HMW1 and FHA reveals that HMW1-PP is cleaved from the proprotein and that mature HMW1 is anchored to the bacterial surface, whereas Fha30 remains a part of the functional FHA protein, and mature FHA is efficiently released extracellularly. Together, these observations raise the question of whether there are structural elements that are common to the TPS pathway.
In an effort to advance our understanding of the structural basis of TPS, we set out to solve the crystal structure of HMW1-PP. In this report we describe the crystal structure of HMW1-PP at 1.92 Å. Despite the sequence and functional diversity among members of the TPS family, analysis of the structure of HMW1-PP suggests that TpsA proteins have a common structural domain required for targeting to the cognate TpsB protein.

EXPERIMENTAL PROCEDURES
Cloning, Expression, and Purification of the HMW1 Pro-piece-The native HMW1 pro-piece (HMW1-PP, residues 69 -441) was expressed as a secreted protein by generating a DH5␣ derivative that contains pHMWB::HMWC (Cam R ) and pHMW1 1-441 ::HAT (Clontech) (Amp R ). The plasmid pHMW1 1-441 ::HAT was created by ligating a DNA fragment containing 340 bp of sequence upstream of the hmw1A start codon and the coding sequence for HMW1 amino acids 1-441 into HindIII-digested pHAT10 (Clontech). The plasmid pHMWB::HMWC was generated by ligating a 4.8-kb NruI fragment containing hmw1B and hmw1C from pHMW1-15 (11) into NruI-digested pACYC184. In this expression system, HMW1-PP is secreted into the cell culture supernatant. The bacteria were incubated shaking at 37°C in 2 liters of LB medium supplemented with 34 g/ml chloramphenicol and 100 g/ml ampicillin until the culture reached an absorbance at 600 nm of ϳ0.8. Isopropyl-␤-D-thiogalactoside was added to achieve a final concentration of 0.2 mM, and the culture was incubated for 4 more hours. The bacteria were pelleted by centrifugation at 6,000 ϫ g for 20 min, and the supernatant was collected and filtered through a 0.22-m membrane (Corning). HMW1-PP was enriched by precipitation with ammonium sulfate at 65% saturation. Following centrifugation at 15,000 ϫ g for 1 h, the protein pellet was resuspended in 25 ml of 20 mM Tris, pH 7.4, 0.5 M NaCl and dialyzed against buffer A (Tris-HCl, pH 7.4, 250 mM NaCl). Subsequently, the protein sample was applied to a nickel-nitrilotriacetic acid superflow (Qiagen) column equilibrated in buffer A and then eluted with a linear gradient of 0 -0.5 M imidazole in buffer A. The pooled fractions containing HMW1-PP were dialyzed against buffer B (20 mM Bis-Tris, pH 7.0, 50 mM NaCl) and applied to an anionic exchange column (HitrapQ, 5 ml; GE Healthcare) equilibrated in buffer B. HWM1-PP has a theoretical pI of 5.21, was bound to the column under these conditions, and then eluted with a linear gradient of 0.05-1 M NaCl in buffer B. The purified fractions were finally submitted to size exclusion chromatography using HiPrep Sephacryl 16/60 S200 equilibrated with buffer C (20 mM MES, pH 5.5, 100 mM NaCl, 5% glycerol).
For the purpose of selenium multiwavelength anomalous dispersion phasing, we generated a construct expressing a GST::HMW1-PP fusion protein. The plasmid pGEX::HMW1 69 -441 was generated by amplifying a 1.1-kb DNA fragment encoding HMW1 amino acids 69 -441 with an EcoRI site at the 5Ј end and a SalI site at the 3Ј end and then ligating this fragment into EcoRI-SalI-digested pGEX-6P-1 (GE Healthcare). The construct was transformed into Escherichia coli strain DL41, a Met auxotroph strain, and the SeMet-labeled protein was produced as described previously (31). After cleavage of the GST moiety from GST::HMW1-PP using PreScission protease (GE Healthcare), SeMet-labeled HMW1-PP was further purified using an anionic exchange column (Hitrap Q) and/or a gel filtration column. For both columns, SeMet-labeled HMW1-PP showed the same elution profile as the native secreted protein. This preparation yielded about 0.8 mg of SeMet-labeled HMW1-PP/liter of culture.
Crystallization and Data Collection-Crystals of the native protein (secreted form) were grown in a 1:1 mixture of protein (6 mg/ml in buffer of 20 mM MES, pH 5.5, 100 mM NaCl, 5% glycerol) and reservoir solution containing 22-24% polyethylene glycol 6K and 0.1 M HEPES (pH 6.8 -7.2) using the hanging drop vapor diffusion method at 17°C. Native crystals (rectangular or triangular thin plate) grew to typical dimensions 200 m ϫ 150 m ϫ 40 m within 2 weeks and diffracted to 2.4 Å when using a synchrotron beam source (Advanced Photon Source, 19BM). The SeMet-labeled protein (cytoplasmic form) often yielded needle crystals with the same conditions (polyethylene glycol 6K) used for the native crystals, and growing crystals of SeMet-labeled HMW1-PP was much more difficult. For data collection, small SeMet derivatized crystals with dimensions 70 m ϫ 50 m ϫ 40 m were used after quick cryoprotection with reservoir solution containing 25% glycerol and direct freezing in liquid N 2 . From single crystals, two singlewavelength anomalous dispersion data sets were collected at 100 K (Advanced Photon Source, 19BM) at the wavelength of 0.9789 Å. One crystal diffracted to 1.92 Å with high crystal mosaicity, and another crystal diffracted to 2.4 Å with low crystal mosaicity. Although the low resolution data set with low mosaicity was used for single-wavelength anomalous dispersion phasing, the high resolution data set was used for structural refinement. The data sets were indexed and integrated using HKL2000 and scaled with SCALEPACK (32). Both native form and SeMet-derivatized crystals belonged to space group I4 with unit cell dimensions a ϭ b ϭ 121.4 Å, c ϭ 50.58 Å and one molecule/asymmetric unit.
Structure Determination and Refinement-The HMW1-PP structure was solved by the single-wavelength anomalous dispersion method. Selenium atom search, initial phasing, and density modification were performed using autoSHARP (33). The initial model was built manually using COOT (34) and XTALVIEW (35) and placed with 60% of the secondary structural elements as polyalanine and 10% with the side chain. After the model building, initial refinement was carried out using CNS1.1 (36) with rigid body, simulated annealing, and restrained individual B factor refinement. At this step the R factor was over 43% (R free ϭ 48%), and then several steps of refinement were performed using REFMAC5 (37) followed by manual adjustment of the model. Coordinates with electron density greater than 4 in F o Ϫ F c maps were assigned as water molecules if the locations were reasonable for hydrogen bonding. B factor cut-off of 40.0Å 2 was applied to water molecules, and any water molecules refining to higher values were removed from the model. The final model has an R factor of 17.4% and a free R factor of 21.7%. Structural analysis of the final model using the Protein Data Bank validation suite indicated that none of the residues is in the disallowed region on the Ramachandran plot, and almost all the residues are in the most favored regions. A summary of the data collection and refinement statistics is given in Table 1. The coordinates and structure factors for HMW1-PP have been deposited in Protein Data Bank (code 2ODL).
Secretion Assay-To assess the portion of HMW1-PP required for secretion, we generated plasmids encoding HATtagged HMW1 1-361 (pHMW 1-361 ::HAT) and HMW1 1-269 (pHMW 1-269 ::HAT), using pHMW1 1-441 ::HAT as a control. These plasmids were created by ligating a DNA fragment containing 340 bp of sequence upstream of the hmw1A start codon and the appropriate coding sequence into HindIII-digested pHAT10 (Clontech) and were then transformed into E. coli DH5␣ harboring pACYC-HMW1B (12). The resulting strains were resuspended from plates into LB broth to an absorbance at 600 nm of 0.3 and were incubated at room temperature for 45 min and then subjected to centrifugation at 6,000 ϫ g at 4°C for 10 min. The cell pellet was resuspended in 10 mM HEPES, pH 7.4, and sonicated to clarity. Culture supernatants were precipitated by adding trichloroacetic acid to a final concentration of 10% (v/v), incubating for 10 min at 4°C, and then centrifuging at 15,600 ϫ g at 4°C for 10 min. The trichloroacetic acid-precipitated proteins were resuspended in 0.2 M Tris, pH 9.0. Cell sonicates and trichloroacetic acid precipitates were resolved on SDS-PAGE gels and examined by Western analysis using an antiserum against the HAT epitope.

RESULTS
Structure Determination-In earlier work, we found that the HMW1 pro-piece (HMW1-PP, corresponding to residues 69 -441) mediates interaction with the HMW1B outer membrane translocator and is essential for HMW1 secretion (12,15). During the process of translocation of HMW1 across the outer membrane, HMW1-PP is cleaved from the HMW1 proprotein, generating mature HMW1 (Fig. 1). To study the mechanism of HMW1 secretion, we focused initially on obtaining HMW1-PP as a secreted native protein. As a first step, we constructed a DH5␣ derivative harboring pHMW1 1-441 ::HAT (encoding HMW1 1-441 fused to the HAT epitope) and pHMW1B-HMW1C (encoding HMW1B and HMW1C). In this strain, HMW1 1-441 is directed to the inner membrane via the signal sequence, processed to release amino acids 1-68, translocated through HMW1B, and secreted extracellularly. Using this strain, we were able to purify large amounts of HMW1-PP from the culture supernatant. This secreted native form of HMW1-PP yielded trigonal or tetragonal plate crystals. Molecular replacement using Fha30 did not give a clear structure solution, and a search for heavy metal derivatives was unsuccessful. As an alternative approach, we generated a construct encoding a GST::HMW1-PP fusion protein and then expressed this protein in E. coli DL41 in SeMet labeling medium. Following recovery of the GST::HMW1-PP fusion protein from the bacterial cytoplasm, we cleaved the GST moiety and purified SeMet-labeled HMW1-PP. The SeMet-labeled form of HMW1-PP showed the same chromatographic properties as the native secreted form of HMW1-PP, indicating that the protein expressed in the cytoplasm was folded, comparable with the native secreted form. Both cytoplasmic HMW1-PP and native secreted HMW1-PP crystallized in the tetragonal space group I4 with one molecule per asymmetric unit and diffracted to high resolution. The structure of HMW1-PP was FIGURE 1. Domain organization of HMW1 and model for TPS. The HMW1 signal sequence (residues 1-68) is shown in yellow, the HMW1 pro-piece (residues 69 -441) is shown in magenta, and the mature HMW1 adhesin (442-1536) is shown in blue (a large cell surface structure) and in orange (a small C-terminal anchor). For clarity, only the secretion process across the outer membrane is presented. After Sec-dependent export of the HMW1 preproprotein across the inner membrane, the HMW1 secretion domain interacts with HMW1B, likely in the periplasm, facilitating translocation of HMW1 across the outer membrane. Sometime during this translocation process, HMW1-PP is cleaved and released to the extracellular space (S. Grass and J. W. St. Geme III, unpublished data). Ultimately, the binding domain of mature HMW1 is exposed on the tip of the adhesin, away from the bacterial cell surface, in a position that favors access to host cells.
solved using single-wavelength anomalous dispersion phasing, and the structure of the native form was identical to the SeMetderivatized form when solved by molecular replacement. The final model has an R/R free of 17.4/21.7% (Table 1). This model comprises all 371 amino acids of HMW1-PP except for residue 69 and includes 182 water molecules.
Structure of the HMW1 Pro-piece-The structure of HMW1-PP at a resolution of 1.92Å represents a complete view of residues 70 -441, including the functional secretion domain. HMW1-PP is a monomer folded into a large right-handed coil termed a parallel ␤-helix, with dimensions of ϳ70 Å ϫ 40 Å ϫ 30 Å (Fig. 2, A and B). The ␤-helix is formed by three parallel ␤-sheets, referred to as PB1, PB2, and PB3 (Fig. 2C), according to the naming of the ␤-sheets in the pectate lyase structure, the first ␤-helix fold determined (21). The entire HMW1-PP sequence contains 12 complete parallel coils in the helix (Fig. 2,  A and C). At the N-terminal side, the first three ␤-strands are anti-parallel to the core right-handed ␤-helix. The first two parallel coils (␤4-␤5-␤6 and ␤7-␤8-␤9) and the topping antiparallel rung ␤1-␤2-␤3 are tilted from the core ␤-helix axis and form a slightly tapered ␤-helix. Interestingly, the highly conserved NPNG motif (denoted in green in Fig. 2A) forms a type I ␤ turn and bridges this twisted N-terminal part and the rest of the helix. Following the NPNG turn, the coils become more regular and form an overall cylinder, with a triangle-shaped cross-section.
Most parallel ␤-helical structures appear to have a conserved amphipathic ␣-helix at the N terminus, which caps and shields the hydrophobic core of the ␤-helix at the N-terminal side (22). In HMW1-PP the N terminus is uncovered and is stabilized by the first three antiparallel strands ␤1, ␤2, and ␤3. Another notable feature of the HMW1-PP structure is the internal organization of the amino acids that form the core of the parallel ␤-helix.  The interior of the ␤-helix consists of mostly aliphatic hydrophobic residues but also contains two other important types of side chain stacks, namely an aromatic cluster and a hydrogenbonded asparagine ladder. Four aromatic residues (Trp 102 , Phe 105 , Phe 115 , and Phe 167 in ␤5, ␤6, ␤7, and ␤14, respectively) are clustered within the first four coils, thereby reinforcing the hydrophobic core of the N-terminal side. The PB1 interior of coils 4 -12 is remarkable in that every ␤-strand in this region is strictly composed of three residues, forming a very regular and long Ile ladder, with the exception of Leu 286 at ␤28 ( Table 2). The PB2 sheet is composed of ␤-strands with 4 -6 residues/ strand and represents a very irregular face. The side chains of the PB2 interior are variable, including Leu/Val/Ile/Thr/Ala/ Trp/Phe. The PB3 sheet of coils 4 -12 is very regular and is composed of ␤-strands with four residues/strand. One side chain stack of PB3 is characterized by an Asn ladder and a contiguous Ile/Leu ladder ( Table 2). Of note, four Asn residues in PB3 (Asn 199 , Asn 220 , Asn 262 , and Asn 282 in ␤16, ␤19, ␤24, and ␤27, respectively) are stacked in an elegant manner to produce an extensive network of hydrogen bonds. In fact, the position of the side chains is optimal for hydrogen bonding and packing. In each position of PB stacks, the nature of hydrophobic residues is very similar, thereby resulting in an overall cylinder-shaped ␤-helix. On the exterior of the helix, stacks contain largely polar residues (Fig. 3), except for one stack of 13 residues at PB1 that includes seven hydrophobic residues Ile/ Leu/Val ( Table 2).
The T1, T2, and T3 connecting loops and turns between ␤-helical strands correspond to loop regions between PB1 and PB2, between PB2 and PB3, and between PB3 and PB1, respectively (Fig. 2C). The T3 turns are the most regular and short loops and contain three to five residues (predominantly three residues). The T1 and T2 turns are more variable in length and adopt several secondary structural elements, forming three protruding extra-helical domains. The large extra-helical domain in T2 is formed by ␣1, ␤15, ␤22, and ␤23 and is bent toward PB2, stabilized by forming hydrophobic cores with residues from ␤8, ␤11, ␤14, ␤18, and ␤21 of PB2. In contrast, the small extra-helical domain in T1 composed of ␤32, ␤36, and ␤40, and ␣2 is oriented externally and is exposed to solvent. As a result of the T1, T2, and T3 turns, the PB1 and PB3 faces are completely accessible to solvent, and the PB2 face is partially shielded by the large extra-helical domain. Of note, as shown in Fig. 3, the partial shielding of the PB2 face creates a major hydrophobic pocket decorated by basic amino acid residues, namely Lys 295 , Lys 304 , and Lys 325 .
Comparison of HMW1-PP and Fha30-HMW1-PP and Fha30 share only 21% sequence identity, presumably explaining why the attempt to predict the structure of HMW1-PP based on the structure of Fha30 was unsuccessful and why the structure of HMW1-PP could not be solved by molecular replacement using Fha30 as the search model. Despite the differences in primary amino acid sequence, both HMW1-PP and Fha30 adopt a right-handed parallel ␤-helix fold. However, because of differences in the spatial arrangement of secondary structure elements and loop structures in these proteins, the structurebased sequence alignment reveals unexpectedly large gaps between the two sequences (Fig. 4A). Superposition of the corresponding residues of the two structures resulted in a root mean square deviation of 2.4 Å for 291 C␣ atoms and a Z score of 24. The most significant structural differences are noted at the N terminus and in the lateral extra-helical moiety (Fig. 4, B  and C). Interestingly, in both HMW1-PP and Fha30, the first three ␤-strands at the N terminus (␤1, ␤2, and ␤3) cover the first coil of each ␤-helix. However, although the first three strands ␤1, ␤2, and ␤3 of HMW1-PP are antiparallel to the core ␤-helix and form extensive hydrogen bonds with ␤6 of PB1, ␤5 of PB2, and ␤4 of PB3, respectively, only the first two strands  ␤1and ␤2 of Fha30 are antiparallel to the core ␤-helix (Fig. 4A). Hence, the N terminus of HMW1-PP forms an uncovered helical structure, whereas the N terminus of Fha30 has a capped structure (Fig. 4C). The extrahelical ␤-hairpin formed by ␤7 and ␤8 in Fha30 does not have the counterpart in HMW1-PP. Interestingly, the major extrahelical motif in HMW1-PP and Fha30 (composed of ␣1/␤15/␤22/␤23 in HMW1-PP and ␤16/ ␤17/␤24/␤25 in Fha30) is formed at the same spatial position for each structure, despite the different nature and arrangement of the secondary structure elements. As shown in Fig. 4A, the sequence conservation is highly concentrated on ␤ strands forming the core helix. Yet, with the low sequence identity, side chain stacks of the interior as well as the exterior of the coils are quite different, resulting in very different surface properties for the two structures and thereby conferring specificity for the interaction with the cognate TpsB protein. Structural Requirement for HMW1-PP Secretion-The structure of HMW1-PP is a single domain structure and contains the secretion domain, a functional region that is proposed to correspond to the N-terminal ϳ250 residues of TpsA proteins but is poorly defined (20,23). In an effort to define the region of HMW1-PP required for secretion, we generated constructs encoding HMW1 1-441 , HMW1 1-361 , and HMW1 1-269 and then expressed these proteins in the presence of HMW1B. As shown in Fig. 5, HMW1 1-441 , HMW1 1-361 , and HMW1  were all detectable in whole-cell sonicates, but only HMW1 1-441 and HMW1 1-361 were associated with secretion into the culture supernatant (proteins corresponding to HMW1 69 -441 and HMW1 69 -361 were detectable in culture supernatants, reflecting cleavage of the signal peptide). (The smaller bands in lanes 2 and 3 in Fig. 5B presumably represent breakdown products of HMW1 69 -441 and HMW1 69 -361 , respectively). These results demonstrate that the functional HMW1 secretion domain corresponds to a portion of HMW1-PP, larger than HMW1 69 -269 and perhaps as large as HMW1 69 -361 .

DISCUSSION
The TPS pathway is among the most widespread mechanisms for protein secretion in Gram-negative bacteria and is characterized by specific recognition between a TpsA exoprotein and a TpsB outer membrane translocator. The nontypeable H. influenzae HMW1 adhesin is a prototype virulence factor secreted via the TPS pathway. In this study, we have elucidated the structure of the HMW1-PP, a region that interacts with the HMW1B outer membrane translocator and is essential for HMW1 secretion. The HMW1-PP structure adopts a monomeric ␤-helix with 12 complete parallel helix turns and one large extra-helical domain. Importantly, the HMW1-PP structure shares striking similarity with Fha30, demonstrating a conserved fold critical for secretion of TpsA proteins and allowing a structure-based sequence alignment of HMW1-PP and Fha30. Of note, earlier studies failed to report a meaningful sequence alignment of HMW1 and FHA, reflecting the fact that these two prototype TpsA proteins possess low sequence similarity.
Previous studies have examined the NPNG and NPNL motifs that are located within the ϳ100 residue region adjacent to the signal peptide in TpsA proteins (15)(16)(17)(18). Although all TpsA proteins contain the NPNG motif, only a subset contain the NPNL motif, including FHA but not HMW1. The structure-based sequence alignment of HMW1-PP and Fha30 demonstrates that the NPNL motif resides in a 19-residue insertion that is present in FHA and lacking in HMW1. This insertion contains the ␤7 and ␤8 strands and forms an extrahelical ␤-hairpin in Fha30, serving a function that may be important for FHA but is obviously not essential for all TpsA proteins. In contrast, the NPNG motif appears to be a critical structural element for all TpsA proteins. Indeed, in HMW1 the NPNG motif contributes to a type I ␤-turn and is largely buried in the hydrophobic core, suggesting that this ␤-turn is important for folding of TpsA proteins. In previous work, we found that mutation of the NPNG motif in HMW1 to IAIG resulted in a loss of processing and secretion (15). Similarly, examination of this motif in other TpsA proteins has demonstrated a crucial role in secretion (16 -18). Based on the structures of HMW-PP and Fha30, we generated a multiple sequence alignment of representative TpsA proteins, revealing two distinct subsets (Fig. 6). It is noteworthy that the variable regions between HMW1-PP and Fha30 are conserved within each subset, suggesting that all members of a given subset adopt the same structure.
Since the report of Erwinia chrysanthemi pectate lyase C (PelC) as the first example of a ␤-helix structure, ␤-helical folds have been identified in a number of proteins associated with infection (21). Representatives include pectate lyaserelated proteins, viral adhesin proteins, and bacterial virulence factors such as Bordetella spp. p69 pertactin (20, 24 -27). Recently, ␤-helical folds have also been identified in an antibiotic resistance protein by mimicking DNA and a pollen allergen protein (28,29). A search for structural homologs of HMW1-PP using the program DALI (30) identified a number of proteins with very high scores. The HMW1-PP fold has a high level of structural similarity (Z score of Ͼ10; root mean square deviation value of 2.4 -3.6 Å) to 13 other ␤-helix structures that are separated into two different functional classes of proteins, namely adhesin molecules (e.g. FHA, tailspike viral adhesion protein, and p69 pertactin) and glycoside hydrolases, including PelC homologs (20, 21, 24 -27). Of note, all of these molecules are secreted to the cell surface or released to the extracellular space. Interestingly, among these ␤-helix structures, HMW1-PP and Fha30 are unique in that the N terminus comprises antiparallel ␤-strands, instead of ␣-helical capping. As we await more TpsA structures, we speculate that the N-terminal antiparallel coil may be a signature for the TpsA ␤-helix fold, serving to stabilize the N-terminal side.
terminus of mature HMW1 is the site of the HMW1-binding domain, it is possible that cleavage of the HMW1-PP fragment enhances exposure of the binding domain and thereby facilitates interaction with host cells. In considering the fact that HMW1-PP adopts the same fold as some glycoside hydrolases, it is notable that HMW1 is glycosylated, raising the possibility that HMW1-PP may be capable of cleaving carbohydrates, resulting in an increase in the structural polymorphism of HMW1.
In conclusion, the structure of HMW1-PP provides the first high resolution picture of the H. influenzae HMW1 TPS pathway. Despite limited overall sequence similarity between HMW1-PP and other TpsA proteins, our structure-based alignment demonstrates remarkable sequence conservation in the secondary structural elements throughout the ␤-helix core structure. Thus, the HMW1-PP structure highlights a fundamental concept, namely that the essential structural and functional elements of proteins are preserved during evolution, sometimes independent of specific amino acid sequences. The structure presented here provides the foundation for rational mutagenesis studies to investigate the determinants of the highly specific interaction between HMW1-PP and HMW1B and between TpsA and TpsB proteins in general. Moreover, based on our multiple sequence alignment, homology models of the secretion domains in other HMW1-like TpsA proteins will be useful in designing new experiments to examine their functional roles.