Ultrahigh Resolution and Full-length Pilin Structures with Insights for Filament Assembly, Pathogenic Functions, and Vaccine Potential*

Background: The Type IV pili are bacterial pilin polymers with multiple functions in pathogenesis. Results: We report here pilin structures for full-length FimA from Dichelobacter nodosus and PilE from Francisella tularensis. Conclusion: Type IVa pilins have conserved curved α-helical N termini and D-regions in the absence of C-terminal cysteines. Significance: Conserved regions form the framework for hypervariable and functional regions and represent vaccine targets. Pilin proteins assemble into Type IV pili (T4P), surface-displayed bacterial filaments with virulence functions including motility, attachment, transformation, immune escape, and colony formation. However, challenges in crystallizing full-length fiber-forming and membrane protein pilins leave unanswered questions regarding pilin structures, assembly, functions, and vaccine potential. Here we report pilin structures of full-length DnFimA from the sheep pathogen Dichelobacter nodosus and FtPilE from the human pathogen Francisella tularensis at 2.3 and 1 Å resolution, respectively. The DnFimA structure reveals an extended kinked N-terminal α-helix, an unusual centrally located disulfide, conserved subdomains, and assembled epitopes informing serogroup vaccines. An interaction between the conserved Glu-5 carboxyl oxygen and the N-terminal amine of an adjacent subunit in the crystallographic dimer is consistent with the hypothesis of a salt bridge between these groups driving T4P assembly. The FtPilE structure identifies an authentic Type IV pilin and provides a framework for understanding the role of T4P in F. tularensis virulence. Combined results define a unified pilin architecture, specialized subdomain roles in pilus assembly and function, and potential therapeutic targets.

of their roles in pathogenesis and as promising vaccines and therapeutic targets.
T4P are flexible filaments, several m in length but less than 100 Å wide, displayed on many Gram-negative bacterial surfaces as well some Gram-positive Clostridial species (9,10). N. gonorrhoeae GC pili are perhaps the best characterized of the T4P in terms of their structure, mechanical properties, and biological functions. GC pili self-associate to hold bacteria in protective microcolonies at the infection site, and they adhere specifically to host receptors (11). They are among a subclass of T4P that are retractile, a feature that facilitates their functions in twitching motility, natural transformation, and intimate adhesion, which precedes invasion of host cells (12,13). GC pili have high tensile strength and are resistant to proteases and denaturants (12,14). 5 The structure of the intact GC pilus filament was determined by fitting the full-length GC pilin structure into a 12.5 Å resolution electron cryomicroscopy reconstruction (6). In GC pili, the conserved N-terminal ␣-helices are arranged in a helical array in the core of the filament with the globular domains lining the pilus surface. The conserved Glu-5 is positioned to form a salt bridge with the positively charged N-terminal amine in the otherwise hydrophobic core of the filament. We proposed that this interaction may be a driving force for pilus assembly (6). A hypervariable ␤-hairpin located between two conserved cysteines protrudes from the surface, allowing the pili to continually alter their antigenicity and evade the immune system (15)(16)(17).
The T4P of the sheep pathogen Dichelobacter (formerly Bacteroides) nodosus are of great interest because they are highly immunogenic and have proven efficacy as vaccines for ovine footrot, a debilitating disease with major economic impact (18 -20). Like GC pili, D. nodosus T4P, or "fimbriae," are retractile (21,22). These pili mediate twitching motility, DNA uptake, and cell adhesion and act as secretion organelles, transporting proteases across the bacterial outer membrane (21)(22)(23). Only two T4P, D. nodosus and Vibrio cholerae toxin coregulated pili (22,23), have been shown to act in secretion, but strong similarities between the T4P system and the bacterial Type II secretion (T2S) system imply that other T4P may share this capability. D. nodosus T4P are immunogenic and protective when injected into sheep in a purified form, but protection is serogroup-specific due to antigenic variability in the FimA pilin (24). The FimA proteins from each serogroups are divided into two classes; the class I FimA pilins have a ϳ40 -45-amino acid segment between two atypical centrally located cysteines, whereas the class II FimAs have only a 9-residue segment as well as a C-terminal cysteine pair conserved in most Type IV pilins (25) (supplemental Fig. S1). Three hypervariable regions are distributed throughout the FimA sequence: two discrete segments between the central cysteine pair and one in the C-terminal segment, which includes the C-terminal cysteines of the class II FimA proteins. A detailed characterization of FimA pilin in the context of the pilus filament would provide insights into pilus functions in D. nodosus pathogenesis and facilitate design of effective and broadly neutralizing footrot vaccines.
The identification of Type IV pilin genes in the deadly human pathogen F. tularensis (7) reinforces the importance and timeliness of characterizing pilin structures, which contribute to our understanding or their roles in virulence and potential as vaccines. F. tularensis causes the frequently fatal disease tularemia (26). This organism is widespread in wildlife, and transmission to humans occurs via small mammals, such as rabbits, and insect vectors such as ticks, mosquitoes, and deer flies (27). F. tularensis is classified as a United States government Class A agent and a potential biological weapon because of its extreme infectivity, ease of dissemination via aerosols, and capacity to cause illness and death (26, 28 -31). Infection can occur via inhalation, ingestion, and exposure to the skin or eyes. The mechanism by which F. tularensis colonizes multiple tissue types is enigmatic but may relate to its multiple pilin genes because other T4P act as cell-specific adhesins (3).
F. tularensis subspecies fall into two classes: the highly virulent Type A strains (subsp. tularensis) and the less virulent Type B strains (subsp. holarctica). An attenuated F. holarctica live vaccine strain and the mouse pathogen F. novidica are often used in laboratory studies. The highly virulent F. tularensis Schu S4 genome contains six open reading frames encoding PilA, PilE (PilE2), PilV (PilE3), PilE4, PilE5, and PilE6 (7,28,32), which bear Type IV pilin hallmarks consisting of a conserved hydrophobic N terminus with a glutamate at position 5 (except PilE6, which has Gln-5) and a pair of C-terminal cysteines. Genes encoding putative pilus assembly proteins have also been identified in the F. tularensis genome. The genes for PilE, PilV, and PilE4 are disrupted in less virulent type B strains (32), implying that these pilins may contribute to the extreme virulence of Type A strains. Alterations in PilA, PilE5, and PilE6 attenuate infectivity for the Type B strains but have reduced impact on the highly virulent Schu S4 strain (32)(33)(34)(35). This implies redundancy in the type A strains, with PilE and/or PilV compensating for loss of PilA.
Although some putative F. tularensis pilins are associated with virulence, it is not clear whether they form T4P. T4P-like filaments were observed on the surfaces of Schu S4 and live vaccine strain, but disruption of only one of the putative pilin genes, pilE4, affected pilus expression (33,36). However, heterologous expression of PilA in an N. gonorrhoeae strain lacking its endogenous pilin gene restored pilus expression and DNA uptake, suggesting that PilA forms functional pili (37). In F. novicida, PilA is involved in secretion of the PepO protease, but this does not explain its role in the human F. tularensis pathogens, which do not produce PepO (36,38). Thus, some of these "pilin" genes may actually encode T2S pseudopilins. Further characterization of these gene products is required to establish them as bona fide Type IV pilins and to help define their roles in F. tularensis pathogenesis.
Here we present two new Type IV pilin crystal structures: the full-length FimA pilin from D. nodosus (DnFimA) and an ultrahigh resolution F. tularensis PilE (FtPilE) structure. Both pilins belong to the Type IVa pilin subclass, members of which are present on a broad range of bacterial pathogens, including N. gonorrhoeae and P. aeruginosa. Type IVa pilins differ from the Type IVb pilins, which are found almost exclusively on enteric pathogens, in their length and amino acid sequences as well as their protein fold (10). These structures provide important insights regarding T4P assembly and function, including the nature and role of the N-terminal segment in filament assembly, the distribution of hypervariable regions in DnFimA, and the authenticity of the F. tularensis Type IV pilins.

EXPERIMENTAL PROCEDURES
Expression and Purification of DnFimA-Full-length DnFimA was isolated from P. aeruginosa strain PAK/2Pfs (39) carrying plasmid pJSM202, which expresses full-length DnFimA (a gift from John Mattick, University of Queensland) (40). DnFimA was purified as described (4). Briefly, cells were grown overnight on tryptic soy broth plates, harvested by scraping, and resuspended in 150 mM CHES, pH 9.5. Pili were sheared from the cells by vortexing for three 1-min intervals. Cellular debris were removed from the pilus solution by centrifugation, and pili were precipitated by exhaustive dialysis in phosphate-buffered saline, pH 7.4. The pili were dissociated into DnFimA subunits by extensive filtration using an Amicon stirred cell concentrator (Millipore) with a 30,000 molecular weight cut-off membrane in 100 mM Tris-HCl, pH 8.0, 20 mM NaCl, 1 mM dithiothreitol (DTT), 1.5% n-octyl-␤-D-glucopyranoside (w/v).
Structure Determination of DnFimA-DnFimA was concentrated to 15 mg/ml and crystallized in 60% saturated ammonium sulfate, 5% saturated NaCl (w/v), 100 mM sodium citrate, pH 6.6, and 50 mM NaPO 4 , pH 9.2. Crystals were flash-frozen in liquid nitrogen, and diffraction data were collected at SSRL beamline 11-1 and processed with HKL2000 ( Table 1). The DnFimA structure was solved using molecular replacement with the program Phaser (41) using the N. gonorrhoeae GC pilin conserved core (i.e. PDB entry 2HI2 lacking the ␣␤-loop and D-region) as a search model. The structure was refined using PHENIX (42) and manually fit using COOT (43). The two pilin molecules in the asymmetric unit are virtually identical; the only difference is that residues 59 -67 are missing in chain B. Therefore, chain A is shown in most figures. Refinement statistics are shown in Table 1.
Expression and Purification of FtPilE-F. tularensis SCHU S4 genomic DNA was obtained from Jeannine Peterson at the Centers for Disease Control and Prevention. The gene fragment encoding residues 28 -148 of FtPilE (FTT0889c) was cloned into the pET-28b expression vector (Novagen), which encodes an N-terminal hexahistidine tag. Cells with overexpressed FtPilE were harvested and resuspended in 50 mM HEPES, pH 7, 500 mM NaCl and then lysed by sonication. FtPilE was purified using a nickel-nitrilotriacetic acid column (GE Healthcare). FtPilE was eluted in 50 mM HEPES, 250 mM NaCl, 300 mM imidazole; dialyzed into 50 mM Tris-Cl pH 7.5, 50 mM NaCl, 1 mM EDTA, 1 mM ␤-mercaptoethanol; and loaded on a size exclusion column. Fractions containing FtPilE were concentrated to 15 mg/ml.
Structure Determination of FtPilE-FtPilE was crystallized in 38% PEG 2000 monomethyl ether, 5% saturated LiSO 4 , 25 mM BIS-Tris propane, pH 7, 20 mM xylitol, and 10 mM DTT. Crystals were flash-frozen in liquid nitrogen, and a 1 Å resolution diffraction data set was collected at SSRL beamline 9-1 and processed with HKL2000 (Table 1). For phase determination, selenomethionine-substituted FtPilE (SeMet-PilE) was produced. SeMet-PilE crystallized in the same space group as FtPilE, and a 1.9 Å data set was collected (Table 1). We located six of the eight seleniums in the asymmetric unit using SOLVE (44), which allowed initial phases to be determined. RESOLVE (45) was used to build a model, which was used to solve the 1.0 Å structure by molecular replacement with AMoRe (46). After manual fitting with COOT (43) and refinement with CNS (47), ACORN was used to refine the phases with dynamic density modification (48). These phases were then used in ARP/wARP (with the 1.0 Å data), which automatically fit over 90% of the model, including side chains. Fitting was completed manually. Due to the ultrahigh resolution, hydrogen atoms and anisotropic B-factor parameters were included in the refinement. In addition, alternate conformations, which are visible for more than 25% of the protein, were included, and the occupancy was refined in PHENIX (42). Refinement statistics are shown in Table 1.
Filament Models-The DnFimA structure was superimposed onto a single GC pilin subunit, optimizing the fit between the N-terminal ␣-helices, using the model building program in COOT (43). DnFimA and GC pilin share 47% identity over the first 50 amino acids. The filament was generated by applying the symmetry operators for the GC pilus filament (PDB entry 2HIL): an axial rise per subunit of 10.5 Å and an azimuthal rotation of 100.8°using PDBSET in COOT. The F. tularensis pilus model was built using the same method; however, the x-ray coordinates for the N-terminal 27 residues were obtained from the GC pilin structure (PDB entry 2HI2). FtPilE and GC pilin share 67% sequence identity over the first 28 amino acids and 41% identity over the first 50.

RESULTS
Full-length DnFimA Structure Determination-To examine the relationships between pilin structure and function for D. nodosus fimbriae, we sought to determine the crystal structure of full-length DnFimA. Pure protein was obtained by expressing DnFimA from D. nodosus VCS1001 (serogroup A, class I) heterologously in P. aeruginosa PAK/2Pfs (40). Pilin subunits were dissociated from sheared surface pili using high pH and n-octyl-␤-D-glucopyranoside to solubilize the pilin subunits. Previous full-length pilin crystallizations required polyethylene glycol and organic alcohols that can potentially influence the structure of the N terminus (4 -6). Here we obtained the full-length DnFimA structure in the presence of n-octyl-␤-D-glucopyranoside and salts but no organic alcohols, providing distinct conditions to examine the protruding N-terminal helix and subunit interactions.
Phase information was obtained by molecular replacement using the N. gonorrhoeae pilin conserved core as a search model. DnFimA crystallized with two molecules in the asymmetric unit ( Table 1). The two DnFimA molecules in this "crystallographic dimer" are oriented head-to-head and tail-to-tail about a 2-fold symmetry axis that runs almost parallel to the long axis of the proteins, with a buried interface of 570 Å 2 (Fig.  1). The C-terminal globular domains face away from each other, and the N-terminal ␣-helices cross at residues 19 -21, the site of a kink induced by Pro-22.
DnFimA Fold, Architecture, and Interactions-The exposed N-terminal segments of ␣1 of each subunit in the crystallographic dimer run almost parallel to each other, held together by van der Waals interactions between the hydrophobic side chains and by hydrogen bonds between the Glu-5 side chain oxygens of each subunit and the Phe-1 nitrogen of its dimer partner. This interaction appears to distort the ␣-helix at Phe-1 and Thr-2, causing the N terminus and the Glu-5 side chains to "reach" toward each other. The globular domains are not part of the dimer interface because the closest atoms are almost 5 Å apart (Leu-142 C␦1-Leu-142 C␦1), and several like charges lie opposite each other, which may actually help to separate these domains ( Fig. 1). The two molecules are similar, with an r.m.s. deviation of 0.35 Å between C␣ atoms. Despite the different crystallization conditions, the overall DnFimA architecture and structure resembles the GC and PAK pilin structures with a high similarity between the N-terminal helices and the central ␤-sheets and differences in the adjacent regions ( Fig. 2A). For all three pilins, the mostly hydrophobic N-terminal half of ␣1, ␣1N, protrudes from the globular domain, and the amphipathic C-terminal half, ␣1C, has its hydrophobic face embedded in an antiparallel ␤-sheet within the C-terminal globular domain. The ␤-sheet is tilted at a ϳ45º angle to ␣1C and partially wraps around ␣1C, where the interface is comprised of mostly hydrophobic side chains.
In each of the full-length Type IV pilin structures, ␣1 has a gently curved S-shape, induced by a proline at position 22 and a glycine or proline at position 42, which are conserved in other Type IVa pilins (Fig. 2, B-D). The angle of the two kinks varies among the pilins, with DnFimA having the sharpest bends. This curvature is important for packing of ␣1 into the pilus filament, as seen in the electron microscopy (EM) reconstruction of the  N. gonorrhoeae GC pilus (6), and probably contributes to filament flexibility. The DnFimA globular domain has the ␣-␤ roll fold characteristic of the Type IVa pilins (Fig. 3A). The DnFimA ␣␤-loop, which lies between ␣1 and the ␤-sheet, contains a ␤-hairpin (residues 58 -68) that protrudes from the globular head domain and is only well resolved in one molecule in the dimer (chain A). Residues 59 -67 are not resolved in chain B, implying flexibility in this region. Following the ␤-hairpin, the polypeptide chain extends downward, antiparallel to ␣1, and forms the first strand of the ␤-sheet (Fig. 3A). Because this strand is absent in the conserved four-stranded ␤-sheets in other Type IVa pilins, it is considered to be part of the ␣␤-loop and is referred to as strand ␤-1. The DnFimA ␣␤-loop is most similar to that of P. aeruginosa K122-4 pilin both in the backbone structure and the presence of an unusual disulfide bond, which in both proteins links the beginning of the ␣␤-loop with the beginning of the second ␤-strand of the ␤-sheet (Fig. 3A). In DnFimA, the disulfide bond is between Cys-56 and Cys-97. K122-4 pilin has a shorter and less protruding ␤-hairpin in a similar position in its ␣␤-loop (residues 63-69). Following the ␣␤-loop is the conserved ␤-sheet, with large loops between strands ␤1 and ␤2 (residues 81-96) and ␤2 and ␤3 (residues 104 -117). In contrast, these loops are much shorter in K122-A pilin, with the ␤3-␤4 loop being longer. The most C-terminal segment following the ␤-sheet is similar in both proteins, but K122-4 pilin has an additional 8 residues that extend away from the globular domain in both the crystal and NMR structures (50,51). Conserved D-region Structure in the Absence of the Conserved Disulfide Bond-The structural similarity between the C-terminal regions of DnFimA and K122-4 pilins is surprising, given that DnFimA lacks the stabilizing disulfide bond seen for all other Type IV pilins sequenced to date. In both DnFimA and K122-4 pilin, the C-terminal segment lies at the periphery of the globular domain (Fig. 3A). The polypeptide chain exits the ␤-sheet and forms an almost circular loop that wraps back to interact with the end of the terminal ␤-strand. In K122-4 pilin, this loop is stabilized by a disulfide bond between Cys-129 near the end of ␤4 and Cys-142 near the C terminus of the pilin and is termed the disulfide region or D-region. Interestingly, DnFimA (and other class I FimA pilins) lacks these C-terminal cysteines and instead stabilizes this conserved conformation with non-covalent interactions: hydrogen bonds between the Tyr-133 nitrogen and the Val-149 oxygen and between Lys-132 N and the Lys-150 oxygen; and van der Waals interactions between the Tyr-133 phenyl ring and the Lys-150 aliphatic side chain segment (Fig. 3B). Because these stabilizing residues are in positions analogous to the K122-4 pilin C-terminal cysteines, and they align well with the C-terminal cysteines present in the class II DnFimA pilins (supplemental Fig. S1), we define the segment between residues 133 and 150 of DnFimA as the D-region, although the conserved cysteines are absent. This D-region conformation appears to be an unanticipated conserved feature in all Type IVa pilins, with variation occurring mainly in the loop immediately following ␤4 (Fig. 3C). GC pilin has the most significant variation, with a hypervariable ␤-hairpin insertion that forms a ␤-sandwich with the main ␤-sheet, giving GC pilin a unique and variable surface stereochemistry. Despite this distinctive feature, GC pilin is in fact more similar than K122-4 pilin to DnFimA in both sequence (39% identity) and overall structure. Both DnFimA and GC pilin have long ␤1-␤2 and ␤2-␤3 loops and a long D-region loop that protrudes from the front of the globular domain (Fig. 3, C and D).
DnFimA Variable Regions, Assembly Model, and Vaccine Implications-DnFimA, like GC pilin, displays antigenic variation (25). We mapped the three discrete hypervariable regions of DnFimA onto its crystal structure (Fig. 4, A and B). Region 1, residues 57-65, corresponds to the protruding ␤-hairpin in the ␣␤-loop. Region 2 (residues 84 -97) corresponds to the ␤1-␤2 loop, and region 3 (residues 134 -145) comprises most of the D-region. These regions are separated in the DnFimA sequence but form an almost contiguous surface on the folded protein, with regions 1 and 2 connected by the central disulfide bond.
To examine D. nodosus fimbria antigenic variation in the context of the assembled pilus filament, as it would be presented to the sheep immune system, we fitted the DnFimA structure into the N. gonorrhoeae GC pilus EM reconstruction (6). DnFimA fit nicely into the GC pilus structure due to its overall similarity with GC pilin (Fig. 4, C and D). The few steric clashes, occurring between the N terminus of each side chain and two neighboring N-terminal ␣-helices, would be eliminated if the first few ␣1 residues were partially extended as in GC pilin rather than reaching upward to interact with Glu-5 (Fig. 1). An extended N terminus would also allow Phe-1 to form a salt bridge with Glu-5 of a neighboring subunit in the filament because these residues would lie at the same axial level due to the staggered arrangement of subunits in the filament. A second clash, occurring between side chains near the kink in ␣1N (Leu-16 and Phe-19) and backbone atoms in the globular domain of the neighboring subunit in the 3-start helix, could be eliminated by minorangle adjustments in the flexible ␣1N, facilitated by a glycine at position 14. Finally, the protruding ␤-hairpin in the ␣␤-loop and the D-region of a neighboring subunit overlap in the filament model (Fig. 4, D and E). The ␣␤-loop structures vary among the Type IV pilins, and the DnFimA ␤-hairpin is only ordered in one of the two molecules in our crystal structure. Thus, the ␤-hairpin, which appears somewhat flexible, probably undergoes a conformational change to accommodate the structurally conserved D-region of the neighboring subunit. Notably, the ␤-hairpin is shorter by 1 residue in many of the DnFimA variants, and the C-terminal loop is shorter by 3 residues in the class II DnFimA variants, which have a C-terminal cysteine (supplemental Fig. S1), so this close contact may be less significant for pili from other D. nodosus serogroups. The hypervariable regions lie on and protrude from the exposed face of DnFimA in the pilus model, as expected for immunodominant sites.
FtPilE Structure Determination-The F. tularensis Type IV pilins are not as well characterized as D. nodosus fimbriae, and their ability to form pilus filaments has not been firmly established. To better understand F. tularensis T4P structure, assembly, and functions, we expressed, purified and crystallized PilE from the highly virulent F. tularensis Schu S4 strain as a His-tagged N-terminally truncated protein (FtPilE). FtPilE functions have not been determined, but its presence in the highly virulent A strains and its absence in the less virulent B strains imply a role in F. tularensis pathogenesis. FtPilE crystallized in the P2 1 2 1 2 1 space group and diffracted to 1.0 Å resolution. To solve the high resolution FtPilE structure, we first solved a 1.9 Å resolution structure of SeMet-PilE by single wavelength anomalous dispersion. This lower resolution structure was then used as a model to solve the 1.0 Å structure of FtPilE by molecular replacement.
FtPilE Fold, Subdomains, and Interactions-FtPilE has the canonical Type IVa pilin fold, with ␣1C embedded in a fourstranded anti-parallel ␤-sheet (Fig. 5A). The ␣␤-loop, residues 53-68, lies on one edge of the ␤-sheet, and the disulfidebonded D-region, between Cys-114 and Cys-132, lies on the other edge. The ultrahigh resolution data allowed us to distinguish dual backbone conformations in most loops, indicating limited flexibility in these regions. As with DnFimA, FtPilE crystallized with two molecules in the asymmetric unit, but in this case the two molecules are oriented in a twisted head-totail arrangement and interact mostly via hydrogen bonding between backbone atoms, burying 584 Å 2 in the interaction interface (Fig. 5B). The crystallographic dimer is stabilized by interactions between 1) the extended ␣␤-loops via canonical antiparallel ␤-sheet hydrogen bonds, although these segments are buckled and lack ␤-strandangles and 2) the C-terminal halves of each ␤1, which give the appearance of a continuous ␤-sheet between the two molecules, although these strands are too far apart to form backbone hydrogen bonds typical of ␤-sheets. The two molecules in the dimer are very similar, with an overall r.m.s. deviation of 0.8 Å (C␣ atoms).

FtPilE Structural Variability and C-terminal Extended
Region-Like the other Type IVa pilins, FtPilE ␣1C possesses a central kink, in this case initiated by Gly-41. This kink lies at the same position as the kink in the other Type IV pilins, but it is even more pronounced than for DnFimA, with the second half of ␣1C oriented at a ϳ45º angle to the first (Figs. 2 and 5). The presence of the conserved Pro-22 implies that the ␣1 of FtPilE shares the S-shaped curve seen in the full-length pilin structures.
The ␣␤-loop of FtPilE is relatively simple compared with other Type IV pilins. From ␣1, the polypeptide chain reverses direction, folding into a 3 10 helix followed by an extended segment that runs antiparallel to and interacts with both ␣1C and the first strand of the ␤-sheet, ␤1, via a core of hydrophobic side chains. The polypeptide chain again reverses direction in a Type II turn to enter the first strand of the ␤-sheet. The ␤1-␤2 and ␤3-␤4 loops at the top of the globular domain are short, 3 and 4 residues, respectively, whereas the ␤2-␤3 loop at the bottom of this domain is 16 residues long and contains a single helical turn. Theangle differences in the ␤3-␤4 turn between the two molecules in the crystallographic dimer result in slightly different loop conformations, which appear to be induced by crystal packing, suggesting flexibility (Fig. 5A).
The D-region that follows the ␤-sheet is a meandering irregular 20-residue segment that folds back on itself and is attached to the end of ␤4 via the Cys-114 -Cys-132 disulfide bond. Beyond Cys-132, the chain is extended to the terminal Ala-138, and part of this segment has ␤-strand conformation. This strand interacts with the beginning of the D-region loop via backbone hydrogen bonds and hydrophobic side chains and lies across the bottom of the ␤-sheet at the front of the globular domain. The FtPilE sequence is dissimilar to GC pilin beyond ␣1, but their structures are nonetheless quite similar (r.m.s. deviation of 4.5 Å for C␣ atoms; Fig. 5C). Their ␣␤-loops are irregular but similar, and the lengths and conformations of their ␤-sheet loops are also comparable. FtPilE lacks the hypervariable ␤-hairpin seen in the D-region of GC pilin, but its ␤-strand replaces some of this bulk. As a result, both FtPilE and GC pilin possess a ridge at the bottom of their globular domains comprised of the large ␤2-␤3 loop, the D-region, and the C-terminal segment.
F. tularensis T4P Model-To model F. tularensis T4P, we fit the FtPilE structure into the GC pilus reconstruction in place of the GC pilin globular domain (Fig. 6A). FtPilE fits exceptionally well into the filament model with only minor side chain clashes. Because the ␣1N of GC pilin was used in the model, the N-terminal residue clashes seen in the D. nodosus pilus model do not occur in the F. tularensis pilus model. The ␤2-␤3 loop and C-terminal loop together form a protruding ridge on the filament surface. In addition, the structurally variable ␤3-␤4 loop lies at the interface between three subunits and is a prominent feature on the pilus surface. Flexibility in this loop may facilitate subunit packing in the filament and may represent both a functional region and an immunogenic one.

DISCUSSION
The crystal structures presented here provide insights into pilin structures and implied T4P assembly, function, and immunogenicity for two important bacterial pathogens with implications for all T4P (Fig. 7). The full-length DnFimA structure shows the S-shaped ␣1 curvature seen for the GC and PAK pilin structures despite its distinct crystallization conditions and space group, confirming that this structure is natural rather than induced by PEG, organic alcohols, or crystal packing. In the N. gonorrhoeae GC pilus EM reconstruction, the N-terminal ␣-helices associate in a helical array in the filament core (4,6), and the helical curvature appears to facilitate subunit packing.
Although individual helices are not resolved in the GC pilus EM reconstruction, the known helical symmetry places the conserved Glu-5 of each pilin subunit at the same level in the filament as the positively charged N terminus of its neighboring subunit, allowing these charged residues to neutralize each other in the otherwise hydrophobic core of the filament. Notably, the DnFimA structure shows a direct interaction between the N-terminal amines and the Glu-5 carboxyl group despite the high pH of the crystallization buffer, where the N-terminal amine group (N1) is probably deprotonated. That this charge pair interaction is seen in the crystallographic dimer in nonphysiological conditions suggests a specific attraction and is consistent with our hypothesis that an analogous N1-Glu-5 interaction acts in filament assembly (6). Notably, the subunit shift associated with having this charge pair interaction in the pilus is consistent both with EM structures and with the defined helix displacement by the membrane-associated and lipidstimulated ATPase family that acts in T4P assembly (52,53). Prior to their incorporation into pilus filaments, pilin subunits are anchored in the inner membrane via ␣1N. We propose that their charged Glu-5, and possibly N1, are exposed to the acyl phase of the lipid bilayer (54). During T4P assembly, Glu-5 of an incoming subunit would lie at the same level in the inner membrane as N1 of the terminal subunit in the growing pilus filament due to the staggered arrangement of the pilin subunits in the pilus filament and the requirement to extrude the pilus incrementally upon each subunit addition. This transient juxtaposition of these two complementary charges would stabilize the newly docked subunit long enough for it to be extruded out of the membrane a short distance. Interestingly, the DnFimA crystallographic dimer shows an interaction proclivity for these two groups even when the subunits lie at the same level, as they would in their membrane-anchored state. However, we propose that the arrangement observed in the crystallographic dimer, which is stabilized by hydrophobic interactions between the ␣1N segments plus the N1-Glu-5 hydrogen bonds, is unlikely to occur between two membrane-solubilized pilins, because their ␣1Ns are already in a hydrophobic environment. Furthermore, charge repulsion would prevent the globular FIGURE 6. F. tularensis pilus model and sequence/structure comparison of Francisella Type IV pilins. A, the F. tularensis T4P model based on the GC pilus structure (6). The C-terminal ␤-strand, ␤5, which is absent in PilA, is circled. The model is colored as in Fig. 5A. B, sequence alignment of FtPilE, FtPilA, and F. novicida PilA. Identical and homologous residues are indicated by * and :, respectively. The FtPilE sequence is colored as in A, with the flexible ␤3-␤4 loop colored blue. The secondary structure of FtPilE is shown above its sequence. FtPilE and FtPilA are 41% identical, and the two PilA proteins are 85% identical. The segment that differs most between the PilA proteins is boxed and corresponds to the ␣␤-loop of FtPilE. domains from associating in this orientation. In contrast, the relative orientation of the globular domains in the pilus filament favors stereochemical complementarity (6). Thus, the DnFimA with its crystallographic dimer helps to explain why pilin subunits only interact in the membrane at discrete coordinated steps during pilus assembly.
The hypervariable regions of DnFimA are located on the protruding ␤-hairpin in the ␣␤-loop, the ␤1-␤2 loop, and C-terminal D-region, respectively. These regions are separated in sequence but form a contiguous surface on the folded protein with the ␣␤-loop and the ␤1-␤2 loop connected by the central disulfide bond (Fig. 7A). The hypervariable regions, which protrude from the exposed face of DnFimA in the pilus model, dominate pilus immunogenicity, leading to serogroup-specific protection in the fimbria-based vaccines (55,56). The DnFimA structure and filament model explain why DnFimA peptides and reduced and/or denatured fimbrial preparations are not useful for eliciting antibodies that recognize the assembled pilus (57)(58)(59). Nevertheless, peptide-based vaccines that encompass all three hypervariable regions from all serogroups may be broadly protective. The ␤-sheet and the large exposed ␤2-␤3 loop are conserved among the D. nodosus serogroups and may provide a better target for peptide-based vaccines (Fig.  7). Notably, putative glycosylation sites located within and around the variable regions of FimA (60), which could complicate vaccine design, appear to be, for the most part, buried by subunit interactions in the pilus model.
We hypothesize that the unusual central disulfide bond in DnFimA robustly couples the N-terminal helix to pilus surface conformations. This conformational connection may provide pilus stability necessary for the specialized D. nodosus fimbria functions in twitching motility, natural transformation, and secretion (22). T4P from several species can perform these functions, but D. nodosus T4P may be the only pili that can perform all three functions. N. gonorrhoeae GC pili are involved in DNA uptake and twitching motility, but T4P-mediated secretion has not been demonstrated for these organisms. Conversely, the V. cholerae toxin coregulated pili are required for secretion of the soluble colonization factor TcpF (61), but twitching motility and natural transformation have not been observed. Pilus-mediated secretion is thought to occur via the same mechanism as that of the T2S system, using pilus extension to extrude substrates through the outer membrane secretin (62)(63)(64). The central disulfide may provide pilus stability necessary to withstand both pushing forces required to extrude substrate and pulling forces needed to take up DNA. This disulfide connection of the ␣␤-loop and ␤1-␤2 loop of DnFimA may also allow conformational flexibility in the protruding ␤-hairpin of the ␣␤-loop during pilus assembly without destabilizing the subunit fold and interactions. Such flexibility is relevant to vaccine design because flexible protein regions are more readily recognized by anti-peptide antibodies (65) and may generally impact protein antibody interactions (66).
Pilin and T4P architectures somehow provide impressive mechanical strength (6,12,14) combined with flexibility to facilitate diverse pilus functions. The new DnFimA and FtPilE structures broaden our understanding of pilins and their roles as pilus building blocks. Both pilins possess sequence and struc-tural variability in their ␣␤-loop as observed for other Type IV pilins, but we also note variability and flexibility in the ␤-strand loops. Moreover, with these new structures, a conserved D-region conformation has emerged. Early comparisons on fewer pilins highlighted differences in the D-region, in particular because of the unique hypervariable ␤-hairpin insert present in GC pilin (5,10). However, with multiple Type IVa pilin structures now available from N. gonorrhoeae, several P. aeruginosa variants, and these new D. nodosus and F. tularensis additions, we find that variability in this region is limited to the beginning of the D-region, immediately following ␤4, and to the most C-terminal segment of the pilin, beyond the second cysteine. The core of the D-region is in fact conserved in the Type IVa pilins. Strikingly, this D-region loop structure exists even in DnFimA, maintained by non-covalent interactions rather than the conserved disulfide bond. We propose that the D-region, like ␣1 and the ␤-sheet, acts as a stabilizing framework on which to build variable regions that define the pilus surface, its stereochemistry, and its interactions with environmental components, such as host receptors, mucosal membranes, and other pili (Fig. 7).
The D-region, whether stabilized by disulfides or by functionally equivalent non-covalent interactions, distinguishes Type IVa pilins from the major pseudopilins of the T2S system. Pseudopilins presumably polymerize to form molecular pistons to transport substrates from the periplasm to the extracellular space. Pseudopilins share N-terminal sequence homology with the Type IV pilins, and their globular domain ␣-␤-roll folds are similar, with both protein classes having an ␣␤-loop and an anti-parallel ␤-sheet (67)(68)(69)(70)(71). However, the ␣␤-loop in the major pseudopilins is conserved in structure and has sequence motifs not present in the Type IV pilins (69). Moreover, the pseudopilins have shorter ␤-strands and lack ␤4, the C-terminal cysteines, and the D-region. Their C-terminal loop, which follows ␤3, varies in length, sequence, and structure among the pseudopilins, but this region is distinctly different from the C terminus of the Type IVa pilins. Most likely this reflects the distinct functional capabilities of these two filament types; whereas the T2S pseudopili function within the periplasm exclusively in secretion, the T4P are surface-exposed and have broader functions. Residues critical to T4P functions in immune escape, adhesion, and autoagglutination map to the D-region (61,72,73). Thus, the D-regions of the Type IV pilins support variable residues and flanking structures that define these functions. DnFimA, like the T2S pseudopilins, lacks a C-terminal disulfide but nonetheless bears all of the hallmarks of a Type IVa pilin. It has been suggested that some of the F. tularensis Type IV pilin genes may actually encode pseudopilins (33). The FtPilE structure, with its long ␤-strands, ␤4, and disulfide-bonded D-region and lack of conservation with the pseudopilin ␣␤-loops, validates this protein as a bona fide Type IVa pilin and not a pseudopilin. Its structural similarity to GC pilin and its ability to fit the GC pilus assembly suggest that it is also capable of forming a functional pilus (Fig. 6).
The pilE gene is only intact in the highly virulent F. tularensis type A strains, suggesting a possible role for this pilin in pathogenicity. Because pilA is intact in both A and B F. tularensis strains, but its disruption has a much greater impact on infec-tivity in the less virulent B strains, FtPilE may provide functional redundancy for PilA in the A strains. However, a comparison of the F. tularensis PilE and PilA sequences, which share 41% identity (Fig. 6B), shows that variable residues are located on the surface-exposed ␣␤-loop and on the protruding ridge formed by the ␤2-␤3 loop and the D-region of FtPilE. Also, the surface-exposed C-terminal ␤-strand, ␤5 (circled in Fig. 6A) is absent in PilA. Thus, pili formed from these two pilins would have very different surfaces defining distinct functions. F. tularensis PilA is 85% identical to that of F. novicida PilA, with sequence differences occurring mainly on the exposed ␣␤-loop (Figs. 6B and 7). These distinct ␣␤-loops may define unique interaction partners, such as human versus mouse cell receptors. The FtPilE structure and pilus model thus provide a framework on which to genetically probe functional regions of these proteins.
The GC pilus EM reconstruction provides a powerful tool for modeling other Type IVa pili. Remarkably, the curved ␣-helices pack well as rigid bodies into the EM density for both GC pilin and DnFimA. We have attempted filament models for other T4P using idealized N-terminal ␣-helices, which are straight, but these resulted in major steric clashes. 6 This reinforces the idea that the S-shape of ␣1 facilitates its packing in the filament core. The substantial sequence homology in ␣1N between GC pilin and other Type IVa pilins allows us to build filament models using N-terminally truncated pilin structures like FtPilE besides the full-length pilins. Thus, these objective computational models support a conserved architecture for all Type IVa pili. Notably, the Type IVb pilin, TcpA, from V. cholerae (PDB code 1OQV), did not pack as well into the GC pilus model as did the Type IVa pilins DnFimA and FtPilE. TcpA and other Type IVb pilins have substantially larger globular domains than the Type IVa pilins, and the helical symmetry of the V. cholerae TCP filament is slightly different from that of GC pili. 7 The ability to visualize the pilins in the context of the pilus filament is enormously valuable for understanding pilus surfaces and their interactions with other molecules. In fact, within the assembled pilus, the three variable regions come together to flank the deepest groove in the pilus surface that exposes the N-terminal helix above its kink for both DnFimA and FtPilE (Fig. 7). The exposure of the flexible kink site combined with the assembly of protruding variable regions to flank it, as characterized here, suggest that this groove may be an Achilles heel for T4P that must be exposed for function and that therefore merits attention for therapeutic intervention.
In general, combined DnFimA and FtPilE structures show that a conserved Type IV pilin architecture persists in the absence of significant sequence homology. Our DnFimA structure and pilus assembly model reveal the distribution of hypervariable regions, explain limitations of previous vaccine formulations, and suggest new epitope targets for footrot vaccines. Our ultrahigh resolution FtPilE structure validates this protein as a true Type IV pilin and not a pseudopilin and provides a framework for modeling the other Type IV pilins and pili for this species. With these new pilin structures, a conserved D-re-gion structure emerges in the absence of sequence homology and without a covalent disulfide bond. The combined DnFimA and FtPilE structures and pilus models suggest these pilins preserve features consistent with a conserved assembly architecture. Some of the conserved pilus features, such as grooves between subunits that expose the N-terminal ␣-helices, as well as variable exposed antigenic regions, may be relevant for vaccine and drug discovery.