Crystal Structure of the Minor Pilin CofB, the Initiator of CFA/III Pilus Assembly in Enterotoxigenic Escherichia coli*

Background: The enterotoxigenic E. coli (ETEC) Type IVb pilus systems each possess a single minor pilin. Results: We show that these minor pilins are required for pilus assembly and report the x-ray crystal structure of CofB from the CFA/III pilus. Conclusion: ETEC minor pilins initiate pilus assembly. Significance: The CofB structure has implications for understanding assembly in more complex Type IV pilus systems. Type IV pili are extracellular polymers of the major pilin subunit. These subunits are held together in the pilus filament by hydrophobic interactions among their N-terminal α-helices, which also anchor the pilin subunits in the inner membrane prior to pilus assembly. Type IV pilus assembly involves a conserved group of proteins that span the envelope of Gram-negative bacteria. Among these is a set of minor pilins, so named because they share their hydrophobic N-terminal polymerization/membrane anchor segment with the major pilins but are much less abundant. Minor pilins influence pilus assembly and retraction, but their precise functions are not well defined. The Type IV pilus systems of enterotoxigenic Escherichia coli and Vibrio cholerae are among the simplest of Type IV pilus systems and possess only a single minor pilin. Here we show that the enterotoxigenic E. coli minor pilins CofB and LngB are required for assembly of their respective Type IV pili, CFA/III and Longus. Low levels of the minor pilins are optimal for pilus assembly, and CofB can be detected in the pilus fraction. We solved the 2.0 Å crystal structure of N-terminally truncated CofB, revealing a pilin-like protein with an extended C-terminal region composed of two discrete domains connected by flexible linkers. The C-terminal region is required for CofB to initiate pilus assembly. We propose a model for CofB-initiated pilus assembly with implications for understanding filament growth in more complex Type IV pilus systems as well as the related Type II secretion system.

Bacterial Type IV pili (T4P) 3 are long thin filaments of the major pilin subunit. This small protein has an extended 53-amino acid ␣-helix, ␣1, of which the C-terminal half, ␣1C, is embedded in the globular C-terminal domain of the pilin, and the N-terminal half, ␣1N, forms a hydrophobic stalk that both anchors the subunit in the inner membrane prior to pilus assembly and holds the subunits together in the assembled pilus filament (1)(2)(3). Within the hydrophobic ␣1N, there is an invariant acidic residue, Glu 5 . The globular C-terminal domain has a central 4-or 5-stranded antiparallel ␤-sheet and a pair of disulfide-bonded cysteines. Vibrio cholerae and enterotoxigenic Escherichia coli (ETEC) produce Type IV pilins of the IVb (T4b) class, which are distinguished from those of the Type IVa (T4a) class by having longer signal peptides (25-30 amino acids), longer mature proteins (Ͼ200 amino acids), and a variable hydrophobic amino acid at their mature N-terminal position (4). T4b pilins also have longer D-regions, which lie between the disulfide-bonded cysteines in the C-terminal region of the globular domain. In contrast, the T4a pilins of Neisseria gonorrhoeae, Neisseria meningitidis, and Pseudomonas aeruginosa have a 6 -8-amino acid signal peptide, a ϳ150-amino acid mature pilin, and an N-terminal phenylalanine. The structures of the T4a and T4b pilins differ primarily in the ␣␤-loop that connects ␣1 with the central ␤-sheet, in the D-region, and in the connectivity of the ␤-sheet itself. The T4b pilus machinery is much simpler, requiring less than a dozen proteins all encoded on the same gene cluster, whereas the T4a assembly machinery utilizes 40 or more proteins encoded on genes distributed throughout the genome (5). Despite these differences, all Type IV pilins share the canonical ladle-shaped pilin structure and helical arrangement within the pilus filament in which the N-terminal ␣-helices associate to form a hydrophobic core, and subunits are related by an axial rise of 8 -10 Å and an azimuthal rotation of ϳ100° (6,7). Importantly, the conserved Glu 5 , which is critical for efficient pilus assembly (7)(8)(9)(10)(11), is positioned in this hydrophobic core to neutralize the positively charged N-terminal amino group of the neighboring pilin subunit (6,7).
T4P assembly occurs in the inner membrane where the pilin subunits are anchored via their hydrophobic N-terminal ␣-helix, ␣1N. Subunits are thought to add to the growing pilus at its base, with the filament growing through the periplasm and across the outer membrane via the secretin channel. Pilus assembly requires a core assembly machinery composed of the major pilin subunit, a prepilin peptidase that removes the signal peptide and adds a methyl group to the N-terminal amine of residue 1 (12)(13)(14), a cytoplasmic assembly ATPase that powers the addition of each subunit to the growing pilus (15)(16)(17), an inner membrane core protein (sometimes called the platform protein) of unknown function (18 -20), and an outer membrane secretin channel (21)(22)(23)(24)(25). This core assembly machinery is also conserved in the bacterial Type II secretion (T2S) system, which polymerizes "pseudopilin" subunits into a periplasmic "pseudopilus" that extrudes protein substrates across the outer membrane without itself forming an extracellular filament (26,27). The genes encoding the T2S machinery, like those of the T4b pilus systems, are encoded on a single operon (28).
Most Type IVa pili, as well as the bundle-forming pili of the Type IVb class, also utilize a second "retraction" ATPase that catalyzes filament disassembly. Retraction is necessary for key T4P functions, such as twitching motility, DNA uptake, phage transduction, and bacterial dissemination (29 -35). No such retraction ATPase has been identified for V. cholerae and ETEC Type IVb pili, and these pili have not been shown to mediate twitching motility or DNA uptake. The T2S systems also lack a retraction ATPase. T2S is thought to occur via a piston-like movement of the pseudopilus (36 -38), a mechanism that is apparently independent of a retraction ATPase. The T4P and T2S systems are functionally related as several T4P systems have secretory functions. The V. cholerae toxin coregulated pilus (TCP) apparatus secretes a protein, TcpF, which is required for colonization of the infant mouse (39,40), and CFA/ III secretes CofJ (41).
In addition to the major pilin, which is the structural unit for the Type IV pilus filament, all Type IV pilus systems possess one or more minor pilins, which share the N-terminal ␣-helix with the major pilins but are expressed in much lower levels. The V. cholerae TCP and ETEC CFA/III and Longus pilus systems of the T4b class each possess a single minor pilin encoded on the pilus operon immediately following the major pilin gene. In contrast, the bundle-forming pilus T4b system has three minor pilins that are encoded at the 3Ј end of the bfp operon. This arrangement is also seen for the T2S systems, which have four minor pilins (26), and for the Type IV pilus systems of some Gram-positive Clostridia species (3). The more complex T4a pilus systems also have multiple minor pilins that are typically encoded on their own gene clusters. Most minor (pseudo)pilins are similar in size to their respective major (pseudo)pilins and possess a position 5 glutamate. However, within each T4a pilus system, one of the minor pilins has a hydrophobic residue instead of glutamate at position 5, such as P. aeruginosa PilX and N. meningitidis PilK. This is also observed in the T2S system, but the minor pseudopilins lacking Glu 5 are typically much larger than their respective minor and major pseudopilins and are classified as GspK family members (42). These include ETEC GspK, Klebsiella oxytoca PulK, and P. aeruginosa XcpX. The sole minor pilins of the V. cholerae and ETEC T4b pilus systems, TcpB and CofB, respectively, have Glu 5 but, like the T2S GspK proteins, are substantially larger than their respective major pilins.
Minor (pseudo)pilins are involved in pilus assembly and functions, but their precise mechanisms have been challenging to identify due to their multiplicity and functional redundancy and, in the case of the T4a pili, the presence of retraction ATPases. The minor pilins of N. gonorrhoeae and P. aeruginosa are required for wild type levels of T4P assembly, but assembly can proceed at reduced levels in minor pilin mutants that also lack the retraction ATPase (43)(44)(45)(46). In addition to their role in pilus assembly, minor pilins act in adhering to and signaling of host cells (47)(48)(49), autoaggregation (50), and swarming motility (51).
Several minor (pseudo)pilin structures have been solved, all lacking their N-terminal ␣1N segment (52)(53)(54)(55)(56)(57)(58). Most of these resemble the structures of their respective major (pseudo)pilins, suggesting that they can incorporate into the pilus, but some possess additional features. The N. meningitidis T4a minor pilin PilX is similar in structure to the N. gonorrhoeae major pilin, PilE, but with a 2-turn ␣-helix instead of a ␤-hairpin in the D-region (54). This feature is expected to be exposed on the pilus surface, and immunogold transmission EM demonstrated a low level of PilX incorporation into N. meningitidis Type T4a pili. PilX is involved in pilus-mediated autoaggregation and adhesion to host cells, and these functions require the 2-turn ␣-helix (50,54). The P. aeruginosa minor pilins PilV, PilW, and PilX along with a non-pilin protein, PilY, are proposed to form a priming complex that is connected via minor pilins PilE and FimU to the major pilin subunits in the pilus shaft (44,46). PilE and FimU have pilin-like structures, but FimU has a second ␤-sheet within the ␣␤-loop (46).
A ternary crystal structure of the ETEC T2S minor pseudopilins GspI, GspJ, and GspK provides critical insight into the role of these proteins in pseudopilus assembly (55). Whereas GspI is similar in structure to the major pseudopilin GspG, GspJ is larger, with two ␤-sheets in the globular domain, similar to P. aeruginosa FimU. GspK is the largest of the three, having a discrete pilin domain with an N-terminal ␣-helix and a ␤-sheet, but with a large ␣-helical domain inserted between strands ␤1 and ␤2. These GspI-GspJ-GspK minor pilins are staggered with respect to one another within the ternary complex (55), similar to the helical arrangement of the major pilins in the N. gonorrhoeae Type IVa pilus (6). This GspI-GspJ-GspK complex is predicted to cap the pseudopilus, with GspK located at the tip. Indeed, the globular domain of GspK is likely too large to fit anywhere but at the tip of the pilus and may serve as a steric block to prevent the pseudopilus from growing across the outer membrane secretin. Consistent with this tip localization, the minor pseudopilins of the K. oxytoca Pul T2S system are required for efficient pseudopilus assembly (59). The Type IVa and T2S minor (pseudo)pilins are interchangeable to some degree, because E. coli K12 minor pseudopilins initiate K. oxcytoca pseudopilus assembly but not secretion (60), and P. aeruginosa minor pseudopilins restore low levels of T4a pilus assembly in a minor pilin deletion strain when the retraction ATPase is absent (46).
The complexity of the T2S and T4a pilus systems, with multiple minor (pseudo)pilins and, in the case of the T4a pili, a retraction ATPase, make it challenging to decipher the molecular mechanism by which the minor pilins influence pilus assembly and functions. The V. cholerae and ETEC T4b pilus systems represent comparatively simple systems with only a single minor pilin and no retraction ATPase. We report here a high resolution x-ray crystal structure of N-terminally truncated CofB, the sole minor pilin from the ETEC CFA/III pilus system, and show that CofB mediates CFA/III pilus assembly.
Deletion of Minor Pilin Genes-The cofB gene was disrupted in pcof using the -RED recombinase system (62) used previously to generate the pcof⌬cofA construct (41). The central portion of the cofB gene (residues 45-462) was targeted for deletion using the CofB-RED-For/-Rev primers to PCR-amplify the kan cassette in plasmid pKD4 using Q5 DNA polymerase. The amplified kan cassette is flanked by a short stretch of nucleotides that correspond to the sequence flanking the targeted deletion site. PCR products were purified using the QIAquick PCR purification kit (Qiagen). E. coli DH5␣-pcof cells were transformed with pKD46 (Ap R ) encoding the -RED recombinase. Electrocompetent DH5␣-pcof-pKD46 cells were prepared and transformed with the kan cassette PCR product. Cells with successful integration of the kan cassette were selected by plating on LB-Cm-Km and incubated at 30°C and further screened for sensitivity to ampicillin corresponding to the loss of pKD46. Ap S , Km R , and Cm R colonies were screened by PCR to confirm insertion of the kan cassette, and pcof-cofB::Km was purified by plasmid miniprep (Qiagen). This plasmid was transformed into competent E. coli SW105 cells expressing the FLP recombinase, which recognizes and cleaves the FRT sequence flanking the kan cassette. Expression of the FLP recombinase was induced with arabinose, and colonies were screened on agar plates with and without Km for loss of the kan cassette. Km S clones were screened for deletion of the cofB gene fragment by PCR using primers CofB-F-KpnI/-R-XbaI for the 5Ј and 3Ј ends of cofB and confirmed by DNA sequencing. A positive pcof⌬cofB plasmid was amplified in E. coli DH5␣, purified, and transformed into electrocompetent E. coli MC4100 cells.
lngA and lngB genes were disrupted in ETEC E9034A using the -RED recombinase system (62,63). The Cm R -encoding cat cassette was PCR-amplified from pKD3 with primer sets LngA-RED-For/-Rev and LngB-RED-For/-Rev to replace the lngA gene fragment (encoding residues 57-155) and the lngB gene fragment (encoding residues 80 -381), respectively. The cassettes were transformed into electrocompetent ETEC E9034A-pKD46 cells expressing -RED recombinase, and clones were screened for resistance to Cm and sensitivity to Ap. Positive clones were made electrocompetent and transformed with pCP20, which encodes the genes for the FLP recombinase. After excision of the cat cassette by FLP recombinase, positive clones were screened for sensitivity to Cm and PCR-amplified with LngA-For-KpnI/-Rev-HindIII or LngB-For-KpnI/-Rev-HindIII to confirm gene disruption. The gene deletions were further confirmed by DNA sequencing.
Assessing CofB Expression and CFA/III Assembly in ETEC and MC4100-pcof Strains-ETEC and E. coli MC4100 pcof cells were grown overnight under CFA/III-inducing conditions on CFA agar plates. Cells were overlaid with 5 ml of phosphatebuffered saline (PBS; 10 mM Na 2 HPO 4 , 2 mM KH 2 PO 4 , 137 mM NaCl, 2.7 mM KCl) with gentle agitation on a rocking unit for 15 min. Cells were gently washed off the plates, and A 600 was measured. Cell suspensions were normalized to an A 600 of 0.1; this mixture constitutes the whole cell culture (WCC) fraction used to assess total protein levels. Cells were removed by centrifugation at 3000 ϫ g for 10 min, and the supernatant was further filtered through a 0.22-m syringe drive filter. This filtered supernatant (Sup) fraction was used to assess CofA in the shed CFA/III pili. Samples were mixed with Laemmli sample buffer (60 mM Tris, pH 6.8, 5% 2-mercaptoethanol, 2% SDS, 10% glycerol, 0.02% bromphenol blue) and boiled for 10 min prior to being loaded onto 15% SDS-polyacrylamide gels. Proteins were transferred onto polyvinylidene difluoride (PVDF) membrane (Bio-Rad) for immunoblotting and detected by rabbit polyclonal antisera raised against N-terminally truncated CofA (64) or CofB (N-terminal peptide 61-74 and C-terminal peptide 403-419; Pacific Immunology). Goat anti-rabbit secondary antibodies conjugated to horseradish peroxidase (Jackson ImmunoResearch) were used to detect primary antibody. Immunoblots were visualized by enhanced chemiluminescence (ECL) with the SuperSignal West Pico chemiluminescent substrate (Thermo Scientific) for CofA or the SuperSignal West Femto chemiluminescent substrate (Thermo Scientific) for CofB. A Fujifilm LAS4000 imager (FujiFilm) was used to capture images of the immunoblots.
Assessing LngB Expression and Longus Assembly in ETEC E9034A-ETEC E9034A cells were grown overnight on LB agar plates at 37°C. Single colonies were inoculated in 2 ml of Terrific Broth (1.2% tryptone, 2.4% yeast extract, 1.7 mM KH 2 PO 4 , 7.2 mM K 2 HPO 4 , pH 7.2) and grown for 4 h in an upright position on a shaking incubator at 250 rpm and 37°C. Cells were inoculated 1:100 (v/v) in 2 ml of Terrific Broth and grown for a further 2 h. The WCC fraction was used to assess total protein levels. Cells were removed by centrifugation at 3000 ϫ g for 10 min, and the Sup was used to assess LngA within the shed Longus pili. Samples were mixed with Laemmli sample buffer and boiled for 10 min prior to being loaded onto 15% SDS-polyacrylamide gels. Proteins were transferred onto PVDF membrane (Bio-Rad) for immunoblotting. Due to their close sequence similarity, LngA and LngB could be detected using the anti-CofA and -CofB (peptide 61-74) antibodies, respectively. Blots were developed as described for CofA and CofB.
Cloning, Expression, and Purification of N-terminally Truncated CofB-The gene fragment encoding CofB residues 25-518 was PCR-amplified from ETEC 31-10 genomic DNA using primers Ec-cofB-fpcr and Ec-cofB-rpcr and cloned into expression vector pET-15b (Novagen) at the NdeI and BamHI sites to provide an N-terminal His 6 tag for purification by nickel-Sepharose column chromatography (GE Healthcare). CofB was expressed in E. coli Origami(DE3) cells (Novagen). Cells were grown to an A 600 of 0.1-0.2 in LB-Ap broth at 37°C. CofB expression was induced with 0.2 mM isopropyl-␤-D-1-thiogalactopyranoside, and cells were grown for a further 20 h at 14°C. Cells were pelleted by centrifugation for 30 min at 5000 ϫ g at 4°C and subjected to two cycles of flash-freezing in liquid nitrogen followed by thawing in a water bath to partially lyse the cells. Cells were resuspended in lysis buffer containing 50 mM NaH 2 PO 4 /Na 2 HPO 4 (pH 7.4), 100 mM NaCl, and EDTA-free protease inhibitor mixture (Roche Applied Science). The suspension was incubated at room temperature with lysozyme for 1 h, gently stirring, and then cells were lysed by sonication. Cell debris was removed by centrifugation for 60 min at 40,000 ϫ g at 4°C. The supernatant was filtered using a 0.4-m filter and then loaded onto a HisTrap column (GE Healthcare) pre-equilibrated with buffer A (50 mM NaH 2 PO 4 / Na 2 HPO 4 , pH 7.4, 30 mM imidazole, pH 7.4, 500 mM NaCl). The column was washed with buffer A, and CofB was eluted with buffer B (20 mM Tris-HCl, pH 7.4, 100 mM NaCl, 250 mM imidazole). Fractions containing CofB were pooled and dialyzed against buffer C (20 mM Tris-HCl, pH 7.4, 100 mM NaCl, 1 mM EDTA, 1 mM EGTA). The His 6 tag was removed by thrombin cleavage, and CofB was further purified on a Sephacryl S-100 HR size exclusion column (GE Healthcare) pre-equilibrated with buffer C. Fractions were shown by SDS-PAGE to be greater than 95% pure, and peak fractions were combined and concentrated to 15 mg/ml using an Amicon stirred cell concentrator with a 10,000-Da molecular mass cut-off filter (Millipore). Protein was flash-frozen in liquid nitrogen and stored at Ϫ80°C.
Crystallization of CofB-Initial CofB crystallization conditions were obtained from the high throughput screening laboratory at the Hauptman-Woodward Medical Research Institute (66). Both native and SeMet CofB crystals were grown by hanging drop vapor diffusion at 20°C. Native CofB crystals were grown in the presence of 100 mM MES, pH 5.6, and 1.6 M ammonium sulfate. SeMet CofB crystals were obtained in 100 mM HEPES, pH 7.5, 700 mM NaH 2 PO 4 , 800 mM KH 2 PO 4 . All crystals were cryocooled in mother liquor containing 30% glycerol and stored in liquid nitrogen for x-ray diffraction data collection at the Stanford Synchrotron Radiation Lightsource (SSRL).
Collection and Processing of CofB X-ray Diffraction Data-A diffraction data set for a native CofB crystal was collected on Beamline 14-1 at 100 K. The crystal belongs to the rhombohedral system, as determined by Web-Ice (67). The raw data set was processed using XDS (94). The space group was determined by POINTLESS in the CCP4 suite (68), and the data set was scaled to 2 Å resolution by AIMLESS (68). SeMet CofB crystals were tested on SSRL Beamline 7-1, and two-wavelength multiple anomalous diffraction (MAD) data were collected after deciding on wavelengths, inflection point, and high energy remote, based on the x-ray fluorescence scan output. The MAD data set was processed by iMOSFLM (69) and scaled by SCALA (70) in the CCP4 suite (68).
Structure Determination of CofB-The SeMet-CofB structure was solved by MAD phasing using SOLVE/RESOLVE (71,72). SOLVE located seven selenium atoms and determined the initial phases of the structure. Phases were improved by density modification procedures by RESOLVE. This 3 Å SeMet-CofB structure was used as a model to solve the 2 Å native CofB structure by molecular replacement. The initial model was built by ARP/wARP (73), examined in COOT (74), and further refined using REFMAC5 (75). COOT was used to locate water oxygens, glycerol, and sulfate ions from the difference map as well as the composite annealed omit map, calculated by CNS (76,77). Final TLS and restrained refinement of the structure with water oxygens and ligands brought R cryst and R free values to 20.5 and 22.9%, respectively. The final structure validation was performed using COOT (74) and MolProbity (78).

Results
The ETEC Minor Pilin CofB Is Required for CFA/III Assembly-The ETEC minor pilin CofB is encoded on the cof operon immediately downstream of the gene encoding the major pilin, CofA, the structural subunit for the CFA/III Type IVb pilus (64,79,80). CofB is a 518-amino acid protein with a predicted 5-amino acid signal peptide (Fig. 1, A and B). The CofB signal peptide is much shorter than the 30-amino acid signal peptide of CofA, which is probably processed by the prepilin peptidase CofP encoded on the cof operon. The N-terminal 26-amino acid region of the mature CofB protein shares sequence similarity with CofA (61%), including the conserved Glu 5 (Fig. 1A). This region corresponds to ␣1N, the inner membrane anchor and polymerization domain of the major pilins. Beyond the N terminus, CofB has no sequence homology with CofA. CofB is more than twice the length of the 208-amino acid CofA. CofB is, however, identical in length and highly similar in sequence to LngB, the minor pilin for the ETEC Longus pilus, with 78% sequence identity between the two proteins, including 8 cysteines (Fig. 1B). Both proteins have a ϳ30-residue tandem repeat (Fig. 1C).
The role of CofB in CFA/III assembly was examined using a heterologous E. coli CFA/III expression system, which allows manipulation of genes within the cof operon. In ETEC strain 31-10, the cof operon is located on a 55-kb virulence plasmid (81) that is unstable and not amenable to genetic manipulation (41). The cof operon was cloned into plasmid pACYC184, producing pcof, which was transformed into several E. coli expression strains (DH5␣, HB101, and MC4100 (41)). These strains express CFA/III and secrete the soluble protein CofJ, also encoded on the cof operon, in a pilus-dependent manner. To test the role of CofB in CFA/III pilus assembly, the cofB gene in pcof was deleted using the -RED recombinase method (62), and pcof⌬cofB was transformed into E. coli MC4100. Cells were grown on CFA agar plates (61) and then washed off of the plates with PBS, and their numbers were normalized. This mixture, referred to as the WCC fraction, was analyzed by SDS-PAGE and immunoblotting using an anti-CofA antibody (64) to deter-mine total CofA levels relative to the "wild type" MC4100-pcof strain. Additional controls included this strain lacking the major pilin, CofA (MC4100-pcof⌬cofA), wild type ETEC 31-10 (82), and ETEC 31-10P, which lacks the 55-kb virulence plasmid (81). To determine CFA/III pilus assembly levels, cells were removed from the WCC by centrifugation and filtration, and the amount of CofA in the Sup fraction was analyzed by SDS-FIGURE 1. CofB amino acid sequence and alignment with CofA and LngB. A, alignment of the signal peptides (shaded gray) and N-terminal 30 residues of the minor pilin CofB and the major pilin CofA from the ETEC CFA/III pilus system. B, sequence alignment of CofB (GenBank TM accession number BAB62898) with the minor pilin LngB (ABU50041) from the ETEC Longus pilus system. Residues 25-518 were expressed recombinantly for crystallization, as indicated by the arrow at residue 25. The secondary structure elements are indicated above the CofB sequence, based on its crystal structure, shown in Fig. 4. The pilin domain is shown with a blue background, the ␤-repeat domain is orange, and the ␤-sandwich domain is green. The disulfide bond connectivity is indicated. C, alignment of the CofB tandem repeats 1 and 2, which form the ␤-repeat domain.
PAGE and immunoblotting. Total CofA levels in the WCC fraction of MC4100-pcof⌬cofB were comparable with those of the positive controls: ETEC strain 31-10 and MC4100-pcof ( Fig.  2A). As expected, no CofA was present in negative control strains ETEC 31-10P and MC4100-pcof⌬cofA. However, whereas CofA levels in the Sup fraction, representing CFA/III pili, were comparable for ETEC 31-10 and MC4100-pcof, no CofA was present in this fraction for MC4100-pcof⌬cofB, suggesting that CofB is required for pilus assembly.
To confirm that the loss of CFA/III in MC4100-pcof-⌬cofB is due to disruption of the cofB gene and not a downstream effect on the cof operon, cofB was cloned into expression vector pJMA10.1. pcofB was transformed into MC4100-pcof⌬cofB. Pilus assembly was rescued to approximately wild type levels when no rhamnose was added, presumably due to low level CofB expression from a leaky promoter ( Fig. 2A). However, induction with rhamnose, even at very low levels (0.001%), resulted in reduced levels of CofA in both the WCC and Sup fractions, indicating that pilus assembly is optimal with very low levels of CofB expression. This observation cannot be explained by competition between CofA and CofB for the signal peptidase because we saw no accumulation of unprocessed CofA under these conditions. Instead, the reduced CofA levels observed upon overexpression of CofB suggest a feedback mechanism that limits the total amount of pilin in the inner membrane.
To verify CofB expression and test its cellular localization, WCC and Sup fractions were blotted with antibody against an N-terminal CofB peptide (residues 61-74). CofB was not detected in ETEC 31-10 or MC4100-pcof WCC using the SuperSignal West Pico chemiluminescent substrate (Thermo Scientific) used for detection of the major pilin, CofA, but a faint band at ϳ57 kDa, corresponding to CofB, was observed for ETEC 31-10 and MC4100-pcof WCC when the more sensitive Femto substrate was used, consistent with CofB being expressed at very low levels in the wild type strains (Fig. 2B, top). As expected, the CofB band was absent in MC4100-pcof⌬cofB WCC, but CofB was detected in uninduced MC4100-pcof⌬cofBϩpcofB WCC, confirming that it is produced under these conditions and at a level that is sufficient for pilus assembly (Fig. 2A). No CofB was detected in the Sup fraction using the N-terminal antibody against CofB peptide 61-74 (Fig. 2B, middle), but an antibody against a C-terminal peptide (residues 403-419) detected CofB in the Sup fraction of MC4100-pcof and MC4100pcof-⌬cofBϩpcofB samples using the Femto substrate kit with a long (60-s) exposure time (Fig. 2B, bottom), suggesting that CofB is incorporated into the pili but at very low levels. Significant proteolysis of CofB was observed in the WCC when the minor pilin was expressed at high levels (Fig. 2B, top). The stable proteolytic fragments are N-terminal because they are detected by antibodies specific for the N-terminal peptide. The most abundant fragment has a mass of ϳ29 kDa, which corresponds to a CofB fragment spanning residues 1 to ϳ260. This fragment is also present in ETEC 31-10 WCC.
The ETEC Minor Pilin LngB Is Required for Pilus Assembly-To examine the role of the ETEC minor pilins in a native T4b pilus system, we turned to ETEC strain E9034A, which produces the Longus T4b pilus (83,84). We deleted the minor pilin gene lngB from the E9034A chromosome using the -RED recombinase system (62). As with the MC4100-pcof⌬cofB strain, pilus assembly was abrogated in ETEC E9034A-⌬lngB, comparable with that of the ⌬lngA mutant, and assembly was rescued when LngB was expressed ectopically (Fig. 3A). LngB expression from plngB in ETEC E9034A-⌬lngB does not appear to be as high as CofB expression from pcofB in MC4100-pcof⌬cofB and requires a higher level of induction (0.001-0.1% rhamnose) to restore wild type Longus levels. Consistent with lower LngB levels, no inhibitory effect on LngA expression/ stability or Longus assembly was observed when LngB was overexpressed, in contrast to the CFA/III system.
We were unable to detect LngB in the pilus (Sup) fraction even when LngB was overexpressed, and the ultrasensitive Femto chemiluminescent substrate was used to develop the blot (Fig. 3B). Given that our results with the CFA/III pilus system showed that the pilus assembly was most efficient when only low levels of CofB were produced, we tested whether increasing the level of the major pilin, LngA, might disrupt pilus assembly by altering the major/minor pilin stoichiometry. However, we found that overexpressing LngA in ETEC E9034A-⌬lngA had no effect on pilus assembly, probably because the excess LngA was not processed by the prepilin peptidase (Fig. 3A). These results parallel those shown for the MC4100 heterologous CFA/III expression system and confirm that the ETEC minor pilins are necessary for T4b pilus assembly.
Crystal Structure of ETEC CofB-To understand how CofB might initiate pilus assembly, we solved its x-ray crystal structure. We expressed a recombinant form of CofB (CofB(25-518)) lacking its hydrophobic N-terminal 24 residues corresponding to the protruding half of ␣1, ␣1N, of the major pilins (4,6,85,86). Both native and SeMet-CofB proteins were expressed and purified and crystals were grown. A 3 Å resolution x-ray crystal structure was solved for SeMet-CofB by MAD methods, and this structure was used as a model for molecular replacement to solve a 2 Å structure of the native N-terminally truncated CofB. Data collection and refinement statistics are shown in Table 2. Crystallization of CofB has also been reported elsewhere (87). In that study, CofB (residues 29 -518) was expressed using a different plasmid (pTT240), purified using an   1 Values in parenthesis represent the highest resolution shell. 2 4 R free is the cross-validation R factor for 5% of the reflections against which the model was not refined.
additional anion exchange step, and crystallized under different conditions (sodium formate, sodium acetate), but the space group and unit cell dimensions and resolution are essentially the same as those reported here. CofB has an unusual structure for a minor pilin; the N-terminal half of the protein, residues 25-259, is a pilin domain, and the C-terminal half, residues 260 -518, is extended with two discrete domains: a central ␤-repeat domain (so-named because it has two small ␤-sheets, each corresponding to one of the tandem repeat sequences (Fig. 1C)) and a C-terminal elongated ␤-sandwich domain (Fig. 4). Linker segments connect the pilin domain to the first ␤-repeat and the second ␤-repeat to the ␤-sandwich domain. The C-terminal region extends from the "inside" face of the pilin domain that in the major pilins faces into the core of the assembled pilus. The pilin domain has the canonical T4b pilin fold seen in the major pilins, CofA (64) and TcpA (88,89), with an N-terminal ␣-helical backbone, ␣1C, and a 5-stranded anti-parallel ␤-sheet, in which strand ␤5 forms the central ␤-strand. This ␤-sheet lies against ␣1C and two shorter ␣-helices, ␣3 and ␣4, all of which are located on the inside face of the pilin domain (64). ␣3 is situated between strands ␤2 and ␤3 and ␣4 is between ␤4 and ␤5.
The CofB pilin domain, which terminates at ␤5, is connected to the ␤-repeat domain via linker 1, a 10-amino acid segment that extends from the inside face of the pilin domain adjacent to the C-terminal end of ␣1C (Fig. 4, A and B). The ␤-repeat domain has two three-stranded anti-parallel ␤-sheets whose planes lie at ϳ90°to each other and are connected by a tight lysine-rich 9-residue loop. Each ␤-sheet has a disulfide bond connecting the end of its first strand and the beginning of the second strand (Cys 273 -Cys 284 , Cys 310 -Cys 321 ). A 13-residue loop connects the terminal ␤-strand of the second ␤-repeat with the first ␤-strand of a ␤-hairpin in the C-terminal ␤-sandwich domain, but the linker between these domains, linker 2, is defined by only 2 amino acids, Leu 351 and Ala 352 , because all other residues in this loop are tightly associated with their respective domains. The short ␤-hairpin forms the proximal end of the ␤-sandwich domain, which has a twisted 6-stranded anti-parallel ␤-sandwich. The strands in the ␤-sandwich are labeled sequentially in Fig. 4, A and B, with ␤A-␤G-␤C forming the inner ␤-sheet, which is proximal to the ␤-repeats, and strand ␤F/B-␤E-␤D forming the outer or distal ␤-sheet. Large intersheet loops between strands ␤A and ␤B and strands ␤C and ␤D are splayed on either side at the distal end of the ␤-sandwich. Another loop between strands ␤B and ␤C has a disulfide bond to the end of ␤G near the C terminus of the protein (Cys 414 -Cys 512 ). The pilin domain, ␤-repeat, ␤-sandwich domains, and linkers together measure 130 Å in length, whereas the pilin domain itself is 60 Å along its long axis parallel to ␣1C. Apart from the linkers, no interactions connect the three domains (Fig. 4C), implying conformational flexibility.
Comparison of CofB with CofA and GspK-The pilin domain of CofB shares the same core structure as the major pilin, CofA, with the ␣1C backbone, antiparallel ␤-sheet with the central ␤5 strand, and helices ␣3 and ␣4 (root mean square deviation for CofA and CofB(25-259) backbone atoms, 4.4 Å) (Fig. 5, A and  B). However, in CofB, the ␣␤-loop that connects ␣1C to the ␤-sheet is longer (93 residues) and bulkier than in CofA. From ␣1C of CofB, a ␤-hairpin extends away from the top of the pilin domain, followed by a meandering loop that crosses the top front of this domain and then winds back to form three sequential single-turn helices, 1, 2, and ␣2. A disulfide bond between Cys 118 and Cys 136 stabilizes this helical cluster. The ␣␤-loop of CofA has only 50 residues and lacks the ␤-hairpin and meandering loop that crosses over the CofB pilin domain, but it has three short helices in approximately the same orientation as those of CofB. No disulfide bond stabilizes the CofA ␣␤-loop. Instead, CofA has a disulfide bond closer to its C terminus between ␣3 and the ␣4-␤5 loop (Cys 132 -Cys 196 ), which is more typical of the major pilins. Apart from the ␣␤-loop, the overall size and volume of the two pilin domains is comparable.
CofB is unusual for a minor pilin with its large size and extended multidomain structure. Most minor pilins are similar in size and structure to their corresponding major pilins. CofB shares characteristics with the ETEC minor pseudopilin GspK, which, like CofB, is approximately twice the size of its respective major pseudopilin GspG and has a canonical pilin domain plus a second non-pilin domain (Fig. 5) (55). However, in GspK, the non-pilin domain is an ␣-helical region, the "␣-domain," inserted between strands ␤1 and ␤2 of the pilin domain ␤-sheet. The ␣-domain is covalently attached to the pilin domain at both its N and C terminus but also has extensive noncovalent interactions, making it an integral part of the pilin domain (Fig. 5C). This rigid single domain structure contrasts with the extended flexible three-domain arrangement of CofB.
Docking of CofB into CFA/III Pilus Filament Model-The similarity of the pilin domain to CofA implies that CofB might incorporate into the growing pilus filament. However, insertion of CofB would be blocked by its C-terminal region, which is located on the inner face of the pilin domain, as shown in the CFA/III model in Fig. 6A (64). CofB does, however, fit well onto the tip of the filament (Fig. 6B), where there is room for the C-terminal region. Such a tip location is observed for GspK in the ternary crystal structure of the ETEC minor pseudopilins GspI, GspJ, and GspK (55) and is consistent with the role of CofB as an initiator of pilus assembly.
The C-terminal Region Is Required for Initiation of Pilus Assembly-Because CofB has such discrete and well defined domains, we tested the requirement for the extended C-terminal region in initiating pilus assembly. We generated a CofB variant truncated at residue 259, comprising the pilin domain only. This site was chosen because residues 260 and beyond do not interact with the pilin domain in our CofB crystal structure. Furthermore, a stable proteolytic fragment corresponding to the pilin domain is produced when CofB is overexpressed (Fig.  2B, top). CofB259 is stably expressed but, unlike full-length CofB, is unable to rescue pilus assembly in the MC4100-pcof⌬cofB strain regardless of its expression level (Fig. 2A).
These results demonstrate the requirement for the CofB C-terminal region for initiating pilus assembly.

Discussion
The ETEC minor pilins CofB and LngB, from the CFA/III and Longus T4b pilus systems, respectively, are highly similar in sequence and size. Both proteins have short leader sequences that are more similar to those of the T4a pilins and the minor pseudopilins than to their respective major pilins, suggesting that they are not processed by the prepilin peptidases encoded on their pilus operons. Both minor pilins have 8 cysteines and two tandem ϳ30-amino acid sequence repeats. We show here that both CofB and LngB are required for pilus assembly and that very low expression levels are sufficient.
The CofB structure is like no other major or minor pilin structure published to date. Although it contains a canonical pilin domain, the extended flexible nature of the C-terminal region suggests an ability to adapt. The CofB pilin domain shares the overall fold of CofA, including the non-sequential arrangement of the ␤-strands within the central ␤-sheet of the globular domain. The additional bulk from the ␤-hairpin in the ␣␤-loop of CofB may affect its ability to pack into a growing pilus in place of CofA, even in the absence of the C-terminal region.
The structure of CofB, its role in initiating pilus assembly, and its similarity to the GspK minor pseudopilin together imply that it is located at the tip of the pilus. Our immunoblots show that CofB is present in the CFA/III fraction at very low levels, consistent with it being a tip-associated pilin rather than one that is distributed throughout the filament. CFA/III pili are several m in length. Based on the transmission EM image reconstruction of the closely related V. cholerae TCP, which has an axial rise per subunit of 8.4 Å (7), a 5-m-long pilus is composed of ϳ6000 CofA subunits. We have thus far been unable to demonstrate CofB localization at the pilus tip using immunogold transmission EM with either of our anti-CofB antibodies. This may be because these antibodies, raised against peptides, are not capable of binding to the folded protein in the context of the pilus filament.
We propose that the CofB minor pilin is the first pilin subunit in a new pilus filament and that it recruits the first CofA major pilin via interactions between its C-terminal region and the CofA globular domain (Fig. 6C). This model is supported by our results showing that CFA/III pili are not made for the CofB259 construct that lacks the C-terminal region. The placement of the C-terminal region as shown in the CofB crystal structure, together with the bulky ␣␤-loop, would prohibit its insertion into a growing pilus filament (Fig. 6A), but CofB fits nicely at the tip of our CFA/III pilus model (Fig. 6B). When modeled as a rigid body, the C-terminal region of CofB spans the tip of the pilus and protrudes on the other side of the filament, where it could potentially clash with the secretin channel and associated periplasmic pilus assembly proteins (the secretin complex) as the pilus grows across the outer membrane. However, the linkers connecting the C-terminal domains may provide sufficient conformational flexibility to allow these domains to tuck into the end of the pilus instead of protruding from it, thus allowing the tip to pass through the secretin channel. This apparent flex-  (7), with the N-terminal 28 residues of CofA modeled from the full-length P. aeruginosa PilA crystal structure (4). The ␤-repeat and ␤-sandwich domains would clash with adjacent CofA subunits in the pilus. B, CofB superimposed onto the first CofA subunit of the CFA/III pilus, which results in no steric clashes. C, model for CofB-mediated initiation of CFA/III assembly. CofB may recruit the first minor pilin in the pilus and/or may signal opening of the outer membrane (OM) secretin channel. CofB folding would allow it to pass through the secretin channel, to be displayed on the tip of the pilus. OM, outer membrane; IM, inner membrane. ibility distinguishes CofB from the bulky, rigid GspK minor pseudopilin, which may simply act as a mechanical blocker to prevent passage of the pseudopilus through the secretin, relegating this filament to the periplasm.
The CFA/III pilus functions like a T2S system in extruding CofJ through the secretin channel. Cryo-EM data provide evidence that T2S substrates can home to the vestibule of the secretin channel, ready to be pushed though the channel by the pseudopilus (90). Thus, the minor pilins located at the pseudopilins tip, as well as CofB and LngB at the tip of the ETEC T4P, may also be involved in signaling the secretin channel to open.
The ETEC and V. cholerae T4b pili are relatively simple systems compared with the T4a pili and even compared with the T2S systems. Our results demonstrate that a single minor pilin is capable of initiating pilus assembly for the ETEC T4P systems, with its pilin domain serving a structural role to anchor the C-terminal region at the tip of the pilus, where it functions to recruit major pilins and/or signal the secretin channel to open. Such tasks may require multiple minor pilins in the more complex T2S and T4a pilus systems. In the case of the T2S system, the pseudopili must rapidly assemble and disassemble, producing a piston-like motion to extrude substrate across the outer membrane. In the case of T4a pili (and EPEC T4b pili), pilus assembly must counter pilus disassembly facilitated by the retraction ATPase. Such complex pilus dynamics may require more sophisticated control that cannot be accomplished by a single minor pilin. The T2S and T4a pilus systems may have evolved from the primitive ETEC and V. cholerae T4b pilus systems to specialize in protein secretion or perform more complex pilus functions like twitching motility and host cell signaling.
Our CofB structure represents the first structure of a minor pilin from a T4b pilus system and of any minor pilin that can initiate pilus assembly single-handedly. This structure will inform experiments to further explore its mechanism for initiating pilus assembly in ETEC and the related V. cholerae TCP system and will provide insights into minor pilin functions for the more complex T2S and pilus systems. The ETEC and V. cholerae minor pilins may also provide new targets for antimicrobial agents designed to inhibit T4P assembly.
Author Contributions-S. K. cloned, expressed, purified, and crystallized CofB and solved its crystal structure. D. N. conceived and performed experiments shown in Fig. 2 and contributed to those of Fig. 3, which were performed by T. H. and G. Y. L. C. coordinated the study and analyzed data along with S. K. and D. N. S. K., D. N., and L. C. wrote the paper with editorial contributions from T. H. and G. Y. All authors approved the final version of the paper.