The K5 Lyase KflA Combines a Viral Tail Spike Structure with a Bacterial Polysaccharide Lyase Mechanism*

K5 lyase A (KflA) is a tail spike protein (TSP) encoded by a K5A coliphage, which cleaves K5 capsular polysaccharide, a glycosaminoglycan with the repeat unit [-4)-βGlcA-(1,4)- αGlcNAc(1-], displayed on the surface of Escherichia coli K5 strains. The crystal structure of KflA reveals a trimeric arrangement, with each monomer containing a right-handed, single-stranded parallel β-helix domain. Stable trimer formation by the intertwining of strands in the C-terminal domain, followed by proteolytic maturation, is likely to be catalyzed by an autochaperone as described for K1F endosialidase. The structure of KflA represents the first bacteriophage tail spike protein combining polysaccharide lyase activity with a single-stranded parallel β-helix fold. We propose a catalytic site and mechanism representing convergence with the syn-β-elimination site of heparinase II from Pedobacter heparinus.

The KflA protein is a K5 lyase, an enzyme found as a tail spike protein of coliphages K5A (1, 2) and K1-5 (3), where it catalyzes depolymerization of K5 capsular polysaccharide. K5 lyase enables the phage to recognize and remove the protective K5 capsule around host bacteria, thereby exposing phage receptors in the outer membrane. The K5 capsular polysaccharide is a virulence factor of Escherichia coli K5 isolates, which are responsible for extraintestinal infections (4,5). The polysaccharide is identical to N-acetyl-heparosan (heparan), a structure present in nonmodified regions of heparan sulfate, and KflA has been utilized previously to study the domain structure of heparan sulfate (6). A bacterial K5 lyase, ElmA, from E. coli K5 strain SEBR 3282 also has been described (7).
The first reported structure of a bacteriophage-derived glycosaminoglycan lyase was that of the hyaluronate lyase HylP1, a tail fiber protein of the inducible prophage SF370.1 of Streptococcus pyogenes strain SF370 (18). HylP1 contains a catalytic domain with triple-stranded ␤-helix topology (TS␤H), 4 a fold previously observed in noncatalytic domains of other viral tail proteins, including bacteriophage T4 short tail fiber gp12 (19) and needle protein gp5 (20,21). Short stretches of this fold also are found in the C-terminal domains of coliphage K1F endosialidase (22) and P22 TSP (23), where it acts to stabilize trimeric proteins. We report that this structure also is found within KflA.
Viral tail spike proteins with predominantly single-stranded parallel ␤-helix architecture include Salmonella phage P22 TSP (23,24), Shigella phage Sf6 TSP (25), and E. coli phage HK620 TSP (26). These are all glycoside hydrolases, facilitating host recognition and infection through binding and degradation of host O-antigen of lipopolysaccharide. This ␤-helix fold also is widely found in pectic lyases and some alginate lyases as well as in a small number of glycoside hydrolase subfamilies, including polygalacturonases, rhamnogalacturonases, and dextranases. Groupings of these enzymes based on structure and activity are described in the CAZy database (27).
In this paper, we report the structure of KflA, which represents the first viral tail spike protein with polysaccharide lyase activity and a catalytic single-stranded ␤-helix domain, a combination frequently observed in bacterial polysaccharide lyases. Also, the proposed catalytic site resembles the syn-␤-elimination site of heparinase II, despite the fact that these enzymes have a very different topology.
Site-directed Mutagenesis-Mutations of kflA within pLYA100 were generated by the QuikChange site-directed mutagenesis kit (Stratagene). Each mutant of kflA used in this paper was fully sequenced to preclude mutations introduced by PCR. Errors were found in the published sequence of kflA, the new sequence revealed KflA is much more similar to K5 lyase from coliphage K1-5 (Swiss-Prot accession no. Q9AZ47) than previously determined (99% similarity, 98% identity, length of 632 amino acids). Based upon this sequence, the molecular mass of native KflA is predicted to be 66.9 kDa.
Mutant variants of His-KflA were screened by lyase assay of cell lysate supernatant from 50-ml cultures of E. coli DH5␣ harboring the corresponding plasmid, induced as for seleno-KflA. Cells were lysed in BugBuster reagent (Novagen). Cultures with low lyase activity were scaled up to 500 ml, and His-KflA variants were purified by nickel-affinity chromatography.
Additional Analytical Methods-SDS-PAGE was performed according to the method of Laemmli (29), and unboiled SDS-PAGE samples were heated (37°C for 30 min) prior to loading (supplemental Fig. S1, a and b). His-KflA content of purified samples was calculated from measurement of A 280 (2-l sample, in triplicate, by Nanodrop) using estimates of extinction coefficient and molecular mass calculated using ProtParam (30). Calculation of pK a values of ionizable groups using the solved structure were performed using Hϩϩ server at Virginia Tech (31,32,33). 5 Lyase Activity Assay-The spectrophotometric assay described previously (28) was performed using a modified reaction buffer (25 mM Tris acetate, pH 8.5, 50 mM NaCl) at 37°C, recording A 232 (supplemental Fig. S2, summarized in Table 1). Initial screening assays were performed by incubating cell lysate supernatants (20 l) in reaction buffer containing 250 g K5 for 1 h at 37°C before recording A 232 .
Crystallization-Crystals were grown from hanging drops; drops were formed by mixing 1 l of 5.2 mg/ml His-KflA in 20 mM Tris-HCl, pH 8.0, 50 mM NaCl, 5% (v/v) glycerol with an equal volume of well solution containing 0.2 M KBr and 10% polyethylene glycol 3350. Crystallization was carried out at 20°C. Needle-like crystals, with the approximate dimensions of 70 m ϫ 70 m ϫ 300 m, formed within 7 days. For data collection, crystals were soaked in a cryoprotectant solution of 20 mM Tris-HCl, pH 8.0, 0.4 M KBr, and 30% polyethylene glycol 400 for 10 min, followed by flash freezing in liquid nitrogen. Seleno-KflA failed to form crystals under the same conditions as His-KflA but could be crystallized by macroseeding smaller crystals derived from His-KflA into hanging drops of seleno-KflA, which had been set up under the same conditions as for His-KflA, and left for 4 days. Seeded crystals grew within 2-3 days and were subjected to cryoprotection in the same manner as His-KflA. Data were processed using XDS (34) for seleno-KflA and MOSFLM (35) for the His-KflA. Details of space groups, cell dimensions, and data collection statistics are given in Table 2.
Structure Solution and Refinement-Experimental phases were derived from single wavelength anomalous diffraction data collected from a single seleno-KflA crystal. Density calculations and analysis of the self-rotation function indicated a trimer in the asymmetric unit. The locations of 21 selenium atoms of a possible 24 were located using SOLVE (36). These phases were inputted into PHENIX (37) and subjected to density modification and automated model building, which resulted in construction of most of the final model. The final stages of model building were conducted manually, using Coot

activity and molecular mass estimates of purified His-KflA and mutated variants
V max is the initial rate of product formation (means Ϯ S.E., mol min Ϫ1 mg Ϫ1 purified enzyme) at 37°C. Nondetectable activity is denoted by ND (detection limit shown in parentheses). Percentage activity relative to His-KflA is shown in parentheses (means Ϯ S.E.). Initial substrate concentration was 500 g K5 per 1 ml of reaction volume, and 1 g of purified protein was used in each reaction, except for E206A and Lys 208 mutant assays, where 5 g purified protein was used. Molecular masses were estimated by light-scattering measurements.
a The molecular mass was estimated via analytical ultracentrifugation (means Ϯ S.E.). (38), combined with refinement using REFMAC (39). The final model was complete for all residues from Pro 7 to Thr 504 on all three chains, and additionally contained 12 bromide ions. The geometry of the model was examined using PROCHECK (40): all main chain and side chain stereochemical parameters were found to be within the limits expected for a refined structure at 1.6-Å resolution. Atomic coordinates and structure factors were deposited in the Protein Data Bank with accession code 2X3H.

RESULTS
His-KflA migrated on SDS-PAGE with an estimated mass of 57 kDa, and nonboiled samples migrated with an apparent mass of 123 kDa (supplemental Fig. S1a). Molecular mass estimated by gel permeation on a calibrated column was 173 kDa (data not shown). Light-scattering analysis and analytical ultracentrifugation of His-KflA gave estimated masses of 168 and 166 kDa, respectively (Table 1). These results suggest His-KflA forms oligomers as reported for K1 endosialidases, which exist in solution as SDS-resistant trimers (23,41).
Structure-As anticipated from biophysical analysis, the crystal structure of KflA revealed a trimeric complex. The overall structure of mature His-KflA comprises a trimer of three identical monomers intertwined at the carboxyl-terminal domain, with a central 3-fold non-crystallographic axis running along the main axis of the macromolecule (Fig. 1, a and b). The structure can be divided into three domains: a small ␣-helical domain at the N terminus, a central single-stranded, righthanded ␤-helix domain and a ␤-prism/triple-stranded ␤-helix domain at the C terminus (Fig. 1a).
The ␤-helix contains 12 complete "winds," with each containing three ␤-sheets termed PB1, PB2, and PB3, using the nomenclature of Yoder and Jurnak (42) separated by short turns (T1, T2, and T3), some of which have loop insertions. Within each monomer, PB1 sheets face away from the central trimer axis and form a solvent-exposed, concave cleft running along the axis of the ␤-helix. Three inter-monomeric clefts also exist along the trimeric axis (Fig. 1, a and b).
The lumen of the KflA ␤-helix contains a stack of 12 hydrophobic residues along PB2 and a ladder of seven stacked Asn residues bordering T1 and PB2. KflA also contains a stack of four Ser residues within T1; this is a feature found in many ␤-helical proteins that is proposed to stabilize the tight turns between PB1 and PB2 sheets (42). Several intermonomeric contacts are formed along the cleft between ␤-helices of neighboring chains, including salt bridges exemplified by the pairings Asp 290 -Arg 357 and Asp 130 -Lys 193 .
The C-terminal domain of each mature KflA monomer consists of a pair of anti-parallel ␤-sheets, a pair of parallel ␤-sheets, and a single ␤-sheet (forming one turn of a triple-stranded ␤-helix) followed by five anti-parallel ␤-sheets (forming one face of a three-sided ␤-prism), ending in a spiraling coil. In this region, each monomer chain makes multiple contacts with each neighboring chain, effectively holding the trimer together through a 720°rotation (supplemental Fig. S3a). The extensive contacts between monomer chains in the trimeric structure may explain the high thermostability previously observed for KflA, with an unfolding transition point at 65°C and peak enzyme activity at 44°C (43).
Active Site Identification-Based on the role of histidine and tyrosine side chains in the catalytic mechanisms proposed for other glycosaminoglycan lyases (10, 44 -48), the residues His 226 , Tyr 229 , His 282 , and Tyr 253 were selected as targets for site-directed mutagenesis. These residues were individually exchanged for Ala and the resulting His-KflA mutants were assayed for lyase activity. The initial screening assay showed high activity levels (data not shown), comparable with nonmutated His-KflA, except in the case of Y229A. To examine the role of Tyr 229 further, the Y229F variant was generated, and both Tyr 229 mutants were purified by nickel affinity and assayed for lyase activity. The Y229F mutation reduced activity to ϳ25%, whereas the Y229A variant had Ͻ2% activity compared with His-KflA ( Table 1).
Calculation of pK a values for ionizable residues in their local environments within the structure of KflA highlighted a large shift in the predicted pK a of Glu 206 from Յ4.5 to Ն7.4. This side chain may therefore be readily protonated under physiological conditions. Glu 206 is at the bottom of a narrow cleft formed by Phe 202 and Lys 208 (Fig. 1c), located centrally within the intramonomeric groove formed by PB1. This is a typical active site location in other ␤-helix enzymes. The pK a analysis also provides an explanation of the reduced activity of the Y229F mutant, as Tyr 229 is predicted to be a major contributor to the raised pK a of Glu 206 , through electrostatic interaction. The observation of reduced lyase activity of the Y229A variant compared with the Y229F variant may reflect involvement of the benzyl ring of Tyr 229 in substrate binding, possibly through a stacking interaction with pyranosyl sugar rings in the substrate. No lyase activity could be detected in variants of His-KflA containing mutations at Glu 206 (E206A) or nearby Lys 208 (K208A, K208R, and K208M) ( Table 1). His-KflA and all purified variants behaved similarly during the expression and purification steps and had molecular masses consistent with trimer formation when analyzed by light scattering (Table 1). These trimers were also all SDSresistant and temperature-labile (supplemental Fig. S1, a and b). A crystal structure of E206A confirmed that this mutant adopted the same structure as His-KflA (data not shown). Lyase assay curves for purified His-KflA, and variants are presented (supplemental Fig. S2). On the basis of these data, we propose that Glu 206 and Lys 208 are the catalytic residues in the ␤-elimination reaction catalyzed by KflA.

DISCUSSION
The determination of the structure of KflA has revealed a number of interesting features. First, it forms a trimeric structure with each chain containing a single-stranded ␤-helix domain with a lyase catalytic site. Unlike K1F endosialidase, KflA is not a processive enzyme, and the trimeric state of KflA may ensure host attachment is maintained through constant partial occupancy of the substrate binding sites present in the trimer. KflA exhibits side chain stacking interactions within the ␤-helix, which is a common feature of many ␤-helices (42,49,50). The placement of the Asn ladder in KflA at the end of T1 is unusual; previously described ␤-helix proteins commonly have an internal Asn ladder within the short turn, T2 (50).
A striking similarity was noted between the fold of the C-terminal ␤-prism/TS␤H domain from KflA and the equivalent domain in K1F endosialidase (22). The KflA ␤-prism extends further than the equivalent fold in K1F endosialidase, with the result that the pairing between chains is altered, but both proteins retain essentially the same strand pattern (supplemental Fig. S3a). The structural similarity covers the last 98 amino acids of the mature proteins and is particularly noticeable throughout the last 31 amino acids, which have a root mean square deviation value of 0.94 Å using TM-align (51) in STRAP software (52). This region contains limited sequence homology between the two mature proteins toward the C terminus residues (41,53,54). The folding of this domain in K1F endosialidase, forming two regions of TS␤H separated by a region of ␤-prism, has been proposed to account for the high thermostability of the trimeric structure (22). The TS␤H found in the C-terminal domain of P22 TSP is described as a "clamp" holding the trimer together (55), contributing to the high thermostability of P22 TSP. It is likely that the intertwining of the three monomer strands in the C-terminal domain of KflA contributes greatly to the high thermostability observed previously for KflA, with an unfolding transition point at 65°C and peak enzyme activity at 44°C (43).
Post-translational Modification-Proteolytic maturation has been reported for K1 endosialidases (41) and ElmA (7). Sequence alignments show KflA contains a conserved posttranslational cleavage site at Ser 505 (41,53,54). The migration of His-KflA on SDS-PAGE at an apparent molecular mass of 57 kDa is consistent with cleavage at Ser 505 , removing a 14.6-kDa carboxyl-terminal fragment to yield a 56.3-kDa mature protein.
The predicted molecular mass of 168.9 kDa for mature, trimeric His-KflA is in good agreement with our analysis by gel permeation chromatography, light scattering, and analytical ultracentrifugation techniques. The domain removed during maturation has been described as an autochaperone assisting in folding and assembly of trimeric tail spike proteins, with the autolytic cleavage event contributing to stability by making unraveling of trimers energetically unfavorable (54). Recently, the structure and catalytic mechanism of the autochaperone of K1F endosialidase has been described (53). We have aligned the C-terminal region of KflA with the K1F autochaperone (supplemental Fig. S3b), further demonstrating the structural similarity between the C-terminal domains of K1F endosialidase and KflA.
The autochaperone domain contains a highly basic region located on the C-terminal ␣-helix, which is well conserved within K5 lyases and K1 endosialidases (41) and which closely resembles a Cardin-Weintraub motif (43,56,57). This motif is implicated in binding of glycosaminoglycans and was previously proposed to contribute to substrate recognition and binding by KflA (43). The observation of this motif in K1 endosialidase may reflect a more general interaction between a patch of basic side chains with an unbranched, polyanionic polysaccharide such as polySia (polysialic/neuraminic acid or K1 capsular polysaccharide). Removal of the autochaperone through proteolytic maturation raises questions regarding the significance and role of this motif, it is conceivable that the cleaved fragment remains phage-associated under normal conditions of phage assembly to enhance host recognition. However, evidence from cryoelectron microscopy (58) indicates that the autochaperone domain does not remain associated with mature KflA or K1E endosialidase.
The morphology of K1-5 and K1E was recently studied by cryoelectron microscopy (58). A comparison of the density maps enabled identification of the electron density contribution by K5 lyase, as this enzyme is absent from K1E. The authors showed the P22 TSP structure fitted the density map of K5 lyase. Six symmetrically related copies of the KflA trimer were fitted into the 6-fold averaged electron density map for the K1-5 tail using the fitting function within the UCSF Chimera package (Fig. 2, a and b) (66). This procedure used a simulated map generated from the KflA structures at 24-Å resolution, corresponding to the estimated resolution of the density map for the K1-5 tail (58), and gave a correlation coefficient of 0.77. Interaction between neighboring tail spikes occurs through the ␤-barrel domain of K1 endosialidase (Fig. 2a). The partners in this interaction are attached to different molecules of adapter protein (gp37), thereby creating an unbroken ring of interacting TSPs around the base of the bacteriophage, which may contribute to stability and correct orientation of tail spikes (Fig. 2b).
Catalytic Mechanism-All of the glycosyl moieties in K5 polysaccharide have been shown by NMR to be in the 4 C 1 chair configuration (59). The ␤-elimination at GlcA requires abstraction of the axial proton at C5 and breaking of the equatorial glycosidic bond attached to C4. This mechanism is a syn-␤elimination (both leaving groups on the same face of the incipient unsaturated bond) (60). Hyaluronan lyases, chondroitin AC lyases, and heparinase III also catalyze syn-␤-elimination reactions, but none of these have a catalytic single-stranded ␤-helix. Despite being a glycosaminoglycan lyase and having the same overall fold as KflA, chondroitinase B catalyzes an anti-␤-elimination at IdoA moieties of dermatan sulfate; this requires a different spatial arrangement of catalytic side chains to that required for syn-␤-elimination. Heparinase I and pectic lyases also catalyze anti-␤-elimination reactions. Heparinase II has two overlapping catalytic side chain groupings, which can catalyze anti-or syn-␤-elimination at IdoA or GlcA components, respectively, in suitable substrates (13). Comparison of KflA residues Glu 206 and Lys 208 , implicated by the mutational analysis to be involved in the catalytic mechanism, with the proposed catalytic residues of other polysaccharide lyases revealed similarity to the active site proposed for heparinase II in syn-␤-elimination at GlcA substituents of heparan sulfate. This is essentially the same reaction as that catalyzed by KflA, differing only in the degree of modification of neighboring glycosyl units tolerated by the two enzymes. Despite the overall structure of the catalytic domain of heparinase II (␣/␣-toroidal) being very different to that of KflA, the two active sites show  (58). a, detail of the contact between KflA and K1F endosialidase trimers; the endosialidase ␤-barrel domain is indicated. b, arrangement of KflA and K1F endosialidase trimers within the 6-fold symmetric structure of the K1-5 tail spike, superimposed on the semitransparent map of the cryo-electron microscopy density map. KflA and K1F endosialidase were fitted into the density map (accession no. 1335 from the Macromolecular Structure database at the European Bioinformatics Institute), and the figure was generated using the UCSF Chimera package (66). similar spatial arrangement when superimposed (Fig. 3). Convergence has been noted before between Pel10A, an (␣/␣) 3barrel pectate lyase, and Pel1C, a parallel ␤-helix pectate lyase (60), which catalyze anti-␤-elimination. The mechanism proposed for heparinase II (13) involves hydrogen-bond formation between the carboxylic group of GlcA and that of Glu 205 . Tyr 257 is proposed to act as a "base" to abstract the proton at C5 of GlcA. Tyr 257 also is proposed to act as an "acid," donating a proton to the proximal glycosidic oxygen, breaking the linkage between C1 of GlcNAc and C4 of GlcA. This dual action of catalytic Tyr residues also has been suggested for other syn-␤eliminases (8).
The mechanism proposed here for KflA involves hydrogen bond formation between the protonated carboxylic group of Glu 206 and the carboxylate of GlcA. This dissipates the negative charge on this group, allowing proton abstraction at C5 of GlcA by Lys 208 . Lys 208 also is proposed to donate a proton to the oxygen of the glycosidic linkage terminating at C4 of GlcA. The order in which reactions at Lys 208 occur (and at Tyr 257 of heparinase II) has not been determined.
The products of this reaction on a K5 polysaccharide molecule are two shorter chains, one with GlcNAc at the new reducing end and another with ⌬-4,5-unsaturated GlcA (4-deoxy-␣-L-threo-hex-4-enopyranosyluronic acid) at the nonreducing end. During initial stages of the in vitro assay, reaction products are noninhibitory, as they are cleavable substrates for KflA.
Other mechanisms proposed for lyases to neutralize the negative charge on substrate hexuronate residues include charge dissipation through hydrogen bond formation to Asn side chains as found in hyaluronan lyase (45) and chondroitin AC lyase (8), interaction with Arg side chains as found in pectin lyases A and B (61,62) or with coordinated Ca 2ϩ ions as found in pectate lyase C (63) and chondroitinase B (17). KflA does not require metal ions for function (28), and no such cofactor was observed in the crystal structure, leading us to conclude the catalytic role of Glu 206 is not the coordination of a metal cofactor. Rather, we propose Glu 206 is involved in direct hydrogen bond formation with the carboxylic group of GlcA substrate components in a role analogous to that of Glu 205 in the syn-␤elimination catalytic site of heparinase II.
Dual Host Specificity-This phenomenon has been reported previously in the case of K1-5, which can infect and replicate in E. coli K1 and K5 strains (3). Coliphage K1-5 has both K1 endosialidase and K5 lyase tail spikes, each attached to the virion through interaction of the N-terminal domain with separate attachment sites on the gp37 adapter protein (Fig. 2, a and  b) (58). Coliphage K5A has been observed previously to infect E. coli K95 strains and degrade K95 polysaccharide (64,65), frequently leading to mistyping of K95 strains as K5. Located 3Ј to kflA in the genome of K5A is open reading frame 523 which likely encodes a K95-specific TSP (3). We have confirmed plaque formation in lawns of E. coli K95 by K5A (data not shown). Therefore, K5A may more accurately be named K5-95. Dual-specificity has obvious implications for phage evolution and expansion of host range as has been discussed previously (3).
In conclusion, the structure of K5 lyase A reported in this paper represents the first viral tail spike protein to combine a single-stranded ␤-helix fold with a ␤-eliminase catalytic mechanism, an enzyme activity common among bacterial glycosaminoglycan-degrading enzymes. Also, the proposed catalytic mechanism involves an unusual role of Glu 206 in direct hydrogen bonding to the carboxylate group within the substrate, mimicking the catalytic role of Glu 205 in heparinase II.