Structure of the Bacteriophage φKZ Lytic Transglycosylase gp144*

Lytic transglycosylases are enzymes that act on the peptidoglycan of bacterial cell walls. They cleave the glycosidic linkage between N-acetylmuramoyl and N-acetylglucosaminyl residues with the concomitant formation of a 1,6-anhydromuramoyl product. The x-ray structure of the lytic transglycosylase gp144 from the Pseudomonas bacteriophage φKZ has been determined to 2.5-Å resolution. This protein is probably employed by the bacteriophage in the late stage of the virus reproduction cycle to destroy the bacterial cell wall to release the phage progeny. φKZ gp144 is a 260-residue α-helical protein composed of a 70-residue N-terminal cell wall-binding domain and a C-terminal catalytic domain. The fold of the N-terminal domain is similar to the peptidoglycan-binding domain from Streptomyces albus G d-Ala-d-Ala carboxypeptidase and to the N-terminal prodomain of human metalloproteinases that act on extracellular matrices. The C-terminal catalytic domain of gp144 has a structural similarity to the catalytic domain of the transglycosylase Slt70 from Escherichia coli and to lysozymes. The gp144 catalytic domain has an elongated groove that can bind at least five sugar residues at sites A-E. As in other lysozymes, the peptidoglycan cleavage (catalyzed by Glu115 in gp144) occurs between sugar-binding subsites D and E. The x-ray structure of the φKZ transglycosylase complexed with the chitotetraose (N-acetylglucosamine)4 has been determined to 2.6-Å resolution. The N-acetylglucosamine residues of the chitotetraose bind in sites A-D.

Pseudomonas aeruginosa is a facultative human pathogen causing prolonged and frequently lethal infections, especially in patients with a compromised immune system (1,2). The treatment of Pseudomonas infections is difficult because of its resistance to many antibiotics. Thus, the use of bacteriophages and their lytic enzymes for anti-Pseudomonas therapy might be a promising line of investigation.
Bacteriophage KZ is a giant virus that uses P. aeruginosa as a host. The structure of the KZ virion was determined using cryo-electron microscopy (3,4). The KZ virion consists of an icosahedral capsid with a diameter of ϳ1450 Å and an ϳ2000-Å long contractile tail, terminating in a baseplate (5,6). Its doublestranded genomic DNA consists of 280,334 bp and has 306 open reading frames (6). Only a few gene sequences can be recognized as being homologous to genes in other organisms and, in particular, in other tailed phages, indicating that KZ represents a distant branch of the Myoviridae family.
The KZ genome encodes two peptidoglycan-degrading enzymes, gp144 (product of gene number 144) and gp181. The gp144 enzyme can cleave peptidoglycans that belong to the "A1␥" peptidoglycan chemotype or family (7) in Gram-negative bacteria, including P. aeruginosa (8,9). The direct cross-linking that occurs in A1␥ peptidoglycans and the full acetylation of N-glucosamine are required for successful hydrolysis by gp144 (9). The peptidoglycan cleavage activity of gp144 is 143 times greater than the activity of hen egg white lysozyme (9,10). The maximum activity of gp144 is reached at pH 6.2 (9).
gp144 and gp181 consist of 260 and 2237 amino acids, corresponding to calculated molecular masses of 28.8 and 245.8 kDa, respectively. There is 52% sequence identity between the 190-residue C-terminal domain of gp144 and an internal domain of gp181 (Fig. 1). Furthermore, these domains have sequence similarity to the catalytic domain of family 1 lytic transglycosylases (9,11). Mass spectrometry analysis of products of peptidoglycan degradation by gp144 confirmed that this protein is a lytic transglycosylase (8). The structure of these lytic transglycosylases is homologous to that of lysozymes (12). Lytic transglycosylases are enzymes that can degrade bacterial peptidoglycan cell walls by cleaving ␤(1,4)-glycosidic bonds between N-acetylmuramic acid and N-acetylglucosamine and add a new glycosidic bond between O-6 and C-1 in the muramic acid residue (see Fig. 2) (13,14). In contrast, lysozymes catalyze the hydrolysis of the same substrate by using a water molecule from the solvent. In lytic transglycosylases, there is only one acidic residue, typically a glutamic acid, in the vicinity of the substrate cleavage site (12), whereas hen egg white lysozyme possesses a catalytic dyad of Glu and Asp (15). According to the proposed reaction mechanism of lytic transglycosylases (see Fig. 2) (13,14), the catalytic glutamate donates a proton to the glycosidic oxygen of the linkage between the N-acetylmuramic acid (NAM) 2 residue at subsite D and the N-acetylglucosamine (NAG) residue at subsite E. In the next step of the reaction, the glutamate residue abstracts the C-6 hydroxyl group proton in the muramoyl oxazolinium ion intermediate, leading to the formation of 1,6-anhydromuramic acid (Fig. 2).
Based on sequence analyses (11), gp144 has a peptidoglycan-binding domain in its N-terminal region ( Fig. 1) (9), suggesting that gp144 evolved as gene fusion of a cell wallbinding domain and a transglycosylase catalytic domain. Experiments with the N-terminal domain fused to green fluorescent protein confirmed that this domain can bind bacterial cell walls (9). The N-and C-terminal domains probably have an independent peptidoglycanbinding function.
Like KZ, the bacteriophage T4 has two peptidoglycan-degrading enzymes that are employed by the virus at different stages of its life cycle (16 -18). The larger of these two enzymes is a part of the T4 cellpuncturing device and is used by the phage to locally digest the cell wall during the infection process. The other smaller enzyme (called the T4 lysozyme or endolysin) causes cell lysis and release of the phage progeny in conjunction with another virally encoded protein called "holin." Holin is a protein that creates lesions in the cytoplasmic membrane through which the endolysin passes to gain access to the peptidoglycan cell wall. However, exhaustive searches did not identify a typical holin sequence in the KZ genome (8). By analogy to T4, the smaller of the two peptidoglycan-degrading enzymes in KZ, gp144 is probably the endolysin, whereas the larger gp181 is probably located in the center of the phage baseplate and forms the cellpuncturing device. The presence of gp181 (but not gp144) in the KZ tail was confirmed by a mass spectrometry study (9), although immunoblot analysis suggested that gp144 might also be attached to the phage tail (11).
Circular dichroism has shown that gp144 has mostly ␣-helical structure (8), although there is a large decrease in the ␣-helical con-  tent when the protein interacts with dimyristoylphosphatidylglycerol, one of the major components of bacterial plasma membranes. This indicates that gp144 can interact with the host membrane, suggesting that holin might not be necessary to allow gp144 to reach the peptidoglycan. Endolysins with multiple domains or "modules" occur in bacteriophages that infect Gram-positive bacteria (19), but unlike gp144, these proteins have their peptidoglycan-binding domains in the C-terminal region. Endolysins of bacteriophages that infect Gram-negative cells are single-module proteins. However, gp144 from KZ and gp188 from bacteriophage EL are exceptions to this rule (9).
Here we report a 2.5-Å resolution crystal structure of gp144 and its complex with chitotetraose, a tetramer of NAG.

MATERIALS AND METHODS
Protein Expression, Purification, and Crystallization-The DNA encoding KZ gp144 was cloned into the pQE-30 vector (Qiagen) using BamHI and HindIII restriction sites (11). The plasmid vector produced the gp144 protein with an N-terminal six-histidine affinity purification tag and Gly-Ser-Ile linker between the tag and the first methionine of gp144. The pQE-30 vector has a T5 promoter under the control of two lac operators and an ampicillin resistance gene.
A selenomethionine (SeMet) derivative of gp144 was expressed in Escherichia coli to facilitate the crystal structure determination. The B834 Met autotroph strain of E. coli was transformed by two plasmids: the pQE-30 vector and the pREP4-lacI plasmid containing the lac repressor and kanamycin resistance genes. The cells were first grown in 2YT medium at 37°C to A 600 ϭ 0.3 and then centrifuged and transferred to modified M9 medium containing SeMet and all other amino acids at a concentration of 40 g/ml, Kao and Michayluk vitamin solution (Sigma), 0.4% (w/v) glucose, 50 g/ml kanamycin, and 100 g/ml ampicillin. The cells were grown to A 600 ϭ 0.6. gp144 expression was induced with 1 mM isopropyl ␤-D-thiogalactopyranoside, and the cells were incubated at 37°C for another 4 h.
The cells were harvested by centrifugation and resuspended in buffer containing 10 mM Tris-HCl, pH 8.0, 200 mM NaCl, and Complete EDTA-free protease inhibitor mixture (Roche  Applied Science). The cells were disrupted by sonication, and the insoluble cell debris was removed by centrifugation. The supernatant was applied to HisTrap HP column (GE Healthcare). The protein was eluted from the column by 200 mM imidazole. The purified protein was dialyzed against 10 mM Tris-HCl, pH 8.0. The purified gp144 was a mixture of monomers, dimers, and trimers (11). The oligomerization was probably caused by the formation of intermolecular disulfide bonds involving some of the three cysteines in the N-terminal domain.
Dithiothreitol was added to the protein solution to obtain a concentration of 5 mM. The addition of dithiothreitol caused the dissociation of the dimers and trimers, resulting in monomeric gp144.
Crystallization and Data Collection-The initial crystallization screening was performed using the sitting-drop vapor diffusion method at 20°C with Hampton Crystal Screens 1 and 2 (20). Drops containing 1 l of protein at 10 mg/ml were mixed with 1 l of reservoir solution and equilibrated against 100 l of reservoir solution. The protein crystallized in 2 M ammonium sulfate, 2% (v/v) polyethylene glycol 400, and 0.1 M NaHEPES, pH 7.5. Crystals appeared in ϳ1 day. The presence of dithiothreitol in the protein solution was important for crystallization. The optimized crystallization conditions were 1.2 M ammonium sulfate, 1.2% (v/v) polyethylene glycol 400, and 60 mM NaHEPES, pH 7.5. Subsequently, crystals were grown in hanging drops formed by mixing 2 l of protein solution with 2 l of reservoir solution and equilibrated against 1 ml of reservoir solution. The crystals were hexagonal prisms with a maximum dimension of ϳ0.4 mm. X-ray data sets were collected from flash-frozen crystals at 100 K. Before freezing, the crystals were washed for a few seconds in a cryoprotectant solution that was obtained from the crystal growth solution by adding glycerol to obtain a 15% (v/v) concentration.
Multiwavelength anomalous dispersion diffraction data were collected to 2.5-Å resolution using the SeMet protein crystals at the General Medicine and Cancer Institutes Collaborative Access Team beamline 23-ID-B of the Advanced Photon Source (Argonne National Laboratory, Argonne, IL). The crystals of the SeMet protein were soaked for ϳ36 h in solution containing 4 mg/ml chitotetraose, 2 M ammonium sulfate, 2% (v/v) poly-  MARCH 14, 2008 • VOLUME 283 • NUMBER 11 JOURNAL OF BIOLOGICAL CHEMISTRY 7245 ethylene glycol 400, and 0.1 M NaHEPES, pH 7.5, to determine the structure of the protein complexed with the chitotetraose (NAG) 4 . The diffraction data from the crystals of the complex were collected to a resolution of 2.6 Å with Rigaku R-AXIS-IV home diffraction equipment. The data sets were processed with the program HKL2000 (21). The program TRUNCATE (22,23) was used to calculate intensity statistics and structure factors. Crystal parameters and diffraction data statistics are summarized in Table 1.

Structure of Bacteriophage KZ Lytic Transglycosylase
Structure Solution, Model Building, and Refinement-The hexagonal crystals had one gp144 molecule/asymmetric unit. The structure was solved using the SeMet multiwavelength anomalous dispersion data set ( Table 1). Determination of the eight selenium positions and subsequent phasing were performed with the program SOLVE (24). The phases were improved by density modification using the program RESOLVE (25). An atomic model was built into the electron density using the programs MIFit (26) and PyMOL (27). The model was subjected to several rounds of crystallographic refinement using the programs CNS (28) and PHENIX (29,30) with manual rebuilding using the program MIFit (26). To determine the structure of the protein complexed with the chitote-traose (NAG) 4 , the refined structure of the native protein was placed in the new unit cell and refined as a rigid body using the program Mol-Rep (31). A difference Fourier (F o Ϫ F c ) map showed the positions and orientations of NAG residues 1-3 of the (NAG) 4 tetramer, but only weak density for residue 4, indicating that the fourth residue was partially disordered. All four NAG residues were added to the atomic model, and the model was refined with the CNS (28) and PHENIX (29,30) programs. The quality of the refined structures ( Table 2) was examined with the program PROCHECK (32). A homology model of the transglycosylase domain of KZ gp181 was built using the program MOD-ELLER (33). Figures were prepared using the programs MolScript (34), Raster3D (35), and PyMOL (27).
Structure of the N-terminal Domain-The N-terminal domain consists of a three-helix bundle (helices 1-3). Part of the N-terminal His tag region is visible in the electron density map (Fig. 3A). The four visible His tag residues plus two residues of the linker connecting the His tag to gp144 form an ␣-helix and are in contact with ␣-helix 3. The Ser and Ile residues in the linker are in contact with a symmetry-related molecule, suggesting that the His tag with its linker might be important for crystallization. There are three unpaired cysteine residues, two of which are accessible from the solvent and could form intermolecular disulfide bonds, accounting for the gp144 oligomers in purified protein solution.
As anticipated, the structure of the gp144 N-terminal domain is similar to the N-terminal putative peptidoglycanbinding domain of the D-Ala-D-Ala-cleaving carboxypeptidase from Streptomyces albus G (Protein Data Bank code 1lbu) (36,37). The domains of these two proteins have 35% sequence identity (Fig. 1), with a root mean square difference of 1.2 Å for 60 equivalenced C-␣ atoms (Fig. 4A). Domains with similar amino acid sequences have been identified in many bacterial wall lytic enzymes and cell surface-associated proteins (37,38). Like KZ gp144 and D-Ala-D-Ala carboxypeptidase, these multimodular proteins have, in addition to their catalytic domain, another domain at either their N or C terminus. The function of these domains is presumably for binding to peptidoglycan substrates. Indeed, the N-terminal domain of gp144 can bind the cell wall peptidoglycans of many Gram-negative bacteria (9).
The N-terminal domain of D-Ala-D-Ala carboxypeptidase has repeats in its amino acid sequence that were hypothesized to be involved in peptidoglycan binding (38). Asp 31 -Thr 38 and Asp 54 -Thr 61 in the N-terminal domain of gp144 were found to have similar sequences to the repeating sequences in D-Ala-D-Ala carboxypeptidase and are situated near the molecular surface. The root mean square difference between equivalent backbone atoms in these two regions of gp144 is only 0.47 Å. On the molecular surface of gp144, these regions are involved in formation of small grooves and are therefore possible sites for peptidoglycan binding (Fig. 3B).
Domains with a structure similar to the N-terminal region of gp144 but with no identifiable sequence similarity occur in human matrix metalloproteinases (39,40). Superposition of the structure of the matrix metalloproteinase-2 prodomain (39) onto the N-terminal domain of gp144 gave a root mean square deviation of 1.7 Å for 36 equivalenced C-␣ atoms (Fig. 4B). The matrix metalloproteinases degrade extracellular matrix proteins and are thought to play a role in tumor invasion and arthritis (41,42).
Structure of the Catalytic Domain-The C-terminal catalytic domain of gp144 (residues 70 -260) has been defined to include the proline-rich N-terminal region consisting of Pro 71 -Ala 83 (Fig. 3A). The region Thr 91 -Thr 213 of the gp144 C-terminal domain has significant structural similarity to the catalytic domains of family 1 lytic transglycosylases (12,14), represented by Slt70 from E. coli (43,44). The root mean square deviation between 96 equivalenced C-␣ atoms of gp144 and Slt70 is 1.5 Å (Fig. 5A). The transglycosylase family has structural similarity to lysozymes and especially to goose lysozyme (45), which has a larger C-terminal region compared with hen egg white lysozyme (15). A larger C-terminal region also occurs in T4 lysozyme, where it is involved in binding of the peptide component of peptidoglycan. Lytic transglycosylase family 1 has four consensus sequence motifs (I-IV), with the catalytic Glu residue being a part of motif I (14). KZ gp144 has some limited sequence similarity to the conservative motifs I-III (Fig. 6), but not to motif IV. These motifs are involved in forming the well characterized substrate-binding sites A-F (15,46).
The substrate-binding sites of Slt70 and lysozymes can bind six sugar residues (43,45,46). Substrate cleavage occurs between NAM bound in subsite D and NAG bound in subsite E. The structure of gp144 complexed with the chitotetraose showed the NAG residues bound in subsites A-D (Figs. 5 and 7), although because of disorder, the orientation of the residue in subsite D is not clearly defined (see "Materials and Meth- ods"). There is no significant difference between the structures of gp144 in the free and complexed forms. As in the original structure of hen egg white lysozyme (15), the natural substrate for gp144 has NAM residues at sites B and D, whereas the chitotetraose molecule has NAG at these sites. The peptide covalently bound to the NAM residue at site D can be involved in interactions with gp144. Thus, the natural substrate is probably bound tighter to gp144 than is chitotetraose. The lack of complementarity of NAG to subsite D probably explains the disorder of the NAG residue at this subsite. However, no such effect seems to occur at subsite B. The orientation of the NAG residue at site B in the gp144-chitotetraose complex suggests that an associated peptide bound to an equivalent NAM residue would be unlikely to interact with gp144. Interactions of NAG and NAM sugar residues at site B with gp144 would be similar, indicating why the NAG residue at site B is well ordered in the gp144-chitotetraose complex. The catalytic residue of gp144 (Glu 115 ) is located between sugar-binding subsites D and E, with no other acidic residues in the proximity of these subsites, in agreement with the proposed reaction mechanism (Fig. 2).
The structure of Slt70 was determined in complex with the 1,6-anhydromuropeptide (NAG-anhydro-NAM-L-Ala-D-Glu-meso-diaminopimelic acid) (44). The NAG residue was found in subsite E, forming hydrogen bonds with three hydrophilic residues (Gln 477 , Glu 478 , and Gln 496 ). Two of these residues (Glu 115 and Gln 134 ) are conserved in KZ gp144. The NAM residue of the anhydromuropeptide that would have been expected to bind to subsite F does not interact with the Slt70 molecule. Instead, the peptide covalently bound to this NAM residue binds to Slt70, although none of these residues are conserved in KZ gp144. Apparently, subsite E (but not subsite F) is well conserved between gp144 and the typical lytic transglycosylase Slt70.
The structure of Thr 213 -Lys 243 of gp144 has a fold different from that of Slt70 (Fig. 5A) or any other lysozyme structure. These residues, together with Phe 201 -Arg 209 , form a groove on the molecular surface of gp144. The groove is close to subsite D, a

Structure of Bacteriophage KZ Lytic Transglycosylase
NAM-binding site. The structure of (NAG) 4 complexed with gp144 suggests that an associated peptide covalently bound to an equivalent NAM residue would be in a suitable position to bind to the surface groove of gp144 (Fig. 3B).
The high sequence similarity between gp144 and the transglycosylase domain of KZ gp181 (Fig. 1) made it possible to build a homology model of this domain. The model suggested that the structures of the gp181 and gp144 transglycosylase domains and, in particular, their substrate-binding sites are very similar.
gp181 is most probably involved in formation of the KZ cell-puncturing device, and its function can be compared with that of gp5 in bacteriophage T4 (17). T4 gp5 has a lysozyme domain followed by a 186-residue C-terminal ␤-helix domain, which forms a membrane-puncturing needle. The transglycosylase domain in gp181 is also followed by an ϳ180-residue C-terminal domain, which might be involved in the formation of a needle-like structure visible in the cryo-electron microscopy reconstruction of the KZ tail (4). The secondary structure prediction suggests the presence of long helices in the gp181 C-terminal domain. Thus, the KZ membrane-puncturing needle might have a coiledcoil structure as in bacteriophage P22 (47).
The structure of the KZ endolysin (gp144) reported here might be useful for the development of new antibacterial reagents for preventing and treating P. aeruginosa infections. Such an antibiotic would function by destroying the peptidoglycan cell wall in Gram-negative bacteria. However, to do so, the outer membrane must first be disrupted to permit access of gp144 molecules to the cell wall. The phage accomplishes the disruption by a using a needle-shaped molecule as a cell-puncturing device (17). Clinically, it might be possible to apply a solution of gp144 in conjunction with a substance that permeabilizes the outer bacterial membrane.