Characterization and Three-dimensional Structures of Two Distinct Bacterial Xyloglucanases from Families GH5 and GH12*

The plant cell wall is a complex material in which the cellulose microfibrils are embedded within a mesh of other polysaccharides, some of which are loosely termed “hemicellulose.” One such hemicellulose is xyloglucan, which displays a β-1,4-linked d-glucose backbone substituted with xylose, galactose, and occasionally fucose moieties. Both xyloglucan and the enzymes responsible for its modification and degradation are finding increasing prominence, reflecting both the drive for enzymatic biomass conversion, their role in detergent applications, and the utility of modified xyloglucans for cellulose fiber modification. Here we present the enzymatic characterization and three-dimensional structures in ligand-free and xyloglucan-oligosaccharide complexed forms of two distinct xyloglucanases from glycoside hydrolase families GH5 and GH12. The enzymes, Paenibacillus pabuli XG5 and Bacillus licheniformis XG12, both display open active center grooves grafted upon their respective (β/α)8 and β-jelly roll folds, in which the side chain decorations of xyloglucan may be accommodated. For the β-jelly roll enzyme topology of GH12, binding of xylosyl and pendant galactosyl moieties is tolerated, but the enzyme is similarly competent in the degradation of unbranched glucans. In the case of the (β/α)8 GH5 enzyme, kinetically productive interactions are made with both xylose and galactose substituents, as reflected in both a high specific activity on xyloglucan and the kinetics of a series of aryl glycosides. The differential strategies for the accommodation of the side chains of xyloglucan presumably facilitate the action of these microbial hydrolases in milieus where diverse and differently substituted substrates may be encountered.

Xyloglucans comprise a family of plant polysaccharides united by a common ␤(134) glucan backbone regularly deco-rated at C-6 with ␣-linked xylopyranosyl residues (1). Numerous studies have indicated that most xyloglucans are based upon the Glc 4 oligosaccharide repeats XXXG or XXGG, where G and X denote unsubstituted D-Glcp and ␣-D-Xylp(136)-D-Glcp units, respectively (2). More rarely, some species have been observed to produce xyloglucans with both Glc 4 and Glc 5 backbone repeats, such as XXXXG (3) and XXGGG (4). The xylose branches may in turn be substituted with combinations of galactopyranose, fucopyranose, arabinofuranose, and O-acetyl residues, depending upon the plant species and tissue localization (reviewed in Refs. 2, 5, and 6). A concise, linear notation based on single-letter abbreviations of commonly observed microstructures is widely used to simplify the description of xyloglucans and xylogluco-oligosaccharides (7)  In plants, xyloglucans function both as seed storage carbohydrates (8) and as essential modulators of the mechanical properties of the primary cell wall (see Refs. 9 -14 and references therein). In the latter context, xyloglucans are intimately associated with cellulose through surface adsorption and direct entrapment within the paracrystalline structure (15). Primary cell wall xyloglucans are widely distributed among land plants (16), thus suggesting that the specific cellulose-xyloglucan interaction may have conferred a particular structural advantage in the colonization of drier habitats (17,18).
There is strong and evolving interest in xyloglucans and also in the enzymes responsible for their modification and degradation. Such interest stems not only from the role of these polysaccharides and their catalysts in plant cell wall morphogenesis (9,10,13), but also from biotechnological applications as diverse as fruit juice clarification (19,20), textile processing (21,22), cellulose surface modification (23)(24)(25)(26)(27), pharmaceutical delivery (28 -30), production of food thickening agents (30,31), as well as the production of xylogluco-oligosaccharides for cell wall analysis (4,5,32), plant growth modulation (33,34), surfactant synthesis (35), and enzyme kinetic studies (36). Furthermore, the goal of biofuel production from plant biomass, which strives to substantially reduce fossil fuel usage, has caused a great resurgence of interest in plant cell wall degrading enzymes (37). However, plant biomass remains extremely difficult to exploit, primarily because its components are extremely resistant to degradation; plant cell wall polysaccharides are often present as insoluble, cross-linked structures. Furthermore, the chemistry of the glycosidic bond itself makes its hydrolysis one of the most challenging reactions in nature, with Wolfenden showing that, in the absence of biological catalysts for its degradation, cellulose has a half-life in excess of 4 million years (38).
The biocatalysts responsible for the hydrolysis of the backbone of xyloglucan are xyloglucan endo-␤-1,4-glucanases or "xyloglucanases" (EC 3.2.1.151). This enzyme commission number reflects many different enzyme sequences, structures, and hydrolytic mechanisms with either inversion or retention of the configuration of the anomeric carbon. In the sequencebased CAZy (carbohydrate active enzymes) classification (39) (recently reviewed in Ref. 40), enzymes defined as xyloglucanases are found in retaining families GH5, GH12, and GH16 and inverting families GH44 and GH74. To date, xyloglucanase structures have only been reported for the GH74 family enzymes (41,42), although the coordinates for a Clostridial xyloglucanase Cel44A are deposited but currently unavailable (Protein Data Bank code 2D8G). 4 The structure of the poplar xyloglucan endotransglycosylase (EC 2.4.1.207) from family GH16, which strongly favors transglycosylation of xyloglucan over hydrolysis, has also been determined (43).
Here we report the characterization, both on polymeric substrates and defined aryl xyloglucan oligosaccharides (Fig. 1), of two structurally distinct xyloglucan hydrolases from families GH5 and GH12: P. pabuli XG5 (hereafter PpXG5) and B. licheniformis XG12 (hereafter BlXG12). The single crystal x-ray structures of both enzymes have been determined, at resolutions from 1.95 to 1.40 Å, in both an unliganded form and in complex with xyloglucan oligosaccharides based upon a cellotetraose backbone. These three-dimensional complexes interpreted in light of the kinetics of the two enzyme classes on xyloglucan-derived substrates provide an unique insight into the different ways xyloglucan side chains are accommodated and/or harnessed for catalysis.

Cloning and Expression of PpXG5
The 40-kDa xyloglucanase produced by the P. pabuli strain (DSM 13330) was cloned by standard methods. Briefly, purified genomic P. pabuli DNA was partially digested by Sau3A and cloned into an Escherichia coli-based lambdaZAPexpress vector (Stratagene, La Jolla, CA). Ligated DNA was packaged in phages using the GigaPackIII gold kit (Stratagene). Eventually, plaque-forming phages were screened on agar plates containing AZCL-xyloglucan (Megazyme International Ireland Ltd.), and positive clones were seen by the formation of blue halos. The gene encoding PpXG5 was DNA-sequenced, and PCR primers were designed for amplification of the gene from P. pabuli genomic DNA. The PpXG5 gene was cloned into a B. subtilis expression vector and expressed as a secreted form from the amyl promoter. PpXG5 was recovered from the broth through a combination of chemical and physical separation steps. The supernatant was applied to a pre-equilibrated S-Sepharose column at pH 5 in 20 mM sodium acetate buffer. PpXG5 was eluted with a gradient of 1 M NaCl in 20 mM sodium acetate, and the appropriate fractions were pooled. PpXG5 was further purified by gel filtration on an S200 column in 0.1 M sodium acetate buffer, pH 6.

Cloning and Expression of BlXG12
The 26-kDa xyloglucanase produced by the B. licheniformis strain (ATCC14580) was cloned from B. licheniformis genomic DNA, and the gene was expressed in B. subtilis by essentially the same protocol as described above for PpXG5. Protein purification also followed a similar strategy. Enzymatic variants of BlXG12 were constructed using the megapriming method and purified as for the wild type enzyme.
pH Rate Profiles-The pH rate profiles of PpXG5 and BlXG12 were determined in triplicate using the method described by Nelson and Somogyi with tamarind xyloglucan (1 g/liter) as the substrate. The following 50 mM buffer systems were used: sodium acetate, pH 4.0 -5.5, and sodium phosphate, pH 5.75-8.
G5000HHR and G3000HHR (both 7.8 ϫ 300 mm), connected in series, and an evaporative light-scattering detector (PL-ELS 1000; Polymer Laboratories). HPLC grade Me 2 SO was used as the eluent at a flow rate of 1.0 ml/min, and the column temperature was maintained at 60°C (27). Limit Digest Analysis-Extended enzymatic hydrolysis of xyloglucan (0.5 g/liter) in 50 mM NaOAc buffer, pH 5.5, was performed at 37°C overnight with PpXG5 (280 g/liter) or BlXG12 (58,000 g/liter). Samples were analyzed with a Dionex ICS-3000 high performance anion exchange chromatography system with pulsed amperometric detection (HPAEC-PAD) and a Dionex PA-100 column using a gradient modified from that previously described (41). Conditions were as follows: Solvent A, 1.0 M NaOH; solvent B, 1.0 M NaOAc; Solvent C, ultrapure water; flow rate, 0.8 ml/min. The gradient program was as follows: 0 -3 min, 100 mM NaOH, 40 mM NaOAc; 3-18 min, linear gradient from 40 to 300 mM NaOAc; 18 -19 min, gradient up to 500 mM NaOH and 500 mM NaOAc and then initial conditions for 4 min.
The enzymatic hydrolyses of GGGG-CNP, XXXG-CNP, and XLLG-CNP were followed by continuous assays measuring the release of 2-chloro-4-nitrophenolate at 405 nm (measured ⑀ 9724 M Ϫ1 cm Ϫ1 , 5 mM NaOAc buffer, pH 5.5) using a Cary 300 Bio UV-visible spectrophotometer (Varian). A total assay volume of 100 l was used in 1-cm path length quartz cells equilibrated and maintained at 30 Ϯ 0.1°C in a Peltier-controlled cell block. Initial rates were determined from the slope of the linear region of the reaction time course corresponding to Ͻ10% conversion. Assays of PpXG5 and BlXG12 employed total enzyme (protein) concentrations of 1.4 g/ml (0.035 M) and 46.8 g/ml (1.79 M), respectively. Kinetic constants were obtained from plots of v o /[E] t versus [S] by nonlinear curve fitting using Microcal TM Origin version 6.0. In the absence of an active site titrant, [E] t was assumed to be equivalent to the total protein concentration (i.e. 100% active protein).
visible spectrophotometer (Varian) and subtraction of background values from XXXG-pMP.

Crystallization, Data Collection, and Structure Solution of PpXG5
PpXG5, at ϳ15 mg/ml, was crystallized from 20 -25% (w/v) polyethylene glycol 8000, 0.2 M CaCl 2 and 0.1 M Tris-HCl, pH 8.5. Single crystals were cryoprotected with mother liquor with the addition of 25% (v/v) ethylene glycol. Data were collected to 1.40 Å on ESRF beamline ID14-3 and processed with DENZO/ SCALEPACK (46). The structure was solved by molecular replacement using the Clostridium cellulolyticum Cel5A (Cel-CCA) as the search model (Protein Data Bank code 1EDG) (47) with the CCP4 (48) version of the program AMORE (49), using the default parameters. An initial model was built automatically with the CCP4 installation of ARP-wARP, and the structure was refined using REFMAC (50) with manual corrections using QUANTA (Accelrys, San Diego, CA) and COOT (51). Data and structure statistics are given in Table 1.
Data were collected at the ESRF on beamline ID14-1 to 1.95 Å resolution from a single crystal cooled to 100 K. Data were integrated and processed with DENZO and scaled and merged with SCALEPACK in HKL2000 (46). All subsequent computing was done using the CCP4 suite of programs (48). The structure of the PpXG5 complex was solved by molecular replacement with the native structure of PpXG5 as the search model, using the CCP4 version of AMORE (49) with data between 15 and 3 Å and an outer radius of Patterson integration of 25 Å. The structure was subsequently refined using REFMAC (50) interspersed with manual corrections and the addition of waters in COOT (51).

Crystallization, Data Collection, and Structure Solution of BlXG12
Initial crystals of the native form of BlXG12 were obtained in a 50:50 merohedrally twinned crystal form (apparent space group P4 n a ϭ b ϭ 124 Å, c ϭ 50 Å; data to 2.2 Å, R merge ϭ 10% at edge). Subsequently, a E155Q variant of BlXG12 was used to obtain a nontwinned crystal form. BlXG12(E155Q) was crystallized from 10% (w/v) polyethylene glycol 4000, 0.05 M MgCl 2 , and 0.1 M BisTris, pH 6.5. Native data were collected in the home laboratory (YSBL, York, UK) using CuK␣ radiation and a MARResearch Image Plate system. Data were processed with DENZO/SCALEPACK (46), and the structure was solved with the Streptomyces lividans Cel12 (52, 53) as the search model. ARP-wARP was unable to build this model automatically, so partial manual rebuilding and refinement with REFMAC (50) were required to produce a starting model within the convergence radius of ARP-wARP. Following completion of a partial model, refinement continued with REFMAC with manual corrections using QUANTA (Accelrys) and COOT (51).

Structure of a Ligand Complex of BlXG12 (E155A)
BlXG12 E155A, in 25 mM acetate buffer and in the presence of ϳ10 mM mixed xyloglucan oligosaccharides, was crystallized from 1.6 M ammonium sulfate, 10% dioxane, and 0.1 M 2-(Nmorpholino)-ethanesulfonic acid, pH 6.5. The mother liquor with the addition of 23% glycerol was used to cryoprotect the crystal prior to flash-freezing in liquid nitrogen.
Data were collected at the ESRF on beamline ID14 -1 from a single crystal cooled to 100 K to 1.40 Å resolution. The struc-

XG5 and XG12 Xyloglucanases
ture was solved using AMORE (49) with the unliganded BlXG12 E155Q mutant as the search model, and was refined as described previously for the PpXG5 complex.

RESULTS
To discover novel xyloglucan-degrading enzymes, fragmented genomic DNA from both B. licheniformis and P. pabuli were cloned into E. coli and bacteriophage expression vectors, and the subsequent libraries were screened for the expression of xyloglucan active enzymes by plating onto AZCL-xyloglucan-agarose. Using this strategy, BlXG12 and PpXG5 were discovered with initial data on dyed xyloglucans, demonstrating that these enzymes could be classified as "xyloglucanases." Despite family GH5 having over 900 members, the PpXG5 enzyme is almost unique, with only two sequences from Paenibacillus sp. KM21 and Bacillus sp. BP-23 having Ͼ45% sequence identity. The former has been shown to be an obligate xyloglucanase (54), whereas the latter finds use in straw processing applications (55) but has not, to our knowledge, been tested on xyloglucan substrates. Family GH12 has ϳ110 members. BlXG12 shows high (Ͼ50%) sequence identity with only one other enzyme, that from Pectobacterium carotovorum (Erwinia carotovora) with ϳ67% identity, which has been described as an endoglucanase, but again activity on xyloglucan was not reported (56,57). The next most similar sequences are those from Streptomyces avermitilis (41% identity) and EG12 from Rhodothermus marinus (30% identity), whose three-dimensional structure has previously been reported (58).
Kinetic Analysis of PpXG5-Relative hydrolysis rates indicate that PpXG5 is an exclusive xyloglucanase, with no detectable activity on a range of other glucan, xylan, mannan, or pectic polysaccharides (Table 2), as observed recently for a close homolog (54). The dependence of the enzymatic hydrolysis rate of xyloglucan on pH was classically bell-shaped (data not shown), with apparent kinetic pK a values of two ionizable groups of 4.5 Ϯ 0.2 and 8.0 Ϯ 0.2. Size exclusion chromatography and HPAEC-PAD demonstrated that PpXG5 hydrolyzes tamarind xyloglucan endolytically to produce a mixture of the Glc 4 -based oligosaccharides XXXG, XLXG, XXLG, and XLLG (Fig. 2).
PpXG5 was also active on synthetic CNP (59), ␤-glycosides of GGGG, XXXG, and XLLG, and the pMP ␤-glycoside of XXXG (Fig. 1B). In the case of GGGG-CNP, XXXG-CNP, and XXXG-pMP, PpXG5 exhibited classical saturation kinetics, and the data (Fig. 3) were readily fit by the standard Michaelis-Menten equation. The rate of the PpXG5-catalyzed hydrolysis of XLLG-CNP, however, showed a more complex dependence on substrate concentration (Fig. 4), which indicated that the glycosylenzyme intermediate, E gly , was capable of binding a second molecule of substrate at high [S], giving rise to substrate inhibition. This was fit appropriately to yield the substrate inhibition constant K is , in addition to k cat and K m (Table 3), The degree of substrate inhibition is quite low for XLLG-CNP and is only manifested at substrate concentrations well above the apparent K m value (Fig. 4, Table 3). GH5 enzymes are retaining, with catalysis occurring via the formation and subsequent breakdown of a covalent glycosyl-enzyme intermediate. With a good leaving group, such as 2-chloro-4-nitrophenol     substrate to form a dead end complex; no evidence for transglycosylation leading to formation of XLLGXLLG-CNP or higher oligomers was observed by HPAEC-PAD. The values of the macroscopic kinetic constants obtained for the action of PpXG5 on XXXG-CNP and XLLG-CNP are summarized in Table 3. Although both substrates have K m values in the micromolar range, the K m value for XLLG-CNP is 2-fold lower. The ratio of k cat /K m indicates that PpXG5 is more selective for this substrate than XXXG-CNP by a factor of ϳ3. The presence of one or both additional galactose residues thus enhances catalysis by the enzyme, lowering the activation barrier to the first chemical step by 3.0 kJ/mol. Assuming that hydrolysis of the glycosyl-enzyme is rate-determining, the ratio of k cat values indicates that galactosylation also increases the rate of this step, albeit by a modest 1.4-fold. Notably, PpXG5 exhibited a comparatively low specificity constant for GGGG-CNP, which can be considered a XXXG-CNP homolog with all branching residues removed (Table 3); the k cat /K m value for this substrate was 85-fold lower than for XXXG-CNP and 280-fold lower than for XLLG-CNP. The specificity of PpXG5 is discussed below in light of the three-dimensional structure of the enzyme.
Three-dimensional Structure of PpXG5-The structure of PpXG5 was solved in a native form with data to 1.40 Å resolution. There are two molecules in the asymmetric unit, which are essentially identical. The chain can be traced continuously from residue 33 to 395 in the electron density. It should be noted that the residue numbering corresponds to the protein including a signal peptide, but this was cleaved during the gene expression (as judged by mass spectrometry, which gave an m/z consistent with a mass of ϳ40,620 Da). PpXG5 displays a (␤/␣) 8 barrel fold, as is typical of family 5 and clan GH-A enzymes. An open groove runs across the surface of the whole protein, which constitutes the substrate binding subsites (Fig. 5A). Structural similarity searches using the SSM server (62) reveal, not surprisingly, close similarities to other GH-A clan family GH5 enzymes, notably a cellulase from C. cellulolyticum (CelCCA) (47), which has 33% identity and a P-score of 21.8 (corresponding to an r.m.s. deviation of 1.28 Å for 316 aligned C␣ positions).
Catalysis by family 5 enzymes occurs with retention of anomeric configuration, which involves a double displacement mechanism and goes via a covalent glycosyl-enzyme intermediate (63). Two carboxylate-containing residues are involved, one that acts as a nucleophile to attack at the anomeric center and another that acts as an acid/base residue to protonate the glycosidic oxygen during the first step of the mechanism and deprotonate a water molecule during the second step. These catalytic residues have been identified as Glu 182 (acid/ base) and Glu 323 (nucleophile) in PpXG5 by analogy with other GH5 enzymes.
Ligand Complex of PpXG5-To determine if crystallization could be used to screen for ligand specificity, PpXG5 was crystallized in the presence of mixed xyloglucan oligosaccharides (2:1:3:3 mixture of XXXG/XLXG/XXLG/XLLG) to obtain structural information on the interactions made with the enzyme. Data on the substrate complex were collected to 1.95 Å resolution. The crystals grew in a different space group to the native crystals, and there is one molecule in the asymmetric unit. The chain can be traced continuously from residue 37 to 395. The native structure and the complex superimpose well, with a r.m.s. deviation of 0.5 Å for the C␣ atoms.
There is well defined electron density for a number of sugar rings in the active site of PpXG5, which corresponds to a molecule of XXLG bound in the minus subsites (Fig. 5B). There are four ␤-1,4-glucose moieties in subsites Ϫ1, Ϫ2, Ϫ3, and Ϫ4, two ␣-1,6 xylose residues branched from the glucose moieties in subsites Ϫ2 and Ϫ3 (the xylose residue that must be present in the Ϫ4 subsite is too disordered to be observed in the electron density), and a ␤-1,2-linked galactoside linked to the xylose in the Ϫ2 subsite. No electron density can be observed for a similarly linked galactose residue in the Ϫ3 subsite (which would give XLLG, a molecule also present in the mixture of oligosaccharides co-crystallized with PpXG5).
The hydroxyl group at C-1 of the glucose residue in the Ϫ1 subsite hydrogen-bonds with both the acid/base (Glu 182 ) and nucleophile (Glu 323 ) residues; the C-2 hydroxyl interacts with Glu 323 , Asn 181 , and His 131 , and the C-3 hydroxyl also hydrogenbonds with His 131 . Asn 363 interacts with both of the hydroxyl groups at C-2 and C-3 of the glucose moiety in the Ϫ2 subsite, and the C-3 hydroxyl also hydrogen-bonds with Asn 50 . The xylose residue in the Ϫ2 subsite hydrogen-bonds with Ser 137 . The sugars in the Ϫ3 and Ϫ4 subsites make no hydrogen bond interactions with the enzyme but only with solvent molecules. There are a number of hydrophobic interactions between aromatic residues and the faces of the sugars, including Trp 361 (with glucose in the Ϫ1 subsite), Tyr 135 (with xylose in the Ϫ2 subsite), Trp 65 (with glucose in the Ϫ3 subsite), and His 365 (with xylose in the Ϫ3 subsite). Interactions are shown in Fig. 6.
Kinetic Analysis of BlXG12-Of the polysaccharide substrates tested (Table 2), BlXG12 demonstrated highest activity toward xyloglucan, but also showed significant activity toward carboxymethyl cellulose, konjac glucomannan, and barley ␤-glucan ( Table 3). The pH dependence of the hydrolysis of xyloglucan by BlXG12 was bellshaped (data not shown), with apparent kinetic pK a values of 4.1 Ϯ 0.1 and 7.8 Ϯ 0.1. Size exclusion chromatography and HPAEC-PAD indicated that the hydrolysis of tamarind xyloglucan by BlXG12 occurred in an endolytic fashion to produce a limit digest composed of a mixture of XXXG, XLXG, XXLG, and XLLG (Fig. 7).
BlXG12 was active on the chromogenic substrates GGGG-CNP, XXXG-CNP, XLLG-CNP, and XXXG-pMP (Fig. 8, Table 3). Based upon k cat /K m values, the specificity of this enzyme was inversely related to the degree of substrate branching. BlXG12 hydrolyzed GGGG-CNP to liberate the aglycon with a similar K m value and a ϳ5-fold greater k cat value compared with XXXG-CNP. XXXG-CNP and XLLG-CNP exhibited similar k cat values, whereas the K m value for the  galactosylated substrate was 3.5-fold higher (Table 3). Interestingly, the enzyme was prone to substrate inhibition by XXXG-CNP (Fig. 8A) but not XLLG-CNP (Fig. 8B). Both the substrate inhibition and specificity trends are opposite to those observed for PpXG5. Indeed, the presence of three additional xylosyl units on the Glc 4 backbone imposes a 4.5 kJ/mol penalty on the first catalytic step, whereas two additional galactosyl units increases the ⌬⌬G value by a further 3.3 kJ/mol, indicating that branching of the glucan chain retards catalysis by BlXG12. Furthermore, the insensitivity of the K m value of XXXG aryl glycosides to the pK a of the aglycon may indicate that formation of the glycosyl-enzyme intermediate is rate-limiting for these substrates.
Three-dimensional Structure of BlXG12-The structure of the BlXG12 nucleophile mutant (E155Q) was solved with data to 1.78 Å resolution. There are two molecules in the asymmetric unit, which are essentially identical. The chain can be traced continuously from residue 31 to 261 in the electron density. The residue numbering corresponds to the protein including a signal peptide, which was cleaved during the gene expression (as judged by mass spectrometry, which gave an m/z value consistent with a mass of ϳ25,994 Da). BlXG12 displays a ␤-jelly roll fold, as shown by other family 12 (and clan GH-C) enzymes. A cleft runs across the surface of the protein, which constitutes the substrate binding subsites; this cleft appears to be deeper than observed with PpXG5 (Fig. 9A). Structural similarity searches using the SSM server (62) confirms similarities to GH12 family enzymes, with the closest apparent match being the Humicola grisea Cel12A (HgGH12) (64) (which has 23% identity and a P-score of 6.9, corresponding to an r.m.s. deviation of 1.52 Å for 209 matched C␣ positions).
Family 12 enzymes, like family 5 enzymes, catalyze with retention of anomeric configuration in a two-step mechanism. BlXG12 and related enzymes do, however, possess an acid/base residue that protonates syn to the pyranoside O-5-C-1 bond, in contrast to family 5 enzymes which are antiprotonators (65). The important catalytic residues in BlXG12 are Glu 243 (acid/base residue) and Glu 155 (nucleophile residue, which has been mutated during the structural studies described here).   JUNE 29, 2007 • VOLUME 282 • NUMBER 26

XG5 and XG12 Xyloglucanases
Ligand Complex of BlXG12-BlXG12 E155A was crystallized in the presence of the same xyloglucan oligosaccharide mixture as described for PpXG5, and data were collected to 1.40 Å resolution. Once again, the substrate complex crystallized in a different space group to the unliganded (E155Q) structure, and there is one molecule in the asymmetric unit. The chain can be traced continuously from residue 29 to 261. The unliganded and complex structures superimpose with an r.m.s. deviation of 0.6 Å for the C␣ atoms.
The electron density for the substrate complex of BlXG12 clearly shows a number of sugar rings in both the positive and negative subsites, corresponding to the observation of two molecules of XXXG (Fig. 9B). There are four ␤-1,4-glucose moieties in subsites Ϫ1, Ϫ2, Ϫ3, and Ϫ4 and two ␣-1,6-xylose residues branched from the glucose moieties in subsites Ϫ2 and Ϫ3 (the xylose residue that must be attached to the glucose in the Ϫ4 subsite is too disordered to be observed in the electron density). No electron density can be observed for ␤-1,2-linked galactose residues on the xylose residues in either the Ϫ2 or Ϫ3 subsites. Similarly, there are two ␤-1,4-glucose moieties in subsites ϩ1 and ϩ2 and two ␣-1,6-xylose residues branched from each of them. There is disordered electron density in the ϩ3 subsite corresponding to the third glucose residue, but this cannot be built with confidence, and neither the likely xylose moiety in the ϩ3 subsite nor the glucose residue in the ϩ4 subsite are observed.
The hydroxyl group at C-1 of the glucose residue in the Ϫ1 subsite is observed to mutarotate; both the ␣and ␤-anomers interact with Met 157 , and the ␤-anomer also interacts with Asp 137 (interactions are shown in Fig. 10). The C-2 hydroxyl group hydrogen-bonds with Trp 197 , the C-6 hydroxyl group hydrogen-bonds with Trp 53 and Glu 243 (the acid/base residue), and the endocyclic oxygen also interacts with Glu 243 . The hydroxyl group at C-2 of the glucose moiety in the Ϫ2 subsite interacts with Asn 51 , and the C-3 hydroxyl interacts with His 97 ; the xylose residue in the Ϫ2 subsite only makes interactions with solvent molecules. Neither of the glucose moieties in the Ϫ3 and Ϫ4 subsites make any hydrogen bond interactions with protein residues. The C-4 hydroxyl group of the xylose residue in the Ϫ3 subsite, however, hydrogen-bonds to both Ser 36 and the main chain nitrogen of Val 52 . Hydrophobic interactions are made between Trp 139 and the glucose moiety in Ϫ1, Trp 53 and the glucose moiety in Ϫ2, and Trp 98 and the glucose moiety in Ϫ3.
The sugars bound in the plus subsites of BlXG12 make relatively few interactions with the protein. The hydroxyl groups at C-3 and C-4 of the glucose moiety in the ϩ1 subsite both interact with the acid/base residue (Glu 243 ), and the hydroxyl group at C-2 hydrogen bonds with the main chain carbonyl group of Gly 166 . The xylose moiety in the ϩ2 subsite hydrogen-bonds with the main chain nitrogen atom of Gly 166 and stacks with Phe 245 . Neither the xylose residue in the ϩ1 subsite nor the glucose moiety in the ϩ2 subsite make any interactions with the enzyme.

DISCUSSION
What constitutes a xyloglucanase? Formally one might define a xyloglucanase as an enzyme with a catalytic preference for xyloglucan substrates, as opposed to other glucans. In practice, such a distinction is difficult, since long unsubstituted
PpXG5 and BlXG12 exemplify the spectrum of enzyme activities that one might appropriately term xyloglucanases. BlXG12 is more active on tamarind xyloglucan than the other polysaccharides tested, yet it is only slightly better on this substrate than on the best artificial substrate, low viscosity carboxymethyl cellulose. On the panel of aryl oligosaccharides examined, BlXG12 clearly prefers a naked glucan chain in the glycon (negative) subsites, and catalysis is impaired in the presence of xyloside and further galactoside substituents. Despite this kinetic preference, BlXG12 does not legislate against galactose substituents, since limit digest analysis gives the full spectrum of possible xyloglucan oligosaccharides (based upon a Glc 4 backbone) such that XXLG, XLXG, and XLLG must be accommodated in both negative and positive subsites of the enzyme.
In contrast to the accommodation of side chain sugars by BlXG12 and the partial preference of the enzyme for tamarind xyloglucan, PpXG5 is a significantly "better" and more specific xyloglucanase under the conditions used. Indeed, PpXG5 is active only on the substituted polymer and not on any of the other polysaccharides tested, and this activity is extremely high compared with BlXG12. Furthermore, PpXG5 favors xyloglucan oligosaccharides in its negative subsites such that, for good leaving groups, binding and formation of the covalent intermediate are extremely rapid, and deglycosylation is most likely rate-limiting. Furthermore, the galactoside moieties are harnessed productively by the enzyme, as reflected in the 4-fold better catalytic efficiencies on XLLG-CNP versus XXXG-CNP.
This span of different xyloglucan specificities from toleration and partial harnessing by BlXG12 through to absolute specificity and harnessing of extended substituents by PpXG5 is well reflected in their respective three-dimensional structures. For studies of both enzymes in complex with ligand, we intentionally screened a mixture of oligosaccharides in order to sift the most favored xyloglucan-derived oligosaccharide from the 2:1:3:3 mixture of XXXG/XLXG/XXLG/XLLG. In the case of BlXG12, this yielded a Ϫ4 XXXG Ϫ1 ϩ1 XX ϩ2 complex, reflecting binding of a minor component of the mixture, but entirely consistent with the kinetic preference for the XXXGaryl substrate over the galactosylated substrates. Strong positive subsite binding may also be reflected in the substrate inhibition observed for this enzyme (as is the case with, for example, Humicola insolens Cel7B (67)). In the case of PpXG5, the same experiment yielded XXLG bound in the Ϫ4 to Ϫ1 subsites, consistent with both the importance of these negative subsites to formation of the covalent intermediate and the kinetic preference for galactosyl moieties.
A comparison of PpXG5 and BlXG12 with other members of their respective families gives an insight into what allows them to act on xyloglucan-derived substrates or indeed prevents other enzymes from having this capacity. Inspection of the active site overlap of the closest structural homolog to PpXG5, CelCCA from C. cellulolyticum (47) (Fig. 11A), as well as primary sequence alignments of family 5 members, reveals that Trp 361 and Trp 65 , which stack with glucose residues in the Ϫ1 and Ϫ3 subsites, respectively, are conserved among GH5 members. However, Ser 137 , which interacts with the C-3 hydroxyl group of the xylose residue in the Ϫ2 subsite, and Tyr 135 , which stacks with the same xylose residue, are found on a loop that is positioned differently in CelCCA and other GH5 members, such as Cel5A (a cellulase from Bacillus agaradhaerans (66)). Primary sequence alignments of about 20 family GH5 open reading frames shows that PpXG5 and the close homologs from Bacillus sp. BP-23 (55) and Paenibacillus sp. KM21 (54) have either a serine or threonine residue at this position, which is in a conserved region with the motif GDG(F/Y)(H/N)(S/T)(I/V), which is not apparent in any of the other family 5 sequences. Likewise, His 365 , which stacks with the xylose residue in the Ϫ3 subsite, appears to be in a conserved YWDNG(H/F) motif when compared with the Bacillus sp. BP-23 and Paenibacillus sp. KM21 sequences, but which is missing from other GH5 sequences. The superposition of the PpXG5 structure with CelCCA and Cel5A shows that although the protein backbone for each structure is in a similar position in the region of His 365 , CelCCA and Cel5A have nonaromatic residues in this position that are not in an orientation to interact with the xylose. As well as PpXG5 possessing residues that promote interactions with xyloglucan-derived substrates, other GH5 members possess residues that are likely to prevent binding of them. For example, His 123 (Cel-CCA) or Tyr 66 and Leu 103 (Cel5A) would clash with the xylose residue in the Ϫ2 subsite, Phe 42 (CelCCA) or Ser 69 (Cel5A) would block the galactose residue in the Ϫ2 subsite, and Lys 267 (Cel5A) would prevent a xylose binding in the Ϫ3 subsite.
Rationalization of the BlGH12 xyloglucanase activity compared with other GH12 members is, however, more difficult. BlGH12 makes no interactions with the xylose residue in the Ϫ2 subsite, and superposition with its closest structural homologue, HgGH12 from H. grisea (Protein Data Bank entry 1UU6 (64)) ( Fig. 11B), demonstrates that a xylose would not clash with any active site residues. However, Tyr 9 of HgGH12 would clash with the xylose in the Ϫ3 subsite; in this equivalent position in BlGH12, there is a serine that makes productive hydrogen bond interactions with the xylose. In the positive subsites, the overlap with HgGH12 shows that there is nothing to prevent binding of a xylose residue in the ϩ1 subsite, but Tyr 132 and Arg 97 would block binding in the ϩ2 subsite; in the equivalent positions, BlGH12 possesses a threonine and glycine, respectively. The lack of information about whether other GH12 members possess xyloglucanase activity makes it difficult to draw conclusions about whether the type of residues at these positions can be used to predict the activity of an individual enzyme.
PpXG5 and BlXG12, together with the recently studied CtXG74 xyloglucanase (41), highlight the diversity of microbial xyloglucan-active hydrolases available in nature. Given the massive importance of biofuels and the potential applications of xyloglucan oligosaccharides, the challenge remains to determine how best to harness this spectrum of activities for optimal applied usage.