Biochemical and Structural Insights into Xylan Utilization by the Thermophilic Bacterium Caldanaerobius polysaccharolyticus*

Background: Caldanaerobius polysaccharolyticus is a thermophile with a hemicellulose utilization gene cluster. Results: The cluster is induced by xylan. The ligand-binding cleft of XBP1 is optimized for binding xylotriose. Conclusion: This gene cluster encodes all of the proteins required to degrade xylan, transport the fragments, and metabolize them via the pentose-phosphate pathway. Significance: This gene cluster could be designed as a cassette to impart a capacity for utilizing hemicellulose. Hemicellulose is the next most abundant plant cell wall component after cellulose. The abundance of hemicellulose such as xylan suggests that their hydrolysis and conversion to biofuels can improve the economics of bioenergy production. In an effort to understand xylan hydrolysis at high temperatures, we sequenced the genome of the thermophilic bacterium Caldanaerobius polysaccharolyticus. Analysis of the partial genome sequence revealed a gene cluster that contained both hydrolytic enzymes and also enzymes key to the pentose-phosphate pathway. The hydrolytic enzymes in the gene cluster were demonstrated to convert products from a large endoxylanase (Xyn10A) predicted to anchor to the surface of the bacterium. We further use structural and calorimetric studies to demonstrate that the end products of Xyn10A hydrolysis of xylan are recognized and bound by XBP1, a putative solute-binding protein, likely for transport into the cell. The XBP1 protein showed preference for xylo-oligosaccharides as follows: xylotriose > xylobiose > xylotetraose. To elucidate the structural basis for the oligosaccharide preference, we solved the co-crystal structure of XBP1 complexed with xylotriose to a 1.8-Å resolution. Analysis of the biochemical data in the context of the co-crystal structure reveals the molecular underpinnings of oligosaccharide length specificity.

Hemicellulose, one of the main components of the plant cell wall, is one of the most abundant polysaccharides in nature.
The efficient degradation of the polymer has gained increasing interest due to the capacity to convert its monomeric sugars to bioenergy products such as ethanol (1). Xylan, the most common hemicellulose, is a heterogeneous polysaccharide composed mostly of linear chains of xylose with side chain substitutions. The backbone of xylan is composed of ␤-1,4-linked D-xylopyranosyl units and may be decorated with 4-O-methyl-D-glucuronyl, L-arabinofuranosyl, and acetyl substituents (2). The complete degradation of xylan requires the synergistic activity of several hemicellulolytic enzymes, such as ␤-1,4-endoxylanase, ␤-xylosidase, ␣-glucuronidase, ␣-L-arabinosidase, and acetylxylan esterase (1). To facilitate a concerted action of these enzymes for hemicellulose degradation, several microorganisms have evolved gene clusters encoding the different hemicellulolytic enzymes (1,(3)(4)(5)(6). The transport mechanism for xylan degradation products has been fairly well described in bacteria such as Streptomyces lividans and Geobacillus stearothermophilus (7,8). However, our knowledge in this area of sugar metabolism by bacteria is still limited.
With a number of advantages over mesophilic enzymes, thermostable enzymes are especially thought to improve hydrolytic performance and overall economy of the process of biofuel production from the plant cell wall. Thermostable enzymes have thus been gaining increasing attention in the field of biofuels (18,19).
In this study, Xyl3A and Agu67A appeared to be components of a pentose sugar metabolism cluster and were cloned and characterized from C. polysaccharolyticus. Because the endoxylanase, Xyn10A, is not linked to this gene cluster, we hypothesized that the gene products from the cluster serve to capture nutrients (xylo-oligosaccharides) generated by Xyn10A for further hydrolysis to directly feed them into the pentose-phosphate pathway. Here, we express the recombinant form of each protein and demonstrate their contributions to xylan metabolism by C. polysaccharolyticus. We also identify a membraneintegral ATP-dependent sugar complex that likely transports the end products of xylan degradation into the cell. To determine the chain length preference for this transporter, we carried out biochemical analysis of the solute-binding component (XBP1) of the complex, and we solved the co-crystal structure of this polypeptide in complex with xylotriose. It is anticipated that the clustering of the genes involved in xylan utilization in this thermophilic bacterium will also offer an opportunity to transfer the phenotype to other organisms with tractable genetic systems for further engineering and improvement of xylan utilization.

EXPERIMENTAL PROCEDURES
Materials-C. polysaccharolyticus (ATCC strain number BAA-17), originally named Thermoanaerobacterium polysaccharolyticum, was isolated from a waste pile of a canning factory in Illinois (20,21). The pET-46b EK/LIC cloning kit and Perfect Protein Marker TM were purchased from Novagen (San Diego). PicoMaxx high fidelity PCR system, Pfu DNA polymerase, Escherichia coli JM109, and BL21-CodonPlus TM (DE3) RIL competent cells were obtained from Stratagene (La Jolla, CA). Restriction enzyme DpnI and 1-kb DNA ladder were purchased from New England Biolabs (Ipswich, MA). The DNeasy Blood and Tissue kit and the QIAprep spin miniprep kit were obtained from Qiagen, Inc. (Valencia, CA). The talon metal affinity resin was from Clontech. Amicon Ultra-15 centrifugal filter units with 30,000-and 50,000-Da molecular mass cutoffs were purchased from Millipore (Billerica, MA). Isopropyl ␤-Dthiogalactopyranoside, antibiotics, agarose, and sodium citrate were obtained from Fisher.
Cloning, Expression, and Purification of Xyn10A, Xyl3A, Agu67A, and XBP1-C. polysaccharolyticus was cultured in trypticase/yeast extract/glucose (TYG) medium to mid-log phase, and genomic DNA was extracted from pelleted cells using the Qiagen DNeasy blood and tissue kit with an integrated RNase treatment step. The partial genome sequence of C. polysaccharolyticus was generated by the W. M. Keck Center for Comparative and Functional Genomics, University of Illinois, and uploaded onto the Rapid Annotation using Subsystem Technology (RAST) server (24) to generate auto-annotated genomic sequence data. The C. polysaccharolyticus gene cluster and the xyn10A gene have been deposited in GenBank TM under accession numbers JX087428 and JX271581, respectively.
All genes were amplified by using C. polysaccharolyticus genomic DNA as the template and a pair of primers targeting the desired gene. The genes xyn10A (ORF2504) and xyl3A (ORF0541) were amplified using the primer pairs xyn10A-F/ xyn10A-R and xyl3A-F/xyl3A-R (supplemental Table S1) using the PicoMaxx high fidelity PCR kit. A putative signal peptidase cleavage site was predicted between amino acids 30 and 31 for Xyn10A using the SignalP server version 4.0 (25). Thus, to ensure that the protein accumulates within the E. coli cells, the forward primer was designed to amplify xyn10A beginning with the codon immediately downstream of the peptidase cleavage site. The putative ␣-glucuronidase encoding gene agu67A (ORF0540) was cloned by initially amplifying a larger DNA fragment agu67A-A with the primers GluFor and GluRev (supplemental Table S1). The coding sequence of agu67A was then amplified with agu67A-A as template and agu67A-F and agu67A-R (supplemental Table S1) as primers. All PCR amplifications of agu67A were carried out with Pfu DNA polymerase from Agilent (Santa Clara, CA). The gene xbp1, encoding a putative solute-binding protein, was amplified by PCR with Pfu DNA polymerase and the primer pair XBP1-F/XBP1-R (supplemental Table S1). To facilitate ligation of the PCR products into the gene expression vector (pET46b), each forward primer (with an -F designation) was engineered to incorporate a 5Ј-GACGACGACAAGA extension, and the reverse primers (with an -R designation) were designed to include a 5Ј-GAG-GAGAAGCCCGGT extension.
The resultant amplicons were then digested with the exonuclease activity of T4 DNA polymerase and subcloned into pET46 Ek/LIC vector using the Ek/LIC cloning kit (Novagen) and E. coli JM109 as the competent cells by electroporation (Gene Pulser Xcell TM from Bio-Rad). All recombinant plasmids (pET46-xyl3A, pET46-agu67A, and pET46-xbp1) were then extracted using QIAprep spin miniprep kit, and the nucleotides were sequenced (W. M. Keck Center for Comparative and Functional Genomics, University of Illinois) to confirm the integrity of the coding sequence. The recombinant plasmids were transformed individually into E. coli BL-21 CodonPlus (DE3) RIL by heat shock and grown overnight at 37°C on Lysogeny Broth (LB) agar plates supplemented with ampicillin (100 g/ml) and chloramphenicol (50 g/ml). A single colony from each plate was picked and pre-cultured at 37°C for 8 h in LB liquid medium (10 ml) supplemented with ampicillin (100 g/ml) and chloramphenicol (50 g/ml). The pre-cultures were then inoculated into fresh LB (1 liter) supplemented with the two antibiotics and cultured at 37°C with vigorous shaking (225 rpm/min) to an absorbance of 0.3 at 600 nm (A 600 nm ). To induce gene expression, isopropyl ␤-D-thiogalactopyranoside was added to the culture at a final concentration of 0.1 mM, and the cells were cultured for an additional 16 h at 16°C. The cells were harvested by centrifugation (4000 ϫ g, 4°C, 15 min) and resuspended in lysis buffer (30 ml, 50 mM Tris-HCl, 300 mM NaCl, pH 7.0). To release the recombinant proteins, the cell suspension was lysed by two sequential passages through an EmulsiFlex C-3 cell homogenizer (Avestin, Ottawa, Canada). The cell debris was removed by centrifugation at 20,000 ϫ g for 20 min at 4°C. To decrease the amount of heat-labile E. coli proteins, the supernatant was heated at 65°C for 30 min and centrifuged at 20,000 ϫ g for 15 min at 4°C to pellet the denatured proteins. Because each gene was cloned in-frame with a polyhistidine tag encoded by the pET46 Ek/LIC vector, the resulting N-terminal polyhistidine (His 6 )-tagged proteins were loaded onto an immobilized metal ion affinity resin (Talon resin, Novagen) that had been pre-equilibrated with the binding buffer (50 mM Tris-HCl, 300 mM NaCl, pH 7.5). The protein/resin mixture was incubated for 1 h at 4°C. After washing unbound proteins with 50 column volumes of binding buffer, the proteins that bound to the column were each eluted with 10 column volumes of elution buffer (50 mM Tris-HCl, 300 mM NaCl, 250 mM imidazole, pH 7.5). The purity of the eluted proteins was examined by SDS-PAGE as described by Laemmli (26). After staining with Coomassie Brilliant Blue G-250, the gel was destained with acetic acid/methanol (1:1, v/v).
For Xyn10A, the purified protein fractions were pooled, concentrated, and exchanged into anion exchange binding buffer (50 mM Tris-HCl, pH 7.0) using an Amicon Ultra-15 centrifugal filter unit (50,000 molecular mass cutoff) with three successive concentration and dilution cycles. The concentrated protein was then loaded onto a 5-ml HiTrap Q HP anion exchange column fitted to an AKTA Express chromatography system (GE Healthcare) and eluted with a linear gradient of an elution buffer (50 mM Tris-HCl, 1 M NaCl, pH 7.0). The absorbance at 280 nm was continuously monitored, and eluted proteins were collected in 0.5-ml fractions and analyzed by SDS-PAGE. The purified fractions were pooled, concentrated, and exchanged into citrate buffer (50 mM sodium citrate, 150 mM NaCl, pH 5.5). The protein was then loaded onto a Superdex TM 200 Hiload TM 16/60 size exclusion column and eluted with citrate buffer at a flow rate of 1.2 ml/min. Five hundred microliter fractions were collected and analyzed by SDS-PAGE, and the purified fractions were pooled, concentrated, and exchanged into storage buffer (50 mM Tris-HCl, 150 mM NaCl, pH 7.5) using an Amicon Ultra-15 centrifugal filter unit (50,000 molecular mass cutoff).
Xyl3A, Agu67A, and XBP1 were purified as described for Xyn10A except that a 30,000 MWCO Amicon tube was used for XBP1, and the anion exchange purification step was omitted for the three proteins because two steps were sufficient to obtain pure protein.
Quaternary Structure Determination by Size Exclusion Chromatography-The quaternary structures of Xyl3A and Agu67A were analyzed by size exclusion chromatography using a Superdex 200 10/300 GL size exclusion column. One hundred microliters of Xyl3A (3 mg/ml), Agu67A (3 mg/ml), or a gel filtration standard mixture was loaded onto the column preequilibrated with a buffer composed of 50 mM sodium citrate, 150 mM NaCl, pH 5.5. The proteins were eluted in the same buffer at a flow rate of 0.5 ml/min. A calibration curve of molecular mass versus retention time was constructed with the gel filtration standards, and the apparent molecular masses of the two proteins were calculated by comparison of experimental retention times with the calibration curve.
Hydrolysis of para-Nitrophenyl (pNP)-linked Sugars-The hydrolytic activities of the putative ␣-glucuronidase (Agu67A) and ␤-xylosidase (Xyl3A) were screened against a panel of pNPlinked substrates by a colorimetric assay using a thermostated Cary 300 UV-visible spectrophotometer (Varian Inc., Palo Alto, CA). The 15 pNP-linked substrates were as follows: pNP-␣-Larabinopyranoside; pNP-␣-L-arabinofuranoside; pNP-␤-D-fucopyranoside; pNP-␣-L-fucopyranoside; pNP-␣-D-galactopyranoside; pNP-␤-D-galactopyranoside; pNP-␣-D-glucopyranoside; pNP-␤-D-glucopyranoside; pNP-␤-D-maltopyranoside; pNP-␣-D-maltopyranoside; pNP-␣-D-mannopyranoside; pNP-␤-D-mannopyranoside; pNP-␣-L-rhamnopyranoside; pNP-␤-D-xylopyranoside, and pNP-␤-D-cellobioside. In the reaction, the pNP-linked substrates (1.0 mM) were incubated with Xyl3A or Agu67A (100 nM) in a citrate buffer (50 mM, pH 5.5) at 65°C for 30 min, and the rate of pNP released in the reactions was monitored continuously through the absorbance at 400 nm. The extinction coefficient for pNP was determined experimentally by constructing a standard curve of different pNP concentrations at a pH of 5.5 and a temperature of 65°C and by using the Beer-Lambert Law (A ϭ c, ⑀, and l, where c is the concentration; ⑀ is the molar extinction coefficient of pNP at 400 nm, and l is the path length of the cuvette). The extinction coefficient of pNP (1636 M Ϫ1 cm Ϫ1 ) obtained at pH 5.5 was used to calculate initial velocities. One unit of ␤-xylosidase activity was defined as the amount of enzyme that release 1 mol of pNP from the substrates per min.
Determination of ␣-Glucuronidase Activity-The ␣-glucuronidase activity was quantified using a colorimetric assay for uronic acids (27). For aldouronic acids, the substrate (6 mg/ml) was incubated with 0.5 M Agu67A in 100 l of citrate buffer (50 mM, pH 5.5) at 65°C for 60 min. The reaction was terminated by adding 150 l of copper reagent (1.97 M Na 2 SO 4 , 0.68 M NaCl, 0.2 M sodium acetate, 20.8 mM CuSO 4 , pH adjusted to 4.8). The mixture was then boiled for 10 min, chilled on ice, and then 150 l of arsenomolybdate reagent (19.2 mM Na 2 HAsO 4 ⅐7H 2 O, 40.5 mM (NH 4 ) 6 Mo 7 O 24 ⅐4H 2 O, 788 mM H 2 SO 4 ) was added for color development. The absorbance of the mixture was measured at 600 nm, and a standard curve was constructed using known concentrations of ␣-glucuronic acid (Sigma).
For polysaccharide substrates, 0.5 M of the enzyme was incubated with BWX (1%, w/v) or MGX (1%, w/v) (both from Sigma) in citrate buffer (50 mM sodium citrate, pH 5.5) and incubated at 65°C for 60 min. The reactions were terminated by heating at 100°C for 10 min. The reaction mixtures were centrifuged at 17,000 ϫ g for 10 min, and the 4-O-methyl-␣glucuronic acid in the supernatant was determined using the colorimetric assay described above. One unit of ␣-glucuronidase activity was defined as the amount of enzyme that catalyzes the release of 1 mol of ␣-glucuronic acid equivalents per min.
Hydrolysis of Oligosaccharides by Xyl3A-To analyze the hydrolytic activity of Xyl3A with oligosaccharides as substrates, xylo-oligosaccharides (xylobiose, X 2 ; xylotriose, X 3 ; xylotetraose, X 4 , xylopentaose; X 5 , and xylohexaose, X 6 ) and cellooligosaccharides (cellobiose, G 2 ; cellotriose, G 3 ; cellotetraose, G 4 ; and cellopentaose, G 5 ) were used. Xyl3A (0.5 M, final concentration) was incubated with each oligosaccharide (10 mg/ml, final concentration) in citrate buffer (50 mM sodium citrate, pH 5.5) in a final reaction volume of 10 l at 65°C for 15 h. The control reaction was performed under the same condition except with heat-denatured Xyl3A added as the enzyme. At the end of the reaction, a 20-l volume of ethanol was added to the hydrolysate, and the mixture was evaporated through a Savant DNA120 SpeedVac concentrator (Savant; Ramsey, MN). The dried product was resuspended in 2.5 l of doubledistilled H 2 O, and 0.5 l of each sample was spotted on Silica Gel 60 F 254 TLC plates (Merck). Monomeric xylose (X 1 , 3.0 g) and xylo-oligosaccharides (X 2 -X 5 ) (2.5 g each) were used as standards in the TLC analysis for xylo-oligosaccharides, and glucose (G 1 , 3.0 g) and cello-oligosaccharides (G 2 -G 5 ) (2.5 g each) were spotted as standards for cello-oligosaccharides analyses. After drying the plates, the products of the reactions were resolved by one ascent for 4 h with 1-butanol/acetic acid/H 2 O (10:5:1, v/v/v) as a mobile phase (28). For visualization, the dried TLC plates were sprayed with a mixture of methanolic orcinol (0.05%, w/v) and sulfuric acid (5%, v/v) and then heated at 75°C for 10 min (29) for color development.
Determination of Kinetic Parameters for Xyl3A with pNPlinked Sugars as Substrates-Kinetic parameters of Xyl3A were determined at optimum conditions with pNP-␤-D-xylopyranoside (pNPX), pNP-␤-D-cellobioside, and pNP-␣-D-glucopyran-oside (pNPGlu) as substrates. In the reactions, Xyl3A (100 nM, final concentration) was incubated with the pNP-linked substrates at 65°C in citrate buffer (50 mM sodium citrate, pH 5.5). The substrate concentrations of pNPX and pNPGlu were used in a range of 0.08 -10 mM, and the concentration of pNP-␤-Dcellobioside ranged from 0.08 to 5 mM. After the substrates were equilibrated to 65°C in the thermostated Cary 300 UVvisible spectrophotometer, the reactions were initiated by addition of Xyl3A, and the rate of pNP production was evaluated by monitoring the absorbance at 400 nm. The initial velocities of the reactions were calculated with extinction coefficient of 1636 M Ϫ1 cm Ϫ1 for pNP at pH 5.5. By using GraphPad software (GraphPad version 5.01, San Diego), the initial velocities were plotted against the substrate concentrations, and the Michaelis-Menten constant (K m ) and the maximum velocity (V max ) were estimated with a nonlinear curve fit. The k cat was calculated as the quotient of the resulting V max and the concentration of enzyme used in the reaction.
Determination of Catalytic Efficiencies for Xyl3A with Xylooligosaccharides and Cello-oligosaccharides-The catalytic constants of Xyl3A for xylo-oligosaccharides (X 2 -X 6 ) and cello-oligosaccharides (G 2 -G 6 ) were determined as described previously (30). Briefly, oligosaccharides (30 M each) were hydrolyzed with Xyl3A in citrate buffer (pH 5.5, 50 mM) at 65°C with final volume of 500 l, and the reactions were terminated at 10 min by boiling for another 10 min. The final concentration of Xyl3A was 50 nM for xylo-oligosaccharides (X 2 -X 6 ) and 500 nM for cello-oligosaccharides (G 2 -G 6 ). The relationship between hydrolysis rate and oligosaccharide substrate concentration (0, 30, and 60 M) was linear; therefore, the substrate concentration of 30 M should be well below K m . A linear relationship was also observed between substrate depletion and hydrolysis time (0, 10, and 20 min), so the hydrolytic reactions were terminated at 10 min. The substrate concentrations at the beginning ([S 0 ]) and termination ([S t ], 10 min) of the reaction were calculated by HPAEC method and used for calculation of the catalytic constants as earlier described (30).
Substrate Binding Assay Using Isothermal Titration Calorimetry-The substrate binding activity of XBP1, which constitutes a component of the putative sugar transport system, in the gene cluster was measured at 25°C using a VP-ITC microcalorimeter (Microcal Inc., Northampton, MA) with different ligands. The ligands (xylose, xylobiose, xylotriose, xylotetraose, glucose, cellobiose, cellotriose, cellotetraose, and aldouronic acids) were dissolved in citrate buffer (50 mM sodium citrate, pH 5.5) to a concentration of 0.5 mM (oligosaccharides) or 0.24 mg/ml (aldouronic acids). The protein XBP1 was diluted to a final concentration of 50 M using the same buffer. The ligand was injected into the reaction cell containing XBP1 in 28 successive 10-l aliquots at 300-s intervals and 20 s duration. Nonlinear regression with a single site model (MicroCal Origin) was applied for data analysis, and thermodynamic parameters were calculated using the Gibbs free energy equation (⌬G ϭ ⌬H Ϫ T⌬S), and the relationship, ⌬G ϭ ϪRT lnK a .
Structure Determination of the XBP1-Xylotriose Complex-XBP1 was purified as described above but with an additional size exclusion chromatographic step (using a Superdex TM 200 Hiload TM 16/60 size exclusion column) in a final buffer com-posed of 20 mM HEPES, pH 7.5, 100 mM KCl. The affinity tag was not removed prior to crystallization. Initial crystallization conditions were obtained by the sparse matrix sampling method using commercial screens. Crystals of the XBP1-xylotriose complex were grown using the hanging vapor drop diffusion method. Briefly, 1 l of protein at 13.5 mg/ml concentration was incubated with 5 mM xylotriose for 2 h on ice, and the complex was mixed with 1 l of precipitant solution (30% polyethylene glycol 1500, 100 mM KCl, and 20 mM HEPES, pH 7.5) and equilibrated over a well containing the precipitant solution at 9°C. Crystals grew within 3 days and were briefly soaked in precipitant solution supplemented with 10% ethylene glycol prior to flash-cooling in liquid nitrogen. Selenomethionine-labeled XBP1 was grown as described above, and crystals of SeMet XBP1 were grown under similar conditions.
Flash-cooled crystals of native XBP1 in complex with xylotriose diffracted x-rays to 2.1 Å resolution at an insertion device synchrotron beam line (LS-CAT Sector 21 ID-F, Advanced Photon Source, Argonne, IL). Crystals of selenomethioninelabeled XBP1-xylotriose complex diffracted to a slightly higher resolution and subsequently were used for all structural analyses. These crystals occupy space group P2 1 2 1 2 with unit cell parameters a ϭ 59.3 Å, b ϭ 150.8 Å, and c ϭ 150.9 Å, with three molecules in the crystallographic asymmetric unit. Although two of the unit cell constants are suspiciously close and indicative of a higher symmetry setting, the data could not be integrated or scaled in any tetragonal space group.
An 8-fold redundant data set was collected to 1.8-Å resolution with an overall R merge ϭ 9.4 and I/(I) ϭ 8 in the highest resolution shell. All data were indexed and scaled using the HKL2000 package (31). Crystallographic phases were determined by single wavelength anomalous scattering. The heavy atom substructure was determined using HySS (32), and refinement of heavy atom parameters using Phaser, as implemented in the PHENIX software package (33), yielded a figure of merit of 0.447. Solvent flattening and noncrystallographic symmetry averaging yielded experimental maps of exceptional quality, allowing nearly the entire polypeptide chain to be automatically traced using either ARP/wARP (34) or Buccaneer (35). Further manual fitting using XtalView (36) was interspersed with rounds of refinement using REFMAC5 (37). Cross-validation, using 5% of the data for the calculation of the free R factor (38) was utilized throughout model building process to monitor building bias. Clear density for the oligosaccharide could be observed in the initial experimental maps only for two of the three molecules in the crystallographic asymmetric unit. The ligand was manually built into the two chains of the model only after the free R factor dropped below 30%. The stereochemistry of the models was routinely monitored throughout the course of refinement using PROCHECK (39). Relevant data collection and refinement parameters are provided in Table 1. The refined coordinates have been deposited in the Protein Data Bank under code 4G68.
Growth of C. polysaccharolyticus and Analysis of Gene Expression-C. polysaccharolyticus was grown in a defined medium with either glucose or BWX as the sole carbon source. The defined medium is the same as reported in our earlier report (supplemental Table S2) (30). The cells cultured with glucose and BWX as substrate were harvested at A 600 nm of 0.2 and 0.1, respectively, for RNA extraction. The RNA used for quantitative RT-PCR (Q-RT-PCR) was obtained from the harvested cells by mixing with 2 volumes of RNAprotect bacteria reagent (Qiagen), and then the cells were collected by centrifuging at 5000 ϫ g for 10 min and saved at Ϫ80°C until RNA extraction. In the subsequent steps, the cells were pretreated with lysozyme, and the total RNA was extracted with the RNeasy mini kit (Qiagen) according to the manufacturer's protocol. The RNA was eluted with nuclease-free water and then digested with RNase-free DNase. Reverse transcription and quantitative PCR were performed as described previously (40). DNA gyrase subunit A (gyrA) was used as the reference gene. The primers used for the experiments are listed in supplemental Table S1.

Identification of Xyn10A, a Multimodular Endoxylanase-
To identify genes involved in xylan degradation, we searched the genome of C. polysaccharolyticus for putative endoxylanase genes using a BlastP search. This search revealed xyn10A, a gene predicted to encode a multimodular endoxylanase. The modular architecture for the predicted protein includes a GH10 endoxylanase module flanked on the N terminus by two family 22 CBMs (CBM 22) and on the C terminus by two family 9 CBMs (CBM 9). In addition, there are three surface layer homology (SLH) modules at the C terminus of the protein (Fig.  1A). A SignalP search revealed a predicted signal peptide with a peptidase cleavage site between amino acid residues 30 and 31. The protein was cloned and expressed as a recombinant protein in E. coli and purified to near-homogeneity. The predicted Highest resolution shell is shown in parentheses. b FOM means figure of merit ϭ ͉͐P() exp(i) d͉. c R-factor ϭ ⌺(͉F obs ͉ Ϫ k͉F calc ͉)/⌺͉F obs ͉, and R-free is the R value for a test set of reflections consisting of a random 5% of the diffraction data not used in refinement. molecular mass of the purified protein as judged by SDS-PAGE (170 kDa, Fig. 1B) was in agreement with the predicted value based on the amino acid composition (160.1 kDa). Following overnight incubation, of the purified protein with BWX, an increase in reducing ends was detected (Fig. 1C), indicating that the recombinant protein possesses endoxylanase activity. Next, Xyn10A was incubated with BWX, and aliquots were taken at 0, 5, 10, 30, and 60 min and analyzed by thin layer chromatography. By 60 min, a mixture of products that included xylotriose and xylotetraose appeared on the TLC plate (Fig. 1D). Therefore, xyn10A encodes an endoxylanase and the major products of hydrolysis on BWX are short xylo-oligosaccharides.
Identification of a Gene Cluster Targeted toward Xylooligosaccharides-Xyn10A is an extracellular endoxylanase that produces mainly xylotriose and xylotetraose from BWX. To utilize the products of hydrolysis of Xyn10A, C. polysaccharolyticus must transport these oligosaccharides into the cell, cleave them into monosaccharides, and metabolize them, most likely through the pentose-phosphate pathway. In studying the genome, we identified an 18-kb region of the genome containing a cluster of genes with predicted roles of transport, hydrolysis, and metabolism of xylo-oligosaccharides (Fig. 2). This cluster includes genes with predicted involvement in oligosaccharide transport (ABC transporter, ORF0548 -0550), hydrolysis of branched oligosaccharides (␣-glucuronidase, ORF0540, and ␤-xylosidase, ORF0541), removal of acetyl groups from oligosaccharides (ORF0551), metabolism of xylose through the pentose-phosphate pathway (ORF0542-0545), and a two-component system that could be involved in regulation of these genes at the transcriptional level (ORF0546 -0547).
Therefore, this gene cluster includes genes predicted to encode the entire repertoire of proteins required for the transport and metabolism of hydrolytic products of Xyn10A. To test whether genes in this cluster function in the degradation and utilization of xylan, several critical components in the cluster were studied further as described below.
xyl3A Encodes an Enzyme with Both ␤-1,4-Xylosidic and ␤-1,4-Glucosidic Activities-ORF0541 was predicted to encode a ␤-xylosidase consisting of three domains as follows: an N-terminal GH 3 domain followed by a C-terminal GH 3 domain, and an FN3-like domain (supplemental Fig. 1). The gene was cloned and expressed in E. coli, and the recombinant N-terminal hexahistidine-tagged protein was purified to near-homogeneity. SDS-PAGE analysis of the purified Xyl3A revealed a single band with a molecular mass of about 86 kDa, which was in agreement with the predicted molecular mass of the hexahistidine fusion protein based on amino acid sequence (88 kDa) (Fig.  3A). The quaternary structure of Xyl3A was further determined through size exclusion chromatography. As shown in Fig. 3B, Xyl3A eluted in a single peak, and the apparent molecular mass was calculated as about 157 kDa. The molecular mass in size exclusion chromatography was nearly two times of that shown in SDS-PAGE, suggesting that Xyl3A exists as a dimer in solution.
The hydrolysis of 15 different pNP-linked sugar substrates by Xyl3A showed that it was most active on pNP-␤-D-xylopyranoside (pNPX, specific activity 210 milliunits/mg). Therefore, the temperature and pH optima of Xyl3A were determined with pNPX in a temperature range of 40 -75°C and a pH range of 4.0 -6.5. These results revealed a temperature optimum of 65°C and a pH optimum of 5.5 (data not shown). The catalytic properties of Xyl3A, based on these parameters, indicated that the catalytic efficiency for Xyl3A is higher with pNPGlu than pNPX (Table 2). However, these are artificial substrates and may not accurately reflect the substrate specificity for natural substrates.
To further test the natural substrate specificity of Xyl3A, the protein (0.5 M) was incubated with xylo-oligosaccharides (X 2 -X 6 ) and cello-oligosaccharides (G 2 -G 6 ), and the products were resolved by thin layer chromatography. Following overnight incubation of Xyl3A with xylo-oligosaccharides, all of the substrates were converted to xylose (Fig. 3D). However, when Xyl3A was incubated with cello-oligosaccharides at a concen-   OCTOBER 12, 2012 • VOLUME 287 • NUMBER 42 tration of 0.5 M, only a small amount of hydrolysis was detected (data not shown), indicating that the activity of Xyl3A with cello-oligosaccharides is much lower than that with xylooligosaccharides. When the concentration of Xyl3A was increased to 2.0 M, all of the cello-oligosaccharides tested were converted to glucose (Fig. 3E). These experiments confirmed the results of assays with pNP substrates, which indicated that Xyl3A exhibits both ␤-xylosidase and ␤-glucosidase activities; however, the activity with xylo-oligosaccharides is higher than with cello-oligosaccharides. Thus, despite the two activities portrayed, Xyl3A is likely a ␤-xylosidase in vivo.

Xylan Utilization by Caldanaerobius polysaccharolyticus
To analyze the catalytic activity for Xyl3A with oligosaccharides more quantitatively, an HPAEC-PAD assay was employed as described under "Experimental Procedures." These experiments showed that the catalytic efficiency (k cat /K m ) values for Xyl3A were much lower for cello-oligosaccharides compared with xylo-oligosaccharides ( Table 2). The catalytic efficiencies were 5 orders of magnitude lower for cellobiose and cellotriose relative to xylobiose and xylotriose and 3 orders of magnitude lower for cellotetraose, cellopentaose, and cellohexaose relative to the corresponding xylo-oligosaccharides (Table 2). These results clearly show that the dominant catalytic activity for Xyl3A is ␤-xylosidase activity.
Subsequent studies with polysaccharides, including BWX, oat spelt xylan, MGX, mannan, locust bean gum, guar gum, and glucomannan, revealed detectable activity against BWX, oat spelt xylan, and MGX, with the predominant product being xylose (data not shown). Xyl3A also showed modest activity on CMC with a monosaccharide, likely representing carboxymethylated glucose, identified in the hydrolysate by HPAEC-PAD. However, no activity was detected with mannan, locust bean gum, guar gum, or glucomannan as substrates (data not shown). These results further supported the assignment of Xyl3A as a ␤-xylosidase.
Agu67A Encodes an ␣-Glucuronidase-4-O-Methylglucuronyl groups are common side chains found attached to the backbone xylopyranosyl groups in xylans. Thus, to completely convert xylan to monosaccharides for metabolism, debranching enzymes such as ␣-glucuronidases are required. ORF0540 was predicted to encode a GH 67 ␣-glucuronidase and was targeted for further analysis. The domain architecture includes three Xyl3A was incubated with xylo-oligosaccharides (X 2 -X 6 ) at 65°C for 15 h. E, hydrolytic activity of Xyl3A against ␤-1,4-linked cello-oligosaccharides. Xyl3A was incubated with cello-oligosaccharides (G 2 -G 6 ). The method was as described for the xylo-oligosaccharides except for the enzyme, which was added at four times the molar concentration used in D. The end products of hydrolysis were resolved by thin layer chromatography. B and C, experiments were performed in triplicate, and data are reported as means Ϯ S.D.

TABLE 2 Catalytic efficiencies (k cat /K m ) for Xyl3A with pNP-linked sugars, xylooligosaccharides and cello-oligosaccharides
a The experiments were performed in triplicate, and data are reported as means Ϯ S.D. b The catalytic efficiencies (k cat /K m ) are reported as mM Ϫ1 s Ϫ1 . c The catalytic efficiencies (k cat /K m ) for pNP-␤-D-xylopyranoside, pNP-␤-D-glucopyranoside, and pNP-cellobioside were determined as described by Dodd et al. (40). d The catalytic efficiencies (k cat /K m ) for xylo-oligosaccharides and cello-oligosaccharides were determined as described by Han et al. (30).
conserved domains, including an N-terminal GH 67 domain, a GH 67 middle domain, and a C-terminal GH 67 domain (supplemental Fig. 1). Agu67A was expressed as a hexahistidine fusion protein in E. coli and purified to homogeneity. SDS-PAGE analysis of the purified Agu67A revealed a single band with a molecular mass of about 79 kDa, which was in agreement with the predicted molecular mass of the hexahistidine fusion protein based on amino acid sequence (81 kDa) (Fig. 4A). The quaternary structure of Agu67A was further determined through size exclusion chromatography. As shown in Fig. 4B, Agu67A eluted in a single peak, and the apparent molecular mass was calculated as 158 kDa. The molecular mass in size exclusion chromatography was two times that shown by SDS-PAGE, suggesting that Agu67A exists as a dimer in solution, which is in agreement with other studies on bacterial ␣-glucuronidases (41). The hydrolytic activity of Agu67A was screened with pNPlinked substrates, aldouronic acids, and polysaccharides. Agu67A showed the highest activity against aldouronic acids, the specific activity reached 154 IU/mg (Fig. 4C). Lower hydrolytic activity was also detected with MGX and BWX as substrates with specific activities of 10.8 and 11.6 IU/mg, respectively. The observable activity with MGX and BWX most likely is due to the hydrolysis of small amounts of aldobiouronic and aldotriouronic acids present in the polysaccharide mixtures, because HPLC assays of the hydrolysate revealed the appearance of xylose and xylobiose following incubation with Agu67A (data not shown). No activity was observed with any pNPlinked sugars.
The optimum temperature and pH for Agu67A was determined with aldouronic acids as substrates. The enzyme was active in a temperature range of 30 -85°C, and the optimum temperature was 60°C (data not shown). Analysis of the activity in buffers of pH values ranging from 3.5 to 6.5 revealed a pH optimum of 5.5 (data not shown).
To further evaluate the activity of Agu67A with aldouronic acids, the enzyme was incubated with an aldouronic acid mixture, and the products of hydrolysis were analyzed by HPAEC-PAD. After incubation of Agu67A with aldouronic acids, peaks appeared corresponding to xylobiose, xylotriose, xylotetraose, and xylopentaose, and the peak representing xylose increased in the enzyme-treated mixture as compared with the control reaction (Fig. 4D). This observation indicated that Agu67A cleaved 4-O-methylglucuronic acid from aldobiouronic, aldotriouronic, aldotetrauronic, and aldopentauronic acids. Furthermore, the concentration of glucuronic acid equivalents increased from 36 mM in the control reaction to 370 mM in the enzyme-treated reaction (Fig. 4E). These results clearly showed that Agu67A produces xylo-oligosaccharides and glucuronic acid equivalents from a mixture of 4-O-methylglucuronyl-sub-  OCTOBER 12, 2012 • VOLUME 287 • NUMBER 42
Synergism of Xyn10A, Xyl3A, and Agu67A in the Hydrolysis of BWX-Given our finding that Xyn10A releases oligosaccharides from BWX, we hypothesized that Agu67A and Xyl3A may facilitate the conversion of oligosaccharides to monosaccharides. To evaluate this hypothesis, we incubated the three enzymes separately and in combination with BWX and analyzed the products of hydrolysis. The three enzymes functioned synergistically to release xylose from BWX (supplemental Fig.  2). These results support our hypothesis that Xyn10A produces branched xylo-oligosaccharides that are subsequently transported into the cell and degraded into monosaccharides by Agu67A and Xyl3A for fermentation by C. polysaccharolyticus.
ORF0548 Encodes a Substrate-binding Component of an ABC Transporter That Is Specific for Xylo-oligosaccharides-Xyn10A is an endoxylanase that produces branched xylo-oligosaccharides. The coding sequence includes a putative signal peptide and three SLH repeats, which indicates that it is secreted outside of the cell and anchored onto the cell wall. The two enzymes that catalyze the hydrolysis of these branched oligosaccharides into monosaccharides do not possess signal peptides and are therefore likely to reside within the cytoplasm of the cell. Therefore, a mechanism must exist whereby branched oligosaccharides are transported across the plasma membrane and into the cell. Within the gene cluster identified in C. polysaccharolyticus, a putative ATP-binding cassette (ABC) importer was found that contains three genes predicted to encode a solute-binding protein (XBP1) (ORF0548), a permease protein (ORF0549), and an ATPase (ORF0550) (Fig. 2).
To evaluate whether the ABC transporter is specific for xylooligosaccharides, the substrate-binding protein was cloned and expressed in E. coli, and binding activity with various substrates was tested by ITC. When XBP1 was titrated with xylose, no binding was detected (data not shown). However, XBP1 bound tightly to xylobiose, xylotriose, and xylotetraose (Fig. 5) with the highest affinity for xylotriose (K d ϭ 11.6 ϫ 10 Ϫ9 M), followed by xylobiose (K d ϭ 83.3 ϫ 10 Ϫ9 M) and then xylotetraose (K d ϭ 204 ϫ 10 Ϫ9 M) ( Table 3). These experiments revealed that XBP1 binds with very high affinity to xylo-oligosaccharides. Moreover, XBP1 discriminates between xylo-oligosaccharides by length with the optimal binding constant observed for the tri-saccharide. In the ITC with glucose, cellobiose, cellotriose, and cellotetraose, no significant binding was detected, indicating that XBP1 binds specifically to xylo-oligosaccharides.
To test whether XBP1 could bind to xylo-oligosaccharides substituted with 4-O-methylglucuronyl groups, we titrated the protein with a mixture of aldouronic acids. The ITC experiment clearly showed that XBP1 bound to components within the aldouronic acid mixture; however, because of the heterogeneity of this mixture, we were unable to determine which component (or components) was bound or to calculate the affinity constants or stoichiometry. Nevertheless, these data, although weaker, do show that XBP1 can bind aldouronic acids, indicating that 4-O-methylglucuronyl substitutions do not disrupt binding and suggesting that these branched oligosaccharides can be imported by this ABC transporter.
Molecular Basis for Xylo-oligosaccharide Recognition by XBP1-The three-dimensional structure of the XBP1-xylotriose complex was solved to 1.8-Å resolution by single wavelength anomalous diffraction methods on data collected from crystals of a selenomethionine-labeled protein. The overall structure shows a bilobal fold that is common to other solutebinding components of ABC-type transporters (Fig. 6A). The first domain consists of residues Ile-46 through Asp-158 and residues Gly 325 through the Val-377, and the second domain includes Gln-159 to Gly-324 and Asp-378 to the C terminus.
A DALI search of a structure-based comparison against the Protein Data Bank reveals the closest homologs to be the ␤-Dgalactopyranose-specific solute receptor AcbH from Actinoplanes (PDB code 3OO6; Z-score ϭ 44.4; r.m.s.d. of 2.2 Å over 378 aligned C␣ atoms) (42), the glucose/galactose-specific solute-binding protein from Thermus thermophilus (PDB code 2B3B; Z-score ϭ 41.4; r.m.s.d. of 2.4 Å over 373 aligned C␣ atoms) (43), and the trehalose/maltose-specific binding protein from Thermococcus litoralis (PDB code 1EU8; Z-score ϭ 39.3;  r.m.s.d. of 2.5 Å over 375 aligned C␣ atoms) (44), among others. The homology of XBP1 to each of these proteins exists only at the structural level, as identity between their primary sequences is below 18%. Additionally, in contrast to XBP1, the structures of each of these proteins have only been determined in complex with monosaccharides.
In the co-crystal structure, the xylotriose ligand is bound in a (roughly) 14-Å cleft that is formed between the two domains (Fig. 6B). An inspection of the binding pocket provides a molecular rationale for the specificity of oligosaccharide chain length. One end of the pocket is capped by a number of aromatic and aliphatic side chains, including Trp-109, Met-211, Trp-212, Ile-329, and Ala-398. The third xylose residue of the trisaccharide is engaged at this site and stacks directly above Trp-208. The second xylose residue is sandwiched between Leu-53 and Phe-287, and the first xylose is packed against Phe-54 and a loop encompassing Thr-55 through Lys-60. This binding pocket is contoured optimally for a trisaccharide substrate; a disaccharide can be accommodated but would not engage in all van der Waals packing interactions, and binding of tetra-or longer oligosaccharides would require movement of the loop encompassing Thr-55 through Lys-60. Hence, the structural data are consistent with the calorimetric analysis that illustrates the preference for a trisaccharide ligand.
The specificity for xylo-oligosaccharides could also be understood in the context of the structural data. In addition to hydrophobic contacts, an extensive set of hydrogen bond interactions stabilizes the interaction between the protein and each of the sugars of the trisaccharide ligand. Polar residues are situated along the length of the ligand-binding cleft where they interact with the hydroxyl groups of the xylotriose. At the site of the first xylose residue (C-1 1 anomeric carbon), which harbors the reducing end of the xylotrisaccharide, O-3 1 is within hydrogen bonding distance to Arg-267, and O-1 1 is within hydrogen bonding distance to the backbone nitrogen of Gly-56. At site 2, both Gln-160 and the indole nitrogen of Trp-109 engage O-3 2 , and Arg-267 interacts with the hemiacetal oxygen O-5 2 . At site The inability of XBP1 to bind to glucose or cellotriose is a result of steric occlusion of the C-5 hydroxymethyl group within the tight confines of the ligand-binding pocket. Modeling studies, based on our co-crystal structure, allow for a prediction of how branched xylo-oligosaccharides can be accommodated within the XBP1 sugar-binding pocket. Based on steric considerations, the branch would have to reside at O-2 2 position (the subscript defines the xylose site). Branches at either O-2 1 would be occluded by steric overlaps with Asn-86 and Lys-90, and a branch at O-2 3 would clash with the protein main chain. Branching at O-2 2 can be accommodated without steric hindrances, and branches at this position are further stabilized by stacking interactions with the side chain of Trp-109 and hydrogen bonding with the side chain of Gln-107.
Expression of Xylan Utilization Genes by C. polysaccharolyticus during Growth on BWX-To evaluate whether these genes are induced at the transcriptional level by C. polysaccharolyticus in response to xylan, the bacterium was cultured in a defined medium with either BWX or glucose as carbon sources to early log phase, then RNA was extracted, and expression of the genes in the xylan utilization cluster as well as xyn10A was evaluated by Q-RT-PCR.
These results indicate that these genes are involved in xylan utilization by C. polysaccharolyticus, and furthermore, the bacterium has evolved a transcriptional program that permits the recognition of xylan fragments and subsequently induces the expression of genes involved in its degradation and fermentation.

DISCUSSION
Xyn10A is a multimodular endoxylanase composed of a GH 10 endoxylanase module flanked on the N terminus by a tandem repeat of CBM 22 and on the C terminus by another tandem repeat of CBM 9, followed by three SLH modules (Fig. 1A). This modular organization is conserved among six other bacteria, two of the genus Thermoanaerobacter and four of the genus Thermoanaerobacterium (supplemental Fig. 3). Beyond these organisms, there are hundreds of homologous proteins (defined as possessing at least one copy of the CBM 22, GH 10, and CBM 9 in the same orientation within a single polypeptide) in the GenBank TM database derived from other bacteria (supplemental Fig. 3). Interestingly, all of the organisms that possess a homolog of Xyn10A are thermophilic or hyperthermophilic bacteria, suggesting that this modular organization imparts an advantage to degrading xylan at elevated temperatures. Studies with XynA from Thermoanaerobacterium saccharolyticum (TsXynA), XynA from Thermotoga maritima (TmXynA), and XynC from Paenibacillus barcinonensis (PbXynC) found the N-terminal CBM 22 to be critical for imparting thermostability and thermophilicity to the respective enzymes (50 -52). In addition to their thermostabilizing properties, the CBM 22 modules also possess carbohydrate binding activity. A polypeptide composed of both N-terminal CBM 22s of TmXynA bound both soluble xylan as well as mixed linkage ␤-1,3/␤-1,4-glucan but did not bind to crystalline cellulose (52). Furthermore, a polypeptide composed only of the second CBM 22 of TmXynA possessed similar binding characteristics to the polypeptide containing both modules (52).
The C-terminal CBM 9 of TmXynA and the noncellulosomal protein XynX from Clostridium thermocellum (CtXynX) possess cellulose binding activities and allow the respective enzymes to bind crystalline cellulose (53,54). Despite binding to crystalline cellulose, neither of these enzymes have the capacity to degrade this polysaccharide. Although the xylanbinding CBM 22 of these proteins likely aids in juxtaposing substrate and catalyst, the role of the cellulose-binding CBM 9 is less clear. However, in intact plant cell walls, xylans are found in close proximity to cellulose fibers; therefore, one possible function of the CBM 9 could be to aid in the separation and degradation of insoluble xylan fragments closely associated with cellulose fibrils. Although the precise function of CBM 9 in xylan degradation by these organisms remains unclear, the high level of conservation of this modular architecture in these proteins across diverse bacteria clearly suggests that these modules are integral to xylan degradation.
Immunogold labeling and electron microscopy revealed that TmXynA is tethered to the outer membrane (toga) of T. maritima by a hydrophobic stretch of amino acids within the N-terminal signal peptide (55). Although there is a signal peptide within CpXyn10A, there is no significant homology between the signal peptides for the two xylanases nor is there a predicted signal peptidase II cleavage site that might facilitate transfer to a lipid moiety. Rather, CpXyn10A is most likely tethered to the surface of the bacterium via the three SLH repeats at the C terminus of the protein. SLH modules recognize and bind to pyruvylated cell wall polysaccharides (56). The enzyme that mediates this modification is encoded by the csaB gene (56), and a homolog of this gene is present within the genome of C. polysaccharolyticus (data not shown), indicating that this mechanism is active in this organism.
Xyl3A is a bifunctional ␤-xylosidase/␤-glucosidase with higher activity with pNPG relative to pNPX; however, the catalytic efficiency for xylo-oligosaccharides was several orders of magnitude higher than for cello-oligosaccharides indicating that cleaving debranched xylan fragments is the most likely activity of this protein. In addition to possessing the N-terminal (␣/␤) 8 and C-terminal ␤-sandwich domains characteristic of GH 3 enzymes (57), Xyl3A also has a C-terminal fibronectin repeat 3-like (FnIII-like) domain (supplemental Fig. S1). FnIII domains are thought to have originated in animals and transferred to bacteria (58), where they exist almost exclusively in association with glycoside hydrolase enzymes (59). The recent crystal structure of a three-domain GH 3 ␤-glucosidase from Thermotoga neapolitana (TnBgl3B) showed that the domain did not make contacts with the active site and adopts a FnIII fold that is distinct from those found in animals as well as those found in other bacterial glycoside hydrolases (60). Domain III of TnBgl3B was annotated as FnIII-like domain, and according to the Pfam database, 3290 of the total 3379 FnIIIlike domains are associated with a GH 3 domain. Further- The xylo-oligosaccharides are then converted to xylose by the intracellularly located Xyl3A. In the xylose metabolism pathway, xylose isomerase converts xylose to xylulose, which is then converted to xylulose-5-P by xylulose kinase. Transketolase then catalyzes the rearrangement of xylulose-5-P and ribose-5-P to sedoheptulose-7-P and glyceraldehyde-3-P, and the transaldolase converts the two products to erythrose-4-P and fructose-6-P. more, 2949 of the total 6490 GH3 N domains are associated with FnIII-like domains, and the domain organization seen for Xyl3A is the most prevalent for GH 3 proteins in the Pfam database (data not shown). Despite the high abundance of proteins with this domain architecture, the role of this domain in GH3 enzymes is still unknown.
XBP1 is the solute-binding component of an ABC transporter and specifically binds to xylo-oligosaccharides with a preference for xylotriose. Although four sequenced Thermoanaerobacterium spp. have uncharacterized homologs of XBP1 ranging in amino acid identity from 74 to 87%, the most closely related protein with demonstrated activity is XynE from G. stearothermophilus (52% identity). Similar to XBP1, GsXynE binds preferentially to xylotriose, although it prefers xylotetraose over xylobiose, which is in contrast to XBP1 (8). The gene encoding GsXynE is located within a 39.7-kb gene cluster containing hemicellulose utilization genes (6), and its expression is regulated by a nearby two-component system (8). This arrangement is similar to that for XBP1 in that a putative twocomponent system lies just upstream of the ABC transporter genes, and it is likely that this system mediates the transcriptional response of these genes in the presence of xylan.
Crystallographic studies of solute-binding proteins illustrate that specificity is mediated by interactions that are largely local to the binding pocket. A comparison of the co-crystal structure of the XBP1-xylotriose complex with other solute-binding proteins illustrates that the binding site for the polysaccharide ligand in XBP1 is optimized for xylotriose. For example, in the glucose-binding protein from T. thermophilus (PDB code 2B3B), the binding site is optimized for mono-and disaccharides, and longer oligosaccharides are occluded by the protrusion of a single residue (His-348) at site 3 and a depression of the loop equivalent to Thr-55 through Lys-60 at site 1. The structure of XBP1 illustrates a binding pocket with a contour that is optimized for trisaccharides, which contains polar residues that are suited for interactions with xylose residues at each of the three sites. Our structural and biochemical data further refine the idea that for solute-binding proteins ligand specificity is achieved within the confines of a highly conserved scaffold through modest changes at the binding site.
The gene cluster identified in C. polysaccharolyticus appears to be targeted toward the utilization of xylans and specifically 4-O-methylglucuronoxylans as suggested by the presence of an ␣-glucuronidase gene within the cluster. The biological relevance of this gene cluster to xylan utilization by C. polysaccharolyticus is demonstrated by Q-PCR experiments that reveal many of these genes to be induced during growth on BWX relative to glucose. The genes located within this cluster encode the entire repertoire of enzymes required for the following: (a) transport xylan fragments across the cell membrane; (b) cleave branched oligosaccharides to monosaccharides; (c) metabolize xylose through the pentose-phosphate pathway, and (d) coordinate expression of xylanolytic genes in response to environmental availability of this substrate (Fig. 8).
The use of thermophilic organisms capable of fermenting both glucose and xylose in the production of biofuels is highly desirable in this emerging industry. In some instances, attempts are made to engineer an organism that can already use glucose to also metabolize xylan or xylose. The process often involves assembly of individual genes on a cassette to transform into the desired organism. The presence of the genes encoding the hydrolytic enzymes and the key enzymes of the pentosephosphate pathway in a cluster makes it easier to transfer a co-evolved molecular machinery to confer the xylanolytic phenotype.