Multidomain, Surface Layer-associated Glycoside Hydrolases Contribute to Plant Polysaccharide Degradation by Caldicellulosiruptor Species*

The genome of the extremely thermophilic bacterium Caldicellulosiruptor kronotskyensis encodes 19 surface layer (S-layer) homology (SLH) domain-containing proteins, the most in any Caldicellulosiruptor species genome sequenced to date. These SLH proteins include five glycoside hydrolases (GHs) and one polysaccharide lyase, the genes for which were transcribed at high levels during growth on plant biomass. The largest GH identified so far in this genus, Calkro_0111 (2,435 amino acids), is completely unique to C. kronotskyensis and contains SLH domains. Calkro_0111 was produced recombinantly in Escherichia coli as two pieces, containing the GH16 and GH55 domains, respectively, as well as putative binding and spacer domains. These displayed endo- and exoglucanase activity on the β-1,3-1,6-glucan laminarin. A series of additional truncation mutants of Calkro_0111 revealed the essential architectural features required for catalytic function. Calkro_0402, another of the SLH domain GHs in C. kronotskyensis, when produced in E. coli, was active on a variety of xylans and β-glucans. Unlike Calkro_0111, Calkro_0402 is highly conserved in the genus Caldicellulosiruptor and among other biomass-degrading Firmicutes but missing from Caldicellulosiruptor bescii. As such, the gene encoding Calkro_0402 was inserted into the C. bescii genome, creating a mutant strain with its S-layer extensively decorated with Calkro_0402. This strain consequently degraded xylans more extensively than wild-type C. bescii. The results here provide new insights into the architecture and role of SLH domain GHs and demonstrate that hemicellulose degradation can be enhanced through non-native SLH domain GHs engineered into the genomes of Caldicellulosiruptor species.

The genus Caldicellulosiruptor is composed of extremely thermophilic, Gram-positive, anaerobic bacteria that are able to attach to and degrade the variety of polysaccharides found in lignocellulosic biomass. Currently, the genomes of 12 members of the genus have been sequenced, representing species isolated from globally distributed terrestrial hot springs (53)(54)(55)(56)(57). Caldicellulosiruptor species produce many extracellular CAZymes, including some that have SLH domains (58). The enzyme inventory produced by individual species, and thus the capacity to degrade plant biomass polysaccharides, varies significantly across the genus (59,60). C. kronotskyensis has the largest inventory of CAZymes of any Caldicellulosiruptor species, with 31 CAZymes compared with 20 CAZymes in C. bescii, which is to date the most studied member of the genus (61).
During growth on complex carbohydrates, Caldicellulosiruptor species physically associate with the substrate. This attachment is mediated in part by non-catalytic cellulose-binding proteins, called tāpirins, that were recently identified and characterized (62). In addition, proteins anchored to the cell surface within the S-layer via SLH domains are also believed to play a role in the attachment of Caldicellulosiruptor species to insoluble plant biomass substrates (21,63). Two such SLH domain proteins from C. saccharolyticus, Csac_0678 and Csac_2722, were shown to bind to insoluble substrates and could be identified in the S-layer protein cell fraction, implying a role in cell-substrate attachment (21). However, more information is needed to understand the role of SLH domain proteins in plant biomass deconstruction, especially how their function relates to other CAZymes produced for this purpose. In this study, the localization, biochemical characteristics, and physiological role of two SLH domain enzymes from C.kronotskyensis, xylanase Calkro_0402 and laminarinase Calkro_0111, were examined from this perspective.

Experimental Procedures
Bacterial Strains, Plasmids, and Reagents-Cloning and expression of recombinant proteins used various E. coli strains: NovaBlue GigaSingles TM (EMD Millipore), Rosetta TM 2(DE3) Singles TM (EMD Millipore), NEB 10-beta electrocompetent E. coli (New England Biolabs), and Arctic Express (DE3)RIL E. coli (Agilent Technologies). Axenic strains of C. bescii and C. kronotskyensis were obtained from the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures. C. bescii strain JWCB018 (64) and non-replicating vector pDCW121 (65) were obtained from J. Westpheling (University of Georgia, Athens, Georgia). Genes of interest were amplified by polymerase chain reaction (PCR) and cloned using the pET46 Ek/LIC vector kit (EMD Millipore) for protein expression. All other vectors were constructed using one-step isothermal assembly of overlapping dsDNA, as described previously (66). Plasmids were isolated using Qiaprep miniprep kits (Qiagen), and plasmid sequences were confirmed by sequencing (Genewiz). Oligonucleotide primer sequences used are listed in Table 1. Carbohydrates and biomasses used included the following: laminarin from Laminaria digitata (Sigma L9634), xylan from birchwood (Sigma X0502), xylan from oat spelts (Sigma X0627), and dilute acid-pretreated Populus trichocarpa ϫ Populus deltoides (National Renewable Energy Laboratory, Golden, CO).
Caldicellulosiruptor strains were grown anaerobically at 70°C in modified DSM640 medium (DSMZ), containing 0.9 g/liter NH 4 Cl, 0.9 g/liter NaCl, 0.4 g/liter MgCl 2 ⅐6H 2 O, 0.75 g/liter KH 2 PO 4 , 1.5 g/liter K 2 HPO 4 , 1 g/liter yeast extract, 1 ml/liter trace element solution SL-10, 5 g/liter cellobiose, 0.5 mg/liter resazurin, and 0.75 g/liter L-cysteine-HClϫH 2 O). Caldicellulosiruptor genomic DNA for cloning and vector construction was extracted, as described previously (68). For genetic manipulations of C. bescii, strains were grown anaerobically at 70°C in low osmolality defined (LOD) or low osmolality complex (LOC) medium (69). C. bescii was plated embedded in LOD medium with 1.5% agar and grown at 70°C in anaerobic tanks with N 2 head space. For solubilization experiments and immunofluorescence microscopy, Caldicellulosiruptor strains were grown on a modified defined version of DSMZ671 medium (671d medium), described previously (70), with 5 g/liter of the specific biomass substrate as the carbon source. For growth of all uracil auxotroph mutants, growth medium was supplemented with 40 mM uracil.
Expression of Recombinant Proteins in E. coli-All Calkro_ 0111 truncation mutants (TMs) and Calkro_0402 TM1 were cloned into the pET46 Ek/LIC vector maintained in NovaBlue E. coli. Sequence confirmed plasmids were transformed into Rosetta 2(DE3) E. coli for protein expression. Proteins were expressed in modified ZYM-5052 autoinduction medium at 37°C 250 rpm. Cells were harvested 18 -24 h after induction by centrifugation at 6,000 ϫ g for 10 min. For the expression of Calkro_0402, the gene was cloned into the pET46 Ek/LIC vec-tor without the predicted signal peptide, as determined by the SignalP 4.1 server (71), and maintained in NEB 10-beta E. coli. The sequence-confirmed plasmid was transformed into Arctic Express (DE3)RIL E. coli for protein expression. Expression was performed in Terrific Broth medium by growing cells at 30°C and 250 rpm until A 600 reached 1.0 followed by induction at 13°C with 1 mM isopropyl-␤-D-thiogalactopyranoside. Cells were harvested 16 -20 h later by centrifugation at 6,000 ϫ g for 10 min. All cell pellets were frozen at Ϫ20°C prior to purification.
Purification of Recombinant Protein and Truncation Mutants-Cell pellets were resuspended in 5 ml of 20 mM sodium phosphate, pH 7.4, 0.5 M NaCl, 20 mM imidazole, 0.1% (v/v) Nonidet P-40, and 1 mg/ml lysozyme per g of wet cell pellet. Cells were lysed in a French press at 16,000 p.s.i. Cell lysates were heat-treated at 65°C for 20 min (except for the full-length Calkro_0402 protein product which was not heattreated) to remove heat-labile proteins from E. coli. The cell lysate was then centrifuged at 25,000 ϫ g for 20 -60 min, and the supernatant was passed through a 0.22-m filter to prepare cell-free cell extract. Proteins were purified using 1-or 5-ml HisTrap HP nickel-Sepharose (GE Healthcare) immobilized metal affinity chromatography columns, operated according to the manufacturer's instructions. For full-length Calkro_0402 and Calkro_0111 TM8, size exclusion chromatography was used after immobilized metal affinity chromatography on a HiLoad 26/600 Superdex 200 pg column (GE Healthcare) in 50 mM sodium phosphate, 150 mM NaCl, pH 7.2, buffer. All chromatography steps were performed on a Biologic DuoFlow FPLC (Bio-Rad). Purity of the recombinant proteins was evaluated by SDS-PAGE using NuPAGE Novex 4 -12% BisTris protein gels (Life Technologies, Inc.) or 4 -15% Mini-PROTEAN TGX gels (Bio-Rad) with a Benchmark protein ladder (Life Technologies) and staining by GelCode Blue Safe Protein Stain (Thermo Scientific). The concentration of purified proteins was determined using the bicinchoninic acid assay (Pierce).
Optimal pH and Temperature Determination-The pH optimum for each recombinant enzyme was determined at 70°C in buffers at pH 2.5-10 (50 mM citrate buffer, pH 2.5 and 3.5; 50 mM sodium acetate buffer, pH 4.5, 5, 5.5, and 6; 50 mM sodium phosphate, pH 6.5, 7, and 8; 50 mM sodium bicarbonate, pH 9.2 and 10). The temperature optimum was determined at the optimal pH between 30 and 100°C. Enzyme reactions and no enzyme/substrate controls were prepared in triplicate with 100 l of 1% substrate and incubated in PCR multiwell plates in a thermocycler (Eppendorf). Activity was assessed by measuring reducing sugar released using a 3,5-dinitrosalicylic acid (DNS) assay (modified from Refs. 72 and 73 and adapted to a 96-well microplate format (74,75)). The DNS reagent used contained 1.6% (w/v) sodium hydroxide, 30% (w/v) sodium potassium tartrate, and 0.1% (w/v) 3,5-dinitrosalicylic acid. Briefly, 30 l of sample was mixed with 60 l of DNS reagent and incubated in a 96-well PCR plate in a thermocycler (Eppendorf) using the following program: 95°C for 5 min, 48°C for 1 min, hold at 20°C. Then the DNS reaction was diluted with distilled water, and absorbance was measured at 540 nm. Glucose or xylose was used as a standard for glucooligosaccharide or xylooligosaccharide release. Analysis of Oligosaccharide Products from Enzymatic Hydrolysis-To analyze oligosaccharides released by the various enzymes, enzyme was added to 500 l of 1% substrate in buffer at the enzyme's optimum pH and incubated between 0 and 240 min at 70°C at 800 rpm in a mixing thermocycler (Eppendorf). Reactions without substrate and/or enzyme were also assessed. After incubation, tubes were spun at 16,000 ϫ g for 5 min at 4°C, and the supernatant was applied to 0.5 ml of 10,000 molecular weight cut-off protein concentrators (Pierce). The protein concentrator eluent was diluted 10-fold with distilled water and analyzed by high performance liquid chromatography (HPLC) (Alliance e2695 separations module, Waters) with refractive index (model 2414, Waters) and photodiode array (model 2998, Waters) detectors. Columns were either the KS-801 (xylooligosaccharides) or KS-802 (glucooligosaccharides) (Shodex) operated with water as the mobile phase (0.6 ml/min and 80°C). Laminarioligosaccharides (L2-L6) (Seikagaku Corp.), individual xylooligosaccharides (X2-X6) (Megazyme), glucose (Sigma), and xylose (Acros Organics) were used as standards.
Confocal and Epifluorescence Microscopy-All centrifugation steps for the preparation of labeled cells were carried out at 6,000 ϫ g for 10 min. Cells harvested from a 50-ml culture in 671d medium were washed one time with sterile 1ϫ phosphate-buffered saline (PBS), and cells were fixed in 1ϫ PBS containing 4% formaldehyde (methanol-free; Fisher) for 30 min at room temperature with gentle shaking, followed by washing three times with 50 ml of PBS. Cells were resuspended in 3 ml of antibody-blocking solution containing 5% (v/v) normal goat serum (Immunoreagents), 1% (w/v) bovine serum albumin (protease-and nuclease-free; Fisher), and 0.05% (v/v) Tween 20 (Fisher) in PBS and incubated on a Boeckle rocker for 60 min at room temperature. After this blocking step, cells were harvested by centrifugation and resuspended in antibody blocking solution containing 100 g/ml primary antibody. Polyclonal antibodies were raised against Calkro_0111 TM7 or Calkro_ 0402 TM1 in chickens and supplied as total IgY with preimmune total IgY controls by GeneTel Laboratories (Madison, WI). Cells were incubated with the primary antibody for 18 h at 4°C on a Boeckle rocker. After this incubation, cells were washed three times with 1 ml of PBS containing 1% (w/v) bovine serum albumin and 0.05% (v/v) Tween 20 and resuspended in antibody blocking solution containing a 1:400 dilution of goat anti-chicken DyLight488-conjugated secondary antibody (Immunoreagents) and incubated on a Boeckle rocker for 60 min at room temperature. Cells were harvested and washed three times with PBS, resuspended in 100 l of 3.6 M DAPI in PBS, and incubated at 4°C for 18 h. Following this incubation, cells were washed three times, resuspended with PBS, and vacuum-filtered onto 0.22-m polycarbonate hydrophilic isopore membrane filters (EDM Millipore) and mounted in 15 l of SlowFade Diamond anti-fade mountant (Life Technologies). For Calkro_0111 and Calkro_0402 localization in C. kronotskyensis, slides were imaged on a Zeiss LSM 710 confocal work station using a ϫ63 oil immersion plan-apochromat (numerical aperture 1.4) objective (North Carolina State University Cellular and Molecular Imaging Facility, Raleigh, NC). Image processing was performed using Zeiss Zen Blue software.
Epifluorescence imaging and cell counting were performed using a Nikon eclipse 50i microscope with a Plan Fluor ϫ100 (numerical aperture 1.3) oil emersion objective and Nikon DS-Fi1 camera. For routine cell counting, cells were stained with acridine orange, as described previously (76). For immunofluorescence images of C. bescii genetic mutant strains, cells were labeled and slides were prepared as described above for confocal imaging. Images were captured as a z-stack of images through the sample with 0.33-m steps controlled by the ES10ZE Focus controller (Prior Scientific) and processed into a single focused image using the Nikon Elements software extended depth of field processing methods. Transmission electron microscopy was performed at the Laboratory for Advanced Electron and Light Optical Methods (College of Veterinary Science, North Carolina State University).
Genetic Manipulation of C. bescii-To prepare competent cells, a 500-ml culture of C. bescii strain JWCB018 was grown to an optical density at 680 nm of 0.06 -0.07 in LOD medium supplemented with 1ϫ 19-amino acid solution (77). The culture was cooled to room temperature, and cells were harvested by centrifugation. All centrifugation steps were carried out at 6,000 ϫ g for 10 min. The cell pellet was washed three times with 10% sucrose and resuspended with 10% sucrose to a total volume of 100 -120 l. Fifty l of competent cells were mixed with 1 g of pJMC009 plasmid DNA at room temperature and electroporated using 1-mm gap cuvettes (USA Scientific) and a Gene Pulser II system with a Pulse Controller PLUS module (Bio-Rad) operated at 2.25 kV, 600 ohms, and 25 microfarads. Immediately following electroporation, cells were transferred to 10 ml of prewarmed LOC medium and incubated at 70°C for 1 h. After this incubation, the culture was cooled to room temperature, and cells were harvested by centrifugation, transferred to 100 ml of prewarmed LOD medium lacking uracil, and incubated at 70°C for 2-4 days to select for first crossover transformants. This culture was passaged and then plated on LOD medium lacking uracil to select individual colonies. Integration of the pJMC009 vector was confirmed by PCR, and a successful first crossover mutant strain was plated on LOD medium supplemented with 40 mM uracil and 4 mM 5-fluoroorotic acid to select for second crossover mutants. Isolated colonies were screened by PCR, and successful second crossovers were identified. A successful second crossover was plated on LOD medium supplemented with uracil, and individual colonies were isolated. This plating and isolation of colonies was repeated two times to obtain the final strain and ensure its purity. All PCR screening steps were performed using genomic DNA isolated using the ZymoBead TM genomic DNA kit (Zymo Research).
Hydrolysis of Hemicellulose Substrates-Washed biomass substrates (oat spelt xylan, birchwood xylan) were prepared by washing 1 g of substrate/100 ml of distilled water overnight at 70°C to remove soluble sugars, centrifuging at 6,000 ϫ g for 10 min, and drying at 70°C. Caldicellulosiruptor strains were passaged on modified 671d medium with respective biomass substrates 3-4 times. Solubilization cultures were prepared in triplicate as 50-ml 671d medium cultures with 5 g/liter substrate, inoculated to 1 ϫ 10 6 cells/ml, and grown at 70°C for 45.5 h. Residual substrate was harvested by centrifugation at 6,000 ϫ g for 10 min and drying at 70°C until constant mass. The extent of solubilization was determined from the difference in mass between the biomass used to prepare each culture and the residual remaining after harvest. For time course analysis of oat spelt xylan solubilization, cultures were prepared as above except that cultures were sampled by removing 2 ml of culture and centrifuging the samples at 6,000 ϫ g for 10 min. The supernatant was stored at Ϫ20°C prior to HPLC analysis.
Analysis of Solubilization in Culture Supernatants-Supernatant samples were analyzed for xylose content, as described previously (70). Briefly, the samples were brought to 4% (w/w) sulfuric acid and autoclaved for 1 h on the liquid cycle. Samples were cooled to room temperature and spun at 18,000 ϫ g for 5 min, and supernatant was analyzed by HPLC as described above except using an Aminex-87H (300 ϫ 7.8 mm; Bio-Rad) column with a mobile phase of 5 mM sulfuric acid at 0.6 ml/min and 60°C.
C. kronotskyensis and C. bescii RNA Microarray-Transcriptomic data, which were acquired previously (70) and deposited in the NCBI Gene Expression Omnibus database (accession number GSE68810), were reanalyzed with respect to the SLH domain-containing proteins from C. bescii and C. kronotskyensis.

SLH Domain Proteins in Caldicellulosiruptor Species-
The genomes of the 12 sequenced Caldicellulosiruptor species collectively encode 34 different groups of SLH domain-containing proteins (supplemental Table S1). About half of the SLH domain-containing protein groups in these genomes have no additional identifiable domains other than SLH domains, highlighting the limited understanding of this group of proteins. Caldicellulosiruptor species produce between 10 (C. saccharolyticus and Caldicellulosiruptor acetigenus) and 19 (C. kronotskyensis) SLH domain proteins.
Eight of the 34 SLH domain-containing protein groups contain identifiable catalytic GH or PL CAZyme domains (Fig. 1).
The other six groups of catalytic SLH domain proteins ( Fig.  1) have not been characterized from Caldicellulosiruptor species. Calkro_0111 (GH16/GH55) is the largest of all of these at 2,435 amino acids and is also the largest CAZyme of any kind identified so far in the genus Caldicellulosiruptor. Calkro_0121 is a second truncated paralog of Calkro_0111, which has 60% amino acid identity to Calkro_0111 and identical domain architecture but is lacking the final CBM32 domain. The arrangement of the domains in Calkro_0111 and Calkro_0121 is entirely unique to C. kronotskyensis, and no homologs of the full protein (Calkro_0111) can be identified in any other sequenced microorganism. Calkro_0072 (GH16) has homologs in seven Caldicellulosiruptor species and is very similar to the ␤-1,3glucanase Lic16A characterized from C. thermocellum (38). C. hydrothermalis produces two catalytic SLH domain proteins that are unique to this species, putative xylanase Calhy_ 1629 (GH43) and putative ␣-glucanase Calhy_2383 (GH87). Calkro_0550 (PL11) from C. kronotskyensis has one additional homolog in C. owensensis and is a putative rhamnogalacturonan lyase. C. owensensis also produces one other putative pec- tate lyase SLH domain protein, Calow_2109 (PL9), which is unique to this species.
These catalytic SLH domain proteins encompass a range of enzymatic activities localized in the S-layer, presumably used to break down plant polysaccharide components that are in proximity of the cell surface. C. kronotskyensis produces six CAZyme SLH domain proteins, the most of any species; C. bescii only produces one, a homolog of Csac_0678, which is conserved in all Caldicellulosiruptor species. These SLH-localized CAZymes accompany a host of other CAZyme domains found in extracellular free enzymes produced by Caldicellulosiruptor sp. for plant biomass degradation (58).
C. kronotskyensis Produces an S-layer and SLH Enzymes Calkro_0111 and Calkro_0402 Are Localized on the C. kronotskyensis Cell Surface-Transmission electron microscopy of C. kronotskyensis growing on dilute acid-treated poplar (P. trichocarpa ϫ P. deltoides) shows the cell membrane, peptidoglycan layer, and S-layer of C. kronotskyensis cells ( Fig. 2A), with the cells appearing to attach to granules of the substrate mediated by appendages at the cell surface. To determine the localization of the SLH CAZymes Calkro_0402 and Calkro_0111, polyclonal antibodies were raised against portions of these proteins and used for immunofluorescence microscopy in C. kronotskyensis. Fig. 2, panels B and C and pan-els D and E, shows confocal microscopy of C. kronotskyensis labeled with anti-Calkro_0111 and anti-Calkro_0402 antibodies, respectively. Controls labeled with preimmune total IgY antibodies show minimal labeling (data not shown). In each case, these enzymes are localized on the cell surface within the S-layer, as would be predicted by the presence of SLH domains. Because of the high similarity between Calkro_0111 and Calkro_0121, the antibodies are probably labeling both of these proteins in Fig. 2 (B and C).
Laminarinase Calkro_0111 Displays Endo-and Exoglucanase Activity in Separate Catalytic Domains-Calkro_0111 contains 15 predicted domains, including the two catalytic units: GH16 and GH55 (Fig. 3A). Because the full enzyme could not be produced recombinantly in E. coli, the enzyme was split into several TMs for characterization. Two of these TMs were used to characterize the enzymatic activity: TM1, containing the GH16 domain and covering the first half of Calkro_0111, and TM8, containing the GH55 domain and covering the second half of Calkro_0111. For both TM1 and TM8, the optimal pH was 5, whereas the optimal temperature was 75°C (Fig. 3B). Analysis of the oligosaccharides released from laminarin by these two halves of Calkro_0111 showed that the GH55 domain of TM8 generates primarily glucose and a small amount of laminaribiose from laminarin, whereas the GH16 domain of TM1 released laminooligosaccharides of size L3 (laminaritriose) and greater, which slowly accumulated with time (Fig. 3C). Taken together, these findings suggest that the GH16 domain is an endoglucanase, whereas the GH55 domain is an exoglucanase. Considering the full-length enzyme, coordination between the GH16 and GH55 domains is expected, such that the GH16 cuts new chain ends for the GH55 to digest, thereby liberating glucose. This synergism is not uncommon in freely secreted Caldicellulosiruptor enzymes with two catalytic domains (76,79,80), but synergism had yet to be demonstrated in any CAZyme SLH domain protein. This is primarily due to the fact that very few SLH domain proteins have two catalytic CAZyme domains, further highlighting the uniqueness of Calkro_0111.
To further explore the domain arrangement and the contribution of the non-catalytic domains to activity, a range of TMs (Fig. 3A) were produced in E. coli and analyzed. Activity of these TMs at 70°C on laminarin was evaluated at pH 5, the enzyme optimum pH, and pH 7, the growth optimum pH for Caldicellulosiruptor (Fig. 3D). TM3 and TM4 had the highest activity at both pH levels tested, whereas TM1 was slightly less active. TM2, which is truncated between the GH16 and the first actin cross-linking-like (ACL) domain, was significantly less active. TM5, which contains the GH16 domain alone, and TM6 and TM7, which contain the ACL and fibronectin type-III (FN3) domains and not the GH16, all displayed little to no activity (Fig. 3D). TM4, which contains the GH16 and first ACL domain, represents the smallest portion of the GH16 side of the enzyme that remains fully active under these conditions. This suggests that the ACL domain plays an important role in GH16 function. The ACL domain could stabilize the GH16 domain, a role that has been shown before for accessory domains in other GH multidomain enzymes. For example, two FN3 domains (also previously called X1 modules) of CbhA (C. thermocellum cellobiohydrolase A) increased the thermostability of the adjacent GH9 domain (81). For the Calkro_0111 GH55 domain, removal of the ACL and FN3 domains from TM8 to TM9 and then removal of the two CBM32 domains from TM9 to TM10 resulted in successive reductions in activity (Fig. 3D). This suggests that both the ACL-FN3-FN3 grouping and two CBM32 domains play an important role in the activity of the Calkro_0111 GH55.
Biochemical Characterization of Calkro_0402 and Relatedness to Other CBM22/GH10/CBM9 Enzymes-Calkro_0402 belongs to a large group of homologous CBM22-, GH10-, and CBM9-containing proteins, from a variety of bacteria, primarily Firmicutes.Anunrootedneighbor-joiningphylogenetictreecon-structed from the full amino acid sequences of top blastp hits to Calkro_0402 and similar xylanases that have been previously characterized shows the relatedness of xylanase enzymes from this group (Fig. 4A).
Calkro_0402 contains three CBM22, one catalytic GH10, two CBM9 domains, and one cadherin-like domain, in addition to C-terminal SLH domain repeats (Fig. 4B). Full-length Calkro_0402 was produced recombinantly in E. coli (Fig. 4C), and the optimal pH and temperature for activity on birchwood xylan were found to be pH 5.5 and 80°C, respectively (Fig. 4D). Calkro_0402 releases primarily xylotriose and xylobiose from birchwood xylan but also a small amount of xylose (Fig. 4E). This type of activity is consistent with other xylanase homologs that have been characterized previously (28,31). Activity of Calkro_0402 was also detected on oat spelt xylan, wheat arabinoxylan, pachyman 1,3-␤-glucan, barley ␤-glucan, and lichenan in addition to birchwood xylan, all at pH 5.5 and 70°C (data not shown).
Transcriptomic Analysis of SLH Domain Proteins in C. bescii and C. kronotskyensis-Transcriptomic analysis was performed on C. kronotskyensis and C. bescii, the Caldicellulosiruptor species with the most and least catalytic CAZyme SLH domain proteins, respectively, when these species were grown on Avicel (crystalline cellulose) and, the more heterogeneous substrate, switchgrass. All CAZyme SLH domain proteins from both species, except Calkro_0111 and Calkro_0121, were up-regulated on switchgrass. This transcriptional response to switchgrass suggests that these genes respond to components of switchgrass, like hemicellulose polysaccharides, not found in crystalline cellulose Avicel. The transcriptional level of these genes on switchgrass is similar to the free enzymes in the glucan degradation locus, known to be a genomic feature of cellulolytic Caldicellulosiruptor sp. (59,61,70). Calkro_0333 and Athe_2303 are very highly transcribed under both conditions. These proteins are homologs of Csac_2451, which was previously identified as the main SLP (59,60). A number of other non-catalytic SLH domain proteins are transcribed at moderate to high levels, but because domains other than SLH domains are not predicted, the role of these proteins is unknown.

Manipulation of the C. bescii S-layer by the Insertion (Knockin) of the Gene Encoding Calkro_0402-Because
Calkro_0402 is very highly transcribed in C. kronotskyensis (Fig. 5) and utilized by a variety of xylan-degrading bacteria (Fig. 4A), Calkro_0402 was "knocked in" to genetically tractable C. bescii to examine its potential contribution to plant biomass degradation in vivo. C. bescii is a good genetic background for testing SLH proteins in vivo, because it produces only 12 SLH domain proteins in total and does not produce a homolog of Calkro_0402. Calkro_0402 was inserted into C. bescii uracil auxotroph strain JWCB018 using pyrF complementation and 5-fluoroorotic acid counterselection. The knock-in construct contained Calkro_0402, including the native signal peptide and the native predicted terminator sequence from C. kronotskyensis, under the control of the slp promoter from the main SLP (Athe_2303) in C. bescii (Fig. 6, A and B). As shown in Fig. 5, the main SLP (Athe_2303) is very highly and constitutively transcribed. Thus, its promoter should direct very high levels of transcription of Calkro_0402 in the knock-in construct. The knock-in was targeted at the ⌬CbeI (Athe_2438) deletion locus in C. bescii strain JWCB018, because genetic manipulation in this area of the genome was successful previously (64). The PCR amplicon for the knock-in C. bescii strain RKCB103 compared with strain JWCB018 is shown in Fig. 6C.
Using this Calkro_0402 knock-in C. bescii strain RKCB103, immunofluorescence microscopy was performed to investigate the expression of Calkro_0402 within the S-layer of C. bescii. For comparison, immunofluorescence microscopy was also performed on parent C. bescii strain JWCB018 and C. kronotskyensis, in addition to C. bescii strain RKCB103 (Fig. 6, D-F). C. bescii strain JWCB018, which does not have Calkro_0402, had minimal labeling with the anti-Calkro_0402 antibodies (Fig. 6D), whereas C. bescii strain RKCB103 was extensively labeled on the cell surface (Fig. 6F). Labeling of the native expression of Calkro_0402 in C. kronotskyensis is also shown (Fig. 6E). Based on transcription levels of Calkro_0402 and Athe_2303, the main SLP, (Fig. 5), the Calkro_0402 transcript in C. bescii strain RKCB103 should be Ͼ10-fold higher, using the slp promoter, compared with levels observed in wildtype C. kronotskyensis. This clearly relates to an increase in Calkro_0402 protein expression in C. bescii strain RKCB103 compared with native expression in C. kronotskyensis, as seen via the immunofluorescent labeling. Also of particular note is that the native signal peptide from C. kronotskyensis on Calkro_0402 appeared to function in C. bescii to route the protein for secretion. The C. bescii-produced Calkro_0402 also localized to the surface of the C. bescii strain RKCB103 cells and in Fig. 6F appears to localize primarily at the poles of the cell. The organization of SLH proteins on the cell surface is generally thought to be coordinated in bacteria that produce multiple SLH proteins, but the mechanisms by which this occurs have not been determined (82). This is the first example of manipu-FIGURE 5. Transcriptional response of genes encoding SLH domain proteins from C. bescii and C. kronotskyensis. Shown is the log squared mean (LSM) transcriptomic level of the 19 SLH domain proteins from C. kronotskyensis and 12 SLH domain proteins from C. bescii when each species is grown on crystalline cellulose (Avicel) and switchgrass (SWG). A log squared mean value of 0 represents average transcript abundance (black). Genes transcribed at levels higher than average have positive log squared mean values (red), whereas genes transcribed at levels lower than average have negative log squared mean values (green). Differential transcription is shown as Avicel minus switchgrass with negative values (orange) up-regulated on switchgrass relative to Avicel. Positive values (blue) are up-regulated on Avicel relative to switchgrass. Analysis is based on whole-genome oligonucleotide microarray experiments deposited in the NCBI Gene Expression Omnibus database with accession number GSE68810 (70). Big, bacterial immunoglobulin-like (CL0159); Tglut, transglutaminase-like superfamily (pfam01841); MG2, macroglobulin 2 (pfam01835); vWFA, von Willebrand factor type A (pfam00092); SH3, bacterial Src homology 3 (pfam08239); RHSrep, RHS repeat (pfam05593); RHScore, RHS-associated core domain (TIGR03696). *, Calkro_0121 is truncated after the first CBM32.
lation of the Caldicellulosiruptor S-layer through genetic modification.
C. bescii Strain RKCB103 Xylan Solubilization-To understand the role of Calkro_0402 in vivo, the ability of C. bescii strains RKCB103 and JWCB018 to solubilize plant biomass in culture was examined. Fig. 7A shows the solubilization of oat spelt and birchwood xylans. Strain RKCB103 is able to solubilize significantly more washed oat spelt xylan than strain JWCB018; the xylose equivalents from the soluble xylooligosaccharides measured in the supernatant are more than double for strain RKCB103 compared with strain JWCB018 (Fig. 7B). For both the washed and unwashed birch xylan substrates, strain RKCB103 performed slightly better and released more xylose equivalents to the supernatant than strain JWCB018. Other biomass substrates tested (diluted acid-pretreated P. trichocarpa ϫ P. deltoides, dilute acid-pretreated switchgrass, and unpretreated Panicum virgatum switchgrass) showed no significant difference in solubilization and also showed Ͻ30 g/ml xylose equivalents in the culture supernatant (data not shown). This suggests that on these lignocellulosic substrates, C. bescii strain JWCB018 produced enough xylanase activity to remove xylan at the rate at which it was exposed from the complex polysaccharide matrix in the substrate, and cellulose solubilization or occlusion of the polysaccharides by lignins was probably limiting the overall solubilization. However, in the substrates with high xylan content, strain RKCB103 with the addition of Calkro_0402 outperformed strain JWCB018.
Immunofluorescence microscopy (Fig. 6F) shows that Calkro_0402 is localized on the surface of strain RKCB103 cells. While the enzyme is tethered to the cell surface, the GH and CBM domains of Calkro_0402 interact with the xylan substrate. This suggests that Calkro_0402 plays a role in mediating cell attachment to xylan substrates. Cultures of strain JWCB018 and strain RKCB103 (Fig. 7, C and D, respectively) grown on washed birchwood xylan support this role for Calkro_0402. These cultures contained approximately the same number of cells as determined by cell counts (data not shown), but the supernatant in Fig. 7D for strain RKCB103 with Calkro_0402 is relatively clear, because the cells appear to be attached to the insoluble xylan at the bottom of the culture bottle. The strain FIGURE 6. Construction of Calkro_0402 knock-in C. bescii strain RKCB103. A, Calkro_0402 knock-in vector pJMC009 homologous recombination with the chromosome of parent C. bescii strain JWCB018. pJMC009 was constructed using the pDCW121 (65) vector backbone, which contains the pSC101 origin, repA, and apramycin resistance marker for maintenance in E. coli as well as C. bescii wild type pyrF under the control of the Athe_2105 promoter for selection of uracil prototrophy and 5-fluoroorotic acid counterselection. Cross-over regions 1 kb in length 5Ј and 3Ј to the ⌬CbeI locus (⌬Athe_2438) in strain JWCB018 were used to target Calkro_0402 for insertion via homologous recombination to that locus. The Calkro_0402 insertion construct included Calkro_0402 and an additional 81 bp downstream of the gene to include its predicted transcriptional terminator element amplified from C. kronotskyensis genomic DNA. High, constitutive expression of Calkro_0402 was driven by the 200-bp promoter element upstream of the C. bescii main SLP gene (Athe_2303). B, the Calkro_0402 construct is inserted in the chromosome of RKCB103 after first crossover vector integration and second crossover 5-fluoroorotic acid counterselection to remove the vector backbone containing pyrF. C, using primers outside of the 5Ј and 3Ј crossover regions (Table 1, ⌬Athe_2438 locus F and R), the ⌬CbeI locus was amplified and sequenced to verify the Calkro_0402 knock-in. Strain JWCB018 has the expected 2.9-kb product, whereas strain RKCB103 has the expected 8.2-kb product with the 5.3-kb insertion of the SLP promoter and Calkro_0402 construct. A New England Biolabs 1-kb ladder was run alongside these PCR products. D-F, anti-Calkro_0402 immunofluorescence microscopy of C. bescii strain JWCB018, C. kronotskyensis, and C. bescii strain RKCB103 grown on birchwood xylan. D, C. bescii strain JWCB018, the genetic parent strain, which does not contain Calkro_0402, shows minimal labeling with the anti-Calkro_0402 antibodies. C. kronotskyensis (E) and Calkro_0402 knock-in C. bescii strain RKCB103 (F) are both extensively labeled with antibody. E and F, Calkro_0402 is shown localized on the cell surface in both C. kronotskyensis and C. bescii strain RKCB103 and appears primarily at the poles of the cells. All cells are labeled with chicken anti-Calkro_0402 total IgY primary antibody and DyLight 488-conjugated goat anti-chicken secondary antibody (green) with DAPI as a counterstain (blue). Scale bars, 2 m.
JWCB018 cultures (Fig. 7C) were more turbid, suggesting that the cells did not as readily attach to the biomass substrate.
To understand the role of Calkro_0402 in improved xylan solubilization, a time course experiment was performed on washed oat spelt xylan. Fig. 8A shows the planktonic cell density for strains JWCB018 and RKCB103 over the duration of the culture. The cell densities of the two strains track each other closely. When monitoring xylose release, roughly 9 -12 h postinoculation, the xylose hydrolyzed from xylooligosaccharides by strain RKCB103 was measurable above the baseline abiotic control, whereas strain JWCB018 took between 16 and 22 h postinoculation (Fig. 8B). It is important to note that the strains were consuming xylose as they degraded the xylan substrate, and these values represent the xylose from excess xylooligosaccharides released by the enzymatic action of the cells that has not been consumed for growth. As with the closed bottle total solubilization experiment, strain RKCB103 has about double the xylose equivalents in the supernatant samples as strain JWCB018 at 45.5 h, the point at which the cultures were harvested.
Calkro_0111 from C. kronotskyensis is the largest glycoside hydrolase from all of the biomass degradation enzymes in the Caldicellulosiruptor genus, free or cell-associated, and is  (n ϭ 3). C. bescii strain JWCB018 cultures on washed birchwood xylan appear very turbid (C), and C. bescii strain RKCB103 cultures appear much less turbid (D) while having the same cell density/ml of culture. C. bescii strain RKCB103 appears to be more readily attached to the insoluble washed birchwood xylan substrate, suggesting that the expression of Calkro_0402 in this strain is playing a role in tethering the cells to the xylan substrate. entirely unique to C. kronotskyensis. The GH16 domain of Calkro_0111 is an endoglucanase, whereas the GH55 is an exoglucanase (Fig. 3). Thus, the two GH domains probably work synergistically, as has been shown for other Caldicellulosiruptor enzymes without SLH domains but with two catalytic domains (76,79,80). Various truncation mutants of Calkro_ 0111 (Fig. 3D) were examined, revealing that the non-catalytic ACL domain adjacent to the GH16 domain is necessary for full activity. Furthermore, for the GH55 domain of Calkro_0111, both the ACL-FN3-FN3 domain grouping and two CBM32 domains improved the activity of the GH55 domain.
The very large multidomain structure of Calkro_0111 is characteristic of many Caldicellulosiruptor polysaccharide-degrading enzymes, where this architecture is thought to have arisen from domain shuffling (83,84). Specifically, for Calkro_0111, the N terminus GH16 portion is similar to the N terminus of Calkro_0072 (45% amino acid identity), and the C terminus GH55 portion is similar to Calkro_0113 (58% amino acid identity). These protein segments may have combined by domain shuffling to form Calkro_0111. However, the ACL-FN3-FN3-ACL-FN3-FN3 motif in the middle of Calkro_0111 and in homolog Calkro_0121 is completely unique to these two genes. These ACL domains belong to the same protein superfamily (CL22458 RICIN superfamily) as fascins, which play a role in bundling actin filaments for cell adhesion and migration in Drosophila and vertebrate species (85). Plants and brown algae, such as Laminaria sp., produce actin and actin bundling proteins that are distinct from fascins for a variety of cellular roles (86,87). The ACL domains of Calkro_0111 may play a role in attachment of this enzyme to actin found in algal or plant cells, similar to the role of a CBM for carbohydrate binding, to help localize the enzyme to laminarin-containing substrates.
Many GH16-and GH55-containing enzymes are active on ␤-1,3-glucans. Calkro_0111 activity was detected on a model substrate, laminarin (␤-1,3-1,6-glucan), albeit at modest levels such that this is not likely to be its natural substrate. In fact, ␤-1,3-glucans, similar to laminarin, are produced in plant reproductive and wound tissue (callose) (88,89), exopolysaccharide from bacteria (curdlan) (90), algae such as L. digitata (laminarin), and lichen such as Certraria islandica (lichenan). In addition to this unique substrate preference of Calkro_0111, its genomic neighborhood is also unique, containing other extracellular biomass-degrading enzymes that have no homologs within other species in the genus. These include GH16/ GH55 Calkro_0121, GH55 Calkro_0113, and GH81 Calkro_ 0114. This locus also encodes two ABC transporters, one of which is homologous to a putative glucooligosaccharide transporter in C. saccharolyticus (91). This entire genomic region is transcribed at relatively low levels when C. kronotskyensis is grown on Avicel or switchgrass. Taken together, this genomic locus, including Calkro_0111, probably degrades unique ␤-1,3glucans from lichens, algae, or other microorganisms that may be more abundant in the natural environment of Kamchatka, Russia, from which C. kronotskyensis was isolated.
The large size of the gene encoding Calkro_0111 demonstrates that this unique multidomain arrangement is beneficial to C. kronotskyensis for it to be maintained and even duplicated as Calkro_0121. More broadly in the genus Caldicellulosirup-tor, the unique multidomain architecture of enzymes, both with and without SLH domains, has probably evolved to appropriately space the CAZyme domains for increased synergy and activity. Specifically, for SLH domain proteins, domain spacing must also accommodate the tethering of the SLH domains at one end of the protein to the cell surface while it is attaching to and degrading the plant biomass substrate with the other. The large size of these catalytic SLH proteins may reflect the length needed to swing out away from the cell to reach biomass substrates to which they can attach. Although domain spacing and arrangement appear to be critical factors in the activity of Calkro_0111, no pattern for domain arrangement through all of the catalytic Caldicellulosiruptor SLH proteins is readily apparent, based on amino acid sequence. Lacking structural data, which are difficult to obtain for these large multidomain proteins, the orientation of each domain relative to another cannot be determined precisely. It does seem likely, however, that each protein has evolved its domain spacing to enable its unique combination of binding and catalytic domains to function properly while being tethered to the cell.
Whereas Calkro_0111 and Calkro_0121 are transcribed at low levels when grown on plant biomass substrates, all of the other SLH GH and PL enzymes from C. kronotskyensis and C. bescii are up-regulated when these species are grown on switchgrass compared with Avicel (Fig. 5), including the xylanase Calkro_0402. This suggests an important role in plant biomass degradation. Calkro_0402 belongs to a large group of homologous CBM22/GH10/CBM9 xylanases (Fig. 4A). While Calkro_0402 releases primarily xylobiose and xylotriose from birchwood xylan (Fig. 4E), similar to previously characterized homologs (28,31), the optimal temperature of 80°C for Calkro_0402 (Fig. 4D) makes it one of the most thermophilic versions of this family of xylanase enzymes characterized to date. As is predicted by the presence of SLH domains, Calkro_0402 is localized to the cell surface of C. kronotskyensis (Fig. 2, D and E), which is a common feature of many homologs of Calkro_0402, including some shown in Fig. 4A associated with the cell by means other than SLH domains. The Calkro_0402 homolog from Thermotoga maritima (and presumably homologs in other Thermotoga sp.) is cell-associated by an N-terminal hydrophobic peptide anchor and not SLH domains (92). Furthermore, the Clostridium clariflavum xylanase lacks SLH domains but has a dockerin domain on its C terminus, which would allow this enzyme to associate via protein-protein interactions into a cellulosome multienzyme complex that is also typically anchored to the cell surface via SLH domains (61). The fact that these homologs without SLH domains can remain associated with the cell implies that there is particular utility for having this type of CBM22/GH10/CBM9 xylanase associated with the cell.
Using genetic manipulation of C. bescii to knock in Calkro_0402, the role of this SLH protein in vivo for substrate attachment (Fig. 7, C and D) and degradation (Figs. 7 (A and B)  and 8 (A and B)) could be examined. Wild-type C. bescii has previously been shown to attach to xylan substrates (63). Our observations suggest that modification of the C. bescii S-layer to contain SLH domain xylanase Calkro_0402 probably improves this attachment (Fig. 7, C and D). Truncation mutants of Xyn10B from Clostridium stercorarium and Xyn10A from Clostridium josui showed that the CBM9 and CBM22 domains mediate the attachment of these enzymes to xylan and other substrates (42,43). While these two xylanases are phylogenetically distant from Calkro_0402 (Fig. 4A), the role of these CBM22 and CBM9 domains is probably very similar in Calkro_0402 in the functional role of degrading xylan substrates. The attachment of the Calkro_0402 knock-in strain RKCB103 is also probably a significant factor in its improved solubilization of xylans (Fig. 7A). Both C. bescii and C. kronotskyensis produce at least five other extracellular xylanases targeted for the secretome with GH10, GH11, or GH43 domains. If the cell is attached to xylan by Calkro_0402, these enzymes are secreted from the cell proximate to their substrate, probably making them more effective in the degradation process as well.
Very few of the diverse catalytic CAZyme-containing SLH domain proteins have been characterized, but their role in biomass deconstruction clearly merits closer examination. There are more than 20,000 SLH domain-containing proteins predicted in over 1,800 microbial genomes. These unusual proteins on the surface of bacterial cells contain a variety of functional protein domains, localized at the interface between the cell and its environment. Although a full understanding of bacterial S-layer proteins and their varied and important roles at the cell surface is not yet available, the results reported here provide new insights into the role of SLH GHs as agents of plant polysaccharide degradation.