Divergent Modes of Glycan Recognition by a New Family of Carbohydrate-binding Modules*♦

The genomes of myonecrotic Clostridium perfringens isolates contain genes encoding a large and fascinating array of highly modular glycoside hydrolase enzymes. Although the catalytic activities of many of these enzymes are somewhat predictable based on their amino acid sequences, the functions of their abundant ancillary modules are not and remain poorly studied. Here, we present the structural and functional analysis of a new family of ancillary carbohydrate-binding modules (CBMs), CBM51, which was previously annotated in data bases as the novel putative CBM domain. The high resolution crystal structures of two CBM51 members, GH95CBM51 and GH98CBM51, from a putative family 95 α-fucosidase and from a family 98 blood group A/B antigen-specific endo-β-galactosidase, respectively, showed them to have highly similar β-sandwich folds. However, GH95CBM51 was shown by glycan microarray screening, isothermal titration calorimetry, and x-ray crystallography to bind galactose residues, whereas the same analyses of GH98CBM51 revealed specificity for the blood group A/B antigens through non-conserved interactions. Overall, this work identifies a new family of CBMs with many members having apparent specificity for eukaryotic glycans, in keeping with the glycan-rich environment C. perfringens would experience in its host. However, a wider bioinformatic analysis of this CBM family also indicated a large number of members in non-pathogenic environmental bacteria, suggesting a role in the recognition of environmental glycans.

Carbohydrates have critical functions in numerous biological events, including, for example, the movement and interactions of cells and proteins in animals, the recycling of plant cell wall carbohydrates, and the interactions between hosts and disease-causing organisms. Central to the role of carbohydrates in biological processes are protein-carbohydrate interactions. Non-catalytic carbohydrate-binding proteins (e.g. lectins, antibodies, and transport proteins) and catalytic carbohydrate-active enzymes are finely tuned to recognize particular carbohydrate structural motifs. The information content of glycans is realized through the specificity of non-catalytic carbohydratebinding proteins (like lectins and antibodies), whereas carbohydrate-active enzymes change the information content and often unlock the energy contained within these molecules.
Carbohydrate-binding modules (CBMs) 3 are a comparatively new class of non-catalytic carbohydrate-recognizing polypeptide that are generally defined by their presence as ancillary modules in larger, multimodular carbohydrate-active enzymes such as glycoside hydrolases, glycosyltransferases, and polysaccharide lyases (1). In the context of these enzymes, the role of CBMs is to specifically bind the carbohydrate substrate and hold the enzyme in proximity to the substrate, allowing catalysis to proceed more efficiently (2). The number of CBM families, which are defined on the basis of amino acid sequence similarity, has grown to the current number of 50 (www.cazy. org). Although the majority of these families are known to have members that recognize primarily plant cell wall polysaccharides, a growing number of families appear to have specificity for animal glycans and appear within the modular structures of carbohydrate-active bacterial virulence factors (3)(4)(5)(6). In these cases, the CBMs appear to have a role in recognizing the information content of cellular glycans, which then allows the enzymatic activity of the virulence factor to be appropriately directed. One of the current challenges in this area, which is complicated by the great diversity of eukaryotic glycans, is to determine the specificity, strength, and molecular determinants of these CBM-glycan interactions. In turn, this provides key information about the cellular targets of the entire carbohydrate-active virulence factor, thus providing greater insight into the host-pathogen interaction.
Clostridium perfringens is a ubiquitous Gram-positive bacterium that is capable of causing an array of diseases such as gastroenteritis and gas gangrene in humans and animals. The genus Clostridium (and the species perfringens in particular) is notable for its prolific production of toxins that contribute to its virulence (7,8). Among its minor toxins are a battery of carbohydrate-active enzymes of the class glycoside hydrolase, which are thought to aid in major toxin delivery, tissue destruction, and nutrient harvesting while in the host. The two myonecrotic strains of C. perfringens for which the genome sequences are available are remarkable for their content of highly modular glycoside hydrolases (9,10). Each strain contains in excess of 50 open reading frames encoding putative glycoside hydrolases, many of which are extracellular and rarely smaller than 1000 amino acids in length and frequently contain, in addition to the catalytic module, three or more definable ancillary modules (3). These ancillary modules include CBMs (3,4), putative proteinprotein interaction domains (11), and modules of unknown function. One such unknown module, which occurs quite frequently in the C. perfringens glycoside hydrolases as well as in proteins from a variety of other organisms, is the novel putative CBM (NPCBM) domain. NPCBM domains show no amino acid sequence identity to known carbohydrate-binding proteins but were hypothesized by Rigden (12) to be CBMs based on their context (i.e. in carbohydrate-active enzymes) and their proposed ␤-sandwich fold, which is common to CBMs. To test the hypothesized carbohydrate-binding function of these modules, we performed structural and functional studies on the NPCBMs from two different multimodular clostridial enzymes, one a hypothesized family 95 ␣-L-fucosidase (CPF_2129 from C. perfringens strain ATCC 13124) and the other a confirmed family 98 blood group A/B antigen-specific endo-␤-D-galactosidase (EabC from C. perfringens strain ATCC 10543) ( Fig. 1) (13). The two targeted modules share only ϳ30% amino acid sequence identity. The results confirm the identification of NPCBMs as a new CBM family, now classified as CBM51, and define a new module that can mediate the carbohydrate-based interaction of the enzymes with host tissues. Furthermore, an unexpected degree of diversity in specificity and binding site architecture is evident in this new CBM family, providing some distinction from other CBM families.

EXPERIMENTAL PROCEDURES
Cloning, Gene Expression, and Protein Production and Purification-The gene fragments encoding GH95CBM51 and GH98CBM51 were PCR-amplified from the genomic DNA of C. perfringens strains ATCC 13124 and ATCC 10543, respectively, using the following sets of primers: GH95CBMF (5Ј-CAT ATG GCT AGC GAA AAG GTT GCA GTT G-3Ј) and GH95CBMR (5Ј-GAA TTC CTC GAG TTA TGT TAA CTT AGC G-3Ј) for GH95CBM51 and GH98CBMF (5Ј-CAT ATG GCT AGC GAA GTT TAT GCT TTG GAA GAA AGC G-3Ј) and GH98CBMR (5Ј-GAA TTC CTC GAG TTA ATT CAC AAA ATC ACC CTT AGC TGT C-3Ј) for GH98CBM51. The amplified products were digested with NheI and XhoI restriction endonucleases and ligated to like digested pET-28a using standard cloning procedures. The resulting plasmids, pGH95CBM and pGH98CBM, encode the desired CBM fused to an N-terminal His 6 tag by a thrombin protease cleavage site.
Polypeptides were produced in 4-liter cultures of Escherichia coli strain BL21(DE3) containing pGH95CBM and pGH98CBM using the methods described previously (3). GH95CBM51 and GH98CBM51 were purified by immobilized metal affinity chromatography from cell-free extracts following previously described procedures (3). Purified polypeptides were concentrated and exchanged into 20 mM Tris-HCl (pH 8.0) in a stirred ultrafiltration unit (Amicon, Beverly, MA) using a 5-kDa cutoff membrane (Filtron Corp., Northborough, MA). Purity as assessed by SDS-PAGE was Ͼ95%.
Selenomethionine-labeled GH95CBM51 was produced using the E. coli B834(DE3) methionine auxotroph. E. coli colonies taken from an LB-agar plate were used to inoculate 3 liters of SelenoMet Medium Base (Molecular Dimensions Ltd.) supplemented with SelenoMet Nutrient Mix (Molecular Dimensions Ltd.) and L-selenomethionine (40 mg/liter). These cultures were grown, induced, and harvested, and the polypeptide was purified exactly as described for the unlabeled protein.
Determination of Protein Concentration-The concentration of purified protein was determined by UV absorbance (280 nm) using calculated molar extinction coefficients of 17,420 and 24,410 M Ϫ1 cm Ϫ1 for GH95CBM51 and GH98CBM51, respectively.
Binding Studies-Proteins for glycan array screening were labeled with fluorescein isothiocyanate (Invitrogen) according to the manufacturer's directions. Labeled protein was desalted and separated from free fluorescein isothiocyanate by gel filtration chromatography using Sephadex G-25 (GE Healthcare). Fluorescein isothiocyanate-labeled GH95CBM51 and GH98CBM51 were used to probe the printed glycan arrays following the standard procedure of Core H of the Consortium for Functional Glycomics (www.functionalglycomics.org/).
Isothermal titration calorimetry was performed as described previously using a VP-ITC system (MicroCal, Northampton, MA) (3). Protein samples were dialyzed extensively against 50 mM Tris-HCl (pH 7.5) and 1 mM CaCl 2 and then concentrated in a stirred ultrafiltration cell as described above. Sugar solutions were prepared by mass in buffer saved from the ultrafiltration step. Both protein and sugar solutions were filtered and degassed immediately prior to use. Protein concentrations were determined by UV absorbance as described above. Although the concentrations of acceptor (i.e. GH95CBM51 or GH98CBM51) were quite high (between 200 and 500 M), the low affinities of the interactions resulted in C values Ͻ2 (14). Thus, based on the 1:1 binding observed in the crystal structures, the data were fit with a single binding site model using stoichiometries fixed at 1. The association constants (K a ) were determined by this fitting process.
Crystallization, Data Collection, and Structure Solution-GH95CBM51 and GH98CBM51, previously exchanged into 20 mM Tris (pH 8.0), were treated overnight at room temperature with thrombin to remove the His 6 tag. The polypeptides were separated from the cleaved His 6 tag by size exclusion chromatography using a Sephacryl S-200 column (GE Healthcare). Pure fractions were concentrated in a 10-ml stirred ultrafiltration device using a 5-kDa cutoff membrane. All crystallizations were performed by the hanging drop vapor diffusion method at 18°C. Diffraction data for all crystals except the selenomethionine-substituted GH95CBM51 crystals were collected on our home source: a Rigaku R-AXIS 4ϩϩ area detector coupled to an MM-002 x-ray generator with Osmic Blue optics and an Oxford 700 series Cryostream. All data were processed using Crystal Clear/d*trek (15). Data collection and processing statistics are given in Table 1.
Crystals of native and selenomethionine-labeled GH95CBM51 (both at 15 mg/ml) were grown in 0.05 M calcium chloride, 0.1 M sodium acetate (pH 4.6), and 20% polyethylene glycol 3350. An optimized selenium single anomalous dispersion diffraction data set for selenomethionine-labeled GH95CBM51 was collected on beamline X6A at the National Synchrotron Light Source (Brookhaven National Laboratories). SHELXC/D was used to determine the substructure of three selenium atoms, followed by refinement and phasing with SHARP (16). Solvent flattening with DM resulted in easily interpretable electron density maps (17). Automatic model building with ARP/wARP yielded a virtually complete model that was finished by manual model building using COOT and refinement with REFMAC (18 -20). This selenium-substituted model was used as a starting point for the higher resolution native structure. GH95CBM51 at 45 mg/ml was co-crystallized with 50 mM methyl-␤-D-galactose in 0.2 M magnesium acetate, 0.1 M sodium acetate (pH 4.8), and 16% polyethylene glycol 3350. This structure was solved by molecular replacement using MOLREP (21) and the native GH95CBM51 model as a template.
Crystals of GH98CBM51 at 15 mg/ml were grown in 0.2 M magnesium acetate, 0.1 M HEPES (pH 7.5), and 20% polyethylene glycol 3350. This structure was solved by molecular replacement using the coordinates of GH95CBM51 as a search model and MOLREP to find the one molecule in the asymmetric unit. After manual correction of this model, ARP/wARP was able to build a model that required minimum alteration using COOT and refinement with REFMAC. A crystal form of GH98CBM51 obtained in 0.2 M sodium carbonate, 0.1 M HEPES (pH 7.5), and 20% polyethylene glycol 3350 was soaked with the blood group A and B antigen trisaccharides (Dextra Laboratories) at 5 mM in mother liquor to obtain complexes with these sugars. This process resulted in the alteration of the unit cell dimensions and space group from a ϭ 36.42, b ϭ 49.54, and c ϭ 88.45 Å (P2 1 2 1 2 1 ; structure not reported, as it did not yield any new information) to a ϭ 69.68, b ϭ 98.75, and c ϭ 49.24 Å (P2 1 2 1 2) but nevertheless yielded excellent complexes. The new asymmetric unit of the soaked crystals contained two molecules of GH98CBM51, each bound to carbohydrate, which were located with the molecular replacement program MOLREP. Building and refinement were carried as described above.
Water molecules were added to all models using the REFMAC implementation of ARP/wARP (18) and inspected visually prior to deposition. In all data sets, 5% of the observations were flagged as "free" (22) and used to monitor refinement procedures. All final model statistics are given in Table 1. Structure images were prepared with PyMOL.
Bioinformatic Analysis-Putative CBM51 domains were detected by position-specific iterative BLAST searches (23) using both GH95CBM51 and GH98CBM51 as queries. Polypeptides containing similar amino acid sequences were dissected for modularity by InterProScan (24) and classified when possible based on the predicted enzymatic specificity for an appended catalytic module. In some cases, boundary positions were fine-tuned by direct sequence alignments with ClustalW (25). For enzymes that contained tandem copies of CBM51, each module was treated independently. Sequence entries were cut off at Ͼ20% identity over a mini-

RESULTS AND DISCUSSION
A New Family of Carbohydrate-binding Modules-To facilitate the characterization of NPCBMs in isolation, the gene fragments encoding the modules were cloned and overexpressed in E. coli, and the polypeptides were purified by immobilized metal affinity chromatography. The purified polypeptides, here called GH95CBM51 (from C. perfringens strain ATCC 13124 GH95) and GH98CBM51 (from C. perfringens strain ATCC 10543 GH98), were screened for glycan binding by Core H of the Consortium for Functional Gly-comics using glycan microarrays. Both proteins displayed the ability to recognize glycans (Fig. 2). GH98CBM51 was clearly the more specific module, showing binding only to glycans bearing the blood group A or B antigen trisaccharide determinants (Fig. 2B). This specificity is in keeping with the blood group antigen specificity of GH98 (EabC) (13). In contrast, GH95CBM51 appeared to interact with a large number of the glycans presented in this array. The common determinant of the best hits was terminal galacto-configured sugars, either D-galactose or D-N-acetylgalactosamine (GalNAc) (Fig. 2A). The exceptions contained galactose following terminal ␣-2,6-linked sialic acid or ␣-1,2-linked fucose. GH95 is a predicted ␣-fucosidase; thus, the binding specificity of its Yellow symbols indicate galacto-configured monosaccharides; blue symbols indicate gluco-configured monosaccharides; and green symbols indicate mannoconfigured monosaccharides. Circles represent hexose monosaccharides, and squares indicate their 2-acetamido derivatives. Purple diamonds indicate sialic acid, and red triangles represent L-fucose. MAY 2, 2008 • VOLUME 283 • NUMBER 18

JOURNAL OF BIOLOGICAL CHEMISTRY 12607
CBM, GH95CBM51, is somewhat at odds with the predicted catalytic specificity. The biological significance of this is currently unknown.
CBM51 Displays a Common Carbohydrate-binding Fold-We approached the question of what are the molecular determinants of specificity in these family 51 CBMs (NPCBMs) through x-ray crystallography. The x-ray crystal structures of GH95CBM51 and GH98CBM51 were solved by optimized selenium single anomalous dispersion and molecular replacement, respectively. Both modules adopt a ␤-sandwich fold comprising a five-stranded antiparallel ␤-sheet opposing a four-stranded antiparallel ␤-sheet (Fig. 3, A and B). The ␤-sandwich fold is common to numerous families of carbohydrate-binding proteins, including a number of CBM and lectin families (1). The atomic models of GH95CBM51 and GH98CBM51 are obviously similar and overlap with a root mean square deviation of 1.2 Å over 140 matched residues (measured by the secondary structure matching algorithm (27) as implemented in COOT). The primary difference in the structures of these two polypeptides is a small insertion in GH98CBM51, which results in the addition of a small ␣-helix near the binding site (Fig. 3B). Relative to GH95CBM51, this addition in GH98CBM51 appears to play a role in recontouring the binding site to accommodate the larger blood group A/B carbohydrates.
Both polypeptides bind a metal atom (Fig. 3, A and B) that was judged to be Ca 2ϩ based on the strictly oxygen-mediated coordination, coordination bond lengths that range from 2.25 to 2.40 Å, a B-factor when modeled as Ca 2ϩ that is consistent with the neighboring atoms, and a small peak of anomalous signal that overlaps with the position of this atom (data not shown). The position of this atom is conserved between GH95CBM51 and GH98CBM51 (within 0.6 Å in a structural overlap) but is not conserved in other CBM families. In the case of these family 51 CBMs, the metal-binding site is at the interface of the two ␤-sheets (the edge of the sandwich) and in the loops that join the termini of the ␤-strands, near the carbohydrate-binding site (Fig. 3, A and B). In contrast, the majority of other CBM families that bind metal atoms, which notably are all also ␤-sandwich proteins, do so at a site that is also at the edge of the sandwich but on a side such that the binding site is on the edge of the strands rather than at the ends (Fig. 3C).
GH95CBM51 Has a Simple Galactose-binding Site-A ligand-bound form of GH95CBM51 was obtained by co-crystallization of the protein with excess methyl-␤-D-galactose. The electron density for this sugar was very clear, allowing unambiguous modeling of the ligand (Fig. 4A).
The binding site of GH95CBM51 is quite shallow (Fig. 4A) and provides surprisingly few direct interactions with the sugar (Fig. 4B). The ⑀-nitrogens of His 955 and His 1041 make hydrogen bonds with the sugar O-4 and O-3, respectively (Fig. 4B). The galactose O-4 also makes a hydrogen bond with the backbone nitrogen of Ser 1039 (Fig. 4B). The phenol group of Tyr 922 makes a classical protein-carbohydrate hydrophobic interaction with the apolar plane formed by C-3, C-4, C-5, and C-6 on the B-face of the galactose. This mode of interaction, which requires an equatorial O-2, an equatorial O-3, and an axial O-4, is consistent with the observed specificity of GH95CBM51 for terminal galacto-configured sugars. However, the galactose O-2 and O-6 are solvent-exposed, suggesting that modifications may be tolerated at these positions. Indeed, the array screening results indicated that this polypeptide can interact with terminal Gal-NAc (i.e. can tolerate an acetamido modification at C-2) and suggested an ability to accommodate a fucosyl residue ␣-1,2linked to galactose. Likewise, the exposure of the O-6 group may also explain the apparent binding of GH95CBM51 to a glycan terminating in sialic acid ␣-2,6-linked to galactose.
The galactose-binding site of GH95CBM51 shows a reasonable degree of architectural similarity to the binding sites of the galactose-specific family 32 CBMs, the best characterized of which is CpCBM32, the CBM32 from C. perfringens GH84C (NagJ) (3). Although GH95CBM51 and CpCBM32 show no significant amino acid sequence similarity, they share the same fold (root mean square deviation of ϳ3.3 Å) and binding site location (Fig. 4C). Many of the protein-carbohydrate interactions are also conserved. Tyr 922 of GH95CBM51 is structurally conserved with Trp 661 of CpCBM32, which plays a role in binding similar to Tyr 922 . His 955 is conserved with His 658 , which, like His 955 , hydrogen bonds with O-4 of the galactose. In CpCBM32, the N of Arg 690 provides the same hydrogen bond to O-3 of the galactose as the N ⑀ of His 1041 in GH95CBM51. Thus, the majority of the GH95CBM51-galactose interactions are structurally well conserved with the family 32 CBMs, perhaps reflecting a simple but prototypical mode of galactose recognition by CBMs.
Structural Basis of Blood Group Antigen Recognition-Ligand-bound forms of GH98CBM51 with blood group A and B carbohydrates were obtained by crystal soaking experiments. In both cases, the electron density for the carbohydrate ligands was quite clear, facilitating accurate modeling of these sugars (Fig. 4, D and E). Only in the case of the 2-acetamido group of the GalNAc in the blood group A antigen trisaccharide was there any ambiguity, as there was no clear electron density for C-8 of this residue. This likely reflected a certain degree of disorder in this portion of the molecule, as this observation was made for the A antigen trisaccharide ligands in the binding sites of both molecules of GH98CBM51 in the asymmetric unit.
The ligand specificity of GH98CBM51 is substantially more restricted in comparison with GH95CBM51, which is revealed by the architecture of its binding site. It is comparatively deep (Fig. 4, D and E), and specific interactions are made between the protein and all three sugars of the blood group carbohydrates (Fig. 4F). The constellation of interactions between GH98CBM51 and these sugars is identical for both blood groups A and B. The plane formed by C-3, C-4, C-5, and C No direct interactions are evident involving the 2-acetamido group of the blood group A determinant, GalNAc, or involving O-2 of the blood group B determinant, galactose, explaining the general ability of this polypeptide to bind both sugars. However, GH98CBM51 did display a 2-3-fold preference for the blood group A antigen, for which there is no obvious structural explanation (i.e. no differences in direct interactions are evident).
However, it appears unlikely that this small preference is biologically relevant.
GH98CBM51 is the only CBM known to be specific for the blood group A/B antigens. Although we did recently determine the structure of a family 47 CBM from Streptococcus pneumoniae in complex with the gluco analog of the blood group A antigen tetrasaccharide, this sugar was not the preferred ligand for this protein (5). In general, structural information regarding the interaction of other carbohydrate-binding proteins with their natural blood group A/B antigens is quite scarce. Only the structures of two lectins and a viral capsid protein have been determined in complex with A and/or B antigen-reactive tri-or tetrasaccharides (28 -30). Interestingly, these structures reveal that there is little conservation in the modes of their carbohydrate recognition. The norovirus capsid protein maintains specific interactions between the protein and the terminal Gal/ GalNAc that defines the antigen and the fucose residue (29). It makes no interactions with the central galactose residue and does not utilize the classical aromatic amino acid side chainsugar ring hydrophobic interaction. Winged bean basic agglutinin I maintains the majority of its interactions, including an aromatic side chain hydrophobic interaction, with the terminal Gal/GalNAc of the antigen but also has some polar interactions with the central galactose residue (30). The fucose residue apparently does not participate in the interaction. Similarly, the fungal galectin CGL2 makes specific interactions with the blood group-determining Gal/GalNAc and the central galactose residue, with which a tryptophan side chain makes an apolar interaction (28). The fucose residue is not involved blood group recognition by CGL2. GH98CBM51 is unique, as it makes specific interactions with all of the residues in the blood group A/B antigen trisaccharide (the blood group-determining Gal/GalNAc, galactose, and fucose) (Fig. 4, D-F). The glycan microarray screening did not reveal binding to any fragments of this ligand, indicating that this triumvirate of residues is required for binding. Furthermore, a model of GH98CBM51 with the blood group A antigen tetrasaccharide suggests that the GlcNAc residue of this sugar may even make additional interactions with this polypeptide (data not shown). Although the affinity of GH98CBM51 for the blood group A/B antigens is low, it appears that GH98CBM51 may be one of the most selective carbohydrate-binding proteins for the complete fucosylated forms of these sugars.
Diversity in the CBM51 Family-A structural overlap of GH95CBM51 and GH98CBM51 reveals how the binding sites of these two related CBMs are adapted to recognize their different respective ligands. The GalNAc of the blood group A antigen in the GH98CBM51 binding site approximately overlaps with the galactose in the GH95CBM51 binding site (Fig.  5A). Hydrogen bonding to O-3 and O-4 of the sugar is main-tained by residues that are roughly structurally conserved, although the nature of the side chains differs substantially: two histidines in GH95CBM51 versus one aspartic acid and one threonine in GH98CBM51. The tyrosine in GH95CBM51 is structurally replaced by an aspartic acid in GH98CBM51, where instead of making hydrophobic interactions with the GalNAc, it is involved in polar interactions with the fucosyl moiety of the blood group antigen. The remaining four binding residues in GH98CBM51 are unique to this module and appear distinctively positioned to accommodate the blood group antigens.
The observed lack of conserved functional residues between these CBMs prompted us to undertake a more comprehensive bioinformatic analysis of the CBM51 family. The CBM51 family currently comprises ϳ60 entries from ϳ46 different proteins from ϳ26 different organisms (different strains of a given species are included in this count). These appear to cluster into six different subfamilies (a-f) (Fig. 5B, FIGURE 5. Comparison of GH95CBM51, GH98CBM51, and other CBM51 members. A, overlay of the GH95CBM51 (yellow) and GH98CBM51 (magenta) carbohydrate-binding sites. Relevant side chains involved in binding and ligands are shown in stick representation, and the metal atoms are shown as spheres. B, phylogenetic analysis of CBM51. The inset shows the complete analysis and indicates the subfamilies. Subfamilies CBM51a and CBM51b (circled in the inset) are expanded with detailed entries. The green star denotes the GH98CBM51 entry, and the blue star denotes the GH95CBM51 entry. C, alignment of subfamilies CBM51a (indicated by the green vertical line) and CBM51b (indicated by the blue vertical line). The entry numbering corresponds to that in B. GH98CBM51 and GH95CBM51 are indicated by stars as in B. The secondary structures for GH98CBM51 and GH95CBM51 are shown above and below the alignment, respectively. Yellow arrows denote ␤-strands, and the red cylinder represents an ␣-helix. Residues involved in ligand binding by GH98CBM51 and GH95CBM51 are indicated above and below the alignment, respectively, by arrowheads.