|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 283, Issue 18, 12415-12425, May 2, 2008
Molecular Basis for the Selectivity and Specificity of Ligand Recognition by the Family 16 Carbohydrate-binding Modules from Thermoanaerobacterium polysaccharolyticum ManA*![]() ![]() ![]() ¶ ¶ ¶|| ¶**1
From the
Departments of
Received for publication, August 6, 2007 , and in revised form, November 13, 2007.
Enzymes that hydrolyze complex polysaccharides into simple sugars are modular in architecture and consist of single or multiple catalytic domains fused to targeting modules called carbohydrate-binding modules (CBMs). CBMs bind to their ligands with high affinity and increase the efficiency of the catalytic components by targeting the enzymes to its substrate. Here we utilized a multidisciplinary approach to characterize each of the two family 16 carbohydrate-binding domain components of the highly active mannanase from the thermophile Thermoanaerobacterium polysaccharolyticum. These represent the first crystal structures of family 16 CBMs. Calorimetric analysis showed that although these CBMs demonstrate high specificity toward β-1,4-linked sugars, they can engage both cello- and mannopolysaccharides. To elucidate the molecular basis for this specificity and selectivity, we have determined high resolution crystal structures of each of the two CBMs, as well as of binary complexes of CBM16-1 bound to either mannopentaose or cellopentaose. These results provide detailed molecular insights into ligand recognition and yield a framework for rational engineering experiments designed to expand the natural repertoire of these targeting modules.
The turnover of photosynthetically fixed carbon through the action of microbial glycoside hydrolases has been estimated to be of the order of 1011 tons annually (1). Consequently, glycoside hydrolases play a key role in the global carbon cycle, and their properties, if well harnessed, will enhance the release of substrates (monomeric sugars) critical to the use of cellulosic materials in the biofuel industry. For most glycoside hydrolases, such as bacterial cellulases and xylanases, the polypeptides are organized in a modular arrangement that usually consists of a catalytic domain and an associated carbohydrate-binding module (CBM).2 These modules may be joined through linkers that are rich in proline, serine, and threonine (2). The evolutionary rationale that led to these complex molecular architectures is currently unclear (3).
It has been suggested that CBMs attain multivalency through multiplicity (4). Therefore, it could be hypothesized that multiple CBMs act synergistically to bind to their target ligand, leading to an increased accessibility of the catalytic domain to the target polysaccharide. This view is supported by studies on a recombinant double CBM, constructed by fusing family 1 CBMs of two Trichoderma reesei cellobiohydrolases via a linker peptide (5). A 5–10-fold increase in the affinity for crystalline cellulose was observed for the double CBM compared with the individual modules, and similar results were observed for two CBMs of Cellulomonas fimi xylanase 11A (3). Thus, in some arrangements, the affinity for substrates increases with multiplicity of CBMs, and this appears to occur frequently in hyperthermophiles and thermophiles (6). Contrary to these finding, a report on the product of a manA gene of Caldicellulosiruptor strain RT8B.4 with two N-terminal CBMs showed no relationship between binding affinity and multiplicity of CBMs (7). Similar conclusions were also obtained from studies based on a Thermotoga neapolitana xylanase flanked by two different CBMs (7). An alternative hypothesis that may explain multiplicity of CBMs in glycoside hydrolases is to increase the diversity of their target polysaccharides or substrates, because polysaccharides found in plant cell walls are heterogeneous in nature. The Paenibacillus sp. strain W-61 xylanase 5 seems to support this hypothesis. In this polypeptide, there is a C-terminal family 22 and an N-terminal family 9 CBM, respectively (8). The family 22 CBM binds to xylan, whereas the family 9 CBM binds to cellulose (4). Carbohydrate-binding modules have been identified in over 100 β-1,4-glucanases (9, 10), and aside from increasing the relative concentration of their respective enzymes on the surface of the substrate, they may also modify substrates into amorphous forms and thus make available to the catalytic domain a more easily degradable substrate (11–13). This is perhaps the case for the two CBMs found in the highly active mannanase (ManA) of Thermoanaerobacterium polysaccharolyticum, because their removal compromises enzyme activity (14). The widespread association of CBMs with glycoside hydrolases suggests that enzyme adsorption play a critical role in the degradation of polysaccharides. Insoluble plant cell wall polysaccharides such as cellulose hold a lot of promise as future substrates for energy production. Understanding the mechanism of CBM-mediated adsorption to β-glucan may therefore be critical to our capacity to fully utilize energy crops as substrates for industrial processes, especially for the biofuels industry. In an effort to understand the influence of the T. polysaccharolyticum carbohydrate-binding modules on the enzymatic activity of the parental protein, we have used bioinformatics to delineate its two linker-less CBMs and carried out biochemical and structural studies on the resultant polypeptides. This represents the first comprehensive structure-thermodynamic analysis of family 16 carbohydrate-binding modules. Here we show that the first and second carbohydrate-binding modules from T. polysaccharolyticum (TpolCBM16-1 and Tpol-CBM16-2, respectively) display high affinity for β1,4-glucose and β1,4-mannose polymers and provide a quantitative assessment of ligand binding using isothermal titration calorimetry. We also present four different high resolution x-ray crystallographic structures as follows: 1) TpolCBM16-1, determined to resolution of 1.5 Å; 2) TpolCBM16-2, determined to 1.9 Å resolution; 3) the co-crystal structure TpolCBM16-1 bound to cellopentaose, determined to 1.2 Å resolution; and 4) the co-crystal structure of TpolCBM16-1 bound to mannopentaose, determined to 2.2 Å resolution. These structures demonstrate that each module contains a single cleft, with two aromatic residues located at the base and along one face and two stretches of polar residues on either side of this cleft, which constitutes the ligand-binding site. The molecular basis for both plasticity in ligand selectivity and specificity by these modules is discussed.
Protein Expression and Purification—The genes encoding the first and second carbohydrate-binding module from T. polysaccharolyticum were cloned by PCR. To amplify the gene segment encoding TpolCBM16-1, a forward primer with the sequence 5'-CATATGgtaaacatggtgagcaaccggg, with an engineered NdeI restriction site (shown in capitals) preceding the start codon, and a reverse primer 5'-CTCGAGctaaacttccaccagatatacatcatcca, with an engineered XhoI site (shown in capitals) after a termination codon, were used with the manA gene as the template. The ManA TpolCBM16-2 was also amplified by PCR with forward primer (5'-CATATGtccaatttaatagtgaacggaacagc) and reverse primer (5'-CTCGAGctatacttcaacgagcgtgatattatcg). Note that these two primers also contained engineered NdeI and XhoI sites in the forward and reverse primers, respectively. Each TpolCBM16 encoding sequence was cloned into a TA cloning vector, pGEM-T (Promega), and sequenced to confirm the correctness of the coding sequence. Each gene was then removed and inserted into a modified pET-28 plasmid (Novagen) bearing an N-terminal polyhistidine tag and a thrombin digestion site preceding the inserted gene. The modification in the pET28 vector was the replacement of the gene encoding kanamycin resistance with that encoding ampicillin resistance. The integrity of all constructs was confirmed by sequencing. Escherichia coli expression strain BL21(DE3) was transformed with the appropriate expression vectors, and single colonies of transformed E. coli were used to inoculate 5 ml of LB medium supplemented with ampicillin (100 µg/ml). Five hours following inoculation, the small scale culture was added to 1 liter of LB medium containing ampicillin (100 µg/ml) for growth at 37 °C. When the A600 nm of the culture reached 0.3, protein expression was induced with 0.1 mM isopropyl β-D-thiogalactopyranoside, and the cells were further grown for 6 h. Bacterial cells were pelleted by centrifugation (4000 x g for 1 h) and resuspended in 100 mM KCl, 20 mM Tris-HCl, pH 8.3, 10% glycerol, and a mixture of protease inhibitors. Resuspended cells were disrupted by multiple passes through an Avestin C5 Emulsiflex French press cell, and insoluble aggregates and cellular debris were removed by centrifugation (15 000 x g for 1 h). Recombinant proteins were purified from the clarified supernatant by virtue of the N-terminal polyhistidine tag using a Talon resin column (Clontech) charged with cobalt chloride. Following elution from the cobalt affinity resin, the cleavable polyhistidine tag was removed using thrombin (1 unit/mg protein; GE Healthcare). The protein was further purified by anion exchange (5 ml HiTrap Q; GE Healthcare) and size exclusion chromatographies (Superdex 75 16/60; GE Healthcare) prior to crystallization. Site-specific variants were purified in essentially the same manner. Selenomethionine-incorporated TpolCBM1 was produced by the method of Van Duyne et al. (15) and purified in the manner described above except that 5 mM β-mercaptoethanol was added to all of the buffers. The protein samples were estimated to be greater than 95% pure as judged by SDS-PAGE. Affinity Gel Electrophoretic Mobility Assays—Assays to monitor the binding of TpolCBM16-1 and TpolCBM16-2 to soluble substrates were performed as described by Tomme et al. (16) in the absence of SDS. 2-Mercaptoethanol was excluded from the loading buffer, and the proteins were not heated at 95 °C prior to loading to the gel. Complexes were resolved on a nondenaturing separating 12% polyacrylamide gel utilizing 0.1% of the following soluble polysaccharide substrates: carboxymethyl cellulose, locust bean gum (galactomannan polysaccharide), or xylan. In all experiments the same proteins were run simultaneously in the control gel without incorporated soluble substrates. The electrophoresis was carried out at 100 V at 4 °C for 3 h. Proteins were visualized by staining with Coomassie Brilliant Blue R-250.
Isothermal Titration Calorimetry—Measurements were made at 25 °C using a VP-ITC calorimeter (MicroCal, Inc, Northhampton, MA) following the manufacturer's recommended procedures. All samples were extensively dialyzed against 50 mM potassium phosphate buffer, pH 7.0, and all ligands were dissolved in the same buffer. The protein sample (100–200 µM), in a 1.4399-ml reaction cell, was injected with 28 successive 10-µl aliquots of ligand (1–1.5 mM) at 300-s intervals. Based on the results from our crystallographic experiments, the number of binding sites was presumed to be one. Data were fitted by nonlinear regression using a single site model (MicroCal Origin), and thermodynamic parameters were calculated using the Gibbs free energy equation ( Crystallization and X-ray Data Collection—Initial crystallization conditions were established by the sparse-matrix sampling methods using commercial and homemade screens. Refinement of promising conditions yielded large crystals suitable for diffraction analysis. Crystals of TpolCBM16-1, Tpol-CBM16-2, TpolCBM16-1-cellopentaose, and Tpol-CBM16-1-mannopentaose complexes were all grown using the hanging drop vapor diffusion method. For crystallization of native and SeMet-labeled TpolCBM16-1, 2 µl of protein sample (20 mg/ml, in 100 mM KCl, 10 mM HEPES, pH 7.5) was added to 2 µl of precipitant (3 M sodium formate, pH 7, 15% ethylene glycol) and equilibrated over a well containing the precipitant solution at 20 °C. Crystals grew within a week and reached a maximum size of 0.3 x 0.4 x 0.6 mm. As the crystallization precipitant proved to be suitable for cryo-protection, single crystals were harvested straight from the crystallization drop and flash-cooled in liquid nitrogen prior to data collection. Co-crystals TpolCBM16-1-cellopentaose complex were grown by adding 2 µl of protein sample (20 mg/ml, in 100 mM KCl, 10 mM HEPES, pH 7.5, plus 5 mM cellopentaose) to 2 µl of precipitant (30% polyethylene glycol 3350, 100 mM Tris-HCl, pH 8.3, 200 mM MgCl2) and equilibrating over a well containing the precipitant solution at 20 °C. Crystals grew within 3 days and reached a maximum size of 0.4 x 0.4 x 0.4 mm. Prior to flash-cooling, crystals were briefly soaked in the precipitant solution supplemented with either 20% methylpentane diol or glycerol. Co-crystals of the TpolCBM16-1-mannopentaose complex were grown by adding 2 µl of protein sample (20 mg/ml, in 100 mM KCl, 10 mM HEPES, pH 7.5, plus 5 mM mannopentaose) to 2 µl of precipitant (1.6 M ammonium sulfate, 100 mM MES, pH 6.5, 10% dioxane) and equilibrating the mixture over a well containing the precipitant solution at 20 °C. Crystals grew within 7 days and reached a maximum size of 0.2 x 0.2 x 0.4 mm. Prior to flash-cooling, crystals were briefly soaked in the precipitant solution supplemented with 25% glycerol. Initial screening with TpolCBM16-2 established sodium formate as a viable precipitant for crystallization. These crystals grew as thin needles and could not be improved upon despite extensive screening. Sequence gazing showed that Tpol-CBM16-2 contained a cysteine (Cys74) at a position occupied by Phe76 in TpolCBM16-1. Reasoning that Cys74 may be fairly reactive, TpolCBM16-2 was incubated with 1 mM of mercury(II) acetate, to cap this residue, prior to crystallization trials. Large single crystals of CBM16-2 were grown by adding 2 µl of protein sample (20 mg/ml, in 100 mM KCl, 10 mM HEPES, pH 7.5, plus 1 mM HgC2H4O2) to 2 µl of precipitant (3.5 M sodium formate, pH 7.0) and equilibrating over a well containing the precipitant solution at 20 °C. Prior to flash-cooling, crystals were briefly soaked in the precipitant solution supplemented with 20% ethylene glycol.
Flash-cooled crystals of unliganded TpolCBM16-1 diffracted x-rays to a minimum Bragg spacing of 1.3 Å, using an insertion device x-ray beam line utilizing an ADSC Q4 CCD detector (IMCA-CAT-Sector 17ID, Advanced Photon Source, Argonne, IL). These crystals occupied space group P212121 with unit cell parameters a = 57.4 Å, b = 77.8 Å, and c = 77.9 Å and contained two molecules in the crystallographic asymmetric unit. A 6-fold redundant data set was collected to a limiting resolution of 1.4 Å (overall Rmerge = 4.60, I/
Crystals of TpolCBM16-1-cellopentaose complex diffract x-rays beyond a Bragg spacing of1Åatan insertion device synchrotron beam line (SER-CAT Sector 22BM, Advanced Photon Source, Argonne, IL). Crystals occupy space group P1 with unit cell parameters a = 32.5, b = 48.8 Å, c = 49.4 Å,
A 6-fold redundant data set was collected from crystals of TpolCBM16-1-mannopentaose complex (overall Rmerge = 3.9, I/
Flash-cooled crystals of unliganded TpolCBM16-2 occupy space group C2221 with unit cell parameters a = 90.4 Å, b = 90.5 Å, c = 139.4 Å, with four molecules in the crystallographic asymmetric unit. The a and b unit cell edges are coincidentally close, and the crystals demonstrate tetragonal pseudo-symmetry consistent with space group P4222, with two molecules in the crystallographic asymmetric unit. However, scaling of data in the higher symmetry space group results in a slight increase in merging statistics, and subsequent crystallographic refinement in the tetragonal setting resulted in a free R factor that stalls at 40%. Hence, the data were scaled in the orthorhombic setting and structure determination, and refinement was carried out in space group C2221. A 6-fold redundant data set was collected to a limiting resolution of 1.9 Å (overall Rmerge = 4.3, I/
Phasing and Structure Determination—Although crystals of selenomethionine-substituted TpolCBM16-1 were grown under similar conditions, SeMet-labeled TpolCBM16-1 occupied a different space group (C2221, with one molecule in the crystallographic asymmetric unit), with unit cell parameters a = 56.4 Å, b = 96.8 Å, c = 73.6 Å. A 6-fold redundant data set was collected at a wavelength near the selenium absorption edge, to a limiting resolution of 1.7 Å (overall Rmerge = 5.5, I/
The structure of native TpolCBM16-1 was determined by molecular replacement using the SeMet TpolCBM16-1 structure as a search probe (24). Following rigid body refinement of the initial molecular replacement solution, the atomic model was subject to automatic rebuilding using ARP/wARP (25), resulting in a near complete trace of both molecules in the crystallographic asymmetric unit. The remainder of the model was fitted using XtalView (22) and further improved by rounds of refinement with REFMAC5 (23). The co-crystal structures of TpolCBM16-1-cellopentaose and TpolCBM16-1-mannopentaose were determined by molecular replacement using the final refined coordinates of native TpolCBM16-1 as a search probe. Clear density for the ligands could be observed in both co-crystal structures prior to crystallographic refinement. Cycles of manual rebuilding followed by crystallographic refinement were carried for each of the co-crystal structures. The respective ligands were manually built into the difference Fourier maps after the free R factors dropped below 30%. Cross-validation, using 5–7% of the data for the calculation of the free R factor (26, 27), was utilized throughout model building process to monitor building bias. The stereochemistry of the models was routinely monitored throughout the course of refinement using PROCHECK (28). The structure of native TpolCBM16-2 was determined by molecular replacement (23) using the refined coordinates of the SeMet TpolCBM16-1 structure as a search probe. Multiple rounds of manual model building were interspersed with refinement using REFMAC5 (23) to complete structure refinement. Cross-validation used 5% of the data in the calculation of the free R factor (26, 27). Despite extensive manual rebuilding and refinement, residues in two loop regions, bridging Tyr82 through Thr90 and Thr111 through Phe116, respectively, remain ill-defined and have not been modeled in the structure. Consequently, the free R factor of 30% for this structure is slightly higher than would be expected, given the resolution limit of 1.9 Å. The stereochemistry of the model was routinely monitored throughout the course of refinement using PROCHECK (28). Crystal parameters, data collection parameters, and refinement statistics for each of the structures are summarized in Table 1. The refined coordinates have been deposited in the PDB with identification numbers 2ZEW (TpolCBM16-1), 2ZEX (TpolCBM16-1-cellopentaose), 2ZEY (TpolCBM16-1-mannopentaose), and 2ZEZ (TpolCBM16-2). Computational Procedures—All calculations were performed with MOE 2006.08 (Chemical Computing Group, Inc.), using either the MMFF94x or OPLS-AA force fields. Starting coordinates for the energy minimization of the TpolCBM16-1-cellopentaose and TpolCBM16-1-mannopentaose complexes (this study) and the cellopentaose-CBM complex from Cellulomonas fimi (PDB 1gu3), solved by Davies and co-workers (29), were taken from the coordinates of the crystal structures, in which crystal waters were included. All atoms (solvent, CBM, and ligand) were energy-minimized until an energy gradient of 10-4 kcal/mol/Å2 was achieved. Interaction energy of the ligand with CBM and explicit solvent was then calculated using the MMFF94x force field, respectively.
Delineation of the CBM Domain Boundaries—In a previous report (14), our delineation of TpolCBM16-1 and Tpol-CBM16-2 suggested that the two modules were separated by 23 amino acids; hence each repeat started with a highly conserved N terminus characterized by the seven amino acids SAVPEAA and ended with a highly conserved sequence LVEV (Fig. 1C). We cloned and expressed each module based on this demarcation. However, neither of the products bound to Sigmacell type 50, although a construct with both putative TpolCBM16s included bound to this substrate. A visual inspection of the 23 amino acids occurring between the putative TpolCBM16-1 and putative TpolCBM16-2 showed that the sequence shares conserved residues with the 23 amino acids preceding the putative TpolCBM16-1. Therefore, we made two new constructs that included the 23 amino acids upstream of the previously demarcated N termini of the putative TpolCBM16s. Each of the gene constructs reported here that extend the N-terminal regions of each of TpolCBM16 construct demonstrates tight binding to Sigmacell type 50. Thus, although our original C-terminal delineation of the duplicated TpolCBM16 of ManA was correct, the N-terminal demarcation was erroneous.
Thermodynamic Analysis of Ligand Specificity—The binding specificity of the purified Tpol-CBM16-1 and TpolCBM16-2 was determined by native electrophoretic mobility assays using various polysaccharide substrates. These studies did not show retardation of each of these modules in the presence of xylans or β1,3-linked glucose polymers. In contrast, tight affinity was observed between both TpolCBM16-1 or TpolCBM16-2 and β1,4-linked glucose polymers such as cellulose. The affinity of these polypeptides for cellulose was such that once the individual module was bound to cellulose, the complex could only be dissociated by denaturing the protein. The binding characteristics of each of these carbohydrate-binding modules were further quantified by isothermal titration calorimetric analysis using oligosaccharides of defined chain length. The binding isotherms for the interaction of TpolCBM16-1 or TpolCBM16-2 with each oligosaccharide were fitted by nonlinear regression with a simple bimolecular interaction model (Table 2). Calorimetric analysis using long chain polymers hydroethylcellulose and glucomannan demonstrates that Tpol-CBM16-1 interacts with glucomannan with a near 9-fold greater affinity than for hydroethylcellulose. Glucomannan is an abundant water-soluble polysaccharide that consists of mixed β-1,4-linked polymers containing glucose and mannose. Consistent with our thermodynamic data, glucomannan is likely the natural substrate for TpolCBM16-1 and Tpol-CBM16-2.
For the interactions with synthetic short chain oligosaccharides, the binding stoichiometries are consistent with a 1:1 interaction. The measured binding affinities increase as a function of chain length with an optimum consistent with a binding site that can accommodate a polymer of five sugar molecules. As is typical of protein-carbohydrate interactions, the thermo-dynamics of the interaction of TpolCBM16-1 and Tpol-CBM16-2 with β1,4-linked polymers are enthalpically driven, with an unfavorable entropic contribution.
Overall Structures of CBM1 and CBM2—Crystals of CBM1, grown using sodium formate as a precipitant, occupied space group P212121, with two molecules in the crystallographic asymmetric unit and diffracted synchrotron x-radiation to a Bragg limit of 1.5 Å. To obtain crystallographic phases, crystals of TpolCBM16-1 were grown from selenomethionine-labeled protein. However, crystals of SeMet-labeled TpolCBM16-1 occupied a different space group (C2221, with one molecule in the crystallographic asymmetric unit), and the structure of SeMet TpolCBM16-1 was solved to a resolution of 1.7 Å by single wavelength anomalous diffraction methods. The structure of native TpolCBM16-1 was solved to a resolution of 1.4 Å by molecular replacement using the refined coordinates of the structure of SeMet-labeled TpolCBM1. The two structures provide three independent views of the molecule and allow for detailed analysis of the structure, independent of crystal packing artifacts. The structure of TpolCBM16-2 was solved to a resolution of 2.1 Å by molecular replacement utilizing the final refined coordinates of TpolCBM16-1.
Both TpolCBM16-1 and TpolCBM16-2 fold into a compact domain typical of a β-sandwich fold consisting of two layers of five antiparallel β-sheets. Given the structural similarity between TpolCBM16-1 and TpolCBM16-2, for brevity, only the topology of TpolCBM16-1 will be described in detail. The bottom layer of the β-sandwich is composed of sheets β2 (Met24–Val27), β3 (Gly37–Ile40), β10 (Leu134–Glu143), β5 (Thr59–Phe68), and β8 (Thr104–Thr111). The upper layer of the sandwich is composed of sheets β1 (Gln18–Asp19), β4 (Ala46–Asp51), β9 (Gln121–Lys126), β6 (Phe76–His83), and β7 (Tyr91–Phe98). A single structural metal ion, presumed to be calcium by analogy to the structures of other carbohydrate-binding modules, is flanked by the N-terminal loop residues, the loop region between sheet β2 and β3, and the C-terminal region of β4. The ion has a coordination geometry characteristic of calcium and is engaged by the O- The core architecture of TpolCBM16-1 and TpolCBM16-2 is similar to that observed in the three-dimensional structures of other families of carbohydrate-binding modules. A DALI search against other structures deposited in the Protein Data Bank shows the closest structural homologs to be the β-1,3 xylan-binding domain from CBM22 (30) (PDB code 1DYO) (r.m.s.d. of 2.9 Å over 138 aligned residues; Z-score = 14.8), the β-1,3-binding laminarinase of CBM4-2 (29) (PDB code 1GUI) (r.m.s.d. of 2.1 Å over 123 aligned residues), and the sugar-binding domain of the F-box ubiquitin ligase (PDB code 1UMH) (r.m.s.d. of 3.1 Å over 135 aligned residues) (31). As in the structures of TpolCBM16-1 and TpolCBM16-2 presented here, a calcium ion is located at an equivalent position in the above noted structures of CBM22 and CBM4, where it contributes to the stability of the polypeptide (4, 29, 30). Within the β-sandwich fold of TpolCBM16-1, the top layer of the sandwich is curved so as to create a cleft that runs along the resultant concave face, perpendicular to the direction of the β-sheets (Fig. 2A). This cleft spans about 25 Å across the face of the β-sandwich and constitutes the ligand-binding site (Fig. 2B). Two solvent-exposed tryptophan residues (Trp20 and Trp125) lie at the base and along one face of this cleft, respectively. A number of polar residues surround the periphery of this cleft, including Gln21, Asp77, and Asn97 on one side and Gln81, Gln93, and Gln121 on the other side, where they are poised to hydrogen bond with the sugar ligand. Given the sequence identity of 62% over 144 amino acids between TpolCBM16-1 and TpolCBM16-2, it is not surprising that the two structures are nearly identical. The main chain atoms from final refined coordinates of the two structures can be superimposed with an r.m.s.d. of 0.8 Å over 120 atoms. Although the two polypeptides share the same overall fold, and contain similar structural features, such as the single structural calcium ion, significant differences exists at the ligand-binding site. A superposition of the two structures reveals that the largest deviations exist in the loop region encompassing residues 18–28 that includes the solvent-exposed Trp20. In addition, several of the polar residues that surround the ligand-binding site of TpolCBM16-1 are not conserved in TpolCBM16-2, including Gln21 (Tyr21 in CBM2) and Asn97 (Arg97 in Tpol-CBM16-2). These notable differences may account for the measured 2-fold lower affinity of TpolCBM16-2 for cellopentaose, relative to that for TpolCBM16-1 (see Table 2).
Molecular Basis for Ligand Specificity and Selectivity by CBMs—To determine the molecular basis for ligand specificity, we determined the crystal structure of TpolCBM16-1 bound to cellopentaose and to mannopentaose. The co-crystal structures of the TpolCBM16-1-oligosaccharide complexes were determined by molecular replacement using the refined coordinates of unliganded TpolCBM16-1 as a search model. Following rigid body refinement of the molecular replacement solutions, clear and continuous electron density, corresponding to each of the five β-1,4-linked sugars, could be observed in initial electron density maps (Fig. 3, A and C). Following further refinement and addition of solvent molecules, the pentasaccharide sugars were built into difference Fourier maps, and the binary complexes were refined without any constraints. In the 1.2 Å resolution co-crystal structure of the Tpol-CBM16-1-cellopentaose complex, there are two identical copies of the complex in the asymmetric unit and the structural descriptions are apt for either molecule. Cellopentaose interacts with TpolCBM16-1 in an extended conformation in a 25 Å cleft located perpendicular to the face of the β-sandwich (Fig. 2). As in other co-crystal structures of other carbohydrate-binding modules, two solvent-exposed aromatic residues (Trp20 and Trp125), which lie at the base and along one face of the ligand-binding cleft, engage in hydrophobic interactions with the last three terminal pyranosides in the reducing end of the oligosaccharide (Fig. 3B). The tryptophan residues are separated by roughly 8.5 Å, and their indole side chains are co-planar with the D-configured sugars to provide extensive hydrophobic contact. Mutational analyses show that replacement of either of these tryptophan residues compromised ligand recognition and demonstrate the importance of the hydrophobic platform created by these two residues in ligand binding. In addition to these hydrophobic contacts, an extensive set of hydrogen bond interactions stabilizes the interaction between the protein and the remaining two sugars at the nonreducing end of the pentasaccharide ligand (Fig. 3B). Sets of polar residues flank either side of the ligand-binding cleft where they interact with the hydroxyl groups of the cellopentaose. Asn97, Asp77, and Gln21, located along one face of the cleft, make contact with the glucose residues at sites 1, 2, and 4, respectively. Gln93, Gln81, and Gln121, located along the opposite face of the cleft, engage the sugar residues at sites 3, and 4, respectively. Additional contacts are observed between the C-6 hydroxymethyl group of glucose molecules at the reducing end of the oligosaccharide and residues Gln21 and the backbone carbonyl of Trp20. The interaction between TpolCBM16-1 and cellopentaose buries a total of 499.6 Å2 of total surface area. In the 2.2 Å resolution co-crystal structure of TpolCBM16-1 bound to mannopentaose, the pentasaccharide is likewise buried in an extended conformation within the 25 Å cleft located perpendicular to the face of the β-sandwich. As in the Tpol-CBM16-1-cellopentaose complex, hydrophobic residues Trp20 and Trp125 interact with the three pyranosides at the reducing end (Fig. 3D). Hydrogen bond interactions between sugars to sites 1 and 4, and protein residues Asn97 and Gln21 reside along one face of the ligand-binding cleft. Along the opposite face of the cleft, residues Gln93, Gln81, and Gln121 interact with the pyranosides at sites 3 and 4. The mannosides at sites 1 and 5 are twisted away from the ligand-binding cleft to accommodate the axial location of the C-2 hydroxyls of the mannosides (Fig. 3D). As a result, the TpolCBM16-1 forms fewer hydrogen bond interactions with the pyranosides of mannopentaose, relative to cellopentaose, and buries slightly less total surface area (430 Å2) upon complex formation.
Comparison of the two co-crystal structures provides insights into the molecular basis for promiscuous ligand binding by TpolCBM16-1. Of particular interest is how the protein scaffold can accommodate the C-2 hydroxyl in an equatorial configuration in the cellopentaose complex, in contrast to how the axial C-2 hydroxyl is accommodated in the mannopentaose complex. The C-2 hydroxyls of the carbohydrates at sites 2 and 5 in the pentasaccharide complexes do not directly interact with the protein. However, the C-2 hydroxyls of the carbohydrates at sites 1, 3, and 4 are characterized by distinct hydrogen bonding interactions with the protein. One of the major specificity determinants in this scaffold is Gln81, and this residue is able to interact with the C-2 hydroxyl at site 3 in either the axial or equatorial configuration. The axial configuration of the C-2 hydroxyl at sites 1 of the mannopentaose complex results in the loss of a hydrogen bond to the N-
In Silico Analysis of Ligand Specificity—To examine the nature of the TpolCBM16 specificity and affinity for ligands, the crystallographic complexes with cellopentaose and mannopentaose were compared with energy-minimized TpolCBM16-polysaccharide complexes. Calculation of interaction energies (i.e. the sum of nonbonding energies) between oligosaccharide and TpolCBM16-1 showed a clear correlation with experimental H values determined from isothermal titration calorimetry (Fig. 4). When structural and thermodynamic data from a similar CBM-ligand complex (CBM-cellopentaose complex from C. fimi, PDB code 1gu3) (29) are also included with the Tpol-CBM16-1 data from this work, a strong correlation is seen between CBM-ligand interaction energy and experimental H (Fig. 4), with a slope of 6 ± 1, and r = 0.96. These calculations were all carried out using the MMFF94x force field. These energy calculations were also repeated using a different force field, the OPLS-AA, which yielded an almost identical trend in interaction energies versus experimental H (inset of Fig. 4).
It is clear that there are meaningful differences in the strength of interaction between the TpolCBM16-1-cellopentaose and TpolCBM16-1-mannopentaose, which are also suggested by computational analysis. However, the small experimental differences between the When one parses the source of the difference in the interaction energies between the TpolCBM16-1-cellopentaose and TpolCBM16-1-mannopentaose complexes, one sees a significant difference in the net van der Waals contacts, in which the TpolCBM16-1-cellopentaose complex has a positive (i.e. unfavorable) net van der Waals energy of +6 kcal/mol, and the TpolCBM16-1-mannopentaose has a net negative van der Waals energy of -14 kcal/mol. The TpolCBM16-1-cellotetraose complex has a small decrease in the unfavorable van der Waals energy. However, the majority of the loss in interaction energy in the TpolCBM16-1-cellotetraose complex, relative to the TpolCBM16-1-cellopentaose complex, is because of decreased electrostatic interactions. These same trends are also observed when performing energy calculations with the OPLS-AA force field.
Molecular Basis for Differential Affinity between Tpol-CBM16-1 and TpolCBM16-2—The results of our calorimetric analysis demonstrate that although TpolCBM16-1 and Tpol-CBM16-2 can both bind cellulose polymers, the affinity of TpolCBM16-2 for the polysaccharide is nearly 4-fold less than that of TpolCBM16-1. A comparison of the crystal structure of TpolCBM16-2 with the co-crystal structure of the Tpol-CBM16-1-cellopentaose complex provides a molecular rationale for this difference. A superposition of the
In this study we have characterized each of the two carbohydrate-binding modules from T. polysaccharolyticum ManA using a combination of biochemical, thermodynamic, computational, and structural biological approaches. We demonstrate that each of the TpolCBM16s can bind both β-1,4-glucose and mannose polymers and quantify the binding affinity for such ligands using isothermal calorimetry. Our thermodynamic analysis suggests that the natural substrate for CBM16s is likely glucomannan, a mixed β-1,4-linked polymer containing both glucose and mannose that comprises nearly half the dry weight of roots and softwoods. Hence, this scaffold has been adapted for plasticity in substrate recognition. In addition, we present four crystal structures, including those of ligand-free Tpol-CBM16-1 and TpolCBM16-2 and of the TpolCBM16-1-cellopentaose and TpolCBM16-1-mannopentaose complexes, which provide a molecular rationale for understanding the ligand promiscuity and specificity by the TpolCBM16s.
Despite the binding promiscuity toward both mannan and glucan polymers, the TpolCBM16s are not able to interact with other oligosaccharides, including chitin, oligoxylans, and β-1,3-glucan species. Our co-crystal structures provide the chemical bases for understanding this binding specificity. Chitin is an unbranched polymer of N-acetyl-D-glucosamine that differs from cellulose by the replacement of the C-2 hydroxyl with an acetamido group. The lack of interaction between the TpolCBM16s and chitin may be explained by the steric clashes that would result between the acetamido groups and the protein scaffold at sugar residues at positions 3 and 4 of the polysaccharide. Although the chemical structures of xylan and cellulose are similar (xylans lack the C-6 hydroxymethyl group), their tertiary structures are quite distinct. Cellulose and mannan form flat polymers with a 180° rotation between consecutive sugars, whereas xylan forms helical structures distinguished by a 120° rotation between consecutive units (32). Hence, TpolCBM16s cannot engage the helical orientation of xylan polysaccharides within a ligand binding cleft that has evolved to accommodate planar polymers such as those of cellulose and mannan. Similarly, β-1,3-glucan polymers are shown to form helical structures in solution. Modeling studies of laminarins demonstrate an energetically favorable helical configuration with The flexibility in ligand binding demonstrated by the Tpol-CBMs is reminiscent of plasticity observed for the family 29 carbohydrate-binding modules from the Piromyces equi noncatalytic protein NCP1 (34). These two CBM29-1 and CBM29-2 bind to a range of β-1,4-linked polysaccharides, including cellulose, mannan, xylan, and glucomannan. Structural studies of CBM29-2 bound to cellohexaose and to mannohexaose reveal very few direct hydrogen bonds, particularly between the protein and the C-2 hydroxyl of the ligand (35, 36). Thus, β-1,4-linked polysaccharides with different configurations of the C-2 hydroxyl can be accommodated at the binding site of CBM29s as these hydroxyl groups do not significantly contribute to binding. Our co-crystal structures of TopCBM16-1 bound to polysaccharides illustrate an alternative mechanism for promiscuous ligand binding, namely compensatory changes in the hydrogen bonding between the C-2 hydroxyl and the protein to accommodate differences in configuration of the carbohydrate. Thus, whereas both CBM29 and TpolCBM16 demonstrate promiscuity in ligand specificity, each utilizes a distinct strategy for achieving this plasticity. The ubiquitous nature of carbohydrate-binding modules that share an overall common fold suggests that this scaffold is one that effectively optimized polysaccharide binding. Subtle differences in the ligand-binding site have resulted in the evolution of a number of different polypeptides capable of engaging a wide diversity of polysaccharides utilizing common core elements. Many such carbohydrate-binding modules are tethered to a catalytic domain, and this modular architecture serves to enhance the efficacy by targeting the catalytic components toward their substrate through interactions mediated by the carbohydrate-binding domain. The modular nature of these enzymes suggests that the combinatorial assembly of different catalytic and carbohydrate-binding domains can result in the production of novel activities. Our work provides a foundation for understanding the ligand specificity of the TpolCBM16s, and these insights may be utilized in altering specificity through minor changes in the protein scaffold. This rational redesign approach can be utilized to expand the natural repertoire of enzymes that can degrade complex polysaccharides into simple sugars, and such an approach will have a broad impact in the biofuels industry.
The atomic coordinates and structure factors (codes 2ZEW, 2ZEX, 2ZEY, and 2ZEZ) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
* This work was supported in part by American Cancer Society Grant 04-37 (to S. K. N.) and in part by the Agricultural Experimental Station Grant ILLU 538-364 (to I. K. O. C.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1 To whom correspondence should be addressed: Dept. of Biochemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801. Tel.: 217-333-0641; Fax: 217-244-5858; E-mail: snair{at}uiuc.edu.
2 The abbreviations used are: CBM, carbohydrate-binding module; MES, 4-morpholineethanesulfonic acid; r.m.s.d., root mean square deviation; PDB, Protein Data Bank; SeMet, selenomethionine.
We thank Lisa Keefe and the staff at IMCA-CAT (17-ID at Argonne National Laboratories, Advanced Photon Source) and John Chrzas and staff at SER-CAT (22-BM at Argonne National Laboratories, Advanced Photon Source) for facilitating data collection. We also thank Joseph Brunzelle at LS-CAT (23-ID at Argonne National Laboratories, Advanced Photon Source) for assistance with data collection.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||