First Structural Insights into α-l-Arabinofuranosidases from the Two GH62 Glycoside Hydrolase Subfamilies*

Background: α-l-Arabinofuranosidases hydrolyze arabinofuranosyl side chains from xylans. Results: The first crystal structures of two fungal α-l-arabinofuranosidases representing two distinct subfamilies from the glycoside hydrolase GH62 family are presented. The examination of these unveils specificity determinants. Conclusion: The structures of complexes with arabinose and cellotriose provide preliminary insight into substrate recognition and catalysis. Significance: This work provides the first structural description members of the GH62 family. α-l-Arabinofuranosidases are glycoside hydrolases that specifically hydrolyze non-reducing residues from arabinose-containing polysaccharides. In the case of arabinoxylans, which are the main components of hemicellulose, they are part of microbial xylanolytic systems and are necessary for complete breakdown of arabinoxylans. Glycoside hydrolase family 62 (GH62) is currently a small family of α-l-arabinofuranosidases that contains only bacterial and fungal members. Little is known about the GH62 mechanism of action, because only a few members have been biochemically characterized and no three-dimensional structure is available. Here, we present the first crystal structures of two fungal GH62 α-l-arabinofuranosidases from the basidiomycete Ustilago maydis (UmAbf62A) and ascomycete Podospora anserina (PaAbf62A). Both enzymes are able to efficiently remove the α-l-arabinosyl substituents from arabinoxylan. The overall three-dimensional structure of UmAbf62A and PaAbf62A reveals a five-bladed β-propeller fold that confirms their predicted classification into clan GH-F together with GH43 α-l-arabinofuranosidases. Crystallographic structures of the complexes with arabinose and cellotriose reveal the important role of subsites +1 and +2 for sugar binding. Intriguingly, we observed that PaAbf62A was inhibited by cello-oligosaccharides and displayed binding affinity to cellulose although no activity was observed on a range of cellulosic substrates. Bioinformatic analyses showed that UmAbf62A and PaAbf62A belong to two distinct subfamilies within the GH62 family. The results presented here provide a framework to better investigate the structure-function relationships within the GH62 family.

the complexes with arabinose and cellotriose reveal the important role of subsites ؉1 and ؉2 for sugar binding. Intriguingly, we observed that PaAbf62A was inhibited by cello-oligosaccharides and displayed binding affinity to cellulose although no activity was observed on a range of cellulosic substrates. Bioinformatic analyses showed that UmAbf62A and PaAbf62A belong to two distinct subfamilies within the GH62 family. The results presented here provide a framework to better investigate the structure-function relationships within the GH62 family.
The increasing push toward biorefining of plant biomass is providing new impetus to the study of enzyme-mediated deconstruction of plant cell walls. However, the rather complex, refractory nature of plant cell walls renders economically viable deconstruction difficult to achieve, especially when using low specificity chemical catalysis. Therefore, a lot of emphasis is being put on the development of biotechnological tools, such as glycoside hydrolases and microorganisms displaying the ability to hydrolyze plant cell wall components.
The plant cell wall contains a large number of complex polysaccharides exemplified by arabinoxylans, which are extensively decorated at the O2 and/or O3 positions with ␣-L-arabinofuranosyl units (1). Therefore, the complete deconstruction of plant cell wall polysaccharides requires numerous enzymatic activities, including a complex arsenal of so-called accessory enzymes that hydrolyze arabinoxylans and produce monomeric pentose sugars (2)(3). Among the hemicellulases that depolymerize arabinoxylans, ␣-L-arabinofuranosidases (EC 3.2.1.55) are exoacting enzymes that remove L-arabinofuranosyl side chain moieties from the main chain xylan, thus enhancing the action of other enzymes, such as ␤-D-xylosidase and ␤-D-xylanases (4). ␣-L-Arabinofuranosidases are mainly grouped in glycoside hydrolase * This work was supported in part by grants from the region Midi-Pyrénées (to B. S.) and the National Institute of Agricultural Research (INRA) CEPIA research department. □ S This article contains supplemental Fig. S1. The atomic coordinates and structure factors (codes 4N4B, 4N2Z, 4N1I, and 4N2R) have been deposited in the Protein Data Bank (http://wwpdb.org/). 1 Both authors contributed equally to this work. 2 Supported by Région Midi-Pyrénées and European funds (FEDER). 3 To whom correspondence may be addressed: (GH) 6 families GH43, GH51, and GH62 of the CAZy classification (5). Among these families, the GH62 family is the smallest with 121 members (as of 12th December 2013) and has so far received little attention. GH62 ␣-L-arabinofuranosidases are secreted by microorganisms and proteomic studies have identified them among the most abundant CAZymes in several fungal secretomes (6 -10). To date, all of the biochemically characterized GH62 are arabinoxylan ␣-L-arabinofuranohydrolases, in as much that they specifically cleave either ␣-1,2or ␣-1,3-L-arabinofuranosidic linkages in arabinoxylans (11). In the case of GH62 ␣-Larabinofuranosidases from Penicillium chrysogenum and Penicillium funiculosum, 1 H NMR and hydrolytic fingerprinting revealed that these enzymes cleave ␣-1,2or ␣-1,3-bonds that specifically link arabinofuranosyl moieties to single-substituted D-xylosyl residues in arabinoxylan (12)(13)(14).
Although the catalytic mechanism of the GH62 family has not yet been elucidated, GH62 members have been predicted by the CAZy team to belong to clan GH-F (15), which is also the clan assignation of GH43 enzymes. Accordingly, it is thought that GH62 will probably operate using a single displacement or inverting mechanism, rather like enzymes in the GH43 family (16). Similarly, using sequence homologies that exist between GH43 and GH62, the identity of the catalytic residues has been predicted (16). However, in the absence of structural information and further biochemical characterization, it is impossible to be more affirmative at this stage.
In this study, we undertook the structural characterization of two GH62 ␣-L-arabinofuranosidases from the fungi Podospora anserina (PaAbf62A) and Ustilago maydis (UmAbf62A). PaAbf62A was mined from the genome of the ascomycete P. anserina, a fungus that displays an impressive set of hemicellulose-acting enzymes (17,18). UmAbf62A was identified in the secretome of the phytopathogen U. maydis, a secretome that can beneficially complete that of Trichoderma reesei CL847 for the saccharification of wheat straw (7). Here, we present the crystal structures of PaAbf62A and UmAbf62A in their apo form and in complex with arabinose and cellotriose (C3). Enzyme characterization of PaAbf62A and UmAbf62A revealed differences in specificities and bioinformatics analysis of the GH62 family revealed that they belong to different subfamilies.

EXPERIMENTAL PROCEDURES
Recombinant Production of PaAbf62A and UmAbf62A-Recombinant PaAbf62A was produced as previously described (18). The U. maydis strain used in this study was obtained from the FGSC (Fungal Genetics Stock Center, Kansas City, KS) fungal collection. The strain was verified by internal transcribed spacer sequencing and archived under the number BRFM 1093 at the "Centre International de Resources Microbiennes," which is dedicated to filamentous fungi of biotechnological interest (CIRMCF), at the National Institute of Agricultural Research (INRA), Marseille, France. The U. maydis BRFM 1093 strain was grown on MA2 (malt extract at 2% w/v) medium. Genomic DNA was purified from the mycelium using the NucleoSpin Plant II kit (Macherey-Nagel GmbH and Co., Düren, Germany). The NW_101116 sequence encoding the UM04309 protein (GenBank TM EAK85571) was amplified by PCR from genomic DNA using Pwo Super yield DNA polymerase (GE Healthcare) and the following primers: EcoRI, forward 5Ј-GAATTCAACCCCGAAACTGAACG-3Ј and XbaI, reverse 5Ј-TCTAGACCCAGACCGTACAAGGTG-3Ј. The amplified fragment was subcloned into the pCRII-TOPO vector (Invitrogen) and subjected to sequencing to check the integrity of the sequence. The full-length ORF corresponding to UmAbf62A was inserted at the corresponding sites (EcoRI and XbaI) into the pPICZ␣A vector in-frame with both the yeast ␣-secretion factor and C-term-(His) 6 tag encoding sequences. The resulting recombinant expression plasmid was transformed into Pichia pastoris X33 and the best-producing transformant was selected for recombinant UmAbf62A production and purification as previously described (18).
Enzymatic Characterization of PaAbf62A and UmAbf62A-Arabinofuranosidase activity was determined by measuring the release of 4-nitrophenol from p-nitrophenyl-␣-L-arabinofuranoside (pNP-Ara; 10 mM; Sigma) in 50 mM sodium phosphate buffer (pH 5.0) at 37°C in a reaction volume of 100 l. The reaction was stopped by adding 900 l of 1 M sodium carbonate (pH 9.0). The release of 4-nitrophenol was quantified at 410 nm using a molar extinction coefficient for 4-nitrophenol of 18,300 M Ϫ1 cm Ϫ1 . One unit of enzyme activity was defined as the amount of protein that released 1 mol of arabinose per min. The apparent optimal pH was estimated using pNP-Ara in the pH range 2.6 to 7.5. The apparent optimal temperature was estimated in the 40 to 70°C range under the conditions described above. For determination of Michaelis-Menten constants, the initial velocities of the enzymes were measured in 50 mM sodium phosphate buffer (pH 5.0) at 37°C, with pNP-Ara concentrations ranging from 0.1 to 20 mM. The kinetic parameters were estimated by using weighted nonlinear least squares regression analysis with Grafit software (Erithacus Software, Horley, United Kingdom).
The hydrolysis of 0.1% (w/v) of wheat arabinoxylan low viscosity, sugar beet arabinan, debranched arabinan, and larch arabinogalactan (Megazyme, Wicklow, Ireland) at 37°C, in 50 mM sodium phosphate buffer (pH 5.0), was monitored by analyzing the amount of free arabinose, using high performance anion exchange chromatography coupled with amperometric detection (ICS 3000; Dionex, Sunnyvale, CA) equipped with a Carbo-PacPA-1 analytical column (250 ϫ 4 mm). Enzymatic reactions were stopped by the addition of 50 mM NaOH before injection (10 l) onto the high performance anion exchange chromatography system. Elution (1 ml min Ϫ1 ) was carried out in 18 mM NaOH. A calibration curve was plotted using arabinose as the standard (Sigma) and used to calculate the amount of product released. All assays were carried out in triplicate. Results are expressed in mol of arabinose released per min per mg of enzyme.
Cellulose Binding Studies-Enzyme affinity for soluble 2-HEC (Sigma) was investigated using affinity gel electrophoresis. Native polyacrylamide gels, prepared by dissolving acrylamide (10% w/v) in 25 mM Tris, 250 mM glycine buffer (pH 8.3), were polymerized in the absence and presence of ligand (0.001 to 0.1% (w/v) 2-HEC). Afterward, for the analysis, ligand-containing and ligand-free native gels were migrated simultaneously in the same tank. Five g of target proteins and BSA (as a negative control) were loaded onto the gels and subjected to electrophoresis at 10 mA/gel for 2 h at room temperature in running buffer containing 15 g liter Ϫ1 of Tris-HCl and 72 g liter Ϫ1 of glycine (pH 8.3). A K D value was determined by plotting the values 1/(R-r) versus 1/[L], where R represents the ratio between GH62 and BSA migration in the absence of 2-HEC, r, the same ratio in the presence of 2-HEC, and [L], the 2-HEC concentration expressed in mg ml Ϫ1 . Ϫ1/K d corresponds to the y value when x ϭ 0 (19).
Inhibition Studies-Inhibition was assayed by measuring the arabinofuranosidase activity of the enzymes using pNP-Ara as substrate in the presence of cello-oligosaccharides. Cello-oligosaccharides were added to the reaction mixture at concentrations ranging from 0 to 300 mM (for cellobiose) and from 0 to 8 mM (for C3, C4, C5, and C6) at pH 5.0 and 37°C. The extent of inhibition was the relative enzyme activity (ratio of activity with inhibitor versus activity without inhibitor). Following the same procedure, the inhibition constant (K i ) was determined using C6 with 6 different concentrations ranging from 0.5 to 8 mM. Data were fitted to the Michaelis-Menten equation to generate an estimate of K i , using the Grafit data analysis software.
Glycosylation Site Prediction-Putative N-glycosylation sites were predicted from the amino sequence using NetNGlyc and verified by mass spectrometry.
N-Deglycosylation-Endo-Hf and amylose resin high flow were obtained from New England Labs (Beverly, MA). 0.5 milliunits of Endo-Hf was used to deglycosylate 1 mg of PaAbf62A at 30°C during 24 h in G5 NEBuffer. Endo-Hf was eliminated by adding amylose resin. Incubation was performed for 2 h at 4°C with gentle shaking in MBP buffer (20 mM Tris-HCl, pH 7.4, 200 mM NaCl, and 1 mM EDTA). Deglycosylated PaAbf62A was recovered in the supernatant after centrifugation.
Mass Spectrometry Analyses-Protein samples were reduced in Laemmli buffer (final composition 25 mM DTT, 2% SDS, 10% glycerol, 40 mM Tris, pH 6.8), 5 min at 95°C, and alkylated with iodoacetamide. 5 g of protein was subjected to 12% SDS-PAGE (Mini-Protean, Bio-Rad). Protein in the gel was digested by incubating the piece of gel with two enzymes: 25 l of 20 ng l Ϫ1 of modified sequencing grade trypsin (Promega) and 25 l of 20 ng l Ϫ1 of Endoproteinase Asp-N Sequencing grade (Roche Applied Science), in 50 mM ammonium bicarbonate, overnight at 37°C. These peptides were analyzed by nanoLC-MS/MS using an Ultimate3000 system (Dionex, Amsterdam, The Netherlands) coupled to an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) as previously described (20), except that peptides were eluted using a 5 to 50% gradient of solvent B during 60 min at 300 nl min Ϫ1 flow rate. The Mascot Daemon software (version 2.2.0, Matrix Science) was used to perform database searches against all entries in a homemade database containing the PaAbf62A protein sequence. Carbamidomethylation of cysteines, oxidation of methionines, acetylation on N-terminal protein, and HexNAc of asparagine were set as variable modifications for Mascot searches, without enzyme cleavage site. The mass tolerances in MS and MS/MS were set to 10 ppm and 0.8 Da, respectively, and the instrument setting was specified as "ESI-Trap." Size Exclusion Chromatography-200 l of 2 mg ml Ϫ1 of proteins in 50 mM sodium acetate buffer (pH 5.0), 50 mM NaCl were loaded onto a Superdex 75 column (GE Healthcare, Life Sciences) and eluted with the same buffer at a flow rate of 0.7 ml min Ϫ1 . Calibration of the column was performed using the Gel Filtration LMW Calibration Kit (GE Healthcare, Life Sciences).
Crystallization-Purified proteins were concentrated using polyethersulfone Vivaspin concentrators with a cut-off of 10 kDa (Vivascience, Sartorius, Goettingten, Germany). The final concentration was determined by measuring the absorbance at 280 nm using a Nanodrop instrument (Wilmington, DE) and theoretical extinction coefficients of 99,350 and 67,840 M Ϫ1 cm Ϫ1 for PaAbf62A and UmAbf62A, respectively. All crystallization experiments were carried out at 12°C. Initial crystallization conditions were screened using the commercially available JSCG I, II, III, and IV and PEG kits (Qiagen, Courtaboeuf, France) and a robotic sitting-drop method, with drops formed by mixing equal volumes (0.2 l) of protein (12, 18, 25, or 38 mg ml Ϫ1 in 50 mM sodium acetate, pH 5.0) and precipitant solution, which were equilibrated against 80 l of precipitant solution. Conditions were further optimized manually using the hanging-drop vapor-diffusion method in 24-well plates. Best crystals of PaAbf62A were obtained with a 1:1 (v/v) ratio of protein (18 mg ml Ϫ1 in 50 mM sodium acetate (pH 5.0) and 50 mM NaCl) to precipitant solution (25% (w/v) PEG 4000, 0.2 M calcium chloride, and 0.1 M Tris pH 8.5). Crystals appeared within a week and grew to an approximated size of 0.25 ϫ 0.2 ϫ 0.05 mm 3 .
Data Collection and Phasing-X-ray experiments were carried out at 100 K. Prior to flash cooling, optimized native crystals of UmAbf62A and PaAbf62A were soaked for a few seconds in the reservoir solution supplemented with 10% (v/v) glycerol to avoid ice formation. Diffraction data were collected in-house (Xcalibur Nova, Agilent Technologies) for PaAbf62A in complex with cellotriose, on beamlines ID23-eh1 (PaAbf62a native structure) and ID14eh1 (UmAbf62a-L-arabinofuranose complex) at the European Synchrotron Radiation Facility (Grenoble, France) and on beamline PROXIMA 1 at SOLEIL (Gif sur Yvette, France) for the native UmAbf62A structure. The diffraction data were processed scaled and merged using the XDS package (21).
The native structure of UmAbf62A was solved by the sulfur-SAD phasing method using data collected in-house at 1.7-Å resolution from a single crystal (Table 1). A highly redundant data set was collected at the Cu-K␣ wavelength covering 1425°o f total rotation with 1°oscillation width per frame. A native dataset was collected to 1.0 Å on beamline PROXIMA 1 at SOLEIL.
The anomalous signal of sulfur atoms from 6 methionine residues and 2 cysteine residues was evaluated and the eight sulfur atom positions and occupancies were determined and refined with SHELX C/D (22,23). The initial phase calculation and phase improvement were performed using SHELX E (23). The quality of the phases was sufficient (figure of merit ϭ 0.60 for all reflections) to obtain an interpretable map at 1.7-Å resolution after solvent flattening. An initial model containing about 90% of the whole protein sequence was automatically generated using the ARP/wARP web service (24). The missing parts were then manually built in -A weighted electron density maps using COOT (28) and followed by refinement steps using Refmac5 (25)(26)(27). The final R and R free were 0.13 and 0.15, respectively.
The crystal structure of PaAbf62A was solved by molecular replacement using UmAbf62A as the model. All protein-ligand structures were directly refined using Refmac5 and the apoenzyme structures as starting models. The coordinates and structure factors were deposited in the Protein Data Bank at RCSB under the references PDB identifiers: 4N1I, 4N2R, 4N4B, and 4N2Z for UmAbf62A, UmAbf62A in complex with arabinose, and PaAbf62A and PaAbf62A in complex with cellotriose, respectively.
Soaking Experiments-C3 was obtained from Megazyme, arabinose from Sigma, and arabino-xylo-oligosaccharides (XA 3 XX) were prepared as described previously (29). A PaAbf62A-cellotriose complex was obtained by soaking a PaAbf62A crystal for 35 min in the reservoir solution supplemented by C3 at saturating concentration. To form a UmAbf62A⅐XA 3 XX complex, a crystal of UmAbf62A was soaked for 2 h in solution containing 20% (w/v) PEG 3350, 0.2 M sodium phosphate, 0.1 M sodium acetate (pH 3.0), 3 mM XA 3 XX, and 50 mM arabinose as cryoprotectant. After soaking, crystals were picked up using a nylon loop and flash-cooled in a stream of nitrogen gas.
Isothermal Titration Calorimetry-Isothermal titration calorimetry (ITC) experiments were conducted at 25°C in Tris-HCl buffer (pH 6.0) on a Microcal ITC200 instrument (GE Healthcare). To ensure minimal buffer mismatch, protein was dialyzed against 20 mM Tris-HCl buffer (pH 6.0) and the ligands were solubilized in the same buffer. Experiments consisted of a series of 20 ϫ 2-or 40 ϫ 1-l injections of ligand (1.6 -5 mM) into the protein-containing (PaAbf62A 114 M, UmAbf62A 488 M) thermostatic cell (initial delay of 60 s, duration of 4 s and spacing of 100 s). ITC experiments were systematically performed in triplicate. The corrected binding isotherms were fitted for a single binding-site model using non-linear least squares analysis to obtain values of equilibrium binding constant (K a ), stoichiometry, and enthalpy changes (⌬H) associated with ligand binding. Other thermodynamic parameters were derived from the equation: ϪRT ln K a ϭ ⌬G ϭ ⌬H Ϫ T⌬S.
Phylogenetic Clustering-A total of 106 sequences corresponding to individual GH62 modules were extracted from the CAZy database (5). They were aligned using the Muscle version 3.7 software (30), operating with default parameters. A matrix of maximum likelihood distances of the resulting aligned sequences was determined using Jones, Taylor, and Thornton  (JTT) distances (31) and subsequently subjected to hierarchical clustering method of Ward (32).

RESULTS
Phylogenetic Analysis of GH62 Family-To better characterize and compare the two fungal arabinofuranosidases presented in this work, a phylogenetic analysis of the GH62 family was carried out using publically available bacterial and fungal sequences. Of the 106 sequences extracted from the CAZy database (12th September 2013), only 17 enzymes (including PaAbf62A and UmAbf62A) have been characterized (EC 3.2.1.55). The bioinformatics analysis revealed two distinct subfamilies that did not correspond to taxonomical segregation (Fig. 1). PaAbf62A and UmAbf62A, which share 38% identity, belong to two distinct subfamilies termed GH62-1 and GH62-2, respectively. Subfamily GH62-1 includes 30 members that are clustered into two distinct subgroups, the larger one having three biochemically characterized members, whereas none of the five members of the smaller subgroup has been characterized at present. On the other hand, subfamily GH62-2 is more homogeneous and contains 76 members, with over a dozen characterized members, which appear to display shorter loop structures than their GH62-1 counterparts.
Comparative Activity-After cloning and heterologous expression in P. pastoris, the specificity of UmAbf62A was investigated and compared with that of PaAbf62A (18). The capacity of UmAbf62A and PaAbf62A to hydrolyze pNP-Ara, as well as arabinose-containing polysaccharides was evaluated. Although it is generally claimed that GH62 enzymes are not able to hydrolyze pNP-Ara (33), enzyme activity was detected using pNP-Ara as substrate for both enzymes and the Michaelis-Menten parameters were determined at their pH and temperature optima (pH 5.0, 37°C). Both PaAbf62A and UmAbf62A displayed relatively weak recognition of pNP-Ara with K m values of 6.1 and 7.5 mM, respectively. In terms of catalytic efficiency, PaAbf62A was ϳ12-fold more active than   (43). The reliability of the tree was analyzed by bootstrap analysis of 100 resamplings of the data set. The phylogenetic tree was prepared using Dendroscope. un, uncultured microorganism.

JOURNAL OF BIOLOGICAL CHEMISTRY 5265
UmAbf62A against pNP-Ara ( Table 2). The investigation of polysaccharide hydrolysis by PaAbf62A and UmAbf62A using wheat arabinoxylan, sugar beet arabinan, and debranched or linear arabinan revealed that both enzymes released arabinose as the sole product, irrespective of the substrate. Moreover, on wheat arabinoxylan, both enzymes displayed similar catalytic rates (174 and 173 min Ϫ1 for PaAbf62A and UmAbf62A, respectively), but UmAbf62A showed an ϳ2-fold higher catalytic rate on sugar beet arabinan (460 min Ϫ1 compared with 192 min Ϫ1 for PaAbf62A). Neither PaAbf62A, nor UmAbf62A displayed activity on debranched arabinan (data not shown), indicating that the hydrolytic action of these enzymes was exclusively directed toward the ␣-1,2 and/or the ␣-1,3 bonds that link side chain L-arabinofuranosyl units to the main chains of the polymers under study. Finally, determination of k cat for the different reactions clearly revealed that both enzymes displayed much higher catalytic rates on natural polysaccharide substrates than on the artificial substrate pNP-Ara. Three-dimensional Structures-Crystallization trials using recombinant UmAbf62A yielded orthorhombic crystals that diffracted up to 1.7 Å using an in-house x-ray diffractometer. Sulfur SAD phasing experiment provided the means to solve the phase of the crystallographic structure of the apoenzyme.
Initial crystallization experiments using PaAbf62A all resulted in crystal twinning, which was attributed to the presence of protein glycosylation. The presence of N-glycosylation sites in the protein sequence was predicted using the NetNGlyc 1.0 server, and indeed MS/MS analysis on N-deglycosylated PaAbf62A confirmed this and unambiguously identified the Asn 56 residue of the peptide 49 DPTIILIN-(HexNAc)GTHHVFASTAQ 67 to bear the residual HexNAc sugar moiety. Similarly, MS/MS analysis of peptide 83 DAPN(HexNAc)ATFYYLDQAPLGTGYR 102 localized the HexNAc moiety on the Asn 86 residue. In addition, the N-terminal peptide DVSIVQLSNQPPS (which is not observed in the x-ray structure) was found to bear a hexose moiety but the amino acid bearing the modification could not be determined. PaAbf62A was thus enzymatically deglycosylated, a procedure that ultimately provided the means to obtain monoclinic crystals that diffracted up to 1.44 Å. The three-dimensional structure was solved using the apo-structure of UmAbf62A as the template for molecular replacement. The statistics of data collection and processing for both proteins are summarized in Table 1.
The refined structure of UmAbf62A was solved at a resolution of 1.0 Å and comprised 312 of 336 residues of the recombinant protein. Final crystallographic R work and R free values were 0.13 and 0.15, respectively, and more than 98% of the residues were in the allowed regions of the Ramachandran plot. One molecule was present in the asymmetric unit. The overall structural fold is composed of a single domain, displaying a five-bladed ␤-propeller fold (Fig. 2), which is consistent with previous predictions and the fact that GH62 belongs to clan GH-F of the glycoside hydrolase classification. Each of the five radially oriented blades comprises four antiparallel ␤-strands with the first blade forming a so-called "molecular Velcro" (34,35). The latter is composed of the first three inner strands composed of the C-terminal end of the polypeptide chain, and the fourth outer strand by the N terminus (blade I: residues 18 -20, 301-308, 281-287, and 260 -264) (Fig. 2). One ␣-helix is observed in the C-terminal region, which is formed by an extra amino acid resulting from the cloning of the protein. Additionally, a disulfide between Cys 10 and Cys 278 constrains the solvent exposed N-terminal part of the polypeptide chain in the vicinity of the loop connecting strands 18 (residues 260 -264) and 19 (residues 281-287). This loop obstrudes the bottom of the tunnel delineated by the five blades of the ␤-propeller.
The structure of PaAbf62A was solved at a resolution of 1.44 Å with one molecule per asymmetric unit. It comprises 319 amino acid residues of 335 for the mature polypeptide chain with final crystallographic R work and R free of 0.14 and 0.16, respectively. More than 99% of the residues are in allowed regions of the Ramachandran plot. The apo-structures of PaAbf62A and UmAbf62A are very similar, with a root mean square deviation of 1.4 Å based on the superposition of 288 C␣ atoms (Fig. 2). The main differences are observed in loop regions and are due to insertions in the primary structure of PaAbf62A, which is longer than that of UmAbf62A, as evidenced by sequence alignment of the GH62 family (supplemental Fig. S1). The biggest loop extension concerns the loop that connects strand ␤9 from blade 3 to strand ␤10 from blade 4. Moreover, PaAbf62A contains two disulfide bridges, formed between Cys 170 and Cys 175 , and between Cys 303 and Cys 337 , respectively, whereas UmAbf62A only contains one, formed between Cys 10 and Cys 278 . The spatial location of this single disulfide bridge in UmAbf62A is similar to that of Cys 10 -Cys 278 in PaAbf62A.
The Active Site-Based on homology with other members of clan GH-F whose catalytic residues have been identified (36,16), it is highly likely that the catalytic residues of UmAbf62A/ PaAbf62A are Glu 195 /Glu 216 (general acid) and Asp 36 /Asp 49 (general base), respectively, with Asp 143 /Asp 165 acting as the pK a modulator. Importantly, the distance measured between the putative general acid and general base amino acid is 8.0 and 8.1 Å for PaAbf62A and UmAbf62A, respectively, which is consistent with an inverting mechanism (37). It is noteworthy that the inverting mechanism has been established for the A. nidulans GH62 member. 7 These amino acids are all conserved within the GH62 family, and form part of a pocket, the bottom of which is lined by a histidine residue, His 261 /His 285 , in UmAbf62A/PaAbf62A, respectively. This histidine residue is part of a SHG motif, which is also conserved within the GH62 family (supplemental Fig. S1), and is stabilized by interactions with a calcium ion that, in both enzymes, is located in the center of the five-bladed ␤-propeller and coordinates His 261 /His 285 in UmAbf62A/PaAbf62A, respectively (Fig. 3). Nevertheless, the coordination of the calcium ion in UmAbf62A and PaAbf62A structures is not identical because in the former the calcium is hepta-coordinated with a pentagonal bipyramidal ligation involving His 261 -N⑀2, Gln 198 -N⑀2, and five water molecules, whereas in PaAbf62A, the calcium ion is hexa-coordinated with a squared bipyramidal ligation involving His 285 -N⑀2 and five water molecules (Fig. 7). It is noteworthy that the structures of some GH43 members (e.g. the structure of BT2895 Bacteroides thetaiotaomicron, Protein Data Bank code 3KST) also contain a calcium ion that displays the same coordination as that observed in PaAbf62A. Moreover, the chelation of calcium by EDTA or Chelex resin had no apparent effect on enzyme activity (data not shown). Coupled to the fact that the calcium ion has no direct interaction with the active site, this observation is consistent with a structural rather than functional role.
Interaction with Arabinose-To trap UmAbf62A complexed with a L-arabinofuranosyl moiety, crystals were soaked with XA 3 XX (29) at pH 3.0 to inactivate the enzyme (38) and L-arabinose was added to the soaking solution as a cryoprotectant. Under these experimental conditions, we hoped to observe the pentasaccharide, but in fact the electron density was too poor to reliably discern the xylosyl moieties. Consequently, only L-arabinofuranose could be built into the electron density maps

Structural Characterization of GH62 ␣-L-Arabinofuranosidases
FEBRUARY 21, 2014 • VOLUME 289 • NUMBER 8 (Fig. 4). The arabinofuranose is housed in the pocket described above, which constitutes the Ϫ1 subsite. The C1 hydroxyl group was observed in both ␣and ␤-anomeric configuration. Thus, the C1-O1 covalent bond of the ␣-L-arabinofuranose could be considered as the scissile bond position of the substrate (␣-L-arabinofuranosyl moiety), or as the anomeric oxygen of the product (␤-L-arabinofuranose). In the ␤-configuration, O1 replaces a water molecule (Wat 894 ) present in the apo-structure (Fig. 4). This water molecule is hydrogen bonded to the O␦1 of the catalytic general base, Asp 36 , and is likely to play the role of the solvent nucleophile used by inverting enzymes. The arabinose ring is stacked against Tyr 58 and hydrogen bonded to Tyr 296 . The O2 of the arabinose is within hydrogen bonding distance of O␦2 of the pK a modulator Asp 143 , and N␦1 of His 261 . Moreover, O4 of the arabinose appears to be hydrogen bonded to Lys 35 (Fig. 4). Unlike UmAbf62A, no complex of PaAbf62A with arabinose could be obtained, although a complex with C3 was obtained. The details and implications of this are discussed hereafter.
Interaction of PaAbf62A with Cellulose Derivatives-Hydrolysis tests clearly demonstrated that PaAbf62A was unable to hydrolyze cellulose-based substrates, such as 2-HEC, carboxymethylcellulose, Avicel, filter paper, and cello-oligosaccharides (data not shown). However, this enzyme was able to bind to Avicel with extremely weak affinity (data not shown) and to 2-HEC, with a dissociation constant (K d ) for 2-HEC being 0.77 mg ml Ϫ1 (Fig. 5). To evaluate the affinity of PaAbf62A for cello-oligosaccharides, ITC was performed using C3 and C6. Accordingly, when using C6 as the ligand, a K d value in the range of 80 M was determined compared with 12 M for X6 (Fig. 6). However, no binding to C3 was detected. Likewise, in the similar assays, UmAbf62A failed to display any binding to either of the two cello-oligosaccharides or to xylohexaose (Fig.  6) in addition, binding to 2-HEC was not possible to determine because UmAbf62A did not penetrate within the native gels. This result emphasizes the unexpectedness of PaAbf62A behavior toward C6 and 2-HEC, especially considering that the enzyme does not possess a carbohydrate binding module. Taken together, these results imply that cello-oligosaccharide/ 2-HEC binding by PaAbf62A might be due either to direct binding in the active site cleft, or at secondary binding sites. Therefore, to further investigate this binding phenomenon, the inhibition of PaAbf62A-mediated hydrolysis of pNP-Ara by cello-oligosaccharides was studied. This revealed that, unlike UmAbf62A, the activity of PaAbf62A was inhibited by cellooligosaccharides or cellulose derivatives and that the strength of this inhibition (expressed as an IC 50 ) was correlated to the chain length of the cello-oligosaccharides (i.e. IC 50 C3, 20 mM Ͻ C4, 8 mM Ͻ C5, 2.2 mM Ͻ C6, 0.8 mM). C6 acted as an inhibitor of PaAbf62A with a K i of 3.4 mM ( Table 2), which is roughly 40-fold lower than the K d value determined using ITC for C6. The explanation for this unexpected result is not apparent, but clearly it is highly unlikely that the non-reducing terminal glucosyl moiety of C6 actually binds in the place of the L-arabinosyl moiety. Therefore, although the model used to calculate the K i was that of competitive inhibition, the actual inhibition mode was probably more complex, with some binding configurations failing to (fully) inhibit the hydrolysis of pNP-Ara.
Positive Subsite Interactions in PaAbf62A-In light of previous results, it is somewhat surprising that C3 is the only ligand that was found to make a crystallographic complex with PaAbf62A. In this complex, the trisaccharide is located in front of the catalytic pocket, which houses a Tris molecule, in the central region of the roughly straight surface channel, confirming the identity of this topological feature as the substrate binding cleft, which can accommodate polymer backbones. According to the protein-ligand complex described in this work, topology of the PaAbf62A substrate binding cleft is formed from at least four subsites, with the scissile bond being located between subsites Ϫ1 and ϩ1, the latter being occupied by the central glucosyl moiety of cellotriose (39). Likewise, the adjacent subsites ϩ2NR and ϩ2R are occupied by non-reducing and reducing glucosyl moieties, respectively (Fig. 7). Further subsites might exist either at the reducing (i.e. ϩ3R) or non-reducing sides (i.e. ϩ3NR), but these are not evidenced by our data, although they are implied by the fact that C6 displays the highest inhibitory potential of the cellooligosaccharides tested. In the case that subsites ϩ3R and ϩ3NR do exist, these would be highly solvent exposed.
The glucosyl residue at the ϩ2R subsite stacks with the Tyr 162 residue, which is present in half of the GH62-1 subfamily, and is stabilized by two strong hydrogen bonds between its O6 and the hydroxyl of Tyr 101 and N⑀2 of Glu 121 , respectively. Moreover, the O2 of this glucosyl moiety interacts with N␦2 of Asn 321 . Interestingly, Tyr 101 is only present in the GH62-1 subfamily, where it is well conserved. Significantly, in UmAbf62 the equivalent of Tyr 101 is Trp 104 , with a tryptophan being strictly conserved at this position in the subfamily GH62-2. Finally, it is noteworthy that in the few cases where Tyr 101 is absent from FIGURE 4. UmAbf62A in complex with arabinose. A, structure of arabinose in the active site pocket of UmAbf62A. Sugar is displayed as stick and protein as a schematic. Catalytic amino acids are named using the one-letter code of the amino acid followed by the position in the sequence. The electron density has been contoured to 1. B, schematic diagram showing hydrogen-bonding, waterbridged, and hydrophobic interactions between UmAbf62A and arabinose. Amino acid residues of UmAbf62A that have hydrophobic interactions with arabinose are shown as spiked spheres (with distances of less than 3.5 Å). Direct and water-bridged hydrogen-bonding interactions are indicated by dashed lines. This diagram and other have been obtained using the LIGPLOT program (44). subfamily GH62-1, it is replaced by a tryptophan residue, like Trp 104 in UmAbf62A ( Fig. 6 and supplemental Fig. S1).
At the ϩ1 subsite, Tyr 320 displays a stacking interaction with the glucose, whereas the O2 atom of the sugar points toward the catalytic pocket, making a hydrogen bond with the Tris molecule. The O⑀1 atom of Glu 216 , the predicted catalytic general acid, also interacts with the O2 of the glucose at the ϩ1 subsite. This is consistent with the fact that the O2 of the sugar moiety in subsite ϩ1 (i.e. a xylosyl or arabinosyl group if the substrate is an arabinoxylan) is engaged in the scissile glycosidic bond that links to the arabinosyl moiety bound in subsite Ϫ1. Moreover, the O3 of the glucose bound in the ϩ1 subsite displays strong interactions with the O⑀2 of Glu 216 and N2 of Arg 241 , respectively, suggesting that the enzyme might hydrolyze both ␣-1,2 and ␣-1,3 linkages.
In addition to its interactions with the glucosyl moiety in subsite Ϫ1, the N2 of Arg 241 also makes a bidentate hydrogen bond to O5 and O6 of the ϩ2NR, suggesting that this residue might play an important role. This observation is consistent with the fact that Arg 241 is highly conserved in GH62. Overall, the glucosyl moiety present in ϩ2NR interacts strongly with Arg 241 through its O6, and with Arg 211 via its O2 and O3. Unlike Arg 241 , Arg 211 is not conserved in the GH62 family and the corresponding residue in UmAbf62A is Glu 190 . Finally, a water molecule (Wat 819 ) is located at the same position as the water molecule Wat 894 in UmAbf62A implying it could take the role of the solvent nucleophile in PaAbf62A.

DISCUSSION
Structural Characterization of GH62 ␣-L-Arabinofuranosidases-This study has revealed the first structures of two GH62 family members in their apo forms, and in complex with either arabinose or C3. The data presented demonstrate that GH62 arabinofuranosidases share common structural features with GH43 arabinofuranosidases, because both families are characterized by a five-bladed ␤-propeller fold. The catalytic apparatus is localized in a pocket, which can accommodate the arabinose sugar constituting the Ϫ1 subsite. The three predicted catalytic residues interact either with the arabinose in the Ϫ1 subsite in the complexed form or with a water molecule in the apo form. This water molecule is at the same position as the anomeric hydroxyl group in the complex with arabinose. Furthermore, the water molecule is absent when arabinose is in an inverted conformation compared with the substrate, suggesting that, not only water molecules Wat 894 and Wat 819 in UmAbf62A and PaAbf62A, respectively, could play the role of the catalytic water molecule, but also that hydrolysis would occur with an inverting mechanism. Taken together, these data strongly implies that the GH62 family belongs to the same clan as the GH43 family (i.e. clan GH-F), which is consistent with the predictions made by Pons et al. (16). Furthermore, it is interesting to note that His 161 /His 285 in UmAbf62A/PaAbf62A, respectively, which coordinates the calcium ion and binds the arabinose in the Ϫ1 subsite, is also conserved in the GH43 arabinofuranosidases and arabinanases from Bacillus subtilis (PDB code 3C7E) and Cellvibrio japonicus (PDB code 3QED), respectively. Consistent with this observation, we suggest that this histidine plays an important functional role in arabinose hydrolysis within the clan GH-F.
In this work, tentative identification of the positive subsites involved in PaAbf62A xylan binding have been performed using the PaAbf62A-C3 complex. We believe that this extrapolation is valid, because as well as binding X6, PaAbf62A also binds C6. Moreover, taking into account the fact that the conformation of ␤-glucans and xylan main chains are similar, it is noteworthy that PaAbf62A is more specific for xylans than arabinans, the latter displaying a quite different main chain conformation. Nevertheless, caution is required because examination of the complex also highlighted specific glucan-binding features, such as the stabilizing hydrogen bonds that are formed with O6 of the primary alcohol of glucosyl moieties that occupy the ϩ2R and ϩ2NR subsites (Fig. 7). Obviously in the case of arabinoxylans these interactions do not occur, because the 6-hydroxyl group of glucopyranose is absent in xylopyranose. This might explain why no xylose could be attributed using XA 3 XX (data not shown) whatever the soaking condition tested and it suggests that xylan binding involves other interactions that account for the specific backbone conformation of xylooligosaccharides and xylans. Sugar is displayed as stick and protein as a schematic. Catalytic amino acids are named using the one-letter code of the amino acid followed by the position in the sequence. The electron density is displayed in dark blue and has been contoured to 1. B, schematic diagram showing hydrogen-bonding, water-bridged, and hydrophobic interactions between PaAbf62A and cellotriose. Amino acid residues of PaAbf62A that have hydrophobic interactions with cellotriose are shown as spiked spheres (with distances of less than 3.5 Å). Direct and water-bridged hydrogen-bonding interactions are indicated by dashed lines. C, superimposition of complex structures of PaAbf62A and UmAbf62A sugar are colored in magenta, UmAbf62A is colored in gold, and PaAbf62A is colored in cyan.