Crystal structure of barley 1,3-1,4-beta-glucanase at 2.0-A resolution and comparison with Bacillus 1,3-1,4-beta-glucanase.

Both plants and bacteria produce enzymes capable of degrading the mixed-linked beta-glucan of the endosperm cell walls of cereal grains. The enzymes share the specificity for beta-1,4 glycosyl bonds of O-3-substituted glucose units in linear polysaccharides and a similar cleavage mechanism but are unrelated in sequence and tertiary structure. The three-dimensional structure of the 1,3-1, 4-beta-glucanase isoenzyme EII from barley was determined from monoclinic crystals at a resolution of 2.0 A. The protein is folded into a betaalpha8 barrel structure as has been shown previously (Varghese, J. N., Garrett, T. P. J., Colman, P. M., Chen, L., Hoj, P. B., and Fincher, G. B. (1994) Proc. Natl. Acad. Sci. U.S.A. 91, 2785-2789) by diffraction analysis at lower resolution of tetragonal crystals. It contains one N-glycosylation site which is described in detail with the sugar moieties attached to residue Asn190. The geometry and hydration of the barley 1,3-1,4-beta-glucanase is analyzed; a model beta-glucan fragment is placed into the binding site by molecular dynamics simulation, and the beta-glucan binding grooves of the plant and bacterial enzymes are compared. Their active sites are shown to have a small number of common features in generally dissimilar geometries that serve to explain both the identical substrate specificity and the observed differences in inhibitor binding.

There is neither sequential nor structural homology between the plant and the bacterial enzymes, and following the universally adopted nomenclature (6), they belong to the families 17 and 16 of glycosyl hydrolases, respectively. Despite this difference they cleave the same substrate at the same cutting site and are inhibited by the same covalently binding inhibitors, 3,4-epoxyalkyl-␤-D-cellobiosides (12). However, the barley 1,3-1,4-␤-glucanase binds the epoxide preferentially with a propyl linker, whereas SUB prefers a butyl linker (13). Because both endohydrolases follow the same stereochemical pathway in glycosyl bond cleavage with retention of the ␤-configuration at the anomeric carbon (14,15), the differences in inhibitor binding are surprising at first sight and warrant a structural explanation. A cleavage mechanism with overall retention of configuration requires the presence in the catalytic site of a general acid separated by about 5.5 Å from a nucleophilic residue (6). These roles have been assigned to Glu 288 and Glu 232 of EII, respectively (3).
Here we present the structure analysis of monoclinic crystals of the barley 1,3-1,4-␤-glucanase (isoenzyme EII) at 2.0-Å resolution. We discuss structural differences between three molecules of EII in two space groups and compare the active center structures of plant and Bacillus 1,3-1,4-␤-glucanases. Structural reasons for inhibitor binding preferences are considered, and possible conformations and positions of a hexameric ␤-glucan fragment are investigated by molecular dynamics calculations.

Diffraction Data and Structure Analysis by Molecular Replacement-
The purification, crystallization, and x-ray diffraction data of barley * This work was supported by the Deutsche Forschungsgemeinschaft through He 1318/9 -3 and the Fonds der Chemischen Industrie. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The 1,3-1,4-␤-glucanase, isoenzyme EII, have been described (16). The diffraction data set was collected from one crystal at room temperature with a 180-mm MarResearch imaging plate system on an Enraf Nonius FR571 rotating anode x-ray generator (45 kV, 90 mA). The images were evaluated with MOSFLM, AGROVATA, and ROTAVATA from the CCP4 suite (20). Relevant crystallographic parameters in the monoclinic space group P2 1 are summarized in Table I. Space group and unit cell dimensions are consistent with the presence of two protein molecules per asymmetric unit. The 40,727 unique reflections correspond to 99.3% of the observations expected for 13.65 to 2-Å resolution, and the completeness in the outermost resolution shell from 2.09 to 2.0 Å is 96.6%. The crystal and x-ray diffraction data for EII in the tetragonal space group P4 3 2 1 2 (Ref. 3; PDB entry 1ghr) are added to Table I for comparison. Since the structure of EII is known for the space group P4 3 2 1 2, the phases of the structure factors could be determined by molecular replacement using XPLOR (17). Two molecules in the asymmetric unit (Mol1 and Mol2) were found by rotation and translation search. The search model 1ghr (3) gave two single peaks 10 and 9 above the average in the Patterson correlation function. After translation search (15.7 and 14.8), rigid body, and packing refinement against data between 13.65 and 3 Å the R value was 31.4%.
Structure Refinement-For refinement, 5% of the reflections were set aside as test set for the calculation of R free (18). R free was used to monitor convergence throughout. After 100 steps of minimization with the standard repel nonbonded energy function followed by 50 steps of Powell Lennard-Jones minimization (R work ϭ 28.3%, R free ϭ 32.4%), a simulated annealing run following the standard slow-cooling protocol (19) with structure amplitudes F Ͼ 2(F) between 13.65 and 2.0 Å, restrictions for non-crystallographic symmetry (NCS) converged with R work ϭ 27.2% and R free ϭ 31.2% (no NCS: R work ϭ 26.3%, R free ϭ 31.1%). In the following steps, NCS restrictions were used for all but those residues making crystallographic or non-crystallographic intermolecular contacts. Subsequent B-value refinement resulted in R work ϭ 25.2% and R free ϭ 28.5%. Electron density maps were calculated with XPLOR and CCP4 (20) programs and displayed in O (21). Further refinement was performed with XPLOR/O combining positional refinement with overall and atomic B-value refinement. After several cycles of positional and B-value refinement and manual revision of the model via electron density and difference density maps, water molecules were added until most significant peaks (Ͼ4) disappeared from the difference electron density map. Ending up with 418 water molecules in the asymmetric unit and reduced NCS restraints for both molecules, the final R work was 17.1% and R free 21.2%.
To start the molecular dynamics simulation, a FORTRAN program written for this purpose but not specific for the glucanase molecule first filled a P1 cell with dimensions 66.379 ϫ 60.241 ϫ 52.643 Å with TIP3 waters provided by the XPLOR package (24). After this procedure, molecule Mol1 of EII was oriented with its inertia equivalent ellipsoid axes parallel to the P1 cell axes. The overlapping water molecules were removed so that Mol1 finally was embedded in 5,069 non-overlapping water molecules. The water box is large enough to cover the substrate by a water shell of at least 13-Å thickness. The charged residues of the protein are made net neutral to mimic the effects of solvation and counterions. The nonbonded interaction cutoff was specified to 8.5 Å. The molecular dynamics calculation, done by XPLOR, preceded by 120 steps of standard repel non-bonded energy minimization and 80 steps of Powell Lennard-Jones minimization in vacuum, started with the fixed enzyme-substrate complex to relax the water molecules around the macromolecule. The solvent molecules were relaxed further during 120 steps of energy minimization followed by molecular dynamics over 5 ps at 100 K, 5 ps at 200 K, 20 ps at 300 K with 0.2-ps reassignment using the CHARMM force field. The energy was then minimized for the solute (100 steps Powell) followed by several steps of molecular dynamics for the solute (5 ps at 100 K, 5 ps at 200 K, 5 ps at 300 K). Finally, molecular dynamics were run for 80 ps at 300 K (with temperature coupling) for the whole system, followed by 80 steps of Powell Lennard-Jones minimization. Over the entire simulation the EII molecule remained fixed, but those side chains in the binding cleft containing atoms in a 4-Å surface shell, the hexaglucan, and the waters were allowed to move. Alternative protocols without water relaxation at 100, 200, and 300 K and without temperature coupling resulted in similar sugar conformations and positions.

RESULTS AND DISCUSSION
Refinement Results-After combining working and test data sets, final positional and B-value refinement yielded an R-value of 17.0%. The R-value was 16.5% for all observations with a low resolution cutoff at 6 Å as used in the previous structure analysis (3). Two residues of N-acetyl-D-glucosamine (NAG) per molecule of the barley 1,3-1,4-␤-glucanase  EII were located at the glycosylation site, Asn 190 . The side chains of Asp 183 and Ile 304 in Mol1 and Gln 41 in Mol2 were refined in two alternative positions with 2/3 and 1/3 occupancy, and two acetate molecules were identified in nonrelated positions. In the Ramachandran diagram, more than 91% of the non-glycine and non-proline residues are within the most favored regions, and all other residues are within additional allowed regions. Two cis-peptides preceding Pro 137 and Ala 276 , identical to those in 1ghr, are present in both Mol1 and Mol2. The average correlation of the calculated F c map with the 2F o Ϫ F c electron density map is better than 0.9 for both molecules according to an analysis with O (21). The correlation coefficient drops below 0.7 for the highly flexible side chains of Arg 100 , Arg 197 , and Arg 261 , which have B values of about 50 Å 2 and are not sterically restricted by hydrogen bonds or van der Waals contacts in both molecules.
Stereochemical parameters of the main and side chains are strongly restrained by standard weights and are better than or inside of the bandwidth defined in PROCHECK (25) for struc-tures with comparable resolution. The final atomic coordinate set representing the monoclinic form of EII was scrutinized with WHATCHECK (26), PROCHECK, and several programs of the CCP4 suite. The results are summarized in Tables I and  II. The experimental data and the refined atomic coordinates of both molecules in the asymmetric unit were submitted to the Brookhaven Protein Data Bank (27) entry code 1aq0.
Overall Structure and Crystal Packing-The overall structure and the active site of EII have been described by Varghese et al. (3) and are very similar in the monoclinic crystal form. The overall shape of the 1,3-1,4-␤-glucanase can be approximated by its inertia equivalent ellipsoid with half-axes of 29.1 ϫ 25.2 ϫ 16.4 Å. The global folding pattern belongs to the ␤␣ 8 barrel type, although ␤-strand number 8 of the barrel is truncated to just two residues (Fig. 1). The active site is located in an open cleft at the bottom of the barrel defined by the C-terminal ends of the parallel intra-barrel ␤-strands. It is about 36 Å long and 8 to 9 Å deep, allowing the binding of oligosaccharide substrates.  The two molecules of EII in the asymmetric unit make numerous intermolecular contacts with distances between 2. shows the residues in the cleft to be structurally conserved to a higher degree than those in the rest of the molecules (Table  III) Structural Heterogeneity, Subdomains, and Hydration-The most prominent difference between the main chains of the three molecules is found for amino acid residues 190 -200 of Mol2 compared with the other two (Fig. 2). In this region, the structure of Mol2 is influenced by the NCS interactions of Gly 199 and Ala 200 in a ␤-turn of Mol1 with Val 196 in Mol2, and by numerous crystallographic contacts to Mol2 itself. The shift of the C␣ backbone preceding the small ␤-sheet by a maximum of 4.8 Å at Thr 194 indicates a certain flexibility of this region. Statistics measuring the degree of similarity of molecules (30) show clearly that Mol1 is more similar to 1ghr than is Mol2 (Table III). Therefore, all considerations of general aspects of the structures will refer to Mol1.
By using the program PUU (31) two structural domains were detected with a boundary crossing the ␤␣ 8 barrel perpendicular to the binding cleft. Residues Ile 1 to Ser 126 and Tyr 273 to Phe 306 constitute domain I; residues Val 127 to Thr 272 belong to the glycosylated domain II (right side in Fig. 3). Only two strands link both domains at Ser 126 /Val 127 and Thr 272 /Tyr 273 . This assignment of structural domains differs from that proposed earlier (3) and is supported by a least-squares superposition of domains I of Mol1 and 1ghr yielding a rotation by about 1°b etween their domains II. The rotation axis traverses the center of mass of domain II and is inclined by about 50°against the cleft axis marked by a modeled hexaglucan substrate (see below) in Fig. 3. Interestingly, the two ␤-strands crossing the inter-domain boundary are not regularly structured. Strand ␤ 8 is truncated to just two residues in Mol1 and Mol2 and is next to the cis-peptide preceding Ala 276 . The long strand ␤ 5 is inter- A hexameric ␤-glucan model oriented such that the reducing end points to the right is shown together with the catalytic site residues Glu 232 and Glu 288 , Tyr 33 , and the glycosylation site with NAG 1 1 and NAG 2 1 . Presumably, the catalytic event is coupled with a movement of the left domain against the glycosylated right domain, thereby distorting the scissile bond of the ␤-glucan positioned directly at the domain boundary. Drawn with SETOR (40). rupted by a 3 10 turn at Gln 129 to Ile 131 where it crosses from domain I into domain II also reflecting some structural irregularity. Functional implications of this domain structure will be discussed below.
Water molecules in 130 pairs related by NCS were identified allowing for a maximal separation of 1.5 Å. These water-binding sites present in both Mol1 and Mol2 are expected to have biological relevance. 37 out of these 130 sites have identical positions within 1.5 Å with any of the 48 water sites present in 1ghr indicating that their positions are independent of crystal contacts and preparation conditions. The water distribution around EII is quite asymmetric, since the molecular surface near the C and N termini shows significantly higher hydration than other parts of the protein molecule. This finding is true also for the waters detected in the molecule 1ghr. Interestingly, only 12 waters out of the 130 are found within the substrate-binding cleft of EII. The lack of complete cleft hydration may be due to the presence of exposed hydrophobic residues (see below).
The Glycosylation Site-Barley 1,3-1,4-␤-glucanase, isoenzyme EII, has only one site for potential N-glycosylation, the sequence Asn 190 -Ala 191 -Ser 192 . The attached carbohydrate moieties were analyzed by Harthill and Thomsen (4). They identified five different branched N-glycans comprised of different sugars contributing with different relative amounts between 15 and 30%. All have a common core sequence starting with two ␤-1,4-linked NAG molecules. These two residues are represented by clear electron density (Fig. 4). They have been identified in difference density maps at a late stage of refinement, because they were absent in 1ghr. The sugars in the two NCS-related protein molecules are differently well defined. The average temperature factors are 30.0 and 60.7 Å 2 for NAG 1 1 and NAG 2 1 of Mol1, and 49.1 and 63.6 Å 2 for the corresponding residues in Mol2. Whereas NAG 1 1 and NAG 2 1 are well defined, a third sugar residue, ␣-L-fucose, attached to O-3 of NAG 1 2 is seen at very low electron density (0.3 e Å Ϫ3 ) but was not modeled. Both glycosylation sites are sterically accessible to different degrees.
The Active Site: Structure and Hydration-The substrate binding cleft of barley 1,3-1,4-␤-glucanase was geometrically defined using SURFNET (32), and residue accessibilities were characterized by NACCESS (33). The largest cleft found by SURFNET (1 Å Ͻ sphere radius Ͻ 4 Å; cutoff distance between atoms and mask region 4 Å) has a volume of 2031 Å 3 (Fig. 5,  top) and comprises the catalytic site and the potential substrate binding site as described recently (3,23). The global dimensions are 7.5 Å in width, 8 to 9 Å in depth around the catalytic center, and 36 Å in length. All accessible atoms within a distance Յ4 Å from the cleft surface were selected, and their polarity was estimated by NACCESS. The resulting patterns are given in Table IV. The "left" side of the cleft (in the orientation of Fig. 5) is mainly decorated with apolar residues, whereas the "right" side, especially in the vicinity of the catalytic site, is covered with polar residues. The residues Tyr 170 , Glu 232 , Tyr 33  which possibly interact with the substrate (3) as partially proven by chemical probing (22).
Water molecules are mainly located at the polar right side of the cleft floor (Fig. 5) where several residues form hydrogen bonds to five water molecules. Only the three water molecules at the lowest end of the cleft are also present in the 1ghr model (3). Three water molecules form H bonds to the Ala 174 backbone and the Tyr 172 and Glu 280 side chains at the left side. Thus, the mostly polar right half of the cleft is covered by a water layer. A comparison with the water structure in the active site of the Bacillus 1,3-1,4-␤-glucanase will follow below.
Molecular Dynamics Simulation of Substrate Binding-A model of a hexaglucan substrate bound to the active site of EII was constructed to assist our understanding of the enzymology of the barley 1,3-1,4-␤-glucanase. The ␤-glucan before and after the simulated docking procedure is shown in Fig. 6 (bottom). In the start configuration, C-1 of glucose residue Glc4 points to the nucleophile Glu 232 of EII; Glc4 is situated halfway between Glu 232 and the general acid Glu 288 ; Glc3 O-6 forms hydrogen bonds to Glu 280 and Lys 283 (as the inhibitor in 1ghs); Glc3 O-2 makes an H bond to Asn 92 (as in 1ghs), and the face of Glc3 is in hydrophobic contact with Tyr 33 (as in 1ghs). The water molecules localized in the cleft were removed completely, because no fixed waters could be detected between inhibitor and 1,3-1,4-␤-glucanase in the crystal structure of an enzymeinhibitor complex (34).
The molecular dynamics simulation of substrate binding was equilibrated after 70 ps. For further discussion we use the middle structure of the inhibitor as determined by LSQMAN (35). Sugar residues Glc2, Glc3, Glc4, and Glc5 are situated within the binding cleft, their root mean square distances from the middle structure are 0.38, 0.31, 0.72, and 0.66 Å, respectively. The first residue, Glc1, is outside of the binding cleft as defined by SURFNET (Fig. 5). The root mean square distances of Glc1 and Glc6 are about 0.66 and 0.68 Å, respectively. Because the position of the substrate is strongly influenced by   Fig. 3. The smallest root mean square ⌬ value for Glc3 coincides with the longest retention of any glucose residue in one position during the simulation. This is in agreement with the results of the subsite mapping of barley 1,3-␤-glucanase isoenzyme GII (36), where the binding affinity at the second subsite toward the non-reducing end relative to the cutting point is at the maximum. In the Bacillus as well as in the barley 1,3-1,4-␤-glucanases the first and sixth residues of the ␤-glucan model substrates are not tightly bound to the ends of the clefts.
All glucose units of the modeled substrate are in the preferred 4 C 1 chair pucker, and the torsion angles are in the low energy regions between 28 and 54°for ⌽ and Ϫ52 to ϩ14°for ⌿ in ␤-1,3 glycosyl bonds (37), and around ⌽ ϭ 40°, ⌿ ϭ Ϫ20°for ␤-1,4-bonds (38). Only the torsion angle ⌽ ϭ 72.5°at the scissile ␤-1,4-bond is outside the minimum, indicating strain which may be relieved by bond cleavage. A deviation from the low energy conformation of similar magnitude but with opposite sign was observed in the modeled ␤-glucan substrate of the Bacillus 1,3-1,4-␤-glucanase (11).
In rationalizing substrate binding and conversion, the domain structure and possible structural rearrangements must be taken into account. The general acid, Glu 288 , and the substrate fix point Tyr 33 are located at the same domain I, whereas the nucleophile Glu 232 at the bottom of the cleft belongs to the glycosylated domain II. The substrate residues Glc1 to Glc4 are attached to domain I, and the scissile bond O-4 of the modeled hexamer is positioned at the inter-domain boundary, and residues Glc5 and Glc6 are attached to domain II (Fig. 3). A slight rotation of the domains relative to each other may support a rearrangement of the catalytic residues and/or the substrate position, thus providing a fine tuning of the catalytic event. A domain rearrangement of this kind is observed between Mol1 and Mol2 of EII (see above).
Comparison of Active Site Structures and Modeled Substrate Binding by Barley and Bacillus 1,3-1,4-␤-Glucanases-Active site directed inhibition of 1,3-1,4-␤-glucanases from B. subtilis and barley (13) by covalent modification with epoxyalkyl cellobiosides reveals subtle differences in their active site geometries. Both enzymes with unrelated primary, secondary, and tertiary structures (Fig. 6) have identical substrate specificity, but the optimum aglycon length of the cellobiosides for maximum inactivation is C-4 (SUB) and C-3 (EII), respectively, and SUB is much more efficiently inhibited by the (S)-epoxybutyl-␤-cellobioside than by the R-isomer, whereas the reverse is found for EII. From this, it was concluded that the differences in inhibitor binding might be related to a different cleavage mechanism (13). It could be shown recently, however, that ␤-glucan hydrolysis follows the same stereochemical course in the bacterial and plant enzymes (14,15). In addition, the covalent complex of the barley 1,3-␤-glucanase with 2,3-epoxypropyl cellobioside shows binding of the S-isomer to the nucleophile of this close homolog of EII.
A common motif in all inhibitor binding studies (14,34) and substrate binding models (Ref. 11; this work) is the hydrophobic "stacking" interaction of the sugar residue at the p2 subsite with a tyrosine side chain (Tyr 33 in EII and Tyr 94 in H(A16-M)). The distances of the plane midpoints are between 4.3 and 4.7 Å in all complexes. In H(A16-M) (2ayh), as well as in the barley glucanases (Mol1 of EII, 1ghr, 1ghs), the distances between oxygens O⑀-1 and O⑀-2 of the catalytic nucleophile and the corresponding tyrosine midpoint are identical (about 4.5 and 6.5 Å). The distances between the active O⑀ atoms and the tyrosine midpoint are changed to 5.7 Å by the inhibitor binding to H(A16-M) and to 5.4 Å by substrate binding modeled for EII (Mol1). Taking this distance of about 5.5 Å into account, the additional length of 1.3 Å of the butyl linker in comparison to the propyl linker requires a bulging out of the linker. The molecular environment of the butyl linker of the inhibitor bound to H(A16-M) consists of residues Trp 192 , Val 88 , Trp 184 , and Phe 30 at one side, and of Tyr 123 , Trp 103 , and Phe 92 at the other. The butyl linker interacts by hydrophobic contacts with the plane of Phe 92 and covers the hydrophobic surface. Such a large hydrophobic area does not exist in the case of EII. The only hydrophobic residues in the vicinity of the linker are Phe 275 and Trp 291 , and therefore, preserving the strong hydrophobic interaction between sugar residue Glc3 and Tyr 33 , the 1.3 Å shorter propyl linker of the epoxyalkyl cellobioside inhibitor is preferentially bound.
By analogy to the cleft determination and characterization described above for EII, the substrate binding site of H(A16-M) was investigated. Here, the form of the cleft is quite different from that of EII (Fig. 5, bottom), and the volume is about 1,640 Å 3 . The width is comparable with the cleft width of 7-8 Å of EII near the catalytic site, but the groove is considerably deeper (about 12.5 Å) and shorter (about 29 Å). In both cases the clefts are somewhat branched out at the molecular surface (Fig. 5). All residues within the active site (Asp 107 , Glu 105 , Glu 109 , Trp 103 , Glu 119 , and Trp 192 ) or interacting with the cellobioside inhibitor (Tyr 24 , Asn 26 , Glu 63 , and His 99 ) as described (34) and additional potential substrate binding partners (Arg 65 , Tyr 94 , Ser 90 , Glu 131 , and Asn 121 ) have accessible surface fractions. These residues are summarized in Table IV for comparison with EII. No obvious similarity between both patterns is found. Whereas the branched cleft in EII has a nearly isometric cross-section in the substrate binding region, the cleft of H(A16-M) is narrow and deep. The hydration pattern of the H(A16-M) groove is quite different from the water structure in the EII cleft (Fig. 5).
The six glucose residues cover the whole length of the cleft in H(A16-M), but eight can be accommodated in EII (36). In any case, only the innermost four appear tightly bound within the reaction center. In Fig. 6 (middle), the surfaces, modeled substrates, catalytic residues, and the hydrophobic binding centers at subsite p2 (Tyr 33 of EII and Tyr 94 of H(A16-M)) are shown. The basic differences between both families of 1,3-1,4-␤-glucanases are the relative positioning of the general acid, the nucleophile, and these tyrosine residues. Given the same orientation of the substrate with the reducing end of Glc6 facing upward in Fig. 6 (middle), the general acid, Glu 288 , of EII is at the left side of the hexameric ␤-glucan, but the corresponding Glu 109 is at the cleft bottom for H(A16-M). To follow the same stereochemical pathway a global rotation is necessary between both substrates by about 90°.
In conclusion, the active sites of the plant and Bacillus 1,3-1,4-␤-glucanases are surprisingly different in view of their close functional similarity. The enzymes show nicely how the same catalytic activity can evolve on completely different protein folds and in dissimilar local geometries.
(H(A16-M)) stacking against Glc3 of the substrate at subsite p2 are also shown. Surface colors indicate electrostatic potential (blue, positive, and red, negative) as calculated with DELPHI (41). Note the different rotational setting of the substrate chain in the active site cleft and the different orientation of the protein side chains. Bottom, superposition of ␤-glucan model substrates for barley and hybrid Bacillus 1,3-1,4-␤-glucanases. The molecules are superimposed with the glucose moiety Glc3 interacting with a tyrosine in subsite p2. The substrate strand fitting into the active site of H(A16-M) (Ref. 11, green) defines the start conformation used in molecular dynamics modeling of the substrate bound to EII (red).