The Xenograft Antigen Bound to Griffonia simplicifolia Lectin 1-B 4 X-RAY CRYSTAL STRUCTURE OF THE COMPLEX AND MOLECULAR DYNAMICS CHARACTERIZATION OF THE BINDING SITE*

The shortage of organs for transplantation into human patients continues to be a driving force behind research into the use of tissues from non-human donors, particularly pig. The primary barrier to such xenotransplantation is the reaction between natural antibodies present in humans and Old World monkeys and the Gal (cid:1) (1–3)Gal epitope (xenograft antigen, xenoantigen) found on the cell surfaces of the donor organ. This hyperacute immune response leads ultimately to graft rejection. Because of its high specificity for the xenograft antigen, isolectin 1-B 4 from Griffonia simplicifolia (GS- 1-B 4 ) has been used as an immunodiagnostic reagent. Furthermore, haptens that inhibit natural antibodies also inhibit GS-1-B 4 from binding to the xenoantigen. Here we report the first x-ray crystal structure of the xenograft antigen bound to a protein (GS-1-B 4 ). The three-dimensional structure was determined from or-thorhombic crystals at a resolution of 2.3 Å. To probe the influence of binding on ligand properties, we report also the results of molecular dynamics (MD) simulations on this complex as well as on the free ligand. The MD simulations were performed with the AMBER force-field for proteins augmented with the GLYCAM parameters for glycosides and glycoproteins. The simulations were performed for up to 10 ns in the presence of explicit solvent. Through comparison with MD simulations performed for the free ligand, it has been determined that GS-1-B 4 recognizes the lowest annealing minimization MD simulation, through weak bath (cid:1) for AMBER

The major barrier to xenotransplantation (1) is a hyperacute immune response (2), in which Gal␣(1-3)Gal (xenograft antigen) present on the surface of non-primate tissues triggers the rejection from human transplant recipients (3)(4)(5). The ubiquitous presence of anti-Gal␣(1-3)Gal antibodies in humans, Old World monkeys and apes is paralleled by the absence of Gal␣(1-3)Gal on the cell surfaces of those species (6). The natural antibodies attack the surface endothelial cells leading to complement activation and organ death. Several approaches to this problem have been considered (1), including inhibition of the anti-Gal␣(1-3)Gal antibodies, induction of tolerance to the xenoantigen (7), and transgenic alteration of the Gal␣(1-3)Gal epitope present on the cell surfaces of the donor species (8,9). To date, only transient suppression of the anti-Gal␣(1-3)Gal immune response has been achieved (1). Overcoming xenograft rejection has become increasingly important due to the huge demand for organ transplants; a study in 1998 estimated that the demand had increased by 100% over the period 1990 through 1998 (1).
Griffonia simplicifolia lectin-1 (GS-1) 1 is a carbohydratebinding glycoprotein that is isolated from the seeds of the African leguminous shrub. As in many other legume lectins, GS-1 relies on the presence of divalent metal cations for its carbohydrate-binding activity (10). GS-1 is a mixture of five tetrameric isolectins that vary in their content of A and B subunits (11). The A subunit was found to bind strongly to both GalNAc␣ and Gal␣ residues, while favoring GalNAc␣ (12). Competitive binding studies have shown that the GS-1 isolectin composed of four B subunits (GS-1-B 4 ) has a high affinity for the Gal␣(1-3)Gal sequence (12,13). Therefore, GS-1-B 4 has found application as an immunodiagnostic reagent in studies of the xenograft antigen. Furthermore, inhibitors of the interaction between anti-Gal␣(1-3)Gal antibodies and the xenograft antigen also inhibit carbohydrate binding to GS-1-B 4 (14 -16). Notably, the binding of Gal␣(1-3)Gal to GS-1-B 4 appears to be determined primarily by the presence of the terminal ␣-galactosyl residue; other linkages may be tolerated, as long as they contain a terminal Gal␣ residue. Thus, GS-1-B 4 is not a perfect model for the natural antibodies; however, it does provide an opportunity to gain detailed insight into the mechanism of recognition of the xenoantigen. To determine the mechanism for the observed specificities, as well as to obtain the first structure of the xenoantigen bound to a protein, we have de-termined the x-ray crystal structure of the GS-1-B 4 ⅐Gal␣(1-3)Gal complex.
A number of structural studies have been reported for the xenograft antigen and related oligosaccharides (12,(17)(18)(19)(20)(21). These studies used both computational and NMR spectroscopic methods to determine the solution conformation. Earlier computational studies employed adiabatic energy mapping to predict low energy conformations for the Gal␣(1-3)Gal linkage. More recently, both gas-phase Monte Carlo and molecular dynamics (MD) simulations have been employed to examine ligand flexibility (19,21). To determine the extent to which water mediates the conformational properties of the ligand, the present study employed MD simulations of the free ligand in explicit water at atmospheric pressure and room temperature. These long 10-ns simulations are extremely computer-intensive, however, they are able to predict with accuracy the influence of solvation and binding interactions on the conformational and dynamic properties of carbohydrates (22). To examine the influence of protein binding on ligand dynamics, as well as to obtain a complete spatial and temporal picture of the interaction, 2-to 5-ns MD simulations of the bound complex, were also performed with explicit water. These simulations provide additional insight into the structural significance of bound waters, seen to mediate the carbohydrate-protein interaction in the x-ray structure.

EXPERIMENTAL PROCEDURES
Crystallization, Diffraction Data Collection, and Structure Solution by Molecular Replacement-The crystallization, x-ray diffraction data collection, and molecular replacement for the GS-1-B 4 complex with Gal␣(1-3)Gal␤-OMe are described in detail elsewhere (23).
Structure Refinement-Data collected from two crystals from different crystallization drops where used in the structural refinement. The combined data set was obtained by merging individual integrated reflection files using SCALEPACK of the HKL software suite (24). Of 22,043 observed (23,055 theoretical) reflections between 20-and 2.2-Å resolution, 1500 were set aside as test observations (25). The CNS suite of programs (26), with a maximum likelihood target function (27), was used throughout the entire process of refinement. After two rounds of independent rigid-body refinement of the two instances of the search model polypeptide chain (RCSB ID: 1GSL) (28), the NCS transformation matrix between the two molecules was determined. The resultant operators were used in the application of NCS constraints in the initial stages of refinement. Real-space density fitting was performed using O (29). After the inclusion of two metal ions (30) and carbohydrate chains, NCS constraints were removed and replaced by gradually decreasing restraints. Prior to PDB submission (as RCSB ID: 1HQL) (31) the model quality was assessed using PROCHECK (32).
Molecular Dynamics-The SANDER (33) module of AMBER 5.0 (34) was utilized in conjunction with the PARM98 parameter set for proteins and the GLYCAM (35) parameter set for glycosides and glycoproteins. A single subunit of the GS-1-B 4 x-ray crystal structure 1HQL was protonated with INSIGHTII (36), and a 25-Å droplet containing 1389 TIP3P waters (37) was placed around O3 of the non-reducing end of the disaccharide (Gal␣, residue 243). Initially, the solvent positions were optimized with 9000 steps of steepest descent, followed by 1000 steps of conjugate gradient, energy minimization. This was followed by a period of simulated annealing, during which the solvent was heated to 300 K over 20 ps, held at 300 K for 60 ps, before being cooled to 5 K over an   additional 20 ps. The simulated annealing was followed by energy minimization of the entire system. During the production MD simulation, all atoms of the protein within 15 Å of the binding site (defined as the carbohydrate recognition domain (CRD)), all waters, and the ligand were allowed complete motional freedom. All other atoms were held frozen unless otherwise stated. The system was then heated from 5 to 300 K over 40 ps and maintained at 300 K for 2 ns through weak coupling to an external bath with a coupling constant of 0.25 ps Ϫ1 . An additional simulation was performed for 5 ns with the water involved in the binding site (Wat 56 ) restrained in the crystallographic position. All simulations involving the free disaccharide were performed under periodic boundary conditions at constant pressure, following similar protocols for energy minimization and simulated annealing as used for the droplet simulations. The final production run was performed for 10 ns. All MD simulations employed an integration time step of 2 fs, a dielectric constant of unity, scaling of 1-4 electrostatic and van der Waals interactions by the standard values of 1/1.2 and 1/2.0, restraint of all hydrogen-containing bonds through the SHAKE algorithm (33), and a cutoff of 8 Å for all non-bonded interactions. Analysis of the trajectories was performed using the CARNAL module of AMBER 5.0.

RESULTS AND DISCUSSION
X-ray Data Reduction and Structure Refinement-The structure of the complex was solved by molecular replacement using diffraction data from 8 to 4 Å, from only one crystal (Crystal I) (23), resulting in the placement of two instances of the search model polypeptide in the asymmetric unit. However, the subsequent refinement proved challenging. Apart from a 25-residue N-terminal sequence (38), no further sequence information was initially available for GS-1-B 4 . Although the related lectin GS-4 has 12 of its 27 N-terminal amino acid residues in common with GS-1-B 4 , a loop in this region of the chain contains 3 additional residues (28). Initial inspection of annealed omit maps (39) using data from 30 to 2.65 Å revealed substantial discrepancies between search model coordinates and electron density. The quality of these maps did not permit significant model improvements due to difficulties in real space fitting of misplaced regions. This could be attributed to the uncertainty surrounding the number of residues and the types of side chains to be fitted in areas with low correlation between electron density and model coordinates.
A second set of diffraction data, this time extending beyond 2.2-Å resolution, was collected using a crystal (Crystal II) grown in a separate experiment under the conditions described for Crystal I (23). This data set was less complete than that for The Xenograft Antigen Bound to G. simplicifolia  Although Table I highlights significant discrepancies between the data from both crystals, the procedure resulted in a data set that proved to be of sufficient quality for successful refinement by alternating slow-cool simulated annealing and real-space model rebuilding. Initially, refinement was confined to one of the polypeptide chains in the asymmetric unit, and coordinates for the second chain were generated by strict application of NCS operators. At this stage, difference density clearly indicated the position of two metal ions. Unlike the commonly observed presence of a Ca 2ϩ and a transition metal cation combination (30) two calcium cations were employed initially in the refinement. This decision was based on a study of GS-1 metal dependence (10) and our failure to detect significant amounts of Mn 2ϩ in a metal analysis of a GS-1-B 4 solution. At this point, the quality of the electron density permitted the addition of the carbohydrate ligand and residues of the N-glycan on residue Asn 27 . NCS constraints were replaced by restraints when the crystallographic residual had been improved to 25.5% (R free ϭ 26.6%). Lastly, the addition of crystallographic water atoms and substitution of Mn 2ϩ for one Ca 2ϩ in each subunit, both based on difference map density, produced a preliminary model. In the absence of the complete GS-1-B 4 amino acid sequence, the model was based on the published N-terminal sequence (38), unpublished data derived from sequencing of fragments from CNBr 2 digests and se-quences of homologous Griffonia simplicifolia lectins. 3 This model was updated when the complete sequence became available (40).
Description of the Biologically Active Tetramer-The asymmetric unit consists of two single chain subunits A (not to be confused with the A-type subunit of GS-1) and B. Subunits A and B are related by a non-crystallographic 2-fold axis oriented roughly perpendicular to the 6-stranded "back" ␤ sheet, common in legume lectin monomers (30). However, unlike the case in the "canonical" dimer found in concanavalin A (41), the two subunits do not arrange to form a large 12-stranded sheet. Rather, the strands composed of residues 4 -11, 239 -231, and 69 -76 appear to align as extensions of strands 69 -76, 239 -231, and 4 -11 in the other subunit, respectively. Aromatic residues such as Trp 13 and Phe 78 and non-polar side chains of Ala 30 and Leu 231 exhibit the closest contacts with the peptide chain in the other subunit. Application of the crystallographic symmetry-based transformation (1 Ϫ x, Ϫy, z) to the atomic coordinates of subunits A and B generated subunits A* and B*, respectively. Interestingly, the mode of association observed between A and A*, as well as between B and B*, resembles that in the GS-4 dimer (28), with a nearly perpendicular alignment of the strands in the ␤ sheets at the interface ( Figs. 1 and 2).
Metal Binding Site-Based on a published biochemical study of the metal dependence of GS-1 (10), Ca 2ϩ was the only diva-   (23). Consequently, both metal sites were treated as being occupied by Ca 2ϩ ions in the initial stages of refinement. Significantly shorter bond distances to surrounding protein side-chain atoms and some remaining F o Ϫ F c electron density at one of the metal centers led to the adoption of the Mn 2ϩ /Ca 2ϩ configuration, commonly observed in legume lectins. It must be assumed that residual Mn 2ϩ remained bound to the protein sample even during isolation. Although failing to be detected by atomic spectroscopy, enough Mn 2ϩ apparently remained for the formation of GS-1-B 4 crystals containing Mn 2ϩ . The final model contains a metal binding site that closely resembles the structure described for GS-4 ( Fig. 1) (28). Similar to observations from GS-4, the second carboxylate oxygen of Asp 130 is oriented such that it may act as a seventh ligand for one of the Ca 2ϩ ions (28,42). N-Glycosylation-Native GS-1-B 4 is a glycoprotein (10), and electron density suggests glycosylation at residue Asn 27 . It is observed on both subunits, but the density on subunit B shows superior continuity when compared with that on A. Electron density permitted the modeling of the core GlcNAc␤(1-4)GlcNAc sequence; however, even in case of the B subunit, density for the carbohydrate residues is weak and does not cover all atoms.
The Xenograft Antigen in the Carbohydrate Binding Site of GS-1-B 4 -The terminal ␣-galactosyl residue (Gal␣) of the xenograft antigen is represented by well-contoured electron density in the CRDs of both the A and B subunits. Interactions between the side chain of Asp 88 with hydroxyl groups HO-3 and HO-4, as well as between the side chains of Asn 134 and the amide nitrogen of Glu 106 with HO-3 (Fig. 1), are paralleled by similar interactions involving residues Asp 89 , Asn 135 , and Gly 107 in the complex between GS-4 and the Lewis b human blood group determinant (28). Additional contacts can be found between the backbone amides of residues Asn 222 and Asn 223 and HO-6. Residues Gly 105 and Glu 106 distinguish GS-1-B 4 from a variety of other legume lectins, in which a Gly-Gly sequence is highly conserved in this region, thus the interaction between the side chain of Glu 106 and hydroxyl groups HO-2 and HO-3 of the Gal␣ residue is noteworthy.
Significant density for the ␤-methyl galactosyl residue (Gal␤) is only seen in the B subunit. This residue is situated well above the protein surface. Notably, in the ligand bound to subunit B, the Gal␤ residue is found in close proximity to the loop region extending from residues 61 thorough 69 of a molecule of subunit A, which is generated by a crystallographic symmetry operation. This presumably restricts the mobility of the carbohydrate ligand and, therefore, improved its contribution to diffraction (Figs. 3 and 4 and Table II).
Free Ligand--Throughout the 10-ns MD simulation in water, the glycosidic torsion angles in Gal␣(1-3)Gal␤-OMe showed only brief, relatively localized transitions from the equilibrium conformation. The ⌿ angle showed increased flexibility relative to the ⌽ angle, which is consistent with other ␣-linkages (20) and with earlier predictions that the Gal␣(1-3)Gal linkage is relatively flexible (see Fig. 3). The major conformation present is shown in Table III and as conformation B in Fig. 3. This conformation was predicted to be the lowest in energy and has been found experimentally to be the most populated in solution in related oligosaccharides (41,20,21,18). Two additional minor conformations were found and are referred to as A and D, (nomenclature consistent with a previous conformational energy map calculated for this linkage) (20). An additional higher energy theoretical conformation (C) (20) was not populated during our simulation. Overall, the average ⌽ and ⌿ angles determined by the MD simulation  The Xenograft Antigen Bound to G. simplicifolia  remained close to those of the ligand in the x-ray crystal structure of the complex. Therefore, it may be concluded that GS-1-B 4 recognizes the lowest energy conformation of Gal␣(1-3)Gal␤OMe, in which ⌽ adopts a conformation expected on the basis of the exo-anomeric effect (44) (Fig. 5 and Table III).
Bound Gal␣ (1)(2)(3)Gal␤OMe Conformational Analysis--The average ⌽ and ⌿ angles from the 2-ns MD simulation were in good agreement with the x-ray data. The analysis was halted at 2 ns, because Wat 56 dispersed out of the CRD at just over 2 ns. According to the x-ray data from subunit B, this water participates in a bridge between the ligand and the protein and may be of importance in stabilizing the protein⅐ligand complex. For comparison, a 5-ns MD simulation was performed, in which Wat 56 was restrained in the x-ray position. The longer simulation revealed ⌽ and ⌿ angles that were more rigid than observed in the 2-ns MD simulation, suggesting that an indirect result of restraining the water was to attenuate the mobility of neighboring residues.
In both the 2-and 5-ns MD simulations a rotation around the C5-C6 bond of the Gal␣ residue occurred. The transition occurred after ϳ100 ps in the longer run and 500 ps in the shorter run. This transition resulted from the formation of a new interaction between O6 of the Gal␣ residue and O3 of the Gal␤ residue at the expense of interactions between O6 and Asn residues 222 and 223. A weak interaction, involving the N␦2 atom of Asn 223 and O6 was maintained throughout the simulation, in contrast to interactions involving the backbone amide atoms of Asn 222 and Asn 223 , which were broken during the transition. The ability of N␦2 to maintain contact was most likely facilitated by the flexibility of the side chain, in comparison to the more rigid backbone. This result indicates more flexibility in the ligand than might be expected on the basis of epitope mapping studies, which have shown that substituents at the O6 position in Gal␣ decrease the affinity (12).
Hydrogen Bonding Analysis-Because the MD simulations include hydrogen atoms, it is possible to include them in an analysis of hydrogen bond properties, such as donor acceptor assignments, and hydrogen bond occupancies. In the calculation of occupancies, hydrogen bonding interactions were assumed to be present if the participating heavy atoms were Յ4 Å apart, and the angle formed between the heavy atoms and the donating hydrogen was Յ 60°, as defined in the CARNAL module of AMBER 5.0. The corresponding standard deviations for the inter-atomic positions were calculated only when the requirements for hydrogen bond occupancy were fulfilled. Therefore, typically strongest hydrogen bonds have the highest occupancies, the smallest standard deviations, and the shortest heavy atom separations. The dependence on hydrogen position results in an analysis that is more sensitive than that based on the x-ray data, which relies solely on the heavy atom separation, with a separation of Յ3.2 Å being characterized as moderately strong, and a separation from 3.2 to 4.0 Å being indicative of a weak, hydrogen bond (45). The MD data provide considerable additional insight into the dynamic or fluxional nature of these interactions ( Fig. 6 and Table IV).
In the crystal structure, the distance between Glu 106 O⑀1 and the oxygen atom of hydroxyl group HO-2 in Gal␣ is exceptionally short, with a heavy atom separation of 2.6 Å. In both MD simulations of the complex this interaction lengthened to a value of ϳ3.4 Å. Similarly, the interactions between HO-3 and HO-4 of Gal␣ with the carboxylate group of Asp 88 display extremely close contacts in the x-ray structure (2.6 Å), but lengthened to a more common value of ϳ2.9 Å in the MD simulations. The extent to which this illustrates the limitations of the x-ray data, versus a genuine difference between solution and crystalline environ-ments, is unclear and may only be resolved with collection of a high resolution data set ( Fig. 7 and Table V).
The only difference between the two simulations of the complex was the treatment of the bound water. As a result, interactions involving Wat 56 , which are shown in Table V, are quite different in each simulation of the complex. Wat 56 populates two positions, denoted E and W (referring to coordination to Glu 106 or Trp 132 ), during the 2-ns simulation and are illustrated in Fig. 4. Although Wat 56 and the carboxylate of Glu 106 are tightly coordinated in configuration E, the interaction has a low overall occupancy due to the fact that the W configuration is present for the majority of the time. In the W configuration two new interactions with Wat 56 form, involving HO-2 of Gal␤ and N⑀1 of Trp 132 . Therefore, these small occupancies illustrate a dynamic, but not necessarily weak, interaction. This suggests that the water is mobile in the binding site, consistent with the absence of electron density for Wat 56 in subunit A. Therefore, the positional constraints employed in the 5-ns MD simulation may yield misleading results in the statistical analysis of properties dependent on this water molecule. This raises a considerable question regarding the role played by this water in the binding mechanism (Figs. 8 and 9).
Lectin Specificity-The binding site consists of a deep cavity, which accommodates only the first residue of the disaccharide (see Fig. 7). Modeling indicated that epimerization of C4 in the terminal Gal␣ residue (Gal␣ 3 Glc␣) would result in the loss of a strong interaction between hydroxyl group HO-4 and Asp 88 O␦2. Similarly, epimerization of C2 in the Gal␣ residue would result in the loss of two interactions, namely with Gal␣ O 2 ⅐Glu 106 O⑀1 and Gal␣ O 2 ⅐Wat 56 . Each of these observations is consistent with experimental data that show this lectin to have the highest binding affinity for oligosaccharides characterized by terminal Gal␣ residues (12). Furthermore, modeling indicated that alteration of the ␣(1-3) linkage to a ␤(1-3) linkage would be sterically unfavorable, due to close contacts formed between the reducing end of the disaccharide and Trp 47 . This is also consistent with experimental data that revealed GS-1-B 4 to have a much stronger affinity for Gal␣ conjugated to human serum albumin than for the corresponding Gal␤ conjugate (12,46).