Crystal Structure of Yeast Allantoicase Reveals a Repeated Jelly Roll Motif*

Allantoicase (EC 3.5.3.4) catalyzes the conversion of allantoate into ureidoglycolate and urea, one of the final steps in the degradation of purines to urea. The mechanism of most enzymes involved in this pathway, which has been known for a long time, is unknown. In this paper we describe the three-dimensional crystal structure of the yeast allantoicase determined at a resolution of 2.6 Å by single anomalous diffraction. This constitutes the first structure for an enzyme of this pathway. The structure reveals a repeated jelly roll (cid:1) -sheet motif, also present in proteins of unrelated biochemical function. Allantoicase has a hexameric arrangement in the crystal (dimer of trimers). Analysis of the protein sequence against the structural data reveals the presence of two totally conserved surface patches, one on each jelly roll motif. The hexameric packing concentrates these patches into conserved pockets

In most organisms the primary purine degradation pathways converge on the production of xanthine, which is subsequently converted into uric acid. This product is excreted in most primates, birds, and insects. In microorganisms, amphibians, and fish, uric acid is converted into urea and glyoxylate via allantoin and allantoic acid. Allantoicases (allantoate amidinohydrolase, EC 3.5.3.4) catalyze the hydrolytic conversion of allantoic acid into ureidoglycolate and urea (reaction scheme is illustrated in Fig. 1). This pathway allows the use of purines as secondary nitrogen sources in nitrogen-limiting conditions. The yeast allantoicase gene DAL2 codes for a 345amino acid protein (1)(2)(3). Expression of the allantoin pathway enzymes is (i) induced by allophanate, (ii) sensitive to nitrogen catabolite repression, and (iii) responsive to mutation of the DAL80 and DAL81 loci, which have been shown to regulate the allantoin degradation system (4,5).
Allantoicase is present in a wide variety of bacteria, fungi (6,7), and plants, as well as in a few animals, such as fish and amphibians (8,9). Although the allantoicase activity was not detected in higher vertebrates, allantoicase transcripts were identified in mice and humans (10 -12). Xenopus and mouse allantoicase were functionally expressed in Escherichia coli (13). The allantoicase amino acid sequence consists of two analogous repeats (each comprising one-half of the protein), catalogued in the Pfam domain data base as allantoicase repeats (PB003817). The mouse and Xenopus sequences contain some extensions compared with the microbial enzymes.
Almost nothing is known on the reaction mechanism or structure of allantoicases, and mechanistic information on any hydrolase acting on linear amidines is scarce (14). We here present the 2.6-Å crystal structure of the allantoicase from Saccharomyces cerevisiae, which is the first structure for this enzyme family. It shows a bimodal organization of the monomer respecting the internal sequence repeat. Both modules have an identical ␤-jelly roll fold, also present in a number of functionally unrelated proteins. Allantoicase associates into hexamers. The contact region between subunits creates a totally conserved pocket that very likely corresponds to the active site.

MATERIALS AND METHODS
Cloning, Expression, and Purification-The YIR029w (Dal2) gene was cloned from S. cerevisiae S288C DNA by PCR in a modified pET9 vector (Stratagene) between NdeI and NotI sites. Six histidine codons were added at the 3Ј-end of the gene. Small scale expression and solubility tests at 37 or 25°C have shown that the protein was expressed in inclusion bodies. Only about 10% of the protein was obtained in the soluble fraction of E. coli after induction at 15°C. However, the co-expression of the E. coli chaperones GrpE, DnaK, DnaJ, GroEL, and GroES significantly increased the solubility of the target protein (ϳ80% increase) (15). A 750-ml 2ϫ YT medium (5 g of NaCl, 16 g of Bacto TM tryptone, 10 g of yeast extract) culture (BIO 101, Inc.) of Gold (DE3) strain (Novagen) co-transformed with the pET construct and the chaperone plasmid (pGKJE3) was grown at 37°C up to an A 600 nm of 1. Expression of the five chaperones was induced 15 min before the target protein by addition of 2 mM arabinose. The target was produced at 25°C during 4 h by adding 0.3 mM isopropyl-1-thio-␤-D-galactopyranoside. Cells were harvested by centrifugation; resuspended in 40 ml of 20 mM Tris-HCl, pH 7.5, 200 mM NaCl, 5 mM ␤-mercaptoethanol; and stored at Ϫ20°C. The cells were thawed, and the cell lysis was completed by sonication at 4°C. The soluble fraction containing the protein was recovered by a 30-min centrifugation at 13,000 ϫ g and 4°C and was applied on a nickel-nitrilotriacetic acid column (Qiagen Inc.). At this step the protein was 80% pure but remained contaminated by chaperones. The second step of purification consisted of a gel filtration on a Superdex 75 column (Amersham Biosciences) equilibrated in Tris-HCl, pH 7.5, 200 mM NaCl, 10 mM ␤-mercaptoethanol. The pure protein was analyzed by analytical size exclusion chromatography on a calibrated Superdex 200 column to measure its oligomeric state.
The labeling of the protein with Se-Met 1 was conducted as described * This work was supported by grants from the Ministère de la Recherche et de la Technologie (Programme Génopoles) and the Association pour la Recherche contre le Cancer (to M. G.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The  1 The abbreviation used is: Se-Met, selenomethionine.
previously (16,17). The purity and integrity of the native tagged protein, as well as the incorporation of Se-Met in place of the Met residues after labeling, were checked by mass spectrometry. The proteins were concentrated at 8 mg/ml using a Vivaspin concentrator (Vivascience). Crystallization and Resolution of the Structure-Crystallization trials were performed at 18°C. Se-Met-labeled protein crystals were grown from a mixture in a 1:1.5 ratio of 8 mg/ml protein solution and a reservoir solution of 7% PEG8000, 12% PEG400, and 100 mM HEPES, pH 7.5. The protein crystallized in space group P6 3 with cell dimensions a ϭ b ϭ 104.98 Å and c ϭ 118.15 Å, containing two polypeptide chains/asymmetric unit and corresponding to a solvent content of 49%.
Single wavelength anomalous diffraction data to 2.6-Å resolution were collected from a crystal cryocooled at 100 K in the reservoir solution containing 20% PEG400 on the European Synchrotron Radiation Facility (ESRF) ID14 EH4 beamline. The data were processed using the HKL (18) and the CCP4 suite programs (19). The selenium sites were found by direct methods using Patterson-based seeding with the program SHELXD (20). The phasing was done with SOLVE (21), and the resulting experimental map was improved by solvent modification using RESOLVE (21,22). The initial model was built in the experimental map at 2.6-Å resolution using the molecular graphics program TURBO-FRODO (afmb.cnrs-mrs.fr/TURBO_FRODO/). The refinement was performed with Refmac (23) and followed by manual rebuilding using TURBO-FRODO. The statistics on data collection, phasing, and refinement are provided in Table I.

RESULTS AND DISCUSSION
Allantoicase Structure-The structure of yeast allantoicase was determined at a resolution of 2.6 Å by single wavelength anomalous diffraction data obtained from Se-Met-substituted protein crystals. The crystals contain two molecules in the asymmetric unit, related by a local 2-fold axis. The allantoicase monomer is cylindrically shaped with dimensions 29 ϫ 35 ϫ 65 Å. The monomer consists of the association of two compact globular modules, each corresponding to a repeated motif with 30% sequence identity (residues 10 -182 and 197-343, hereafter called modules A and B). A topology diagram of the two modules of one monomer is represented in Fig. 2A. The superposition of both modules shows that they are almost identical (Fig. 2B) with a root mean square deviation of 1.6 Å for 120 C␣ positions. The fold of the allantoicase module can be defined as a ␤-sandwich jelly roll motif in which all the strands are connected by loops (Fig. 2C). The ␤-sandwich is formed from the face-to-face packing of two antiparallel ␤-sheets containing five and three strands, respectively, in the order ␤1␤2␤7␤4␤5 and ␤8␤3␤6. The two sheets enclose an extensive hydrophobic interior that is closed off by the numerous connections between the strands. The N-terminal module has an extra strand (␤0a) that extends the shorter sheet and two short ␤-stretches (␤1c and ␤2c) that form an extra small sheet. The connection between ␤0a and ␤1a also contains an extra helix (␣0). A few regions are disordered in the structure; no density was observed between residues 188 and 194 (linker connection between the two modules) and between residues 285 and 296 (forming the loop connecting ␤4b and ␤5b). The missing regions are the same in both subunits of the non-crystallographic dimer. It was shown by mass spectrometry that the crystallized protein was intact and that therefore the absent regions in the model are due to conformational mobility or heterogeneity.
The two modules of the allantoicase monomer are asymmetrically packed against each other (Fig. 2C) in a perpendicular orientation. The contact region involves loops ␤0-␤1 from the N-terminal module (A) and loop ␤2c-␤1b and ␣1b from the C-terminal module (B). The center of the contact region is hydrophobic (Ile-17, Val-23, Leu-30, Val-201, Val-206, Leu-208, and Leu-223). The hydrophobic nature of these residues is maintained in the microbial allantoicases, but some are mutated into polar residues in the eukaryotic sequences. Some polar contacts are found at the periphery of this patch, but few imply direct hydrogen bonds.
The two subunits of the non-crystallographic dimer are related by a 2-fold axis and are associated through head-to-tail packing, module A of one subunit interacting with module B from the other (Fig. 3A). Dimer formation buries in total 2600 Å 2 , representing 9% of the accessible surface area. The dimer  interface is stabilized by eight hydrogen bonds and by extensive salt bridges. The majority of the residues involved in these contacts are conserved. An important part of the packing surface is contributed by long surface loops (residues 119 -134 and 221-231). The first loop contains long insertions in the sequences of higher organisms.
Structural Analogues-A search for related structural motifs using a single allantoicase module revealed structural analogies with other ␤-sandwich modules (Macromolecular Structure Database (MSD) server, www.ebi.ac.uk/msd/Services.html). The best score was obtained with the N-terminal domain of the XRCC1 single strand break repair enzyme (24) (Z-score, 5.51; root mean square deviation, 1.99 Å for 118 C␣ positions) and the carbohydrate-binding modules from family 29 (25) (Z-score, 3.26; root mean square deviation, 2.9 Å for 113 C␣ positions aligned). A number of other proteins contain this motif, for instance, the C2 domain of coagulation factors V and VIII (26), the Doc1 subunit of the anaphase-promoting complex (27), and the galactose domain of a bacterial sialidase (28). All these proteins belong to the same SCOP galactose-binding domain-like superfamily (29). Although none of these proteins have a functional relationship to allantoicase, they are all involved in biological interactions with other partners (DNA, carbohydrates, membrane lipids, etc.). Structural alignment of all these modules reveals that they all bind their targets using roughly the same surface of the jelly roll. None of the structurally characterized domains of this superfamily have enzymatic activity, although many of them are associated to another independent catalytic domain. Allantoicase therefore presents the first jelly roll motif within this large family endowed with catalytic activity. Although dimerization of jelly roll motifs is common (30), the bimodal allantoicase presents a unique mode of association of these structural motifs, which might be linked to the quaternary structure of the protein (see below).
Identification of the Active Site-Nothing is known about the mechanism of allantoicases, and residues involved in catalysis or substrate binding have not been identified. Sequence analysis, as illustrated in Fig. 4, shows that both allantoicase modules possess four highly conserved sequence stretches (boxed I, II, III, and IV with a and b in each module). Fig. 2, A and B, shows that these conserved regions have the same location in both jelly roll modules and are structurally almost superposable. Mapping of these conserved motifs on the molecular surface shows that regions I-IV in both allantoicase modules encompass well defined pockets at opposite poles of the allantoicase monomer (Fig. 2, C and D).
The presence of two conserved and structurally similar but spatially distinct pockets seemed rather unusual, but inspection of the crystal packing showed that the region from module A (respectively, module B) packs against the region from module B (respectively, module A) of the neighbor molecule, creating a small but very tight and totally conserved contact (illustrated in Fig. 3, B and C). Allantoicase from Chlamydomonas reinhardtii is present as a hexamer in solution (dimer of trimers) with the trimers bonded by disulfide bridges (31). In the case of yeast allantoicase, combination of the 3-fold crystal axis with the local 2-fold axis creates a hexamer (dimer of trimers; shown in Fig. 3B), but no cysteines are present at the dimer or trimer interfaces. The trimer contact buries 594 Å 2 /monomer and involves in total seven hydrogen bonds. Direct contacts are from residues in regions II and III (detailed in Fig. 3C). Two stacking arginines make a fireman's grip across the interface  2. A, topology diagram for the jelly roll motifs corresponding to the two allantoicase repeats, module A and B (diagram generated by Tops (35)). Secondary structure elements are labeled according to Fig. 4. Positions of conserved regions are boxed I, II, III, and IV. B, superposition of the allantoicase modules A and B, colored bronze and silver, respectively (same orientation as module A in C). Regions I, II, III, and IV are colored red and orange for modules A and B, respectively. All structure figures are generated by Pymol (pymol.sourceforge.net/). C, ribbon presentations of the allantoicase enzyme. The color of the secondary structure elements is identical to those of the above topology diagram. The two allantoicase modules are approximately related by a 90°rotation. Cter, C-terminal; Nter, N-terminal. D, projection of totally conserved (identified by ConSurf (36)) residues (in red) on the allantoicase surface (same orientations as C). The conserved residues cluster in two separated pockets coinciding with regions Ia, IIa, IIIa, and IVa and regions Ib, IIb, IIIb, and IVb, respectively. bridges involve totally conserved residues and protrude deeply inside the interface. This leads us to believe that the yeast enzyme is also present as a hexamer. From gel filtration elution profiles we observed that it may form either dimers or trimers in solution. For solubility reasons, these experiments were carried out at rather high salt concentrations. Since salt bridges seem to be a major stabilizing factor of the trimer interface, high salt conditions may dissociate the protein in solution.
The trimer interface creates a pocket, lined with totally conserved residues, providing an ideal candidate for the active site. The groove contains at one side Glu-72, Asn-108, and C. reinhardtii allantoicase completely kills enzyme activity, indicating that a histidine could be involved in catalysis (32). His-214 is well positioned to play such a role in the yeast enzyme. The carbon-nitrogen bond hydrolyzed by allantoicase is not very reactive, and with the present structure in hand, how this may be achieved remains an intriguing question. Fumarylacetoacetate hydrolase catalyzes the hydrolysis of a similarly unreactive carbon-carbon bond by combining a Glu-His-water catalytic triad with an activating Ca 2ϩ ion (33). The proposed active site pocket in yeast allantoicase has glutamate and histidine side chains that are well positioned to cooperate in a hydrolysis reaction; however we do not have any evidence for a metal ion site. No effect of metal ions on the activity of the C. reinhardtii enzyme was measured, except a 3-fold increase in the presence of manganese (34). This small effect precludes an essential role for manganese in catalysis. Crystallization trials with substrates/products are underway to further define the active site and reaction mechanism of this interesting enzyme.
Allantoicase has been reported to be bifunctional (32). Both enzyme activities were kinetically characterized for the allantoicase from C. reinhardtii (31,34). The K m values for the allantoate and ureidoglycolate substrates are similar, but the turnover of the latter is 10ϫ lower. Ureidoglycolate lyases are specialized enzymes that catalyze the conversion of ureidoglycolate into urea and glyoxylate, but no individual gene associated with this activity has been reported in yeast. The mechanism and partial sequence of the ureidoglycolate lyase of Burkholderia cepacia were recently characterized, but this did not reveal any yeast orthologues (14). Apart from the pocket described above, no second conserved surface patch is present in yeast allantoicase, and both reactions probably take place at the same active site. CONCLUSION The crystal structure of yeast allantoicase reveals that it is composed of a repeated jelly roll motif. The repeat of such a motif within the same protein is unique. Although this fold was encountered in a number of other proteins, allantoicase is the first documented example where it carries catalytic activity. Combination of local and crystal symmetry creates a hexamer that probably represents the active species of the enzyme. A totally conserved pocket at the trimeric subunit interface constitutes a good candidate for the active site. The present structure forms an excellent starting point to study the mechanism of this poorly documented enzyme family.