Molecular Modeling and Site-directed Mutagenesis Define the Catalytic Motif in Human g -Glutamyl Hydrolase*

Human g -glutamyl hydrolase (hGH) is a central enzyme in folyl and antifolylpoly- g -glutamate metabolism, which functions by catalyzing the cleavage of the g -glu-tamyl chain of substrates. We previously reported that Cys-110 is essential for activity. Using the sequence of hGH as a query, alignment searches of protein data bases were made using the SSearch and TPROBE pro-grams. Significant similarity was found between hGH and the glutamine amidotransferase type I domain of Escherichia coli carbamoyl phosphate synthetase. The resulting hypothesis is that the catalytic fold of hGH is similar to the folding of this domain in carbamoyl phosphate synthetase. This model predicts that Cys-110 of hGH is the active site nucleophile and forms a catalytic triad with residues His-220 and Glu-222. The hGH mutants C110A, H220A, and E222A were prepared. Consistent with the model, mutants C110A and H220A were inactive. However, the V max of the E222A hGH mutant was reduced only 6-fold relative to the wild-type enzyme. The model also predicted that His-171 in hGH may be involved in substrate binding. The H171N hGH mutant was found to have a 250-fold reduced V max . These studies to determine the catalytic mechanism begin to define the three dimensional interactions of hGH with poly- g -glutamate substrates. using sequence of hGH as a search of Swiss-Prot plus data bases was made using the to of of segment pair score cutoff TPROBE performs a alignment on this set and a alignment model. This process is repeated until number of sequences does not significantly. As de- scribed unde “Results,” this identified seven motifs of the GATase family of proteins. The motifs of the GATase domains of pro- teins of known structure were then aligned with the corresponding motifs in hGH. The SEGMOD module of GeneMine Appli- used to generate a molecular model by homology between hGH and the most similar sequence with known structure. This procedure uses a data base search strategy and energy minimization algorithm The coordinates of the known structure are used as a basis to derive predicted coordinates of the target protein. The coordinates of the parent are combined with coor- dinates of target homologous segments from the structural data base to construct 10 initial models. Then an average model based on these 10 is derived. Energy minimization is employed to reduce steric overlap and produce the final predicted structure.

␥-Glutamyl hydrolase (GH) 1 (EC 3.4.19.9), which hydrolyses the ␥-glutamyl conjugates of folic acid and antifolates, is a key enzyme in the metabolism of folic acid and the pharmacology of antifolates. Folate is required as a cofactor by several enzymes in the de novo biosynthesis of DNA precursors and several amino acids. Antifolates (for example, methotrexate (MTX)) have been the traditional treatment for many cancers. When the cell takes up folates or antifolates, they are poly-␥-glutamylated by the enzyme folylpolyglutamate synthetase (EC 6.3.2.17) (1). These polyglutamates are retained intracellularly and are generally better substrates or inhibitors of the target enzymes (2)(3)(4). GH alters these properties by catalyzing the removal of the polyglutamate chain (1,5,6).
Although GH is known to cleave folylpolyglutamates, ultimately yielding folylmonoglutamate and glutamate, its role in the intracellular folate-dependent pathways remains unclear. Similarly, the role of GH in folate deficiencies is ambiguous. Increased GH activity (7,8) or decreased folylpolyglutamate synthetase activity (9 -12) can produce resistance to MTX in vitro. Recently, the ratio of folylpolyglutamate synthetase and GH activities has been demonstrated to be an indicator of MTX polyglutamylation in acute lymphocytic leukemia (13), and inherent resistance to MTX in acute myelogenous leukemia (14) may be due to an increased GH activity. However, clarification of the role of GH in these outcomes is unavailable. Therefore, definition of the active site residues and mechanism of human GH (hGH) will be necessary steps in developing specific inhibitors to assess its cellular function.
Despite the important role of GH in folate function and antifolate therapeutics, no structural studies of the enzyme have been carried out. The catalytic mechanism of GH is not known. It has been established that the rat and human GH proteins are inhibited by iodoacetic acid (1,15), suggesting that at least one cysteine is important for activity. In a previous study (16) using site-directed mutagenesis, we altered the cDNA for hGH to encode four different proteins, each with one of four cysteine residues changed to alanine. Three of the mutant proteins had activities similar to wild-type hGH and were inhibited by iodoacetic acid, whereas the C110A mutant protein had no activity. Cys-110 is conserved among the human, rat, and mouse GH amino acid sequences. These results indicate that Cys-110 is essential for enzyme activity and suggest that GH is a cysteine peptidase. The present study extends this earlier work by using comparative molecular modeling to predict the catalytic fold and presents the first molecular model for hGH. The catalytic fold of hGH is predicted to be similar to that of the glutamine amidotransferase type I (GATase) domain of the small subunit of carbamoyl-phosphate synthetase (CPS, EC 6.3.5.5) from Escherichia coli (eCPS). In the model, Cys-110 is proposed as the active site nucleophile, and His-220 and Glu-222 are predicted to be the other amino acids in the putative catalytic triad. The model also predicts that His-171 is involved in substrate binding but not catalysis. Site-directed mutagenesis of these conserved amino acids followed by characterization of the purified mutant proteins was used to test the predictions of the model. Determination of the active site fold is the first step in defining the structure of hGH in order to develop specific inhibitors that can be used to understand the role of this enzyme in folate homeostasis and antifolate therapy.

Generation of the Model for the Catalytic Fold of Human ␥-Glutamyl
Hydrolase-A two-step process was used to generate the sequence alignment model. First, using the sequence of hGH as the query, a search of Swiss-Prot plus TrEMBL data bases was made using the program SSearch (17), which uses the Smith-Waterman algorithm to identify statistically significant sequence similarities. The set of protein sequences obtained from this search was then used as a starting set for analysis using the program TPROBE (18). TPROBE performs a Gibbs alignment of this set, and it generates motif models based on this alignment. These alignment motifs are next used to search the nonredundant data bases to extract additional sequences that fit the alignment model. TPROBE then purges sequences that are closely related so that each pair of sequences in the set has a BLAST maximal segment pair score less than a cutoff score. For our analysis, the cutoff score was set at 150. TPROBE then performs a Gibbs alignment on this set and generates a new alignment model. This process is repeated until the number of recruited sequences does not increase significantly. As described unde "Results," this process identified seven motifs of the GATase family of proteins. The motifs of the GATase domains of proteins of known structure were then aligned with the corresponding motifs in hGH. The SEGMOD module of GeneMine (Molecular Applications Group, Palo Alto, CA) was then used to generate a molecular model by homology between hGH and the most similar sequence with known structure. This procedure uses a data base search strategy and energy minimization algorithm (19,20). The coordinates of the known structure are used as a basis to derive predicted coordinates of the target protein. The coordinates of the parent are combined with coordinates of target homologous segments from the structural data base to construct 10 initial models. Then an average model based on these 10 is derived. Energy minimization is employed to reduce steric overlap and produce the final predicted structure.
Expression and Purification of Wild-type and Mutant hGH Proteins-The cDNA for the mature forms of wild-type and C110A hGH were subcloned from pET24a (16) into the NdeI and BamHI sites of the pET28b vector (Novagen, Madison, WI) using standard molecular biology protocols (21). This added an N-terminal His tag to the proteins. The final expressed protein had the N-terminal sequence Met-Gly-Ser-Ser-His-His-His-His-His-His-Ser-Ser-Gly-Leu-Val-Pro-Arg-Gly-Ser-Met before Arg-1 of the previously described sequence of the mature enzyme (15). Site-directed mutagenesis was performed with the Quick-Change kit (Stratagene, La Jolla, CA) according to the manufacturer's protocol using the expression vector for the wild-type enzyme as the template, except that 100 ng of each oligonucleotide was used, and the annealing temperature was reduced to 53°C. Oligonucleotides used for mutagenesis of wild-type hGH to H220A, E222A, and H171N were as follows: GHH220Aϩ, 5Ј-TGT CCA GTG GGC TCC AGA GAA AGC-3Ј; GHH220AϪ, 5Ј-GCT TTC TCT GGA GCC CAC TGG ACA-3Ј; GHE222Aϩ, 5Ј-GTG GCA TCC AGC GAA AGC ACC TT-3Ј; GHE222AϪ, 5Ј-AAG GTG CTT TCG CTG GAT GCC AC-3Ј; GHH171Nϩ, 5Ј-CTG CCA ATT TCA ATA AGT GGA GCC TCT CCG-3Ј; GHH171N-5Ј-CGG AGA GGC TCC ACT TAT TGA AAT TGG CAG-3Ј. The nucleotides altered during site-directed mutagenesis are underlined.
After mutagenesis, plasmids were purified using the QIAfilter plasmid midi kit (Qiagen, Valencia, CA) following the manufacturer's protocol, and the entire hGH open reading frame was sequenced on an ABI 377 sequencer (PerkinElmer Life Sciences).
Wild-type enzyme or mutants of hGH were expressed in an E. coli expression system (Novagen) as described previously (16). Wild-type or mutant hGH proteins were purified to homogeneity by a two-step procedure. The clarified sonicate from 1 liter of culture was purified by nickel chelate chromatography on a HisBind column (6.3 ϫ 1.0 cm) at either room temperature or 4°C following the manufacturer's protocol (Novagen). Eluate containing hGH protein was dialyzed into 0.05 M sodium acetate buffer, pH 5.5, containing 0.05 M 2-mercaptoethanol, 1 M NaCl, and 1 mM EDTA. Aliquots (2 ml) were further purified by gel filtration chromatography at 4°C on a Sephacryl S-200 column (2.5 ϫ 95 cm) (Amersham Pharmacia Biotech) in the same buffer with a flow rate of 20 ml/h. Fractions (2.5 ml) containing hGH protein were pooled, and the protein was concentrated using an Amicon (Beverley, MA) stirred cell with a YM-10 membrane. Protein concentrations were determined as described previously (16).
Enzyme Assays-Enzyme activity was measured using the substrate 4-NH 2 -10-CH 3 PteGlu 2 (MTXG 2 , Shirck Laboratories, Jona, Switzerland) as described previously (16), and kinetic constants were determined as described previously (16). The results presented are an average of at least two protein preparations and three sets of assays per protein preparation.
Secondary Structure Determination by Circular Dichroism-To determine the secondary structure of wild-type and mutant hGH proteins, measurements were made on a Jasco 720 spectropolarimeter (Japan Spectroscopic Ltd, Tokyo, Japan). Three scans were made at 20 nm/min between 260 and 200 nm in a 0.05-, 0.02-, or 0.01-cm cell at 25°C. After subtraction of the buffer spectra, the data were converted to molar ellipicity units. The Selcon program (22) was used to calculate the ␣ helix, ␤ sheet, turns, and random structure.

RESULTS
Construction of the Model-By using the SSearch and TPROBE programs, a statistically significant similarity was found between the GATase type I group of proteins and hGH. TPROBE generated an alignment model with seven motifs. A total of 391 sequences were identified as members of this superfamily. The Sequence Logo alignment model (23) illustrated in Fig. 1 uses 47 of these sequences since TPROBE purges sequences that are too closely related. The numbering used corresponds to that for mature hGH (15). In particular, the well characterized GATase catalytic residues cysteine, histidine, and a putative third residue, glutamate, are strongly conserved in the sequence alignment model. Among the set of sequences that belong to the identified superfamily, the small subunit of eCPS (24), GMP synthetase from E. coli (25), and anthranilate synthase from Sulfolobus solfataricus (26) have known high resolution x-ray structures with Protein Data Bank accession numbers 1jdb, 1gpm, and 1qdl, respectively. A molecular model of hGH is presented in Fig. 2. The image was generated with the program MolScript (27) and rendered by the program Raster3D (28). The model was generated by first aligning amino acid sequences in hGH with the corresponding sequence in eCPS and then using eCPS as a template structure for the SEGMOD module of GeneMine. The other structures were aligned as well, but the modeling software selected eCPS, which is the most similar sequence to hGH, with a 17% similarity. The superimposition of the ␣-carbon atoms of the hGH model with 1jdb chain C, 1qdl chain B, and 1gpm chain A had root mean square deviation C␣ values of 1.02, 4.76, and 8.37 angstroms, respectively. The regions that can be predicted with confidence are those that in the alignment model have a maximum a posteriori probability (MAP) score that is positive or close to zero maximum. In hGH, these regions are residues 95-124, 199 -223, 59 -82, 21-48, and 157-181, corresponding to 254 -283, 332-356, 226 -249, 190 -217, and 298 -322 in 1jdb chain C. Each individual motif is colored in Fig. 2 in order of descending statistical significance as follows: orange 95-124, blue 199 -223, gold 59 -82, salmon 21-48, sea green 157-181. Red regions represent those sequences outside of the five significant motifs. There are four regions in hGH that are sequence insertions with respect to eCPS. The sequence insertion in hGH of residues 142-159 has corresponding but slightly smaller insertion regions in 1gpm chain A and 1qdl chain B compared with 1jdb chain C. The other three insertions are unique to hGH, since they do not occur in the other structures. These insertions are shown in red in Fig. 2.
As a test of the proposed structure, a model structure for the subunit of GMP synthetase, 1gpm chain A, was calculated using the same procedure. The percent similarity of the alignment between 1gpm chain A and 1jdb chain C is 19%. Since this value is close to the 17% similarity between hGH and 1jdb chain C, the model structure for 1gpm chain A can be used as a positive control to validate the molecular modeling of hGH. The region 7-207 of 1gpm chain A was modeled using 1jdb chain C as a template structure. The region starts with the first motif identified by TPROBE and ends with the last motif. The region includes the two non-significant motifs. A structural superimposition of the five significant motifs between the 1gpm chain A model and the known 1gpm chain A structure (25) had a root mean square deviation C␣ of 3.86 Å, and the overall superimposition had a root mean square deviation C␣ of 4.99 Å. The overall fold of the model was similar to the known structure. As expected, the model of the loop, 1gpm chain A 117-131, corresponding to the insertion in 1gpm, is not accurate, and the five significant motifs (1gpm chain A 71-100, 160 -184, 44 -67, 9 -36, and 129 -153) overlap quite well.
Predictions of the Model-The model predicts that Cys-110 in hGH (yellow in Fig. 2) is analogous to Cys-269 in eCPS. eCPS contains two domains and catalyzes the synthesis of carbamoyl phosphate from bicarbonate, glutamine, and two molecules of MgATP. The smaller subunit is the site of glutamine hydrolysis to produce ammonia for delivery to the larger subunit. eCPS and other members of the GATase I family of enzymes contain a Cys-His-Glu catalytic triad (25). Cys-269 in eCPS functions as the active site nucleophile in the small subunit, attacking the ␥-carbonyl of glutamine to form a glutamyl thioester intermediate (29). His-220 (pale blue in Fig. 2) and Glu-222 in hGH (bisque in Fig. 2) are predicted to be the other two amino acids in the proposed catalytic triad corresponding to His-353 and Glu-355 in eCPS. The alignment model also predicts that His-171 in hGH (pale blue in Fig. 2) is analogous to His-312 in eCPS.
Site-directed Mutagenesis of hGH-The addition of an Nterminal HisTag to hGH enabled the purification to homogeneity of large quantities of wild-type and mutant proteins from the E. coli expression system (Fig. 3). The wild-type enzyme and mutant proteins reacted with a polyclonal antibody raised against hGH expressed in insect cells (Fig. 3) (15). Analysis of the secondary structures of wild-type and mutant hGH proteins by circular dichroism (Table I) indicated that they were essentially the same, demonstrating that the site-directed mutagenesis did not significantly alter the protein folding.
Enzyme Kinetics of Wild-type and Mutant hGH Proteins-The purified wild-type and mutant hGH proteins were assayed for activity with MTXG 2 (Table II). The kinetic constants obtained for the wild-type hGH protein with an N-terminal His tag were similar to previous results with a non-His-tag hGH protein (16), suggesting that the addition of the N-terminal His tag had no effect on hGH activity. As predicted by the molec-ular modeling and previous results in a non-His-tag expression system (16), mutation of Cys-110 to alanine in the His-tag construct produced inactive enzyme. Mutation of His-220 to alanine also produced an inactive enzyme, confirming the critical role of this amino acid in catalysis. Mutation of Glu-222 to alanine produced active enzyme, with a K m similar to that of the wild-type hGH and a 6-fold reduced V max (Table II). Mutating His-171 to asparagine produced an enzyme with a 250fold reduction in V max (Table II) but an unchanged K m .

DISCUSSION
The proposed molecular model predicted the catalytic triad based on homology with the GATase domain of eCPS. In agreement with our previous study of the four hGH cysteine residues in which Cys-110 was identified as essential for activity (16), expression of the C110A hGH mutant in an E. coli expression system using a construct with a HisTag attached to the N terminus produced inactive enzyme. Amino acid Cys-110 was confirmed as necessary for catalytic activity and predicted by homology to Cys-269 in eCPS to be the active site nucleophile that attacks the ␥-carbonyl of the susceptible Glu-Glu bond in poly-␥-glutamate substrates to form a thioester intermediate. A similar mutation in eCPS produced protein that bound but did not hydrolyze glutamine (30).
The production of an inactive hGH mutant by mutating His-220 to alanine is consistent with the predicted role in the catalytic mechanism. The equivalent residue in eCPS (His-353) is suggested to have several roles: to activate the cysteine nucleophile, to protonate the leaving group, and to activate the water molecule that attacks the thioester intermediate (29,(31)(32)(33)(34). Mutating His-353 to asparagine in eCPS produced an enzyme with greatly reduced activity, which still bound glutamine (31). The same mutation in a mammalian CPS yielded a protein with a 164-fold lower k cat , a 4-fold higher K m, and an unmeasurably low rate of thioester formation (32). The mutation of the equivalent histidine in the GATase domain of anthranilate synthase decreased both activity and sensitivity to 6-diazo-5-oxo-L-norleucine, suggesting that this histidine resi-  Circular dichroism spectra were generated between 260 and 200 nm. The data were converted to molar ellipicity units and used in the program Selcon (22) to determine the percentage of ␣-helix, ␤-sheet, turn, and random structure. Results are the mean Ϯ S.D. of 3-6 spectra.

Determination of Glutamyl Hydrolase Catalytic Motif
due plays a role in the formation of the thioester intermediate (35).
The results presented here indicate that Glu-222 is not catalytically essential in hGH. When this residue was mutated to alanine there was a 6-fold reduction in V max , indicating that the E222A hGH mutant retains more activity than the equivalent mutant of eCPS. The thermal stability for the hGH E222A mutant protein was also decreased, and a lower yield of purified E222A was observed (results not shown). These observations are consistent with results obtained by site-directed mutagenesis of the putative third member of the catalytic triad of papain (36). Mutation of this asparagine to glutamine and alanine reduced the k cat /K m 3.4-and 150-fold, respectively. It was also found that these mutants had increased aggregation, proteolytic susceptibility, and decreased thermal stability, suggesting that this residue was also structurally important.
The role of Glu-355 as a member of a catalytic triad in eCPS has recently been the subject of some debate. Glu-355 is conserved in all known GATase type I sequences and was proposed to be a member of the catalytic triad based on analogy to other hydrolyzing enzymes. The location of this conserved glutamate within hydrogen-bonding distance of His-353 in the crystal structures of eCPS (24) and the equivalent histidine in E. coli GMP synthetase (25) added strength to this hypothesis. Only recently has site-directed mutagenesis been performed to test this hypothesis. When this residue was mutated to glycine in mammalian CPS, an enzyme was generated that had an unaltered K m but 47-fold lower k cat (32) and poor thioester formation. An E355A mutation in eCPS (33) produced an enzyme with a 47-fold higher K m in glutamine-dependent reactions but an unaltered V max . In p-aminobenzoate synthetase (37), a 35fold increased K m , 4-fold decreased V max , and reduced thioester formation were observed after mutation of the equivalent glutamate residue. The retention of some activity when the active site glutamate was mutated to alanine argues against Glu-355 being absolutely essential for catalysis, but poor thioester formation suggests that it may have an indirect role in catalysis. The absolute conservation of this glutamate residue in GATase domains and also in GH proteins suggests a critical role for this residue. In hGH, Glu-222 may play more of a role in structural integrity than in catalysis.
The model predicted that His-171 in hGH would function like His-312 in eCPS. His-312 in eCPS has been suggested to be involved in substrate binding but not catalysis (31). Mutating His-171 in hGH to asparagine reduced enzyme activity. Asparagine was chosen as it is sterically conservative and may maintain some hydrogen-bonding patterns but cannot perform the acid/base reactions characteristic of the the imidazole ring. However, unlike the equivalent mutation in eCPS, V max and not K m was altered in hGH. This absence of a reduction in K m suggests that His-171 in hGH functions differently than His-312 in eCPS. The large reduction in V max for the H171N mutation implies some unknown role in the catalytic mechanism of hGH. The crystal structure of eCPS indicates that the imidazole ring of His-312 points away from the catalytic triad and is approximately 5 Å from the catalytic cysteine (24). His-312 was proposed to be involved in glutamine binding rather than catalysis based on biochemical analysis of an H312N mutant eCPS protein. This mutant had a 47-fold increase in the K m for the glutaminase reaction but an unaltered V max (31). However, in the crystal structure of the H353N eCPS mutant, where the thioester intermediate has been trapped (29), there are no apparent electrostatic interactions between His-312 and the substrate. Therefore, the role of this residue in substrate binding remains unknown.
The hypothesis that the two enzymes use a similar catalytic mechanism is consistent with the similarity of the biochemical reactions catalyzed by hGH and the GATase domain of eCPS. It has previously been shown that the products of the hGHcatalyzed reaction on folypolyglutamates are glutamate or di-␥-glutamate (15). The amide group hydrolyzed in the polyglutamate chain is the N-terminal of one glutamate linked to the ␥-carbonyl of the preceding glutamate. The GATase domain of eCPS removes the ␥-NH 2 from glutamine to produce glutamate and ammonia (24 -26, 29 -34). The reaction proceeds by attack of the active site cysteine on the ␥-carbonyl of glutamine to form a thioester intermediate and ammonia (24,29,33,34). It can be envisioned that if the catalytic mechanism of eCPS is applied to hGH, the active site cysteine would attack the ␥-carbonyl of the poly-␥-glutamate to form the thioester, and the ␥-linked glutamate would be released. hGH has been shown to be inhibited by glutamine analogues such as azaserine and 6-diazo-5-oxo-L-norleucine (38), also suggesting that the active site could be similar to the glutamine binding GATase domain of eCPS.
These results indicate that hGH is a member of the cysteine peptidase family of enzymes with Cys-110 as the active site nucleophile and His-220 as the proton donor. The catalytic fold appears to be similar to that of the GATase type I domain, with His-171 having an essential but as yet undetermined role in enzyme activity.
Ollis et al. (39) identify the ␣/␤ hydrolase fold common to several hydrolytic enzymes with different catalytic function and phylogenetic function. The x-ray structure of GMP synthetase (25) indicates that the GATase type I domain has a fold different than that of the ␣/␤ hydrolase fold. However, some of the elements of the ␣/␤ hydrolase fold are retained. For example, the "nucleophile elbow" is present. In this structural feature, the catalytic cysteine is the last amino acid of a ␤-sheet and the first of an ␣-helix, forcing it into an unusual orientation (39). To maintain this structure, the amino acids ϩ2 and Ϫ2 around the catalytic cysteine must be small (39). In hGH residues 108 and 112 are both glycines. Therefore, hGH may also contain a nucleophile elbow. X-ray crystallography studies are under way to test this hypothesis and to determine the role of His-171. The results presented here define the active site fold and catalytic residues of hGH and should facilitate the design of high affinity inhibitors specific for GH to test whether inhibition of GH activity would lead to increased retention of folyland antifolylpolyglutamates within the cell.