Required allosteric effector site for N-acetylglutamate on carbamoyl-phosphate synthetase I.

Carbamoyl-phosphate synthetase I (CPSase I) catalyzes the entry and rate-limiting step in the urea cycle, the pathway by which mammals detoxify ammonia. One facet of CPSase I regulation is a requirement for N-acetylglutamate (AGA), which induces an active enzyme conformation and does not participate directly in the chemical reaction. We have utilized labeling with carbodiimide-activated [14C]AGA to identify peptides 120-127, 234-237, 625-630, and 1351-1356 as potentially being near the binding site for AGA. Identification of peptide 1351-1356 confirms the previous demonstration (Rodriquez-Aparicio, L. B., Guadalajara, A. M., and Rubio, V. (1989) Biochemistry 28, 3070-3074) that the C-terminal region is involved in binding AGA. Identification of peptides 120-127 and 234-237 constitutes the first evidence that the N-terminal region of the synthetase is involved in ligand binding. Since peptides 631-638 and 1327-1348 have been identified near the ATP site of CPSase I (Potter, M. D., and Powers-Lee, S. G. (1992) J. Biol. Chem. 267, 2023-2031), the present finding of involvement of peptides 625-630 and 1351-1356 at an “allosteric” activator site was unexpected. The idea that portions of the AGA effector site might be derived from an ancestral glutamine substrate site via a gene duplication and diversification event was considered.

Formation of carbamoyl phosphate by carbamoyl-phosphate synthetase (CPSase) 1 (see Equation 1 and Table I) constitutes the first step of two biosynthetic pathways, one leading to arginine and/or urea and the other to pyrimidine nucleotides (1, 2). 2 ATP ϩ HCO 3 Ϫ /CO 2 ϩ NH 3 ͑glutamine͒ 3 2 ADP ϩ P i ϩ carbamoyl-phosphate ͑ ϩ glutamate͒ (Eq. 1) Escherichia coli and other enteric bacteria have a single CPSase that supplies carbamoyl phosphate to both the arginine and pyrimidine pathways and that is allosterically regulated by intermediates of both pathways, with UMP acting as an allosteric inhibitor and IMP and ornithine (an intermediate in the arginine biosynthetic pathway) acting as allosteric activators. Animals and fungi have two CPSases, with one specific for pyrimidine biosynthesis, which is allosterically regulated by UTP but not by ornithine, and with the other specific for arginine biosynthesis and/or the urea cycle. In spite of extensive screening of intermediates of the arginine and pyrimidine pathways and of related amino acids and amino acid analogs (3,4), there have been no allosteric regulators identified for the arginine-specific CPSases of Saccharomyces cerevisiae. However, N-acetylglutamate (AGA) is a required allosteric activator for CPSases involved in urea biosynthesis for ammonia detoxification, including those from rat liver and human liver, and for the elasmobranch CPSases involved in urea biosynthesis for osmoregulation.
The requirement for AGA apparently serves as a major regulatory feature of mammalian ammonia detoxification (5), and the CPSases have been divided into three groups on the basis of their requirement for AGA and on the basis of the nitrogen donor utilized (Table I). CPSase I, represented in the present research by rat liver CPSase I, requires AGA and uses ammonia, but not glutamine, to form carbamoyl phosphate. CPSase II is not affected by the presence of AGA and uses glutamine preferentially but can also utilize ammonia; the arginine-specific and pyrimidine-specific CPSases, as well as the single E. coli CPSase serving both pathways, are members of the CPSase II group. CPSase III, from urea osmotic elasmobranches as well as some invertebrates, requires AGA and uses glutamine preferentially, but can also utilize ammonia. Mechanistic studies of the glutamine-utilizing CPSases have established that they act as glutamine amidotransferases to cleave ammonia from glutamine and then utilize this nascent ammonia analogously to the alternative substrate, exogenous ammonia, at the carbamoyl phosphate synthesis site (6). The glutamine amidotransferase (GAT) moiety of CPSase may reside on a small subunit that is associated with a large carbamoyl-phosphate synthetase subunit, as in the case of the E. coli enzyme (6), or it may reside on a domain that is fused to the synthetase, as in the case of the mammalian pyrimidine-specific CPSase II (7) and elasmobranch CPSase III (8).
Sequencing studies (e.g. Refs. [7][8][9][10][11] have shown that CPSases I, II, and III all have extensive sequence identity/similarity and that they all incorporate a GAT region that has a sequence homologous to those of the other six members of the G-type amidotransferase family and that appears to have evolved from an ancestral glutamine-utilizing gene (12). CPSase I, which cannot utilize glutamine as a substrate and has no detectable glutaminase activity, also contains the GAT homology region. Presumably residue(s) essential for glutaminase activity have been replaced in the CPSase I family, e.g. replacement of an essential cysteine by serine in rat liver CPSase I (13). Analysis of the CPSase sequences (7)(8)(9)(10)(11) has also shown the presence of two ATP binding regions (domains B and C) that contain the consensus primary sequence motifs identified for ATP-utilizing enzymes and that appear to have resulted from an ancestral gene duplication. Functional studies of E.
coli CPSase II (14 -17) and rat liver CPSase I (18 -23) have established that the two molecules of ATP bind at distinct sites and react in discrete steps in the reaction pathway, with one molecule of ATP (ATP B , which binds to domain B) utilized for activation of bicarbonate and the other molecule of ATP (ATP C , which binds to domain C) utilized for phosphorylation of carbamate. In all CPSases studied, the GAT domain can act independently as a glutaminase (e.g. Refs. 24,25), and the remainder of the protein can function independently for ammoniadependent carbamoyl phosphate synthesis; however, there is a strong functional linkage between occupancy of the glutamine site on the GAT domain and occupancy of the ATP B site on the synthetase portion of CPSase, with greatly increased activity at each site resulting from occupancy of the other site (6,24,25).
The known allosteric effectors for CPSases all affect the binding of ATP but do so in two quite distinct manners. IMP, UMP, UTP, and ornithine are all known to affect the interaction of CPSases with both ATP B and ATP C (6), and binding of these allosteric effectors has been shown to involve the Cterminal region of CPSases II (26 -29). In contrast to the CPSase II effectors, AGA greatly facilitates the binding of ATP B to CPS I but has no detectable effect on the binding of ATP C (22,30). In the single previous study of AGA site localization, reaction of rat liver CPSase I with the photoaffinity label N-chloroacetyl[ 14 C]glutamate was found to result in labeling of only the C-terminal region of CPSase (31).
We now report additional studies that were aimed at identifying specific peptides that are near the AGA binding site. We have utilized an alternative and complementary approach to labeling the AGA site, carbodiimide-mediated covalent attachment of AGA to lysyl residues located near the AGA site. This latter approach should allow identification of peptides near the carboxyl groups of AGA, whereas use of photoactivated Nchloroacetylglutamate should allow identification of peptides near the acetyl group of AGA. In addition, since the photochemical reactions of alkyl halides preferentially involve aromatic groups (32,33), aromatic amino acid side chains near the AGA binding site would have been detectable with photoactivated N-chloroacetylglutamate (31), whereas the present studies are targeted at lysyl residues. Furthermore, more recent studies utilizing an extensive panel of AGA analogs (34) have shown that AGA is bound with high affinity to CPSase I only in the presence of ATP and that the topography of the single AGA binding site differs significantly for the high and low affinity forms. Therefore, we have included ATP in the present studies of AGA site localization, whereas the photolabeling of rat liver CPSase I by N-chloroacetylglutamate was studied in the absence of ATP (31).
We have identified four lysyl-containing peptides of rat liver CPSase I that potentially occur near the binding site for AGA. These peptides confirm the involvement of the C-terminal region that was suggested in the previous study of AGA site localization (31) and also indicate that residues from additional portions of the protein are potentially near the AGA site. Surprisingly, two of the labeled peptides (625-630 and 1351-1356) are contiguous in the primary sequence with peptides previously localized to the ATP B site (631-638 and 1327-1348; Ref. 22). Thus, rather than binding to a site removed from the active site, the required allosteric effector AGA apparently binds to a site that is very near the active site where it facilitates ATP B binding. The other two peptides covalently linked to AGA (120 -127, 234 -237) are in the extreme N-terminal portion of CPSase I and provide the first suggestion that this region may be involved in binding of any ligands. Based on these peptide localizations and on sequence analysis of the extreme N-terminal region of CPSase, we propose that part of the AGA effector site is derived from an ancestral glutamine substrate site via a gene duplication and diversification event and present a model for CPSase ligand interactions that incorporates this new AGA site localization.
Enzymatic Activity and Protein Determination-The enzymatic activity of CPSase I was assayed by the coupled pyruvate kinase/lactate dehydrogenase system, as described previously (22). All assay data points were the average of at least three determinations. CPSase I concentrations were determined by the Bradford method (35) or from the absorbance at 280 nm. It has been found that the amount of CPSase I that yields an absorbance of 0.96 at 280 nm yields a value of 1 mg/ml a The small subunit of these two CPSases binds and cleaves glutamine, whereas the large subunit accepts the ammonia moiety cleaved from glutamine, binds all of the remaining substrates and effectors, and carries out all of the other catalytic events. The glutamine amidotransferase moiety is fused to the synthetase moiety in the single-subunit CPSases. CPSase I, which cannot utilize glutamine as a substrate and has no detectable glutaminase activity, also contains the glutamine amidotransferase homology region, but residues essential for glutaminase activity have been replaced.
in the Bradford assay (36). The enzyme was placed in buffers other than the storage buffer by the rapid centrifugal gel filtration column procedure of Penefsky (37).
Labeling of CPSase I with EDC-activated AGA-The reaction mixture used for EDC-activated AGA labeling contained 75 mM MOPS, pH 7.2, 10 mM KCl, 10 mM EDC, 0.1 mM AGA (unless specified otherwise), 10 mM ATP, 18 mM MgCl 2 , and 3 mg/ml (18 M) CPSase I. The AGA was preactivated with the EDC for 10 min at 25°C prior to its addition to the labeling mixture. After the indicated time of reaction at 37°C, the CPSase I was removed from the labeling mixture components by centrifugal desalting (37) into 100 mM Tris/HCl, pH 8.1.
Limited Proteolysis, Electrophoresis, and Densitometry-After a 15min treatment with EDC-activated AGA, CPSase I was subjected to limited proteolysis (38) for 30 min in 0.1 M Tris, pH 8.1, with either elastase (1:59) or trypsin (1:220). The resulting large fragments were separated by SDS-PAGE (39) on 8% gels. The amount of protein was determined by scanning the wet gels, stained with Coomassie Brilliant Blue R-250, with a Hoefer GS 300 scanning densitometer and integrating the peaks with the Hoefer GS 360 system. For autoradiography, gels were dried and exposed to Amersham hyperfilm-␤max at room temperature. The developed film was subjected to scanning densitometry and integration as described above.
Exhaustive Proteolysis and Peptide Analysis-After a 15-min treatment with EDC-activated AGA, CPSase I was reduced and alkylated with iodoacetic acid as described by L'Italien (40). Exhaustive trypsin proteolysis was carried out in 0.05 M Hepes, pH 7.5, 0.05% SDS, 0.01 M CaCl 2 at 20°C for 96 h with trypsin (1:20) added every 24 h; end-overend shaking was required to keep the labeled protein suspended. Although an 18-h digestion was sufficient for unlabeled CPSase I, the AGA-modified enzyme required a much longer incubation for complete proteolysis. The solution was microcentrifuged and the supernatant loaded on a Vydac C18, 5-m column (0.45 ϫ 25 cm). Peptides were eluted with the following gradient of solvent A (0.11% trifluoroacetic acid) and solvent B (95% acetonitrile, 0.1% trifluoroacetic acid), 0% B for 10 min, 0 -35% B for 140 min, and 35-75% B for 15 min. Radiolabeled peptides were detected with a Beckman Radioisotope Detector connected in series with a Waters 441 Absorbance Detector (214 nm optics) and were collected directly from the column eluate. The amino acid sequences of these radioactive peptides were determined at the Tufts University Analytical Core Facility by automated Edman degradation. Amino acid composition analysis was also carried out at the Tufts University Analytical Core Facility.
Localization of AGA-labeled Peptides to CPSase I Domains-Limited proteolysis, reduction, and alkylation were carried out as described above. The CPSase I fragments were separated by SDS-PAGE (39) on 8% gels, stained for 10 min at room temperature with 0.03% Coomassie Brilliant Blue R-250 in acetic acid/methanol/water (1:3:6, v/v/v), and destained for 50 min at 4°C in the same solvent system. The fragment bands were excised from the gel, sliced into 1-mm pieces, and electroeluted into Centricon 30 microconcentrators with an Amicon Centrilutor. Individual electroeluted samples of the fragments were concentrated in the original Centricon 30 microconcentrators, resuspended in 0.05% SDS in 50 mM Hepes, pH 7.5, and concentrated again. These samples were pooled to create single samples of each fragment with approximate final volumes of 0.5 ml. These samples were subjected to exhaustive trypsin proteolysis and the resulting peptides separated by reverse-phase HPLC as described above.

Labeling of CPSase I with EDC-activated [ 14 C]AGA-We
have utilized the "zero-length" cross-linker EDC (41) to covalently attach AGA to amino acid residues in or presumably near its binding site on CPSase I. The rationale for this labeling approach was as follows. (a) Positively charged lysyl side chains might well be components of the binding site for the negatively charged AGA. (b) EDC should activate either or both of the carboxyl groups of AGA and facilitate amide bond formation with any available ⑀-amino group of lysyl residues near the AGA binding site. Specific recognition of EDC-activated glutamate derivatives at binding sites for the glutamyl moiety has been previously observed in studies with the folate-methotrexate transporter from L1210 cells (42) and from Leishmania donovani (43), dimethylglycine dehydrogenase (44) and 5,10-methenyltetrahydrofolate synthetase (45). Presumably the nonmodified portion of these EDC-glutamate adducts tar-gets the adduct to the enzyme binding site, and lysyl groups near the binding site attack the EDC-activated carboxyl group to release the EDC moiety and allow a more complete fit of the glutamate derivative to its binding site; given the bulkiness and positive charge of the EDC moiety, it would not be expected to actually gain complete access to the binding site.
Incubation of CPSase I with EDC-activated [ 14 C]AGA ( Fig.  1) led to a time-dependent incorporation of [ 14 C]AGA. No covalent label incorporation was observed in the absence of EDC. When EDC was added last to the reaction components, after free [ 14 C]AGA had bound to CPSase I, 30 -50% less incorporation was observed, presumably because of competing reactions of EDC with surface-exposed glutamyl, aspartyl, and tyrosyl groups of CPSase I; however, the covalently linked [ 14 C]AGA showed a similar pattern of domain localization (see below) for the two types of labeling protocol, suggesting that the EDCpreactivated [ 14 C]AGA was targeted to the same site(s) as free [ 14 C]AGA. The standard 10 mM EDC treatment yielded no intermolecular protein-protein cross-linking detectable by SDS-PAGE, although substantial intermolecular cross-linking occurred with 100 mM EDC. Specificity of labeling was indicated by a lack of significant labeling in proteins not expected to interact with [ 14 C]AGA. When treated under the standard conditions for labeling with EDC-preactivated [ 14 C]AGA (15 min, with 3 mg/ml protein in each case) and subjected to centrifugal desalting (37) to remove free [ 14 C]AGA, pyruvate kinase retained 0.52% of the cpm associated with CPSase I, lactate dehydrogenase retained 1.09%, and apomyoglobin retained 0.51%. Further indication of specificity of labeling was derived from the finding that inclusion of equimolar glutamate (0.1 mM each of glutamate and [ 14 C]AGA) in the labeling mixture had no significant effect on the amount of [ 14 C]AGA covalently incorporated into CPSase I (94% of the amount incorporated in the absence of glutamate). If the EDC-activated carboxyl group(s) of AGA were simply reacting with the most available lysyl groups of CPSase I, then EDC-activated glutamate should also be reactive and compete with the activated [ 14 C]AGA; however, the observed failure of EDC-activated glutamate to compete with EDC-activated [ 14 C]AGA would be consistent with labeling of the AGA binding site on CPSase I, which is known not to accept glutamate (34).
We had hoped that covalent attachment of AGA would yield a permanently activated CPSase I, indicating that the covalent attachment was to the correct site and induced the correct CPSase conformational change. Instead, treatment of CPSase I with EDC-activated AGA was accompanied by a loss of enzymatic activity (Fig. 2). However, the rate of inactivation was essentially unaffected by omission of AGA and increased by omission of ATP, strongly suggesting that inactivation resulted at least partly from modification by excess free EDC of residue(s) critical for ATP binding. Similar carbodiimide inactivation, against which ATP protects, has been previously observed for cAMP-dependent protein kinase (46,47). It should also be noted that there were no activity determinations reported for either the reversibly bound or covalently bound analog when photoactivated chloroacetylglutamate was utilized for AGA site labeling (31), precluding any comparison with the present activity findings.
Effect of Covalently Bound AGA on CPSase I Conformation-The loss of enzymatic activity apparently associated with coincidental modification of the ATP site precluded using enzymatic activity as a probe of whether covalent AGA incorporation yielded an allosterically active CPSase I conformation. However, limited proteolysis analysis of AGA-labeled CPSase I indicated that this enzyme conformation was quite similar to that shown previously (38,48,49) to be induced by binding of free AGA (Fig. 3). Previous studies (38,48,49) have established that limited proteolytic digestion of CPSase I with either trypsin or elastase yields defined polypeptide fragments that are thought to represent independently folded domains (Fig. 3). The presence of AGA increases the overall rate of limited proteolysis and also increases the ratio of 140:120 kDa species observed in elastase digestion. The presence of ATP, or of ATP plus AGA, decreases the proteolysis rate and also decreases the ratio of 140:120 kDa species. In the present studies (Fig. 3), covalent labeling of CPSase I with AGA resulted in the same effects on limited proteolysis that were observed in the presence of an excess of free AGA, increased loss of the intact 160-kDa species and increased ratio of 140:120 kDa species, as observed qualitatively in Fig. 3 and confirmed by densitometry (data not shown). Also, the addition of ATP to AGA-labeled CPSase I led to both a decrease in rate of proteolysis and a decrease in 140:120-kDa species, indicating that labeled enzyme retained its ability to bind ATP and thus that the label had not been incorrectly incorporated at a binding site for ATP.
Identification of [ 14 C]AGA-labeled Peptides-In order to localize the site(s) of modification with EDC-activated AGA, [ 14 C]AGA-labeled CPSase I was subjected to exhaustive trypsin proteolysis and reverse-phase HPLC analysis of the resulting peptides (Fig. 4). Five [ 14 C]AGA-labeled peptide peaks were identified and subjected to N-terminal sequence analysis (Table II). When less limited labeling conditions were utilized (0.5 mM AGA, 30 min), the same five peaks were obtained with a very similar distribution of label among peaks; however, additional smaller peaks were also obtained, presumably due to nonspecific labeling since their number and elution positions were variable in multiple labeling experiments (data not shown). The initial unnumbered peak of the 14 C trace (Fig. 4) did not represent [ 14 C]AGA-labeled peptide(s) since it was not present when the labeled CPSase I was more completely separated from the labeling mix components by SDS-PAGE and electroelution from the gel before exhaustive proteolysis and HPLC analysis. Peaks 1 and 2 appeared to be derived from AGA linkage at the N terminus of the protein. They were refractory to analysis by Edman degradation but contained the amino acid residues expected for the tryptic peptides derived from the alternative N termini of CPSase I (peptide 39 -42, LSVK and peptide 40 -42, SVK). CPSase I is synthesized with a leader sequence that appears to be cleaved at two adjacent sites as the CPSase I precursor is transported into the mitochondrion, resulting in two alternative N termini for the mature form of rat liver CPSase I (11,38). Before treatment with EDC-activated AGA, automated Edman degradation of intact CPSase I resulted in approximately equal yields of two Nterminal sequences, with one sequence starting at Leu-39 and the other at Ser-40. After treatment with EDC-activated AGA (0.5 mM AGA, 30-min incubation), much less of the intact CPSase I underwent Edman degradation, strongly suggesting that both N termini were blocked. Further evidence for this identification of peaks 1 and 2 is that they were localized entirely to domain A ( Fig. 5; complete discussion of Fig. 5  below). Further studies will be necessary to determine whether AGA labeling of the N terminus might reflect a specific structural and/or functional role in the regulatory interaction of AGA with CPSase I or simply, as we are presently assuming, the exposure on the enzyme surface of the chemically reactive N-terminal amino group.
Four peptides were found to be present in [ 14 C]AGA-labeled HPLC peaks 3-5 (Table II): peak 3 contained two peptides, 120 -127 and 234 -237; peak 4 contained only peptide 1351-1356; and peak 5 contained two peptides, 1351-1356 and 625-630. The presence of two peptides each in peaks 3 and 5 was ascertained by observing two amino acids in each sequencing cycle and assigning the two sequences to unique sites within the known complete amino acid sequence (11). Several lines of evidence strongly suggested that all four of these peptides  (11). Limited proteolysis with elastase or trypsin under nondenaturing conditions yields discrete polypeptides that appear to result from cleavage at independently folded domains (38,48,49). The indicated fragment sizes were determined by SDS-PAGE. The structural domains of the enzyme, derived from the proteolytic fragment data, are indicated by domains A-D, with domain A comprised of residues 39 -417, domain B of residues 418 -787, domain C of residues 788-1328, and domain D of residues 1329 -1500. The N-terminal boundary for domain A is residue 39 or 40 because CPSase I is synthesized with a leader sequence (residues 1-38/39) thst is cleaved at one of two alternative adjacent sites as the 165-kDa CPSase I precursor is transported into the mitochondrion (11). Since domain D is cleaved to small peptides even under the limited proteolysis conditions, the interface between domains C and D is not precisely defined. The C/D interface appears to include residue 1328, where V8 protease cleaves, although neither elastase nor trypsin cleaves this interface before residue 1356 (48).  4 -6). In the control reaction (lanes 1-3), all treatment was identical except that the [ 14 C]AGA and EDC were omitted. The reactions were terminated by desalting (37) into 100 mM Hepes, pH 7.5, and limited proteolysis with elastase was carried out with no addition (lanes 1, 4), 10 mM MgATP (lanes 2, 5), or 5 mM AGA (lanes 3, 6). tively with AGA since exclusive labeling of this peptide is also obtained (peak 4), whereas exclusive labeling of peptide 625-630 is not observed. However, an alternative model would be that peptide 1351-1356 can interact with either the ␣-carboxyl or the ␥-carboxyl of AGA while peptide 625-630 interacts with a single carboxyl of AGA. For this alternative model to be correct, the resulting covalently linked products of peptide 1351-1356 must elute in different peaks, and AGA-labeled peptide 625-630 must coincidentally be labeled to the same extent as one of the two labeled forms of peptide 1351-1356 and must elute at the same position as this peptide. Although peptides 120 -127 and 234 -237 co-eluted in all labeling and separation protocols applied to intact CPSase I tryptic peptides, [ 14 C]AGA labeling at these two peptides appeared to occur independently since the relative amounts observed in various fractions ranged from the about 5:1 ratio of Table II to a nearly 1:1 ratio. The total amount of radioactivity observed in peak 3 decreased as the mass recovery of peptide 120 -127 decreased, as would be expected if peptide 120 -127 were a site of [ 14 C]AGA incorporation. Peptide 234 -237 was also identified as a site of [ 14 C]AGA incorporation when the tryptic peptides from only domains A plus B were separated (87-kDa fragment, Fig. 5; complete discussion of Fig. 5 below). In this separation, peptide 234 -237 could be detected in the absence of peptide 120 -127 in a fraction at the leading edge of [ 14 C]AGA-labeled peak 3.
As would be expected for [ 14 C]AGA-modified residues, the sequencer yields of lysyl residues 127, 237, 630, and 1356 were all much lower than those of the other residues within the respective peptides (Table II). Presumably, the finding of small amounts of unmodified lysine in the labeled peptides reflected the partial breakdown of the [ 14 C]AGA-linked lysine under sequencing conditions, as has been suggested for other proteinligand systems involving ⑀-lysyl-␥-glutamyl cross-links (50, 51). The finding of small amounts of radioactivity in the early sequencing cycles was also consistent with a partial breakdown of the [ 14 C]AGA-linked peptides. Linkage of [ 14 C]AGA to lysines 127, 237, 630, and 1356 would be expected to block the action of trypsin. Assuming that each of the labeled peptides extended beyond the [ 14 C]AGA-modified lysyl residue to the next lysyl or arginyl residue in the sequence, the actual peptides separated by reverse-phase HPLC would be 120 -146, 234 -238, 625-638, and 1351-1360. However, the [ 14 C]AGAmodified lysyl residues also appeared to block the Edman degradation reaction. Peptide sequences could not be reliably obtained beyond the modified lysyl residues, and the bulk of radioactivity was recovered on the sequencer filter. Such re- a Trace amounts of peptide 1445-1452 were also present in peak 5 from the HPLC run shown in this table; however, this peptide was eliminated as a site of AGA labeling since it has no lysyl residue and since it was absent from peak 5 in replicate labeling and separation experiments.
b Indicates an amino acid residue derived from two peptides.  (1:59) or trypsin (1:220), reduced, and alkylated. The resulting fragments were separated by SDS-PAGE, recovered by electroelution, and subjected to exhaustive tryptic proteolysis. The resulting labeled peptides were separated by reverse-phase HPLC; for the 37-kDa fragment, the 0 -35% B gradient was 120 min rather than the usual 140 min. For clarity, only the radioactivity trace of each separation is presented. * indicates that a portion of domain D remains attached to domain C after limited proteolysis with elastase or trypsin, as shown in Fig. 3. Elution positions for the peaks varied somewhat depending on the total peptide composition of the sample and were as follows: peak 1 eluted at 5.5-6% B with intact CPSase I, 5.25-5.5% B for the 37-kDa fragment, and 5% B for the 87-kDa fragment; peak 2 eluted at 6.25% with intact CPSase I and at 6% with the 37-or 87-kDa fragments; peak 3 eluted at 14.75% B for intact CPSase I or the 37-kDa fragment and at 14% B for the 87-kDa fragment; peak 4 eluted at 15% B for the intact CPSase I, at 16.5% for the 62-kDa fragment and at 17% B for the 108-kDa fragment; and, peak 5 eluted at 15/25% B for intact CPSase I and at 17% for the 108-kDa fragment. fractoriness to Edman cleavage might be due to cyclization when the N terminus of the [ 14 C]AGA-linked lysine becomes exposed under the acidic conditions required for sequencing. N-terminal glutamines are known to cyclize to pyrrolidonecarboxylyl(pyroglutamyl) residues under acidic conditions, but generally this cyclization reaction is not rapid enough to totally block Edman sequencing (52). If analogous cyclization is the cause of the presently observed total block to Edman sequencing, the rate of cyclization must be greater for [ 14 C]AGA-linked lysine than for glutamine. Amino acid composition analysis was consistent with the presumed peptide extensions but could not allow definitive sequence identification. Automated Edman degradation also revealed low levels of the extension amino acids, as would be expected for the small fraction of peptide where the [ 14 C]AGA linkage was broken to yield unmodified lysine; however, these recoveries were too low to allow definitive sequence identification. Therefore, the labeled peptides are indicated throughout as terminating in the last residue identified by sequencing, the modified lysyl residue, since the extensions cannot be clearly defined. It should also be noted that peptide 1351-1360 is not expected to result from trypsin treatment since amino acid 1350 is a methionine; presumably, this cleavage results from traces of chymotrypsin present in the trypsin or from autolysis of trypsin during the extended incubations necessary for exhaustive proteolysis of the [ 14 C]AGAlabeled CPSase I. However, no other specificity anomalies have been observed (22,38) in CPSase peptides generated with the trypsin (treated with tosylamide-2-phenylethyl chloromethyl ketone) utilized in the present study.
Analysis The relationship between all four labeled peptides and the functional AGA binding site cannot be clearly defined from the present data. Occurrence of peptide 1351-1356 near the AGA binding site is consistent with the previous finding that the AGA analog N-chloroacetylglutamate labeled within 20 kDa of the C terminus of CPSase I (31). It is also noteworthy that peptide 1351-1356 incorporated more [ 14 C]AGA label than the other three peptides. The failure to detect any other labeled regions in the previous study might have resulted from the different preferential reactivity of the photoactivated N-chloroacetylglutamate (most reactive with aromatic groups (32, 33)) or from the different localization of reactive groups (acetyl group versus carboxyl groups). Since recent studies have shown that AGA is bound with high affinity to CPSase I only in the presence of ATP and that the topography of the single AGA binding site differs significantly for the high and low affinity forms (34), the different labeling patterns might also result from occupancy of the high affinity AGA site in the present study (where ATP was included) and occupancy of the low affinity AGA site in the previous study (Ref. 31, where ATP was not included). Although the present data place peptides 1351-1356 and 625-630 at a single site of AGA interaction, with some heterogeneity of interaction since only peptide 1351-1356 is covalently attached to AGA on some enzyme molecules whereas both peptides are covalently attached to a single AGA on other molecules, further studies will be required to determine whether labeling of peptides 120 -127 and 234 -237 also results from flexible occupancy of this single AGA site. Although the labeled peptides might play critical functional roles, it is also quite feasible that they are simply near the AGA binding site; these determinations will also require further studies.
Model for CPSase Interaction with AGA and ATP-It is possible to construct a plausible model (Fig. 6) for CPSase I in which all four labeled peptides (120 -127 and 234 -237 from domain A, 625-630 from domain B, and 1351-1356 from domain D near the domain C interface) are near a single binding site for AGA. In this model there is extensive interaction among domains A-D to form proximal binding sites for ATP B , AGA, and ATP C . Domains B and C each contain the consensus primary sequence motifs identified for ATP-utilizing enzymes (11,21) and have been shown to contain the adenine subsites of the two ATP binding sites (21)(22)(23). Thus, domain B can be considered as the core locus for the ATP B site and domain C as the core locus for the ATP C site. In addition, domain D appears to form a flexible loop to complement the portions of the ATP sites contained in globular domains B and C; it is probable that amino acid residues from domain D participate in catalytic processing of the ATP molecules and/or cause specific changes in the conformation of substrate binding residues of domains B and C, although they might also participate directly in ATP binding. This participation of domain D in the two ATP sites is based on previous studies with the ATP analog 5Ј-p-fluorosulfonylbenzoyladenosine (22) which established that peptides 631-638 (domain B) and 1327-1348 (domain D near the domain C interface) are near the ␥-phosphate portion of the ATP B site, and that peptides 1310 -1317 (domain C near the domain D interface) and 1445-1454 (domain D) are near the ␥-phosphate subsite of the ATP C site. It has also been established previously that domain A is very near the adenine subsite of at least one of the ATP sites (21,23); studies with the E. coli CPSase II (54) have suggested that this interaction is with both of the ATP sites. Given this modeling of domains and of ATP sites, together with the localization of the [ 14 C]AGA-labeled peptides, one feasible manner in which AGA could function as an essential allosteric activator is to bind to its cognate site on domain D (with components from domains A and B either very near or possibly contributing to the actual binding site for AGA) and to thereby pull this flexible domain into the correct conformation for binding ATP B ; it is possible that conformational changes occur within domains A and B upon binding of AGA that also facilitate the subsequent binding of ATP B . It should be noted that Rubio and Cervera (55) have suggested a very similar model for rat liver CPSase I, where domain A interacts with both domains B and C, domain D interacts with both domains B and C, and interaction with allosteric ligands is proposed to change the structural relationship of domain D with other domains; however, in that model there is no interaction between domains A and D, and domain D is modeled as removed from the active site regions of domains B and C.
Evolutionary Model for Derivation of the AGA Allosteric Effector Site from an Ancestral Glutamine Substrate Site-The primary involvement of domain D in binding AGA is consistent with its involvement in binding other allosteric activators in other CPSases (26 -29). However, there was no obvious rationale for the localization of [ 14 C]AGA-labeled peptides very near the ATP B binding site, rather than at a site removed from the active site as would be expected for a traditional allosteric activator, and within the N-terminal region of CPSase I, which had not previously been implicated in any ligand interactions. In seeking a rationale for these localizations, we expanded upon previously proposed evolutionary schemes. The duplication of an ancestral ATP-utilizing gene to form domains B and C, the core binding units for ATP B and ATP C , has been previously proposed (9 -11). The evolution of the CPSase GAT domain, as well as the GAT domains for the other six G-type amidotransferases, from an ancestral glutamine-utilizing gene has also been previously proposed (12). The unique feature that we are now suggesting is that the ancestral glutamine-utilizing gene underwent duplication to yield subdomains A-1 and A-2 (together forming domain A) of an ancestral CPSase and that these subdomains diversified differentially to yield the presentday CPSases. The other six members of the type G amidotransferase family have a GAT domain that corresponds in size and sequence alignment to subdomain A-2 (12). The evolutionary origin of subdomain A-1 of CPSase has been previously unassigned, and no ligand binding involvement has been previously demonstrated for this region.
We propose the following diversification scheme for subdomains A-1 and A-2. (a) An ancestral glutamine-utilizing gene was duplicated to yield two adjacent GAT regions in the domain A analog of an ancestral CPSase. (b) Subdomain A-1 underwent diversification to yield CPSase II with only one fully functional GAT region as subdomain A-2. Studies with E. coli (54) and Syrian hamster (57) CPSase II have established this subdomain A-1 as an interaction subdomain that is required for complex formation of domain A with domains B-C-D. The studies on mammalian CPSase II (57) further established that subdomain A-1 attenuates the intrinsically high activity of the GAT subdomain (A-2) in the isolated domain A (containing both A-1 and A-2) but allows increased glutaminase activity to A2 of various CPSases Alignment of each pair (via IALIGN) yielded a similarity score which was compared with the mean of similarity scores (30 each for A1/A1 and A2/A2 and 200 each for A1/A2) obtained after randomizing those two sequences; the alignment scores below are in units of standard deviations above or below the mean of the randomized comparisons. CPA 1, S. cerevisiae arginine-specific CPSase II (60); CPS III, Squalus acanathias liver urea-specific CPSase III (8); CAD, syrian hamster pyrimidine-specific CPSase II (7,56); frog CPS I, Rana catesbeiana liver urea-specific CPSase I (61); rat CPS I, rat liver urea-specific CPSase I (11); URA2, S. cerevisiae pyrimidine-specific CPSase II (62); ECOLI, E. coli CPSase II (63). occur when domain A is fused to the synthetase domains B-D, and the ATP B active site of the synthetase is occupied. These findings strongly suggest that subdomain A-1 mediates the functional linkage observed between the glutaminase activity of subdomain A-2 and the synthetase activity at the ATP B site within domain B. (c) Alternative diversification of subdomain A-1 occurred to yield CPSase III where subdomain A-1 is involved in binding AGA while subdomain A-2 retains GAT activity and presumably also retains the interaction role for subdomain A-1. (d) Diversification of both subdomains A-1 and A-2 occurred to yield CPSase I with subdomain A-1 involved in binding AGA and subdomain A-2 no longer retaining GAT activity. The utilization of AGA as a required allosteric activator for CPSase I, with linkage between occupancy of the AGA site and the ATP B site (22,30), is analogous to the strong functional linkage between occupancy of the glutamine site on the GAT moiety and occupancy of the ATP B site on the synthetase moiety in the CPSase II family (6,25). However, in the CPSase I case, the activator AGA interacts directly with the interaction subdomain A-1 and can directly mediate a functional response at the ATP B site, whereas in the CPSase II case, glutamine binds to subdomain A-2, and this occupancy must be communicated to the ATP B site via the interaction subdomain A-1. Possibly, this direct regulation at subdomain A-1 by AGA facilitates the much tighter binding of ammonia by CPSase I (K m of 38 M) (58) than by E. coli CPSase (K m of 5 mM) (13) to allow a large flux of ammonia at physiological concentrations through the urea cycle but effectively eliminate a requirement for high levels of ammonia that would be toxic to the brain and to other organs. Although this discussion has thus far focused on potential involvement of subdomain A-1 in binding AGA, it appears, both from previous studies (31) and our own, that interactions with domain D were also involved in the evolution of the AGA binding site, yielding an AGA site for the present-day CPSase I that includes components from both domain D and subdomain A-1 and that is very near the ATP B site components from domains B and D.
Since gene duplication and diversification is the major route by which genomes increase in size and complexity (e.g. Ref. 59), this would be a reasonable evolutionary origin for subdomain A-1. To determine whether subdomains A-1 and A-2 of presentday CPSases retain traces of sequence identity that would be consistent with their proposed evolution via an ancient duplication of an ancestral glutamine-utilizing gene and subsequent diversification, we obtained alignment scores for subdomains A-1 and A-2 from various CPSases (Table III). For these alignments, a previous definition of the GAT domain (7) was utilized as the definition of subdomain A-2, and the remainder of domain A was designated as subdomain A-1. Each of the pairwise alignments optimized amino acid identity and similarity (64). The resulting similarity score (including gap penalties) was compared to the mean of 30 -200 scores obtained by randomizing the sequences to determine the statistical significance of each alignment. The alignment scores given in Table III are in units of standard deviations above or below the mean of the randomized comparisons. An alignment score of 3.0, with a less than 1 in 1000 chance of arising by chance, may be considered as the minimum score for establishing common ancestry in binary comparisons (64).
Comparison of the alignments between various subdomains A-1 and between various subdomains A-2 (Table III) provides support for differential diversification of these two subdomains and for their proposed functional roles. (a) There are very strong similarity scores for the alignments of rat liver CPSase I subdomain A-1 with both frog liver CPSase I and CPSase III (the only three CPSases that utilize AGA), whereas only the strong relationship between rat liver CPSase I and frog liver CPSase I (the only 2 CPSases which cannot function as glutamine amidotransferases) is observed in the alignment of subdomains A-2. (b) The relationship between subdomains A-1 from E. coli CPSase II and from the other CPSases is quite weak, whereas the relationships among the corresponding subdomains A-2 are not as weak. Although most of the alignment scores for A-1/A-2 pairs (Table III) did not differ significantly from the alignment scores that resulted from randomized arrangements of the component amino acids, the alignments between S. cerevisiae CPA1 subdomain A-2 and subdomains A-1 of shark liver CPSase III and of frog liver CPSase I provided some evidence of trace identity remaining between these subdomains. This suggestion that the CPA1 subdomain A-2 might have diversified less than other subdomains A-2 is also consistent with recent sequence analysis of CPSases (65), based on the entire sequence, which suggested that the yeast CPA1/CPA2 may best resemble the structure of ancestral CPSase.
Since often similarities in three-dimensional structures are maintained longer after evolutionary divergence than are similarities in primary sequence, knowledge of the folded structures of subdomains A-1 and A-2 might provide evidence for a common ancestry. There has recently been a preliminary report (66) of the solved crystal structure for the glutamine amidotransferase domain of GMP synthetase (another type G amidotransferase). When a detailed structure becomes available for GMP synthetase, it will be of great interest to carry out homology modeling of the CPSase subdomains A-1 and A-2. However, given the low level of sequence identity remaining between subdomain A-1 and the amidotransferases and the uncertainty of that alignment, homology modeling might well prove unsuccessful. Thus, determination of the subdomain A-1 structure might not be possible until a solved x-ray structure for one of the CPSases is available.