The COOH Terminus of Arylamine N-Acetyltransferase from Salmonella typhimurium Controls Enzymic Activity*

Arylamine N-acetyltransferases (NATs) are a homologous family of enzymes, which acetylate arylamines, arylhydroxylamines, and arylhydrazines by acetyl transfer from acetyl-coenzyme A (Ac-CoA) and are found in many organisms. NAT was first identified as the enzyme responsible for the inactivation of the anti-tubercular drug isoniazid in humans. The three-dimensional structure of NAT from Salmonella typhimurium has been resolved and shown to have three distinct domains and an active site catalytic triad composed of “Cys69-His107-Asp122,” which is typical of hydrolytic enzymes such as the cysteine proteases. The crystal unit cell consists of a dimer of tetramers, with the C terminus of individual monomers juxtaposed. To investigate the function of the first two domains of full-length NAT from S. typhimuriumand to investigate the role of the C terminus of NAT, truncation mutants were made with either the C-terminal undecapeptide or the entire third domain (85 amino acids) missing. Unlike the full-length NAT protein (281 amino acids), the truncation mutants of NAT fromS. typhimurium are toxic when overexpressed intracellularly in Escherichia coli. Full-length NAT hydrolyses Ac-CoA but only in the presence of an arylamine substrate. Both truncation mutants, however, hydrolyze Ac-CoA even in the absence of arylamine substrate, illustrating that the C-terminal undecapeptide controls hydrolysis of Ac-CoA by NAT from S. typhimurium.

Arylamine N-acetyltransferases (NATs, 1 EC 2.3.1.5) vary in size from 30 to 34 kDa and form a distinct protein family (1). These enzymes are found in a range of species from prokaryotes to humans and catalyze the acetylation of many arylamine and arylhydrazine compounds by catalyzing the transfer of an acetyl group from acetyl-coenzyme A (Ac-CoA) to the terminal amino group of the substrate. This activity plays an important role in the metabolism of drugs and xenobiotics (2)(3)(4). It has also been proposed that a possible endogenous role for NATs in humans is in the acetylation of a folate catabolite, p-aminobenzoyl glutamate (5,6), but as yet no endogenous substrate is known for NATs in prokaryotes, although NAT activity has been identified in prokaryotes, including several Mycobacteria (7)(8)(9)(10)(11). While the members of the NAT family may have a common function it is also possible that individual NATs carry out different roles in the various organisms in which they have been found. The existence of the NAT homologue, rifamycinamide synthase (12,13), which is involved in the cyclization of an arylamine precursor to form the antibiotic rifamycin in Amycolatopsis mediterrani also indicates that NATs may have differing roles in different organisms.
NAT from Salmonella typhimurium was first identified from strains of the bacterium used in the Ames test for carcinogens (14) and it was also subsequently shown to be involved in the activation of hydroxyarylamine carcinogens by O-acetylation (2,15,16). Previously, kinetic studies on NAT from pigeon livers showed the necessity of a sulfhydryl group for activity (17)(18)(19). It was also these studies that led to the first suggestion of a "ping-pong" mechanism involving a thio-acetyl intermediate.
Recently, the structure of NAT from S. typhimurium (ST-281), which consists of 281 amino acid residues, was determined (20). The structure revealed a unique fold, which has three distinct domains (Fig. 1). The first two domains (Fig. 1a) showed similarity to the structure of proteins in the "papain" family of cysteine proteases such as staphopain (21). The structure of NAT illustrated the presence of a "Cys 69 -His 107 -Asp 122 " catalytic triad, also found in the cysteine proteases (22). The three amino acid residues of the catalytic triad are found in the first two domains of NAT from S. typhimurium (Fig. 1a) and are conserved in all members of the NAT family (20). This structural similarity of NAT to the cysteine proteases suggested that NATs and the cysteine proteases might have evolved from a common ancestor. The structure of the unit cell of crystalline NAT from S. typhimurium is a pair of tetramers with each tetramer consisting of a pair of dimers. The C termini (ϳresidues 270 and beyond) of the monomers in the individual dimers ( Fig. 1a) are not visible in the solved structure of NAT but must be in very close proximity from the juxtaposition of the monomers in the unit cell ( Fig. 1b) (23).
We have investigated the function of the first two domains of full-length NAT from S. typhimurium, which corresponds to 196 amino acid residues, and the role of the C terminus of NAT, by making truncation mutants with either the C-terminal undecapeptide (AAFDTHPEAGK) or the entire third domain missing (Fig. 1). Here we report that these truncation mutants of NAT from S. typhimurium are toxic when overexpressed intracellularly in Escherichia coli. We also demonstrate that only the truncation mutants hydrolyze Ac-CoA in the absence of substrate and illustrate that the C-terminal undecapeptide is essential in modulating hydrolysis of Ac-CoA by NAT from S. typhimurium.

EXPERIMENTAL PROCEDURES
Cloning-Fragments of the gene encoding full-length S. typhimurium NAT were cloned following amplification of the regions from the open reading frame of NAT in genomic DNA from S. typhimurium (Strain L2, from Jay Hinton, IMM, Oxford). The forward primer 5Ј-A-GTCACTCATATGACCTCTTTTTTACAT-3Ј and reverse primers 5Ј-A-TTTTGTAAGCGGCCGCTCACATTACCGCGGCCAGCTC-3Ј and 5Ј-ATTTTGTAAGCGGCCGCTCATTGCGGCCAGTGAGCCGA-3Ј were used to amplify 589-and 811-bp fragments, respectively, which encode for the first 196 amino acids (st-196) and 270 amino acids (st-270) of full-length S. typhimurium NAT (st-281). The 589-and 811-bp fragments were ligated into the vector pGEMT (Promega) and transformed into E. coli strain JM109 (Promega). The inserts were then subcloned into the expression vector pET 28b(ϩ) (Novagen Inc., Madison, WI), immediately after the region encoding the hexa-histidine tag, using the NotI and NdeI restriction sites (shown in bold) at the 5Ј and 3Ј ends, respectively. The plasmids, now with open reading frames corresponding to nst-196 and nst-270, were then transfected into E. coli strain BL21(DE3)pLysS (Promega). The nst-196,nst-270 fragments and the open reading frame of S. typhimurium NAT, inclusive of a N-terminal hexa-histidine tag (nst-281) were further subcloned from pET28b(ϩ), into the vector pBAD/gIII (24) (Invitrogen Inc., Groningen, Holland) using the NcoI and XhoI sites, followed by transformation of the pBAD/ gIII vectors into E. coli strain TOP 10 (Invitrogen), for propagation and expression. The sequences of the open reading frames of the final constructs were confirmed.
Protein concentrations of soluble fractions were determined by measuring the absorbance at 280 nm or by using the Bradford assay (28). NAT protein in insoluble fractions was quantitated following Western blotting by comparison with known amounts of soluble pure NST-281.
Protein samples were denatured, reduced, and alkylated for 12% SDS-PAGE (29). Gels were stained with Coomassie Blue (30) or probed for NAT by Western blotting using a rabbit polyclonal antiserum against pure ST-281 (11).
Enzymic Activities-Enzymic activities were determined (17,18,27) using soluble fractions of freshly prepared bacterial cell lysates, unless otherwise stated. All Michaelis constants, K m , and maximum rates, V max , were calculated using nonlinear optimization (max iterations ϭ 200) (31-33) from activities measured in duplicate at a minimum of five substrate concentrations.
Three different enzymic activities were determined. (A) Arylamine N-acetyl transfer activity was measured using acetyl-CoA and 4-aminoveratrole (4AV), p-anisidine (ANS), or isoniazid (INH). Reactions, containing up to 2 mM acetyl donor or acceptor, were carried out in a total volume of 200 l (20 mM Tris-HCl, 1 mM EDTA, and 1 mM dithiothreitol, pH 7.5) at 37°C and stopped by addition of an equal volume of 20% (w/v) trichloroacetic acid (4°C). The loss of arylamine substrate was determined spectrophotometrically (27,34) or by high performance liquid chromatography (35). (B) The enzymatic hydrolysis of Ac-CoA to CoA was determined using dithiobis(2-nitrobenzoic acid). Reactions were carried out as in A except without dithiothreitol. The quantity of CoA generated was determined by measuring the absorbance, at 410 nm, of the CoA-dithiobis(2-nitrobenzoic acid) conjugate as previously described (17). (C) Hydrolysis of p-nitrophenyl acetate (PNPA) was determined by the rate of formation of p-nitrophenol at 410 nm. The enzyme solution (final volume 1 ml) contained between 50 pg and 1 g of pure protein in 20 mM Tris-HCl and 1 mM EDTA at pH 7.5 with the addition of 2 mM INH as required. PNPA (final concentration up to 4 mM) was then added to the enzyme solution (pre-heated to 37°C) to start the reaction. The final concentration of acetonitrile (used to dissolve PNPA) stock was less than 0.1% (v/v) and had no effect on enzymic activity as determined using methods A and B.
Simulated "in Silico" Docking of Substrates-All non-polar hydrogen and terminal oxygen atoms were attached and Gasteiger charges were assigned to the three-dimensional protein structure of NAT from S. typhimurium (PDB accession number 1E2T) using the program SYBYL 6.5 (36). Three-dimensional structures of substrates, in MOL2 format, were created using SYBYL 6.5 and rotatable bonds and fixed rings were assigned using the program AUTOTORS (37).
The substrate structures were then annealed to the protein structure in silico using the AUTODOCK suite of programs (38). The lowest energy docking solutions were then viewed and analyzed using SPDBV (39).

RESULTS
Expression of nst-281, nst-270, and nst-196 in the Cytosol of E. coli Using pET 28b(ϩ)-NST-270 and NST-196, which correspond to NAT from S. typhimurium containing an N-terminal hexa-histidine tag with either the C-terminal 11 amino acids or entire third domain missing, respectively ( Fig. 1), were produced as recombinant proteins in E. coli. NST-270 and NST-196 were found in inclusion bodies although the level of recombinant protein was very low (Ͻ100 g/l of culture). In contrast, NST-281, which corresponds to the full-length NAT from S. typhimurium containing an N-terminal hexa-histidine tag, is soluble and recovered in high yield (ϳ2 mg/liter of culture). When higher concentrations of IPTG (up to 1 mM) were used the level of expression of NST-270 and NST-196 was greater, but still the amount of soluble protein produced, was too low to be detected by Western blotting. No improvement in solubility of NST-270 and NST-196 was obtained when cultures were grown slowly (27°C), a procedure which had previously proved successful in generating soluble recombinant proteins (10).
Toxicity of  Proteins in the Cytosol of E. coli-To determine whether the low level of expression of the truncated proteins was due to toxicity, plasmid stability tests were carried out (25). Colonies of BL21(DE3)pLysS, containing the pET28b(ϩ) vector with inserts encoding NST-270 or NST-196 were grown on LB agar plates with or without kanamycin/ IPTG combinations to determine the number of colonies retaining the pET28b(ϩ) plasmid with the different inserts (26,40,41). The ratio of the number of colonies growing on plates containing IPTG to plates containing both IPTG and kanamycin was used to determine the percentage of colonies that had lost their plasmid. Of the colonies that contained plasmids with the inserts encoding for NST-270 and NST-196, 36 and 43%, respectively, lost their plasmid in the presence of either 1 or 0.1 mM IPTG. For comparison, the same experiment with NST-281 showed that in the presence of 1 or 0.1 mM IPTG, only 10 or 4% of colonies, respectively, had lost the plasmid with the nst-281 insert.
Expression of nst-281, nst-270, and nst-196 Using pBAD/ gIII-Expression in the pBAD/gIII system leads to transport of recombinant protein, prior to folding, from the cytoplasm to the periplasm by means of the gIII signal peptide, which is cleaved once the protein reaches the periplasm (24,42). However, for NST-270, ϳ50% of recombinant protein was found in the cytoplasm with the gIII signal still attached, as determined on the basis of molecular weight using Western blotting (Fig. 2b). Minor contamination from protein in the pellet could not be avoided when separating cytoplasmic and periplasmic fractions.
Expression of nst-270 and nst-196, using this system, resulted in much higher levels of expression than using the pET system (Fig. 2a). The quantity of soluble truncated proteins recovered was, however, much less than for NST-281 (1 mg/ liter culture) when expressed using the pBAD/gIII system under similar conditions (Fig. 2b). All three recombinant proteins could nevertheless be affinity purified from the soluble fraction of cell lysates. Typically the N-terminal hexa-histidine-tagged proteins were eluted with 50 mM imidazole yielding pure pro-tein. Unlike NST-281, the NST-196 and NST-270 proteins lost a large proportion of enzymic activity during purification and were observed to precipitate rapidly in high salt (300 mM), which is required for purification. The enzymic activity of NST-270 and NST-196 in lysates was unstable and could not be detected after 3 days at 4°C. Therefore lysates were used for activity assays within hours of preparation. These observations were in contrast to those for NST-281, which was stable, both when stored as lysate and during purification following expression using the pBAD/gIII system. During the purification of NST-270, the higher molecular weight protein corresponding to NST-270 with the uncleaved gIII signal peptide (ϳ32,000) (Fig.  2b) eluted at a different concentration of imidazole (250 mM) to NST-270. This protein (NST-270 with the gIII signal sequence attached) was enzymatically inactive.
Enzymic Analysis of Recombinant NAT Proteins-All proteins were assessed for N-acetyltransferase, Ac-CoA hydrolysis, and PNPA hydrolysis activities as shown in Tables I and II. PNPA resembles an acetylated arylamine (Fig. 3).
Arylamine N-acetyltransferase activity was observed only for the NST-281 and NST-270 proteins. However, the K m values for both proteins, with Ac-CoA, were too low (less than 20 M) to be determined accurately, using the high performance liquid chromatography method employed (35). Enzymatic cleavage of Ac-CoA was observed with all three recombinant proteins in the presence of 2 mM INH. When INH was not FIG. 1. The three-dimensional crystal structure of NAT from S. typhimurium. a, cartoon topology of the structure of a monomer of NAT from S. typhimurium (ST-281), in three distinct domains. Standard nomenclature is used for the structural annotation and the amino acid sequence is given below the structure in single letter format. The protein sequence of the first two domains of ST-281 is colored black with the sequence of the third domain being colored green followed by the final C-terminal undecapeptide, which is colored magenta. Residues in the sequence with triangles above them indicate amino acid residues of the "catalytic triad." The PDB accession number of ST-281 is 1E2T and the structural topology was created using "PDBsum" (50). b, a three-dimensional representation of the structure of a dimer of NAT from S. typhimurium (ST-281). The unit cell consists of a dimer of tetramers, with each tetramer being composed of a pair of dimers. Each secondary structure element is colored successively from the N terminus to the C terminus for each individual monomer (blue through to green for monomer B and then green through to red for monomer A). present, only NST-270 and NST-196 hydrolyzed Ac-CoA as shown in Table II. The hydrolysis of the acetyl donor PNPA, also only occurs with NST-281 in the presence of INH. However, there are distinct differences in the pattern of hydrolysis of both Ac-CoA and PNPA (Table II), by the full-length and truncated NATs. The Michaelis constant obtained for NST-281 expressed using the pBAD/gIII system was similar to that obtained for NST-281 using the pET system in previous studies (27), indicating that expression in the pBAD/gIII system has not adversely affected the kinetic values.
Simulated Docking of Substrates-The structures of 4AV, ANS, INH, PNPA, and the acetylated forms of INH (Ac-INH) and ANS (Ac-ANS) (Fig. 3a) were docked into the structure of NAT from S. typhimurium and the lowest energy binding sites were determined in each case. The sites described are an order of magnitude lower in energy than the next best solutions.
The docking, of PNPA, revealed a preferential binding pocket (site ␣) for the benzene ring. The aromatic ring of PNPA appeared to interact with Phe 125 , which is conserved in all prokaryotic NATs, by a "-ring stacking" interaction with a docking energy of Ϫ21.1 kJ/mol (Ϫ5.05 kcal/mol). The binding site ␣ was almost identical in position to that for 4-bromoacetanilide (an irreversible inhibitor covalently bound to the Cys 69 residue) as determined from the crystal structure of NAT from S. typhimurium with the inhibitor bound (20) (Fig. 3b).
The second series of docking studies were with substrates 4AV, ANS, and INH. These compounds bind similarly to PNPA in site ␣. Additionally, these substrates, but not PNPA, were also bound at a second site (␤) (Fig. 3b). The interaction of the substrates at the second site appeared to be with residues Ile 36 ,  Pro 37 , and Phe 38 . These residues have previously been suggested to contribute to Ac-CoA binding (11). The docking energy for 4AV, ANS, and INH was lower at site ␤ than at site ␣ (Fig. 3), indicating that 4AV, ANS, and INH bind at site ␤ in preference to site ␣. It would be highly unlikely for acetyl transfer to occur at site ␤ because of the 9.1-Å distance between the terminal nitrogen of a substrate bound at site ␤ and the sulfydryl of Cys 69 (Fig. 3b). However, any substrate subsequently bound in the higher energy site ␣, would have its terminal nitrogen (or oxygen for PNPA) close enough to the active site Cys 69 (3.0 Å) to allow transfer of an acetyl group to and from the sulfhydryl group. No significant binding sites were identified for the acetylated substrates, Ac-INH and Ac-4AV.

DISCUSSION
When the truncated versions of NST-281 (NST-270 and NST-196) are generated as recombinant proteins using the pET28b(ϩ) system there is very little protein produced compared with the production of NST-281. The pET28b(ϩ) plasmids encoding for nst-270 and nst-196 are approximately four times more unstable, in E. coli, than the plasmid encoding for nst-281, suggesting that the NST-270 and NST-196 proteins are much more toxic than NST-281. This serves to explain why so little recombinant NST-270 and NST-196 is produced (26). The minimal toxicity of the NST-281 protein can be attributed to the large quantity of soluble recombinant protein being produced disrupting usual cell function (41). The toxicity of the truncation mutants is likely to be due to an enzymic activity, rather than to accumulation of protein, as very little NST-270 or NST-196 protein was produced. An analogous situation has been observed with zymogens (such as trypsinogen), which are only toxic when expressed in bacteria, if the inactivating peptide at the N terminus is missing (43).
When the nst-196, nst-270, and nst-281 proteins were expressed under conditions in which the recombinant proteins were excreted into the periplasmic space (24,42,44), much higher levels of expression for soluble NST-270 and NST-196 proteins (Fig. 2) were observed relative to expression using the pET system. The quantity of soluble NST-270 and NST-196 proteins produced in the periplasmic space was low relative to the production of soluble NST-281 (Fig. 2, c and d). It thus appears that the folded structure of NST-281 is more stable than the structures of NST-270 and NST-196 and suggests that the C-terminal region is essential for the stability of the complete NST-281 fold. However, the enzymic activity of the NST-270 and NST-196 proteins indicated clearly that low levels of correctly folded proteins were present. Soluble, but inactive, NST-270 (with the gIII signal still attached) was found in the cytosol and is unlikely to be folded correctly since one of the features of the gIII transport system is that folding of recombinant proteins is only completed after transport to the periplasm (24,42).
The acetyltransferase and hydrolytic activities of the truncated versions of S. typhimurium NAT were different from  (ST-281). a, the structures of the arylamine substrates, ANS and 4AV, are shown along with the structure of the arylhydrazine substrate INH and acetyl donor PNPA. The acetylated product acetyl-anisidine (Ac-ANS) is also shown. The docking energies at the ␣ and ␤ sites, as shown below the respective structures, are given in kJ/mol. b, representation of the docking solutions of ANS into binding site ␤ and PNPA into binding site ␣ of the structure ST-281 as determined by simulated docking using the AUTODOCK suite of programs (38). The protein structure of ST-281 is shown in ribbon form with the residues of the catalytic triad also indicated. those of the full-length protein. The NST-281 and NST-270 proteins catalyzed the acetyl transfer from Ac-CoA to 4AV, ANS, or INH whereas NST-196, which completely lacks the third domain, showed no such activity. The lack of acetyltransferase activity for NST-196 confirms that the third domain of NST-281 is essential for the acetylation step (19). This has been shown previously for human NAT1 (45) and suggests a common domain structure for both NST-281 and human NAT1. The similarity of the K m values of NST-270, for the acetylation of the arylamines 4AV and ANS (200 M), indicates that the relative affinity for these two arylamine substrates has been abolished in the shorter form of the protein. The K m values, with NST-270, for both substrates (4AV and ANS) were lower than those for NST-281 suggesting that the apparent affinity of NST-270 for the arylamine substrates is greater than the apparent affinity of NST-281 for the same substrates. In contrast, the K m value of NST-270 for Ac-CoA is over 8-fold greater than that of NST-281 (Tables I and II). The apparent affinity for Ac-CoA is much reduced on removal of the C terminus of NST-281. This suggests that the C-terminal undecapeptide plays an important role in substrate and Ac-CoA binding, possibly by occluding access to the active site of the opposing NST-281 monomer if a dimer exists in solution (Fig. 1). The dependence of the K m values, for the hydrolysis of Ac-CoA by NST-270 and NST-196, on the presence of INH (Table I) indicates that binding of Ac-CoA also requires the presence of INH.
Only NST-270 and NST-196 hydrolyze Ac-CoA when there is no INH present, which serves to explain the higher toxicity of these recombinant proteins in the cytosol of E. coli. Intracellular concentrations of Ac-CoA have been measured to be of the order of 1 mM in E. coli (46,47) and at this concentration of Ac-CoA it would be expected that the truncated proteins, NST-270 and NST-196, would compete effectively with other endogenous Ac-CoA-dependent enzymes, which are present in the cytosol of E. coli. The recombinant NST-281 protein, in contrast, would not compete in the absence of arylamine substrate. The control of hydrolysis of Ac-CoA is therefore likely to be an important function of the C terminus of NST-281, although it is still unknown how the hydrolysis of Ac-CoA, in the absence of arylamine substrate, is prevented in NST-281.
The simulated docking of 4AV, ANS, and INH to NAT from S. typhimurium revealed two binding sites (␣ and ␤) for the aromatic substrates (Fig. 3b). In contrast, docking of PNPA showed it was bound only at one site corresponding to the substrate-binding site ␣. The lack of good binding of the acetylated substrates (AC-INH, Ac-4AV) also suggests that, once acetylated, the substrates do not remain bound to the protein.
Site ␤ is in close proximity (ϳ5 Å) to the putative "P-loop" which has been considered as a possible binding region for the phosphate groups of Ac-CoA in NAT from S. typhimurium (20) suggesting that site ␤, rather than site ␣, may be involved in Ac-CoA binding.
The mode of interaction between NST-281 and Ac-CoA is unclear, although a very strong "ring stacking" interaction between an aromatic moiety and the adenosine ring of Ac-CoA has previously been demonstrated in solution (48,49). It is thus possible that the arylamine substrate initially binds in site ␤ to form a new Ac-CoA-binding site in which the aromatic ring of the substrate can "ring stack" with the adenosine ring in Ac-CoA. This would serve to explain the lack of hydrolysis in the absence of INH for NST-281 and would also explain the lower K m values for the hydrolysis of Ac-CoA by NST-270 and NST-196 in the presence of INH, although it is clear that the Cterminal undecapeptide must also play a role in regulation of the binding of Ac-CoA.
It is therefore proposed that the NAT protein must initially bind arylamine substrate (at site ␤) to initiate a conformational change, which allows the subsequent binding of Ac-CoA and the acetylation of the active site cysteine. Subsequent acetyl transfer can then occur from the acetylated-protein intermediate to the substrate (now bound at site ␣).
In conclusion, the C terminus of NAT from S. typhimurium regulates the binding of Ac-CoA in the absence of substrate such that there is control over hydrolysis of Ac-CoA. It is also possible that the first two domains of NAT from S. typhimurium and the corresponding domains of the cysteine proteases have evolved from a common ancestor with the third domain of NAT from S. typhimurium acting to regulate the hydrolytic function of the NAT protein.