Structural and Biochemical Characterization of an Active Arylamine N-Acetyltransferase Possessing a Non-canonical Cys-His-Glu Catalytic Triad*

Background: Catalytic activity of prokaryotic and eukaryotic arylamine N-acetyltransferases (NATs) relies on a strictly conserved catalytic Cys-His-Asp triad. Results: Structural and biochemical studies identified a functional NAT with a Cys-His-Glu catalytic triad. Conclusion: The catalytic triad of these acetyltransferases is more plastic than previously believed. Significance: (BACCR)NAT3 represents the first known functional acetyltransferase with a non-canonical Cys-His-Glu catalytic triad. Arylamine N-acetyltransferases (NATs), a class of xenobiotic-metabolizing enzymes, catalyze the acetylation of aromatic amine compounds through a strictly conserved Cys-His-Asp catalytic triad. Each residue is essential for catalysis in both prokaryotic and eukaryotic NATs. Indeed, in (HUMAN)NAT2 variants, mutation of the Asp residue to Asn, Gln, or Glu dramatically impairs enzyme activity. However, a putative atypical NAT harboring a catalytic triad Glu residue was recently identified in Bacillus cereus ((BACCR)NAT3) but has not yet been characterized. We report here the crystal structure and functional characterization of this atypical NAT. The overall fold of (BACCR)NAT3 and the geometry of its Cys-His-Glu catalytic triad are similar to those present in functional NATs. Importantly, the enzyme was found to be active and to acetylate prototypic arylamine NAT substrates. In contrast to (HUMAN) NAT2, the presence of a Glu or Asp in the triad of (BACCR)NAT3 did not significantly affect enzyme structure or function. Computational analysis identified differences in residue packing and steric constraints in the active site of (BACCR)NAT3 that allow it to accommodate a Cys-His-Glu triad. These findings overturn the conventional view, demonstrating that the catalytic triad of this family of acetyltransferases is plastic. Moreover, they highlight the need for further study of the evolutionary history of NATs and the functional significance of the predominant Cys-His-Asp triad in both prokaryotic and eukaryotic forms.

Site-directed mutagenesis experiments have shown that the NAT catalytic reaction unequivocally requires each of the Cys, His, and Asp catalytic residues (8 -10). Accordingly, all known functional prokaryotic and eukaryotic NATs harbor a Cys-His-Asp catalytic triad (11,12). Natural variations in the catalytic triad residues are poorly documented among the NAT family. So far, one unique natural modification of a catalytic triad residue has been reported and characterized. A single nucleotide polymorphism (364G3 A) in a (HUMAN)NAT2 isozyme (NAT2*12D) leads to an inactive NAT2 variant in which the catalytic triad Asp residue is replaced by Asn (10). Surprisingly, mutation of the catalytic Asp to the chemically analogous Glu in (HUMAN)NAT2 produced a variant with very low enzyme activity (2% of the wild-type activity). This Asp-to-Glu variant is also expressed at very low levels, suggesting that the substitution alters the structure of the enzyme (10).
Analyses of functional prokaryotic and eukaryotic NATs and those with triad variants prompted the hypothesis that a strict Cys-His-Asp catalytic triad is required for catalysis by this family of acetyltransferases (10,11). Interestingly, however, among the three nat genes present in Bacillus cereus, one encodes for a putative NAT isoenzyme, arylamine N-acetyltransferase 3 ((BACCR)NAT3), with an atypical Cys-His-Glu catalytic triad (9). The catalytic activity and structure of this atypical NAT have not been studied presumably because it was assumed to be inactive.
Here, x-ray crystallography, site-directed mutagenesis, biophysical, and computational approaches were combined to characterize this atypical NAT protein. We show that, despite the presence of a Glu residue in its catalytic triad, this atypical NAT is functional and able to acetylate prototypic NAT substrates. In contrast to (HUMAN)NAT2, (BACCR)NAT3 can support either a Glu or Asp residue in its catalytic triad with no significant structural or catalytic impact. Our data show that, contrary to current knowledge, the plasticity of the active site of this family of acetyltransferases also extends to their catalytic triad residues. More broadly, it raises the question of the evolutionary history of NAT enzymes and in particular of the functional significance of the predominance of the Cys-His-Asp triad in both prokaryotic and eukaryotic isozymes.
Expression and Purification of Recombinant (BACCR)NAT3 WT and E123D Enzymes-E. coli BL21 strains containing the (BACCR)NAT3 WT or E123D constructions were grown at 37°C until absorbance at 600 nm reached 0.7. Addition of 500 M isopropyl ␤-D-1-thiogalactopyranoside for 5 h at 25°C started the induction of NAT protein overexpression. Bacteria were then resuspended in phosphate-buffered saline (PBS), pH 7.4, 10 mg of lysozyme, 0.1% Triton X-100, and 1ϫ protease inhibitor mixture prior to sonication at 4°C (3 min; 8-s pulse on, 30-s pulse off; Branson Digital Sonicator model 450). Bacterial debris were pelleted (30 min, 12,000 ϫ g, 4°C), and supernatants were incubated with Chromatrix nickel-nitrilotriacetic acid beads (Jena Bioscience) for a minimum of 2 h at 4°C. Extensive washes were achieved on columns successively with PBS with 0, 20, and 30 mM imidazole before elution of His 6tagged NAT enzymes with PBS with 300 mM imidazole. Finally, proteins were reduced with 10 mM dithiothreitol (DTT) and dialyzed overnight against 25 mM Tris-HCl, pH 7.5. Proteins were digested with thrombin and further purified by ion exchange chromatography (Mono Q column, Ä KTA system, GE Healthcare) as described previously (13). Purification yield was estimated by SDS-PAGE, and protein concentration was determined by absorbance at 280 nm.
Crystal Structure Determination-Crystallization and data collection of the (BACCR)NAT3 wild-type protein were reported previously (13). Briefly, crystals were obtained with the hanging drop-vapor diffusion method with a 1:1 mixture of 30 mg⅐ml Ϫ1 (BACCR)NAT3 protein with 1.6 M sodium citrate, pH 6.5, 0.28 M NDSB-221 as additive (Hampton Research) at 18°C. The best crystals were flash cooled in liquid nitrogen with a 1:1 volume ratio of Paratone and paraffin oil as cryoprotectant before diffraction data collection at the Swiss Light Source synchrotron (Beamline PX3, Paul Scherrer Institute, Switzerland). The crystallographic parameters and data statistics were reported in Kubiak et al. (13). Briefly, data collection was achieved at 0.984 Å with a 1°oscillation per image on a chargecoupled device mar225 detector. XDS was used for data integration (14). The (BACCR)NAT3 structure was resolved by molecular replacement using the program Phaser (15) implemented in the CCP4 program suite (16). A model of (BACCR)NAT3 was built using CHAINSAW (17) with (BACAN)NAT1 structure (Protein Data Bank code 3LNB) as the search model. The protein structure was obtained by iterative manual rebuild using Coot (18) and refined by Twin Lattice Symmetry and restrained refinement using REFMAC5 (19). After addition of water molecules, a final refinement step was achieved using BUSTER 2.0 software (20). The stereochemical quality of the model was followed during the refinement process using the program MolProbity (21). Water molecules were added with a 3.0 cutoff and manual verification before final validation was per-formed with PROCHECK (22). The final refinement statistics and model parameters are shown in Table 1. All structural representations were generated with PyMOL (23). Atomic coordinates and structure factors of the (BACCR)NAT3 protein have been deposited in the Protein Data Bank under accession code 4DMO.
Determination of Apparent Kinetic Parameters of (BACCR)NAT3 and the E123D Variant Enzymes-Steady-state kinetic experiments were conducted using the 5,5Ј-dithiobis-(nitrobenzoic acid) assay (24) to determine the apparent kinetics of (BACCR)NAT3 and E123D variant enzymes parameters K m app and k cat toward 4-aminosalicylic acid, 2-aminofluorene, and isoniazid. Initial velocities (V i ) were determined in the presence of arylamine substrate at various concentrations, 400 M AcCoA, and 0.5 g⅐ml Ϫ1 enzyme in 25 mM Tris-HCl, pH 7.5. Alternatively, various concentrations of acetyl donors were used with 500 M 4-aminosalicylic acid to determine the apparent kinetic parameters toward AcCoA. The PNPA assay described by Cleland and Hengge (25) was used to obtain kinetic parameters toward the acetyl donor PNPA. Control experiments were conducted in the absence of arylamine substrate. V i values were plotted against substrate concentration and fitted to the Michaelis-Menten equation using Kaleidagraph 3.5 (Synergy Software). Results of triplicate experiments are shown including standard deviations.
Circular Dichroism (CD) Spectra and Chemical Unfolding Followed by CD-Recombinant enzyme samples were extensively dialyzed against 10 mM sodium phosphate, pH 7.5 buffer before recording CD spectra in an Aviv215 spectropolarimeter. Far-UV CD spectra (180 -260 nm) were acquired with 505 and 463 g⅐ml Ϫ1 final concentrations of (BACCR)NAT3 and E123D variant, respectively, through a cylindrical cell with a 0.02-cm path length. The ellipticity signal was recorded each 0.5 nm (1-nm bandwidth) with an integration time of 1 s per step. The average spectra of four successive scans was normalized to the protein concentration, and ellipticity was converted to differential molar extinction coefficient ⌬⑀ per residue. Secondary structure quantitative composition was deduced from the decomposition of the spectra using the CONTIN analysis included in the CDPro package with the 29-protein basis of Johnson and co-worker (26). Near-UV CD spectra (250 -350 nm) were obtained from an average of five successive scans between 250 and 350 nm through a 1-cm path length rectangular cell with 2.016 and 1.852 mg⅐ml Ϫ1 final concentration of (BACCR)NAT3 and E123D variant, respectively. Spectra were normalized to the protein concentration as ⌬⑀ per chain.
Native and 5 M GdmCl totally unfolded proteins both at 0.1 mg⅐ml Ϫ1 were prepared in 20 mM Tris-HCl, pH 7.8. Various amounts of native and unfolded proteins were mixed in a 1-ml final volume to obtain GdmCl concentrations ranging from 0 to 5 M with a 0.1 M steps. The samples were held for 24 h at 25°C before recording ellipticity at 222 nm using a 0.5-cm path length rectangular cell (10-s integration time, 2-nm bandwidth). The ellipticity at 222 nm was expressed as a function of GdmCl concentration and then fitted to the following two-state transition model equation.
where C is the concentration of GdmCl (M), ⌬G H2O is the unfolding free energy in the absence of denaturant, and m is the dependence of ⌬G on denaturant concentration. a and b are the slopes of the ellipticity base line (supposedly linear) of the native and unfolded states, respectively. Finally, the ellipticity signal was normalized to the total amplitude difference between the fully unfolded protein and native protein (27). pH-dependent Inactivation by Iodoacetamide-Recombinant enzyme was mixed with either 50 mM MES, pH 6.0 -6.5; 50 mM Tris-HCl, pH 7-9; or 50 mM CAPS, pH 9.5-10. Addition of 1 mM IAA to a solution containing 15 M enzyme started the inactivation reaction at 25°C. The percentage of residual activity was measured at different incubation times by adding 5-l aliquots to a solution containing 25 mM Tris-HCl, pH 7.5, 2 mM PNPA, and 500 M 4-aminosalicylic acid as described above. The percentage of residual activity fitted a single exponential curve allowing us to determine the first-order inactivation constant k inact , which was corrected by the k inact determined in the absence of IAA. ln(k inact ) was plotted as a function of pH. The catalytic cysteine pK a value was determined using the Henderson-Hasselbalch equation (28).
where k max is the highest and k min is the lowest measured k inact value, respectively. Standard deviations were calculated from three independent experiments.  Phylogenetic Analysis of NAT Sequences-Amino acid sequences were retrieved with BLASTp on the non-redundant sequence databases using the (BACCR)NAT3 sequence as a query. The 250 protein sequences with the highest scores were aligned. Of these, one strain representative of each species with different NAT isoform sequences was selected. Sequences with lower scores but presenting a putative Cys-His-Glu catalytic triad were added to this selection. All sequences were aligned using Muscle with the (BACCR)NAT1, (BACCR)NAT2, and (BACCR)NAT3 sequences and sequences of previously characterized NATs. The phylogenetic tree was constructed using a distance method implemented in MEGA (29) through a neighbor-joining analysis computed using the Dayhoff model. A total of 500 bootstrap replications were performed to determine statistical support for clades.
Expression of (BACCR)NAT3 and an E123D (BACCR)NAT3 Variant-The only characterized NAT enzyme harboring a putative Cys-His-Glu catalytic triad is (BACCAN)NAT3, a short NAT homologue devoid of NAT activity (5,30). Previous evidence suggests that NAT sequences harboring a Cys-His-Glu catalytic triad are devoid of NAT activity (9,10,30). To characterize the (BACCR)NAT3, we cloned and expressed the protein in E. coli and purified it to homogeneity. In parallel, a variant of (BACCR)NAT3 in which the catalytic Glu was replaced by an Asp residue (E123D variant) was also cloned, expressed, and purified. Both proteins were soluble and had identical purification yields (5 mg/liter of culture). SDS-PAGE revealed a unique band of ϳ34 kDa consistent with the predicted (ProtParam) molecular mass of 33,465 and 33,450 Da for the wild-type and mutant proteins, respectively ( Fig. 2A). Altogether, these data suggest that, in contrast to (HUMAN)NAT2 isozyme, the presence of a Glu or Asp residue in the catalytic triad of (BACCR)NAT3 does not impact protein stability and does not lead to aggregation during expression (10).
Further analysis of (BACCR)NAT3 and its E123D variant by circular dichroism revealed that their secondary and tertiary structures were very similar to one another (Fig. 2, B and C). The unfolding profiles of both proteins showed an unusual transient dramatic ellipticity decrease centered on 0.6 -0.7 M GdmCl (Fig. 2D). At higher denaturant concentration, a regain in ellipticity was observed, and at GdmCl concentrations above This atypical phenomenon could be related to the "denaturation trough" initially reported by Goldberg and co-workers (32) and then by other authors (33)(34)(35). Although the (HUMAN)NAT2 D122N protein has been suggested to retain its global folding over molecular dynamics experiments, mutagenesis studies showed that the (HUMAN)NAT2 D122E variant was unstable and prematurely degraded (10,36). In contrast, our data demonstrate that the presence of either an Asp or a Glu residue in the catalytic triad appears not to affect the folding properties of (BACCR)NAT3.
A Cys-His-Glu Catalytic Triad Is Consistent with NAT Activity in (BACCR)NAT3 Wild Type: a Kinetic Comparison with the E123D Mutant-As both the (BACCR)NAT3 and (BACCR)NAT3 E123D proteins appeared properly folded, we investigated their potential to N-acetylate three model aromatic amine substrates (4-aminosalicylic acid, 2-aminofluorene, and isoniazid). (BACCR)NAT3 displayed significant N-acetylation activity toward these three NAT substrates (Table 2). Restoring the Asp residue at position 123 slightly decreased the turnover of the enzyme (k cat ) by 1.3-1.7 times and the affinity constant (K m app ) by 2.1-6.8 times. (BACCR)NAT3 wild-type catalytic efficiency (k cat /K m app ) was 1.6 -4.4 times lower than that of the E123D mutant, but both enzymes were in the range of mean catalytic efficiency values observed for other bacterial NATs (30,37). Similarly, the K m app value of (BACCR)NAT3 for the acetyl donor PNPA was 2 times higher than that of the mutant, and the catalytic efficiency was 2.5 times higher. However, only small differences were observed between the two enzymes in kinetic parameters for the physiological acetyl donor AcCoA (Table 2).
Consistent with previous reports for other NATs, PNPA was a less efficient acetyl donor than AcCoA for both NATs we studied (37). AcCoA has been shown to possess numerous stabilizing interactions all over the active site cleft in contrast to aromatic amines that interact deeply in the active site to reach the acetyl-cysteine intermediate for acetylation (38 -41). Thus, the differences in kinetic parameters observed for aromatic amine substrates and the aromatic amine structural analogue PNPA suggest that the shape of the active site proximal to the catalytic triad might be affected; a similar mechanism has been proposed for the (HUMAN)NAT2 D122N mutant (36). Glu and Asp residues share very similar chemical properties and have similar side chain pK a values of 4.15 and 3.71, respectively. The third catalytic residue of the triad has been suggested to act as a base assisting the deprotonation of the two other catalytic residues; a change in the catalytic mechanism would therefore be expected to modify the reactivity of the catalytic cysteine (42,43). Hence, we conducted a pH-dependent inactivation experiment by IAA to determine the reactivity/pK a of the catalytic Cys 69 in both (BACCR)NAT3 and the E123D variant. The pK a values of the catalytic cysteine of the wild-type and E123D mutant enzymes were found to be similar (8.26 Ϯ 0.11 and 7.80 Ϯ 0.14, respectively) (Fig. 3). This suggests that the presence of a catalytic Asp or Glu residue has no significant impact on the reactivity of the catalytic cysteine of (BACCR)NAT3. Moreover, the IAA/pH inactivation profile is more similar to that of Mycobacterium tuberculosis NAT (whose catalytic cysteine pK a lies above 10) than to that of hamster NAT2 for which the catalytic cysteine pK a is 5.23. Therefore, the catalytic mechanism of (BACCR)NAT3 is unlikely to rely on a thiolate/imida-zolium ion pair as shown for hamster NAT2 (42) but rather on a general base catalysis mechanism in which the catalytic Cys 69 is deprotonated in the presence of substrate at physiological pH as proposed for M. tuberculosis NAT (Fig. 4) (43).
Mutation of individual catalytic residues led to a dramatic loss of activity in NATs (10,38). For the first time, we demonstrate the existence of a non-canonical but functional Cys-His-Glu catalytic triad in an NAT. Our data suggest that this noncanonical triad has no significant impact on activity or substrate-specific catalytic mechanisms when compared with other characterized prokaryotic NATs. This is an important finding for documenting the catalytic diversity of NAT-related enzymes. Indeed, only two catalytic triad variants of the papainlike cysteine peptidases cathepsins L have been reported and are associated with normal activity levels but greatly altered substrate specificity (44). The catalytic triads of transglutaminases and deubiquitinating enzymes are strictly conserved (45). Nevertheless, two esterases (serine proteases) have been reported to have normal activity levels and substrate specificity despite an Asp-to-Glu mutation in the Cys-His-Asp catalytic triad (46). Alternatively, normal activity levels have been reported in particular inteins with a modified catalytic residue but with an accompanying change in the catalytic mechanism (47,48).
Crystal Structure of (BACCR)NAT3 Wild-type Enzyme: Overall Structure and Catalytic Triad Geometry-Although mutagenesis and computational studies have been conducted on catalytic triad variants of NAT enzymes, no structural determination studies have been reported. Thus, we have solved the  The pK a values of the catalytic cysteine residue suggest that it exists mostly as a thiol at physiological pH (Ͻ8), which would be inconsistent with the nucleophilic attack on the AcCoA molecule (first reaction step). Thus, as suggested by Sikora et al. (43) for the M. tuberculosis NAT enzyme, we propose that the Glu/Asp 123 and His 108 dyad act as general base that facilitates the formation of a thiolate. In the absence of AcCoA in the active site, the thiol proton is shared between Cys 60 and His 108 (dashed line). The AcCoA binding in the active site could displace this equilibrium and help in the deprotonation of the thiol, leading to the nucleophilic attack of the thiolate on the AcCoA. The following catalysis steps are likely to involve the canonical tetrahedral intermediate states (in brackets) that are known to occur in the Ping Pong Bi Bi mechanism of NAT enzymes. The active site is represented by a gray dashed curve, the His 108 and Glu/Asp 123 residues are buried, and the Cys 69 is solvent-accessible. crystal structure of (BACCR)NAT3 at 2.14 Å. The Matthews coefficient value of 2.16 Å 3 ⅐Da Ϫ1 and solvent content of 43.03% are consistent with the presence of two molecules in the asymmetric unit after molecular replacement. According to PROCHECK, 97% of the residues are in favored regions of the Ramachandran plot, and 3% are in additional allowed regions, and the R-factor and R free values are 20.4 and 25.5%, respectively (Table 1). Three molecules of NDSB-221 additive have been identified in the asymmetric unit with no biological significance. No electron density was observed for the two residues Gly Ϫ1 and Ser 0 that remained after the His 6 tag thrombin cleavage. Clear electron density was observed for all residues except for side chains of surfaced-distributed residues 6, 15-18, 23, 54, 88, 230, 241, and 249. The enzyme has a typical three-domain fold consisting of an ␣-helical bundle (1-85), a ␤-barrel (86 -192), and a C-terminal ␣/␤ lid (193-263) (Fig. 5A) (11). Interestingly, no electron density was observed for residues Asn 168 -Ala 177 , which align with the "mammalian insertion" present in human NATs and inferred in (BACAN)NAT1 (Fig. 1) (40, 41). This suggests the presence of a similar highly flexible loop in (BACCR)NAT3 and reinforces the idea that some Bacillus NAT isoforms may have undergone a particular evolutionary history (e.g. horizontal gene transfer from a eukaryotic host, giving rise to eukaryotic features in these bacterial NATs). The crystal structure of (BACCR)NAT3 shows unambiguous and strong electron densities for residues Cys 69 , His 108 , and more importantly the non-canonical Glu 123 , shaping a catalytic triad (Fig. 5B). Furthermore, this catalytic triad is highly superimposable with the canonical Cys-His-Asp triad of other NATs as shown by r.m.s.d. values between 0.627 and 0.869 Å. These values are ϳ4-fold lower compared with r.m.s.d. values between the catalytic triads of canonical NATs likely due to the presence of a larger Glu residue in (BACCR)NAT3 rather than to a different triad conformation as all catalytic triads are sim-FIGURE 5. Three-dimensional structure of the (BACCR)NAT3 wild-type enzyme. A, schematic and surface representation of the two (BACCR)NAT3 molecules in the asymmetric unit, solved at 2.14 Å. Each enzyme chain has a typical three-domain fold including an ␣-helical bundle (red), a ␤-barrel (yellow), and a C-terminal ␣/␤ lid (blue). The structure confirms that residues Cys 69 , His 108 , and Glu 123 form a catalytic triad (green ball and stick). No electron density was attributed to the region spanning from Lys 167 and Asp 178 in both molecules, indicating the presence of a flexible loop corresponding to the mammalian-like insertion found in human NAT enzymes. B, 2F o Ϫ F c density map at 1.5 for (BACCR)NAT3 catalytic residues Cys 69 , His 108 , and Glu 123 . Hydrogen bonds are shown as black dashes, and distances (molecule A in the asymmetric unit) are expressed in Å. Distances in molecule B are shown in parentheses. C, superimposition of (BACCR)NAT3 (red) with (SALTY)NAT1 (yellow), (PSEAE)NAT1 (blue), (MYCMR)NAT1 (orange), and (RHILO)NAT1 (green) catalytic triads with r.m.s.d. values of 0.869, 0.672, 0.825, and 0.812 Å, respectively. It reveals a conserved geometry for the non-canonical Cys-His-Glu catalytic triad in (BACCR)NAT3 compared with other canonical NAT catalytic triads. The (BACCR)NAT3 backbone is shown in orange.
ilarly oriented (Fig. 5C). The side chains of Cys 69 , His 108 , and Glu 123 are similarly oriented compared with the Asp and Cys residues of canonical NATs, and the Cys 69 -His 108 and His 108 -Glu 123 in (BACCR)NAT3 are also comparable with those found in other NATs. Altogether, these observations depict clear structural bases for the catalytic activity measured for (BACCR)NAT3 and demonstrate that the geometry of the noncanonical Cys-His-Glu catalytic triad fulfills the structural and chemical requirements of NAT catalysis in the (BACCR)NAT3 enzyme.
Protein Packing and Catalytic Residue Environment Likely Explain the Accommodation of a Glu in the Catalytic Triad of (BACCR)NAT3-As previously reported by Zang et al. (10), the loss of activity in (HUMAN)NAT2 enzyme likely results from disruption of the protein structure integrity. The structures of (BACCR)NAT3 and (HUMAN)NAT2 are highly superimposable with r.m.s.d. values of 1.483 and 0.539 Å for the whole structure and the catalytic triad, respectively. Both enzymes have a similar catalytic triad geometry, although His 108 appears slightly remote in (BACCR)NAT3 compared with its position in (HUMAN)NAT2 (Fig. 6A). In (BACCR)NAT3, the catalytic Glu 123 residue is stabilized by a complex network of hydrogen bonds established with residues Asn 73 , His 108 , Gly 125 , and Tyr 184 and unexpectedly with two water molecules that are further stabilized through three hydrogen bonds to residues Leu 121 , Leu 132 , and Ala 133 (Fig. 6B, upper panel). In (HUMAN)NAT2, the catalytic Asp 122 interacts with the same residues (except for Ser 125 ), but the main chain is stabilized by a single hydrogen bond with Gln 130 rather than with water molecules (Fig. 6B,  lower panel). The topology of (BACCR)NAT3 is similar to that of (HUMAN)NAT2. Despite 27 additional residues in the (HUMAN)NAT2 enzyme compared with (BACCR)NAT3, the solvent-accessible surface of both proteins is very similar with values of 13,552 and 12,485 Å 2 , respectively (Fig. 6C). In addition, (BACCR)NAT3 has nine unresolved/absent surfaced-exposed residues in its inferred mammalian-like loop. Fifteen of the additional residues in (HUMAN)NAT2 compose a C-terminal extension that is stabilized by a 2.8-Å hydrogen bond between Thr 289 and Ser 127 (Fig. 7A) (40). In contrast, the C-terminal amino acid of (BACCR)NAT3 is 19 Å far from the corresponding Ser 128 position. In both enzymes, this position is located on the turn of the loop carrying the third catalytic residue. Ten of the additional residues are located in the "mammalian-like insertion," which is stabilized by several interactions with residues on domain III and hence is much more rigid than the unresolved flexible loop in (BACCR)NAT3 (Fig. 6B). This suggests a higher packing of the (HUMAN)NAT2 enzyme that could prevent structural rearrangements of the enzyme and thus preclude the accommodation of the larger Glu residue in the catalytic triad. Additionally, several differences are found in the vicinity (Ͻ5 Å) of the third catalytic residue. In particular, the large residues Gln 130 and Met 131 in (HUMAN)NAT2 are replaced by the smaller Leu 130 and Pro 131 residues, respectively, in (BACCR)NAT3 (Fig. 7A). These amino acid differences lead to a larger space available around the Glu residue in (BACCR)NAT3 (167 Å 3 ) compared with the Asp residue in (HUMAN)NAT2 (101 Å 3 ). It is likely that the interaction between the Asp 122 main chain and Gln 130 in (HUMAN)NAT2 allows fewer structural rearrangement of the two residues compared with the conformations available for the Glu 123 main chain and the water molecules in (BACCR)NAT3. Thus, considering the similar chemistry of the Asp and Glu residues and the results of the kinetic study of (BACCR)NAT3 enzymes, we propose that the accommodation of a Glu as a catalytic residue in NAT enzymes relies on a combination of low steric constraints, favorable protein packing, and amino acid environment.
Energy calculations and energy minimization models of the (BACCR)NAT3 E123D and (HUMAN)NAT2 D122E mutants using FoldX 3.0 and Popmusic 2.1 support this model (49,50). The ⌬⌬G energy values obtained using the FoldX and Popmusic algorithms are ϩ2.69 and 0.31 kcal/mol upon E123D mutation in (BACCR)NAT3 and ϩ9.05 and ϩ2.51 kcal/mol upon D122E mutation in (HUMAN)NAT2, respectively. This clearly indicates that the presence of a glutamate is energetically favorable in (BACCR)NAT3 in contrast to its destabilizing effect in (HUMAN)NAT2. Energy minimization models suggest that such an increase in ⌬G is concomitant with a steric clash between the His 107 and Glu 122 catalytic residues in (HUMAN)NAT2 (Fig. 7B). Although the model seems robust, the absence of protein backbone rearrangements upon the mutation leads to a His 108 -Asp 123 distance of 4.7 Å, which is incompatible with the catalytic activity observed experimentally in the (BACCR)NAT3 E123D mutant enzyme (Fig. 7B).
The geometry of the His 108 , Phe 125 , and Tyr 184 residues suggests the existence of a strong network ofinteractions (Fig.  6B). In hamster NAT2, residue Tyr 190 has been suggested to be important for maintaining the catalytic triad conformation by interacting with the catalytic Asp 122 and His 107 residues through hydrogen bonding andinteractions, respectively (51). The presence of a Ser instead of a Phe residue at position 125 in (HUMAN)NAT2 is likely to weaken this network. One can only speculate that local structural rearrangements upon insertion of a larger Glu residue may trigger the disruption of this network. Coupled with a lower degree of freedom of the Asp 122 -containing loop (Ser 127 -Thr 287 interaction) and a greater compaction, this could nevertheless decrease the ability of the protein to "find" an energetically favorable conformation, leading to the disruption of the protein structure integrity and its premature degradation. Molecular dynamics simulations have only been performed on the (HUMAN)NAT2 D122N mutant, but these suggest that the enzyme possesses fewer hydrogen bonds, which could decrease its overall stability (36).
Phylogenetic Analysis of Glu Residue Distribution in the Catalytic Triads of NAT Enzymes-To date, only the (BACCR)NAT3 and (BACAN)NAT3 orthologous enzymes and the (HUMAN)NAT2 D122N variant have been shown to naturally harbor a mutation in a catalytic residue. A closer investigation of the diversity of the NAT catalytic triad was achieved by BLASTing the non-redundant protein with (BACCR)NAT3 as query. The 250 highest scoring sequences retrieved all correspond to NAT homologues belonging to Bacillus species (supplemental Fig. S1). These sequences cluster into three main clades because of sequence identity with B. cereus NAT1, NAT2, and NAT3 enzymes (Fig. 8). Interestingly, every strain possessing several NAT sequences systematically harbors one with a Glu residue instead of an Asp at the catalytic position 123. More strikingly, seven NAT homologue sequences are present in the low score range and also present a putative Cys-His-Glu catalytic triad, including three sequences from Dictyostelium species and four from other bacteria (Fig.  8). Glenn et al. (5) showed that NAT sequences from B. anthracis and B. cereus nat genes were closely related to NATs from slime molds, suggesting the possible involvement of horizontal nat gene transfer events. Thus, the presence of a non-canonical NAT enzyme in the majority of Bacillus strains might stem from early horizontal gene transfer events from a eukaryotic host. Still, other unrelated bacteria harbor such atypical NAT  (53)) for (BACCR)NAT3 (green pocket) and (HUMAN)NAT2 (red pocket), respectively. This difference is likely due to the presence of the Pro 131 residue in (BACCR)NAT3 instead of Met 131 residue in (HUMAN)NAT2. Both enzymes are represented in the same orientation, and a vertical 180°rotation view of the cavity in (HUMAN)NAT2 is shown in the right panel. The surface of Asn 72 has been subtracted in (HUMAN)NAT2 for increased visibility of the cavity, but it was included for the cavity volume calculation. The hydrogen bond in (HUMAN)NAT2 between the C-terminal residue Thr 289 and Ser 127 is represented in dashes (distances in Å); the corresponding Ser 128 does not interact with the C-terminal region in (BACCR)NAT3. B, FoldX energy minimization models of the E123D and D122E mutants for (BAC-CR)NAT3 and (HUMAN)NAT2, respectively. The difference in free energy variation (⌬⌬G) generated upon mutation was calculated by FoldX. The wild-type and mutant catalytic triads are represented in sticks, and the catalytic His are represented as Van der Waals spheres (1-Å radius; white). The Asp or Glu residue surface is also represented in Van der Waals spheres (1-Å radius) for (BACCR)NAT3 (orange) and (HUMAN)NAT2 (cyan) models. These models show that a steric clash occurs between His and Glu upon D122E mutation in (HUMAN)NAT2. FIGURE 8. Distribution of the non-canonical Cys-His-Glu catalytic triad throughout species. NAT homologue sequences from Bacillus species cluster differently from canonical bacterial NATs and are closely related to lower eukaryotes as reported previously (5). The Bacillus sequences cluster in three main clades corresponding to (BACCR)NAT1, (BACCR)NAT2, and (BACCR)NAT3 orthologous sequences. All NAT homologues present in the (BACCR)NAT3-containing clade have a Glu residue instead of the Asp at catalytic position 123 (in green). Three sequences from lower eukaryotes and four bacterial sequences also possess a putative Glu as the third catalytic residue (in red). Characterized NAT sequences are indicated according to the NAT nomenclature; other sequences are indicated as genus species strain followed by the UniProt identifier in brackets.
homologues. Considering the environmental adaptation role proposed for prokaryotic NATs, such findings suggest that other non-canonical NAT enzymes may be functional and that such enzymes might have particular properties, e.g. substrate specificity, conferring specific features to these organisms. The characterization of the different B. cereus NAT isoforms and other non-canonical enzymes will help in answering this question.
Concluding Remarks-Although the activity of the prokaryotic and eukaryotic NAT enzymes was known to rely on the presence of a strictly conserved Cys-His-Asp catalytic triad, we report here the characterization of a fully functional NAT enzyme possessing a non-canonical Cys-His-Glu catalytic triad. The crystal structure of this NAT enzyme from B. cereus ((BACCR)NAT3) suggests that the accommodation of a Glu residue might be achieved only in certain NAT enzymes that possess favorable packing and steric constraints around the catalytic position rather than depending on chemical properties of the third catalytic residue, although it remains unclear which regions or particular residues are involved. The identification of both prokaryotic and eukaryotic NAT sequences adds clues in this direction. However, it raises the question of the evolutionary history of the catalytic triad of NAT enzymes and the significance of the predominance of Cys-His-Asp catalytic triads over the non-canonical Cys-His-Glu catalytic triad, which can be catalytically active as shown in this study. A similar question has been proposed for the conservation of an Asp residue instead of the equivalent Glu residue in the catalytic triad of acetylcholine esterase (52). Finally, our work reinforces recent studies demonstrating greater structural and functional diversity than expected in the NAT enzyme family of acetyltransferases. The systematic functional and structural characterization of NATs harboring atypical features, in particular the homologues carrying the non-canonical Cys-His-Glu catalytic triad, will be necessary to better understand the basis of this emerging diversity.