Crystal Structure of N-Acetylornithine Transcarbamylase from Xanthomonas campestris

We have identified in Xanthomonas campestris a novel N-acetylornithine transcarbamylase that replaces ornithine transcarbamylase in the canonic arginine biosynthetic pathway of several Eubacteria. The crystal structures of the protein in the presence and absence of the reaction product, N-acetylcitrulline, were determined. This new family of transcarbamylases lacks the DxxSMG motif that is characteristic of all ornithine transcarbamylases (OTCases) and contains a novel proline-rich loop that forms part of the active site. The specificity for N-acetylornithine is conferred by hydrogen bonding with residues in the proline-rich loop via water molecules and by hydrophobic interactions with residues from the adjacent 80's, 120's, and proline-rich loops. This novel protein structure provides a starting point for rational design of specific analogs that may be useful in combating human and plant pathogens that utilize acetylornithine transcarbamylase rather than ornithine transcarbamylase.

Xanthomonas is a genus that includes several plant pathogens that attack a variety of economically important crops such as citrus fruits, grapes, rice, beans, and cotton. Phylogenetic analyses place Xanthomonas at the base of the ␥-subdivision of proteobacteria, within the same clade as Xylella fastidiosa. Recently, the genomic sequences of Xanthomonas campestris pv. campestris and Xanthomonas axonopodis pv. citri were reported (1). Essentially all the putative arginine biosynthesis genes in X. campestris and X. axonopodis can be identified by homology to known arg genes from various organisms; however, the sequences of their putative argF genes differ significantly from the canonical argF genes that encode ornithine transcarbamylase (OTCase) 1 enzymes (see www.genome.jp/dbget-bin/get_pathway?org_nameϭxcc&mapnoϭ00220: urea cycle and metabolism of amino groups). These novel argF genes are very similar to a gene that we previously identified in Bacteroides fragilis and that is essential for arginine biosynthesis (2). The unusual argF gene products do not contain the DxxSMG ornithine-binding motif that is completely conserved in other argF/OTCase genes (3). Moreover, their purified protein products display no catalytic activity when incubated with carbamyl phosphate and ornithine. argF genes of this new class are also present in X. fastidiosa, X. axonopodis, Bacteroides thetaiotaomicron, Cytophaga hutchinsonii, Tannerella forsythensis, Prevotella ruminicola, and other Eubacteria, suggesting that this gene is not rare. The three-dimensional structure of the argF gene product from B. fragilis indicates that it has a distinctive ornithine-binding site, implying that it probably belongs to a novel class of transcarbamylases (2). We have now shown that several enzymes from this class catalyze the carbamylation of N-acetyl-L-ornithine instead of L-ornithine. 2 Here, we describe the liganded and unliganded three-dimensional structures of this novel argF (designated henceforth as argFЈ) product from X. campestris at 2.2 and 1.9 Å resolution, respectively, and provide structural evidence that it is a novel N-acetylornithine transcarbamylase (AOTCase). Since the encoded argFЈ protein is essential for arginine biosynthesis in several organisms that are major pathogens, but is not present in other bacteria, animals, and humans, it is a specific growth inhibition target that could be used to develop drugs for agricultural and medical applications.

EXPERIMENTAL PROCEDURES
Cloning, Protein Expression, and Purification-The argFЈ gene was amplified by PCR from X. campestris genomic DNA (ATCC 33913) using the primers 5Ј-GACATATGTCACTGAAGCACTTCTTGAACACC-3Ј and 5Ј-GCGGATCCTCACGGGCGGCTCTGACCCAC-3Ј, which introduce NdeI and BamHI sites at the initiator ATG and downstream of the stop codon, respectively. The amplification products were cloned into a Topo vector using the Zero-blunt Topo cloning kit (Invitrogen). The plasmid with the correct insert was identified, isolated, and digested with both NdeI and BamHI. The fragments were ligated into the pET28a vector (Novagen) using T4 DNA ligase (New England Biolabs) and transformed into Escherichia coli DH5␣ cells (Invitrogen for expression. Cultures were grown at 298 K to an A 600 of 0.4 -0.6 and were then induced with 0.2 mM isopropyl ␤-D-thiogalactoside and incubated overnight. The cells were harvested by centrifugation and suspended in 40 ml of nickel affinity lysis buffer (300 mM NaCl, 50 mM NaH 2 PO 4 , pH ϭ 7.4, 10% glycerol, 10 mM ␤-mercaptoethanol). The protein was purified using an AKTA FPLC system first with a Histrap nickel affinity column (Amersham Biosciences) and then with a Hightrap DEAE column (Amersham Biosciences). Following purification, the protein was dialyzed into a buffer containing 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 1 mM EDTA, and 5 mM ␤-mercaptoethanol. Protein purity was verified by SDS-PAGE (12%) followed by Coomassie staining; a single band of the expected molecular mass was observed. Protein concentration was determined by the Bradford method using bovine serum albumin as a standard (4). N-Acetyl-L-citrulline was synthesized by reacting L-citrulline with equimolar amounts of acetic anhydride in water followed by pH adjustment to 6.0.
Diffraction data for the unliganded protein were collected to 2.8 Å resolution on an R-axis IV image-plate diffractometer mounted on a Rigaku RU-200 rotating-anode generator with a Cu target ( ϭ 1.54178 Å). The data were indexed initially in I centered cubic space group I23 or I2 1 3 with unit cell parameters a ϭ b ϭ c ϭ 129.2 Å. Data sets to 2.2 and 1.9 Å resolution for unliganded and liganded crystals, respectively, were subsequently collected at the CHESS synchrotron source. All data were processed using the HKL2000 package (5).
Structural Solution and Refinement-Initially molecular replacement was carried out on the in-house unliganded data set with the program AmoRe (6) in the CCP4 crystallographic suite (7). The space group was identified as I2 1 3. The value of the packing density (8) is 2.23 Å 3 Da Ϫ1 , for a monomer molecular mass of 40 kDa, which is consistent with the asymmetric unit being a monomer with a solvent content of 44.5%.
Attempts to use monomeric searching models based on E. coli OT-Case (Protein Data Bank (PDB) code 1DUV) or B. fragilis argFЈ protein (PDB code 1JS1) and other OTCase structures were unsuccessful. However, a reasonable solution was easily identified with trimeric models. The best solution with E. coli OTCase (9, 10) as the search model had a correlation coefficient of 51.6 and an R factor of 49.6% after the translation search. The same molecular replacement solution was found also with the B. fragilis argFЈ trimer as the search model (2).
After the molecular replacement solution was obtained, two monomers were removed and the resulting model with a single monomer was subjected to rigid body refinement using the data set collected at CHESS and CNS version 1.0 (11). The progress of the refinement was monitored by the free R factor for 10% of the data (12). The initial model was built on the electron density calculated with unliganded E. coli OTCase as the search model after solvent flattening with density modification in the CCP4 suite (7) using program O (13). After rigid body refinement, the model was subjected to 600 steps of torsional angle molecular dynamics with a starting temperature of 2500 K. At this point the electron density for the 80's and 240's loops N-acetyl-L-citrulline, a sulfate ion at the active site (see supplemental Fig. 1), and a second sulfate ion on the 3-fold axis were visible in the liganded structure. The model for the monomer was then rebuilt to include these features. Subsequent energy minimization and B-factor refinement resulted in r ϭ 26.9% and R free ϭ 29.1% for the liganded structure. Further refinement with all data to 1.9 Å resulted in r ϭ 26.3% and R free ϭ 28.4%.
Water molecules were added using the CNS protocol for water-pick when the electron density was Ͼ3.5 in F o Ϫ F c maps, and there was at least one hydrogen bonding contact at less than 3.5 Å. The final model of the liganded structure consisted of 333 amino acid residues, two sulfate ions, one N-acetyl-L-citrulline, and 145 water molecules. The final residuals for this model were r ϭ 22.7% and R free ϭ 25.4%, respectively. The final residuals for the unliganded structure with 71 water molecules in the model were r ϭ 22.6% and R free ϭ 28.0%, respectively.

RESULTS AND DISCUSSION
Structural Model-The refinement statistics for both liganded and unliganded structures are given in Table I

Structure of a Novel N-Acetylornithine Transcarbamylase
gion with the exception of Glu 144 , Thr 145 , and Leu 295 , which are in energetically unfavorable conformations in both structures. These three residues are near the active site and are constrained by a hydrogen-bonded network. In all other known transcarbamylase structures, leucine residues corresponding to Leu 295 are also in energetically unfavorable conformations, and the peptide bond between this leucine and Pro 296 is cis, suggesting that these structural features are essential to function. In the unliganded structural model, the electron density for residues 74 -77 is too weak to be modeled, indicating that this loop is flexible in the absence of substrates. In general, the unliganded and liganded structures are very similar with a r.m.s. difference of 0.501 Å for 328 equivalent C␣ atoms. The differences are smallest in the hinge region that consists of the C termini of helices H1 and H5 and the N terminus of helix H12, which are held together by an extensive hydrogen bond network. The largest differences between the liganded and unliganded structures reside in the 80's, 120's, and 240's loops that are remote from the hinge region and closer to the active site.
The conformation of bound N-acetyl-L-citrulline is well defined, but the electron density does not distinguish between the carbonyl oxygen and methyl carbon of the acetyl group. However, it is most likely that the acetyl group is part of a hydrogen bonded network that includes Wat57 and Wat101 (see Fig. 1 for the active site structure). The sulfate ion located at the 3-fold axis has strong (2.78 Å) hydrogen bonds to three symmetry-related His 66 as is the case in Pyrococcus furiosus OTCase (15).
Since the unliganded and liganded structures are similar, and the liganded structure is at a higher resolution, the subsequent results and discussion are based on the liganded structure.
Monomeric Structure-As in other members of the transcarbamylase superfamily, the monomer has two domains, linked by two interdomain helices, H5 and H12. Each domain has a central parallel ␤-sheet surrounded by ␣-helices and loops conforming to ␣/␤ folding topology (Fig. 2A). The active site is located within the cleft between the two domains and is shared by two adjacent monomers.
The r.m.s. difference between the B. fragilis argFЈ protein structure (PDB code 1JS1) and the X. campestris structure  calculated with the automatic matching algorithm in O (16) is 1.4 Å for 280 equivalent C␣ atomic positions. Coincidentally, the r.m.s. difference is also 1.4 Å for 280 equivalent C␣ atomic positions between human OTCase (PDB code 1C9Y) and the X. campestris structures, while the relationship to E. coli AT-Case (PDB code 1EKX) is more distant, with a r.m.s. difference of 1.7 Å for 262 equivalent C␣ atomic positions.
The B. fragilis argFЈ protein structure (2) and X. campestris AOTCase have extended 80's and 120's loops, relative to OT-Case and ATCase, and a proline-rich loop that is not present in any other transcarbamylase (2). However, the loop that links H7 and ␤-strand B8 in the second binding domain of the B. fragilis argFЈ protein is 11 amino acids shorter than in other OTCases and X. campestris AOTCase. (see sequence alignment provided in supplemental Fig. 2). The conformation of the 120's loop in X. campestris AOTCase is similar to that of B. fragilis argFЈ protein (Fig. 3), and this loop interacts strongly with the 240's loop via hydrophobic interactions. As a result of these interactions, the 240's loop is well defined (supplemental Fig.  3), and the relative positions of the two domains are constrained.
The conformation of the 80's loop in liganded X. campestris AOTCase is significantly different from that of the unliganded X. campestris and B. fragilis argFЈ proteins (Fig. 3). Residues 74 -77, which are disordered in the unliganded structure become ordered, and the NE atom of Trp 77 (Trp 72 in B. fragilis) is displaced ϳ12 Å relative to its position in the unliganded B. fragilis structure. (It cannot be seen in the unliganded X. campestris structure.) This enables the NE atom to hydrogen bond with the O2 atom of the sulfate ion at the active site.
Trimeric Structure-The catalytic unit is a trimer with exact 3-fold symmetry and three active sites shared by adjacent monomers (Fig. 2B). The interfaces between monomers involves the conserved SMRTR motif and the 80's loop of the carbamyl phosphate (CP)-binding domain and the loop linking B10 and H11 in the N-acetylornithine-binding domain. (The specific interface interactions are provided in supplemental Table 1). In contrast, the interactions between monomers of OTCases and ATCases primarily involve amino acid residues from the CP domain. However, one ion pair interaction, between Arg 51 and Glu 94 , is conserved across all known structures in the transcarbamylase superfamily.
In contrast, the binding site for the second substrate in AOTCase (acetylornithine) is very different from OTCase (ornithine). The carboxyl group of the reaction product, N-acetyl-L-citrulline, is rotated ϳ110°around the CA-CB bond relative to the carboxyl group of L-ornithine bound to human OTCase. Two additional residues, Glu 144 and Lys 252 , are involved in binding N-acetyl-L-citrulline in AOTCase (Fig. 1). Glu 144 OE1 and Lys 252 NZ form hydrogen bonds with oxygens O (2.67 Å) and OXT (2.79 Å), respectively, of the carboxyl group (Fig. 1). Three water molecules, Wat6, Wat28, and Wat57, at the active site help to anchor N-acetyl-L-citrulline by a network of hydrogen bonds. In addition to this intense hydrogen-bonding network, additional hydrophobic interactions of Trp 77 (from the adjacent 80's loop) and Leu 184 (from the proline-rich loop) with the methyl C atom of the acetyl group also appear significant.
Several of the above interactions involve residues within the proline-rich loop, which is characteristic of the new AOTCase family. It appears that the function of the prolines is to maintain the conformation of the loop to allow these interactions, enabling AOTCases to "distinguish" between ornithine and N-acetylornithine.
Importantly, the product-liganded AOTCase structure provides an ending state for the forward reaction and a starting state for the reverse reaction. No such structure is available for any OTCase. The distance between the O1 atom of the bound sulfate and CZ atom of N-acetyl-L-citrulline is only 2.73 Å. This would allow the O1 atom to attack the CZ atom in the reverse reaction if phosphate binds in the same manner. Even though a sulfate ion was found at the active site in the unliganded protein, its location is different from that in the liganded structure, where it is closer to the ureido group by about 2.34 Å. Apparently, when N-acetylcitrulline is released, the phosphate moves toward the ureido group freeing the site for CP binding in the subsequent reaction.
The identification and biochemical and structural characterization of AOTCase provide important information on a potential target for specific inhibition of bacterial pathogens that posses this unique step in a novel arginine biosynthetic pathway. Such inhibitors, when developed, would not affect the more common OTCase present in most lower and all higher organisms including human and would potentially provide a specific non-toxic method for controlling certain agricultural and human pathogens.