Structure of Protein Geranylgeranyltransferase-I from the Human Pathogen Candida albicans Complexed with a Lipid Substrate*

Protein geranylgeranyltransferase-I (GGTase-I) catalyzes the transfer of a 20-carbon isoprenoid lipid to the sulfur of a cysteine residue located near the C terminus of numerous cellular proteins, including members of the Rho superfamily of small GTPases and other essential signal transduction proteins. In humans, GGTase-I and the homologous protein farnesyltransferase (FTase) are targets of anticancer therapeutics because of the role small GTPases play in oncogenesis. Protein prenyltransferases are also essential for many fungal and protozoan pathogens that infect humans, and have therefore become important targets for treating infectious diseases. Candida albicans, a causative agent of systemic fungal infections in immunocompromised individuals, is one pathogen for which protein prenylation is essential for survival. Here we present the crystal structure of GGTase-I from C. albicans (CaGGTase-I) in complex with its cognate lipid substrate, geranylgeranylpyrophosphate. This structure provides a high-resolution picture of a non-mammalian protein prenyltransferase. There are significant variations between species in critical areas of the active site, including the isoprenoid-binding pocket, as well as the putative product exit groove. These differences indicate the regions where specific protein prenyltransferase inhibitors with antifungal activity can be designed.

focus of cancer chemotherapeutic research for over a decade. Two FTase inhibitors, Lonafarnib (Schering) and Tipifarnib (Johnson & Johnson) have advanced to late-stage clinical trials (10) for the treatment of cancer.
In lower eukaryotes, prenylation of essential signal transduction proteins is also required for function. In pathogenic microorganisms, such as C. albicans, disruption of protein prenylation of such essential cellular proteins has the potential for the development of new antifungal medications (5)(6)(7)(8)23). The Ram2 gene in C. albicans encodes the common ␣-subunit of FTase and GGTase-I; knock-out mutations of this gene are lethal (5). As in mammalian cells, knock-out of a single ␤-subunit is not lethal, because the remaining CaaX prenyltransferase can cross-prenylate non-cognate substrates (7). Even so, at least one series of selective CaGGTase-I inhibitors has shown significant antifungal activity, suggesting that impairment of one of the CaaX prenyltransferase may be sufficient for effective treatment (23). In C. albicans, Rho family substrates of the GGTase-I regulate cell wall biogenesis, while Ras family substrates of CaFTase have been strongly implicated in virulence by regulating the transition from yeast to hyphae (5). Selective inhibitors of C. albicans CaaX prenyltransferases would therefore provide tools for understanding morphological changes in this yeast as well as potential antifungal treatments.
Here we present the crystal structure of the CaGGTase-I at 1.8-Å resolution in complex with its cognate lipid substrate, geranylgeranylpyrophosphate (GGPP). This structure reveals the structure of a non-mammalian protein prenyltransferase and provides high-resolution structural insights into the evolution of the protein prenyltransferase mechanism in lower eukaryotes and mammals. Although the yeast enzyme shares a similar overall architecture with its mammalian ortholog, close inspection of the active site reveals areas of significant divergence in critical regions. In particular, mechanisms of isoprenoid selection differ in the two enzymes; the putative product exit groove also varies significantly. We postulate that the unique structural features of CaGGTase-I will provide opportunities to design selective inhibitors for the development of new anti-fungal therapeutics.

EXPERIMENTAL PROCEDURES
Cloning and Protein Expression-C. albicans genomic DNA (strain SC5314) was obtained from the American Type Culture Collection. The RAM2 gene encoding the ␣-subunit was amplified from the genomic DNA using standard PCR methods and Platinum Pfx High Fidelity polymerase (Invitrogen). The forward primer sequence (SalI restriction site underlined) was 5Ј-CTGACGCCATGGATGACAGACTCCAAA-TATGAC-3Ј and the reverse primer sequence (NotI site underlined) was 5Ј-CATTATGCGGCCGCTTACACCGA-TGTGAG-3Ј. The insert was digested with restriction enzymes SalI and NotI (New England Biolabs) for subcloning into the expression vector.
Amplification of the CDC43 gene encoding ␤-subunit was challenging because of the particularly AT rich sequence. The insert was amplified in two steps. First, primers were designed to amplify from regions flanking the gene, ϳ250 bases upstream and downstream of the coding sequence. PCR was performed using the Phusion High Fidelity DNA polymerase (New England Biolabs) using the manufacturer's suggested procol, with the forward primer 5Ј-TCAAACCGGCTTCTTCAAGT-3Ј and the reverse primer 5Ј-TGTTGATTGTGTGTGTGGGA-3Ј. The resulting fragment of ϳ1.7 kb was used as the PCR template for another round of PCR amplification using TaqDNA polymerase (Invitrogen) according to the manufacturer's protocol with the following primers: forward, 5Ј-GCG-CTGCATATGAACCAACTGCTGATTAACAAACATGAG-AAATTTTT-3Ј (NdeI restriction site underlined); reverse, 5Ј-GCGCTGCTCGAGTTAATACTTTATTTTTTCTTTAA-AAAATTGATACGATTCTTTTGTAATTG-3Ј (XhoI site underlined). The resulting PCR product was cloned into the PCR2.1 TOPO vector using the TOPO TA cloning kit from Invitrogen. Plasmids isolated from colonies positive for insert FIGURE 1. A, protein prenyltransferase reaction scheme. Alkylation of the cysteine ␥ sulfur by the isoprenoid (C1 position) produces prenylated CaaX tetrapeptide product; pyrophosphate is the leaving group. B, protein prenyltransferase reaction cycle, adapted from Ref. 13, 14. Enzyme binds the lipid substrate (red) first (complex 1), followed by CaaX tetrapeptide substrate (blue) binding to generate a ternary substrate complex 2. Prenylated product complex 3 is formed, and pyrophosphate is released. A fresh lipid substrate molecule (red) then binds to initiate a new turn of the cycle and to complete the current by displacing prenylated product complex into a product exit groove (green) to generate 4. Displaced product is released from the active site allowing binding of a new CaaX substrate and continuing the cycle.
incorporation were digested using restriction enzymes XhoI and NdeI (New England Biolabs) to liberate the CDC43 insert.
The pCDFDuet-I vector (Novagen) was chosen to co-express both subunits of the enzyme in Escherichia coli (E. coli); the vector contains two multiple cloning sites (MCS) under the control of separate isopropyl-1-thio-␤-D-galactopyranosideinducible T7 promoters for robust co-expression of two gene products. The digested RAM2 gene encoding the ␣-subunit was subcloned into the SalI and NotI restriction sites in MCSI of the pCDFDuet-I vector. The digested CDC43 gene encoding the ␤-subunit was subcloned into the NdeI and XhoI restriction sites of MCSII of the pCDFDuet-I vector to achieve the final expression construct. The Duke University Medical Center DNA Analysis Facility performed the sequence analysis to confirm error-free construction of the expression vector.
The final construct was transformed into C41 (DE3) Escherichia coli (AVIDIS, S.A.) for expression. A single colony was picked from the plate and grown overnight in a 50-ml LB culture supplemented with 50 mg/ml streptomycin. The 50-ml LB culture was used to inoculate 2 liters of LB media supplemented with streptomycin until the A 600 reached 0.8, at which point the culture was induced with a final concentration of 1 mM of isopropyl-1-thio-␤-D-galactopyranoside for 4 h at 37°C. The culture was supplemented with 300 M Zn(SO 4 ) 2 at induction. Cells were harvested at 6000 ϫ g, and the cell paste could be flash-frozen in liquid nitrogen and stored at Ϫ80°C for several months.
Protein Purification-The cell paste was resuspended in a 10-fold volume of Buffer A (20 mM Tris, pH 7.7, 5 mM dithiothreitol, 5 M ZnCl 2 ) supplemented with SigmaFast general use protease inhibitor tablets (Sigma). Cells were lysed using a pressurized homogenizer (Microfluidics Corp.) and the resulting crude lysate was clarified by centrifugation at 45,000 ϫ g for 30 min. The lysate was first applied to a DEAE Sepharose column and fractionated using gradient from Buffer A ϩ 150 mM NaCl to Buffer A ϩ 300 mM NaCl over 8 column volumes. The fractions containing CaGGTase-I as determined by SDS-PAGE were pooled and brought to a final concentration of 0.8 M (NH 4 ) 2 SO 4 by 2-fold dilution with Buffer A ϩ 1.6 M (NH 4 ) 2 SO 4 . As with the purification of mammalian GGTase-I, a 2-fold molar excess of GGPP was added to the pooled fractions at this point as well, and the mixture was stirred at 4°C for 20 min before application to the column. The GGPP putatively displaces nonspecifically bound lipids in the active site of the molecule and results in a narrower elution peak from the phenyl-Sepharose column. After incubation with GGPP, the protein was applied to a phenyl-Sepharose column pre-equilibrated in Buffer A ϩ 0.8 M (NH 4 ) 2 SO 4 , and fractionated using a gradient of Buffer A ϩ 0.8 M (NH 4 ) 2 SO 4 to Buffer A over 8 column volumes. The fractions containing CaGGTase-I (SDS-PAGE) were pooled and adjusted to 10 mS/cm conductivity by dilution in Buffer A (Thermo Scientific conductivity meter) and applied to a Q-Sepharose column pre-equilibrated in Buffer A ϩ 150 mM NaCl. The protein was fractionated using a gradient from Buffer A ϩ 150 mM NaCl to Buffer A ϩ 350 mM NaCl over 8 column volumes. The fractions containing CaGGTase-I (SDS-PAGE) were concentrated to 0.5 ml total volume using a centrifugal concentrator (50-kDa MWCO, Amicon), and applied to a 120-ml Superdex 16/10 gel filtration column equilibrated in Buffer A. The final fractions containing CaGGTase-I (SDS-PAGE) were concentrated again in a centrifugal concentrator (50 kDa MWCO, Amicon) to 15 mg/ml, flash-frozen in liquid nitrogen, and stored at Ϫ80°C. Typical yield was ϳ3-5 mg of purified CaGGTase-I per liter of E. coli culture.
Crystallization and Data Collection-GGPP was added to an aliquot of protein at a 0.5:1 GGPP/protein ratio for 30 min prior to setting up the crystallization drop. CaGGTase-I crystals were grown in hanging drop format in which 1 l of protein solution was mixed with 1 l of well solution consisting of 25% PEG 1500 and 1ϫ PCB buffer, pH 7.0 (100 ml of 10ϫ PCB contains 3.84 g of sodium propionate, 4.28 g of sodium cacodylate, and 11.29 g of Bis-Tris propane). Crystals appeared within 2-3 days and grew as long thin rods, with typical dimensions 400 m ϫ 50 m ϫ 50 m. Prior to data collection, crystals were transferred to a stabilizing solution containing 1ϫ PCB buffer pH 7.0, 30% PEG 1500, followed by cryoprotection in stabilizing solution plus 10% ethylene glycol. Crystals were flash-frozen in liquid nitrogen. Diffraction data were collected at Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory at 100 K. Crystals diffract beyond 1.7-Å resolution; a complete data set was collected to 1.8-Å resolution (Table 1). CaGGTase-I crystallized in space group C2 (a ϭ 132.3 Å, b ϭ 66.05 Å, c ϭ 82.8 Å, ␣ ϭ ␥ ϭ 90.0°, ␤ ϭ 100.0°) with one molecule in the asymmetric unit. HKL2000 was used for data reduction and scaling.
Structure Solution and Refinement-The CaGGTase-I structure was determined by molecular replacement using PHASER (24). A homology model derived from the rat GGTase-I structure (PDB code 1N4P, chains A and B, (14)) was constructed using MODELLER (25) and subsequently used as the search model. A cycle of simulated annealing was then performed on the solution in CNS (26), which gave an R factor of 41.4%. ARP/ wARP (27) was then used to retrace the model, properly fitting regions not successfully fit into the density by the simulated annealing refinement; iterative automated chain tracing with ARP/wARP and refinement in REFMAC5 (28)

RESULTS
Overall Structure of C. albicans GGTase-1-CaGGTase-I is an 82-kDa heterodimer consisting of 37-kDa ␣-subunit (306 residues) and a 45-kDa ␤-subunit (390 residues). Fig. 2A shows the overall structural features of the enzyme. The sequence identities with respect to the mammalian ortholog are 28 and 25% for the ␣and ␤-subunits, respectively. Despite this low overall sequence identity, the structure of CaGGTase-I is quite similar in overall architecture to mammalian GGTase-I (1.58 Å r.m.s.d. calculated over all aligned ␣-carbon atoms, Fig. 2B).
The CaGGTase-I ␣-subunit is smaller than the mammalian protein (306 versus 377 amino acids). CaGGTase-I lacks an N-terminal domain rich in proline and glutamine residues present in the mammalian enzyme. This domain is disordered in all mammalian FTase and GGTase-I structures determined to date.
Like the mammalian enzyme, the CaGGTase-I ␣-subunit consists of ␣-helices. There are 16 helices altogether, one short helix more than in mammalian enzyme, arranged in antiparallel pairs to form a crescent that envelopes part of the ␤-subunit.
We observe variation in helix length in the ␣-subunit. Helices 5␣ and 8␣ extend one helical turn longer than the corresponding mammalian helix, and helix 12␣ is approximately one helical turn shorter. Five amino acids are inserted between the equivalents to12␣ and 13␣ in the mammalian enzyme and form an additional short helix in CaGGTase-I (13␣).
The ␤-subunit is slightly larger than that of mammalian GGTase-I (390 residues versus 377 for human GGTase-I). Variation in loop length between the helices in the ␤-subunit accounts for the additional residues. The ␤-subunit is predominantly ␣-helical forming an ␣-␣ barrel with a central, largely hydrophobic cavity, which contains most of the residues that bind substrate and coordinate the catalytic Zn 2ϩ ion (Fig. 1). The CaGGTase-I ␤-subunit also has two short antiparallel ␤ sheet regions (residues 67␤-76␤ and141␤-154␤) remote from the active site, which do not appear in the mammalian enzyme.
Two loops (residues 176␤-188␤, 261␤-272␤) and several residues at the termini (1␣-2␣, 305␣-306␣; 1␤) are not visible in the final electron density maps. The disordered loops correspond to two areas of allelic and strain variation in the ␤-subunit: the region 176 -188 contains a polyasparagine tract that can vary between 6 and 17 residues, depending on strain and allele. The variant described here contains six asparagines. The region 261-272 contains a region rich in asparagine, aspartate, and glycine. The number of repeats of these residues also varies across strains and alleles (8); the amino acid sequence of the variant reported here is KDGNGDNGNGDN.
Isoprenoid Substrate Binding and Selection-Like mammalian GGTase-I (14), CaGGTase-I binds its isoprenoid substrate in a hydrophobic groove on one side of the active site cavity. The diphosphate moiety forms hydrogen bonds to lysine, arginine, and histidine residues, which are all mostly conserved across species. Although conformation of the GGPP lipid substrate is similar to mammalian GGTase-I (Fig. 3A), the binding mode of the fourth isoprene in CaGGTase-I varies significantly (Fig. 3A). In mammalian GGTase-I, this fourth isoprene is directed toward the CaaX substrate-binding pocket adjacent to the isoprenoid pocket; in the yeast GGTase-I, the phenylalanine 99␤ is bulkier than threonine 127␤ at the equivalent position in rat GGTase-I, directing the fourth isoprene away from the CaaX-binding site toward the edge of the active site cavity.
This conformational difference suggests that the C. albicans prenyltransferases have an alternative mechanism for isoprenoid substrate selection. In mammalian FTase and GGTase-I a single residue (W102␤ in FTase and T49␤ in GGTase-I) dominates selection of isoprenoid length (14,32): the tryptophan in FTase presents a steric block to any isoprenoid lipid longer than a 15-carbon FPP, while the smaller threonine in GGTase permits binding of the additional five-carbon fourth GGPP isoprene. Mutagenesis of W102␤ to threonine converts FTase to a geranylgeranyltransferase (32).
In CaGGTase-I, isoprenoid selection appears to be governed by two residues (L98␤ and L352␤), for which there are no equivalent in mammalian GGTase I. Inspection of a sequence alignment of CaGGTase-I ␤-subunit with the CaFTase ␤-subunit reveals variation at the positions equivalent to these leucine residues. CaFTase is predicted to have two tyrosines, 502␤ and 259␤, in this position. A homology model suggests that these residues impinge on the binding site for the fourth isoprene, excluding GGPP from the binding site and selecting for the shorter FPP (Fig. 3B).
Zinc and Magnesium Dependence of the Prenylation Reaction-Like all protein prenyltransferases studied to date, CaGGTase-I is a Zn 2ϩ -dependent metalloenzyme (17,(33)(34)(35)(36)(37), with the Zn 2ϩ activating the cysteine thiolate of the CaaX substrate for attack on the C1 carbon of the isoprenoid substrate. The Zn 2ϩ coordination sphere is conserved and consists of an aspartic acid (D294␤), cysteine (C296␤), and histidine (H349␤), forming a distorted pentacoordinate geometry with two ligands contributed by D294␤ at 2.23 and 2.38Å, a ligand from C296␤ at 2.00Å, and a ligand from the N⑀ of H349␤ at 2.29Å. Consistent with the structures of mammalian FTase and GGTase-I (14,38), a water molecule occupies the position of a fifth ligand. In the mammalian enzymes, the water is displaced by the ␥ sulfur of the cysteine residue of the CaaX substrate upon peptide binding.
Unlike the mammalian GGTase-I, CaGGTase-I is dependent on millimolar levels of Mg 2ϩ for its maximum reaction rate (8). This Mg 2ϩ dependence is also observed in both mammalian and Saccharomyces cerevisiae FTases, as well as S. cerevisiae GGTase-I (17, 39 -41). A Mg 2ϩ ion is hypothesized to stabilize the diphosphate leaving group in the chemical step of the reaction (Fig. 4A) (13,42). Modeling of a Mg 2ϩ in the crystal structure of mammalian FTase indicates that aspartate D352␤ is positioned to coordinate this ion next to the diphosphate leaving group in the modeled transition state (Fig. 4A) (13). Mutagenesis studies further support that this residue binds Mg 2ϩ (40). The structure of mammalian GGTase-I reveals that the terminal amine of a lysine side chain at the equivalent position could effectively substitute for Mg 2ϩ at this position ( Fig.  4A) (14, 39).
The CaGGTase-I structure reveals that the equivalent position to D352␤ or K311␤ in the mammalian enzymes is arginine 339␤ (Fig. 4B). At neutral pH, the positive charge of arginine is delocalized over the guanidinium group. In the mammalian GGTase-I, the orientation of K311␤ is restricted by a tryptophan, W312␤, effectively directing the positive charge toward the diphosphate (14) (Fig. 4B). In CaGGTase-I, the residue adjacent to R339␤ is an aspartate, D340␤ (Fig. 4B). We propose a 2-fold effect: first, with a smaller neighbor (aspartate), the arginine can explore multiple conformations, which is supported by the observation that its guanidinium group is poorly ordered in the electron density; and second, the negatively charged D340␤ diminishes the positive charge density in this region, requiring higher Mg 2ϩ levels.
The Mg 2ϩ -dependent S. cerevisiae GGTase-I has a lysine at the position equivalent to mammalian K311␤ and CaGGTase-I R339␤, but the mammalian tryptophan is replaced by asparagine N332␤ (isosteric with the CaGGTase-I D340␤), suggesting that the lysine conformation is less restricted, similar to CaGGTase-I R339␤.
CaaX Substrate-binding Pocket-In the mammalian FTase and GGTase-I, the CaaX protein substrate binds in an extended conformation (Fig. 1B) with the cysteine coordinating the catalytic Zn 2ϩ ion and the C terminus anchored at the bottom of the pocket by a glutamine residue (Q167␣) (11,13,14,(43)(44)(45). In addition, the CaaX substrate makes significant van der Waals contact with the lipid substrate ( Fig. 1B) (11,13,14,44,45). We expect that the extended conformation will be recapitulated in CaGGTase-I, with the Zn 2ϩ and Q104␣ (equivalent to Q167␣ in mammalian GGTase-I) acting as anchor points.
The CaGGTase-I ␤-subunit barrel forms a large central cavity, which binds the two substrates similar to mammalian GGTase-I (Figs. 1B and 2). Despite the similar overall architecture, CaGGTase-I exhibits significant variation in the identities of residues comprising the CaaX X-residue-binding site compared with the mammalian GGTase-I.
Most of the residues within a 5-Å radius of the X-residue are non-conservative substitutions compared with the mammalian enzyme (14) (Fig. 5). This arrangement suggests that either there is degeneracy in the recognition of the CaaX tetrapeptide (6,8), or that the peptide substrates adopt a different binding mode, particularly with respect to the C-terminal X-residue. The structural adaptations of the binding site to accommodate the GGPP fourth isoprene are also likely to effect binding of the X-residue, because the two substrate-binding pockets are adjacent to each other and share several residues within van der Waals distance of both substrates (Fig. 1B).  Product Exit Groove-The reaction path determined for mammalian GGTase-I and FTase reveals the presence of a displaced prenylated product intermediate that precedes product release (Fig. 1B) (13,14). The isoprenoid portion of this intermediate is bound in a solvent-exposed product prenylated product exit groove located adjacent to the CaaX substratebinding site (Fig. 1B). Product release for the mammalian protein prenyltransferases is the slowest step (300-fold slower relative to the catalytic step) in the reaction (19,21). The structurally determined pre-release intermediate product complex is consistent with this kinetic scheme.
Aside from mammalian prenyltransferases, only S. cerevisiae FTase (46) has been characterized by pre-steady state kinetic analysis. The kinetically defined reaction path of S. cerevisiae FTase (46) is similar to the mammalian enzymes, with the notable exception that the product release step is no longer the clearly dominant slow step (3-fold slower relative to the catalytic step). This suggests that the product is not stably bound in this enzyme. Inspection of sequence alignment between the mammalian and S. cerevisiae FTase sequences reveals there is significant variation in the residues lining the exit groove, suggesting that the latter lacks this groove, accounting for the large difference in product release rate constants.
Only the steady state kinetics parameters for the CaGGTase-I have been reported, and its overall turnover rate is not significantly different than is reported for mammalian GGTase-I (0.076 s Ϫ1 for CaGGTase versus 0.051 s Ϫ1 ) (6,39). The rate constants for product release are not available for CaGGTase-I. The crystal structure reveals significant variation in the exit groove compared with the mammalian enzyme (Fig.  6). In particular, residues 17-24 in CaGGTase-I, which make up one side of the putative exit groove, are positioned on average 4 Å closer to the other wall of the exit groove than is seen in the mammalian enzymes (13,14). This arrangement narrows this groove to 6.4 Å at its narrowest point, thereby presenting a steric block to a displaced isoprenylated product modeled in a similar position to that observed in mammalian GGTase-I. If a displaced isoprenylated product were part of the CaGGTase-1 reaction path, it therefore must bind shallowly in the exit groove. This arrangement is expected to change the rate constants for product release relative to the mammalian counterpart.

DISCUSSION
The CaGGTase-I structure reveals a high-resolution picture of a non-mammalian protein prenyltransferase. The Zn 2ϩ coordination sphere, essential for all protein prenyltransferases, is completely conserved. Despite relatively low sequence identity, there is a remarkable degree of structural conservation in regions of the protein apparently non-essential for activity.
By contrast, there is variation in parts of the structure involved in the molecular recognition of both the lipid and CaaX substrates. Despite these differences in the CaaX protein substrate-binding pocket, CaGGTase-I exhibits nearly identical substrate preferences to the mammalian enzyme to the extent that different sequences have been tested (6,8).
Furthermore, the CaGGTase-I structure shows that the recognition and release of isoprenylated products varies across species. In particular, the product exit groove that is observed in the mammalian enzyme and confers an unusual interplay between substrate binding and product release is lacking in the CaGGTase-I enzyme. This exit groove may not be required for monoprenylation but may instead be a necessary feature to confer the processive diprenylation observed in the in Rab GGTases (13,14).
We propose that the structural differences observed between mammalian GGTase-I and CaGGTase-I will be sufficient to devise a structure-based design strategy to develop C. albicansselective GGTase-I inhibitors. The variation in the CaaX-binding pocket, particularly the X-residue recognition residues, as well as the uniquely shaped exit groove provide opportunities to define ligands, which will be selective for the C. albicans enzyme.