Crystal Structure of Human Squalene Synthase

Squalene synthase catalyzes the biosynthesis of squalene, a key cholesterol precursor, through a reductive dimerization of two farnesyl diphosphate (FPP) molecules. The reaction is unique when compared with those of other FPP-utilizing enzymes and proceeds in two distinct steps, both of which involve the formation of carbocationic reaction intermediates. Because FPP is located at the final branch point in the isoprenoid biosynthesis pathway, its conversion to squalene through the action of squalene synthase represents the first committed step in the formation of cholesterol, making it an attractive target for therapeutic intervention. We have determined, for the first time, the crystal structures of recombinant human squalene synthase complexed with several different inhibitors. The structure shows that SQS is folded as a single domain, with a large channel in the middle of one face. The active sites of the two half-reactions catalyzed by the enzyme are located in the central channel, which is lined on both sides by conserved aspartate and arginine residues, which are known from mutagenesis experiments to be involved in FPP binding. One end of this channel is exposed to solvent, whereas the other end leads to a completely enclosed pocket surrounded by conserved hydrophobic residues. These observations, along with mutagenesis data identifying residues that affect substrate binding and activity, suggest that two molecules of FPP bind at one end of the channel, where the active center of the first half-reaction is located, and then the stable reaction intermediate moves into the deep pocket, where it is sequestered from solvent and the second half-reaction occurs. Five α helices surrounding the active center are structurally homologous to the active core in the three other isoprenoid biosynthetic enzymes whose crystal structures are known, even though there is no detectable sequence homology.

The isoprenoid biosynthetic pathway yields a structurally diverse family of low molecular weight molecules with a variety of physiological functions (1)(2)(3). In humans, the pathway produces such critical end-products as cholesterol, bile acids, dolichol, ubiquinone, steroid hormones, and prenylated proteins. Chain length-selective prenyltransferases catalyze the successive head-to-tail addition of the 5-carbon (C5-) isopentenyl diphosphate to the growing isoprene chain to form a series of linear C10-, C15-, C20-and C25-isoprenoid diphosphates. Cyclic terpenes are formed from these linear isoprenoid diphosphates through the actions of numerous class I terpene cyclases (3). The longer 30-carbon linear terpene, squalene, of cholesterol biosynthesis and the 40-carbon linear terpene, phytoene, of carotenoid biosynthesis are formed by head-to-head condensations, respectively, of two 15-carbon and 20-carbon isoprenoid diphosphates, through the actions of the prenyltransferases squalene synthase and phytoene synthase (4) and are subsequently converted into their respective cyclic terpenes through the action of specific class II terpene cyclases (3). Both the isoprenoid chain elongation reactions, catalyzed by prenyltransferases, and the isoprenoid cyclization reactions, catalyzed by the terpene cyclases proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond (2,3).
Squalene synthase (EC 2.5.1.21) is a 47-kDa membraneassociated enzyme that catalyzes the reductive dimerization of two molecules of farnesyl diphosphate (FPP) 1 in a two-step reaction to form squalene (Fig. 1). The uniqueness of the headto-head coupling of two FPP molecules to form squalene via a stable cyclopropylcarbinyl diphosphate intermediate has elicited much mechanistic speculation over the years. The reaction proceeds in two distinct steps, both of which involve the formation of carbocationic reaction intermediates (4,5). In the first half-reaction, two molecules of FPP react to form the stable cyclopropylcarbinyl diphosphate intermediate, presqualene diphosphate (PSQPP), with concomitant release of a proton and a molecule of inorganic diphosphate. In the second half-reaction, thought to occur at a second catalytic site within the enzyme active center (6 -8), PSQPP undergoes heterolysis, isomerization, and reduction with NADPH to form squalene. It has been postulated that the SQS catalytic machinery consists of two nonidentical FPP binding sites (9,10), one that binds FPP in a conformation that facilitates its cleavage to yield an allylic carbocation, and one that binds FPP in an orientation that facilitates carbocation insertion into its C2-C3 double bond (4,11). It has also been suggested that the two FPP molecules bind sequentially with the donor FPP binding first (9,12) and that translocation of PSQPP from the first to the second reaction center occurs without its release from the enzyme (12,13). Several residues thought to be involved in the SQS catalytic machinery have been identified through site-specific mutagenesis (14).
SQS catalyzes the first committed step in cholesterol formation in mammals and is an attractive site of therapeutic intervention (15)(16)(17)(18). We undertook the structure determination of this key enzyme to aid in the design of cholesterol synthesis inhibitors of potential therapeutic importance.

EXPERIMENTAL PROCEDURES
Protein Purification and Crystallization-The protocols for expression, purification, and crystallization of the recombinant truncated form of SQS have been described previously (19). Briefly, recombinant, doubly truncated human SQS was overexpressed in Escherichia coli and purified by ion-exchange chromatography. Crystals of the proteininhibitor complex were obtained by the hanging drop vapor diffusion method. Initially, small, single but irregular crystals were grown from drops containing protein at 1-2 mg ml Ϫ1 , inhibitor at a 1:1 molar ratio and half-strength well solution equilibrated over wells containing 25-30% polyethylene glycol 4000, 0.1 M sodium citrate, pH 5.6, 0.2 M ammonium acetate. Large (up to 0.2 ϫ 0.4 ϫ 0.8 mm) crystals were then grown by microseeding crushed crystals into drops of 9 -11% polyethylene glycol 4000, 0.1 M Bis-Tris, pH 5.8, 0.2 M ammonium acetate, 1-5-fold molar excess of inhibitor and protein at 3-5 mg ml Ϫ1 . The same protocol, but without the addition of inhibitor, yielded no crystals.
Data Collection and Heavy Atom Screening-Crystals were flash frozen at 100 K in a gaseous nitrogen stream following transfer to a cryostabilization solution consisting of well solution made with 27% polyethylene glycol 4000 and 5% (v/v) 2-methyl-2,4-pentanediol. Diffraction data (Table I) were collected on a RAXIS IIc imaging plate detector using CuK ␣ x-rays from a Rigaku RU200 generator operated at 50 kV, 100 mA. Data sets were processed with DENZO and SCALE-PACK (21). Crystals were soaked in ϳ40 different heavy atom compounds prior to collection of partial data sets. Generally, soaks of 1 mM heavy atom concentration were carried out for 24 h, with subsequent soak times and concentrations modified according to the results. An iodine derivative was obtained by co-crystallizing SQS with an iodinated analogue of CP-320473, in which an iodine was placed at the 5-position of the napthyl ring.
Structure Solution and Refinement-The structure was solved by the method of multiple isomorphous replacement, with anomalous scattering. Heavy atom sites were located from difference Patterson maps analyzed with the program PATSOL (22). Initial phases calculated using the program MLPHARE as implemented in the CCP4 software suite (23), gave an overall figure of merit to 2.4 Å of 0.50. 3-fold noncrystallographic symmetry averaging with density modification by histogram matching as implemented in DM (24) gave maps of sufficient quality to allow a nearly complete model of SQS to be built (Fig. 2c). Subsequent refinement and rebuilding was done against 2.15 Å data (Native II) using X-PLOR (25) and O (26). Water molecules were placed according to strict distance criteria and only if they refined to a temperature factor lower than 60 Å 2 . The final model has excellent stereochemistry with no residues adopting unfavorable backbone conformations. Electron density for residues 315-327 and for the NH 2 -terminal seven residues was not seen, so they were not included in the final model. Structures of complexes with other inhibitors were determined by difference Fourier methods. The missing residues (315-327) were subsequently modeled into unambiguous difference density in the SQS-CP-458003 complex structure.

RESULTS
Overall Structure-The structure of SQS 2 is entirely ␣-helical, with the axes of all the helices somewhat aligned and arranged in three layers (Fig. 2, a and b). The first layer is formed by helices A, B, and K; the second layer contains helices E, C, J, and L; and the third layer is formed by D, F, G, H, I, and M. The protein is folded as a single domain, with a large channel running through the center, surrounded by helices C, F, G, H, and J.
Evolutionary Fold Conservation-Despite the lack of any sequence homology, the SQS core structure is similar to that of the other three class I isoprenoid biosynthetic enzymes whose crystal structures are known (Fig. 3). Of these, one is avian farnesyl-diphosphate synthase (27) (FPS), which catalyzes the synthesis of farnesyl diphosphate (the substrate of SQS) from dimethyl allyl phosphate and isopentenyl diphosphate, and the other two are cyclases that use farnesyl diphosphate as a substrate to catalyze the synthesis of cyclic sesquiterpenes:pentalenene synthase (28) (PLS) from Streptomyces and 5-epi- i for the intensity (I) of i observations of reflection h. Phasing power ϭ ͗F͘/E, where ͗F͘ is the root mean square heavy atom structure factor and E is the residual lack of closure error. R factor ϭ ¥ ͉ F obs Ϫ F calc ͉ /¥ ͉ F obs ͉, where F obs and F calc are observed and calculated structure factors, respectively. R free ϭ R factor calculated using 8% of the reflection data chosen randomly and omitted from the start of refinement. rmsd, root mean square deviations from ideal geometry.  (3,28). The conserved feature in all the structures is an ␣-helical core surrounding a central active site cavity, of which one end is predominantly hydrophobic, and the other end is more hydrophilic and contains a signature "aspartate-rich sequence" (30).
Inhibitor Binding Site-In squalene synthase, the central channel is solvent exposed, except at one end, which is covered by a "flap" formed by residues 50 -54, which connect helices A and B (Fig. 2b). The side chain of Phe 54 forms one wall of a large hydrophobic cavity under the flap, into which the inhibitors bind (Figs. 4a and 6). This hydrophobic pocket leads to the opening of the solvent exposed part of the channel, which runs through the middle of the protein. The other end of the channel has two conserved aspartate-containing sequence motifs are red, and chlorine are green. b, ribbon diagram (34,35) 3. SQS is structurally homologous to other class I isoprenoid biosynthetic enzymes. Each protein structure was individually superimposed on SQS, and all four are shown in exactly the same orientation. The percentage of identical residues in the superimposed parts is less than 16%. Five helices surrounding the active center in each enzyme are shown as magenta cylinders, with the rest of the protein backbone in yellow. Highly conserved aspartates, which are expected to be involved in binding the diphosphate moiety of the substrates via Mg 2ϩ ions (2,11,36) are indicated in ball-and-stick representation. Of the five helices forming the active core, a kink in the helix forming the base of active center appears to be a conserved feature in all the structures, which may play a role in stabilizing cationic intermediates (details under "Discussion"). This figure was drawn using MOLSCRIPT (34) and RASTER3D (35).
( 80 DTLED 84 and 219 DYLED 223 ), which are located on opposite walls of the channel (Fig. 5). The residues forming the flap and those lining the central channel are from the most highly conserved regions of amino acid sequence in the squalene synthases and also show high similarity to the plant and bacterial phytoene synthases (31) (Fig. 4). As might be expected for functionally related enzymes, the conserved regions identified in the sequence comparison may therefore represent portions of the active center.
The structures of three different inhibitor complexes are described. The two more potent compounds, CP-320473 and CP-424677 (IC 50 values of 56 and 32 nM, respectively, measured as described previously (15)), fill the cavity under the flap with bulky hydrophobic groups (napthyl and chlorophenyl group in CP-320473 and biphenyl and dichlorophenyl group in CP-424677) (Fig. 6, a and c). The third compound, CP-458003, which is an analog of CP-320473 lacking the napthyl group, is significantly less potent (IC 50 ϭ 30 M) and fills only a part of  (37)) correspond to residues conserved in the squalene and phytoene synthases, colored similarly in Fig. 4b. The loop between helices A and B (residues 50 -54) forms a flap over the inhibitor binding site and is excluded from the surface calculation in this figure, shown instead as a tube. Phe 54 , which forms one wall of the hydrophobic pocket in which the inhibitors bind, is shown in stick form. Residues 116 -119 are also excluded from the surface calculation to better show the conserved residues from helix C and helix F, which form one wall of the central cleft. b, alignment of squalene and phytoene synthase sequences. The first three sequences are squalene synthases from human (P36268), rat (Q02769), and Saccharomyces cerevisiae (P29704). The next three are phytoene synthases from tomato (P37273), Erwinia uredovora (P22872), and Rhodobacter capsulatus (P17056). The numbers in parentheses are the accession numbers of these sequences in the SWISS-PROT data base. Like SQS, phytoene synthases also catalyze a head-to-head condensation of two isoprenyl diphosphates, in this case geranyl geranyl diphosphate, to form phytoene, a 40-carbon terpene (4,31). Two regions in the sequence that are highly conserved in the squalene synthases and not at all in the phytoene synthases are colored magenta and pale green. Five other colored regions identify residues that are highly conserved in both enzymes. The arrows mark the amino and carboxyl termini of the recombinant construct of human SQS used for crystallization.
the cavity under the flap (Fig. 6b). The residues forming the flap and the side chain of Tyr 73 show significant conformational differences in the three inhibitor complexes (Fig. 6). It is possible that common inhibitor interactions within this binding site stabilize an otherwise flexible region, because we were unable to grow crystals of the apo-enzyme, even with further screening.
First Half-reaction-In the first half-reaction ( Fig. 1), the condensation of two molecules of FPP forms a cyclopropyl containing intermediate, presqualene diphosphate, through a unique 1Ј-2-3 prenyl transferase reaction (4,11). Class I isoprenoid biosynthetic enzymes contain a DDXXD sequence motif that binds the diphosphate moiety of the substrates via Mg 2ϩ ions, facilitating phosphate release (2,11,30). Structural superposition of SQS on FPS shows that the two conserved DDXXD sequence motifs in FPS (Asp 117 -Asp 121 and Asp 257 -Asp 261 ) overlap with two conserved aspartate-rich sequences, 80 DTLED 84 and 219 DYLED 223 on helices C and H, respectively, in SQS (Fig. 3). These two motifs are on opposite sides of the central cleft with the carboxylate groups on the aspartates and glutamates all pointing into the cleft (Fig. 5). EAS and pentalenene synthase also use FPP as a substrate and have one conserved DDXXD motif, which overlaps spatially with the 80 DTLED 84 sequence in SQS (Fig. 3). In all three enzymes, the aspartate side chains are involved in binding multiple Mg 2ϩ ions that stabilize binding of diphosphate groups in the substrates. Based on this superposition, it is highly likely that the sequences 80 DTLED 84 and 219 DYLED 223 in SQS bind the diphosphates of two substrate FPP molecules via bridging Mg 2ϩ ions, even though they do not exactly match the consensus DDXXD motif. The isoprene tails of both FPP molecules would extend toward the hydrophobic "upper" end of the channel (Fig. 4). Further corroboration of the importance of the aspartates in the 219 DYLED 223 sequence is provided by sitedirected mutagenesis experiments using rat SQS (14). Any changes to Asp 219 and Asp 223 , including mutations with longer acidic residues (D219E, D223E), neutral residues (D219N, D223N) or positively charged residues resulted in a total loss of activity (14). Based on the crystal structure, we would predict a similar result with the aspartates in the 80 DTLED 84 sequence.
The structure also suggests how the prenyl donor and prenyl acceptor FPPs bind. A large hydrophobic cavity capable of accommodating the 30-carbon intermediate PSQPP is present on the same side of the cleft as the 80 DTLED 84 motif, which suggests that the prenyl acceptor binds on this side of the cleft. This cavity is filled by the napthyl, chlorophenyl, and isobutyl groups of the inhibitor CP-320473 (Fig. 6a). X-ray structures obtained by soaking allylic substrates into crystals of FPS (32) revealed that the growing isoprenyl diphosphate was bound to Asp 117 -Asp 121 , which corresponds to Asp 80 -Asp 84 in SQS (Fig.  4), further suggesting Asp 80 -Asp 84 as the binding site for the phosphate groups of the prenyl acceptor. Mutational analysis of rat SQS also showed that mutations in Tyr 171 , including Y171F, Y171W, and Y171S, resulted in complete loss of activity (14). Tyr 171 is on helix G2, which forms the "base" of the phosphate binding pocket (Fig. 5). The phenolic hydroxyl points into the central cleft and is 3.0 Å from the carboxylate oxygen of Glu 83 , suggesting a catalytic role for this tyrosine, which would explain the mutagenesis data. Also likely to play a key role in catalysis of the first half-reaction are the conserved arginines Arg 218 and Arg 228 located adjacent to the second aspartate-rich domain. The side chains of these residues also extend into the cleft, near the side chains of the conserved aspartate residues (Fig. 5), suggesting that they may stabilize the diphosphate leaving group by forming electrostatic or hydrogen-bonding interactions.
Second Half-reaction-The product of the first half-reaction is the stable intermediate presqualene diphosphate, which is rearranged and reduced by NADPH to form squalene in a second reaction. Several mechanisms for the conversion of PSQPP to squalene, based on known rearrangements of cyclopropylcarbinyl cations have been proposed (4,11). To prevent these highly reactive carbocations from being prematurely quenched by solvent, an important part of the catalytic mechanism of SQS must include an effective shielding from water. Inspection of the binding site for the inhibitor CP-320473 suggests that the cavity that accommodates the carbocationic intermediates is likely to be defined by two pockets that hold the napthyl and chlorophenyl groups of CP-320473. These pockets form the upper end of the central cleft in SQS and are formed by the amino acids constituting part of the A-B loop ( 50 TSRSF 54 ), which we term the flap (Fig. 2b), and parts of helices C, G1, G2, and J. The specific residues that line these pockets (Phe 288 , Cys 289 , Pro 292 , Val 179 , Leu 183 , Tyr 73 , Phe 54 , and Leu 211 ) are predominantly hydrophobic and completely conserved in all known SQS sequences. Partial reaction analysis of site-directed mutants of rat SQS (14) showed that Phe 288 was critical for the second half-reaction. Even conservative mutations such as F288W and F288L showed a complete loss of second reaction activity, although retaining first reaction activity. The structure shows that the Phe 288 side chain forms one wall of the hydrophobic cavity and could stabilize one of the carbocationic intermediates in the second half-reaction. Cation-interactions for the stabilization of carbocation intermediates have been proposed for several terpenoid polyene cyclases (28,29,33). DISCUSSION In this work, by determining the three-dimensional structure of squalene synthase complexed with various inhibitors, we have identified some of the key residues involved in catalysis and provided a structural framework for building a de- FIG. 5. Proposed active site for the first half-reaction. A close-up of the solvent exposed, lower end of the central cleft. Conserved aspartates and arginines and Tyr 171 (discussed under "Results") are shown in stick representation. The inhibitor, CP-320473, is also shown in stick representation, and it makes no interactions with the residues at this site, being buried in the hydrophobic upper half of the channel (Fig. 6). tailed mechanistic model of the two-step reaction catalyzed by this enzyme. Comparison with structures of other isoprenoid biosynthetic enzymes reveals a common folding architecture at the catalytic core and a conserved aspartate-rich sequence motif at the active site designed to bind prenyl phosphates via bridging Mg 2ϩ ions. Another common feature of all class I isoprenoid biosynthetic enzymes is the formation of a relatively stable allylic carbocation species by the release of the phosphate group. Another conserved structural feature suggests a common mechanism by which the primary carbocation could be stabilized in all the enzymes of this family. Superposition of SQS on FPS, EAS, and pentalenene synthase reveals a kink in the helix (helix G in SQS) forming the base of the catalytic cleft in each of the enzymes (residues Val 175 -Ala 176 in SQS, Lys 214 -Thr 215 in FPS, Thr 401 -Thr 402 in EAS, and Ile 177 -Gly 178 in pentalenene synthase), suggesting a common functional role for this feature (Fig. 3). One consequence of such a break in a helix is that backbone amide nitrogens and carbonyl oxygens are available for hydrogen bonding to other ligands, as they are not involved in making hydrogen bonds within the helix. The kink in helix G forms a shallow depression at the mouth of the hydrophobic pockets that are formed under the flap in squalene synthase. The backbone carbonyl atoms of residues 175 and 176 point into the central cavity, forming H-bonds with bound water molecules. In the proposed mechanism for the reaction catalyzed by EAS (29), the allylic carbocationic intermediate is positioned near the main chain carbonyl oxygens of residues 401 and 402, at the bend in helix G. A similar bend at an analogous position suggests that the same role may be played by the backbone carbonyl oxygens of residues 175 and 176 in SQS. The side chain of Arg 77 , which is completely conserved in all the squalene and plant phytoene synthase sequences (31) (Fig. 4), points into the same pocket from the opposite direction, suggesting that it may play a role in directing the released phosphate group away from the carbocation.
There is no evidence of a characteristic nucleotide-binding motif in the structure, which would help define the NADPHbinding site, but it is tempting to consider the J-K loop ( 314 VKIRK 318 ) and part of helix K as making up part of the nucleotide binding site. This domain is conserved in the squalene synthases but not in the plant and bacterial phytoene synthases, which do not require a nucleotide co-factor (31) (Fig.  3). It is also the most flexible part of the structure, indeed residues 315-327 are so disordered that they were not seen in the electron density maps calculated for the CP-320473 structure. Co-crystal structure with the inhibitor CP-458003 showed that these residues formed a loop and a helix, which runs over the top of the central catalytic cleft. It is conceivable that this part of the structure is inherently flexible and is stabilized by NADPH binding.
Native SQS is membrane-bound, but the recombinant construct used for crystallization was truncated at both NH 2 and COOH termini to generate soluble, active protein (19). Our structure shows that the NH 2 -and COOH-terminal ends of the truncated protein are on the same face of the protein (the top as viewed in Fig. 2, a and b), which suggests that the membrane is closer to the "top" end of the protein. This would be consistent with the general directionality of the reaction, in which the hydrophilic substrates (FPP) bind to the "lower" end of the central cleft, and the lipophilic product squalene, leaves from the upper end, closer to the membrane.
The structure described here provides evidence that SQS has the same chain fold and overall topology as other class I prenyltransferases and terpene cyclases, suggesting an evolutionary relationship, despite the lack of sequence conservation. However, in addition to these structural similarities, the struc- ture described here also demonstrates key differences between the active center regions of SQS and other class I prenyltransferases and terpene cyclases, consistent with the unique reaction mechanism catalyzed by SQS. These observations form the basis for further evaluation, using site-directed mutagenesis together with modified substrates and substrate analogs of the dynamics of the unique reaction mechanism catalyzed by human SQS to allow more detailed mechanistic models to be inferred. Such studies are currently underway. In addition, information obtained from these and future explorations into the structural requirements for SQS-mediated conversion of FPP to squalene will open up the possibility of modifying the substrate specificity of SQS, through site-directed mutagenesis, to create novel terpene products by engineering mutations in the catalytic cleft. Finally, the structural information presented here makes it possible to better understand how the diversity of structurally distinct SQS inhibitors (15)(16)(17)(18) interact with the enzyme active center to interfere with catalysis.