The Crystal Structure of Gankyrin, an Oncoprotein Found in Complexes with Cyclin-dependent Kinase 4, a 19 S Proteasomal ATPase Regulator, and the Tumor Suppressors Rb and p53*

Gankyrin is a 25-kDa hepatocellular carcinoma-asso- ciated protein that mediates protein-protein interactions in cell cycle control and protein degradation. It has been reported to form complexes with cyclin-de-pendent kinase 4, retinoblastoma protein, the S6b ATPase subunit of the 19 S regulator of the 26 S proteasome, and Mdm2, an E3 ubiquitin ligase involved in p53 degradation. It is the first protein described to bind both to the 26 S proteasome and to proteins in other com- plexes containing cyclin-dependent kinase(s) and p53 ubiquitylating activities, thus providing a mechanism for delivering cell cycle regulating machinery and ubiquitylated substrates to the proteasome for degradation. Gankyrin contains a 33-residue motif known as the ankyrin repeat that occurs five and a half to six times in the sequence. As a step toward understanding gankyrin interactions with its protein partners we have deter- mined its three-dimensional crystal structure to 2.0-Å resolution. It reveals

Gankyrin is overexpressed in human hepatocellular carcinomas, its name deriving from this observation (gann means cancer in Japanese) and the presence of five and a half to six ankyrin repeats in its sequence (1,2). The gankyrin gene is one of the first genes to be overexpressed in a rodent model of hepato-carcinogenesis (3). Its overproduction leads to transformation of cultured NIH-3T3 cells and induces tumor formation in nude mice. Gankyrin overexpression correlates with hyper-phosphorylation and degradation of the retinoblastoma protein, Rb, a tumor suppressor. Gankyrin has been shown to bind to the cyclin D-dependent kinase cyclin-dependent kinase 4, a cell cycle regulator, one of whose substrates is Rb (4,5). Phosphorylation of Rb is an important step in enabling progression through the cell cycle, and thus, the activities of the relevant kinases are carefully controlled. Binding of gankyrin to CDK4 1 prevents the latter from binding to p16 INK4 (5). p16 INK4 is a member of the INK4 family of proteins, which inhibit CDKs 4 and 6 and, thus, themselves have tumor suppressor functions. As a p16 INK4 antagonist, gankyrin functions as a promoter of cell cycle progression.
Gankyrin has also been shown in yeast two-hybrid (Y2H) screens and biochemically to form a complex with the human S6b ATPase subunit of the 19 S regulatory complex of the 26 S proteasome. S6b is one of six non-redundant ATPases, all belonging to the AAA (ATPases associated with diverse cellular activities) superfamily (6). Among ATP-dependent proteases, these hexameric ATPase complexes are associated with substrate unfolding and translocation into an enclosed chamber where proteolysis occurs. In the case of the 26 S proteasome, degradation takes place in the 20 S core. The only other protein shown to interact with gankyrin in a Y2H screen and biochemically is a melanoma antigen MAGE-A4. This interaction suppresses the oncogenic activity of gankyrin (7). Recently, it has been shown that gankyrin is a protein cofactor that controls the p53 ubiquitylation activity of Mdm2 in a p300-independent manner. 2 Gankyrin is the first protein described to bind both to the 26 S proteasome and to proteins in other complexes containing a cyclin-dependent kinase(s) and p53 ubiquitylating activities. It, therefore, provides a mechanism to directly bring cell cycle regulating machinery and ubiquitylated substrates to the proteasome for substrate degradation. This coupled with the increasing evidence for the role of the ubiquitin proteasome pathway in DNA repair (9) and transcription (10) indicates the need to solve the structure of gankyrin to understand its functions in distinct complexes involved in cell cycle control and apoptosis. As a step toward understanding the interactions of gankyrin with its partner proteins and its role in cellular transformation, we have determined the crystal structure of gankyrin at 2.0-Å resolution.

EXPERIMENTAL PROCEDURES
Structure Determination-Gankyrin was expressed as a thrombin cleavable His 6 -tagged form from a pET28a plasmid derivative and purified by nickel chelation chromatography. Thrombin digestion produces a polypeptide with 3 N-terminal residues, GSH, attached to the 226 residues of gankyrin. Native gel analysis revealed a single band or multiple bands according to whether dithiothreitol (5 mM) was present, prompting us to treat the protein with iodoacetic acid to carboxymethylate exposed cysteines. After passage of iodoacetic acid-modified gankyrin through a Superdex 75 16/60 column the protein appeared homogenous and electrospray ionization mass spectrometry suggested that two of the five cysteines had been acetylated. The resulting protein was crystallized from polyethylene glycol 5000 monomethyl ether (11). A 2.0-Å resolution data set was taken from these crystals on beamline ID29 at the European Synchrotron Radiation Facility Grenoble (Table I).
The structure was solved by molecular replacement in the program AMoRe (12) using as the search model a consensus ankyrin repeat protein, E3.5 (PDB entry 1mj0), derived from a combinatorial library with a theoretical complexity of 3.8 ϫ 10 23 molecules, each consisting of a fixed ankyrin repeat framework but otherwise randomized sequence (13). E3.5 and gankyrin have 31.4% sequence identity. The molecular replacement solution was very clear, as evidenced by the correlation coefficient of 41.6% and the R factor of 49.7% of the correct solution, which stood out clearly from the values for the next best solution of 31.4 and 56.0% respectively.
The initial electron density maps were of excellent quality, allowing 75% of the model to be built interactively in QUANTA (Accelrys, San Diego, CA). The model was completed by one round of automatic model building in ArpWarp (14) and manual water building in QUANTA. The REFMAC (15)-refined model contains residues 4 -226 of gankyrin and 58 water molecules (Table I) and is deposited in the PDB with the accession code 1U0H.

RESULTS AND DISCUSSION
Overall Fold-The crystal structure of gankyrin was solved by molecular replacement using the coordinates of a consensus ankyrin repeat protein, E3.5, as a search model (13). It is interesting that the structure of an artificial protein with no function provided the clearest solution to the structure of an oncoprotein that functions at the heart of the biochemistry of the cell. The refined structure consists of residues 4 -226. There is no electron density associated with residues 1-3 (and 3 residues derived form the tag), and it is assumed these residues are disordered. Gankyrin contains 14 ␣-helices arranged in seven pairs and separated from one another by six ␤-hairpins as shown in Fig. 1, A-D.
The repetitive character of the structure is clearly apparent. The five clear-cut ankyrin repeat sequences ANK1-ANK5 ( Fig.  1E) each adopt the classical ankyrin repeat fold, justifying the naming of the protein. Each consists of two anti-parallel ␣-helices and a perpendicularly oriented short ␤-hairpin. Pairs of adjacent ankyrin repeat elements line up side-by-side so that their ␣-helices form a 4-helix bundle and their ␤-hairpins extend into anti-parallel ␤-sheet.
The flanking sequences at the N and C termini also adopt ankyrin repeat structures. Residues 202-226 form a ␤-hairpin followed by a helix-loop-helix, although the first of the two helices is somewhat abbreviated (Fig. 1, C and D). These elements of structure continue the regular packing arrangement and constitute ANK6 even though the sequence deviates significantly from the ankyrin repeat consensus. Residues 4 -36 clearly add a seventh structural repeat element, which we will call ANK0. They fold to form a helix-loop-helix followed by a 3 10 helix-like turn leading into ANK1. Thus, at the structural level, the ankyrin repeat pervades the molecule in its entirety. The peripheral ANK elements use only one of their lateral faces to pack with their ANK neighbors; the opposite face is exposed and, therefore, able to tolerate sequence variation.
Overall the structure has a shape resembling a breaking wave or cupped hand. In the latter context, there is a noticeable curvature across the palm so that the surface created by the ␤-hairpins and the first of the ANK ␣-helices is concave, whereas that formed by the second of the ANK helices is convex. This type of curvature has been observed in other ankyrin repeat structures.
The Ankyrins Repeats of Gankyrin-The ankyrin repeats in gankyrin are aligned against a consensus ANK sequence and the secondary structure assignments in Fig. 1E. Asp 1 of the consensus is conserved in four of the ankyrin repeats of gankyrin; it was replaced by Asn in two of the others. Asp 1 / Asn 1 sits below the ␤-turn with its carboxylate/amide CAO forming hydrogen bonds with the main chain amide NOH of residues 2 and 4. The other carboxylate oxygen/amide NH 2 forms one or more hydrogen bonds to side chains of various types (Ser, Arg, Trp, Tyr, Asn) at position 10 of its own ANK repeat (ANK n ) or at position 5 of the preceding ANK motif (ANK nϪ1 ) or, in one case, to a water molecule. The Thr 6 -Pro 7 -Leu 8 -His 9 -Leu 10 -Ala 11 sequence is fully conserved in ANK5 and recognizable in all but ANK0. This motif spans the beginning of the first ␣-helix (␣1) and is largely buried. The side chain of Thr 6 /Ser 6 forms a hydrogen bond with N ␦ of His 9 , whose N ⑀ OH forms a hydrogen bond to a main chain carbonyl of ANK nnϩ1 . The Pro 7 /Ala 7 residue at the start of ␣1 contributes to the hydrophobic packing interaction with ␣2 of ANK n as well as with ␣1 of ANK nϪ1 . The side chains of the helical residues Leu 8 /Met 8 and, to a lesser extent Ala 11 are at the heart of the local four helix bundle formed by ANK n and ANK nϩ1 . The position 10 side chain is exposed on the surface of the molecule, and this residue is highly variable among the ankyrin repeats of gankyrin, appearing as the consensus Leu only in ANK5; elsewhere it is Trp, Ile, Tyr, Arg, and Val. Gly 15 appears in the short loop connecting the two ANK helices, where it is preferred presumably because of its flexibility. Residues 19 and 20 at the start of ␣2 are Val in the consensus ankyrin repeat, although in gankyrin the residues at these positions are often other hydrophobic residues and occasionally polar residues. The side chains project into the protein core so that the residue 19 side chain interacts with the residue 20 side chain of ANK nϪ1 and the residue 20 side chain interacts with that of residue 19 in ANK nϩ1 so that as a set these residues form a hydrophobic continuum across the molecule. The Glu 19 of ANK5 is oriented with its carboxylate exposed on the bottom face of the molecule so that its aliphatic portion contributes to the apolar core. Otherwise the requirement for apolar residues is relaxed only at Glu 19 and Lys 20 of the first and the last ANK repeats, respectively, which do not participate in packing with ANK neighbors and instead are exposed on the flanks of the molecule.
Residues 22, 23, and 24 are each Leu in the ankyrin repeat consensus, and in gankyrin there is absolute conservation of hydrophobic character in the internal ANKs 1-5. The central Leu/Val side chains are at the protein core, whereas the adjacent hydrophobic residues are laterally oriented and form interactions with the ␣2 helices of adjacent ANK repeats. Again, there is no requirement for hydrophobic residues at Glu 22 and Ser 23 of ANK0 and Glu 24 of ANK6 as the packing constraints disappear in the absence of a second flanking ANK element.
The Gly 27 -Ala 28 -Asp 29 -Val 30 -Asn 31 -Ala 32 segment connects ␣2 to the ␤-hairpin, the polypeptide following a remarkably similar course in all of the ankyrin repeats of gankyrin (Fig. 1,  B-D). The Gly 27 residue is present in four of the ANKs and replaced in the two other instances by lysines whose side chains project from the back face of the molecule. The Ala 28 side chain points toward the center of the molecule to contribute to the hydrophobic core; a Val substitution in ANK1 is easily accommodated, whereas the Ser side chain at this position in ANK0 is exposed on the side of the molecule. The Asp 29 consensus residue does not appear in gankyrin at all, although polar residues often take its place at this exposed position on the top surface of the molecule. The main chain twists at this point so that the side chain of ANK residues 28 -30 generally lie across the top face of the molecule, and these positions show only moderate obedience to the consensus. ANK0 contains an extra residue in this region.
the alignment indicates the span of the ␣-helical and ␤-strand segments of the structure. A consensus ankyrin repeat sequence (16) is shown below the structure. Residues that match the consensus are in blue. F, van der Waals surface representation of gankyrin, with residues colored according to side chain polarity: white, apolar; green, neutral polar; red, acidic; blue, basic. The view is of the concave surface of the molecule in the same orientation as in D. 1. Structure of gankyrin. A, stereo view of the 2F o Ϫ F c electron density maps contoured at the 1 level and displayed on residues 74 -78 of ANK2 and 107-111 of ANK3. B-D, orthogonal ribbon representations of gankyrin. The polypeptide chain is color-ramped from its N terminus in blue to the C terminus in red. E, alignment of the five full ankyrin repeat sequences (ANK1-ANK5) together with sequences at the N (ANK0) and C (ANK6) termini of the molecule, which in the structure adopts the ankyrin repeat fold. The schematic below Implications for Interactions with Partner Proteins-It is not possible on the basis of the crystal structure of gankyrin to assert how this protein recognizes its protein partners. Analysis of the structures of ankyrin repeat protein complexes shows some variety in the extent and mode of interactions that mediate partner recognition ( Fig. 2; Ref. 16).

FIG.
In the p53⅐53BP2 complex the ankyrin repeat domain of 53BP2 has only a minor role in binding to p53, with the ␤-hairpin of the last of the four ankyrin repeats, ANK4, contacting p53 ( Fig. 2A). The principal protein-protein interactions here are mediated by the downstream SH3 domain of 53BP2 (17). In GABP␣, a transcription factor of the Ets domain family, interactions with the accessory factor GABP␤ are mediated by the residues on the tips of four consecutive ␤-hairpin fingers and by residues on the exposed face of the ␣-helices in the ankyrin groove; in both cases these represent regions where sequence is not conserved in the ANK motif ( Fig. 2B; Ref. 18). More extensive regions of the ankyrin repeat element are employed in p16 INK4a binding to CDK6 (Fig. 2C; Ref. 19). As for GABP␣⅐GABP␤, the helical face of the ankyrin groove for three of the repeats together with the ␤-hairpins clamp the N-terminal lobe of CDK6. In this case the connecting loops between the helical pairs in each ankyrin repeat element extend the contact surface with the target protein. The most extensive interface so far described for an ankyrin repeat protein is in the complex between IB␣ and NFB ( Fig. 2D; Refs. 20 and 21). Here all six ankyrin repeat elements of IB␣ make a variety of contacts to the nuclear localization signal, dimerization interface, and DNA binding domain of NFB. These involve extensive surfaces of the ankyrin groove including ␣1 and the ␤-hairpins of the majority of IB␣ ANKs, with additional contacts involving the exposed lateral face of ANK1.
The ankyrin groove is, therefore, the most likely surface to mediate interactions with gankyrin partner proteins. As already indicated this surface is composed of residues from the ␣1 helices and the ␤-hairpins. It is decorated with the side chains of residues, which lie largely outside the consensus motifs, in particular those at ANK positions 2, 3, 5, 10, 13, and 14 ( Fig. 1, E and F). Examination of a calculated electrostatic surface for gankyrin reveals a single remarkable feature, a patch of negative electrostatic potential on and beneath the ␤-hairpin rim of ANKs 5 and 6 (Fig. 3). This surface may interact with a complementary surface with positive potential on gankyrin target protein(s).
GST pull-down, immunoprecipitation, and yeast two-hybrid assays have been performed on a series of deletion mutants of gankyrin to define regions of the structure required for interactions with partner proteins (4, 7). 2 These experiments have shown full-length gankyrin is required for interactions with Rb, Mdm2, S6b, and MAGE-A4. This may be because the interaction surface on gankyrin spans the full length of the molecule. Alternatively, deletion of the N-terminal extension or of one or more ankyrin repeats may expose otherwise buried non-polar surfaces, decreasing the stability of the folded molecule. Arguing against this is the observation that truncated gankyrin proteins containing only the first three and four ANKs retain CDK4 binding activity, comparable with the intact protein (5). Interestingly, these CDK4 binding deletions lack the LXCXE element discussed below.
It has been proposed that residues Leu 178 -X-Cys 180 -X-Glu 182 of gankyrin constitute a motif used to mediate interactions with the retinoblastoma protein, Rb (1,5). The crystal structure of the pocket domain of Rb bound to a synthetic peptide (DLYCYEQLN) derived from the E7 protein of human papilloma virus (Fig. 4A) emphasizes the importance of the LXCXE motif in binding (23).
In this complex the peptide binds in an extended ␤-strand type conformation such that the alternate Leu, Cys, and Glu side chains point into the protein and make specific contacts to it (Fig. 4A). Leu and Cys bind in hydrophobic pockets of complementary shape, whereas the Glu side chain forms hydrogen bonds to main chain amide groups of residues at the terminus of an ␣-helix; the strength of these interactions are augmented by the positive helix dipole (23).
A peptide spanning the LXCXE motif from HPV E7 inhibits Similarly, a peptide encompassing residues 176 -185 of gankyrin disrupted Rb-gankyrin interactions in pull-down assays (5). Three peptides in which the Leu, Cys, and Glu residues were individually swapped for Ala residues failed to block Rb-gankyrin binding. In gankyrin, the Leu 178 -Ala-Cys 180 -Asp-Glu 182 sequence is embedded in helix 1 of ANK5 (Fig. 4B). Although the side chain of Glu-182 is exposed, those of Leu-178 and Cys-180 are partially and fully buried, respectively. For the LXCXE motif to bind to Rb in the same manner as the HPV E7 peptide, a major conformational change would be required necessitating significant unfolding of gankyrin. This seems implausible. Moreover, as discussed above, there is no precedent for such a mode of binding by ankyrin-repeat proteins to their partners. It seems, therefore, that the presence of an LXCXE motif in the gankyrin sequence is coincidental and that it is not used to bind to Rb in the same manner as does that of the HPV E7 peptide. It is very likely nevertheless that the LXCXE peptide binding site and gankyrin-binding sites on Rb overlap. It is also likely that the side chains of Leu 178 and Glu 182 contribute to the surface of the groove in gankyrin that is used in partner binding (Fig. 4B); moreover they are adjacent to the prominent negatively charged surface (Fig. 3). Consistent with their involvement in binding Rb, Glu 182 to Ala and Leu 178 to Ala mutations in gankyrin abolish and diminish Rb binding, respectively (1).
The Importance of Gankyrin Structure for Understanding Cell Division-Gankyrin competes with the multiple ankyrinrepeat pINKs (5) to modulate cyclin-dependent kinase activity. Emphasizing the versatility of the ankyrin repeat scaffold as a molecular recognition surface, it appears that despite their similar folds (Figs. 1 and 2C) and common target, p16 INK4 and gankyrin have overlapping rather than identical binding sites on CDK4. As a result, unlike p16 INK4 , gankyrin does not inhibit the kinase activity of CDK4. Gankyrin also appears to have separate functions in regulating the activities of other protein complexes involved in cell cycle control and apoptosis. Gankyrin directly interacts with Mdm2 to cause the polyubiq-uitylation and degradation of p53 in a p300-independent manner. 2 The cupped hand structure of gankyrin may be sufficiently promiscuous to be involved in the regulation of ubiquitylation by other ubiquitin ligases and, therefore, control other activities in DNA repair, transcription, and cell cycle control. The gankyrin orthologue in yeast, Nas6p, which was identified as an Rpt3-interacting protein (24), has also been crystallized (25). Its structure, reported in the accompanying paper, is very similar to that of gankyrin (root mean square deviation ϭ 1.7 Å for 224 common C ␣ atoms).
Gankyrin-protein complexes appear to have a central role in gene expression and apoptosis. Therefore, solving the structure of gankyrin is important not only for understanding the regulation of Rb and p53 functions but also to begin to consider strategies for the design of compounds that block gankyrin activities and provide a basis for drug development for therapeutic intervention in hepatocellular carcinoma. Currently, there are no effective treatments for the growing global problem of liver cancer.
It is clear from this and other recent studies that the presence of ankyrin repeats in a protein sequence allows confident prediction of fold, although little can be deduced regarding partner recognition and function. The next step in understanding molecular recognition in gankyrin is to determine the structure of this oncoprotein in complex with partner proteins or, failing this, fragments or even peptides derived from these proteins, which carry the gankyrin recognition determinants. . In A all the Rb protein atoms are colored in pink with the HPV E7 peptide atoms colored according to element. The Leu, Cys, and Glu side chains of the E7 peptide are buried by their interaction with Rb. In B, gankyrin atoms are colored pink except for those of Leu 178 , Cys 180 , and Glu 182 , which are colored by element. The Cys 180 side chain is buried in the protein core.