Molecular Motions of Human Cyclin-dependent Kinase 2*

We present a comprehensive description of the dynamic behavior of CDK2 in complex with cyclin A, arrived at by analysis of a total of 0.25 μs of solvated molecular dynamics trajectories and 42 deposited CDK2 structures, and refined using other protein simulation algorithms. The CDK2-cyclin A dimer is a dynamic complex of 6 subdomains. Thermal motions are dominated by a relative twisting of the two monomers. The predominant motion within CDK2 is a “breathing” of the N-terminal and C-terminal lobes. The N-terminal lobe of cyclin A is tightly linked to the “PSTAIRE” helix of CDK2 to provide a rigid nucleus to the complex. By contrast, the “CDK-insert” region (residues 219-251) sometimes becomes highly mobile, a behavior that is observed in crystallographic analyses of CDK2 structures and that may relate to its role in recognizing diverse binding partners. We find that the three arginines that anchor phosphothreonine 160 of fully active CDK2 do not contribute equally to structural stabilization. This observation is supported by a survey of protein kinase sequences. We have also explored the physical basis of the role of the phosphate moiety in signaling by artificially modifying the charge of phosphothreonine 160 in molecular dynamics simulations. We find that phosphothreonine binding involves an active process of attraction in which both the receptor site (the arginine triad), and the phosphothreonine have a higher charge than is required to maintain an active conformation once formed. We have deposited our dynamics data to aid protein kinase inhibitor design.

The cyclin-dependent kinases (CDKs) 1 play a prominent role in controlling the cell cycle (reviewed in Ref. 1). Their phosphorylating activity is strongly enhanced when the protein is associated with one of the cyclin family of molecules, concentrations of which vary with the stage in the cell cycle (reviewed in Ref. 2). The CDKs and cyclins are the basis of a chemical oscillator that governs cell cycle progression. CDKs also re-spond to products from pathways that report aspects of cellular status, such as genomic integrity, and integrate these signals to modify oscillator progression, avoiding lethal mistakes such as entering mitosis before DNA replication is complete (reviewed in Refs. 1 and 2).
Monomeric CDK2 has negligible activity because the residues that should form its catalytic and substrate-recognition sites are disordered (3). Full activation requires both binding by cyclin A or cyclin E and phosphorylation of residue Thr 160 on a loop (the "T-loop") by a CDK-activating kinase (CAK) activity (4 -6). The structure of cyclin A (7) does not change significantly, but many of the CDK2 residues are realigned toward their active configuration (8). In this state the CDK2 has 0.4% of its full activity: 100% activity is achieved when Thr 160 is phosphorylated (9). X-ray structural data shows that phosphorylated Thr 160 turns in to contact three arginine residues (Arg 50 , Arg 126 , Arg 150 ), twisting the T-loop and stabilizing the conformation of residues linked to the catalytic apparatus of the kinase (6). Fig. 1 shows the experimentally determined structures of phosphorylated and unphosphorylated CDK2 in complex with cyclin A, emphasizing the arginines and the phosphorylated threonine.
The models of the underlying structure/function relationship of CDK activity described above have been deduced from inspection of crystallographically determined structures that represent snapshots of molecular conformation. While highly informative (e.g. Refs. 10,11), it has sometimes been found that further insights into the structure/function relationship can be gained though analysis of simulated dynamics (12). For example, previous studies into the dynamic properties of cAPK using short timescale (600 ps) simulations (13), have demonstrated that interlobal dynamics depend on a composite of conformational changes in residues of the "hinge region," rather than on a conformational switching of a single amino acid. Throughout these motions, it was found that the large and small lobes acted as rigid bodies. The mechanism of intramolecular communication within a molecule of the Src protein kinase was dissected in (14), revealing how allosteric signals are able to propagate across several nanometers within a protein. The relevance of such simulations to inhibitor design was addressed in Ref. 15, which describes 1-ns molecular dynamics (MD) simulations of monomeric CDK2, both free and in complex with inhibitors. A broad division into N-and C-terminal domains was identified, which varied according to bound inhibitors. More recently Bartova et al. (16) described a simulation of CDK2 with cyclin A and ATP in several states for 3 ns. A stabilization of the PSTAIRE helix secondary structure was noted on binding with cyclin A. Mobility of the glycine-rich loop was increased by phosphorylation of Thr 160 . Distortions of ATP configuration were predicted as probable consequences of phosphorylation of Thr 14 and or Tyr 15 .
The simulation of proteins requires the generation of ensembles of structures in which each frame contains an accessible configuration. One of the best established methods for generating an ensemble from a known static structure is MD, reviewed in Ref. 17. The ensemble generated by this technique is like a movie, in that frames have a well defined temporal relationship with each other.
A shortcoming of the MD approach, however, is that available computing power permits only short simulations to be run for whole proteins, typically of the order of tens of nanoseconds. Such a short simulation may insufficiently explore the configuration space available to the protein. An alternative approach to MD is to generate an ensemble of structures randomly, but with constraints to ensure that only "reasonable" structures are included in the ensemble, e.g. those with atoms positioned such that no covalent bonds are broken compared with the starting configuration and which favor the maintenance of salt bridges and other non-covalent bonds. This approach is implemented in the program Concoord (18,19). Concoord calculates a list of distance constraints for different classes of bond, derived from analysis of a starting protein structure. From a random starting structure, atoms are moved until "condensation" is detected, when clusters of atoms satisfy distance constraints and other criteria. This is repeated to generate an ensemble of structures.
Results from the two methods, MD and Concoord have complementary properties. Concoord structures are derived from constraints that circumscribe a region of conformational space that can be accessed. This space is thoroughly explored in a Concoord ensemble as a result of the highly randomizing protocol by which each member of the ensemble is generated. MD structures are in principle free to access all physically obtainable conformations. The conformational space that is explored is, however, limited by the incremental search protocol that is used to obtain individual members of the ensemble, and is further restricted by the Brownian and viscous effects of the solvent environment. Thus we suggest that ensembles derived from Concoord are most useful in identifying the breathing motions that a protein is able to perform around an equilibrium position, and that might be extrapolated to predict the character of conformational changes. By contrast, MD-derived ensembles are most useful in predicting both the detailed behavior of a protein, and the trajectory that a structure is likely to experience under specific conditions. A third approach for ensemble generation is to use compilations of experimentally determined structures. The structure of CDK2 has been determined both in isolation and in combination with many other molecules, such as natural inhibitors, drugs, and activating proteins. These forty or so structures constitute an ensemble and can be subjected to analysis.
We have extended and refined the prevailing models of the behavior of CDK-cyclin complexes by application of these techniques including (i) a set of long time scale (15 ns) molecular dynamics simulations, (ii) Concoord ensemble generation, and (iii) a survey of available CDK2 structures. Our aims were to characterize the molecular motions of CDK2 at both a global and local level and to determine the hierarchy of tertiary structures that exist in a CDK-cyclin complex. We also aimed to gain a clearer understanding of the interactions that maintain the different regulatory states of the molecule from the detailed protein movements observed in MD simulations. As well as extending our understanding of the interrelationships of CDK structure, dynamics, and function, we have identified the regions of conformational space in which CDK-cyclin complexes prefer to lie. This result will allow us to target appropriate conformations in computational inhibitor design, an approach FIG. 1. A, stereo view of CDK2-cyclin A (PDB 1JST) highlighting key features referred to in the text. CDK2 is predominantly colored gray, with an orange PSTAIRE helix, a pink G-loop, and a cyan T-loop. Cyclin A is represented in transparent green. Highlighted in ball and stick representation is phosphothreonine 160 docked with the triad of arginine residues Arg 50 , Arg 126 , and Arg 150 . Arg 50 carbon atoms are shown in orange, Arg 126 are gray, and Arg 150 are cyan. Aspartate 127 from the catalytic site is shown to the left of the arginines. ATP and a magnesium ion are shown in the CDK2 interdomain cleft in ball and stick representation. B, close up view of the phosphothreonine and arginine triad in active conformation. C, unphosphorylated form of CDK2 in complex with cyclin A (PDB 1FIN) adopts a conformation in which the threonine 160 is turned away from the arginine triad, and the T-loop is reconfigured. that may be useful in the development of novel anticancer therapies (20).

Generating Ensembles: Molecular Dynamics
In this study, three separate sets of MD simulations were undertaken, all using the GROMACS suite of software (21). Using dual processor 2.4GHz Linux cluster nodes (one node per process) we typically achieved a simulation rate for a protein in a 10-nm solvated octahedral box of 1 ns in 4.5 days. The three studies differed in their starting protein configuration, but shared a simulation protocol.
The GROMOS 43a1 force field was used throughout. The molecule to be simulated was solvated with water in an octahedral box with diameter 10 nm using random placement of water. Counterions (Cl Ϫ or Na ϩ ) were added to neutralize the overall system charge. An initial energy minimization run was carried out followed by a position-restrained simulation of 100 ps to 1 ns to remove clashes/high strains. The production run was then initiated as follows: a time step of 2 fs was used; center of mass movement was removed for each time step; electrostatic interactions were implemented using 4th order particle mesh Ewald (PME) calculations (22) with a coulomb radius of 1 nm; VDW bond treatment was cutoff with radius 1 nm; the system was thermodynamically coupled to a 300 K bath and pressure coupling was at 1 atmosphere (23). Velocities were randomly generated at start-up with a Maxwellian distribution corresponding to a temperature of 300 K. Bond constraints were applied using LINCS (24) with order 4. The SPC water model was used (25).

MD Study 1: Dynamics of CDK2-Cyclin A for Thr 160 with and without Phosphate
In this report we use a two letter code to identify a CDK2-cyclin A structure in a given state of phosphorylation and activity as in Table I. According to this nomenclature therefore, AP (for active form, phosphorylated) corresponds to the x-ray structure for Thr 160 phosphorylated (and therefore active) CDK2-cyclin A with the phosphate group intact. AU (for active structure, manually edited so as to be unphosphorylated) is the same structure with the phosphate group removed in silico. Likewise IU (for inactive form, unphosphorylated) corresponds to the x-ray structure for unphosphorylated (and therefore inactive) CDK2cyclin A with no addition of phosphate group, and IP (for inactive structure, manually edited so as to be phosphorylated) is the same structure with a phosphate group added in silico to Thr 160 before simulation. The four starting configurations were prepared as follows.
IU-One dimer of unphosphorylated CDK2 bound to cyclin A along with its associated ATP and water of crystallization was taken from PDB file 1FIN. A magnesium 2ϩ atom was added to the structure by analogy with its known position in other similar structures.
IP-IU was taken as a starting point and a phosphate group with charge Ϫ2e was added to residue Thr 160 of CDK2. The coordinates of the remaining side chain atoms for Thr 160 were not altered. The phosphate group was added in a low energy conformation, pointing away from the center of the CDK2 protein.
AP-The x-ray structure determined for phosphorylated CDK2, bound to cyclin A (PDB accession ID: 1JST) was reduced to a single dimer with associated ATP and water of crystallization, as for IU. In this case a metal ion, manganese, was already in place. For the simulation the manganese was replaced with a magnesium 2 ϩ ion in the same location.
AU-The resulting structure from AP was taken and the phosphate group substituted by a single hydrogen.
All four of these starting structures were simulated according to the MD protocol described above. IU and AP (i.e. the unperturbed crystallographically determined structures) acted as controls and were run once for 15 ns each. IP and AU were each simulated three or four times for 10 -15 ns each.
We have considered and rejected the possibility of performing "um-brella sampling" (26) to quantify the energy landscape of the transition. The flexibility of the T-loop means that there are very many possible paths between the states, so that choosing a single reaction coordinate is not appropriate and a rigorous exploration of the many dimensional energy landscape of the T-loop between AP and IU configurations is not judged practical in a system of this size at today's computing speeds.

MD Study 2: Effect of Varying Phospho-Thr 160 Charge on CDK2-Cyclin A Dynamics
As will be discussed below, a comparison of the AP and AU trajectories from MD Study 1 showed that when the Thr 160 residue was left phosphorylated the T-loop remained in its active conformation. However, after the phosphate group had been removed, the T-loop began to reconfigure. This difference might be accounted for by a number of characteristics of the phosphate group: (a) that it is charged so that Coulomb attraction keeps it associated with the anchoring arginines, (b) that its bulk may simply obstruct unfolding, or (c) that it is held by van der Waals or hydrogen bonds. In order to determine the contribution of (a) to the effect, 9 simulations were run according to Protocol 1. Each was run for a duration of 5 ns with a different charge on the phosphate group, ranging in half-charge steps from Ϫ3e to ϩe.

MD Study 3: Simulated Mutagenesis Study of CDK2-Cyclin A Dimer
As discussed above, CDK2 residues Arg 50 , Arg 126 , Arg 150 are implicated in the maintenance of the activated configuration by anchoring the T-loop via the phosphorylated Thr 160 residue. Results from MD Study 1 suggested that all three did not contribute equally to the maintenance; therefore three starting structures were prepared each based on AP with a different anchoring arginine mutated to methionine. Although not a perfect match, methionine was judged to be the most similar electrically neutral residue to arginine in terms of shape and volume.
For each of these mutants, and for the wild type, a set of 5 simulations, each of 5 ns duration was made (i.e. 20 ϫ 5 ns simulations in total). The motivation for this duplication was to identify whether the protocol reproducibly predicted differences in the dynamic behavior of the mutants. Because repeating the same simulations five times with identical starting conditions would lead to five identical simulations, we used a different value for the "seed" used in assigning random velocities to atoms at the beginning of each run (see MD protocol above). This led to mutually independent evolution of each member of the set, allowing us to discriminate aspects of the evolution that depend more robustly upon the model from those that depend on this aspect of the boundary conditions.

Generating Ensembles: Concoord
The following protocol was used to generate ensembles using Concoord. For the generation of the distance constraints the DIST program was run as follows: secondary structure recognition was performed using DSSP (27); alternative non-bonded contacts were allowed; cutoff radius for non-bonded interacting pairs was left at the 0.4-nm default value, as were all other analysis options.
For the structure generation the DISCO program was run in default configuration. In particular the number of structures and iterations per structure were both left at the 500 default and chirality was checked "on the fly." Concoord was run four times, once for each of the structures prepared in MD Study 1, i.e. CDK2-cyclin A IU, IP, AP, AU.
The main limitation of the approach is that the number of frames in the ensemble is small compared with the two described above. Additionally there will be several selection effects at work. For example, each x-ray structure is the weighted mean of the population of structures that is in fact adopted, averaged over the volume of the crystal, and the duration of the diffraction experiment. Therefore although a range of configurations will be found in the ensemble these are likely only to be a subset of those that can actually occur. A set of structures of CDK2 bound to various ligands was selected from the Protein Data Bank. The three-dimensional structures were aligned by minimizing the C␣ r.m.s.d. The alignment of the structures depends on which C␣ atoms are included in the set for which the mutual r.m.s.d. is to be minimized. Choosing a localized subset of the atoms (e.g. the relatively stable ␣-helices in the C-terminal domain) could prejudice the analysis. Therefore we sought to include as many atoms as possible in the alignment. This was the set of C␣ atoms that are common in all reference structures used. This set contained 256 of the possible 298 residues, the remaining 42 residues falling in regions that were not resolved by crystallography in one or more of the structures used.

Extracting Motions
One of the challenges of dynamics analysis is extracting significant large scale motions from the data. In this report we make use of two main analyses, covariance analysis and principal component analysis (PCA).
Covariance Analysis-The pairwise covariance for atom coordinates in an ensemble, C ij is given by Equation 1, where i, j are coordinate indices, r i , r j are the values of coordinate i, j, t is time. ͗ (quantity) ͘ t represents the time-averaged value of a quantity. A matrix in which each pair of atoms has an associated Per Atom Normalized Covariance (PANC) can be calculated using Equation 2, where CЈ ij is the PANC of atoms i and j and x i , y i , z i , x j , y j , and z j are the Cartesian coordinates of atoms i and j. CЈ ij takes a value from Ϫ1 to ϩ1 so that a value of 1 means that at all times atoms i and j move together, a value of Ϫ1 means that at all times they move in opposite directions and a value of 0 means that the atomic movements are uncorrelated. As described in Refs. 14 and 29 we may aid interpretation of the resulting matrix by selecting those pairs of atoms with a correlation coefficient for their motions above a given threshold and drawing a line between them on a three-dimensional representation of the molecule. In contrast to Ref. 14 we do not remove local links as they are helpful in defining "subdomains" in the molecule.
Principal Component Analysis-PCA is a means of reducing the dimensionality of a data set. When applied to protein dynamics (30) it provides a description of the motion of all the atoms in a protein in terms of a small number of dynamic modes. Whereas covariance analysis highlights regions that move together, the eigenvectors and eigenvalues of PCA emphasize direction and amplitude of dominant modes of protein motion.
The required covariance matrix and eigenvectors for this analysis were obtained by applying the g_covar program of the GROMACS package, with -debug setting to ensure that a full per-coordinate covariance matrix was dumped. Only the C␣ atoms were included in the analysis as the diagonalization of the matrix was found to exceed the memory capacity of the computer for all atom analysis. The supporting mathematics are outlined in Refs. 30 and 31. Key to interpreting these eigenvectors is a helpful visualization of them. We have based our visualizations on the porcupine plots of (30). In that study, a stick was drawn for all residues corresponding to the movement implied for that residue for a given eigenvector. For example to visualize eigenvector 1 a stick is drawn for each residue starting from the C␣ and projecting in the direction of the component of the first eigenvector that corresponds to that residue. We have found that cones are easier for the eye to interpret than sticks and a script was written to allow the molecular graphics program VMD (32) to automatically plot these cones onto the protein, given the eigenvectors of the PCA decomposition. 2 We have described elsewhere (29) a web-service 3 in which we have made these analyses and visualizations available to other researchers.

Limitations of the Computational Approach
We wish to highlight some of the limitations of the computational approach that should be kept in mind when considering the points made in this study. In particular the following limitations should be considered.
The timescales that MD can currently access are relatively short and it is possible that broader patterns will emerge when the system is probed on a longer time scale. This has been one of the main motivations for using Concoord to predict global motions. This limitation probably explains why we have not been able to show reproducible and complete reconfiguration of CDK2 when its inactive structure is phosphorylated in silico.
Apparent subtleties of the structure, dynamics, and interactions of the system being simulated may depend upon details of the empirical force field employed. By necessity computational approaches use empirical force fields that neglect, for example, effects that can only be described in a quantum mechanical treatment. Although some modeldependent behavior can be identified by conducting multiple simulations with different boundary conditions as described here, shortcomings of the force field will not be addressed by this approach.
As discussed above, Concoord is likely to produce a more coarse approximation to the family of accessible conformations than MD. In particular, the relative contributions of different aspects of bonding are encoded as constraints, rather than as restraints, although the latter would more closely resemble the continuous character of true potential functions. As such, Concoord is known to overestimate the contribution of non-covalent bonds compared with MD. Nevertheless the output of Concoord has been validated by several means, and in this study a remarkable degree of similarity between the key motions from the Concoord approach and an assembly of experimentally determined structures is found.
Additionally Concoord contains no explicit temperature consideration, potentially affecting the relative rates of occupancy of the conformations. There is however an implicit temperature built into the constraints in that the values of the minimum and maximum bond lengths that are used correspond to a bond at a specific temperature.
Despite these limitations we have been able to demonstrate support for a number of our results from experimental data (e.g. residue conservation in known kinase sequences and the correspondence between the predicted and observed preferred modes of conformational variability).

Concerted Molecular Motions
Dynamic Modes- Fig. 2A shows, in porcupine representation, the most significant motion of the CDK2-cyclin A complex predicted by Concoord for the AP state: a twist about an axis that runs from the CDK2 to cyclin A, pivoting approximately about residue P155. However this gross twisting motion of the CDK2-cyclin A dimer masks a flexing motion of the CDK2. We can isolate the dynamics of the CDK2 within the complex by considering only CDK2 atoms in the PCA analysis of the dimer.
By this analysis, the most significant motion of CDK2 within a CDK2-cyclin A complex is a "breathing" mode ( Fig. 2B) with the two domains of the protein moving in opposition around the "hinge" region (residues 78 -86). This region forms its own interactions with the edge of the ATP molecule and connects the N-and C-terminal lobes, each of which independently contributes a surface to the ATP binding site. Therefore, as the protein breathes, the active site opens and closes. Very similar results were obtained in analysis of the Concoord AU, IP, and IU ensembles.
Porcupine representations of the different eigenvectors derived from the MD simulations demonstrated less evidence of long range concerted motions. Moreover, porcupine analysis of intermediate subtrajectories (3-15 ns, 6 -15 ns, 9 -15 ns, 3-12 ns) showed little mutual similarity. Taken together, these results suggest that the modes of motion generated by analysis of Concoord ensembles might emerge on a time scale that is longer than that of our simulation. Interestingly, however, in an analysis of the full 15-ns trajectory of AU, a motion resembling eigenvector 1 of the Concoord ensemble appeared as eigenvector 3 (Fig. 2C). Fig. 2D shows the dominant mode from a similar analysis of the ensemble of experimentally derived CDK2 structures taken from the Protein Data Bank. Strikingly, the observed structural variability of CDK2 is similar in character to the molecular flexibility and motions predicted by Concoord and MD. Fig. 2E shows the normalized relative contributions of the predicted modes to the overall motion of the molecule for each method, Concoord, MD and experimental. It is seen that for the experimental result, by far the biggest contributor is this breathing motion. Concoord predictions are also dominated by breathing, but to a lesser extent. The characteristic motions predicted in MD simulations demonstrate a slower drop off in contribution of higher order modes, which may in turn reflect the questions of timescale discussed above.
The second most significant predicted internal motion of CDK2 in the context of a CDK2-cyclin complex corresponds to a twisting mode around a "vertical" axis, i.e. about a line running roughly through the centers of the two CDK2 domains (Fig. 2F). The third ranked motion is dominated by intradomain flexing and twisting (not shown), and the fourth is an elongation of the protein along its major axis, accompanied by a thinning on its two minor axes (not shown). Fig. 2E indicates that the majority of the C␣ movement is accounted for by these four eigenvectors.

Structural Domains and Subdomains within the CDK2-Cyclin A Dimer
Rigid domains and subdomains of the CDK2-cyclin A complex can be identified by their tendency to undergo concerted motion, as revealed by the covariance web plot. This is shown in Fig. 3A for the 15-ns AU molecular dynamics simulation. Throughout most of the protein, at a correlation threshold of 0.7, only connections within secondary structure appear. The notable exception is at the interface between cyclin A and the PSTAIRE helix of CDK2. Here, there appears to be a rigid group of cyclin residues that move in close concert with the PSTAIRE helix. This observation was reproduced for all simulations of CDK2 that we have analyzed in this way, including the wild types AP and IU although the rigidity of the Cyclin N-terminal is particularly pronounced in the instance shown. From this observation, it appears that a rigid "core" is formed in the CDK2-cyclin A complex, comprised of part of the Nterminal cyclin-box fold of cyclin A and the PSTAIRE helix of CDK2. The rigidity of this core may underlie the profound effect that cyclin A binding has on the orientation of the PSTAIRE helix, and hence on the activity of CDK2.
There is a high degree of similarity between the covariance web that emerges from analysis of the entire AU MD trajectory, and the webs that are generated when subsections of that trajectory are analyzed. This suggests that authentic features of short range correlated motion may have been detected. This behavior contrasts with the PCA analysis of long range concerted motions referred to above, which indicated that whole molecule motions cannot be discerned in short (ϳ10 ns) MD simulations of this system. This contrast is consistent with the expectation that the internal motions of small rigid domains will have a shorter characteristic time than that between larger, more loosely associated entities.
The Concoord simulation has identified a total of 6 domains that tend to move with high internal correlation, i.e. act as structural units (Fig. 3B). These are 1) the CDK2 N-terminal domain; 2 and 3) two CDK2 C-terminal subdomains; 4) the cyclin A C-terminal domain; 5) the cyclin A N-terminal domain, which is closely associated with 6) a region composed of the CDK2 PSTAIRE helix and the first half of the T-loop.
Each algorithm for ensemble generation has different benefits: Concoord is likely to identify domains whose integrity is apparent only on a long timescale, whereas MD is better able to rank the relative rigidity of domains. Thus we conclude that the mosaic structure of the complex is represented by the subdomains of Fig. 3B, with the interfacial domain (domain 6)

FIG. 2. Principal dynamic modes of the CDK2-cyclin A dimer.
A, porcupine plot of principal motion of the dimer calculated using Concoord. The view is from CDK2 along the CDK2-cyclin A axis. The motion is primarily an interdomain twist. B, same motion recast to show only motions internal to the CDK2. The motion is seen to be a relative flexing of the N-and C-terminal domains causing the active site to open and close. C, similar motion is sometimes observed in a corresponding analysis of a subset of the trajectories generated by molecular dynamics, although in this case it appears as the 3rd ranked eigenvector. D, when an ensemble of structures taken from the PDB of CDK2 bound to a range of ligands is subjected to the same analysis the result obtained is similar to that of the Concoord ensemble: the dominant mode is shown here. E, relative contributions of different modes (eigenvectors) to the overall motion is shown for the three analyses: Concoord (short dashed line), MD (long dashed line), and experimental ensemble (solid line). The data are renormalized so that the eigenvalues for each set add up to unity. F, second most significant motion for CDK2 is a relative twist of the N-and C-terminal domains. This view is looking "down" from the N-terminal domain, and was generated from the Concoord ensemble. of particular significance, as indicated by its appearance in Fig. 3A.

Specific Interactions
In addition to characterizing the inherent global motions of CDK2 bound to cyclin A, we have used MD simulation to explore the particular interactions that are responsible for the adoption and maintenance of the unphosphorylated (0.4% active) and phosphorylated (fully active) CDK2-cyclin A conformations.

Stabilization of the Arg 50 -Arg 126 -Arg 150 -Phospho-Thr 160 Site
Role of Thr 160 Phosphate-MD simulations starting with the "wild-type" configurations AP and IU were generated to serve as controls for the subsequent simulations of modified proteins. It was found that in both cases the known crystallographic structures were preserved throughout the simulation, as judged both by inspection and by several observations. First, the distance between Thr 160 and any atom of the arginine triad is approximately constant throughout simulations of AP and IU (Fig. 4). This serves as an approximate metric of the extent of transformation between active and inactive conformations.
Second, DSSP secondary structure assignments of AP and IU remain largely constant (data included as Supplementary Material). Third, the r.m.s.d. between the structures at time ϭ t and time ϭ 0 quickly leveled out at 0.3 nm for both AP and IU indicating that there was no gross unfolding of the protein over the timescale considered (data not shown). Notwithstanding this general structural stability, multiple simulations showed a capacity for CDK2 residues 217-250 to adopt a highly mobile state, the significance of which is discussed below.
Having shown that the controls gave the expected results we examined the trajectories from the structures that had been modified compared with the crystal structure, i.e. IP and AU. Simulation of the IP configuration led to variable results, summarized in Fig. 5A. In these figures each vertical box constitutes a summary of a chart like Fig. 4 collapsed in the time axis. The minimum, lower quartile, median, upper quartile, and maximum separations for the simulation are shown in each box. For two repeats of the simulation (IP runs "2" and "3") there was no obvious tendency for the protein to migrate toward the active configuration. On a third (IP run "1") however clear progress was made toward docking the Thr 160 phosphate into the arginine triad, shown more explicitly in Fig. 5

, B-E.
In contrast, the removal of the phosphate from the active state (forming the AU structure), consistently reduces the ability of the arginine triad to anchor the T-loop. Fig. 6A shows a summary of this measure for the control (AP) and the four 10-ns simulations of the AU structure. To allow direct comparison between the AP and AU trajectories only the first 10 ns of the AP trajectory was included. It is observed that the removal of the phosphate groups allows wider excursions of all three anchoring arginines, and visual examination of the trajectory showed that this is accompanied by a rotation of the Thr 160 away from its docked position in all cases. Arg 50 and Arg 150 lose their association with Thr 160 , leaving only Arg 126 to maintain the active configuration. In contrast to the AP configuration in which the original side chain interaction was maintained throughout, in three of the four AU simulations (AU runs "2," "3," and "4") Arg 126 was able only to maintain contact with the backbone of the Thr 160 during the 10 ns of simulation, and with a markedly looser association than for AP. Indeed for one of the AU simulations (AU run "1") we were able to observe the dissociation of this final link allowing a fuller reconfiguration of the T-loop. Figs. 6, B-D illustrates this graphically and 6E provides quantification, showing how the minimum distance between the Thr 160 and any of the anchoring arginines varied

FIG. 3. Correlated motions in the CDK2-cyclin A dimer.
A, cyclin A binds to the CDK2 PSTAIRE helix to make a rather rigid structure. In this case the AU form is shown, which emphasizes the rigidity of the cyclin N terminus more strongly than in other simulations, but the tendency for the PSTAIRE helix to dominate the CDK2cylin A interaction is reproduced in all other simulations analyzed. Pairs of atoms moving with correlation Ͼ 0.7 in a MD simulation are joined with a red line. The backbone is colored to emphasize assigned secondary structure as follows: ␣-helix is purple, 3-10 helix is mauve, ␤-sheet is yellow, bridge-␤ is tan, coil is white, and turn is cyan. B, similar analysis using a Concoord-derived ensemble highlights other rigid regions in the molecule, enumerated in main text. Concoord is better able than MD to discern larger regions whose correlated nature emerges over a long time scale. over the full 15 ns of the two simulations, AP and AU run 1. It is observed that from the start AU makes excursions with larger amplitude than AP. After ϳ3.5 ns a still larger excursion is made. This represents a partial unfolding of the T-loop with the protein locally evolving toward the experimentally determined unphosphorylated structure, IU. Subsequently the Thr 160 occasionally collides with the arginines but during the 15 ns of the simulation there was no further extended association of Thr 160 with the anchoring arginines. For AP, in contrast, the arginines and Thr 160 remain associated for the whole simulation. Thus the active conformation is destabilized by removal of the phosphate group from Thr 160 . FIG. 5. A, summary of multiple simulations of IP configurations with IU control. For each residue, and for each run the maximum, upper quartile, median, lower quartile, and minimum separation of atom centers between any atom of the residue and Thr 160 are plotted. B and C show starting and finishing configurations of the 15-ns IP run 1 simulation. Color scheme is as in Fig. 1. Significant progress is made toward docking of the phospho-Thr 160 (on the right of each figure) into the anchoring arginine. The fully docked, activated form toward which the protein was presumably evolving (1JST) is shown in D. This behavior is quantified in E. The graph shows the minimum separation between the arginine triad and either any atom (gray trace) of phospho-Thr 160 , or the phosphate oxygens (black trace) of phospho-Thr 160 . We emphasize that this clear change in configuration was not reproduced in the other two IP simulations.
We note that even in the more progressive AU and IP simulations that the structures do not progress fully to the reconfigured form. Explanations for this could be: (a) that the simulations are not long enough to observe the full transition, (b) that there is an energy barrier that prevents further progress, or (c) that the force-field chosen has a minimum in this configuration that is not reflected in reality. To investigate this further would require a quantitative approach such as Um- FIG. 6. A, summary of multiple simulations of AU configurations with AP control. For each residue, and for each run the maximum, upper quartile, median, lower quartile, and minimum separation of atom centers, between any atom of the residue and Thr 160 are plotted. There is a marked reduction in association between the arginines and Thr 160 for all the AU plots compared with that for AP. The absence of phosphate on Thr 160 allows a partial unfolding of the CDK2 T-loop from its active conformation. B and C show the CDK2 conformation for one of 4 simulations of AU (AU run 1). These images recorded at 0 and 3900 ps, respectively. Color scheme is as in Fig. 1. D shows the same region of 1FIN, the structure toward which AU seems to be evolving. This is quantified in E. The separation between Thr 160 and anchoring arginines was measured over a 15-ns trajectory. The lighter line is for the AP simulation (i.e. with Thr 160 phosphorylated), and the heavy line is for the AU run 1 simulation (i.e. Thr 160 unphosphorylated). For AP, association with anchoring arginines is maintained for the full duration of the simulation. For AU Thr 160 makes early excursions from the arginines before making a more permanent departure at about 3.5 ns. It is noted that subsequent associations between Thr 160 and the arginines are very transitory in AU.
brella Sampling, which could provide Potentials of Mean Force for the transition. This is beyond the scope of this study for reasons described under "Experimental Procedures." We have extended the MD approach to explore what property of the Thr 160 phosphate group (e.g. shape, charge, or hydrogen bonding potential) is most directly responsible for maintaining the active conformation in a series of experiments that are only possible through simulation. Initially, we have explored this by performing simulations in which the charge of Thr 160 phosphate was set to non-physical values in the range Ϫ3 to ϩ1 at the start of the simulation (MD Study 2). Fig. 7 summarizes these 9 simulations. For each simulated charge the maximum, upper quartile, median, lower quartile, and minimum values of the minimum triad-Thr 160 atomic center separation during the period 2-5 ns are shown. The first two nanoseconds were excluded as they included relaxation from the starting configuration. In line with intuition, we observe that the positively charged pseudophosphate spends its time away from the arginines, presumably because it is repelled by their like charges. Also we see the trend that the larger the negative charge on the phosphate the fewer and smaller excursions are made by Thr 160 , and also that the baseline separation is smaller as the increased Coulomb attraction pulls them more tightly together. An exception is for a charge of Ϫ2.5e, which behaves as expected for the first nanosecond (data not shown), but which subsequently departs somewhat. We interpret this anomaly to be the result of a thermal fluctuation and anticipate that given time the phosphorylated Thr 160 would adopt a mean separation in line with neighbors in the graph. We conclude that charge makes a substantial contribution to maintenance of triad-Thr 160 association, but that the full Ϫ2.0e charge is not required for this maintenance: the separation values for Ϫ2.0e and Ϫ1.0e are similar. As a cross check we have compared the evolution of this AP charge ϭ 0 simulation with that of the AU simulations described above. We note that whereas in the AU simulations the Thr 160 turned away from the arginine triad, in the charge 0 AP simulation the side chain of Thr 160 remains docked in place for the whole simulation. This suggests that the charge alone does not account for the full reduction in mobility of the T-loop when a phosphate is present, but that other forces, presumably including hydrogen bonds, van der Waals, and steric hindrance also contribute to the maintenance of structure. These forces are overcome however with even a small amount of positive charge on the phosphate, as indicated by increased separation for the charge ϭ ϩ0.5 simulation.
Differential Roles for the Cluster of Arginines That Bind Phospho-Thr 160 -As discussed above, examination of the x-ray structure of phosphorylated CDK2 shows that the negatively charged phosphate group on phospho-Thr 160 is docked into a triad of positive arginines: Arg 50 , Arg 126 , and Arg 150 . The natural interpretation is that there is a Coulomb attraction between these entities, which anchors the phospho-Thr 160 in place and maintains the functional phosphorylated configuration.
We are not aware of any suggestion that any of the arginines are more important than the others in this respect. On examination of the simulation of the AP configuration of CDK2cyclin A, however, it was observed that Arg 50 rapidly drifted away from the interaction center, as shown in Fig. 8, A and B (0 and 1500 ps, respectively). It is observed that the Arg 50 side chain, attached to the PSTAIRE helix, has drifted away from the rest of the cluster. We observe that the dissociation occurs as early as 40 ps in the trajectory. This observation was also reproduced in other simulations of CDK2 (data not shown). This led to the hypothesis that Arg 126 and Arg 150 play dominant roles in the anchoring of phospho-Thr 160 , with Arg 50 somewhat less important. This raises the question of why the x-ray structures for CDK2 always show the Arg 50 , Arg 126 , Arg 150 triad in close association with the phosphate group. One possibility is that there is indeed an attraction between the groups and at low temperatures (such as those used for x-ray diffraction) this is enough to cause the Arg 50 to associate with the rest of the cluster in the lowest energy configuration. At higher temperatures, such as the 300 K simulated here, the attraction is insufficient to overcome thermal disruption and the Arg 50 drifts away. Note that once Arg 50 has drifted away the association of Arg 126 and Arg 150 with phospho-Thr 160 is neutral: only dipole and higher order effects are available to attract the Arg 50 back into place. Although it is unlikely, it is not impossible that this prediction of low association of Arg 50 with phospho-Thr 160 is an artifact of the force field used, as described in limitations, above. We therefore sought empirical data that could support or counter this theoretical prediction. We noted that if Arg 50 , or equivalent residues, also play a smaller role in stabilizing an active T-loop conformation in other protein kinases, then it might be expected that basic residues at this position are likely to be less well conserved throughout the protein kinase family than at the sequence positions equivalent to Arg 126 and Arg 150 . This hypothesis was tested by counting how frequently arginine and lysine residues were found at these three residue positions within the kinase family, using the Human Kinome web resource (33). Nearly 500 members of the kinase family were included in the survey and results are shown in Fig. 8C. The number of kinases that conserve arginine or lysine in positions equivalent to CDK2 Arg 126 or Arg 150 is 399 and 266, respectively. For Arg 50 the equivalent number is only 143. Clearly the basic nature of residues 126 and 150 are much better conserved than that of residue 50. The results from this conservation analysis suggested MD study 3, described above, in which simulations were run with the three anchoring arginines individually mutated to methionine in silico, 5 simulations per mutation. Typical results for single simulations, one for each mutant are presented in Fig. 9. The traces in each graph are the separation of each FIG. 7. Separation of phospho-Thr 160 from the anchoring arginine triad for a range of simulated phosphothreonine charges, based upon AP over the period 2-5 ns. Distances are measured between atomic centers so that touching atoms are shown on this graph with a separation of ϳ0.2 nm. As in previous figures, for each simulation the maximum, upper quartile, median, lower quartile, and minimum separations are shown. It is seen that charge contributes to the stability of the protein but that the full Ϫ2e phosphate charge is not required. member of the arginine triad from the phospho-Thr 160 residue; specifically it is the distance between the centers of the two nearest atoms in each pair of residues. Blue is Arg 50 , red is Arg 126 , and black is Arg 150 .
The first graph (A) is for the R50M mutated molecule. Predictably the electrically neutral Met 50 residue shows no tendency to associate with the phospho-Thr 160 . Nevertheless, the Arg 126 and Arg 150 residues maintain close association with phospho-Thr 160 during the 5 ns of simulation. Arg 150 makes more excursions than Arg 126 .
A different picture is seen in for R126M (Fig. 9B). As expected the neutral Met 126 residue shows no tendency to associate with Thr 160 . Arg 50 rapidly diverges from phospho-Thr 160 . Arg 150 maintains association for a nanosecond, but then makes an excursion from Thr 160 . This leaves the phospho-Thr 160 unanchored: it diffuses away and when Arg 150 returns from its excursion it can no longer reach the phospho-Thr 160 tight anchoring has been lost and the T-loop begins to depart from its anchoring arginines. The third graph (Fig. 9C) shows that when Arg 150 has been mutated, Arg 50 still departs from phospho-Thr 160 . Nevertheless during the 5 ns of the simulation Arg 126 is able to maintain contact with phospho-Thr 160 .
These simulations were run five times each to identify whether these observations were repeatable. The results are summarized in Fig. 9D. For each structure (wild, R50M, R126M, R150M) the value of the mean minimum separation of atomic centers between the phospho-Thr 160 and any of the arginine triad for the period 1-5 ns are shown for each of the five simulations. There is a striking resemblance between the association data in Fig. 9D and the conservation data in Fig.  8C: mutation of the best-conserved residues leads to the most disruption.
These mutagenesis and dephosphorylation results are consistent with a single interpretation. The presence of the phosphate group on Thr 160 of CDK2-cyclin A is necessary to main-tain the active configuration of the enzyme. It establishes links with a triad of arginines, Arg 50 , Arg 126 , and Arg 150 in the N terminus of the protein. Of these three Arg 126 is the most important in maintaining the configuration. Arg 150 also makes a contribution, primarily through redundancy: if random fluctuations lead to Arg 126 being displaced, Arg 150 maintains contact with phospho-Thr 160 until Arg 126 diffuses back. Arg 50 plays little part in the anchoring function. We predict that in vitro CDK2 kinase activity should be smallest for R126M, greater for R150M and greatest for R50M. R50M may have an activity similar to the wild type since Arg 126 and Arg 150 seem able to maintain contact with phospho-Thr 160 without assistance from Arg 50 . Fig. 10A shows a comparison of the crystallographically determined B-factors for 1JST (the parent structure for simulation AP) with the B-factors calculated from the AP MD trajectory, omitting the first 2 ns as relaxation time. Many features are common to both plots and for much of the structure the values are well matched, although the simulation predicts the identity of mobile regions better than it predicts the degree of their mobility. Several differences between the 1JST B-factor values and those for AP are noted. First, the AP-simulated values are significantly lower than those of the x-ray data for two regions around residues 11 and 35 in the CDK2. This implies that the simulation has only explored a subset of the configurations that are available to the protein, and that were presumably adopted in the 1JST crystal. This is consistent with the hypothesis that the two peaks in the B-factor plot of 1JST arise primarily from static disorder, occupying 2 or more states that are separated by a kinetic barrier that is not overcome in the course of the simulation. This result is consistent with dual conformations observed for the main chain of this glycine-rich loop in crystallographic structures of CDK2-cyclin A (MEMN, data not shown).

B-factors
FIG. 8. The three anchoring arginines do not contribute equally to the maintenance of the active conformation. Two frames are shown from the AP simulation. A, at 0 ps; B, at 1500 ps. Color scheme is as in Fig. 1. The view is from behind the PSTAIRE loop, with respect to the orientation of Fig. 1. It is seen that although phospho-Thr 160 remains in contact with other members of the arginine triad, Arg 50 has lost association with phospho-Thr 160 by 1500 ps, suggesting that it is not heavily involved in the maintenance of CDK2 activity. C, this conclusion is supported by a study of the conservation of basic residues at sites analogous to CDK2 Arg 50 , Arg 126 , and Arg 150 in the protein kinase family. It is seen that Arg 50 is not as well conserved as the other two.
For the peak at residue 11, this is consistent with the proposed biological function of this loop: it opens and closes to allow entrance and exit of ATP and ADP respectively. The second loop at around residue 35 is often difficult to model in x-ray structures of both monomeric CDK2 and CDK2 in complex with cyclin A, and may play a role at the edge of the interface between the PSTAIRE helix and the CDK-proximal domain of cyclin A.
The most dramatic departure between the two simulations is around residue 235, where simulation implies significantly higher mobility than is indicated by the crystallographic Bfactors. The high peak in the simulation arises because there is a reconfiguration of the protein in this region. Strikingly, although many deposited structures of CDK2 present low Bfactors for this region, a subset (e.g. chain C of 1OGU) indicate that this region is indeed able to switch into a high-mobility state that has prevented reliable model building. Consistent with the observation that a mobile state is observed in only a subset of CDK2 structures, the associated reconfiguration is not observed in some of our other simulations of CDK2 and was not noted in the free CDK2 simulations of Ref. 15.
When we compare the shapes of AP and AU B-factor plots we see very close global agreement. Fig. 10B shows the simulated AP B-factors subtracted from the simulated AU B-factors for FIG. 9. A, B, and C show results of three MD simulations of AP in which different arginines from the triad Arg 50 , Arg 126 , Arg 150 are mutated to methionine respectively. In each case a trace is the minimum separation of atom centers between any of Arg 50 , Arg 126 , or Arg 150 and the phospho-Thr 160 . In A, it is seen that the R50M mutation does not affect the integrity of the triad-Thr 160 complex, whereas B shows that R126M completely disrupts it within 1 ns. Bottom, when Arg 150 is mutated, Arg 126 is able to maintain association with phospho-Thr 160 during the 5 ns of this simulation. To ensure that these results were reproducible, each mutant was simulated 5 times: the results are shown in D. In this summary chart the mean minimum separation between atomic centers of Thr 160 and any residue of the mutated arginine triad during the simulation period 2-5 ns are plotted for each mutant and for the wild type. Strikingly the plot shows a strong agreement with the independent residue conservation data for the arginine triad shown in Fig. 8C: mutation of the most strongly conserved arginine (126) leads to the greatest disruption. the CDK2 residue. The general trend is that B AU -B AP is positive within CDK2 implying that the phosphate group serves to stabilize the global CDK2 structure.

A Possible Mechanism to Prevent Non-productive Hydrolysis of ATP
In these simulations ATP often adopts conformations that are not optimal for phosphotransfer, as is also found in most crystal structures of CDK2. The exception is the structure of phospho-Thr 160 CDK2 in complex with ATP and a peptide substrate (34). In this case, ATP was found to be correctly configured for nucleophilic attack. Therefore the simulations and experimental structures suggest that ATP only adopts its active conformation when a proper substrate is bound at the CDK2 peptide binding site. This would function to prevent wasteful hydrolysis of ATP by water before the approach of the peptide. We do not know whether it is the identity of the peptide that gives rise to this ATP reconfiguration, or just its presence. If the former, then this would be a mechanism for increasing the specificity of the CDK2 activity to its cognate ligand. DISCUSSION At equilibrium the motions of a CDK2-cyclin A dimer are dominated by intermonomer twisting and flexing motions.
Within CDK2, dynamics are dominated by a bilobal flexing between the N-and C-terminal domains causing an opening and closing of the active site. Other contributing modes of motion are relative twist between the N-and C-terminal lobes and twisting and stretching within the N-and C-terminal domains.
Strikingly, we find that the experimentally determined configurations adopted by CDK2 are consistent with the family of breathing motions predicted by Concoord essential dynamics. As well as supporting the validity of our analysis, this indicates that the character of conformational changes of CDK2, including those associated with changes in activation state, may be guided by the inherent dynamic properties of CDK2 identified in this study.
Cyclin A moves in concert with the PSTAIRE helix. This observation underscores the importance of interactions between the N-terminal cyclin box fold of cyclins and the PSTAIRE helix of CDKs, the most highly conserved element of the cyclin-binding surface of CDKs. The significance of these interactions is indicated by the capability of single domain cyclins, such as the p25 fragment of the CDK5 activator p25, to bind to and activate their cognate CDK (35).
Our results show that the charge on the phosphate accounts for much, but not all of the tendency of the phospho-Thr 160 not to unfold from its active conformation. However, the full 2 Ϫ charge is not required for this effect over the timescales considered: little difference is observed in simulations where the phosphate is given a single negative charge. As expected, removal of the phosphate from a simulation of the phosphorylated x-ray structure leads to a significant loosening of the association between the arginine triad (Arg 50 , Arg 126 , and Arg 150 ) and Thr 160 . Indeed, in one simulation, a rapid partial unfolding of the T-loop was observed.
In the complementary experiment, in silico addition of phosphate to the unphosphorylated Thr 160 of CDK2 did not reproducibly lead to adoption of the active conformation (although one partial transition was observed). This may indicate that diffusion of the phosphorylated residue to its active conformation either is a slow process compared with the period simulated, or arises from factors not included in this simulation. The lifetime and structure of this putative state, phosphorylated but not yet in an active conformation, is unlikely to be biologically significant.
The arginines Arg 50 , Arg 126 , and Arg 150 in the triad that anchors phospho-Thr 160 are not of equal importance in binding to phospho-Thr 160 . Simulated mutagenesis shows that Arg 126 is vital for maintenance of triad-phosphate contact beyond a nanosecond or so. Arg 150 acts as a fail-safe for Arg 126 : should thermal fluctuations temporarily disrupt the contact between Arg 126 and phospho-Thr 160 . Arg 150 can hold phospho-Thr 160 in place until Arg 126 returns from its excursion. Arg 50 seems to make little contribution to the stabilization, and its main role may well relate to the hydrogen bonds that it makes to cyclin A, rather than to the interactions it forms with phospho-Thr 160 . Alternatively, we speculate that Arg 50 may serve as an additional hook to improve the chance of capture of a newly phosphorylated Thr 160 should it diffuse into range, and to hold it in position until Arg 126 or Arg 150 can take over a more permanent interaction. The simulations required to explore this possibility would require extensive sampling of the reconfiguration pathway, and are beyond the scope of this report. The ranking of importance of the arginines in maintaining the active conformation of the T-loop is consistent with their conservation rates within the kinase family: a basic residue is conserved at location 126 with 2.7 times the frequency of location 50. At lower temperatures there will be fewer fluctuations of the arginines and we therefore speculate that the x-ray structure with all three arginines docked to phospho-Thr 160 represents a low energy state with high occupancy at the experimental 100 K, but with possibly low occupancy at room temperature.
Finally, our simulations provide reassuring consistency with the experimentally observed distribution of mobile regions within the CDK2 molecule. Notably, the region between residues 219 and 251 is observed in some (but not all) simulations to diffuse away from interactions with the rest of the molecule and to adopt a highly mobile conformation. This result provides an explanation for the observation that in some crystals of CDK2 in complex with cyclin A, this region lacks interpretable electron density, consistent with possessing a high degree of thermal motion and/or static disorder. The significance of this transition remains unclear, but it is interesting to note that this region of the protein is both highly characteristic of CDK proteins compared with other protein kinases, and is responsible for at least two examples of protein:protein interaction, namely with the CDK subunit CKS1, and with the postulated Thr 160 phosphatase KAP. The potential to exist in highly ordered and highly mobile states might play a role in both modulating the kinetics and thermodynamics of the interactions in which it plays a role, and in allowing specific interaction with multiple structurally dissimilar but functionally important partners.