The crystal structure of the protein kinase HIPK2 reveals a unique architecture of its CMGC-insert region

The homeodomain-interacting protein kinase (HIPK) family is comprised of four nuclear protein kinases, HIPK1–4. HIPK proteins phosphorylate a diverse range of transcription factors involved in cell proliferation, differentiation, and apoptosis. HIPK2, thus far the best-characterized member of this largely understudied family of protein kinases, plays a role in the activation of p53 in response to DNA damage. Despite this tumor-suppressor function, HIPK2 is also found overexpressed in several cancers, and its hyperactivation causes chronic fibrosis. There are currently no structures of HIPK2 or of any other HIPK kinase. Here, we report the crystal structure of HIPK2's kinase domain bound to CX-4945, a casein kinase 2α (CK2α) inhibitor currently in clinical trials against several cancers. The structure, determined at 2.2 Å resolution, revealed that CX-4945 engages the HIPK2 active site in a hybrid binding mode between that seen in structures of CK2α and Pim1 kinases. The HIPK2 kinase domain crystallized in the active conformation, which was stabilized by phosphorylation of the activation loop. We noted that the overall kinase domain fold of HIPK2 closely resembles that of evolutionarily related dual-specificity tyrosine-regulated kinases (DYRKs). Most significant structural differences between HIPK2 and DYRKs included an absence of the regulatory N-terminal domain and a unique conformation of the CMGC-insert region and of a newly defined insert segment in the αC–β4 loop. This first crystal structure of HIPK2 paves the way for characterizing the understudied members of the HIPK family and for developing HIPK2-directed therapies for managing cancer and fibrosis.

The homeodomain-interacting protein kinases (HIPKs) 5 are a family of nuclear dual-specificity serine/threonine and tyrosine kinases that belong to the CMGC superfamily of protein kinases. Within the CMGC family on the kinome tree, HIPKs reside on a branch together with dual-specificity tyrosine-regulated kinases (DYRKs), CDC-like kinases (CLKs), serine/arginine-rich protein kinases (SRPKs), and the pre-mRNA processing factor 4 kinase (PRPF4B) (Fig. 1A). This group is separate from the well-studied CMGC kinases, such as cyclindependent kinases (CDKs), cyclin-dependent-like kinases (CDLKs), mitogen activates kinases (MAPKs) and glycogen synthase kinases (GSKs) (1). HIPKs act as transcriptional regulators and corepressors for homeodomain transcription factors (2) and regulate a diverse range of cellular functions, including cytokinesis and apoptosis, ultimately playing roles in adipogenesis, tumor suppression, angiogenesis, and inflammation (3)(4)(5). Based on similarity of the kinase domain sequence, HIPK kinases are most closely related to DYRK kinases, which are involved in cell survival, differentiation, and neurogenesis (6 -8). Like DYRKs, HIPK kinases are also involved in the pathology of certain neurodegenerative diseases. Dysregulation of DYRK1A plays a significant role in Down syndrome and Alzheimer's disease through a gene dosage-dependent mechanism (9 -11). HIPK2 activity positively correlates with neurodegeneration in ALS (12). Loss of function of HIPK2 has been linked to the development of Alzheimer's disease (13), whereas loss of HIPK3 was implicated as beneficial in Huntington's disease (14).
Three of four HIPK protein kinases in humans, HIPK1, HIPK2, HIPK3, and HIPK4, share a common domain architecture. HIPK1-3 consist of a highly conserved N-terminal kinase domain and several C-terminal regulatory domains (Fig. 1B). Directly C-terminal to the kinase domain is the homeoproteininteracting domain (HID) (2), followed by the speckle retention sequence (SRS) (15) in HIPK2 or the proline, glutamate, serine, and threonine (PEST)-rich domain in HIPK1 and HIPK3 (16). In HIPK1-3, the HID and SRS/PEST domains precede the autoinhibitory domain (AID) (17) and the serine, glutamine and alanine (SQA) repeat region at the C terminus (18). HIPK4 stands out as a unique family member because it consists of only a kinase domain. The HIPK4 kinase domain is also the most divergent and shares only ϳ50% sequence identity with the kinase domains of HIPK1-3, which have ϳ90% sequence identity among themselves. Genetic studies point to some functional differences and redundancies between the HIPK kinases, in particular HIPK1 and HIPK2. Individual loss of Hipk1 or Hipk2 does not manifest as an abnormal phenotype in mice (19,20). However, dual loss of Hipk1 and Hipk2 results in embryonic lethality due to developmental defects (20). HIPK3 and HIPK4 carry more distinct functions. Loss of Hipk3 results in decreased insulin secretion and impaired glucose tolerance, potentially driving type 2 diabetes (21). Hipk3 Ϫ/Ϫ mice also display lower levels of mutant huntingtin protein, making HIPK3 a potential therapeutic target in Huntington disease (14). Hipk4 knockdown studies identified HIPK4 as a key regulator of human skin epithelial cell differentiation (22).
The archetypal member of the HIPK family, HIPK2, was discovered as an interactor and positive regulator of the repressor activities of NK homeodomain-containing transcription factors (2). Under normal conditions, HIPK2 has a short half-life as a result of constitutive ubiquitination by the SIAH-1 ubiquitin ligase, resulting in HIPK2 proteasomal degradation (23). Upon DNA damage, HIPK2 is stabilized and catalytically activated by caspase-mediated cleavage that removes the C-terminal AID and SQA domains, releasing kinase inhibition (17). HIPK2 activation results in phosphorylation and degradation of the p53 repressor MDM2 (24 -26), phosphorylation of the apoptotic regulator CtBP (24), and modulation of p73 activity (25). Collectively, these events contribute to p53-dependent DNA lesion repair and, if the damage is irreparable, activation of cellular apoptosis. Due to its role in activating the DNA damage response, HIPK2 is a canonical tumor suppressor. Hence, in addition to its role in neurodegenerative diseases, HIPK2 has also been implicated in the pathology of several cancers (3).
Inhibition of HIPK2 kinase activity through therapeutic intervention seems counterproductive in light of its characteristics as a tumor suppressor. However, elevated HIPK2 levels have also been shown to be cytoprotective and are detected in cervical cancers, glioblastoma, pilocytic astrocytomas, and tonsillar squamous cell carcinoma (26 -29). The cytoprotective mechanism of HIPK2 signaling relies in part on a positive feedback between HIPK2 and the oxidative stress response transcription factor, NRF2, which promotes cancer cell survival (30). These findings suggest that pharmacological inhibition of HIPK2 kinase activity might be of therapeutic benefit in a subset of cancers.
In addition to its role in cancer, HIPK2 has also been implicated as an important regulator of renal, hepatic, and pulmonary fibrosis (31)(32)(33). In an HIV transgenic mouse model (Tg26) of renal fibrosis, HIPK2 protein levels are elevated in the kidneys, and HIPK2 knockout attenuates the development of fibrosis (31). Upon activation, HIPK2 triggers several pro-fibrotic, pro-inflammatory, and pro-apoptotic pathways, including TGF-␤/Smad3, Wnt/Notch, NF-B, and p53, which are all known to increase expression of pro-fibrosis markers (34 -37). Consequently, HIPK2 activation promotes cellular epithelialto-mesenchymal transition (EMT) (32,38). Expression of kinase-dead HIPK2 or its knockdown in human renal tubular epithelial cells significantly attenuates HIV-induced apoptosis and down-regulates EMT markers in Tg26 mice (31). Likewise, inhibition of HIPK2 by a small-molecule inhibitor ameliorates fibrosis in the Tg26 mice by suppression of the TGF-␤/Smad3 pathway (39). Together, these results suggest that targeting HIPK2 kinase activity could be of significant benefit in treatment of chronic fibrosis (40 -42).
The activation mechanism of CMGC kinases involves phosphorylation of a conserved tyrosine residue in the activation loop, which in MAPK kinases is encoded within the TXY motif (43). DYRK kinases contain a YXY motif in which the second tyrosine is phosphorylated (44 -46). HIPK1, HIPK2, and HIPK3 have an SXY motif, and the related member HIPK4 contains an EDITORS' PICK: Structure of HIPK2 bound to CX-4945 EPY motif. The phosphorylation of tyrosine 361 in the SXY motif was shown to activate HIPK2 (47). In contrast to other dual-specificity kinases, mutation of the activation loop tyrosine (Y361F) does not completely inactivate HIPK2, but it significantly reduces its catalytic efficiency and expands substrate recognition to encompass noncanonical tyrosine residues within HIPK2 (48). HIPK2 autophosphorylates its activation loop via a unique mechanism, first described in DYRK kinases, in which the tyrosine 361 is co-translationally phosphorylated (48 -51). In addition to autophosphorylation, full activation of HIPK2 has been described to involve acetylation and sumoylation of the HIPK2 protein (41,52). HIPK2 is acetylated on multiple lysines throughout the protein, including the kinase domain (53), and sumoylated on Lys 25 located N-terminally to the kinase domain (54).
Despite the promising therapeutic potential of targeting the HIPKs, there are currently no crystal structures of kinases in this family. To aid in the design of specific therapeutics for HIPK2, we determined the first crystal structure of a HIPK kinase, HIPK2, in complex with CX-4945, previously characterized as a casein kinase 2 subunit ␣ (CK2␣) inhibitor (55). CX-4945 (silmitasertib) is currently in phase I/II clinical trials for cholangiocarcinoma (56) and multiple myeloma (57). We describe the unique features of the HIPK2 kinase domain revealed by our crystal structure and discuss how they could be explored for design of HIPK2-targeted therapeutics.

Overall structure of the HIPK2 kinase domain
To obtain a crystal structure of the HIPK2 kinase domain, we generated a construct corresponding to the kinase domain of human HIPK2 (residues 177-547), expressed it in Escherichia coli, and purified the protein to homogeneity. Crystallization trials were conducted in the absence or presence of potential active site ligands, including nucleotides such as ATP, ADP, and AMP-PNP. We tested a panel of inhibitors known to act on HIPK2 and a broad range of CMGC family kinases (58 -60). Under the conditions tested, HIPK2 crystallized in the presence of the CK2␣ ATP-competitive inhibitor CX-4945, previously shown to inhibit HIPK2, HIPK3, and a number of other kinases (62,76), and crystallized in complex with CK2␣, the proviral integration site for Moloney murine leukemia virus-1 kinase (Pim1), and CLK2-4 kinases (63)(64)(65).
We determined the HIPK2 kinase domain structure at 2.2 Å resolution in the P6 2 space group with one molecule in the asymmetric unit (Table 1). In the structure, HIPK2 adopts the classical two-lobe kinase domain fold. The CX-4945 inhibitor binds to the ATP-pocket of the kinase in the type I active kinase conformation (DFG-in, helix C-in; Fig. 2A). The N-lobe of the HIPK2 kinase domain is mostly well-resolved and is comprised of the five-␤-sheet fold and a fully ordered helix C (␣C). The electron density for three loops within the N-lobe is absent. These loop regions encompass residues 208 -210 located in the tip of the P-loop, residues 231-236 located in the ␤3-␣C linker region and residues 269 -272 that connect the ␤4 and ␤5 strands. The HIPK2 C-lobe is fully resolved, including an insert region characteristic for the members of the CMGC family ( Fig. 2A). Our structure reveals that the insert is built by four short helices with an anti-parallel ␤-strand and extends across the base of the C-lobe, occluding helix G. This region of the structure is discussed in more detail below.
The HIPK2 kinase adopts an active conformation in the structure, with helix C swung toward the active site, permitting hydrogen bonds between the catalytic lysine (Lys 228 ; PKA residue Lys 72 ), the carboxylate group of the CX-4945 molecule, and Glu 243 (PKA residue Glu 91 ) in helix C (Fig. 2B). The N-terminal portion of the activation loop adopts an extended conformation, but the loop is only partially ordered in the structure, with insignificant electron density for residues 354 -359. The DFG motif aspartate (Asp 346 ; PKA residue Asp 184 ) points toward the active site, occupying the DFG-in conformation. In the structure, Asp 346 and Asp 324 play a central role in stabilization of an active conformation; the Asp 346 amino group forms a hydrogen bond with the CX-4945 carboxylate, and the Asp 346 carbonyl forms electrostatic interactions with His 322 (PKA residue Tyr 164 ) within the HXD motif. The side chains of Asp 346 and Asp 324 make interactions with the side chain of Asn 329 and the backbone amino group of Gly 348 ( Fig. 2B and Fig. S1). Asp 324 forms two additional ionic pairings with His 322 and Ser 364 . The activation loop tyrosine, Tyr 361 , is phosphorylated in the structure, and the phosphate group forms hydrogen bonds with the guanidine side-chain groups of Arg 365 and Arg 368 , further stabilizing the active conformation of the activation loop (Fig. 2B).
Arg 368 is located at the base of the Pϩ1 substrate-binding pocket and corresponds to the "CMGC arginine," a characteristic feature present in the sequence of the CMGC family of kinases (66). As in most active CMGC kinases (67), the HIPK2 CMGC arginine side chain hydrogen-bonds to the main-chain oxygen of another residue in the activation loop and stabilizes a  2D). This network of interactions connects the CMGC arginine to the CMGC-insert, orienting the insert toward the activation loop (Fig. 2D). The active state of the HIPK2 kinase in our structure is also evident in the conformation of the hydrophobic spines (Fig. S1). The catalytic C-spine is a noncontiguous hydrophobic zone that is connected by interactions with the nucleotide adenine ring (69 -71). In human HIPK2, the C-spine is composed of Val 213 and Ala 226 in the N-lobe and Leu 284 , Ile 330 , Met 331 , Leu 332 , Val 390 , and Leu 394 in the C-lobe. In our structure, the CX-4945 naphthyridine group completes the C-spine by interacting with Ala 226 and Met 331 . Val 213 also stabilizes the bound inhibitor through favorable contacts with the chloroanilino moiety (Fig. S1). The regulatory R-spine in HIPK2 is also in the active assembled conformation. The bottom of the R-spine is stabilized in the C-lobe by Asp 383 , which interacts with the backbone of His 322 in the HAD motif. The side chain of His 322 contacts the phenyl ring of Phe 347 in the DFG motif, stabilizing the kinase in the active state. Phe 347 in turn interacts with Leu 247 from the C-helix, and the top of the R-spine is completed through hydrophobic bonds to Ala 263 in the N-lobe (Fig. S1).

Binding of CX-4945 to the ATP site of HIPK2
The kinases within the CMGC family are attractive therapeutic targets. Ongoing efforts aim to synthesize selective inhibitors against various members, including HIPK2 (60,(72)(73)(74). Whereas our attempts to crystallize HIPK2 with the specific inhibitor TBID (60) or with ATP analogs were unsuccessful, we obtained diffracting crystals with CX-4945 (55, 62), a rationally designed inhibitor to selectively bind to the activesite ATP-pocket of CK2␣, a Ser/Thr kinase that positively regulates cellular proliferation (75). The CX-4945 compound is well-resolved in the electron density map and makes several interactions with the residues in the ATP-pocket of HIPK2 (Fig.  3A). The carboxylate group of the inhibitor forms a hydrogen bond network with the catalytic lysine (Lys 228 ), the amino group of Asp 346 within the DFG motif, and a water molecule. Secondary stabilizing hydrogen bonds are made between the water, the amino nitrogen of Phe 347 , and the carboxylate side chain of Glu 243 , which also hydrogen-bonds to Lys 228 . In HIPK2, the ATP-pocket is formed by Leu 205 , Val 213 , Ala 226 , Val 261 , Phe 277 , Met 279 , Leu 280 , Met 331 , and Ile 345 . The naphthyridine ring of CX-4945 is tightly bound by the R-spine residues Previous crystal structures have determined the mode of CX-4945 binding to CK2␣ (63,76), Pim1 (64), and CLK2-4 (65). The core hydrophobic active-site residues are well-conserved between HIPK2, CK2␣, and the CLKs, whereas the sequence of the Pim1 hinge region is different at several posi- Inhibitor is shown with the 2F o Ϫ F c map contoured to 1.5. Hydrogen bonds are shown as black dashes, and a water molecule is shown as a red sphere. B, for clarity, identifiers of all residues in structure images shown in C-F were reduced to single letters. A table summarizing the correspondence of the letters to individual residues in each structure is shown here. C-F, zoomed-in views of the active sites of the indicated CMGC kinases crystallized in complex with the CX-4945 inhibitor. Inhibitor binding is shown face-on and upon 70°rotation. C, HIPK2 bound to CX-4945; D, CK2␣ bound to CX-4945 (PDB code 3NGA); E, CLK1 bound to CX-4945 (PDB code 6FYV); F, Pim1 bound to CX-4945 (PDB code 5O11).
EDITORS' PICK: Structure of HIPK2 bound to CX-4945 tions, which make contact with CX-4945 (Fig. 3, B-E). In CK2␣ and the CLKs (only CLK2 is shown for clarity in Fig. 3), the hinge region (CK2␣ (His 115 -Thr 119 ) and CLK2 (Leu 245 -Leu 248 )) interacts with the CX-4945 in the ATP-binding pocket through electrostatic interactions. Pim1 contains a proline insertion in the hinge region, which reduces connections between the inhibitor and kinase backbone. In HIPK2, the hinge region extends toward the active site and contacts the inhibitor via Met 279 and Leu 280 , which are buried in the back of the active site. The naphthyridine group of CX-4945 forms a hydrogen bond with the backbone nitrogen of Leu 280 (Fig. 3A). This interaction is conserved in CK2␣ and the CLKs (Fig. 3, C and D) but is absent in Pim1 in which this position is occupied by a proline (Pro 123 ) (Fig. 3E). CK2␣ makes a unique interaction with the inhibitor via His 160 , which is positioned toward the ATP-pocket, likely accounting for the increased high affinity of CK2␣/CX-4945 binding (Fig. 3C). CLKs, Pim1, and HIPK2 contain a glutamate at this position (Glu 328 in HIPK2), which is oriented away from the active site. Whereas the apex of the P-loop in HIPK2 is disordered and does not participate in the contact with the inhibitor, in the structures of CK2␣, CLKs, and Pim1, the P-loop apex makes direct interactions with the inhibitor. This unique feature of the P-loop in HIPK2 presents an opportunity for design of more potent and selective HIPK2 inhibitors.

Comparison of the HIPK2 kinase domain with CMGC family members
The HIPK2 kinase domain structure shares a notable degree of similarity with solved structures of the CMGC kinases but also includes several unique features. The closest homologs of HIPK kinases, the DYRK family of kinases, includes DYRK1A, DYRK1B, DYRK2, DYRK3, and DYRK4. Structures of DYRK1A, DYRK2, and DYRK3 have been solved (77,78). Alignment of these kinase structures with the HIPK2 kinase domain shows high conservation of the global architecture of the kinase domain, and the structures superimpose with a root mean square deviation of 0.91 Å for DYRK1A, 0.82 Å for DYRK2, and 0.82 Å for DYRK3 over 218 C␣ atoms (Fig. 4A). The residues in the active site of these kinases, specifically the interactions mediated by the activation loop, are highly conserved (Fig. 4B). There are only minor hydrophobic substitutions in the ATPpocket in HIPK2 when compared with DYRK kinases (Fig.  S2A). The tyrosine phosphorylation site within the activation loop is conserved among CMGC family members, including MAPK, GSK, RCK, CDKL, DYRK, and HIPK members. Analogous interactions to the ones observed between phosphorylated Tyr 361 , Arg 365 , and Arg 368 in HIPK2 are present in the structures in DYRK1A, DYRK2, DYRK3, ERK2, and GSK3␤ (77-79) (Fig. 4B and Fig. S2, B-D). An exception is seen in the apostructure of the closely related kinase, PRPF4B, where the phosphorylated tyrosine is rotated away from the stabilizing arginine residues (Fig. S2D).
A distinguishing feature of the HIPK2-containing branch of the CMGC family is the lack of the conserved HRD motif, which is replaced in DYRKs by HCD, in SRPKs by HTD, and in HIPKs by HAD (Fig. 2C). The role of HRD arginine is to stabilize the active conformation of the kinase via interaction with the phosphorylation site in the activation loop (70). Typically, activation of kinases that lack the HRD arginine does not depend on activation loop phosphorylation. In CMGC kinases that contain an HRD motif, like MAPKs, the HRD arginine interacts with the first phosphorylation site in the TXY motif (79). The corresponding residue in the HIPK2 kinase is Ser 359 (within the SXY sequence), which in our structure is disordered, making it impossible to conclude what its putative interactions with the kinase domain are. Previous studies showed that mutation of Ser 359 does not affect catalytic activity of HIPK2 (5), implicating phosphorylation of the activation loop tyrosine (Tyr 361 ) as the most important determinant of kinase activity (48,51). As seen in our structure, and previously that of DYRKs, pTyr 361 is not in direct contact with the HXD motif (Figs. 2B and 4B). Hence, lack of conservation of the HRD arginine likely is a consequence of a unique activation mechanism that evolved in the HIPK2-containing branch of the CMGC family.
Many CMGC kinases feature intramolecular interactions that the N-lobe and the C-lobe domains make with adjacent domains. Our structure reveals that HIPK2 forms interactions that are distinct from those characterized in the most closely related DYRK kinases. The N-lobe of DYRK kinases interacts directly with the N-terminally located DYRK homology (DH) box, which in DYRK2 is preceded by the N-terminal autophosphorylation accessory (NAPA) domain (80). The DH box of DH box/NAPA module packs against the groove in the N-lobe, making extensive interactions with all five ␤ strands and the N terminus of helix C (Fig. 4A). Whereas HIPK kinases have a divergent N terminus that lacks the DH box and the NAPA domain, in our structure, the N-terminal extension of HIPK2 packs in the same groove in the N-lobe. Whereas the electron density in this region was poor, preventing us from modeling the side chains, we were able to build the backbone atoms confidently. Hence, HIPK2 appears to engage its N-terminal region in the same pocket as DYRKs while forming a different set of interactions (81,82).
As in other CMGC kinases, the C-lobe of HIPK2 forms intramolecular interactions with the CMGC-insert region. Comparison of the CMGC-insert among the CMGC kinases shows significant sequence and structural variation ( Fig. 4A and Fig. S3) (1). HIPK kinases have long CMGC-insert regions; in HIPK1, HIPK2, and HIPK3 the insert region encompasses 78 residues (residues 418 -493 in HIPK2). HIPK4 has the longest CMGCinsert of any CMGC family member (84 residues). Outside of the HIPK family, the HIPK2 CMGC-insert has the highest similarity to the DYRK family kinases both in length and sequence (Fig. S3). Despite these similarities, our crystal structure reveals significant differences in the conformation of the insert region between HIPK2 and DYRK kinases.

Unique features of the HIPK2 CMGC-insert region
As in other CMGC kinases, the HIPK2 CMGC-insert is located at the base of the kinase domain C-lobe and makes direct contacts with the G and H helices of the kinase domain (Fig. 4C). In HIPK2 and the DYRK kinases (77), the N-terminal portion of the CMGC-insert forms two short helices (␣L and ␣LЈ) before diverging near the G helix. The succeeding region EDITORS' PICK: Structure of HIPK2 bound to  in DYRK1A forms a ␤-hairpin, whereas in DYRK2 and DYRK3 it extends as a short loop (Fig. S3). This region in HIPK2 adopts a beta hairpin that is significantly longer than the one in DYRK1A (Fig. 4D). The subsequent region of the CMGC-insert in HIPK2 adopts a conformation not seen in other CMGC kinases (Fig. S3), by turning toward helix G and forming a short two-turn helix, which we have called "helix M". In DYRK2 and DYRK3, this region of the insert first turns in the opposite direction, where it forms a long ␤-hairpin, and only then comes back toward helix G to form another ␤-hairpin located to the left of where helix M is located in HIPK2 (Fig. 4D). After helix M, the HIPK2 CMGC-insert region takes a unique path again and runs in close proximity to the activation loop before it adopts a short helical structure, called by us "helix N." The helix in this position is a universal feature of the insert regions in all CMGC kinases whose structures have thus far been solved  5Y86). B, comparison of active site between HIPK2 and DYRK2; numbering in parentheses corresponds to DYRK2. C, zoomed-in view of the secondary structural elements in the CMGC-insert visualized in the CX-4945-bound HIPK2 kinase structure and its trajectory on the kinase C-lobe. D, comparison between the CMGC-insert regions in HIPK2 and DYRK2 kinases made by overlay of the CX-4945-bound HIPK2 kinase structure and the DYRK2 structure (PDB code 3K2L). E, phosphorylation of CMGC-insert serine residues observed in the structures of the CX-4945-bound HIPK2 kinase, DYRK2 (PDB code 3K2L), and DYRK3 (PDB code 5Y86).
EDITORS' PICK: Structure of HIPK2 bound to CX-4945 (Fig. S3). In all of these structures, helix N connects C-terminally via a short loop with helix H in the core kinase domain. In HIPK2, this loop portion of the CMGC-insert is also unique because it is significantly longer than in other kinases and instead of forming a loop, it adopts a helical structure, becoming an extension of helix H. This results in a considerably longer helix H in HIPK2 compared with other CMGC family members (Fig. 4D). The unique features of the HIPK2 CMGC region discussed above seem integral to its structure and are largely unaffected by the packing of symmetry-related molecules in the crystal lattice.
Compared with other CMGC kinases, with the exception of DYRK2 and DYRK3, the long insert region of HIPK2 engages more extensively with the kinase C-lobe (Fig. S3). Its unique feature, helix M, is located directly next to the helix G that lines the so-called Pϩ3 pocket (83). Together with helix LЈ, helix M partially blocks the Pϩ3 pocket in HIPK2 (Fig. S4A). Whereas the functional role of this interaction for HIPK2 is currently unknown, the Pϩ3 pocket in other CMGC kinases is utilized as a docking site for signaling partners that often serve as substrate-presenting scaffolds. The axin scaffold, which is critical for efficient phosphorylation of ␤-catenin by GSK3␤, engages the Pϩ3 site, and when HIPK2 and GSK3␤ kinases are overlaid, a helix within axin directly overlaps with helix LЈ in the HIPK2 CMGC-insert region (84) (Fig. S4A). Remarkably, in both kinases, the interaction involves a phenylalanine residue whose side chain packs into the hydrophobic pocket within the C-lobe. In HIPK2, the intramolecular interaction is provided by Phe 434 localized in helix LЈ, whereas in GSK3␤, the phenylalanine is provided intermolecularly by the axin scaffold (Phe 388 ) (Fig. S4B). The adaptor protein Cks, which engages CDK substrates primed by phosphorylation and enables their subsequent processive multiphosphorylation by CDKs (85)(86)(87), also engages with CDK kinases in this region. This interaction bears a resemblance to the helix LЈ/helix M cap made over helix G by the CMGC-insert in HIPK2 (Fig. S4A).

HIPK2 has a unique serine phosphorylation site within the CMGC-insert
The CMGC-insert region in HIPK2 contains a phosphorylation site, pSer 441 , located in the apex of the ␤-hairpin, which is well-resolved in our crystal structure, and a known autophosphorylation site (48). This phosphorylation site is unique to HIPK2 and is not present in other HIPKs. In the structure, the phosphorylated Ser 441 hydrogen-bonds with the adjacent Arg 437 and with Ser 426 located on helix L (Fig. 4E). Analysis of molecular dynamics trajectories for phosphorylated HIPK2 indicates that whereas the interaction between pSer 441 and Ser 426 was not stable over the course of the simulations, the hydrogen bond between pSer 441 and Arg 437 remained intact (Fig. S5A). We hypothesized that this hydrogen bond is essential for the unique conformation of the CMGC-insert region observed in our HIPK2 kinase structure. However, the unbiased all-atom molecular dynamics simulations of the HIPK2 kinase domain structure in the presence or absence of Ser 441 phosphorylation showed that the root mean square fluctuation of all residues within the HIPK2 kinase domain was largely unaffected by the removal of Ser 441 phosphorylation (Fig. S5B). This analysis suggests that Ser 441 phosphorylation rather plays a different role in HIPK2. Moreover, the secondary structure prediction of the CMGC-insert sequences in HIPK1 and HIPK4 predicts similar secondary structure propensity to HIPK2 (Fig.  S5B), suggesting that the CMGC-insert regions may adopt a similar structure in these kinases despite the absence of a serine phosphorylation site.
DYRK2 and DYRK3 are also phosphorylated within the CMGC-insert, but at different locations. DYRK2 has two phosphorylation sites (pSer 369 and pSer 385 ), and DYRK3 has one site (pSer 445 ). pSer 385 in DYRK2 and pSer 445 in DYRK3 are located within the loop regions that occupy a position similar to that of the ␤-hairpin in HIPK2, but because the loops are shorter, the phosphorylation sites in DYRKs are engaged in different interactions. In fact, the phosphoserine residues in DYRK kinases occupy the same position as Arg 437 occupies in HIPK2, and they form hydrogen bonds with nearby arginine residues (Arg 390 in DYRK2 and Arg 450 in DYRK3). pSer 369 , located on the ␣L helix, is unique to DYRK2 and does not form significant interactions with neighboring residues (Fig. 4E). Whereas HIPK2 has an alanine in this position (Ala 421 ), interestingly, a neighboring tyrosine (Tyr 423 ) occupies the position that in DYRK2 is taken by the phosphorylated Ser 369 side chain (Fig. S3B). The differences in the CMGC-insert regions exhibited by HIPK2 compared with DYRKs make this a potential specific site for targeting HIPK2.

Statistical analysis of evolutionary constraints identifies another unique insert in the ␣C-␤4 loop as distinctive of HIPK kinases
To expand our understanding of the structural features that are characteristic of the HIPK family, we performed a statistical comparison of the evolutionary constraints acting on HIPK kinase domain sequences in relation to other CMGC kinases. These constraints generally correspond to residues that are highly conserved in HIPK sequences (foreground HIPK alignment in Fig. S6) and are biochemically distinct in other CMGC kinases (background CMGC alignment in Fig. S6) (66). Our analysis reveals that whereas HIPK-specific residues are broadly dispersed along the kinase sequence, they tend to spatially cluster in the regions of the kinase domain that mediate interactions with HIPK-specific inserts or flanking sequence segments, such as CMGC-insert or N-terminal extension, whose unique conformations are visualized in our HIPK2 kinase structure. For example, stabilization of the unique CMGC-insert is achieved through multiple interactions across the kinase C-lobe. Specifically, three residues, Trp 398 , Arg 411 , and Tyr 412 , in HIPK2 are specific to the HIPK family, and in our structure they mediate interactions between the CMGC-insert and the kinase core (Fig. 5, A and B).
In addition, we identified a short insert segment within the loop connecting the ␣C helix and ␤4 strand, defined here as the ␣C-␤4 loop insert, as a novel distinctive feature of HIPKs (Fig.  5, C and D). The ␣C-␤4 loop insert adopts a unique conformation in the HIPK2 structure and is stabilized through HIPKspecific residues in the kinase core (Fig. 5D). In particular, Tyr 258 in the ␣C-␤4 loop insert packs against a lysine in helix E (Lys 314 ), which is also identified as HIPK-specific in our analy-EDITORS' PICK: Structure of HIPK2 bound to CX-4945 sis. Lys 314 additionally forms a hydrogen bond with Glu 253 to stabilize the loop (Fig. 5D). Other notable evolutionarily constrained residues in HIPK kinases include Met 331 in the ATPbinding pocket (Fig. S6), which may contribute to the specificity of inhibitor binding (Fig. 3B).

Discussion
The atomic structure of the HIPK2 kinase bound to an inhibitor provides the first structural insights into the HIPK family of kinases and opens the door for structure-function studies of this important class of cellular regulators. The HIPK2 kinase crystallized in an active conformation stabilized by the bound inhibitor and by phosphorylation of a conserved tyrosine in the activation loop. Our structure revealed a highly conserved network of interactions between HIPK2 and the closely related CMGC kinases: DYRK kinases. In both HIPK2 and DYRK kinases, phosphorylation of the activation loop appears to occur intramolecularly during translation before the kinase is fully folded. In addition to stabilizing an active state, this modification is important for switching substrate specificity in these A, sequence alignment of a region within the C-lobe in HIPK sequences from diverse phyla, as revealed by our analysis of evolutionary constraints acting on HIPK sequences (Fig. S6). Three sets of related sequences are shown in the hierarchical alignment: (i) a foreground set of 1498 HIPK sequences that share a co-conserved pattern, as defined by the Bayesian pattern-partitioning procedure, (ii) a background set of 14,296 CMGC sequences, and (iii) a display set of HIPK homologs from diverse phyla. Only the display sequences are explicitly shown in the alignment. The foreground (HIPK) and background (CMGC) alignments are shown as residue frequencies below the display alignment. Residue frequencies are indicated in integer tenths where, for example, a 9 indicates 90 -100% occurrence of the corresponding residue, at the corresponding position, in weighted foreground or background sequences. Patterns identified as unique to HIPK sequence are indicated by black dots above the alignment and highlighted in the display alignment. Evolutionary constraints on the pattern residues are displayed as red histograms above the alignment, where the height of the histogram indicates the strength of HIPK-specific constraints at the corresponding position. B, evolutionarily constrained residues in the HIPK2 kinase C-lobe that form specific interactions with the CMGC-insert (shown in surface mode) are shown as sticks. C, distinguishing residues in the ␣C-␤4 loop in HIPK kinases define a unique insert region. Insert sequences are indicated in lowercase letters, and their frequency is not scored due to their absence in all non-HIPK kinase sequences. D, the residues stabilizing the unique conformation of the ␣C-␤4 loop insert in HIPK2 are shown as sticks. E, comparison of the ␣C-␤4 loop structure in the HIPK2 kinase with DYRK1A (PDB code 3ANQ) and GSK3␤ (PDB code 1GNG).
EDITORS' PICK: Structure of HIPK2 bound to CX-4945 kinases to serine and threonine residues (51,52). This unusual mechanism likely explains the lack of conservation of an HRD arginine in HIPKs and DYRKs, which typically stabilizes a phosphorylation site within the activation loop (70). In HIPK2 and DYRK kinases, the phosphorylated tyrosine is engaged in alternative interactions, engaging a conserved glutamine residue in the base of the substrate-binding pocket. In HIPK2, as in its close CMGC relatives, the glutamine participates in stabilization of the CMGC arginine residue and its interaction with a phosphorylated tyrosine in the activation loop. Whereas these interactions result in a strained backbone of the activation loop, they represent another unique feature necessary for the active state of CMGC kinases (67). Thus, the HIPK2 kinase domain recapitulates many key structural elements that are characteristic for this subclass of CMGC kinases.
Whereas interactions in the active site of HIPK2 highly resemble those described in its closest CMGC relatives, our structure, which reveals for the first time the conformation of the HIPK2 CMGC-insert, defines its unique structural features compared with other members of the CMGC kinase family. These include the presence of a short helix not seen in other CMGC kinases, called by us helix M, as well as a notable extension of helix H in the core of the kinase domain by the C-terminal region of the CMGC insert. As a result, HIPK2 has the unusually long helix H, which distinguishes it from other CMGC kinases. These new structural elements within the CMGC-insert of HIPK2 lead to its unique conformation and interaction with the kinase C-lobe that is not reminiscent of any other CMGC-insert/kinase interactions characterized thus far. Our statistical analysis of the evolutionary constraints acting on HIPK sequences shows that the residues located at the interface of the CMGC-insert region and the kinase domain binding are conserved and distinctive of the HIPK family, prompting us to speculate that the CMGC-insert region might adopt a similar structure in all members of the HIPK family.
The CMGC-insert constitutes the most divergent region across all CMGC kinases, both in length and in sequence (66). This region equips CMGC kinases with unique functions by serving as a binding platform for signaling partners (66). These include the Cks adaptor protein for CDKs (85)(86)(87) and an axin scaffold for GSK3␤, which is critical for efficient phosphorylation of ␤-catenin (84), as well as an inhibitor of that interaction, FRAT (88). In ERK2, point mutations in the CMGC-insert modulate the ability of ERK2 to bind MEK1 (89). Although no functional roles of the CMGC-insert region in HIPK2 have thus far been characterized, we describe a curious resemblance of the CMGC-insert-mediated interactions in HIPK2 to binding of substrate-presenting scaffolds in CDKs and GSK3␤. Hence, one could speculate that the CMGC-insert will similarly play an important role in modulation of HIPK2-dependent substrate specificity and perhaps also processivity of catalysis, in a manner analogous to Cks-mediated regulation of CDKs. In our statistical analysis of sequence constraints across evolution, we also identified a new insert region within the kinase domain specific for the HIPK family, called the ␣C-␤4 loop insert. Our HIPK2 structure shows that this region adopts a conformation not seen in other CMGC kinases. It is therefore a possibility that the ␣C-␤4 loop insert may play a specific role in HIPK-dependent signaling, perhaps by serving as an interaction site for intramolecular regulatory domains or intermolecular binding partners and/or by stabilizing conformation of the adjacent helix C.
In addition to phosphorylation on Tyr 361 in the activation loop, in our crystal structure the CX-4945-bound HIPK2 kinase is phosphorylated on Ser 441 in the CMGC-insert region. Both of these sites were previously characterized as autophosphorylation sites in HIPK2 (48,51). The significance of Ser 441 phosphorylation is unclear, and our studies of HIPK2 dynamics show that it is not involved in stabilization of the unique conformation of the CMGC-insert. Phosphorylation of the HIPK2 kinase domain has been shown to regulate its subcellular localization in addition to activation, possibly via regulation of HIPK2 oligomerization (48). The exact oligomeric state of HIPK2 in cells is unknown; however, in vitro studies suggest that phosphorylation promotes transition between a HIPK2 dimer to a monomer (90). In support of this model, in our structure, the phosphorylated HIPK2 is a monomer and does not form extensive interactions with other molecules in the crystal lattice. In contrast to the co-translational phosphorylation on Tyr 361 , which is likely a constitutive modification, phosphorylation of Ser 441 might serve as a regulatory switch, controlling events like HIPK2 oligomerization. Hence, the CMGC-insert could possibly play a role in HIPK2 oligomerization.
HIPK2 is emerging as an exciting therapeutic target, and efforts to identify modulators of HIPK2 activity have recently intensified. Small-molecule inhibitors that target either the hydrophobic ATP-binding pocket or an unknown allosteric site in HIPK2 have been reported (39,60). The inhibitor CX-4945 was designed to target CK2␣; however, it also inhibits HIPK2 and HIPK3 (85 and 93% inhibition, respectively (76)), with an IC 50 of 1 nM for CK2␣ and 45 nM for HIPK3 (62). Therefore, allowances need to be made while using CX-4945 as a scaffold for the design of specific HIPK2 inhibitors. By revealing the architecture of the active site, our structure provides a platform for guided design of such molecules. As demonstrated by recent findings of a CK2␣ inhibitor with minimal off-target effects (92), specificity can be achieved for these closely related kinases. Importantly, sites other than the nucleotide-binding pocket could be explored for targeting of the CMGC kinases. The unique interaction between the CMGC-insert region and the Pϩ3 site in our structure points to a potential benefit of designing molecules that bind to the Pϩ3 pocket and disrupt this interaction. Pϩ3 pocket-targeting molecules have been described for a number of kinases in recent years (93,94), underscoring the validity of this approach. The HIPK2 structure described here will greatly aid in conceptualizing and optimizing future inhibitors with the overarching goal of finding effective treatments for diseases in which HIPK2 does not function properly. Most importantly, our HIPK2 structure serves as a starting point for characterizing the understudied members of the HIPK family.

Expression and purification of recombinant HIPK2 kinase domain
The kinase domain of HIPK2 (residues 178 -547, as per Uni-Prot accession code Q9H2X6) was cloned into pET28a vector EDITORS' PICK: Structure of HIPK2 bound to  and expressed in Rosetta DE3 pLysS cells (MiliporeSigma). A 25-ml lysogeny broth (LB) overnight culture, with 50 mg/liter kanamycin and 33 mg/liter chloramphenicol, grown at 37°C and 220 rpm for 20 h, was used to inoculate 1 liter of 2YT medium with antibiotics. When cell density reached A 600 ϭ 0.8, the flasks were cold-shocked on ice and transferred to a cold room for 1 h. Expression was induced by the addition of 0.7 mM isopropyl 1-thio-␤-D-galactopyranoside to 1 liter of culture, which was then incubated at 18°C and 190 rpm for 19 h. Cells were harvested by centrifugation at 5,000 rpm for 40 min, flashfrozen in liquid nitrogen, and stored at Ϫ80°C.
The cell pellet was resuspended in lysis buffer (25 mM Tris, pH 8.5, 500 mM NaCl, 0.02% Triton X-100, 5 mM MgCl 2 , 2 mg of DNase I, 3 mM ␤-mercaptoethanol, 10 mM imidazole, and one cOmplete EDTA-free mini protease inhibitor mixture tablet) on ice and sonicated at 4°C, with a microtip, at 35% amplitude with 1-s on/4-s off pulses for 6 min using a sonic dismembrator model 500 (Fisher). Ultracentrifugation at 30,000 rpm for 30 min in a Ti45 rotor clarified the lysate (Optima L-90K ultracentrifuge, Beckman Coulter). The supernatant was loaded onto a 1-ml HisTrap FF column (GE Healthcare) equilibrated in Niwash buffer (25 mM Tris, pH 8.5, 500 mM NaCl, 10 mM imidazole, and 3 mM ␤-mercaptoethanol) and eluted with a linear gradient of 60 ml of Ni-elution buffer (25 mM Tris, pH 8.5, 500 mM NaCl, 250 mM imidazole, and 3 mM ␤-mercaptoethanol). The protein eluted from 35 to 100 mM imidazole. Fractions were assessed to contain recombinant HIPK2 kinase domain by SDS-PAGE and diluted to Ͻ1 mg/ml in dialysis buffer (25 mM Tris, pH 8.5, and 5 mM ␤-mercaptoethanol), and the His tag was cleaved with tobacco etch virus protease in dialysis tubing against 2 liters of dialysis buffer overnight at 4°C. Uncleaved protein and protease were removed with nickel-nitrilotriacetic acid resin. The flow-through was applied to a RESOURCE Q column equilibrated in Q-wash buffer (25 mM Tris, pH 8.5, 20 mM NaCl, and 3 mM ␤-mercaptoethanol) and eluted with a linear gradient of 40 ml of Q-elution buffer (25 mM Tris, pH 8.5, 1 M NaCl, and 3 mM ␤-mercaptoethanol). The HIPK2 kinase domain eluted as a broad peak from 230 to 400 mM NaCl, indicating that the kinase domain contained multiple phosphorylated species. The protein-containing fractions were applied to a Superdex 200 column in final buffer (25 mM Tris, pH 8.5, 100 mM NaCl, and 0.5 mM TCEP), which eluted as a single monomeric peak. The purified HIPK2 kinase domain was concentrated to ϳ20 mg/ml, flash-frozen in liquid nitrogen, and stored at Ϫ80°C.

Crystallization, data collection, and structural determination of recombinant HIPK2 kinase domain
Recombinant HIPK2 kinase domain was diluted to 6 mg/ml, and the inhibitor CX-4945 was added to a final molar concentration of 1.25ϫ. Initial sparse matrix crystallization screening was performed at 20°C, by hanging-drop vapor diffusion, with 100-nl ϩ 100-nl drops of protein/inhibitor against reservoir solution, on a Mosquito crystallization robot (TTP Labtech) using commercial screens (Qiagen). The vast majority of conditions precipitated the kinase-inhibitor complex; however, one condition produced small crystals that were subjected to optimization. Crystals were grown in 24-well plates using hang-ing-drop vapor diffusion at 20°C with 1 l of proteininhibitor solution against 1 l of reservoir solution (final condition: 20% PEG 3350 and 0.2 M KSCN). Initial crystals formed within 3 days, and large hexagonal pyramids grew to ϳ100 m in diameter by 1 week. The crystals were taken to ALS beamline 8.3.1 (Advanced Light Source), cryo-protected with 20% glycerol, and flash-cooled in liquid nitrogen. A single crystal diffracted to 2.2 Å on the Pilatus 6M detector (Dectris). The images were analyzed and integrated with HKL2000 (95) in space group P6 2 with unit cell a ϭ b ϭ 130.2 Å, c ϭ 52.3 Å, ␣ ϭ ␤ ϭ 90º, ␥ ϭ 120º, and the intensities were scaled and merged in Aimless (CCP4) (96). The structure was solved by molecular replacement using Phaser (97) with a poly-Ala model of DYRK1A (PDB code 3ANQ). The structure was manually built in Coot (98) and refined with Phenix (99) with subsequent rounds of model building, refinement, and structural analysis until R work and R free stabilized to 0.20 and 0.24, respectively. Electron density for CX-4945 could be clearly seen in the hydrophobic ATP-binding pocket. The crystal structure of HIPK2 kinase domain contains three Ramachandran outliers. These are located in the N terminus of the kinase. Local packing constriction places the electron density into suboptimal geometric position. Great care was taken during model building to place these residues into the map while remaining true to the calculated electron density map. The final structural coordinates and electron density maps were deposited with the protein data bank (PDB code 6P5S). Structural visualization and comparison with related kinases was performed in PyMOL (Schrödinger, LLC, New York).

Molecular dynamics
All-atom unbiased MD simulations were performed using GROMACS 2016.4 (100). Structures were parameterized using the CHARMM36 (101) force field, solvated with TIP3P water, and neutralized using sodium and chloride ions. The system was contained in a dodecahedral box at least 1 nm larger than the protein from all sides with periodic boundary conditions. Long-range interactions were calculated using particle mesh Ewald. Neighbor lists were maintained by the Verlet cutoff scheme (102). The system underwent energy minimization using steepest descent minimization until the maximum force was Ͻ100 kJ/mol. Canonical ensemble (103) was used to warm the system from 0 to 310 K in 100 ps. Isothermal-isobaric ensemble (104) (1 bar, 310 K) was applied for 100 ps. Positional restraints were applied during equilibration. The unbiased MD simulation used 2-fs time steps.

Identification and quantification of HIPK-specific evolutionary constraints
Evolutionary constraints imposed on HIPK sequences were identified using a Bayesian approach described previously (61,91). In brief, curated multiple-sequence alignment profiles of various CMGC kinases families (66) were used to detect and align 15,280 CMGC sequences from the NCBI-nr database. The aligned sequences were used as input for Bayesian partitioning with pattern selection, which identified a correlated residue pattern that most distinguished HIPK sequences from other CMGC sequences. The HIPK-specific patterns EDITORS' PICK: Structure of HIPK2 bound to  are shown in the form of hierarchical multiple-sequence alignment in Fig. S6.