A structural view of PA2G4 isoforms with opposing functions in cancer

eIF2 a eukaryotic initiation factor 2 a internal ribosome entry site; factor;


Edited by Alex Toker
The role of proliferation-associated protein 2G4 (PA2G4), alternatively known as ErbB3-binding protein 1 (EBP1), in cancer has become apparent over the past 20 years. PA2G4 expression levels are correlated with prognosis in a range of human cancers, including neuroblastoma, cervical, brain, breast, prostate, pancreatic, hepatocellular, and other tumors. There are two PA2G4 isoforms, PA2G4-p42 and PA2G4-p48, and although both isoforms of PA2G4 regulate cellular growth and differentiation, these isoforms often have opposing roles depending on the context. Therefore, PA2G4 can function either as a contextual tumor suppressor or as an oncogene, depending on the tissue being studied. However, it is unclear how distinct structural features of the two PA2G4 isoforms translate into different functional outcomes. In this review, we examine published structures to identify important structural and functional components of PA2G4 and consider how they may explain its crucial role in the malignant phenotype. We will highlight the lysine-rich regions, protein-protein interaction sites, and post-translational modifications of the two PA2G4 isoforms and relate these to the functional cellular role of PA2G4. These data will enable a better understanding of the function and structure relationship of the two PA2G4 isoforms and highlight the care that will need to be undertaken for those who wish to conduct isoform-specific structure-based drug design campaigns.

PA2G4 discovery and structure
The human PA2G4 gene was first isolated, characterized, cloned, and mapped by Lamartine et al. (1). They reported a 394-amino acid protein that showed strong similarity to the recently isolated murine p38-2G4 protein (2). Because p38-2G4 was identified as a cell cycle-regulated DNA-binding protein, it was deduced that the human counterpart was likely to have the same function. Three years later, a screen biased toward identifying proteins that bind ErbB3 in the absence of tyrosine phosphorylation described two alternate isoforms of human PA2G4, although they were not fully characterized (3). We now understand that the human PA2G4 gene contains three in-frame ATG initiation codons, which can be differentially spliced to produce either a short PA2G4-p42 (340 amino acids (aa)) or long PA2G4-p48 (394 aa) isoform (4) (Fig. 1). Although both isoforms of PA2G4 regulate cellular growth and differentiation, these isoforms often have opposing roles, depending on the context. The long PA2G4-p48 isoform is transcribed by the first ATG codon and has been shown to promote tumorigenesis. It is the predominant form of PA2G4 found in mammalian cells and is located in both the nucleus and cytoplasm (4). Conversely, the short PA2G4-p42 acts as a tumor suppressor. The PA2G4-p42 isoform is transcribed by the third ATG codon of the PA2G4 gene, is expressed at relatively low levels in mammalian cells, and has been localized only in the cytoplasm (4). Many published biological studies do not differentiate between these isoforms. Where possible throughout this review article, we have referred to them via their isoform-naming convention (i.e. PA2G4-p48 or PA2G4-p42). Where there is ambiguity over which isoform the study was conducted on, or for features that are conserved across both isoforms, we have not added the isoform-naming convention (i.e. PA2G4). Furthermore, unless stated, amino acid numbering is based upon the larger PA2G4-p48 numbering.
To date, three protein structures of PA2G4 have been published ( Table 1). The first was a 2.5 Å resolution X-ray crystal structure of murine PA2G4-p48 (PDB entry 2V6C) (5). This structure was derived from an N-and C-terminally truncated PA2G4-p48 construct (aa Glu 8 -Ser 360 ) and has continuous electron density observed between residues 8 and 360 (5). The second structure was a higher-resolution (1.6 Å) X-ray crystal structure of the human PA2G4-p48 (PDB entry 2Q8K) (6,7). This was obtained with electron density observed between residues 7 and 362 and revealed a highly conserved structural fold to the murine PA2G4, with a root mean square deviation (RMSD), a measure of structural similarity, across the a-carbons of 0.44 Å. Notably, in this structure, a loop region (aa 283-292) does not fit the electron density well and is missing two amino acids (284 and 285). In 2019, a small 20-amino acid fragment of PA2G4 in complex with HNF4a was also solved (PDB entry 6CHT), although only 7 amino acids (AELKALL) were resolved, and we will not discuss this structure further (9). Finally, in 2020 a cryo-EM structure of the human PA2G4-p48, ranging between 3.3 and 8 Å resolution, was elucidated in complex with the human 80S ribosome (PDB entry 6SXO) (8). Interestingly, this is the only structure that has the most C-terminal 33 amino acids resolved. Due to these additional amino acids, unless stated, this structure was utilized in all figures. The cryo-EM structure also showed good alignment to both the murine and human X-ray structures with an RMSD across the a-carbons of 0.86 Å for both.
All of the PA2G4-p48 structures revealed the highly conserved central PA2G4/Ebp1 structure, conserved between mice and humans, and have provided insights into the protein's functionality. These structures revealed that PA2G4 is structurally related to the family of Type II methionine aminopeptidases (Type II MAPs), due to its overall "pita bread" conformation with two a-helices enclosing two antiparallel b-sheets ( Fig.  2) (6,10). MAPs are metalloproteases that act to cleave the first methionine off of a growing polypeptide chain (11). Notably, PA2G4-p48 only has 25% sequence similarity to Type II MAPs, and the differences in the sequences highlight the differences in function. Importantly, although both have a conserved concave binding pocket, residues Asp 262 , His 339 , and Glu 459 in hMAP2 facilitate the catalytic function of the Type II MAPs. In PA2G4, these residues are replaced by Asn 120 , Asp 195 , and Lys 320 (Fig. 2). These substitutions alter the binding pocket charge environment, particularly the positive to negative charge swap with a change from Glu 459 to Lys 320 in PA2G4. Furthermore, unlike hMAP2, none of the solved PA2G4 crystal structures show any evidence that zinc ions or methionine could be present in this binding pocket. Lastly, PA2G4 contains additional C-terminal extensions (aa 337-394), helical (aa 205), and loop (aa 63-72 and 128-134) insertions, which differ from Type II MAPs (5). These structural differences may explain the inability of PA2G4 to perform hydrolytic substrate cleavage on a common MAP substrate. Thus, although PA2G4 adopts a hMAP2-like structure, it clearly has a unique functionality.

PA2G4 interactions with DNA and role in RNA processing and translation
The ability of PA2G4 to bind nucleic acids is well-documented (12)(13)(14)(15)(16). The PA2G4 protein has a predicted amphipathic helical domain (aa 204-246), suggesting that it may directly interact with DNA and/or proteins. PA2G4 has been shown to interact indirectly with E2F1 at E2F1 promoter elements by binding to a complex of nuclear proteins, including retinoblastoma (Rb) protein (17) and Sin3A (18,19). The activity of Rb and PA2G4 on the E2F1 promoter is regulated by the ErbB-3 ligand heregulin (20). The C-terminal domain of PA2G4 is an important DNA-binding domain. Xia et al. (17) showed that this domain (residues 300-372) was necessary for Rb binding and that the binding of PA2G4 to Rb is important in transcriptional repression in Rb1 cells, as PA2G4 mutants lacking the Rb-binding domain could no longer repress transcription of cyclin E. PA2G4 has also been shown to bind to HDAC2 both in vitro and in vivo, with the ability of PA2G4 to recruit histone deacetylase (HDAC) related to its ability to repress transcription. HDAC inhibitors such as trichostatin A can significantly reduce PA2G4-mediated repression (21). PA2G4 strongly binds protein kinase B (Akt) and suppresses apoptosis. Phosphorylated nuclear but not cytoplasmic Akt interacts with PA2G4 and enhances its antiapoptotic action independent of AKT kinase activity (12). Kim et al. (22) determined that PA2G4 also binds to the p53 E3 ligase HDM2, enhancing HDM2-p53 association and thereby promoting p53 polyubiquitination and subsequent degradation to decrease p53 activity. Furthermore, Kim et al. (23) proposed that PA2G4 modulates p53 levels by stabilization of the HDM2 protein and by facilitating HDM2 interaction with Akt. PA2G4 is involved in transcriptional regulation and also seems to participate in the control of protein translation (14,15). Analysis of the crystal structure of PA2G4 revealed two putative RNA-binding sites; the helix a5 (aa 205-213), which contains positively charged lysine residues often found in RNA interaction sites (6), and an extended positively charged surface patch, also with a lysine-rich sequence ( 364 RKTQKKKKKK 373 ), situated within the C-terminal extension (aa 338-394) (5, 6). Interestingly, it was originally suggested via sequence analysis that PA2G4 contained an s70-like domain (14) and a dsRNAbinding domain (24) and that these were the RNA-binding sites of PA2G4. Although deletion of these putative domains abrogated RNA binding, the crystal structures have since shown that neither of these domains are present in PA2G4, leading to the conclusion that the loss of function due to deletion of the residues was likely to have severely disrupted protein folding (5). Furthermore, mutations within the lysine-rich C-terminal extension (aa 364-373) partially inhibit nucleolar localization (14). As the nucleolus is the main site of ribosome biosynthesis, this suggests that the RNA-binding activity of PA2G4 contributes to this function.
PA2G4 protein binds to a number of RNAs, including mRNA (13), rRNA, and ribonucleoprotein complexes (15,24). PA2G4 seems to bind more strongly to ribosomal 5S RNA when compared with a mixture of t-RNAs or the S-domain RNA of the signal recognition particle, suggesting a direct interaction of PA2G4 with mature ribosomes via the 5S RNA (6). As a member of the ribonucleoprotein complex, which is the association of RNA and protein, PA2G4 contains a dsRNAbinding domain that mediates its interaction with dsRNA (14,24). Deletion of this domain diminishes its localization to the nucleolus and reduces the formation of ribonucleoprotein complexes (24). In the cytoplasm, PA2G4 associates with mature ribosomes and acts as a cellular inhibitor of eukaryotic initiation factor 2a (eIF2a) phosphorylation by binding the dsRNA-activated kinase, PKR, which renders it incapable of phosphorylating eIF2a (24).
PA2G4 may also play a role in the processing of rRNA precursors and intermediates. It associates with nucleophosmin (B23) in the nucleolus, and ablation of either B23 or PA2G4 reduces ribosome biosynthesis by inhibiting the processing of the 32S intermediate for subsequent maturation of the 28S rRNA (25). This means PA2G4 acts as an endonuclease that cleaves the 38S pre-rRNA to generate the mature 28S rRNA. Fig. 3 summarizes the molecular interactions of PA2G4 in the cytoplasm and nucleus and its key role in cellular processes.

The function of structural regions of PA2G4
PA2G4 contains numerous structural regions (i.e. lysine-rich regions, an LXXLL motif, and numerous phosphorylation sites). The functions of these regions of PA2G4 have been Both structures are rainbow-colored from the N terminus (blue) to the C terminus (red). Highlighted are the antiparallel b-sheets and the amino acid differences (PA2G4 Asn 120 , Asp 196 , and Lys 320 and hMAP-II Asp 262 , His 339 , and Glu 459 ) that are hypothesized to make PA2G4 unable to act like an hMAP-II and catalyze the release of Met via its zinc ions (Zn). elucidated predominantly via mutagenesis studies. We have summarized these studies and mapped them onto the PA2G4-p48 structure (Fig. 4).
The N-terminal region is only present in the PA2G4-p48 isoform. This region has been suggested to be involved in nucleolar localization, as N-terminal deletion mutants of the protein are detected in the cytoplasm and excluded from the nucleus (14). Mutagenesis studies reveal that residues Lys 20 -Lys 22 on PA2G4 are essential for nuclear localization, with K20A/K22A mutants being unable to localize to the nucleus (14). Nuclear localization is a passive mechanism, and generally, the protein signal for nuclear localization is extended basic sequences (26). However, studies have shown that the essential component is a strong positive charge over a specific interaction surface (27). Lys 20 -Lys 22 are located on a solvent-exposed region of the PA2G4 protein. Interestingly, this position is adjacent to the second lysine-rich region, the highly basic loop 1 ( 63 IFKKE-KEMK 71 ). Due to the structural proximity of loop 1 and Lys 20 -Lys 22 , alongside their importance in nuclear localization, it could be speculated that PA2G4 contains a bipartite nucleolar localization signal (6). Nuclear export and nucleolar localization and/or retention may also be regulated by the C terminus. Specifically, alanine mutation of either Arg 364 or Lys 365 resulted in only partial nucleolar localization (14). Lys 22 in the N-terminal region has also recently been found to be important for an interaction with the minor groove of RNA (8).
The C-terminal region of PA2G4 is highly positive, with six continuous lysine residues. Aside from its aforementioned role . Key molecular interactions and cellular processes of PA2G4. PA2G4 binds to ErbB-3 in the cytoplasm; however, upon stimulation with heregulin (HRG), PA2G4 dissociates from ErbB-3 and translocates to the nucleus. PA2G4 is phosphorylated by protein kinase C and p21-activated kinase (PAK1). Phosphorylation of PA2G4 at serine 360 is needed for proper nuclear localization and for binding to ErbB-3 and nuclear Akt. PA2G4 binds to many proteins, including Sin3A and Rb, to enable transcription of E2F1, a cell cycle regulator. PA2G4 regulates cell survival by interacting with Akt and MDM2 to form a nucleoprotein complex to deregulate p53. PA2G4 also plays a role in the processing of rRNA precursors and intermediates by associating with nucleophosmin in the nucleolus and acting as an endonuclease that cleaves the 32S pre-rRNA to generate the mature 28S rRNA. In the cytoplasm, PA2G4 associates with mature ribosomes and acts as a cellular inhibitor of eIF2a (eukaryotic initiation factor) phosphorylation by binding the dsRNA-activated kinase, PKR, which renders it incapable of phosphorylating eIF2a, and is involved in protein translation. in nucleolar localization, it is widely thought to be associated with PA2G4's role as an internal ribosome entry site (IRES)trans-acting factor (ITAFs) (5). Initially, PA2G4 was identified as IRES-trans-acting factor (ITAF45) (1,2), and its ITAF activity has been implicated in the pathogenesis of the neurovirulent Theiler's murine encephalomyelitis virus (28). ITAFs are RNAbinding proteins that have an additional role in the initiation of cap-independent translation (29,30), which is the first major phase of protein biosynthesis (31). Little is currently known about the role of cellular IRES function; however, ITAFs are believed to be recruited from their cellular functions to assist the folding and/or stabilization of the active form of IRES (29,30,(32)(33)(34). This in turn changes the binding affinity for the IRES to the translation apparatus, promoting the initiation of RNA translation (35). This is one of the two highly regulated translation initiation methods in eukaryotic mRNAs, the other being the cap-dependent initiation mechanism, which occurs in many mRNAs (36). Removal of the C-terminal region (construct containing only aa 1-360) led to PA2G4 being unable to bind to the IRES (5), suggesting an essential role of the C terminus. This was recently elucidated in the cryo-EM structure, where the C terminus of PA2G4 formed an a helix that filled the major groove of the RNA (8). Thus, the authors postulate that the C-terminal region of PA2G4 acts like the well-known arginine-rich motifs that are present in viral transactivation and signal recognition (8,37,38). Notably, this C terminus is a feature that distinguishes PA2G4 from the Type II MAPs and may relate to its specific functionality.
LXXLL motif and PA2G4-p42/PAG4-p48-isoform conserved protein-protein interaction sites The C-terminal region has been shown to be an important protein-protein interaction binding domain, particularly in the repression of transcription. It contains an LXXLL motif, commonly known as a nuclear receptor interaction domain (39). Similar to nuclear receptors, the PA2G4 LXXLL motif ( 354 LKALL 358 , Fig. 5) is helical and forms the charge-clamp interaction that neutralizes the helix dipole (40). Notably, for PA2G4, this is likely to be a weaker interaction than the traditional nuclear receptor charge-clamp interaction, due to a slightly different helical architecture and the positively charged residue being positioned upstream of the LXXLL motif. This may lead to a greater diversity of binding partners. Monie et al. (5) suggests that dissociation of the 354 LKALL 358 helix from the surface of PA2G4 may allow for another LXXL motif to bind to the pocket. However, they note that a binding partner for this site has not been identified to date. Interestingly, the recent cryo-EM complex displayed this motif exposed from the rest of the 80S ribosome, enhancing the suggestion that this motif could bind proteins and affect the ITAF role of PA2G4 (8). To date, proteins shown to bind to the region include Rb, Akt, Sin3A, androgen receptor, and histone deacetylase 2 (HDAC2), which functions to repress transcription in cancer cell lines (12,16,20,41,42).
Our studies suggested that an alternate region of PA2G4-p48 might be a potential protein-protein interaction site, specifically interacting with the oncoprotein MYCN (43). To investi-gate this, we created a series of GST-MYCN deletions and analyzed their ability to bind a FLAG-tagged PA2G4-p48 (43). We identified a 7-residue sequence (aa 248-254), DHKALST, able to bind to PA2G4-p48 in a dose-dependent manner. Our computational modeling and mutagenesis studies confirmed that Ser 47 , Arg 271 , and Arg 272 were essential for a protein-protein interaction with MYCN, with alanine mutations resulting in reduced binding (via surface plasmon resonance) and a more stable MYCN (43) (Fig. 5). Interestingly, the PA2G4-p48 groove that we propose MYCN binds to is between two helices, a2 and a6, and positively charged loop 1 described above (Ile 63 -Lys 72 ). These regions were all shown to interact with RNA in the human 80S ribosome (Fig. 5). Specifically, loop 1 (aa Ile 63 -Lys 72 ) interacts with rRNA via residues Ile 63 , Phe 64 , Lys 65 , and Lys 66 , whereas from helix a6, Gln 254 , Tyr 255 , Leu 257 , Lys 258 , Arg 263 , Phe 266 , Ser 267 , and Arg 271 were all shown to have direct and specific (mostly p-p and p-cation stacking) interactions with the His 59 loop of the rRNA (8). This lends further evidence that this may be an alternate, yet to be fully explored, proteinprotein interface for PA2G4-p48.
Last, the cryo-EM structure also showed that helix a7 (Asp 287 -Leu 301 ) also has the potential to interact with other proteins, specifically in this structure, 60S ribosomal protein L23a (Fig. 5) (8). To our knowledge, a role for this helix has not been described previously, and its interactions with ribosomal protein L23a are predominantly hydrophobic. , which leads to stabilization of MYCN and cellular growth. The a6 helix (purple) along with the basic loop (cyan) interacts with the rRNA (orange), as shown in the cryo-EM structure. This structure also revealed that the a7 helix (red cartoon) may also be another potential protein-protein binding site, as it interacts with the 60S ribosomal protein L23a (shown as red circle), which is part of the larger 80S ribosome. Interestingly, the LXXLL motif (yellow) is completely accessible in the cryo-EM structure, further suggesting that this is a potential interaction site. This site is a major protein-protein interaction site and interacts with numerous proteins, including Rb, Akt, Sin3A, AR, and HDAC2. This stops the protein from then interacting with its downstream cell growth promotor, leading to tumor suppression.

Post-translational modification sites
Post-translational modifications, including phosphorylation, ubiquitination, sumolation, glycosylation, nitrosylation, methylation, acetylation, lipidation, and proteolysis, are commonly employed in the human proteome to regulate cellular activity. These modifications may lead to some of the functional differences between the PA2G4-p48 and PA2G4-p42 isoforms (described in detail in the next section of this review). Because phosphorylation is the most common type of post-translational modification, we used the NetPhos 3.1 Server to predict Ser, Thr, and Tyr phosphorylation sites based upon the sequence of the PA2G4-p48 and PA2G4-p42 isoforms and subsequently mapped them to their respective structures (Fig. 6). The sequence analysis showed that the PA2G4-p48 isoform contained 42 potential phosphorylation sites, whereas the PA2G4-p42 isoform contained only 33 potential sites. Notably, for the PA2G4-p48 isoform, seven of these are buried, and four are beyond the region that has been crystallized to date, whereas for the PA2G4-p42 isoform, all of these have the potential to be surface-exposed and thus available for phosphorylation. Notably, the lack of PA2G4-p42 experimentally determined structures means this is yet to be confirmed.
Two of these sites, present in both isoforms, are adjacent to each other, Ser 360 and Ser 361 , and numerous studies have shown that phosphorylation of these residues is essential for nucleolar localization (4) and/or interactions with the receptor tyrosine-protein kinase erbB-3 (ERBB3), the E3 ubiquitinprotein ligase Bre1 (HUWE1), and Akt (12,44,45). Phosphorylation of proteins can modulate the nature or strength of protein-protein interactions by a variety of methods, including conformationally changing the protein structure and/or changing the binding energy of a complex (reviewed in Ref. 46). Considering that this phosphorylation site is adjacent to the LXXLL motif described above, it would be reasonable to deduce that this is a major protein-protein interaction site for numerous proteins. The structures solved thus far suggest that Ser 360 is slightly more exposed and available for phosphorylation than Ser 361 , and we have chosen to present the results of the studies thus far based upon the serine the authors of those studies have pointed to. Notably, being adjacent to each other, there may be some ambiguity if only one serine or both are phosphorylated.
The role of Ser 361 in nucleolar localization was elucidated by the use of the PA2G4 mutants S361A and S361D (4). The S361A mutant causes the Ser to be unable to be phosphorylated and leads to the protein remaining in the nucleus regardless of growth factor stimulation. In contrast, the PA2G4 S361D mutant, a mutant mimicking protein kinase C (PKC)-mediated phosphorylation, resulted in no PA2G4 detected in the nucleolus, therefore suggesting that phosphorylation of this residue is essential for nuclear localization. Notably, however, this residue is present in both isoforms; thus, it is unclear why only the larger PA2G4-p48 isoform is found in the nucleus.
The crystal structures indicate that Ser 360 is more solventexposed than Ser 361 and thus more likely to be phosphorylated. There are other studies showing the role of Ser 360 phosphorylation, particularly its functional role in protein-protein interactions. Studies have shown that PKC-d-triggered phosphorylation on Ser 360 is responsible for the association of PA2G4-p48 with nuclear Akt and prevention of apoptosis (12). By using mutants similar to the aforementioned PA2G4 Ser 361 Figure 6. Potential phosphorylation sites. Potential phosphorylation sites as calculated from the NetPhos3.1 server (yellow spheres) are mapped onto PA2G4-p42 (left) and PA2G4-p48 (right). Highlighted are those confirmed as phosphorylation sites and the N and C terminus of both structures. Notably, many of the predicted phosphorylation sites on the p48 isoforms are actually buried behind the N terminus helix. This helix is missing in the P42 isoform; therefore, all potential sites could be post-translationally modified, leading to alternate functional effects. mutants, Ahn et al. (12) showed that PA2G4-p48 S360A, which cannot be phosphorylated, barely binds Akt, whereas PA2G4-p48 S360D, which mimics phosphorylation, binds strongly to nuclear Akt. Using the same S360D mutant, Liu et al. (45) noticed that the PA2G4-p42 isoform was strongly polyubiquitinated when compared with WT PA2G4-p42. Even more interesting, they found that the PA2G4-p48 isoform was unable to undergo the same polyubiquitination. After conducting pulldown assays and co-immunoprecipitation, they concluded that the ubiquitination was due to the binding of the E3 ubiquitin-protein ligase, Bre1. Notably, both the PA2G4-p48 and PA2G4-p42 isoforms were able to interact with Bre1; however, only its interaction with the phosphorylated Ser 360 PA2G4-p42 isoform led to polyubiquitination and subsequent degradation. The authors concluded that this difference in function may be due to the location of the isoform (i.e. PA2G4-p42 is located predominantly in the cytoplasm, whereas PA2G4-p48 is located in both the cytoplasm and nucleus). Because Bre1 is a large protein (975 aa) and the protein folding of PA2G4 means that the N terminus of PA2G4-p48 is only approximately 20 Å away from this ubiquitination site, could it also be possible that the N terminus of PA2G4-p48 plays a role in inhibiting Bre1's ability to cause ubiquitination? Ser 363 and Thr 366 are located at the C terminus of PA2G4 and have been shown to be phosphorylated by PKC (44). Phosphorylation was shown to be enhanced by the presence of heregulin, but the functional effect of this phosphorylation was not clearly elucidated. With at least four known phosphorylation sites, it is clear that this C-terminal region is of significant importance in controlling the function of PA2G4 either via its ability to interact with other proteins or its ability to impact nuclear localization (due to the close proximity signal described above).
Three serine residues in the N terminus, Ser 34 , Ser 40 , and Ser 44 , have also been confirmed as PA2G4-p48 phosphorylation sites (47,48). Importantly, these three sites are unique to the PA2G4-p48 isoform and are thought to be partly responsible for the oncogenic role of the PA2G4-p48 isoform (discussed in more detail under "The tumor-promoting role of PA2G4-p48"). Ser 34 was shown to be phosphorylated by cyclin-dependent kinase 2 (CDK2), which led to tumor growth (49). Importantly, this growth could be ablated via a PA2G4-p48 S37A mutation (48). It is likely that this phosphorylation is increasing the affinity for another protein, although which protein(s) this is or how this leads to an increase in tumor growth remains to be determined. PA2G4-p48 Ser 40 and Ser 44 were explored by Wang et al. (47), who demonstrated that the PA2G4-p48 sequence 40 SSGVS 44 resembles the putative glycogen synthase kinase 3b (GSK3b) phosphorylation sequence (Ser/Thr-X-X-X-Ser/Thr). An important feature of this sequence is that GSK3b usually requires priming phosphorylation on Ser or Thr at position 14 in order to phosphorylate (50, 51); therefore, they hypothesized that one Ser would be a priming site, which in turn allows the other to be phosphorylated. Subsequent coimmunoprecipitation studies with WT PA2G4-p48 and a S44A mutant suggested that Ser 44 was the priming phosphorylation site for GSK3b recruitment, with subsequent Ser 40 phosphorylation by GSK3b (47). Importantly, phosphorylation of either Ser 40 or Ser 44 is essential for the interaction between PA2G4-p48 and F-box/WD repeat-containing protein-7 (FBXW7) and subsequent degradation of PA2G4-p48 via the recruitment of ubiquitinating proteins (47). Therefore, these phosphorylation sites are important regulatory sites for PA2G4-p48. The validity for the other phosphorylation sites suggested by NetPhos remains to be experimentally evaluated; however, it would be intriguing to see whether the buried PA2G4-p48 phosphorylation sites, exposed in the PA2G4-p42 isoform, are important for the different functions of the isoforms.
Functional and structural differences between the PA2G4-p42 and PA2G4-p48 isoforms It has been established that the two PA2G4 isoforms have opposing roles in cancer, and the underlying structural reason for this has been the subject of numerous studies in recent years (reviewed in Ref. 52). The longer p48 isoform of PA2G4 localizes to the cytoplasm and nucleus, suppressing apoptosis and promoting cellular growth. In contrast, the shorter PA2G4-p42 isoform of PA2G4, which lacks the N-terminal 54 amino acids present in PA2G4-p48, is confined predominantly to the cytoplasm, where it acts to promote differentiation and suppress proliferation (4). Monie et al. (5) suggested that the PA2G4-p42 isoform was likely to lack one and a half helices from the N terminus of PA2G4-p48. Our modeling studies confirm that PA2G4-p42 is missing the entire helix a1 and half of a2 (Fig.  7A). In agreement with Monie et al. (5), we also show that truncation of this N terminus is likely to expose a large hydrophobic surface (Fig. 7B), which, when exposed to solvent, would have a destabilizing effect and may explain the significantly lower levels of PA2G4-p42 expression (4). We extended this analysis and generated electrostatic potential surfaces using Pymolv2.3.2 (Schrodinger, LLC). This confirmed the hydrophobic patch on the PA2G4-p42 isoform (Fig. 7B). Furthermore, due to the removal of the N terminus, we found different electrostatic potential profiles across this face of the protein, with the PA2G4-p42 isoform being more positive and the PA2G4-p48 isoform more negative (Fig. 7B). This may account for alternate protein-binding patterns in cells.
The differing N termini between the isoforms has also been suggested to be responsible for different interactions with protein binding partners. It has been shown that PA2G4-p42 but not PA2G4-p48 is able to bind to ErbB3 (4). Mechanistically, this is thought to be due to the N-terminal 54-residue extension of PA2G4-p48, which blocks/masks the ErbB3-binding motif on the PA2G4-p42 isoform, preventing this interaction from occurring (4). This is confirmed by our modeling studies and others, which show a large hydrophobic surface on the PA2G4-p42 isoform, postulated to be the binding pocket of ErbB3. Conversely, the larger N-terminal domain of PA2G4-p48 has been shown to be essential for other protein interactions. Specifically, the HDM2 protein has been shown to bind tightly to the PA2G4-p48 isoform, but not an N-terminal domain mutant (aa 48-394) that mirrors the PA2G4-p42 isoform (22). As discussed above, this truncation also removes Lys 20 and Lys 22 , which are shown to be essential for nucleolar localization (14). This minor structural difference is an intriguing site to target for small-molecule inhibition. Drugs that interact only with the PA2G4-p48 isoform would have the potential to inhibit the oncogenic role of PA2G4-p48, while having little effect on the tumor-suppressive role of PA2G4-p42. In our study, we used structure-based drug design methods to target a binding site boarded by helix a2 (Ser 44 -Lys 62 ) (43). We hypothesized that the known PA2G4 inhibitor WS6 (53) bound to the same groove on PA2G4 that MYCN binds to, a groove boarded by helix a2 (43). Further biological assessment of WS6 confirmed that it had an inhibitory effect on PA2G4-p48's oncogenic role in malignant neuroblastoma cells in vitro and in vivo (43). Due to the lack of helix a2 in PA2G4-p42, we now hypothesize that this compound would not be able to bind to the shorter PA2G4-p42 isoform effectively and therefore should have no effect on this isoform's function. However, further experiments are needed to support our hypothesis.
Another interesting drug discovery approach would be to selectively repress PA2G4-p48, while also enhancing PA2G4-p42 via a protein partner. Insights into this approach may be obtained from one of the most interesting cases of a PA2G4binding protein, FBXW7. FBXW7 has been characterized as a general tumor suppressor as it functions as a substrate recognition subunit of the SCF (SKP1/CUL1/F-box protein) E3 ubiquitin ligase complex (54), which in turn regulates a network of proteins with central roles in cell division, growth, and differentiation (55,56). FBXW7 has been shown to bind to both the PA2G4-p42 and PA2G4-p48 isoforms but to elicit different responses in tumor development (47). Using mutagenesis and immunoprecipitation, the authors showed that only the WD40 domain of FBXW7 interacts with the N-terminal tail of PA2G4-p48, specifically involving the Ser 40 and Ser 44 residues discussed above being phosphorylated by GSK3b, leading to ubiquitination and proteasomal degradation of the p48 isoform (47). Interestingly, this interaction formed a negative regulation loop where PA2G4-p48 binds to FBXW7. This result masks FBXW7's nuclear localization signal, thereby altering its cellular location and subsequent ubiquitination functions on other oncoproteins. However, although it is missing the N terminus, the PA2G4-p42 isoform was still able to bind to FBXW7 (47).
Through extensive mutagenesis and immunoprecipitation, the authors showed a differing interaction profile between FBXW7 and the PA2G4-p42 isoform, with the shorter isoform interacting with both the F-box and the WD40 domains of FBXW7. Because F-box domains act to recruit ubiquitin-conjugating enzymes, they postulated that the PA2G4-p42 isoform acts as a scaffolding protein for FBXW7, stabilizing and enhancing its tumor-suppressing role (47). Although mirroring the endogenous interactions of FBXW1 with both PA2G4 isoforms via therapeutics would be difficult, it is intriguing to speculate about other proteins that may be interacting with both isoforms but eliciting alternate roles. The alternate roles of the isoforms in tumor development are further discussed below.

Tumor-suppressive role of PA2G4-p42
The shorter PA2G4-p42 isoform is generally found at much lower levels in the cell (4). Although recently there is some evidence that the longer isoform may also be ubiquitinated (47), early studies suggested that the shorter isoform was the only isoform of PA2G4 able to be ubiquitinated (45). Regardless, there is strong evidence that the PA2G4-p42 isoform alone is ubiquitinated by the E3 ubiquitin-protein ligase, Bre1. This E3 binds to the phosphorylated PA2G4-p42 isoform, specifically requiring phosphorylation of residue Ser 360 , promoting its polyubiquitination and degradation (45). This process is thought to be enhanced in cancer cells, which in turn reduces any tumor-suppressive actions of the PA2G4-p42 isoform. A series of studies have been conducted to elucidate the cellular pathways underlying the tumor suppressor function of PA2G4-p42. Specifically, these studies have shown that PA2G4 binds to proteins including androgen receptor (AR), phosphatidylinositol 3-kinase (PI3K) (49), Erb-3 receptor (57), and Rb (17), preventing them from binding to their specific cell growth regulators. The PA2G4-p42 isoform has been shown to be tumor-suppressive in prostate, colon, breast, non-small-cell lung carcinoma, and salivary adenoid cystic carcinoma cells and/or models (16, 20, 41, 42, 48, 58). For prostate cancer, overexpression of PA2G4-p42 in LNCaP androgen-sensitive human prostate adenocarcinoma cells produced a decrease in tumorigenic growth both in vitro and in xenograft models (16). Similarly, siRNA knockdown of PA2G4-p42-Ebp1 in LNCaP cells increased AR activation (42). Overexpression of the PA2G4-p42 isoform has been shown to significantly down-regulate six genes in the AR signaling pathway; AR, prostate-specific antigen (PSA; kallikrein 3), kallikrein 2, POV-1, TMPRSS2, and prostate-derived factor (16). As described above, the C-terminal region of PA2G4 has been shown to directly bind to the AR (41) and Sin3A (16), a known co-repressor of AR. Interestingly, although this C terminus is conserved between the PA2G4-p42 and PA2G4-p48 isoforms, only the p42 isoform has been shown to bind to the AR. Therefore, it would not be unreasonable to postulate that the PA2G4-p42 isoform acts as part of a larger complex involving AR, Sin3A, and HDAC, all of which bind to and repress PSA gene transcription (16). The significant tumor-suppressive role seen by the PA2G4-p42 isoform in prostate cancer, may be a result of PA2G4-p42 not only binding to and suppressing the AR, but also via down-regulation of PI3K activity (49). PA2G4-p42 interacts with the p85 subunit of PI3K, encouraging the recruitment of the chaperone-E3 ligase complex HSP70/CHIP, which in turn leads to ubiquitination and degradation of PI3K (49). The central role of the PI3K growth-signaling pathway in numerous cancers (59), the ability of the PA2G4-p42 isoform to suppress the PI3K growth-signaling pathway and subsequently reduce Akt activation, may explain the general tumorsuppressive role of the PA2G4-p42 isoform.
The tumor-suppressive pathways for the PA2G4-p42 isoform in breast cancer have also been well-studied. PA2G4 was first discovered as an ErbB-3 receptor-binding protein due to its ability to interact with the cytoplasmic tail of ErB3 (3). ErbB3 belongs to the ErbB family of receptor tyrosine kinases, which modulate intracellular signaling cascades and are implicated in breast cancer oncogenesis. When this receptor is stimulated with the ErbB-3/4 ligand heregulin in AU565 breast cancer cells, the PA2G4-p42 isoform dissociates from the ErbB-3 receptor and is translocated from the cytosol to the nucleus, where it undergoes phosphorylation (3,4,44,57). PA2G4 also reduces the protein levels of another Erb family member, ErbB2, also known as HER2 (19,60). In another example of its multifunctional nature, PA2G4 both increases ubiquitination of ErbB2 and suppresses intracellular ErbB2 mRNA promotors, either directly or via a complex of proteins (60). Interestingly, this study showed no significant difference in the expression levels of either PA2G4 isoform, and cells transfected with both PA2G4 isoforms show an increase in tamoxifen sensitivity (60). Does this suggest that the Erb family has a similar profile as FBWX1 described above and interacts with the alternate isoforms of PA2G4 differently? More biochemical experiments will be needed to explore this; however, it is clear that through multiple interactions with the Erb family, PA2G4-p42 is postulated to act as a tumor suppressor.
An additional tumor-suppressive role for PA2G4-p42 in breast cancer is via its interaction with the Rb protein, which acts as a tumor suppressor and is often down-regulated in solid tumors, including breast cancer (61). The C-terminal residues 300-372 of the PA2G4-p42 isoform bind directly to the Rb protein (17). The interaction between the PA2G4-p42 motif and Rb leads to the inhibition of the E2F1 transcription factor and a decrease in cyclin E levels, two proteins whose activity promotes oncogenesis (41,62). This was shown experimentally by Zhang et al. (41), who demonstrated that PA2G4-p42 can inhibit the transcriptional activity in PA2G4-p42 isoformtransfected MCF-7 cells. In addition, PA2G4-p42 inhibits the transcription of other E2F1-regulated promoters, such as the c-MYC and cyclin D1. Notably, ectopic expression of PA2G4-p42 in AU565 cells inhibits proliferation and induces cellular differentiation, in a similar manner to treatment with the Erb-3/4 ligand, heregulin (described above) (57). The inhibition by PA2G4 of the E2F-regulated promotors is enhanced by heregulin treatment (20). This suggests a synergistic role between PA2G4 and heregulin with at least two alternate pathways in breast cancer, where PA2G4-p42 plays a role as an intracellular signaling protein with the ability to interact with transcriptional modulators. These experiments in the breast cancer setting have also elucidated that PA2G4-p42 is unlikely to repress transcription alone. Rather, its transcriptional repression activity seems to be partially dependent on recruitment of HDAC activity, specifically HDAC2, via the C-terminal 293-372 aa motif of PA2G4-p42 (41). As in the prostate cancer example above, these residues are conserved between the PA2G4-p48 and PA2G4-p42 isoforms; therefore, the exposed hydrophobic face and differential phosphorylation profiles are likely to be the differential features. Regardless, there is strong evidence that the expression levels of PA2G4 could potentially be used to guide clinical treatment decisions.
Although the tumor suppressor role of the p42 isoform has been established in breast and prostate cancer, a recent study found that PA2G4-p42 also acts as an oncoprotein, and both PA2G4 isoforms increase the malignant phenotype of neuroblastoma cells in vitro (43). Further studies are therefore required to explore the mechanism underlying the apparent contextual role of PA2G4-p42 cancer.
The tumor-promoting role of PA2G4-p48 The long PA2G4-p48 isoform has numerous roles in cancer, including inhibiting the tumor-suppressing activity of p53 (22), binding to mRNA and promoting up-regulation of antiapoptotic proteins (13), directly binding to and stabilizing the MYCN oncoprotein in a positive forward feedback expression loop (43), and stabilizing proteins by down-regulating their degradation via the ubiquitination pathway (47). The PA2G4-p48 isoform has been shown to directly interact with Bcl-2 protein mRNA, promoting the expression of this antiapoptotic protein (13). The phosphorylated PA2G4-p48 isoform interacts with nuclear Akt and prevents apoptotic cell death (12). Furthermore, knockdown of PA2G4-p48 increases the levels of endogenous p53 in glioblastoma cells. The PA2G4-p48 isoform also directly binds HDM2, which promotes HDM2-dependent p53 ubiquitination (22). Combining this with the evidence that the PA2G4-p48 isoform is highly expressed in more aggressive cancers (22,63) suggests that the reduction in p53 levels may be regulated by the levels of PA2G4-p48 as a mechanism of oncogenesis.
Knockdown of PA2G4-p48 in muscle stem cells and loss-ofexpression studies in mice and chick embryos indicate a role in regulating proliferation and differentiation (64). Consistent with a role in proliferation, PA2G4-p48 knockout mice are 30% smaller compared with WT littermates and show cellular hallmarks associated with growth retardation (65). Cellular analysis from these mice demonstrated a decrease in mouse embryonic fibroblast growth. Biochemical analysis revealed a decrease in insulin-like growth factor 1 (IGF1) and a paradoxical increase in MAPK signaling transduction pathway activity, which the authors deduce to be related to differing cellular backgrounds. These findings were supported by an observed increase in expression of Sevenless homolog 1 (SOS), an upstream activator of the Ras-MAPK signaling cascade, and suggest a complex role of PA2G4-48 in the MAPK signaling pathway. The MAPK signaling pathway has a long-established overall role to sustain cell survival, proliferation, and differentiation in eukaryotic cells, and as a consequence, the MAPK signaling pathway is a promising therapeutic target in multiple cancers (66). Therefore, it is not surprising that high expression of PA2G4-p48 has also been implicated in numerous cancers.
Our studies showed that in neuroblastoma cells, PA2G4-p48 targeted siRNA knockdown reduced MYCN protein levels (43). This decrease was normalized in the presence of the proteasomal inhibitor MG132, suggesting that PA2G4-p48 protects MYCN from proteasomal-mediated degradation. Furthermore, loss of PA2G4-p48 appears to sensitize neuroblastoma cells to 13-cis-retinoic acid treatment, suggesting that PA2G4-p48 may contribute to the therapy resistance observed in MYCN-amplified neuroblastoma cells (43). For these reasons, PA2G4-p48 is an attractive target for therapeutic inhibition of the MYCN signaling pathway in neuroblastoma and other MYCN-amplified cancers. Our studies revealed that PA2G4 was able to bind directly to MYCN, leading to an oncogenic forward feedback loop of high expression levels of PA2G4-p48 and MYCN proteins (43).
PA2G4-p48 has also been implicated in the tumorigenesis of oral squamous cell carcinoma (OSCC). As demonstrated by Mei et al. (67), PA2G4-p48 expression in OSCC cells is linked with that of podoplanin, a mucin-type protein. Podoplanin is both a prognostic and diagnostic marker in various cancers (68). PA2G4-p48 been shown to bind to a DNA sequence in the podoplanin gene promoter, suggesting a role of PA2G4-p48 in podoplanin transcriptional regulation. In line with these findings, down-regulation of PA2G4-p48 results in a reduction in malignant phenotypes, including invasion and anchorage-independent growth (67). In contrast to these findings, ectopic expression of full-length PA2G4 cDNA in OSCC cell lines results in decreased growth and proliferation (69). Similar findings have been demonstrated in salivary adenoid cystic carcinoma (58). It remains unclear in these studies which isoform of PA2G4 is modulating cellular function. Thus, the role of PA2G4 in oral cancers requires further investigation.
Finally, the role of PA2G4-p48 in acute myeloid leukemia is an emerging area of research. A recent study has demonstrated that peripheral blood samples from patients with acute myeloid leukemia demonstrate increased PA2G4-p48 expression, with an associated increase in the proliferating cell nuclear antigen oncoprotein. In these cell lines, PA2G4-p48 has been shown to bind to RNA polymerase I to increase the synthesis of rRNA (70).

Conclusions
The proliferation-associated protein 2G4 (PA2G4) has two distinct isoforms, which function as a contextual tumor suppressor or oncoprotein, depending on the tissue type being studied. Although this review has bought together the structural and functional studies regarding PA2G4, much is still to be elucidated. There is only one significant structural difference between the two isoforms (i.e. the p42 is missing the most Nterminal 54 amino acids); however, they have opposing functional roles in the context of cancer. Some of this can be explained by the exposure of a hydrophobic surface pocket in the PA2G4-p42 isoform, which is likely to provide a proteinprotein interaction site that is unique for this shorter isoform. This exposed surface also contains alternate potential phosphorylation sites, although to our knowledge none of these have been experimentally verified. Another reason for the alternate functional role may be the location of the protein, with the PA2G4-p48 isoform containing an N-terminal nucleolar localization signal. However, it is more likely that the additional phosphorylation sites in this N-terminal region (namely Ser 34 , Ser 40 , and Ser 44 ), which are known to bind to specific proteins, play a larger role in its oncogenic function.
As PA2G4 has a distinct role in cancer progression, it may serve as a novel therapeutic target. However, because the two isoforms have opposing functional roles, care needs to be taken to enable targeting of the oncogenic isoform only. In this review, we have highlighted the multiple protein-protein interaction sites that may be targeted by small-molecule therapeutics (73). In designing these inhibitors, it will be important to focus on sites essential for the oncogenic role of PA2G4, such as its interaction site with MYCN, which we postulate also binds to WS6, a PA2G4 inhibitor (43). An alternate approach might be to focus on protein partners, such as FBXW1, which interacts with both isoforms but elicits alternate functional effects in the cell (47). Notably, both of the aforementioned studies were published in the last 3 years; thus, it is likely that we have yet to discover the full extent of protein partners.
There are still some confounding results, especially relating to the ability of PA2G4-p42 to act as an onco-suppressor by interacting with other proteins via its C-terminal region, a region conserved in the larger PA2G4-p48 isoform. Potentially, this highlights a competitive binding nature, although additional biochemical studies are needed that can link the structure-function of the alternate PA2G4 isoforms. Although not discussed in this review, there is also the potential to use genetically engineered mouse models to express the different PA2G4 isoforms and assess its biological impact in normal and cancer cells. Using PA2G4-p48 knockout mice, one could investigate whether depletion of PA2G4-p48 can delay or prevent tumor growth when crossed with murine cancer models (i.e. TH-MYCN mice to study neuroblastoma).
With this information in mind, it is clear that there is still much to be elucidated regarding the subtle structural changes that lead to the differing functional effect of the PA2G4 isoforms. The recent cryo-EM structure highlights the benefit of obtaining structures of complexes where these subtle binding differences may be seen. Because both PA2G4 isoforms have therapeutic potential across the field of oncology, any future PA2G4 structures, particularly those in complex with binding partners, will be an important step in understanding the biology of this protein.