The ABCs of the atypical Fam20 secretory pathway kinases

The study of extracellular phosphorylation was initiated in late 19th century when the secreted milk protein, casein, and egg-yolk protein, phosvitin, were shown to be phosphorylated. However, it took more than a century to identify Fam20C, which phosphorylates both casein and phosvitin under physiological conditions. This kinase, along with its family members Fam20A and Fam20B, defined a new family with altered amino acid sequences highly atypical from the canonical 540 kinases comprising the kinome. Fam20B is a glycan kinase that phosphorylates xylose residues and triggers peptidoglycan biosynthesis, a role conserved from sponges to human. The protein kinase, Fam20C, conserved from nematodes to humans, phosphorylates well over 100 substrates in the secretory pathway with overall functions postulated to encompass endoplasmic reticulum homeostasis, nutrition, cardiac function, coagulation, and biomineralization. The preferred phosphorylation motif of Fam20C is SxE/pS, and structural studies revealed that related member Fam20A allosterically activates Fam20C by forming a heterodimeric/tetrameric complex. Fam20A, a pseudokinase, is observed only in vertebrates. Loss-of-function genetic alterations in the Fam20 family lead to human diseases such as amelogenesis imperfecta, nephrocalcinosis, lethal and nonlethal forms of Raine syndrome with major skeletal defects, and altered phosphate homeostasis. Together, these three members of the Fam20 family modulate a diverse network of secretory pathway components playing crucial roles in health and disease. The overarching theme of this review is to highlight the progress that has been made in the emerging field of extracellular phosphorylation and the key roles secretory pathway kinases play in an ever-expanding number of cellular processes.

The study of protein phosphorylation began as early as 1883 to 1900, when phosphorous was detected in milk casein (1) and egg-yolk phosvitin (2) respectively, thus making them the two earliest known phosphoproteins. Intriguingly, both these phosphoproteins are secreted from cells. Casein is secreted in milk (3) while phosvitin, a cleaved form of vitellogenin, is synthesized in the liver and secreted into the oviduct (4,5). Since these initial discoveries, casein and phosvitin have been used as common artificial substrates in the study of numerous kinases (6)(7)(8). In fact, the first evidence for the existence of protein kinases was provided by the pioneering study of George Burnett and Eugene Kennedy where they used rat mitochondrial extract to provide ATP and casein as the substrate to demonstrate the covalent addition of phosphate to casein in vitro (6). Since that time, many investigators have added to the number and complexity of kinases leading to the compilation of the kinome in 2002 (9). This list of the human kinome included 540 individual members and represented kinases that could phosphorylate proteins as well as other biological molecules such as lipids and carbohydrates primarily within the cytosol and nucleus of the cell. But what about the kinases that phosphorylate resident proteins in the secretory pathway or proteins destined for secretion? This question was partially answered when the physiological secretory pathway kinase phosphorylating casein, family of sequence similarity 20C (Fam20C), was discovered in 2012 (10,11). This same kinase was found to phosphorylate phosvitin in 2018 and thereby is accountable for the phosphorylation of the first identified secreted phosphoproteins (5).
The first clue for recognizing the secretory pathway kinases came from the identification of the Drosophila protein, fourjointed (Fj), as a secretory pathway kinase that phosphorylated the extracellular domains of atypical cadherins (12). Using Fj as a BLAST query revealed a small family of related proteins that included Fam20A, B, and C (11). Since little was known about these proteins, they were designated "Fams" based on shared but limited sequence similarity. They all harbor a signal peptide that would direct them into the secretory pathway, but due to a lack of sequence similarity with canonical kinases, none of these atypical kinases were represented in the human kinome. The other domain these proteins share, which is also the sequence of highest homology, is the C-terminal Fam20 domain. Unexpectedly, the conserved Fam20 domain in each of these proteins has a very different function. Fam20C is the Golgi casein kinase responsible for phosphorylating secreted proteins on SxE/pS motifs (11). Fam20A is a pseudokinase that interacts with Fam20C and increases its activity (13), and Fam20B is a xylose kinase involved in proteoglycan biosynthesis (14,15).
Over the past few decades, multiple proteins in the extracellular and secretory space have been found to be phosphorylated. Many of these phospho-proteins are secreted into milk, serum, plasma, and cerebrospinal fluid (reviewed in (16)) and have defined roles in diverse cellular processes from signaling, coagulation, migration, extracellular matrix formation, proteolysis, and biomineralization. The majority of these secreted proteins exhibit a phospho-motif of SxE/pS but to date, we have limited knowledge of the function of the majority of these extracellular phosphorylation events (reviewed in (16)). Interestingly, out of the 540 kinases in the human kinome, only two kinases have been found localized in the secretory pathway: protein O-mannosyl kinase (POMK/ SGK196) (17,18) and the tyrosine kinase, vertebrate lonesome kinase (VLK/SGK493) (19), both of which do not phosphorylate SxE/pS motifs. Because the identity of the kinase(s) responsible for the majority of extracellular phosphorylation events remained elusive, the study of extracellular phosphorylation has lagged behind that of intracellular phosphorylation. It is increasingly clear that extracellular phosphorylation events play just as important roles in cellular regulation as their intracellular counterparts.
To date, there are 13 known secretory pathway kinases (or kinase-like proteins), and we know very little about some of them. In a handful of cases, we do not know their substrate specificity or even if they are active kinases. This review focuses on Fam20A, B, C, the small subfamily of secretory pathway kinases for which we have made significant progress. In particular, we will address their cellular functions, reported substrates, structure/function relationships, and importance in human disease.

VLK family and POMK
VLK and POMK are two secreted kinases that can be found at the root of the kinome tree. Therefore, their amino acid sequences were well enough conserved with the canonical kinases for them to be classified as kinases. POMK is an Omannose kinase important for dystroglycan receptor function and matriglycan elongation (18,20). VLK is the first secreted tyrosine kinase identified, and it phosphorylates a broad range of secreted and ER-resident substrates (19). A PSI-BLAST search using VLK as a query produces another small family of potential secreted kinases that includes Fam69A, Fam69B, Fam69C, DIA1, and DIA1R. Very little is known about these proteins (21)(22)(23).

Fj family of atypical kinases
As alluded to in the introduction, the study of extracellular kinases was spearheaded by Ken Irvine's laboratory when they published the first example of a secreted kinase, the fly protein Fj, which they went on to show phosphorylated unusual cadherin domains (12). The murine equivalent of Fj, four-jointed box 1 (FJX1) is involved in forming appropriate dendrite arbor morphology in the hippocampus (24), and recently, human FJX1 has been shown to increase the invasive potential of nasopharyngeal cancer cells (25,26). In addition to FJX1 and Fam20A, B, and C, this small family contains two additional members, Fam198A and Fam198B. To date, neither Fam198A nor B has been ascribed kinase activity, and very little is known about their cellular functions (27,28).

Fam20B, the secreted xylose kinase
Vertebrates exhibit three members of the Fam20 family of proteins (Fam20A,B, and C), whereas early invertebrates such as hydra and sponge have a single homolog of Fam20 whose activity resembles the human Fam20B-like protein ( Fig. 1) (29). Within the Fam20 family of secretory kinases, Fam20B was identified as a xylosylkinase kinase that phosphorylates xylose residues within the conserved tetrasaccharide linkages of proteoglycans (15). Interestingly, the xylose phosphorylation on the proteoglycan tetrasaccharide linkage was first identified in hydra (30), and further biochemical investigation revealed that hydra Fam20 and sponge Fam20 lacked protein kinase activity but exhibited robust xylosylkinase activity (29). In fact, Fam20B is thought to be the first ancestral template protein for the Fam20 family of kinases and the function of xylose phosphorylation is conserved through the animal phylum from sponges to humans (29). This evolutionary relationship is apparent in available structures. The ATP-binding sites of Fam20B and Fam20C are highly conserved (Fig. 2, A and B). However, Fam20B has a unique saccharide binding site not present in Fam20C or Fam20A (Fig. 2, A and C) (29). Fam20C homologs are characterized by an occluded substrate binding pocket that cannot accommodate bulky saccharide substrate due to steric clashes. This occlusion results from slight structural rearrangements arising from distal residue substitutions that position a flexible loop within the binding pocket ( Fig. 2D) (29). The Fam20B-mediated xylose phosphorylation robustly stimulates galactosyltransferase II (GalT-II) activity leading to further addition of galactose to the tetrasaccharide linkages and accelerated proteoglycan chain extension ( Fig. 5) (14). Furthermore, EXTL2 (Exostosin-Like Glycosyltransferase 2) polymerase utilizes the xylose phosphorylation to transfer a GlcNAc residue to the tetrasaccharide linkage region leading to termination of proteoglycan chain elongation (31). Intriguingly, depletion of Fam20B leads to immature proteoglycan formation, a phenotype quite reminiscent of Ehlers-Danlos syndrome, a rare inherited condition that affects connective tissue owing to GalT-II mutations (14). Thus, Fam20B plays an evolutionarily conserved quality-control role for proteoglycan biosynthesis and is arguably the ancestral Fam20.
Whole-body genetic depletion of Fam20B in mice was embryonic lethal at E13.5 with the embryos exhibiting severe development defects and significant organ hypoplasia (32). These observations were consistent with studies in zebrafish wherein loss-of-function mutants of Fam20B led to aberrant cartilage matrix organization and early stages of chondrocyte hypertrophy leading to skeletal defects (33). These initial in vivo observations were further echoed when tissue-specific depletion of Fam20B in mice led to the development of supernumerary teeth (34,35), chondrosarcoma with major postnatal ossification defects (36), and severe craniofacial defects (37). Thus, the overarching role of Fam20B in proteoglycan biosynthesis likely contributes to the skeletal and developmental defects observed upon Fam20B depletion in tissue-specific in vivo models. In humans, two lethal compound heterozygous variants in Fam20B have been identified in a girl who died soon after birth ( Fig. 2E) (38). The genetic alterations reported were T59Afs and N347Mfs and the patient exhibited severe organ hypoplasia, skeletal defects, and respiratory failure (38). The amino terminal T59A frameshift leads to hypomorphic gene function and essential loss of one allele of Fam20B. The carboxy-terminal alteration, N347M frameshift, results in disruption of more than 15% of the protein sequence and results in the loss of C389, which forms a disulfide bond with C332 and likely contributes to the global stability of the protein. The N347M frameshift, therefore, results in a destabilized Fam20B and also represents a functionally inactive variant. Intriguingly, osteoarthiritis and osteochondropathy patients with decreased proteoglycans and chondrocyte numbers exhibited marked reduction of Fam20B, GalT-II, and EXTL2 protein levels in knee cartilage biopsy samples (39). This suggests that Fam20B could be a predictive marker for specific bone diseases.

Fam20C, the secreted Golgi casein kinase
As stated previously, the story of milk casein as a phosphoprotein started in the late 19th century when Olof Hammarsten reported the presence of phosphorus in casein (1). Fifty years later, Fritz Lipmann identified that the phosphorus was covalently bound to casein as phosphoseryl groups (40). Eventually, the sequences surrounding those phosphoseryl groups in casein were identified as SxE/pS, which prompted the idea that SxE/pS sequence was the preferential motif for enzymes phosphorylating casein (41,42) within the secretory pathway (43). In subsequent years, two cytoplasmic kinases were shown to robustly phosphorylate casein in vitro and because of this ability were designated casein kinase 1 and 2 (44). This was despite the fact that they would never come into contact with casein because they were localized to the cytoplasm and nucleus while casein, a secreted protein, resided in the secretory pathway and extracellularly. The bona fide "Golgi casein kinase" activity was initially observed in lactating mammary glands (41,43,45) and partially purified from milk (46). Lorenzo Pinna and colleagues extensively characterized the activity of the partially purified protein from Golgi fractions and further reported that the kinase was highly resistant to the majority of the well-established kinase inhibitors including staurosporine (3,(47)(48)(49)(50). In 2012, this elusive activity was identified molecularly when Fam20C was experimentally recognized to be the Golgi casein kinase capable of phosphorylating casein in vivo (11). Although atypical, crystallography studies on the nematode-ortholog of Fam20C (51) revealed that the kinase exhibited the canonical N-and Clobed kinase structure with a well-defined ATP-binding activesite pocket (Fig. 3A). The breadth of Fam20C's activity was alluded to when phosphoproteomic studies of human plasma, serum, and cerebrospinal fluid demonstrated that more than two-thirds of secreted phosphorylated proteins were phosphorylated on SxE/pS motifs (52)(53)(54). In fact, phosphoproteomic analysis of secreted neuropeptides in the nervous and endocrine system revealed that the predominant phosphomotif was SxE (55). This was solidified by studies in which Fam20C was ablated in several tissue culture cell lines and the culture media was analyzed for secreted phosphoproteins (56). Cumulatively, this work resulted in affirming that Fam20C is the kinase responsible for phosphorylating the majority of secreted proteins and broadened Fam20C's substrate preference to include phosphorylation sites other than SxE/pS sites (56). For instance, a recent study reported that specific threonine residues on the neuroendocrine chaperone 7B2 were phosphorylated by Fam20C (57). Surprisingly, there were nonoverlapping substrates between the secreted phosphoproteome from the different cell lines indicating that individual cell populations have different milieux of secreted proteins.
Fam20C has a strong cofactor preference for Mn 2+ and Co 2+ ions over the canonical Mg 2+ ion for its kinase activity (47) although the physiological levels of Mg 2+ in cells (around 1 mM) are 10 4 fold higher than Mn 2+ (about 100 nM) (44). Lorenzo Pinna and colleagues argued that under physiological circumstances, specific signaling components may play a role in promoting Fam20C to utilize Mg 2+ over Mn 2+ in the secretory pathway (44). The group reported that sphingosine and sphingosine-1-phosphate significantly improved the ability of Fam20C to utilize Mg 2+ as a cofactor (50,58). Indeed, sphingosine addition led to an eightfold higher activity of Fam20C in vitro with a Figure 2. Structure of FAM20B, the glycan kinase. A, structure of Hydra magnipapillata FAM20B (hmFAM20B, PDB ID: 5xoo, chain A, white) with bound adenosine (ADN) and Galβ1-4Xylβ1 substrate. N and C lobes indicated approximately. B, FAM20B ATP-binding site (PDB ID:5xoo, chain A, white, ADN:adenosine) is highly conserved with C. elegans FAM20C ATP-binding site (PDB ID:4kqb, chain A, goldenrod, ADP, adenosine diphosphate). Similar residues labeled (FAM20B:black, FAM20C:orange). C, FAM20B saccharide binding site containing Galβ1-4Xylβ1 substrate (gray) (PDB ID:5xoo, chain A). D, superimposed FAM20B (PDB ID:5xoo, chain A, white) with C. elegans FAM20C (PDB ID:4kqb, chain A) at saccharide binding site. Arrow indicates flexible loop occluding saccharide binding. E, gene diagram depicting disease mutations. fs, frame shift. Figure 3. Structure of Fam20C, the secreted protein kinase. A, structure of C. elegans FAM20C (ceFAM20C, PDB ID:4kqb, chain A, goldenrod). N and C lobe indicated approximately. ATP-binding site diagram of important residues. Parenthetical residues represent structurally equivalent residues in Homo sapiens FAM20C. B, heterotetramer of Danio rerio FAM20C (drFAM20C, goldenrod) and Homo sapiens FAM20A (hFAM20A, cyan) (PDB ID:5yh2; chains A-D). Heterodimer interface and heterotetramer interfaces indicated. C, heterodimer of Homo sapiens Fam20C (hFAM20C, goldenrod1) and Homo sapiens FAM20A (hFAM20A, cyan) (PDB ID:5yh3, chains A and C). Residues important to the heterodimer interface indicated. N and C lobe indicated approximately. D, gene diagram depicting disease mutations (fs, frame shift; X, STOP/termination). E, cartoon depiction of kinase indicated positions of mutated residues when resolved (mutations as red spheres, PDB ID:5yh3, chain C). Residue labels color coded to indicate mutation type (red: missense mutation, orange: frameshift, and pink: STOP/termination). N and C lobes indicated approximately. threefold increase in Vmax and a consequent threefold decrease in Km (50,58). However, ceramide, the precursor of sphingosine, had no effect on Fam20C activity, thus suggesting sphingosine as a specific activator of Fam20C (50,58). Interestingly, the activity of Fam20C is dynamically controlled by its binding partner Fam20A (Fig. 3, B and C). Fam20A and Fam20C together form a heterodimeric complex (Fig. 3C), which dramatically promotes the activity of Fam20C to phosphorylate its substrates (13,29). Two heterodimers can further associate to form a heterotetrameric complex (Fig. 3B), but it remains an open question as to which form exists in vivo. This uncommon allosteric mode of pseudokinase-mediated activation of Fam20C is further explained below in the Fam20A section. Finally, functional annotations of Fam20C substrates suggest that Fam20C will play important roles in many physiological processes and disease states.

FAM20C substrates in nutrition and mineralization
The gene encoding casein resides on chromosome 4 surrounded by other genes encoding proteins that contain multiple SxE motifs. Casein accounts for approximately 80% of the total protein in bovine milk where it interacts with calcium phosphate forming colloidal structures called casein micelles, thereby providing nutrients including calcium and phosphate for growth of bones and teeth to mammalian infants (59). The consequences of casein phosphorylation have been intensively studied with regard to cheese manufacturing where it is suggested to affect milk technological properties by stabilizing calcium phosphate nanoclusters and promoting micellar growth (59)(60)(61).
In addition, chromosome 4 harbors another gene cluster encoding the small integrin binding ligand-N-linked glycoproteins (SIBLINGs). These genes are known to regulate bone and tooth development and encode osteopontin, dentin matrix protein-1 (DMP1), matrix extracellular phosphoglycoprotein, bone sialoprotein, and dentin sialophosphoprotein, all of which are involved in binding calcium and all of which are Fam20C substrates (11). In fact, Fam20C phosphorylates DMP1 in osteoblasts and young osteoclasts, which leads to the secretion of phospho-DMP1 into the pericanalicular matrix of mineralized bone (62). Fam20C is further thought to indirectly promote DMP1 transcription (63). In addition, Fam20C phosphorylates multiple sites on osteopontin and promotes its secretion (64) but inhibits its binding to αvβ3 integrin (65). These negatively charged phosphorylated substrates allude to Fam20C's involvement in Ca 2+ regulation in many varied and diverse processes including nutrition and the formation of mineralized tissues. Indeed, a large body of literature, focusing on conditional tissue-specific knockout mice and cell models, reports the roles of Fam20C in promoting biomineralization including the growth and development of osteoblasts, osteoclasts, bone, dentin, and enamel ( Fig. 5) (66)(67)(68)(69)(70)(71)(72)(73)(74)(75)(76)(77)(78).

Fam20C substrates promoting secretion and ER homeostasis
Phosphoproteomic analysis of pancreatic β -islet cells from type 2 diabetic obese (T2D) mice revealed 39 potential phosphosites conforming to the SxE motif (79). The study reported that Fam20C levels went up in the cells of T2D mice, thereby promoting secretion of immature proinsulin under hyperglycaemic conditions (79). Upon restoring euglycaemia, the levels of Fam20C and 11 corresponding SxE phosphosites were brought back to basal level (79). This study suggests that Fam20C might play an important role in the control of insulin section from the β -islet cells of pancreas. In fact, recent studies suggest Fam20C plays a pivotal role in ER homeostasis, which promotes proper section, including phosphorylation of proteins sequestered within the secretory pathway (Fig. 5). Recent works report that Fam20C phosphorylation of ER oxidoreductin 1α (Ero1α) on Ser145 (SxE site) is important for regulating ER redox homeostasis and oxidative protein folding (80). This Ero1α phosphorylation is induced following secretion-demanding conditions such as lactation and interestingly, this posttranslational event occurs in the Golgi apparatus, and Ero1α is retrograde-transported to the ER mediated by ERp44 (80). Furthermore, Fam20C maintains ER proteostasis and protects against ER stress-induced cell death (81). Protein disulfide isomerase (PDI) is a highly abundant ER-resident enzyme playing critical roles as both a thiol-disulfide oxidoreductase and a molecular chaperone, which prevents protein misfolding in the ER (82,83). Fam20C phosphorylates PDI on Ser357 upon ER stress and promotes the activity of PDI to maintain ER proteostasis (81). Indeed, loss of Ser357 (Ser359 in mouse) leads to acute liver damage in mice challenged with proteotoxic stress (81). Interestingly, recent studies show that Fam20C phosphorylation is required for the secretion of certain proteins. For example, Fam20C phosphorylates calcium binding protein 45 kDa (Cab45), a Golgi protein, regulating the sorting and secretion of proteins (84). This phosphorylation regulates Cab45 oligomerization independent of its Ca 2+ binding ability and facilitates translocation of Cab45 into trans Golgi network-derived vesicles, thus accelerating vesicle budding (84). Furthermore, the Cab45 phosphorylation enhances secretion of its client proteins, including lysozyme C (84). Similarly, Fam20C phosphorylation has been shown to be important for the secretion of osteopontin (64).

Fam20C substrates in blood
Phosphoproteomic analyses of plasma and serum revealed that the majority of phosphorylated sites identified adhered to the SxE/pS motif (52,54), thus triggering the hypothesis that the majority of the extracellular plasma/serum phosphoproteins could be Fam20C substrates. Multiple proteins with wellestablished roles in blood coagulation, phosphate homeostasis, and complement pathways have been identified in phosphoproteomic studies by comparing the phosphoproteome of wildtype cells with cells lacking Fam20C (56, 85) (Fig. 5). The major vertebrate clotting factor fibrinogen (alpha and gamma chains) was identified as a potential substrate of Fam20C in these phosphoproteomic screens (56). Phosphorus was found in fibrinogen as early as 1962 and the amino acid sequence revealed the sites to be SxE (86). During tissue and vascular injury, fibrinogen is cleaved by thrombin to fibrin peptides, which form a fibrin-based blood clot and stop bleeding (87). It has been reported that phosphorylated fibrinogen binds better to thrombin, thus releasing more fibrin peptides and promoting faster coagulation (88,89). Fam20C has been found to directly phosphorylate fibrinogen alpha and gamma chains in vitro (56), and further work is needed to define the physiological roles of the phosphorylation events. On a similar note, Fam20C phosphorylates the A2 domain of von Willebrand factor (vWF) on two SxE sites, pSer1517 and pSer1613 (90). The modifications promote platelet adhesion to sites of vascular injury and helps in coagulation (90). Among the other serum/plasma proteins identified as Fam20C substrates are collagen and the complement components C3 and C4 (56) wherein collagen and C3 have been reported to be phosphorylated previously (91,92). Further work is needed to establish the role of Fam20C and phosphorylation of its key substrates in the blood coagulation pathway.
Another well-characterized substrate of Fam20C in serum is fibroblast growth factor-23 (FGF23), a bone-derived hormone that regulates serum phosphate levels (85,93). Mice with Fam20C deletion exhibit an increase in bioactive serum FGF23 leading to the development of hypophosphatemic rickets and skeletal defects (32,76), which can be partially reversed by feeding the mice a high-phosphate-containing diet (94). In fact, within the Golgi, Fam20C phosphorylates FGF23 on Ser180 (SxE site), which inhibits its O-glycosylation and subsequently promotes proteolysis and inactivation of the hormone (85). Intriguingly, proteolysis-resistant missense alterations adjacent to Ser180 (R176Q, R179W, and R179Q) activate FGF23 leading to hypophosphatemic rickets (95). Furthermore, knockdown of Fam20C in cells promotes FGF23 mRNA expression (63), and elevated levels of serum FGF23 contribute to cardiovascular complications and increased mortality in patients with chronic kidney disease (96).

Fam20C substrates in heart
Besides FGF23, which contributes directly to cardiovascular problems in patients, various other substrates of Fam20C have been implicated in heart disease (Fig. 5). PCSK9 (proprotein convertase subtilisin-kexin 9) patient genetic variations altering SxE sites correlate with LDL-cholesterol dysregulation, a risk factor for heart disease (97). Importantly, Fam20Cmediated phosphorylation of PCSK9 improves PCSK9 secretion and enhances the degradation of the low-density lipoprotein receptor (LDLR) in endosomes/lysosomes (97). On a similar note, PCSK7 is phosphorylated by Fam20C on Ser505 (SxE site) leading to higher triglyceride uptake into adipocytes (98). Interestingly, exome sequencing revealed a low frequency coding variant PCSK7, R504H, correlated with 30% lower plasma triglyceride levels in individuals harboring this change (98). Further biochemical analyses revealed that the R504H substitution enhanced phosphorylation of the adjacent S505 possibly promoting higher triglyceride uptake (98).
Cardiac function, contraction and relaxation, is brought about by a complex interplay of multiple proteins and posttranslational modifications playing essential roles in regulating intracellular calcium (Ca 2+ ) handling (99). The sarcoplasmic reticulum (SR) of cardiac muscle is the Ca 2+ storage organelle, and Ca 2+ is shuttled between the SR and cytosol via various SR resident receptors during contractions and relaxations of the heart (100). Fam20C resides in the SR of cardiac muscle and phosphorylates multiple major Ca 2+ handling machinery proteins including histidine-rich Ca-binding protein (HRC), Stim1, calsequestrin 2, sarcalumenin, triadin, calumenin, and calreticulin (101,102). These proteins play essential roles mediating SR Ca 2+ storage, uptake, and release (102,103). For example, Fam20C-mediated phosphorylation of calsequestrin 2, the major Ca 2+ binding protein in the SR, dramatically alters the ability of calsequestrin 2 to oligomerize, which is critical to its function (102). Stim1, the luminal ER/SR Ca 2+ sensor responsible for store-operated Ca 2+ entry in a variety of cell types, is also dramatically regulated by Fam20C phosphorylation, providing the most compelling evidence of Fam20Cmediated Ca 2+ regulation. In addition, a recently discovered Stim1-S88G substitution (within an SxE site) was found in a patient with heart disease and the substitution, which precludes Fam20C phosphorylation, was shown to alter Ca 2+ signaling (102,104).
Interestingly, cardiomyocyte-specific Fam20C knockout mice (cKO) exhibited signs of heart failure upon aging or induced pressure overload by transverse aortic constriction (102). At 9 months of age, cKO mice exhibited a significant increase in left ventricle chamber size with distinct features of heart fibrosis and dilated cardiomyopathy (102). The heart failure phenotype in cKO mice is thought to be brought about by dramatic SR Ca 2+ handling defects since isolated cardiomyocytes from aged cKO mice exhibited severe Ca 2+ cycling defects and delayed relaxation (102).
Dilated cardiomyopathy (DCM) is an underlying heart defect and is associated with sudden death in over 50% of the cases (105). Aged cKO mice exhibit clear signs of DCM, and although multiple substrates have been reported for Fam20C in SR, HRC has been widely implicated in DCM (103). HRC is an essential Ca 2+ handling protein, and its depletion leads to enhanced cardiomyocyte aftercontractions upon stress (106). Failing human hearts exhibit lower protein levels of HRC, and multiple genetic variants of HRC have been reported in human DCM cases (103). Fam20C-mediated phosphorylation of HRC is thought to control Ca 2+ leak and enhance SR Ca 2+ transport, thereby maintaining ambient signaling (101). The site of phosphorylation on human HRC is S96, which is a canonical SxE phosphorylation site (101). Remarkably, S96A is a common human genetic variant of HRC, and patients with the homozygous Ala/Ala variant exhibit fourfold increased risk of lethal ventricular arrhythmias in idiopathic DCM compared with normal Ser/Ser patients and twofold increased risk when compared with heterozygous individuals (103). Furthermore, preliminary genetic analysis indicates that roughly 60% of participants had at least one copy of S96A suggesting that this condition has extremely broad implications for heart disease (103). The intriguing dosage-dependent manner of DCM lethality in the nonphosphorylatable S96A genetic variant of HRC suggests that pS96 HRC phosphorylation by Fam20C is likely an important molecular event in cardioprotection.

Fam20C genetic alterations in disease
Biallelic loss-of-function genetic alterations in the Fam20C gene lead to the development of an autosomal recessive disorder called Raine syndrome (OMIM #259775) Figure 3D (107-109). In 1985, two infant sisters with neonatal lethality were reported to exhibit a unique, autosomal recessive case of congenital sclerosing osteomalacia with cerebral calcification (110). It was not until 2016 that their archival DNA was sequenced to reveal a Fam20C genetic alteration in a key conserved region (111). These patients may have been arguably the first documented cases of Raine syndrome harboring genetic alterations in Fam20C. The name "Raine syndrome" was coined in 1989 when Raine and colleagues comprehensively reported this lethal osteosclerotic bone dysplasia (112) while links with Fam20C alterations were established by Simpson and colleagues in 2007 (107). The cases presented often exhibit neonatal-lethality with extreme skeletal . Boxes indicate the pseudokinase active site and ATP-binding site. N and C lobes indicated approximately. B, superimposition of Homo sapiens FAM20A (hFAM20A, PDB ID:5yh3, chain A, cyan) and C. elegans FAM20C (ceFAM20C, PDB ID:4kqb, chain A, goldenrod) active sites. Manganese coordinating residues indicated. Q258 abolishes manganese and ATP-binding. ceFAM20C ATP-binding displayed for reference. C, superimposition of Homo sapiens FAM20A (hFAM20A, PDB ID:5yh3, chain A, cyan) and C. elegans FAM20C (ceFAM20C, PDB ID:4kqb, chain A, goldenrod) bound ATP/adenosine diphosphate (ADP). hFAM20A binds ATP in an inverted fashion. D, gene diagram depicting disease mutations (del, deletion; fs, frame shift; X, STOP/termination). E, cartoon depiction of kinase indicated positions of mutated residues when resolved (mutations as red spheres, PDBID:5yh3, chain C). Residue labels color coded to indicate mutation type (red: missense mutation, orange: frameshift, pink: STOP/termination, and yellow: deletion). N and C lobes indicated approximately. deformities, ectopic calcification, and organ hypoplasia (107). Some nonlethal cases have also been reported with patients exhibiting hypophosphatemia, altered facial and skeletal features (108). Over 40 cases of Raine syndrome have been reported worldwide and DNA sequencing revealed that all these patients carried various alterations in the Fam20C gene, which are likely the driving cause of disease (107)(108)(109). About 25 unique alterations have been reported for Fam20C in disease, which affect stability, secretion, activity, and integrity of Fam20C protein (Fig. 3D) (11,51). Intriguingly, a direct correlation has been observed between Fam20C activity and disease lethality, wherein, complete deletion leads to neonatal lethality, whereas residual activity is sufficient to keep the individual alive beyond birth to preteen and even teenage years. Two teenagers with hypophosphatemia and rickets exhibited a compound heterozygous Fam20C genetic alteration where one copy of the Fam20C gene contained a T268M substitution (113). Fam20C T268M purified in vitro preserved only 10% of wild-type kinase activity (50). Interestingly, FDA-approved multiple sclerosis drug and sphingosine analog, fingolimod, potently activated Fam20C in vitro (50). Fingolimod also led to higher activity of Fam20C T268M in vitro (50). This suggests that fingolimod may be utilized in partially alleviating the loss of activity of Fam20C in nonlethal Raine syndrome patient cases. Furthermore, a similar amino acid replacement Ser to Thr (S410T) in a patient exhibited very mild symptoms (114). In fact, a canine model of nonlethal Raine syndrome has been reported exhibiting a minimally disruptive Ala to Val substitution in the Fam20C kinase domain (115). Most alterations reported alter the protein sequence of Fam20C in key conserved regions, whereas large chromosomal rearrangements (107) and splice-site alterations also result in Fam20C deletions and disease manifestations (116,117). The reported Fam20C disease alterations in humans with the exception of splice-site mutations have been listed in Table 1 and Figure 3, D and E with corresponding information on inheritance, lethality, and effect on Fam20C protein/kinase activity.

Fam20A, the secreted pseudokinase
Unlike Fam20C, which is ubiquitously present in all tissues, Fam20A is preferentially expressed in lactating mammary glands and in enamel and dental matrices (13,32). Fam20A forms a functional heterotetrametric complex with Fam20C (Fig. 5) and allosterically increases Fam20C activity, via heterodimerization, toward its substrates (Fig. 3, B and C) (13,29). Interestingly, formation of the heterodimer is sufficient to allosterically increase Fam20C activity both in vitro and in cells, and the unique contributions of the heterotetramer are still unknown (29). Fam20A is a paralog of Fam20C and is the first secreted pseudokinase identified (Fig. 4, A and B) (13). Pseudokinases are proteins that share sequence homology with kinases but lack kinase activity either due to mutations in normally conserved amino acids that catalyze phosphoryl transfer (118) or utilize the kinase fold to transfer molecules other than phosphate (119,120). A conserved Gln residue in Fam20A replaces a Mn 2+ cation coordinating Glu residue of Fam20C, which is essential for catalysis (13). In fact, mutagenesis studies revealed that replacing the Gln to a Glu in Fam20A triggered hydrolysis of ATP and restored kinase activity (13). In addition to the lack of an essential residue required for catalysis, Fam20A binds to ATP (Fig. 4, A and C) in a unique conformation (121). Structural studies revealed that the ribose moiety of the ATP is "upside down," and the entire nucleotide is inverted with the phosphate groups pointing at the opposite direction (121). Hence, the γ-phosphate is positioned away from the active site and cannot be Figure 5. Roles of Fam20 secretory pathway kinases. The overarching roles of the Fam20 kinases identified to date are in nutrition, biomineralization, blood, cardiac function, proteoglycan biosynthesis, allosteric kinase activation, and endoplasmic reticulum proteostasis. Fam20 paralogs are localized in the secretory pathway and phosphorylate multiple substrates playing essential roles in animal physiology. utilized for transfer. Several hydrophobic residues and hydrogen bonds in the pseudokinase pocket bind the adenine of ATP (Fig. 4, B and C) while the otherwise-hydrolyzable γphosphate is surrounded and stabilized by extensive salt bridge and hydrogen bonds (121). Furthermore, the "inverted" ATPbinding to Fam20A seems to prefer the absence of metal ions as biochemical studies indicated that the dissociation constant of Fam20A ATP-binding is 50-fold higher in the presence of Mn 2+ cation (121). Intriguingly, ion-independent ATP-binding of Fam20A remarkably promoted the formation and structural homogeneity of the heterotetrameric Fam20A-Fam20C complex (121). Although cation-independent ATP-binding has been reported previously in other pseudokinases (118,122,123), the inverted binding to ATP and the heterotetramer formation in the secretory pathway make Fam20A a unique pseudokinase. Interestingly, subtle structural differences from Fam20C redesign Fam20A's ability to achieve kinaseindependent function (121). Fam20A has a unique and highly conserved insertion in the Gly-rich loop, which triggers the formation of two unique disulfide bonds (human Fam20A: Cys209-Cys319 and Cys211-Cys323) (121), and truncation of this insertion due to aberrant RNA splicing leads to the development of tooth enamel defects called amelogenesis imperfecta in a patient (124).
Variations in the gene encoding Fam20A result in amelogenesis imperfecta (AI), nephrocalcinosis (NC), and ectopic calcification (EC) (125). Similar observations were echoed from whole-body and tissue-specific genetic depletion of Fam20A in mice, which exhibited clear phenotypes of AI and dental defects (32,126). An exhaustive list of Fam20A patient variations with corresponding clinical information has been reported by Nitayavardhana and colleagues in 2020 (127). To date, about 40 different disease-causing genetic alterations have been reported in Fam20A in 70 patients of 50 independent families (Fig. 4D) (127). The patients exhibited nonlethal dental symptoms including hypoplastic enamel, gingival hyperplasia, and unerupted permanent teeth (127). The majority of the alterations were frameshifts with increased chances of hypomorphism, truncation, deletion, complete loss of function, major structural effects with possible dissociation from the Fam20A-Fam20C complex. The alterations are listed in Table 2 and Figure 4, D and E. The roles of the Fam20 kinases in disease transcend our current knowledge, which is evident from preliminary studies pointing to potential roles of Fam20C in diseases beyond biomineralization and cardiac function (36,(128)(129)(130). Developing inhibitors/activators for Fam20B or C makes sense at this point due to their usefulness as academic tools. To date, only one inhibitor, FL-1607, has been developed for Fam20C, and no proper in vitro target engagement or biochemical binding/inhibitory assays have been shown for this compound (71). It is expected that in vivo targeting of Fam20 kinases would elicit major side effects owing to the diverse substrates essential for organism function (11,56). The following section provides the evolutionary perspective of the Fam20 family from early invertebrates to mammals.

Fam20 and animal evolution
Fam20 orthologues are observed across the animal kingdom from sponge to mammals and early invertebrates have a single copy of the Fam20 gene ( Fig. 1) (29). Amphimedon queenslandica or sponge is considered to be the oldest animal phylum (131) and exhibits a single copy of the Fam20 gene, which has Fam20B-like glycan kinase activity and produces phosphorylated xylose residues on tetrasaccharide linkers (29). Cnidarians such as Hydra magnipapillata also exhibit a single Fam20B-like protein (29), which robustly phosphorylates xylose residues and is thought to contribute to CS peptidoglycan chain extension, a function conserved through to mammals (30). An interesting exception is the nematode Caenorhabditis elegans (C. elegans) as it is, to date, the only organism known that does not have a Fam20B-like kinase activity (51). Even though proteoglycan biosynthesis in C. elegans is remarkably conserved when compared with that in humans, only unphosphorylated xylose is detected in the tetrasaccharide linker of C. elegans CS proteogylcans (132), highlighting the absence of Fam20B activity (51). Instead, Fam20 in C. elegans (known as FAMK-1) is a protein kinase with the same SxE substrate preference as mammalian Fam20C (51,133). A study of FAMK-1 in C. elegans to uncover its ancestral roles revealed that it is involved in many physiological processes contributing to fertility, embryogenesis, and development (133). During embryogenesis, FAMK-1 prevents multinucleation, which can be overcome by elevating the temperature or lowering cortical stiffness (133). In adults, FAMK-1 expression in the spermatheca, a tissue that undergoes repeated mechanical strain controlled by calcium fluxes, is important for fertility (133). In the context of the organism, it is clear that Fam20C activity is required in the late secretory pathway or outside the cell for function (133). The advent of two members in Fam20 family is first observed in arthropods (29). Drosophila melanogaster has one copy each of Fam20B and Fam20C (29). In fact, Fam20C phosphorylates Drosophila egg yolk proteins in vitro (5), which are the closest functional analogs of vitellogenin and phosvitin (134). As stated previously, phosvitin, one of the most heavily phosphorylated proteins known, is a Fam20C substrate (5). Phosvitin is cleaved from vitellogenin, the major egg yolk protein found in all egglaying animals (4), and largely consists of long stretches of serine residues that are phosphorylated by Fam20C despite the absence of glutamate residues (5). Phosphorylation of vitellogenin and/or its phosvitin domains occurs in birds, fish, worm, and insect yolk proteins (5), making this a widespread and evolutionarily conserved modification. It is duly noted that the functional consequences of these phosphorylation events have yet to be determined. Fam20C also plays important roles in Apis sp. or the honeybee where it phosphorylates royal jelly proteins (135). An indepth phosphoproteomics study of royal jelly proteins determined that they are phosphorylated mainly on SxE sites likely by a Fam20C-like protein in the hypopharyngeal and mandibular glands of nurse bees from where royal jelly is secreted (135). Royal jelly is an indispensable dietary component of the queen bee and possesses antibacterial, anticancer, antihypertensive, and antioxidative effects that coincidentally benefit human health (135)(136)(137). Significantly, the antimicrobial activities of royal jelly are influenced by phosphorylation in complex ways (135).
While the role of Fam20C in biomineralization in vertebrates is well documented, the study of Fam20's role in invertebrate biomineralization is in its infancy. A recent study characterized Fam20 cDNA from the pearl oyster, Pinctada fucata, and determined that it was expressed in the mantle edge positioned to play a role in shell formation (138). Furthermore, its expression increases in the stage of development when the shell is first forming and knockdown of Fam20 in vivo by RNA interference resulted in the formation of abnormal calcium carbonate crystals during shell formation (138). It is intriguing that Fam20C could be involved in calcium carbonate as well as calcium phosphate biomineralization processes, nevertheless it remains to be shown that P. fucata Fam20 displays Fam20C kinase activity on relevant substrates. Echinoderms such as Strongylocentrotus purpuratus or sea urchins exhibit a duplication of Fam20C wherein it has one copy of Fam20B and 2 copies of the Fam20C genes (29). Protochordates such as Branchiostoma and Saccoglossus exhibit both Fam20B and Fam20C; however, the tunicates such as Ciona intestinalis and Oikopleura dioica seem to have a single Fam20 gene exhibiting Fam20B-like functions (29). The reason is unclear; however, incomplete genome sequencing could be a contributing factor for this "absence" (29).
Fam20A is first observed in fish (139). In fact, fish express three copies of Fam20C and one copy each of Fam20B and Fam20A (29). Vertebrates have all three members of this subfamily, Fam20A, B, and C, while invertebrates/protochordates do not possess a Fam20A orthologue. This may be attributable to the need for enhanced Fam20C activity, which presumably would promote biomineralization necessary for the formation of bones and tooth enamel. It is a mystery why divergent animal species have maintained different Fam20 protein activities, but as pointed to previously, these phylogenetic analyses demonstrate that the Fam20B glycan kinase is likely the ancestral kinase (29). Fam20A may have been derived from Fam20C, lost its kinase activity but gained the function of activating Fam20C as a pseudokinase partner in vertebrates (29).

Concluding remarks
Since 1883, secreted proteins have been known to be phosphorylated. The identification of Fam20C in 2012 displaced the intracellular CKs as genuine casein kinases and opened up a new field wherein over 100 substrates across the phosphoproteome were linked to a single secreted atypical kinase (11,56). With a preferred motif of SxE/pS, Fam20C can account for approximately two-thirds of the secreted phosphoproteome. But, a large fraction of secretory phosphoproteins exhibits pThr, non-SxE pSer, and pTyr phosphorylation events, which may not be attributable to Fam20C. The field of secretory pathway and extracellular phosphorylation is poised to expand rapidly with the continued characterization of the kinases that function in these environments. Most of the secretory pathway kinases' activities and functions have yet to be elucidated. It is unknown whether the other FJX and VLK family proteins are kinases and if so whether their substrates are proteins, lipids, or metabolites. On the other hand, we have made significant progress with the subfamily of secreted kinases composed of actual kinases, Fam20B and C and the pseudokinase Fam20A (Fig. 5). With established links to human skeletal diseases, the initial roles of the Fam20 family were thought to be focused on biomineralization; however, identification of SxE/pS motifs in over two-thirds of all secreted phosphoproteome including plasma, serum, cerebrospinal fluid, neuropeptides, and extracellular matrix components points to a diverse function of Fam20C. Indeed, our work with the heart-specific Fam20C knockout mouse revealed the quintessential role of Fam20C in maintaining cardiac health (101,102). Furthermore, roles of Fam20B and Fam20C in invertebrate organisms suggest roles in glycan function, mollusk shell formation, insect egg development, beehive nutrition, and fertility of nematodes. Other groups have also reported diverse substrates for Fam20C playing essential roles in endoplasmic reticulum homeostasis, coagulation, nutrition, and hormonal regulations. Thus, organ-specific focus on Fam20C should reveal further systematic functions of Fam20C modulating a diverse set of substrates. Indeed, activators of Fam20C may benefit nonlethal Raine patients as well as protect against heart disease and other potential systemic health issues. Thus, the roles of the Fam20 family extend far beyond biomineralization, and greater focus should be put on identifying these multiple roles in diverse systems. We believe that the Fam20 family is just the tip of the iceberg since multiple secretory pathway kinases remain enigmatic. In fact, recent studies have revealed that the kinome possibly expands far beyond the 540 kinases with predicted kinases and pseudokinases exhibiting diverse functions beyond phosphate transfer (119,120). Identification of the Fam20 family is a testament to the fact that atypical kinases exhibit catalytic residues, structural features, and cellular localizations outside of conventional knowledge. Hence, we have just scratched the surface of the physiological significance of extracellular phosphorylation and many exciting prospects await the field for the near future.