A Systematic Study of Site-specific GalNAc-type O-Glycosylation Modulating Proprotein Convertase Processing*

Background: GalNAc-type O-glycosylation is emerging as a co-regulator of proprotein convertase processing of proteins. Results: O-Glycosylation within at least ±3 residues of the RXXR substrate motif for furin affected processing. Conclusion: Site-specific O-glycosylation by 20 polypeptide GalNAc transferases have wide co-regulatory functions in proprotein processing. Significance: This is the first systematic study that paves the way for wider co-regulatory functions of O-glycosylation in protein processing. Site-specific GalNAc-type O-glycosylation is emerging as an important co-regulator of proprotein convertase (PC) processing of proteins. PC processing is crucial in regulating many fundamental biological pathways and O-glycans in or immediately adjacent to processing sites may affect recognition and function of PCs. Thus, we previously demonstrated that deficiency in site-specific O-glycosylation in a PC site of the fibroblast growth factor, FGF23, resulted in marked reduction in secretion of active unprocessed FGF23, which cause familial tumoral calcinosis and hyperostosis hyperphosphatemia. GalNAc-type O-glycosylation is found on serine and threonine amino acids and up to 20 distinct polypeptide GalNAc transferases catalyze the first addition of GalNAc to proteins making this step the most complex and differentially regulated steps in protein glycosylation. There is no reliable prediction model for O-glycosylation especially of isolated sites, but serine and to a lesser extent threonine residues are frequently found adjacent to PC processing sites. In the present study we used in vitro enzyme assays and ex vivo cell models to systematically address the boundaries of the region within site-specific O-glycosylation affect PC processing. The results demonstrate that O-glycans within at least ±3 residues of the RXXR furin cleavage site may affect PC processing suggesting that site-specific O-glycosylation is a major co-regulator of PC processing.

Proprotein convertase (PC) 2 processing is one of the major post-translational modifications and a fundamental step in pro-tein maturation where limited targeted proteolysis activates or inactivates many different proteins such as hormones, growth factors, cytokines, proteases, and receptors. Nine subtilisin-like proprotein convertases (PCSKs; PC1/3, PC2, furin, PC4, PC5/6, PACE4, PC7, SKI-1/S1P, and PCSK9) are known to regulate this process in a cell-type and protein-specific manner. The PCSKs are synthesized as inactive zymogens and undergo proteolytic processing to generate a heterodimer of an inhibitory prosegment with the remaining molecule. As they travel through the secretory pathway they sort to subcellular compartments where the optimal pH and calcium concentration will induce a secondary autocatalytic in trans cleavage that liberates the inhibitory prosegment and allows limited proteolysis of proprotein substrates. Furin and PC5/6B localize to the trans-Golgi network and cycle to the cell surface in recycling endosomes, whereas PC5/6A and PACE4 are either bound to the cell surface via heparin sulfate proteoglycans or are secreted in the extracellular matrix. In contrast, PC1/3 and PC2 reside in secretory granules where they process polypeptide hormones and proteins within the regulated secretory pathway (1,2). Seven of the nine PCs cleave proprotein substrates C-terminal of a polybasic consensus cleavage motif ((K/R)-(X) n -(K/R), where n ϭ 0, 2, 4, or 6). Furin has been the most extensively studied PC and its stringent specificity for the RXXR motif relies on a large negatively charged catalytic pocket as evidenced from crystallization studies (3).
In general it is believed that a substantial fraction of the ϳ3,500 secreted mammalian proteins undergo PC-mediated maturation (1), and several co-regulatory mechanisms have been described including pH dependence (␣4 integrin) (4), membrane topology (collagen XXIII) (5), and endogenous protein or peptide inhibitors (pro-TGF-␤) (6,7). More recently, site-specific O-glycosylation in or immediately adjacent to PC-cleavage sites is emerging as another important co-regulator of PC-mediated processing. Originally, tissue-specific processing of pro-opiomelanocortin was suggested to be related to O-glycosylation at Thr 45 (8 -10). More recently, we demonstrated that deficient O-glycosylation of the threonine residue in the RTHR 179 PC-processing site of FGF23 is the cause of the rare human disease, familial tumoral calcinosis (11). Moreover, O-glycosylation immediately C-terminal to the processing site in ANGPTL3 (RAPR 224 -TT) was found to regulate processing and was possibly related to regulation of serum lipids (12). Other studies have also indicated that O-glycans in close proximity to PC-processing sites may affect cleavage (13,14). These studies all point to a regulatory role of site-specific O-glycosylation by the large family of GalNAc transferases, which provides a co-regulatory system with a high degree of differential regulation through the many isoforms.
Here, we have systematically addressed the question of the spatial range of the influence of O-glycans on furin processing. We used a comprehensive set of in vitro peptide assays and an ex vivo cell model where O-glycosylation can be manipulated to demonstrate that O-glycans within at least Ϯ3 residues (P7-P3Ј) of the RXXR PC-processing site can inhibit cleavage. Bioinformatics analysis of secreted proteins further demonstrated prevalence for Ser and Thr residues in or immediately adjacent to many putative PC-processing sites suggesting a much wider regulatory role of O-glycosylation in PC processing.

EXPERIMENTAL PROCEDURES
Chemoenzymatic Synthesis of Glycopeptide Substrates for in Vitro Analysis-A series of 20-mer synthetic peptides (Schafer-N, Sigma) were designed with a central furin RXXR sequon processing site and a "walk" of potential O-glycosylation sites ( Table 1, Fig. 1). Recombinant glycosyltransferases were expressed as soluble secreted truncated proteins in insect cells and purified as described previously (15). Selection of suitable GalNAc transferases for site-specific O-glycosylation of peptides were performed in product development assays in 25 l of 25 mM cacodylic acid sodium, pH 7.4, 10 mM MnCl 2 , 0.25% Triton X-100, 1.5 mM UDP-GalNAc (Sigma), 10 g of acceptor peptides, and 0.5 g of purified enzyme at 37°C using MALDI-TOF to monitor products. Glycopeptides for further analysis were produced in reactions with a 100 -200-g scale and purified by HPLC using a Kinetex TM 2.6-m C18 100-Å LC column (100 ϫ 2.1-mm C18 column).
Characterization of O-Glycosylation Sites by Electron Transfer Dissociation-MS 2 -Products of O-glycosylation reactions were characterized by electrospray ionization-linear ion trap-Fourier transform mass spectrometry (ESI-LIT-FT-MS) in an LTQ-Orbitrap XL hybrid spectrometer (Thermo Scientific) equipped for electron transfer dissociation for peptide sequence analysis by MS/MS (MS 2 ) with retention of glycan site-specific fragments. Samples were dissolved in methanol/ water (1:1) containing 1% formic acid and introduced by direct infusion via a TriVersa NanoMate ESI-Chip interface (Advion BioSystems) at a flow rate of 100 nl/min and 1.4 kV spray voltage. Mass spectra were acquired in positive ion FT mode using parameters similar to previous studies (12), except at a nominal resolving power of either 30,000 or 60,000. Electron transfer dissociation-MS 2 spectra were analyzed by comparison with theoretical c and z ⅐ fragment m/z values calculated for all positional combinations of one HexNAc residue distributed on the all potential S and T glycosylation sites in the sequence. Calculations were performed using the web-based Protein Prospector MS-Product software routine.
EYPF Reporter Expression Constructs-A previously described reporter construct (12) was modified to include an additional His tag and an extended MUC1 tandem repeat sequence. The modified reporter construct was designed with a central 20-amino acid PC processing target sequence flanked by short spacer sequences and His tags/EYFPs in N-terminal end and 3.5 MUC1-TR/His tag in C-terminal end ( Fig. 2A). A series of constructs with targeting sequences identical to those used in in vitro peptide assays ( Fig. 1) were developed by inframe insertion of double-stranded oligonucleotides, encoding the respective peptide sequences, into the SacII and Bpu1102I restriction sites situated within the respective N-or C-terminal spacer sequences of the reporter construct ( Fig. 2A).
Expression of Reporter Constructs in CHO ldlD Cells-CHO ldlD cells with defective UDP-Gal/UDP-GalNAc C4-epimerase (16) were grown in Ham's F-12/Dulbecco's modified Eagle's medium (1:1, v/v) supplemented with 3% fetal bovine serum and 1% glutamine. Cells were transfected with 2 g of pcDNA3.1 reporter constructs using Mirus TransIT-CHO Transfection Kit according to the manufacturer's instructions. Transient expression was analyzed 48 h post-transfection, and stable transfectants were selected in 0.4 mg/ml of Zeocin (Invitrogen) by monitoring expression of EYFP. Further analysis by immunocytology using anti-EYFP (Living colors, Clontech), anti-HIS (Santa Cruz Biotechnology), or anti-MUC1 mAbs were performed. A secondary screening strategy monitoring the secreted reporter construct by a capture ELISA was developed. Briefly, 96-well microtiter plates where coated with 3 g/ml of anti-MUC1 5E10 (15) overnight at 4°C, and incubated in blocking buffer, pH 7.4 (PO 4 , Na/K, 1% Triton, 1% BSA), for 2 h at room temperature. Following the wash, wells where incubated with 100 l of cell culture supernatant for 4 h at RT and subsequently 16 h at 4°C. Binding of reporter constructs were detected by incubating wells with 1 g/ml of biotinylated HMFG2 for 2.5 h at RT followed by streptavidin-HRP for 1 h at RT and visualized by incubating with TMBϩ (Dako). Adding 0.5 M H 2 SO 4 stopped the reaction and the plate was read at 450 nm. To evaluate the effects of O-glycosylation stable CHO ldlD transfectant cells were grown in medium supplemented with 0.1 mM Gal (G), 1 mM GalNAc (Gn), or a combination (G/Gn).
In Vitro Furin Cleavage Assay of Peptides and Glycopeptides-A modification of an in vitro cleavage assay with a human furin protease (Sigma) was used to analyze furin cleavage of RXXR-Thr-GalNAc walk peptides, pro-BNP, ST6GalNAc-I, inhibin ␣, integrin ␣, and Semaphorin 3B peptides and glycopeptides (11). The assay was performed in 100 mM Hepes, pH 7.5, 1 mM CaCl 2 , 0.5% Triton X-100, 2 mM DTT using 10 g of peptide or glycopeptide substrate and 1 unit of enzyme in a total volume of 50 l. The mixture was incubated at 37°C, and product development was evaluated after 0.5, 5, and 20 h by MALDI-TOF.
Ex Vivo PC Processing in CHO ldlD Cells-CHO ldlD cells stably expressing PC target sequence reporter constructs were grown to 50% confluence in 6-well dishes, and the medium was replaced with fresh medium containing 50, 5, 1, or 0 M of the cell-permeable furin inhibitor Dec-RVKR-CMK (EMD Chemicals) in dimethyl sulfoxide. After 24 and 48 h of growth, cell culture supernatant was sampled and assayed by NuPAGE and Western blotting. Briefly, cell culture medium (10 l) was separated on NuPage Novex BisTris 4 -12% gels (Invitrogen) and blotted onto nitrocellulose membrane for 60 min, followed by incubation in 5% BSA in Tris-buffered saline (TBS). Following a wash in TBS plus 0.05% Tween 20, the membranes were incubated in primary antibody (mouse anti-HIS, rabbit anti-EYFP, mouse anti-Tn-MUC1 mAb 5E5 or anti-MUC1 5E10) overnight at 4°C (15). Blots were developed with 5-bromo-4chloro-3-indolyl phosphate/nitro blue tetrazolium substrate after incubation with secondary alkaline phosphatase-conjugated antibodies for 1 h at room temperature. Assessment of processing was performed by scanning dried membranes in a flat bed scanner (highest resolution 800 dpi) using ImageJ 1.45i software.
Bioinformatic Analysis-We used the human proteome from the International Protein Index (version 3.65) (17) downloaded via the European Bioinformatics Institute homepage with 86,379 sequences and first identified a predicted secretome using version 3.0 of the SignalP prediction algorithm (18). Further selection was performed by combining the NN (neural network) and HMM (hidden Markov model) versions of Sig-nalP. Subsequently, we applied the ProP prediction algorithm (19) to identify putative RXXR PC processing sites combined with the presence of Ser/Thr amino acids within P6 -P2Ј positions of the RXXR motif.

In Vitro Analysis of Furin Cleavage of a Library of
GalNAc-Glycopeptides-To systematically explore the spatial range within which site-specific O-glycosylation can affect PC processing, we designed a series of ideal peptides with a central RXXR furin processing motif and a walk of O-glycosylation sites (Thr residues) from P8 to P4Ј (Table 1). Because there is little knowledge of conformational requirements for efficient substrates for either furin processing or GalNAc-T O-glycosylation, we designed an arbitrary set of related peptide sequences with a high degree of sequence and amino acid similarity and with use of small, uncharged residues (Ala, Gly, Pro, and Thr, except for the necessary Arg residues). The use of similar sequence design was expected to result in similar rates and efficiencies of furin digestion. To ensure efficient enzymatic O-glycosylation Pro residues were included. We also avoided dibasic motifs in or around the RXXR motif to enhance selectivity for furin as other PCs have a preference for dibasic motifs. As predicted we were able to use three GalNAc-T isoforms to produce GalNAc glycopeptides with a single GalNAc residue attached at the appropriate Thr site in all peptide designs.
Two peptides (P3 and P1Ј) were poor acceptors for in vitro O-glycosylation and only ϳ50% incorporation of GalNAc was achieved even after prolonged assays (16 h), however, we could still use these for analysis of processing. A time course analysis of in vitro furin cleavage of each of the peptides revealed that all peptides were substrates for in vitro furin cleavage as predicted,

Summary of in vitro and ex vivo O-glycosylation and PC furin cleavage
a 20-mer peptides containing a RXXR minimal furin cleavage motif and a walk of threonine from position P8 to P4Ј (grey, underlined). b In vitro glycosylation of target sequences with GalNAc-T1, -T2, and -T3. x; full incorporation of 1 site. (x); partial incorporation of 1 site. c Peptide sequence is cleaved in vitro by furin. d In vitro glycosylation provides protection against in vitro proteolytic cleavage by furin. Y ϭ yes, N ϭ no, and P ϭ partial. NA (not analyzed). e Ex vivo analysis of PC processing in the CHO ldlD cell model. Y ϭ yes, N ϭ no, and P ϭ partial. NA (not analyzed). and some variation in velocities of the reactions were observed with P2Ј being the poorest substrate (not shown). Comparative analysis of furin cleavage of peptides and GalNAc glycopeptides in a time course monitored by MALDI-TOF revealed that O-glycosylation within the region from P8 to P3Ј adversely affected cleavage (Table 1). Position P8 at the most N-terminal end tested partially affected cleavage, whereas the most C-terminal position P4Ј tested had no effect on cleavage (Table 1 and Fig. 1). O-Glycans at positions P7 (Fig. 1B), P5 (Fig. 1C), and P2Ј ( Fig. 1E) (as well as P6, P3, and P1Ј not shown) completely blocked cleavage, although perhaps surprisingly an O-glycan at P2 (Fig. 1D) only partially inhibited cleavage. The latter is, however, in agreement with our previous studies of PC processing of FGF23, where GalNAc at P2 in the RHTR 179 2 processing site only partially inhibited cleavage in vitro (11). Importantly, an extended O-glycan structure with ␣2,6-linked sialic acid provided complete block of processing. In our in vitro analysis, we limited studies to the simple GalNAc O-glycan because our  (16 h) and reactions were monitored by MALDI-TOF. All nonglycosylated peptides except the P2Ј peptide were completely processed by furin, and the P2Ј peptide was partially processed. GalNAc glycopeptides with GalNAc residues in positions P7, P5, and P2Ј were completely protected against cleavage. GalNAc in positions P2 and P8 provided partial protection, whereas position P4Ј did not provide protection. Downward arrow includes mass of the expected cleavage product. Peptides P5 and P2 where synthesized with an N-terminal aminocaproic acid (C 6 H 11 NO) linker with a mass of 114. Calculated m/z are all for monoisotopic MH ϩ , and calculated full-length peptide MH ϩ are shown in parentheses. NOVEMBER 18, 2011 • VOLUME 286 • NUMBER 46 ex vivo cell model provides information of the effect of extended O-glycans as described below. The results are summarized in Table 1.

Site-specific O-Glycosylation Modulates PC Processing
Ex Vivo Analysis of PC Processing in the CHO ldlD Cell Model-We next proceeded with analysis of PC processing of a selection of the peptide sequences tested by the in vitro assay (including the most N-and C-terminal sites and sites most proximal to the cleavage site) in the CHO ldlD cell system where capacity for O-glycosylation can be modulated by addition of exogenous Gal and GalNAc sugars (20). We used a modification of a previously described chimeric reporter construct that facilitates rapid selection of stable clones as well as screening of processing and analysis of the influence of O-glycosylation ( Fig. 2A) (12). We initially analyzed glycosylation and processing of the reporter constructs by transient expression, but found that both glycosylation and processing of secreted products were quite incomplete so we used stable clones for all studies reported (Fig. 2B).
Processing efficiency of constructs was assessed by mobility shifts using the ratio of cleaved (N-terminal) and uncleaved fragments produced by cells grown without (37/50-kDa bands) and with the capacity for O-glycosylation (37-38/62-kDa bands). As shown in Table 1 and Fig. 2B the in vitro furin cleavage assay results correlated well with ex vivo processing of the corresponding reporter constructs in CHO cells. All reporter constructs were cleaved completely or partially without O-gly-cosylation. With O-glycosylation processing, constructs P5, P1Ј, and P2Ј were virtually abrogated, whereas processing of the P8 and P2 constructs was partially abrogated in agreement with the in vitro assay result. In contrast, O-glycosylation of construct P4Ј did not block processing. There were no substantial differences in processing of constructs glycosylated with simple GalNAc␣ (cells grown in GalNAc alone) or glycosylated with sialylated core 1 (NeuAc␣2,3Gal␤1-3[ϮNeuAc␣2,6]-GalNAc␣) O-glycans (cells grown in Gal/GalNAc) (Fig. 2B).
The reporter system allows assessment of O-glycosylation by large shifts in mobility (pro-apoprotein ϳ50 kDa and pro-glycoprotein ϳ62 kDa) due to the C-terminal MUC1 tandem repeat sequence with multiple O-glycosylation sites. O-Glycosylation of the PC processing target region can also be identified by mobility shifts of the cleaved N-terminal fragment (N-terminal apoprotein ϳ37 kDa and glycoprotein 38 kDa) when glycosylation is at P1-P8, whereas determination of glycosylation in positions P1Ј-P4Ј requires further analysis. The O-glycosylation capacity of CHO ldlD cells, i.e. the repertoire of GalNAc-Ts expressed, is not known. However, a recent study reported expression analysis of GalNAc-Ts in the parent CHO-K1 cell line (21), which is likely to be similar to CHO ldlD. Only Gal-NAc-T2, -T7, -T11, and -T19 were expressed in K1. In agreement with this, we previously showed that glycosylation sites specific for the GalNAc-T3/T6 subfamily, including, e.g. the FGF23 PC-processing site, are not glycosylated by the endogenous GalNAc-Ts expressed in both CHO-K1 and ldlD cells (11,12). Although we could not directly confirm glycosylation of the P1Ј-P4Ј constructs by mobility shifts, the finding that constructs P1Ј-P3Ј exhibited complete or partial blocking of processing is in accord with efficient glycosylation of the target sequences.

PC Processing of Reporter Constructs Is Performed by a Furinlike PC-Most
PCs prefer cleavage at dibasic sequences, whereas furin exhibits rather strict specificity for RX(R/K)R but accepts RXXR. We therefore, as mentioned earlier, designed peptides and reporter constructs with cleavage sequence RXXR, with X ϭ Pro, Ala, and Thr, only to ensure the absence of a dibasic motif, and thus excluding activity of other PCs that may be expressed in CHO cells. To further confirm that furin or furin-like activities were responsible for the observed processing we tested the effect of addition of the cell permeable furinlike protease inhibitor Dec-RVKR-CMK on processing in CHO ldlD cells. Cleavage of all tested reporter walk constructs (P8, P2, P2Ј, and P4Ј) was inhibited in a dose-dependent manner with almost complete inhibition at 5 M (Fig. 3). These results indicate that the PC target sequences are processed by furinlike convertases in CHO ldlD. The convertases PACE4 and PC5/6, with similar substrate specificities as furin, are also inhibited by Dec-RVKR-CMK, but PACE4 is not believed to be expressed in CHO-K1 (22, 23) and presumably not in the subclone CHO ldlD.
Proteome-wide Analysis of Potential O-Glycosylation Around Potential RXXR Processing Sites-To probe the potential for co-regulation of PC processing by site-specific O-glycosylation we surveyed potential O-glycosylation sites in RXXR furin-like processing sites by searching the human proteins with signal sequences for occurrences of Ser/Thr residues within the P8 -P5Ј RXXR sequons (Fig. 4A). A high prevalence of Ser residues at P7-P5, P3, and P1-2Ј were found, as well as less prevalent Pro at P6 and P2, and Thr even less prevalent at P6-5, 3, and P1-2Ј. Although there is no good prediction algorithm for single O-glycosylation sites these findings suggest that O-glycosylation around PC processing sites may be rather common. The analysis allowing Ser/Thr at P6 -P2Ј revealed 710 proteins with 1,264 RXXR sequons having Ser/Thr within P6 -P2Ј, and further narrowing down the allowed positions for Ser/Thr residues reduced the number as shown in Fig. 4B. Analysis of the list of identified proteins reveals a wide range of all major protein classes known to undergo PC processing (Fig.  4C).
In Vitro Analysis of a Select Set of Identified Proteins-From the bioinformatics analysis we selected 11 functionally diverse proteins for in vitro analysis of O-glycosylation and furin processing (Table 2). FGF23 and ANGPTL3 identified previously (11,12) were included as controls. Synthetic peptides were tested for in vitro O-glycosylation with the three most abundant GalNAc-T isoforms, and all but one (hepatocyte growth factor receptor) could be O-glycosylated by one or more of these isoforms indicating that the sequences may be O-glycosylated in vivo. Only four of the 11 proteins are known to be O-glycoproteins and in most cases the sites of O-glycans are unknown. In vitro furin cleavage assays demonstrated that 9 of the 11 sequences could be cleaved, whereas 8 of the 11 proteins are known to undergo PC processing. We observed, as expected, substantial differences in velocity and efficiency of cleavage of the different peptide sequences, which are exemplified by the time course analysis of the semaphorin 3B, integrin ␣E, and pro-BNP peptides ( Fig. 5 and supplemental Fig. S1). Thus, the integrin ␣E sequence was completely cleaved at the first 3-h time point, where several other peptides were cleaved less than 50%.
We next tested in vitro cleavage of the same sequences with GalNAc incorporated by the most active GalNAc-T isoforms, and demonstrated that in all but one case (Endothelin-1), where O-glycosylation and furin cleavage were possible, O-glycosylation completely or partially protected in vitro furin cleavage as predicted ( Table 2). Fig. 5A shows a time course furin digestion of fully O-glycosylated semaphorin 3B peptide where the Gal-NAc glycopeptide is completely protected from digestion. Fig.  5B shows an example of an integrin ␣E (integrin ␣E/CD103) peptide (mixture of 2 and 3 GalNAc residues), where the Gal-NAc glycopeptide with 3 GalNAc residues (Thr 56 , Thr 60 , Thr 63 ) is completely protected from digestion, whereas the glycopeptide with 2 sites (Thr 56 , Thr 63 ) and the unglycosylated peptide is cleaved. It is surprising that the glycoform with two GalNAc residues at positions P7 and P1Ј was readily cleaved by furin. We have not identified a particular reason why the integrin ␣E peptide behaves differently than the other examples tested as well as the model peptides. The peptide was one of the best furin substrates tested, and the example may serve to illustrate that local conformation in addition to sequence and O-glycosylation also influence cleavage efficiency. An example of partial protection (pro-BNP) is shown in Fig. 6A, and additional examples (ST6GalNAc-I and Inhibin ␣) of complete protection are shown in Fig. 6, B and C.

DISCUSSION
Site-specific GalNAc-type O-glycosylation is emerging as an important co-regulator of PC processing of proteins, but our knowledge of and ability to predict the occurrence of this event is highly limited. In this study we first used a model system to delineate the sequence range within which site-specific O-glycosylation may affect PC processing in vitro and ex vivo in cells, and demonstrate that O-glycans at least Ϯ3 residues N-and C-terminal and of the RXXR furin recognition motif may affect processing. This allowed proteomewide analysis of sequences with the potential for PC processing and adjacent O-glycosylation, and the search yielded hundreds of candidate proteins despite that it was limited to the RXXR motif. It is estimated that ϳ3,500 proteins can potentially undergo PC processing (1,7), and our results suggest that a major part of these may be co-regulated by O-glycosylation. The select group of proteins identified and analyzed in further detail supports this, where our predictions of PC processing, O-glycosylation, and modulation of processing by furin were largely correct.
The finding that site-specific O-glycosylation may regulate hundreds of PC processing events places the large GalNAc-T gene family in a central position in a wide area of biological pathways. Elucidating specific functions of each of the Gal-NAc-T isoforms in health and disease is not a straightforward task due to lack of reliable prediction methods. A number of studies including genomewide association studies link specific GalNAc-T isoforms with important biological functions and diseases, however, in most cases the molecular basis for the role of GalNAc-Ts in these remained obscure (24). The first and only specific mechanism uncovered so far was indeed dysregulated PC processing of the phosphaturic factor FGF23 caused by deficiency in the GalNAc-T3 isoform, which exclusively reg-ulates O-glycosylation in the RHTR 179 2 processing sequon of FGF23 (11). Perhaps surprisingly, the rare disease familial tumoral calcinosis may be caused by deleterious mutations in either the GALNT3 or FGF23 genes, which suggests that despite the broad function of the GalNAc-T3 enzyme as shown in Tables 1 and 2, this enzyme has limited essential nonredundant functions in O-glycosylation apart from the single Thr 178 site in FGF23. We envision that other disease-causing defects in the GalNAc-T repertoire as predicted by genome-wide association studies will involve dysregulated PC-processing events as we recently proposed for GalNAc-T2 (12).
Our current knowledge of O-glycosylation of especially less abundant proteins and proteins with single O-glycosylation sites is quite limited, and sequence-based prediction models generally fail to predict isolated O-glycosylation sites in proteins (25). However, considerable data indicate that in vitro GalNAc-T enzyme analysis of short peptide substrates reasonably reflects glycosylation occurring in cells with the appropriate GalNAc-Ts (26). We have recently developed a novel strategy to identify O-glycosylation sites using zinc finger nuclease-targeted glycoengineered human cell lines that are limited in O-glycosylation capacity to simple GalNAc (and NeuAc␣2,6GalNAc) O-glycan structures. The simplified O-glycoproteome of such SimpleCells allow for sensitive lectin affinity chromatography and mass spectrometry sequencing A, a sequence logo plot was generated from all known secreted mammalian proteins containing the RXXR motif. The upper panel shows the distribution of amino acids in positions P8 to P4Ј in relationship to RXXR and the lower panel shows an enlargement of the lower area to bring out the top 5 amino acids in the above mentioned positions. B, the bioinformatics search for RXXR revealed 1264 potential furin processing sites in 710 proteins where a Ser or Thr is found within position P6 to P2Ј. In P5 to P1Ј we found 896 sites in 573 proteins. If a search was made for the optimal furin cleavage site RXR/KR Ser or Thr in P6 to P2Ј were seen in 641 sites in 384 proteins and the corresponding number for P5 to P1Ј were 396 and 285, respectively. If the search was narrowed down to contain Ser or Thr in P2 or P3, 252 sites in 229 proteins were displayed. C, a schematic illustration of the many protein classes represented in our search result that can be affected by up to 20 different GalNAc-Ts and the 7 proprotein convertases. and identification of O-glycosites. With this strategy we have identified Ͼ400 new O-glycosites demonstrating that GalNAc O-glycosylation is far more prevalent than previously thought (27). Furthermore, we have tested Ͼ12 peptides derived from this search in in vitro O-glycosylation assays and found excellent correlation between in vitro enzyme assays and in vivo O-glycosylation (27). We hope that further characterization of the O-glycoproteome by this strategy will advance knowledge of the co-regulation of PC processing events by site-specific O-glycosylation.
In this study we further analyzed a set of protein examples identified by our proteome-wide search ( Table 2). A known processing site in the pro-BNP (TLRAPR 76 2SP) was susceptible to furin cleavage as well as O-glycosylation in our in vitro assays (Fig. 6A). Pro-BNP is produced by cardiomyocytes in the atrium and ventricle of the heart, and is proteolytically processed at Arg 76 to yield NT-pro-BNP (N-terminal fragment of BNP precursor). GalNAc-T3 glycosylated Thr 71 (P6) and Ser 77 (P1Ј), providing partial protection from furin cleavage in 16-h assays. This is in agreement with studies demonstrating that pro-BNP is O-glycosylated, and that Thr 71 may be important for processing (13). The same authors also found evidence that pro-BNP is differentially glycosylated and processed in two different cell lines (HEK293 and CHO). Interestingly, we found that within the BNP processing site T 71 LRAPR2S 77 , Gal-NAc-T3 initiates glycosylation at Thr 71 and proceeds with glycosylation of Ser 77 using its lectin domain. Thus, a GalNAc-T3 lectin mutant only glycosylated Thr 71 . This is similar to how GalNAc-T3 glycosylates Thr 171 of FGF23 (TPIPRRHTR 179 2) before glycosylating Thr 178 in a lectin-dependent manner (not shown). Glycosylation with a GalNAc-T3 lectin mutant revealed that GalNAc alone at P6 yielded protection in a 5-h cleavage assay (supplemental Fig. S1). Pro-BNP is up-regulated during pathophysiological stress and processed BNP is a prognostic biomarker in heart failure. It is possible that dysregulation of GalNAc-Ts plays a role for the altered processing of BNP in disease.
Recently, it was shown that activin stimulates maturation of inhibin ␣ and ␤ through induction of furin or a furin-like PC, and thus acting in a positive feedback loop regulating FSH production (28,29). We found that glycosylation of Thr 234 (P2Ј) in the processing site (RARR 232 2ST) of inhibin ␣ interferes with in vitro furin cleavage (Fig. 6B). Furthermore, glycosylation was specifically performed by the GalNAc-T3 isoform, which, as inhibin ␣, is highly expressed in the testis (30,31).
Fibroblast growth factor 7/keratinocyte growth factor (FGF7/KGF) is secreted from mammalian cells as two biologically active forms, KGF␣ (amino acids 1-163) and KGF␤ (amino acids 23-163) due to PC processing in the RHTR 23 2 sequon. The longer form is O-glycosylated at Thr 22 (P2) and N-glycosylated at Asn 14 (32). We found that O-glycosylation of Thr 22 blocked processing suggesting that site-specific O-glycosylation in this case may co-regulate the ratio of short/long forms of KGF secreted (Table 2). We previously found that GalNAc O-glycosylation in the same RHTR 179 2 motif in FGF23 only partially affected processing, whereas elongated  c Peptide sequence is cleaved in vitro by furin, yes (Y) or no (N). d In vitro glycosylation provides protection against in vitro proteolytic cleavage by furin, yes (Y), no (N) or not applicable (NA). e Target protein is known to be O-glycosylated, reference. f Target protein is known to undergo SPC processing, reference. g Integrin ␣E is reported to be processed in RQRR 178 , but two other potential cleavage sites exists. h Two GalNAc glycoforms with 2 and 3 mol of GalNAc incorporated (third GalNAc on Thr 60 ) were analysed in a mixture, and only the glycoform with 3 mol was protected from cleavage. glycan structures were required for complete blocking of furin cleavage (11). This discrepancy may be due to the finding that the FGF7 peptide generally was a poorer substrate for in vitro furin cleavage compared with the FGF23 peptide. Semaphorin 3B (Sema3B) (RNRR 731 2TH) was identified in our search, and we found that GalNAc-T3-glycosylated Thr 732 (P1Ј), which inhibited in vitro processing by furin (Fig. 5A). Sema3B was first identified together with semaphorin 3F as a tumor suppressor of small-cell lung carcinoma (33,34), and later Sema3B was found to function as an endothelial cell repellent (35). Sema3B is inactivated by furin cleavage at multiple processing sites. The major furin site generates 20-and 80-kDa fragments, and Varshavsky and colleagues (35) show that processing at the most C-terminal site that generates ϳ2and 78-kDa fragments is sufficient to disrupt its function as an inhibitor of angiogenesis. A role of O-glycosylation in this process is intriguing.
Integrins are transmembrane cell surface ␣/␤ heterodimeric receptors of which two, platelet glycoprotein IIb (36) and integ-rin ␤1 (37), are known to be O-glycosylated and several ␣-integrin subunits undergo PC processing by furin in the extracellular domain yielding disulfide-linked heavy and light chains (38). Integrin ␣E (CD103) heterodimerizes with integrin ␤7 (␣E␤7) and constitutes the E-cadherin binding integrin, also known as mucosal lymphocyte-1 antigen, primarily expressed in intestinal intraepithelial lymphocytes. Integrin ␣E is reported to be processed in RQRR 178 2 (39), but contains two additional predicted SPC sites. One of these (RTKR 62 2TP) is located in the N-terminal region, and this was susceptible to furin cleavage as well as O-glycosylation in our in vitro assay. GalNAc-T3 incorporated 2 and 3 GalNAc residues at Thr 56 (P7), Thr 60 (P3), and Thr 63 (P1Ј), where Thr 60 in the RTKR 62 motif was only partially glycosylated (Table 2). Surprisingly GalNAc glycosylation in the P7 and P1Ј positions did not protect against furin cleavage, whereas the additional glycosylation in P3 yielded complete protection (Fig. 5B). GalNAc glycosylation in positions P7 and P1Ј blocked processing of our model peptide sequences (Table 1) as did glycosylation of P1Ј in the  ). B, the integrin ␣E peptide was completely cleaved in the RTKR 62 2 motif after 3 h, yielding an N-terminal fragment of 1454 Da (left panel). The corresponding glycopeptide with two GalNAc residues at P7 and P1Ј was also cleaved after 3 h, whereas the glycopeptide with 3 GalNAc residues at P7, P3, and P1Ј was not cleaved after a 16-h incubation (right panel). Reactions were sampled at 0, 3, and 16 h. semaphorin 3B sequence (Fig. 5A). There is no obvious explanation for this discrepancy, but local conformation and efficiency of substrate sequence may play roles and these are not considered in this study.
Finally, we identified a putative processing site in the stem region of the type 1 transmembrane Golgi-located sialyltransferase ST6GalNAc-I (PTRARR 72 2TT). Many Golgi glycosyltransferases are known to be shed as catalytically active enzymes by cleavage in the stem region, but little is known about the mechanism and only two examples have been characterized to our knowledge. Thus, shedding a soluble active form of ST6Gal-I has been shown to be performed by BACE-1 (40). PC processing has been shown to regulate secretion of the ␤3GlcNAc transferase Lunatic fringe (41). We found that Thr 74 (P2Ј) was O-glycosylated in vitro and that this efficiently protected furin cleavage (Fig. 6C). ST6GalNAc-I is a key regulator of the predominant cancer glycoform STn (NeuAc␣2,6GalNAc␣1-O-Ser/Thr) (42) and overexpression in cells override the normal elongation process of O-glycosylation (37,43). ST6GalNAc-I has not been reported to be O-glycosylated, but we have recently identified O-glycosylation sites in the stem regions of several Golgi glycosyltransferases with our SimpleCell strategy, including in ST6GalNAc-I, although not at the Thr 74 site (27). This finding may help advance our understanding of the regulation of the resident time of glycosyltransferases in cells.
The dystroglycan and collagen XVIIIa1 peptides were both readily in vitro glycosylated but not cleaved by furin. Processing of both substrates has been described and it is plausible that PCs other than furin are responsible for potential processing of these sites (44,45). Bone morphogenic protein 7 and endothelin-1 were both glycosylated and cleaved by furin in vitro but glycosylation did not yield protection against processing. The endothelin-1 peptide RLRR 49 2 was not protected by O-glycosylation with a GalNAc residue at P8 (or P10 as specific site in N-terminal fragment was not confirmed).
In summary, our study predicts that site-specific O-glycosylation adjacent to PC processing sites is a major player in regulation of processing of proteins. We have only surveyed the simple RXXR motif, but further analysis of simpler dibasic motifs targeted by most PCs is expected to provide even further support for this prediction. It has long been known that O-glycans provide protection from proteolytic digestion with the most illuminating example being the LDL receptor (16,20), but the understanding that O-glycosylation at specific sites play regulatory roles counteracting fine-tuned PC processing events that activates or inactivates bioactive proteins has only emerged recently (11,12). The strategy and results of the survey conducted in this study provide a foundation for further studies into the co-regulatory role of site-specific O-glycosylation.