Role of N-Linked Oligosaccharide Flexibility in Mannose Phosphorylation of Lysosomal Enzyme Cathepsin L*

Mannose phosphorylation of N-linked oligosaccharides by UDP-GlcNAc:lysosomal enzymeN-acetylglucosamine-1-phosphotransferase is a key step in the targeting of lysosomal enzymes in mammalian cells and tissues. The selectivity of this process is determined by lysine-based phosphorylation signals shared by lysosomal enzymes of diverse structure and function. By introducing new glycosylation sites at several locations on the surface of mouse procathepsin L and modeling oligosaccharide conformations for sites that are phosphorylated, it was shown that the inherent flexibility of N-linked oligosaccharides can account for the specificity of the transferase for oligosaccharides at different locations on the protein. By using this approach, the physical relationship between the lysine-based signal and the site of phosphorylation of mannose residues was determined. The analysis also revealed the existence of additional independent lysine-based phosphorylation signals on procathepsin L, which account for the low level of phosphorylation observed when the primary Lys-54/Lys-99 signal is ablated. Mutagenesis of residues that surround Lys-54 and Lys-99 and demonstration of mannose phosphorylation of a glycosylated derivative of green fluorescent protein provide strong evidence that the cathepsin L phosphorylation signal is a simple structure composed of as few as two well placed lysine residues.

The mammalian lysosomal protein targeting system has the capability of recognizing and modifying lysosomal hydrolases and growth factors from a wide range of protein families with high specificity. The molecular basis for this selectivity is due to the activity of the UDP-GlcNAc:lysosomal enzyme N-acetylglucosamine-1-phosphotransferase (GlcNAc-1phosphotransferase), which phosphorylates N-linked oligosaccharides of these proteins by the addition of GlcNAc-1-P (1)(2)(3)(4)(5). This modification begins after lysosomal proteins are exported from the endoplasmic reticulum and is followed by the removal of the terminal GlcNAc moieties from the adducts. In the Golgi apparatus the phosphorylated proteins are bound to mannose 6-phosphate receptors, which mediate the delivery of the proteins to lysosomes.
GlcNAc-1-phosphotransferase has been purified as a 540-kDa complex composed of disulfide-linked homodimers of ␣ and ␤ subunits and two identical, noncovalently associated ␥ subunits (6). The ␣ subunit was shown to have nucleotide sugar binding activity, and on the basis of previous genetic data, it was known that the catalytic and protein recognition activities are likely to be located on separate subunits (7). This has since been verified by analysis of the transferase in cells from patients with mucolipidosis IIIC or variant pseudo-Hurler polydystrophy. GlcNAc-1-phosphotransferase from these patients is defective in the ␥ subunit, which prevents phosphorylation of lysosomal enzymes, yet transferase activity on synthetic substrates is retained (8).
Although the molecular basis for recognition of lysosomal hydrolases by the transferase has been studied quite extensively, a complete understanding of the nature of the recognition has remained elusive. Early studies (9 -12) demonstrated that the recognition involves protein determinants that are conformation-dependent and that the protein determinants in human cathepsin D extend over a large portion of the surface of the protein and involve many residues including lysine. We have shown that lysine residues alone can account for most if not all of the energy of interaction between the transferase and mouse cathepsin L and that lysine residues are the major determinants for mannose phosphorylation of a wide range of lysosomal proteins (13). Subsequently, we identified two lysine residues, Lys-54 and Lys-99, as the ones involved in cathepsin L phosphorylation (14). On the basis of these results, a relatively simple model involving lysine residues was proposed as a general phosphorylation signal for lysosomal proteins. Similar involvement of lysine residues (15,16) has since been demonstrated for other proteins, and it is now widely accepted that lysine residues are the primary determinants for mannose phosphorylation of a wide range of proteins by the transferase.
Our recent studies have focused on determination of the molecular dimensions of the phosphorylation signal and its relationship to the oligosaccharides that are phosphorylated. With this information, it will be possible to identify substrates for the transferase from information in structural data bases and to engineer phosphorylation signals on macromolecules for therapeutic purposes. In our previous study (17), we compared lysine-based signals that were identified for cathepsin L and cathepsin D, and we found that critical lysine residues were separated by a similar distance (ϳ34 Å) in the two proteins. In this study, we further define the cathepsin L signal by determining its relationship to the site of oligosaccharide phosphorylation. We also examine the involvement of residues in the vicinity of critical lysine residues and provide additional evidence for the simplicity and generality of the signal.

EXPERIMENTAL PROCEDURES
Enzymes, Antibodies, cDNAs, and Other Reagents-Antibodies to mouse cathepsin L and green fluorescent protein (GFP) 1 were raised in rabbits as described previously (18). LipofectAMINE PLUS was purchased from Invitrogen. EcoScint H was acquired from National Diagnostics. The QuikChange Mutagenesis Kit and Pfu Turbo were obtained from Stratagene. DpnI and N-glycanase (peptide N-glycosidase F) were purchased from New England Biolabs. Other restriction enzymes were acquired from Invitrogen. Chemical reagents were purchased from Sigma.
Cells and Growth Conditions-COS-1 cells were purchased from the American Type Culture Collection and were cultured in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal calf serum, 100 units/ml penicillin, and 100 g/l streptomycin and maintained at 37°C in a humidified atmosphere of 5% CO 2 .
Subcloning of Mouse Cathepsin L-A fragment containing the full coding sequence of mouse procathepsin L (19) was subcloned into the pED4neo vector (20) for expression in COS-1 cells. The pED4neo vector was modified by the insertion of a synthetic linker made from the complementary oligonucleotides 5Ј-ATT TTC TCG AGA CCG GTG CGG CCG CGA ATT CGT CGA CTC TAG AA-3Ј and 5Ј-AAT TTT CTA GAG TCG ACG AAT TCG CGG CCG CAC CGG TCT CGA GA-3Ј into the EcoRI cloning site of the vector. The linker contains XhoI, AgeI, NotI, EcoRI, SalI, and XbaI sites, in 5Ј-3Ј order, and eliminates the EcoRI site into which it was cloned. The modified vector will hereby be referred to as pED4neoSL. Mouse cathepsin L cDNA was ligated into the XhoI and XbaI cloning sites to give an expression vector, pED4neoSL-CL, suitable for transient or stable expression of mouse cathepsin L in eukaryotic cells.
Mutagenesis of Cathepsin L cDNA-All mutagenesis steps were carried out using a modification of the QuikChange Site-directed Mutagenesis kit (Stratagene). This method was chosen because of its high mutation efficiency and because it requires no specialized vectors. All oligonucleotide primers were designed using MutantMaker, a Visual Cϩϩ application developed by one of the authors (J. B. Warner), which expedites the task of primer selection. The program reads in protein text files, generates mutagenic primers based upon the user's parameters, and prints an oligonucleotide order form, thereby minimizing human error. The following parameters were used in designing the primers: GC content Ͼ ϭ 40%, T m Ͼ ϭ 78°C, and termination by G or C on both ends of the oligonucleotide. Oligonucleotide primers were synthesized and purified by SDS-PAGE. PCR was carried out using the following conditions: 1 cycle of 95°C for 30 s, followed by 19 cycles of 95°C for 30 s, 60°C for 1 min, and 72°C for 20 min. PCR products were ethanol-precipitated, resuspended in deionized water, and transformed into XL1-blue supercompetent cells (Stratagene). All mutations were verified by sequencing.
Protein Expression and Quantitation of Phosphorylation-COS cells were transiently transfected with pED4neoSL-CL using the Lipo-fectAMINE PLUS reagent (Invitrogen) as described by the vendor. The transfected cells were cultured for 48 h prior to labeling. The cells were labeled for 6 h with 0.1 mCi/ml [ 35 S]methionine in DMEM containing 10% dialyzed fetal calf serum, 0.9 mg/liter cold methionine, and 10 mM NH 4 Cl, which causes the secretion of newly synthesized lysosomal enzymes. Immunoprecipitation of cathepsin L from the media and SDS-PAGE of labeled proteins were carried out as described previously (21).
The methods used for quantitation of phosphorylation were similar to those used previously (14,17). Transfected cells were double-labeled with 0.1 mCi/ml [ 3 H]leucine and 0.4 mCi/ml [ 32 P]phosphate in DMEM containing 10% dialyzed fetal calf serum and 10 mM NH 4 Cl. Unlabeled leucine was reduced to 2% of the normal concentration to increase incorporation of radioactive leucine into protein. Cathepsin L was immunoprecipitated from the media and subjected to SDS-PAGE as described previously except that unlabeled procathepsin L (4 g) was added as a carrier to each immunoprecipitate prior to SDS-PAGE. The gels were stained with Coomassie Blue, and cathepsin L-containing bands were excised from the gel and extracted by overnight incubation in 1 ml of 1 N NaOH at room temperature. After neutralization with 1 ml of 1 N HCl, the amount of 32 P and 3 H in each band was determined by scintillation counting in EcoScint H (National Diagnostics).
Phosphorylation of glyco-GFP was similarly quantitated, except that GFP (4 g) was added as a carrier to each immunoprecipitate prior to SDS-PAGE. The difference in leucine usage between mouse procathep-sin L (19 residues) and glyco-GFP (20 residues) was taken into account when comparing mannose phosphorylation of the two proteins.
Modeling of Procathepsin L Oligosaccharides-A previously modeled structure of mouse procathepsin L was used in these studies (17). The Quanta molecular modeling program (Accelrys, Inc.) was used to determine energetically acceptable conformations of a Man 8 GlcNAc 2 N-linked oligosaccharide attached at various locations on the cathepsin L structure. The Man 8 GlcNAc 2 structure was used since in vivo the terminal mannose in the middle branch of the Man 9 GlcNAc 2 oligosaccharide is clipped prior to phosphorylation of lysosomal proteins by GlcNAc-1-phosphotransferase (22). A set of acceptable conformations was generated for each oligosaccharide by randomly selecting torsion angles for all rotatable bonds in the Man 8 GlcNAc 2 structure (Fig. 1), minimizing the potential energy of the resulting structures and selecting those conformations that possess an acceptable total potential energy value. A maximum potential energy cut-off of 1000 kcal/mol was used that was found to be characteristic of Man 8 GlcNAc 2 conformations lacking clashes and close contacts. Fifty cycles of conjugate gradient minimization were used, which brought energy values of acceptable conformations close to convergence in most cases. Conformations were generated in the context of the protein and torsion angles for rotatable bonds in the oligosaccharide and in the asparagine side chain were randomly set. Generally, less than 1% of the conformations that were generated met the energy criteria. At least 2000 acceptable conformations were obtained for each oligosaccharide. When the acceptable oligosaccharide conformations were overlaid, they formed a spheroid-like object, resembling a cloud. Therefore, such overlays of oligosaccharide conformations are referred to as oligosaccharide clouds.
In order to assess the overlap between adjacent conformational clouds, a specialized Cϩϩ application was developed. This application determines where selected residues in separate conformational clouds can overlap in three-dimensional space. For each Man 8 GlcNAc 2 oligosaccharide there are five mannose residues with a C-6-OH moiety, all of which are susceptible to phosphorylation in vivo (22). Minima (X min , Y min , and Z min ) and maxima (X max , Y max , and Z max ) are determined for the coordinates of the O-6 atom of every phosphorylatable mannose in each conformational cloud (Fig. 6). The cube generated from these coordinates gives a simple but effective definition of the boundaries of each conformational cloud. Boundaries are generated for all clouds from phosphorylated mannoses. The program then calculates the largest minimum and smallest maximum for each coordinate to determine the overlapping region. If the free O-6 atom of a mannose residue is contained in this overlapping region, its coordinates are retained.
Generation of a Secreted, Glycosylated Form of GFP-The cDNA of proliferin-2, a secreted mannose 6-phosphate containing glycoprotein, was purchased from American Type Culture Collection and subcloned into the pED4neo expression vector. The signal sequence from proliferin was amplified from pED4neo-proliferin using the following PCR primers: 5Ј-CCG GAA TTC CGC ATG CTC CCT TCT TTG ATT CAA-3Ј (forward primer) and 5Ј-TCC CCG CGG TGC ACA CAT GGG AAA TGA GGC-3Ј (reverse primer). The PCR product was purified, cut with EcoRI and SstII, and ligated into the EcoRI and SstII cloning sites of the pEGFP-N1 vector (Clontech). The modified GFP sequence was then subcloned into the pED4neoSL vector using the XhoI and NotI cloning sites. A glycosylation site was added to residues 156 -158 on the secretory GFP by site-directed mutagenesis, using the following primers: 5Ј-GTG AAG TTC GAG AAC GGC ACC CTG GTG AAC CGC-3Ј (forward primer) and 5Ј-GCG GTT CAC CAG GGT GCC GTT CTC GAA CTT CAC-3Ј (reverse primer). The location of the glycosylation site positions the oligosaccharide on the rim the ␤-barrel structure.

Topography of Mannose Phosphorylation of Cathepsin L-To
examine the requirement for oligosaccharide placement, glycosylation sites were placed at selected locations on the surface of cathepsin L by site-directed mutagenesis, and glycosylation and mannose phosphorylation at each site were determined after transient expression in COS-1 cells. Thirty-three glycosylation site mutations spread throughout the protein were created and tested. To maximize the number of properly folded and glycosylated proteins, surface residues in regions lacking ␣-helical structure were chosen for placement of the sites.
Wild-type procathepsin L contains two potential glycosylation sites, only one (Asn-221) of which is utilized in the properly folded protein (19). Because the cryptic site at Asn-268 can be glycosylated in some modified cathepsin L proteins (19,23), both the utilized site (Asn-221) and the non-utilized site (Asn-268) were mutated to glutamine, creating a construct encoding a nonglycosylated cathepsin L protein. When expressed, the protein was secreted poorly as expected ( Fig. 2) since glycosylation is needed for proper folding of cathepsin L in the COS-1 expression system (21). This construct was used for construction of the altered glycosylation site proteins listed in Table I. Results on the synthesis, secretion, and glycosylation of selected altered glycosylation site proteins are shown in Fig. 2, and data for all of the constructs are summarized in Table I. Synthesis and secretion of the proteins, as determined by biosynthetic labeling in the presence of NH 4 Cl, was used as a means of assessing whether or not altered cathepsin proteins are folded properly. NH 4 Cl inhibits mannose 6-phosphate receptor function and causes the quantitative secretion of newly synthesized mannose-phosphorylated proteins. We have examined previously (13,14,17,19,21) more than 100 altered cathepsin L and cathepsin D proteins using the COS cell expression system. In all cases examined, normally folded proteins, as determined by level of catalytic activity (13), susceptibility to proteolytic digestion (14,17,21), or susceptibility to heat denaturation (13), are secreted under these conditions, whereas abnormally folded proteins are retained within the cells and/or degraded. Of the 33 altered glycosylation site proteins created for this study, 19 were glycosylated and efficiently secreted (Fig. 2). These were retained for further analysis. Endoglycosidase H (endo H) treatment was used to examine the state of glycosylation of the proteins (Table I). Oligosaccharides on proteins that undergo mannose phosphorylation would be expected to remain sensitive to the treatment, whereas those that do not would be expected to be further processed to forms that are resistant to the treatment. Of the 19 retained proteins, 5 were sensitive to endo H treatment, 7 were partially sensitive, and 7 were resistant to the treatment.
Biosynthetic labeling with [ 32 P]phosphate and [ 3 H]leucine was used to determine which of the altered glycosylation site proteins were susceptible to mannose phosphorylation (Fig. 3). Five of the constructs (Asn-105, Asn-158, Asn-171, Asn-217, and Asn-229) displayed a high level of phosphorylation approaching the phosphorylation level of wild-type procathepsin L. All other constructs showed minimal (Ͻ25%) phosphorylation compared with the wild-type protein. Phosphorylation of the 5 highly phosphorylated constructs was inhibited by mutation of Lys-54 and Lys-99 to alanine (Fig. 4) indicating that all 5 constructs utilize the previously identified mannose phosphorylation signal (13,14).
The topographical locations of engineered glycosylation sites on the surface of the protein are shown in Fig. 5. The locations of the glycosylation sites of the highly phosphorylated constructs were found to be clustered within the vicinity of Asn-221, the wild-type glycosylation site. Most of the engineered sites in the vicinity of Asn-221 were highly phosphorylated. Some sites in this region were phosphorylated weakly (Asn-104, Asn-199, and Asn-208) or not at all (Asn-156 and Asn-108). The low level of phosphorylation at these sites is attributed to site-dependent effects that reduce accessibility of the oligosaccharides to the transferase. Such effects would include proteinoligosaccharide interactions that limit oligosaccharide flexibility and site-dependent differences in oligosaccharide processing that inhibit or prevent phosphorylation.
Flexibility of N-Linked Oligosaccharide Accounts for Topography of Mannose Phosphorylation of Cathepsin L-Previous studies (22,24) have shown that mannose-phosphorylated proteins display a surprising heterogeneity with regard to sites of phosphorylation on the oligosaccharide and the number and location of the phosphorylated oligosaccharides on the protein.
To account for this apparent lack of specificity, we have proposed that selectivity of the reaction for specific mannose 6-phosphate residues on protein-linked oligosaccharides is dic- The torsion angles that were randomized to generate oligosaccharide structures are marked with a circular arrow. The locations of the mannoses that can be phosphorylated are marked with an asterisk (22). One terminal mannose residue was excluded from this study since removal of this residue normally occurs prior to mannose phosphorylation and is needed to allow phosphorylation of the underlying mannose (22). tated in large part by the ability of those residues to migrate in three-dimensional space to the catalytic site of GlcNAc-1-phosphotransferase when the protein is bound to the transferase through its lysine-based phosphorylation signal (17). This hy-pothesis is supported by NMR studies, which have indicated that N-linked oligosaccharides are flexible in solution and that, although preferred conformations may exist, such conformations are short lived (25-28).  To test this hypothesis, three-dimensional oligosaccharide clouds composed of compilations of randomly generated, energetically acceptable Man 8 GlcNAc 2 oligosaccharide conformations were modeled at phosphorylated glycosylation sites of procathepsin L as described under "Experimental Procedures." The oligosaccharide conformations that compose each cloud represent sterically unhindered oligosaccharide conformations that were chosen based on an energy value that would exclude clashes and close contact of atoms within the oligosaccharide and between the oligosaccharide and the protein. Thus, the volume enclosed by each cloud represents the space available to one or more phosphorylatable mannose residues of the oligosaccharide attached at that location on the protein. Clouds for each glycosylation site are shown in Fig. 6. In these representations, only the positions of O-6 atoms of phosphorylatable mannose residues are shown (see Fig. 1). An overlay of all 6 clouds and a graphical representation of the overlap between clouds are also shown in Fig. 6. The existence of an overlap region shared by all 6 clouds substantiates the hypothesis described above by showing that all 6 oligosaccharides are capable of positioning phosphorylatable mannose residues in the same region in space. This region would correspond to the position of the catalytic site of the transferase when it is bound to the protein.
Role of Lysine Residues in Residual Phosphorylation of Cathepsin L-The highly phosphorylated glycosylation sites including the wild-type site (Fig. 4), as well as some of the weakly phosphorylated sites (data not shown), displayed significant levels of phosphorylation when lysine residues that compose the previously described phosphorylation signal (Lys-54 and Lys-99) were mutated to alanine. Whereas this phosphorylation represents a minor component of the phosphorylation of the wild-type protein, it does represent a substantial portion of the phosphorylation observed for two of the engineered sites (Asn-105 and Asn-229). To determine whether or not lysine residues are responsible for the residual phosphorylation of the wild-type protein, site-directed mutagenesis was carried out on a procathepsin L construct containing the wild-type glycosylation site and alanine mutations at Lys-54 and Lys-99. The results shown in Fig. 7 indicate that several lysine residues, including Lys-157, Lys-233, and Lys-237, contribute to this residual phosphorylation. Phosphorylation of a construct containing alanine mutations at Lys-54, Lys-99, Lys-116, Lys-157, Lys-233 and Lys-237 was Ͼ90% inhibited, indicating that these lysine residues can account for virtually all of the mannose phosphorylation of procathepsin L. The residual phosphorylation appears to result from weak lysine-based phosphorylation signals that act independently of the primary Lys-54/Lys-99 signal. The level of residual phosphorylation differs among the engineered glycosylation sites providing additional evidence that this component of the phosphorylation results from independent phosphorylation signals.
Role of Lysine Microenvironment in Mannose Phosphorylation of Cathepsin L-Surface residues within 8 Å of Lys-54 and Lys-99 on wild-type cathepsin L were mutated to alanine in order to determine the importance of the environment surrounding these lysine residues (Table II). No significant change in mannose phosphorylation was observed in single alanine mutations; however, the possibility remained that these residues had a weak interaction with the transferase and worked in concert to provide a stronger contact. In order to address this question, multiple mutations were created in single constructs so that three residues surrounding Lys-54 (construct M56A/ R57A/M58A) and five residues surrounding Lys-99 (construct H97A/Q98A/H100A/K101A/K102A) were mutated. In all cases there was no significant change in mannose phosphorylation of the cathepsin L protein indicating that these residues play little if any role in the mannose phosphorylation of cathepsin L.
Glycosylation and Mannose Phosphorylation of GFP-To convert GFP into a suitable substrate for the transferase, an endoplasmic reticulum signal sequence and a glycosylation site were added as described under "Experimental Procedures." This GFP construct, called glyco-GFP, was completely glycosylated and secreted in COS-1 cells (Fig. 8A). When COS-1 cells expressing the construct were labeled with [ 32 P]phosphate, glyco-GFP was found to be phosphorylated (Fig. 8B). Quantitation of mannose phosphorylation of glyco-GFP was preformed using the [ 32 P]phosphate and [ 3 H]leucine double-labeling protocol as described under "Experimental Procedures." The level of phosphorylation was 16.8% of that observed for wild typecathepsin L but more than 8-fold greater than background phosphorylation for this system (Ͻ2%) as determined previously using glycopepsinogen (17). To determine whether the phosphorylation was associated with the oligosaccharide, the labeled protein was treated with specific endoglycosidases (Fig.  8B). N-Glycanase, which cleaves all N-linked oligosaccharides, completely digested glyco-GFP and removed all associated phosphorylation. Endo H, which reacts with only high mannose oligosaccharides, cleaved only a small fraction of the glycosylated protein, but removed phosphorylation completely, indicating that the phosphorylation was located entirely on high mannose oligosaccharides. Although actual residues that compose the glyco-GFP phosphorylation signal were not elucidated, the high concentration of surface lysine residues on this protein (20 in 239 amino acids) provides several pairs of suitably positioned lysine residues that could serve as GlcNAc-1-phosphotransferase recognition sites. The phosphorylation of a synthetic glycoprotein that could not have evolved a complex phosphorylation signal provides evidence that the signal is a relatively simple structure that may exist on a wide range of proteins regardless of their need to be targeted to lysosomes. DISCUSSION Finding a common structural motif for recognition of protein substrates by GlcNAc-1-phosphotransferase has been a longstanding goal in the field of lysosomal protein trafficking. Because the motif is three-dimensional in nature and is found on proteins of diverse origin and structure, it has been difficult to identify. Several studies have been carried out to determine which amino acid residues are necessary for mannose phosphorylation of individual proteins. Here, we provide evidence that the inherent flexibility of N-linked oligosaccharides enables these structures to be placed in a large although limited area  on a lysosomal protein and still have access to the catalytic domain of the transferase. By placing oligosaccharides throughout the surface of cathepsin L, it was possible to identify a region in three-dimensional space where mannose residues are likely to be phosphorylated and the spatial relationship of this site to lysine residues Lys-54 and Lys-99 of the phosphorylation signal in the protein. Knowing the placement of the oligosaccharide in reference to the signal is necessary for identifying or constructing a signal that not only allows interaction of the transferase with its substrate protein but also provides access to its N-linked oligosaccharides and phosphorylation of one or more mannose residues. Findings of this study concerning the relationship of critical components of the cathepsin L phosphorylation signal to the N-linked oligosaccharide of the protein are summarized in Fig.  9. The distance of ϳ34 Å between ␣-carbon atoms of critical lysine residues appears to be a general feature of the phosphorylation signal. This distance was demonstrated in our earlier study (17) for cathepsin D as well as cathepsin L (14) and is consistent with results obtained for two other proteins, DNase I (15, 29) and aspartylglucosaminidase (16). In the case of aspartylglucosaminidase, mutation of two lysine residues, Lys-183 in the ␣ subunit and Lys-214 in the ␤ subunit, was found to inhibit mannose phosphorylation by 96%. Because aspartylglucosaminidase is expressed as a ␣ 2 ␤ 2 heterotetramer with a phosphorylated oligosaccharide in each subunit, these residues could be used in one or more of a variety of configurations for phosphorylation. Given this caveat, Lys-183 and Lys-214 are separated by 32.71 Å, consistent with distances of 33.75 and 33.63 Å for cathepsin L and cathepsin D, respectively (14,17). Bovine DNase I has two oligosaccharides and requires four lysine residues, Lys-27, Lys-50, Lys-74, and Lys-124 for efficient mannose phosphorylation. Lys-27 and Lys-74 have substantial effects on phosphorylation but do not appear to belong to the same phosphorylation signal since they affect different oligosaccharides. The two others residues, Lys-50 and Lys-124, are located 34.31 Å apart, again consistent with the 34 Å inter-lysine distance. The site of mannose phosphorylation for procathepsin L, which was identified by determining the overlap of oligosaccharide clouds of highly phosphorylated glycosylation sites, is also shown in Fig. 9. The center of this region was calculated to be 26.83 Å from the closest critical lysine, Lys-99. The relationship of the site of mannose phosphorylation to the phosphorylation signal represents an intrinsic property of GlcNAc-1-phosphotransferase and should apply to other transferase substrates.
The results of this study suggest that the cathepsin L phosphorylation signal is polarized with one lysine residue (Lys-99) proximal to the site of mannose phosphorylation and the other lysine residue (Lys-54) distal. If the binding properties of these residues were equivalent, two regions of mannose phosphorylation corresponding to two transferase-binding orientations would be expected. Mutation of residues in the vicinity of critical lysine residues did not reveal involvement of other residues that would distinguish interaction of the two lysine residues with the transferase. However, it is possible that residues other than those tested could serve such a role. It is also possible that, because of the overall shape of procathepsin L, binding of the transferase to the cathepsin is sterically restricted to a single orientation or that binding in one orientation is nonproductive.
The existence of additional minor or cryptic lysine-based phosphorylation signals on cathepsin L is consistent with data gathered on other mannose-phosphorylated proteins. The chimeric studies of cathepsin D, as the authors note, can be interpreted by having two independent phosphorylation signals, instead of having a single extended signal (9). The existence of two or more signals on cathepsin D explains why mutation of individual lysine residues affects phosphorylation of the two cathepsin D oligosaccharides differently and why localized phosphorylation of engineered glycosylation sites was not observed for this protein (12,15).
A previously unresolved issue concerning the nature of the phosphorylation signal is its level of complexity. Is the signal a complex, highly evolved structure that interacts with the transferase over an extended surface (29) or is it a relatively simple structure composed of a few well placed residues (17)? Two findings presented in this study address this issue. First, mutation of residues surrounding critical lysine residues to alanine had little if any effect on phosphorylation of cathepsin L. This supports our previous finding that lysine residues account for most if not all of the energy of interaction between the transferase and the cathepsin L phosphorylation signal (13,14). These findings suggest that phosphorylation signals may actually be quite simple requiring as few as two well placed lysine residues as a minimal structure (14,17). This does not mean that additional protein factors are not important in selected cases. Arginine has been shown to partially substitute Murine procathepsin L is displayed along three different axes to show the relationship between the critical lysine residues of the protein and the mannose phosphorylation site (see Fig. 6H). The coordinate system is defined such that the Z plane is the side of the protein that exposes the residues Lys-54 and Lys-99 (B), and the line between the residues delimits the x axis. The distance between the Lys-99 and the midpoint of the phosphorylation cloud was calculated as 26.83 Å. Note that structures further from the viewer, such as the phosphorylation cloud in the top panel, appear darker to indicate depth. for lysine in some contexts (15)(16)(17). Other residues in the vicinity of critical lysine residues may affect the accessibility or properties of the lysine residues. Effects such as this may explain the apparent involvement of tyrosine residues in phosphorylation of aspartylglucosaminidase and DNase I (16,29). In theory, any residue in contact with the transferase when it is bound to the protein could affect the rate and efficiency of phosphorylation. The issue of complexity is also addressed by the finding that glyco-GFP is susceptible to mannose phosphorylation. Given the origin, subcellular location, and structure of GFP, there is no logical explanation for how a complex mannose phosphorylation signal could have evolved on this protein.
On the other hand, phosphorylation of this protein is completely compatible with a simple phosphorylation signal composed of a few well placed lysine residues.
Finally, a simple phosphorylation signal is compatible with what is known about the evolution of the mannose 6-phosphate recognition system. Utilization of the mannose 6-phosphate recognition system for lysosomal targeting was a relatively late event in evolution probably occurring sometime during early vertebrate evolution. Evolution of the system would have required a genetic mechanism for generating mannose phosphorylation signals on a structurally diverse set of lysosomal hydrolases. Generation of a complex structure on a diverse set of proteins would have required extensive remodeling of the proteins. If such alterations had occurred, one would expect to observe manifestations of these changes when comparing hydrolase sequences from species that utilize the mannose 6-phosphate recognition system and those that do not. Such manifestations have not been observed. A simple phosphorylation signal, such as the one described, would allow generation of phosphorylation signals on proteins of diverse structure through one or two point mutations and would be fully compatible with evolution of the system.