Processing of lysosomal beta-galactosidase. The C-terminal precursor fragment is an essential domain of the mature enzyme.

Lysosomal beta-D-galactosidase (beta-gal), the enzyme deficient in the autosomal recessive disorders G(M1) gangliosidosis and Morquio B, is synthesized as an 85-kDa precursor that is C-terminally processed into a 64-66-kDa mature form. The released approximately 20-kDa proteolytic fragment was thought to be degraded. We now present evidence that it remains associated to the 64-kDa chain after partial proteolysis of the precursor. This polypeptide was found to copurify with beta-gal and protective protein/cathepsin A from mouse liver and Madin-Darby bovine kidney cells and was immunoprecipitated from human fibroblasts but not from fibroblasts of a G(M1) gangliosidosis and a galactosialidosis patient. Uptake of wild-type protective protein/cathepsin A by galactosialidosis fibroblasts resulted in a significant increase of mature and active beta-gal and its C-terminal fragment. Expression in COS-1 cells of mutant cDNAs encoding either the N-terminal or the C-terminal domain of beta-gal resulted in the synthesis of correctly sized polypeptides without catalytic activity. Only when co-expressed, the two subunits associate and become catalytically active. Our results suggest that the C terminus of beta-gal is an essential domain of the catalytically active enzyme and provide evidence that lysosomal beta-galactosidase is a two-subunit molecule. These data may give new significance to mutations in G(M1) gangliosidosis patients found in the C-terminal part of the molecule.

Human lysosomal ␤-D-galactosidase (␤-gal) 1 is an exoglycosidase that removes ␤-ketosidically linked galactose residues from glycoproteins, sphingolipids, and keratan sulfate (reviewed in Ref. 1). The metabolic storage disorders G M1 gangliosidosis and Morquio B are caused by structural deficiencies in the ␤-gal gene (1), while in galactosialidosis (GS) reduction of ␤-gal activity is secondary to a deficiency in the lysosomal carboxypeptidase protective protein/cathepsin A (PPCA) (2,3). Clinically distinct forms of G M1 gangliosidosis have been described, varying in age of onset and severity. Patients develop a multisystemic disease with a very severe central nervous system involvement. In contrast, Morquio B patients show no neurologic deterioration or hepatosplenomegaly but do present with skeletal abnormalities later in life. Oligosaccharides carrying terminal ␤-linked galactose residues derived from either glycoproteins or keratan sulfate are the major storage materials in visceral organs, while G M1 gangliosides and, to a lesser extent, G A1 accumulate in the brains of G M1 gangliosidosis patients (1). Specifically, the central nervous system pathology is mimicked in ␤-gal knock-out mice (4).
Early immunoprecipitation studies showed that ␤-gal is synthesized as an N-glycosylated 85-kDa precursor, which is converted into a 66-kDa intermediate and a 64-kDa mature form (1,12). In normal human fibroblasts, the intermediate form accumulated when these cells were cultured in the presence of leupeptin, indicating that proteolytic processing could be involved in the conversion of the 66-kDa form to the 64-kDa form (12). GS fibroblast treatment with leupeptin was sufficient to protect ␤-gal against degradation, again resulting in the accumulation of the 66-kDa form (12).
The human ␤-gal cDNA encodes a 76-kDa protein that carries seven potential N-glycosylation sites (13)(14)(15). Comparison of the chemically determined N-terminal amino acid sequence of the 64-kDa form with the cDNA deduced sequence revealed that only 28 residues (including the signal peptide) are Nterminally removed during maturation of the precursor molecule. This suggested that the major proteolytic processing of the ␤-gal precursor takes place at its C terminus (14). However, the exact location of the cleavage site at the C terminus of the precursor molecule was unknown. Also, little attention has been given to the significance of the C-terminal domain in the biosynthesis of the enzyme and its fate after maturation. In this report, we show that the C-terminal peptide after processing of the precursor molecule remains associated with the N-terminal fragment and is not degraded.

EXPERIMENTAL PROCEDURES
Cell Culture-Human skin fibroblasts from a normal individual and patients with early infantile GS (16) or G M1 gangliosidosis (14) were deposited in the European Cell Bank (Rotterdam, The Netherlands) (Dr. W. J. Kleijer). Fibroblasts, Madin-Darby bovine kidney cells, and COS-1 cells (17) were maintained in Dulbecco's modified Eagle's medium supplemented with antibiotics and 10 or 5% (COS-1) fetal bovine serum.
Enzyme and Protein Assays-␤-Gal and cathepsin A activities were measured with 4-methylumbelliferyl-␤-D-galactopyranoside and the Nblocked dipeptide carbobenzoxyphenylalanylalanine as substrates, respectively, according to Refs. 3 and 18. Total protein concentrations were quantitated with bicinchoninic acid (19) following the manufacturer's guidelines (Pierce).
Enzyme Purification-Mouse livers (600 g, wet weight) were extracted from freshly sacrificed C57/Bl6 and Bl6/CBA mice and used for preparation of a lysosomal-mitochondrial extract as described by Scheibe et al. (20). Alternatively, cultured Madin-Darby bovine kidney cells were used (10 10 cells). The extract was concentrated by ammonium sulfate precipitation (21), and pelleted proteins were resuspended in 20 mM EDTA, 10 mM tartaric acid, 5 mM ␤-mercaptoethanol, pH 5.0 (buffer A) and desalted on a BioGel P6DG column (2.5 ϫ 50 cm; equilibrated in buffer A (1 ml/min)). Following concentration in a stirred cell concentrator (Amicon), proteins were separated on a Sephadex G-200 column (5 ϫ 45 cm; equilibrated in buffer A (1 ml/min)); fractions eluting right after the void volume in which both ␤-gal and cathepsin A activities peaked were pooled and concentrated in Centriprep C-30 concentrators (Amicon). This concentrate was passed through a Sephacryl S-400 column (1.5 ϫ 165 cm; equilibrated in buffer A (0.4 ml/min)). The fractions with highest ␤-gal activity were pooled and loaded on a paminophenyl ␤-D-thiogalactopyranoside-agarose column (Sigma) (1 ϫ 8 cm, equilibrated in buffer A containing 100 mm NaCl (buffer B) at 0.4 ml/min). This column was washed with buffer B and eluted with buffer B containing 500 mM D-galactonic acid ␥-lactone (Sigma), and fractions highest in ␤-gal activity were pooled. Before SDS-polyacrylamide gel electrophoresis, D-galactonic acid ␥-lactone was removed by dialysis against buffer B.
Western Blotting and Protein Sequence Analysis-For analytical purposes, Western blots were prepared from 12.5% SDS-polyacrylamide gels and probed as described (22). Blots were developed with colorimetric substrates (22). For preparative Western blotting, SDS-polyacrylamide gels were prepared with piperazine diacrylamine (Bio-Rad) as cross-linker. Following electrophoresis, proteins were transferred to Problott polyvinylidene difluoride membrane (Applied Biosystems) according to the manufacturer's guidelines. Blots were briefly stained with Coomassie Brilliant Blue, and individual bands were excised, which were processed for N-terminal amino acid sequence analysis by automated Edman degradation using an Applied Biosystems 470 sequenator. Initial yield was ϳ35 pmol.
cDNA Mutagenesis-The ␤-gal cDNA (14) was truncated after nucleotide 1679 (corresponding to codon 543) with two stop codons by adapting the polymerase chain reaction-based in vitro mutagenesis strategy of Higuchi et al. (25) using sense primer A (5Ј-GTGGATGGGATCCCC-CAGGG-3Ј, corresponding to ␤-gal cDNA nucleotides 1389 -1407, including the BamHI site at nucleotide 1397) and antisense primer B (5Ј-CCCCAAGCTTCATCATGAGTTGTGGGCCCAGGCTT-3Ј, corresponding to ␤-gal cDNA nucleotides 1679 -1660, including an adapter sequence carrying a HindIII site and two stop codons). Using the fulllength ␤-gal cDNA as template and primers A and B, a fragment corresponding to ␤-gal nucleotides 1389 -1679 was amplified, carrying two translation termination signals and a HindIII site at its 3Ј-end. After BamHI/HindIII digestion, this fragment was linked to the BamHI fragment of normal ␤-gal cDNA ending at nucleotide 1397, to replace all of the ␤-gal cDNA located 3Ј of the BamHI site at nucleotide 1397. At its 3Ј-end the polymerase chain reaction product was ligated to a HindIII site in the 3Ј-flanking vector sequence.
To create a cDNA that codes for a ␤-gal deletion mutant in which the ␤-gal signal sequence (ending with Tyr 28 ) is linked to Ser 544 (coding for ␤-gal signal peptide in frame with the C-terminal domain of ␤-gal), nucleotide 134 of the ␤-gal cDNA was linked to nucleotide 1680. Primers were synthesized that span the fusion site: sense primer A (5Ј-CG-CAATGCCACCTCCAACTACACGCTCCC-3Ј, corresponding to nucleotides 123-134 and 1680 -1696) and antisense primer B (5Ј-CGTGTAG-TTGGAGGTGGCATTGCGCAAGC-3Ј, corresponding to nucleotides 1691-1680 and 118 -134). Using the ␤-gal cDNA (cloned into pBluescript) as template in separate polymerase chain reactions, primer B was combined with a sense primer that anneals at the 5Ј-flanking vector sequence, and primer A was combined with an antisense primer that anneals at the 3Ј-flanking vector sequence. Thus, in the first reaction (primer B), a fragment was generated, of which the terminal 24 nucleotides had the same sequence as the first 24 nucleotides of the product of the second reaction (primer A). These two fragments were annealed at their homologous termini and used as template for a third polymerase chain reaction using the primers that anneal at 3Ј-and 5Ј-flanking vector sequences. In this reaction, a fragment was amplified that corresponds to the ␤-gal cDNA lacking nucleotides 135-1679.
Transfections-cDNAs were transfected into subconfluent COS-1 cells using calcium phosphate precipitate, as described (29) or the Superfect reagent, according to the manufacturer's instructions (Qiagen). Forty eight hours post-transfection, cells were metabolically labeled with L-[4,5-3 H]leucine (50 Ci/ml of culture medium) for 16 h and processed for immunoprecipitation; alternatively, for analysis of enzyme activity, cells were harvested by trypsinization 72 h post-transfection.
Uptake Experiments-Early infantile GS fibroblasts (16) were grown to confluency in 35-and 85-mm dishes. PPCA precursors were prepared and added to the culture medium of these cells, as described earlier (23). After 4 days of uptake, fibroblasts in 35-mm dishes were harvested by trypsinization for analysis of enzyme activities. Cells in 85-mm dishes were cultured for 3 days in the presence of exogenous PPCA precursors; culture was then continued in the presence of both L-[4,5-3 H]leucine (50 Ci/ml) and exogenous PPCA precursors for 18 -20 h before processing cells for immunoprecipitation (23).

RESULTS
Copurification of the ␤-Gal C-terminal Fragment with the ␤-gal 64-kDa Chain-During purification of the multienzyme complex containing ␤-gal, PPCA, and neuraminidase, a lysosomal-mitochondrial fraction was prepared from mouse liver and fractionated by gel filtration on a Sephadex G-200 column (Fig.  1A). All three enzyme activities co-eluted right after the void volume (peak I), while oligomeric forms of ␤-gal and PPCA eluted independently (i.e. PPCA dimers (ϳ104 kDa) and ␤-gal tetramers (ϳ256 kDa) and dimers (ϳ128 kDa) (Fig. 1A)). Surprisingly, in samples displaying ␤-gal activity (peaks I and II), not only the 64-kDa mature form, but also two bands of apparent molecular mass of ϳ24 -22 kDa were clearly recognized by anti-␤-gal precursor antibodies (Fig. 1B). Furthermore, these low molecular weight proteins copurified with the multienzyme complex (peak I), after purification on a ␤-gal affinity column (p-aminophenyl ␤-D-thiogalactopyranoside-agarose) (Fig. 1C). It is noteworthy that most of the ␤-gal eluted from the paminophenyl ␤-D-thiogalactopyranoside-agarose column was present in fraction 4, followed by a minor peak in fraction 5, which coincided with the pool of cathepsin A and neuraminidase activities (Fig. 1C). This elution pattern may depend on the fact that ␤-gal in complex with the other two enzymes (fraction 5) has a higher affinity for the substrate than its free form. Apparently, the association of the ␤-gal C-terminal fragment to the mature 64-kDa domain, although noncovalent, is stable enough to withstand harsh washes with 1 M NaCl solutions during the purification procedure.
To verify the identity of the 22/24-kDa form (Fig. 1B), Nterminal amino acid sequencing was performed. The obtained sequence coincided with that of mouse ␤-gal residues Ser 546 -Phe 558 (Fig. 1D, underlined), corresponding to Ser 543 -Phe 555 in the human protein (Fig. 1D). Thus, the 22/24-kDa band represents a C-terminal fragment of the ␤-gal precursor, generated during its proteolytic processing in lysosomes. To prove that this mechanism of ␤-gal maturation into a two-subunit enzyme is conserved within other mammalian species, we purified the bovine multienzyme complex from cultured Madin-Darby bovine kidney cells (data not shown). Also in bovine cells, the C-terminal ␤-gal fragment co-purified with the multienzyme complex. The chemically determined N-terminal amino acid sequence was 69% identical to the mouse sequence (Fig. 1D,  underlined). Although the first amino acid of the bovine fragment did not coincide with that of its murine counterpart, both fragments start with a serine residue (Fig. 1D), suggesting that the protease cleavage site could be between two serines. The high degree of similarity between the bovine, mouse, and human C-terminal ␤-gal sequences points to a fully conserved mechanism of proteolytic processing of the enzyme (Fig. 1D).
Immunoprecipitation of C-terminal Fragment from Human Fibroblasts-To verify if the 22/24-kDa fragment is also retained after normal processing of the human precursor, radiolabeled human fibroblasts were subjected to immunoprecipitation with anti-␤-gal antibodies that recognize both precursor and mature forms of the protein (␣-BV85 and ␣-n64, respectively). We also constructed a cDNA (B24) that separately encodes the C-terminal domain (Ser 544 -Val 677 ; B24) and was tagged N-terminally with the ␤-gal signal sequence to ensure translocation of the polypeptide to the endoplasmic reticulum (see "Experimental Procedures"). This mutagenized cDNA was transfected in COS-1 cells to compare the size of its encoded fragment with that of the C-terminal fragment of ␤-gal in human fibroblasts. After SDS-polyacrylamide gel electrophoresis, only the immunoprecipitated 85-kDa ␤-gal precursor and the 64-kDa mature form were readily detected in normal fibroblasts (Fig. 2, lane 1). However, deglycosylation with N-glyco- sidase F allowed the detection of an 18-kDa band (Fig. 2, lane 2) that comigrated with the overexpressed and deglycosylated B24 protein (Fig. 2, lane 6). This result suggests that both the C-terminal domain of human ␤-gal and the truncated B24 protein are glycosylated at two sites. Apparently, the presence of sugar chains on the C-terminal fragment alters its electrophoretic mobility and renders its detection difficult. Notably, all molecular mass forms of ␤-gal, including the 18-kDa band, were absent from ␤-gal mRNA-deficient G M1 gangliosidosis fibroblasts (Fig. 2, lanes 3 and 4). Thus, the C-terminal fragment is retained in the mature form of ␤-gal in human fibroblasts. The fact that the ϳ60and 18-kDa deglycosylated bands are not of equal intensity may be due to the 6-fold difference in number of leucine residues between these fragments.
To further ascertain if the C-terminal peptide is indeed a component of mature and active ␤-gal, we cultured PPCA mRNA-negative GS fibroblasts in the presence of exogenous PPCA precursor. It is well established that endocytosis of PPCA precursor by these cells rescues mature ␤-gal from rapid degradation and restores its enzymatic activity to nearly normal levels (11,16,30). We used wild-type PPCA (Fig. 3, WT) as well as two PPCA mutants in uptake experiments; PPCA-F412V (FV), originally identified in a mild late infantile case of GS, is only partially transported to lysosomes, where it is unstable (23,32), and PPCA-S150A (SA) is an engineered active-site mutant that has no cathepsin A activity but, like wild-type PPCA, restores ␤-gal, neuraminidase, and cathepsin A activities in PPCA-deficient cells (16,30). GS cells were treated with these unlabeled PPCA precursors for 3 days and radiolabeled with [ 3 H]leucine for an additional day. The cells were then processed for immunoprecipitation with ␣-BV85 and ␣-n64 antibodies. Fig. 3 shows that, while the level of the 85-kDa ␤-gal precursor was comparable in all cells, restoration of ␤-gal activity was achieved after uptake of wild-type PPCA or PPCA-S150A but not after uptake of PPCA-F412V. The increase in activity was paralleled by increased amounts of both the 64-and 28/24-kDa forms. Similar results were obtained using an unrelated strain of PPCA mRNA-deficient GS fibroblasts (data not shown). These observations confirm un-equivocally that the C-terminal fragment is an integral part of the mature ␤-gal.
Role of the C-terminal Fragment in ␤-Galactosidase Secretion and Catalytic Activity-To investigate whether the N-terminal domain of the enzyme would be functional without the Cterminal fragment, we constructed a cDNA that encodes only for the N-terminal polypeptide (Met 1 -Ser 543 ; B64). Transfection of COS-1 cells with either B64 or B24 cDNA constructs did not result in an increase of ␤-gal activity (Fig. 4A, B24 and  B64). However, co-expression of the two cDNAs generated a significant increase in ␤-gal activity over the COS-1 endogenous values (Fig. 4A, B24/B64), implying that the interaction between the N-terminal and C-terminal domains is required for the catalytic activity of ␤-gal.
FIG. 3. Uptake of exogenous PPCA by galactosialidosis fibroblasts. PPCA-mRNA negative galactosialidosis fibroblasts were cultured either in normal medium (Ϫ) or in the presence of COS-1 overexpressed and secreted exogenous wild-type PPCA (WT), mutant PPCA-F412V (FV), and mutant PPCA-S150A (SA). The cells were metabolically labeled and processed for immunoprecipitation (see Fig. 2) and separated by SDS-polyacrylamide gel electrophoresis. Alternatively, cells were harvested and assayed for ␤-galactosidase, cathepsin A, and neuraminidase activities (lower three panels).
C-terminal peptide (␣-24) (Fig. 4B, lanes 5 and 6, respectively). Since the latter antibody did not recognize the ϳ66-kDa fragment by itself, the co-precipitation data suggest that the two ␤-gal domains form a two-chain molecule. The association between the two subunits must be strong enough to endure the stringent washing conditions (0.5 M NaCl, pH 8.6) after immunoprecipitation. Furthermore, the 66-kDa fragment was secreted only when co-expressed with the C-terminal portion (Fig. 4B, lanes 11 and 12). In contrast, the C-terminal peptide was detected in the medium of overexpressing cells, regardless of the presence or absence of co-expressed B64 (Fig. 4B, lanes  10 and 12). All together, these results pinpoint to an essential role of the C-terminal fragment in the proper folding of the N-terminal part of ␤-gal and in the intracellular transport of this domain.

DISCUSSION
The 85-kDa precursor of human lysosomal ␤-gal is proteolytically processed at its C terminus into a 66/64-kDa mature form (12). The fate of the ϳ24/28-kDa fragment that is generated in this process was until now overlooked. Here we present evidence that this polypeptide is retained after maturation of the precursor and is part of the catalytically active form of ␤-gal. At present, its specific role in the lysosomal function of mature ␤-gal is not fully understood. It is well established that correct folding of enzyme precursors is a prerequisite for their intracellular transport and lysosomal localization (reviewed in Refs. 33-36). Therefore, we can speculate that ␤-gal depends on its C-terminal domain for acquisition of the proper tertiary structure, both in the precursor and mature state. In addition, the polypeptide may be involved in protein-protein interaction both in the homo-oligomeric or heteromultimeric forms of the enzyme.
The significance of the ␤-gal C-terminal domain is further supported by the identification of several amino acid substitutions in different G M1 gangliosidosis patients that localize in this region: Y591C, Y591N (37), R590H, K578R, and E632G (38). The latter three were shown to be disease-causing mutations, since they abolished enzyme activity when the mutant proteins were expressed in COS-1 cells. Therefore, they may affect either the folding of the precursor and its exit from the endoplasmic reticulum or the lysosomal stability/activity of a correctly routed enzyme. It will still be interesting to analyze the effects of these mutations at the structural level. In addition, the C terminus of mammalian ␤-gal contains a small domain (residues 577-592) that is conserved in nine ␤-galactosidases and related proteins from bacterial, fungal, nematode, and plant species (39 -41). Interestingly, all of the mutations mentioned above except E632G lie in this area. Specifically, Lys 578 and Arg 590 are identical in all organisms except in Aspergillus niger, while Tyr 591 is maintained in four of these species and is adjacent to the fully conserved Trp 592 .
The fact that the C-terminal fragment isolated from murine liver starts at Ser 546 does not exclude the possibility that this peptide may arise from a larger proteolytic product that is subsequently trimmed at its N terminus. However, leupeptin treatment of cultured human fibroblasts results in accumulation of a 66-kDa intermediate (12). Therefore, it is likely that the human ␤-gal precursor is initially cleaved at or close to the N-terminal side of Ser 543 -Ser 544 , followed by further processing of the large domain at Arg 530 by a trypsin-like protease, as predicted earlier (15). It is noteworthy that the precursor, obtained from overexpressing insect cells, can be readily converted to a 64-kDa form by mild trypsin digestion (42).
Few examples of C-terminal processing of lysosomal enzymes have been described. Lysosomal ␣-glucosidase undergoes both N-and C-terminal processing; a ϳ20-kDa peptide is removed from the C terminus, but it is not known whether it remains associated with the rest of the mature molecule (reviewed in Ref. 43). In contrast, the 18 amino acid residues that are removed from the C terminus of human lysosomal ␤-glucuronidase are required for its binding to the microsomal esterase egasyn, which is a prerequisite for a fraction of the enzyme precursor to be retained in the ER (51). In addition, when synthesized without its C terminus, the enzyme is catalytically impaired, hypophosphorylated, and increasingly secreted (44). Thus, the C-terminal domain of the ␤-glucuronidase precursor appears to be essential for its folding process and compartmentalization.
In contrast to the examples discussed above, many more proteins are known to undergo N-terminal processing (see Refs. 45-48 for reviews). The N-terminal pro-regions of various prokaryotic and eukaryotic proteases and polypeptide hormones function as intramolecular chaperones; they are known to be required for proper protein folding (and consequentially intracellular transport) by reducing kinetic barriers between unfolded or partially folded forms and mature molecules. They are also powerful inhibitors of the proteolytic enzymes and do not necessarily become superfluous after processing; e.g. the extracellular form of cathepsin B remains in a noncovalent complex with its propeptide, and this conformation accounts for its stability in this environment (49,50).
In view of the fact that ␤-gal forms a multienzyme complex with neuraminidase and PPCA, it will be relevant in the future to determine the role played by the C-terminal fragment in complex assembly.