Low Density Lipoprotein Receptor Class A Repeats Are O-Glycosylated in Linker Regions*

Background: Evidence of O-glycosylation of the LDL receptor (LDLR) ligand-binding domain has been reported. Results: Analysis of recombinant LDLR showed conserved O-glycans in linker regions between LDLR class A repeats. Conclusion: The ligand-binding domain of LDLR is O-glycosylated with extended O-glycans, and O-glycans are also found in related receptors. Significance: O-Glycosylation may play a role in LDLR binding to ApoB and intracellular trafficking. The low density lipoprotein receptor (LDLR) is crucial for cholesterol homeostasis and deficiency in LDLR functions cause hypercholesterolemia. LDLR is a type I transmembrane protein that requires O-glycosylation for stable expression at the cell surface. It has previously been suggested that LDLR O-glycosylation is found N-terminal to the juxtamembrane region. Recently we identified O-glycosylation sites in the linker regions between the characteristic LDLR class A repeats in several LDLR-related receptors using the “SimpleCell” O-glycoproteome shotgun strategy. Herein, we have systematically characterized O-glycosylation sites on recombinant LDLR shed from HEK293 SimpleCells and CHO wild-type cells. We find that the short linker regions between LDLR class A repeats contain an evolutionarily conserved O-glycosylation site at position −1 of the first cysteine residue of most repeats, which in wild-type CHO cells is glycosylated with the typical sialylated core 1 structure. The glycosites in linker regions of LDLR class A repeats are conserved in LDLR from man to Xenopus and found in other homologous receptors. O-Glycosylation is controlled by a large family of polypeptide GalNAc transferases. Probing into which isoform(s) contributed to glycosylation of the linker regions of the LDLR class A repeats by in vitro enzyme assays suggested a major role of GalNAc-T11. This was supported by expression of LDLR in HEK293 cells, where knock-out of the GalNAc-T11 isoform resulted in the loss of glycosylation of three of four linker regions.


The low density lipoprotein receptor (LDLR) is crucial for cholesterol homeostasis and deficiency in LDLR functions cause hypercholesterolemia. LDLR is a type I transmembrane protein that requires O-glycosylation for stable expression at the cell surface. It has previously been suggested that LDLR O-glycosylation is found N-terminal to the juxtamembrane region. Recently we identified O-glycosylation sites in the linker regions between the characteristic LDLR class A repeats in several LDLR-related receptors using the "SimpleCell" O-glycoproteome shotgun strategy.
Herein, we have systematically characterized O-glycosylation sites on recombinant LDLR shed from HEK293 SimpleCells and CHO wild-type cells. We find that the short linker regions between LDLR class A repeats contain an evolutionarily conserved O-glycosylation site at position ؊1 of the first cysteine residue of most repeats, which in wild-type CHO cells is glycosylated with the typical sialylated core 1 structure. The glycosites in linker regions of LDLR class A repeats are conserved in LDLR from man to Xenopus and found in other homologous receptors. O-Glycosylation is controlled by a large family of polypeptide GalNAc transferases. Probing into which isoform(s) contributed to glycosylation of the linker regions of the LDLR class A repeats by in vitro enzyme assays suggested a major role of GalNAc-T11. This was supported by expression of LDLR in HEK293 cells, where knock-out of the GalNAc-T11 isoform resulted in the loss of glycosylation of three of four linker regions.
The low density lipoprotein receptor (LDLR) 2 is a membrane-bound cell surface receptor crucial for the homeostasis of plasma cholesterol, and deleterious mutations in LDLR underlie familial hypercholesterolemia. LDLR has served as a model for receptor endocytosis (1), and is the founding member of a superfamily that includes VLDL receptor (VLDLR), LDLRrelated protein 1 (LRP1), LDLR-related protein 2 (LRP2 or megalin), and LDLR-related protein 8 (LRP8 or ApoER2) and more distantly related receptors such as sortilin-related receptor, L(DLR class) A repeats containing (SORLA) (2). Several members of the LDLR family have in common their ability to bind apolipoprotein E found in chylomicrons and VLDL particles, whereas LDLR is the main receptor binding apolipoprotein B-100 found in intermediate density lipoprotein and LDL particles. In addition, LRP1 and VLDLR are able to bind lipoprotein lipase (3,4) and hepatic lipase (5,6). These receptors have several structural motifs in common, including the ligandbinding LDLR class A repeats and LDLR class B repeats consisting of a ␤-propeller structure surrounded by three EGF-like repeats, two N-terminal and one C-terminal to the ␤-propeller (2). The LDLR class A repeat is a structure consisting of 40 amino acids constrained by three internal disulfide bridges (Cys The juxtamembrane region of LDLR, VLDLR, and LRP8 contains a mucin-like sequence that has been shown to carry GalNAc-type O-glycans, although the actual sites of attachment have not been clarified (11)(12)(13)(14). O-Glycosylation in the stem region of LDLR is important for cell surface expression and stability of this receptor as originally demonstrated with the CHO ldlD cell line deficient in the Glc/GlcNAc C4-epimerase, where the ectodomain of LDLR is shed from the surface unless cells are grown in Gal and GalNAc to enable O-glycosylation (12,13). A study by Davis et al. (15) suggested that apart from the stem region, the N-terminal segment of LDLR also appeared to contain O-glycans. In this study they used an LDLR construct without the stem region containing the putative proteolytic release site(s) as well as the juxtamembrane O-glycosylation sites. Their results demonstrated that when expressed in CHO cells the surface expression and half-life of the LDLR, as well as its binding to LDL and internalization were not affected. Similar findings have been reported for VLDLR, although in this case enhanced shedding was observed for the VLDLR lacking the stem region (11). Curiously, studies on monensin-resistant CHO cell lines have revealed that one particular clone (Mon-R31) appeared to produce LDLR with O-glycans in the stem region, but devoid of O-glycans in the N-terminal ligandbinding domain (16,17). The Mon-R31 cells showed strongly reduced LDL binding and uptake (17,18), suggesting that the putative N-terminal O-glycans may be important for receptor function.
We recently developed a proteome-wide O-glycoproteome discovery strategy using stable gene editing to simplify O-glycan structures, the so-called SimpleCell (SC) strategy (19 -21), where the COSMC gene that encodes a private chaperone for the Core1 synthase, C1GalT1, is knocked out resulting in loss of O-glycan elongation (22). Using this shotgun strategy, we identified a number of O-glycosites in the linker regions between LDLR class A repeats of LDLR-related receptors, and more specifically found one ambiguously identified O-glycosite in the linker region between two LDLR class A repeats (repeat 2-3) of the LDLR. Herein, we report a detailed characterization of the O-glycosylation of recombinant LDLR expressed in CHO and HEK293 cells. We demonstrate that the linker regions between LDLR class A repeats contain a highly conserved O-glycan in the sequence motif XX-C 6 XXXTC 1 -XX. This O-glycosylation sequence motif is found in many but not all LDLR class A repeat linker regions of other LDLR-related receptors. We further provide evidence that the polypeptide GalNAc transferase isoform GalNAc-T11 is the major enzyme responsible for glycosylation of the linker regions of the human LDLR. The biological significance of these O-glycans is not yet clear, but our results now provide a firm basis for functional studies.

EXPERIMENTAL PROCEDURES
Cell Lines-HEK293 SC with COSMC knock-out (19), and HEK293 SC/-GALNT11 with both COSMC and GALNT11 knock-out 3 were grown in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum and 1% glutamine. CHO GS (Sigma) were grown in EX CELL CD CHO Fusion medium (Sigma) supplemented with 2% glutamine.
Recombinant Expression of LDLR-Cells were transfected with 3 g of plasmid encoding human LDLR with C-terminal V5-tag and bicistronic EGFP (pIRES2-hLDLR-V5-EGPF) (23) using an Amaxa Nucleofector according to the manufacturer's protocol (Lonza) and transient expression was analyzed within 3 days. HEK293 SC stable transfectants were selected in 0.4 mg/ml of G418 (Invitrogen) monitored by EGFP expression, followed by dilutional cloning. CHO WT transfectants were seeded into 96-well plates as 0.7 cells/well containing 80% EX CELL CHO Cloning Medium (Sigma) and 20% EX CELL CD CHO Fusion medium (Sigma) with 2% glutamine and 0.32 mg/ml of G418. Further analysis by immunocytology and Western blotting on total cell lysates using anti-V5 (sc-81594, Santa Cruz Biotechnology) was performed. For production and harvesting, shed LDLR semiconfluent HEK293 SC cells were grown in T175 flasks with 40 ml of media for 3 days and CHO WT for 10 days before harvest.
Purification of Shed LDLR-A column with 1 ml of packed VVA (for SimpleCells) or PNA (for wild-type)-agarose (Vector Laboratories) per 0.5 liter of media was used to enrich for glycoproteins contained in media from LDLR overexpressing cells. For VVA purification, media (dialyzed twice for 8 h against 10 mM Tris, pH 7.4, 150 mM NaCl at 4°C) were mixed 1:1 in 2ϫ Lac A buffer (40 mM Tris, pH 7.4, 300 mM NaCl, 2 M urea, 0.2 mM CaCl 2 /MgCl 2 /MnCl 2 /ZnCl 2 ), applied to the VVA column and after extensive washing (40 -50 column volumes (CV) 1ϫ Lac A) eluted with 0.2 M GalNAc in Lac A (4 ϫ 1 ml). For PNA purification glycoproteins in the media were first desialylated at 37°C for 3 h (0.1 units of neuraminidase/ml media, N3001, Sigma) and the media was dialyzed (twice for 8 h against 1 mM Hepes, pH 7.5, 150 mM NaCl at 4°C), mixed 1:1 in 2ϫ Lac B buffer (20 mM Hepes, pH 7.5, 300 mM NaCl, 2 mM CaCl 2 / MgCl 2 /MnCl 2 /ZnCl 2 ), applied to the PNA column and after extensive washing (40 -50 CV 1ϫ Lac B) eluted with 0.5 M galactose in Lac B (4ϫ 1 ml). Elutions were analyzed by Western blotting using an antibody detecting the extracellular domain of LDLR (epitope within the ␤-propeller domain, ab30532, Abcam) and biotinylated VVA (B-1235, Vector Laboratories) or PNA (B-1075, Vector Laboratories). Ion exchange chromatography was performed using an anion-exchange MonoQ column (5/50 GL, GE Healthcare) mounted on an Ä kta FPLC interfaced by UNICORN 4.12 control software for further purification of the lectin-chromatography elutions. All 4 lectin chromatography elutions from each experiment were pooled and centrifuged at 14,000 ϫ g, 4°C, 10 min before loading on to a MonoQ column pre-equilibrated in buffer A (25 mM BisTris, pH 6.5, 10 mM NaCl) using the sample pump with a flow of 1 ml/min. After 15 CV of wash with buffer A, elution with a linear salt gradient from 10 mM to 1 M NaCl in 25 mM BisTris, pH 6.5, was performed. LDLR-positive fractions (as determined by Western blot analysis) were pooled and further purified by reverse-phase HPLC using a Jupiter C4 column (5 m, 300 Å, column 250 ϫ 4.6 mm) (Phenomenex) with manual peak collection (the LDLR positive peak was detected by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) or Western blot analysis).
Purified proteins were semi-quantified by SDS-PAGE Coomassie staining.
SDS-PAGE Western Blotting and Deglycosylation-Samples were run on NuPAGE Novex BisTris 4 -12% gels and blotted onto nitrocellulose membrane for 90 min (0.45 m, Bio-Rad), blocked in 5% skimmed milk in TBS-T or (for VVA-B blots) in 1% polyvinylpyrrolidone (40 kDa, P0930 Sigma), washed 3 times in TBS-T prior to overnight incubation at 4°C with primary antibody in TBS-T with 1% BSA (or lectin in 1% polyvinylpyrrolidone). Blots were developed with ECL (32106, Pierce ThermoScientific) after incubation with HRP-conjugated secondary antibodies (Dako) for 1 h at room temperature. For deglycosylation studies 1 g of purified LDLR was incubated with 0.5 unit of PNGase F (Roche Applied Science), 1 g of ␣-GalNAcase (24), in combination or alone in 10 mM NaPO 4 , 50 mM NaCl, 0.25 mg/ml of BSA, pH 6.8, at room temperature for 4 h and overnight. Samples were analyzed by Western blotting.
Mass Spectrometric Analysis of Shed LDLR-Approximately 5-7 g of shed LDLR was reduced (5 mM Tris(2-carboxyethyl)phosphine in 50 mM ammonium bicarbonate for 45 min at 60°C) and alkylated (10 mM iodoacetamide in 50 mM ammonium bicarbonate at room temperate for 45 min, then 5 mM DTT added to terminate alkylation); after removal of Nglycans by overnight PNGase F the protein was subjected to proteolysis (chymotrypsin followed by either endoproteinase GluC or endoproteinase AspN, all from Roche). For N-glycan analysis, reduction, alkylation, and PNGase F digest were done in buffers containing H 2 18 O (Cambridge Isotope Laboratories). The products were desalted using a standard C18 Zip-Tip procedure and analyzed by rpLC-MS using a system comprised of an EASY-nLC II interfaced via nanoSpray Flex ion source to an LTQ-Orbitrap XL ETD mass spectrometer (ThermoFisher Scientific); in some cases a system comprised of an EASY-nLC-1000 interfaced to an Orbitrap Fusion was employed; LC solvent gradient and MS acquisition parameters were similar to those previously published (20,21). MS1, HCD-MS2, and ETD-MS2 spectra were all acquired in the Orbitrap sector; nominal resolution settings for MS1 were 30,000 (XL) or 60,000 (Fusion); for HCD-MS2, 15,000; for ETD-MS2, 15,000 (XL) or 30,000 (Fusion). MS data were searched against the human LDLR protein sequence (P01130) using Proteome Discoverer 1.4 software (ThermoFisher Scientific), assisted by manual inspection of spectra as described elsewhere (19 -21). Carbamidomethylation was entered as static modification; dynamic modifications used were deamidation (Asn, Gln, Arg), oxidation (Met), and O-glycosylation (HexNAc for HEK293 SC samples; HexHexNAc or HexNAc for CHO WT samples). Mass tolerances were 15 ppm for precursors, 0.05 Da for fragments. To properly match precursors to the mainly deglycosylated fragments produced in HCD-MS2 mode, HCD spectra were preprocessed using an algorithm that subtracts the exact masses of mono-and/or disaccharide residues from the precursor masses (HexNAc for HEK293 SC samples; HexHexNAc or HexNAc for CHO WT samples); separate .mgf files were generated from 1, 2, 3, and 4ϫ subtractions, and these were submitted along with .raw files to processing by Proteome Discoverer, as described previously (19 -21). To facilitate pro-cessing, the subtracted spectral files are arbitrarily assigned the MS2 mode designation "CID", and the actual numbers of residues subtracted are appended to each file name. Identifications made from these files are thus listed as CID in the Method columns of supplemental Table S1.
In Vitro GalNAc-T Enzyme Analysis-A series of 20-mer synthetic peptides (NeoBioSci) were designed covering the linker regions between all the LDLR class A repeats of LDLR and selected linker regions from VLDLR, LRP1, LRP8, and SORLA (Table 1). Recombinant glycosyltransferases were expressed as soluble secreted truncated proteins in insect cells and purified as described previously (25). In vitro glycosylation of peptides was performed in product development assays in 25 l of 25 mM cacodylic acid sodium, pH 7.4, 10 mM MnCl 2 , 0.25% Triton X-100, 4 mM UDP-GalNAc (Sigma), 10 g of acceptor peptides, and 0.1 g of purified enzyme. All reactions were incubated at 37°C and product development was evaluated by MALDI-TOF-MS in a time course as previously described (26) except that a Bruker Autoflex MALDI-TOF instrument with accompanying Compass 1.4 FlexSeries software was used for product evaluation.
An expression construct encoding the first four LDLR class A repeats (amino acid 23-189) of LDLR (EPB41) was synthesized and ligated into pET-28a kanamycin vector (Novagen) by Genewiz. Transformation by electroporation was done in T7 Express LysY competent Escherichia coli cells (New England Biolabs) and expressed protein was nickel-purified using nickel-nitrilotriacetic acid-agarose (Invitrogen) and reverse-phase HPLC using a Jupiter C4 column (5 m, 300 Å, column 250 ϫ 4.6 mm). In vitro glycosylation of EPB41 dissolved in water was performed and product development was evaluated as described above with few modifications: 20 g of pure reporter protein was incubated with 0.2 g of purified enzyme in 25 mM MES, pH 6.0, 10 mM MnCl 2 , 10 mM CaCl 2 , 0.3 M NaCl, 0.25% Triton X-100, and 4 mM UDP-GalNAc in 50-l product development assays.

RESULTS
Expression of LDLR in CHO and HEK293-We previously identified a single O-glycopeptide in the LDLR covering the short linker region between LDLR class A repeats 2-3 in secreted LDLR using the SimpleCell approach (19,20). The glycopeptide was found in the secretomes of IMR32, HaCaT, and MCF7 SCs and in total cell lysates from Capan1 and MDA231 SCs. The glycopeptide covered a Thr positioned Ϫ1 to the first Cys residue in the third LDLR class A repeat, although the actual O-glycosite could not be unambiguously defined. We also identified glycopeptides from the same region in several LDLR-related receptors, and in most cases the glycosites were identified as the Ϫ1 Thr residue in the linker sequence motif between two LDLR class A repeats (XX-C 6 XXXTC 1 -XX) (19,20).
The SimpleCell O-glycoproteomic strategy is a shotgun approach that may provide limited coverage for individual O-glycoproteins, so to further characterize O-glycosylation of LDLR, we stably expressed a full coding human LDLR construct in CHO WT and HEK293 SC cell lines. Shed ectodomain of LDLR was essentially quantitatively isolated from HEK293 SCs using VVA lectin chromatography as previously described (27) and further purified by ion exchange and RP-HPLC with a yield of ϳ40 g from 500 ml of growth media. We also isolated shed LDLR from WT CHO cells using PNA lectin chromatography followed by ion exchange and RP-HPLC, and obtained a yield of ϳ80 g from 35 ml of growth media.
Characterization of Shed LDLR-Preliminary analysis of glycosylation was performed with isolated LDLR derived from HEK293 SCs using ␣-GalNAcase and PNGase F digestion and SDS-PAGE Western blotting (Fig. 1A). Purified shed LDLR migrated at ϳ97 kDa. Digestion with ␣-GalNAcase enhanced migration by 1-2 kDa suggesting a loss of 5-10 O-GalNAc residues, whereas PNGase F enhanced migration by 3-5 kDa suggesting the presence of 1-2 N-glycans. Digestion with both enzymes enhanced migration by 5-7 kDa. Efficiency of deglycosylation with ␣-GalNAcase was evaluated by VVA-biotin blotting, which demonstrated near complete loss of reactivity after overnight incubation. No change in band thickness was observed upon ␣-GalNAcase deglycosylation, consistent with a lack of variability in numbers or positions of O-glycosites in the shed LDLR prior to digestion.
O-Glycosites on the purified shed LDLR from HEK293 SCs and WT CHO cells were determined by LC-MS analysis of a protease digest following PNGase F removal of N-glycans from the protein (supplemental Table S1). On LDLR from HEK293 SCs three sites were unambiguously identified (from ETD-MS2 spectra) at positions Thr 67 , Thr 147 , and Thr 235 located at position Ϫ1 to the first Cys residue in linker regions between repeats 1-2, 3-4, and 5-6 within the sequence motif XX-C 6 XXXTC 1 -XX (Fig. 1B). The characterization process is illustrated in Fig. 2 for one peptide that includes the O-glycosite at Thr 67 ; the peptide sequence is clearly identified from a virtually complete set of b n and y n fragments in an HCD-MS2 for which the mass of one HexNAc has been subtracted from the precursor ("CID-MS2" as described under "Experimental Procedures") to facilitate automated assignment ( Fig. 2A, see also supplemental Fig. S1). These assignments are defined as "ambiguous" when more than one potential O-glycan acceptor

O-Glycans on LDLR Ligand-binding Domain
site (i.e. Thr or Ser) is present in the glycopeptide. An ETD-MS2 spectrum from a virtually identical precursor (unsubtracted) provides sufficient coverage by c n and z n fragments to unambiguously identify the O-glycosite at Thr 67 of the same peptide ( Fig. 2B, supplemental Fig. S2). The linker region between repeats 2-3, in which we previously ambiguously assigned a glycosite to Thr 108 (19,20) in the XX-C 6 XXXTC 1 -XX sequence, was well covered by glycopeptides, but the specific site could still not be unambiguously determined. Fig. 1B schematically illustrates the coverage achieved with LDLR derived from HEK293 SCs where all peptides covering the linker regions between LDLR class A repeats of the ligand-binding domain are shown to the left, including defined and partially defined glycosites as well as nonglycosylated (naked) peptides (identified from unsubtracted HCD-MS2 spectra as in, e.g. Fig. 2C). For the HEK293 SCs sample, a minimum of 10 peptides covering each linker region were characterized, and importantly we identified a mixture of peptides and glycopeptides covering each linker suggesting that glycosylation at each site is not complete (supplemental Figs. S1-S54). In the juxtamembrane region of the LDLR a total of 13 glycosites were identified on LDLR from HEK293 SCs (Fig. 1B). Glycosites previously determined from SimpleCell glycoproteomes corresponded closely except for two peptide stretches at the most C-terminal region (19 -21).
We also analyzed LDLR expressed in WT CHO cells and identified the core 1 (Gal-GalNAc) structure in all four linker regions where we found glycosites in SC-produced LDLR (supplemental Figs. S55-S86), as well as four sites in the juxtamembrane region (Fig. 1B). However, in most cases the glycosites could not be unambiguously defined except for Thr 235 (supplemental Figs. S59 -S73). A minimum of two peptides covering each linker region were determined and the 18 most N-terminal amino acids of the juxtamembrane region were covered by peptides with up to four Gal-GalNAc modifications. In a few cases, the same peptides were identified with both GalNAc and Gal-GalNAc modifications (supplemental Table S1). Interestingly, whereas we identified unglycosylated peptides in the linker regions of LDLR derived from HEK293 SCs as well as WT CHO cells, we did not find unglycosylated peptides in the juxtamembrane region, suggesting that O-glycosylation of the linker regions are less efficient compared with the stem region or that they may be regulated.
Our data also point to the regulated LDLR-shedding site in HEK293 cells. Thus, peptide coverage in the juxtamembrane domain suggests that C-terminal shedding of LDLR in HEK293 SCs must occur within a 13-amino acid stretch immediately before the predicted transmembrane domain (Fig. 1B).
LC-MS analysis of N-glycan occupancy was also carried out on the shed LDLR from HEK293 SC and WT CHO. Human LDLR has five potential N-glycosylation consensus sites ( 97 NGSD, 156 NSST, 272 NVTL, 515 NGSK, and 657 NLTQ), of which the three first are located in the LDLR class A repeat region with two sites (Asn 97 and Asn 156 ) in class A repeat folds and one site (Asn 272 ) in the linker region between repeats 6 and 7 (Fig. 1B). PNGase F digestion of the shed LDLR only enhanced mobility by 3-5 kDa (Fig. 1A) suggesting the presence of just one or two N-glycans, which is in agreement with previous reports suggesting that LDLR from human fibroblasts carries 1-2 N-glycans and 9 -18 O-glycans (28). To clarify this, LC-MS analysis of digests of shed LDLR from HEK293 SC and CHO WT carried out in heavy water was performed (Fig. 1B, supplemental Figs. S102-S106). This provided evidence for N-glycosylation at Asn 156 , Asn 272 , and Asn 657 , but not at the two other sites. Evidence for glycosylation at the Asn 156 site was found with LDLR isolated from HEK293 SC, at the Asn 272 site with LDLR from CHO WT, and the Asn 657 site with LDLR from both HEK293 SC and CHO WT. N-Glycosylation at Asn 657 is in agreement with previous reports (29 -31). We did not obtain sequence coverage for the Asn 156 site with LDLR isolated from CHO WT and the Asn 272 site with LDLR from HEK293 SC. The data suggests that up to three N-glycans can be found on LDLR, but we only identified two N-glycosylation sites in LDLR isolated from either cell source tested. The Asn 97 and Asn 272 sites are evolutionary conserved in frog, mouse, hamster, and man (Fig. 3), whereas the Asn 156 site is only conserved between mouse and man and the Asn 657 site only between hamster and man. The Asn 515 site is found in man, but not in mouse, hamster, or frog. Fig. 3 demonstrates that the XX-C 6 XXXTC 1 -XX sequence motif is well conserved down to Xenopus laevis. The characteristic extended linker sequence between LDLR class A repeats 4 -5 has several Ser/Thr residues, but the positions of these are not conserved, and we did not identify any predicted glycosylation sites in the extended linkers from LDLR or the other related short receptors VLDLR and LRP8 (Fig. 4). The LDLR itself is the defining receptor in the LDLR family. The O-glycoproteomes from 12 human cancer cell lines (19 -21) identified a number of glycosites on LDLR-related receptors (Fig. 4). Further surveying these receptors for the sequence motif XX-C 6 XXXTC 1 -XX provided additional potential glycosites demonstrating that all LDLR family members are predicted to have O-glycans in their ligand-binding domains. Including the sites identified on LDLR itself, we identified a total of 21 O-glycosites in the sequence motif. A total of 41 sequence motifs exist in the LDLR family of proteins. In contrast to the short LDLR-related receptors, we identified nine O-glycosites in extended linker regions from other receptors (Fig. 4), but given the lack of sequence conservation in these proteins we are unable to predict the distribution of O-glycosylation in this region.

O-Glycosites in Linker Regions of LDLR Class A Repeats Are Conserved-
In Vitro Enzyme Analysis to Probe GalNAc-T Isoform Glycosylation of Linker Regions-Initiation of mucin-type O-glycosylation is catalyzed by up to 20 isoforms of GalNAc-Ts (32,33). We tested 10 GalNAc-T isoforms by in vitro enzyme assays using time course MALDI-TOF-MS analysis with a panel of 20-mer synthetic peptides derived from LDLR and related receptors (Fig. 5, Table 1). Whereas all recombinant purified GalNAc-Ts were active with control substrates we found that most peptides covering the linker regions between LDLR class A repeats did not serve as acceptor substrates. A few exceptions included the peptide covering the linker region between LDLR class A repeats 5-6 in LDLR, which was efficiently glycosylated by several GalNAc-T isoforms (Fig. 5), with the GalNAc-T11 isoform being the most efficient. Similarly, one linker region in VLDLR (repeat 3-4) and two in LRP1 (first cluster repeat 2-3 and second cluster repeat 8 -9) were substrates for several isoforms including GalNAc-T11 (Table 1). GalNAc-T16 was particularly active with the LRP1-derived substrates (Table 1). One common feature of all linker region peptides serving as substrates for GalNAc-T11 was the presence of Pro at ϩ3, although three peptides with a ϩ3 Pro residue did not serve as substrates.
In vitro O-glycosylation assays in the past have been shown to correlate fairly well with in vivo functions of GalNAc-Ts (32), but most of the substrates tested were derived from more unstructured regions of proteins. Given that the acceptor sites in the short linker regions must be influenced by the adjacent LDLR class A repeats, we tested an E. coli expressed reporter protein containing the first four LDLR class A repeats of LDLR as a substrate for four GalNAc-T isoforms (Fig. 6). Interestingly, whereas all GalNAc-Ts were active with control peptides, only GalNAc-T11 appeared to incorporate GalNAc residues into the reporter protein as detectable by VVA lectin blotting (Fig. 6).
To further explore the role of GalNAc-T11 in glycosylation of LDLR, we developed HEK293 SCs with knock-out of GALNT11 using zinc finger nuclease targeting. 3 We have previously shown that HEK293 cells express GalNAc-T1, -T2, -T3, -T11, -T12, and -T16 as evaluated by immunocytochemistry using a panel of monoclonal antibodies (19). Analysis of expressed recombinant shed LDLR from HEK293 SCs lacking GalNAc-T11 showed loss of O-glycosylation in three of the four identified glycosites (Thr 67 , Thr 108 , and Thr 147 ) in LDLR class A repeat linker regions of LDLR ( Fig. 7; for spectral evidence, see Fig. 2D, supplemental Figs. S87-S101). The only glycosite still identified in these cells was Thr 235 (supplemental Figs. S95 and S96), which correlated well with in vitro analyses showing that only the peptide covering this site was efficiently used by several GalNAc-T isoforms (Fig. 5). We also identified glycosites in the juxtamembrane region, although only the most N-terminal part of this region was covered in the analysis. The juxtamembrane region of LDLR contains clustered O-glycosites similar to those found in mucins and these sequences are generally substrates for multiple GalNAc-T isoforms.

DISCUSSION
O-Glycosylation in the juxtamembrane region of LDLR was originally demonstrated to be important for cell surface stability of the receptor, and this has served as a paradigm for the protective role of O-glycosylation in stem regions of cell membrane proteins (12,13). However, early work indicating the presence of additional O-glycosylation in the N-terminal ligand-binding domain of LDLR and a functional role of these has largely been ignored (15). Herein, we performed a thorough analysis of the ectodomain of recombinant shed LDLR and

O-Glycans on LDLR Ligand-binding Domain
demonstrate that most of the linker regions between the LDLR class A repeats in the ligand-binding domain contain a Thr O-glycosite in the XX-C 6 XXXTC 1 -XX sequence motif. These O-glycosites are evolutionarily conserved in LDLR, and they are also found or predicted in many linker regions of six other LDLR-related receptors. We demonstrate that these O-glycans are elongated to sialylated core 1 structures in CHO wild-type cells and thus potentially to larger structures in other cell types with more complex O-glycosylation patterns. Finally, we provide evidence that the GalNAc-T11 isoform plays a major role for glycosylation of most of the linker regions in LDLR.
Identification of O-glycoproteins and the specific sites of attachment of O-glycans on a proteome-wide level have long been hampered by technical difficulties mainly associated with lack of universal enzymes to release glycans and label attachment, in combination with the heterogeneity and lability of O-glycan structures (20). Recently, several novel strategies relying on various enrichment strategies of O-glycopeptides in digests have been introduced (20,34,35), and these are providing insights into novel O-glycoproteins and in particular O-glycosites in regions of proteins not found before. We have used the SimpleCell strategy that relies on genetically engineered cell lines producing truncated homogenous O-glycans (19 -21). This strategy enables simple lectin enrichment of the truncated GalNAc O-glycopeptides and high throughput mass spectrometric detection of glycosites, but information about the structures found at these sites in normal cells is lost. Furthermore, the mass spectrometry strategy is in principle a "shotgun" method with limited coverage of individual O-glycoproteins and without information on stoichiometry of O-glycan sites, and it therefore mainly serves as a discovery platform for O-glycoproteins to be selected for further in-depth analysis, as performed in the present study with LDLR. As illustrated in Fig. 1, our shotgun SimpleCell approach identified many glycosites in the juxtamembrane region of LDLR, and these overlapped very well with those identified here on the recombinantly expressed shed LDLR ectodomain. In contrast, the SimpleCell shotgun approach did not cover the LDLR class A repeat domain very well. By careful selection of protease digestion strategy and direct analysis of recombinant expressed purified LDLR, we were able to identify O-glycosites in all LDLR class A repeat linker regions with a conserved sequence motif as well as additional glycosites in the juxtamembrane region. Early studies suggested that a mutant LDLR construct without the juxtamembrane region retained some O-glycosylation when expressed in wild-type CHO cells (15) and this construct would exclude all the identified O-glycosites in the stem region (Fig.  1). It is likely that the identified glycosites in the linker regions represent this O-glycosylation in the N-terminal part of the mutant construct. Interestingly, it was demonstrated that the mutant LDLR construct, in contrast to the wild-type one, was FIGURE 6. In vitro GalNAc-T glycosylation of an E. coli-expressed reporter encoding four class A repeats (1-4) of LDLR. The purified reporter construct was incubated with recombinant GalNAc-Ts in glycosylation reactions overnight followed by SDS-PAGE Western blot analysis to probe incorporation of GalNAc by reactivity with VVA. Control staining with anti-HIS was included to evaluate protein load and potential degradation. Ctrl, reaction without GalNAc-T. not O-glycosylated when expressed in the CHO MonR31 mutant cell line selected for monensin resistance (16). In these studies, it was suggested that one-third of the O-glycans on LDLR were present in the N-terminal region (16). Compared with the present data, this suggests that the average LDLR molecule has 3-4 O-glycans in the ligand-binding domain and 6 -8 O-glycans in the stem region. Unfortunately, to our knowledge this cell line is no longer available for studies, but using this cell line it was demonstrated that loss of O-glycosylation in the N-terminal region resulted in reduced LDLR ligand binding and uptake (17,18). Interestingly, the O-glycan occupancy in the linker region appeared to be incomplete compared with the juxtamembrane region, which given the apparent conserved nature of these glycans may suggest that they are regulated. O-Glycosylation of the juxtamembrane region is clearly required for stability of the receptor at the surface (12,13), and it is likely that O-glycosylation of the linker regions plays a role in the function for LDLR, although at present we do not know how. It should be noted that our results do not allow interpretation as to whether individual LDLR molecules lack O-glycosylation in all LDLR class A repeats, or there is partial O-glycosylation in different linker regions of all LDLR molecules. The lack of change in band thickness in the ␣-GalNAcase digestion experiment (Fig. 1A) is consistent with the latter option.
The analysis of N-glycosylation of LDLR suggested the presence of up to three N-glycans (Asn 156 , Asn 272 , and Asn 657 ), and interestingly one site (Asn 272 ) is highly conserved and placed in the linker region between class A repeat 6 -7, where we did not identify O-glycosylation (Fig. 3). Thus, the only linker region without glycosylation appears to be the extended linker between repeats 4 -5. The Asn 272 site (as well as the Asn 156 site) was previously found to be glycosylated in the crystallization studies of the LDLR extracellular domain expressed in insect cells (36). Other studies have reported that the Asn 657 site carried an N-glycan (29 -31), and when compared with the crystal structure, glycosylation at either of these three sites could affect the proposed interaction between the ␤-propeller domain and LDLR class A repeats 2-5 at endosomal pH (36).
Initiation of GalNAc-type O-glycosylation is catalyzed by up to 20 GalNAc transferase isoforms (32). Previously, in vitro glycosylation of 20-mer peptides have been the classical approach to analyze GalNAc-T isoform substrate specificity, but peptides covering the linker regions between LDLR class A repeats of LDLR and several other receptors in the family did not serve as efficient substrates for the GalNAc-Ts tested here (Fig. 5, Table  1). The seven (of 20) peptides that did serve as substrates for one or more GalNAc-Ts all contained Pro residues in close vicinity of the acceptor site, in agreement with past experience (37)(38)(39). We also recently demonstrated that only ϳ50% of peptides designed to cover known human O-glycosites serve as efficient substrates for a large number of recombinant GalNAc-Ts in vitro (19). Thus, to further explore how the linker regions are glycosylated, we used an E. coli-expressed reporter protein encoding the four first LDLR class A repeats of LDLR as substrate for in vitro glycosylation assuming that this construct would better present structural constraints imposed by the repeats. We experienced problems keeping the construct in solution in buffers and pH compatible with activity of the GalNAc-Ts, but under the conditions used we found that only GalNAc-T11 appeared to utilize this substrate (Fig. 6). This was verified by expression of LDLR in zinc finger nuclease-engineered HEK293 cells without GALNT11, where GalNAc-T11 seemed to be essential for the glycosylation of most of the LDLR class A repeat linker regions (Fig. 7). To our knowledge such examples where folding of protein substrates is required for demonstration of activity of GalNAc-Ts in in vitro assays have not previously been reported, but this is well established for enzymes initiating O-Fuc and O-Glc glycosylation (40). This finding may add to the difficulties in defining clear consensus acceptor sequence motifs for the GalNAc-T isoforms (37)(38)(39), as well as universal sequence predictors for O-glycosylation (19,41).
Our findings may also shed light on the CHO MonR31 mutant cell line and its apparent selective loss of capacity for O-glycosylation of the N-terminal region of LDLR (16). Total loss of capacity for O-glycosylation was originally generated by loss of the UDP-Glc/GlcNAc C4-epimerase in the CHO ldlD mutant (12,13), but selective loss of O-glycosylation of some but not all glycosites in one protein like LDLR may only occur by loss of individual GalNAc-T isoforms. Given the results with our mutant HEK293 cells without GalNAc-T11 it is likely that the MonR31 mutant cell line was deficient in GalNAc-T11, which has been shown to be expressed in wild-type CHO cells (42). Unfortunately, it is no longer possible to test this hypothesis, but the original finding that the MonR31 mutant was selected for resistance to monensin is intriguing.
Deciphering the isoform-specific functions of GalNAc-Ts has long been hampered by the functional redundancy of the up to 20 human GalNAc-Ts (32). Knock-out animal models show subtle phenotypes and deciphering molecular mechanisms is not straightforward (43)(44)(45). Gene association studies have pointed to a number of potentially important functions of specific GalNAc-T isoforms (32,46), and deficiency in one GalNAc-T gene, GALNT3, causes familial tumoral calcinosis associated with hyperphosphatemia (47). The underlying molecular mechanism was demonstrated to be loss of site-specific O-glycosylation in an inactivating proprotein processing site in FGF23 (48). Perhaps a related mechanism may explain why the GALNT2 gene in a number of Genome Wide Association Studies has been associated with HDL cholesterol and triglyceride levels (49). Moreover, the GALNT11 gene was suggested to play a role in the severe congenital heart disease heterotaxy, where a patient with mono-allelic deficiency was identified (50), and recent studies suggest that the GalNAc-T11 isoform regulates Notch receptor signaling through site-specific O-glycosylation (51). The finding here that GalNAc-T11 also appears to play a pivotal role in site-specific O-glycosylation of the LDLR class A repeat linker regions of LDLR suggests that this isoform has multiple distinct functions. It is important to note that many GalNAc-T isoforms can be grouped into close homologs often with highly similar kinetic properties, but that GalNAc-T11 appears to be unique and group in one subfamily (If), which only appears to include one other gene tentatively designated GALNT20 (32). This gene is, however, likely a pseudogene and enzymatic activity of the encoded protein that lacks the C-terminal lectin domain found on all other GalNAc-Ts has not been demonstrated. The orthologous gene in Drosophila, l(2)35Aa, encodes the dGalNAc-T1 (pgant35A) enzyme with essentially identical enzymatic properties to the human GalNAc-T11 enzyme (26). Drosophila has 11 GalNAc-T genes and l(2)35Aa is essential for fly development (26,52,53), although specific mechanism(s) have not been deciphered. Importantly, l(2)35Aa mutants do not show Notch phenotypes and it has been shown that dGalNAc-T1 (pgant35A) is required for proper epithelial tube formation in Drosophila (54). Drosophila expresses a distinct group of LDL receptors, lipophorin 1 and 2, which structurally most closely resembles LDLR, VLDLR, and LRP8, and several splice forms with either seven or eight LDLR class A repeats exist (55,56). Lipophorin 1 (isoform D) and lipophorin 2 (isoform F) both have seven class A repeats, whereof two linker regions (repeats 1-2 and 5-6) and one linker region (repeats 2-3), respectively, have the sequence motif XX-C 6 XXXTC 1 -XX suggesting conservation of potential O-glycosylation. We have not tested whether potential O-glycosylation of these linkers is mediated by dGalNAc-T1/GalNAc-T11 or other isoforms. Although we show that GalNAc-T11 is important for O-glycosylation of human LDLR in HEK293 cells, it is possible that this will not be the case in other cell types with different GalNAc-T repertoires. The availability of precise gene editing technologies such as zinc finger nucleases to produce isogenic cell systems with defined differences in the GalNAc-T repertoire opens the field for molecular dissection of non-redundant functions of individual isoforms and specific biological functions as we recently demonstrated (27). The established isogenic HEK293 cell pairs with and without GALNT11 thus provide an excellent system to study all non-redundant functions of this isoform in future studies.
As discussed, the role of the O-glycans in the ligand-binding domains of the LDLR family is still unclear. It is known that LDLR and its related receptors are shuttled through the secretory pathway bound to RAP until reaching the Golgi apparatus, where dissociation occurs (8). With Golgi being the site for initiation of GalNAc-type O-glycosylation it is possible that O-glycosylation plays a role in the interaction with RAP. RAP binding to LDLR and its related receptors occurs via the LDLR class A repeats overlapping several of the O-glycan sequence motifs described here (9,10). Curiously, RAP itself is also O-glycosylated at several sites (19), two of which are in the RAP-D3 domain, the major domain participating in LDLR binding (57). LDLR is furthermore, regulated by the pro-protein-convertasesubtilisin-kexin-9 (PCSK9) that binds the EGF A domain of the LDLR (58). Addition of PCSK9 to cultured hepatocytes promotes degradation of WT LDLR and LDLR lacking up to four class A repeats, the EGF B domain, or the stem region with the clustered O-glycans. In contrast, LDLR lacking five or more class A repeats, the entire ligand-binding domain, or the ␤-propeller domain failed to be degraded, although they bound and internalized PCSK9 (59).
It is well established that glycoproteins can play a crucial role in viral entry during infection (60), and it has been demonstrated that the LDLR is the cellular receptor for vesicular stomatitis virus (61) and hepatitis C virus (62). In particular, it was shown that the soluble ligand-binding domain of LDLR can competitively inhibit vesicular stomatitis virus and hepatitis C virus infection (61,62). To our knowledge, the potential role of glycosylation has not been addressed, but the cell systems described in this study should now enable such studies.
In summary, we demonstrate that the linker regions of LDLR class A repeats of LDLR and related receptors contain highly conserved O-glycans that appear to be at least partly controlled by the GalNAc-T11 isoform. Past studies suggest that these O-glycans are important for the function of LDLR and we have now developed isogenic cell systems enabling more detailed dissection of the functional role and molecular mechanism.