Proteomics-based screening of the endothelial heparan sulfate interactome reveals that C-type lectin 14a (CLEC14A) is a heparin-binding protein

Animal cells express heparan sulfate proteoglycans that perform many important cellular functions by way of heparan sulfate–protein interactions. The identification of membrane heparan sulfate–binding proteins is challenging because of their low abundance and the need for extensive enrichment. Here, we report a proteomics workflow for the identification and characterization of membrane-anchored and extracellular proteins that bind heparan sulfate. The technique is based on limited proteolysis of live cells in the absence of denaturation and fixation, heparin-affinity chromatography, and high-resolution LC-MS/MS, and we designate it LPHAMS. Application of LPHAMS to U937 monocytic and primary murine and human endothelial cells identified 55 plasma membrane, extracellular matrix, and soluble secreted proteins, including many previously unidentified heparin-binding proteins. The method also facilitated the mapping of the heparin-binding domains, making it possible to predict the location of the heparin-binding site. To validate the discovery feature of LPHAMS, we characterized one of the newly-discovered heparin-binding proteins, C-type lectin 14a (CLEC14A), a member of the C-type lectin family that modulates angiogenesis. We found that the C-type lectin domain of CLEC14A binds one-to-one to heparin with nanomolar affinity, and using molecular modeling and mutagenesis, we mapped its heparin-binding site. CLEC14A physically interacted with other glycosaminoglycans, including endothelial heparan sulfate and chondroitin sulfate E, but not with neutral or sialylated oligosaccharides. The LPHAMS technique should be applicable to other cells and glycans and provides a way to expand the repertoire of glycan-binding proteins for further study.

altered angiogenesis in the diaphragm, due to diminished Slit-Robo signaling, and reduced tumor angiogenesis, due to dysregulated vascular endothelial cell growth factor (VEGF) and fibroblast growth factor (FGF) signaling, and altered chemokine and selectin-mediated responses to acute inflammation (6 -9).
The size of the endothelial "HS-interactome," i.e. the repertoire of heparan sulfate-binding proteins (HSBPs) on the surface and surrounding extracellular matrix of endothelial cells, is unknown in part due to technical challenges in working with membrane proteins and protein complexes. Previous attempts to elucidate the interactome using live cells and tissue extracts often enriched intracellular proteins, such as DNA-and RNAbinding proteins, which bind ligands that have charge characteristics similar to heparan sulfate. To circumvent these problems, preparative steps involving purification of plasma membranes have been applied to cells and tissues (10). Cellsurface biotinylation strategies coupled with streptavidin enrichment prior to affinity chromatography also have proven useful in studies of cultured cells (10 -12). Often, the proteins identified were nuclear or cytoplasmic components. Heparan sulfate-binding membrane proteins were difficult to identify because of their low abundance and the need for detergents or prior enrichment techniques.
In this report, we describe a new simple proteomics strategy to identify plasma membrane and extracellular HSBPs that also permits the simultaneous determination of the binding domains that interact with heparin/heparan sulfate. The workflow combines Limited Proteolysis in the absence of denaturation, Heparin-Affinity chromatography, and high-resolution LC-MS/MS proteomics (LPHAMS). Application of LPHAMS to endothelial cells led to the identification of known HSBPs, including membrane receptors, secreted proteins, and extracellular matrix proteins, along with a set of previously unknown HSBPs. As a validation of LPHAMS, we characterized the heparin-binding properties of CLEC14A, a previously undocumented membrane HSBP involved in angiogenesis. This report lays the groundwork for screening and identifying not only HSBPs and their binding site, but other classes of glycan-binding proteins.

Analysis of heparin-binding proteins on the surface of endothelial cells
Limited proteolysis is a powerful tool to map conformational features of proteins. By using suboptimal conditions for proteolysis (limiting enzyme, reduced temperature, and omission of reducing agents and denaturants), limited cleavage occurs at exposed hinges or loops resulting in the liberation of intact protein domains (13,14). When applied to cells, limited proteolysis can be used to isolate and purify ectodomains of cell-surface transmembrane proteins, their subdomains, and subdomains of extracellular matrix and associated secreted proteins (15). We hypothesized that analysis of these liberated domains by chromatography on heparin-affinity resin would identify potential HSBPs and enrich for HS-binding domains, potentially indicating sites of contact between the protein and the ligand (Fig. 1A).
To establish the feasibility of the approach, we treated confluent monolayers of human umbilical vein endothelial cells (HUVEC) with varying concentrations of proteinase K and chymotrypsin (see under "Experimental procedures"). The extent of proteolysis was monitored by SDS-PAGE and silver staining of released proteolytic fragments. Conditions were empirically adjusted to shift the pattern of bands on the gel from the pattern obtained for samples treated with buffer, but not to the extent that all of the material migrated as low molecular weight peptides. To enrich for HSBPs, we subjected the samples to heparin-affinity chromatography. Heparin is structurally related to HS, although it is more highly sulfated, enriched in iduronic acid, and more highly-negatively charged. Its commercial availability makes it an inexpensive surrogate for HS. Samples obtained after proteolytic digestion or mock digestions with PBS were bound to heparin-Sepharose, and weakly-bound proteins were washed out with low ionic strength buffer (0.3 M NaCl in 20 mM HEPES (pH 7.2)). Strongly-bound proteins were eluted with buffer containing 1 M NaCl. The eluted material was then analyzed by LC-MS/MS. As described above, the overall method was designated LPHAMS.
To assess the feasibility and dynamic range of LPHAMS, HUVEC were first treated with proteinase K or chymotrypsin for 10 min at room temperature (see under "Experimental procedures"). Proteolytic fragments were collected and analyzed by the LPHAMS workflow. Proteomic characterization of the material displaying high affinity to heparin yielded numerous candidate HSBPs at 1% false discovery rate (FDR). All identified proteins were filtered based on the presence of at least two MS/MS counts for each identified peptide. Protein identifications were also filtered based on the presence of signal peptides (membrane and secreted proteins) and subcellular localization deduced via database searches and manual curation of the literature. Using these criteria, a total of 32 cell-surface or extracellular proteins were confidently identified in five independent experiments using proteinase K or chymotrypsin (Table S1). They included known HSBPs, such as thrombospondin 1 (THBS1) (16 -18), hedgehog-interacting protein (HHIP) (19), and vascular endothelial growth factor receptor 1 (VEGFR1) (20). Previously unknown HSBPs were also detected, including C-type lectin domain family 14 member A (CLEC14A), tyrosine receptor phosphatase ␤ (PTPR␤), lysyl oxidase-like protein 2 (LOXL2), transmembrane protein 132 (TMEM132A), growth/ differentiation factor 15 (GDF15), adhesion G-proteincoupled receptor L2 (ADGRL2), and killer cell immunoglobulin-like receptor 3DL2 (KIR3DLR2).
We performed MS analysis at each step in the purification scheme and aligned identified peptides from known heparinbinding proteins to their respective protein sequence. Connective tissue growth factor (CTGF) is a 349-amino acid protein composed of four domains: an insulin-like growth factorbinding protein; a von Willebrand factor type C domain (VWFC); a thrombospondin type 1 repeat (TSP1); and a cystine-knot domain (CTCK) (Fig. 1B) (21). Peptides identified in the input and flow-through fractions originated from the VWFC domain, whereas peptides identified in the high-salt eluate were aligned to the heparin-binding CTCK domain (Fig.  1B). In the case of HHIP, peptides from the high-salt eluate LPHAMS and the heparan sulfate interactome aligned to the frizzled-like domain and two epidermal growth factor (EGF)-like domains (Fig. 1C), which is consistent with the observation that heparin binding occurs in the N-terminal frizzled-like domain (19). For THBS1, identified peptides from the input and flow-through aligned outside of the heparinbinding domain whereas peptides identified in the 0.3 M wash and high-salt eluate localized to the heparin-binding N-terminal thrombospondin domain (Fig. 1D) (17, 18). These findings show that the partial proteolysis released subdomains of acces-sible proteins and that fractionation of the released material by heparin-affinity chromatography enriched for subdomains that map to the heparin-binding site.
To examine the dynamic range of LPHAMS, we plotted the protein counts as a function of their normalized label-free intensities (Fig. 1, E and F). Interestingly, known HSBPs such as THBS1, HHIP, and CTGF displayed high intensities, whereas other low-abundant HSBPs, such as SDF1 and ADAMTS4, occurred at lower intensities. Many of the newly-identified, Living cells were subjected to limited proteolysis to liberate exposed domains. These domains were passed through a heparin-Sepharose column to enrich for heparin-binding domains and were subsequently analyzed by MS. B-D, alignments of peptides identified by MS to protein sequence (x axis). The colored bars represent peptides recovered in each step of heparin chromatography as indicated in the key. Domain structures for these three proteins are aligned above the peptide map, and the shaded region enclosed by broken lines indicate the known heparin binding site. E and F, frequency distribution of identified proteins (gray) in samples treated either with proteinase K or chymotrypsin and eluted from the heparin column with 1 M NaCl. Blue bars depict established heparin-binding proteins, and red bars indicate putative heparin-binding proteins.

LPHAMS and the heparan sulfate interactome
putative HSBPs also occurred at low intensities as well, which may explain why previous methods failed to detect them. Compared with chymotrypsin treatment, proteinase K treatment of HUVEC resulted in higher protein recovery based on protein intensities and consequently in more protein identifications ( Fig. 1 F).
We next applied LPHAMS to mouse lung microvascular endothelial cells (MLEC), mouse brain microvascular endothelial cells, and U937 histiocytic lymphoma cells, which yielded 9, 12, and 20 HSBPs, respectively (Table 1 and Table S1). Of these proteins, unique HSBPs such as interleukin-1 receptor type 1 (IL1R1), transmembrane channel-like protein 5 (TMC5), and amyloid precursor like protein 2 (APLP2) were identified. Several proteins were shared between cell lines, whereas others were unique as might be expected given the diversity of protein expression across cell types (Table 1). In total, 55 HSBPs were identified, including 30 HSBPs not previously known to bind to heparin. About half of the proteins identified by LPHAMS were secreted soluble proteins or extracellular matrix proteins. Presumably, many of the soluble proteins were present in the extracellular matrix or bound to the cell surface, given that the cells were only gently rinsed with PBS prior to limited proteolysis. Twelve type I, one type II, two polytopic, and one glycosylphosphatidylinositol-anchored membrane proteins were identified in this way (Table 1). Thus, LPHAMS has the capacity to identify a broad range of membrane-associated and extracellular proteins, and the technique can be applied to different cell types.

LPHAMS facilitates mapping of heparin-binding domains in HSBPs
As indicated above, LPHAMS can enrich for protein subdomains that bind to heparin. Alignment of the peptides identified in the mass spectra to primary protein sequences in the UniProt database often mapped to specific subdomains in the known and putative HSBPs ( Fig. 2A). For example, peptides derived from thrombospondin-1 (THBS1) were confined to the N-terminal laminin G (LamG)-like domain where the heparinbinding site was previously mapped by heparin-affinity chromatography of proteolytic fragments, molecular docking, and X-ray crystallography ( Fig. 2A) (16 -18). Several of the proteins (e.g. VEGFR1 (20), HHIP (19), and stromal cell-derived factor 1 (SDF1, CXCL12) (22,23)) showed partial alignment of the recovered peptides with the putative heparin-binding domains. A disintegrin and metalloproteinase with thrombospondin repeats 4 (ADAMTS4) was identified in the screen as well, consistent with the observation that the protein can interact with heparan sulfate (24,25). The location of the heparin-binding site in ADAMTS4 has not been established, but we found enrichment of peptides derived from a zinc-dependent metalloprotease domain, a thrombospondin type 1 repeat, and an ADAMTS spacer region, suggesting its location. In general, recovered peptides in secreted HSBPs aligned well with domains previously shown to bind heparin, for example in CTGF (21,26), hepatic-derived growth factor (27,28), perlecan (HSPG2) (29), and laminin ␣4 (LAMA4) (30,31). In some reported HSBPs, we did not recover peptides corresponding to the location of the heparin-binding site, for example in the N-terminal domain of FN1, endostatin (the heparin-binding domain in collagen XVIII; COL18A1) (32), and annexin A2 (ANXA2) (33). However, peptides mapping to a heparin-binding site in the C-terminal domain of FN1 were detected (34).
We next inspected peptides derived from previously unidentified HSBPs (Fig. 2B) and examined their position in available crystal structures or in generated molecular models based on related structures to search for patches of positively-charged amino acids fitting the constraints described for heparin-binding domains (Fig. 3, A-H) (1,35,36). For reference, the crystal structure for THBS1 and SDF1 is shown with the electropositive surface previously documented to bind heparin (Fig. 3, A and B) (17,22). Examination of the structure of THBS1 and SDF1, which have been co-crystallized with heparin, showed that peptide sequences retrieved by LPHAMS aligned with the heparin-binding site (17, 18,22). For putative heparin-binding proteins, such as growth differentiation factor 15 (GDF15), LOXL2, and Netrin 4 (NTN4), we took advantage of previously existing crystal structures to search for regions of positive charge (37)(38)(39). The crystal structure of GDF15 is a dimer containing a large electropositive region (41 Å) spanning 108 amino acids of each monomer ( Fig. 3C) (38). LOXL2 had peptides spanning the second to fourth scavenger receptor cysteine-rich (SRCR) domain and the lysyl oxidase-like domain (Fig. 2B). Inspection of the crystal structure (PDB 5ZE3) (39) revealed a large electropositive patch (45 ϫ 22 Å) spanning the dimer interface of the SRCR4 domain, once again highly consistent with a putative heparin-binding site (Fig. 3D). NTN4 is an extracellular protein, consisting of lamin EGF-like domains, that functions in axon guidance (37). Inspection of the crystal structure revealed several linear stretches of positive charge spanning ϳ25, 28, and 40 Å (Fig. 3E). We modeled the putative heparin-binding sites in the R-type lectin domain of PTPR␤ and the C-type lectin domains of CLEC14A using Robetta (Fig. 3, F and G). PTPR␤ consists of 1997 amino acids and a single ricinlike fold followed by 17 fibronectin repeats. All of the peptides recovered by LPHAMS mapped specifically to the N-terminal ricin domain (representing 5.1% of the total protein) (Fig. 2B). An area of positive charge spanning 20 ϫ 20 Å was present in the model (Fig. 3F). In CLEC14A, recovered peptides predominantly localized to the C-type lectin and EGF domains (Fig.  2B). Interestingly, modeling of CLEC14A suggested a patch of positive charge stretching 10 ϫ 30 Å, indicative of a putative heparin-binding site (Fig. 3G). Another C-type lectin 14 family member, CD93, also yielded peptides in its lectin domain that aligned with a putative heparin-binding site (Figs. 2B and 3H).
Some of the identified proteins lacked three-dimensional structures or were not of sufficient homology to previously crystallized proteins to allow molecular modeling. These proteins include transmembrane protein 132A (T132A) and the ␣1 and ␣2 chains of type V collagen (COL5A1 and COL5A2) (Fig.  2B). Interestingly, the recovered peptide sequences in type V collagen corresponded to the thrombospondin 1 TSP1/LamG domains and collagen EMF1a (COLF1) protein domains. TSP1 and LamG are protein modules known to interact with heparin (10), whereas COLF1 has not been previously associated with heparin binding.

LPHAMS and the heparan sulfate interactome CLEC14A binds heparan sulfate
To validate LPHAMS as a discovery tool for new HSBPs, we analyzed the glycosaminoglycan-binding properties of CLEC14A, a member of the C-type lectin family 14 (40). CLEC14A plays a role in physiological and pathological angiogenesis, but its identification as a heparin-binding protein and the structure and function of the heparin-binding domain have not been characterized (41,42). CLEC14A is type I transmembrane protein containing a C-type lectin domain (CTLD), an EGF-module, and an endomucin domain rich in serine and threonine residues (Fig. S1). A 21-amino acid transmembrane peptide connects the ectodomain to a 71-amino acid cytoplasmic tail. C-type lectins in general bind calcium, and many bind glycans through a carbohydrate recognition domain in the CTLD. CLEC14A belongs to a subgroup of C-type lectins that includes CD93, THBS1, and CD248 (endosialin) (43).
CLEC14A contains a C-type lectin fold related to the fold present in E/L/P-selectins, which bind to glycans containing sialyl Lewis X (46). We tested whether CLEC14A could bind to sialyl Lewis X and other classes of acidic and neutral glycans through the Consortium for Functional Glycomics Protein-Glycan Interaction Core using a glycan array covering 609 different glycan structures unrelated to GAGs. CLEC14A-325 did not bind significantly to any of the glycans (Fig. S4). In contrast, surface plasmon resonance (SPR) confirmed that CLEC14A-325 bound to immobilized heparin in a dose-dependent manner (Fig. 4A). Binding of CLEC14A-290, lacking the endomucin domain, was reduced compared with CLEC14A-325 (Fig. 4, B and C). As shown below, CLEC14A-325 behaves as a trimer by size-exclusion chromatography, whereas CLEC14A-290 migrates as a monomer, most likely explaining the difference in binding of the two recombinant proteins to immobilized heparin (Fig. 4, A and B). To assess binding of different classes of GAGs, we immobilized recombinant CLEC14A-325 onto an SPR chip and tested different GAGs as the analyte (Fig. 4D and Fig. S5). Under these conditions, heparin exhibited similar binding characteristics, like the results obtained when heparin was immobilized on the chip. Chondroitin sulfate E, dermatan sulfate, and HS derived from Chinese hamster ovary cells (rHS01) also bound (Fig. 4D), whereas no binding was detected with hyaluronan (data not shown). Although the basic assumptions of single-site binding are violated due to the heterogeneous nature and valency of heparin and glycosaminoglycans, we calculated apparent binding constants using maximum response units (RU) (Fig. 4D). These data are summarized in Table 2.
Binding of CLEC14A to endothelial HS was tested using 35 Slabeled HS isolated from HUVEC. Samples were mixed with recombinant CLEC14A-325, and the solution was rapidly filtered through a nitrocellulose membrane. Free, uncomplexed GAGs do not bind to nitrocellulose, as demonstrated by the lack of counts bound to the filter when [ 35 S]HS was incubated with BSA ( Fig. 5) (47). In contrast, [ 35 S]HS bound to CLEC14A-325 in a dose-dependent manner (3 g of protein bound 10 Ϯ 4% of the input [ 35 S]HS counts) (Fig. 5). For comparison, 0.4 g of FGF2, which binds to HS with high affinity (48 -50), sequestered 35 Ϯ 6% of input counts (Fig. 5).

Heparin oligosaccharides bind to a single binding site in CLEC14A
Size-exclusion chromatography using globular protein standards showed that CLEC14A-290 ran as an ϳ31-kDa monomer consistent with its predicted molecular mass of 31.6 kDa (Fig.  6A). In contrast, CLEC14A-325 migrated with an effective mass of ϳ85 kDa, but the predicted molecular mass was only 35.6 kDa, suggesting that CLEC14 -325 behaved as a dimer or trimer (Fig. 6B). When the experiment was repeated using multiangle light-scattering (MALS) to more accurately estimate molecular mass, CLEC14A-290 and CLEC14A-325 eluted with molecular masses of 35.4 Ϯ 2.4 and 87.1 Ϯ 18.1 kDa, respectively. Many C-type lectins behave as trimers, which increases their avidity for multivalent ligands (51). One function of the endomucin domain in CLEC14A may be to facilitate oligomerization, thus increasing its affinity for GAGs.
As indicated above, modeling studies predicted a 22.3 ϫ 29.2-Å electropositive surface embedded in the CTLD of CLEC14A, which could theoretically accommodate a dp10 -12 heparin oligosaccharide (Fig. 3G). Incubation of CLEC14A-290 with dp10 heparin oligosaccharides shifted the elution pattern of the protein, increasing the apparent mass to ϳ38 kDa (Fig.  6A). Incubation with heparin yielded a large complex of average relative mass of 195 kDa, suggesting a stoichiometry of 5:1 CLEC14A-290/heparin. The data also support the idea that CLEC14A accommodates oligosaccharides of ϳdp10 because heparin consists of a variety of chains with an average molecular mass of ϳ14 kDa (ϳdp50). Incubation of CLEC14A-325 oligomer with dp10 heparin oligosaccharides shifted its mass from 90 to 114 kDa; the difference of 24 kDa is close to the predicted value if CLEC14A-325 behaves like a trimer and the complex binds three dp10 oligosaccharides (Fig. 6B). Together, these findings suggest that CLEC14A monomers contain a single binding site for heparin and that these sites act independently in the trimer.
Binding of glycans to proteins can stabilize them against denaturation. To test the impact of heparin on CLEC14A stability, we examined the response of CLEC14A-325 to thermal denaturation using differential scanning fluorimetry (DSF). In this technique, denaturation is measured by binding of a hydrophobic dye to hydrophobic residues exposed by denaturation. An increase in melting temperature induced by ligand-protein binding reflects enhanced protein stability. CLEC14A-325 showed a typical biphasic melting curve, melting at 55°C based on the first derivative spectrum. The addition of heparin increased thermostability by up to 9.8°C (Fig. 7A). For reference, we examined the impact of heparin on stability of FGF2, a classical HSBP. The addition of heparin increased the stability of FGF2 by 33.7 Ϯ 0.1°C (Fig. 7A). Individual oligosaccharides dp6 -18 added at a 9:1 molar ratio also increased the stability of CLEC14A-325 against thermal denaturation (Fig. 7B), with the effect reaching saturation with a dp12 oligosaccharide. Based on NMR structures for heparin oligosaccharides, a dp12 oligosaccharide has extended the length by ϳ38 -48 Å (52). This length corresponds well with the size of the modeled site shown in Fig. 3G. Chemically desulfating heparin at N-or C6 positions of glucosamine units or the C2 position of uronic acids reduced its ability to stabilize CLEC14A, indicating that binding involves contacts with multiple sulfate groups (Fig. 7C).

Computational mapping of the site of CLEC14A
Because of the low-sequence homology of the lectin domain of CLEC14A with proteins available in the Protein Data Bank (53), we employed different homology modeling methods to predict 3D structural models of CLEC14A. Superimposition of the models showed that despite using different templates, the fold of the protein core is well-preserved, but the conformation of the various loops differs significantly (Fig. S6A). Molecular dynamics (MD) simulations of selected CLEC14A models in LPHAMS and the heparan sulfate interactome explicit solvent on the microsecond timescale confirmed that the fold of the protein core remained stable, but the loops were significantly flexible (Fig. S6B). This makes a reliable prediction of the binding site of heparin fragments using docking methods difficult. Therefore, a novel MD-based unbiased protocol was used to map the site of CLEC14A. Each predicted 3D model of CLEC14A was simulated together with a heparin fragment in explicit solvent on the microsecond timescale. In the starting assemblies, the heparin fragment was always positioned remotely, as shown in Fig. S6A. In this way any interaction with specific amino acids during the simulation was not biased by the starting orientation. The number of heavy atom contacts (Ͻ4.0 Å) between the heparin fragment and the individual amino acids was analyzed from the MD trajectories (accumulated 11.6 s) for each frame in the corresponding model (Fig. 8,  A-C). It was found that binding events occurred typically in less than 50 ns, but rearrangements in the exact binding pattern occurred even after 500 ns. From the interaction plots, it was obvious that Arg-161 reproducibly played a key role in most of the simulations, but other amino acids also contributed significantly. The statistics and a binding model derived from the simulations with the YASARA model are shown in Fig. 8.

Genetic mapping of the site of CLEC14A
Arg-161 lies in a large positively-charged surface patch in the CTLD along with arginine and lysine residues Arg-141, Lys-158, and Arg-165 (Fig. 9A). To determine whether these residues are involved in binding, they were converted one-by-one to alanine residues, and recombinant protein was produced in HEK293F cells. Chromatography of the recombinant proteins on heparin-Sepharose showed that R141A, K158A, and R165A mutations had little effect on the salt concentration required for elution of the mutated proteins compared with the WT protein (488 -528 mM NaCl for mutants R141A, K158A, and R165A versus 515 mM for the WT) (Fig. 9B). However, the R161A var-    Fig.  4D. The values for heparan sulfate include an additional data point obtained at 28 g/ml. All data was analyzed using TraceDrawer version 1.6.1 (Ridgeview Instruments AB, Vänge, Sweden).

LPHAMS and the heparan sulfate interactome
iant eluted at a much lower concentration of NaCl (373 mM) indicating impairment in its heparin-binding capacity (Fig. 9C). We validated this finding using an ELISA in which heparin was immobilized on a plate. WT CLEC14A-325 bound to immobilized heparin with an apparent affinity of ϳ25 nM. In contrast, the R161A variant essentially lost its capacity to bind immobilized heparin under these conditions (Fig. 9D). Interestingly, the R161A mutant was more thermally stable compared with the WT protein, indicating that the decreased binding to heparin was not caused by unfolding of the mutant protein. As expected, heparin stabilized the WT protein to thermal denaturation, but it had a much reduced effect in the mutant (Fig. 9, E and F).

Discussion
In this report we describe the development and application of LPHAMS, a proteomic workflow integrating limited proteolysis, heparin-affinity chromatography, and high-resolution LC-MS/MS. Application of LPHAMS to human and murine endothelial cells led to the identification of large number of HSBPs, and in many examples, the method revealed the subdomains that facilitate binding to HS. We identified the endothelial transmembrane protein CLEC14A as a novel glycosaminoglycan-binding protein. CLEC14A, a C-type lectin, most likely exists as a trimer and does not bind typical N-and O-linked glycans but instead binds to glycosaminoglycans. In practice, LPHAMS is simple, does not require pre-fractionation methods or detergents, and can be applied to a variety of cell types. Types I and II, polytopic membrane proteins, as well as extracellular matrix and secreted proteins were discovered using this method.
Many investigators have used affinity chromatography coupled with MS to identify heparin-binding proteins, but typically, the source material consisted of a body fluid such as blood, serum, or cerebrospinal fluid (16, 26, 48, 49, 54 -56). This approach led to the purification of soluble growth factors, plasma proteins of the coagulation cascade and complement systems, and RNA-and DNA-binding proteins, but few membrane proteins were identified. To enrich for membrane proteins, a technique was devised that involved isolation of a plasma membrane fraction, for example from liver (10,57). Solubilization of the membranes and affinity purification of the material over heparin-Sepharose led to the identification of 148 HSBPs, including 79 membrane proteins. Although effective, this strategy involves homogenization of tissues, purification of plasma membranes, and detergents for solubilization of otherwise insoluble membrane proteins. Another approach employed cell-surface biotinylation of cultured cells, enrichment by streptavidin-affinity chromatography, and fractionation by heparin-Sepharose chromatography (10 -12). Many HSBPs were discovered in this way, but few membrane proteins were enriched possibly because of their low abundance or limited access to the tagging reagents.
LPHAMS takes advantage of suboptimal proteolysis to selectively cleave proteins at exposed protein hinges or loops, and the method can be easily modified depending on the ultimate targets. When applied to cells, specific enrichment for protein ectodomains of cell-surface proteins occurred leading to the identification of previously unidentified heparin-binding proteins. One limitation of this method is the requirement that target proteins have accessible protease cleavage sites, but the use of a broad-spectrum serine protease (proteinase K) reduces this potential problem. In contrast, the use of a broad-spectrum protease could easily lead to the under-representation of certain proteins due to undesirable proteolysis. To mitigate this limitation, adjustments can be made to the duration of proteolytic treatment, concentration of the protease, the type of protease, and temperature. Other GAGs or glycans could be used as the affinity matrix as well.
LPHAMS also has the advantage of providing information about putative glycosaminoglycan-binding sites in the heparinbinding proteins. These predictions were often consistent with prior mapping studies in which the site of interaction had been deduced by crystallography, NMR, or modeling. Several of the binding sites in the HSBPs identified in this study mapped to larger domains that presumably depend on folding and approximation of subdomains to generate the positively-charged surfaces with affinity for heparin. Of the 55 identified proteins, 26 had at least one so-called Cardin-Weintraub sequence ( Table  1): -XBBXBX-or -XBBBXXBX, where X is a hydrophilic amino acid and B is a basic amino acid (35). However, these Cardin-Weintraub sequences were not necessarily predictive, because many mapped outside of the established heparin-binding site or to domains not exposed at the cell surface. When examining secondary sequences for a heparin-binding turn based motifs (TXXBXXTBXXXTBB, where T denotes a turning amino acid (36)), only four proteins were predicted to contain a putative heparin-binding site.

LPHAMS and the heparan sulfate interactome
Like any proteomic technique, LPHAMS may not detect all candidate proteins or subdomains fulfilling the criterion of binding heparin. For example, peptides from the heparin-binding domain in COL18A1 and ANXA2 were not recovered possibly because of proteolytic fragmentation or weak affinity for heparin. Although nearly all known heparin-protein interactions are driven by electrostatics, proteins that bind heparin via polar interactions and/or nonionic interactions may have been missed (60). Conceivably, some of the proteins identified by LPHAMS might not actually bind to heparin, but instead may form a complex with a bona fide heparin-binding protein, which then led to its co-purification. However, these interactions are of interest as well because they define possible complexes and biological functions.
In this study, we show for the first time that CLEC14A binds to GAGs and not to carbohydrates that typically associate with C-type lectins. CLEC14A is most likely a trimeric protein, typical of many C-type lectins. Binding to heparin oligosaccharides appeared to occur independently in each monomer. Interestingly, the heparin-binding site in CLEC14A does not map to the site typically associated with the carbohydrate recognition domain of this lectin subfamily (43), and binding does not depend on calcium. These observations suggest that the ability to bind heparin and other GAGs evolved independently of the C-type lectin fold. CLEC14A is an endothelial-specific gene upregulated during tumor angiogenesis and regulates endothelial cell migration and adhesion in vitro and angiogenesis in vivo (41,42,(61)(62)(63)(64)(65)(66)(67). The C-type lectin domain has been shown to engage other matrix proteins, such as multimerin 2 (MMRN2) and heat-shock protein 70-1A (61,62). Antibodies blocking these interactions or targeting the C-type lectin domain have been shown to decrease cell migration and tumor angiogenesis in a manner dependent on MMRN2 (42,62) or VEGFA (68). Whether these antibodies block the GAG-binding site is not known.
CLEC14A bound to specific glycosaminoglycans with fast on-rates and undetectable off-rates. Heparin and chondroitin sulfate E have high charge density and bound more avidly compared with heparan sulfate, chondroitin sulfate A, and dermatan sulfate, which have lower charge density (Fig. 4). There was no detectable binding to hyaluronan. CLEC14A also binds to heparan sulfate from HUVEC, but less so compared with FGF2 (Fig. 5). These findings suggest that CLEC14A prefers highlycharged polysaccharides, consistent with dramatically reduced Figure 7. Heparin stabilizes CLEC14A to thermal denaturation. CLEC14A-325 or FGF2 was thermally denatured, and the melting point (T m ) was determined by differential scanning fluorimetry (DSF). A, CLEC14A-325 (6 M) was incubated with various concentrations of heparin to achieve the indicated molar ratios. FGF2 (24 M) was incubated with heparin (9:1 molar ratio). B, heparin oligosaccharides (dp6 -18) were incubated with CLEC-325 at a 9:1 molar ratio and analyzed by DSF. C, CLEC14A-325 was incubated with chemically de-sulfated forms of heparin. Each condition was performed in triplicate, and the data were analyzed by Prism (version 5.0).

LPHAMS and the heparan sulfate interactome
binding to desulfated forms of heparin (Fig. 7). Although it is possible that CLEC14A prefers a specific arrangement of sulfated residues in the ligand, we think its most likely that the overall charge determines the affinity of the interaction, which would suggest that the GAG-binding site is somewhat promiscuous. The combination of unbiased molecular dynamics simulations to homology models and alanine mutagenesis indicates that binding of heparin to CLEC14A is highly dependent on Arg-161 (Figs. 8 and 9). Although we are confident in the location of the CLEC14A-heparin-binding pocket, we point out that the binding modes are approximate because it is not feasible to reach full convergence in the binding simulations. Attempts to crystallize the complex are currently underway.
In summary, we describe a facile method for discovery of heparin and other GAG-binding proteins using limited proteolysis. The technique can be readily adapted to other cell types by altering proteolytic conditions and in theory can be used to identify proteins that interact with other carbohydrate ligands by variation of the affinity matrix. The technique also aids in mapping the ligand-binding site. Finally, the characterization of CLEC14A as a heparin-binding protein validates the technology and suggests further studies of the function of CLEC14A-GAG complexes in vivo.

Limited proteolysis proteomics screening
Confluent HUVEC, grown in 100-mm diameter dishes in EGM-2 medium (Lonza), were washed twice with 5 ml of M199 medium (Life Technologies, Inc.). After testing different conditions, we typically treated cells with proteinase K (250 ng/ml) or chymotrypsin (1 g/ml) in M199 medium for 10 min at room temperature. The supernatant was collected, centrifuged at 400 ϫ g to remove cellular debris, and then placed on ice. The samples were applied to a 1-ml HiTrap heparin-Sepharose column (GE Healthcare) equilibrated in 150 mM NaCl in 25 mM

LPHAMS and the heparan sulfate interactome
HEPES buffer (pH 7.1). Columns were washed with 0.3 M NaCl in 25 mM HEPES buffer to remove low-affinity binding proteins and step-eluted with 1 M NaCl in 25 mM HEPES buffer. An in-solution digestion was performed on proteins in these fractions with mass spectrometry grade trypsin gold (Promega) at 37°C. Peptides were desalted using C18 Tips (Pierce) and dried using a speed-vacuum centrifuge. Murine brain (mBMEC) and lung (mLEC) microvascular endothelial cells were isolated as described (6) and were cultured on gelatin (Sigma) in Dulbecco's modified Eagle's medium (DMEM; Lonza) containing 20% (v/v) fetal bovine serum (Atlanta Biologicals), heparin (100 g/ml), and endothelial cell growth supplement (50 g/ml; VWR International). Confluent mBMEC and mLEC were washed with serum-free DMEM and then treated with proteinase K (250 ng/ml) for 15 min. U937 cells were cultured in Roswell Park Memorial Institute 1640 (RPMI) medium containing 10% FBS (v/v). The cells were centrifuged, washed three times with PBS, and then digested with proteinase K (500 ng/ml) or trypsin (1 g/ml) in PBS for 45 min with rotation. The supernatants from digestions of mBMEC, mLEC, and U937 cells underwent the same workflow for heparin purification, trypsinization and MS as HUVEC.

Liquid chromatography-mass spectrometry
Tryptic peptides were analyzed by ultra-HPLC (UPLC) coupled with tandem MS (LC-MS/MS) using nano-spray ionization. The experiments were performed using a TripleTOF 5600 hybrid mass spectrometer (ABSCIEX) interfaced with nanoscale reversed-phase UPLC (Waters, nanoACQUITY) using a 20-cm ϫ 75-m, inner diameter, glass capillary packed with 2.5-m C18 (130) CSHTM beads (Waters). Peptides were eluted from the C18 column using a linear gradient (5-80%) of acetonitrile at a flow rate of 250 l/min for 1 h. Buffer A is 98% water, 2% acetonitrile, 0.1% formic acid, and 0.005% TFA, and buffer B is 100% acetonitrile, 0.1% formic acid, and 0.005% TFA. MS/MS data were acquired in a data-dependent manner. MS1 data were acquired for 250 ms at m/z of 400 -1250 Da, and the MS/MS data were acquired from m/z of 50 -2000 Da. MS1-TOF acquisition time was 250 ms, followed by 50 MS2 events with a 48-ms acquisition time for each event. The threshold to trigger the MS2 event was set to 150 counts when the ion had the charge state of ϩ2, ϩ3, and ϩ4. The ion-exclusion time was set to 4 s.
In a few cases, the samples were also run on a Proxeon EASY nanoLC system (Thermo Fisher Scientific) coupled to a Q-Ex- active Plus mass spectrometer (Thermo Fisher Scientific). Dried peptides were reconstituted with 2% acetonitrile, 0.1% formic acid, and quantified by modified BCA peptide assay (Thermo Fisher Scientific). Equal peptide amounts derived from each sample were injected and analyzed by LC-MS/MS using a Proxeon EASY nanoLC system (Thermo Fisher Scientific) coupled to a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). Peptides were separated using an analytical C 18 Acclaim PepMap column (0.075 ϫ 500 mm, 2 m; Thermo Fisher Scientific) equilibrated with buffer A (0.1% formic acid in water) and eluted in a 93-min linear gradient of 2-28% solvent B (100% acetonitrile) at a flow rate of 300 nl/min. The mass spectrometer was operated in positive data-dependent acquisition mode. MS1 spectra were measured with a resolution of 70,000, an automated gain control (AGC) target of 1e6, and a mass range from 350 to 1700 m/z. Up to 12 MS2 spectra per duty cycle were triggered, fragmented by higher-energy collisional dissociation, and acquired with a resolution of 17,500 and an AGC target of 5e4, an isolation window of 1.6 m/z, and a normalized collision energy of 25. Dynamic exclusion was enabled with duration of 20 s.

Protein and peptide identification and analysis
Bioinformatics and statistical analysis of proteomics results were conducted in the Perseus statistical suite (version 1.6.5.0) (69) and the MaxQuant computational platform. Raw data were searched by the Andromeda search engine (70) against the mouse UniProt FASTA database (17,009 entries, downloaded June 2, 2017), human Uniprot database (20,129 entries, downloaded June 3, 2017), and against a common contaminant database. Search parameters were set as follows: enzyme, trypsin with up to two potentially missed cleavages; fixed modifications, carbamidomethyl on cysteines; variable modifications, oxidation of methionine and acetylation of protein N terminus; minimum peptide length, 7 amino acids. The FDR for both peptide and protein identifications was set to 1% and was calculated by searching the MS/MS data against a reversed decoy database. Allowed mass deviation for precursor ions was set to 5 ppm and for peptide fragments was set to 20 ppm. Peptides having at least 95% confidence with multiple modified and cleaved states of the same underlying peptide sequence were considered distinct. Multiple spectra of the same peptide due to replicate acquisition or different charge states were counted once. Inclusion criteria required detection of Ն2 MS/MS counts in a given scan and/or detection in multiple samples. Several of the proteins from the U937 runs had low MS/MS counts, but three of these were established HSBPs or were detected in endothelial cells. The actual spectral counts and intensities for all the protein identifications are provided in Tables S1 and S3. For peptide alignments, the number of distinct peptides identified from all limited proteolysis MS experiments was mapped to their respective UniProt protein sequence by using the multiple sequence alignment tool Clustal Omega (71,72).

CLEC14A N-glycan site mapping and glycopeptide analysis
For studies involving N-linked glycans, CLEC14A was treated with PNGase F (New England Biolabs) for 16 h at 37°C under nondenaturing conditions. Details are provided in the supporting information.

Heparin ELISA
Porcine mucosal heparin (SPL Scientific Protein Laboratories) was immobilized (50 l at 1 mg/ml) on a 96-well Carbobind plate (Corning) in 0.1 M sodium acetate buffer (pH 5.5) for 2 h at room temperature. Wells were washed with PBST, blocked with 1% BSA (Sigma) for 2 h at 37°C, and washed again with PBST. The wells were incubated with the indicated concentrations of recombinant CLEC14A at room temperature for 1 h. Bound protein was quantitated using THE TM His tag antibody (GenScript) and anti-mouse horseradish peroxidase (Cell Signaling). The K d value was calculated by fitting the binding data to a single-site binding model (Prism 8).

Endothelial heparan sulfate purification
HUVEC were grown on gelatin in EGM-2 medium until confluent. The cells were radiolabeled with 20 Ci of 35 SO 4 (9.25-37.0 GBq/mmol; PerkinElmer Life Sciences) in 5 ml of F12 medium (Gibco) supplemented with 10% FBS depleted of glycosaminoglycans by chromatography over DEAE-Sepharose. The culture medium was collected after 24 h, and the cell layer was treated with trypsin for 10 min at 37°C. The trypsin solution was collected and centrifuged to remove cell debris. Secreted proteoglycans in the growth medium and cell surface were pooled and digested with 0.4 mg/ml Pronase (Sigma) overnight at 37°C. Samples were diluted with 2 volumes of wash buffer and purified by anion-exchange chromatography. Briefly, columns were prepared by washing 1 ml of 50% slurry of DEAE-Sepharose beads (GE Healthcare) with 50 mM sodium acetate, 0.2 M NaCl, 0.1% Triton X-100 (pH 6.0). After applying LPHAMS and the heparan sulfate interactome the sample, the columns were rinsed with wash buffer, and proteoglycans and glycosaminoglycans were eluted with 2.5 ml of 50 mM sodium acetate buffer containing 2 M NaCl (pH 6.0). Samples were then desalted on a PD-10 column (GE Healthcare) equilibrated in 10% ethanol. Samples were lyophilized and resuspended in 50 mM Tris buffer containing 50 mM NaCl and 25 mM MgCl 2 (pH 8.0). To remove DNA and chondroitin sulfate, samples were treated with 20 kilounits/ml DNase I (Sigma) and 20 milliunits of chondroitinase ABC (Amsbio) for 3 h at 37°C. To liberate the [ 35 S]heparan sulfate chains from residual peptides, the samples were ␤-eliminated with 0.4 M NaOH overnight at 4°C. The heparan [ 35 S]sulfate was then re-purified by anion-exchange chromatography and desalted.

Nitrocellulose filter binding assay
Recombinant CLEC14A, BSA, or FGF2 (PeproTech) were incubated with 10,000 counts of heparan [ 35 S]sulfate for 30 min at room temperature. Samples were added to prewashed nitrocellulose membranes on a vacuum apparatus and rapidly filtered. The filters were added to 5 ml of Ultima Gold XR (PerkinElmer Life Sciences) scintillation fluid and counted by liquid scintillation.

Surface plasmon resonance
A Nicoya OpenSPR was used to generate binding curves for CLEC14A interaction with heparin (SPL Laboratories), porcine intestinal mucosal dermatan sulfate (Celsus Laboratories), chondroitin sulfate E (Sigma), Chinese hamster ovary cell heparan sulfate (rHS01, TEGA Therapeutics, Inc.), and umbilical cord hyaluronan (Sigma). Protein was immobilized on a Nicoya carboxyl sensor using amine coupling kit (Nicoya). Carboxyl sensors were functionalized using 0.2 ml of a 1:1 mix of N-hydroxysuccinimide (0.1 M) and 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (0.4 M) before coupling to recombinant CLEC14A under flow conditions. Ethanolamine was used to block remaining active sites on the chip. In other experiments, biotinylated heparin (Sigma) was immobilized onto a Nicoya Streptavidin Sensor chip. All surfaces were washed with SPR buffer (20 mM HEPES, 150 mM NaCl, 5 mM CaCl 2 , 17 mM NaN 3 , 5 mM MgCl 2 , 0.1% BSA, and 0.05% Tween 20 (pH 7.2)) and regenerated with 20 mM HEPES buffer (pH 7.2) containing 3 M NaCl. Ligands were allowed to associate with the chip at a flow rate of 20 l/min in SPR buffer for 4 min and allowed to dissociate for 5 min. Regeneration buffer was used before each injection of ligand to clean the surface chip.

Analytical heparin-Sepharose chromatography
CLEC14A was applied to a 1-ml HiTrap heparin-Sepharose column (GE Healthcare) in PBS. Protein was eluted with a gradient of NaCl from 150 mM to 1 M.

Differential scanning fluorimetry
CLEC14A (6 M) or FGF2 (24 M) was incubated with 5X SYPRO Orange Protein Gel Stain (Thermo Fisher Scientific) in PBS. CLEC14A thermal denaturation was monitored on a CFx96 real-time PCR system (Bio-Rad) using a temperature gradient from 25 to 98°C (1°C/min). Heparin and heparin derivatives (Iduron) were added to the solution at a final concentration of 48 M. Melting temperatures were calculated using first derivatives of the data assuming a Gaussian distribution (Prism 8).