Disulfide Bond Assignments of Secreted Frizzled-related Protein-1 Provide Insights about Frizzled Homology and Netrin Modules*

Secreted Frizzled-related protein-1 (sFRP-1), a soluble protein that binds to Wnts and modulates Wnt signaling, contains an N-terminal domain homologous to the putative Wnt-binding site of Frizzled (Fz domain) and a C-terminal heparin-binding domain with weak homology to netrin. Both domains are cysteine-rich, having 10 and 6 cysteines in the Fz and heparin-binding domains, respectively. In this study, the disulfide linkages of recombinant sFRP-1 were determined. Numbering sFRP-1 cysteines sequentially from the N terminus, the five disulfide linkages in the Fz domain are 1–5, 2–4, 3–8, 6–10, and 7–9, consistent with the disulfide pattern determined for homologous domains of several other proteins. The disulfide linkages of the heparin-binding domain are 11–14, 12–15, and 13–16. This latter set of assignments provides experimental verification of one of the disulfide patterns proposed for netrin (NTR) modules and thereby supports the prediction that the C-terminal heparin-binding domain of sFRP-1 is an NTR-type domain. Interestingly, two subsets of sFRPs appear to have alternate disulfide linkage patterns compared with sFRP-1, one of which involves the loss of a disulfide due to deletion of a single cysteine from the NTR module, whereas the remaining cysteine may pair with a new cysteine introduced in the Fz domain of the protein. Analysis of glycosylation sites showed that sFRP-1 contains a relatively large carbohydrate moiety on Asn172 (∼2.8 kDa), whereas Asn262, the second potential N-linked glycosylation site, is not modified. No O-linked carbohydrate groups were detected. There was evidence of heterogeneous proteolytic processing at both the N and C termini of the recombinant protein. The predominant N terminus was Ser31, although minor amounts of the protein with Asp41 and Phe50 as the N termini were observed. The major C-terminal processing event was removal of the terminal amino acid (Lys313) with only a trace amount of unprocessed protein detected.

Wnt signaling has been implicated in the specification of cell fate, polarity and proliferation, tissue patterning, and the onset of neoplasia (reviewed in Refs. 1 and 2). Signaling is initiated by the secreted Wnt proteins, which react with proteins on the cell surface to form a receptor complex consisting of a seven-pass transmembrane molecule of the Frizzled (Fz) 1 family (3) and either LRP5 or LRP6/Arrow (4 -6), members of the low density lipoprotein receptor-related family (7,8). In the absence of Wnt receptor activation, the modular protein Axin provides a scaffold for the binding of glycogen synthesis kinase 3␤ (GSK-3␤), adenomatous polyposis coli (APC) protein, and ␤-catenin (9 -13). This facilitates the phosphorylation of ␤-catenin by GSK-3␤ and subsequent rapid degradation of ␤-catenin by a ubiquitin-dependent process (14,15). In response to Wnt binding, the Axin-GSK-3␤-APC-␤-catenin complex is disrupted by a process that involves the cytoplasmic proteins Dishevelled and Frat (16,(17)(18)(19)(20), dephosphorylation of Axin (22,23), and recruitment of Axin to LRP5 associated with Axin destabilization (24). As a result, the phosphorylation and degradation of cytosolic ␤-catenin are inhibited, leading to its interaction with DNA-binding proteins of the T-cell factor/lymphoid enhancerbinding factor family and accumulation in the nucleus where these complexes activate expression of target genes (25)(26)(27)(28)(29)(30). Mutations in APC, ␤-catenin, and Axin that increase the steady state level of soluble ␤-catenin create conditions tantamount to a constitutively active canonical Wnt pathway and have been observed in many human cancers (reviewed in Ref. 2).
The Wnt-binding site in Fz proteins consists of ϳ120 amino acid residues and has been designated the Fz cysteine-rich domain (CRD) because it contains 10 cysteines that are present in all members of the Fz family (3,31). Several other proteins possessing a Fz CRD have been identified, including tyrosine kinases (32,33), carboxypeptidase Z (34), and an isoform of collagen XVIII (35). In addition, a set of secreted Fz-related proteins (sFRPs) have been described that are ϳ300 amino acids in length and contain an N-terminal Fz CRD that is typically ϳ30 -50% identical to the CRDs of Fzs (36 -46). These proteins bind Wnts and regulate their activity in a variety of assays. Although the Wnt binding of sFRPs is generally believed to be mediated by the Fz CRD, interaction between Wingless (Drosophila ortholog of mammalian Wnt1) and a sFRP-1 mutant lacking the CRD imply that other mechanisms of direct or indirect interaction also exist (47).
The C-terminal heparin-binding portion of sFRPs bears weak homology with netrins (36,37), proteins involved in axonal guidance (48). Originally, this potential relationship was based on the presence of clusters of positively charged residues and a few other conserved amino acids distributed over a span of ϳ50 amino acids in FrzB/sFRP-3 (36). More recently, Bá nyai and Patthy (49) identified a netrin (NTR) module in the Cterminal domains of netrins, sFRPs, type I procollagen C-proteinase enhancer proteins (PCOLCEs), complement proteins C3, C4, and C5, and in the N-terminal domains of tissue inhibitors of metalloproteinases (TIMPs). This homology was based on related patterns of six conserved cysteines, several conserved segments of hydrophobic residues, and a correlation between predicted and known secondary structure in some of the proteins having the domain. However, experimentally determined disulfide bond assignments for the cysteine residues were only available for TIMPs and complement protein C3, the latter being a variant in the group that contains only four of the conserved cysteines. Thus, the validity of the proposed NTR module would be reinforced if the disulfide structure of another protein containing the putative domain conformed to the predicted scheme.
In this study, we characterized the post-translational processing of sFRP-1. The linkages of the eight disulfide bonds and the site of N-linked glycosylation in sFRP-1 were determined using MALDI-MS and N-terminal sequencing of purified peptides. The data show that sFRP-1 has two distinct domains with 10 and 6 cysteines in the N-and C-terminal domains, respectively. The N-terminal domain has a pattern of disulfide linkages identical to that of the Fz CRD recently defined in rat tyrosine kinase Ror-1, mouse sFRP-3, and mouse Fz8 (50,51). The assignment of disulfides in the C-terminal domain experimentally validates the primary disulfide pattern predicted for NTR modules (49). In addition, these results provide the first complete experimental assignment of disulfide linkages in an sFRP recombinant protein containing both a CRD and an NTR domain in tandem. An interesting aspect of this assignment is that two other subsets of sFRPs are likely to have different disulfide linkages compared with sFRP-1, suggesting that shuffling of several disulfide bonds may have occurred during evolution of this protein family.

EXPERIMENTAL PROCEDURES
Materials-Trypsin (sequencing grade) was purchased from Promega (Madison, WI). Subtilisin was obtained from Roche Molecular Biochemicals. Tris-(2-carboxyethyl)-phosphine (TCEP) was obtained from Pierce. Cyanogen bromide (CNBr) was obtained from Aldrich. Reagents for PAGE were obtained from Bio-Rad. All other reagents were either high performance liquid chromatography (HPLC) grade or the highest quality analytical reagent grades available.
CNBr Fragmentation-CNBr was used for initial fragmentation of sFRP-1 for disulfide assignments because the intact unreduced protein was unusually resistant to cleavage by all proteases tested. Acetic acid (5% final concentration) was added to purified sFRP-1 (1.81 mg/2 ml), and the protein was desalted on an Econo-Pac10DG desalting column (Bio-Rad) using 5% acetic acid to elute the protein. Fractions containing protein were pooled, lyophilized, and reconstituted in 88% formic acid followed by the addition of a 100-fold molar excess of CNBr over the Met content. After overnight incubation in the dark at room temperature under argon, the sample was lyophilized twice and redissolved in 500 l of 7 M urea, 50 mM NaH 2 PO 4 , and 50 mM glycine, pH 6.5. The CNBr fragments were separated by HPLC gel filtration using two TSK columns G3000 SW XL and G2000 SW XL connected in series with a 10 mM sodium phosphate, 150 mM NaCl, 7 M urea, pH 6.5, buffer at a flow rate of 0.6 ml/min. Fractions were analyzed by SDS-PAGE and mass spectrometry.
Reversed Phase HPLC Separation of Peptides-Peptides were separated by reversed phase (RP) HPLC on a ZORBAX 300SB-C18 column (2.1 ϫ 150 mm, Hewlett-Packard) using a System Gold HPLC (Beck-man Instruments, Fullerton, CA) at a flow rate of 0.2 ml/min. A linear gradient was applied using solvent A (0.1% trifluoroacetic acid in water) and solvent B (0.085% trifluoroacetic acid in 95% acetonitrile). Where required, subtilisin digests were reduced prior to RP-HPLC by adding an equal volume of 20 mM TCEP in 200 mM ammonium bicarbonate, pH 8.0. The mixture was incubated at 37°C for 1 h, and 1.7% trifluoroacetic acid (final concentration) was then added prior to injection onto the HPLC column.
Partial Reduction with TCEP and Alkylation-TCEP partial reduction of peptide complexes containing multiple disulfides was performed as described previously (52). The purified peptide complex (160 pmol/50 l in 0.1% trifluoroacetic acid) was mixed with an equal volume of 20 mM TCEP in 50 mM citrate, pH 3.2, and incubated for 3 min at 22°C. Alkylation of peptides was performed by adding the TCEP-reduced peptide solution into an equal volume of 1 M iodoacetamide in 200 mM HEPES, 2 mM EDTA, pH 8.0, followed by incubation at 37°C for 30 min. The reaction was stopped by adding 1.3% trifluoroacetic acid (final concentration).
N-terminal Sequence Analysis-Automated Edman sequencing was performed using an Applied Biosystems model 494 protein sequencer as described previously (53).
Mass Spectrometry-Molecular mass analysis was performed by matrix-assisted laser desorption/ionization time-of-flight-mass spectrometry using a Voyager DE-PRO mass spectrometer (Perspective Biosystems, Framingham, MA) with an accelerating voltage of 20 kV. Data were acquired either in linear or reflector mode using either external or internal calibration with protein A (44,614 Da), ubiquitin (8567.49 Da), insulin ␤ chain (3496.96 Da), and bradykinin (1061.24 Da). When necessary, samples were desalted using C18 Ziptips (Millipore, Bedford, MA) followed by elution with a small volume of 50% acetonitrile, 0.1% trifluoroacetic acid. The intact protein and large peptides were mixed 1:1 with a saturated solution of 3,5-dimethoxy-4-hydroxycinnamic acid (sinapinic acid, Sigma) in 33% acetonitrile and 0.1% trifluoroacetic acid for MALDI-MS analysis. Peptides Ͻ5-kDa from CNBr fragmentation and subtilisin digestion were applied to MS sample plates precoated with a saturated solution of nitrocellulose and ␣-cyano-4-hydoxycinnamic acid (1:4 w/w) in 2-propanol and acetone (1:1 v/v) as described previously (54). Where required, CNBr fragments and RP-HPLC samples (20 l) were reduced in 10 volumes of 2 mM TCEP, 20 mM ammonium bicarbonate, pH 8.0, at 37°C for 1 h, followed by desalting on ZipTips to remove the TCEP and ammonium bicarbonate.
Site-directed Mutagenesis at Asn 172 and Asn 262 -Single amino acid substitutions were introduced with sFRP-1/pcDNA3.1 (47) as template and the QuikChange XL Site-directed Mutagenesis Kit (Stratagene) following the manufacturer's instructions. N172Q and N262Q were generated, respectively, with the following primer pairs: GCCATGAC-GCCGCCCCAAGCCACCGAAGCCTCC (forward)/GGAGGCTTCGGT-GGCTTGGGGCGGCGTCATGGC (reverse); and CCCTGCCACCAGCT-GGACCAACTCAGCCACCACTTCCTC (forward)/GAGGAAGTGGTG-GCTGAGTTGGTCCAGCTGGTGGCAGGG (reverse). The underlined letters indicate the location of mutations introduced to modify the sequence. DNA constructs were sequenced to confirm the presence of the intended substitutions and ensure the absence of random mutations at any other sites. Recombinant expression and protein purification were performed as described previously for wild-type sFRP-1 (47).

RESULTS
Protein Purification and Characterization of sFRP-1-Recombinant sFRP-1 was purified from MDCK cell culture supernatant by heparin-Sepharose affinity chromatography. The purified protein migrated on SDS-PAGE with an apparent mass of ϳ35 kDa (Fig. 1A). MALDI-MS analysis of purified sFRP-1 showed a single broad peak with an average mass (MH ϩ ) of 35,452 Da. N-terminal sequence analysis of sFRP-1 indicated that the majority of the polypeptide chains began with Ser 31 , whereas ϳ10% and ϳ7% of the sample began with Asp 41 and Phe 50 , respectively (Fig. 2). MALDI-MS analyses of tryptic peptides revealed the predominant C terminus of sFRP-1 to be Phe 312 , although trace amounts of the protein with C termini at Gln 309 , Ser 310 , Phe 308 , Val 311 , and Lys 313 were observed (data not shown). Because the calculated amino acid sequence mass of sFRP-1(Ser 31 -Phe 312 ) is 32,394 Da, the difference between observed and calculated mass suggested the molecule was glycosylated where the major species had ϳ3000 Da of carbohydrate mass. The broad MS peak shape was consistent with heterogeneity of the putative carbohydrate moiety and heterogeneous proteolytic processing of the N and C termini described above.
CNBr Fragmentation-Initial fragmentation of sFRP-1 utilized CNBr to cleave peptides on the C-terminal side of methionines, which resulted in conversion of these residues to a mixture of homoserine and homoserine lactone. Masses corresponding to both methionine derivatives were observed for most peptides. For simplicity, only masses corresponding to the predominant homoserine lactone form are reported (residue mass ϭ 83.04 Da), and these residues are indicated as Met xxx when peptide sequences are described. The CNBr fragments were separated by HPLC gel filtration, and major peaks/pools were designated by C1 to C6 as shown in Fig. 3A. The protein bands observed in fractions C1, C2, and C3 on nonreducing gels shifted to lower molecular weight positions on reducing gels, which indicated these fractions contained disulfide linkages (Fig. 3, B and C). MALDI-MS of C1 showed a single broad peak with an average mass of 30,428.7 Da prior to reduction, whereas masses corresponding to Ser 31 -Met 75 , Ala 87 -Met 143 , and a weak signal for Lys 301 -Phe 312 were observed after re-duction ( Table I). Comparisons of SDS gel bands and masses of C1, C2, and C3 showed that Met 168 had not been cleaved in C1 resulting in isolation of a single large unreduced complex containing all disulfide-linked peptides. This incompletely fragmented CNBr peptide was not directly identified in the MS analysis apparently due to a combination of its large size and the glycosylated moiety on this fragment that interfered with ionization of the peptide after reduction (see below). The C2 peptide complex contained the six cysteines from the heparinbinding (NTR) domain in three polypeptide chains as follows: glycosylated Thr 169 -Met 210 , Lys 211 -Met 270 , and Lys 301 -Phe 312 , which confirmed the major C terminus of the protein was Phe 312 . The C3 peptide complex contained the 10 cysteines from the Fz CRD domain in three polypeptide chains: Ser 31 -Met 75 , Ala 87 -Met 143 , and Leu 154 -Met 168 . The C4 to C6 peptide fractions did not contain any cysteine residues and were determined to be Gly 271 -Met 297 , Gln 144 -Met 153 , and Val 76 -Met 86 , respectively. Peaks C1 to C3 were further analyzed as described below to determine the disulfide linkages of sFRP-1.
Analysis of the C-terminal Heparin-binding Domain-The total mass of the unreduced C2 peptide complex indicated a disulfide-linked complex containing Thr 169 -Met 210 , Lys 211 -Met 270 , and Lys 301 -Phe 312 , plus an additional mass of 2812 Da that proved to be due to glycosylation. Because C2 contained three disulfide bonds, further cleavage with subtilisin (enzyme: substrate ϭ 1:3 (w/w)) was used. Representative RP-HPLC chromatograms of the C2 subtilisin digest before and after reduction are shown in Fig. 4. All peak fractions in the nonreduced chromatogram and selected peaks in the reduced chro-matogram were analyzed by MALDI-MS. Peptides that could not be unambiguously identified by mass analysis were subjected to Edman sequencing. Three major peaks observed in the nonreduced digest (C2-S1 to C2-S3, upper panel of Fig. 4) were observed to be disulfide-linked complexes. In addition, several new peaks appeared in the reduced subtilisin digest chromatogram that corresponded to cysteine-containing peptides released from disulfide linkages after reduction. The C2-S1 com- FIG. 3. CNBr fragmentation of sFRP-1. A, chromatographic separation of sFRP-1 CNBr fragments (1.8 mg) on two TSK columns G3000 SW XL and G2000 SW XL as described under "Experimental Procedures." Pools that were further analyzed are indicated as C1-C6. MALDI-MS analyses of these fractions are summarized in Table I. B and C, CNBr-fragmented sFRP-1 was separated on 15% Tris-Tricine gels under nonreducing (B) and reducing (C) conditions followed by staining with Coomassie Blue: lane 1, heparin affinity purified sFRP-1; lane D, CNBr digest; lanes C1-C3, the indicated pools from A. C1-C3 bands on nonreducing gel shifted to lower molecular weight positions after reduction, indicative of peptides containing disulfide linkages. The C1 fraction contained an incomplete CNBr fragment with all eight disulfide linkages. C2 and C3 were determined to be the C-terminal heparin-binding (NTR) domain and the N-terminal CRD, respectively. These three pools were used for further digestion with subtilisin.  (Table II). The C2-S2 complex had four cysteines in two polypeptide chains, Gly 181 -Lys 193 and Leu 249 -Leu 263 , with heterogeneous cleavage at the C termini of Thr 182 , Lys 250 , Asn 251 , and Gly 252 , whereas the C2-S3 complex was the same as C2-S2 with heterogeneous cleavage only at Thr 182 . Because further attempts to cleave between adjacent cysteines in C2-S2 and C2-S3 were not successful, the C2-S3 complex was subjected to partial reduction using TCEP followed immediately by alkylation with iodoacetamide and subsequent separation by RP-HPLC. The results from MALDI-MS and Edman sequence analyses of the partially reduced and alkylated C2-S3 are summarized in Table III  168 , linked by five disulfide bonds. The C3 peptide complex was subjected to subtilisin digestion (E:S ϭ 1:3 (w/w)) and RP HPLC to further separate disulfide-linked complexes (Fig. 5). Several peaks observed in the nonreduced digest (C3-S1 to C3-S7) were not observed in the reduced digest, indicating the presence of disulfide-linked complexes. The C3-S1 to C3-S3 complexes had a total of two cysteines in two polypeptide chains, Cys 104 -Gln 109 and Ala 134 -Met 143 with heterogeneous cleavage at the C termini of Ala 134 and Ser 138 , giving a direct disulfide assignment of Cys 104 -Cys 139 (Table II). The C3-S4 and C3-S5 complexes consisted of two polypeptide chains, Arg 65 -Asn 69 and Val 110 -Cys 113 , with heterogeneous cleavage at the C terminus of Leu 66 . Because these complexes contained only two cysteines, another direct disulfide assignment of Cys 67 -Cys 113 was obtained. The C3-S6 and C3-S7 complexes had six cysteines in three polypeptide chains, Thr 52 -Leu 64 , Phe 116 -Glu 133 , and Leu 154 -Ala 167 , with heterogeneous cleavage at the C terminus of Val 119 . Because the yield for C3-S6 and C3-S7 complexes was too low for either further protease cleavage experiments or partial reduction and alkylation analysis, the C1 complex from the CNBr fragmentation was digested with subtilisin using a 1:9 (w/w) enzyme-to-substrate ratio, followed by RP-HPLC separations optimized to isolate the peptide complexes corresponding to C3-S6 and C3-S7 (described above) from this more complex starting sample. The purified peptide complex containing Thr 52 -Leu 64 , Phe 116 -Glu 133 , and Leu 154 -Ala 167 from RP-HPLC of the first C1 subtilisin digestion (C1-S) was redigested with subtilisin (E:S ϭ 1:3 (w/w)) to further fragment this complex. The C1-S-S1 complex had two cysteines in two polypeptide chains, Arg 129 -Glu 133 and Leu 154 -Asp 157 , giving a direct disulfide assignment of Cys 132 -Cys 156 (Table II). The C1-S-S1 showed a 43 Da mass increase compared with the expected sequence mass, and the N terminus was not available for Edman sequencing. These results suggest that the N-terminal amino group was carbamoylated. Apparently the extended incubation of this peptide in urea-containing buffers through multiple sequential protease digestions resulted in this artifactual modification. The C1-S-S2 peptide complex had four cysteines in disulfide linked peptide chains, Thr 52 -Leu 64 , Cys 120 -Cys 128 , and Lys 158 -Cys 165 . Because additional proteolysis of C1-S-S2 was not successful, partial reduction with TCEP and alkylation were used to complete disulfide bond assignments of this domain. The results from MALDI-MS and Edman sequence analyses of C1-S-S2 partial reduction and alkylation are summarized in Table III. Peptide C1-S-S2-R1 was the completely reduced and alkylated peptide Thr 52 -Leu 64 . C1-S-S2-R2 was composed of peptide Cys 120 -Cys 128 with an alkylated cysteine and the peptide Lys 158 -Cys 165 . Edman sequencing of C1-S-S2-R2 showed that Cys 120 was alkylated, indicating that this peptide complex was linked by the disulfide bond Cys 128 -Cys 165 and that the remaining disulfide bond linkage was Cys 57 -Cys 120 . Therefore, the complete disulfide bond assignments of the sFRP-1 N-terminal Fz CRD was determined to be Cys 57 -Cys 120 , Cys 67 -Cys 113 , Cys 104 -Cys 139 , Cys 128 -Cys 165 , and Cys 132 -Cys 156 .
N-Linked Glycosylation Site of sFRP-1-The location and approximate size of the N-linked glycosylation site of sFRP-1 were determined by N-terminal Edman sequencing and MALDI-MS analyses of CNBr and subtilisin-digested peptide complexes and reduced peptides. As mentioned above, the CNBr C2 peptide complex consisted of three peptides, Lys 211 -Met 270 , Lys 301 -Phe 312 , and Thr 169 -Met 210 , with a mass 2812 Da higher than the expected amino acid sequence mass (Table  I) ZORBAX 300SB-C18 column as described under "Experimental Procedures" using the following gradient: 2% solvent B for 5 min; 2-32% solvent B over 75 min; and 32-60% solvent B over 35 min. Lower panel, separation of 5 g of C2 subtilisin digest after reduction with TCEP using the same gradient. Major peaks that disappeared following reduction are indicated by C2-S1 to C2-S3. MALDI-MS analyses of these fractions are summarized in Table II. pected mass, indicating that Asn 262 was not modified. However, a mass for Thr 169 -Met 210 , which includes the potential N-linked glycosylation site at Asn 172 , was not observed. Instead, a weak and broad 7266-Da mass was observed, which suggested that Thr 169 -Met 210 contained an ϳ2812-Da carbohydrate moiety on Asn 172 . Glycosylation at Asn 172 was confirmed by Edman sequencing of peptides from the reduced C2 complex and C2-S3-R4 complex. The expected yield of Asn was observed at residue 262, indicating no apparent modification at this site. In contrast, no Asn was observed at residue 172, indicating that this Asn was completely modified. No evidence of O-linked glycosylation was observed in MALDI-MS analysis of CNBr and subtilisin fragments.
In addition to the above analyses, site-directed mutagenesis was performed at both possible N-linked glycosylation sites, Asn 172 and Asn 262 , individually and simultaneously. Purified recombinant proteins containing either one or both of these substitutions were analyzed by SDS-PAGE, and their mobili- ties were compared with that of wild-type sFRP-1 (Fig. 6).
Derivatives containing the Gln 172 substitution migrated faster than native sFRP-1, whereas the Gln 262 modification did not alter the mobility of the proteins. These findings were consistent with the conclusions from MALDI-MS and Edman sequence analyses that N-linked glycosylation was present at Asn 172 but not at Asn 262 .

DISCUSSION
The disulfide bonds in recombinant sFRP-1 have been determined by a combination of MALDI-MS, peptide mapping, and N-terminal sequencing. All peptide cleavage steps were carried out below pH 6.5 to prevent disulfide scrambling. The disulfidebonding linkages in sFRP-1 are summarized in Fig. 2. CNBr treatment was chosen for the initial fragmentation in this study because attempts to cleave sFRP-1 with various proteases were not successful. Assignments of disulfide linkages in the C2-S3 ({Gly 181 -Lys 193 }-{Leu 249 -Leu 263 }) and C1-S-S2 ({Thr 52 -Leu 64 }-{Cys 120 -Cys 128 }-{Lys 158 -Cys 165 }) peptide complexes were not straightforward because each complex contained four cysteines (Table II). Because both peptide complexes were resistant to further proteolysis under all conditions evaluated, partial reduction with TCEP followed by alkylation was then used to determine disulfide bond assignments. Edman sequencing of the partially reduced and alkylated peptide complexes (C2-S3-R4 and C1-S-S2-R2) allowed the unambiguous assignment of these disulfide linkages, and no disulfide scrambling was observed.
The N-terminal portion of sFRP-1 has been predicted to be homologous to the putative Wnt-binding site of Frizzleds (38,  c Calculated mass of each peptides includes mass of carboxyamidomethylation (number of carboxyamidomethylated residues ϫ 58 Da). d Assignments were confirmed by N-terminal sequencing. e C2-S3-R4 has heterogeneity on N terminus of ( 181 GT) 183 T-K 193 that was not resolved by the RP-HPLC gradient used here. Mass heterogeneity caused by the heterogeneous cleavage indicated in parentheses.

FIG. 5. Chromatographic identification of disulfide-linked peptide complexes from C3 after subtilisin digestion.
Upper panel, chromatographic separation of a C3 subtilisin digest (3.2 g) on a ZORBAX 300SB-C18 column using the gradient described in Fig. 4. Lower panel, chromatographic separation of 1.1 g of C3 subtilisin digest after reduction with TCEP using the same gradient. Major peaks, which disappeared upon reduction, are indicated by C3-S1 to C3-S7. MALDI-MS analyses of these fractions are summarized in Table II. FIG. 6. Determination of N-linked glycosylation site of sFRP-1 by site-directed mutagenesis. Recombinant native and mutant sFRP-1 proteins containing glutamine substitutions in either one or both of the potential N-linked glycosylation sites (N172Q and N262Q) were purified, and their apparent sizes were compared following SDS-PAGE. Proteins having greater mass are indicated with an arrow and ones having less mass with an arrowhead. Differences in mass are attributed to the presence or absence of carbohydrate. 39). The disulfide linkages and cysteine spacings of human sFRP-1 determined experimentally in the present study are compared with putative homologous domains of other proteins in Fig. 7. As shown, the sFRP-1 N-terminal Fz CRD module has a disulfide linkage pattern of 1-5, 2-4, 3-8, 6 -10, and 7-9, consistent with the disulfide-bonding pattern of the Fz module recently determined in rat Ror1 receptor tyrosine kinase, mouse sFRP-3, and mouse Fz8 (49,50). The cysteine spacings of these domains are highly conserved throughout the homologs and orthologs with the greatest variation occurring between C 8 and C 9 (spacing ranges from 12 to 27 residues) and intermediate variability between C 2 and C 3 (36 -41 residues) and C 9 and C 10 (8 -13 residues). However, it is quite interesting that Sizzled, Sizzled2, and Crescent, a subset of sFRPs that currently have been described only in Xenopus and chicken, contain an 11th cysteine residue (C*) in their CRDs, which is located between the conserved C 8 and C 9 residues (see below for further discussion and Fig. 7, upper panel). Diversity in these regions may contribute to distinct specificities for Wnt binding that presumably are characteristic of different Fz family members.
The disulfide linkages of the C-terminal domain of sFRP-1 determined experimentally in the present study are 1-4, 2-5, and 3-6. These assignments experimentally verify a primary disulfide linkage/cysteine spacing pattern (pattern A in Fig. 7) that was previously predicted in a model of NTR modules (49). As illustrated in Fig. 7, the sFRP-1 disulfide linkage matches that determined for human TIMPs (pattern B), although the location of C 5 in the aligned sequences is quite different. Indeed, both cysteine spacings and disulfide linkages appear to be quite variable within putative NTR domains. We propose that NTR modules could be categorized into five groups or subfamilies based upon the divergent cysteine spacings and experimentally determined or predicted disulfide linkages (Fig. 7).
With the results for sFRP-1 described herein, assignments have now been rigorously determined for three of these five groups. The disulfide-bonding pattern of the sFRP-1 NTR domain most closely matches that of hNet2l, the hPCOLCEs, and hWFIKKN (pattern A). Although relatively little is known about the functional significance of NTR domains, it is noteworthy that naturally occurring truncated fragments of hP-COLCE1 that begin slightly upstream of the NTR domain have been reported to have protease inhibitory activity (55). A similarly sized fragment of hPOLCE2 also has been observed in cell culture fluid (56). The association of NTR domains with protease inhibitory activity was first described by Bá nyai and Patthy (49) when they recognized that the protease-binding, N-terminal domain of TIMPs is an NTR module. Thus, this additional link of protease inhibitory activity with proteins having an NTR module reinforces their speculation that sFRPs, or perhaps fragments of sFRPs, might have such activity as well.
Alignments of cysteines in the C-terminal domain of other sFRPs reveal distinct patterns that might have substantial functional and evolutionary implications. When full-length protein sequences are compared, sFRP-1, -2, and -5 are quite similar to each other, whereas sFRP-3 and -4 are more distantly related overall with the greatest divergence in their C-terminal domain sequences (data not shown). In addition, whereas the cysteine spacings of sFRP-1, -2, and -5 NTR domains are quite similar (Fig. 7, pattern A), the C-terminal regions of sFRP-3 and -4 show a distinct cysteine pattern. Consistent with these differences, the C-terminal domains of sFRP-3 and -4 were not included in the group of proteins originally identified as having NTR domains (49). However, when we searched the non-redundant protein data base at NCBI using the C-terminal region of human sFRP-3 (residues 170 -325) with BLAST, apparent significant homology with human and mouse netrin 4 was observed (E value ϭ 5 ϫ 10 Ϫ10 ). Subsequent pairwise sequence alignment of the C-terminal region of sFRP-3 with residues 497-628 of human netrin 4 showed 34% identity over 117 residues encompassing most of the NTR domain of netrin 4. This homology strongly suggests an evolutionary relationship of these two modules despite the fact that the cysteine spacing of the netrin 4 NTR domain fits pattern A, whereas the cysteine spacing of the sFRP-3 Cterminal domain is quite different. We therefore propose that sFRP-3 and -4 contain NTR domains with a different cysteine spacing and disulfide linkage pattern (pattern D, Fig. 7). The unique sets of traits for the sFRP-3 and -4 NTR domains are as follows: both have a cysteine, C 0 that is eight residues upstream of C 1 ; the location of C 3 relative to C 2 and C 4 is shifted considerably downstream; and C 6 has been lost. The conservation of the closely spaced cysteines, C 1 -C 2 and C 4 -C 5 and comparison with pattern A, suggests that the novel C 0 might form a disulfide bond with the otherwise unpaired C 3 (Fig. 7). Furthermore, sFRP-4 contains two additional cysteine residues downstream of C 5a that might form a disulfide bridge with each other. If NTR modules have functional significance for sFRPs, we surmise that the differences observed in the cysteine spacing and inferred disulfide bonding patterns would result in contrasting activities among the various family members.
Sizzled, Sizzled-2, and Crescent represent another subset of sFRPs with a potentially unique disulfide linkage pattern that affects both their Fz CRD and NTR domains. Specifically, they have 11 cysteines in their N-terminal Fz CRDs and only 5 cysteines in their C-terminal domains (Fig. 7, NTR pattern E). As noted above, they have an additional cysteine between C 8 and C 9 in the Fz CRD, whereas C 5 has been lost from the C-terminal domain (compare pattern E versus A). Inspection of the recently determined mouse sFRP-3 Fz CRD crystal structure (51) together with alignment of the Sizzled and Crescent CRD sequences to the mouse sFRP-3 sequence strongly suggest the additional Sizzled/Crescent CRD cysteine is located on the surface of the CRD (Fig. 8). Cysteines exposed on surfaces of extracellular proteins usually form disulfide bonds due to the oxidizing extracellular environment. Hence, it is tempting to hypothesize that the NTR domain C 2 , which presumably would be unpaired due to the loss of C 5 , might form an interdomain disulfide bond with the additional, unpaired 11th cysteine located between C 8 and C 9 in the Fz CRD (Fig. 8). Of course this interesting model is highly speculative, but it also is readily testable. One alternative to the hypothesized inter-domain disulfide might be interchain disulfide links to yield covalent homodimers. The crystal structures of mouse sFRP-3 and mouse Fz8 showed Fz CRDs form non-covalent dimers under certain conditions. However, an intermolecular disulfide between the two 11th CRD cysteines in a dimer is not likely because the dimer interface in the crystal structure is on the opposite side of the molecule.
Previous reports indicated that sFRP biosynthesis was associated with partial proteolysis (36,47). For instance, when epitope tags were placed at the C terminus of Xenopus Frzb-1/ sFRP-3, the expected tagged proteins were detected in the intracellular compartment but not in the conditioned medium, implying proteolytic cleavage near the C terminus (36). In the present study, the predominant proteolytic event at the C terminus is removal of the terminal residue, Lys 313 . In addition, the N terminus was heterogeneously processed with the majority of purified sFRP-1 starting at Ser 31 as well as minor amounts of protein starting with Asp 41 or Phe 50 , as noted earlier (47). As described above, the unreduced protein was FIG. 7. Cysteine spacing and disulfide bonding patterns of cysteine-rich motifs related to sFRP-1. Top panel (Fz CRD), experimentally determined disulfide structures (solid lines connecting cysteines) and cysteine spacings of human (h) sFRP-1 (this study) and rat Ror-1 (rRor-1) (50) are separately compared with closely related homologs described in a recent phylogenic study (32). Proteins represented include the following: human sFRP-3, -4, and -5 and mouse sFRP-1, -2, and -3 (sFRPs human and mouse); human Frizzled 3 and 5 (hFzd3,5); human muscle-specific kinase (hMuSK); human carboxypeptidase Z (hCPZ); human collagen XVIII isoform (hCollagen). The Fz CRD for the more divergent Xenopus Sizzled and Sizzled2 (Szl and Szl2) and chicken Crescent (Crescent) proteins that have an 11th cysteine are shown in a third group. Bottom panel (NTR Domain), five patterns of disulfide linkages based on experimental data (solid lines) and/or cysteine spacing are shown. Pattern A includes the disulfide assignments for human sFRP-1 determined in the present study; other proteins with similar cysteine spacings and presumably the same disulfide structure include the following: human sFRP-5; mouse sFRP-1,2; human netrin-2 like protein (hNet2l); human procollagen C-proteinase enhancer proteins-1 and -2 (hPCOLCE1,2); human WAP, Fs, Ig, Ku, and NTR protein (hWFIKKN) (49,56,57). Pattern B, experimentally determined disulfide structure for human tissue inhibitor of metalloproteinases 1 and 2 (hTIMP1,2) and predicted for human TIMP 3 and 4 (hTIMP3,4) (58, 59). Pattern C, experimentally determined disulfide structure for complement C3 (Complement C3) (21) and predicted for complement C4 and C5 (Complement C4,5). Pattern D, predicted disulfide structure (dotted lines connecting cysteines) for human sFRP-3 and -4 based on comparison of cysteine spacings with that of sFRP-1,2,5 as well as sequence alignment to human netrin 4 (see text). Pattern E, predicted disulfide structure (dotted lines) for Xenopus Sizzled1 and -2 (Szl and Szl2) and chicken Crescent (Crescent) based on comparison of cysteine spacings with sFRP-1,2,5. Putative unpaired cysteine is boxed. Cysteines in nonstandard locations (based on previous alignments (49)) are designated with C* or C # . highly resistant to protease digestion with the exception of these small segments at both termini. A particularly proteaseresistant core was identified as the disulfide linked {Phe 50 -Lys 193 }-{Asn 251 -Arg 272 } complex produced by extended trypsin digestion using high enzyme ratios in the presence of buffers containing 3 M urea (data not shown).
sFRP-1 has two potential N-linked glycosylation sites on Asn 172 and Asn 262 (Fig. 2). MALDI-MS and Edman sequence analyses of the C2 and C2-S3-R4 peptide complexes showed that Asn 172 is completely glycosylated with a carbohydrate moiety of about 2812 Da, and Asn 262 is not modified. Results from site-directed mutagenesis of both possible N-linked glycosylation sites were consistent with N-linked glycosylation at Asn 172 , and no evidence of glycosylation was observed at Asn 262 . The mass of the carbohydrate on Asn 172 is ϳ200 Da less than the ϳ3000-Da difference between the calculated mass of the sFRP-1 sequence and the single peak observed in MALDI-MS of intact protein. This minor discrepancy is probably due to errors in the mass measurements caused by both heterogeneity of the carbohydrate moiety and the N terminus. However, there is a slight possibility an additional post-translational modification of the protein exists that eluded detection in the present study.
In conclusion, we have determined the disulfide linkages and glycosylation sites in human sFRP-1. The disulfide-bonding pattern of the N-terminal Fz CRD matches that recently reported for several other members of the Fz/sFRP family, whereas the pattern in the C-terminal domain reinforces the credibility of the NTR module as a structural entity. This assignment of disulfide linkages in a complete sFRP protein containing both a CRD domain and an NTR module should serve as the basis for exploring disulfide bond shuffling in the sFRP family. The variations in cysteine patterns within subsets of sFRPs suggest an unusual fluidity of disulfide bonds, which are typically strictly conserved over very wide evolutionary distances. Finally, the systematic analysis of the sFRP-1 post-translational modifications provides a sound basis for further structural and functional studies of this protein. FIG. 8. Schematic models of human sFRP-1 and Xenopus Sizzled proteins. A, sFRP-1, disulfide linkages and glycosylation site determined in the present study. The locations of cysteines numbered from the N terminus of both the Fz CRD and NTR domains are indicated with bold numbers. The N-glycosylation site of sFRP-1 is symbolically indicated with a series of connected hexagons. B, space-filling model of crystal structure of mouse sFRP-3 CRD. Crystallographic data were retrieved from the Protein Data Bank (code 1IJX) and visualized by using WebLab ViewerLite (51). The location of the amino acid residue that aligns with the additional (11th) cysteine between C 8 and C 9 in the Sizzled Fz CRD is highlighted in black and indicated with an arrow. The exposed location of this residue is consistent with the possibility of a disulfide linkage between C* in the Fz CRD and C 2 in the C-terminal domain of Sizzleds (see Fig. 7). C, Xenopus Sizzled, a hypothetical model for a new interdomain disulfide in Sizzled proteins. C* indicates the additional cysteine located between C 8 and C 9 in the Sizzled/Crescent CRD. Cysteines are labeled as in A. Lengths of solid lines between cysteines that represent the amino acid backbone in each protein model are shown approximately to scale to illustrate relative size of loops and domains.