Localization of Disulfide Bonds in the Cystine Knot Domain of Human von Willebrand Factor*

von Willebrand factor (VWF) is a multimeric glycoprotein that is required for normal hemostasis. After translocation into the endoplasmic reticulum, proVWF subunits dimerize through disulfide bonds between their C-terminal cystine knot-like (CK) domains. CK domains are characterized by six conserved cysteines. Disulfide bonds between cysteines 2 and 5 and between cysteines 3 and 6 define a ring that is penetrated by a disulfide bond between cysteines 1 and 4. Dimerization often is mediated by additional cysteines that differ among CK domain subfamilies. When expressed in a baculovirus system, recombinant VWF CK domains (residues 1957–2050) were secreted as dimers that were converted to monomers by selective reduction and alkylation of three unconserved cysteine residues: Cys2008, Cys2010, and Cys2048. By partial reduction and alkylation, chemical and proteolytic digestion, mass spectrometry, and amino acid sequencing, the remaining intrachain disulfide bonds were characterized: Cys1961–Cys2011 (1-4), Cys1987–Cys2041 (2-5), Cys1991–Cys2043 (3-6), and Cys1976–Cys2025. The mutation C2008A or C2010A prevented dimerization, whereas the mutation C2048A did not. Symmetry considerations and molecular modeling based on the structure of transforming growth factor-β suggest that one or three of residues Cys2008, Cys2010, and Cys2048 in each subunit mediate the covalent dimerization of proVWF.

von Willebrand factor (VWF) 1 is a blood glycoprotein that is required for platelet adhesion to subendothelium, and it binds to and stabilizes blood clotting factor VIII in the circulation (1). VWF is synthesized by endothelial cells (2) and megakaryocytes (3). The sequence of VWF is remarkable for the clustering of cysteine residues in N-terminal and C-terminal regions. Cysteine is the most abundant amino acid in the protein and accounts for 8.3% of the total (4). All cysteine residues in the secreted protein appear to be paired in disulfide bonds (5), some of which comprise intersubunit bonds. After translocation into the endoplasmic reticulum, proVWF subunits dimerize through disulfide bonds near their C termini. These tail-to-tail dimers are transported to the Golgi and there form additional head-to-head disulfide bonds near the N terminus of the subunits, yielding multimers that may exceed 20 million Da in size. The formation of both C-terminal and N-terminal intermolecular disulfides is critical because only the largest VWF multimers can effectively mediate platelet adhesion (1).
The covalent structure of the VWF dimerization domains and the locations of intersubunit disulfide bonds have not been fully characterized. A C-terminal proteolytic fragment of VWF is dimeric (5), and these 151 residues were sufficient to form dimers when expressed in COS cells (6). Thus, some of the 15 cysteines within the last 151 amino acid residues of the VWF subunit form intersubunit bonds (5). The C-terminal 90 residues of VWF (residues 1961-2050, numbered from the N-terminal Ser of the mature VWF subunit) comprise a domain that is homologous to the cystine knot (CK) family of proteins. Many CK proteins are dimeric, although the mode of dimerization varies (7). The mutations C2008Y (8), C2010R (9), and C2043R (10) have been found in patients with variants of von Willebrand's disease type 2A who cannot assemble large VWF multimers. Also, recombinant VWF with the mutation C1991W (11) or C2010R (9) cannot dimerize.
The VWF CK domain contains 11 cysteine residues and is particularly similar to the CK domains of several epithelial mucins and Norrie disease protein (norrin) that form oligomers through C-terminal CK domains (12)(13)(14). Studies of these related CK domains are consistent with a conserved role for certain cysteine residues in dimerization. Amino acid residue Cys 95 of norrin corresponds to Cys 2010 of VWF, and the mutation C95A prevented the oligomerization of recombinant norrin (15). Residues Cys 13244 and Cys 13246 of porcine submaxillary mucin (PSM) correspond to Cys 2008 and Cys 2010 of VWF, and the mutation C13244A or C13246A reduced but did not prevent the dimerization of a recombinant PSM CK domain (16).
These results indirectly implicate several cysteines of the VWF CK domain in tail-to-tail VWF dimerization. However, the specific cysteines responsible have not been identified. To address this problem, we have investigated the structure of recombinant VWF CK dimers by partial reduction and alkylation, chemical and proteolytic digestion, mass spectrometry, and amino acid sequencing.
Expression and Purification of Recombinant Proteins-Each baculovirus expression vector was cotransfected into Sf9 cells (PharMingen) with linearized BaculoGold ® DNA (PharMingen) and high titer recombinant baculovirus was prepared by repeated infection. High Five TM cells (Invitrogen, Carlsbad, CA) were infected and grown in Express Five serum-free medium plus 18 mM glutamine for 96 h. Media containing VWFCK or VWFCKM were dialyzed against TB (20 mM Tris-HCl, pH 7.9, 20 mM NaCl, 0.02% sodium azide) (19) and applied to Ni-NTA-agarose (Qiagen Inc, Santa Clarita, CA) using a Pharmacia FPLC system (Amersham Pharmacia Biotech). After washing with 10 mM imidazole in TB, protein was eluted with a linear imidazole gradient (10 -200 mM) in TB over 20 min at 4 ml min Ϫ1 . Column fractions containing VWFCK or VWFCKM were pooled and loaded directly onto a C8 column (10 ϫ 250 mm, 5 m; Vydac/The Separations Group Inc., Hesperia, CA), using a Hewlett-Packard (Palo Alto, CA) 1100 HPLC system. Buffer A was 0.1% (v/v) trifluoroacetic acid (Pierce) in water, and buffer B was 0.1% (v/v) trifluoroacetic acid in HPLC grade acetonitrile (Aldrich). Protein was eluted with a linear gradient of 15-30% buffer B over 15 min at 4 ml min Ϫ1 .
Deglycosylation of VWFCK-A GlycoFree deglycosylation kit was obtained from Oxford GlycoSystems (Abington, United Kingdom). VW-FCK was deglycosylated with trifluoromethane sulfonic acid according to the manufacturer's instructions.
Partial Reduction with TCEP and Alkylation with NEM-VWFCK or VWFCKM was partially reduced and alkylated as described previously (20). In brief, VWFCK (200 g) in 40 l of 0.75 M sodium acetate, pH 4.6, 6 M guanidine HCl, and 8 l of 0.1 M TCEP was incubated at 45°C for 20 min under N 2 . VWFCKM (200 g) in 30 l of 0.75 M sodium acetate, pH 4.6, 6 M guanidine HCl, and 4 l of 0.1 M TCEP was incubated at 45°C for 20 min under N 2 . Then 3 l (VWFCK) or 4 l (VWFCKM) of 1 M NEM in dimethyl sulfoxide (ICN Biomedicals Inc., Costa Mesa, CA) was added and incubated at 37°C for 60 min under N 2 . After alkylation, samples were subjected to RP-HPLC using a C4 column (4.6 ϫ 250 mm, 5 m, Vydac) at 1 ml min Ϫ1 in 0.1% aqueous trifluoroacetic acid and eluted with a gradient of acetonitrile.
Complete Reduction with DTT and Alkylation with 4VP-The remaining disulfide bonds in partially reduced and alkylated VWF CK domains or purified fragments thereof were reduced with 50 mM DTT in 0.2 M Tris-HCl, pH 8.5, 6 M guanidine HCl, and alkylated with excess 4VP (21). The modified products were purified by RP-HPLC using a C18 column (2.1 ϫ 150 mm; Vydac/The Separation Group Inc.) and acetonitrile gradient.
CNBr and Lysyl Endopeptidase Digestion of VWFCK and VWF-CKM-Samples (200 g) were reacted with 1.8 mg of CNBr in 70% formic acid for 16 h at 25°C. The reaction was dried, dissolved in 0.2 M Tris-HCl, pH 7.9, and 4 M urea, and digested for 20 h at 37°C with 4 g of lysyl endopeptidase (22). Samples were subjected to RP-HPLC using a C18 column as above.
CNBr and Thermolysin Digestion of Partially Reduced and Alkylated VWFCKM-Samples were digested for 16 h with CNBr in 70% formic acid at a CNBr:Met molar ratio of 300:1. Products were subjected to RP-HPLC using a C18 column (2.1 ϫ 150 mm, 5 m; Vydac) at 0. 25  System. PTH products derived from S-ethylsuccinimidocysteine (ES-Cys) were detected as a doublet between the positions of PTH-proline and PTH-methionine. The PTH derivative of S-pyridylethylcysteine (PE-Cys) was detected as a sharp peak between the positions of PTHtyrosine and PTH-proline. Mass Spectrometry-Masses were determined by matrix-assisted laser desorption/ionization time of flight mass spectrometry with PerSeptive Voyager (PerSeptive Biosystems, Framingham, MA), Bruker Proflex or Bruker BiFlex III (Bruker Daltonics, Inc. Manning Park, Billerica, MA) spectrometers, and a matrix of ␣-cyano-4-hydroxycinnamic acid (Hewlett Packard) or 2,5-dihydroxybenzoic acid (Aldrich). Peptide masses were predicted with Sherpa 3.1.1 (Department of Biochemistry, University of Washington, Seattle, WA).
Expression and Pulse Labeling of Proteins in COS-7 Cells-Plasmids encoding VWFCK and variants were transiently transfected into COS-7 cells (24). Pulse labeling and immunoprecipitations were performed as described previously (25) with some modifications. Cells were incubated for 1 h at 37°C in methionine-free Dulbecco's modified Eagle's medium (Life Technologies, Inc.) containing 10% dialyzed fetal bovine serum (Life Technologies, Inc.) and labeled for 30 min in 3 ml of methioninefree Dulbecco's modified Eagle's medium supplemented with 100 mCi of TranS-label TM (ICN Biomedicals). Chase was initiated by removing the labeling medium and adding complete medium containing 10 mM unlabeled methionine. At various times, samples (1 ml) of culture medium and cell lysates were precleared by shaking for 4 h at 4°C with 20 l of protein A-Sepharose CL-4B fast flow (Amersham Pharmacia Biotech) and 10 l of normal rabbit serum (Dako, Glostrup, Denmark). After preclearing, 1 l of rabbit anti-human VWF polyclonal antibody A082 (Dako) was added and incubated for 16 h at 4°C. Immune complexes were precipitated by shaking for 30 min with 20 l of protein A-Sepharose. The immunoprecipitates were washed, separated by SDS-PAGE on 15% polyacrylamide gels, and analyzed with a STORM 840 system (Molecular Dynamics, Sunnyvale, CA).
Molecular Modeling-Sequences of TGF-␤2 and the VWF CK domain were aligned by the Clustal method with PAM250 residue weight table using the program MEGALIGN (DNASTAR, Madison, WI). The TGF-␤2 dimer structure (Protein Data Bank entry code 2tgi) (26) was used as the reference structure to assign coordinates for the backbone atoms of core regions, comprising the cystine knot and associated ␤-strands, using the program MODELER (Molecular Simulations, San Diego, CA). Starting structures for the remaining loop regions were selected from a library of reference structures, side chains throughout were replaced with those of VWF, additional disulfide bonds were constructed, and the segment Asp 2000 -Asp 2006 was superimposed on helix ␣5 of TGF-␤2. Steric overlaps were eliminated by molecular dynamics simulation and energy minimization with the program DIS-COVER (Molecular Simulations). A model with an alternative intersubunit disulfide bond arrangement was built similarly, repositioning the C termini and translating one subunit to achieve proper geometry for Cys 2008 -Cys 2010 and Cys 2048 -Cys 2048 bonds. Ribbon drawings were prepared with the program MOLSCRIPT (27).

Characterization of Recombinant VWF CK Domains-Previ-
ous studies indicated that the C-terminal 151 amino acid residues of VWF contained cysteine residues that mediate dimer formation in the endoplasmic reticulum (5,6). The last Ϸ90 amino acid residues of VWF are homologous to CK motifs that are present in many proteins and often mediate their dimerization (7, 28). Therefore, proteins containing the CK domain of VWF were designed for structural characterization (Fig. 1A). Construct VWFCK consists of the VWF signal peptide and four amino acids of the propeptide (residues 1-26 of preproVWF), six histidines, an enteropeptidase cleavage sequence (NYKD-DDDK), and the C-terminal 94 amino acids of VWF (residues 1957-2050 of mature VWF). Construct VWFCKM is similar except that Ile 1984 has been mutated to methionine to facilitate analysis by CNBr cleavage. This substitution was chosen because methionine at a position homologous to VWF residue 1984 is conserved in the CK domains of several epithelial mucins (28,29).
VWFCK and VWFCKM were expressed in a baculovirus system and purified by affinity chromatography on Ni-NTAagarose followed by RP-HPLC. The final step resolved one major peak and one minor peak for VWFCK (Fig. 1B). The dominant peak was collected and analyzed by SDS-PAGE, and it exhibited a single band that migrated at 34 kDa (Fig. 1C). Under reducing conditions, the band shifted to 20 kDa, indicating that VWFCK is composed of two identical subunits linked through disulfide bonds. Similar results were obtained for VWFCKM. Thus, the C-terminal CK domain of VWF func-tions as a dimerization motif.
The N-terminal amino acid sequence of purified VWFCK or VWFCKM was AEETHHHHHHNYKDDDDK, demonstrating that the expected Cys-Ala bond was cleaved to release the VWF signal peptide. Results from mass spectrometry indicated that molecular mass of intact dimeric VWFCK was 27713.2 Da (Table I), whereas the predicted mass was 25729.2 Da. The difference of 1984 Da is due to glycosylation at Asn 2027 because deglycosylation with TMSF reduced the mass of VWFCK to 25929.4 Da, which is close to the predicted value for the protein. After complete reduction and alkylation with NEM, the molecular mass of the VWFCK subunit was 15269.9 Da, which implies the modification of Ϸ11.2 cysteines/subunit (1 Da per reduced cysteine ϩ 125.1 Da per alkylation) compared with the theoretical value of 11 cysteines. Analysis of VWFCKM gave similar results; the mass of dimeric VWFCKM was 27771.9 Da (Table I), and the mass of reduced and alkylated monomeric VWFCKM was 15294.8 Da.
Preparation of Monomeric VWF CK Domains by Partial Reduction and Alkylation-Potential intersubunit disulfide bonds were localized by selective reduction with TCEP to convert dimeric VWF CK domains into monomers, followed by alkylation with NEM to mark the positions of the reduced cysteines. TCEP is active at pH 4.6, and this low pH suppresses the base-catalyzed rearrangement of disulfide bonds that may occur during reduction (30,31). NEM can be used effectively to alkylate thiols at pH 4.6 (20), and the combined use of TCEP with NEM at this pH avoided the exposure of proteins to alkaline conditions that could alter their disulfide structure. The partially reduced products obtained with VWFCK showed three major peaks upon analysis by RP-HPLC ( Fig. 2A). The number of disulfide bond reduced and alkylated in each peak was determined by mass spectrometry (Table I). The mass of NEM-VWFCK-1 was similar to that of the intact dimer and indicates that VWFCK contains no free cysteines. The mass of NEM-VWFCK-3 was consistent with complete reduction and alkylation of all 11 cysteines per subunit, producing a monomer. The mass of partially reduced NEM-VWFCK-2 differed from that of the unreduced subunit by 391 Da, which corresponds to the modification of three cysteines by NEM. Alternatively, the same conclusion is reached by comparing the  The products of partial reduction with TCEP and alkylation with NEM were separated by RP-HPLC (Fig. 2), and peak fractions were analyzed by mass spectrometry. The increment in mass due to alkylation is 1 Da (for reduction) ϩ 125.1 Da (for alkylation by NEM) ϭ 126.1 Da. The mass of an unmodified monomer is taken to be the mass of the unmodified dimer ϫ 0.5, which equals 13856. 6  mass of NEM-VWFCK-2 with the mass of reduced and alkylated VWFCK (15269.9 Da) in which 11 cysteines are modified; the difference (1022.3 Da) corresponds to alkylation of 8.1 fewer cysteines. Because NEM-VWFCK-2 is monomeric, this result indicates that no more than three cysteines of VWFCK participate in intersubunit disulfide bonds.
Similar results were obtained for VWFCKM. Separation of partially reduced and alkylated products by RP-HPLC gave three major peaks and several minor peaks (Fig. 2B) like those observed for VWFCK ( Fig. 2A). Analysis by mass spectrometry (Table I) indicated that NEM-VWFCK-M1 had a mass (27771.9 Da) similar to that of unmodified dimeric VWFCKM (27786.3 Da), and therefore contained no free cysteines. The mass of NEM-VWFCK-M6 (15292.3 Da) was similar to that of fully reduced and alkylated VWFCKM that was prepared independently (15294.8 Da), demonstrating that this last peak contains fully reduced and alkylated monomer. The mass of monomeric NEM-VWFCK-M3 (14281.4 Da) was consistent with the alkylation of three cysteines. The masses of the remaining peaks also correspond to integral numbers of reduced and alkylated cysteines per molecule (Table I). The results for VWFCK and VWFCKM agree and indicate that three or fewer cysteines form intersubunit disulfide bonds.
Location of Potential Intersubunit Disulfide Bonds-NEM-VWFCK-2 was completely reduced and alkylated with 4VP, cleaved with CNBr, and peptides were purified by RP-HPLC (Fig. 3). The amino acid sequences obtained from peaks 1-5 are shown in Table II (Table III). The identification of ES-Cys at only three positions suggests that the dimerization of VWF CK domains is mediated by intersubunit disulfide bonds between one or more of the residues Cys 2008 , Cys 2010 , and Cys 2048 .
The alkylation status of Cys 1976 was not determined in this experiment because it was too far from the N terminus of the corresponding CNBr fragment. Therefore, the proposed assignments were confirmed by characterizing the intrasubunit disulfide bonds in VWFCK and VWFCKM.
Intrasubunit Disulfide Bond Cys 1976 -Cys 2025 "a-c"-Dimeric VWFCK or VWFCKM was sequentially digested with CNBr and lysyl endopeptidase. The products were separated by RP-HPLC and sequenced. Results for VWFCKM are shown in Fig. 4. Peak CL1 gave the sequences AEETHHHHHHNYK and SEVEVD, corresponding to the N-terminal (His) 6 tag sequence and VWF residues 1978 -1983. Peak CL2 gave the sequences of two peptides (Table IV) (Table II). B, a similar analysis of NEM-VWFCK-M3 was performed. Peaks MC1 through MC6 were analyzed by amino acid sequencing (Table III). after lysine and methionine (Fig. 1A). Residue Asn 2027 is glycosylated (32) and was not detected. The peptides in fraction CL2 contain only two cysteine residues, and neither was detected, indicating the presence of a disulfide bond between Cys 1976 and Cys 2025 . Because Cys 2025 is not alkylated in monomeric NEM-VWFCK-M3, this bond is an intrasubunit disulfide bond. Similar results were obtained starting with dimeric VW-FCK (data not shown).

TABLE II
Sequence results for CNBr-cleaved and 4VP-modified NEM-VWFCK-2 monomer Monomeric NEM-VWFCK-2 was cleaved with CNBr, separated by RP-HPLC (Fig. 3A), and peak fractions were analyzed by amino acid sequencing. Residues designated Ce were detected as S-ethylsuccinimidocysteine (ES-Cys). Residues designated Cp were detected as S-pyridylethylcysteine (PE-Cys). Residues in brackets were expected from the known sequence but not observed. Residue numbers in superscript indicate position in the mature VWF subunit.   (Fig. 3B), and peak fractions were analyzed by amino acid sequencing. Residues are designated as described in the legend to Table II. Peak MC2 gave the same sequence data as peak MC1. Peaks MC4, MC5, and MC6 gave the same sequence data as peak MC3. A   (Fig. 4) was analyzed by amino acid sequencing without prior reduction. Residues are designated as described in the legend for Table II or 14794.6 Da (Table I), consistent with the presence of one or two remaining intrasubunit disulfide bonds, respectively. To locate these cysteines, fraction M4 was digested with CNBr, reduced completely with DTT, and alkylated with 4VP. The digest was separated by RP-HPLC (Fig. 6), and peptides containing the cysteines of interest were characterized (Table VI). Fraction CV1 gave a single amino acid sequence beginning with His 1985 that contained exclusively PE-Cys 1987 and PE-Cys 1991 . Two masses were observed that are consistent with the proposed structure of the peptide; one corresponds to the mass predicted for the peptide terminating in homoserine, and the second corresponds to the formation of homoserine lactone with the loss of water (18 Da). Fraction CV2 gave two sequences. The major sequence corresponded to the C-terminal CNBr peptide with PE-Cys 2041 , PE-Cys 2043 , and ES-Cys 2048 . The minor sequence was the same as the sequence of fraction CV1, except that both ES-Cys and PE-Cys were detected at position 1991. Fraction CV2 therefore is a mixture of three peptides, two of which share the same N terminus and terminate in homoserine or homoserine lactone. The results of mass spectrometry support this conclusion, giving four peaks that are consistent with five masses predicted for these peptides (Table VI). Fraction CV3 gave a single sequence corresponding to the C-terminal CNBr peptide with PE-Cys 2041 , ES-Cys 2043 , and ES-Cys 2048 . CV3 therefore is the same as the major sequence in fraction CV2, except for the derivatization of Cys 2043 . Later column fractions contained mixtures of the three remaining CNBr fragments and were not analyzed further.

-E-E-T-H-H-H-H-H-H-N-Y-K-D-D-D-D-K-E 1957 -E-P-[E]-Cp-N-D 1963 1961 Q 2020 -V-A-L-H-Cp-T-[N]-G-S-V-V-Y-H-E-V-L-N-A 2038 2025 Y 1997 -S-I-D-I-N-D-V-Q-D-Q-Ce-S-Ce-Cp-S-P-T-R-T-E-P
These results allow the assignment of the two remaining intrasubunit disulfide bonds. All four cysteine residues in question were identified. Cys 1987 and Cys 2041 were detected exclusively as PE-Cys, whereas Cys 1991 and Cys 2043 were detected as both PE-Cys and ES-Cys. Monomeric NEM-VWFCK-M4 contains proteins with either one or two intrasubunit disulfide bonds (Table I), and the data of Table VI suggest that fraction M4 consists mainly of two specific proteins. Both proteins share a Cys 1987 -Cys 2041 "2-5" intrasubunit disulfide bond, explaining the detection of only PE-Cys at these positions. In addition, one protein has a Cys 1991 -Cys 2043 "3-6" intrasubunit bond, accounting for the presence of both PE-Cys and ES-Cys at residues Cys 1991 and Cys 2043 .
Site-directed Mutagenesis of Cys 2008 , Cys 2010 , and Cys 2048 -The role of cysteine residues at positions 2008, 2010, and 2048 in VWFCK dimerization was investigated by mutation to alanine and expression in COS-7 cells (Fig. 7). After pulse labeling, dimeric wild-type VWFCK was detected as an intracellular 34-kDa protein that was secreted into the medium within 3 h of chase. The single substitutions C2008A and C2010A, and the double mutation C2008A/C2048A, increased the electrophoretic mobility of intracellular and secreted VWFCK proteins to a position (Ϸ28 kDa) between that of dimeric VWFCK (34 kDa) and reduced VWFCK (Ϸ20 kDa) (Fig. 1). The subunit composition of these intermediate species was assessed by gel electrophoresis (Fig. 8) with VWFCK derivatives of known mass (Table I). Dimeric NEM-VWFCK1 comigrated with radiolabeled VWFCK, indicating that proteins at Ϸ34 kDa are dimeric. As expected, the fully reduced and alkylated, monomeric NEM-VWFCK3 comigrated with reduced radiolabeled VW-FCK. Monomeric NEM-VWFCK2, in which three cysteines are reduced and alkylated, had an intermediate mobility similar to that of proteins with the mutations C2008A, C2010A, and C2008A/C2048A. The similar anomalous electrophoretic mobility of these four proteins probably is explained by decreased SDS binding compared with fully reduced VWFCK, and suggests that mutations C2008A, C2010A, or C2008A/C2048A cause the synthesis of monomers that retain some intrasubunit disulfide bonds. The mutation C2048A was compatible with the secretion of a dimeric 34-kDa species, although an intracellular band at Ϸ28 kDa was detected that is consistent with the  (Table VI).
folding and intracellular retention of some monomeric protein (Fig. 7). The mutations C2048A and C2008A/C2048A also caused the appearance intracellularly of a Ϸ22-kDa species that may be monomeric but sufficiently unfolded to bind SDS and comigrate with fully reduced VWFCK (Fig. 7). Alternatively, these faster migrating species may reflect proteolytic degradation. The triple mutation C2008A/C2010A/C2048A was associated with a transient 22-kDa intracellular species, and no secreted protein was detected. These results indicate that at least Cys 2008 and Cys 2010 are necessary for the synthesis and secretion of dimeric VWFCK. DISCUSSION For normal hemostatic function, VWF must be assembled into dimers and, subsequently, into multimers. Cysteine residues that mediate the C-terminal dimerization of VWF are within the last 151 amino acid residues of the subunit (5, 6), but their location has not been reported. The identification of a CK-like domain at the C terminus of norrin, VWF, and related epithelial mucins (28) was an important advance that focused attention on a substantially smaller segment of these proteins. CK domains comprise approximately 90 amino acids and include six cysteines arranged in a knot-like topology. Three amino acid residues including a central glycine usually separate the second and third cysteines: Cys-X-Gly-X-Cys. A single residue separates the fifth and sixth cysteines. Disulfide bonds between the second and fifth cysteines, and between the third and sixth cysteines, form a macrocyclic ring that is penetrated by a disulfide bond between the first and fourth cysteines. Many proteins with CK domains form homodimers or heterodimers, suggesting that CK domains function as dimerization motifs in several different contexts.
Studies of PSM demonstrated the potential of CK domains to mediate the oligomerization of epithelial mucins (16). PSM exhibits extensive homology to VWF and is assembled into oligomers by a similar mechanism; C-terminal domains dimerize in the endoplasmic reticulum, and N-terminal domains Monomeric NEM-VWFCK-M4 was digested with CNBr, reduced with DTT, and alkylated with 4VP. Products were separated by RP-HPLC (Fig. 6), and peak fractions were analyzed. Residues predicted to be present from the known amino acid sequence of VWF, but not detected, are enclosed in brackets. Lowercase "m" indicates homoserine or homoserine lactone.

Sample
Amino Purified derivatives of VWFCK with specific numbers of alkylated cysteine residues were isolated by RP-HPLC (Fig. 3), and their mass was determined (Table I). NEM-VWFCK-1 is intact dimeric VWFCK with no cysteines alkylated. NEM-VWFCK-2 is monomeric with three cysteines alkylated. NEM-VWFCK-3 is monomeric with all 11 cysteines alkylated. COS-7 cells were transfected with plasmids encoding wild-type VWFCK or a variant with the mutation C2010A and radiolabeled with [ 35 S]methionine. After 6 h of chase, samples of conditioned medium were immunoprecipitated with polyclonal rabbit anti-human VWF antibody. Purified proteins and immunoprecipitates were separated by SDS-electrophoresis on a 15% polyacrylamide gels under non-reducing conditions (NR) or after reduction (R) with 2.5% 2-mercaptoethanol. The gel was stained with Coomassie Blue, dried, and subjected to autoradiography. The left half of the figure shows the mobility of purified proteins stained with Coomassie Blue. The right half of the figure shows the autoradiographic pattern for radiolabeled VWFCK and VWFCK(C2010A). The positions of standard proteins with the indicated masses in kDa are shown without reduction (left side) and with reduction by 2.5% 2-mercaptoethanol (right side).
promote the formation of disulfide-linked oligomers in the Golgi apparatus. When the C-terminal CK domain of PSM was expressed in transfected COS cells, it was secreted as a disulfide-linked dimer (16). The VWF CK domain functions similarly. Upon expression in a baculovirus system (Fig. 1) or in transfected COS-7 cells (Fig. 7), the isolated VWF CK domain was secreted as a disulfide-linked dimer with no unpaired cysteine residues (Table I).
These results focused attention on the C-terminal CK domain of VWF and the location of intersubunit disulfide bonds was narrowed further by chemical modification. Reduction and alkylation of Cys 2008 , Cys 2010 , and Cys 2048 converted dimeric VWFCK into a monomer and excluded all but these three residues from participation in covalent dimerization (Tables II  and III). The three cysteines that are alkylated in VWF CK monomers are interesting because they are not broadly conserved among CK domains, and one of them is characteristic of a specific CK family.
CK domains fall into at least four structural groups that differ in their mode of dimerization, and each group is characterized by distinct extra cysteine residues outside of the six that define the cystine knot (7). Among these groups, the VWF CK domain appears to be most similar in sequence to members of the TGF-␤ family, which have one additional cysteine that forms an intersubunit disulfide bond. This extra cysteine is present in VWF, and also in norrin (12), PSM (13), and MUC5AC (14) (a representative human epithelial mucin) (Fig. 9), suggesting that all of these proteins may have at least one conserved intersubunit disulfide bond. The position of this cysteine residue is marked with an asterisk in Fig. 9. These proteins also have four additional cysteines that are not found in the TGF-␤ family, labeled a, b, c, and d in Fig. 9. Therefore, norrin, certain epithelial mucins, and VWF may constitute a subfamily of CK domains with a common structural basis for dimerization and additional intrasubunit or intersubunit disulfide bonds.
The disposition of these extra cysteine residues was addressed previously by computer modeling of norrin based on the crystal structure of TGF-␤2 (28). The one extra cysteine conserved with TGF-␤ was proposed to form an intersubunit disulfide bond, and the four additional cysteines were paired in two intrasubunit disulfide bonds, linking Cys a -Cys c and Cys b -Cys d . The observed covalent structure of VWF CK domains is consistent with this model. The core CK motif is present, as shown by the detection of conserved disulfide bonds Cys 1961 -Cys 2011 "1-4", Cys 1987 -Cys 2041 "2-5", and Cys 1991 -Cys 2043 "3-6". Furthermore, the proposed Cys 1976 -Cys 2025 "a-c" intrasubunit disulfide bond was identified.
Interestingly, the structure of recombinant norrin may deviate significantly from the proposed model (28) because the protein is mulimeric rather than dimeric. When expressed with a C-terminal (His) 6 tag, norrin was secreted as high molecular weight multimers, and the substitution C95A caused the secretion of norrin dimers (15). This result is consistent with the occurrence of the predicted Cys 95 -Cys 95 intersubunit disulfide bond, but also indicates that other intersubunit disulfide bonds contribute to multimer formation. Thus, norrin CK domains may exhibit even greater variety in subunit contacts than is represented by the four structural classes characterized so far (7).
With the support of chemical evidence for the connectivity of all but three cysteine residues, we constructed a model for the dimeric VWF CK domain (Fig. 10A). Sequence alignments ( Fig. 9) (29) indicate that the CK domains of VWF, epithelial mucins, and (to a lesser extent) norrin differ from TGF-␤2 homologs, mainly by having a markedly shorter segment corresponding to the ␣5 and 3 10 helices of TGF-␤2. This segment was modeled as a short ␣-helix for the VWF CK domain. The VWF CK model has the same disulfide connectivity as proposed for norrin (Fig. 10A) and includes a Cys 2010 -Cys 2010 intersubunit disulfide bond analogous to that found in TGF-␤2.
This pattern is compatible with the results of our structural studies, which demonstrate the presence of a CK motif and limit the intersubunit disulfide bond candidates to Cys 2008 , Cys 2010 , and Cys 2048 . To avoid the occurrence of unpaired cysteine residues, which are not observed in VWF or recombinant VWFCK, the dimerization of the VWF CK domain must involve either one or three intersubunit disulfide bonds. However, the formation of two intersubunit Cys 2008 -Cys 2048 bonds would require the dimer interface to deviate radically from that of TGF-␤, and for this reason the model has a single intersubunit disulfide bond.
VWF CK dimers with three intersubunit disulfide bonds can be constructed easily if larger structural changes are accepted, particularly if the pairing of Cys 2010 is altered. Because all cysteines are oxidized, such a dimer must include one pairing between the same cysteine residues in both subunits. There are four such patterns. One contains Cys 2008 -Cys 2008 , Cys 2010 -Cys 2010 , and Cys 2048 -Cys 2048 bonds. The others correspond to the three structurally distinct combinations of Cys x -Cys x , Cys y -Cys z , where Cys x is Cys 2008 , Cys 2010 , or Cys 2048 . For example, the shift of one subunit by 5 Å enables the formation of two Cys 2008 -Cys 2010 bonds (Fig. 10B). The (Cys 2008 -X-Cys 2010 ) 2 arrangement is similar to the antiparallel Cys-X-Cys/Cys-X-Cys cystine framework found in several influenza virus neuraminidases and in porcine leukocyte protegrin-1 (reviewed in Ref. 33). Completion of the structure requires the formation of a Cys 2048 -Cys 2048 bond, which is accommodated easily by FIG. 9. Alignment of selected CK domains. The CK domain sequences of human TGF-␤2, Norrie disease protein (norrin) (12), PSM (13), human mucin MUC5AC (14), and VWF were aligned as described under "Experimental Procedures." Cysteines are indicated in boldface. An asterisk marks the cysteine that forms the intersubunit disulfide bond of TGF-␤2. Amino acid residues identical to those of TGF-␤2 are boxed, and residues identical to those of VWF are shaded in yellow. The secondary structure elements from a TGF-␤2 crystal structure (34) are shown above the alignment; arrows indicate ␤-strands, and cylinders indicate helices. movement of the C termini (Fig. 10B). This alternative model employs all three of the cysteine residues that are implicated by reduction/alkylation in the dimerization of VWFCK. The location of these cysteines on one face of the subunit effectively constrains the dimer interface to the same concave surface that is used in TGF-␤, and the model preserves a similar arrangement.
The various possible models for CK dimer structure might be distinguished by mutagenesis of cysteine residues, but this approach has been only partially successful. For the VWF CK domain, mutation of Cys 2048 permitted the secretion of dimers and suggested the presence of one or two intersubunit disulfide bonds involving Cys 2008 , Cys 2010 , or both residues. However, no mutation affecting either Cys 2008 or Cys 2010 allowed the secretion of dimers, and the data therefore do not exclude a Cys 2008 -Cys 2010 intersubunit bond. Similar mutagenesis studies of the PSM CK domain are consistent with more than one intersubunit disulfide bond but do not identify their locations (16). In this case, single alanine substitutions for Cys 13244 (Cys b ), Cys 13246 (Cys*), or Cys 13283 (Cys d ) caused the secretion mainly of PSM CK dimers, indicating that no one of these cysteines is necessary for intersubunit disulfide bond formation. Therefore, the available structural and mutagenesis data suggest that CK domains like those of VWF and PSM, which possess 11 conserved cysteines, may constitute a structural family of CK domains with a novel pattern of three intersubunit disulfide bonds. Although norrin has the same 11 conserved cysteines, it may represent a special case in which the same cysteine residues (Cys b , Cys*, and Cys d ) form disulfide bonds among three rather than two monomers, resulting in the assembly of multimers rather than dimers.
Further study is needed to test this hypothesis. For dimeric proteins like the VWF CK domain, analytical methods that rely on fragmentation generally cannot distinguish intersubunit from intrasubunit disulfide bonds. Because we have not been able to isolate a monomeric VWF CK species with fewer than three reduced and alkylated cysteines, complete characterization of the disulfide bonds involving Cys 2008 , Cys 2010 , and Cys 2048 probably will require a different methodology, such as x-ray crystallography.