Mass Spectrometric Evidence That Proteolytic Processing of Rainbow Trout Egg Vitelline Envelope Proteins Takes Place on the Egg*

The rainbow trout egg vitelline envelope (VE) is constructed of three proteins, called VEα,VEβ, and VEγ, that are synthesized and secreted by the liver and transported in the bloodstream to the ovary, the site of VE assembly around eggs. All three proteins possess an N-terminal signal peptide, a zona pellucida domain, a consensus furin-like cleavage site (CFLCS) close to the C terminus, and a short propeptide downstream of the CFLCS. Proteolytic processing at the CFLCS results in loss of the short C-terminal propeptide from precursor proteins and enables incorporation of mature proteins into the VE. Here mass spectrometry (matrix-assisted laser desorption ionization time-of-flight-mass spectrometry and liquid chromatography-mass spectrometry with a micromass-quadrupole TOF hybrid mass and a QSTAR Pulsar i mass spectrometer) was employed with VE proteins isolated from rainbow trout eggs in a peptidomics-based approach to determine the following: 1) the C-terminal amino acid of mature, proteolytically processed VE proteins; 2) the cellular site of proteolytic processing at the CFLCS of VE precursor proteins; and 3) the relationship between proteolytic processing and limited covalent cross-linking of VE proteins. Peptides derived from the C-terminal region were found for all three VE proteins isolated from eggs, indicating that processing at the CFLCS occurs after the arrival of VE precursor proteins at the egg. Consistent with this conclusion, peptides containing an intact CFLCS were also found for all three VE proteins isolated from eggs. Furthermore, peptides derived from the C-terminal propeptides of VE protein heterodimers VEα-VEγ and VEβ-VEγ were found, suggesting that a small amount of VE protein can be covalently cross-linked on eggs prior to proteolytic processing at the CFLCS. Collectively, these results provide important evidence about the process of VE formation in rainbow trout and other non-cyprinoid fish and allow comparisons to be made with the process of zona pellucida formation in mammals.

All vertebrate eggs are surrounded by an extracellular coat, referred to as a chorion, vitelline envelope (VE), 2 or zona pellucida (ZP) (1). The coats surrounding fish and mammalian eggs, often designated as VE and ZP, respectively, are located between the oolemma and surrounding follicle cells and frequently exhibit distinguishable layers. These egg coats have been shown to play important roles during oogenesis, fertilization, and early embryogenesis.
Proteins that constitute the VE and ZP are highly conserved and related to each other (2). For example, a feature common to VE and ZP proteins is the presence of a so-called ZP domain (3)(4)(5), a sequence of ϳ260 amino acids that is responsible for polymerization of the proteins into long fibrils or filaments (6). Rainbow trout (Oncorhynchus mykiss) VEs consist of at least three proteins, called VE␣(ϳ58 kDa), VE␤(ϳ52 kDa), and VE␥(ϳ47 kDa), that possess an N-terminal signal peptide (SP), a proline-glutamine-(PQ-) rich (PQ) region, a ZP domain, a consensus furin-like cleavage site (CFLCS) close to the C terminus of the polypeptides (7,8), and a short propeptide downstream of the CFLCS. VE␣ and VE␤ also have a trefoil (P) domain, just upstream of the ZP domain, that has six cysteine residues present as three intramolecular disulfides (Cys 1 -Cys 4 , Cys 2 -Cys 5 , and Cys 3 -Cys 6 ) (9). The ZP domain of each VE protein has eight conserved Cys residues present as four intramolecular disulfides (7). VE␣ and VE␤ exhibit high homology with each other and are related to mammalian egg coat proteins ZP1 and ZP2, whereas VE␥ is homologous with ZP3 (8).
Mouse ZP proteins (ZP1-3) are synthesized in the ovary by growing oocytes (5,10). ZP proteins from other mammals also are synthesized exclusively in the ovary by oocytes and/or follicle cells (2). Similarly, VE proteins in amphibian (11,12) and cyprinoid fish (e.g. Zebrafish, goldfish, and carp) (13)(14)(15)(16) are synthesized in the ovary. On the other hand, VE proteins in rainbow trout (17) and a large number of other noncyprinoid fish (e.g. winter flounder, sea bream, cod, and medaka) (18 -22) are synthesized in the liver under hormonal (estradiol-17␤) control and then transported in the bloodstream as proproteins to the ovary where they form the inner layer of the egg VE. In most instances it would appear that polymerization of fish VE proteins takes place in the ovary (8,17,21,(23)(24)(25)(26)(27).
The site of proteolytic processing of trout VE precursor proteins that occurs during VE formation around eggs is of considerable interest.
Here we used mass spectrometry (MS) measurements of individual VE * This work was supported in part by National Institutes of Health Grant HD-35105 (to proteins isolated from rainbow trout eggs to assess whether or not the CFLCS and C-terminal propeptide were retained following synthesis and secretion of the proteins by the liver and transport to the ovary and eggs. Because only VE proteins that had recently arrived in the vicinity of eggs would be expected to retain a CFLCS and C-terminal propeptide, i.e. a small fraction of total VE protein associated with eggs, MS provides the level of sensitivity necessary to make such an assessment. The evidence obtained strongly suggests that proteolytic cleavage at the CFLCS of VE precursor proteins, which results in removal of the short C-terminal propeptide, takes place on the egg. Occasionally this occurs after formation of covalently linked heterodimers (VE␣-VE␥ and VE␤-VE␥) but before incorporation of VE proteins into the VE. This would be consistent with the suggestion that soluble VE precursor proteins are converted into insoluble VE proteins at the site of envelope assembly in the ovary, not in the liver or bloodstream (8,17,21,(23)(24)(25)(26)(27). Furthermore, the evidence is consistent with a recently proposed mechanism for ZP assembly in mammals that involves short hydrophobic patches near the C terminus (external hydrophobic patch (EHP)) and within the ZP domain (internal hydrophobic patch (IHP)) of ZP precursor proteins (28).

Isolation of VE Proteins
The procedure described by Brivio et al. (29) was used to isolate VEs from rainbow trout (O. mykiss) eggs. Briefly, eggs were obtained from females by abdominal squeezing in a bloodless procedure that avoided contamination of eggs with proteins from blood. Eggs were frozen immediately by immersion in an ethanol and dry ice mixture. To prepare proteolytic digests of VE proteins, VEs were subjected to SDS-PAGE, and gel pieces containing individual VE proteins were excised and then digested with various proteases. Analyses of VE protein digests by MS were performed using several VE preparations.

Gel Electrophoresis
VE proteins were separated on 10% gels by SDS-PAGE (30) under reducing and nonreducing conditions (Fig. 1A). For nonreducing conditions, samples were dissolved in the absence of DTT. Gels were then stained with Coomassie Brilliant Blue. In most cases, a small portion of unstained gel just above the Coomassie-stained band (i.e. higher molecular weight) was used for MS experiments. As a result, under reducing conditions, there was a small amount of cross-contamination of VE proteins (e.g. VE␤ peptides in VE␥ samples and VE␣ peptides in VE␤ samples), but this did not interfere with interpretation of the MS analyses. For molecular weight determination by MS, individual VE proteins were obtained by preparative SDS-PAGE on 12% gels under nonreducing conditions, and the position of each protein was determined by transient zinc staining (see below). Gel bands containing individual VE proteins were excised, electroeluted (Biotrap, Schleicher & Schuell) in 25 mM Tris, 192 mM glycine, and 0.025% (w/v) SDS, pH 8.8, for 12-16 h at room temperature, and concentrated to 0.5-1 mg/ml. To remove SDS, individual proteins were dialyzed twice against 6 M urea (16 h each dialysis), followed by dialysis against distilled water six times (12-16 h each dialysis).

Enzymatic Digestion of VE Proteins
Digestion of gel pieces containing individual VE proteins with proteases was carried out in the manner of Hellman et al. (31) with certain modifications (32). Gel pieces containing 0.5-2 g of protein were incubated with 60% (v/v) ACN for 20 min, dried completely in a SpeedVac evaporator, and rehydrated for 10 min with digestion buffer (25 mM ammonium bicarbonate, pH 8.0). This procedure was repeated three times. After drying, gel pieces were again rehydrated in digestion buffer containing 10 mM DTT and incubated for 1 h at 56°C. Reduced Cys residues in VE proteins were blocked by replacing the DTT solution with 100 mM iodoacetamide in 25 mM ammonium bicarbonate, pH 8.0, for 45 min at room temperature with occasional vortexing. In some experiments, the reduced Cys residues were not blocked by iodoacetamide. Gel pieces were dehydrated, dried, and rehydrated twice. Dried gel pieces were then digested for 16 -18 h at 37°C in digestion buffer containing 15 ng/l trypsin (cleavage at the C terminus of Arg and Lys residues), chymotrypsin (cleavage at the C terminus of Phe, Tyr, Trp, Met, and Leu residues), endoproteinase AspN (cleavage at N terminus of Asp and Glu residues), or endopeptidase GluC(V8) (cleavage at C terminus of Glu and Asp residues). For tryptic and chymotryptic digests, 5 mM calcium chloride was added to the digestion buffer. Following digestion, peptides were extracted twice from gel pieces by addition of 300 l of 60% ACN, 5% FA in 25 mM ammonium bicarbonate, pH 8.0, and shaking for 60 -90 min at room temperature. Solutions containing peptides of VE proteins were pooled, dried, and used for MALDI-TOF-MS in either linear or reflective modes and MS or MS/MS by LC-MS(QTOF and QSTAR).

Matrix-assisted Laser Desorption Ionization-Time-of-Flight-Mass Spectrometry (MALDI-TOF-MS)
MALDI-TOF-MS analyses of VE peptides were carried out on a TOF Voyager-DERP mass spectrometer (PerSeptive Biosystems, Foster City, CA) in a linear mode and on a Reflex III TOF system (Bruker Daltonics, Leipzig, Germany) in a reflective mode, both equipped with a nitrogen laser (337 nm).
Samples were dissolved in 50% ACN (linear mode) and co-crystallized with ␣-cyano-4-hydroxycinnamic acid matrix on a stainless steel target. In a reflective mode, the samples were prepared by a modified thin layer method (33). The dry samples were dissolved with 2% trifluoroacetic acid, and a mixture of ␣-cyano-4-hydroxycinnamic acid and nitrocellulose was used as matrix. After drying, samples were analyzed either undiluted or at suitable dilutions. After ionization, peptide data were collected in the mass/charge (m/z) 800-8000 (linear mode) and 700 -3200 (reflective mode) range, and time-to-mass conversion was achieved by using external or internal calibration with bradykinin (m/z 1061.2), angiotensin (m/z 1297.5), insulin (m/z 5734.5), and tryptic fragments. Peaks detected by MALDI-TOF-MS in a linear mode corresponded to average m/z peptides. The peaks detected by MALDI-TOF-MS in a reflective mode corresponded to monoisotopic m/z peptides. Here we refer to the MALDI-TOF-MS peaks (m/z) as average/monoisotopic, singly charged (protonated) peptides. For MALDI-TOF-MS measurements in a linear mode, m/z differences (from the calculated value) up to 1 Da were observed, whereas calibration differences for measurements in a reflective mode were generally up to 100 ppm. Interpretation of mass spectra was performed using Mascot (www.matrixscience.com), FindMod, FindPep, PeptideMass (www.expasy.ch), as well as manually.

Electrospray Ionization (ESI) Quadrupole (Q) TOF-MS
Nanoflow HPLC-MS(LC-MS(QTOF))-Analyses were carried out with a Micromass-QTOF hybrid mass spectrometer (Waters Corp., Milford, MA) with a nanoelectrospray source. A fused silica tip mounting adaptor, fitted with a 75-m (inner diameter) fused silica tip (New Objective), was connected through a 50-m (inner diameter) fused silica tubing to the liquid chromatography (LC) detector outlet. An LC Packings system (Dionex Corp., Sunnyvale, CA), equipped with an Ultimate micro pump and solvent organizer and a Switchos loading pump and Famos autosampler, was used for LC-MS. Separation was carried out on a 75-m ϫ 15-cm column (LC Packings C18 PepMap; 5-l injection volume) at a flow rate of 200 nl/min, using a gradient of 2-80% ACN in 0.1% FA. The mass spectrometer was operated in the data-dependent mode and automatically switched between MS and MS/MS. The peaks detected by LC-MS(QTOF) corresponded to peptides with a monoisotopic m/z. Calibration differences for both MS and MS/MS measurements were typically up to 200 ppm. Processed files were subjected to a Mascot search (www.matrixscience.com), and the results were further supplemented by manual searches. The MS/MS spectra were further analyzed using MS-Product, MS-Fit, and MS-Tag (prospector.ucsf.edu), as well as manually.
For determination of molecular weights of intact VE proteins, samples containing individual polypeptides (0.5-1 mg/ml) were concentrated ϳ10 times to 5-10 mg/ml in 5 l of water and were reduced by adding 5 ml of 50 mM DTT in 0.1 M Tris-HCl, pH 8.5, and heating at 60°C for 1 h. Reduced proteins were injected onto a Dionex C18 micron precolumn (0.3 ϫ 5 mm), desalted at 200 nl/min with 10% ACN, 0.1% FA, eluted with 95% ACN, 0.1% FA, and analyzed on a Micromass QTOF hybrid mass spectrometer with a nanoelectrospray source connected to the column outlet. Operating conditions were 1.8-kV capillary voltage, 32-V cone voltage. Raw spectra were deconvoluted using the maximum entropy (MaxEnt) based software supplied with the MassLynx software.
Nanoflow HPLC-MS (LC-MS(QSTAR))-Samples were measured on a QSTAR Pulsar i mass spectrometer (Applera Deutschland GmbH, Darmstadt, Germany) equipped with a nanoelectrospray source (Proxeon Biosystems, Odense, Denmark). The MS was coupled to an Ultimate micro pump (Dionex, Idstein, Germany) with a FAMOS autosampler. Samples were dissolved in 2% trifluoroacetic acid (trifluoroacetic acid, S.P. 99.9%, SDS, Peypin, France; H 2 O, Lichrosolv, Merck). HPLC column tips (fused silica) with a 75-m inner diameter (New Objective, Woburn, MA) were packed with Reprosil-Pur 120 ODS-3 to a length of 11-12 cm. A gradient of 0.5% acetic acid (ACS Reagent, Sigma) in water plus 0.5% acetic acid in water containing 80% ACN (HPLC gradient grade, SDS), with increasing proportions of ACN, was used for peptide separation. The mass spectrometer was operated in the data-dependent mode and switched automatically between MS and MS/MS. Differences between measured and calculated m/z values for these measurements were typically up to 150 ppm. Processed files were subjected to an in-house licensed Mascot search (www.matrixscience.com), and the results were further supplemented by searches using MS-Product, MS-Fit, and MS-Tag (prospector.ucsf.edu), as well as manually.

Assignment of C-terminal Peptides
To ensure that VE precursors would be present, a small amount of unstained gel just above the Coomassie-stained band (i.e. higher molecular weight) was used for MS experiments. In general, digested VE protein samples were divided in two and analyzed in parallel by MALDI-TOF-MS and by either LC-MS(QTOF) or LC-MS(QSTAR). The samples were first analyzed to identify the VE proteins by peptide mass fingerprinting using Mascot search (www.matrixscience.com). The data that resulted from MALDI-TOF-MS were then compared with those from LC-MS(QTOF or QSTAR). When a peak that corresponded to a C-terminal peptide was detected by MALDI-TOF-MS, a search for the same peptide was carried out using either LC-MS(QTOF) or LC-MS(QSTAR). In addition, when a peak that corresponded to a C-terminal propeptide was detected by LC-MS(QSTAR) and no suitable MS/MS was obtained, a second run for detection of the same peak (using the same sample) was performed.
The C-terminal amino acid present at the CFLCS of mature VE proteins was determined primarily with AspN and chymotrypsin digests because there is no cleavage site for these enzymes at the CFLCS. Because the C-terminal peptides contained many Lys, Arg, Asp, and Glu residues, trypsin and AspN were used as the principal proteases for determination of the C-terminal propeptides, which increased the probability of detecting C-terminal peptides.
In MALDI-TOF-MS and tandem MS experiments, possible modifications of the peptides (e.g. cyclization of Gln or Glu residues to pyroglutamate or oxidation of Met to Met) were examined. In MS/MS experiments, the collision-induced dissociation (CID) spectra corresponding to a C-terminal peptide were analyzed using MS-Product, MS-Fit, and MS-Tag (prospector.ucsf.edu), as well as manually. Because a large number of amino acids in the C-terminal propeptides could lose water (e.g. Ser, Thr, Asp, and Glu) and ammonia (e.g. Arg, Lys, Asn, and Gln), single and multiple neutral loss of water and ammonia were taken into account. Neutral loss of CO (M r 28 units) was also considered. Because of the presence of many Pro and Asp residues in the C-terminal region of VE proteins, fragments that resulted in CID from the cleavage of the peptide bond of these amino acids, and the internal fragments that resulted from subsequent fragmentation of these fragments (MS/MS/MS), were taken into account. Because the C-terminal peptides contained many positively charged amino acids (e.g. Arg, His, Lys, and Asn), in MS and MS/MS experiments we searched for multiply [(2ϩ), (3ϩ), (4ϩ)] charged ions, in addition to the doubly charged ions classical for tryptic digests.

Electrophoretic and Mass Spectrometric Analyses of Rainbow Trout
VE Proteins-Under reducing conditions (ϩDTT) on SDS-PAGE, rainbow trout VEs contain three major proteins, called VE␣, VE␤ and VE␥, with masses of ϳ58, ϳ52, and ϳ47 kDa, respectively (Fig. 1A, lane a). In addition, a 4th band with a mass of ϳ120 kDa is present and contains VE␣-VE␥ and VE␤-VE␥ heterodimers, as reported previously (7). No other proteins (e.g. proteins from blood) were detected either on Coomassie-stained gels or by MS experiments. Under nonreducing conditions (ϪDTT) on SDS-PAGE, the band containing VE␣-VE␥ and VE␤-VE␥ heterodimers separates into two bands; the top band contains VE␣-VE␥ heterodimers and the bottom band contains VE␤-VE␥ heterodimers (Fig. 1A, lanes b and d). In addition to the VE␣, VE␤, and VE␥ monomers observed under reducing (Fig. 1A, lane a) and nonreducing (Fig. 1A, lanes b and d) conditions, additional protein bands were occasionally observed with molecular weights close to that of VE␣ (Fig. 1A, lane b), VE␥ (Fig. 1A, lane d), and VE␤ (data not shown). MALDI-TOF-MS analysis of trypsin-digested gel bands marked as VE␣ (Fig. 1A, lane b) identified both bands as VE␣ (data not shown), suggesting the presence of VE isoproteins.
Each VE propolypeptide contains an N-terminal SP, a PQ region, a P domain (absent in VE␥), a ZP domain, a CFLCS, and a C-terminal propeptide downstream of the CFLCS (Fig. 1B). The amino acid sequence of VE proteins containing the last Cys residue upstream of the CFLCS (VE␣-Cys 528 , VE␤-Cys 487 , and VE␥-Cys 406 ), the CFLCS, and the C-terminal propeptide are shown in Fig. 1B. Sometime after secretion by the liver, the C-terminal propeptide is removed by a furin-like pro- protein convertase that cleaves at the CFLCS. The predicted molecular masses of secreted VE proteins lacking the SP are 61, 56, and 47 kDa for VE␣, VE␤, and VE␥, respectively (not accounting for glycosylation of VE␥). Based on amino acid sequences after removal of C-terminal propeptides, mature VE proteins should have molecular masses of ϳ58 kDa (VE␣), ϳ52 kDa (VE␤), and ϳ44 kDa (VE␥). In addition, VE␥ has ϳ3 kDa of covalently linked N-linked oligosaccharide. By comparing theoretical molecular weights of VE proteins with and without the C-terminal propeptide and the molecular weights of VE proteins run under reducing conditions (Fig. 1A, lane a), it appears that VE proteins are proteolytically processed and do not contain the C-terminal propeptide. However, because molecular weights determined by SDS-PAGE may not be accurate, the molecular weights of individual polypeptides also were measured by LC-MS(QTOF) to confirm that VE proteins are processed at the C terminus. The molecular masses determined were 51,996 Da (Fig. 1C), 51,948 Da, and 51,980 Da (data not shown) for VE␤ and 47,600 Da for VE␥ (Fig. 1D), in good agreement with results of SDS-PAGE analyses.
Intensity, counts  The SignalP prediction program (www.cbs.dtu.dk/services/SignalP/) predicts that the N-terminal amino acids of VE proteins are Gln 23 , Gln 21 , and Gln 23 for VE␣, VE␤, and VE␥, respectively. Based on the SignalP prediction program, previous analyses (7), SDS-PAGE (Fig. 1A,  lane a), and LC-MS(QTOF) analyses (Fig. 1, C and D), it appears that mature VE␤ terminates within the CFLCS, probably with Arg 489 or Lys 490 , and that VE␥ also terminates within the CFLCS. In addition, the N-linked sugar chain of VE␥ has an apparent mass of ϳ3 kDa, in agreement with previous reports (7). It should be noted that we were unable to determine by LC-MS(QTOF) analyses the mass of VE proteins that contain the C-terminal propeptide, probably because of the presence of very low amounts of precursors compared with proteolytically processed VE proteins. Also, when samples containing 1 (Fig. 1A, lane a) or 2 (Fig. 1A, lane b) VE␣ bands were analyzed, we were unable to assign their molecular weight; this was probably because of either a low signal intensity or breakdown of the proteins during analysis.
Introduction to Mass Spectrometry Analyses of Rainbow Trout VE Proteins-It is to be expected that the bulk of VE protein associated with rainbow trout eggs is mature, processed, and assembled, with only a relatively small fraction present as immature, unprocessed VE proprotein. In what follows, MS was used to identify the C-terminal amino acid of VE proteins present on rainbow trout eggs and to determine whether the C-terminal propeptide, downstream of the CFLCS, as well as an intact CFLCS, are present on at least a portion of the VE proteins surrounding eggs. To evaluate the mass spectra that follow (Figs. 2-7), the position of the peptide(s) within the VE proteins is diagrammed schematically in Fig. 1E, with the position of peptide(s) indicated by the arrow at the top of the diagram. For example, in the left panel of Fig. 1E, the peptide is part of the mature protein and ends within the CFLCS. In the middle panel of Fig. 1E, the peptide is part of the unprocessed proprotein and is part of the C-terminal propeptide. In the right panel of Fig. 1E, the peptide is part of the unprocessed proprotein and contains part of the mature protein (upstream of CFLCS), the intact CFLCS, and part of the C-terminal propeptide.
Determination of the C-terminal Amino Acids of Mature VE Proteins-Rainbow trout VE precursor proteins are synthesized by the liver and contain an N-terminal SP that directs them to the endoplasmic reticulum (ER). They also contain a C-terminal propeptide that is removed by a proprotein convertase (i.e. a furin-like protease) that acts at a CFLCS having the consensus sequence RXXR. Both the SP and propeptide are missing from the bulk of VE proteins that are assembled into an extracellular coat surrounding eggs (Fig. 1, C and D).
The C terminus of mature VE␤ was determined by LC-MS(QSTAR) analyses of AspN digests of VE␤. A doubly charged (2ϩ) peak with m/z 685.95 (calculated 685.80) that corresponds to a peptide with the sequence 481 DSCEPRCYRK 490 (Cys modified to carbamidomethyl (CM)-C) with Lys 490 as terminal amino acid was detected ( Fig. 2A, left). Although no fragmentation of this peak was obtained, a peptide containing the same C-terminal amino acid was also detected by MALDI-TOF-MS (linear mode) analyses of AspN digests, where a singly charged peak with m/z 3271.69 (calculated 3271.72) that corresponds to a peptide with the sequence 464 ETVFIHCNTAVCLPSLGDSCEPRCYRK 490 (Cys modified to CM-C) was found (TABLE ONE). Two other C-terminal amino acids were found for VE␤. LC-MS(QSTAR) analyses of AspN digests of VE␤ revealed a doubly charged peak with m/z 573.86 (calculated 573.80) ( Fig. 2A, right) that corresponds to a peptide with the sequence 484 EPRCYRKR 491 (Cys modified to CM-C and Glu modified to pyroglutamate), with Arg 491 as a C-terminal amino acid. However, this peak did not produce a fragmentation spectrum and was not confirmed by other analyses. Previously, we determined by MALDI-TOF-MS analyses of AspN digests that the C-terminal amino acid of mature VE␤ is Arg 489 (7). In this context, it should be mentioned that in another MALDI-TOF-MS analysis of an AspN digests of VE␤, a peak with m/z 1243.44 (calculated 1243.35; TABLE ONE) was found that corresponds to a peptide with the sequence 481 DSCEPRCYR 489 (Cys modified to CM-C). Therefore, the C-terminal residue of mature VE␤ could be Arg 489 or Lys 490 or Arg 491 that lie within the CFLCS.
Previously, we determined by MALDI-TOF-MS analyses of chymotryptic digests that the C-terminal amino acid of mature VE␥ is Lys 410 (7). Other C-terminal amino acids in mature VE␥ were detected by MALDI-TOF-MS (linear mode) analyses of chymotryptic and AspN digests (  (Fig. 2C), with Arg 530 as the C-terminal amino acid. On the other hand, these peaks did not produce interpretable fragmentation spectra. These data suggest that the C-terminal amino acid for VE␣ could be Arg 530 and is part of the first CFLCS sequence 530 RQRR 533 . It should be mentioned that no cleavage sites at the second CFLCS sequence 541 KKTK 544 were found. Collectively, these data indicate that the C terminus of VE proteins lies within the CFLCS consensus sequence RXXR (i.e. RXRR for VE␣ and VE␤ and RKXR for VE␥).
Determination of the C-terminal Propeptide of VE␤-MALDI-TOF-MS (linear mode) analyses of tryptic digests of VE␤ were used to determine whether a C-terminal propeptide remained on VE␤ purified from eggs (Fig. 3A). A peak was found with m/z 1568.80 (calculated 1567.87) that corresponds to a peptide from the C-terminal propeptide with the sequence 492 RDIPAAVQKTARIK 505 . This peptide also appeared in other experiments where a peak with m/z 1568.52 (calculated 1567.87) was found (TABLE ONE). In the same MALDI-TOF-MS spectrum shown in Fig. 3A, a peak that corresponds to a peptide from the C-terminal propeptide of VE␤ was found with m/z 2059.37 (calculated 2059.28) that corresponds to a peptide with the sequence 506 SNLVSSGELILTDPRELTN 524 . The two peaks detected by MALDI-TOF-MS (linear mode) that correspond to these two peptides offer complete coverage of the C-terminal propeptide in VE␤. The sequence of the first peptide ( 492 RDIPAAVQKTARIK 505 ) starts in the CFLCS, and the second peptide, which is contiguous with the first, continues in the C-terminal propeptide to the end of the proprotein (Asn 524 ). The presence of the C-terminal propeptide in VE␤ was also confirmed by other MALDI-TOF-MS analyses using various enzymatic digests (TABLE ONE).
By using LC-MS(QSTAR) another peptide was detected from the C-terminal propeptide of VE␤ (Fig. 3B). A doubly charged peak with m/z 805.43 (calculated 805.47) that corresponds to a peptide with the sequence 490 KRRDIPAAVQKTAR 503 was found (Fig. 3B, inset). The

Summary of MS analyses of VE proteins
All peaks are monoisotopic, except MALDI peaks (linear mode), which are average; C*, CM-C; *E, pyroglutamate; M ϭ methionine-sulfoxide; MALDI*, measurements in reflective mode.
CID spectrum contained y1, y2, y3, y5, and y8 ions (in intact form or with neutral loss of H 2 O or NH 3 ), b6, and a12 ions, as well as many internal fragments that confirm the identity of the peptide. Other peptides from the C-terminal propeptide of VE␤ are shown in TABLE ONE. Therefore, some VE␤ from eggs contains peptides from the C-terminal propeptide, downstream of the CFLCS.
Determination of the C-terminal Propeptide of VE␣-To determine whether a C-terminal propeptide is present in other VE proteins, VE␣ and VE␥ were analyzed. MALDI-TOF-MS (linear mode) was used to analyze tryptic digests of egg VE␣ (Fig. 4A). A peak was found with m/z 2544.38 (calculated 2544.87) that corresponds to a peptide from the C-terminal propeptide with the sequence 531 QRRDLSAQGQKKTKG-DVVVSSQK 553 . This peptide was also observed in other experiments as a peak with m/z 2544.40 (calculated 2544.87) (TABLE ONE). In the same MALDI-TOF-MS (linear mode) spectrum (Fig. 4A), two additional peaks were found that correspond to a peptide from the C-terminal propeptide of VE␣. These peaks with m/z 2929.80 (calculated 2929.40) and 2945.65 (calculated 2945.40) correspond to a peptide with the sequence 534 DLSAQGQKKTKGDVVVSSQKVIMIDPR 560 , with its Met residue in either an unoxidized or oxidized form. The presence of the C-terminal propeptide was also detected in other MALDI-TOF-MS experiments with various enzyme digests (TABLE ONE). Further evidence for the presence of the C-terminal propeptide was provided by tandem MS of LC-MS(QSTAR) analyses of AspN digests of VE␣. A quadruply charged peak with m/z 517.53 (calculated 517.52) was found that corresponds to peptide 546 DVVVSSQKVIMIDPRFYA 563 (Fig. 4B,  inset). The CID fragmentation spectrum (Fig. 4B) produced a series of peaks containing doubly charged b ions (b8, b9, b12, b13, and b16) and singly, doubly, or triply charged y ions (y5, y6, y7, y10, y12, y13, y15, and y16) with neutral loss of H 2 O, NH 3 , or both that confirms the identity of this peptide. Other C-terminal peptides from VE␣ are shown in TABLE ONE. Therefore, some VE␣ from eggs possesses the C-terminal propeptide downstream of the CFLCS.
Determination of the C-terminal Propeptide of VE␥-Evidence for the presence of a C-terminal propeptide in VE␥ was obtained by LC-MS(QTOF and QSTAR) analyses of tryptic and AspN digests (    850.71 (calculated 850.79) were found (Fig. 5A). These ions correspond to the same peptide from the C-terminal propeptide of VE␥ with the sequence 417 HQKLVNIWEGDVQLGPIFISEK 438 . Although the fragmentation spectrum of the doubly charged ion (m/z 1275.56) (Fig. 5B) did not produce many fragments, the presence of y20, y21, a6, and a20 (with single or multiple neutral loss of NH 3 or NH 3 (TABLE ONE), as well as by LC-MS(QSTAR) analyses (Fig. 5, C and D). The doubly charged peak of m/z 741.92 (calculated 741.90) (Fig. 5C, inset) and the triply charged peak with m/z 494.93 (calculated 494.94) (Fig. 5D, inset) correspond to the same peptide with the sequence 413 DTT-KHQKLVNIW 424 . The fragmentation spectrum of the doubly charged peak (m/z 741.92) produced mostly singly and doubly charged b and a ions (b9, b10, b11, a4, and a11) with neutral loss of H 2 O or NH 3 and singly charged internal fragments (Fig. 5C). The CID spectrum of the triply charged peak (m/z 494.93) also produced mostly singly and doubly charged b and a ions (b2, b3, b10, b11, and a11) with neutral loss of water or ammonium, as well as singly charged internal fragments (Fig. 5D), which confirmed the identity of the peptide. It is of interest that the CID of these two parent ions (m/z 741.92 and 494.93) produced almost no y ions and have as common peaks the doubly charged ions b10, b10-H 2 O, b11, and a11. Collectively, these data suggest that some VE␥ from eggs possesses a C-terminal propeptide downstream of the CFLCS.

Intensity, counts
Determination of the C-terminal Propeptides of VE␣-VE␥ and VE␤-VE␥ Heterodimers-A small amount of VE protein in the egg VE apparently is cross-linked by a transglutaminase (21) into heterodimers (VE␣-VE␥ and VE␤-VE␥) (7). MALDI-TOF-MS (linear mode) analyses of tryptic digests of VE␣-VE␥ and VE␤-VE␥ heterodimers were also carried out (Fig. 6). A singly charged (1ϩ) peak with m/z 1568.72 (calculated 1567.87) corresponds to a peptide with the sequence 492 RDIPAAVQKTARIK 505 from the C-terminal propeptide of VE␤ in VE␤-VE␥ heterodimers (Fig. 6A). Peaks corresponding to this particular peptide were also observed in MALDI-TOF-MS (linear mode) analyses of tryptic digests of VE␤ (Fig. 3A), as well as in tryptic digests of VE␤ analyzed by LC-MS(QSTAR) (Fig. 3B). These findings suggest that    NOVEMBER 11, 2005 • VOLUME 280 • NUMBER 45 some VE␤ from VE␤-VE␥ heterodimers from eggs contains the C-terminal propeptide. A singly charged (1ϩ) peak with m/z 2456.10 (calculated 2456.84) from MALDI-TOF-MS (linear mode) analyses (Fig. 6A) corresponds to a peptide with the sequence 420 LVNIWEGDVQLGPIFISEKVAQ 441 from the C-terminal propeptide of VE␥ in VE␤-VE␥ heterodimers. This peptide was also identified by LC-MS(QTOF) of tryptic digests of VE␥, where a triply charged (3ϩ) ion with m/z 819.37 (calculated 819.11) corresponds to the same peptide from the C-terminal propeptide of VE␥ (TABLE ONE). Another singly charged (1ϩ) peak with m/z 3210.40 (calculated 3210.65) (Fig. 6A) corresponds to a C-terminal peptide from VE␥ with the sequence 411 GRDTTKHQKLVNIWEGDVQL-GPIFISEK 438 . This particular peptide was also detected in another MALDI-TOF-MS (linear mode) analysis using VE␤-VE␥ heterodimers, in which a peak with m/z 3210.71 (calculated 3210.65) was found. These results suggest that the C-terminal propeptide of VE␥ is present in some VE␤-VE␥ heterodimers from eggs.

Processing of Fish VE Proteins
MALDI-TOF-MS (linear mode) analyses of tryptic digests of VE␣-VE␥ heterodimers revealed a singly charged (1ϩ) peak with m/z 1360.40 (calculated 1360.55) that corresponds to a peptide with the sequence 533 RDLSAQGQKKTK 544 from the C-terminal propeptide of VE␣ (Fig.  6B). Another peak observed in the same spectrum (data not shown; TABLE ONE) with m/z 3209.37 (calculated 3210.65) corresponds to a peptide from the C-terminal propeptide of VE␥ with the sequence 411 GRDTTKHQKKLVNIWEGDVQLGPIFISEK 438 ; however, this peak was not considered because it exceeded the accepted error limit for the measurements. In this context, it should be mentioned that this particular peak is probably the same one detected in MALDI-TOF-MS analyses with VE␤-VE␥ heterodimers (Fig. 6A). In any case, these data suggest that some VE␣-VE␥ heterodimers from eggs contain the C-terminal propeptides of both VE␣ and VE␥ monomers. Collectively, these data suggest that heterodimerization may take place prior to removal of the C-terminal propeptides.

Cleavage of the VE Precursors and Removal of C-terminal Propeptides Takes
Place on the Egg-VE precursor proteins that contain a C-terminal propeptide are synthesized by the liver, secreted into the bloodstream, and transported to the ovary. Although unlikely, it is possible that the propeptide could have been cleaved from the protein prior to reaching the ovary and remained attached by noncovalent interactions. Here we used MS analyses of VE proteins isolated from eggs to look for peptides that contain a portion of mature VE proteins (i.e. upstream of the CFLCS), the CFLCS, and a portion of the C-terminal propeptide (i.e. downstream of the CFLCS). Such a peptide was identified by LC-MS(QTOF) analyses of tryptic digests of VE␤ (Fig. 7A). A triply charged (3ϩ) peak with m/z 697.08 (calculated 697.05) corresponds to a peptide with the sequence 487 CYRKRRDIPAAVQKTAR 503 (Cys modified to CM-C) (Fig. 7A, inset). The peptide contains the last conserved Cys residue from the ZP domain, the CFLCS, and a portion of the C-terminal propeptide. CID produced a fragmentation spectrum in which some y ions (  Peptides that contain a portion of mature VE proteins, the CFLCS, and a portion of the C-terminal propeptide were also detected in VE␣ and VE␥. In LC-MS(QTOF) analyses of tryptic digests of VE␥, a triply charged (3ϩ) peak with m/z 1009.77 (calculated 1009.76) was identified (TABLE ONE) that corresponds to a peptide with the sequence 391 EAGGNDGVCGCCDSTCSNRKGRDTTKHQK 419 (unmodified Cys). This peptide contains part of the mature protein, the CFLCS, as well as part of the C-terminal propeptide. However, we were unable to interpret its CID fragmentation spectrum. On the other hand, MALDI-TOF-MS analyses (linear mode) of GluC(V8) digests of VE␣ revealed a singly charged peak with m/z 1223.15 (calculated 1223.37) (TABLE ONE) that corresponds to a peptide with the sequence 526 QMC-NRQRRD 534 (unmodified Cys; Met oxidized to Met). This peptide contains the last conserved Cys residue of the ZP domain, the CFLCS, and the amino acid Asp that is part of the C-terminal propeptide.
Another peptide that contains a portion of mature VE proteins, the CFLCS, and a portion of the C-terminal propeptide was detected by LC-MS(QSTAR) analysis of AspN digests of VE␣ (Fig. 7B). A quadruply charged peak m/z 630.65 (calculated 630.57) (Fig. 7B, inset) that corresponds to a peptide with the sequence 525 EQMCNRQRRDL-SAQGQKKTKG 545 (Cys modified to CM-C) was found. The CID spectrum (Fig. 7B) contains peaks that correspond to singly or multiply charged b ions (b15, b17, b18, b19, and b20), a ions (a2 and a6), and y ions (y3, y7, y8, y14, y15, y17, y18, and y19) (with neutral loss of H 2 O, NH 3 , or both), confirming the identity of the peptide. Collectively, these results suggest that removal of the C-terminal propeptide from VE precursor proteins takes place on the egg.

DISCUSSION
VE precursor proteins are secreted from hepatocytes into the bloodstream by female rainbow trout and by estrogen-induced males (17,21,34,35). VE proteins are present in the blood as precursor proproteins that contain the C-terminal propeptide (8,17,21,(23)(24)(25)(26)(27). After transport to the ovary through the bloodstream, these precursors are not internalized by eggs (17)   secreted as precursor proteins, with a short C-terminal propeptide downstream of the CFLCS, that undergo various post-translational modifications. Synthesis of the proteins takes place in the liver (34), although VE␥ messenger-RNA, but not protein, has been detected in the ovary (8). The initial step in processing of VE precursor proteins is removal of the N-terminal SP that directs them to the ER. After formation of intramolecular disulfides and glycosylation of VE␥, VE precursor proteins are secreted by the liver and transported in the bloodstream as proproteins to the ovary. The site of proteolytic processing at the CFLCS and assembly of VE precursor proteins following secretion by the liver is not clear. Consequently, we employed a peptidomics-based approach to determine the site of processing at the CFLCS of VE proteins and the relationship between their processing and assembly. In general terms, peptidomics refers to alternative strategies for comprehensive analysis of peptide mixtures that are fractionated by liquid chromatography, analyzed by MS/MS, and identified by data base searches (36,37). To clarify aspects of processing and assembly of the VE precursor proteins, we analyzed a variety of proteolytic digests (i.e. tryptic, GluC(V8), AspN, and chymotryptic digests) of VE␣, VE␤, VE␥, and heterodimers of VE proteins by MS (i.e. MALDI-TOF-MS in both a linear and reflective mode, LC-MS(QTOF), and LC-MS(QSTAR)) (38). Collectively, the MS evidence strongly suggests that rainbow trout VE proteins undergo proteolytic processing at the CFLCS some time after their arrival at eggs in the ovary.
VE precursor proteins, like many other ZP domain-containing proteins, are cleaved at their CFLCS by a proprotein convertase or furin-like protease (39,40). To assemble into higher order structures such as fibrils or filaments, the C-terminal propeptide downstream of the CFLCS must be removed (23,28). The C-terminal amino acid of mature (i.e. processed) ZP domain proteins lies within a CFLCS that has the consensus sequence RXX(R/K) (Arg 1 -X 1 -X 2 (Arg 2 /Lys)). Although most studies have found that the C terminus of ZP domain proteins lies within the CFLCS, the C-terminal amino acid can be Arg 1 (7) or before Arg 1 (7,41,42), X 1 (7,(43)(44)(45), X 2 (23), or Arg 2 /Lys (46). In addition, the C-terminal amino acid of mature porcine ZPC (ZP3) is Ser 332 in the sequence SRK, upstream of the conserved CFLCS but downstream of the ZP domain, suggesting that proteolytic processing may also take place at cleavage sites within dibasic motifs (42). In uromodulin the C-terminal amino acid of the mature polypeptide is Phe 548 with FS as the proteolytic cleavage site (47). Most interestingly, this cleavage site lies within the ZP domain between conserved Cys 6 and Cys 7 residues and is not a CFLCS or a dibasic motif, but it does resemble the substrate of a chymotrypsin-like protease. In a previous report (7), it was found that cleavage of the C-terminal propeptide of fish VE proteins takes place at a proprotein convertase cleavage site having the general consensus sequence (K/R)X n (K/R), with n ϭ 0, 2, 4, or 6, as suggested by others (48). Differences in reported results are apparently due to trimming by a carboxypeptidase following cleavage at the CFLCS, as suggested for bird (41,45), amphibian (41), fish (7), and mammalian (42) egg coat proteins. However, if proprotein convertases act at the more general consensus RX n R (n ϭ 0), more than one amino acid may be the cleavage site for the enzyme(s). For example, for VE␤ that has the cleavage site sequence RKRR (Arg 1 -X 1 -X 2 -Arg 2 ), the proprotein convertases could cleave after X 1 , X 2 , or Arg 2 .
MS and data base searches (36, 49 -51) allow identification only of peptides that are present in the data base and cannot identify peptides derived from post-translationally modified proteins (52). Consequently, AspN-derived peptides that correspond to the C terminus of mature VE proteins are not identifiable in any data base because there are no AspN cleavage sites in the CFLCS. However, by using manual searches of MS and MS/MS data, we found that the major C-terminal amino acid lies within the CFLCS and may be any one of the four CFLCS amino acids (Arg 1 -X 1 -X 2 -Arg 2 /Lys), confirming a previous report (7). Our results support those of other studies in which most mature ZP domain proteins terminate within the CFLCS (7,23,(41)(42)(43)(44)(45)(46). For example, proteolytic processing of quail ZPC (ZP3) at the CFLCS is mandatory for its secretion and unprocessed ZPC proprotein accumulates in the ER (53). Similar results have been reported for mouse (54) and human (55) ZP3. On the other hand, proteolytic processing of ZP domain proteins may take place at furin-like cleavage sites other than the CFLCS (42,56). Furthermore, there is a report that mutation of the CFLCS does not affect secretion or assembly of mouse ZP3 (56), and mutation of the CFLCS of recombinant human ZP3 diminishes but does not abolish its secretion (55). Although there are additional furin-like cleavage sites downstream of the CFLCS in fish VE proteins (Fig. 1), we found no evidence for any additional C-terminal amino acids other than those from the CFLCS.
It has been demonstrated that a small amount of VE protein can be incorporated into covalently cross-linked heterodimers (7), but it is not clear where and when dimerization takes place. In most cases, secreted VE precursor proteins in the bloodstream are monomers (23,34,35,57); however, there is some evidence that heterodimers may be present as well (25,34,35,57). In the gilthead sea bream, sea bass, and Atlantic halibut VE monomers are mainly transported in the bloodstream as HMWPs (25,27,57). It was suggested that when HMWPs arrive at the ovary they are cleaved to lower molecular weight VE proteins in preparation for assembly. In the bloodstream of rainbow trout and other fish there are no VE heterodimers (17,21,23,34,35), and the monomeric VE precursors contain a C-terminal propeptide (8,17,21,(23)(24)(25)(26)(27), whereas most assembled VE proteins do not contain a propeptide (8,17,21,(23)(24)(25)(26)(27) (Fig. 1). Because the C-terminal propeptide is present in heterodimers, it is likely that heterodimerization of rainbow trout VE proteins can take place on ovarian eggs prior to C-terminal processing at the CFLCS. However, it should be noted that covalently cross-linked heterodimers, or HMWPs, usually represent a relatively small fraction of total VE protein. Fish VE proteins heterodimerize through their N-terminal PQ region in a reaction catalyzed by a transglutaminase (i.e. formation of an amide bond between the ␥-carbonyl of Gln residues and the ⑀-amino group of Lys residues with release of NH 3 ) (58). This is consistent with our finding that VE heterodimers are completely stable under conditions that would dissociate noncovalently linked monomers or monomers linked by intermolecular disulfides (i.e. in the presence of SDS and DTT). Mammalian ZP proteins lack an N-terminal PQ region, and covalently linked heterodimers or higher order oligomers between ZP1, 2, and/or 3 have not been detected in either the egg or embryo ZP (59).
There are significant differences between VE assembly in some fish and ZP assembly in mammals. For example, ZP protein precursors are synthesized in the ovary by growing oocytes and/or follicle cells and have a C-terminal transmembrane (TM) domain that anchors the proteins in egg plasma membrane prior to assembly (28). On the other hand, trout VE protein precursors are synthesized in the liver, are transported to the ovary, and do not have a TM domain (28). Despite the differences, nascent mammalian ZP proteins and fish VE proteins are deposited on the inside margin of the extracellular coats (18,60). Most interestingly, when ZP proteins are truncated just upstream of the TM domain, the proteins lacking a TM domain are secreted but are neither cleaved at the CFLCS or incorporated into the ZP (6,28). The evidence suggests that the TM domain is not involved in specific interactions, but ensures proper localization and/or topological orientation of nascent proteins so that proteolytic processing and assembly can take place. Unlike ZP precursor proteins, VE protein precursors that lack a TM domain undergo both cleavage at the CFLCS and assembly into the VE once they reach the ovary (23). Because, like ZP proteins, VE protein precursors possess both an EHP and IHP (28), presumably these elements prevent premature polymerization of VE proteins into fibrils/filaments in the bloodstream. As a result of proteolytic processing at the CFLCS in the ovary, the EHP is lost with the propeptide, the IHP in the ZP domain is exposed, and mature VE proteins are able to polymerize around eggs.
A number of interesting questions about VE assembly remain. Perhaps principal among these is the question of how some fish VE protein precursors are targeted specifically to the ovary from the bloodstream. In an analogous situation in fish, birds, and amphibians, the yolk precursor protein, vitellogenin, is synthesized and secreted by the liver and transported in the bloodstream to the ovary (61). There it is taken up into growing oocytes in a receptor-mediated fashion by micropinocytosis. In the chicken, vitellogenin receptors also import very low density lipoprotein, riboflavin-binding protein, and ␣ 2 -macroglobulin into growing oocytes (62). Whether or not a similar receptor-mediated mechanism applies to uptake of fish VE protein precursors remains to be determined. In a similar vein, it will be of interest to determine the source of the furin-like enzyme that cleaves VE precursor proteins at the CFLCS in the ovary. It is possible that the enzyme is associated with the oocyte plasma membrane that is close to the innermost layer of the VE into which nascent, processed VE proteins are incorporated. The enzyme could be the receptor for VE proproteins, which would explain why rainbow trout VE proproteins lack a TM domain. These and other issues will be the focus of future studies of VE formation in fish.