Localization of Carbohydrate Attachment Sites and Disulfide Bridges in Limulus 2-Macroglobulin EVIDENCE FOR TWO FORMS DIFFERING PRIMARILY IN THEIR BAIT REGION SEQUENCES *

The primary structure determination of the dimeric invertebrate 2-macroglobulin ( 2M) from Limulus polyphemus has been completed by determining its sites of glycosylation and disulfide bridge pattern. Of seven potential glycosylation sites for N-linked glycosylation, six (Asn, Asn, Asn, Asn, Asn, and Asn) carry common glucosamine-based carbohydrates groups, whereas one (Asn) carries a carbohydrate chain containing both glucosamine and galactosamine. Nine disulfide bridges, which are homologues with bridges in human 2M, have been identified (Cys –Cys, Cys–Cys, Cys–Cys, Cys–Cys, Cys– Cys, Cys–Cys, Cys–Cys, Cys–Cys, and Cys–Cys). In addition to these bridges, Limulus 2M contains three unique bridges that connect Cys and Cys, Cys and Cys, respectively, and Cys in one subunit with the same residue in the other subunit of the dimer. The latter bridge forms the only interchain disulfide bridge in Limulus 2M. The location of this bridge within the bait region is discussed and compared with other -macroglobulins. Several peptides identified in the course of determining the disulfide bridge pattern provided evidence for the existence of two forms of Limulus 2M. The two forms have a high degree of sequence identity, but they differ extensively in large parts of their bait regions suggesting that they have different inhibitory spectra. The two forms (Limulus 2M-1 and -2) are most likely present in an 2:1 ratio in the hemolymph of each animal, and they can be partially separated on a Mono Q column at pH 7.4 by applying a shallow gradient of NaCl.

The primary structure determination of the dimeric invertebrate ␣ 2 -macroglobulin (␣ 2 M) from Limulus polyphemus has been completed by determining its sites of glycosylation and disulfide bridge pattern.Of seven potential glycosylation sites for N-linked glycosylation, six (Asn 275 , Asn 307 , Asn 866 , Asn 896 , Asn 1089 , and Asn 1145 ) carry common glucosamine-based carbohydrates groups, whereas one (Asn 80 ) carries a carbohydrate chain containing both glucosamine and galactosamine.Nine disulfide bridges, which are homologues with bridges in human ␣ 2 M, have been identified (Cys 228 -Cys 269 , Cys 456 -Cys 580 , Cys 612 -Cys 799 , Cys 657 -Cys 707 , Cys 849 -Cys 876 , Cys 874 -Cys 910 , Cys 946 -Cys 1328 , Cys 1104 -Cys 1155 , and Cys 1362 -Cys 1475 ).In addition to these bridges, Limulus ␣ 2 M contains three unique bridges that connect Cys 361 and Cys 382 , Cys 1370 and Cys 1374 , respectively, and Cys 719 in one subunit with the same residue in the other subunit of the dimer.The latter bridge forms the only interchain disulfide bridge in Limulus ␣ 2 M. The location of this bridge within the bait region is discussed and compared with other ␣-macroglobulins.Several peptides identified in the course of determining the disulfide bridge pattern provided evidence for the existence of two forms of Limulus ␣ 2 M. The two forms have a high degree of sequence identity, but they differ extensively in large parts of their bait regions suggesting that they have different inhibitory spectra.The two forms (Limulus ␣ 2 M-1 and -2) are most likely present in an ϳ2:1 ratio in the hemolymph of each animal, and they can be partially separated on a Mono Q column at pH 7.4 by applying a shallow gradient of NaCl.
␣ 2 -Macroglobulin (␣ 2 M) 1 from the American horseshoe crab, Limulus polyphemus, is a member of the class of proteinase-binding ␣-macroglobulins (␣Ms) present in the blood of vertebrates and invertebrates (1).␣Ms are glycoproteins containing ϳ1450 residues, and they circulate as 180-kDa monomers, 360-kDa disulfide-bridged dimers, or 720-kDa tetramers noncovalently assembled from two disulfide-bridged dimers (2,3).Proteinase binding is initiated by one or more cleavages in an ϳ30 -60-residue stretch near the middle of the subunit (the bait region) (4,5).This elicits a conformational change that leads to entrapment of the proteinase (5).The bound proteinase is poorly accessible to high molecular weight substrates and inhibitors, and through rapid clearance of the ␣M-proteinase complex the ␣Ms play a role in controlling the level of proteolytic activity in the blood and tissues (2).In many vertebrate species, e.g.man, rat, mouse, and pig, two or three related ␣Ms have been found (3).Because their sequences differ greatly in their bait regions, each ␣M probably controls a particular set of proteinases, although information on this is fragmentary (2,6).
Limulus ␣ 2 M is the most extensively studied invertebrate ␣M.It is a 360-kDa dimer (7,8), and its proteinase-binding characteristics (1,9), shape (7,8), amino acid sequence deduced from its cDNA sequence (10), and carbohydrate composition (10) have been determined.In the set of peptides generated to initiate cDNA cloning (11,12), it was observed that residues in several positions were at variance with those predicted from the cDNA (10), suggesting that the Limulus ␣ 2 M used, which was purified from pooled hemolymph, was a mixture of two or more forms.Like most other known ␣Ms, Limulus ␣ 2 M con- tains internal thiol esters (13).When activated during proteinase complex formation, the ␣M thiol esters rapidly react with nucleophilic groups on the attacking proteinase and other available nucleophiles (14).This process results in efficient cross-linking of the proteinase to ␣Ms (15).However, in Limulus ␣ 2 M covalent proteinase binding is insignificant as the bound proteinase can be released by denaturation (7,16).In the case of trypsin an unusual self-cross-linking reaction within the Limulus ␣ 2 M dimer contributes to the tight binding of trypsin (17).
Human ␣ 2 M is the only ␣M for which complete information on the arrangement of its intrachain and interchain disulfide bridges and positions of Asn-based carbohydrate groups is available (18,19).Apart from the Cys residue being part of the thiol ester site, the human ␣ 2 M subunit contains 24 Cys resi- dues of which 22 engage in 11 intrachain bridges and 2 engage in interchain disulfide bridges thereby aligning the two subunits of the ␣ 2 M dimer in an antiparallel fashion.For verte- brate ␣Ms of known sequence most positions of disulfide bridges can readily be predicted from the data on human ␣ 2 M.
In addition to the Cys residue engaging in thiol ester formation, the Limulus ␣ 2 M subunit contains 23 Cys residues of which 18 would be expected to form nine disulfide bridges equivalent with those found in human ␣ 2 M.However, for Limulus ␣ 2 M the pattern of disulfide bridges is ambiguous with regard to five positions (10).The Cys residues that have no counterpart in human ␣ 2 M are located at positions 360, 381, 719, 1370, and 1434.Because no free ϪSH groups can be detected in native Limulus ␣ 2 M, they are all likely to be paired, and importantly, the two subunits of the Limulus ␣ 2 M dimer must be connected by an uneven number of interchain bridges in contrast to the human dimer.
From the carbohydrate composition given earlier (10) Limulus ␣ 2 M contains both glucosamine and galactosamine (19.4 and 2.9 residues/mol subunit, respectively).As galactosamine is not present in human ␣ 2 M and probably other mammalian ␣Ms, it was also of interest to locate the carbohydrate groups in Limulus ␣ 2 M. Limulus ␣ 2 M contains seven candidate Asn residues for attachment of glucosamine-based glycan groups (Asn 80 , Asn 275 , Asn 307 , Asn 866 , Asn 896 , Asn 1089 , and Asn 1145 (10)).In contrast, carbohydrate groups containing galactosamine are frequently bound to Ser and Thr residues which, however, cannot readily be predicted.
Here we report the determination of the complete disulfide bridge pattern of Limulus ␣ 2 M consisting of 11 intrachain bridges and one interchain bridge.Curiously, the single interchain bridge engages a Cys residue located in the bait region.We also report the localization of six Asn residues carrying glucosamine-based carbohydrate groups, and one Asn residue carrying a carbohydrate group containing both glucosamine and galactosamine.We further provide evidence from sequencing of a number of peptides for a second Limulus ␣ 2 M, the sequence of which differs from that reported earlier (10) particularly in its bait region.The two forms (Limulus ␣ 2 M-1 and -2) are most likely present in each animal and can be partially separated by ion exchange chromatography on Mono Q at pH 7.4.
Analytical Procedures-Amino acid analysis was performed by cation exchange using established procedures (21).Automated peptide sequencing and PTH-derivative analysis were carried out as reported (22).SDS-PAGE was performed in 10 -20 and 20% slab gels using the standard Tris glycine system.Mass spectra were acquired with a Bruker BIFLEX matrix-assisted laser desorption/ionization time-offlight instrument equipped with a 1-m flight tube, a reflector, a 337-nm nitrogen laser, and a 1-GHz digitizer.Thin film matrix surfaces were prepared using the fast evaporation technique from ␣-cyano-4-hydroxycinnamic acid dissolved in acetone/water (99:1) to 30 g/l.A 0.5-l volume of analyte (0.1-10 pmol/l) was deposited on the matrix surface and allowed to dry onto the crystals.Spectra were obtained by averaging 20 -50 single-shot spectra and calibrated externally by using the calibration constants of angiotensin II and adrenocorticotropic hormone, fragment 18 -39.Theoretical protonated masses (MH ϩ ) were calculated using the GPMAW program (Lighthouse Data, Odense, Denmark).
Main Digest-Approximately 30 mg of Limulus ␣ 2 M was used as starting material for the main digest.SDS-PAGE revealed that almost complete bait region cleavage had occurred during preparation and/or storage, and the material was therefore treated at pH 8.5 with 20 mM IAA to block the thiol group appearing upon thiol ester cleavage.After removal of excess reagent by gel chromatography on Sephadex G-25 in 10% formic acid, the material was freeze-dried, redissolved in 70% formic acid, and treated with 50 mg of CNBr for 20 h.After freeze-drying the degraded Limulus ␣ 2 M was redissolved in 300 l of formic acid, and 10 volumes of water was added.Addition of aliquots of 5 M NaOH was used to raise the pH of the solution, and at pH 4 -5 the solution turned turbid.Precipitation appeared to be complete at pH 7-8.
In analytical experiments trypsin or thermolysin was added at pH 4, and digestion was attempted after raising the pH to 7-8.HPLC experiments showed only extensive degradation with thermolysin.Therefore, to a solution of ϳ18 mg of CNBr-degraded Limulus ␣ 2 M 1:50 (w/w) thermolysin was added at pH 4 and the pH subsequently raised to 7 by addition of Tris.After incubation for 90 min at 55 °C with stirring, the suspension had nearly cleared.To separate the larger carbohydratecontaining and disulfide peptides from the small peptides, the digest was fractionated on a Superdex peptide column using 0.1% trifluoroacetic acid, 25% acetonitrile as eluent.The column effluent was monitored by measuring the absorbance at 280 nm and by determining the amount of cysteic acid and amino sugars in each fraction after performic acid oxidation and amino acid analysis (not shown).Two pools, one containing all carbohydrate peptides and the larger disulfide peptides and one containing the small disulfide peptides, were made.
Half of each pool was separated by RP-HPLC on an 8 ϫ 250-mm column packed with Nucleosil C18 using elution with gradients of acetonitrile in 0.1% trifluoroacetic acid (Fig. 1, A and B).Cys-and carbohydrate-containing peptides were located as above by performic acid oxidation and amino acid analysis of aliquots from each fraction combined in pairs.
Pools of interest were subjected to ion exchange chromatography on a 4 ϫ 250-mm LC-SCX column using gradient elution with NaCl in 5 FIG. 1. RP-HPLC separation of the two Superdex peptide pools from the main digest.A, one-half of the pool containing the large disulfide-bridged peptides and all carbohydrate-containing peptides was eluted from an 8 ϫ 250-mm Nucleosil C18 column with linear gradients formed from 0.1% trifluoroacetic acid (solvent A) and 90% acetonitrile containing 0.08% trifluoroacetic acid (solvent B) (dashed line).The column was operated at 50 °C at a flow rate of 2 ml/min, and 0.67-ml fractions were collected.The separation was monitored by recording the absorbance at 225 nm (solid line) and by determining the amount of half-cystine, glucosamine (GlcN), and galactosamine (GalN) in fractions combined in pairs (vertical bars).B, one-half of the pool containing small disulfide-bridged peptides was separated on the Nucleosil C18 column under the same conditions as in A. The peptides were detected at 225 nm (solid line), and the content of half-cystine in two successive fractions is indicated by vertical bars.The elution conditions and the fraction size were the same as in A.
mM phosphoric acid, 25% acetonitrile (23).Usually 4 -10 components appeared (not shown), and Cys-and carbohydrate-containing peptides were located as above.In a number of cases ion exchange pools required a further RP-HPLC step on a 4 ϫ 125 mm column packed with Hypersil C18 before unambiguous identification of the content could be made by amino acid and sequence analysis.
Ancillary Digest-Evidence for two disulfide bridges and the localization of carbohydrate at two potential positions were lacking from the main digest.A preparation of Limulus ␣ 2 M (containing intact subunits) was treated with methylamine and IAA and dialyzed into 10 mM phosphoric acid.From inspection of the sequences around the positions where peptide evidence was missing, trypsin was chosen to possibly generate a set of fairly large peptides that might relatively easily be separated from the bulk of smaller peptides.When aliquots of 1 M Tris were added to raise the pH, the solution turned turbid at pH 4 -5 as seen before, followed by extensive precipitation above pH 7.However, SDS-PAGE consistently showed the generation of a set of species between ϳ15 and 28 kDa when 1:50 (w/w) trypsin was added at pH 4 and digestion allowed to proceed at 37 °C for 24 h at pH 6.5 (not shown).Upon electroblotting to polyvinylidene difluoride and sequence analysis, the relevant peptides were found to be present in the 15-28-kDa species of the digest.
On a preparative scale 4-mg portions of Limulus ␣ 2 M digested with trypsin as above were subjected to gel chromatography on Superdex 75.In one experiment the separation was done using 10 mM sodium acetate, 100 mM NaCl, pH 4.5 (Fig. 2A).Although a major part of the larger species of the unfractionated digest was lost by adsorption to the column matrix and could only be recovered by subsequent elution with 6 M guanidinium chloride, the soluble material upon subsequent digestion of the materials in pools 1 and 2 with S. aureus proteinase and RP-HPLC separation (not shown) provided peptides giving evidence for the position of the two remaining carbohydrate groups.
The material eluted with guanidinium chloride as described above was desalted and subjected to SDS-PAGE.Upon electroblotting and sequence analysis a major 26 -28-kDa doublet species containing two N termini was found to cover the stretches where the missing two disulfide bridges must be located.However, in subsequent manipulations the material was lost.An additional preparative tryptic digest as above was then subjected to gel chromatography on Superdex 75 using 50% formic as solvent (Fig. 2B).The relevant material eluted in the void volume (horizontal bar in Fig. 2B), together with other fragments, and was digested with 1:20 (w/w) S. aureus proteinase at pH 7 for 2 h.RP-HPLC separation showed fair digestion (Fig. 3A).Upon subsequent digestion of one particular fragment set (horizontal bar in Fig. 3A) with chymotrypsin and RP-HPLC separation (Fig. 3B), unambiguous assignment of the two disulfide bridges present could be made from peptides present in the section of the chromatogram labeled with a horizontal bar.
In most cases the evidence for assigning the disulfide bridges was based on the amino acid composition of performic acid-oxidized relatively short pure peptides and sequence analysis of intact peptides.No peptides contained an internal disulfide bridge, and two sequences in near equimolar yield were seen, occasionally on a background of several minor irrelevant components.In cases where Cys 2 was released after less than 5-8 cycles of Edman degradation, bis-PTH-Cys 2 was normally seen in the RP-HPLC analysis of the PTH-derivatives eluting as a low yield peak near PTH-Tyr (24); after more than 8 cycles no signal was observed.When performed, mass spectrometry (MS) confirmed the assignment based on sequence analysis.However, in two cases MS provided the full evidence for assignment.
The evidence for locating carbohydrate groups to particular positions was based on the presence of glucosamine (and in one case also galactosamine) in hydrolysates coupled with the lack of a PTH-derivative when encountering Asn residues located in the sequence Asn-Xaa-Ser/ Thr (Xaa not Pro).
Preparation and Digestion of Partially Reduced Methylamine-reacted Limulus ␣ 2 M-Prior to this experiment Limulus ␣ 2 M was treated with methylamine and IAA.By using 8 mM mercaptoethanesulfonic acid (MESA) at pH 8.0 for 20 min, the interchain disulfide bridge(s) in the Limulus ␣ 2 M dimer could be reduced to an extent of Ͼ90% as evaluated FIG. 3. RP-HPLC separation of digests aimed at determining the remaining disulfide bridges.A, an S. aureus proteinase digest of the pool from the Superdex 75 separation carried out in 50% formic acid (Fig. 2B) was separated on a 4 ϫ 250-mm Nucleosil C18 column.The column was equilibrated with 5% solvent B (90% acetonitrile, 0.08% trifluoroacetic acid) and 95% solvent A (0.1% trifluoroacetic acid) and eluted with a gradient formed by solvent A and solvent B (dashed line).The separation was performed at 50 °C at a flow rate of 1 ml/min, and fractions of 0.5 ml were collected.Peptides were detected at 215 nm (solid line), and the amount of half-cystine in fractions having an absorbance Ͼ0.05 was determined (vertical bars).By MS and sequence analyses of half-cystine-containing fractions it was found that the fractions indicated by a horizontal bar contained a disulfide-bridged cluster involving Cys 657 , Cys 707 , and Cys 719 .B, the fractions shown by the horizontal bar in A were digested with chymotrypsin and separated on a 2 ϫ 250-mm Nucleosil C18 column.The column was operated at 50 °C at a flow rate of 0.2 ml/min and eluted with a gradient formed from the same solvents as in A (dashed line).The separation was monitored at 215 nm (solid line), and 0.1 ml fractions were collected.The labeled protein was redissolved in 2 ml of 6 M guanidinium chloride; the pH was adjusted to 9.0 by addition of Tris, and dithioerythritol was added to 10 mM to fully reduce Limulus ␣ 2 M.After reduction for 30 min IAA was added to 30 mM, and after 30 min of reaction the reduced and carboxamidomethylated protein was recovered by gel chromatography on Sephadex G-25 using 20 mM Tris-HCl, 100 mM NaCl, pH 8.0 as eluent.The preparation was digested with 1:50 (w/w) trypsin for 3 h at 37 °C, and after acidification with trifluoroacetic acid the digest was separated by RP-HPLC on an 8 ϫ 250-mm column packed with Nucleosil C18 using gradient elution with acetonitrile in 0.1% trifluoroacetic acid (Fig. 4).Then the fractions containing radioactivity was located by scintillation counting, and the major peptides were further purified by RP-HPLC on a column of Hypersil C18 or by cation exchange chromatography as above (not shown).
Determination of the Amino Acid Sequence of the Bait Region of Limulus ␣ 2 M-2-One mg of Limulus ␣ 2 M was treated with ϳ2 mg of CNBr in 70% formic acid for 20 h at room temperature.After drying the degraded material was redissolved in 500 l of 6 M guanidinium chloride, 50 mM Tris-HCl, pH 9.0, and reduced for 30 min with 10 mM dithiothreitol.The peptide solution was then acidified with trifluoroacetic acid and loaded on a 4.6 ϫ 250-mm Vydac C4 column equilibrated with 4.5% 2-propanol in 0.1% trifluoroacetic acid.The column was eluted with a gradient of 2-propanol in 0.1% trifluoroacetic acid (Fig. 5), and the bait region peptides of ϳ13 and 18 kDa corresponding to each form of Limulus ␣ 2 M were identified by SDS-PAGE and N-terminal sequencing of samples blotted onto polyvinylidene difluoride membranes.From these experiments a 32-residue stretch containing the bait region of Limulus ␣ 2 M-2 was determined.
Isolation of the Two Forms of Limulus ␣ 2 M from Single Animal Hemolymph-Hemolymph (80-and 100-ml samples) was separately drawn from two animals and processed as described earlier (20).Five mg of material depleted in hemocyanin and pentraxin was subjected to gel chromatography on a Superose 6 HR 10/30 column equilibrated and eluted with 50 mM Tris-HCl, pH 7.4.The Limulus ␣ 2 M containing fractions near the void volume of the column were pooled and loaded on a Mono Q HR 5/5 column equilibrated with the above Tris buffer and eluted with a shallow gradient of NaCl.Two partially separated peaks appeared at [NaCl] ϭ 190 and 220 mM, representing Limulus ␣ 2 M-2 and -1, respectively (Fig. 6).

Assignment of Disulfide Bridges and Location of Carbohydrate Groups-When determining the sequence of Limulus
␣ 2 M by a combination of peptide sequencing and cDNA cloning (10), we found a number of positions in which residues deviating from those determined from the cDNA sequence were present.As detailed below additional partial peptide sequence information has been obtained that does not conform with the sequence deduced from the cDNA sequence.In fact, the preparation of Limulus ␣ 2 M used for analysis is a mixture of two closely related forms, ␣ 2 M-1 representing the published se- quence (10) and ␣ 2 M-2 representing the set of deviating partial sequences.
In order to localize the glycosylation sites and disulfide bridges in Limulus ␣ 2 M, a digest of Limulus ␣ 2 M termed the main digest was made.First Limulus ␣ 2 M, which had been treated with IAA, was degraded with CNBr with the aim of generating a limited number of relatively large peptide clusters FIG. 5. RP-HPLC separation of reduced CNBr fragments of Limulus ␣ 2 M on a Vydac C4 column.The column was eluted at 50 °C with a stepwise linear gradient (dashed line) formed from 0.1% trifluoroacetic acid (solvent A) and 90% 2-propanol, 0.1% trifluoroacetic acid (solvent B).The flow rate was 0.75 ml/min and the fraction size was 0.5 ml.The elution was monitored by recording the absorbance at 280 nm (solid line), and bait region containing fragments were identified by SDS-PAGE and sequence analysis (bar).FIG. 6. Separation of Limulus ␣ 2 M-1 and Limulus ␣ 2 M-2 from hemolymph of a single animal.A pool obtained from gel chromatography on Superose 6 HR 10/30 was subjected to ion exchange chromatography on Mono Q HR5/5.The column was equilibrated with 50 mM Tris-HCl, pH 7.4, and eluted at a flow rate of 0.80 ml/min with a gradient of NaCl in the same buffer.Gradient breakpoints: 0 mM NaCl at 10 min, 300 mM NaCl at 67 min, 500 mM NaCl at 80 min, 700 mM NaCl at 85 min (dashed line).The separation was monitored by recording the absorbance at 280 nm.Limulus ␣ 2 M-2 eluted in fractions ϳ47-52 and Limulus ␣ 2 M-1 eluted in fractions ϳ50 -58.The identifi- cation of the forms was made by separately degrading material from fractions 49 ϩ 50 and 55 ϩ 56 with CNBr followed by sequence analysis of the characteristic bait region fragments of 15 (Limulus ␣ 2 M-2) and 18 kDa (Limulus ␣ 2 M-1).FIG. 4. RP-HPLC elution profile of tryptic peptides generated from 14 Ccarboxamidomethylated partially reduced methylamine-treated Limulus ␣ 2 M. The peptides were separated on an 8 ϫ 250-mm Nucleosil C18 column at 50 °C at a flow rate of 2 ml/min using gradient elution with linear gradients (dashed line) formed from 0.1% trifluoroacetic acid (solvent A) and 90% acetonitrile containing 0.08% trifluoroacetic acid (solvent B).The peptides were detected at 215 nm (solid line), and the amount of radioactivity in two successive fractions each having a size of 0.67 ml is indicated by vertical bars.
which could then be further enzymatically digested.However, in contrast to human ␣ 2 M (25), most of the material appeared in one large cluster.Subsequent digestion of the material with trypsin was unsuccessful due to limited solubility at pH 7-8, but treatment with thermolysin brought most of the material into solution.Upon gel chromatography on a Superdex peptide column, the effluent was divided into two pools, one consisting of the large peptides including those containing carbohydrate (pool A) and one consisting of the small peptides (pool B) (see "Experimental Procedures").The material in each pool was subjected to RP-HPLC (Fig. 1, A and B).In a few cases the material in the fractions obtained after RP-HPLC was of sufficient purity to allow assignment of the disulfide bridges and to locate the sites of carbohydrate attachment by compositional and sequence analysis.In general, the RP-HPLC pools made on the basis of their content of half-cystine and amino sugars were further separated by cation exchange chromatography (not shown).
Because no peptide material could be isolated from the main digest in sufficient purity to assign two particular disulfide bridges, and because evidence for localizing two carbohydrate groups was ambiguous, an ancillary digest was investigated.By using carefully controlled conditions, tryptic digestion of Limulus ␣ 2 M produced material which upon separation (Fig. 2, A and B) and further digestion provided the relevant peptides to complete the carbohydrate localization and bridge assignment.
The evidence for the location of carbohydrate groups to Asn residues is summarized in Table I.As seen in Fig. 1A glucosamine was present in many fractions in the elution profile, whereas galactosamine was only present in fractions 163-180.The major component in these fractions was a peptide with the N-terminal sequence LYANGS (77-82) apparently containing both glucosamine and galactosamine.In subjecting the material in these fractions to lectin affinity chromatography on concanavalin A and Jacalin-Sepharose to possibly separate the glucosamine-and galactosamine-containing peptides, the material was lost.However, from subdigestion of material from the ancillary digest (pool 2 in Fig. 2A) the peptide LYANG-SYSSPSSNDFFFE containing both amino sugars was isolated.Upon chymotryptic digestion of this peptide, it was established by amino acid analysis that the peptide LYANGSY contained both amino sugars, whereas the peptide SSPSSNDF contained none.Sequence analysis of the former peptide revealed that no PTH-Asn was present in cycle 4 and that PTH-Ser was present in normal amounts in step 6.Hence, it can be concluded that Asn 80 carries a glycan containing both glucosamine and galactosamine.
It was further established from sequence analysis of fractions obtained from the RP-HPLC separation of pool A from the main digest (Fig. 1A) that Asn 275 , Asn 307 , Asn 896 , Asn 1089 , and Asn 1145 all carry glucosamine-based carbohydrate groups.The composition of the material found in pool A26 -A28 indicated that a short peptide containing Asn 866 was present.However, no evidence from sequence analysis for carbohydrate on Asn 866 could be obtained possibly due to cyclization of Gln 865 during treatment with CNBr in formic acid.Upon subdigestion of the material from pool 1 in Fig. 2A originating from the ancillary digest, a disulfide-bridged peptide cluster composed of residues 833-853, 864 -880, and 889 -921 was obtained (pool 1 in Fig. 2A).Sequence analysis of this peptide provided evidence for carbohydrate on Asn 866 as well as confirmatory evidence for carbohydrate on Asn 896 .
The sequences of peptides containing the glycosylated Asn residues shown in Table I all represent ␣ 2 M-1, and no evidence for peptides containing Asn residues in different sequences was obtained.This indicates that ␣ 2 M-2 has the same pattern of glycosylated Asn residues as ␣ 2 M-1.
In Table II the evidence for the assignment of all disulfide bridges in Limulus ␣ 2 M is summarized.The bridges Cys 228 - Cys 269 , Cys 361 -Cys 382 , Cys 456 -Cys 580 , Cys 612 -Cys 799 , Cys 849 -Cys 876 , Cys 874 -Cys 910 , Cys 946 -Cys 1328 , Cys 1104 -Cys 1155 , Cys 1362 -Cys 1475 , and Cys 1370 -Cys 1434 were all identified from the main digest.Most of the peptides containing these disulfide bridges were recovered as several cleavage variants, some of which are listed in Table II.With the exception of the bridges considered below, the sequences of the peptide mates originated from ␣ 2 M-1.In the case of the bridges Cys 1104 -Cys 1155 and Cys 1362 -Cys 1475 , the mates containing Cys 1155 and Cys 1475 did not originate from ␣ 2 M-1, but rather from ␣ 2 M-2, and the stretches containing Cys 849 and Cys 1434 were recovered as variants originating from both ␣ 2 M-1 and ␣ 2 M-2.
Despite extensive search no peptides containing Cys 657 , Cys 707 , and Cys 719 could be identified from the main digest.This could be due to precipitation of this material during the preparation of the digest.An ϳ28-kDa tryptic fragment, which contains the above-mentioned Cys residues, was generated in the ancillary digest.Upon electroblotting and sequencing two N termini, 656 YXEDYK and 664 QTEGEHEG were determined in equimolar yield.The fragment must contain ϳ250 residues, and hence Cys 657 in the peptide YCEDYK must be disulfidebound to Cys 707 or Cys 719 present in a large fragment containing the bait region.Furthermore, Cys 707 or Cys 719 must engage in bridge formation with the corresponding residue in another large fragment, i.e. the ϳ28-kDa fragment set is a heterotetramer.a The peptides were obtained by digestion with S. aureus proteinase of the large fragments present in pool 1 and 2 in the separation shown in Fig. 2A.
The fragment set was subdigested with S. aureus proteinase after partial separation from several other fragments on a Superdex 75 column that was equilibrated and eluted in 50% formic acid (Fig. 2B).The S. aureus proteinase digest was fractionated by RP-HPLC (Fig. 3A), and a smaller fragment was obtained that contained the three Cys residues.This fragment was further digested with chymotrypsin, and the digest was again fractionated by RP-HPLC (Fig. 3B).By MS analysis of fractions from this RP-HPLC separation, it was demonstrated that Cys 657 is connected to Cys 707 (expected peptide set TRPCKPSGF bound to YCEDYK (␣ 2 M-1), Table II) and that Cys 719 is engaged in forming an interchain bridge within the Limulus ␣ 2 M dimer.In this case no MS data were obtained from the peptide expected from ␣ 2 M-1 (dimer of 713 EDG- GRPCPQYDVAF), but rather from ␣ 2 M-2 as the dimeric pep- tide set 713 EDGGRPCPQFDE was identified (Table II).That sequence was known from peptides isolated from partial reduction experiments described below.
Partial Reduction of Methylamine-treated Limulus ␣ 2 M- Due to the particular location of the 24 half-cystine residues in human ␣ 2 M, whose pairing to 12 bridges was relatively easily established (18), the identification of two of these as being engaged in interchain bridges was made by partial reduction using the highly solvated MESA and radiolabeling (19).As judged from SDS-PAGE (not shown), the subunits of methylamine-reacted Limulus ␣ 2 M were likewise essentially fully separated using 8 mM MESA for 30 min at room temperature.Following alkylation with radiolabeled IAA and removal of excess reagents, the preparation was fully reduced and alkylated with unlabeled IAA.The material was then digested with trypsin and separated by RP-HPLC (Fig. 4).The fractions containing the major part of the radioactivity were further purified, and two sets of peptides were characterized from these fractions as summarized in Table III.
Two major radiolabeled peptides present in pools 166 -168 and 187-188 (Fig. 4), respectively, originated from unexpected cleavage by trypsin at Arg-Pro (705-706).Upon sequence analysis they were found to contain the same N-terminal sequence PCKPSGFEDGGRPCPQ.The radiolabel was solely found in position Cys 719 , hence demonstrating that the interchain disulfide bridge containing this Cys residue is indeed solvent-exposed.However, downstream of Cys 719 the sequences diverged, with that of the peptide from pool 166 -168 being identical to the sequence expected from the bait region of ␣ 2 M-1 and that of pool 187-188 being different (␣ 2 M-2, Table III).This shows that a fairly long stretch of the bait regions of ␣ 2 M-1 and ␣ 2 M-2 differs markedly in sequence.
The cleavage of partially reduced Limulus ␣ 2 M by trypsin was incomplete as evidenced by the distribution of a major part of the label in several late eluting fractions.The peptide in pool 204 -205, originating from Limulus ␣ 2 M-2, apparently was that of pool 187-188 having a long C-terminal extension which, however, could not be identified due to poor sequencing yields.Two major labeled peptides present in pools 303-305 and 312-315, respectively, had N termini reflecting cleavage at Arg-Gln (663-664).Upon subdigestion with chymotrypsin several peptides, which could not be adequately purified, indicated that they each contained the bait region and hence represented N-terminally extended versions of the bait region peptides.The specific activity of all peptides containing the bait region was ϳ40,000 cpm/nmol.
From two other pools peptides having an ϳ10-fold lower specific activity than the bait region peptides described above were isolated.These peptides were mates of the C-terminal disulfide bridge Cys 1362 -Cys 1475 pointing to solvent exposure of that bridge located in the part of Limulus ␣ 2 M presumably being equivalent with the receptor-binding domain of human ␣ 2 M (26,27).In addition, two versions of the mate containing Cys 1362 originating from both ␣ 2 M-1 and ␣ 2 M-2 were found (Table III).
Determination of the Complete Bait Region Sequence of Limulus ␣ 2 M-2-In order to determine the entire bait region sequence of Limulus ␣ 2 M-2 and to be able to distinguish be- tween the two forms of Limulus ␣ 2 M, we took advantage of the fact that the partial bait region sequence of ␣ 2 M-2 contains a Met residue at position 728 not present in ␣ 2 M-1 (Table III).By assuming that ␣ 2 M-2 like ␣ 2 M-1 has a Met residue at position 864, CNBr degradation would generate ϳ15and 18-kDa fragments, respectively.After incubation with CNBr the fragments generated were reduced and separated on a Vydac C4 column (Fig. 5).Both the 18-kDa fragment containing the entire bait region of ␣ 2 M-1 and the 15-kDa fragment containing the C-terminal part of the bait region of ␣ 2 M-2 were found to elute in fractions 40 and 41.By Edman degradation of samples The pools indicated with prefix A and B are from the separations shown in Fig. 1A and Fig. 1B, respectively.Pools named with the prefix Ct are from the separation shown in Fig. 3B.For masses below 1200-Da monoisotopic masses were calculated and above 1200-Da average masses are shown.The underlined amino acid residues deviate from the amino acid sequence deduced from cDNA cloning.electroblotted onto polyvinylidene difluoride, the remaining part of the sequence of the bait region of ␣ 2 M-2 was obtained from the 15-kDa fragment.The complete sequence of the bait region of ␣ 2 M-2 is aligned with the bait region sequences of Limulus ␣ 2 M-1 (10) and tick ␣M (28) in Fig. 7.
Isolation and Characterization of the Two Forms of Limulus ␣ 2 M from Single Animal Hemolymph-By having established that Limulus ␣ 2 M prepared from pooled hemolymph contains two related forms, we investigated whether each form was present in individual animals.␣ 2 M was prepared from two animals using the procedure of Ref. 20.In the final ion exchange step of each experiment (Fig. 6), two partially separated peaks appeared.Each peak represented pure and active Limulus ␣ 2 M as judged from the presence of the 180-kDa subunit and the heat fragments of ϳ55 and 125 kDa upon reducing SDS-PAGE.Samples were removed from the flanking parts of the partially separated proteins, treated with CNBr, and separated by SDS-PAGE.Electroblotting and sequence analysis as above identified the material eluting at [NaCl] ϭ 190 mM as ␣ 2 M-2 (containing the 15-kDa bait region fragment), and the material eluting at [NaCl] ϭ 220 mM as ␣ 2 M-1 (containing the 18-kDa bait region fragment).␣ 2 M-2 had the same N-terminal sequence as ␣ 2 M-1 (15 residues determined).Based on the heights of the two peaks in Fig. 6, ␣ 2 M-1 and ␣ 2 M-2 were present in an approximate 2:1 molar ratio in both animals investigated.

DISCUSSION
In this work we have completed the primary structure determination of Limulus ␣ 2 M by determining its disulfide bridge pattern and localizing its sites of carbohydrate attachment.Furthermore, evidence for the occurrence of two forms of Limulus ␣ 2 M (␣ 2 M-1 and ␣ 2 M-2) has been obtained.
It was found that all seven potential N-linked glycosylation sites are occupied.As shown in the schematic comparison of the localization of disulfide bridges and glycosylation sites in Limulus and human ␣ 2 M in Fig. 8, the only N-linked glycosylation site that is conserved in human ␣ 2 M is Asn 896 .Even though Limulus ␣ 2 M contains galactosamine (10), we could not iden- tify any Ser or Thr residues containing carbohydrate.Instead we found that Asn 80 carries a glycan containing both glucosamine and galactosamine.The presence of galactosamine-containing N-linked carbohydrate chains has been reported previously in several other proteins including pituitary glycoprotein hormones (29), human urinary kallidinogenase (30), human tissue factor pathway inhibitor (31), bovine component PP3 (32), snake venom batroxobin (33), and hemocyanin from the freshwater snail Lymnaea stagnalis (34) but not in other ␣Ms.
Regarding the disulfide bridge pattern of Limulus ␣ 2 M, we have confirmed the existence of nine disulfide bridges that are located in a similar position in human ␣ 2 M as shown in Fig. 8.The remaining five Cys residues in Limulus ␣ 2 M, which are not part of the thiol ester site, were shown to be engaged in three disulfide bridges.One of these bridges (Cys 360 -Cys 381 ) is located in the N-terminal part of the protein.Another bridge connecting Cys 1370 and Cys 1434 is located in the region corresponding to the receptor-binding domain of the mammalian ␣Ms.The location of a bridge at this position is compatible with the three-dimensional structure of the receptor-binding domain of bovine ␣ 2 M, because the side chains of the equivalent residues in bovine ␣ 2 M are located in close proximity to each other on ␤-strands 2 and 7, respectively (35).Furthermore, a disulfide bridge is located in a similar position in human C3 (36).The third bridge, which is unique to Limulus ␣ 2 M, is an interchain disulfide bridge that connects Cys 719 in one subunit with the same residue in the other subunit of the dimer.In a previous alignment of the bait region of Limulus ␣ 2 M with the bait regions of other ␣Ms, this Cys residue was placed at the N-terminal border of the bait region (10).However, this location is not in accordance with our determination of the disulfide bridge pattern, because we found that the bridge, which defines the N-terminal boundary of the bait region, is Cys 657 -Cys 707 (Fig. 7).This implies that the bait region is 12 residues longer than previously thought (10) and that the above-mentioned interchain disulfide bridge is located in the bait region.
To investigate whether Limulus ␣ 2 M contains other inter- chain disulfide bridges than the above-mentioned, Limulus ␣ 2 M was partially reduced with MESA, and the freed thiol groups were radiolabeled with IAA.Because MESA is highly solvated it preferably reduces solvent-exposed disulfide bridges.As expected, it was found that the interchain bridge linking the two bait regions was easily reduced.However, the bridge Cys 1362 -Cys 1475 was also reduced by MESA.Because this bridge is conserved in human ␣ 2 M and is located in the region corresponding to the receptor-binding domain of the mammalian ␣Ms and, furthermore, is reduced only to an extent of ϳ10% compared with the bait region interchain bridge, we conclude that the bridge linking the bait regions is the only interchain disulfide bridge in Limulus ␣ 2 M.
The location of an interchain disulfide bridge within the bait region of Limulus ␣ 2 M shows that parts of the two bait regions of the Limulus ␣ 2 M dimer are located in close proximity to each other.This observation is in line with studies on recombinant bait region variants of human ␣ 2 M. By deleting parts of the C-terminal end of the bait region of human ␣ 2 M, it was shown that the bait regions are involved in forming the interface between its non-covalently associated dimers (37).This was further examined by mutating single residues in the bait region to Cys residues.These cysteine-containing variants formed disulfide-linked tetramers demonstrating that at least two bait regions are located close to each other at the interface between the non-covalently associated dimers of human ␣ 2 M (38).
The two interchain bridges in human ␣ 2 M are located in a completely different part of the primary structure than the single interchain bridge of Limulus ␣ 2 M, namely in the N-terminal part.This difference in location of the interchain bridges in tetrameric human and dimeric Limulus ␣ 2 M seems to be reflected in the functionality of the disulfide-linked dimers.Evidence suggests that the disulfide-linked dimers of  7. Alignment of bait region sequences from invertebrate ␣Ms.The bait region is defined as the peptide stretch located between residues 666 -706 in human ␣ 2 M (6).In Limulus ␣ 2 M-1 these residues correspond to residues 707-757.The underlined peptide stretches differ between Limulus ␣ 2 M-1 and -2.

FIG. 2 .
FIG. 2. Gel chromatography of tryptic peptides from the ancillary digest on a Superdex 75 (10/30) column.The column was equilibrated and eluted at a flow rate of 0.5 ml/min with 10 mM sodium acetate, 100 mM NaCl, pH 4.5 (A), and 50% formic acid (B).The separations were monitored by recording the absorbance at 275 (A) or 280 nm (B), and fragments of interest were identified by SDS-PAGE, amino acids, and sequence analysis.The pools indicated by horizontal bars in A and B were used to determine remaining glycosylation sites and disulfide bridges, respectively.
FIG. 3. RP-HPLC separation of digests aimed at determining the remaining disulfide bridges.A, an S. aureus proteinase digest of the pool from the Superdex 75 separation carried out in 50% formic acid (Fig.2B) was separated on a 4 ϫ 250-mm Nucleosil C18 column.The column was equilibrated with 5% solvent B (90% acetonitrile, 0.08% trifluoroacetic acid) and 95% solvent A (0.1% trifluoroacetic acid) and eluted with a gradient formed by solvent A and solvent B (dashed line).The separation was performed at 50 °C at a flow rate of 1 ml/min, and fractions of 0.5 ml were collected.Peptides were detected at 215 nm (solid line), and the amount of half-cystine in fractions having an absorbance Ͼ0.05 was determined (vertical bars).By MS and sequence analyses of half-cystine-containing fractions it was found that the fractions indicated by a horizontal bar contained a disulfide-bridged cluster involving Cys 657 , Cys 707 , and Cys 719 .B, the fractions shown by the horizontal bar in A were digested with chymotrypsin and separated on a 2 ϫ 250-mm Nucleosil C18 column.The column was operated at 50 °C at a flow rate of 0.2 ml/min and eluted with a gradient formed from the same solvents as in A (dashed line).The separation was monitored at 215 nm (solid line), and 0.1 ml fractions were collected.Fractions 26 -35 (bar) were analyzed by MS.

FIG. 8 .
FIG. 8. Schematic comparison of the disulfide bridge pattern and sites of glycosylation between Limulus and human ␣ 2 M. The location of the 11 intra- chain bridges and the single interchain bridge in Limulus ␣ 2 M is shown as well as the location of the 11 intrachain and 2 interchain bridges in human ␣ 2 M. The positions of carbohydrate groups are shown by filled diamonds.The bait regions are indicated by horizontal bars and the thiol esters with an asterisk.The Cterminal receptor-binding domain of human ␣ 2 M starts at the site shown with an arrow.

TABLE I
Summary of evidence for location of carbohydrate groups to Asn residuesThe pools named with prefix A originate from the separation shown in Fig.1A.

TABLE II
Summary of evidence for assignment of disulfide bridges in Limulus ␣ 2 M

TABLE III
Major radiolabeled tryptic peptides isolated after partial reduction of methylamine-reacted Limulus ␣ 2 M with MESA The underlined residues differ from the sequence deduced from cDNA cloning.The Cys residue shown in boldface was radiolabeled.