Determination of the disulfide bonds within a B domain variant surface glycoprotein from Trypanosoma congolense.

The disulfide bonds within a variant surface glycoprotein from Trypanosoma congolense have been determined. L-[35S]Cysteine metabolically labeled protein was digested with trypsin, and radiolabeled peptides were separated by reversed-phase high performance liquid chromatography, and putative cystine-containing peptides were subdigested with other proteases and analyzed after further purification by amino acid sequencing and mass spectrometry. All eight cysteine residues of the protein, located within the N-terminal domain, are covalently linked. The four disulfide bonds are between cysteines 16/236, 171/193, 195/206, and 286/298. This is, for the first time, the determination of disulfide bonds within a variant surface glycoprotein belonging to the B-type. As all the eight cysteines of BENat 1.3 variant surface glycoprotein are positionally conserved, the cystine pattern of this protein can be regarded as a prototype of disulfide bonding within B-type variant surface glycoproteins. Although the cysteine residues of B-type variant surface glycoproteins are located at completely different positions in the protein chain compared with A-type variant surface glycoproteins, the positions of the disulfide bonds can easily be integrated into the A-type tertiary structure. This result implies that, despite their enormous amino acid sequence variability, variant surface glycoproteins, regardless of their subtype, can fold into a similar tertiary structure.

Bloodstream forms of African trypanosomes possess a large repertoire of surface antigen (VSG) 1 genes (1). Only one glycoprotein is expressed at a time, and it forms the cell-surface glycoprotein coat (2,3). Due to rapid switching of the expression of VSG genes coding for antigenically different proteins, the parasites continuously change their antigenicity (antigenic variation) and are thus able to evade the immune response of the host (for recent reviews see Refs. 4 and 5). In the past, approximately 30 amino acid sequences of different VSGs from several trypanosome species had been determined. Although highly variable in their amino acid sequences, these proteins contain conserved structural elements, i.e. a two-domain structure with a 350-residue N-terminal domain and a 50 -150residue C-terminal domain (6). All domains, except the Cterminal domain of Trypanosoma congolense, have positionally conserved cysteine residues. According to the distribution of the cysteine residues, three N-terminal domain types (A, B, and C) and four C-terminal domain types (types 1-4) were defined (7). The majority of the sequenced Trypanosoma brucei VSGs have type A N-terminal domains; all of the T. congolense VSGs have N-terminal domains of the type B. The type A domain has four positionally conserved cysteine residues, and the type B domain has eight (7,8). It has been shown in three VSGs that the pattern of disulfide linkages in the A-type domain is conserved (9,11). As revealed by x-ray crystallography, one of the disulfide bridges is responsible for interconnecting short peptide loops regarded to be important for the antigenicity of the molecules. The second disulfide bond interconnects two parallel ␣-helices forming the VSG stem region (10,11). Until now, nothing is known about the bonding of the cysteine residues in B-type VSGs. A prediction of the bonding pattern on the basis of A-type disulfide bonding is not possible because the positions of the cysteine residues in the B-type peptide chain differ so greatly from those in the A-type domain. One objective of this work was to investigate whether or not the location of cystines within a B-type VSG is compatible with the A-type tertiary structure.
We have recently sequenced several B-type VSGs (8) and selected one of them, BENat 1.3, to determine its disulfide bonding pattern. BENat 1.3 VSG is 407 amino acids long and contains eight cysteine residues in the conserved positions of the eight-cysteine motif typical for B-type VSGs (7,8 Trypanosomes-T. congolense clone BENat 1.3 was used for this study. BENat 1.3 is a cell clone derived from the third peak of parasitemia of a New Zealand rabbit chronically infected with BENat 1, the origin of which is described (12). The parasites were grown in mice and rats and harvested according to standard procedures (13).
In Vitro Labeling-Trypanosomes were washed three times with NaCl/P i /glucose (44 mM NaCl, 60 mM sodium phosphate, 55 mM glucose, pH 8.0) and resuspended in cysteine-free Dulbecco's modified Eagle's medium at 7 ϫ 10 7 cells/ml. After a preincubation period (30 min, 37°C) * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) X79401.
Isolation of Variant Surface Glycoprotein-Washed trypanosomes were subjected to dioxane lysis according to Ref. 14. VSG was purified from the supernatant by affinity chromatography using concanavalin A-Sepharose. Lectin-bound glycoprotein was electrophoretically desorbed and isoelectrically focused as described elsewhere (14). The glycoprotein was further purified by size exclusion chromatography with Bio-Gel P30 (Bio-Rad, Mü nchen, Germany) and subsequently lyophilized.
Radiocarboxymethylation-For carboxymethylation of free thiol groups, 10 g of unlabeled glycoprotein were dissolved (1 mg/ml) in helium-purged 0.1 M Tris/HCl, pH 8.5, containing 6 M guanidine hydrochloride. 3 l of 5 mM [ 14 C]iodoacetic acid (1.4 ϫ 10 5 Bq/nmol protein) were added, and the tube was sealed and incubated in the dark at 37°C with shaking. The sample was tested for incorporation of radioactivity by liquid scintillation counting after trichloroacetic acid precipitation and fluorography after SDS-polyacrylamide gel electrophoresis.
Enzymatic Digests-For tryptic digestion the protein was dissolved (1 mg/ml) in helium-purged 0.1 M Tris/HCl buffer, pH 8.5, containing 10% (v/v) ACN and 3-10 g of trypsin/100 g of protein. Incubation was carried out at 25°C for 45 h, with the addition of fresh trypsin every 15 h.
Chymotryptic digestion of tryptic peptides was carried out in heliumpurged 0.1 M Tris/HCl buffer, pH 7.8, with 10 mM CaCl 2 and 5% (v/v) ACN. 15 g of peptide were dissolved in digestion buffer (final concentration 1 mg/ml) and were incubated with 1 g of enzyme for 30 h at 25°C, and fresh enzyme was added after 15 h.
Digestion of tryptic peptides with staphylococcal serine protease was carried out in helium-purged 25 mM sodium phosphate buffer, pH 7.8. 15 g of peptide were dissolved in digestion buffer (final concentration 1 mg/ml) and incubated with 5 g of enzyme for 67 h at 25°C.
Peptide Sequencing-Peptides, 20 -50 pmol, were sequenced by automated N-terminal Edman degradation on an Applied Biosystems (Foster City, CA) pulsed liquid-phase sequencer, model 477A, under standard conditions. Phenylthiohydantoin derivatives of amino acids were identified by an on-line analyzer, model 120 A (Applied Biosystems) with a repetitive yield of 92-95%.
Mass Spectrometry-Molecular masses of peptides were determined by MALDI-TOF-MS on a Finnigan MAT model Vision 2000 mass spectrometer (Finnigan MAT, Bremen, Germany). 1 l of peptide solution (about 1 pmol) was mixed with 1 l of matrix solution (10 g/l 2,5dihydroxybenzoic acid in 0.1% trifluoroacetic acid, 30% (v/v) ACN) and allowed to air-dry. Ions were generated by short laser pulses (N 2 laser, 337 nm), and positive ions were accelerated and detected in the reflector mode. Spectra were calibrated using angiotensin as an external standard.

RESULTS
Non-reduced BENat 1.3 VSG was denatured in urea and radiocarboxymethylated with [ 14 C]iodoacetic acid. Liquid scintillation counting after trichloroacetic acid precipitation and fluorography after SDS-polyacrylamide gel electrophoresis showed no incorporation of radioactivity (data not shown). Thus all eight cysteine residues were found to be disulfidebonded. The strategy employed to localize the disulfide bridges is based on N-terminal sequence analysis and MALDI-TOF-MS of cystine-containing peptides after proteolytic fragmentation of the glycoprotein.
Analysis of Tryptic Cystine Peptides-After tryptic digestion of L-[ 35 S]cysteine-labeled VSG, the peptides were separated by rHPLC using the following five-step ACN (in 0.1% trifluoroacetic acid) gradient: 5% ACN for 5 min, 5-24% ACN from 5 to 15 min, 24% ACN for 5 min, 24 -29% ACN from 20 to 80 min, and 29 -40% ACN from 80-to 100 min. Five radioactive peaks were eluted at 16 min (24% ACN), 22.2 min (24.2% ACN), 27 min (24.6% ACN), 50.8 min (26.6% ACN), and 58.7 min (27.2% ACN). N-terminal sequences of cystine-containing peptides found in the five radioactive fractions are given in Table I. Peak 1 was found to contain equimolar amounts of the fragments Ile 14 -Arg 17 and Asp 234 -Arg 239 . In one of the radioactive fractions obtained from an alternate purification of the tryptic digest in ammonium acetate (data not shown), these fragments were also found by sequence analysis to coelute, which proves that they are covalently bound through a disulfide bridge between cysteine residues 16 and 236.
Peptides in peaks 2 and 3 were found to have the same N-terminal sequence, beginning with position 271, and were presumed to be resolved due to partially incomplete tryptic cleavage. According to the cDNA-derived sequence, they contain the cysteine residues at positions 286 and 298. Isolated peak 2 peptide was further cleaved with staphylococcal serine protease under conditions that allow cleavage after aspartic acid as well as glutamic acid (phosphate buffer, pH 7.8). Two fragments, 2.1 and 2.2, were resolved by rHPLC eluting at 31.9 and 32.9% ACN, respectively. By MALDI-TOF-MS two major mass peaks were detected in each of the two peaks: 3744.2 and 3503.8 in peak 2.1 and 3430.1 and 3189.2 in peak 2.2. These masses were incompatible with masses calculated for peptides expected from the cDNA-derived sequence. Therefore, the peptide in peak 2.1 was completely sequenced by Edman degradation. Two deviations from the cDNA-derived sequence were found as follows: lysine was identified instead of the predicted asparagine at position 290, and at position 293 asparagine was detected instead of the predicted serine. Taking into account these two deviations and the fact that the peptides were only partially cleaved C-terminal to Asp 287 and Asp 304 , the mass  (Table I) showed that they both contain the cysteine residues 171, 193, 195, and 206 but that the peptide in peak 4 was incompletely cleaved. A further deviation from the sequence derived from the cDNA clone was found in the third fragment of the peptide in peak 5, at position 212, where valine was identified instead of the predicted isoleucine. The peptide was repurified by rHPLC and analyzed by MALDI-TOF-MS. The detected mass was 5084 Da, whereas the mass calculated for the disulfidebonded triple peptide, based on the cDNA sequence and taking into account the substitution of valine for isoleucine, was 5099 Da. This discrepancy of 15 Da implies a further amino acid divergence.
The peptide was digested with chymotrypsin, and the resulting fragments were separated by rHPLC. Two completely digested subpeptides, 5.1 (eluting at 26% ACN) and 5.2 (eluting at 28.4% ACN), could be identified by mass spectrometry and N-terminal sequence analysis. As shown in Table III, subpeptide 5.1 contains amino acids 168 -172 and 193-194, which proves that Cys 171 is linked to Cys 193 . The mass detected for this subpeptide, 1040.2 Da, is compatible with the mass calculated from the cDNA-derived sequence. Subpeptide 5.2 contains amino acids 195-197 and 205-217, proving that Cys 195 is linked to Cys 206 . The experimentally determined mass for this subpeptide is 1652 Da, which is 16 Da less than the calculated mass based on the cDNA sequence. Since all amino acids in this fragment except Glu 216 and Trp 217 were confirmed by Edman degradation (Tables I and III Thus, all four disulfide bonds of BENat 1.3 VSG from T. congolense are identified. A schematic representation of the disulfide-bonding pattern is shown in Fig. 1.

DISCUSSION
The eight cysteine residues of BENat 1.3-VSG form four disulfide bonds, i.e. Cys 16 /Cys 236 , Cys 171 /Cys 193 , Cys 195 /Cys 206 , and Cys 286 /Cys 298 . All cysteine residues are positionally conserved (8). Therefore, it can be assumed that the disulfide bonding pattern of BENat 1.3-VSG is the prototype of disulfide bonds in VSG B domains. This, however, remains to be proven experimentally. Contrary to T. brucei, VSGs from T. congolense have no cysteines in their C-terminal domains (7,8). Therefore, the typical cysteine-based classification of C-terminal domains is not possible in these VSGs.
Interestingly, the B domain disulfide-bond pattern is compatible with the gross A domain tertiary structure. The final disulfide bond in B domains connects cysteines 286 and 298. In A domains this region corresponds to sequence segments characterized by short ␣-helices interconnected by several peptide loops near the foot of the VSG stem region. As seen in Fig. 2 no structural change has to occur in this region to form a bond between Cys 286 and Cys 298 .
Taken together, the disulfide-bonding pattern of B domains can be easily integrated into the A domain tertiary structure. The major difference probably resides within the surface-oriented loop regions. It is exactly this region that also contains a substantial part of the tertiary structure differences between the two A domain VSGs ILTat 1.24 and MITat 1.2 (11). The tip  of the VSG molecules might be used to produce a variety of structures, not only differing between members of the same domain type but also between those of different domain types. The variable surface structures might reflect different surface properties of bloodstream trypanosomes. It is known that, contrary to T. brucei, bloodstream T. congolense can be agglutinated by lectins (15,16) and are able to bind to erythrocytes (17) and endothelial cells (18 -20). The latter effect is restricted to the flagellar region and is certainly not produced by VSGs (19). The surface structures of VSGs, however, might model the cell surface such to allow adhesion molecules within the surface coat their specific interaction with corresponding cells but preventing access of antibody molecules to these common and invariable surface antigens. The long known phenomenon of autoagglutination of bloodstream forms of T. congolense might be produced by the B domain-specific surface structures as well.
The primary amino acid sequences next to the cysteines 193/195 are rather strongly conserved between variants (8). Even if this part of the amino acid chain is located near the cell surface, it does not form antigenic determinants, which would be invariable and counteracting antigenic variation. The conserved amino acids are probably not exposed but necessary to form contacts to neighboring surface loops in order to maintain a specific variant-independent surface structure. Anti-VSG sera produced by immunizing animals with isolated VSG fail to react with living trypanosomes (21). By epitope mapping with polyclonal anti-VSG sera, we know that VSG molecules generally are highly immunogenic, and many sequence-type antigenic determinants are distributed along the chain (data not shown). However, none of these determinants is accessible at the cell surface. It seems, that cell-surface antigenic determinants are formed by short amino acid stretches of different but closely neighbored VSG molecules of the surface coat. Disrupting the coat (by isolation of VSG) destroys these determinants.
This would explain both the unreactiveness to living cells of anti-VSG antibodies raised by immunization with isolated VSG and the fact that surface-reacting antibodies can only be obtained by immunizing animals with living or X-irradiated trypanosomes (21).
In T. brucei non-VSG cell-surface proteins have been identified showing cysteine patterns similar to A domain VSGs (22), i.e. ISG65 and ISG75 (invariant surface glycoprotein 65 and 75) (23), ESAG6 and ESAG7 (expression-site-associated gene 6 and 7 products) (24), and human serum resistance associated protein (25). A structural similarity to VSGs is considered to be a prerequisite to become embedded into the surface coat (22). This suggests that the tertiary structure of VSGs (the VSGfold) is not limited to VSGs and is likely present in invariant trypanosomal surface proteins too. To date, no information is available on invariant cell-surface proteins of T. congolense (surface coat with B domain VSGs). Due to deviating surface properties of the two trypanosome species (see above), compared with T. brucei other invariant surface proteins might exist within the surface coat of T. congolense. To shield them from immunological attack by host antibodies, slight structural modifications of the utmost VSG parts might therefore be necessary.
During the course of this investigation, we became aware of discrepancies between the cDNA sequence data of BENat 1.3 VSG published earlier (8) and the actual amino acid sequences. At least four differences were observed. More differences might exist because only parts of the protein were sequenced. The cDNA-predicted isoleucine at position 212 was replaced by a valine, the asparagine at 290 by a lysine, and the serine at 293 by an asparagine. The fourth discrepancy of 16Da occurs in tryptic peptide 5 but was not resolved by sequencing (see "Results"). The three identified exchanges can be explained by point mutations. Thus, the isoleucine codon AUC could be converted to the valine codon GUC, the asparagine codon AAU to the lysine codon AAG, and the serine codon AGC to the asparagine codon AAC. Similarly, the fourth discrepancy could be explained by a conversion of Glu 216 (codon GAA) to Asp (codon GAU or GAC) leading to a mass reduction of 14 Da. The autoradiographies of the sequencing gels of the original BENat 1.3-cDNA were still available, and a re-evaluation confirmed the earlier published data. Also available was the original cDNA, which was resequenced, and again, no deviations to the published sequence could be found (data not shown). Thus, clearly the published BENat 1.3 cDNA sequence was correct. It is known that trypanosomes have a high potential to induce point mutations. This has been demonstrated previously at the 3Ј ends of VSG genes (26). It has, to a lesser extent, also been found in other regions of the VSG genes, and it has recently even been detected in a foreign gene introduced into T. brucei (27). The mutation rate of T. brucei has been calculated to be 12 times higher than in Saccharomyces cerevisiae and Neurospora crassa, and it has been suggested that this phenomenon might be related to the requirement for diversity among the VSG genes on which the trypanosome depends for survival (27). It can therefore be assumed that a trypanosome clone derived from a single cell will not remain homogeneous but will be continually changed by point mutations in its gene(s). The diverging sequence data of the cloned BENat 1.3 cDNA and the present protein can be explained by assuming that originally one (a minor one) of several existing VSG cDNAs was sequenced, whereas the amino acid sequences of the current VSG reveal the most prominent species. Interestingly, the observed amino acid exchanges occur in regions of the molecule, at least with respect to the A domain structure, which are not exposed at the cell surface. Therefore, the antigenicity of the trypanosomes did not change. If point mutations would occur in a region of the VSG gene coding for surface-exposed parts of the protein, a change in the antigenicity of the VSG (and in the trypanosome) could be possible. Such an effect has been observed in Trypanosoma equiperdum (28). Antigenicity changes by point mutations might occur more frequently than currently believed and could well contribute significantly to the antigenic variability potential of African trypanosomes.