![]()
|
|
||||||||
J. Biol. Chem., Vol. 280, Issue 8, 7228-7235, February 25, 2005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





||
**
From the
Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, United Kingdom and the ¶Department of Biochemistry, Trinity College, Dublin 2, Ireland
Received for publication, September 20, 2004 , and in revised form, November 22, 2004.
| ABSTRACT |
|---|
|
|
|---|
-helix followed by a single turn of 310-helix and connected by a short loop to a small anti-parallel
-sheet and then a longer
-helix at the C terminus. This compact domain is flanked by two unstructured regions. The structured part of the domain contains 42 residues, and the core comprises 2 disulfide bonds and 2 hydrophobic residues. These cysteines and hydrophobic residues are conserved in other VSGs, and we have modeled the structures of two further VSG C-terminal domains using the structure of MITat1.2. The models suggest that the overall structure of the core is conserved in the different VSGs but that the C-terminal
-helix is of variable length and depends on the presence of charged residues. The results provided evidence for a conserved tertiary structure for all the type 2 VSG C-terminal domains, indicated that VSG dimers form through interactions between N-terminal domains, and showed that the selection pressure for sequence variation within a conserved tertiary structure acts on the whole of the VSG molecule. | INTRODUCTION |
|---|
|
|
|---|
In Trypanosoma brucei and the closely related species T. evansi and T. equiperdum, each VSG monomer has a molecular mass of 4555 kDa and is made up of two or three domains (2, 3). The VSG N-terminal domain has 350400 residues, which are followed by one or two C-terminal domains of 3070 residues each. Less is known about T. congolense and T. vivax VSGs; they are somewhat smaller, 4550 kDa, and have a single domain that is similar to the T. brucei VSG N-terminal domain (4). VSGs are dimeric (5, 6) and are attached to the plasma membrane via a covalent linkage from the C-terminal carboxyl group to a glycosylphosphatidylinositol anchor (7).
VSGs that are successively expressed by an infecting population are antigenically distinct as a result of extreme sequence variation (3, 8). The structures of the N-terminal domains from two different VSGs (MITat1.2 and ILTat1.24) have been solved and are remarkably similar despite little sequence identity (9, 10). It is believed that the conservation of structure is necessary for the protective function of the VSG, although this has not been proven. The N-terminal domain of VSG MITat1.2 (9) is elongated and is composed of seven
-helices, which comprise about 47% of the structure. Two major
-helices, each of
70 Å, form a coiled coil that acts as a scaffold for the remaining smaller helices and loops. The structure of VSG ILTat1.24 (10) is homologous to that of MITat1.2. Although the structural identity is 60%, the sequences are only 16% identical over the same residues. This striking conservation of structure is likely to be present in other VSGs (4, 1012) as well as some other cell surface proteins (13, 14).
The function of the VSG C-terminal domain remains obscure; it is possible that it elongates the VSG, resulting in a thicker surface coat, but this function is clearly not essential as neither T. congolense nor T. vivax VSGs contains an obvious equivalent of this domain. There is no obvious similarity between the VSG C-terminal domain sequence and any other protein, with the possible exception of an extracellular domain present in hexose transporters from trypanosomes and related species (13). The C-terminal domains from T. brucei-like VSGs are related to one another but can be split into four types based on sequence homology (3). In the structure of VSG MITat1.2, the N-terminal domain ends with an
-helix (helix S) which is followed by a single type 2 C-terminal domain in the native VSG. Type 2 C-terminal domains are characterized by 4 conserved cysteine residues. A similar arrangement of cysteines is present in type 1 C-terminal domains, where they are disulfide-bonded in a 1 to 3 and 2 to 4 pattern (15). When different type 2 C-terminal domains are compared, there is little sequence conservation other than the cysteines, a conserved hydrophobic (often aromatic) residue after the third cysteine and an N-linked glycosylation site very close to the C terminus (Fig. 1).
|
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
Analytical UltracentrifugationEquilibrium sedimentation analysis was performed in a Beckman XL-I ultracentrifuge. Three two-sector cells were filled with 100 µl of 29, 30, or 37 µM M1.2C in 50 mM sodium phosphate, pH 6.0, and 150 mM NaCl; 110 µl of the same buffer was used on the reference side of each cell. The cells were placed in an An50-Ti rotor. For each sample, the equilibrium concentration gradients were measured for angular velocities of 25000, 33000, and 40000 rpm. Concentrations were measured as the absorbance at 280 nm (and using the Raleigh interference optics). The measured data were analyzed by global fitting using the program WinNonlin (19). The nine data sets were fitted simultaneously, with the molecular mass as a global parameter and with two local parameters for each (the concentration at the reference position and the baseline offset). The solvent density used was 1.01099 g/ml at 5 °C, and the protein partial specific volume at 5 °C was 0.6998 ml/g. Both values were calculated using Sednterp (T. Laue, University of New Hampshire). An extinction coefficient for the protein of 9098 M-1 cm-1 was obtained using amino acid analysis.
NMR SpectroscopyNMR experiments were performed using [15N]- and [13C,15N]-labeled samples of 0.71.0 mM M1.2C in 50 mM sodium phosphate, pH 6.5, 150 mM sodium chloride, 10% D2O. Spectra were recorded at 25 °C on Bruker DRX 500 and DRX 600 spectrometers. The experiments recorded were 15N HSQC, 15N-separated NOESY (150 ms mixing time), 15N-separated TOCSY (67.9 ms DIPSI mixing), HNCA, HN(CO)CA, HNCACB, CBCA(CO)NH, HCCH-TOCSY, HBHA(CBCACO)NH, (H)CC(CO)NH-TOCSY (14.3 ms mixing time), 13C HSQC, and 13C-separated NOESY (150 ms mixing time). TALOS (20) was used to estimate backbone torsion angles based on the CA, CO, CB, N, and HA chemical shifts. CLEANEX (21) experiments were recorded to analyze solvent exchange rates and deduce hydrogen bonding information. Backbone dynamics were assessed using a heteronuclear NOE experiment (22). Spectra were processed using the program AZARA (W. Boucher, University of Cambridge) and analyzed with ANSIG (23).
Structure CalculationsStructure calculations were performed using ARIA (24) interfaced to CNS 1.0 (25). The structures were calculated with 90 ps of high temperature dynamics and 30 ps of cooling from 10,000 to 1,000 K in torsion angle space. This was followed by 24 ps of cooling from 1,000 to 50 K using dynamics in Cartesian space. Figs. 3, 5, and 6 were produced using Molscript (26) and Raster3d (27).
|
|
|
DockingThe whole VSG structure was docked using the program 3D-Dock (30). The N-terminal domain dimer from MITat1.2 (Protein Data Bank code 1VSG [PDB] ) was docked with the core structure of the C-terminal domain (residues 375416) on the basis of surface complementarity, and the resulting structures were then ranked on the basis of an empirical pair potential matrix (30).
| RESULTS |
|---|
|
|
|---|
Two-dimensional 1H TOCSY and NOESY experiments were recorded on unlabeled material that had been expressed in E. coli AD494(DE3). Some of the resonances in these spectra were accompanied by a second, weaker peak of about one-tenth the intensity. We concluded that these shadow cross-peaks were due to an alternative disulfide-bonded form of M1.2C because: (i) the electrospray ion trap mass spectrometry data indicated a homogeneous polypeptide of the expected mass, (ii) no free sulfhydryl groups were detected, and (iii) M1.2C appeared homogeneous by reducing SDS-PAGE but migrated as two close but distinct bands on non-reducing SDS-PAGE at a similar relative intensity to the two sets of resonances observed in the NMR spectra (data not shown). In subsequent experiments, the ratio of the two forms varied with the E. coli host strain, growth medium, and temperature of expression. The isotopically labeled samples were produced in E. coli BL21trxB(DE3), and it was necessary to purify the major form using reverse-phase HPLC. After HPLC purification, the intensity ratio of the resonances in a 15N HSQC spectrum was 11:1.
M1.2C Is a Monomer in SolutionVSGs are dimeric, and the N-terminal domain contains a large dimerization interface that buries
8,000 Å2 (9). Equilibrium sedimentation analysis was used to determine whether the C-terminal domain is also a dimer (Fig. 2). The best fit of the model to the data gave a molecular mass of 8795 g/mol. When compared with the mass calculated from the amino acid composition of 8730.4 g/mol, it can be seen that M1.2C is a monomer in solution under the conditions used. The highest concentration used was 37 µM, so the Kd for any dimer formation would be >0.5 mM.
|
The NMR data defined a structured core with unstructured residues at both the N and C termini. The structured part of M1.2C contained 42 residues from Ala-375 to Asp-416 (Fig. 3 and Table I). Each residue in this core region has a positive value for the heteronuclear NOE (Fig. 4). Outside the structured core, no long range distance restraints were observed for residues from Glu-359 to Ala-374 or from Glu-417 to Ser-433. The heteronuclear NOE values for these residues were less than 0.1 (Fig. 4), indicating that they are disordered in solution.
|
|
of Cys-381 and Cys-393. This translates to a C
C
interatomic distance of 3.497 Å in the structure closest to the mean, a value close to the canonical distance for two cysteines linked by a disulfide bond (31). No NOEs were observed between Cys-389 and Cys-404; their absence may be due to conformational exchange that leads to line broadening, as the backbone amide resonance of Cys-404 could not be observed in any NMR spectra. Interestingly, however, Cys-389 NH is observed and its heteronuclear NOE is typical of a structured residue. As there are no free thiols in the monomeric M1.2C and the relative molecular mass corresponds to the expected value for the fully disulfide-bonded form, Cys-389 and Cys-404 must also be forming a disulfide bond. Thus, M1.2C has a disulfide bonding pattern of 1 to 3 and 2 to 4, the same as that present in VSG type 1 C-terminal domains (15).
The secondary structure of M1.2C (Fig. 2) begins with a short
-helix between residues Glu-378 and Lys-383. This is followed by a single turn of 310-helix between residues Gln-386 and Glu-388, which is connected by a short loop to the first strand of the
-sheet from Cys-393 to His-396. The two short strands of the
-sheet result in a typical network of interstrand NOEs and hydrogen bonds. The second strand of the
-sheet runs from Lys-403 to Leu-406 and leads to a longer
-helix stretching from Lys-408 to Asp-416. As well as the usual intrahelix NOE pattern, this terminal
-helix has a limited number of long range NOEs to the rest of the core region, in agreement with the lower heteronuclear NOE values for this region (Fig. 4).
The loop, Asn-397 to Lys-402, connecting the two
-strands is not well defined (at the top of the structure in Fig. 3) with the heteronuclear NOE values ranging between 0.3 and 0.6, lower than for most of the structured core (Fig. 4). The line widths of the resonances corresponding to residues within this loop were also larger than the average line width in the core, implying that there is motion on the millisecond time scale within this region.
The heteronuclear NOE values for the C-terminal
-helix (Lys-408 to Asp-416) are positive but decrease toward the terminus, suggesting that there is an increase in the amplitude of helix motion. The majority of residues in the
-helix are charged (Fig. 1), and it appears to be stabilized by a number of salt bridges (Fig. 5) both within the helix and between the helix and the rest of the domain. In the middle of the helix, Lys-412 makes a salt bridge with Asp-416 and Lys-413 is sandwiched between Glu-409 and Glu-417. The spacing between these residues conforms to the i, i+4 pattern that has been observed to stabilize isolated
-helices (32). Two sets of ionic interactions between the C-terminal
-helix and other residues in the structured core pin the N terminus of the helix to the rest of the core: Lys-394 in the first
-strand interacts with Asp-407 and Glu-410, whereas Glu-378 in the first
-helix interacts with Lys-408 (Fig. 5).
The 4 cysteines in M1.2C lie buried in the core of the structure (Fig. 5). The remainder of the core is composed of 2 hydrophobic residues that flank the disulfide bonds, Trp-395 and Leu-406. These residues, as well as the cysteines, are conserved in many VSG type 2 C-terminal domains (Fig. 1); in other type 2 C-terminal domains hydrophobic residues are usually found at an equivalent location in the sequence. Most of the remainder of the domain is made up of small polar and charged residues, with the exception of Pro-392, which also makes hydrophobic contacts with Ala-411 and Val-414 in the C-terminal helix. None of these 3 residues is conserved in type 2 C-terminal domains.
The importance of the cysteines for the structural integrity of M1.2C was tested by making mutants in which the cysteines were changed to alanines. Three mutants were made: (i) C381A and C393A, (ii) C389A and C404A, and (iii) all four cysteines mutated to alanine. The proteins were expressed, purified, and preliminary 1H NMR spectra collected. The spectra of all three mutants showed considerably less dispersion of resonances (data not shown), indicating that none of them retains the structure of wild-type M1.2C.
Comparison with Other StructuresThe structure of the M1.2C does not bear direct homology to any known structure in the Protein Data Bank, according to a search performed using the Dali server (33). Comparison with other small, disulfiderich structures suggests that the C-terminal domain is similar to Knottins and Trefoil fold proteins. The Knottins are a diverse group of proteins with a common structural motif comprising a
-hairpin with a cysteine on each side that forms disulfide bonds with cysteines outside the hairpin (34). In Knottins, however, the cysteine residues in the hairpin are usually directly opposite each other, whereas in M1.2C the equivalent cysteines (Cys-393 and Cys-404) are staggered (Figs. 3 and 5). A similar staggered arrangement otherwise only occurs in the Trefoil fold (for example Protein Data Bank 1e9t
[PDB]
and 1ps2). Although the Trefoil fold contains three disulfide bonds, it appears to be the nearest, but remote, structural neighbor of M1.2C.
Modeling of the Structures of Other VSG Type 2 C-terminal DomainsThe structure of M1.2C was used to model the VSG type 2 C-terminal domains from VSGs MITat1.1 and ILTat1.21 (Fig. 6). These sequences were selected as examples of C-terminal domains that are increasingly divergent from the sequence of MITat1.2 (Fig. 1). The C-terminal domain of MITat1.1 has a higher degree of identity to M1.2C than ILTat1.21, which is among the most divergent from M1.2C. The divergence is most apparent in the region of ILTat1.21 corresponding to the second
-helix in M1.2C; only 2 of the 9 residues are charged in ILTat1.21 compared with the 6 in M1.2C. It is not clear whether this region of ILTat1.21 will form an
-helix as is found in M1.2C. Both of the domains modeled have the 2 hydrophobic residues that stabilize each side of the core, performing a function equivalent to Trp-395 and Leu-406 in MITat1.2 (Figs. 1 and 5), Phe-407 and Leu-416 in MITat1.1, Tyr-397 and Pro-406 in ILTat1.21. The models were of good stereochemical quality, as determined by PROCHECK (data not shown).
In the model of MITat1.1, the C-terminal helix is likely to be stabilized in a similar way to that of M1.2C (Fig. 6). In several places along the helix, the charges are reversed with respect to those in M1.2C, and in these cases the charge of the interacting residue is also reversed. For example, in MITat1.2, Lys-412 and Asp-416 are close together on the surface of the helix, whereas in MITat1.1 these residues are replaced by Glu-422 and Lys-426, respectively. The model predicts that the C-terminal helix in MITat1.1 will be stabilized by interactions between Lys-423 and Asp-419/Glu-427, between Lys-417 and Asp-419, and between Glu-384 and Lys-418 (Fig. 6b). There are also two hydrophobic side chains within this helix in MITat1.1, Val-420 and Leu-424, which are on the same side of the helix. These residues are stabilized by a hydrophobic interaction with Leu-400 (Fig. 6b) within the loop before the first
-strand, which is longer in MITat1.1 than MITat1.2, so that Leu-400 corresponds to a gap in the alignment (Fig. 1). These considerations suggest that MITat1.1 can form the same structured core as MITat1.2. The C-terminal helix can be stabilized by ion pairs in a similar way, with extra stabilization coming from hydrophobic interactions.
In contrast, in ILTat1.21 the C-terminal helix has almost no charged residues, although those that are present can make stabilizing interactions (Fig. 6c). Thus, Lys-408 and Glu-411 could interact, as could Glu-396 and Lys-407. The remainder of this helix is made up of small, uncharged residues. The charged residues that are present are at the N terminus of the helix, so it is likely that ILTat1.21 will form a shorter helix here, of a single turn. The interaction between Lys-408 and Glu-411 would favor 310-helix formation as this would bring these side chains into close proximity.
Model of a Complete VSG PolypeptideThe program 3D-Dock (30) was used to produce a model of the complete VSG polypeptide (Fig. 7). This model was selected by filtering the output for (i) the surface complementarity and an empirical likelihood that certain residues interact, (ii) the position of the C-terminal domain at the membrane-proximal end of the N-terminal domain, (iii) a structure that was consistent with the envelope determined for the C-terminal domain of VSG ILTat1.24 (17), and (iv) a structure that had the C terminus pointing down toward the location of the plasma membrane and the glycosylphosphatidylinositol anchor. The model shown was the highest ranked that met these criteria.
|
| DISCUSSION |
|---|
|
|
|---|
Here, we have presented the first structure of a VSG C-terminal domain. The structure is novel and has at its core a pair of disulfide bonds. In the absence of either pair of cysteines, the domain does not fold correctly, and thus both disulfide bonds are required to form the structure. The requirement for both disulfide bonds is consistent with the absolute conservation of 4 cysteines in all other VSG type 2 C-terminal domains. The isolated domain is a monomer in solution, so the native VSG dimerizes through interactions between the N-terminal domains of the monomers. However, it is not possible to rule out interactions between the C-terminal domains at the exceptionally high concentrations of VSG present on the trypanosome surface.
The VSG C-terminal domain is comprised of a compact core of 42 residues, Ala-375 to Asp-416, flanked by two unstructured regions. To the C-terminal side of this core, the unstructured residues, Glu-417 to Ser-433 (Fig. 1), are polar and in native MITat1.2 contain an N-linked glycan on Asn-429 and the glycosylphosphatidylinositol anchor oligosaccharide on the carboxyl group of Ser-433. It is possible that the absence of the carbohydrate in the recombinant polypeptide used for the structural studies resulted in destabilization of the structure in this region. However, it is more likely that in the native VSG this part of the protein is flexible as the N-linked oligosaccharide can be either an oligomannose containing 79 mannose residues or a branched poly-N-acetyllactosamine (35). The diversity in the oligosaccharide suggests that this region of the protein has to be flexible to accommodate a range of sugar moieties. To the N-terminal side of the core, the unstructured residues, Glu-359 to Ala-374, include the protease-sensitive "hinge" between the two domains (2, 15). The construct used for the structural studies contained a 4-residue overlap, Glu-359 to Thr-362, with the structured VSG N-terminal domain (Ala-1 to Thr-362) (9). The objective of the overlap was to at least define the limits of the unstructured hinge and possibly to overlap the two structures to produce a model of the structure of a whole VSG. The maximum limits of the unstructured hinge are the 15 residues Gln-363 to Thr-377, as Thr-362 is structured in the N-terminal domain and the secondary structure of the C-terminal domain begins at Glu-378.
The second finding in this study was the demonstration that the structure of M1.2C can be used to produce models of other VSG type 2 C-terminal domains. The different type 2 C-terminal domains have the 4 conserved cysteines and 2 conserved hydrophobic residues that make up the core of the domain; otherwise they have little conservation of sequence except around the N-linked glycosylation site close to the C terminus (Fig. 1). The conservation of tertiary structure mirrors that previously found in the N-terminal domain and is presumably necessary for the function of the VSG C-terminal domain.
It has been proposed that the function of the VSG C-terminal domain is to increase the ability of the VSG monolayer to protect underlying proteins from host immunoglobulins, possibly by simply increasing the thickness of the monolayer (36). However, the C-terminal domain is a compact structure and the N and C termini of the domain are relatively close to each other. Thus, the domain is not particularly extended as might be expected of a domain with such a structural role in increasing the thickness of the VSG monolayer. The model of the VSG (Fig. 7) suggests that the VSG is not particularly extended by the inclusion of the C-terminal domain but does suggest that the C-terminal domain contributes to a densely packed polypeptide layer proximal to the plasma membrane.
The VSG coat has evolved to ensure the long term survival of an infecting population by functioning both in antigenic variation and in the formation of a protective macromolecular diffusion barrier. The results presented herein suggest that the C-terminal domain contributes to the diffusion barrier by creating a region densely packed with polypeptide adjacent to the plasma membrane.
| FOOTNOTES |
|---|
* This work was supported by the Wellcome Trust. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
Recipient of a Nehru Fellowship from the Cambridge Commonwealth Trust. ![]()
||To whom correspondence may be addressed. Tel.: 44-1223-766018; Fax: 44-1223-766002; E-mail: h.r.mott{at}bioc.cam.ac.uk. **To whom correspondence may be addressed. Tel.: 44-1223-333683; Fax: 44-1223-766002; E-mail: mc115{at}cam.ac.uk.
1 The abbreviations used are: VSG, variant surface glycoprotein; TOCSY, total correlation spectroscopy; NOE, nuclear Overhauser effect; NOESY, NOE spectroscopy; HSQC, heteronuclear single quantum correlation; HPLC, high performance liquid chromatography; MOPS, 4-morpholinepropanesulfonic acid. ![]()
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. Marcello and J. D. Barry Analysis of the VSG gene silent archive in Trypanosoma brucei reveals that mosaic gene expression is prominent in antigenic variation and is favored by archive substructure Genome Res., September 1, 2007; 17(9): 1344 - 1352. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Molecular and Cellular Proteomics |
| Journal of Lipid Research | ASBMB Today |