Structural Characterization of the Bradyzoite Surface Antigen (BSR4) from Toxoplasma gondii, a Unique Addition to the Surface Antigen Glycoprotein 1-related Superfamily*

Toxoplasma gondii is an obligate intracellular protozoan parasite that infects nearly one-third of the human population. The success of T. gondii is based on its complex life cycle; a lytic tachyzoite form disseminates infection, whereas an encysted bradyzoite form establishes a latent, chronic infection. Persistence and transmissibility is central to the survival of the parasite and is, in part, mediated by a family of antigenically distinct surface antigen glycoprotein (SAG)-related sequences (SRS) adhesins that play a dual role in host cell attachment and host immune evasion. More than 160 members of the SRS family have been identified with only the tachyzoite-expressed SAG1 structurally characterized. Here we report the first structural description of the bradyzoite adhesin BSR4 using x-ray crystallography and small angle x-ray scattering. The 1.90-Å crystal structure of BSR4 reveals an architecture comprised of tandem β sandwich domains organized in a head to tail fashion with the N-terminal domain responsible for dimer formation. A restructured topology in BSR4 results in a ligand-binding site that is significantly reorganized in both structure and chemistry relative to SAG1, consistent with BSR4 binding a distinct physiological ligand. The small angle x-ray scattering solution structure of BSR4 highlights a potentially important structural role for the interdomain polymorphic linker that imparts significant flexibility that may promote structural adaptation during ligand binding. This study reveals an unexpected level of structural diversity within the SRS superfamily and provides important insight into the role of these virulence factors.

The protozoan parasite Toxoplasma gondii, the causative agent of toxoplasmosis, is a serious global pathogen that infects nearly one-third of the human population (1)(2)(3). T. gondii infections can be lethal to a developing fetus and immunocompromised, cancer, AIDS, and organ transplant patients. Clinical features range from asymptomatic infection to lymphadenopathy, ileitis, encephalitis, and/or blinding ocular infections in both children and adults with the severity of symptoms tending to increase with age (1, 4 -9). During infection, T. gondii cycles between the rapidly growing, lytic tachyzoite stage and the slow growing, cyst-forming bradyzoite stage. Upon challenge by the immune system, the tachyzoites, which are responsible for rapid dissemination of the parasite, differentiate into encysted bradyzoites that promote chronic infection. The molecular switches that regulate interconversion of the two parasitic stages in the host remain largely undefined. Despite the complex life cycle of T. gondii, it exists as a single species and has been referred to as one of the most, if not the most, successful protozoan parasite of animals on this planet (3).
The success of T. gondii is largely due to its ability to infect a broad range of host cells (10), which is, in part, mediated by a family of developmentally expressed, antigenically distinct surface antigen glycoprotein (SAG) 5 -related sequences (SRS) adhesins. The SRS proteins play a dual role in parasite attachment and in regulating host immunity to establish chronic infection (11). With the release of the T. gondii genome, more than 160 unique sequences belonging to the SRS superfamily have been identified (12). The common features used to define SRS paralogs include 12 conserved cysteine residues, four conserved proline residues, a conserved tryptophan residue, and at least 20% sequence identity (13)(14)(15)(16). In addition, each SRS adhesin possesses an N-terminal signal sequence, and the mature protein is tethered to the parasite outer surface membrane by a glycosylphosphatidylinositol anchor (12,16,17).
Despite the growing number of genetic and cellular studies describing the importance the SRS adhesins as virulence factors in T. gondii, the detailed molecular mechanisms by which this family of proteins mediate their biological function are unknown. To date, the highly immunogenic surface adhesin SAG1 is the only SRS antigen for which a crystal structure has been determined (18,19). In the original, seminal study, Garcia and co-workers (19) established that SAG1 assembled into a dimer with the monomer adopting a novel SRS fold. Furthermore, the dimeric structure displayed a topologically defined contiguous positively charged groove formed at the interface of the D1 domains. This structural feature was predicted to mediate the interaction between the parasite and host cell extracellular matrix components such as the sulfated proteoglycan heparin (19). Although this has not been confirmed biochemically for SAG1, elution studies showed that the SRS adhesin SAG3 was selectively retained on a heparin column, suggesting a direct interaction (20). More recently SAG1 was co-crystallized with a monoclonal antibody that binds a major epitope on SAG1 (18). This latter study shows that the SAG1-antibody complex is a monomer in solution and that the mapped immunogenic epitope does not overlap with the putative ligand-binding groove (19).
The structural paradigm established for SAG1 by these studies (18,19) has generated many important questions regarding the role of the SRS adhesins as virulence factors in T. gondii. How conserved is the SRS fold among the 160ϩ family members? What are the identities of the SRS ligands and what is the basis for their interactions? Are the structural features of the tachyzoite-expressed SAG1 conserved in the SRS proteins expressed on the bradyzoite form of the parasite? This latter question is particularly important because the tachyzoite form is known to engage the host in a different manner than the highly infectious bradyzoite form, a characteristic that is intimately involved in regulation of host immunity, establishment of chronic infection, and the transmissibility of the parasite to the naïve host (17,21). To determine whether SRS antigens from different infective stages of the parasite displayed structural diversity and, if so, how it might affect ligand binding, we structurally characterized the bradyzoite surface antigen BSR4 using both x-ray crystallography and small angle x-ray scattering. The results presented here show an unexpected level of structural diversity within the SRS superfamily with BSR4 adapted to bind a distinct physiological ligand from SAG1.

EXPERIMENTAL PROCEDURES
Expression, Purification, and Crystallization-BSR4 from the type II clonal lineage of T. gondii was recombinantly produced in insect cells, purified, and crystallized as described previously (22). Briefly, insect cell medium containing secreted BSR4 was concentrated by tangential flow, and the recombinant protein was purified to homogeneity using nickel-nitrilotriacetic acid and Superdex 75 size exclusion chromatography. The fractions were analyzed by SDS-PAGE, pooled based on purity, and concentrated to 10 mg ml Ϫ1 in Hepes buffered saline. The crystals of BSR4 were grown in 18% polyethylene glycol 8000, 100 mM sodium cacodylate pH 6.5, and 100 mM zinc acetate and grew to a maximum size of 0.5 ϫ 0.4 ϫ 0.2 mm after 7 days at 293 K.
Data Collection, Processing, and Structure Solution-A single BSR4 crystal was looped into cryoprotectant consisting of mother liquor supplemented with a mixture of 10% glycerol and 10% ethylene glycol for 10 s and flash cooled directly in the cryostream (100 K). Diffraction data were collected on a Rigaku R-AXIS IV ϩϩ area detector coupled to an MM-002 x-ray generator with Osmic "blue" optics and an Oxford Cryostream 700. Diffraction data to 1.90 Å were processed using Crystal-Clear/ d*trek (23). The data collection statistics are presented in Table 1.
All of the refinement steps were carried out using the CCP4 suite of programs (24). Initial phases were obtained by molecular replacement using MOLREP (25) with the individual domains of SAG1 (Protein Data Bank code 1kzq) pruned with CHAINSAW (26) as search models. The individually docked D1 and D2 domains were used as the starting point for ARP/Warp (27), which built and registered the sequence of ϳ70% of the backbone. The remaining structure was built manually with solvent atoms selected using COOT (28). All of the solvent atoms were inspected manually before deposition. The overall structure of BSR4 was refined with REFMAC (25) to an R cryst of 23.8% and an R free of 26.8%. Stereochemical analysis of the refined BSR4 structure was performed with PROCHECK and SFCHECK in CCP4 (24). The Ramachandran plot shows excellent stereochemistry for the refined structure with more than 92% of the residues in the favored conformations and no residues modeled in disallowed orientations. Overall 5% of the reflections were set aside for calculation of R free .
Small Angle X-ray Scattering (SAXS)-Synchrotron x-ray scattering data from solutions of BSR4 were collected at the X33 Beam-line of the EMBL (DESY, Hamburg, Germany) (29) using a Pilatus 500 K instrument. A 4.2 mg ml Ϫ1 solution of bovine serum albumin was measured as a reference during calibration. The scattering patterns were measured with an exposure time of 2 min at 288 K at a wavelength of 1.5 Å. The sample-to-detector distance was set at 2.4 m, resulting in scattering vectors (q ϭ 4/ sin, where 2 is the scattering angle) ranging from 0.06 to 0.5 Å Ϫ1 . Three concentrations of BSR4 at 7.63, 5.61, and 4.05 mg ml Ϫ11 were measured to test for consistency and eliminate concentration-dependent effects. Background scattering using the buffer solution was measured after each protein sample and then subtracted from the protein scattering patterns after normalization and detector response correction. The values of radii of gyration (R g ) were derived from the Guinier approximation (30): is the scattered intensity, and I(0) is the forward scattered intensity. The radius of gyration and I(0) are inferred respectively from the slope and the intercept of the linear fit of Ln[I(q)] versus q 2 in the q range q⅐Rg Ͻ 1.12. The distance distribution function P(r) was calculated on the merged curve by the Fourier inversion of the scattering intensity I(q) using GNOM (31) and GIFT (32). The low resolution shape of BSR4 was determined ab initio from the scattering curve using the program GASBOR (33).
Bioinformatics-The neighbor joining tree was constructed using the CLUSTALW (version 1.83) (34) by the method of Saitou and Nei (35). The buried surface area of the BSR4 dimer was calculated using the Protein-Protein interaction analysis server.

RESULTS AND DISCUSSION
Phylogeny of BSR4-T. gondii is characterized by three infective forms: tachyzoites, bradyzoites, and sporozoites.
Tachyzoites are responsible for disseminating acute infection in naive hosts, whereas bradyzoites represent the transmissible stage of the asexual cycle and sporozoites the transmissible stage of the sexual cycle. Phylogenetic analyses indicate that genes expressed in a stage-specific manner cluster together as closest paralogs (Fig. 1). The prototypical bradyzoite surface antigen, BSR4, is encoded in a cluster of SRS genes on chromosome IV that also comprises the closely related paralogs SRS6 and SRS9. Sequence alignment shows that BSR4 shares only 26.5% sequence identity with the tachyzoite-expressed SAG1. This observation is consistent with the prediction that stagespecific structural features likely play an important role in the biology of T. gondii infection, dissemination, and pathogenesis. As a result, BSR4 and SAG1 are likely to recognize different physiological ligands.
Overall Structure-The insect cell produced BSR4 crystallized with one monomer in the asymmetric unit of the P4 1 2 1 2 unit cell with the dimer generated by the crystallographic 2-fold symmetry. The BSR4 structure was solved by molecular replacement using the N-and C-terminal domains of SAG1 as independent search models. The overall BSR4 structure was refined to a resolution of 1.90 Å with the final model starting at Ser 5 and extending through the fifth histidine of the His 6 tag (His 314 ). Overall, BSR4 adopts a dumbbell shaped structure with the N-(D1) and C-terminal (D2) domains organized in an extended head to tail fashion connected by a short, five-residue linker ( Fig. 2A). In this organization, the distal points of the D1 and D2 domains are positioned ϳ80 Å apart. The core ␤ strand structure of BSR4 is well ordered with only four interstrand surface loops remaining unmodeled (Phe 34 3 Ser 41 , Arg 124 3 Lys 128 , Ala 159 3 Ser 163 , and Lys 294 3 Thr 310 ). The final data collection and refinement statistics are presented in Table 1.
The ␤ sandwich fold of the D1 and D2 domains of BSR4 are composed of antiparallel and parallel strands stabilized by three conserved disulfide bonds (Fig. 2B). The D1 domain extends from Arg 6 through Glu 162 and adopts a flat, extended structure that measures ϳ42 ϫ 11 Å. The general organization of the ␤ strands in the D1 domain of BSR4 is of a discontinuous four (a, b, d, and e) on three (c, g, and f) ␤ sandwich (Fig. 2B, left panel). The upper face of the D1 ␤ sandwich is formed by two central parallel strands (b and d) organized in an upward fashion bordered by two downward facing strands (a and e) to give an overall down-up-up-down topology. The three strands forming the lower face of the ␤ sandwich are also a mixture of parallel and antiparallel strands with strands c and g directed downward and strand f directed upward. Three disulfide bonds provide stability to the core structure with strands b and g tethered by Cys 16 and Cys 155 and strands d and f connected by Cys 46 and Cys 138 . Strands d and e are connected by a predominant loop structure that extends from Ile 53 through Glu 106 . Despite the position of this region outside of the core ␤ sandwich, clear electron density exists for each residue with significant rigidity contributed by the third disulfide bond formed between Cys 66 and Cys 77 . The D2 domain adopts a more compact, globular structure than D1 measuring 33 ϫ 11 Å and extends from Ser 166 through Lys 308 (Fig.  2B, right panel). Unlike the D1 domain, the organization of the strands in the D2 domain is of a five on three ␤ sandwich. The secondary structure elements are, however, tethered together in a similar fashion to that of the D1 domain by three conserved disulfide bonds (Cys 176 3 Cys 301 , Cys 202 3 Cys 275 , and Cys 218 3 Cys 228 ). Although no appreciable sequence identity is observed between the N-and C-terminal domains of BSR4, the obvious structural similarities indicate that the two domains evolved from a common ancestral gene.
Expanding the SRS Fold-A bioinformatic analysis of the BSR4 structure using a Dali (36) search revealed that the four/ five on three ␤ sandwich fold of BSR4 was most similar to the T. gondii tachyzoite-expressed SAG1 (18,19) with Z scores of 17.8 and 20.5 for the N-and C-terminal domains, respectively. By comparison, the next most similar structure was the cellular adhesin protein NCAM (37) with a substantially reduced Z score of 4.8. The significant BSR4-SAG1 Z score is consistent with the original conclusion by Garcia and co-workers (19) that the SRS architecture constitutes a novel fold. A detailed comparison of the BSR4 and SAG1 (19) in the context of their dimeric forms reveals a divergence in topology of the D1 domain consistent with the lower Z score from the Dali search. In BSR4, strands a and b are organized in an antiparallel fashion followed by strand c on the lower face of the ␤ sandwich (Fig.  2C, D1 domain, left panel). In the dimeric structure of SAG1, however, strands a and b are positioned on opposite faces of the ␤ sandwich, whereas strand c is parallel to strand a (Fig. 2C, D1  domain, right panel). Interestingly, in the monomeric SAG1 structure, the absence of the dimer partner results in disruption of the a strand such that it folds back on itself (18). The overall topology of the five on three ␤ sandwich D2 domain is conserved between BSR4 and SAG1. The strand designations in BSR4, however, have been modified to reflect that the first two strands are antiparallel and therefore denoted as a and b rather than a and aЈ as defined previously for SAG1 (19).
BSR4 Dimer-Based on gel filtration and small angle x-ray studies, BSR4 is a monomer in solution below the highest concentration tested of 15 mg ml Ϫ1 . Both monomeric and dimeric forms of SAG1 are observed in solution and are represented in the two different crystal forms (18,19). He et al. (19) proposed that the dimeric form of SAG1 is critical to establish a topological binding site for potential ligands such as host cell surface sulfated proteoglycans. Alternatively, the crystal structure of SAG1 in complex with an Fab fragment adopted a monomeric form and demonstrated that SAG1 dimerization is not favored in a co-crystal complex, implicating host immunity as a potential regulator of SAG1 ligand interaction (18). Analysis of the molecular packing of BSR4 in the tetragonal unit cell clearly shows a symmetry related molecule oriented to form a dimeric BSR4 structure (Fig. 3B). However, only ϳ600 Å 2 of surface area from the BSR4 D1 domains is buried upon dimer formation, which is half of the 1201 Å 2 buried between the D1 domains of SAG1 (19). The physiological relevance of the homodimer is further supported by the observation that proteins adopting biologically relevant multimers on cell surfaces often show no appreciable multimerization in solution (38,39) because of the loss of localization effects of the cell membrane.
Overall, the BSR4 dimer interface is formed by 10 backbone hydrogen bonds with six contributed by the upper portion of the ␤ sandwich (strand a) and four from the lower portion (strand c). The contiguous strands in BSR4 show unambiguous electron density including the connectivity between the a and b strands that forms the basis of the structurally divergent topology relative to SAG1 (Fig. 3A). The extended sheet strongly suggests that the dimer interface defined by the D1 domains is physiologically relevant. Interestingly, the alignment of the intermolecular strands from the D1 domain of BSR4 that par-ticipate in dimerization is considerably more extensive than in SAG1 (Fig. 3B). This results from the individual strands in BSR4 being longer than in SAG1. The disposition of the D2 domains in BSR4 are such that they are directed away from each other and do not contribute to formation of the dimer (Fig. 3C, left  panel). This organization is in contrast to SAG1, where the D2 domains contribute ϳ440 Å 2 of the 2030 Å 2 total of buried surface area and is likely the reason for partial formation of SAG1 dimers in solution (Fig. 3C, right panel).
Flexibility of the BSR4 Linker-The structure of BSR4 reveals that D1 and D2 domains are oriented in a head to tail fashion connected by a structurally demarcated five-residue linker. The incorporation of a linker region indicates the potential for intramolecular flexibility that may play a role in multimerization and ligand recognition in the SRS superfamily. From the crystal structure of BSR4, the linker region is extended such that the D1 and D2 domains do not interact. To more thoroughly characterize the structural relationship between the D1 and D2 domains, we determined the structure of BSR4 in solution using SAXS. The calculated radii of gyration and D max values for three concentrations of BSR4 are reported in Table 2. For each concentration, several ab initio GASBOR (33) calculations were performed and subsequently compared with the program DAMAVER (40) to compute the normalized spatial discrepancy value for each shape. In all of the experiments, similar forms were calculated with normalized spatial discrepancy values of 0.92 Ϯ 0.05 consistent with a high level of experimental accuracy (41). The individual domains of BSR4 were manually docked into the overall ab initio shape determined by SAXS (Fig. 4, bottom right panel) alongside the surface calculated based on the crystal structure (Fig. 4, bottom left panel). The values for the crystal structures calculated with CRYSOL ␤ sandwich folds adopted by the D1 and D2 domains. The ␤ strands that form the upper leaves of the D1 and D2 domains are shown in blue and magenta, respectively. The yellow strands depict the lower leaves of the D1 and D2 domains. Structure figures were generated with PyMol (43). C, left-and right-hand panels represent the topology diagrams generated with TopDraw (44) for BSR4 and SAG1, respectively. Note the difference in topology between the D1 domains of BSR4 and SAG1.

TABLE 1 Data collection and refinement statistics
The values in parentheses are for the highest resolution shell.

Data collection
Space group P4 1  (42) ( Table 2) suggest that the disposition of the D1 and D2 domains in the crystal structure differs from that observed in solution. All of the shapes produced with GASBOR (33) indicate a kink between the domains that results in an interdomain angle of ϳ45°. It is noteworthy that when the kinked form of BSR4 is mapped onto the dimer structure in Fig. 3C, the orientation of the D1 domains are unchanged. The D2 domains, however, are much closer to each other but do not form the extensive interface observed in SAG1.
A comparison of the five residue linker sequence between seven different SRS adhesins from bradyzoites and tachyzoites reveal two general subfamilies. The first family is defined by bulky, polar residues separated by alanine and serine residues (BSR4 (EARAS), SAG1 (QARAS), SRS1 (YAKSA), SRS9 (KARAS), and SRS6 (KARPS)), whereas the second family incorporates proline residues (SAG3 (EPTPP) and SRS2 (EPRDP)). The linker appears to provide a structural element, evolved during gene duplication that governs intramolecular associations and tertiary topology. It is noteworthy that the structure of the linker region does not show stage specificity within the SRS superfamily. Instead, BSR4 and SAG1, which share only share 26.5% sequence identity, display a conserved tether. The structural implications of the different tethers are unknown, but we predict it plays a role in facilitating structural adaptation during ligand binding. It may also serve as a structural filter to mediate heterodimerization within the SRS superfamily. This would enable the parasite to dramatically increase the repertoire of its surface antigens without increasing the size of its genome.
Implications for Ligand Binding-The contiguous ␤ sheet formed upon BSR4 dimerization is consistent with a topologically defined ligand-binding site. To further characterize the chemistry of this surface, the isoelectric potential was calculated, showing that the ligand-binding surface of BSR4 is comprised of discrete basic and acidic regions (Fig. 5A). Arg 6 and Lys 117 combine to create a horseshoe type structure of positive charge that partially encircles a pocket of negative charge formed by Asp 17 and Glu 45 . Lys 15 is also present on the surface but adopts a flattened structure such that the methylene side chain  is exposed, and the amino group is directed away from the central core of the dimer interface. Sequence alignment of all known BSR4 alleles illustrates that each of the charges of these surface residues are conserved, with only the BSR4 allele from the VAND strain showing a conservative substitution of Asp for Glu at position 45. Rotation of the BSR4 dimer reveals that the topologically formed surface cleft forms an extended, shallow groove (Fig. 5B, left panel). This structural feature differs significantly from that observed in SAG1, which shows a deep groove (Fig. 5B, right panel). The relatively short SAG1 D1 strands form a comparatively offset ␤ sandwich, which results in a compact, twisted structure that relies on interstrand loops to generate the sides of the deep groove. In BSR4, the ␤ strands are nearly twice as long as those in SAG1, resulting in the extended shallow groove. Furthermore, the groove in SAG1 is lined only with basic residues, consistent with its predicted role to bind a sulfated proteoglycan (16). The structural and chemical differences of BSR4 relative to SAG1 support the hypothesis that the two SRS family members bind distinct physiological ligands consistent with their unique roles in T. gondii pathogenesis.
Conclusion-The high resolution structure of BSR4 reported here represents the first structural description of an SRS protein from the highly infectious, cyst-forming bradyzoite cell stage of T. gondii and only the second structure of an SRS superfamily member. Molecular packing of BSR4 in the crystal lattice suggests formation of a dimer where the secondary structure of the D1 domains align to form a contiguous ␤ sheet, resulting in topologically defined groove. The presence of this groove-like structure in BSR4 supports a ligand binding role for the SRS proteins as originally suggested in the SAG1 study (19). The chemical and structural diversity of the BSR4 and SAG1 grooves, however, suggests coordination of distinct ligands. Additional diversity in the SRS superfamily is likely derived from a polymorphic linker tethering the D1 and D2 domains that may promote structural adaptation during ligand binding. Based on these observations, screening studies to identify host cell ligands can now be initiated with greater confidence.