Galactose Recognition by the Apicomplexan Parasite Toxoplasma gondii*

Background: TgMIC4 is an important microneme effector protein from Toxoplasma gondii. Results: The structure of TgMIC4 together with carbohydrate microarray analyses reveal a broad specificity for galactose-terminating sequences. Conclusion: Lectin activity within the fifth apple domain of TgMIC4 is reminiscent of the mammalian galectin family. Significance: TgMIC4 may contribute to parasite dissemination within the host or down-regulation of the immune response. Toxosplasma gondii is the model parasite of the phylum Apicomplexa, which contains numerous obligate intracellular parasites of medical and veterinary importance, including Eimeria, Sarcocystis, Cryptosporidium, Cyclospora, and Plasmodium species. Members of this phylum actively enter host cells by a multistep process with the help of microneme protein (MIC) complexes that play important roles in motility, host cell attachment, moving junction formation, and invasion. T. gondii (Tg)MIC1-4-6 complex is the most extensively investigated microneme complex, which contributes to host cell recognition and attachment via the action of TgMIC1, a sialic acid-binding adhesin. Here, we report the structure of TgMIC4 and reveal its carbohydrate-binding specificity to a variety of galactose-containing carbohydrate ligands. The lectin is composed of six apple domains in which the fifth domain displays a potent galactose-binding activity, and which is cleaved from the complex during parasite invasion. We propose that galactose recognition by TgMIC4 may compromise host protection from galectin-mediated activation of the host immune system.

Toxoplasma gondii is an obligate intracellular parasite of the phylum Apicomplexa, and the causative agent of toxoplasmosis. It exhibits the ability to invade almost any nucleated cell, and is prevalent among human populations, with a worldwide infection rate of up to 50%. Infection in humans primarily occurs following consumption of undercooked infected meat or contact with feces from infected domestic cats. In healthy adults the infection is generally either asymptomatic or results in a mild, flu-like illness, which marks the beginning of a lifelong chronic infection (1). In immunocompromized individuals infection can lead to acute disease, in which T. gondii awakens from a semidormant state causing blindness (2) or potentially fatal encephalitis (3,4). Infection in pregnant women can result in a range of fetal birth defects or death (5). Due to its remarkably high infection rate, T. gondii constitutes the third most common cause of food-related death in both the United States (6) and France (7) after Salmonella and Listeria.
T. gondii and other apicomplexan parasites, including Plasmodium species, rely on an active, phylum-specific host cell invasion process to establish infection. Host cell entry consists of several sequential steps initiated by the release of proteins from secretory organelles named micronemes and rhoptries (8). Some of the microneme protein complexes contribute to host cell attachment and provide a link between host cell receptors and the parasite actomyosin motor and hence provides the motive force necessary for host cell invasion (9,10) Among the microneme protein complexes operating in invasion, the TgMIC1-4-6 complex is important for the efficiency of this process and has been shown to contribute to the virulence of the parasite in mice (11,12). The structure of the N-terminal region of TgMIC1 revealed a pair of novel domains termed the microneme adhesive repeat region (MARR) 4 (13), in complex with a wide range of sialylated glycans (14). TgMIC1 not only interacts with a range of sialylated glycans on host cell receptors, but also recruits TgMIC4, which is anticipated to exert adhesive function during invasion (15). Previous studies suggested that the first two apple (A) domains of TgMIC4 bind to the N terminus of TgMIC1 (16,17), whereas the C-terminal fragment (including the sixth apple domain) exhibited cell binding activity of unknown specificity (supplemental Fig. S1) (16). Although lectin activity has not yet been reported for TgMIC4, speculation has recently arisen due to an earlier report showing that a TgMIC1-4 subcomplex could be recovered from a lactose-affinity column (18) and our previous studies revealing TgMIC1 specificity for sialylated oligosaccharides only (13).
Here, we combine atomic resolution studies with data from carbohydrate microarrays to reveal the basis of the interaction between TgMIC4 and a variety of galactose (Gal)-terminating oligosaccharides, and further define the interaction between TgMIC4 and TgMIC1. This reveals new features regarding both parasite-receptor interactions and the stoichiometry of the TgMIC1-4-6 complex.
Solution Structure Determination of TgMIC4-A12 and TgMIC4-A5-Backbone and side chain assignments were completed using our in-house, semiautomated assignment algorithms and standard triple resonance assignment methodology (19). H ␣ and H ␤ assignments were obtained using HBHA(CB-CACO)NH. The side chain assignments were completed using HCCH-total correlation (TOCSY) spectroscopy and (H)CC(CO)NH TOCSY. Three-dimensional 1 H-15 N/ 13 C NOESY-HSQC (mixing time 100 ms at 500 and 800 MHz) experiments provided the distance restraints used in the final structure calculation. The ARIA protocol (20) was used for completion of the NOE assignment and structure calculation, including water refinement of final structures using a thin layer of explicit solvent (21). Dihedral angle restraints derived from TALOS were also implemented (22). The structural statistics are presented in the supplemental data.
A structural model for the TgMIC4-A5 complex was calculated using the molecular docking program HADDOCK (23) Selected residues that exhibited a significant shift perturbation, namely Lys-428, Asn-460, Tyr-467, Lys-469, Tyr-476 and Tyr-478, were defined as "active" for docking. Additionally intermolecular NOEs between TgMIC4-A5 and lacto-N-biose were measured via a three-dimensional 13 C-filtered 13 C-HSQC-NOESY experiment. TgMIC4-A5 aromatic nuclei were re-assigned in the presence of lacto-N-biose via titration of the protein with the ligand and observance of chemical shift perturbations in the 1 H-13 C-HSQC spectrum (selective for aromatic nuclei). The presence of a 5-fold excess of lacto-N-biose enabled NOE-bearing nuclei to be assigned using a sample of free disaccharide. Lacto-N-biose 1 H chemical shift assignment was carried out using data from 1 H-1 H TOCSY, 1 H-1 H COSY, 1 H-1 H NOESY, and 13 C-HSQC NMR spectra. Intermolecular NOEs were implemented as distance restraints alongside chemical shift perturbation-derived ambiguous interaction restraints to drive structure calculation using HADDOCK version 2.1. The docking process utilized an ensemble of 10 lowenergy conformers from the TgMIC4-A5 structure determination, and a lacto-N-biose structure (and parameter and topology files) created using the GlyCaNS server.
T. gondii Culture-T. gondii tachyzoites were grown in confluent human foreskin fibroblasts or Vero cells maintained in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% fetal calf serum, 2 mM glutamine, and 25 g/ml of gentamicin.
Immunofluorescence Assay and Confocal Microscopy-Parasite-infected human foreskin fibroblasts were fixed with 4% paraformaldehyde or 4% paraformaldehyde, 0.05% glutaraldehyde in PBS, depending of the antigen to be labeled and processed as previously described (24). Confocal images were collected with a Leica laser scanning confocal microscope (TCS-NT DM/IRB and SP2) using a 1003 Plan-Apo objective with NA 1.4.
Plaque Assay-Host cells were infected with parasites for 7 days before fixation with paraformaldehyde/glutaraldehyde followed by Giemsa staining.
Invasion Assay-Invasion assays were performed as described previously using the RH-2YFP strain as an internal standard (26). Briefly, confluent human foreskin fibroblasts have been heavily infected with a mixture of the strain of interest and RH-2YFP parasites and washed out after 1 h. Parasites were incubated for 36 h. Extracellular parasites were then collected and the ratio of non-YFP to YFP parasites was determined. At the same time these parasites were transferred on new host cells and allowed to invade for 1 h at 37°C before washing. Then, incubation continued for 24 h and cells were fixed. Parasites were stained with ␣-GAP45 and the ratio between non-YFP and YFP parasite vacuoles was calculated. The efficiency of invasion was determined by counting vacuoles in 20 fields (e.g. around 400 vacuoles) for each condition and for two independent experiments.
Carbohydrate Microarray Analyses-The microarray analyses were performed using the neoglycolipid-based microarray system that contains sequence-defined lipid-linked oligosaccharide probes: glycolipids and neoglycolipids (27,28). The repertoire of 400 probes is described in supplemental Table S2. Among these are 248 neutral sequences, 55 acidic but nonsialylated, and 97 sialylated probes with differing sialic acid linkage glycan backbone, chain length, and sequence. The lipid-linked glycan probes were printed onto 16-pad nitrocellulose-coated glass slides, in duplicate at two levels, 2 and 5 fmol/spot, as described (28). The screening microarray binding analyses were performed at ambient temperature. His-tagged TgMIC4-A12, TgMIC4-A5, and TgMIC1-MARR were assayed essentially as described (12,13). In brief, the arrayed slides were blocked for 1 h with 1% (w/v) bovine serum albumin (Sigma) in Pierce Casein Blocker solution (casein/BSA). The His-tagged proteins were precomplexed with mouse monoclonal anti-polyhistidine (Sigma) and biotinylated goat anti-mouse IgG antibodies (Sigma) in a ratio of 1:2.5:2.5 (by weight) and overlaid onto the arrays. TgMIC4-A12 and TgMIC1-MARR were tested at 40 g/ml and TgMIC4-A5 was tested at 20 g/ml. Binding was detected using Alexa Fluor 647-conjugated streptavidin (Molecular Probes). Microarray data analysis and presentation were carried out using dedicated software. The binding to oligosaccharide probes was dose-related, and results at 5 fmol/ spot are shown.
To closely compare the binding preferences of TgMIC4-A5 a focused microarray "dose-response" format was used, which included glycolipids asialo-GM1, GM1, SM1a, and SB1a (29). For this, the probes were quantified at the same time and printed in duplicate at four levels: 0.3, 1, 2, and 5 fmol/spot.
Blue Native PAGE and SDS-PAGE of Native TgMIC1-4 Complex-Native TgMIC1-4 complex from tachyzoites of the virulent RH strain of T. gondii was purified as described previously (18). BN-PAGE experiments were performed with the Native PAGE Novex BisTris Gel System (Invitrogen) according to the instructions, using a 4 -16% gradient gel. In addition to natively purified complex (Lac ϩ ), protein samples were denatured either in 8 M urea or 6 M guanidinium chloride for 4 h at room temperature. Samples were mixed with Native PAGE loading buffer (Invitrogen), and 5 g of protein was loaded into the gel. Electrophoresis was performed in an ice bath, at 150 V until completion of dye migration. The gel was stained using the Colloidal Blue Staining Kit (Invitrogen). A high molecular weight calibration kit (GE Healthcare) was used to indicate the protein size.
Natively purified complex was analyzed via SDS-PAGE using a 4 -12% BisTris polyacrylamide gel (Invitrogen) in NuPAGE MOPS-SDS buffer, pH 7.7, according to the manufacturer's instructions. In addition, a 220 kDa band from blue native PAGE was excised from the native gel and the proteins were eluted by incubating the gel slice with SDS-sample buffer overnight. This was incubated at 90°C for 5 min and the supernatants were loaded into the gel. SDS-PAGE was performed at 170 V until completion of sample migration. Protein bands were visualized via silver staining using a Silver Staining Plus kit (Bio-Rad).

RESULTS AND DISCUSSION
The Overall Structure of Apple Domain Pair from TgMIC4-Most previous structural studies of apple domains of various proteins have been focused on individual apple domains (30 -32). Sequence analysis reveals that TgMIC4 comprises six apple domains that occur in intimate pairs, with only 3 residues separating the first and second (A12), third and fourth (A34), and fifth and sixth (A56) domains (15). A phylogenetic analysis of the individual apple domains from TgMIC4 showed a clear divergence between the odd and even domains, indicating that they may naturally form pairs (33) (supplemental Fig. S2).
To investigate the arrangement of apple domains in TgMIC4, we determined the solution structure of the first two apple domains, namely A1 and A2 comprising residues 58 -231 (TgMIC4-A12), using heteronuclear multidimensional nuclear magnetic resonance (NMR) spectroscopy. The structure reveals an intimately associated pair of apple domains ( Fig. 1 and supplemental Fig. S3), each consisting of a sheet formed of 4 -5 antiparallel ␤-strands that cradles an ␣-helix. On the other face of this ␤-sheet lies an additional smaller sheet that is formed from two short ␤-strands, and in the second domain is extended with an additional 4-residue helix. Two disulfide bridges connect the helix to the central strands of the sheet (C2:C5 and C3:C4), with a further disulfide bridge connecting the N terminus to the C terminus (C1:C6). These structural features correspond to previously determined apple domain structures (supplemental Fig. S4). The ensemble of the 10 lowest energy structures has been deposited in the Protein Data Bank under accession number 4A5V ( Fig. 1A and Supplemental Table S1). The interface between the domains is mediated in the main by a collection of hydrophobic residues, most notably a patch of four alanine residues found on the outer face of the helix, Phe-213 and Met-158 in A2 and Leu-124 and Pro-122 from A1. A large number of NOEs can be assigned between the two domains across the interface (supplemental Fig. S3).
Analysis of all apple domain structures determined to date reveals only one in which the arrangement of tandem pairs has been established: the crystal structure of the four apple domains found in coagulation factor XI (35). Although the apple domains found in coagulation factor XI have a relatively low sequence similarity to TgMIC4-A12, a comparison of the arrangement of apple domains in this protein with TgMIC4-A12 reveals a very similar structural interface (Fig. 2). Interestingly, the relative orientation of the domains is reversed. In coagulation factor XI, helix 1 of the odd numbered domains is buried at the interface, whereas in TgMIC4-A12, helix 1 of the even numbered domains is found at the interface.
The similarity in the arrangement of the apple domain pairs to that found in coagulation factor XI suggests that the fulllength TgMIC4 protein may adopt a similar disc-like structure (Fig. 2). The significantly longer linker between the fourth and fifth apple domains in TgMIC4, and the fact that this region is cleaved by a parasite-encoded protease, TgSUB1, at the surface of the parasite releasing TgMIC4-A56 (36), suggests a model in which coagulation factor XI arrangement is maintained between the first two pairs, with the third pair (A56) more loosely associated (Fig. 2).
TgMIC4 Interacts with the ␤-Finger of TgMIC1 via Second Apple Domain-Previous studies on mutant parasites indicate that TgMIC4-A12 interacts with the N-terminal region of TgMIC1 within the TgMIC1-4-6 complex (16). To investigate this interaction we performed an NMR titration using recombinantly expressed TgMIC1-MARR and TgMIC4-A12. No interaction could be observed by NMR, which was also confirmed by isothermal titration calorimetry and analytical gel filtration. The disparity between in vivo and in vitro results lead us to reanalyze the crystal structure of TgMIC1-MARR (13). There are two additional cysteine residues present in the ␤-finger motif at the C-terminal end of MAR2, which introduce a rearrangement of the disulfide bond pattern in MAR2 compared with that seen in MAR1. The expected pairing between C4 and C6 is broken and two new disulfide bonds are made with the cysteine residues from the ␤-finger. This results in the ␤-finger motif being pinned against the surface of the MAR2 domain and we hypothesize that this may block a potential interaction between recombinant TgMIC1 and TgMIC4 (supplemental Fig. S5). The normal protein folding environment within the ER of the parasite and subsequent quality control checks would enforce an alternative bonding pattern in MAR2 similar to the one observed for MAR1 happening in vivo and allow correct assembly of the complex.
To investigate the possibility that this disulfide rearrangement in recombinant TgMIC1-MARR is responsible for the C, the interface between the domains seems to consist predominantly of buried hydrophobic residues. A patch of four alanine residues is found on the outer face of the helix from apple 2. D, a charge-charge interaction between His-13 and Glu-108 at the A12 interface.
lack of binding in vitro, a peptide corresponding to the TgMIC1 ␤-finger (residues 237-256) was synthesized and an intramolecular bond was formed between the two ␤-finger cysteine side chains. An NMR titration experiment was performed by recording the 1 H-15 N-HSQC spectra of 15 N-labeled TgMIC4-A12 in the presence of increasing amounts of peptide. A number of chemical shift perturbations are seen in this spectrum (Fig. 3A). The residues undergoing chemical shift perturbation are localized exclusively to A2 (Fig. 3B), suggesting that an interaction exists between the second apple domain of TgMIC4 and the ␤-finger region of TgMIC1. Further studies will be required to elucidate the precise binding mode of the peptide.
The "␤-Finger" Region of TgMIC1 Is Required for Correct Trafficking of TgMIC4 in T. gondii-It has been shown previously that an interaction exists between TgMIC1 and TgMIC4 that facilitates the correct targeting of TgMIC4 to the micronemes (16,37). In these studies complementation of the mic4ko strain with either TgMIC4 or TgMIC4-A12 results in their correct sorting to the micronemes, suggesting that the first pair of apple domains is sufficient to mediate interaction with TgMIC1 within the parasite. To determine the significance of the potential interaction between TgMIC4-A2 and the ␤-finger motif from TgMIC1, the transport of the components of the complex was analyzed on mic1 knock-out strain (mic1ko) complemented with a construct expressing TgMIC1-⌬␤finger (lacking residues 216 to 237). The expression of TgMIC1-⌬␤finger was assessed by Western blot (Fig. 4A). When mic1ko is complemented with full-length TgMIC1, TgMIC6 and TgMIC4 are successfully targeted to the micronemes (16) (Fig.  4B). In contrast, when mic1ko parasites are complemented with  TgMIC1-⌬␤finger, the TgMIC1 mutant protein and TgMIC6 are correctly transported to the micronemes, whereas TgMIC4 remains mislocalized (Fig. 4B). As observed in mic1ko, TgMIC4 is blocked in earlier compartments of the secretory pathway, suggesting that the ␤-finger forms a necessary part of the TgMIC4-MIC1 interface and is crucial for the sorting of TgMIC4. These in vivo results are in excellent agreement with the structural data.
We next compared these strains in cell invasion assays (Fig.  4C). In the mic1ko strain, invasion is significantly reduced compared with wild-type confirming the role for MAR domains in invasion (12,13). Efficient invasion can be restored when this strain is complemented with TgMIC1wt, whereas complementation with TgMIC1-⌬␤finger also significantly improves invasion efficiency. This suggests that the folding and activity of MAR domains is retained in the absence of the ␤-finger, ruling out the possibility that the loss of TgMIC4 recruitment is merely due to a loss of TgMIC1 structural integrity. Additionally, the observed deficiency of invasion recovery by TgMIC1-⌬␤finger compared with TgMIC1wt may be understandable by the loss of cell adhesion via TgMIC4.
Intermolecular Covalent and Noncovalent Interactions Contribute to TgMIC1-4 Complex-To provide further details regarding the interaction between TgMIC1 and TgMIC4, we purified a native complex from T. gondii tachyzoites (virulent RH strain) as described previously (18) and analyzed it by both SDS-PAGE (Fig. 5A) and blue native gel electrophoresis (Fig.  5B). This complex runs at ϳ660 kDa on a native gel. Samples incubated in either 8 M urea or 6 M guanidinium chloride degrade into two bands of ϳ440 and 220 kDa. Further SDS-PAGE analysis of the lower 220 kDa band indicated that it consists predominantly of TgMIC1. In each case, the stability of these subcomplexes in 8 M urea and 6 M guanidinium chloride suggests that the constituents interact via covalent bonds, most likely inter-molecular disulfide bridges. The 220 kDa TgMIC1 band corresponds closely in mass to a trimeric TgMIC1 arrangement. An early molecular checkpoint in the assembly of TgMIC1-4-6 is the interaction between the EGF domains of ; red) to the micronemes is disrupted (not rescued) in the mutant parasite carrying TgMIC1⌬␤-finger and appears distributed through the early secretory pathways as observed in mic1ko. Lower panel, in contrast to TgMIC4, the trafficking of TgMIC6 is restored to the micronemes by the TgMIC1⌬␤-finger. C, cell invasion assays comparing the T. gondii mutant strains demonstrate that the expression of TgMIC1⌬␤-finger restores invasion in mic1ko to the wild-type level. The considerable overexpression of TgMIC1wt lead to a significant increased efficiency of invasion. Error bars indicate S.D. D, the plaque assay recapitulates several lytic cycles of the parasites. The mic1ko parasites expressing either TgMIC1wt or TgMIC1⌬␤-finger form plaques of comparable size, whereas mic1ko form smaller plaques.
TgMIC6 and the C-terminal galactin-like domain of TgMIC1. TgMIC6 possesses three EGF domains in which the TgMIC1binding interface is conserved and therefore could recruit up to three molecules of TgMIC1 (supplemental Fig. S6) (26,37). This would suggest that although the first TgMIC6-EGF domain is eventually cleaved during trafficking of the complex through the secretory pathway, the TgMIC1 trimer would remain intact until secreted onto the parasite surface.
A disulfide-bonded 440-kDa TgMIC4/TgMIC1 species corresponds in molecular mass to approximately six molecules, with a possible arrangement being three of TgMIC4 joining a trimeric platform of TgMIC1. Although the arrangement of intermolecular disulfide bonds remains undefined, it is worth noting the proximity of a seventh, unpaired cysteine (Cys-263; known hereafter as C3Ј) from TgMIC4-A3 to the conserved disulfide linkages within the internal ␣-helix/␤-hairpin loop and to A2 in our TgMIC4 model (Fig. 2). Recombinant production of TgMIC4-A3 yields a polydisperse sample and NMR analysis identifies three folded species in approximately equal quantities (supplemental Fig. S7, A and B). Peptide fingerprinting via MALDI (matrix-assisted laser deabsorption/ionization) mass spectrometry (MS) under nonreducing conditions indi-cates that one species contains the expected pattern of disulfide linkages (C1:C6, C2:C5, and C3:C4 with C3Ј free; supplemental Fig. S7C). The two additional species contain mismatched and free C3, C3Ј, and C4. Due to the high homology with A1 and A5 (in which alanine replaces the free cysteine residue), it was reasoned that mutation of C3Ј should prevent disulfide scrambling, resulting in a monodisperse sample. However, combined NMR and MALDI-MS data reveals that recombinant TgMIC4-A3 C263A adopts two stable conformations, in which C3/C4 are disulfide-linked and free, respectively (supplemental Fig. S7C). The dynamic nature of this region together with the labile nature of the disulfide pattern would provide a mechanism for covalent association with TgMIC1. Surface-associated disulfide isomerases may also contribute the necessary shuffling of sulfide bonds. Interestingly, the homologue of MIC4 in the closely related species Neospora caninum does not contain this additional cysteine residue, and NcMIC1 (the homologue of TgMIC1) does not co-purify from parasite lysates (38,39). In summary, we propose a model (Fig. 5C) in which the TgMIC1-4 complex forms an array on the parasite surface composed of trimeric TgMIC1 and heterohexameric TgMIC1-4 subcomplexes, anchored via TgMIC6.  3 and 4, respectively). C, a proposed model of TgMIC1-4 assembly. The molecular masses of the native subcomplexes are consistent with the existence of TgMIC1-4 hetero-hexamer and TgMIC1 trimer species.
The Fifth Apple Domain of TgMIC4 Is a Lectin with Specificity for Galactose-terminating Oligosaccharides-To assess the carbohydrate-binding properties of TgMIC4, we carried out carbohydrate microarray analyses using the recombinant proteins TgMIC4-A5, TgMIC4-A56, and TgMIC4-A12. The microarrays encompassed a panel of 400 lipid-linked oligosaccharide probes representing diverse mammalian glycan sequences and their analogs, as well as sequences derived from fungal and bacterial polysaccharides. These are arranged based on negative charge (neutral and acidic), sialyl linkages, and backbone sequences ( Fig. 6A and supplemental Table S2). TgMIC4-A12 showed no significant binding to any of the probes in the microarray (data not shown); whereas, TgMIC4-A56 and TgMIC4-A5 showed good binding to a diverse range of oligosaccharide probes terminating in ␤-galactose (Gal) with a similar binding profile (results for TgMIC4-A5 shown in Fig. 6 and supplemental Table S2). The probes bound include a wide range of neutral sequences and several acidic sequences that are sialylated and sulfated at inner residues. This is clearly distinct from the binding specificity of TgMIC1-MARR, which bound exclusively to sialic acid-terminating sequences ( Fig. 6B and supplemental Fig. S8) (13).
The reciprocity in the binding signals of TgMIC4-A5 and TgMIC1-MARR to N-glycans and gangliosides is clearly shown in the matrix presentation (Fig. 6B). Whereas the Gal-terminating N-glycan NA2F was bound by TgMIC4-A5, the disialylated analog A2F(2-3) was bound only by TgMIC1-MARR. Unlike TgMIC4-A5, TgMIC1-MARR did not bind to GM1-related probes, but it bound strongly to the closely related members of the ganglioside family, e.g. GM2 and GT1b, which were not recognized by TgMIC4-A5.
A closer comparison of TgMIC4-A5 binding to GM1-related sequences was performed by microarray analyses in dose-response format using asialo-GM1, SM1a, SB1a, and GM1 glycolipids (Fig. 6C). Here also, TgMIC4-A5 elicited no binding signals with SB1a, indicating the importance of unmodified terminal Gal for binding, but that a negative charge at position 3 of the internal Gal residue contributes positively the binding strength.
Atomic Resolution Insight into TgMIC4-A5 Oligosaccharide Ligand Interactions-The solution structure of the fifth apple domain of TgMIC4, comprising residues 410 -491 (TgMIC4-A5), was determined using NMR spectroscopy, revealing the expected canonical apple domain fold (supplemental Fig. S9 and Table S3). The ensemble of the 10 lowest-energy structures has been deposited in the Protein Data Bank under accession number 2LL3. To investigate the binding mode of TgMIC4-A5 in more detail, NMR titration experiments were performed with Gal, lactose (Gal␤1-4Glc), LacNAc (Gal␤1-4GlcNAc), lacto-N-biose (Gal␤1-3GlcNAc), and GM1-penta (Gal␤1-3GlcNAc␤1-4(Neu5Ac␣2-3)Gal␤1-4Glc). Each ligand induced a significant number of chemical shift perturbations in the TgMIC4-A5 1 H-15 N-HSQC spectrum, indicative of an interaction with each of the ligands (Fig. 7, A and E). The pattern of shift perturbations was similar for each ligand, indicating a conserved binding pocket, the core of which lies at the junction between the two-stranded and four-stranded ␤-sheets. Where possible (i.e. fast-exchange), shift perturbations were used to estimate the dissociation constant (K d ) for the interaction (Fig. 7C). Galactose binds with a K d of ϳ2.6 ϫ 10 Ϫ4 M, whereas lactose and LacNAc each bind with K d values of ϳ1.6 ϫ 10 Ϫ4 M. The ␤1,3-linked analog (lacto-N-biose) binds more tightly, in the intermediate-exchange regime, with a K d of ϳ1.1 ϫ 10 Ϫ4 M determined using isothermal titration calorimetry (supplemental Fig. S10). Of all the ligands tested, the ganglioside oligosaccharide GM1-penta (lacking the ceramide tail) was found to bind most tightly, in the slow-exchange regime (i.e. K d ϳ10 Ϫ5 M) (supplemental Table S4). This is in overall agreement with the results observed in the microarray analyses.
To further characterize the mechanism of galactose recognition by TgMIC4-A5, a structural model of a TgMIC4-A5⅐lacto-N-biose complex was calculated using HADDOCK; a computer program for data-driven molecular docking (23) (Fig. 7, structure statistics in supplemental Table S6). An ensemble of the 10 low-energy structures has been deposited in the Protein Data Bank under accession number 2LL4. Based on the NMR titration data, ambiguous interaction restraints were implemented for residues Lys-428, Asn-460, Tyr-467, Lys-469, Tyr-476, and Tyr-478. These data were complemented by the detection and assignment of seven intermolecular NOEs (nuclear Overhauser enhancements) between TgMIC4-A5 and lacto-N-biose, measured in a 13 C-filtered 13 C-HSQC-NOESY (nuclear Overhauser enhancement spectroscopy) spectrum ( Fig. 7A and supplemental Table S5). In terms of TgMIC4-A5, intermolecular NOEs were restricted entirely to aromatic nuclei, of which chemical shifts in the bound state were assigned via an NMR titration using a 1 H-13 C-HSQC (aromatic-selective) experiment (supplemental Fig. S11A). In terms of lacto-N-biose, the presence of a 5-fold excess resulted in detection of effectively free-state chemical shifts, as verified via comparison of conventional and 13 C-edited 1 H-1 H TOCSY spectra of, respectively, free and bound disaccharide (supplemental Fig. S11B). 1 H-Chemical shift assignment of the free-state molecule was therefore carried out (supplemental Fig. S11, C and D), enabling the intermolecular NOE assignment to be completed. Assignments were implemented as distance restraints for molecular docking of TgMIC4-A5 and lacto-N-biose.
Due to the low molecular weight of the protein and intermediate-exchange regime of the interaction, it was not possible to obtain data regarding the bound-state conformation of lacto-N-biose using established transferred NOE methods. Coupled with the availability of only a small volume of unambiguous distance restraint data from intermolecular NOEs, this precluded the calculation of a full experimental structure using a program such as ARIA, and instead HADDOCK was utilized in a similar manner to previous studies (41)(42)(43).
The structural model of TgMIC4-A5/lacto-N-biose suggests that the protein adopts a similar mechanism of galactose recognition to the Galectin family proteins (44), despite TgMIC4-A5 being structurally distinct (supplemental Fig.  S12A). The galactose ring stacks against the aromatic ring of Tyr-467 (in an equivalent position to the conserved tryptophan residue of Galectins) and forms hydrogen bonds with the side chains of Asn-460, Lys-469, and Tyr-476. The side chains of Lys-428 and Tyr-478 form steep walls at either end of the pocket and may provide additional ligand contacts. To confirm our structural description for galactose recognition we created binding site mutants (namely K428A, K469M, and Y478L), checked their foldedness, and reassessed their carbohydrate binding by NMR (supplemental Fig. S13). Affinity for galactose remained unaffected by the K428A mutation, which is consistent with the interaction with the backbone atoms in this region. Diminished binding of Gal was observed for TgMIC4-A5 Y478L (a K d of ϳ1.7 mM was determined via NMR titration data) suggesting that this residue provides important contacts, consistent with the observance of intermolecular NOEs to its aromatic ring. The interaction was completely abolished in TgMIC4-A5 K469M , suggesting that this residue forms a key hydrogen bond.
The only other structural insight that is available for an apple domain-carbohydrate complex is from the crystal structure of a hepatocyte growth factor-NK1⅐heparin complex (45). Whereas general features of the binding site are shared, such as hydrogen bonding to lysine residues, heparin binds to a face of the apple domain that is completely different from that of TgMIC4-A5 bound by Gal-terminating sequences (supplemental Fig. S12). This study therefore identifies a new mode of oligosaccharide recognition by apple domains.
These data prompt us to revisit two other microneme proteins for which lactose binding has been suggested. The Sarcocystis muris lectin, SML-2 (46), displays the same arrangement of Gal-binding residues and would be predicted to use the same mode of recognition as TgMIC4 (supplemental Fig. S2A). The microneme proteins EtMIC4 and EtMIC5 of Eimeria tenella form a high molecular weight complex that is also pulled down by lactose chromatography and binds host cells (47). EtMIC5 contains 11 apple domains and by comparison with TgMIC4 we deduce that the seventh apple domain may be a lectin similar to TgMIC4-A5 (supplemental Fig. S2A). Gal/GalNAc-specific lectins have also been identified in other protozoan parasites; examples include surface proteins from Cryptosporidium spp. (48), Entameba histolytica (49), and Trypanosoma cruzi (50). It is possible that the mode of galactose recognition characterized here is conserved in these more distantly related organisms.
TgMIC1 and TgMIC4 Cannot Simultaneously Bind GM1-As previously reported, TgMIC1 binds to a range of sialylated glycans with a preference for ␣2-3 sialic acid linkage. It has been suggested that recognition of sialylated sequences, such as those found on gangliosides, may be important for the tropism of the parasite to the brain with formation of cysts in the intermediate hosts (13). The binding studies carried out in this work have revealed that TgMIC4-A5 has galactose-binding activity and binds strongly to the oligosaccharide moiety of ganglioside GM1. GM1 possesses both terminal galactose and side chain sialic acid moieties, and is often targeted by microbial pathogens; for example, it is recognized by the cholera toxin from Vibrio cholerae (51) and the major capsid VP1 from simian virus 40 (52,53). Given that GM1 contains ␣2-3-linked sialic acid on the inner Gal residue, which is bound by TgMIC1, it was reasoned that TgMIC1 should also be capable of binding to GM1-penta, although the affinity is likely to be weak as there was no significant binding to GM1 by TgMIC1-MARR in the solid phase microarray analyses (Fig. 6B). The capability of TgMIC1-MARR to interact with GM1-penta in solution was indeed demonstrated via the NMR chemical shift perturbation analysis (supplemental Fig. S14). To test the ability of TgMIC1-MARR and TgMIC4-A5 to bind to GM1 simultaneously, we performed a sequential NMR titration experiment. GM1 was first titrated into 15 N-labeled TgMIC1-MARR and binding was monitored by specific peak perturbations in 1 H-15 N-HSQC spectra. After saturation, 13 C-15 N-labeled TgMIC4-A5 was then titrated to the complex and the interaction of both microneme proteins was monitored by 1 H-15 N-HSQC (for TgMIC1-MARR) and 1 H-15 N two-dimensional HNCO (for TgMIC4-A5) spectra (supplemental Fig. S14). The data show  Table S2. C, microarray dose-response analyses of the binding of TgMIC4-A5 to asialo-GM1, GM1, SM1a, and SB1a printed at 0.3, 1, 2, and 5 fmol/spot. that TgMIC4 efficiently displaces TgMIC1 at an equimolar ratio of TgMIC4 to glycan, suggesting that the affinity of TgMIC4-A5 for GM1 is higher than TgMIC1-MARR. Furthermore, we can deduce that the interaction of TgMIC4-A5 with GM1 occludes the sialic acid branch thereby preventing binding by TgMIC1-MARR. Interestingly, microarray data reveal stronger binding of TgMIC4-A5 to GM1 and SM1a (sulfated analog of GM1) than to asialo-GM1 (Fig. 6C) suggesting a con-tribution from the acidic moiety. Examination of the surface electrostatics reveals several regions of significant positive charge adjacent to the galactose-binding pocket that would likely stabilize an interaction with the negative charge of sialic acid (supplemental Fig. S15).
Glycan Recognition by TgMIC1-4-6 Complex and Its Biological Relevance-Our observation that TgMIC4 is capable of displacing TgMIC1 from GM1 suggests that even though these FIGURE 7. The solution structure of a TgMIC4-A5⅐lacto-N-biose complex. A, superimposed section of the 1 H-15 N-HSQC spectrum of TgMIC4-A5 before (black) and after (green) the addition of 5 molar eq of lacto-N-biose. A large number of significant chemical shift perturbations are observed. B, data strips from the 13 C-filtered NOESY-HSQC spectrum of the TgMIC4-A5⅐lacto-N-biose complex. Seven intermolecular NOEs were detected and assigned (as annotated). C, K d determination for the TgMIC4-A5/galactose interaction using NMR data. Normalized and combined 1 H N / 15 N H chemical shift changes for five TgMIC4-A5 residues were plotted as a function of galactose concentration. Curves were fit using a single-site binding model via least-squares linear refinement, yielding K d values (as described in Ref. 61). These were averaged yielding a K d of ϳ2.6 ϫ 10 Ϫ4 M. D, an ensemble of 10 low-energy structures, depicted as backbone heavy atom traces. E, a schematic representation of an example structure. The galactose ring of lacto-N-biose stacks against the aromatic ring of Tyr-467, analogous to the mechanism of galactose recognition by galectins.
adhesins are present within the same adhesive complex they are likely to exploit different carbohydrate ligands on the host cell surface. If the distinct sialyl and galactose-binding preferences of TgMIC1 and TgMIC4 are purely adhesive then this dual recognition may allow the parasite to exploit both sialic aciddependent and -independent invasion mechanisms. Modulation of the sialic acid dependence has been observed in Plasmodium falciparum (54) and the proteolytic trimming of TgMIC4 on the parasite surface and subsequent loss of galactose binding could provide the necessary switch (36). Alternatively, the different binding specificities of TgMIC1 and TgMIC4 may have a special implication in the preferential tissue/cell tropisms of the parasite in the brain, where different ganglioside molecules are abundant and their oligosaccharide moieties exposed on the cell surface. It is possible that on those cells that express high affinity ganglioside ligands for both TgMIC1 and TgMIC4 there would be an amplification of the binding strength.
The Gal-specific lectin activity may also fulfill a role that is independent of cell adhesion and TgMIC1. Interestingly, a novel role has been suggested for TgMIC4 or a TgMIC4-like protein in the oocyst stage (55), where it is released into the parasitosphorous vacuole, the space in which the parasite replicates inside the host cell (56). As revealed by our microarray and structural studies, the carbohydrate binding profile of TgMIC4 resembles that of the galectins, a family of eukaryotic lectins with roles in regulating cell adhesion, receptor activation, intracellular signaling, apoptosis, and immune system function. Furthermore, a growing body of evidence exists suggesting that parasites can actively usurp galectin activity to help propagate an infection as well as keep the immune system in check (57). Galectin function can also contribute to any stage of an infection by altering the magnitude and quality of the immune response. Specifically, galectin activity controls the balance between anti-apoptotic and pro-apoptotic signals, activation of immune cells, and cytokine secretion. It is conceivable that proteolytic maturation of TgMIC4 provides a mechanism to liberate a soluble galectin-like lectin, which could subsequently contribute independently to parasite dissemination or down-regulation of the host immune response. This would be reminiscent of the Gal/GalNAc-binding surface protein from the intestinal parasite E. histolytica, which is essential for adhesion to target cells, cytotoxicity, and the inhibition of human complement (49). Recently, is has been shown that engagement of glycosylphosphatidylinositol-anchored proteins present on the surface of the T. gondii tachyzoite by galectins may serve to activate immunity (58). It is also worthwhile noting that mice vaccinated with the NcMIC4 antigen were more susceptible to neosporosis (59). Exhaustive analysis of apicomplexan genomes reveals several other MIC4-like proteins secreted by organelles involved in invasion and one secreted into the parasitophorous vacuole postinvasion, suggesting that galactose recognition might be a ubiquitous strategy by which the parasites control the host response (60).
Concluding Remarks-This work complements previous models of the TgMIC1-4-6 complex, providing new insight into the location of the interaction between the second apple domain of TgMIC4 and the ␤-finger of TgMIC1, and a possible stoichiometry of the macromolecular complex. Glycan binding has been localized to the fifth apple domain of TgMIC4, and its specificity for galactose-terminating oligosaccharides has been discovered. These findings are summarized in a schematic model of the TgMIC1-4 subcomplex (supplemental Fig. S16). The similarity of the carbohydrate recognition of TgMIC4-A5 lectin activity to those of galectins and the biological significance is a subject for future functional studies.