The Opportunistic Pathogen Toxoplasma gondii Deploys a Diverse Legion of Invasion and Survival Proteins*

Host cell invasion is an essential step during infection by Toxoplasma gondii, an intracellular protozoan that causes the severe opportunistic disease toxoplasmosis in humans. Recent evidence strongly suggests that proteins discharged from Toxoplasma apical secretory organelles (micronemes, dense granules, and rhoptries) play key roles in host cell invasion and survival during infection. However, to date, only a limited number of secretory proteins have been discovered, and the full spectrum of effector molecules involved in parasite invasion and survival remains unknown. To address these issues, we analyzed a large cohort of freely released Toxoplasma secretory proteins by using two complementary methodologies, two-dimensional electrophoresis/mass spectrometry and liquid chromatography/electrospray ionization-tandem mass spectrometry (MudPIT, shotgun proteomics). Visualization of Toxoplasma secretory products by two-dimensional electrophoresis revealed ∼100 spots, most of which were successfully identified by protein microsequencing or matrix-assisted laser desorption ionization-mass spectrometry analysis. Many proteins were present in multiple species suggesting they are subjected to substantial post-translational modification. Shotgun proteomic analysis of the secretory fraction revealed several additional products, including novel putative adhesive proteins, proteases, and hypothetical secretory proteins similar to products expressed by other related parasites including Plasmodium, the etiologic agent of malaria. A subset of novel proteins were re-expressed as fusions to yellow fluorescent protein, and this initial screen revealed shared and distinct localizations within secretory compartments of T. gondii tachyzoites. These findings provided a uniquely broad view of Toxoplasma secretory proteins that participate in parasite survival and pathogenesis during infection.

The protozoan Toxoplasma gondii is a human pathogen that causes severe opportunistic disease (toxoplasmosis) in congenitally infected babies and immunocompromised individuals (e.g. AIDS) (1,2). Although foodborne transmission via ingestion of infected meat products contributes to the steady rise in age-dependent seroprevalence, waterborne transmission of the highly infectious, feline-derived oocyst stage has led to recent outbreaks (3,4). The efficiency of waterborne transmission and the availability of drug-resistant strains have raised awareness of toxoplasmosis as a threat to public health from natural outbreaks or the potential malicious contamination of public water sources. Because effective control of outbreak situations will be enhanced by having multiple options for the diagnosis and treatment, the identification and exploitation of novel parasite targets is of acute importance.
T. gondii belongs to phylum Apicomplexa, which also includes several other notable pathogens such as Plasmodium (the agents of malaria), Cryptosporidium (the cause of cryptosporidiosis), and Eimeria (the cause of coccidiosis). For several aspects of apicomplexan biology, Toxoplasma is emerging as a important model organism (5) because it exhibits many of the features and capabilities that define the phylum, yet it is more amenable to experimental manipulation than many of its kin.
Toxoplasma is an obligate intracellular parasite and must invade a vertebrate host cell for survival and replication. Invasion initiates the lytic cycle leading to the cell and tissue destruction that is a hallmark feature of Toxoplasma pathology. Toxoplasma invasion is a rapid (Ͻ30 s), dynamic, and complex process that relies on the secretion of numerous secretory proteins from specialized secretory organelles, including micronemes, rhoptries, and dense granules (6). Previous studies revealed that sequential secretions of these secretory proteins are critical events in parasite invasion and establishment of infection (7,8). Micronemal proteins (MICs) 3 are released first upon the parasite apical attachment to a host cell and function in host cell attachment and penetration (reviewed in Ref. 9). Next, the contents of rhoptries are discharged and are thought to be critical for biogenesis of the parasitophorous vacuole (PV) that envelops the parasite during invasion and its interaction with host cellular organelles (10,11). Finally, dense granules proteins (DGs) are exocytosed both during and after invasion and are thought to function in intracellular survival and replication (12)(13)(14). Recent studies have also suggested that the parasite releases products capable of manipulating the host immune response (15,16). Also, secretory products and surface antigens (SAGs) have received scrutiny as promising diagnostic markers (17,18). Despite this progress, only a limited number of secretory proteins have been discovered to date. This point is underscored by the recent bioinformatic identification of Ͼ800 genes encoding proteins with a putative secretory signal peptide (www. toxodb.org), yet fewer than 30 Toxoplasma secretory proteins have been described to date. Although many of the putative secretory proteins may be retained in internal compartments such as the ER/Golgi or apicoplast (a plastid like organelle), a significant fraction is probably exported to external sites for interaction with the host.
The recent coupling of high resolution protein separation techniques (two-dimensional electrophoresis or multichromatography) with high throughput identification strategies has permitted the wide scale analysis of protein identity, expression, modification, localization, and interactions within whole cells or subcellular fractions (19,20). With these tools, it is possible to analyze a multitude of cellular proteins from a more global perspective. The recent completion of parasite genome projects (21)(22)(23) has fueled proteomic studies of Plasmodium (24,25), Eimeria tenella (26), Trypanosoma cruzi (27), and Leishmania major (28). The Toxoplasma genome is nearing completion, and two studies have described preliminary mapping and partial identification of the tachyzoite (a rapidly dividing form of the parasite) proteins, either by two-dimensional electrophoresis immunoblotting (29) or by two-dimensional electrophoresis coupled with MALDI-MS (30). Although these studies provided a valuable initial view of the protein complement for the parasite, a relatively small subset of the Toxoplasma proteome was examined.
Here we use complementary approaches to examine a cohort of excreted/secreted antigens (ESA) freely released by Toxoplasma tachyzoites, the life stage responsible for the tissue pathology observed in toxoplasmosis. Our findings reveal that Toxoplasma mobilizes a rich assortment of putative effector molecules that may interact with the host during infection. These include adhesins that presumably contribute to attachment and penetration, proteases that may facilitate invasion or migration into host tissues, metabolic enzymes potentially involved in nutrient acquisition, and a series of hypothetical proteins of unknown function. These findings significantly widen the viewing aperture of the molecular players contributing to Toxoplasma survival and pathogenesis during infection.

EXPERIMENTAL PROCEDURES
Reagents and Chemicals-Porcine trypsin (sequencing grade) was from Promega (Madison, WI). MS calibration mixture II (peptide mass standard kit) was from Applied Biosystems (Framingham, MA). Onedimensional electrophoresis and two-dimensional electrophoresis reagents and immobilized IEF strips were from Bio-Rad or Amersham Biosciences. The MALDI matrix, ␣-cyano-4-hydroxycinnamic acid, was purchased from Sigma. All the organic solvents are HPLC grade. All other reagents and chemicals were obtained from either Fisher or Sigma and were of the highest purity available.
Cell Culture and Large Scale ESA Preparation-Parasite culture and ESA preparation were performed according to Ref. 31. Briefly, T. gondii strain 2F was propagated in human foreskin fibroblast cells. Freshly egressed tachyzoites were harvested by passage twice through a 20-gauge needle followed by filtration through a 3-m pore size membrane to remove host cell debris. Parasites were washed twice by centrifugation in D0 medium (Dulbecco's modified Eagle's medium, 2 mM glutamine, 10 mM HEPES). Large scale preparation of ESA proteins was performed by incubating ϳ4 ϫ 10 9 filter-purified tachyzoites in 15 ml of D0 plus 1% (v/v) ethanol at 37°C for 20 min followed by cooling on ice for 5 min. Parasites were removed by centrifugation (1000 ϫ g, 10 min, 4°C). The supernatant, which was supplemented with a mixture of proteinase inhibitors, was concentrated to ϳ400 l using C-20 concentrators according to the manufacturer's instructions (Millipore; Billerica, MA).
Two-dimensional Gel Electrophoresis-ESA proteins were separated in the first dimension on 11-or 18-cm immobilized dry strips (pH 3-10, pH 4 -7) using an isoelectric focusing system (Amersham Biosciences or Bio-Rad). 120 g of ESA protein was mixed with rehydration buffer containing 8 M urea, 2% CHAPS, 0.5% carrier ampholytes buffer, and 65 mM dithiothreitol. Rehydration and isoelectric focusing were performed according to the manufacturer's instructions. Following isoelectric focusing, proteins were reduced and alkylated by successive 15-min treatments with equilibration buffer containing 2% dithiothreitol followed by 2.5% iodoacetamide. Proteins were then resolved in the second dimension on an SDS-polyacrylamide gel (12.5% homogenous gel for the Amersham Biosciences system, 10 -20% gradient gel for Bio-Rad). Resolved proteins were either transferred directly to polyvinylidene difluoride membrane for N-terminal sequencing or stained with colloidal Coomassie Blue stain (32) for MALDI-MS analysis.
N-terminal Sequencing-Proteins on a two-dimensional electrophoresis gel were transferred to Immobilon P SQ membranes (Millipore) and stained with Coomassie Brilliant Blue R-250. Bands were excised and subjected to Edman degradation on a PerkinElmer Life Sciences model 477A gas-phase protein sequencer.
In-gel Digestion, Peptide Extraction, MALDI-MS Analysis, and Data Base Searching-Protein spots from two-dimensional electrophoresis gels were excised, in-gel digested with trypsin (12.5 ng l Ϫ1 in 50 mM NH 4 HCO 3 ), and solvent-extracted as described previously (33). After extraction, peptide mixtures were dried down and re-dissolved in 2 l of 50% acetonitrile and 0.3% trifluoroacetic acid, mixed with saturated ␣-cyano-4-hydroxycinnamic acid (matrix) solution, and deposited on the MALDI-MS target plate according to a general two-layer method (34). The MS analysis of digested peptides was performed with a Voyager DE-STR MALDI-time-of-flight mass spectrometer (PerSeptive Biosystems, Framingham, MA). The spectra were acquired in the reflection-positive mode with delayed extraction, and external peptides or proteins adjacent to the sample spots were used for MS calibration. Internal calibration was refined using selected trypsin autolysis ions. Peptide monoisotopic masses were used to search against Toxoplasma NCBI data base via the MS-Fit algorithm (prospector.ucsf.edu) or to search Toxoplasma protein/EST/genome data bases via an in-house installed Mascot program (35). In both cases, a mass tolerance of Ϯ50 ppm was allowed in the data base searches. Generally, a minimum number of four peptide matches, sequence coverage of at least 15%, and a match to a T. gondii entry were required for true positive match when using MS-fit algorithm, whereas a score higher than the Mascot significant score was regarded as positive match in Mascot data base searches.
MudPIT Analysis and Protein Identification-Lyophilized ESA proteins were reconstituted in 1 ml of 50 mM ammonium bicarbonate solution. The proteins were reduced by adding 10 l of dithiothreitol (1 M) and incubation at 37°C for 2 h. Cysteines were alkylated by adding 10 l of iodoacetic acid (1 M in 1 N NaOH) and incubated in the dark for 30 min at room temperature. The sample was spun through a Microcon YM-3 concentrator until dry and resuspended in 1 l of 50 mM ammonium bicarbonate buffer for trypsin digestion.
The digested ESA peptide mixtures were buffer-exchanged into 5% acetonitrile, 0.1% acetic acid prior to analysis. The MudPIT experiments were performed according to the manufacturer's instructions on a Pro-teomeX work station (Thermo-Finnigan, San Jose, CA), which is composed of an autosampler, two HPLC pumps, a 10-port column-switching valve, and a Deca-XP ion trap mass spectrometer with a micro-ESI interface. For one-dimensional MudPIT, HPLC separation of peptide mixtures was conducted through reverse phase chromatography only, whereas for two-dimensional MudPIT, the HPLC separation was performed through ion exchange chromatography in the first dimension and then reverse phase chromatography in the second dimension. Specifically, in the case of two-dimensional MudPIT, peptides were bound to a strong cation exchanger (BioBasic SCX, 0.32 mm ϫ 10 cm, Thermo Hypersil) and eluted stepwise with five NH 4 Cl concentrations (10, 50, 100, 200, and 500 mM, respectively). Peptides eluted from each salt step were further separated on a reverse phase capillary column (BioBasic C18, 300Å, 5 m silica, 180 m ϫ 10 cm) using an acetonitrile gradient (0 -60% solvent B in solvent A, A ϭ 0.1% formic acid; B ϭ 100% acetonitrile containing 0.1% formic acid). The mass/charge (m/z) ratios of eluted peptides and fragmented ions were analyzed by an LCQ Deca XP ion trap mass spectrometer. Following each full scan mass spectrum, two MS/MS spectra of the top two most intense peaks were acquired. The dynamic exclusion feature was enabled to obtain MS/MS spectra of co-eluted peptides.
Protein identification was performed by searching the T. gondii protein subset of NCBI "nr," the amino acid TgTwinScan gene predictions, and the clustered EST data bases using the TurboSequest algorithm (36) in the Bioworks 3.1 software package (Thermo Finnigan). The identified peptides were further evaluated using charge-state versus cross-correlation number (Xcorr). The criteria for positive identification of peptides were Xcorr Ͼ1.5 for singly charged ions, Xcorr Ͼ2.0 for doubly charged ions, and Xcorr Ͼ2.5 for triply charged ions (37). Low scoring peptide matches (Xcorr Ͻ1.9/2.2/3.1 for ϩ1/ϩ2/ϩ3, respectively) were verified by manual inspection of MS/MS spectra. Additionally, only proteins with at least two peptides of distinct sequence identified were accepted.
In Silico Analysis of Identified Proteins-All proteins identified by any of the three proteomic approaches were subjected to multiple in silico analyses. Predicted localization of proteins was assigned primarily through previously published localization and subsequently by several algorithms as follows: signal peptide (SignalP 3.0) (38), apicoplast targeting (PATS) (39), mitochondria targeting (PlasMit) (40), and nuclear localization sequence (PredictNLS) (41). Proteins with predicted signal peptides were also analyzed for the presence of transmembrane domains (TMHMM) (42) and GPI anchor sites (129.194.185.165/dgpi/ index_en.html). Protein function was assigned based on published results. Novel proteins were assigned putative function if strong homology over long stretches of the protein (based on BLASTP with default settings against NCBI nonredundant data base) was found to a protein of known function.
Ortholog Identification-To identify orthologs to ESA proteins, we followed the approach used to generate the TIGR orthologous gene As described under "Experimental Procedures," purified tachyzoites were treated with 1% ethanol to induce secretion, and parasite products were collected from culture supernatants (ESA). ESA proteins were concentrated in a 10-kDa retentate and used for each of three analysis schemes. A, ESA proteins were resolved on pH 3-10 two-dimensional electrophoresis gels, and spots in the alkaline region were excised for N-terminal sequencing (after transfer to polyvinylidene difluoride membranes) or for in-gel trypsin digestion and analysis by MALDI-MS. The acquired PMF data were used to search against T. gondii data bases using ProteinProspector or Mascot. B, ESA proteins were separated on pH 4 -7 two-dimensional electrophoresis gels and analyzed by MALDI/PMF as above. C, for MudPIT analysis, ESA protein mixtures were digested by trypsin, and the peptide mixtures were separated by two-dimensional chromatography using a strong cation exchange (SCX) column in tandem with a reverse phase (RP) column. Eluted peptides were analyzed by ESI MS-MS. The tandem mass spectra generated were correlated to theoretical mass spectra generated from T. gondii EST or predicted gene sequences using TurboSEQUEST. Finally, intracellular localization of a subset of the novel putative secretory proteins was assessed by expression of YFP fusion proteins in Toxoplasma tachyzoites. alignments (TOGA) (43). ESA amino acid sequences were BLASTed against available apicomplexan TIGR gene indices (44) using WUBLAST tblastn, and for sequence matches with E values of Ͻ10 Ϫ5 , the percent conservation of the best high scoring pair was recorded. Putative orthology was assigned if the reciprocal BLAST (blastx) using identical stringency against the Toxoplasma TgTwinScan amino acid gene predictions identified the original query as the top hit. As an alternative approach, we identified the protozoan OrthoMCL (45) ortholog clusters containing identified ESA proteins and retransformed the relative similarity scores among proteins into the original BLAST E values.

RESULTS
Toxoplasma secretory products have been implicated in parasite entry, intracellular survival, and interaction with the host (reviewed in Ref. 47). Regardless of whether they are initially secreted in a soluble or membrane-associated form, many of these products ultimately accumulate in ESA as a result of proteolytic shedding from the parasite surface (48 -50).
ESA proteins were recovered from the medium after stimulating parasite secretion with 1% ethanol treatment (51). To maximize ESA product discovery, we employed several strategies that are outlined in Fig. 1. The first approach was to separate proteins by two-dimensional electrophoresis coupled with identification either by N-terminal sequencing (Fig. 1A), which was primarily used for alkaline proteins, or by MALDI-MS and peptide mass fingerprinting (PMF) (Fig. 1B). The second approach was to use multidimensional liquid chromatography to separate tryptic peptides in line with ESI-MS/MS to acquire tandem MS data (Fig. 1C). ESA proteins were identified by matching the acquired N-terminal sequence, MALDI/PMF, or ESI-MS/MS data with sequences in the T. gondii subset of the NCBI nonredundant data base, T. gondii EST, or gene prediction (TgTwinScan) data bases using several search algorithms including BLAST, ProteinProspector (51, 52), Mascot (35), and SEQUEST (36). A subset of the novel putative secretory proteins identified in these screens was then expressed in a YFP fusion construct to test for localization within the parasite secretory pathway.
To view a wide profile of extracellular products, ESA proteins were initially resolved on broad pH range IEF strips (pH 3-10). Because only a minority of species occupied the alkaline region of the gel, these proteins were identified by a combination of N-terminal sequencing ( Fig.  2A) and MALDI/PMF (Fig. 2B). Note that different electrophoresis systems were used in Fig. 2, A-C. Twelve spots were subjected to N-terminal microsequencing and matched to their cognate genes by BLAST (TABLE ONE). Seven of the 12 spots were SAG proteins, a family of glycosylphosphatidylinositol (GPI)-anchored proteins that abundantly occupy the parasite surface. SAG1 and SAG2 each migrated as three distinct species. Each of the SAG1 species possessed the same N-terminal sequence, suggesting that the charge heterogeneity is not because of differential cleavage of the N terminus. On the other hand, N-terminal sequences of the SAG2 species were slightly offset in a manner that would alter both the size and charge of the protein, potentially resulting in the distinct migration of these species. Their presence in the ESA suggests that SAG proteins are shed from the parasite surface, possibly contributing to their well known hyper-immunogenicity and pro-inflammatory properties (53). One novel protein, termed p40, was identified by N-terminal sequencing. This product is discussed in greater detail below.
As shown in Fig. 2B, several additional proteins in the alkaline region were identified by MALDI/PMF including GRA2, a dense granule pro- tein necessary for maintaining the intravacuolar network of membranes within the parasitophorous vacuole (13). Also present in the alkaline region was ROP9 (p36), a protein of unknown function (54) and the only rhoptry-derived protein identified in this study. A parallel study of the rhoptry proteome by Bradley et al. (120) provides a comprehensive analysis of this unique subset of secretory products. Because the migration of ESA proteins was strongly skewed toward the acidic region of the broad pH range gels, we used high resolution narrow pH range IEF strips (pH 4 -7) to further separate the protein mixture (Fig. 2C). These narrow range gels revealed ϳ100 distinct spots, of which 55 were successfully matched to 35 distinct proteins by MALDI/PMF analysis (TABLE TWO ). Identified proteins were classified into several main categories, including MIC proteins (MIC1, -2, -4, -5, -6, -8, -10, and -11, M2AP, AMA1, and SUB1), dense granule proteins (GRA1, -2, -5, and -7 and T. gondii protease inhibitor-1), surface antigens (SAG1, -2, and -3 and SRS1), and other secretory proteins (PDI and Cyp18). Also, two-dimensional electrophoresis/MS revealed the extensive proteolytic modifications that proteins such as M2AP, MIC2, MIC11, MIC4, and SUB1 undergo. In addition to proteins identified previously, several novel hypothetical proteins were discovered as a result of searching against the Toxoplasma data bases using PMF data (TABLE TWO). In general, this gel-based approach provided the  advantage of visualizing the complexity and heterogeneity of ESA products but was limited by relatively low sensitivity because many of the low abundance species were not successfully identified. As a complementary approach to two-dimensional electrophoresis/ MS, the chromatography-based proteomic method MudPIT was used to analyze ESA proteins. ESA proteins were trypsin-digested, subjected to two-dimensional LC-MS/MS analysis, and identified by searching the un-interpreted product ion spectra against T. gondii clustered ESTs, the predicted genes (TgTwinScan), or the T. gondii subset of the NCBI nonredundant data base using the TurboSequest algorithm (36). A total of 62 proteins were identified (TABLE THREE). Strikingly, this approach resulted in the identification of additional proteins that were not detected on two-dimensional electrophoresis gels such as GRA3, NTP1, NTP2, NTP3, SRS2, and others. Moreover, several novel proteins emerged, including most notably a secreted metalloprotease of the insulinase family (TgTwinScan_4000), one or more of a family of Apple/ PAN domain containing proteins (TgTwinScan_2357, -2358, -2359, and -2361) similar to E. tenella MIC5 and T. gondii MIC4, and three proteins (TgTwinScan_0203, -3857, and -6350) with significant homology to hypothetical proteins of the human malaria parasite, Plasmodium falciparum. Also detected were several additional proteins (TgTwinScan_1114, -1327, -2489, and -3416) without homology to proteins or domains in the public data bases. These proteins represent novel Toxoplasma-specific products that may contribute to the unique biological features of this parasite.
As shown in Fig. 3, although MudPIT yielded the highest number of proteins identified, the combination of approaches promoted maximum coverage and provided validation for those products identified by more than one strategy.
To assess localization and potential function, ESA protein sequences were analyzed by a series of algorithms designed to detect targeting signals and orthologous proteins. The absence of targeting signals in 30% of the ESA proteins (Fig. 4) suggests that some cytosolic proteins are released into the ESA as a result of inadvertent parasite lysis. As expected, the majority (58%) of ESA proteins contained a putative signal sequence or a signal anchor sequence, many of which have been shown previously to localize to the either surface of the parasite or secretory organelles. As an initial screen to determine the intracellular locations of novel secretory proteins, we selected five genes (TwinScan_0203, -1327, -2359, -2489, and -2661) predicted to lack a transmembrane segment or GPI anchor and cloned them in-frame with a C-terminal YFP-encoding expression plasmid. After transient transfection into tachyzoites, expression and localization were assessed by fluorescence microscopy of formaldehyde-fixed parasites (Fig. 5). Despite multiple attempts, no expression of TwinScan_2489-YFP was observed. Together with the observation that transfection of this construct yielded morphologically defective parasites not seen with the other constructs, we conclude that expression of TwinScan_2489-YFP was toxic or otherwise poorly tolerated. TwinScan_0203-YFP localized to tubular structures resembling the parasite mitochondrion, a finding that was confirmed by co-localization with Mitotracker Red. Although this protein was predicted to contain a secretory signal sequence, it also possesses a putative mitochondrial targeting sequence that appears to be the dominant sorting element, at least in the context of this re-expression system. Twin-Scan_1327-YFP showed a punctate pattern distributed throughout the parasite, which showed only limited co-localization with the dense granule markers GRA2, GRA4 (data not shown), and GRA7 in extracellular parasites. In intracellular parasites TwinScan_1327-YFP accumulated in the vacuole at the convergence of distal ends of the parasites, where it partially co-localized with the dense granule markers secreted into the  OCTOBER 7, 2005 • VOLUME 280 • NUMBER 40

Toxoplasma Invasion and Survival
vacuole. Precisely how this protein reaches the parasitophorous vacuole remains to be determined. TwinScan_2359 (encoding the only member of the novel Apple/PAN family represented by tachyzoite ESTs) and Twin-Scan_2661 (encoding P40) showed similar localization in the apical region. This distribution was apical to the Golgi (GRASP-RFP; data not shown) and showed partial overlap with rhoptries (ROP2) and with micronemes (AMA1). This staining pattern may reflect inefficient trafficking to one of these apical organelles, with much of the protein accumulating in an intermediate compartment along the pathway. Imperfect targeting has previously been seen after overexpression of tagged secretory proteins in T. gondii (50). Collectively, these data confirm that several of the novel ESA products are expressed in the secretory system where they traffic to shared and distinct exocytic organelles within the parasite. Signal-containing proteins were assigned to one of seven categories according to function based on previous studies or on their similarity to proteins of known function. Most strikingly, proteins of unknown function constitute the largest category (54%), a reflection of the incomplete understanding of the roles fulfilled by Toxoplasma secretory products. The next largest category (24%) consists of proteins involved in adhesion/invasion, followed by proteins associated with immune evasion (7%), proteolysis (7%), protein folding (4%), intracellular survival (2%), and glycolysis (2%).
Because apicomplexan parasites share many core biological features, including an obligate intracellular lifestyle, we were interested to what degree the Toxoplasma-secreted ESA proteins were conserved across the phylum. Emulating the approach used to generate the TIGR orthologous gene alignments (TOGA), we assigned reciprocal BLAST hits against the TIGR gene indices for seven apicomplexan parasites as putative orthologs, identifying orthologs to 33 ESA proteins in one or more Apicomplexa (Fig. 6). As expected, the very closely related coccidian Neospora caninum showed the highest degree of conservation both in number of orthologs and their sequence homology. In general, after normalization with respect to the number of unique sequences in each TIGR gene index (lowest, Cryptosporidium parvum 910; highest, Plasmodium yoelli 9812), the number of orthologs identified decreased according to their phylogenetic distance from T. gondii (55). Because representation in EST collections like the TIGR gene indices is skewed by stage-specificity and abundance of transcripts, the false negative rate of this analysis is expected to be high, making a failure to detect orthologs in the other Apicomplexa less informative. To compensate for this and to expand our analysis to additional protozoa, we consulted a collection of orthologous clusters generated using OrthoMCL developed by David Roos' laboratory at the University of Pennsylvania. By using OrthoMCL, which utilizes a wider array of sequences, we identified dozens of additional orthologs, particularly in more distantly related organisms. By using both of these approaches, we successfully identify orthologs in one or more apicomplexan parasites for 80% (35/ 44) of the predicted secretory ESA proteins, indicating that a number of ESA proteins are conserved within the phylum.

DISCUSSION
In this study we substantially widen the perspective of Toxoplasma proteins that form the interface of the parasite with its host during infection. The efficiency with which Toxoplasma initiates infection, avoids clearance by the innate immune system of the host, and persists indefinitely in host tissues is determined in large part by the surface of the parasite and secretory products. Although the precise function of most ESA products remains poorly understood, the current expanded view of these components is an important prelude to understanding mechanisms underlying Toxoplasma-host interactions contributing to infection and disease.
A number of cytosolic proteins were identified in the ESA. Because it was necessary to prepare ESA samples in defined, serum-free medium, some inadvertent parasite lysis was expected. Although most of these proteins are likely authentic cytosolic proteins, it is also possible that a subset may be incorrectly assigned due to inaccuracies of the gene assembly programs used to analyze the Toxoplasma genome, particularly the difficulty of correctly predicting the initial exon containing the signal sequence.
As expected, the largest subgroup of secretory proteins identified in the ESA is derived from the micronemes. In fact, the only known tachyzoite MIC protein not detected in this study was MIC3. This high coverage rate suggests that many of the novel ESA proteins may also be derived from MICs, a notion supported by the partial localization of recombinant TgTwinScan_2359 and TgTwinScan_2661 within these organelles. MICs are thought to participate in parasite adhesion and invasion of host cells (reviewed in Ref. 56). Consistent with their putative role in host cell attachment, many of the MIC proteins possess recognizable adhesive domains found in vertebrate proteins such as thrombospondin, integrins, and epidermal growth factor (9). Moreover, several studies have documented the cell binding activities of adhesive MICs, including MIC1, MIC2, MIC4, and MIC3 (49,(57)(58)(59). However, several other MIC proteins (MIC5, MIC10, M2AP, SUB1, and AMA1) do not contain obvious adhesive sequences, suggesting these proteins have  . Predicted localization of identified ESA proteins. 58% of identified proteins were predicted to enter the secretory pathway, 8% to target to the nucleus, and 3% to target to the mitochondrion, and for 30% no targeting motifs were found thus predicting cytoplasmic localization. Among the secretory proteins, 15%, 1%, and 12% have been demonstrated to reside in the micronemes, rhoptries, and dense granules, respectively. The ferredoxin oxidoreductase has been shown to reside in the apicoplast, and PDI is an ER resident protein. 21% have predicted secretory signal sequences, but no other targeting information was found. Localization was assigned based on published reports and available targeting prediction algorithms (Signal sequence, SignalP; apicoplast, PATS; mitochondrion, PlasMit; nucleus, PredictNLS). alternative or accessory functions (60 -64). Another feature of Toxoplasma MICs is they often assemble into protein complexes and work in concert. For example, MIC6 partners with MIC1 and MIC4 (65), MIC8 escorts MIC3 (50), and MIC2 accompanies M2AP (64,66,67). Among these, the MIC2-M2AP complex appears to be the most abundant on the two-dimensional electrophoresis map of the ESA based on the Coomassie-stained two-dimensional electrophoresis profile (Fig. 2), a finding that is consistent with recent evidence that this adhesion complex plays a central role in parasite adhesion and invasion (59,66,68).
In the present study, 10 DG proteins (GRA1, GRA2, GRA3, GRA5, GRA7, and GRA8, T. gondii protease inhibitor-1, and NTPase I, II, and III) were recovered from the ESA fraction (TABLES TWO and THREE). DG proteins are secreted at the highest rate soon after the parasite is fully within the vacuole (68,70). In most cases, the precise functions of DG proteins remain unclear, partly because they lack significant homology with proteins of known function. Despite this, several observations strongly suggest that DGs participate in the modification of PV and its interaction with the host cytosol. For example, almost all of the known DG proteins are membrane-associated proteins, which is in accordance with recent protein genetic evidence that GRA2 and GRA6 play important roles in the biogenesis of intravacuolar membranes (13). GRA7 was suggested to be involved in immune protection, and a decrease in GRA7 synthesis paralleled the loss of parasite virulence (14). The T. gondii serine protease inhibitor-1 (71) was recently confirmed as a DG protein, presumably playing an important role in protecting against the proteolytic enzymes and modulating the immune response during the acute FIGURE 5. Preliminary localization of novel putative secretory products expressed as YFP fusion proteins. TgTwinScan_0203-YFP (green) co-localizes with Mitotracker Red dye in the single mitochondrion of extracellular tachyzoites. TgTwinScan_1327-YFP (green) shows punctate localization, which overlaps to a small degree with dense granule markers (GRA4 or GRA7, red) in extracellular tachyzoites. In intracellular parasites it accumulates in the vacuole at the convergence of the parasites distal ends, potentially the residual body. TgTwinScan_2359-YFP and TgTwin-Scan_2661-YFP (both in green) localize to the apical end of extracellular tachyzoites with partial overlap with the rhoptries (ROP2, red) and micronemes (AMA1, red). All samples were stained for DNA with 4Ј,6-diamidino-2-phenylindole (blue). Micrographs were captured under ϫ1000 magnification, with the exception of Twin-scan_0203-YFP (ϫ600). Scale bar, 5 m.
phase of infection (72). NTPase isoforms are apyrases, which were demonstrated to function in intracellular replication of the parasite based on antisense interruption of expression (73). Also, the full scale activation of NTPase with dithiols may contribute to parasite exit from the host cell (74). It is likely that several of the novel secretory products identified herein are also derived from the DGs. Among these may be proteins of low antigenicity that would have been invisible to the antibody-based screens previously used to identify and characterized DG products.
It was somewhat unexpected to find significant amounts of SAGs in the ESA, given that these proteins are normally tethered to the parasite surface via a GPI anchor (75). At this time it is unclear how these SAGs are released into the ESA, and several possible explanations can be made. We speculated that the SAGs are soluble in the ESA because a small fraction of these proteins may have failed to acquire a GPI anchor, were liberated by a parasite phospholipase, or were proteolytically released from the surface. Alternatively, it is possible that SAGs are released by surface membrane blebbing, as has been observed for Toxoplasma (76) and other pathogenic protozoa (77). As for the function of SAGs in invasion, a SAG3-null mutant (78) showed an approximate 2-fold reduction in host cell invasion and partially attenuated infectivity in mice. This result was explained by a reduction in the capacity to attach the host cell. In addition, SAGs are one of the main targets for humoral and cellmediated immune responses during initial infection (79 -81).
The ER is the site of nascent secretory protein folding, which is facilitated by a collection of enzymes and molecular chaperones, including PDI, heat shock protein (HSP70), and secretory cyclophilin (Cyp). Although T. gondii PDI has not been definitively localized to the parasite ER, it displays a C-terminal sequence (GEEL) resembling an ER retention/retrieval element (K/HDEL) (82). Most interestingly, T. gondii PDI was recently shown to be a major target of mucosal IgA antibodies (82); thus its excretion in a soluble form could contribute to its elevated immunogenicity. Cyp18 is a secretory cyclophilin that was initially iden- FIGURE 6. Phylogenetic conservation of secretory ESA proteins among apicomplexan parasites. Only proteins with at least one putative ortholog are listed. Reciprocal BLAST hits against the TIGR gene indices of six other apicomplexan parasites with an E value of Ͻ10 Ϫ5 were assigned as putative orthologs with percent positive (identical or chemically similar) conservation in the best high scoring pair shown and indicated in color code. The corresponding OrthoMCL ortholog clusters were used as comparison and to expand ortholog identification to additional protozoa. For the OrthoMCL clusters, the negative logarithm of the BLAST expect values are color-coded to indicate the extent of conservation. Additionally, the number of genes in each species (i.e. paralogs) belonging to the orthologous gene cluster is shown. Putative orthologs identified by one but not the other method are indicated by ϫ. Predicted localization is also specified: microneme (Mic), rhoptry (Rho), dense granule (DG), surface (Sur), Apicoplast (Api), other secretory (OSec), unknown secretory (USec), and mitochondrial (Mito). Species abbreviations: N. caninum (Nc); Sarcocystis neurona (Sn); E. tenella (Et); C. parvum (Cp); Theileria annulata (Ta); Plasmodium berghei (Pb); P. yoelli (Py); P. falciparum (Pf); Plasmodium chabaudi (Pc); Plasmodium knowlesi (Pk); Tetrahymena tetraurelia (Tt); and Cyanidioschyzon merolae (Cm). tified as one of two major immunophilins in T. gondii (83). Cyp18 is mainly located in the ER, but a subset of intracellular parasites shows significant Cyp18 staining in the PV. 4 In addition to its role as a foldase, Sher and co-workers (16) recently demonstrated that Cyp18 stimulates dendritic cells to secrete interleukin-12, thereby contributing to the robust pro-inflammatory response often seen during Toxoplasma infection.
Several novel secretory proteins were identified in our analysis of ESA fractions. The sequences of some of these proteins are highly conserved in the Apicomplexa, implying that they fulfill parallel roles in the biology of these parasites. For example, we identified a novel insulinase-like metalloprotease (TwinScan_4000), which may contribute to tissue damage or migration/dissemination during infection. Also, MudPIT revealed a family of proteins containing adhesive cysteine-rich Apple/PAN domains that are highly analogous to that of E. tenella MIC5 (84). Although genes encoding this family are tandemly linked in the genome, analysis of EST sequences suggests that they are stage-regulated because only one member (Twin-Scan_2359) is expressed in tachyzoites. A YFP fusion to this tachyzoitespecific product localizes to an apical compartment partially overlapping with micronemes and rhoptries. Additional studies will be required to determine its precise localization and whether this protein is mobilized to the parasite surface during invasion where it could contribute to host receptor binding via one or more of its Apple/PAN domains.
Apicomplexan-specific proteins that are conserved across the phylum likely play fundamentally important roles the biology of these intracellular parasites. Our phylogenetic analysis of ESA products revealed extensive conservation of AMA1 and MIC2 (TRAP family), which is consistent with recent studies (67,68,85,86) strongly implicating these proteins in cell invasion. ROP9 is also well conserved implying that this protein may play a central, albeit still shrouded, role in the Apicomplexa. Among the novel ESA products that are particularly well conserved is TgTwinScan_3857, which encodes a protein containing a membrane attack complex (MAC)/perforin domain. Although similar proteins are expressed by Plasmodium and have been reported recently to be required for migration through cells and tissue in the mosquito and mammalian hosts (87,88), the precise molecular mechanisms by which they fulfill this role remain to be determined.
In summary, the current study illuminates a diverse legion of secretory proteins that contribute to parasite survival and pathogenesis during infection. Novel parasite-specific proteins identified in this screen significantly expand the number of potential targets for therapeutic intervention or diagnostic gain. We anticipate that ongoing in-depth functional analysis of identified novel proteins would further unravel the invasive and pathogenic mechanisms used by Toxoplasma during infection.