Histone Acetylase GCN5 Enters the Nucleus via Importin-α in Protozoan Parasite Toxoplasma gondii*

The histone acetyltransferase GCN5 acetylates nucleosomal histones to alter gene expression. How GCN5 gains entry into the nucleus of the cell has not been determined. We have mapped a six-amino acid motif (RKRVKR) that serves as a necessary and sufficient nuclear localization signal (NLS) for GCN5 in the protozoan pathogen Toxoplasma gondii (TgGCN5). Virtually nothing is known about nucleocytoplasmic transport in these parasites (phylum Apicomplexa), and this study marks the first demonstrated NLS delineated for members of the phylum. The TgGCN5 NLS has predictive value because it successfully identifies other nuclear proteins in three different apicomplexan genomic databases. Given the basic composition of the T. gondii NLS, we hypothesized that TgGCN5 physically interacts with importin-α, the main transport receptor in the importin/karyopherin nuclear import pathway. We cloned the importin-α gene from T. gondii (TgIMPα), which encodes a protein of 545 amino acids that possesses an importin-β-binding domain and armadillo/β-catenin-like repeats. In vitro co-immunoprecipitation experiments confirm that TgIMPα directly interacts with TgGCN5, but this interaction is abolished if the TgGCN5 NLS is deleted. Taken together, these data argue that TgGCN5 gains access to the parasite nucleus by interacting with TgIMPα. Bioinformatics analysis of the T. gondii genome reveals that other components of the importin pathway are present in the organism. This study demonstrates the utility of T. gondii as a model for the study of nucleocytoplasmic trafficking in early eukaryotic cells.

Eukaryotic DNA is highly compacted in chromatin, which is largely comprised of histone proteins. Once thought to play only a scaffolding role, histones are now recognized as a critical component of gene regulation and epigenetic inheritance. Many chromatin remodeling complexes contain histone-modifying enzymes that alter the nucleosomal structure and modulate the expressed genome (1,2). GCN5, originally described as a transcriptional adapter in yeast (3), was later discovered to be a histone acetyltransferase (HAT) 1 (4). HATs covalently modify chromatin by acetylating specific lysine residues within the N-terminal tails of histones, thereby attenuating the nucleosome-DNA interaction. GCN5 and the related HAT, P/CAF (p300/CBP-associating factor), are critical for growth and development in mice (5,6) and plants (7). Although much study has been invested in the gene regulatory functions of GCN5s, little attention has been directed at resolving how the HAT gets into the cell nucleus. One study suggests that P/CAF requires importin-␣ for nuclear localization, but the precise mechanism of the interaction was not determined (8).
Importin-␣ proteins serve as adaptors that bind both the nuclear-destined cargo protein and importin-␤. Importin-␣ recognizes a nuclear localization signal (NLS) on the cargo, which is typically a mono-or bi-partite cluster of basic amino acids (9). Importin-␤ is capable of binding nucleoporins, thus escorting the importin-␣-importin-␤-cargo complex to the nuclear pore. RanGDP and nuclear transport factor 2 are involved in translocation of the ternary complex through the nuclear pore, whereupon RCC1 exchanges GDP with GTP to dissociate the cargo (10,11). RanGTP recycles importin-␤, and importin-␣ is exported back to the cytoplasm by a distinct importin-␤ homologue, cellular apoptosis susceptibility protein (12). Exceptions exist to this generalized pathway, such as the ability of importin-␤ to translocate cargo without importin-␣ (13) and the presence of transportin, a distant relative of importin-␤ that recognizes nonclassical (M9) NLSs (14).
The GCN5 HAT identified in the protozoal parasite Toxoplasma gondii (phylum Apicomplexa) revealed unexpected features, most notably an N-terminal "extension" that is unusually long (820 amino acids) for an early eukaryote (15). Early eukaryotes including yeast and the fellow alveolate protist Tetrahymena possess a short N-terminal domain of Ͻ100 residues (4). The N-terminal extension in plant GCN5s range from 150 to 250 amino acids and have little homology to each other or other HATs (16). Metazoan GCN5 family members contain N-terminal extensions of ϳ500 residues, and they are fairly similar in composition (17,18). TgGCN5 contains the second longest N-terminal extension of all GCN5 HATs described to date, the longest (1126 amino acids) having recently been noted in another apicomplexan, Plasmodium falciparum (19), the causative agent of severe malaria. Interestingly, the lengthy N-terminal extensions present in these apicomplexan GCN5s bear no similarity to each other or any known protein and are devoid of known protein motifs. We hypothesized that at least one role for the N-terminal extension may be nuclear localization. Consistent with this hypothesis is the recent observation that maize GCN5 needs its N-terminal extension to access the nucleus; however, the precise NLS was not mapped (16). Understanding GCN5 and how it enters the parasite nucleus is significant for a number of reasons. T. gondii can cause congenital birth defects and is an important opportunistic pathogen in AIDS and immunosuppressed patients (20 -22). Blocking transcription factors from entering the parasite nucleus will subvert parasite differentiation and other processes essential for its survival. Novel elements found in parasite transcription factors and/or the nuclear import pathway may be exploited in the design of more selective therapeutic agents. Additionally, the study of these pathways in protozoa provides a unique perspective on how these systems evolved in early eukaryotic cells.
Here we report the use of T. gondii GCN5 (TgGCN5) as a tool to investigate the understudied area of nuclear trafficking in protozoan parasites. We have mapped the first NLS to be identified and validated for any apicomplexan protein and in the process defined a function of the long N-terminal extension of TgGCN5. We demonstrate the predictive value of this NLS by identifying other nuclear proteins from the T. gondii genome as well as two additional apicomplexan databases. We have also characterized a T. gondii importin-␣ orthologue (TgIMP␣) and demonstrated that this nuclear receptor interacts with TgGCN5 via the NLS that we have elucidated. Knowledge that GCN5 family HATs are translocated to the nucleus by the importin pathway may be useful in developing therapies that prevent this factor from modulating transcription in eukaryotic cells.

EXPERIMENTAL PROCEDURES
Parasite Culture and Methods-T. gondii tachyzoites were cultivated in human foreskin fibroblast cells and transfected by electroporation as previously described (23). The RH⌬HXGPRT clone and pminiHXGPRT vector were obtained through the AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, National Institutes of Health, from Dr. David Roos (24 -26). RH⌬HXGPRT were grown in culture medium supplemented with 320 g/ml 6-thioxanthine (Sigma). Mycophenolic acid at 25 g/ml plus xanthine at 50 g/ml (Sigma) were used to select for parasites transfected with pminiHXGPRT-derived vectors.
Plasmids-All of the constructs for immunolocalization were based on a T. gondii expression vector that was built into pminiHXGPRT. The T. gondii TUB1 promoter was amplified from ptubP30-GFP/sagCAT (27) using primers 1 and 2 (all of the primer sequences are listed in Table I). These primers incorporate a BamHI site on the 5Ј end of the PCR product and a polylinker region at the 3Ј end containing sites for BglII, NdeI, EcoRV, and AvrII. The 3Ј-untranslated region of T. gondii dihydrofolate reductase was amplified using these primers 3 and 4. The resulting PCR product contained AvrII and NotI sites that were digested for a three-piece ligation with the TUB1 amplicon above (cut with BamHI and AvrII) and the pminiHXGPRT vector (cut with BamHI and NotI). We have designated this expression construct as ptubX FLAG ::HX, the X representing where the gene of interest is to be installed ( Fig. 1). It is important to note that although a C-terminal FLAG tag option is available in this vector, it was used only for the ␤gal construct. An N-terminal FLAG tag was fused to TgGCN5 sequences to be tested for localization as outlined below.
Primers 5-10 are the sense primers used in conjunction with the antisense primer 11 to create the various FLAG-tagged forms of Tg-GCN5 or ␤gal. All of the inserts were ligated into ptubX FLAG ::HX at the NdeI and AvrII sites except the ␤gal constructs, which used BlgII and AvrII. Prior to electroporation, ϳ50 g of plasmid DNA was linearized using NotI.
For the production of in vitro translated protein, pGADT7 and pG-BKT7 vectors were used (Clontech). Full-length TgGCN5 and TgGCN5 lacking the first 99 amino acids (⌬99TgGCN5) were ligated into the pGBKT7 vector using the NdeI and XmaI sites. Sense primers 12 and 13 were used with antisense primer 14 to amplify inserts encoding full-length TgGCN5 and ⌬99TgGCN5, respectively. ⌬NLS-TgGCN5 was generated by a ligation of three DNA fragments. (i) The first piece encoded TgGCN5 amino acids 1-93 and was amplified using the primer 12 and 15, the latter of which contains a NheI site. (ii) The second fragment encoded amino acids 100 -1169 and was generated with primer 16 (contains a NheI site) and primer 14 (contains a XmaI site). (iii) The third piece was pGBKT7 digested with NdeI and XmaI. The resulting vector encodes a form of TgGCN5 in which RKRVKR is replaced by AS (encoded by the NheI site). TgIMP␣ was ligated into pGADT7 in frame with the N-terminal hemagglutinin tag using NdeI and XmaI sites, and the insert was generated by PCR using the primers 17 and 18.
All of the PCRs were conducted with the proofreading DNA polymerase Pfu Ultra™ (Stratagene). DNA sequencing was performed at the Indiana University Biochemistry Biotechnology Facility (Indiana University School of Medicine, Indianapolis, IN).
Molecular Methods-T. gondii mRNA was isolated from ϳ2 ϫ 10 8 tachyzoites using Poly(A)Pure TM (Ambion). Tachyzoites were allowed to completely lyse the host cell monolayer and were passed through a 3.0-m filter to exclude residual host cell debris. mRNA was treated with DNase prior to Northern blotting, which was performed using standard methods (28). A cDNA-derived probe representing the final 2.3 kb of TgGCN5 was labeled with [ 32 P]dATP (PerkinElmer Life Sciences) using random primers and a Prime-A-Gene® kit (Promega). Reverse transcription-PCRs were carried out using the SuperScript II™ One-Step reverse transcription-PCR system (Invitrogen). The rapid amplification of cDNA ends was performed using an Invitrogen Gene-Racer TM kit. Sequencing alignments and phylogeny trees (neighbor joining method) were generated using AlignX, a component of Vector NTI Advance 9.0 (Informax). Protein motif searches were performed using the Pfam data base (version 14.0; pfam.wustl.edu/) and/or PROS-ITE data base (version 18.35; us.expasy.org/prosite/).
Immunofluorescence Assays-12-Well plates containing confluent human foreskin fibroblast cells grown on glass coverslips were infected with parasites. The cultures were fixed 18 -24 h post-infection with ice-cold methanol at Ϫ20°C for 10 min and blocked with phosphatebuffered saline containing 3% Fraction-V bovine serum albumin TgGCN5 Antisera-A PCR-derived fragment encoding a portion of the C-terminal end of TgGCN5 (amino acids 976 -1169) was ligated into pET19b (Novagen) to create an in-frame fusion to the N-terminal polyhistidine tag. Following transfection into BL21(DE3) Escherichia coli, recombinant protein was induced by adding 1.0 mM isopropyl ␤-Dthiogalactopyranoside for 3 h at 37°C and purified by virtue of a nickel-nitrilotriacetic acid-agarose column (Qiagen) under native conditions. The purified protein was used as antigen to produce polyclonal antisera in rabbit (Pocono Rabbit Farms & Laboratories, Canadensis, PA). The resulting antisera recognizes Ͻ0.1 g of recombinant antigen at a dilution of 1:10000, whereas prebleed sera is nonreactive (data not shown).
Co-immunoprecipitation Experiments-In vitro translated proteins representing full-length TgGCN5, ⌬99TgGCN5, ⌬NLS-TgGCN5, and hemagglutinin-tagged TgIMP␣ were generated using the TNT® coupled reticulocyte lysate system (Promega) in the presence of Redivue TM L-[ 35 S]methionine (Amersham Biosciences) and RNase Out (Invitrogen). 50-l reaction mixtures were assembled and incubated as described by manufacturer. Co-immunoprecipitations were performed using 20 l of a TgGCN5-derived construct combined with 20 l of TgIMP␣ and mixed at 25°C for 1 h. 1 l of TgGCN5 polyclonal antisera or 10 l of polyclonal anti-hemagglutinin (Santa Cruz) was added and mixed for 1 h at 25°C. 10 l of EZ-View protein A-agarose slurry (Sigma) was applied, and the reaction mixture was allowed to incubate at 25°C for 1 h. Agarose was washed five times with 200 l of Buffer 1 and twice with 300 l of Buffer 2 from the MatchMaker co-immunoprecipitation kit (Clontech). Bound proteins were eluted in NuPAGE loading dye containing 5.0% ␤-mercaptoethanol and resolved on NuPAGE Bis-Tris SDS gels (Invitrogen). The gels were fixed in an aqueous solution of 5% isopropanol and 5% glacial acetic acid overnight and rinsed in running distilled water for 1 h. The gels were incubated in Autoflour (National Diagnostics) for 2 h, dried, and processed for autoradiography.

Full-length TgGCN5
Localizes to Parasite Nucleus-Tg-GCN5 was initially described as a 474-amino acid protein, as deduced from a 2.3-kb cDNA clone isolated from a library screen (29). Our parallel cloning of TgGCN5 suggested that the open reading frame is much larger, encoding a protein of 1169 amino acids (15). To verify the message length and determine whether alternatively spliced variants of TgGCN5 are produced, we performed Northern analysis. The probe used was a cDNA-derived sequence corresponding to the 2.3-kb portion previously reported, which is shared between both predicted transcripts. The result clarifies that only one transcript of ϳ5.6 kb exists in T. gondii (Fig. 2). The size of the transcript is consistent with the longer version of the gene and predicts a 5Ј-untranslated region of 1.3 kb. To resolve the 5Ј end of the TgGCN5 message, 5Ј rapid amplification of cDNA ends was performed. Two amplicons were produced, indicating transcriptional start sites at Ϫ1101 and Ϫ1316 nucleotides from our proposed translational start. The 215-nucleotide difference between the two putative transcriptional starts is not resolvable on the Northern.
A recombinant version of the previously reported 474-amino acid version of TgGCN5 (corresponding to amino acids 697-1169 of full-length TgGCN5) fused to a c-Myc epitope accumulated in the parasite cytoplasm, an unexpected localization for a GCN5 family member HAT (29). To test whether the additional N-terminal sequence alters localization to the parasite nucleus, full-length TgGCN5 and a truncated version lacking the N-terminal extension (amino acids 1-697) tagged with a FLAG epitope were expressed in the parasites ( FLAG TgGCN5 and FLAG ⌬N T TgGCN5, respectively). Immunofluorescence assays using anti-FLAG demonstrate that the N-terminal extension is indeed required for nuclear localization of TgGCN5 (Fig.  3). In addition to nuclear DNA, 4Ј,6Ј-diamino-2-phenylindole also stains the 35-kb extrachromosal element housed in the apicoplast. We did not observe any form of TgGCN5 trafficking to the apicoplast organelle.
Mapping the Nuclear Localization Signal (NLS) of Tg-GCN5-Protein motif searches of the TgGCN5 sequence revealed a putative bipartite NLS at amino acid positions 488 -505 (15). However, removal of this 18-amino acid motif from full-length TgGCN5 fused to a C-terminal green fluorescent protein tag did not subvert nuclear localization. 2 Therefore, a series of mapping constructs were designed to identify the region responsible for targeting TgGCN5 to the parasite nucleus. After several truncation studies (data not shown), the region involved in nuclear localization was narrowed to be between amino acids 58 and 120. Visual inspection of this region revealed a hexapeptide enriched with basic residues beginning at residue 94. NLSs often contain basic amino acids and average 6 -8 residues in length (30). As shown in Fig. 4, FLAG-tagged TgGCN5 lacking the first 99 residues ( FLAG ⌬99TgGCN5) does not enter the parasite nucleus. However, when only the first 93 residues are removed ( FLAG ⌬93TgGCN5), the resulting protein is no longer restricted from the parasite nucleus.
This demonstrates that the hexapeptide RKRVKR (amino acids 94 -99) is required for the nuclear localization of Tg-GCN5. To verify whether the hexapeptide is a complete and sufficient NLS, RKRVRK was fused to the N terminus of E. coli ␤gal for expression in T. gondii. The FLAG epitope tag was moved to the C-terminal end of the ␤gal fusion proteins to ensure that the FLAG-NLS chimeric sequence was not responsible for nuclear localization. ␤gal FLAG is excluded from the parasite nucleus unless fused to the TgGCN5 NLS (Fig. 5). The  homogenous staining of the parasites expressing NLS-␤gal is likely due to saturation of the nuclear trafficking pathway or a strong nuclear export sequence present in ␤gal. Taken together, these studies show that RKRVKR is a necessary and sufficient NLS.
Bioinformatics Screen for Proteins Harboring the TgGCN5 NLS-RKRVKR is a rare form of a monopartite NLS conforming to the X type described in Boulikas (30), where denotes a lysine or arginine. With the recent completion of multiple apicomplexan genome sequencing projects, knowledge of NLS motifs would prove valuable in characterizing parasite genes of unknown function. Bioinformatics searches using the TgGCN5 NLS against predicted proteins in ToxoDB.org (Release 3.0) revealed no exact matches, save TgGCN5 itself. However, when RKRXKR is used, 25 potential hits emerged using the TgTigrScan set of predicted proteins (Table II). For example, TgTigrScan_3884 is undoubtedly a DNA polymerase and contains a similar motif, RKRTKR, which may contribute to its nuclear compartmentalization. TgTigrScan_2877, 5941, 6704, and 7743 and several more contain predicted zinc fingers that are indigenous to many nuclear transcription factors. Tg-TigrScan_4179 contains RKRKKR and is a probable nuclear ribonucleoprotein, harboring an RNA recognition motif. Although these predictions are encouraging, they remain to be confirmed in vivo. It is also worth mentioning that NLSs may be more a matter of charge than an orderly configuration of precise amino acid residues (31). Thus, permutations of the TgGCN5 NLS (e.g. (R/K)(R/K)X(R/K)(R/K)(R/K)) are likely to be found in other nuclear proteins in apicomplexa.
T. gondii Possesses an Importin-␣ Homologue-Nuclear proteins harboring a NLS rich in basic residues typically interact with members of the importin (also known as karyopherin) family of proteins, particularly importin-␣ (30,32). The T. gondii database (Release 2.1) was screened for sequences encoding possible importin-␣ homologues using the BLAST algorithm and text queries. The contig Tgg_7395 (corresponding to Tgg_994532 in Release 3.0) exhibited a high degree of similarity to importin-␣ proteins from other species. Primers designed to the genomic sequence data were employed to obtain the fulllength cDNA for T. gondii importin-␣ (TgIMP␣, AY267540). Based on the results of the cDNA sequence, we have deduced that the ϳ9.5-kb TgIMP␣ genomic locus is comprised of 11 exons and 10 introns.
TgIMP␣ possesses an open reading frame of 1638 nucleotides with a consensus start ATG (33) that is preceded by an inframe stop codon Ϫ156 nucleotides upstream. The deduced 545-amino acid sequence has a calculated molecular mass of ϳ60 kDa and shows unequivocal similarity to orthologues present in other eukaryotic organisms, especially the fellow apicomplexan parasite P. falciparum (34). Apicomplexan IMP␣ proteins harbor the well conserved importin-␤-binding domain and armadillo/␤-catenin-like repeats (35) (Fig. 6A). Additionally, they contain the leucine-rich nuclear export signal within the final armadillo/␤-catenin-like repeat that is required to recycle importin-␣ back to the cytoplasm (36). An autoinhibitory motif (KRR) is present in the importin-␤-binding domain of metazoan importin-␣ proteins, but the corresponding sequence in TgIMP␣ is more akin to the one in plants (KKR). Phylogenic analysis of apicomplexan importin-␣ proteins (Fig.  6B) also illustrates that they are more similar to those found in plants, a characteristic that has been noted before for other apicomplexan proteins (37).
TgGCN5 Interacts with TgIMP␣ via the RKRVKR NLS-Coimmunoprecipitation studies were performed to assess whether TgGCN5 physically interacts with TgIMP␣ and, furthermore, whether such an interaction occurred by virtue of the NLS elucidated by the immunolocalization studies. TgIMP␣ and various forms of TgGCN5 were transcribed and translated in vitro in the presence of radiolabeled methionine. After mixing them together, the proteins were co-immunoprecipitated using antisera generated against the C terminus of TgGCN5 and resolved on SDS-PAGE. As shown in Fig. 7, full-length TgGCN5 interacts with TgIMP␣. However, a truncated version of TgGCN5 starting just downstream of the RKRVKR NLS (⌬99TgGCN5) is no longer capable of binding TgIMP␣, consist-FIG. 4. The hexapeptide RKRVKR (amino acids 94 -99) is necessary for nuclear localization of TgGCN5. Parasites were transfected with FLAG-tagged forms of TgGCN5 lacking the first 93 ( FLAG ⌬93TgGCN5) or 99 ( FLAG ⌬99TgGCN5) amino acid residues. Immunofluorescence assay was carried out as described in the legend to Fig. 3. DAPI, 4Ј,6Ј-diamino-2-phenylindole.
FIG. 5. The TgGCN5 NLS is sufficient to translocate a heterologous protein to the parasite nucleus. Parasites transfected with E. coli ␤gal fused to FLAG with or without the TgGCN5 NLS (␤gal FLAG and NLS-␤gal FLAG , respectively) were examined by immunofluorescence assay as described in the legend to Fig. 3. DAPI, 4Ј,6Ј-diamino-2-phenylindole. FIG. 3. The N-terminal extension of TgGCN5 is required for nuclear localization. Recombinant protein in transgenic parasites expressing either full-length FLAG TgGCN5 or TgGCN5 lacking the Nterminal extension ( FLAG ⌬N T TgGCN5) was detected by immunofluorescence assay using anti-FLAG antibodies followed by staining with antirabbit Alexa 488 (green). 4Ј,6Ј-Diamino-2-phenylindole (DAPI) was used as a nuclear stain (red). hN, host cell nucleus; TgN, parasite nucleus. ent with the in vivo localization data. To ensure that only the NLS hexapeptide is required to mediate the interaction with TgIMP␣, a version of full-length TgGCN5 was generated in which RKRVKR was replaced by an alanine and serine (⌬NLS-TgGCN5). ⌬NLS-TgGCN5 fails to interact with TgIMP␣ ( Fig.  7), confirming that TgIMP␣ cannot bind any region of TgGCN5 outside of the NLS RKRVKR.

DISCUSSION
Two understudied research topics in the clinically and economically relevant group of pathogens in phylum Apicomplexa include nucleocytoplasmic trafficking and the regulation of gene expression. We have attempted to expand our knowledge base in both of these related areas by studying how the important HAT, TgGCN5, is directed to the parasite nucleus to carry out its gene regulatory functions.
Functions of the Unique N-terminal Extension of TgGCN5-The strikingly divergent N-terminal extensions present in both TgGCN5 and PfGCN5 may be indicative of parasite-specific functions. This study has discovered that at least one function of the N-terminal extension is to localize the HAT to the nucleus via the importin pathway. Additional possible functions of the TgGCN5 N-terminal extension may include the participation in protein-protein interactions or the recognition of substrate. In vivo, GCN5 operates as the enzymatic component of large multiprotein complexes such as SAGA and ADA (1). One or more proteins may bind to the N terminus of TgGCN5 to nucleate an analogous histone acetylase complex in the parasites. The divergent nature of the apicomplexan N-terminal sequences implies that they interact with novel, parasite-specific proteins, perhaps different sets of proteins under varying conditions or life cycle stages.
The N-terminal extension of TgGCN5 may also influence enzymatic activity and substrate recognition. Recombinant mammalian GCN5 devoid of the N terminus is capable of acetylating free histones in vitro but requires the N-terminal sequence to be able to recognize and acetylate nucleosomes (18). The N-terminal extension is dispensable for GCN5-mediated acetylation of free histone H3 in both P. falciparum (19) and T. gondii, 3 but further study is required to assess the role of this domain, if any, in recognizing nucleosomal substrates.
The Predictive Value of the TgGCN5 NLS-The completion of three apicomplexan genomes in recent years has revealed many predicted open reading frames of no known function. Defining motifs such as an NLS will greatly assist in annotation efforts and functional genomics. The TgGCN5 NLS sequence exhibited no exact matches to predicted genes, but minor variations produced encouraging results when screened against the database. Substituting the only nonbasic residue (valine) with any amino acid identified a DNA polymerase, a probable ribonucleoprotein, and numerous parasite-specific proteins containing zinc fingers, which are likely transcription factors. Table II also displays results from similar searches of the Plasmodium database with permutations of the TgGCN5 NLS. For example, searching the P. falciparum annotated protein data set with RKRXKR revealed 14 potential hits, including a predicted nuclear protein containing three PHD fingers (PFC0425w; RKRNKR). Searches with the less stringent criteria (RK)(RK)-(RK)X(RK)(RK) revealed 1500 and 1180 potential hits in the annotated protein data sets from P. falciparum and Plasmodium yoelii, respectively. Cursory analysis of some of these candidate nuclear proteins revealed a putative histone deacetylase (PY07179; KRKNKK), nuclear protein tRNApseudouridine synthase (PF10 0175; RKKRKK), and the putative DNA polymerase subunit (PF10_0362; KKKMKK). Searches of the Cryptosporidium expressed sequence tag database for instances of (RK)(RK)(RK)X(RK)(RK) were also successful in identifying nuclear proteins; among the nine hits were histone H2B (CpEST_AA224688; RKRRKR) and a putative nucleolar protein (CpEST_AA253596; KKKLRK). It should be cautioned, however, that some hits in each database contained the NLS but are not likely to be nuclear based on their homology assignment.
Searches for the TgGCN5 NLS on other GCN5 proteins have been less successful. Stringent matches are not evident on higher eukaryotic GCN5s or the highly related P/CAF HATs, but P/CAF is known to interact with importin-␣ (8). Similarly, strict matches to the TgGCN5 NLS are also not apparent on fellow apicomplexan GCN5s from P. falciparum (19) or Cryptosporidium parvum (EAK89017). However, runs of 4 -5 basic residues exist in the N termini of these GCN5 HATs that are plausible candidates for importin-␣ binding. Further experimental evidence must be obtained to conclude whether other GCN5s are translocated to the nucleus via the importin pathway.
These observations are consistent with the idea that the importin-mediated nuclear transport pathway is ancient in origin and well conserved among eukaryotes (32). However, it is notable that the apicomplexan versions are typically more plantlike, which may have relevant functional implications. For example, plant importin-␣ proteins are able to translocate their cargo into the nucleus without being bound by importin-␤ (41). Further study is required to test whether this is true in apicomplexa nuclear transport.
Recent studies have also implicated that components of the importin pathway have roles beyond nuclear trafficking. Importins and the Ran GTPase cycle are required for spindle formation and embryonic mitosis in Caenorhabditis elegans (42,43). Importin-␣ can associate with Xenopus laevis egg membranes and is involved in the formation of the nuclear envelope (44). There is also evidence that importins function to prevent the aggregation of cytosolic proteins harboring basic rich domains by binding to, and hence shielding, that exposed region (45). It would be of interest to determine whether similar functions exist for the apicomplexan importin pathway members.
Apicomplexans can be considered "minimal" eukaryotes, because their evolutionary trajectory branches before the divergence of fungi, plants, and animals (46). The possession of a streamlined nucleocytoplasmic transport system with only one importin-␣ makes this early eukaryote attractive to study for dissecting the fundamentals of this pathway. The continued study of both parasite nuclear trafficking and chromatin remodeling in T. gondii promises to reveal insight into how these important biological processes evolved and may reveal differences that can be exploited therapeutically.