A Novel Pax-like Protein Involved in Transcriptional Activation of Cyst Wall Protein Genes in Giardia lamblia*

Giardia lamblia differentiates into infectious cysts to survive outside of the host. It is of interest to identify factors involved in up-regulation of cyst wall proteins (CWPs) during this differentiation. Pax proteins are important regulators of development and cell differentiation in Drosophila and vertebrates. No member of this gene family has been reported to date in yeast, plants, or protozoan parasites. We have identified a pax-like gene (pax1) encoding a putative paired domain in the G. lamblia genome. Epitope-tagged Pax1 localized to nuclei during both vegetative growth and encystation. Recombinant Pax1 specifically bound to the AT-rich initiator elements of the encystation-induced cwp1 to -3 and myb2 genes. Interestingly, overexpression of Pax1 increased cwp1 to -3 and myb2 gene expression and cyst formation. Deletion of the C-terminal paired domain or mutation of the basic amino acids of the paired domain resulted in a decrease of the transactivation function of Pax1. Our results indicate that the Pax family has been conserved during evolution, and Pax1 could up-regulate the key encystation-induced genes to regulate differentiation of the protozoan eukaryote, G. lamblia.

G. lamblia is of evolutionary interest because it has been proposed as an early branching eukaryotic lineage (16,17). Within its genome, very simplified components have been identified for many cellular processes, including DNA synthesis, transcription, and RNA processing, suggesting that the missing components are too divergent to be recognized or that they are nonessential (18). Many aspects of giardial gene transcription are unusual. For example, G. lamblia lacks eight of the 12 general transcription initiation factors (18 -20), and it does not have some components of multisubunit mediators that bridge transcriptional activators or repressors to basal RNA polymerase II initiation machinery (21). Known giardial transcription factors, including TATA-binding protein, appear to have diverged at a higher rate than those of crown group eukaryotes (19). Giardial RNA polymerase II has no regular heptad repeats in the C-terminal domain (20). In addition, unusually short 5Ј-flanking regions (Ͻ65 bp) are sufficient for the expression of many giardial protein-coding genes (7,8,10,(22)(23)(24)(25). Within the short promoter regions, no consensus TATA boxes or other cis-acting elements characteristic of higher eukaryotic promoters have been observed (7,8,10,(22)(23)(24)(25). Instead, AT-rich sequences have been found around the transcription start sites of many genes (7,8,10,(22)(23)(24)(25). They are functionally similar to the initiator (Inr) element in late branching eukaryotes because they are essential for promoter activity and play a predominant role in determining the positions of the transcription start sites (22)(23)(24).
Little is known of the molecular mechanisms governing transcriptional regulation of the cyst wall biosynthetic pathway. During encystation, genes encoding three CWPs are coordinately induced (7)(8)(9). Few transcription factors regulating cwp gene expression have been characterized to date (26 -28). A Myb family transcription factor (Myb2) that is up-regulated during encystation may be involved in coordinating up-regulation of the cwp1 to -3 genes (26,29). Two GARP (named from the maize GOLDEN2, Arabidopsis response regulator proteins and the Chlamydomonas Psr1 protein) family transcription factors may be involved in transcriptional regulation of the encystation-induced cwp1 gene and constitutive ran gene (27). An AT-rich interaction domain (ARID) 3 family transcription fac-tor can bind to specific AT-rich Inr sequences of three cwp genes and function as an important transactivator in the regulation of the cwp1 gene (28). WRKY can bind to specific sequences in the cwp1, cwp2, and myb2 promoters and up-regulate expression of these genes (13).
Many Pax proteins have similar sequence specificities for DNA binding (45). Pax proteins can also recognize more than one DNA sequence; they have a diverse DNA binding sequences (39). Either paired domain or homeodomain possesses independent DNA binding activity. They may cooperate to bind DNA, and the binding sequences of Pax proteins include the two juxtaposed or overlapped sites, one recognized by the homeodomain and the other by the paired domain (46). The RED subdomain, which has more variety than the PAI subdomain, may contribute to a different sequence specificity of Pax proteins (39).
Because Pax proteins play critical roles in development and cell differentiation of many eukaryotes, we asked whether Giar-dia has Pax family proteins and whether they influence gene expression during Giardia differentiation into dormant cysts. We searched the Giardia genome data base for genes encoding Pax-like domains. We identified two giardial Pax homologs and determined the function of Pax1 in Giardia. We found that Pax1 bound to specific AT-rich initiator sequences and functioned as a transactivator of the cwp1 to -3 and myb2 genes to regulate G. lamblia differentiation into dormant cysts.
Isolation and Analysis of the pax1 Gene-The G. lamblia genome data base (available on the Giardia Genomics Resources web site) (18,49) was searched with the amino acid sequences of the paired domain of Drosophila Pox meso (GenBank TM accession number NM_001043222) using the BLAST program (50). This search detected two putative Pax homologues (open reading frames 32686 (Pax1) and 16640 (Pax2) in the G. lamblia genome data base). The Pax1 coding region with 300 nt of 5Ј-flanking regions was cloned, and the nucleotide sequence was determined. The pax1 gene sequence in the data base was correct. To isolate the cDNA of the pax1 gene, we performed RT-PCR with pax1-specific primers using total RNA from G. lamblia. For RT-PCR, 5 g of DNasetreated total RNA from vegetative and 24 h encysting cells was mixed with oligo(dT) [12][13][14][15][16][17][18] and random hexamers and Superscript II RNase H-reverse transcriptase (Invitrogen). Synthesized cDNA was used as a template in subsequent PCR with primers Pax1F (CACCATGTCCGAGTATGATGAGCA) and Pax1R (ATACACATCAACGTCCATCT). Genomic and RT-PCR products were cloned into pGEM-T easy vector (Promega) and sequenced (Applied Biosystems).
Plasmid Construction-All constructs were verified by DNA sequencing with a BigDye Terminator 3.1 DNA sequencing kit and an Applied Biosystems 3100 DNA analyzer (Applied Biosystems). Plasmid 5Ј⌬5N-Pac was a gift from Dr. Steven Singer and Dr. Theodore Nash (51). The pax1 gene and its 300-bp 5Ј-flanking region were amplified with oligonucleotides Pax1AF (GGGCGCCTAGGATAAGTGTCTTTCGG-CAGGC) and Pax1AUXR (GGGCCTCTAGACTAGAT-GTATCGATACGTATCATACACATCAACGTCCATCT), digested with AvrII/XbaI, and cloned into NheI/XbaI-digested pPW1N (52). The resulting plasmid, pPPax1, contained the pax1 gene controlled by its own promoter with an AU1 tag fused at its C terminus. For constructing pPPax1m1, a PCR with oligonucleotides Pax1AF and Pax1m1R (GTGAATAAGcgaC-TCaccCGTTTCCTGGGGTATGATTAAATACTCgctgcaG-TTGCTAGCTGAaccCCCcatTTGAGGGAT; mutated nucleotides are shown in lowercase type) generated a 1.1-kb product. Another PCR with primers Pax1m1F (ATCCCTC-AAatgGGGggtTCAGCTAGCAACtgcagcGAGTATTTAA-TCATACCCCAGGAAACGggtGAGtcgCTTATTCAC) and Pax1AUXR generated a 0.4-kb PCR product. A second run of PCR with the above two products and primers Pax1AF and Pax1AUXR generated a 1.5-kb PCR product that was digested with AvrII/XbaI and cloned into NheI/XbaI-digested pPW1N (13). The resulting plasmid, pPPax1m1, contains a pax1 gene with a mutation of the coding region of a stretch of basic amino acids between residues 240 and 260, which is located inside the paired domain. For constructing pPPax1m2, a PCR with oligonucleotides Pax1AF and Pax1m2R (GGAAACAACcacggtGG-AgtttgcCCCcatCGGCCCCACCTCAAGCTGACCACAATC-ccaGAAacaaccAACGATGTT) generated a 1.3-kb product. Another PCR with primers Pax1m2F (AACATCGTTggttgtT-TCtggGATTGTGGTCAGCTTGAGGTGGGGCCGatgGGG-gcaaacTCCaccgtgGTTGTTTCC) and Pax1AUXR generated a 0.2-kb PCR product. A second run of PCR with the above two products and primers Pax1AF and Pax1AUXR generated a 1.5-kb PCR product that was digested with AvrII/XbaI and cloned into NheI/XbaI-digested pPW1N (13). The resulting plasmid, pPPax1m2, contains a pax1 gene with a mutation of the coding region of a stretch of basic amino acids between residues 291 and 310, which is located inside the paired domain. For constructing pPPax1m3, a PCR product generated with oligonucleotides Pax1AF and Pax1m3AUXR (GGGCCTCTAGA-CTAGATGTATCGATACGTATCATCTTCAGGGCCAAG-CTC) was digested with AvrII and XbaI and cloned into NheI/ XbaI-digested pPW1N (13). The resulting plasmid, pPPax1m3, contains a pax1 gene lacking the C-terminal region (residues 233-368 containing the paired domain. Transfection and Western Blot Analysis-Cells transfected with the pP series plasmid containing the pac gene were selected and maintained with 54 g/ml puromycin. Western blots were probed with anti-V5-HRP (Invitrogen) or anti-AU1 monoclonal antibody (Covance, Princeton, NJ; 1:5000 in blocking buffer) or anti-CWP1 antibody (1:10,000 in blocking buffer) (29) and detected with peroxidase-conjugated goat anti-mouse IgG (Pierce; 1:5000 in blocking buffer) or peroxidase-conjugated goat anti-rabbit IgG (Pierce; 1:5000) and enhanced chemiluminescence (GE Healthcare).
Expression and Purification of Recombinant Pax1 Protein-The genomic pax1 gene was amplified using oligonucleotides Pax1F and Pax1R. The product was cloned into the expression vector pET101/D-TOPO (Invitrogen) in frame with the C-terminal His and V5 tag to generate plasmid pPax1. To make the pPax1m1 (or pPax1m2) expression vector, the pax1 gene was amplified using two primer pairs, Pax1F and Pax1m1R (GTG-AATAAGCGACTCACCCGTTTCCTGGGGTATGATTA-AATACTCGCTGCAGTTGCTAGCTGAACCCCCCATTT-GAGGGAT) (or Pax1m2R; GGAAACAACCACGGTGGAG-TTTGCCCCCATCGGCCCCACCTCAAGCTGACCACAA-TCCCAGAAACAACCAACGATGTT) and Pax1m1F (ATCC-CTCAAATGGGGGGTTCAGCTAGCAACTGCAGCGAG-TATTTAATCATACCCCAGGAAACGGGTGAGTCGCTT-ATTCAC) (or Pax1m2F; AACATCGTTGGTTGTTTCTGG-GATTGTGGTCAGCTTGAGGTGGGGCCGATGGGGGC-AAACTCCACCGTGGTTGTTTCC) and Pax1R. The two PCR products were purified and used as templates for a second PCR. The second PCR also included primers Pax1F and Pax1R, and the product was cloned into the expression vector to generate plasmid pPax1m1 (or pPax1m2). To make the pPax1m3 expression vector, the pax1 gene was amplified using two primers, Pax1F and Pax1m3R (ATCTTCAGGGCCAAGCTC). The product was cloned into the expression vector to generate plasmid pPax1m3. The pPax1, pPax1m1, pPax1m2, or pPax1m3 plasmid was freshly transformed into Escherichia coli BL21 Star TM (DE3) (Invitrogen). An overnight preculture was used to start a 250-ml culture. E. coli cells were grown to an A 600 of 0.5 and then induced with 1 mM isopropyl-D-thiogalactopyranoside (Promega) for 4 h. Bacteria were harvested by centrifugation and sonicated in 10 ml of buffer A (50 mM sodium phosphate, pH 8.0, 300 mM NaCl) containing 10 mM imidazole and protease inhibitor mixture (Sigma). The samples were centrifuged, and the supernatant was mixed with 1 ml of 50% slurry of nickel-nitrilotriacetic acid Superflow (Qiagen). The resin was washed with buffer A containing 20 mM imidazole and eluted with buffer A containing 250 mM imidazole. Fractions containing Pax1, Pax1m1, Pax1m2, or Pax1m3 were pooled; dialyzed in 25 mM HEPES, pH 7.9, 20 mM KCl, and 15% glycerol; and stored at Ϫ70°C. Protein purity and concentration were estimated by Coomassie Blue and silver staining compared with serum albumin. Pax1, Pax1m1, Pax1m2, or Pax1m3 was purified to apparent homogeneity (Ͼ95%).
Generation of Anti-Pax1 Antibody-Purified Pax1 protein was used to generate rabbit polyclonal antibodies through a commercial vendor (Angene, Taipei, Taiwan).
Electrophoretic Mobility Shift Assay-Double-stranded oligonucleotides specified throughout were 5Ј-end-labeled as described (23). Binding reaction mixtures contained the components described (28). Labeled probe (0.02 pmol) was incubated for 15 min at room temperature with 5 ng of purified Pax1, Pax1m1, Pax1m2, or Pax1m3 protein in a 20-l volume supplemented with 0.5 g of poly(dI-dC) (Sigma). Competition reactions contained a 200-fold molar excess of cold oligonucleotides. In an antibody supershift assay, 0.8 g of an anti-V5horseradish peroxidase antibody (Bethyl Laboratories) was added to the binding reaction mixture. The mixture was separated on a 6% acrylamide gel by electrophoresis.
Microarray Analysis-RNA was quantified by A 260 nm by an ND-1000 spectrophotometer (Nanodrop Technology) and qualified by a Bioanalyzer 2100 (Agilent Technology) with an RNA 6000 Nano LabChip kit. RNA from the pPPax1 cell line was labeled by Cy5, and RNA from the 5Ј⌬5N-Pac cell line was labeled by Cy3. 0.5 g of total RNA was amplified by a Low RNA Input Fluor Linear Amp kit (Agilent Technologies) and labeled with Cy3 or Cy5 (CyDye, PerkinElmer Life Sciences) during the in vitro transcription process. 0.825 g of Cy-labeled cRNA was fragmented to an average size of about 50 -100 nucleotides by incubation with fragmentation buffer at 60°C for 30 min. Correspondingly fragmented labeled cRNA was then pooled and hybridized to a G. lamblia oligonucleotide microarray (Agilent Technologies) at 60°C for 17 h. After washing and drying by nitrogen gun blowing, microarrays were scanned with an Agilent microarray scanner (Agilent Technologies) at 535 nm for Cy3 and 625 nm for Cy5. Scanned images were analyzed by Feature Extraction version 9.1 software (Agilent Technologies), and image analysis and normalization software was used to quantify signal and background intensity for each feature; data were substantially normalized by the rank consistency filtering LOWESS method.

RESULTS
Identification and Expression of pax1 Gene-To identify genes encoding novel Pax proteins from G. lamblia, we performed BLAST searches (50) against the G. lamblia genome data base (available on the World Wide Web) (18,49) using the amino acid sequences of the paired domain of Drosophila Pox meso (GenBank TM accession number NM_001043222) as a query sequence. Amino acid sequences with similarity to the paired domain were found in two proteins that we named Pax1 and Pax2 (open reading frames 32626 and 16640, respectively, in the G. lamblia genome data base). We first focused on understanding the role of Pax1 in Giardia. Comparison of genomic and cDNA sequences showed that the pax1 gene contained no introns. The deduced giardial Pax1 protein contains 368 amino acids with a predicted molecular mass of ϳ40.48 kDa and a pI of ϳ6.68. It has a putative paired domain, as predicted by Pfam (available on the World Wide Web) (53) (Fig.  1A). In contrast to the human Pax family and Drosophila Prd (30), the paired domains of which are N-terminal, the giardial Pax1 paired domain is near the C terminus (residues 233-363) (Fig. 1A). Unlike some human Pax family members, which contain a homeodomain or octapeptide, giardial Pax1 does not have these motifs as predicted by pfam (available on the World Wide Web) (Fig. 1A).
The structure of the paired domain includes two subdomains, PAI and RED, each of which possesses a helix-turn-helix structure (H1-H3 for PAI and H4 -H6 for RED) (Fig. 1B) (38). Structural studies of the paired domain of human Pax6 show that the H3 and H6 of each helix-turnhelix motif recognize distinct halfsites and contact the DNA major grooves of these sites (Fig. 1B,  arrows) (54). A ␤ hairpin and a linker that connects two subdomains contact the DNA minor groove (Fig. 1B, arrowheads) (54). Few of the key contact residues identified by structural studies of human Pax6 or Drosophila Prd are conserved in Pax1 (Fig. 1B). The similarity between the giardial Pax1 and human Pax family is limited to their paired domains (data not shown).
Encystation-induced Expression of the pax1 Gene-RT-PCR and quantitative real-time PCR analysis of total RNA showed that the pax1 transcript was present in vegetative cells and increased by ϳ1.4-fold in 24-h encysting cells ( Fig. 2A). As controls, we found that the mRNA levels of the cwp1 and ran genes increased and decreased significantly during encystation, respectively ( Fig. 2A). The products of the cwp1 and ran genes are the component of the cyst wall and the ras-related nuclear protein (7,56). To determine the expression of Pax1 protein, we generated an antibody specific to the full-length Pax1. Western blot analysis confirmed that this antibody recognized Pax1 at a size of ϳ40 kDa (Fig. 2B). Pax1 was expressed in vegetative cells, and its levels increased slightly during encystation (Fig. 2B).
Localization of the Pax1 Protein-To determine the role of Pax1 protein, we prepared the construct pPPax1, in which the pax1 gene is controlled by its own promoter and contains an AU1 epitope tag at its C terminus (Fig. 2C) and stably transfected it into Giardia. The AU1-tagged Pax1 was detected in the nuclei during vegetative growth and encystation (Fig. 2D), indicating that Pax1 is a nuclear protein in Giardia. As a negative control, there was no staining for anti-AU1 antibody detection in the 5Ј⌬5N-Pac cell line, which expressed only the puromycin selection marker (Fig. 2, C and D). We further identified the portion of Pax1 that is sufficient to direct the protein to the nuclei. A typical nuclear localization signal was predicted in the  (77). GenBank TM accession numbers for human Pax1 to -9 and Drosophila Prd are NM_006192, NM_000278, NM_181458, NM_006193, NM_016734, NM_000280, NM_001135254, NM_003466, NM_006194, and NM_164990, respectively. Letters in black boxes, letters in gray boxes, and hyphens indicate identical amino acids, conserved amino acids, and gaps in the respective proteins, respectively. Gray boxes indicate the ␣-helices in the paired domain of human Pax6 (54). The arrows indicate the key residues contacting the major groove in human Pax6 or Drosophila Prd (54,58). The arrowheads indicate the residues that make contact with the minor groove/phosphodiester backbone in human Pax6 or Drosophila Prd (54,58). A region (residues 291-310) with a typical nuclear localization signal predicted by the PSORT program (available on the World Wide Web) (57) is underlined by a solid line. A region (residues 240 -260) rich in basic amino acid residues is underlined by a dotted line. OCTOBER 15, 2010 • VOLUME 285 • NUMBER 42 291-310 residues of Pax1 using the PSORT program (available on the World Wide Web) (57) (Fig. 1B, solid line). A region rich in basic amino acid residues may be a putative nuclear localization signal (residues 240 -260) (Fig. 1B, dotted line), although it was not predicted as a nuclear localization signal by the PSORT program. These two putative nuclear localization signals are located inside the paired domain (Fig. 1B). Mutation of the basic amino acids between residues 240 and 260 (pPPax1m1) (Figs. 1B and 3A) did not affect nuclear localization in both vegetative and encysting cells (Fig. 3, B-G). Mutation of the basic amino acids between residues 291 and 310 (pPPax1m2) (Figs. 1B and 3A) resulted in loss of nuclear localization in both vegetative and encysting cells (Fig. 3, H-M), suggesting that these basic residues may play an important role in the exclusively nuclear localization. The staining was limited to some small vesicles in the cytosol (Fig. 3,  B-G). Deletion of the C-terminal region containing the paired domain and the C-terminal 5 amino acids (residues 233-368, pPPax1m3; Figs. 1A and 3A) resulted in a significant decrease of nuclear localization (Fig. 3, N-S). The staining was evenly distributed in both the nuclei and cytosol of both vegetative and encysting cells (Fig. 3, N-S).

Regulation of cwp Genes by Pax1 in G. lamblia
Identification of the Pax1 Binding Sites-The nuclear localization of Pax1 suggested that it might also function as a transcription factor in G. lamblia. To test its DNA binding activity, we expressed Pax1 with a C-terminal V5 tag in E. coli and purified it to Ͼ95% homogeneity, as assessed in a silver-stained gel (data not shown). An anti-V5-horseradish peroxidase antibody specifically recognized the recombinant V5tagged Pax1 in Western blots (Fig.  4A).
To determine whether purified Pax1 binds DNA, we performed electrophoretic mobility shift assays with double-stranded DNA sequences from the 5Ј-flanking region of an encystation-induced gene, cwp1. Incubation of a labeled doublestranded DNA probe, cwp1Ϫ45/ Ϫ1, with Pax1 resulted in the formation of shifted bands (Fig. 4B, lane  2). cwp1Ϫ45/Ϫ1 is the region from Ϫ45 to Ϫ1 bp relative to the translation start site of the cwp1 gene. Pax1 did not bind to either single strand of the cwp1Ϫ45/Ϫ1 probe (data not shown). The binding specificity was confirmed by competition and supershift assays (Fig. 4B,  lanes 3-5). Pax1 bound to cwp1Ϫ45/Ϫ1 could be supershifted by an anti-V5-horseradish peroxidase antibody (Fig. 4B, lane 3). The formation of the shifted cwp1Ϫ45/Ϫ1 bands was almost totally competed by a 200-fold molar excess of unlabeled cwp1Ϫ45/Ϫ1 but not by the same excess of a nonspecific competitor, cwp1Ϫ45/Ϫ1m7 (Fig. 4B, lanes 4 and 5). Scanning mutagenesis of the cwp1Ϫ45/Ϫ1 probe showed that substitutions within the AATAAA sequence significantly decreased the DNA-protein interaction (cwp1Ϫ45/Ϫ1m7; Fig. 5, lane 8), but mutations of the other regions caused a minor decrease in binding (Fig. 5, lanes 2-7, 9, and 10). Substitutions of the three Ts within the AAATAAAATAT region also caused a minor decrease in binding (Fig. 5, lane 11).
Pax1 was also shown to bind to cwp1Ϫ90/Ϫ46, and within this region it bound to the 3Ј-region (cwp1Ϫ68/Ϫ46) and The filled black box indicates the coding sequence of the AU1 epitope tag. D, nuclear localization of Pax1. The pPPax1 stable transfectants were cultured in growth (Veg, left panels) or encystation medium for 24 h (Enc, right panels) and then subjected to immunofluorescence analysis using anti-AU1 antibody for detection. The product of pPPax1 localizes to the nuclei in both vegetative and encysting trophozoites (upper panels). The middle panels show the DAPI staining of cell nuclei. The bottom panels are the merged images of the DAPI staining and images of Pax1-AU1. As a negative control, there was no staining for anti-AU1 antibody detection in the 5Ј⌬5N-Pac cell line.

Regulation of cwp Genes by Pax1 in G. lamblia
Studies suggest that Drosophila Prd and human Pax6 can bind to both DNA major (using the H3 and H6; Fig. 1B, arrows) and minor grooves (using the ␤ hairpin and linker, which connects PAI and RED subdomains; Fig. 1B, arrowheads) (54,59). To investigate how Pax1 binds DNA, we used distamycin A, which binds to the minor groove of AT-rich DNA sequences, as a competitive inhibitor of Pax1 binding (60). As shown in Fig.  7B, the binding of Pax1 to DNA decreased with increasing concentrations of distamycin A. However, the binding was not completely inhibited at concentrations of ϳ5 mM, suggesting that Pax1 may bind to both DNA major and minor grooves.
The paired domains of the human Pax proteins are known to be important for DNA binding (30). We found that a Pax1 mutant (Pax1m3) with a deletion of the C-terminal region containing the paired domain (residues 233-368, Pax1m3) reduced nuclear localization and increased cytosol localization (Fig. 3, A  and N-S). To understand whether the C-terminal paired domain is also important for DNA binding, the Pax1m3 mutant was expressed in E. coli and purified. We found that the purified Pax1m3 did not bind to the cwp1Ϫ45/Ϫ1, cwp1Ϫ90/Ϫ46, ranϪ51/Ϫ20, cwp2Ϫ30/ϩ8, and cwp3Ϫ30/ϩ10 probes (Fig. 8,  B-F), indicating that the C-terminal paired domain is important for DNA binding. Two putative nuclear localization signals have been found inside of the paired domain, of which the C-terminal signal (residues 291-310) but not the N-terminal signal (residues 240 -260) is important for nuclear localization (Fig. 3). We also tried to understand whether the regions we tested for nuclear localization are important for DNA binding. Mutation of the basic amino acids between residues 240 and 260 (Pax1m1) or between residues 291 and 310 (Pax1m2) (Figs. 3A) resulted in a complete loss of binding activity to the cwp1Ϫ45/Ϫ1, cwp1Ϫ90/Ϫ46, ranϪ51/Ϫ20, cwp2Ϫ30/ϩ8, and cwp3Ϫ30/ϩ10 probes (Fig. 7, B-E). Similar or higher levels of Pax1m1 to -3 were added to the binding reaction mixture (Fig. 8A).
Overexpression of Pax1 Induced the Expression of cwp1 to -3 and myb2 Genes-To study the role of Pax1 in G. lamblia, we expressed pax1 by its own promoter (pPPax1; Fig. 2C) and observed its gene expression. A ϳ40-kDa protein was detected (Fig. 9A), which is matched to the predicted molecular mass of Pax1 (ϳ40.5 kDa) with the AU1 tag (ϳ0.8 kDa). Similar to the expression pattern of the endogenous Pax1 protein, the levels of the Pax1-AU1 protein increased slightly during encystation (data not shown; also see Fig. 2B). We found that Pax1 overexpression resulted in a significant increase of the CWP1 protein levels during vegetative growth (Fig. 9A). We also found that the cyst number in the Pax1-overexpressing cell line (pPPax1) increased by ϳ1.4-fold (p Ͻ 0.05) (data not shown) relative to the control cell line, which expresses only the puromycin selection marker (5Ј⌬5N-Pac) (Fig. 2C), indicating that the overexpressed Pax1 can increase the cyst formation. RT-PCR and  1). B, detection of Pax1 binding sites. Electrophoretic mobility shift assays were performed using purified Pax1 and the 32 P-end-lableled oligonucleotide probe cwp1Ϫ45/Ϫ1 (Ϫ45 to Ϫ1 relative to the translation start site of the cwp1 gene). Components in the binding reaction mixtures are indicated above the lanes. The arrowhead indicates the shifted complex. The Pax1 binding specificity for the cwp1Ϫ45/Ϫ1 probe was confirmed by competition and supershift assays. Some reaction mixtures contained a 200-fold molar excess of cold oligonucleotides cwp1Ϫ45/Ϫ1 or cwp1Ϫ45/Ϫ1m7 or 0.8 g of anti-V5-horseradish peroxidase antibody, as indicated above the lanes. FIGURE 5. Mutation analysis of the cwp1؊45/؊1 probe sequence containing the putative Pax1 binding site. Electrophoretic mobility shift assays were performed using purified Pax1 and various 32 P-end-labeled cwp1Ϫ45/Ϫ1 mutant probes as described. Base changes in the mutants are shown in underlined lowercase type. Components in the binding reaction mixtures are indicated above the lanes. The arrowhead indicates the shifted complex. The transcription start sites of the cwp1 gene determined from 24-h encysting cells are indicated by asterisks (8). The AT-rich Inr element spanning the transcription start site is underlined. The transcription start sites of the cwp1 gene determined from 24-h encysting cells are indicated by asterisks (8). The AT-rich Inr element spanning the transcription start site is underlined. ϩ, ϩ/Ϫ, and Ϫ, moderate binding, weak binding, and no binding, respectively. ϩϩϩ and ϩϩ, strong binding. quantitative real-time PCR analysis showed that the mRNA levels of the endogenous pax1 plus vector-expressed pax1 in the Pax1-overexpressing cell line increased by ϳ2-fold (p Ͻ 0.05) (Fig. 9, B and C) relative to the control cell line, which expressed only the puromycin selection marker (5Ј⌬5N-Pac) (Fig. 2C) (51). The mRNA levels of the endogenous cwp1, cwp2, cwp3, and myb2 genes in the Pax1-overexpressing cell line increased by ϳ1.7-2.5-fold (p Ͻ 0.05) relative to the 5Ј⌬5N-Pac control cell line (Fig. 9, B and C). The egfcp1 mRNA levels decreased significantly in the Pax1-overexpressing cell line (Fig. 9, B and  C). Similar mRNA levels of the ran, ribosomal protein L7 (open reading frame 19436), and 18 S ribosomal RNA genes were detected (Fig. 9, B and C). Similar results were obtained during encystation (data not shown). The results suggest that the overexpressed Pax1 can transactivate the cwp1, cwp2, cwp3, and myb2 genes.
To further understand the function of giardial Pax1, we observed the effect of overexpression of the Pax1m1, Pax1m2, and Pax1m3 mutants that cannot bind DNA (Fig. 8). Pax1m1 can enter nuclei, but Pax1m2 cannot enter nuclei. Pax1m3 localized to both nuclei and cytoplasm (Fig.  3). We found that the levels of Pax1m1, Pax1m2, or Pax1m3 protein increased significantly compared with that of wild type Pax1 during vegetative growth (Fig. 9A). We further analyzed whether the transcript levels of the Pax1m1, Pax1m2, or Pax1m3 were changed. As shown by RT-PCR and quantitative real-time PCR analysis, the levels of AU1-tagged pax1m1 to -3 mRNA increased by ϳ11.4, ϳ17, or ϳ30-fold (p Ͻ 0.05) compared with that of wild type AU1-tagged pax1 during vegetative growth (Fig. 9, B and C). This suggests a negative autoregulation of the pax1 gene. We did not detect any AU1-tagged pax1 transcripts in the 5Ј⌬5N-Pac control cell line (Fig. 9C). In addition, the levels of the CWP1 protein and of the cwp1, cwp2, cwp3, and myb2 mRNA decreased significantly in the Pax1m1-, Pax1m2-, or Pax1m3overexpressing cell line relative to the wild-type Pax1-overexpressing cell line (Fig. 9, B and C). Similar results were obtained during encystation (data not shown). The results suggest a decrease of transactivation activity of Pax1m1, Pax1m2, and Pax1m3.
Oligonulceotide microarray assays confirmed the up-regulation of the cwp1, cwp2, cwp3, and myb2 gene expression in the Pax1-overexpressing cell line to ϳ1.57to ϳ3.0-fold of the levels in the control cell line (Fig. 10A). The egfcp1 gene was down-regulated by Pax1 overexpression (Fig. 10A). Similar mRNA levels of the ran and ribosomal protein L7 genes were detected (Fig. 10A).
Recruitment of Pax1 to the pax1, cwp1 to -3, and myb2 Promoters-We further used ChIP assays to study the association of Pax1 with specific promoters in the Pax1-overexpressing cell line. We found that Pax1 was associated with its own promoter and the cwp1, cwp2, cwp3, myb2, egfcp1, ran, and ribosomal protein L7 promoters during vegetative growth or during encystation (Fig. 10B) (data not shown). However, Pax1 was not associated with the 18 S ribosomal RNA gene promoter, which has no Pax1 binding site (Fig. 10B).

DISCUSSION
The pax gene family is a group of transcription factors that perform a variety of cellular functions in animal lineage, including helminths, insects, marine animals, fish, amphibians, birds, and mammals, but it has not been identified to date in yeast, FIGURE 6. Detection of Pax1 binding sites in multiple promoters. Electrophoretic mobility shift assays were performed using purified Pax1 and various 32 P-end-lableled oligonucleotide probes as described. Components in the binding reaction mixtures are indicated above the lanes. The arrowhead indicates the shifted complex. The transcription start sites of the cwp2 and cwp3 genes determined from 24-h encysting cells are indicated by asterisks (7,9). The transcription start sites of the ran gene determined from vegetative cells are indicated by asterisks (23). The AT-rich Inr elements spanning the transcription start sites are underlined. The translation start sites of the cwp2 and cwp3 genes are framed. ribopL7 and 18S represent ribosomal protein L7 and 18 S ribosomal RNA. ϩ, ϩ/Ϫ, and Ϫ, moderate binding, weak binding, and no binding, respectively. ϩϩϩ and ϩϩ, strong binding. OCTOBER 15, 2010 • VOLUME 285 • NUMBER 42 fungi, plants, or protozoa (30,(32)(33)(34)(35)(36)(37). In this study, a Pax-like transcription factor has been identified in G. lamblia although divergent in sequence. This suggests that the Pax family may have evolved before divergence of G. lamblia from the main eukaryotic line of descent but may have been lost from lineages leading to the yeast, fungi, and plants. To date, giardial Pax1 is FIGURE 7. Analysis of Pax1 binding ability. A, Pax1 may bind to AT-rich sequence. Electrophoretic mobility shift assays were performed using purified Pax1 and various 32 P-end-lableled oligonucleotide probes as described. Components in the binding reaction mixtures are indicated above the lanes. The arrowhead indicates the shifted complex. ϩ, ϩ/Ϫ, and Ϫ, moderate binding, weak binding, and no binding, respectively. ϩϩϩ and ϩϩ, strong binding. B, effect of distamycin A on the binding of Pax1 to DNA. 32 P-endlabeled cwp1Ϫ45/Ϫ1 probe was incubated with Pax1 in the absence (lane 1) or presence of distamycin A (lanes 3-6). Distamycin A was dissolved in Me 2 SO. Adding Me 2 SO to the reaction mix did not decrease the Pax1 binding activity (lane 2). to -3 protein with a V5 tag at its C terminus was purified by affinity chromatography and then detected by anti-V5-horseradish peroxidase antibody in Western blots. B-F, loss of DNA binding ability of Pax1m1 to -3. Electrophoretic mobility shift assays were performed using purified Pax1 and Pax1m1 to -3 and specific probes, including cwp1Ϫ45/Ϫ1, cwp1Ϫ90/Ϫ46, ranϪ51/Ϫ20, cwp2Ϫ30/ϩ8, and cwp3Ϫ30/ϩ10. The arrowhead indicates the shifted complex. , and pPPax1m1 to -3 stable transfectants were cultured in growth medium and then subjected to SDS-PAGE and Western blot. The blot was probed by anti-AU1 and anti-CWP1 antibody. Coomassie-stained total protein loading control is shown below. B, quantitative real-time PCR analysis of gene expression in the Pax1-and Pax1m1 to -3-overexpressing cell lines. Real-time PCR was performed using primers specific for pax1, cwp1, cwp2, cwp3, myb2, egfcp1, ran, ribosomal protein L7, and 18 S ribosomal RNA. Similar mRNA levels of the ran, ribosomal protein L7 (data not shown), and 18 S ribosomal RNA genes for these samples were detected. Transcript levels were normalized to 18 S ribosomal RNA levels. -Fold changes in mRNA expression are shown as the ratio of transcript levels in the pPPax1 or pPPax1m1 to -3 cell line relative to the 5Ј⌬5N-Pac cell line. Results are expressed as the means Ϯ S.E. (error bars) of at least three separate experiments. C, RT-PCR analysis of gene expression in the Pax1 and Pax1m1 to -3-overexpressing cell lines. The 5Ј⌬5N-Pac, pPPax1, and pPPax1m1 to -3 stable transfectants were cultured in growth medium and then subjected to RT-PCR analysis. PCR was preformed using primers specific for pax1-au1, pax1, cwp1, cwp2, cwp3, myb2, egfcp1, ran, ribosomal protein L7, and 18 S ribosomal RNA genes. the first Pax transcription factor identified in early diverging protozoan parasites. We did BLAST searches of the genome databases of Trichomonas vaginalis, Entamoeba histolytica, Plasmodium falciparum, and Trypanosoma brucei and only identified matches for uncharacterized proteins. Further analyses will reveal whether Pax family proteins are shared with these eukaryotic lineages.

Regulation of cwp Genes by Pax1 in G. lamblia
Previously, we have identified four other families of transcription factors in G. lamblia that are up-regulated during encystation and may be involved in encystation, including Myb, ARID, GARP, and WRKY (13,(27)(28)(29). ARID and Myb family transcription factors are more widespread. They have been found in insects, helminths, mammals, yeast, fungi, and plants (61)(62)(63)(64). However, GARP and WRKY family transcription factors have been found only in plants and not in yeast, fungi, or animals (65,66). It is interesting that G. lamblia possesses the transcription factors identified in both plants and animals (Myb and ARID) or the transcription factors identified only in plants (GARP and WRKY) and animals (Pax).
The giardial promoters, including the encystation-specific cwp promoters, defined to date are notably short and contain AT-rich Inr elements (7-9, 22-24, 26, 67). Deletion and mutation analysis of the ran, ␣2-tubulin, cwp1, and cwp2 promoters have provided evidence for involvement of AT-rich Inr ele-ments in basal promoter activity (22,23,28,67). We have identified several transcription factor genes whose expression is upregulated during encystation, including Myb2, GARP-like proteins, ARID1, WRKY, and Pax1 (13, 26 -28). Among them, ARID1 and Pax1 can bind AT-rich Inr elements (28). Our previous mutation analysis provides further evidence of the function of the AT-rich Inr elements. Mutation of the AT-rich Inr element in the cwp1 promoter might impair the binding of ARID1, Pax1, or other Inr-binding proteins to the Inr, leading to a downstream shift in transcription start site selection during vegetative growth (28). During encystation, transcription start site selection of the cwp1 promoter was shifted upstream, irrespective of mutations in the AT-rich Inr element. Therefore, the binding of some encystation-specific transcription factors to proximal upstream regions might be important for the selection of the upstream transcription start sites. Interestingly, the proximal upstream regions with Myb2, GARP-like proteins, or WRKY binding sites are just next to the AT-rich Inr elements (13,26,27,67), making it more likely that there is an interaction between regulatory and general transcription factors.
Pax proteins in higher eukaryotes are involved in differentiation and function as transcriptional activators or repressors (41)(42)(43). Our results show that Pax1 localizes mainly to the nuclei in both vegetative and encysting Giardia trophozoites. The increased levels of the Pax1 protein during encystation indicate that it may also play a role in differentiation. The findings that Pax1 binding sites/Inr in the cwp1 to -3 and myb2 promoters are positive cis-acting elements and that overexpressed Pax1 can induce the cwp1 to -3 and myb2 promoter activity indicate that Pax1 is involved, at least in part, in promoting cwp1 to -3 and myb2 gene expression during encystation. Our results showed that the constitutively overexpressed Pax1 increased the levels of the cwp1 to -3 and myb2 mRNA. The levels of the CWP1 protein and cyst formation also increased in the Pax1-overexpressing cell line. ChIP assays confirmed the association of Pax1 with its own promoter and the cwp1 to -3 and myb2 promoters. In addition, deletion of C-terminal paired domain or mutation of the basic amino acids of the paired domain resulted in a decrease of transactivation function of Pax1 upon the expression of the cwp1 to -3 and myb2 genes. We also found an important role of the basic residues of the paired domains in the exclusively nuclear localization. The results suggest that Pax1 may play an important role in induction of encystation.
Many important transcription factors involved in developmental regulation and in stress response have an autoregulation mechanism, including mammalian c-Myb and plant WRKY (68,69). Myb2 or WRKY has been found to be positively or negatively autoregulated to maintain its own protein levels, and this is related to the presence of its binding sites in its own promoter region (13,26,29). It has been shown that mammalian Pax proteins may be negatively or positively autoregulated by activating or inhibiting the activity of its own promoter (70). We found that deletion of the C-terminal paired domain or mutation of the basic amino acids of the paired domain resulted in a decrease of nuclear localization. The mRNA levels of these Pax1 mutants increased significantly compared with that of The non-transfected WB cells were cultured in growth medium for 24 h and then subjected to ChIP assays. Anti-Pax1 antibody was used to assess binding of Pax1 to endogenous gene promoters. Preimmune serum was used as a negative control. Immunoprecipitated chromatin was analyzed by PCR using primers that amplify the 5Ј-flanking region of specific genes. At least three independent experiments were performed. Representative results are shown. Immunoprecipitated products of Pax1 yielded more PCR products of pax1, cwp1, cwp2, cwp3, myb2, egfcp1, ran, and ribosomal protein L7 promoters, indicating that Pax1 was bound to these promoters. The 18 S ribosomal RNA gene promoter was used as a negative control for our ChIP analysis. OCTOBER 15, 2010 • VOLUME 285 • NUMBER 42 wild type Pax1, suggesting a negative autoregulation of the pax1 gene.

Regulation of cwp Genes by Pax1 in G. lamblia
The paired domain of giardial Pax1 has few of the conserved key contact residues, and it has a predicted helix-turn-helix structure similar to that of the human Pax family members. Mammalian Pax6 may bind to the G 1 element, which contains AT-rich sequence to activate the glucagon gene expression (71). Our results indicate that the AT-rich initiator sequence may be important for giardial Pax1 binding. Further studies also indicate that giardial Pax1 can bind to the poly(A) sequence with a T, TT, TTT, or TC insertion but not to poly(G) sequence. The results suggest that giardial Pax1 may recognize variable AT-rich Inr sequences in different gene promoters.
Mammalian Pax proteins may have a complete homeodomain (Pax3/7 and Pax4/6) or an incomplete homeodomain with only one helix (Pax2/5/8) (30,40), but Pax1/9 has no homeodomain. The paired domain of Pax1, -2, -3, and -6 has been reported to bind to GTTCC sequence (72). Many Pax proteins, including Pax6 and Pax8, can recognize similar sequence (G/T)T(T/C)(C/A)(C/T)(G/C)(G/C) (30). The binding sequences of Pax proteins with both domains include the two juxtaposed or overlapped sites, an ATTA motif recognized by the homeodomain, and a GTTCC motif recognized by the paired domain (46). Similar binding sequence of different Pax proteins suggests that different Pax proteins can recognize the same target genes. Interestingly, it has been reported that Pax proteins may have a high DNA-binding flexibility, and they may bind to other sequences unrelated to GTTCC (73,74). It is possible that Pax proteins may recognize different target sequences and regulate many different target genes through variable combinations of PAI and RED subdomains or homeodomain (73). The variable sequence recognition ability may help Pax proteins to interact with other transcription factors (73). In this study, we found that giardial Pax1 can bind to ATrich sequence with a high flexibility, suggesting that Pax1 may recognize AT-rich Inr elements of many different gene promoters and that it may interact with different transcription factors.
Two stretches of basic amino acids are present inside the paired domain, of which the C-terminal one was predicted as a nuclear localization signal by the PSORT program (available on the World Wide Web) (57). We found that mutation of the C-terminal stretch (residues 291-310, Pax1m2) resulted in loss of nuclear localization, suggesting that the C-terminal basic amino acids of the paired domain may play some role in the exclusively nuclear localization. Similarly, a nuclear localization signal has been identified inside of the paired domains of human Pax6 (44). In addition, we found a loss of DNA binding activity and a decrease of transactivation ability of Pax1m2 on the expression of the cwp1 to -3 and myb2 genes. Because Pax1m2 was expressed at higher levels, its inactivity may be due to its inability to enter nuclei or to bind DNA. We found that mutation of the N-terminal basic amino acids of the paired domain (residues 240 -260, Pax1m1) did not affect nuclear localization but resulted in a loss of DNA binding activity and a decrease of transactivation ability upon the expression of the cwp1 to -3 and myb2 genes. Because Pax1m1 was expressed at higher levels, its inactivity may be due to its inability to bind DNA. We also found that deletion of the paired domain and C-terminal 5 amino acids (residues 233-368, Pax1m3) resulted in a partial loss of nuclear localization, a loss of DNA binding activity, and a decrease of transactivation ability upon the expression of the cwp1 to -3 and myb2 genes. Because Pax1m3 was expressed at higher levels, its inactivity may be due to its lower ability to enter nuclei or its inability to bind DNA. Our results suggest that the paired domain may be important for nuclear localization, DNA binding, and in vivo function. It is also possible that these specific regions of the paired domain may be positive regulatory regions for activation of transcription.
Our results showed that constitutively expressed Pax1 increased the expression of cwp1 and cwp2 genes by ϳ1.6 -2.5fold in vegetative trophozoites. However, the cwp1 promoter could be increased by ϳ47-fold during encystation (27). In addition, the pax1 gene expression could only be increased by ϳ1.4-fold during encystation ( Fig. 2A). Although Pax1 can also function as a transactivator, it may still need to cooperate with some other transcription factors that are induced during encystation to transactivate these cyst wall protein genes.
We found that Pax1 cannot bind to the 18 S ribosomal promoter, which does not contain the AT-rich Inr element. Pax1 can bind AT-rich Inr elements of both the constitutive ran gene and the encystation-induced cwp1, cwp2, cwp3, and myb2 genes, suggesting that Pax1 may be involved in transcriptional regulation of many different genes. It has been shown that the AT-rich Inr in the ran promoter is an important positive cisacting element (23). However, in this study, overexpressed Pax1 did not transactivate the ran promoter. We also found that Pax1 can bind to the AT-rich Inr elements of the egfcp1 and ribosomal protein L7 promoters. However, overexpression of Pax1 resulted in no change of the ribosomal protein L7 gene expression and a decrease of the egfcp1 gene expression. The presence of the Pax1 binding sites/Inr in many gene promoters suggests that Pax1 may be involved in transcriptional regulation of many different genes. In late branching eukaryotes, Pax proteins regulate specific target genes by interacting with other classes of DNA-binding proteins that occupy directly adjacent binding sites within the target promoter region. For example, Pax5 can cooperate with Ets, a transcription factor containing a helix-turn-helix DNA-binding domain, to activate the mb-1 gene in a pre-B cell (75). The paired domain of Pax3 interacts with the HMG domain of SOX10 to activate Mitf and Ret promoters (76). Pax-6 could interact with c-Maf, a bZIP transcription factor, to activate the glucagon gene expression (71). Therefore, it is possible that giardial Pax1 functions as an activator via association with some encystation-specific cofactors on the promoter context of encystation-induced genes.
Our study provides evidence for the involvement of Pax1 in DNA binding and transactivation of the cwp1 to -3 and myb2 genes in the early diverging protozoan G. lamblia. Pax proteins have been found to function during development and cell differentiation. We have also found the important role of the giardial Pax1 in induction of formation of the cyst, which is a differentiation stage. Our studies provide new insights into the evolution of eukaryotic DNA binding domain and transcriptional mechanisms during differentiation.