Identification and characterization of a ran gene promoter in the protozoan pathogen Giardia lamblia.

The promoter elements that regulate transcription initiation in Giardia lamblia are poorly understood. In this report, the promoter of the Giardia ran gene was studied using a luciferase expression plasmid pRANluc+ to monitor transcription efficiency. An AT-rich sequence spanning -51/-20 relative to the translation start site of the ran gene was identified and was found to be required for efficient luciferase expression by deletion and mutation mapping of pRANluc+. The -51/-20 sequence was also sufficient for promoter activity as revealed from studies on a 32-base pair synthetic promoter derived from this region. Deletion mapping of the synthetic promoter revealed two minimal promoter elements, -51/-42 and -30/-20, sufficient for 6- and 30-fold luciferase expression above background, respectively. The transcription start sites on luc+ messenger RNA were determined by the position of the synthetic promoter in the luciferase expression plasmids as shown by primer extension experiments. Results from electrophoretic mobility shift assays revealed multiple DNA-protein complexes upon binding of nuclear proteins with either DNA strand but not the double-stranded DNA derived from the ran promoter. Our results delineate the first promoter sequence of the Giardia gene (ran), which provides an excellent model for future studies on transcription regulation in this protozoan parasite.

The promoter elements that regulate transcription initiation in Giardia lamblia are poorly understood. In this report, the promoter of the Giardia ran gene was studied using a luciferase expression plasmid pRAN-luc؉ to monitor transcription efficiency. An AT-rich sequence spanning ؊51/؊20 relative to the translation start site of the ran gene was identified and was found to be required for efficient luciferase expression by deletion and mutation mapping of pRANluc؉. The ؊51/؊20 sequence was also sufficient for promoter activity as revealed from studies on a 32-base pair synthetic promoter derived from this region. Deletion mapping of the synthetic promoter revealed two minimal promoter elements, ؊51/؊42 and ؊30/؊20, sufficient for 6-and 30-fold luciferase expression above background, respectively. The transcription start sites on luc؉ messenger RNA were determined by the position of the synthetic promoter in the luciferase expression plasmids as shown by primer extension experiments. Results from electrophoretic mobility shift assays revealed multiple DNAprotein complexes upon binding of nuclear proteins with either DNA strand but not the double-stranded DNA derived from the ran promoter. Our results delineate the first promoter sequence of the Giardia gene (ran), which provides an excellent model for future studies on transcription regulation in this protozoan parasite.
As a very common intestinal parasite of humans and one of the earliest diverging eukaryotic cells, Giardia lamblia is an important pathogen for biological studies (1,2). The parasite adapts to drastic changes in host and outside environments by differentiating between actively dividing trophozoites and dormant cysts in response to extracellular stimuli (3,4). Trophozoites emerge from cysts in the upper intestine and proliferate to cause gastrointestinal disease. Giardiasis usually manifests as self-limiting diarrhea; however, it can also lead to malabsorption and even death in children. Trophozoites differentiate into cysts in the lower intestine and are excreted to the environment.
G. lamblia exhibits both prokaryotic and eukaryotic features (1,2). Its genome is small and compact (genome size of ϳ1.2ϫ10 7 bp) 1 (1,2). It is very interesting to study how the organism regulates its gene expression efficiently. Although intron sequences have not been reported in Giardia genes to date (5), a number of genes that code for putative components of the spliceosomal machinery have been found. The short 5Јuntranslated sequence (as short as 1 nt) and the absence of 5Ј-cap structure of bulk Giardia messenger RNA raise the possibility that G. lamblia may exploit distinct mechanisms for translation initiation different from common strategies used by other eukaryotes (1,7,8). Recent developments using gene transfer technologies and on-going genome sequencing projects involving this organism will provide us additional opportunities to study the molecular and cellular mechanisms operating in this protozoan parasite (9 -13).
Very little is known about the mechanism and regulation of transcription initiation in G. lamblia. Studies on gene expression during encystation and excystation of G. lamblia in vitro have shown that trophozoites, as well as cysts, produce new transcripts in response to extracellular stimuli (14 -18), indicating that regulation of transcription initiation must play a pivotal role during encysting and excysting stages. DNA transfection studies have shown that a 0.8-kilobase pair DNA sequence flanking 5Ј of the glutamate dehydrogenase gene (gdh) in G. lamblia is necessary and sufficient for promoter activity (11). However, DNA homology searches in this and other Giardia protein-coding genes so far examined have not revealed consensus core promoter elements used in either prokaryotes or higher eukaryotes. Instead, a number of AT-rich sequences close to or spanning the transcription initiation sites were identified by sequence alignments and suggested to be Giardia promoter elements (19 -21). These postulations remain to be confirmed experimentally.
Our previous efforts in the development of an episomal stable DNA transfection system for G. lamblia utilized the promoter-containing sequence within Ϫ590 and ϩ27 of a rasrelated nuclear protein gene (ran) (13). In the present study, we further dissected the ran promoter by using DNA transfection systems to monitor transcription efficiency and start-site selection of the reporter genes in vivo. We also used electrophoretic mobility shift assay to identify potential transcription factor(s) in vitro. To the best of our knowledge, this is the first promoter sequence to be demonstrated experimentally in G. lamblia. This information will have comparative value when other promoter sequences are identified and will also provide a framework for future studies regarding the transcription regulation of G. lamblia.

EXPERIMENTAL PROCEDURES
G. lamblia Culture-G. lamblia WB axenic culture was maintained in a modified TYI-S-33 medium as described by Keister (22). DNA Transfection and Luciferase Assay-Plasmids were delivered into G. lamblia by electroporation as described previously (13). For transient transfection, 30 g of plasmid DNA was electroporated into 1.5 ϫ 10 7 G. lamblia cells. For stable transfection, 100 g of plasmid was used. Stable luciferase expression cell lines were established by selection of transfectants with 600 g ml Ϫ1 of geneticin as described previously (13). The luciferase activity in G. lamblia was assayed as described previously (13).
Plasmid Constructions-The sequences of the modified plasmids described below were verified by the automatic DNA sequencing method performed on both strands of the DNA (ABI).
Detailed descriptions of the plasmid construction below are available upon request. A series of pRANlucϩ mutant constructs with deletions from the 5Ј-sequence flanking lucϩ (see Fig. 2) were made by the Erase-a-base system according to the supplier's instructions (Promega). Targeted mutagenesis by polymerase chain reaction was used to create mutations within the Ϫ51/Ϫ1 region of pRANlucϩ, and the mutated sequences in these plasmids are shown in Fig. 3. Defined dsDNA fragments derived from the Ϫ51/Ϫ20 region were inserted into pSPlucϩ or subsequently derived vectors, and the insertions are shown in Figs. 4 and 5. The luciferase expression cassettes from pAT-20f, pAT-66, pAT-20f-op, and p(Ϫ41/Ϫ20) (see Figs. 4 and 5) were excised by KpnI/XhoI and subcloned into KpnI/XhoI-digested pRANneo, which resulted in pNAT-20f, pNAT-66, pNAT-20f-op, and pN(Ϫ41/Ϫ20), respectively (Fig. 6, A and C).
Nuclear Extract Preparations-Nuclear extracts were prepared as described previously but with modifications (23). Briefly, G. lamblia WB cells were harvested from late logarithmic phase culture and washed once with phosphate-buffered saline. The cells (3 ϫ 10 9 ) were resuspended in 10 volumes of lysis buffer consisting of 10 mM HEPES (pH 7.9), 0.32 M sucrose, 0.1 mM EDTA, 0.1% Triton X-100, 10 mM ␤-mercaptoethanol, 20 M leupeptin, and 2 mM MgCl 2 . The lysed cells were homogenized by a Potter-Elvehjem tissue grinder (B. Braun) at the speed of 1,500 rpm for 15 min. The nuclear sediment was obtained by centrifugation at 16,000 ϫ g and resuspended in one volume of extraction buffer consisting of 25 mM HEPES (pH 7.9), 400 mM KCl, 0.1 mM EDTA, 25% glycerol, 10 mM ␤-mercaptoethanol, 20 M leupeptin, and 5 mM MgCl 2 and incubated on ice for 30 min with low speed stirring. Soluble nuclear extract was recovered from centrifugation at 16,000 ϫ g. The nuclear extract was dialyzed in nuclear buffer consisting of 25 mM HEPES (pH 7.9), 40 mM KCl, 0.1 mM EDTA, and 15% glycerol. The protein concentration in the nuclear extract was determined by a BCA protein quantitation kit as described by the supplier (Pierce).
Gel Retardation Assays-Oligonucleotides were labeled at the 5Ј-end with [␥-32 P]ATP (Amersham Pharmacia Biotech) by T4 polynucleotide kinase as described by the supplier (Promega). Binding reactions were performed in a 20-l volume containing 0.02 pmol of the probe and 1-10 g of nuclear extract in binding buffer consisting of 50 mM Tris-Cl (pH 7.4), 50 mM KCl, 1 mM EDTA, 1 mM dithiothreitol, and 0.1 mg ml Ϫ1 bovine serum albumin and incubated for 15 min at room temperature. Competition assays were performed by including a 200-fold molar excess of cold DNA oligonucleotides in the reaction mixtures unless otherwise specified. Glycerol was added to the reaction mixture to a 10% final concentration, and the mixture was separated in a 6% acrylamide gel by electrophoresis.

RESULTS
Transient DNA Transfection Assays-A luciferase expression plasmid pRANlucϩ (Fig. 1A) was used to test ran promoter activity in subsequent experiments. The time course of luciferase expression in G. lamblia cells transfected with pRANlucϩ was examined in several independent experiments with reproducible results. Different batches of plasmid DNA were also tested for transcription efficiency with only small variations in luminometer readings (less than Ϯ10%). As shown in Fig. 1B, the luciferase activity in cells transfected with pRANlucϩ was detectable (28-fold above background) at 2.5 h post-transfection and increased linearly to 880-fold above background at 20 h post-transfection. Luciferase activity began to level off at subsequent time points and reached a peak value (ϳ1,000-fold above background) at 33 h post-transfection. Luciferase expression remained above 80% of the peak value until 56 h posttransfection. In contrast, cells transfected with pSPlucϩ, in which the expression of the reporter gene is driven by a bacteriophage SP6 promoter, only exhibited background activity at all time points tested (0.1% of the peak value). Cells transfected with pGL3-control (Promega), in which luciferase gene expression is driven by an eukaryotic SV40 promoter and enhancer, exhibited 0.5% activity at peak value relative to that of pRANlucϩ at different time points tested.
Deletion and Mutation Mapping of the ran Promoter-In subsequent experiments, luciferase activity in the transfected cells was assayed at 30 h post-transfection. The 5Ј-boundary of the ran promoter was mapped by testing the transcription efficiency of a series of pRANlucϩ 5Ј-deletion mutants (Fig. 2). Deletion from Ϫ595 to Ϫ267, Ϫ233, Ϫ131, Ϫ109, Ϫ93, Ϫ62, or Ϫ51 relative to the translation start site had little effect on luciferase expression. The luciferase activity was reduced to 40% of the original level with the deletion to Ϫ44. The luciferase activity decreased dramatically to 0.6% of the original level with the deletion to Ϫ19. Luciferase expression was completely abolished with the deletion to Ϫ2 (0.1%). On the other hand, cells transfected with pRAN3Ј⌬, which lacks the 3Ј-flanking sequence, exhibited ϳ23% of the original activity. These results reveal that the sequence spanning Ϫ51/Ϫ2 of the ran gene is required for ran promoter activity.
The Ϫ51/Ϫ1 sequence of pRANlucϩ was examined by scanning mutagenesis (Fig. 3). Luciferase activity decreased to 9, 8, and 6% of the original level with clustered mutations within Ϫ51/Ϫ42, Ϫ41/Ϫ31, and Ϫ30/Ϫ20, respectively. By contrast, the activity only decreased to 55 and 44% with clustered mutations within Ϫ19/Ϫ11 and Ϫ10/Ϫ1, respectively. In conjunction with the deletion mapping described above, these results suggest that Ϫ51/Ϫ20 is more important than Ϫ19/Ϫ1 for ran promoter activity. We then focused on Ϫ51/Ϫ20 for further mutational analysis. Luciferase activity only decreased to 27% of the original level with mutations confined to Ϫ25/Ϫ20. More extensive mutations spanning Ϫ51/Ϫ31 and Ϫ40/Ϫ20 further decreased luciferase activity to 2 and 0.3% of the original level, respectively. Mutations of the entire Ϫ51/Ϫ20 sequence led to only 0.2% of the original activity. These results show that the sequence spanning Ϫ51/Ϫ20 is necessary for optimal luciferase expression, and the sequence is composed of multiple overlapping suboptimal components.
Synthetic Promoter-A 32-bp synthetic DNA derived from Ϫ51/Ϫ20 of the ran gene was placed close to the lucϩ gene in pSPlucϩ at different positions and orientations (Fig. 4). Since these plasmids lack the 3Ј-untranslated region of the ran gene, we used pRANlucϩ3Ј⌬ as the reference plasmid in the following experiments. The activity of pAT-20f and pAT-66 exhibited 78 and 75% promoter activity, respectively, relative to that of pRANlucϩ3Ј⌬. The activity of pAT-AT was 2-fold that of pRANlucϩ3Ј⌬. On the other hand, pAT-20r only exhibited 8% promoter activity relative to that of pRANlucϩ3Ј⌬, and pAT-d only exhibited activity equivalent to pSPlucϩ. Although the Ϫ51/Ϫ20 sequence could drive gene expression in either orientation, it was 10-fold weaker in the reverse orientation. Moreover, within the limit tested, the distance between the synthetic promoter and the translation start site exerts little influence on transcription efficiency.
Sequences that confer suboptimal promoter activities were mapped by deletion of the Ϫ51/Ϫ20 sequence in pAT-20f from either the 5Ј-or 3Ј-end (Fig. 5). Luciferase activity decreased to 17, 15, and 7% relative to that of pAT-20f in 5Ј-deletion from Ϫ51 down to Ϫ41, Ϫ34 and Ϫ30, respectively. No luciferase activity was observed with further deletion to Ϫ25, indicating that Ϫ30/Ϫ20 is a minimal sequence sufficient for luciferase expression at a level 30-fold above background. The activity only decreased to 83% relative to pAT-20f with insertion of a 19-bp sequence that separates the 3Ј-minimal sequence from its 5Ј Ϫ51/Ϫ31 sequence. By contrast, the activity was reduced to ϳ1% relative to that of pAT-20f with 3Ј-deletion from Ϫ20 up to Ϫ31 or Ϫ42, indicating that Ϫ51/Ϫ42 is also sufficient for luciferase expression at a level 6-fold above background. The internal sequences such as Ϫ41/Ϫ31 and Ϫ30/Ϫ26 only exhibited background activity close to that of pSPlucϩ. In conjunction with data obtained from mutagenesis studies (Fig. 3), these results imply that the sequences spanning Ϫ51/Ϫ42 and Ϫ30/Ϫ20 are two minimal promoter sequences in G. lamblia, each of which acts in concert with the adjacent AT-rich sequence to mediate higher levels of promoter activity.
Selection of Transcription Start Sites-Stable luciferase expression cell lines were established from drug selection of G. lamblia WB cells transfected with a series of pNAT plasmids. In these pNAT plasmids, the selective marker neo is driven by an episomal ran promoter (13), and lucϩ is driven by various synthetic promoters (Fig. 6A). Consistent with transient assays, luciferase expression in stable cell lines harboring pNAT-66, pNAT-20f, and pNAT-20f-op exhibited similar luciferase activities (ϳ10 8 RLU/10 7 cells), whereas cells harboring pN(Ϫ41/Ϫ20) exhibited much reduced luciferase activity (ϳ3 ϫ 10 6 RLU/10 7 cells). The start sites of ran messenger RNA from G. lamblia WB cells were mapped by primer extension using primer smg8 as described previously (24). Consistent with earlier results, three extension products were mapped to Ϫ29, Ϫ4, and Ϫ2 of ran messenger RNA relative to the ran translation start site (Fig. 6B, lane 1). Identical results were obtained by primer smg9 (data not shown). Primer extension on neo messenger RNA from pRANneo was performed using primer neo1, which reveals a single major extension product mapped to Ϫ29 (Fig. 6B, lane 2). Identical results were obtained by primer neo2 (data not shown).
The primer extension experiments on lucϩ messenger RNAs were also performed by two different primers, luc1 and luc2, with consistent results, and only those performed by luc1 are shown (Fig. 6B, lanes 3-6). Since all lucϩ messenger RNAs studied here lack the ran coding sequence, the start sites described below are relative to the lucϩ translation start site. Results from primer extension experiments are summarized in Fig. 6C. Primer extension on lucϩ messenger RNA from pNAT-20f revealed two faint bands at Ϫ30 and Ϫ12 and two stronger bands at Ϫ22 and ϩ1 (Fig. 6B, lane 4). A faint band at Ϫ76 and four downstream stronger bands at Ϫ68, Ϫ52, Ϫ51, and Ϫ48 were observed on lucϩ messenger RNA from pNAT-66 (Fig. 6B,  lane 3), showing that the two distal start sites on the lucϩ messenger RNA moved upstream by exactly 46 nt when the AT-rich promoter was moved from Ϫ20 to Ϫ66. The positions and numbers of the other proximal start sites were variable within a 13-nt window (between Ϫ12 and ϩ1 of lucϩ messenger RNA from pNAT-20f). On lucϩ messenger RNA from pNAT-20f-op, a major site at Ϫ22 and a minor site at Ϫ21 were selected (Fig. 6B, lane 5). No primer extension product was seen on lucϩ messenger RNA from pN(Ϫ41/Ϫ20), showing that reduced gene expression by this construct occurs at the transcription level (Fig. 6B, lane 6). It is clear that the distal start site (Ϫ29 of ran) is used to some extent in all messages except that from pNAT-20f-op, and a new start site 8 nt downstream is also used in the lucϩ transcripts. Both of the consensus sites reside within the 32-bp AT-rich region. These results suggest that the position of the 32-bp AT-rich region in the promoter site is the primary determinant for start site selection.
Promoter-binding Proteins-Potential nuclear factors that interact with the ran promoter were examined by electrophoretic mobility shift assay. The sequence and binding activ-ity of each oligonucleotide used in this assay are listed in Table  I. The DNA-protein complexes described below were eliminated by the addition of proteinase K (50 g ml Ϫ1 ) but were not affected by the addition of RNase (100 g ml Ϫ1 ) in the reaction (data not shown). Ten g of nuclear extract was used to bind 32 P-labeled dsDNA probe Ϫ51/Ϫ20ds in each binding reaction, and several batches of nuclear extract were tested for DNA binding. No DNA-protein complex was detected under the conditions tested (Fig. 7A, lane 2). By contrast, a broad band was detected when a non-template strand DNA probe, Ϫ51/Ϫ20, was incubated with 1 g of nuclear extract (Fig. 7B, lane 2), indicating the formation of DNA-protein complexes in this reaction. The DNA-protein complexes were completely eliminated by cold Ϫ51/Ϫ20 (Fig. 7B, lane 3) but were not affected by a nonspecific competitor NS (Fig. 7B, lane 4). Due to sequence redundancy in Ϫ51/Ϫ20, we also used its subunits as competitors and found that the complex formation was almost eliminated by Ϫ30/Ϫ20 (Fig. 7B, lane 7). The DNA-protein complexes could not be completely eliminated even with a 1,000fold molar excess of Ϫ30/Ϫ20 (data not shown). The complex formation was not affected by the competitor sequence Ϫ51/ Ϫ42 (Fig. 7B, lane 5), Ϫ41/Ϫ31 (Fig. 7B, lane 6), Ϫ45/Ϫ38, or Ϫ35/Ϫ26 (data not shown). These observations suggest that binding of certain protein factor(s) to the ran promoter is specifically directed toward the 3Ј-end of the non-template strand DNA.
Two broad bands were detected when template strand DNA probe Ϫ20/Ϫ51 was used in the binding reaction (Fig. 7C, lane  2). The complex formation was completely inhibited by competitor DNA Ϫ20/Ϫ51 (Fig. 7C, lane 3) but was not affected by NS (Fig. 7C, lane 4), indicating that the template strand of the ran promoter is also the target of certain nuclear proteins. The DNA-protein complexes were almost removed when Ϫ42/Ϫ51, Ϫ31/Ϫ41, Ϫ20/Ϫ30 (Fig. 7C, lanes 5-7), or an octameric T tract oligonucleotide T8 (data not shown) was used in the competition assay. However, the DNA-protein complexes could not be completely eliminated even with a 1,000-fold molar excess of Ϫ42/Ϫ51, Ϫ31/Ϫ41, Ϫ20/Ϫ30, or T8 (data not shown). These results indicate that Ϫ20/Ϫ51 contains multiple protein binding sites that may contribute to higher affinity for certain nuclear proteins, and T-tract sequences in Ϫ20/Ϫ51 may participate in protein binding. The multiple DNA-protein complexes formed in the binding reaction were probably due to oligomerization of two or more DNA-binding proteins on this template.
We further tested whether the non-template strand DNA Ϫ30/Ϫ20 is sufficient for DNA-protein complex formation. A broad band was detected in binding of nuclear extract to the Ϫ30/Ϫ20 probe (Fig. 7D, lane 2). The complex formation was specific, since its formation was not affected by NS (Fig. 7D,  lane 4) but was completely inhibited by Ϫ30/Ϫ20 (Fig. 7D, lane  3). Interestingly, the complex formation was also completely inhibited by the template strand DNA Ϫ42/Ϫ51, Ϫ31/Ϫ41, and Ϫ20/Ϫ30 (Fig. 7D, lanes 5-7), indicating that these DNA fragments bind to the same protein factor(s), and T-tract sequences in these DNA fragments may also participate in protein binding.

DISCUSSION
In this report, luciferase activity in transfected G. lamblia trophozoites was used to monitor transcription efficiency of modified ran promoter contexts. The levels of lucϩ transcripts from transient transfections were below the detection limit of the primer extension method (data not shown); however, the levels of luciferase expression correlated well with the degrees of mutations or deletions within the 32-bp AT-rich ran promoter sequence (Figs. 3 and 5). The luciferase activities in stable expressing cell lines also correlated well with the results of primer extension experiments (Fig. 6), demonstrating that alteration in luciferase activity indeed reflects the transcription efficiency of various ran promoter contexts.
A typical eukaryotic promoter site that is composed of one or more core promoter element(s) proximal to the transcription start site and additional DNA regulatory elements in the distal region usually spans a few hundred bp (25,26). A typical promoter site in Escherichia coli that is composed of two core promoter elements and adjacent upstream activator binding sites usually spans ϳ70 bp (25,27). By contrast, the ran promoter in G. lamblia apparently comprises a single AT-rich domain spanning the Ϫ51/Ϫ20 sequence of the ran gene as supported by several lines of evidence. First, progressive deletions from Ϫ590 down to Ϫ51 of the ran gene neither enhanced nor suppressed luciferase expression (Fig. 2), implying the absence of any promoter element other than Ϫ51/Ϫ20 in the 5Јuntranslated region of the ran gene. Second, luciferase expression was eliminated after the entire Ϫ51/Ϫ20 sequence was deleted (Fig. 2) or mutated (Fig. 3). Smaller mutations or deletions within the Ϫ51/Ϫ20 sequence only led to reduced gene expression. The downstream sequence Ϫ19/Ϫ1, which was needed for optimal luciferase expression, was insufficient to confer significant promoter activity. Third, a 32-bp synthetic promoter spanning the Ϫ51/Ϫ20 region was necessary and sufficient to direct the expression of a downstream reporter gene (Fig. 4). The presence of an intragenic core promoter element in the ran promoter is unlikely, since the protein coding sequence of the ran gene is not required for luciferase expression in the synthetic promoter system (Figs. 4 -6). It is also important to note that the 32-bp AT-rich sequence also conferred weaker promoter activity when positioned in reverse orientation (Fig. 4). Preliminary data reveal that the ran promoter has the potential to regulate transcription of an upstream gene positioned in reverse orientation relative to the ran gene. 2 These observations indicate that the ran promoter in G. lamblia exhibits an unusual simple organization distinct from other common eukaryotic and prokaryotic promoters. Determination of whether such a promoter organization is common in G. lamblia awaits identification of more promoters in this organism.  Synthetic dsDNA fragments derived from the Ϫ51/ Ϫ20 sequence of the ran gene (thin line) were placed in forward orientation at Ϫ20 in front of the lucϩ gene in pSPlucϩ. Nine plasmids, p(Ϫ41/Ϫ20), p(Ϫ34/Ϫ20), p(Ϫ30/Ϫ20), p(Ϫ25/Ϫ20), p(Ϫ30/Ϫ26), p(Ϫ41/Ϫ31), p(Ϫ51/Ϫ42), and p(Ϫ51/ Ϫ31), each representing a 5Ј-or 3Ј-deletion mutant of pAT-20f, were used. An extra mutant plasmid, pAT-20f-op, to which a bacterial tetracycline operator sequence (dotted line) was inserted between Ϫ31 and Ϫ30 of pAT-20f, was also included. The luciferase activities in transfected cells were assayed at 30 h posttransfection. The results are the average Ϯ S.E. of duplicate samples from three separate experiments and are shown as percentage of activity of pAT-20f (1202 RLU/10 7 cells taken to be 100%).
The sequence of the ran promoter is extremely AT-rich (29 out of 32 bp) and contains five short A-tracts (3-4 A residues in a run) intervened by one or two T residues and a short T tract (three T residues in a run) flanked by two A residues at the 3Ј-end. Results from scanning mutagenesis and deletion map-ping studies show that the multiplicity of the A-tract sequence in a modified ran promoter correlates with transcription efficiency (Figs. 3 and 5). Also, mutations of the T-tract sequence in pRANlucϩ resulted in a 4-fold reduction in luciferase expression (Fig. 3). AT-rich sequences containing two A-tracts  TTTTTTTT ϩ a The plus and minus signs indicate the ability of an ssDNA fragment in 200-fold molar excess to compete with the 32 P-labeled ssDNA probe (Ϫ51/Ϫ20*, Ϫ20/Ϫ51*, or Ϫ30/Ϫ20*) for DNA-protein complex formation in electrophoretic mobility shift assays. The consensus motifs in protein-binding DNA sequences are underlined.
(such as Ϫ51/Ϫ41; TAAAA TAAAT) or an A-tract plus a T-tract (such as Ϫ30/Ϫ20; TAAAA CTTTAA) constitute a minimal promoter in G. lamblia (Fig. 5). These observations suggest that the A-tract and T-tract sequences are two important elements of the ran promoter. Although the exact sequence requirements for these elements have not been determined by point mutations, it is interesting to note the presence of Atract-or T-tract-like sequences spanning the transcription start sites of many genes in G. lamblia (Fig. 8). In conjunction with a T-tract sequence-specific ssDNA-binding protein discussed below, these observations suggest that the AT-rich sequence spanning the transcription start sites described above is most likely one of the core promoter elements in G. lamblia.
The interactions among core promoter elements and associated transcription factors are primary determinants of start site selection in transcription initiation (28,29). In G. lamblia, we found that transcription start sites selected were also determined by the 32-bp AT-rich sequence (Fig. 6B). Although some of these sites deviated from authentic ran messenger RNA start sites, they varied within a rather small range (Fig.  6C). Multiple start sites were consistently observed from primer extensions performed on ran messenger RNA. By contrast, only the distal start site was detected on neo messenger RNA (Fig. 6B, lanes 1 and 2). Interestingly, the consensus site resides within the AT-rich sequence of the ran promoter. This discrepancy may reflect differences in chromosomal (ran) and episomal (neo) environments in regulating gene expression or may be due to the involvement of certain DNA sequences downstream of the transcription start site (30). In the present study, the distal start site was still present on various lucϩ messenger RNAs but with a much reduced intensity. Instead, a new preferable start site at 8 nt downstream of the distal start site appeared regardless of the positions of the synthetic promoter. These observations suggests that the AT-rich sequence is the primary determinant for start site selection. It is important to note that the two consensus start sites described above are all located within the 3Ј-minimal promoter element. As shown in Figs. 5 and 6, the minimal promoter element can be separated from its 5Ј-sequence in the ran promoter without seriously hampering the transcription efficiency or start site selection. This sequence may be a functional analogue of eukaryotic initiators (29).
A key regulatory event in eukaryotic transcription initiation is the recognition of distinct dsDNA promoter sites by specific multimeric protein complexes (28,29). In DNA-protein binding assays, we detected DNA-protein complexes on each of the DNA strands but not on the dsDNA fragment derived from the ran promoter (Fig. 7). Binding of a putative transcription factor FIG. 7. DNA-protein interactions revealed by electrophoretic mobility shift assay. The oligonucleotides dsDNA Ϫ51/Ϫ21ds (A), non-template strand DNA Ϫ51/-20 (B), template strand DNA Ϫ20/Ϫ51 (C), and a subunit non-template strand DNA Ϫ30/Ϫ20 (D) were ␥-32 P-labeled (lane 1) and incubated with 10 g (A) or 1 g (B-D) of nuclear extract in a reaction mixture (lane 2) for 20 min at room temperature. The reaction products were separated in 6% acrylamide gels by electrophoresis. The formation of DNAprotein complexes were competed with a 200-fold molar excess of various cold DNA oligonucleotides (lanes 3-7) as listed above each lane. The residual DNA-protein complexes in the competition assays are indicated by arrowheads. NS represents a non-template strand DNA sequence derived from Ϫ71/Ϫ52 of the ran gene.
to the dsDNA may have been unstable under our tested conditions, or such a factor was unstable in our nuclear extract preparations. Due to the lack of known dsDNA-binding protein in G. lamblia, we could not include a positive control in our experiments. In another pathogenic protozoan, Trypanosoma brucei, ssDNA-binding protein interacting with singlestranded promoter elements rather than double-stranded promoter elements was also observed (31,32). The interactions of DNA-binding proteins to single-stranded promoter elements that influence transcription efficiency have been observed in other eukaryotic promoters (33)(34)(35)(36)(37). Whether ssDNA-binding protein(s) identified here also plays a role in regulating transcription initiation in G. lamblia is not clear from current studies. Nevertheless, it is interesting to note that a consensus T-tract sequence was present in all ssDNAs that exhibited protein-binding property ( Table I), indicating that the T tract sequence may participate in protein binding on single-stranded ran promoter elements. Our results from DNA-protein competition assays also show that residual DNA-protein complexes were still present on long ssDNA fragments (Ϫ51/Ϫ20 or Ϫ20/ Ϫ51) even with a 200 -1,000-fold molar excess of the short DNA fragments or T8 (Fig. 7, B and C; also data not shown), indicating that the long DNA fragments exhibit higher affinities for binding nuclear proteins than short DNA fragments or that the long fragments may provide extra protein-binding sites for additional nuclear factors. These characteristics may be responsible for enhanced transcription efficiency beyond that which can be achieved by minimal promoters. Since transcription efficiency of pm(Ϫ25/Ϫ20), which lacks the putative protein binding site on the non-template strand DNA (Fig. 3), was only reduced by 4-fold, it is tempting to speculate that binding of the putative transcription factor to the template strand DNA may be related to the recruitment of a putative transcription complex to the site of transcription initiation, and an additional protein binding site on the non-template strand DNA may stabilize this complex. These possibilities remain to be investigated in the future.
In summary, our studies on the ran promoter provide a model system for future investigation on the transcription machinery in G. lamblia, one of the earliest diverging eukaryotic single cells.