Advertisement

The Novel Core Promoter Element GAAC in the hgl5 Gene of Entamoeba histolytica Is Able to Direct a Transcription Start Site Independent of TATA or Initiator Regions*

  • Upinder Singh
    Correspondence
    National Foundation for Infectious Diseases Fellow. To whom correspondence should be addressed: Rm. 2115, MR 4 Bldg., 300 Park Place, University of Virginia Health Sciences Center, Charlottesville, VA 22908. Tel.:/Fax: 804-924-0075;
    Affiliations
    Department of Medicine, University of Virginia, Charlottesville, Virginia 22908
    Search for articles by this author
  • Joshua B. Rogers
    Affiliations
    Department of Medicine, University of Virginia, Charlottesville, Virginia 22908
    Search for articles by this author
  • Author Footnotes
    * This work was supported by National Institutes of Health Grants R01-AI 37941 (to W. A. P.) and K08-AI 01453 (to U. S.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
      Entamoeba histolytica, an enteric protozoa, is the third leading parasitic cause of death worldwide. Investigation of the transcriptional machinery of this eukaryotic pathogen has revealed an unusual core promoter structure that consists of nonconsensus TATA and initiator regions and a novel third conserved core promoter sequence, the GAAC element. Mutation of this region in the hgl5 promoter decreases reporter gene expression and alters the transcription start site. Using positional analysis of this element, we have now demonstrated that it is able to direct a new transcription start site, 2–7 bases downstream of itself, independent of TATA and Inr regions. The GAAC region was also shown to control the rate of transcription via nuclear run on analysis and an amebic nuclear protein was demonstrated to specifically interact with this sequence. This is the first description in the eukaryotic literature of a third conserved core promoter element, distinct from TATA or initiator regions, that is able to direct a transcription start site. We have formulated two models for the role of the GAAC region: (i) the GAAC-binding protein is a part of the TFIID complex and (ii) the GAAC-binding protein functions to “tether” TATA-binding protein to the TATA box.
      Entamoeba histolytica is a single cell eukaryote that causes invasive amebic colitis and liver abscess. Infection with this organism is an important contribution to morbidity and mortality in developing countries, and worldwide it is the third leading parasitic cause of death. During its life cycle E. histolyticaundergoes developmental changes such as transformation from the cyst to trophozoite and adaptation from an anaerobic to aerobic environment upon invasion. How E. histolytica regulates these events is not understood, although regulation of transcription is likely to be an important mechanism of this control. Recently two papers have described transcriptional control of a drug resistance gene in E. histolytica (
      • Perez D.G.
      • Gomez C.
      • Lopez-Bayghen E.
      • Tannich E.
      • Orozco E.
      ,
      • Gomez C.
      • Perez D.G.
      • Lopez-Bayghen E.
      • Orozco E.
      ), thus demonstrating a relationship between pathogenesis and regulation of transcription.
      At a molecular level, little is known about the control of gene expression in this organism. As an early diverging member of the eukaryotic tree, E. histolytica has many unusual characteristics with regard to gene organization. These include a genome that is AT-rich (67% within coding regions and 78% overall) (
      • Gelderman A.H.
      • Keister D.B.
      • Bartgis I.L.
      • Diamond L.S.
      ,
      • Tannich E.
      • Horstmann R.D.
      ) and compact (1.5 × 107bp
      The abbreviations used are: bp
      base pair
      TBP
      TATA-binding protein
      PCR
      polymerase chain reaction
      EMSA
      electrophoretic mobility shift assay(s)
      Inr
      initiator
      luc
      luciferase
      GBP
      GAAC-binding protein
      TF
      transcription factor.
      1The abbreviations used are: bp
      base pair
      TBP
      TATA-binding protein
      PCR
      polymerase chain reaction
      EMSA
      electrophoretic mobility shift assay(s)
      Inr
      initiator
      luc
      luciferase
      GBP
      GAAC-binding protein
      TF
      transcription factor.
      ) (
      • Dvorak J.A.
      • Kobayashi S.
      • Alling D.W.
      • Hallahan C.W.
      ) and an RNA polymerase II that is resistant to α-amanitin (
      • Lioutas C.
      • Tannich E.
      ). A putative E. histolytica TBP has been reported (GenBankTM accession number Z48307) that has significant sequence divergence from the TBP of Drosophila melanogaster,Caenorhabditis elegans, and Plasmodium falciparum (
      • McAndrew M.B.
      • Read M.
      • Sims P.F.G.
      • Hyde J.E.
      ). In addition, we have recently found that yeast TBP does not bind to the TATA box of the E. histolytica hgl5 gene.
      U. Singh, J. Rogers, and W. A. Petri, Jr., unpublished results.
      2U. Singh, J. Rogers, and W. A. Petri, Jr., unpublished results.
      It has been shown that amebic promoter sequences do not function in a mammalian system and that viral promoters (cytomegalovirus, human immunodeficiency virus long terminal repeat, the simian virus 40) and promoters from other systems (Dictyostelium) are nonfunctional in amebic trophozoites (
      • Purdy J.E.
      • Mann B.J.
      • Pho L.T.
      • Petri Jr., W.A.
      ,
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Thus, it would appear that species-specific transcription factors may be utilized in amebic gene expression.
      The core promoter region in metazoans is the target of a variety of regulatory proteins that work in concert to direct the complex mechanisms of transcriptional control. Transcription of mRNA relies on the assembly of RNA polymerase II and a variety of other transcription factors (TFIID, TFIIA, TFIIB, TFIIE, TFIIF, and TFIIH) into a stable and functional preinitiation complex (
      • Roeder R.
      ). The preinitiation complex may assemble in a sequential manner on the DNA sequence of the core promoter, or a “holoenzyme” complex may form, which then binds specifically to the core promoter region (
      • Pugh B.F.
      ). Both the TATA box and Inr regions appear to direct the formation of the preinitiation complex, control the site of transcription initiation, and regulate activation by upstream activator proteins (
      • Nikolov D.B.
      • Burley S.K.
      ). Recently, variations to this classic core promoter architecture have been described. A third conserved core promoter element has been identified in a subset of TATA-less promoters of Drosophila (
      • Burke T.W.
      • Kadonaga J.T.
      ), in the human angiotensinogen gene (
      • Yanai K.
      • Nibu Y.
      • Murakami K.
      • Fukamizu A.
      ), and the core promoter of the hepatocyte growth factor (
      • Jiang J.G.
      • Zarnegar R.
      ). A consensus element recognized by TFIIB and immediately adjacent to and upstream of TATA has also been described (
      • Lagrange T.
      • Kapanidis A.N.
      • Tang H.
      • Reinberg D.
      • Ebright R.H.
      ). Thus, there appears to be an emerging body of literature on an alternative core promoter architectural structure.
      We have shown recently that the E. histolytica core promoter contains three elements that control the site of transcription initiation (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). The amebic TATA (GTATTTAAA(G/C)) and Inr (AAAAATTCA) elements appear to function in a classic manner despite their sequence divergence from metazoans (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). A unique third core promoter element GAAC (GAACT) was identified in 31/37 protein-encoding E. histolytica genes (
      • Purdy J.E.
      • Pho L.T.
      • Mann B.J.
      • Petri Jr., W.A.
      ). It has a variable location between the TATA and Inr sequences, a characteristic not defined previously for core promoter transcription elements in metazoans. Mutation of this region in the hgl5 gene promoter had a greater effect on gene expression and selection of the site of transcription initiation than mutation of either the TATA or Inr regions (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). These characteristics of the GAAC region would seem to usurp the dominant role of the TATA element in transcriptional control in E. histolytica.
      Our goal was to determine how the GAAC element in the hgl5 gene of E. histolytica regulates the rate and site of transcription initiation. This was undertaken by performing positional analysis of the GAAC region. We determined that it affected the rate of transcription, did not function as an enhancer element, and was able to direct a site of transcription initiation independent of a TATA or Inr element. An amebic nuclear protein was demonstrated to specifically interact with this DNA sequence as shown by EMSA. Based on the data, we have formulated a model for the role of the GAAC region in transcriptional control in E. histolytica.

      MATERIALS AND METHODS

       Cultivation of E. histolytica and Stable Transfection

      E. histolytica strain HM-1:IMSS trophozoites were cultured in TYI-S-33 medium containing penicillin (100 units/ml) (Life Technologies, Inc.) and streptomycin (100 μg/ml) (Life Technologies, Inc.) (
      • Diamond L.S.
      • Harlow D.R.
      • Cunnick C.C.
      ). Stable transfection was performed using two promoter vectors (5′ hgl5-luciferase-hgl 3′ and 5′ actin-neor-actin 3′) described previously (
      • Vines R.R.
      • Purdy J.E.
      • Ragland B.D.
      • Samuelson J.
      • Mann B.J.
      • Petri Jr., W.A.
      ,
      • Ramakrishnan G.
      • Vines R.R.
      • Mann B.J.
      • Petri Jr., W.A.
      ). The trophozoites were maintained at 24 μg/ml G418, and all Northern blot and primer extension analyses were performed on RNA isolated from stably transfected parasites maintained at this drug concentration.

       Plasmid Construction

      Positional analysis of the GAAC region was undertaken on the plasmids pTP.4i and pTP.GAAC-CLA (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Construction of the pTP.4i plasmid was described earlier and consists of a 272-bp hgl5 upstream region fused to a luciferase reporter gene and the 5′ actin region fused to the neomycin drug selection gene (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ,
      • Purdy J.E.
      • Pho L.T.
      • Mann B.J.
      • Petri Jr., W.A.
      ,
      • Ramakrishnan G.
      • Vines R.R.
      • Mann B.J.
      • Petri Jr., W.A.
      ). In the plasmid pTP.GAAC-CLA the 5′ core promoter region of the hgl5 gene has the GAAC region mutated to a ClaI site (GAACT to CGATT). To introduce a GAAC region in the upstream location, we used a two-round PCR technique (
      • Purdy J.E.
      • Pho L.T.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Previous analysis of the 5′-noncoding region of thehgl5 promoter had revealed that the region 20 bases upstream of the TATA sequence could be replaced by an AT-rich sequence without affecting gene expression (
      • Purdy J.E.
      • Pho L.T.
      • Mann B.J.
      • Petri Jr., W.A.
      ). This region was replaced in both pTP.4i and pTP.GAAC-CLA plasmids using a primer that introduced an intact GAAC region 22 bases upstream of the beginning of the TATA region. The primer used to generate this was 5′-AAAAGAAGGAAAGGAATGAACTAGTAATAATAGGAAAGG-3′, which introduced an SpeI site (underlined) downstream of the GAAC region (bold) that was used for diagnostic digests of possible clones. The first round of PCR used the above primer with the primer 5′-CTTTCTTTATGTTTTTGGCG-3′, which hybridized with the coding region of luciferase (bases 1727–1746 of pGEM-luc, Promega). This resulted in a 150-bp fragment that was then used as a primer for the second round of PCR using the primer 5′-CTACTGAAGCTTAGTAAAGAATAGTATTGA-3′ (containing aHindIII restriction site shown underlined for cloning) that hybridized at the 5′-end of the p-TP.4i plasmid. Using the same approach, the primer 5′-AAAAGAAGGAAAGGTGATCAAGTAAAATAATAGGAAAGG-3′ was used to clone an intact but inverse orientation GAAC (bold) with aBclI restriction site (underlined) into the coding strand at a region 22 bases upstream of the TATA region. Similarly a primer was designed to generate an intact GAAC region 10 bases downstream of the initiator region (5′-GACAAAGATATGAAAAATGAACTATGGATCCAAATG-3′). This resulted in a change in the amino acid sequence of thehgl5 fusion region between the start codons of thehgl5 and luciferase genes from that of wild type, but did not generate any nonsense or stop codons. The wild type amino acid sequence is (Lys-Leu-Leu-Leu-Trp-Ile-Glu) and was changed to (Lys-Asn-Glu-Leu-Trp-Ile-Glu). Each primer had at least 14 bases of sequence homology upstream and downstream of the mutations to allow for hybridization with the backbone. These colonies were screened by restriction enzyme analysis where appropriate or by sequence analysis. All constructs used in our experiments were sequenced in their entirety to rule out PCR-induced mutations.

       Northern Blot Analysis

      Northern blot analysis was done as described previously (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ) with the following modifications: 13 μg of RNA was utilized for each construct, and the blot was probed with radiolabeled denatured DNA probes. The DNA probes consisted of the coding regions of luciferase and neomycin, which were extracted by digestion of the pTP-Luc plasmid (
      • Ramakrishnan G.
      • Vines R.R.
      • Mann B.J.
      • Petri Jr., W.A.
      ) and pTCV1 plasmid (
      • Vines R.R.
      • Purdy J.E.
      • Ragland B.D.
      • Samuelson J.
      • Mann B.J.
      • Petri Jr., W.A.
      ) byBamHI and SalI. These probes were labeled with random primers, the Klenow fragment of DNA polymerase I, and [α-32P]dCTP.

       Luciferase Assay

      The procedure was done as described previously (
      • Ramakrishnan G.
      • Vines R.R.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Briefly, stably transfected trophozoites, maintained in TYI-S-33 medium supplemented with 24 μg/ml G418, were chilled on ice and the number of cells counted and harvested. These were then centrifuged at 200 × g for 5 min, washed once in phosphate-buffered saline (pH 7.5), and lysed in 100 μl of lysis buffer with the addition of protease inhibitors E64-C and leupeptin. Lysates were assayed following serial dilution to 10−3 or 10−4. Samples were prepared in triplicate and assayed at room temperature with a Turner luminometer model TD-20E (Promega). Luciferase activity per cell was calculated as a measure of reporter gene expression.

       Nuclear Run On Analysis

      Nuclei were harvested from 5 × 107 logarithmically growing trophozoites stably transfected with the plasmid of interest and stored at −70 °C (
      • Bruchhaus I.
      • Leippe M.
      • Lioutas C.
      • Tannich E.
      ). These were thawed on ice, and the nuclear run on was performed as described (
      • Greenberg M.E.
      • Bender T.P.
      ). RNA extraction was performed using the guanidinium isothiocyanate method (RNagen kit, Promega). Approximately 1.2 pmol of DNA probes (luciferase, neomycin, and 272-bp promoter of thehgl5 gene) were purified, denatured, and dot-blotted onto Zeta-Probe GT genomic blotting membrane (Bio-Rad). The membrane was incubated with the prehybridization solution at 65 °C for 20 min, denatured RNA probe was added, and the mixture was incubated overnight at 65 °C. The membrane was washed at 65 °C according to the manufacturer's instructions and exposed on a PhosphorImager (Molecular Dynamics).

       Primer Extension Analysis

      Primer extension analysis was performed as described previously using polyadenylated mRNA from stably transfected amebae (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Primer extension was performed using the Superscript II RNase H reverse transcriptase system (Life Technologies, Inc.) and run next to the appropriate sequencing ladder on a 6% polyacrylamide gel. Sequencing was performed using the circumvent thermal cycle sequencing system (New England Biolabs) using the α-35S-labeled dATP incorporation method. To rule out contaminating or nonspecific extension products, all mRNA samples were treated in DNase buffer (50 mm Tris-HCl (pH 7.4), 1 mm EDTA (pH 8.0), 10 mm MgCl2, 1 mm 1,4-dithiothreitol) with 10 units of RNase-free DNase at 37 °C for 60 min followed by overnight ethanol precipitation prior to primer extension experiments.

       Nuclear Extract Preparation and Electrophoretic Mobility Shift Assay

      Nuclear extracts were prepared by the methods described previously (
      • Gilchrist C.A.
      • Mann B.J.
      • Petri Jr., W.A.
      ). The double-stranded oligonucleotide used for the electrophoretic mobility shift assays was 5′-AAGACAATGAACTAGAATG-3′ with an intact GAAC region (bold) but with a mutated and truncated Inr region (italics). The oligonucleotide used for the gel shift assays did not contain a TATA region. An E. histolytica hgl5 promoter sequence (5′-AATTCTGTTATATGATCATTTGGTTTGTAATTACAGCTGG-3′) and an oligonucleotide (5′-AAGACCTACGATAAGAATG-3′) with a mutated GAAC region (bold) were used as double-stranded competitors for the gel shift assay. The probe was purified by a polyacrylamide gel extraction procedure (
      • Chory J.
      • Baldwin Jr., A.S.
      ). Modifications on this method included the incubation of the polyacrylamide gel section containing the probe at 37 °C overnight in elution buffer (TE (10 mm Tris, 1 mm EDTA, pH 8.0) with 100 mm NaCl) prior to centrifugation for 30 min at 10,000 × g. The supernatant was then saved and the pellet washed with an additional 400 μl of elution buffer and re-centrifuged. The two supernatants were combined, the pellet discarded, and the filtered probe was ethanol-precipitated at −70 °C. One pmol of this purified probe was labeled with [α-32P]dATP using the Klenow fragment of DNA polymerase I and purified from unincorporated nucleotide by a NucTrap column (Stratagene, La Jolla, CA).
      The protein-DNA interaction occurred in band shift buffer (10 mm Tris-HCl (pH 7.9), 50 mm NaCl, 1 mm EDTA, 0.05% non-fat milk powder (Carnation), 3% glycerol, 0.05 mg of bromphenol blue). To this reaction mixture 0.5 μg of salmon sperm, 25 fmol of DNA probe, and 1.2 μg of nuclear extract were added (
      • Miglarese M.R.
      • Richardson A.F.
      • Aziz N.
      • Bender T.P.
      ). The reaction was allowed to incubate at room temperature (20 °C) for 1 h prior to the electrophoresis of the reaction mix on a nondenaturing polyacrylamide gel for 2–3 h (
      • Buratowski S.
      • Chodosh L.A.
      ). The gel was then fixed, dried, and quantitated by PhosphorImager analysis.

      RESULTS

       Positional Analysis of the GAAC Element as Assessed by Reporter Gene Expression

      Previously we had shown that mutation of the GAAC region in the core promoter of the hgl5 gene of E. histolytica resulted in decreased reporter gene expression and mRNA levels (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Since the GAAC element has been shown to have a variable location between the TATA and Inr regions, we hypothesized that it may function in a position-independent manner to control the rate of gene expression. To test this hypothesis, we performed positional analysis of the GAAC region in the hgl5 gene promoter linked to a luciferase reporter gene. The GAAC region was placed upstream of the TATA region by 22 bases. Alternatively GAAC was placed downstream of the Inr by 10 bases or one helical turn (Fig.1). These plasmids were electroporated into trophozoites and selected for stable transfectants by drug (G418) selection.
      Figure thumbnail gr1
      Figure 1Reporter gene constructs for analysis of GAAC function. The schematic depicts the various constructs: wild type core promoter, mutated core promoter GAAC element (A), wild type core promoter with an upstream wild type GAAC (B), mutated core promoter with upstream GAAC element (C), mutated core promoter with upstream inverse orientation GAAC (D), wild type core promoter with downstream GAAC (E), and mutated core promoter with downstream GAAC (F).
      Northern blots were performed with total RNA harvested from the trophozoites (Fig. 2). Luciferase assays were performed on cells stably transfected with various constructs (Fig. 2). Mutation of the GAAC region in the core promoter resulted in a decrease in luciferase message to 38% of wild type, as reported previously (Fig. 2 A) (
      • Singh U.
      • Rogers J.B.
      • Mann B.J.
      • Petri Jr., W.A.
      ). Insertion of a wild type GAAC sequence upstream of a mutated core promoter, in either the native or inverse orientation, did not reconstitute luciferase message to wild type levels (Fig. 2, C and D) and in fact resulted in luciferase enzyme activity that was 17 and 7.4% of wild type. Insertion of a wild type GAAC sequence 22 bases upstream of a wild type core promoter also resulted in a marked diminution of reporter gene message levels and luciferase expression was 43% of wild type (Fig. 2 B). Insertion of a wild type GAAC sequence downstream of a wild type core promoter resulted in moderate enhancement of message accumulation and luciferase levels were 284% of wild type (Fig. 2 E). Insertion of a GAAC region downstream of a mutated core promoter resulted in decreased luciferase message and enzyme levels were 44.5% of wild type (Fig. 2 F). Thus, the GAAC element was not able to regulate the level of gene expression in a position-independent manner. In addition, it controls the transcription start site; therefore this sequence does not meet the classical definition of an enhancer element.
      Figure thumbnail gr2
      Figure 2Northern blot analysis of reporter gene (luciferase) and control (neomycin) total RNA transcribed from theE. histolytica hgl5 promoter constructs containing positional variants of the GAAC element. Total RNA (13 μg) was hybridized with oligonucleotides containing the neomycin (0.8 kilobase) and luciferase (luc) (1.6 kilobase) coding regions. Luciferase enzyme activity relative to the wild type activity is shown below the Northern blot for each construct. The constructs represented are shown in the schematic in Fig. . WT, wild type; MUT, mutant.

       Mutation of the GAAC Region Resulted in a Decreased Rate of Transcription as Determined by Nuclear Run On Analysis

      mRNA in E. histolytica usually has a short 5′-untranslated region, and transcripts with long 5′-untranslated regions in this organism may be unstable. Thus Northern blot and primer extension data on mRNA may not accurately reflect the rate of transcription initiation from a particular site. Nuclear run on assays were done on selected constructs to determine the role of this region in regulating the rate of transcription from a particular site. Fig.3 shows the data from nuclear run on assays done on nuclei harvested from trophozoites stably transfected with the wild type construct, a core promoter with a mutated GAAC region (Fig. 3, lane A), and core promoter with a mutated GAAC region but an upstream wild type GAAC (Fig. 3, lane C). Newly transcribed RNA obtained from the run on assays was hybridized with probes for neomycin, luciferase, and the 5′-noncoding region of the hgl5 gene. Neomycin was utilized as a control message, and all exposures were developed to obtain equivalent neomycin signal. In the wild type construct, the luciferase message was evident; however, there was no measurable signal at the 5′-noncoding region of the hgl5 gene. In the construct with a core promoter with a mutated GAAC region, the luciferase message was not detected. In a construct with a core promoter with a mutated GAAC region and a GAAC region inserted in the upstream region (Fig. 3 C), RNA from both the luciferase and 5′-untranslated region regions was detected. The magnitude of these signals relative to the neomycin signal was roughly equivalent. This distribution of transcript abundance was reflected in the intensity of bands from the two transcription start sites seen in the primer extension analysis of this construct (Fig.4 C), indicating that the RNA generated from the various start sites was stable and that primer extension analysis is an accurate method for quantitating the abundance of message from a particular transcription initiation site. These data confirmed that the GAAC element controlled gene expression by regulating the rate of transcription.
      Figure thumbnail gr3
      Figure 3Nuclear run on analysis for selected constructs (refer to Fig. ). [32P]UTP-labeled RNA was generated from nuclei and hybridized with DNA probes consisting of the 5′-noncoding region of the hgl 5 gene, the coding region of the luciferase gene, and the coding region of the neomycin gene. All nuclei were harvested from stably transfected trophozoites selected at 24 μg/ml G418. Run on analyses were independently conducted for constructs A andC. Images were generated by use of the PhosphorImager (Molecular Dynamics 425) in conjunction with the Adobe Photoshop 3 software program.
      Figure thumbnail gr4
      Figure 4Primer extension analysis of mRNA transcribed from hgl5 promoter constructs with mutated GAAC (A) and upstream and downstream positional variants of the GAAC region (B, C, D,E, and F). The lettering matches the constructs shown in Fig. . The extension products are located to theright of the sequencing ladders in the (+)mRNA lane. The location of each element (Inr, GAAC, TATA) is labeled on the DNA sequence. The main primer extension product is marked with alarge arrow, and minor extension products are marked withsmall arrows. Primer extension was performed using 20 μg of poly(A)+ mRNA in A, 80 μg inB, 40 μg in C, 60 μg in D, 5 μg in E, and 20 μg in F. All samples were DNase-treated before primer extension experiments. Quantitative results of the primer extension products were generated using PhosphorImager (Molecular Dynamics 425) and Image Quant 1.1 (Macintosh).

       The GAAC Sequence Can Function in a Position-independent Manner to Direct a Site of Transcription Initiation

      The Northern blot data showed that addition of a wild type GAAC region upstream of a core promoter with a mutated GAAC element did not reconstitute reporter gene expression to wild type levels. Previous data has shown that mutation of the GAAC region results in a major transcription start site at the Inr and multiple minor (4% compared with the Inr start site) transcription start sites up to −90 (Fig. 4 A). We therefore wished to determine the effect of changes in position of wild type GAAC on transcription start site.
      Insertion of a wild type GAAC region upstream of a wild type or mutated core promoter resulted in the appearance of new transcription start sites (Fig. 4, B and C). In the wild type core promoter construct (Fig. 4 B), transcription initiated 2–7 bases downstream of the new GAAC element and at the Inr region. The new start site downstream of GAAC was quantitated to be 19% of the Inr start site in that construct. When a GAAC region was placed upstream of a mutated core promoter GAAC region (Fig. 4 C) a new transcription start site appeared 3–4 bases downstream of this element, which was relatively equal in intensity (80%) to the wild type start site in that construct. The placement of an inverse GAAC region upstream of a mutated core promoter resulted in a new transcription start site 5 bases downstream of the GAAC region, which was 56% of the wild type start site in that construct (Fig.4 D). Analysis of downstream GAAC regions showed that placement of a wild type GAAC region downstream of a wild type or mutated core promoter resulted in the generation of a new start site 2–3 bases downstream of the GAAC sequence (Fig. 4, E andF). Similar to earlier results, in the context of a mutated core promoter (Fig. 4 F), the new start sites generated by the GAAC region appeared to be of equal intensity to that of wild type (113% compared with the Inr site), whereas in the context of a wild type core promoter (Fig. 4 E), the wild type start site at the Inr was dominant (new start site was 49% of the wild type site). Thus, this data indicated that the GAAC region was capable of controlling a transcription start site independent of TATA and Inr. In eukaryotes this function had previously been assigned exclusively to the TATA and Inr regions of the core promoter.

       An E. histolytica Nuclear Protein Binds to the GAAC Region of the hgl5 Gene in a Sequence-specific Manner

      To determine whether the GAAC sequence functioned by interacting with a sequence specific amebic nuclear protein, we utilized EMSA analysis. Amebic nuclear extracts were hybridized with a double-stranded oligonucleotide with the GAAC sequence to identify DNA-protein interactions. EMSA was performed with a probe that contained no TATA region, a mutated and truncated Inr region, and a wild type GAAC region from the hgl5 gene. This oligonucleotide was constructed to prevent DNA-protein interactions with the other core promoter elements, the TATA and Inr.
      Incubation of the probe with crude amebic nuclear extract revealed two bands; the lane with probe alone and no amebic nuclear protein had no bands (Fig. 5). Competition experiments were done to determine the specificity of the DNA-protein interaction. The lower band (small arrow) represents a nonspecific DNA-protein interaction as its intensity is not altered by self, unrelated, or mutant competition. Since the oligonucleotide for the EMSA analysis did not contain functional TATA or Inr regions, any specific DNA-protein interaction seen in the EMSA can be ascribed to the GAAC region. Competition assays with self cold unlabeled probe at 2 × and 4 × revealed that a band (Fig. 5, large arrow) was competed by the cold competitor. An unlabeled, unrelated amebic promoter sequence and an oligonucleotide with a mutated GAAC region did not compete this specific band to the same degree. This demonstrated specificity of the DNA-nuclear protein interaction for the oligonucleotide with the GAAC sequence and indicated that the GAAC region specifically recognizes an amebic nuclear protein.
      Figure thumbnail gr5
      Figure 5Electrophoretic mobility shift analysis of the GAAC element. An oligonucleotide with the hgl5 core promoter region containing a mutated and truncated Inr region and a wild type GAAC sequence was analyzed by EMSA. Klenow-radiolabeledhgl5-oligonucleotide was incubated with nuclear extract. Competition experiments were done with increasing concentrations of unlabeled self, unrelated hgl5 oligonucleotide, and oligonucleotide with a mutated GAAC region. The specific band is marked with a large arrow, and a nonspecific band is labeled with asmall arrow. The image was generated by use of the PhosphorImager (Molecular Dynamics 425) in conjunction with the Adobe Photoshop 3 software program.

      DISCUSSION

      The major conclusion from this study is that the third core promoter element GAAC (GAACT) in the hgl5 gene of E. histolytica (i) independently directs a new site of transcription initiation, (ii) controls the rate of transcription initiation, and (iii) interacts in a sequence-specific manner with an amebic nuclear protein (GAAC binding protein(s), GBP(s)). The role of the GAAC region in the hgl5 gene was determined by reporter gene assays, Northern blot analysis, and nuclear run on assays, all of which indicated that the GAAC region controls the rate of transcription.
      The GAAC element of the hgl5 gene in E. histolytica is able to control a transcription start site independent of TATA and Inr regions. We demonstrated that positional manipulation of the GAAC region, separated from the TATA and Inr core promoter, resulted in new transcription start sites 2–7 bases downstream of itself. This result occurred consistently with upstream and downstream positioning of the GAAC element and regardless of whether the core promoter region contained a wild type or mutated GAAC sequence. In the context of a wild type core promoter (i.e.wild type TATA, GAAC, and Inr regions) the dominant start site was always in the Inr region regardless of whether a GAAC region was inserted in an upstream or downstream location. However, when the core promoter contained a mutated GAAC sequence, the insertion of a wild type GAAC region in the upstream or downstream location resulted in new transcription start sites that were of equal intensity to that of the wild type Inr site. Thus, in the wild type promoter the interaction between the three regions and the proteins that bind to them are dominant in controlling the transcription initiation site. However, in the context of a mutated GAAC region, this dominance is lost, and new transcription initiation sites (directed by the GAAC region) occur. The GAAC sequence was also able to direct a new site of transcription initiation in an inverse orientation, most likely through the creation of a cryptic GAAC site. The identification of a third core promoter element that controls the site of transcription initiation is unprecedented in eukaryotes.
      We hypothesized that the GAAC sequence functioned to control the site of transcription initiation via a DNA-protein interaction. The results of the EMSA analysis revealed that an amebic nuclear protein(s), GBP, interacted specifically with the GAAC region. Competition assays with an unrelated hgl5 promoter sequence and an oligonucleotide with a mutated GAAC region pointed to the specificity of this interaction. In the analysis of the EMSA results, it is important to realize that the GAAC-GBP interaction apparently was not dependent on DNA-TBP or DNA-Inr binding protein interactions, since the EMSA probe did not contain functional TATA or Inr regions. The implication is, therefore, that although TBP may require GBP for DNA binding, the GBP does not require TBP or Inr binding protein(s) for accurate and specific DNA binding. Once the amebic GBP and TBP have been isolated, the specific DNA-protein and protein-protein interactions can be characterized in greater detail.
      The requirement for a gene or a family of genes to have three core promoter elements is unclear: why would transcription of protein encoding genes in E. histolytica be dependent on a third regulatory region? Perhaps the E. histolytica TFIID complex has multiple DNA binding regions composed of TBP, GBP, and Inr binding proteins. A variety of pre-assembled TFIID complexes could exist, containing some or all of the core promoter-binding proteins. These different TFIID complexes could differentially regulate a variety of core promoters containing all three or only one or two of these regulatory regions.
      A second model is based on the fact that TBP in vitro is able to bind to multiple AT-rich sequences (
      • Zenzie-Gregory B.
      • Khachi A.
      • Garraway I.P.
      • Smale S.T.
      ). It has been shown previously that the specificity of TBP binding to the TATA region is conferred in large part by its proximity to other regulatory regions (
      • Taylor I.C.A.
      • Kinston R.E.
      ). In an AT-rich organism such as E. histolytica a mechanism may have developed in which a factor such as GBP localizes TBP to the promoter. Thus a model can be hypothesized in which transcription in E. histolytica hgl5 genes may be dependent on protein-protein interactions in which GBP functions to “tether” or localize TBP/TFIID to the core promoter.
      Precedence for both these models can be found in the metazoan literature. It has been shown that TFIID can bind multiple regions of the core promoter, including the TATA and Inr regions (
      • Kaufmann J.
      • Smale S.T.
      ), and Kadonaga has also demonstrated binding of this protein to another core promoter regulatory region, the distal promoter element (
      • Burke T.W.
      • Kadonaga J.T.
      ). In support of the second model is the existence of factors such as TFIIA, which stabilize the binding of TFIID to the core promoter TATA box (
      • Imbalzano A.
      • Zaret K.S.
      • Kingston R.E.
      ). Both models provide for a basal complex that has multiple sites to regulate transcription in response to cellular and environmental stimuli.
      In conclusion, we have described, for the first time in the metazoan transcription literature, a third core promoter region, GAAC, which is independently able to direct a site of transcription initiation. This sequence from the hgl5 gene of E. histolytica has been shown to interact in a sequence specific manner with an amebic nuclear protein (GBP). The presence of this regulatory core promoter region raises intriguing questions regarding transcriptional control in this primitive protozoan parasite. It is important to consider the possible presence of similar yet to be identified regulatory proteins in other eukaryotes. The isolation of the GAAC-binding protein and characterization of its role in transcriptional control are the next steps in elucidating the role of GBP in the transcriptional machinery of E. histolytica.

      ACKNOWLEDGEMENTS

      We thank Barbara Mann, David Auble, and William A. Petri, Jr. for excellent discussions and scientific input.

      REFERENCES

        • Perez D.G.
        • Gomez C.
        • Lopez-Bayghen E.
        • Tannich E.
        • Orozco E.
        J. Biol. Chem. 1998; 273: 7285-7292
        • Gomez C.
        • Perez D.G.
        • Lopez-Bayghen E.
        • Orozco E.
        J. Biol. Chem. 1998; 273: 7277-7284
        • Gelderman A.H.
        • Keister D.B.
        • Bartgis I.L.
        • Diamond L.S.
        J. Parasitol. 1971; 57: 906-911
        • Tannich E.
        • Horstmann R.D.
        J. Mol. Evol. 1992; 34: 272-273
        • Dvorak J.A.
        • Kobayashi S.
        • Alling D.W.
        • Hallahan C.W.
        J. Eukaryot. Microbiol. 1995; 42: 610-616
        • Lioutas C.
        • Tannich E.
        Mol. Biochem. Parasitol. 1995; 73: 259-261
        • McAndrew M.B.
        • Read M.
        • Sims P.F.G.
        • Hyde J.E.
        Gene ( Amst. ). 1993; 124: 165-171
        • Purdy J.E.
        • Mann B.J.
        • Pho L.T.
        • Petri Jr., W.A.
        Proc. Natl. Acad. Sci. U. S. A. 1994; 91: 7099-7103
        • Singh U.
        • Rogers J.B.
        • Mann B.J.
        • Petri Jr., W.A.
        Proc. Natl. Acad. Sci. U. S. A. 1997; 94: 8812-8817
        • Roeder R.
        Trends Biochem. Sci. 1996; 21: 327-335
        • Pugh B.F.
        Curr. Opin. Cell Biol. 1996; 8: 303-311
        • Nikolov D.B.
        • Burley S.K.
        Proc. Natl. Acad. Sci. U. S. A. 1997; 94: 15-22
        • Burke T.W.
        • Kadonaga J.T.
        Genes Dev. 1996; 10: 711-724
        • Yanai K.
        • Nibu Y.
        • Murakami K.
        • Fukamizu A.
        J. Biol. Chem. 1996; 271: 15981-15986
        • Jiang J.G.
        • Zarnegar R.
        Mol. Cell. Biol. 1997; 17: 5758-5770
        • Lagrange T.
        • Kapanidis A.N.
        • Tang H.
        • Reinberg D.
        • Ebright R.H.
        Genes Dev. 1998; 12: 34-44
        • Purdy J.E.
        • Pho L.T.
        • Mann B.J.
        • Petri Jr., W.A.
        Mol. Biochem. Parasitol. 1996; 78: 91-103
        • Diamond L.S.
        • Harlow D.R.
        • Cunnick C.C.
        Trans. R. Soc. Trop. Med. Hyg. 1978; 72: 431-432
        • Vines R.R.
        • Purdy J.E.
        • Ragland B.D.
        • Samuelson J.
        • Mann B.J.
        • Petri Jr., W.A.
        Mol. Biochem. Parasitol. 1995; 71: 265-267
        • Ramakrishnan G.
        • Vines R.R.
        • Mann B.J.
        • Petri Jr., W.A.
        Mol. Biochem. Parasitol. 1997; 84: 93-100
        • Bruchhaus I.
        • Leippe M.
        • Lioutas C.
        • Tannich E.
        DNA Cell Biol. 1993; 12: 925-933
        • Greenberg M.E.
        • Bender T.P.
        Ausubel F.M. Brent R. Kingston R.E. Moore D.D. Seidman J.G. Smith J.A. Struhl K. Current Protocols in Molecular Biology. John Wiley & Sons, Inc., New York1997: 4.10.1-4.10.11
        • Gilchrist C.A.
        • Mann B.J.
        • Petri Jr., W.A.
        Infect. Immun. 1998; 66: 2383-2386
        • Chory J.
        • Baldwin Jr., A.S.
        Ausubel F.M. Brent R. Kingston R.E. Moore D.D. Seidman J.G. Smith J.A. Struhl K. Current Protocols in Molecular Biology. John Wiley & Sons, Inc., New York1997: 2.7.1-2.7.8
        • Miglarese M.R.
        • Richardson A.F.
        • Aziz N.
        • Bender T.P.
        J. Biol. Chem. 1996; 271: 22697-22705
        • Buratowski S.
        • Chodosh L.A.
        Ausubel F.M. Brent R. Kingston R.E. Moore D.D. Seidman J.G. Smith J.A. Struhl K. Current Protocols in Molecular Biology. John Wiley & Sons, Inc., New York1997: 12.2.1-12.2.11
        • Zenzie-Gregory B.
        • Khachi A.
        • Garraway I.P.
        • Smale S.T.
        Mol. Cell. Biol. 1993; 13: 3841-3849
        • Taylor I.C.A.
        • Kinston R.E.
        Mol. Cell. Biol. 1990; 10: 165-175
        • Kaufmann J.
        • Smale S.T.
        Genes Dev. 1994; 8: 821-829
        • Imbalzano A.
        • Zaret K.S.
        • Kingston R.E.
        J. Biol. Chem. 1994; 269: 8280-8286