Characterization of a polypyrimidine/polypurine tract in the promoter of the gene for chicken malic enzyme.

Starvation inhibits and refeeding stimulates transcription of the malic enzyme gene in chick liver. DNA between −320 and +72 base pairs (bp) is DNase I-hypersensitive in hepatic nuclei from fed but not starved chicks (Ma, X. J., and Goodridge, A. G. (1992) Nucleic Acids Res. 20, 4997-5002). A polypyrimidine/polypurine (PPY/PPU) tract lies within the DNase I-hypersensitive region. In hepatocytes transiently transfected with plasmids containing triiodothyronine response elements and a minimal promoter from the malic enzyme gene linked to the chloramphenicol acetyltransferase gene, deletion of the PPY/PPU tract inhibited chloramphenicol acetyltransferase activity by about 90% with or without triiodothyronine. Fine mapping of S1 nuclease-sensitive sites suggests that the PPY/PPU tract can assume different isoforms of non-B-DNA, some of which may be triplex structures. The PPY/PPU tract contains specific binding sites for single- and double-stranded DNA binding proteins and, with 8 bp 3′ of the tract, can function as a promoter. A (CT)7 repeat binds single-stranded DNA-binding protein and is essential for promoter activity. Two C-rich elements bind single-stranded DNA-binding proteins and may mediate inhibition of promoter function. The single- and double-stranded DNA-binding proteins that interact with the PPY/PPU tract may regulate transcription of the malic enzyme gene.

Transcription of eukaryotic genes is regulated by the interactions of specific DNA sequence elements, their cognate transcription factors, and the general transcription machinery. Genes that encode proteins have two types of DNA elements (1)(2)(3). One element involves DNA sequences that are specific to certain genes. Binding of transcription factors to these elements and the subsequent interaction of those factors with other parts of transcription machinery regulate gene expression tissue-specifically and in response to exogenous agents such as hormones. The other sequences are promoter elements common to most genes; they provide a site for assembly of the transcription initiation complex, identify the start site for transcription, and endow the gene with a basal rate of expression. The most common type of promoter in genes transcribed by RNA polymerase II contains TATA-box elements. In promoters with TATA-boxes, binding of the general transcription factor TFIID to the TATA-box represents the first step in transcription initiation. There is, however, a growing number of promot-ers that lack TATA-box elements (4 -7). These promoters are usually GC-rich, have multiple transcription start sites, and use initiator sequences to identify where the initiation complex should bind and initiate transcription. How transcription is initiated and regulated on promoters that lack TATA-boxes is not clear.
Regulation of promoter activity likely involves changes in conformation of the trans-acting proteins when they bind to their specific sequence elements and/or when a ligand, such as a hormone or another protein, binds to a DNA-bound factor. Binding of a trans-acting protein to its specific sequence elements also may cause the DNA to undergo changes in conformation (8). Protein-induced changes in the secondary structure of promoter DNA may regulate transcription. For example, binding of protein to DNA may stimulate or stabilize the formation of a functionally significant non-B-DNA structure. Alternatively, productive binding of a regulatory protein to DNA may require prior formation of a particular non-B-DNA structure. One of the earliest reports to suggest a role for non-B-DNA in gene regulation showed that in actively transcribed chicken globin genes, DNA in the chromatin of the promoter region was sensitive to S1 nuclease treatment (9). Active transcription was correlated with the formation of an unusual DNA structure with a single-stranded region.
One of the non-B-DNA structures that has drawn a great deal of attention in the last 10 years is the triple helix (10,11). DNA triplexes are formed by polypyrimidine/polypurine (PPY/ PPU) 1 sequences. In this structure, a DNA strand (donor strand) from one half of the sequence folds into the major groove of the other half-duplex, forming Hoogsteen base pairs and leaving the other strand in a single-stranded state. A triplex can be either H-DNA if the pyrimidine strand is the donor, or H*-DNA if the purine strand is the donor. A number of different triplex isoforms may form along a long PPY/PPU sequence (12). Formation of specific isoforms, or stabilization of those structures once formed, may contribute to the regulation of transcription (13).
There are several reasons to believe that PPY/PPU tracts may have important functional roles. On a statistical basis, PPY/PPU sequences are overrepresented in eukaryotic genomes (14 -16). Second, triplex-forming PPY/PPU sequences have been discovered in the promoters of several genes that lack TATA-boxes; the protooncogenes hEGFR, c-ets-2, and c-Kiras are examples (6,17,18). Deletion of the PPY/PPU sequences from these promoters decreases their promoter activities in transient transfection assays. Third, when the triplexforming sequence of the c-myc gene is connected to a heterologous promoter, it enhances expression of a linked reporter gene. In a series of mutant PPY/PPU sequences, the degree of transcription enhancement correlated with ability of the sequence to adopt triplex structure (19). Finally, naturally occurring point mutations in the triplex-forming sequence of the promoter of human ␥-globin gene cause abnormalities in regulation of transcription of the genes of the hemoglobin locus (20,21). Point mutations that decrease the potential of the sequence to form triplex cause persistent expression of ␥-globin in adults. Despite the evidence cited above, the involvement of these unusual DNA structures in regulation of gene expression remains controversial. Furthermore, the mechanisms by which triplexes and related structures might regulate function remain unclear.
Malic enzyme (L-malate:NADP ϩ oxidoreductase (decarboxylating), EC 1.1.1.40) catalyzes the conversion of malate to pyruvate and CO 2 . At the same time, it generates NADPH, much of which is used for synthesis of fatty acids. Transcription of the malic enzyme gene in liver is regulated by nutritional state and endocrine status in vivo and by insulin, glucagon, triiodothyronine (T3), glucocorticoids, and unesterified fatty acids in hepatocytes in culture (22). The transition from a low transcription rate in the starved state to a 40-fold higher transcription rate in the refed state is accompanied by a profound change in chromatin structure in the region from Ϫ320 to ϩ72 bp, hypersensitive to DNase I and certain restriction enzymes in livers from fed chicks and resistant in starved ones. The maximum increase in DNase I hypersensitivity occurs within 6 h, the same time required for the transcription rate to reach its highest level. Sensitivity to DNase I and transcription rate are rapidly and simultaneously decreased when food is removed from fed chicks (23,24).
The 5Ј-flanking region (5.8 kb) of the chicken malic enzyme gene has been cloned, and the nucleotide sequence of 4.4 kb has been determined. 2 The promoter of the malic enzyme gene is GC-rich, does not contain a TATA-box, and has multiple transcription start sites. Like many promoters that lack a TATAbox (11), the promoter of the malic enzyme gene contains a PPY/PPU tract; it is located between Ϫ134 and Ϫ86 bp. This PPY/PPU tract is S1 nuclease-sensitive in the supercoiled state (25) and is a potential candidate for involvement in the nutritional regulation of chromatin structure and transcription rate. Here, we report an analysis of the structure and function of this PPY/PPU tract. To our surprise, the PPY/PPU tract plus an 8-bp 3Ј extension can function as an independent promoter. To our knowledge, this is the first report that a PPY/PPU tract with the potential to form triplex structures can be utilized to direct transcription initiation in a promoter lacking a TATA-box.

EXPERIMENTAL PROCEDURES
Construction of Reporter Plasmids-Routine subcloning was performed by standard methods (26,27). Our initial reporter plasmid, p[ME-5800/ϩ31]CAT was made by inserting 5.8 kb of 5Ј-flanking DNA plus 31 bp of 5Ј-untranslated sequence of the chicken malic enzyme gene upstream of the bacterial reporter gene, chloramphenicol acetyltransferase (CAT) in the promoterless construct, pKS-CAT. 2 pBH147CAT (Fig. 1) was made by inserting a BstXI/HindIII fragment from pME[Ϫ5800/ϩ31]CAT upstream of a minimal promoter for malic enzyme linked to CAT (pME[Ϫ147/ϩ31]CAT). pBH147⌬PPYCAT is a mutated version in which the sequence between Ϫ134 and Ϫ90 bp (all but 4 bp of the PPY/PPU tract) was deleted using a polymerase chain reaction-based strategy (28).
To construct pME[Ϫ3903/Ϫ3703]TKCAT, a 201-bp malic enzyme fragment from Ϫ3903 to Ϫ3703 bp was amplified by polymerase chain reaction and inserted into the reporter plasmid pBLCAT2 (29) between the SphI and BamHI sites. The promoterless construct, pME[Ϫ3903/ Ϫ3703]⌬TKCAT was made by removing the thymidine kinase (TK) promoter from pME[-3903/Ϫ3703]TKCAT by digestion with BamHI and BglII, followed by blunt-end ligation. DNA fragments containing various parts of the PPY/PPU tract were made by annealing two synthetic oligonucleotides and inserting the double-stranded DNA into pME[Ϫ3903/Ϫ3703]⌬TKCAT in place of the TK promoter. pME([Ϫ3903/Ϫ3703]ME-147/ϩ31)CAT was constructed by replacing the TK promoter of pME[Ϫ3903/Ϫ3703]TKCAT with malic enzyme promoter sequence between Ϫ147 and ϩ31 bp. The sequence of each reporter plasmid was confirmed by sequencing using the dideoxy chain termination method. DNA fragments used in constructing the reporter plasmids and in other experiments are named by designating the 5Ј and 3Ј ends of each fragment relative to the major transcription start site of the endogenous malic enzyme gene. For deletion constructs, a "⌬" precedes the deleted part of the DNA.
Transient Transfections-This procedure was carried out essentially as described (30). Briefly, hepatocytes were prepared from 19-day-old embryos of white Leghorn chickens, plated at a high density (22 l of packed cells/35-mm plate), and incubated in Waymouth medium 705/1 supplemented with streptomycin (100 g/ml), penicillin G (60 g/ml), corticosterone (1 M), and insulin (Eli Lilly, Indianapolis, IN) (50 nM). At about 20 h of incubation, the medium was changed, and the cells were transfected with supercoiled reporter plasmids using Lipofec-tAce . In addition to the components named above, each plate contained 40 g of LipofectAce, 1.3 g of pBH147CAT or an equimolar amount of another reporter plasmid, 0.5 g of pCMV␤-GAL, and sufficient pBluescriptKS(ϩ) to make the total amount of transfected DNA 4.5 g/plate. On day 2 (day 0 ϭ day cells were prepared), the transfection medium was removed, and hepatocytes were treated with or without T3 (1.6 M) in the medium described above. After a 48-h incubation, hepatocytes in duplicate plates were harvested into one tube and suspended in 200 l of 0.1 M Tris-HCl, pH 7.8, 1 mM EDTA, 1 mM DTT, 10 g/ml trypsin inhibitor, and 0.174 mg/ml phenylmethylsulfonyl fluoride and lysed by three cycles of freezing-thawing.
Freshly prepared cell extracts were used to measure protein and ␤-galactosidase activity (26,31). Before measuring CAT activity, extracts were heated to 60°C for 30 min, and precipitated protein was removed by centrifugation. Samples of heat-stable extract containing the equivalent of 5-50 g of unheated soluble protein were incubated for 15 h at 37°C in 0.1 M Tris-HCl, pH 7.8, 1 mM EDTA, 12 M 14 C-chloramphenicol, 2 mM acetyl-CoA (150 l total volume). The products of the reaction were extracted with ethyl acetate and separated by thin layer chromatography. Radioactivity in substrate and products was visualized by autoradiography and measured by liquid scintillation spectrometry or measured by direct autoradiography using an Instant-Imager (Packard Instrument Co., Meriden, CT). The CAT activity was expressed initially as the percentage of substrate converted to acetylated product per microgram of unheated soluble protein and then normalized as described in the figure legends.
Low Resolution Mapping of S1 Nuclease Sensitivity-pBH147CAT or pBH147⌬PPYCAT (20 g) were digested with 1 or 10 units of S1 nuclease at 37°C for 15 min in a 100-l reaction containing 30 mM sodium acetate, pH 4.6, 30 mM NaCl, 1 mM ZnCl 2 . The reaction was stopped by phenol extraction, and the DNA was precipitated. DNA was then digested with BamHI. Some DNA was treated with BamHI before digesting with S1 nuclease. The resulting DNA fragments (about 1 g) were separated by size on 1% agarose gels. Gels were stained with ethidium bromide, and the DNA fragments were visualized with UV light.
High Resolution Mapping of S1 Nuclease and P1 Nuclease Sensitivity-Supercoiled pME637 (Ϫ413 and ϩ224 bp in the PstI site of plasmid pIBI31) (20 g) was partially digested with 0, 0.05, or 0.2 units of S1 nuclease for 5 min at 37°C in the S1 nuclease buffer described above or with 1 unit of P1 nuclease for 0, 1, or 4 min at 37°C in a buffer containing 25 mM Tris-HCl, pH 7.0, 50 mM NaCl, 5 mM ZnCl 2 . Reactions were stopped by adding 0.5 M EDTA (1 l) and extracting with phenol. DNA was precipitated with ethanol and used for the following manipulations. To detect nicks on the top strand, DNA (1 g) was incubated with 10 pmol of biotinylated oligonucleotide 5Ј-ϩ136 CTACCTTGAT-GAGGTGCGGGTC ϩ115 -3Ј at 95°C for 5 min and then at 45°C for 15 min in 19 l of 10 mM KCl, 20 mM Tris-HCl, pH 8.8, 10 mM (NH 4 ) 2 SO 4 , 2 mM MgSO 4 , 0.1% Triton X-100 and 100 M each of dATP, dTTP, dGTP, and dCTP. To detect nicks on the bottom stand, DNA was incubated with biotinylated oligonucleotide 5Ј-Ϫ367 CACAAAAATA-AGCGTGAGGAGGCAGG Ϫ342 -3Ј at 95°C for 5 min and then at 50°C for 15 min.
Primer extension reactions containing 1 l (2 units) of Vent (exo Ϫ ) DNA polymerase were incubated for 10 min at 76°C and stopped by chilling on ice. The reaction mixture was then added to 50 l of 5 mM Tris-HCl, pH 8.5, 2.5 M NaCl, 0.5 mM EDTA containing 200 g of streptavidin-coated paramagnetic beads (Dynabeads M-280 streptavidin) that had been pretreated as described (32). Binding was carried out for 30 min at room temperature. DNA bound to the beads was separated from unbound DNA on a magnetic stand, and the unbound DNA was discarded. The DNA template strand and the extended immobilized strand were denatured in 50 l of 150 mM NaOH at 50°C for 5 min. The extended immobilized strand was then separated from the template strand by centrifugation or on a magnetic stand, and the extended strand was discarded. The supernatant solution containing the templates was neutralized with 50 l of 150 mM HCl, and 10 l of 1 M Tris-HCl, pH 7.4. Yeast tRNA (5 g) was added to the mixture, and the nucleic acids were precipitated. The DNA-RNA pellet was dissolved in 36 l of 10 mM KCl, 20 mM Tris-HCl, pH 8.8, 10 mM (NH 4 ) 2 SO 4 , 2 mM MgSO 4 , and 0.1% Triton X-100 and 100 M each of dATP, dTTP, dGTP, and dCTP; 18 l of this mixture were then added to 1 l of 32 P-labeled primer (1 pmol; top strand, 5Ј-Ϫ11 CCTGCGGGAGCGGAGGCG Ϫ28 -3Ј; bottom strand, 5Ј-Ϫ194 CGGTGGGTGACTCAGCG Ϫ178 -3Ј). The mixture was incubated at 95°C for 1 min, then at 45°C (bottom strand) or 50°C (top strand) for 15 min. Vent (exo Ϫ ) DNA polymerase (2 units) was added, and the reaction was incubated for 10 min at 76°C. The reaction was stopped by chilling on ice. After precipitating the DNA with ethanol, the extension products were analyzed on 6% polyacrylamide, 8 M urea sequencing gels.
DNase I Footprint of Single-stranded DNA-A PstI fragment (Ϫ413 and ϩ224 bp) was subcloned into plasmid M13mp19 at the PstI site, and a clone containing the top strand of the malic enzyme sequence was isolated and named pM13MEϪ413/ϩ224(ϩ). The StuI-PstI fragment (Ϫ236 and ϩ 224 bp) was subcloned into M13mp19 at the PstI and SmaI sites. The resulting plasmid, pM13MEϪ236/ϩ224(Ϫ), contains the bottom strand of the malic enzyme sequence. To prepare the top strand probe, 32 P-labeled universal sequencing primer, 5Ј-GTAAAACGACG-GCCAGT-3Ј, was extended on single-stranded pM13MEϪ236/ϩ224(Ϫ) using the Klenow fragment of Escherichia coli DNA polymerase. The extension product was digested with AvaII. To prepare the bottom strand probe, 32 P-labeled oligonucleotide, 5Ј-Ϫ11 CCTGCGGGAGCG-GAGGCG Ϫ28 -3Ј, was extended on pM13MEϪ431/ϩ224(ϩ), and the extension product was digested with StuI. Single-stranded probes were isolated after electrophoresis through alkaline-agarose gels (27). About 100,000 cpm of probe was preincubated with various amounts of nuclear protein in a 50-l binding reaction containing 10 mM HEPES, pH 7.4, 50 mM KCl, 0.5 mM EDTA, 0.5 mM DTT, 0.1 g/l poly(dI-dC), 0.01% Nonidet P-40, 4% glycerol, 0.04 g/l bovine serum albumin. After a 1-h preincubation on ice, 5 l of 100 mM Tris-HCl, pH 7.4, 500 mM MgCl 2 , RQ-1 RNase-free DNase I (1 unit) was added. After 15 min in ice water, the reaction was stopped by 0.5 M EDTA (5 l), followed by phenol extraction and precipitation with ethanol. The resulting DNA fragments were separated by size on 6% polyacrylamide, 8 M urea sequencing gels.
Footprinting of Double-stranded DNA-DNA fragments containing the malic enzyme promoter were excised from plasmid pMEϪ413/ϩ224 by digestion with either KpnI and HindIII or AvaI and PstI. The top strand was labeled at the HindIII site of the KpnI/HindIII fragment; the bottom strand was labeled at the AvaI site of the AvaI/PstI fragment. About 0.5 ng of DNA probe was incubated with 70 g of nuclear proteins for 40 min on ice in 100 l of 10 mM Tris-HCl, pH 7.4, 50 mM NaCl, 30 mM KCl, 0.5 mM EDTA, 1 mM DTT, 5% glycerol, 0.03 g/l poly(dI-dC). DNase I and 5 mM MgCl 2 were then added, and the incubation continued for another 10 min on ice. Digestion was stopped with 10 mM EDTA, followed by phenol-chloroform extraction. The resulting DNA fragments were precipitated with ethanol and separated by size on 5% polyacrylamide, 8 M urea sequencing gels.
Gel Electrophoretic Mobility Shift Assay-Single-stranded probes were labeled using T4 polynucleotide kinase. Double-stranded probes were labeled using the Klenow fragment of E. coli DNA polymerase. 32 P-labeled probe (0.01 pmol) was mixed with competitor DNA and then incubated with about 0.1 (single-stranded DNA as probe) or 3 (doublestranded DNA as probe) g of nuclear protein in 20 l of 10 mM HEPES, pH 7.4, 50 mM KCl, 0.5 mM EDTA, 0.5 mM DTT, 0.1 g/l poly(dI-dC), 0.01% Nonidet P-40, 4% glycerol, 0.04 g/l bovine serum albumin. The binding reaction was performed on ice for 45 min. DNA and DNAprotein complexes were resolved on 6% nondenaturing polyacrylamide gels.
Oligonucleotides-Unmodified oligonucleotides were synthesized and purified by the DNA Core Facility of the University of Iowa. Oligonucleotides biotinylated at their 5Ј-ends were purchased from Genosys Biotechnologies, Inc. (The Woodlands, TX). For the gel electrophoretic mobility shift assay, double-stranded oligonucleotides were made by annealing two complementary oligonucleotides (purified on 8% polyacrylamide gels and collected using an Elutrap device (Schleicher & Schuell)).
Materials-Restriction enzymes were purchased from New England Biolabs (Beverly, MA), except for BamHI, which was obtained from Boehringer Mannheim, and were used in the buffers provided by the manufacturers. Other enzymes were purchased from the indicated sources: RQ1 RNase-free DNase (Promega, Madison, WI); S1 nuclease (Sigma); Vent (exo Ϫ ) DNA polymerase, T4 DNA ligase, and calf intestinal alkaline phosphatase (New England Biolabs); Klenow fragment of E. coli DNA polymerase and Taq DNA polymerase (Boehringer Mannheim); T4 polynucleotide kinase (New England Biolabs). DNA sequencing kits were purchased from Bio-Rad. Streptavidin-coated paramagnetic beads, Dynabeads M-280, were purchased from Dynal (Lake Success, NY). Nucleotides and poly(dI-dC) were purchased from Pharmacia Biotech Inc. Radiolabeled nucleotides were purchased from Amersham Corp. D-Threo-[dichloroacetyl-1,2-14 C]-chloramphenicol was from DuPont NEN. LipofectACE and Waymouth medium MD 705/1 were obtained from Life Technologies, Inc. Corticosterone and 3,5,3Ј-Ltriiodothyronine were purchased from Sigma. Competent E. coli DH5␣ cells used for subcloning were purchased from Life Technologies, Inc. Agarose was from Eastman Kodak Co. All other chemicals were of reagent grade or the best quality commercially available. pIBI31 was obtained from Kodak. Bluescript KS ϩ was from Stratagene (La Jolla, CA). Bruno Luckow and Gunter Schutz (German Cancer Research Center, Heidelberg, Germany) provided pBLCAT2 (29). pCMV-␤GAL (33) was obtained from Richard Maurer (University of Iowa). Supercoiled plasmid DNA was extracted from E. coli cultures using alkaline lysis and purified by CsCl gradient centrifugation (26).
Statistical Analysis-Where appropriate, statistical significance of differences between pairs of means was determined by the Wilcoxon signed rank test (34). Standard errors of the mean are provided to indicate the degree of variability in the data. (6,17,18). We tested the functional significance of the PPY/PPU sequence in the malic enzyme promoter by comparing transcription from a promoter containing the wild-type sequence (pBH147CAT) with that in a construct lacking all but 4 bp of the PPY/PPU tract (pBH147⌬PPYCAT). pBH147CAT contains a T3 response unit and a minimal promoter from the chicken malic enzyme gene linked to a CAT reporter gene (Fig. 1B). 2 In hepatocytes in culture transfected with the wild-type and deletion constructs, T3 caused 8-and 20-fold stimulations of CAT activity, respectively (Fig. 1C). In cells transfected with pBH147⌬PPYCAT, however, T3-induced expression of CAT was 80% less than that in cells transfected with pBH147CAT; the decrease was about 90% in cells not treated with T3. These results indicate that the PPY/PPU tract is not required for T3 responsiveness but plays a major role in determining both basal and T3-induced rates of transcription from the malic enzyme promoter.

Functional Role of PPY/PPU Tract-PPY/PPU tracts play important functional roles in several promoters
Sensitivity of the PPY/PPU Tract to Single-strand-specific Nucleases-Analysis of the nucleotide sequence of the PPY/ PPU tract suggested that it might form triplex structures. Formation of an intramolecular triplex structure leaves a loop of DNA in the single-stranded form and requires the DNA to be in a supercoiled state (8,10,11). We therefore examined S1 nuclease sensitivity in the DNA of pBH147CAT. When supercoiled plasmid was treated with S1 nuclease and then digested with BamHI, DNA fragments of 1.7 and 3.9 kb were generated (Fig. 2). The intensities of both bands increased as the amount of S1 nuclease in the reaction was increased. These results indicated the presence of a single-stranded region 1.7 kb from the BamHI site. When the PPY/PPU sequence at 1.7 kb from BamHI site was deleted (pBH147⌬PPYCAT), neither the 1.7 nor the 3.9 kb band was detected, indicating that the PPY/PPU tract was necessary for S1 sensitivity. When supercoiled plasmid was linearized by BamHI before digestion with S1, the 1.7and 3.9-kb fragments were not observed. Thus, formation of the single-stranded region was dependent on supercoiling. This PPY/PPU tract also conferred supercoiling-dependent sensitivity to S1 nuclease when subcloned into other plasmid vectors (results not shown).
In an effort to determine the nature of non-B-DNA involved in the S1 sensitivity described above, we next determined which strand was cleaved by the single-strand-specific nucleases S1 and P1 and which specific bases were cleaved. For this purpose we modified standard primer extension methodology. In this modification, supercoiled plasmid DNA is partially digested by a single-strand-specific endonuclease to generate a population of nicked but not linearized DNAs. The positions of the nicks were then detected by primer extension. Two steps of primer extension were employed. The first primer extension used a sufficiently large amount of unlabeled biotinylated primer to compete effectively for the template. The resulting extension products were isolated from other DNA in solution using streptavidin-coated paramagnetic beads. The DNA template strand and the immobilized extension strand were then denatured under alkaline conditions and separated by centrifugation or on a magnetic stand. The isolated template strand was used for a second primer extension with a different 32 Plabeled primer. The isolation and use of the single-stranded template DNA increased the efficiency for primer extension from a small amount of 32 P-labeled primer. In addition, formation of triplex structures can block the passage of DNA polymerase (35). By using the procedure just described and performing the primer extension at a high temperature we avoided artifacts associated with such blockages. Prior to the first primer extension, the plasmid is digested by restriction enzymes at both sides of the region of interest. This provides a full-length extension product that serves as an internal control in reactions without nuclease.
When supercoiled plasmid containing the malic enzyme sequence between Ϫ413 and ϩ224 bp was incubated with nuclease, two kinds of cleavages were observed (Fig. 3A). In one . At about 20 h of incubation, the medium was changed, and the cells were transfected with supercoiled reporter plasmids using LipofectAce as described under "Experimental Procedures." Two days after plating the cells, the transfection medium was removed, and hepatocytes were treated with or without T3 (1.6 M) in the medium described above. After an additional 48-h incubation, hepatocytes were harvested, and CAT activity, ␤-galactosidase activity, and protein were measured. The results were initially expressed as percentage of [ 14 C]chloramphenicol converted to acetylated chloramphenicol per microgram of soluble protein and then corrected for differences in transfection efficiency by dividing by ␤-galactosidase activity of the same extract (A 420 units per microgram of protein). Relative CAT activities were then calculated by setting the corrected CAT activity for T3treated hepatocytes transfected with pBH147CAT to 100 and adjusting all other activities proportionately. The results are the means Ϯ S.E. of seven experiments, each one using an independently isolated batch of hepatocytes. CAT and ␤-galactosidase activities of extracts from T3treated hepatocytes transfected with pBH147CAT were 0.54 Ϯ 0.17 (mean Ϯ S.E., n ϭ 7) percent conversion/15 h/g of protein and 32 Ϯ 7 ϫ 10 Ϫ4 (mean Ϯ S.E., n ϭ 7) A 420 units/min/g of protein, respectively. Three or more independent preparations of each plasmid were used.
population, the intensities of the signals increased with increasing nuclease concentration, and the cleavages were specific to samples treated with nuclease. The other population of cleavages was nonspecific and occurred in both nucleasetreated and untreated samples. They probably represent premature termination by DNA polymerase. On the top DNA strand, S1 nuclease detected a major single-stranded region between Ϫ104 and Ϫ91 bp; this region of the PPY/PPU tract contains a (CT) 7 repeat. S1 cleavages were not detected on the opposite strand of the same region. The asymmetrical distribution of the single-stranded region suggests the DNA formed a triplex structure in which one DNA strand was left in the single-stranded state while the other strand was folded into the major groove of the double-stranded DNA, where it was protected from enzymatic attack. DNA of this region has the potential to form two triplex isoforms that, together, can account for the observed digestion pattern (Fig. 3C, structures a and b). The unpaired dinucleotides on the bottom strands of these structures may not be accessible to nuclease.
In addition to the major single-stranded region in the region of the (CT) 7 repeat, a minor S1 nuclease-sensitive region was detected in the upstream part of the PPY/PPU tract between Ϫ115 and Ϫ128 bp. Again, no cleavage was observed on the bottom strand. DNA of this region has the potential to form two triplex isoforms that, together, can account for the observed digestion pattern (Fig. 3C, structures d and e). Cleavage in this region was less extensive than that in the (CT) 7 repeat. In this population of plasmid DNA molecules, isoforms d and e may not be as abundant as isoforms a and b.
We also used another single-strand-specific endonuclease, P1 nuclease, to probe DNA structure in the region of the PPY/ PPU tract. P1 nuclease activity has a broader pH optimum, permitting assessment of structures present at neutral pH. When supercoiled plasmid DNA was incubated with P1 nuclease at pH 7.0 in the presence of 50 mM NaCl and 5 mM ZnCl 2 , the pattern of cleavages was different from that observed with S1 nuclease at pH 4.6 (Fig. 3A). On the bottom strand, we observed a set of discrete cleavages corresponding to the adenines in the (GA) 7 repeat. P1 nuclease cleaves preferentially after adenines (36). This result thus suggests that the purine strand was in the single-stranded state and that only linkages after adenines were cleaved by P1 nuclease under these conditions. Cleavage of the top strand in this region was less extensive than that of the bottom strand. One interpretation of this result is the presence of triplex structures that utilize pyrimidines as the third strand, i.e. the H-DNA counterparts of structures a and b (Fig. 3C). Alternatively, the lack of cleavage may reflect the specificity of P1 nuclease, and a substantial fraction of DNA in this region may be in the melted state (Fig. 3C,  structure c).  1 and 7), 0.05 (lanes 2 and 8), or 0.2 (lanes 3 and 9) units of S1 nuclease for 5 min at 37°C or with 1 unit of P1 nuclease for 0 (lanes 4 and 10), 1 (lanes 5 and 11), or 4 (lanes 6 and 12) min at 37°C as described under "Experimental Procedures." Lanes marked A, G, C, and T are sequencing reactions using the same primer used for the nuclease-treated DNAs. Sequences of the complementary strands are indicated so that they correspond to the strands being mapped. Numbers indicate the nucleotide sequence with respect to the major start site of transcription. Panel B, nucleotide sequence of the PPY/PPU region, indicating major cleavages by S1 nuclease (*) and P1 nuclease (!) and minor cleavages by S1 nuclease (ϩ) and P1 nuclease (⅐). Panel C, possible non-B-DNA structures consistent with the results shown in panels A and B.
As observed with S1 nuclease, the PPY sequence upstream of the (CT) 7 repeat was sensitive to P1 treatment while the opposite strand was not; even the bond after the adenine was not cut by P1 nuclease. This result suggests that only the PPY strand was in the single-stranded state. This may reflect the presence of triplex structures similar to those detected after treatment with S1 nuclease except that the base triad "CG*A ϩ " at Ϫ113 and Ϫ125 bp in the Watson-Crick duplex did not form (Fig. 3C).
Some nuclease P1 sensitivity was observed further downstream, especially at Ϫ62 bp between two cytidines. A substantial cluster of S1 nuclease-sensitive bonds also was observed on the bottom strand between Ϫ78 and Ϫ58 bp. Both nucleases also caused cleavage in this region on the top strand. The results suggest that the DNA is melted in this C-rich region, but the mechanism is obscure.
Nuclear Proteins Bind to Double-stranded DNA in the PPY/ PPU Tract-The finding that sequences in the PPY/PPU tract could adopt non-B-DNA structures and, in particular, singlestranded regions, led us to examine nuclear extracts for proteins that would bind specifically to double-or single-stranded regions of the PPY/PPU tract. A number of possible DNase I footprints were detected on double-stranded DNA between Ϫ413 and ϩ224 bp (Fig. 4). The clearest of these was located in the PPY/PPU tract between about Ϫ130 and Ϫ100 bp on the top strand and between about Ϫ125 and Ϫ100 bp on the bottom strand (Fig. 4). We used the gel electrophoretic mobility shift assay to characterize this binding activity further (Fig. 5). One major DNA-protein complex was formed when nuclear protein was incubated with a double-stranded probe that spanned Ϫ134 to Ϫ103 bp. Unlabeled probe at 100-fold molar excess completely eliminated binding to the labeled probe, suggesting that the binding was sequence-specific. Neither the purine nor the pyrimidine strand between Ϫ135 and Ϫ86 bp was able to compete for the binding at a 1000-fold molar excess. Thus, this nuclear protein(s) binds specifically to double-stranded DNA and, unlike some PPY/PPU-binding proteins (37), has no affinity for single-stranded DNA.
Because this protein bound to the (dC)⅐(dG)-rich part of a nuclease S1-sensitive region of the PPY/PPU tract, we were concerned that it might be the erythrocyte-specific poly(dG⅐dC)binding protein, BGP1, or a related protein. BGP1 binds to an S1 nuclease-sensitive poly(dG⅐dC) sequence in the promoter region of the chicken ␤-globin gene and has been implicated in the regulation of expression of the ␤-globin gene (38,39). Even though expression of BGP1 is restricted to erythrocytes, our nuclear extracts could contain this protein because both liver and isolated hepatocytes may contain small amounts of erythrocytes. The minimal binding site for BGP1 is (dG) 7 ⅐(dC) 7 ; we therefore used (dG) 30 ⅐(dC) 30 to compete for the binding of nuclear proteins to the double-stranded DNA fragment from Ϫ134 to Ϫ103 bp. No competition was detected, even when the double-stranded competitor oligonucleotide was used at 5000-fold molar excess. This result indicates that the protein is a se- quence-specific PPY/PPU-binding protein different from BGP1. The other minor bands on this autoradiograph (Fig. 5) may be due to degradation of the protein in the major complex or nonspecific binding proteins.
We also used the gel electrophoretic mobility assay to assess binding of nuclear proteins to other regions of the PPY/PPU tract. We did not detect specific binding to oligonucleotides containing double-stranded DNA fragments spanning Ϫ117 to Ϫ90 bp or Ϫ105 to Ϫ78 bp (data not shown). We have not yet analyzed binding to regions downstream of the PPY/PPU tract using this approach.
Nuclear Proteins Bind to Single-stranded DNA from the PPY/PPU Tract-The potential for the PPY/PPU tract to form non-B-DNA structures prompted us to search for proteins that would bind specifically to single-stranded regions in this structure. We first performed a DNase I-footprint assay using single-stranded DNA as a probe. On the top (PPY) strand, nuclear proteins protected a broad region between about Ϫ142 and Ϫ84 bp (Fig. 6). The protection increased when more nuclear extract was used. At the highest level of nuclear protein, additional protected sites were noted both flanking the PPY/PPU tract and further 5Ј and 3Ј. One of these appears to map to a run of seven C nucleotides interrupted by one A just downstream from the PPY/PPU tract at Ϫ76 to Ϫ68 bp. This site also showed weak S1 nuclease sensitivity (Fig. 3A). The other sites may represent protein binding to short runs of pyrimidines, which are poorer binding sites and/or do not have the potential to form single-stranded DNA in vivo.
On the bottom (PPU) strand, the region between Ϫ124 and Ϫ90 bp was not susceptible to cleavage by DNase I in the absence of nuclear protein. Nevertheless, even the minor cleavages, especially at the 5Ј end of this region, were not protected by nuclear extract. Indeed, the highest levels of extract caused increased cleavage, probably due to endogenous nuclease in the extract. Lack of binding of nuclear proteins to the PPU strand was confirmed by gel electrophoretic mobility shift assays (data not shown).
Nuclear proteins that bind to the PPY strand were characterized using the gel electrophoretic mobility shift assay. Three major binding activities, A1, A2, and B, were identified when a single-stranded oligonucleotide spanning nucleotides Ϫ119 to Ϫ90 was used as probe (Fig. 7A). The specificities of these binding activities were examined by using different synthetic oligonucleotides to compete for binding. Complexes A1 and A2 were not competed by d(A) 30 , d(G) 30 , or d(T) 30 , but were competed by d(C) 30 almost as efficiently as by unlabeled probe. These results suggest that proteins in complexes A1 and A2 specifically recognized a C-rich sequence. The C-rich region in this probe corresponds to the single-stranded loop in triplex structure d (Fig. 3C). We therefore tested an oligonucleotide corresponding to nucleotides Ϫ118 to Ϫ107 as a competitor for binding of proteins in complexes A1 and A2 to the longer probe. Fragment Ϫ118/Ϫ107 competed for the binding but with substantially lower affinity than that of unlabeled probe. This result suggested that proteins in complexes A1 and A2 recognize and can bind to the single-stranded loop in the putative triplex structure d. The observation that fragment Ϫ118/Ϫ107 binds less well than fragment Ϫ119/Ϫ90 may reflect a nonspecific requirement for additional bases at the 3Ј-end of Ϫ118/Ϫ107.
The binding specificities of A1 and A2 were examined further with other oligonucleotides. d(CT) 10 and fragment Ϫ99/Ϫ86 were poor competitors. The proteins in complexes A1 and A2 appeared to have some affinity for double-stranded oligonucleotides because d(CT)⅐d(GA) 10 was modestly effective as a competitor. This turned out to be an artifact caused by dissociation of the double-stranded DNA in the binding reaction (see below). In sum, these results suggest that the proteins in complexes A1 and A2 are sequence-specific single-stranded DNA binding proteins that recognize poly(dC) but may prefer that the C-run be interrupted by a T (or other nucleotide).
The binding activity of protein in complex B also was sequence-specific; binding was not competed by d (A) 30 , d(G) 30 , d(T) 30 , or d(C) 30 but was competed by d(CT) 10 . This result suggested that the protein in complex B may bind to the (CT) 7 repeat of the probe. An oligonucleotide corresponding to nucleotides Ϫ99 to Ϫ86, which contains only four CT repeats, failed to compete for binding. Binding of nuclear protein also was determined using a single-stranded fragment spanning Ϫ101 to Ϫ81 as probe. Binding of the protein in complex B was detected (results not shown). This fragment contains five CT repeats and may represent the minimum binding site.
The ability of the protein in complex B to bind to doublestranded (CT) 10 ⅐(GA) 10 also was tested; this protein was competed by (CT) 10 ⅐(GA) 10 , although not as efficiently as by unlabeled probe. As described above for the A1 and A2 complexes, however, this result was probably due to an artifact caused by FIG. 6. DNase I footprint of single-stranded DNA of malic enzyme promoter. pM13MEϪ413/ϩ224(ϩ) and pM13MEϪ236/ϩ224(Ϫ) were used to prepare top and bottom strand probes, respectively, as indicated under "Experimental Procedures." About 100,000 cpm of probe (0.2 pmol) was preincubated with increasing amounts of nuclear protein (lanes 2 -5 and 8 -11). After a 1-h preincubation on ice, 1 unit of RNase-free DNase I was added. After 15 min in ice water, the reaction was stopped, and the resulting DNA fragments were separated by size on 6% polyacrylamide, 8 M urea sequencing gels. Nuclear extracts were prepared from the livers of 2-week-old chicks as described under "Experimental Procedures." Products of reactions that did not contain DNase I or nuclear protein are in lanes 1 and 7. Products of reactions that contained 4 g of nuclear protein but no DNase I are in lanes 6 and 12. The amount of reaction mixture added to each lane was adjusted so that the signal for the full-length probe was about the same intensity. Lanes marked A, G, C, and T contained the products of sequencing reactions using the same primer used for the treated DNAs. Numbers indicate the nucleotide sequence with respect to the major start site of transcription.
dissociation of double-stranded DNA. We labeled this doublestranded oligonucleotide with T4 polynucleotide kinase and incubated it at the same concentrations and under the same conditions used in the competition reactions. Based on the subsequent analysis of the DNA in a nondenaturing polyacrylamide gel, about 25% of the total mass of DNA had dissociated into single-stranded oligonucleotides. We also tested the binding activity of a double-stranded oligonucleotide spanning Ϫ117 to Ϫ90 bp; this is the same region spanned by the singlestranded probe, Ϫ119/Ϫ90. No binding activity was detected with the double-stranded probe (data not shown). We conclude that the protein in complex B is single-strand-specific and probably recognizes the CT repeat.
DNase I footprinting of single-stranded DNA exhibited a broad region of protection on the PPY strand that extended upstream to about nucleotide Ϫ144. We therefore tested binding of nuclear protein to the region upstream of Ϫ119 bp using the single-stranded PPY tract from nucleotide Ϫ135 to Ϫ115 as probe (Fig. 7B). Like the protein(s) in complexes A1 and A2, the protein that formed a complex on the 5Ј part of the polypyrimidine tract did not compete with d(A) 30 , d(G) 30 , or d(T) 30 but did compete efficiently with d(C) 30 . In fact, d(C) 30 competed more efficiently than unlabeled probe. Binding to this fragment also was competed by the 3Ј single-stranded PPY fragment, Ϫ119 to Ϫ90. These results suggest that the binding properties of the protein in complex C were similar to those in complexes A1 and A2. Like the proteins in complexes A1 and A2, those in complex C competed for Ϫ118/Ϫ107 substantially less well than for unlabeled probe and practically not at all for d(CT) 10 . On the other hand, the protein in complex C had a lower affinity for its own probe than for fragment Ϫ119/Ϫ90. This may have been due to a slight difference in nucleotide sequence, the run of Cs in Ϫ135/Ϫ115 has one less C and a differently positioned T. The similarity of binding properties suggests that the protein in complex C is the same as that in complex A1 or A2. Only one complex formed on fragment Ϫ135/ Ϫ115, suggesting that A1 and A2 contain different proteins, only one of which is the same as that in complex C.
The binding of protein in complex B to single-stranded probe was enhanced when binding of protein in complexes A1 and A2 was reduced by competition (Fig. 7A), suggesting that the proteins in A1 and A2 may compete with those in B in the following way: when complexes A1 and/or A2 are formed, the oligonucleotide is no longer available for binding of protein to B. We tested this hypothesis by changing the ratio of nuclear protein to labeled probe in the binding reaction. When the ratio of nuclear proteins to probe was highest, all probe was shifted into complexes at A1 and A2. The proteins in complexes A1 and A2 likely have higher affinities for probe than that in complex B or are present at higher concentrations in the nuclear extract than the one that binds to B. As the amount of protein in the extract was reduced, complex B began to form. This result suggests that, although the proteins in complexes A1 and A2 and that in complex B recognize different sequences, the sites may overlap or that due to interactions, formation of complexes A1 and A2 may preclude formation of complex B. To determine if binding of protein in complex B to its site can exclude the binding of proteins in complex A, we started four binding reactions with a ratio of nuclear protein to probe that permitted binding of protein in complex B to the probe (Fig. 8). After a 30-min preincubation, increasing amounts of nuclear protein were added to each binding reaction, and the incubation continued for another 30 min. Binding of protein in complex B was displaced by those in A1 and A2 when sufficient nuclear protein was added to the preincubated samples. This result suggests that when probe is limiting, only proteins in complexes A1 and A2 will be bound.
The broad region of protection of single-stranded polypyrimidine tract in the DNA footprint assay likely results from the binding of proteins in complexes A1 and/or A2 to the two C-rich sites. Protein in complex B also may bind to the (CT) 7 region in the footprint assay, depending on the DNA/protein ratio. Whether or not the protein in complex B bound to the CT repeat in the footprint assay would not be discernible, however, because the region of the CT repeat was resistant to DNase I in the absence of added nuclear protein (Fig. 6). FIG. 7. Gel electrophoretic mobility shift assay of binding of nuclear proteins to single-stranded PPY sequences. Singlestranded probes were labeled using T4 polynucleotide kinase. 32 P-Labeled probe (0.01 pmol) was mixed with competitor DNA at the indicated molar ratios and then incubated with about 0.1 g of nuclear protein as indicated under "Experimental Procedures." The binding reaction was performed on ice for 45 min. DNA and DNA-protein complexes were resolved on 6% nondenaturing polyacrylamide gels. Nuclear extracts were prepared from the livers of 2-week-old chicks as described under "Experimental Procedures." Panel A, single-stranded oligonucleotide corresponding to PPY sequence between Ϫ119 and Ϫ90 bp was the probe. Panel B, single-stranded oligonucleotide corresponding to PPY sequence between Ϫ135 and Ϫ115 bp was the probe.
Promoter Activity of the PPY/PPU Tract-The detection of nuclear proteins that bind specifically to single-stranded DNAs that may be single-stranded in triple-helical structures suggests that these unusual structures may play a role in the function of the PPY/PPU tract. What is the role of the PPY/PPU tract in regulating transcription of the malic enzyme gene, and how could the protein-DNA interactions that we have observed contribute to that function?
The transcription of the endogenous malic enzyme gene has its major start site at ϩ1 bp, with several minor start sites further upstream, including ones at Ϫ74 and Ϫ89 bp. 2 Messenger RNA transcribed from p[MEϪ5800/ϩ31]CAT in transfected hepatocytes also has a start site at ϩ1. Additional upstream start sites also are detected, but they are similar in intensity to that at ϩ1 rather than minor as in the endogenous gene. In contrast, mRNA transcribed from transfected pBH147CAT has major start sites of about equal intensity at Ϫ74 and Ϫ89 bp and little or no initiation from ϩ1 bp. These results and the inhibition of T3-induced and basal transcription caused by deletion of the PPY tract from pBH147CAT suggested the possibility that the PPY tract might serve as a promoter.
To test this hypothesis, we constructed several reporter plasmids in which the TK promoter of pME[Ϫ3903/Ϫ3703]TKCAT was replaced by PPY tract plus 8 bp of 3Ј-flanking DNA or variants thereof with different deletions (Fig. 9A). The malic enzyme fragment, Ϫ3903/Ϫ3703 bp, contains a T3 response unit composed of one major and several weak T3 response elements. 2 Promoter activities of the resulting reporter plasmids were tested in transient transfection assays in T3-treated chick embryo hepatocytes in culture. The Ϫ3903/Ϫ3703 fragment itself lacked promoter activity (Fig. 9B). When the PPY tract plus 8 bp of 3Ј-flanking DNA were inserted into the plasmid to form pME[(Ϫ3903/Ϫ3703)MEϪ135/Ϫ78]CAT, the level of promoter activity was comparable with that for hepatocytes transfected with pBH147CAT (data not shown) and almost half that of pME[(Ϫ3903/Ϫ3703)MEϪ147/ϩ31]CAT. The latter contains the entire minimal malic enzyme promoter from Ϫ147 to ϩ31 bp. These results indicate that the fragment from Ϫ135 to Ϫ78 was capable of acting as an independent promoter.
When the orientation of the PPY tract was reversed (pME[(Ϫ3903/Ϫ3703)MEϪ78/Ϫ135]CAT), promoter activity was lost (Fig. 9B). Thus, the potential for formation of an unusual DNA structure was not sufficient for promoter activity; correct orientation also was necessary. The nucleotide sequence between Ϫ88 and Ϫ80 bp is repeated exactly between Ϫ18 and Ϫ10 bp, suggesting a possible role in transcription initiation. When most of the upstream repeat was deleted from pME[(Ϫ3903/Ϫ3703)MEϪ135/Ϫ78]CAT, while keeping the PPY/PPU tract intact, promoter activity was lost, indicating that the PPY/PPU tract alone is not sufficient for promoter activity. Deletion of the PPY/PPU tract (⌬Ϫ134/Ϫ90) from pME[(Ϫ3903/Ϫ3703)MEϪ135/Ϫ78]CAT also caused essentially complete loss of promoter activity. Thus, one copy of the 9-bp repeat is necessary but not sufficient for promoter activity. Deletion of the (CT) 7 repeat (pME[(Ϫ3903/Ϫ3703)MEϪ135/ Ϫ78⌬Ϫ105/Ϫ92]CAT) also caused loss of promoter activity but not to the same extent as when the entire PPY/PPU tract or the 9-bp repeat was deleted. Finally, deletion of the upstream part of the PPY/PPU tract between Ϫ135 and Ϫ106 bp caused a 64% increase in promoter activity.
In sum, these results suggest that both the CT repeats and the 9-bp repeat are necessary for promoter activity and together are sufficient for promoter activity. We have not detected a protein-binding activity specific for the 9-bp repeat, but both the essential CT repeat and the upstream part of the PPY/PPU tract specifically bind single-or double-stranded proteins. Interestingly, the (CT) 7 ⅐(GA) 7 region remained S1 nuclease-sensitive in pME[(Ϫ3903/Ϫ3703)MEϪ105/Ϫ78]CAT (data not shown).

DISCUSSION
Several types of non-B-DNA structures play important roles in various cellular events (8). One non-B-DNA structure is the triplex structure. When the triple helix was first described, the requirement for supercoiling and acidic conditions raised questions about its physiological role in regulating the functions of DNA. However, accumulating evidence suggests that triplex structures can form at neutral pH, in a process favored by divalent metal ions and polyamines (11,40,41). Furthermore, chromatin structure and other local protein-DNA interactions result in a high degree of supercoiling in intact cells (42)(43)(44). Recent work has expanded the repertoire of sequences that can form the third strand of a triplex by demonstrating an increased tolerance for certain mismatches in Hoogsteen base pairing of H*-DNA (11,45), further increasing the likelihood that triplexes will form under physiological conditions.
The transition from B-DNA to triplex is proposed to start with melting of the central region of a PPY/PPU tract (40,46,47), followed by the bending of the region and formation of Hoogsteen base pairs. The abundance and the stability of a particular form of triplex structure, therefore, is determined largely by the energy required to melt the central region and the length of Hoogsteen base-paired region. In the PPY/PPU sequence of the malic enzyme gene, the sequence upstream of (CT) 7 is C-rich and more difficult to melt than the downstream part. This may explain why the triplex structures formed by the upstream region are less abundant than those formed in the downstream region at acid pH (Fig. 3). The (CT) 7 sequence, although easier to open up, is not as long as other CT repeat sequences that easily form stable triplex structures. The formation of triplex structures a and b (Fig. 3C) by the (CT) 7centered region may have been facilitated by protonation of adenines in the acidic S1 digestion buffer. The nonorthodox base triad CG*A ϩ has been observed as a major component in H*-DNA formed under the acidic conditions (45). At neutral pH, lack of protonation of A nucleotides introduces four continuous mismatches in structure a and reduces the outer stretch of Hoogsteen base pairs in structure b to three. The outer four base pairs are separated from the remaining legitimate Hoogsteen base pairs by two mismatches. Thus, the loss of one base pair in structure b and the four mismatches in structure a should have a substantial destabilizing effect on these triplex structures. Consistent with this reasoning, our results did not suggest formation of triplex structures by the (CT) 7 -centered region at neutral pH. The non-B-DNA structure that we detected in this region is likely the melted form.
In the 5Ј-flanking DNA of the malic enzyme gene from ϩ31 to Ϫ419 bp, the A(CT) 7 TT sequence in the PPY/PPU tract is the only 17-bp region that has an (A ϩ T) content greater than 50%; the rest of this region of the malic enzyme gene is very GC-rich. From ϩ132 to Ϫ249 bp, the region is 70 -80% GC; from Ϫ249 to Ϫ449, it is 67% GC. 2 Under the stress of supercoiling, the CT region may open more readily than its neighboring sequences; the next step would be formation of triplex structures, although the short length of the CT repeat may preclude formation of a stable triplex structure. The formation of transient triplex structures might shift the equilibrium from B-DNA toward the melted structure, trapping the region in a dynamic non-B-DNA state.
The ability of the d(CT) 7 repeat to adopt supercoiling-dependent non-B-DNA structures may explain why this region is required for promoter activity in transfected pME[(Ϫ3903/ Ϫ3703)ME-135/Ϫ78]CAT. One of the crucial steps in transcription initiation is melting of double-stranded DNA in the promoter region. Conditions that promote melting not only stimulate transcription initiation but also reduce or eliminate the requirements for several general transcription factors (48 -50). In another study, the initiator-binding protein, YY1, supported transcription initiation in the absence of TATA-binding protein on a supercoiled promoter but not on a linear promoter (51), indicating that sequence-specific changes in DNA structure caused by supercoiling may be involved in TATA-binding protein-independent transcription.
We propose the following model for promoter function of the region between Ϫ135 and Ϫ78 bp. Three sequence elements are important. The first element, 5Ј-Ϫ88 CCCGCAGGA Ϫ80 -3Ј, is essential for the promoter activity and may identify the transcription start site. The second indispensable region is the (CT) 7 repeat. The third important sequence is upstream of the CT repeat, contains two C-rich elements, and in transfections has a negative influence on promoter activity. The stress of supercoiling should cause the PPY/PPU tract to melt and adopt one of several possible non-B-DNA conformations. We suggest that a melted structure formed by the (CT) 7 repeat represents the active promoter conformation. Binding of the (CT) n -specific protein to the top strand may be sufficient to maintain the open conformation. The open structure of the CT repeat may facilitate entry of RNA polymerase into the template. Alternatively, interaction of the (CT) n -specific protein with other factors may facilitate transcription initiation.
The sequence upstream of the CT repeat may serve as a regulatory unit. Formation of triplex structure by the upstream region should release supercoiling stress and may prevent formation of an open conformation within the CT repeat. Binding of the poly(dC)-specific protein to single-stranded regions of the triplex structures might stabilize the triplex structure and its negative activity. Binding of the poly(dC)-specific protein would not only lock the sequence in the inactive triplex conformation but also might prevent the binding of the (CT) n -specific protein (Fig. 8) even if the CT repeat did undergo some melting. The negative effect of the upstream PPY/PPU sequence, however, might be suppressed by the double-stranded DNA binding protein because it would prevent formation of the triplex structure, permitting the energy of supercoiling to melt the CT repeat and facilitate transcription initiation.
In our model, we propose that two single-stranded DNA binding proteins are important in regulating promoter activity. Single-stranded DNA binding proteins with binding specificities similar to those we describe have been identified in several mammalian organisms (37,(52)(53)(54)(55). Numerous reports suggest that single-stranded DNA binding proteins are involved in transcription regulation. For example, the levels of several single-stranded DNA-binding proteins correlate with the level of expression of certain genes (56,57). Two single-stranded DNA-binding proteins have been cloned and contain domains characteristic of some transcription factors (55,56). One of them, CNBP, stimulates expression of a reporter gene driven by a promoter and DNA fragment containing CNBP binding sites (58). In addition to proteins isolated by virtue of their ability to bind to single-stranded DNA, some transcription factors have comparable or higher affinity for one strand of their recognition sequence than for the double-stranded form (59,60). In the case of estrogen receptor, for example, the affinity for noncoding strand is 60-fold higher than that for the double-stranded sequence. Furthermore, some well known DNA cis-elements are sensitive to S1 nuclease and may provide binding sites for specific single-stranded DNA-binding proteins (61,62). Finally, a single-stranded DNA-binding protein, SSB, activates a promoter by stabilizing a DNA hairpin structure, providing evidence that the interplay between the secondary structure of DNA and binding of a single-stranded DNA binding protein is involved in the promoter function (63). All of these results support the hypothesis that single-stranded DNA-binding proteins play important roles in transcription and its regulation.
What is the role of the PPY/PPU in transcription of the endogenous malic enzyme gene? It has its major start sites at ϩ1, with only minor transcription initiation immediately downstream of the PPY/PPU sequence. This result suggests that, normally, the endogenous gene may not use the PPY/PPU tract as a promoter. In the reporter plasmid, pME[Ϫ5800/ ϩ31]CAT, use of start sites just downstream of the PPY/PPU tract is essentially equivalent to that at ϩ1. In the reporter plasmid, pBH147CAT, both major start sites are just downstream from the PPY/PPU tract. Furthermore, transcription is greatly curtailed in the absence of the PPY/PPU tract. The increased usage of start sites at about ϩ1 (pME[Ϫ5800/ ϩ31]CAT versus pBH147CAT) suggests that sequence elements upstream of Ϫ147 bp may play a role in start site selection. In any event, our results suggest that a PPY/PPU tract can define a cryptic site for transcription initiation in certain promoters and suggests a mechanism by which transcription could be initiated and regulated in promoters lacking a TATA box.
One important difference between the endogenous malic enzyme gene and transiently transfected malic enzyme genes is likely to be the lack of assembly of the DNA into normal chromatin on the transiently transfected genes. Proper chromatin structure may add constraints that prevent this PPY/ PPU tract from acting as a promoter in the endogenous gene. The converse also may be true; changes in DNA structure may regulate chromatin structure. For example, a PPU tract in the promoter region of the chicken adult ␤-globin gene is involved in regulation of nucleosome structure (38). Furthermore, it has been proposed that nucleosomes cannot assemble on parts of the DNA that are in a triplex structure (8). The transition from a low transcription rate in the starved state to a 40-fold higher transcription rate in the refed state is accompanied by a profound change in chromatin structure in the region from Ϫ320 to Ϫ72 bp; chromatin in this region is hypersensitive to DNase I in livers from fed chicks and resistant to DNase I in starved chicks. The transition from DNase I sensitivity to DNase I resistance occurs within 4 h and occurs with the same kinetics as the increase in transcription rate (23). The changes in chromatin structure do not appear to be caused by the altered transcription rate because this region is hypersensitive to DNase I in hepatocytes in culture whether the rate of tran-scription is high or low (64). We speculate that the PPY/PPUbinding proteins that we have described play a role in regulating chromatin structure. If so, they may play important roles in mediating the effects of feeding and starvation on transcription of the malic enzyme gene.