Transcription of the Human Folylpoly-γ-glutamate Synthetase Gene*

In mammals, folylpoly-γ-glutamate synthetase (FPGS) activity is found in any cell undergoing sustained proliferative phases, but this enzyme also displays a tissue-specific pattern of expression in differentiated tissues. It is now reported that the steady state levels of FPGS mRNA in normal and neoplastic cells reflect these patterns, supporting the concept that the control mechanisms underlying this distribution are transcriptional. To initiate an understanding of these interacting levels of control, we have determined the position and properties of the minimal FPGS promoter controlling transcription of the FPGS gene in human CEM leukemia cells, a line which expresses high levels of this enzyme and its mRNA. The TATA-less region immediately upstream of the major transcriptional start site previously mapped in human tumor cells, which includes several GC- and Y-boxes, functioned as a remarkably efficient promoter when used to drive expression of a luciferase reporter in transient expression studies in CEM cells. The minimal region of the FPGS promoter required for maximal transcriptional activation in CEM cells included the 80 base pairs over which the multiple transcriptional start sites were located, and the 43 base pairs immediately upstream. DNase I footprint analysis detected the binding of Sp1 at all seven of the consensus sites within the probe used, two of which are contained within the minimal promoter region. The several Sp1 sites immediately upstream of the first major transcriptional start activated transcription in Drosophilacells when cotransfected with an Sp1 construct, including those in the region which functioned as a minimal promoter in CEM cells. An additional region of the minimal promoter, situated between the two translational start codons of the FPGS gene, was bound by protein(s) from HeLa cell nuclear extracts. We conclude that transcription of the FPGS gene in CEM cells involves transactivation events over a limited upstream DNA sequence and that the FPGS promoter used in proliferating human leukemic cells has strong similarity to other TATA-less promoters that utilize tandem, closely spaced Sp1 sites to initiate transcription.

In mammals, folylpoly-␥-glutamate synthetase (FPGS) activity is found in any cell undergoing sustained proliferative phases, but this enzyme also displays a tissuespecific pattern of expression in differentiated tissues. It is now reported that the steady state levels of FPGS mRNA in normal and neoplastic cells reflect these patterns, supporting the concept that the control mechanisms underlying this distribution are transcriptional. To initiate an understanding of these interacting levels of control, we have determined the position and properties of the minimal FPGS promoter controlling transcription of the FPGS gene in human CEM leukemia cells, a line which expresses high levels of this enzyme and its mRNA. The TATA-less region immediately upstream of the major transcriptional start site previously mapped in human tumor cells, which includes several GC-and Y-boxes, functioned as a remarkably efficient promoter when used to drive expression of a luciferase reporter in transient expression studies in CEM cells. The minimal region of the FPGS promoter required for maximal transcriptional activation in CEM cells included the 80 base pairs over which the multiple transcriptional start sites were located, and the 43 base pairs immediately upstream. DNase I footprint analysis detected the binding of Sp1 at all seven of the consensus sites within the probe used, two of which are contained within the minimal promoter region. The several Sp1 sites immediately upstream of the first major transcriptional start activated transcription in Drosophila cells when cotransfected with an Sp1 construct, including those in the region which functioned as a minimal promoter in CEM cells. An additional region of the minimal promoter, situated between the two translational start codons of the FPGS gene, was bound by protein(s) from HeLa cell nuclear extracts. We conclude that transcription of the FPGS gene in CEM cells involves transactivation events over a limited upstream DNA sequence and that the FPGS promoter used in proliferating human leukemic cells has strong similarity to other TATAless promoters that utilize tandem, closely spaced Sp1 sites to initiate transcription.
Folylpoly-␥-glutamate synthetase (FPGS) 1 catalyzes the ATPdependent formation of an amide bond between the ␥-carboxyl group of the naturally occurring folates and the amino group of glutamic acid. The addition of glutamic acid moieties to folate compounds allows their intracellular retention and concentration for both the folate cofactors (1-3) as well as for all of the "classical" folate antimetabolites studied to date. As a result of its role in the retention of folate cofactors in the cytosol and mitochondria, FPGS is essential for folate homeostasis and the survival of proliferating mammalian cells. The metabolism of antimetabolites by this enzyme plays a major role in the cytotoxicity, and perhaps the selective cytotoxicity, of several folate antagonists used or under development for the treatment of human cancers.
Because any cell that is attempting to proliferate during exposure to classical antifolates is susceptible to cytotoxicity if it expresses FPGS, the tissue distribution of this enzyme and the factors controlling its expression are of critical importance. From early studies on the distribution of enzyme activity, it was known that the levels of FPGS are moderately high in some tumors, in normal gut and bone marrow stem cells, and in the two specialized organs of folate metabolism, liver and kidney, but enzyme levels in other mouse adult tissues were barely detectable (4 -6). Studies relating methotrexate polyglutamylation and FPGS activity in individual childhood leukemias to the therapeutic activity of this drug have suggested that the sensitivity of some of these tumors to antifolate chemotherapy is causally related to the level of expression of FPGS (6 -9). In a recent study of FPGS in human leukemias (9), B-lineage leukemias, which are more responsive to methotrexate, had higher levels of FPGS activity than did T-lineage leukemias. Cells from acute nonlymphoblastic leukemias, which are generally resistant to antifolates, had lower enzyme levels (9). This lineage-specific effect was also seen in cells selected from normal bone marrow; lymphoid progenitor cells (CD10/CD19 ϩ ) had high levels of FPGS activity similar to those seen in blast cells from patients with B-lineage leukemias, whereas normal nonlymphoid progenitors (CD34 ϩ ) had lower levels (9). On the other hand, a decrease in cellular FPGS activity in tumor cells can contribute to or cause both acquired (10,11) and intrinsic (12) resistance to methotrexate.
A previous study in this laboratory (6) concluded that FPGS levels are controlled by at least two mechanisms, one of which is linked to proliferation and the other results in a tissuespecific pattern of expression in differentiated tissues. Despite this complexity, all available evidence (13) appears to rule out the existence of multiple genes for FPGS function, a fact that has focused attention on control of the known FPGS locus. With the publication of the human FPGS cDNA (14), recent studies have defined the structure of the FPGS genomic locus (13) and the use of multiple transcriptional start sites (15), and they have begun to address how the control of FPGS gene expression is mediated. Thus, when HL60 promyelocytic leukemia cells were chemically induced to differentiate, levels of FPGS enzyme activity decrease (6), and its mRNA declined coordinately (16). FPGS enzyme activity was not detectable in human pe-ripheral blood lymphocytes (6), but both enzyme activity and FPGS mRNA increased when the cells were stimulated to proliferate with phytohaemagglutinin (17). Conversely, in a cell line selected for resistance to the second generation dihydrofolate reductase inhibitor edatrexate, the decreased FPGS levels causative of resistance were demonstrated to be due to a decrease in the rate of transcription of the FPGS gene (18). Hence, the limited information available implicates the level of transcription of this gene as the major determinant of cellular FPGS activity.
On these bases, we initiated a study to define the processes involved in the transcriptional control of this gene. In this study, we report the promoter region of the FPGS gene in a human leukemic cell. We demonstrate that a well defined short sequence, which covers the major transcriptional start sites mapped previously in several human tumor cell lines (19), and an addditional 43 bp upstream, are necessary and sufficient for basal promoter activity and that discrete binding sites for Sp1 and for a second nuclear factor define this region. These studies constitute the first description of the promoter controlling transcription of this gene and set the stage for studies defining the proliferation-linked and tissue-specific controls on the expression of the FPGS gene.

MATERIALS AND METHODS
Northern Analysis of FPGS mRNA-RNA from murine normal tissues and human tumor cell lines was extracted using the Trizol reagent (Life Technologies, Inc.), denatured with glyoxal, and separated by size on a 1.2% denaturing agarose gel in 10 mM sodium phosphate buffer, pH 7.0. The RNA was transferred onto nylon membranes (Biotrans, ICN, Irvine, CA) and hybridized either with the first 690 bp of the published human cytosolic FPGS-coding region (14) or with a 1.7-kb probe representing the downstream sequences of the mouse L1210 cell FPGS cDNA (19). Probes were random-labeled to a specific activity of about 2 ϫ 10 9 cpm/g and were used at a concentration of 1 ϫ 10 6 cpm/ml of hybridization solution (0.5 M sodium phosphate, pH 7.0, 7% SDS, 1% bovine serum albumin, and 1 mM EDTA). A Northern blot of poly(A) ϩ RNA (2 g/lane) from normal human tissues was purchased from CLONTECH Laboratories (Palo Alto, CA). This latter blot was blocked and hybridized in 75 mM NaCl, 50 mM phosphate, 5 mM EDTA, 2% SDS, 100 g/ml sheared salmon sperm DNA, and 10 ϫ Denhardt's solution, pH 7.4. All blots were hybridized at 65°C, and filters were washed to a stringency of 0.2 ϫ SSC and 0.1% SDS at 65°C. For both murine and human Northern blots, a human ␤-actin probe was used to normalize for RNA loading.
5Ј-Rapid Amplification of cDNA Ends (RACE)-5Ј-RACE was carried out starting with 2 g of poly(A) ϩ -selected RNA from CEM cells using a gene-specific antisense primer in exon 5 for reverse transcription and poly(C)-tailing; subsequent polymerase chain reaction (PCR) amplification used a nested primer within exon 4 and an anchor primer. A published method (20) was followed using the primers and reaction conditions previously described (15). The PCR products resultant from this 5Ј-RACE, representing the 5Ј ends of the FPGS mRNA, were separated on a 1.5% agarose gel. The products ranged over approximately 100 bp, and three individual ligations into the pCRII vector (InVitrogen, San Diego, CA) were carried out using DNA purified from different regions of the gel to avoid the size bias inherent in such reactions. After transfection of the cloned PCR products into bacteria, 203 clones were individually streaked out onto duplicate grids. Nitrocellulose lifts of these duplicate plates were probed with either FPGSspecific primer F5, (5Ј-GGCATTGGTCTGGCAGGGTATTGAGCAT-GCGCACGGC-3Ј), located at the 5Ј end of exon 2, or F6 (5Ј-CCGAGCATGGAGTACCAGG-3Ј), which was positioned at the extreme 3Ј end of exon 1. These primers were labeled using polynucleotide kinase and [␥-32 P]dATP. Manual double-stranded sequencing was performed using Sequenase 2.0 (U. S. Biochemical Corp.).
Construction of Reporter Constructs-The FPGS promoter constructs used for transfection experiments were generated from a 2.4-kb Hin-dIII-BamHI restriction fragment of genomic clone BL (15) cloned into pBlueScript (pBSHB). This 2.4-kb fragment had previously been shown to stretch from intron 1 upstream into the 5Ј-flanking sequences (15) and to contain 1.65 kb of sequence upstream of exon 1. The pHinc-Eae construct was made by complete digestion of pBSHB with HincII and subsequent partial digestion with EaeI. The 1.6-kb fragment of pBSHB from this double digest was ligated into the NotI and EcoRV sites of pBlueScript (pBS27); the inserted fragment was then removed using the polylinker sites Acc65I and EcoICRI allowing it to be ligated into the Acc65I and SmaI sites of the pGL3-basic vector (Promega, Madison, WI). The upstream start methionine was mutated using PCR; the template used was pBS27, and the primers were Mut1 (5Ј-GGGCGC-CGGGACTAGTTCGCG-3Ј; the bold letters show the position of the mutated methionine) and the M13 reverse primer. PCR conditions for all constructs were 1 min each at 95, 60, and 72°C for 30 -35 cycles. The PCR product was gel-purified and ligated into the pCRII vector (In-Vitrogen). This construct was cut with NarI and SacI, and the 232nucleotide fragment of interest was ligated to the 1.8-kb 5Ј-flanking region of the FPGS gene in pBlueScript, replacing the fragment that contained the unmodified start methionine to yield the plasmid pFMUT18. This construct was then used to generate a series of deletion constructs cloned in the pGL3-basic expression plasmid. Initial deletion constructs were chosen by the availability of restriction enzyme cleavage sites (see Fig. 3). The remaining constructs were PCR-derived. pF11-Eae1 was generated using the F11 primer (5Ј-GTGAGGC-GACGCTGCCGTG-3Ј) and the M13 primer in the pFMUT18 construct. pRC1-Mut3 and pRC2-Mut3 were derived from a pBlueScript construct in which the upstream methionine was mutated as shown above but which also extended further downstream into intron 1 (the primers used to generate these constructs were RC1, 5Ј-GGTACCGACGCT-GCGCTGATTGGC-3Ј; RC2, 5Ј-GGTACCTTTGGGGCGGTGCTGATT-GATG-3Ј; and Mut3, 5Ј-CACGTGCAGCTGATACCTGGTACTCCTTGC-3Ј). All PCR fragments were cloned into the pCRII vector, released with enzymes Acc65I and EcoICRI for ligation into the Acc65I and SmaI sites of the pGL3-basic vector. The cloned segments which were PCRgenerated were sequenced to confirm a lack of PCR-generated artifacts.
SP1 Expression Constructs-The SP1 expression construct pPacSP1 containing the 2.1-kb SP1 cDNA downstream of the Drosophila actin promoter and the corresponding vector without insert, pPac ϩ NdeI, were generously provided by Dr. Robert Tjian (University of California, Berkley).
Transfection Assays-CEM cells were cultured in RPMI 1640 medium supplemented with 10% fetal calf serum. Exponentially growing cells were washed once with phosphate-buffered saline and resuspended in serum-free medium at a density of 1.2 ϫ 10 7 cells/ml. Aliquots (0.8 ml) of cells were mixed with 10 g of the luciferase construct and 30 g of the internal control vector CMV-␤-galactosidase, and were transfected using a BTX transfector 300 electroporater (Biotechnologies and Experimental Research, Inc., San Diego, CA) at 250 V and 300 microfarads. As the size of the deletion constructs decreased, the amount of construct was decreased proportionally to maintain the construct molarity constant between transfections; pGL3-basic vector was used to keep the total DNA content per transfection sample at 10 g. After electroporation, cells were cultured in 12 ml of complete medium for 48 h. The cells were then washed once in phosphate-buffered saline and lysed in 300 l of reporter lysis buffer (Promega). The lysed cells were centrifuged at 14,000 rpm for 1 min in a Microfuge, and the supernatants were assayed for luciferase and ␤-galactosidase activity (Promega). The ␤-galactosidase activities of the extracts were determined and used to normalize luciferase levels for transfection efficiency. Each experiment was performed at least three times with triplicate determinations in each experiment. Results were first normalized for ␤-galactosidase activity, then expressed either relative to the luciferase expression driven by the SV40 promoter of the pGL3-control vector in the same experiment or as a percentage of the activity produced by the most active FPGS promoter construct in a series.
Drosophila Schneider 2 cells (1 ϫ 10 6 cells) were plated onto 60-mm dishes in Schneider's Drosophila medium (Life Technologies, Inc.) plus 10% fetal calf serum and used the next day for transfection. Each luciferase construct (10 g) and 1 g of either pPacSP1 or pPac ϩ NdeI were transfected into the cells as a calcium phosphate precipitate. After 48 h, the cells were scraped from the plates, washed twice with phosphate-buffered saline, and lysed in 300 l of reporter lysis buffer, as above. Supernatants were assayed for luciferase activity, which was then normalized to protein content measured using an adaptation of the Bradford technique (Bio-Rad).
DNase I Footprinting-Probes were generated from the pStu-Eae construct (see Fig. 3) by cleaving with Acc65I to analyze the sense strand or with BglII to analyze the antisense strand. This probe stretched from position Ϫ180 bp to ϩ124 bp where ϩ1 represents the most 5Ј major transcriptional start site. 5Ј-End-labeling was performed using Superscript reverse transcriptase (Life Technologies, Inc.) and high specific activity 32 P-labeled dATP and dCTP (DuPont NEN). After labeling, the fragment was released using the opposite restriction en-zyme site, and the labeled fragment was gel-purified. Protein binding and DNase I reactions were carried out using 3 footprinting units of recombinant Sp1 or 25 g of HeLa nuclear extract (Promega) in the buffers supplied with the Hotfoot footprinting system (Stratagene). Nuclear extracts or purified proteins were incubated for 10 min on ice with 100 g/ml poly(dI-dC) before the addition of 15,000 cpm 32 Plabeled probe to each reaction mix. Mixtures were incubated for an additional 20 min on ice and then partially digested with DNase I. The amount of DNase I used was titrated to optimize the degree of cleavage; 0.025 unit was used for samples containing DNA alone and the samples containing purified SP1, and 0.5-1 unit was used for the reactions containing 25 g of HeLa extract. Products were separated on a 6% denaturing polyacrylamide sequencing gel and size was determined from the migration of a sequencing reaction added to the gel adjacent to the test samples.

FPGS mRNA Levels in Normal and Neoplastic Tissues-
The size and abundance of transcripts from the FPGS gene were determined by Northern blot analysis in normal tissues and neoplastic cell lines. Previous studies (4, 6) of the levels of FPGS activity in normal mouse tissues had indicated high enzyme activity in murine liver and kidney, with lower levels in spleen, and very low to undetectable levels in several other tissues. All mouse tumors surveyed in those studies expressed substantial levels of FPGS activity (4,6). The hybridization of a near full-length mouse cDNA probe to Northern blots mirrored this pattern (Fig. 1a), with FPGS mRNA levels highest in the liver and kidney and equivalent levels in the L1210 leukemia cell line. Mouse brain, heart, skeletal muscle, lung, and spleen had low to undetectable levels of message. A single 2.4-kb band was detected in all mouse cells expressing FPGS. Northern analysis of poly(A) ϩ RNA from normal human tissues, using a 0.7-kb probe corresponding to the 5Ј-coding region of the FPGS gene, showed some interesting differences to the pattern in mouse tissues (Figs. 1, a and b). As in the mouse tissues, human liver showed high levels of FPGS mRNA, and kidney also expressed relatively high levels. Human pancreas, a tissue which was not studied in the mouse, had levels intermediate between that of the liver and kidney. A low level of FPGS message was detected in the brain as was seen with the mouse, and surprisingly little was seen in human placenta. However, in mRNA from heart and skeletal muscle there were very high levels of FPGS-specific mRNA, in direct contrast to the levels in these tissues in the mouse. Although the basis of this difference is not clear at this time, this result has been reproduced on a second human multiple tissue blot. The major species of FPGS message in human normal tissues was 2.4 kb, but a minor band at 2.2 kb was also detected which represented 5-10% of the total hybridization, as estimated by PhosphorImager scanning. When RNA from eight human tumor cell lines was hybridized with the human FPGS probe, there was a high level of FPGS mRNA detected in all of the samples (Fig. 1c). The level of FPGS mRNA in these eight samples did not differ by more than a factor of four. Likewise, RNA extracted from the circulating lymphoblasts of a pediatric patient with acute lymphoblastic leukemia was found to have a similarly high level of FPGS mRNA. The lower molecular weight hybridizing band seen in the normal human tissues was not detected in any of these tumor specimens.
5Ј-RACE Analysis of FPGS mRNA from CEM Cells-We had previously used 5Ј-RACE and ribonuclease protection assays to define the transcriptional start sites for FPGS in a series of Filters were consecutively hybridized with FPGS cDNA and a control probe, human ␤-actin. a, total RNA (10 g) from murine normal tissues from female DBA/2 mice was probed with a 1.7-kb downstream murine FPGS cDNA (19); b, human normal tissue poly(A) ϩ mRNA (2 g) was probed with a fragment of human FPGS cDNA which included the first 690 bp of coding region sequence; and c, total RNA (10 g) from the indicated human tumor cell lines was probed with a 690-bp human FPGS cDNA fragment.
FIG. 2. Transient transfection assays in CEM human leukemic cells using constructs with and without a functional upstream translation initiation codon. CEM cells were transfected using electroporation with 10 g of the FPGS promoter-luciferase reporter construct and 30 g of pCMV-␤-galactosidase. The next day, the cells were lysed, and luciferase activity was measured. Transfection efficiencies were standardized with respect to ␤-galactosidase activity. Luciferase activity from each construct is expressed relative to the activity of a pGL3 control construct driven by the SV40 promoter, which was assigned the arbitrary value of 1. Each experiment was performed three times in triplicate, and the error bars represent the S.D., reflecting variation among the experiments.
human leukemic and carcinoma cell lines (15). In that analysis, all of the 25 RACE products sequenced originated within what we defined as exon 1 (15). Recently, sequence heterogeneity has been reported at the 5Ј end of mouse and human FPGS transcripts resultant from alternative use of upstream exons spliced to exon 2 (21,22). Hence, we extended our previous 5Ј-RACE analysis to more thoroughly define the initial exon used in CEM cells and, therefore, the position of the promoter used in this cell line. Of the 203 5Ј-RACE colonies picked, 140 hybridized with an exon 2 probe and 126 (90%) also hybridized with exon 1. The 14 clones which hybridized to exon 2 but not to the exon 1 probe were sequenced. None represented cDNAs with alternative 5Ј sequence; the clones sequenced either truncated within exon 2 or were false negatives, which contained exon 1 sequence, or corresponded to an unspliced precursor mRNA (1 clone). We concluded from this extensive 5Ј-RACE analysis that all of the transcripts from the FPGS gene in CEM cells initiated transcription using the previously defined (15) start sites in exon 1, and that DNA sequences close to exon 1 were uniquely promoting transcription of this gene in this human tumor cell. Hence, we initiated an analysis of the exon 1 proximal promoter with the concept that it was either the only or, at least, the major promoter used for the FPGS gene in this model of a human tumor.

Design of FPGS-Luciferase Reporter Constructs and Definition of the Minimal Promoter
Region-In a previous study, we demonstrated how differential usage of two start methionines located within exon 1 of the human FPGS gene leads to targeting of FPGS activity to the mitochondrial and cytosolic compartments (15). Transcription was initiated at multiple sites spread over 80 bp, which divide the transcripts into two classes differing by the presence or absence of an additional upstream translation initiation site. In designing promoter reporter constructs to study transcription from this gene, segments of the sequence immediately upstream of the first exon were placed in front of a luciferase gene in the pGL3 plasmid, including the sequence at which transcription of the endogenous gene is initiated, but not the start methionine, which initiates translation of the coding region of the protein. Hence, in a series of such constructs, translation of luciferase would initiate from the same ATG downstream of any putative promoter sequences, allowing comparisons of luciferase activity as transcriptional efficiencies. Thus, problems associated with upstream translational initiation and early termination of FIG. 3. 5-and 3-deletion analysis of the human FPGS promoter in CEM cells. CEM cells were transfected using electroporation with 10 g of the FPGS promoter-luciferase reporter construct and 30 g of pCMV-␤-galactosidase per sample. The next day, the cells were lysed, and luciferase activity was measured. Transfection efficiencies were standardized with respect to ␤-galactosidase activity. Luciferase activity is expressed relative to the longest promoter construct, pH3-Eae, which was assigned an arbitrary value of 100. Each bar represents the mean and S.D. of three experiments, each of which was performed in triplicate. a, schematic drawing of the human FPGS promoter. The relative positions of the consensus GC-boxes, Y-boxes, and transcriptional start sites (15) are indicated. b and c, 5Ј-and 3Ј-deletion analyses of the human FPGS promoter, respectively. transcripts in the pGL3 constructs would be avoided. However, in the FPGS gene, inclusion of all of the transcriptional initiation sites would also place the codon for the upstream start methionine (which initiates translation of the mitochondrial form of human FPGS) in some of the transcripts. Although any peptides initiated from this codon would be terminated by in-frame stop codons in the pGL3 vector prior to the luciferase gene, the presence of this methionine had an effect on translation initiated downstream at the luciferase start methionine (Fig. 1). Thus, when the luciferase production was compared using two constructs differing only by the presence of the intact upstream start codon (pH3-BssHII), or a mutated start codon (AGT) (pH3-BssHII-Mut), the presence of the intact ATG was found to decrease luciferase activity by over 5-fold. Two other constructs also tested the effect of the presence of this ATG, pHinc-Eae and pH3-Eae; these constructs extended 3Ј almost to the downstream start methionine, and stretched upstream for a length of 1.5 and 1.8 kb, respectively. The construct with the intact ATG (pHinc-Eae) produced 2.5-fold lower luciferase activity than did pH3-Eae, an effect which could be ascribed to the presence of the upstream start codon, because further deletion analysis demonstrated that the additional upstream sequence did not result in more efficient transcription (see below). Subsequently, all FPGS constructs that contained this region were designed to have an AGT codon in place of the upstream start codon to ensure that translation of the transient transcripts initiated only from the luciferase cassette start methionine. Overall, the sequence immediately upstream of exon 1 of the FPGS gene has very substantial promoter activity as can be seen from the results in Fig. 2, in which luciferase production in transient transfections driven by these constructs in CEM cells is expressed relative to the level of luciferase produced from the SV40 minimal promoter. The pH3-Eae construct produced 5.6-fold more luciferase activity than the SV40 promoter, a surprising level given that the latter is such an active promoter.
A series of 3Ј-deletion constructs of the FPGS promoter region were made to assess the importance of sequences in this region on transcription of the FPGS gene. As shown by the data of Fig. 3b, removal of 82 bp at the 3Ј end of the very active pH3-Eae construct to form pH3-BssHII-Mut reduced the luciferase activity Ϸ3-fold. This construct does not include the 3Ј-most transcriptional start site (Fig. 3a). However, from ribonuclease protection analysis (15), this start site represents only about 5-10% of the total FPGS message in CEM cells, so that the 3-fold drop in activity appears to be due to the loss of sequences that enhance transcription rather than to the loss of a minor transcriptional start site. The pH3-SmaI construct has another 60 bp removed from the 3Ј end and includes only the two upstream transcriptional start sites. Transfection of this construct produces approximately 18-and 5-fold less luciferase activity than the pH3-Eae and pH3-BssHII-Mut constructs, respectively. Again, the drop in activity does not correspond to the loss of transcriptional start sites; although only two major start sites remained in this construct, they represented over 50% of the transcripts as judged by ribonuclease protection analysis (15). We concluded from this analysis that the majority of exon 1 was essential for maximal transcriptional activity from the FPGS promoter.
5Ј-Deletions of the FPGS promoter region were constructed and analyzed to define the upstream boundary of the minimal promoter region (Fig. 3c). The 1.6 kb of sequence upstream of exon 1 contained in the pH3-Eae construct could be reduced to 43 bp (pF11-Eae and pRC-Mut3) without a reduction in luciferase activity in transient assays in CEM cells. In fact, as the construct size was decreased, there was a gradual increase in luciferase activity. The construct pRC1-Mut3, in which both start methionines have been mutated, was designed to determine whether extension of what appeared to be the minimal promoter (pF11-Eae) further 3Ј to the end of exon 1 enhanced transcription. Transcription levels driven by the pRC-Mut3 and F11-Eae constructs did not differ significantly, leading to the conclusion that sequences downstream of the Eae site did not contribute substantively to the function of this promoter. This conclusion was supported by a comparison of transcription from the pStuI-Eae and pRC2-Mut3 constructs (Fig. 3c). The minimal portion of the FPGS promoter required to drive transcription, then, consists of bp 1 to 150 of exon 1 and the 43 bp of the 5Ј region flanking it. This region contains two forward and one reverse GC-boxes, the core binding sequence for the transcription factor Sp1, and a single inverted CCAAT motif, i.e. a potential Y-box.
SP1 Transactivation of the FPGS Promoter-Because several potential binding sites for Sp1 mapped to the region immediately upstream of transcriptional initiation, we sought to determine the functional activity of binding of Sp1 to the FPGS promoter in the control of transcription from this gene. To address this question, a subset of the FPGS-luciferase reporter constructs were transfected into Drosophila SL2 cells, which lack endogenous Sp1, and the effect of cotransfection with the Sp1 expression vector, pPacSP1, was tested. Transfection of the luciferase gene driven by the SV40 control promoter showed a FIG. 4. Transactivation of the human FPGS promoter by Sp1 in Drosophila cells. Drosophila cells were cotransfected with 10 g of the FPGS promoter-luciferase reporter construct and 1 g of the Sp1 expression construct, pPacSP1, or of the corresponding vector control, pPac ϩ NdeI, as a calcium phosphate precipitate. The next day, the cells were lysed, and luciferase activity was measured. Transfection efficiencies were standardized with respect to protein content. The results are expressed as the ratio of luciferase produced by SL2 cells transfected with each construct and with a construct expressing the transcription factor SP1 divided by the luciferase activity produced by the cells transfected with each FPGS construct and the vector only control. Each bar represents the mean and S.D. of three experiments, each of which was performed in triplicate.
260-fold activation of expression by Sp1, compared with that seen in cells cotransfected with the control plasmid, pPac ϩ NdeI (Fig. 4), in agreement with previous experience with this system (23). The longest FPGS promoter construct studied (pH3-Eae) was as effective as the SV40 control promoter in stimulating luciferase activity in this system (approximately 300-fold). As the size of the FPGS promoter region was decreased from the 5Ј end, the degree of activation by Sp1 also decreased. However, even the construct found to define the minimal promoter in human cells, pF11-Eae, which contained only two Sp1 binding sites, showed a 22-fold induction compared with that observed with the vector only control. This difference between the transactivation of luciferase by upstream sequences of pF11-Eae in Drosophila and human leukemic cells was very clear, but our results do not allow us to distinguish whether there is a silencer element operative upstream in the human cells (as perhaps suggested by the data of Fig. 3c) or an upstream enhancer that became functional in Drosophila cells. Of course, it cannot be excluded that the effect is due to the high levels of Sp1 expression expected from the Drosophila actin promoter used in pPacSp1. Nevertheless, we could conclude that Sp1 binding to the GC boxes immediately upstream from exon 1 in this gene would greatly stimulate transcription from the FPGS promoter and that they constitute functional Sp1 binding sites.
DNase I Footprint Analysis of the Minimal FPGS Promoter-DNase I footprint analysis of the FPGS promoter region showed four regions protected by purified Sp1 transcription factor that were also protected by HeLa cell total nuclear extract (Fig. 5a). Each of these sites corresponded to GC-boxes which were expected to bind Sp1 (Fig. 5b). The HeLa extract also protected another site not bound by Sp1, which is labeled H1 in Fig. 5b. This site lies just downstream of the first start methionine codon and is therefore in a transcribed and translated region of the FPGS gene. The H1 site spanned 29 bp and contains a GC-rich region at the 5Ј end and a consensus E-box (CCACCTGC) at the 3Ј end. The sequences of this region of the mouse (19) and hamster (15) FPGS genes were compared with the human sequence (Fig. 5c). The 5Ј end of H1 was conserved across these species, and the core E-box (CACCTG) was present in the human and hamster sequences but was not found in the mouse gene. DISCUSSION FPGS serves the function of trapping folate cofactors in the cell and, as a result, is essential for normal cellular proliferation. Earlier studies (5,6) had shown that FPGS activity was high in cells undergoing rapid proliferation and dropped when cells either committed to differentiation or entered an extended G 0 phase. In addition, enzyme levels have been shown to range from undetectable to rather high among differentiated tissues (4,6), leading to the concept that control of FPGS expression was obeying a tissue-specific regulatory mechanism. The Northern analysis presented here shows a similar pattern. FPGS mRNA levels are high in all of the in vitro cultured tumor cell lines, which are dividing rapidly, whereas, in both the murine and human normal tissues, FPGS message levels showed distinct tissue-specific patterns. It would appear that at least some of these controls operate at the transcriptional FIG. 5. DNase I footprint analysis of the human minimal promoter region. a, recombinant Sp1 and HeLa nuclear extract were incubated with 3 ng of an FPGS promoter fragment spanning positions Ϫ180 bp to ϩ124 bp, where ϩ1 is the most upstream transcriptional start site (15). The fragment was end-labeled at the Ϫ180-bp terminus for the sense strand, and at ϩ124 bp for the antisense strand. The DNA-protein mixture was partially digested with DNase I and extracted with phenol/chloroform prior to fractionation on a 8 M urea, 6% polyacrylamide denaturing gel. The markers are the G and A lanes of a sequencing reaction. Footprints specific to SP1 of Hela proteins are labeled as S and H, respectively. b, the positions of the S1-4 and the H1 protein binding sites are indicated relative to the positions of the transcriptional start sites. c, comparison of the region of the H1 protein binding site between human (14, 15), hamster (15), and mouse (18,19) sequences. A dot between Murfpgs and Humfpgs represents identity in all species. level. Having determined the sites of transcriptional initiation of the FPGS gene, we set out to learn the first level of information about the transcriptional control of this locus. Our previous analysis (15) of the sequence directly upstream of the first exon in CEM leukemic cells revealed a GC-rich region with no canonical TATA sequence, a structure usually characteristic of housekeeping genes and proto-oncogenes. In this study, we demonstrate that this region has substantial activity as a transcriptional promoter.
The proximal 120 bp of the promoter studied contained seven forward and one reverse GC-boxes, the core binding site for Sp1 as well as for several other transcription factors. DNase I footprinting analysis with SP1 recombinant protein and with HeLa nuclear extract indicated that each of the forward GCboxes bound Sp1 and were bound in Hela cell nuclear extracts. When the reporter constructs were transfected into Drosophila SL2 cells, which lack endogenous Sp1, expression was transactivated by the cotransfection of an Sp1-expressing plasmid. Hence, it would appear that Sp1 plays a major role in transcription from this gene, as with other TATA-less promoters that have multiple GC-boxes immediately upstream of the major site of transcriptional initiation.
The human FPGS promoter gives rise to multiple transcription initiation sites spread over 80 bp (15), a typical characteristic of a TATA-less GC-rich promoter lacking an initiator element (24). In the absence of an initiator element, precedent would predict that start site selection would be directed by a proximal Sp1 binding site (25) that generates multiple start sites between 30 and 55 bp downstream. The positioning of start sites is of significance for the FPGS gene as it is this which determines the subcellular localization of enzymatic activity. The first exon of the FPGS gene in CEM cells contains 2 in-frame start methionine residues; translation initiated from the upstream methionine directs enzyme to the mitochondria and from the downstream methionine directs it into the cytosol. Additional studies will be required to define the role of the factor-binding elements within the FPGS promoter in the regulation of start site distribution in this gene.
Within the minimal promoter region, only one footprint, designated herein H1, could not be attributed to GC-boxes. The 3Ј region of this 29-bp sequence was homologous with an E-box (5Ј-CACCTGGC-3Ј), an element first recognized in a tissuespecific enhancer of the murine heavy chain immunoglobin gene (26). The CANNTG portion of this motif has since been shown to be the core binding site for the basic helix-loop-helix transcription factors (27). The E-box-binding proteins represent several families of proteins, one group of which is the muscle regulatory factors involved in transcriptional activation of muscle-specific genes (27,28). We note that the position corresponding to the H1 element in the human gene did not contain an E-box in the mouse gene (Fig. 5c), an observation which suggests that this element is involved in the expression of FPGS mRNA in human, but not mouse, heart and skeletal muscle.
There has been circumstantial evidence (29,30) for the existence of isoforms of FPGS, which differ in kinetic characteristics, yet there appears to be only one FPGS genomic locus (13). We had previously mapped the multiple transcriptional start sites for this locus in a number of human cancer cell lines and reported that transcripts for both a mitochondrial and cytosolic forms of FPGS arose using the same promoter (15). It is not yet clear whether these forms of enzyme explain the different catalytic features seen in some studies (29,30), but it would seem unlikely, given that the mitochondrial leader is expected to be cleaved after passage into this organelle. Our analysis of the 5Ј-termini of FPGS transcripts in human cancer cell lines (Ref. 15 and this study) has identified only a single form of exon 1. Others have subsequently reported that this same exon 1 initiated either the only or the most prevalent form of FPGS transcripts in HepG2, MCF-7, and human fetal liver (21). In more recent reports, alternate forms of exon 1 have been detected in cDNAs for human (21) and mouse FPGS (22), each of which was spliced to exon 2. To date, the alternative transcripts identified in human cells do not appear capable of translation to active FPGS, and the use of downstream translational start sites also did not lead to active enzyme (21). On the other hand, the divergence in sequence reported recently at the 5Ј-terminus of FPGS transcripts in mouse normal and malignant tissues could indeed lead to isoforms of enzyme (22), and the structure of the mouse gene suggested would indirectly support the existence of a second upstream promoter used in normal mouse tissues. It is now a central issue whether the use of alternative promoters or a level of regulation not yet defined on the promoter we herein describe is involved in the tissue-specific control of FPGS activity and/or the generation of functional isoforms of FPGS.