Additional organizational features of the murine folylpolyglutamate synthetase gene. Two remotely situated exons encoding an alternate 5' end and proximal open reading frame under the control of a second promoter.

Nucleotide sequence analysis of independently isolated clones from a mouse liver cDNA library identified two additional splice variants of folylpolyglutamate synthetase (FPGS) mRNA with novel sequence at the 5′ end. These variants incorporate two new alternatives (exons A1a and A1b) of exon 1 in the murine FPGS gene which are also spliced to exon 2. Exon A1a encodes most of the 5′-untranslated region. Exon A1b encodes a downstream segment of the 5′-untranslated region, two ATG start codons, and a unique mitochondrial leader peptide as well as 15 additional amino acids of cytosolic FPGS not encoded by all previously identified (Roy, K., Mitsugi, K., and Sirotnak, F. M. (1996) J. Biol. Chem., 271, 23820-23827) splice variants. It was also found by direct sequencing of genomic fragments that although exon A1b is spliced to exon 2, these new alternatives (i.e. exons A1a and A1b) to exon 1 are found approximately 9.5 kilobases upstream from exons B1a, B1b, and B1c. Exons A1a and A1b are separated from each other by a 124-nucleotide intron. Sequencing of the region 5′ to exon A1a revealed a nucleotide sequence that was promoter-like and different from the downstream promoter region in the content of putative cis-acting elements. Primer extension analysis identified a number of potential transcription start sites within the more 3′ end of this region. FPGS RNA transcripts incorporating exons A1a and A1b were detected in both normal mouse tissues, particularly, liver and kidney, and also to a varying extent in tumors; FPGS RNA transcripts incorporating exons B1a, B1b, and B1c were detected mainly in tumors. Thus, transcription of the FPGS gene in this tissue-specific manner appears to reflect the different usage of alternates to exon 1 under the control of different promoters. An unusual splice variant identified infrequently in a mouse liver cDNA library was 2.67 kilobases in size and incorporated exons A1a and A1b and a segment of the downstream promoter region along with exons B1c and B1b and exons 2-15.

Earlier studies in several laboratories (10 -12) suggest that the expression of FPGS activity in various tissues is under stringent regulation. The basis for this regulation and details as to its mechanism are important to our understanding of cellular folate homeostasis (1)(2)(3)(4)(5)(6) and also to the metabolic disposition of folate analogs (2)(3)(4)(5)(6)(7)(8) during therapy of tumors in animal models and patients. The recent cloning by Shane and co-workers (13) of human FPGS cDNA now makes it possible to address these issues at the level of gene expression. Other recent studies by Moran and co-workers (14) have provided information pertaining to the most 5Ј sequence of the human FPGS gene which included the sequence of a putative promoter-like region. Preliminary functional studies reported recently (15) provided evidence of promoter activity for this region. The earlier (14) studies also suggested alternate transcription start sites as a basis for the regulation of synthesis of both mitochondrial and cytosolic forms of human FPGS with the former requiring the inclusion of a NH 2 -terminal leader peptide encoded by nucleotide sequence at the 5Ј end of exon 1. This same group (16) and Shane and co-workers (17) provided detailed information on the structural organization of the human FPGS gene in the form of 15 exons spanning approximately 11 kb of DNA sequence. Shane and co-workers (17) also provided evidence for the existence of alternate splice variants of exon 1. Recent studies of our own (18) characterized the murine FPGS gene, a promoter-like region with somewhat different characteristics than the human promoter and revealed an interesting complexity with regard to exon 1. This was expressed in the form of three splice variants that differed in their content of sequence homologous to human exon 1. As only some variants incorporated a nucleotide sequence encoding both a mitochondrial leader peptide as well as cytosolic FPGS, the results of these studies raised the possibility that the regulation of synthesis of mitochondrial and cytosolic forms of FPGS could occur as a result of alternate splicing. * This work was supported in part by Grants CA08748 and CA56517 from the National Cancer Institute and the Elsa U. Pardee Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  We now report other results that identify two additional splice variants found in the form of cDNA in a murine liver cDNA library. These variants incorporate distinct alternatives to exons designated as exons 1a, 1b, and 1c of the murine FPGS gene in our previous publication (18) and encode a distinctly different 5Ј end and proximal open reading frame within the mRNA transcript. Hybridization screening of DNA restriction fragments showed that these new alternatives to exon 1 exist in the genome Ͼ10 kb upstream from the previously designated exons 1a, 1b, and 1c, and their transcription along with exons 2-15 also incorporated in these variants, appears to be under the control of a different promoter. We have putatively identified such a promoter in the sequence immediately 5Ј of these new alternates to exon 1. We also report on data that show that transcription of these various splice variants that incorporate alternatives to exon 1 occurs in a tissue-specific manner reflecting the alternate usage of different promoters.

EXPERIMENTAL PROCEDURES
Selection of Murine FPGS cDNA Variants-Full-length cDNA clones were selected by hybridization (19) screening of a murine liver cDNA library in gt11 using an L1210 cell FPGS cDNA (19) as a probe. The hybridization screening of the cDNA library was carried out under conditions described earlier (20). The radioactive FPGS cDNA probe was prepared by random priming (Boehringer Mannheim) with [␣-32 P]dCTP and the L1210 cell FPGS cDNA as a template. Isolation of poly(A) ϩ RNA was prepared as discussed earlier (21) and utilized in the synthesis of cDNA by a standard procedure (22).
Isolation of Genomic Clones by Hybridization Screening-A BALB/c mouse liver genomic library in EMBL3 SP6/T7 (Clontech; Palo Alto, CA) was screened with an exon-specific radioactive murine FPGS cDNA probe that was prepared by random priming with [␣-32 P]dCTP (Boehringer Mannheim). The nucleotide sequence of this FPGS cDNA was  submitted to GenBank (GenBank/EMBL Accession U59517). After screening of a large number of plaques and purification several positive clones were obtained. The DNA insert from these clones was purified and characterized by restriction mapping and Southern hybridization (23). One of these nonidentical clones designated clone 10 was selected by the above procedure for further analysis because of the large size of its insert (ϳ18 kb). Restriction fragments of these clones generated with SacI or EcoRI were selected for further study on the basis of hybridization with the region-specific probes. The fragments of interest after extraction were subcloned into Bluescript SK ϩ (Stratagene) for sequencing.

Sequencing of DNA Fragments and Intron
Size Determination-Double-stranded DNA was sequenced in both directions according to the dideoxy method of Sanger et al. (24) using Sequenase version 2.0 (U. S. Biochemical Corp.). Oligonucleotide primers based on the mouse FPGS cDNA were used initially. Additional oligonucleotide primers were prepared on the basis of the sequence data generated when necessary for extending the sequencing. Exon/intron junctions were determined by direct sequencing across these junctions using primers based upon the mouse FPGS cDNA sequences. Intron sizes were determined by sequencing through the region in question.
Enzymatic Primer Extension-Ten pmol of an antisense oligonucleotide primer corresponding to nucleotide sequence in the 3Ј end of exon A1b (5Ј-GGAGGCAGTCTTAGCTTCGTAAG-3Ј) of the murine FPGS gene was end labeled (Promega, Madison, WI) at the 5Ј terminus using T4 polynucleotide kinase and [␥-32 P]ATP. The labeled primer was hybridized to 5 g of poly(A) ϩ RNA isolated from mouse liver cells which was first treated with methyl mercuric hydroxide and the primer extension reaction carried out as well with a standard kit as specified (Promega). An mRNA preparation derived from a 1.2-kb kanamycinresistant plasmid cDNA with an antisense primer was used as a control. Electrophoresis on an 8% polyacrylamide gel was carried out by a standard protocol (25).
Northern Hybridization Analysis-Using a standard procedure, sam-   Quantitative PCR-The relative amount of cell FPGS mRNA transcript representing different splice variants in L1210 cells and murine liver was determined by reverse transcriptase PCR. Prior to cDNA synthesis (see above), poly(A) ϩ RNA was digested with RNase-free DNase I, treated with phenol chloroform and precipitated with ethanol followed by washing two times with 70% ethanol before dissolving in H 2 0. All poly(A) ϩ RNA was treated with methyl mercuric hydroxide to reduce secondary structure. Five l of the prepared cDNA was utilized in a PCR using four different exon 1-specific sense primers and an exon 5 antisense primer. Ten pmol of an 18-mer oligonucleotide encompassing the ATG start site were used for each PCR. 5Ј-CCAAGTGATGAT-GAAAGC-3Ј was used as the A1b-specific sense primer, and 5Ј-TAA-GACTATGTCGCTGGGC-3Ј was used as the B1a-specific sense primer. 5Ј-GAGCCGGGCATGGAGTAT-3Ј was used as the B1b-specific sense primer, and 5Ј-ACAAAGATGGCCGTGATA-3Ј was used as the B1cspecific sense primer. 5Ј-TTCAGGCCGTAATTCCGC-3Ј was used as the exon 5-specific antisense primer. Initially, reactions were run with different amounts of cDNA and different PCR cycles to obtain conditions that ensured that amplification was in the linear range at the time the amount of product of each primer-specific reaction was compared. As a control, the cDNA reaction mixture without reverse transcriptase was used to initiate a PCR to make sure that there is no genomic contamination (data not shown). As an internal standard, sense (5Ј-CCTGGTGTCCCTGCCTTC-3Ј and antisense (5Ј-TCCTATGGAT-GAAACCTC-3Ј primers of FPGS representing exons 9 and 14, respectively, were used in a parallel reaction. The experimental reaction was run for 25 cycles using a standard procedure (28), and the relative amount of product generated with each set of primers was determined by 1.5% agarose gel electrophoresis and staining with ethidium bromide (29).
Materials-All radioactive isotopes used for the above studies were obtained from DuPont NEN. Specific activities for [␣-32 P]dCTP, [␣-35 S]dATP, and [␥-32 P]ATP were 3,000, 1,000, and 3,000 Ci/mmol, respectively. DNA restriction enzymes were purchased from Boehringer Mannheim. Amplitag DNA polymerase was obtained from Perkin-Elmer. Nitrocellulose was purchased from Schleicher & Schuell, and oligonucleotide primers were synthesized by Genelink. Solutions for polyacrylamide sequencing gels were obtained from National Diagnostics. All other materials were reagent grade.

RESULTS AND DISCUSSION
Novel Splice Variants of the Murine FPGS Gene-Screening of an L1210 cell cDNA library with a murine FPGS cDNA probe (19) and DNA sequencing originally identified (18) 18) initially in a liver cDNA library and subsequently in an L1210 cell cDNA library and incorporates an alternate to exon B1a (exon B1c) which is spliced to exon B1b plus exons 2-15. The 5Ј nucleotide sequences of these variants aligned with respect to the sequences in exon 2 are given in Fig. 2 where it can be seen that variants I and III incorporate different 5Ј ends and a different nucleotide sequence within the proximal segment of the open reading frame.
Further screening of the mouse liver cDNA library was carried out using the same cDNA probe. In this library, cDNA representatives of variants I, II, and III were found to be relatively rare (Table I) compared with their relative frequency in the L1210 cell cDNA library. Instead, the most common variant found (variant IV) incorporates ( Fig. 1A and 2) a novel sequence in the form of alternates (exons A1a and A1b) to exons B1a and B1b or B1c and B1b spliced to exons 2-15. These clones were derived from mature transcripts with an appropriate poly(A) tail and properly positioned polyadenylation signal sequence at the 3Ј end. The length and sequence of the 5Јuntranslated region of variant IV, the position of the two ATG start codons, and the encoded mitochondrial leader peptide (MMKSTRSLPMSWPVREKFW) were distinctly different from those encoded by either exons B1a and B1b or exons B1c and B1b. The nucleotide sequence and the encoded amino acid sequence of all of these splice variants are compared in Fig. 2. These data show that the exons included in these splice variants differ considerably in their composition. Variants I, III, and IV all have in-frame upstream and downstream ATG start codons with properly positioned Kozak (29) consensus sequences. Variant II incorporates only the downstream ATG. In addition to the untranslated region incorporated in exon B1a, exons B1a and B1b in variants I encode different segments of a mitochondrial leader peptide. Along with a shorter untranslated region incorporated in exon B1c, exons B1c and B1b in variant III incorporate different segments of another mitochondrial leader peptide larger in size than that encoded by variant I. By contrast, exon A1a in variant IV only incorporates a untranslated region much longer than that found in variants I and III, whereas exon A1b encodes a different mitochondrial leader peptide in its entirety as well as 15 additional amino acids of cytosolic protein not encoded in variant I, II, or III. The results of quantitative reverse transcriptase PCR obtained earlier (18) and in the current studies (data not shown) using poly(A) ϩ RNA from L1210 cells and mouse liver were consistent with the relative frequency of these variants observed in  Table I.
During further screening, another splice variant (variant V) was eventually identified (Fig. 1) in the liver cDNA library which was also rare and extremely unusual. This cDNA was larger (2.67 kb in length) than the other variant cDNAs and incorporated not only exons A1a and A1b but also a portion (identified as exon B1d) of the putative promoter region described previously (18) along with exon B1c and B1b and exons 2-15.
Genomic Location of Exons A1a and A1b-Using a cDNA probe derived from the novel sequence in exons A1a and A1b, screening of genomic clones (clones 3 and 5) incorporating the nucleotide sequence in the previously published (18) murine FPGS gene showed that these new alternatives to exon 1 were not located within or near the confines of that region of the gene (18). However, the nucleotide sequence was subsequently found by hybridization within another clone (Fig. 1B) bearing an 18-kb insert (clone 10) which was identical to the novel sequence in the 5Ј end of variant IV. The 3Ј end of this clone overlaps with the 5Ј end of clone 3. By restriction mapping and sequencing, the location of these new exons was found (Fig. 1B) to be approximately 9.5 kb upstream from exon B1a. Also, these two exons are separated by an intron 124 nucleotides in length. The exon compositions of all of the splice variants found so far in this mouse liver cDNA library are compared in Fig. 2. The nucleotide sequence at the intron/exon junctions pertaining to all of the alternatives of exon 1 are given in Table II. These junctions match those for published (25) consensus splice junctions.
Identification of a Second Promoter-like Region in the Murine FPGS Gene-Sequencing of DNA 5Ј of exon A1a identified a 2-kb region with stretches of the sequence that included a number of putative binding sites for various cis-acting factors known to affect transcription. The characteristics of this region (Fig. 3) are quite different from that of the promoter-like region 5Ј of exon B1a (18). In contrast to this downstream region, there are no putative SP-1 binding sites. However, in addition to a GATA/TATA box, there are putative sites for 15-20 different transcription factors not found in the downstream promoter-like region, notably MyoD, Myc, Myb, and p53.
Enzymatic primer extension analysis was carried out using a primer matching the 3Ј end of exon A1b and methyl mercuric hydroxide-treated poly(A) ϩ RNA from murine liver. The data suggest (Fig. 4) multiple transcription start sites distributed within this region spanning approximately 200 nucleotides (Fig. 3) upstream from the location of the primer. Start sites were found at ϩ52, ϩ55, ϩ62, and ϩ64 nucleotides. Major start sites were found at ϩ53 and especially at Ϫ184. The usage of only the latter would result in the formation of a transcript encoding the putative mitochondrial leader peptide. The results of primer extension analysis pertaining to the downstream promoter-like region using exons B1b-and B1c-specific primers have already been reported (18).
Tissue-specific Expression of FPGS in the Form of Alternate RNA Transcripts-Using antisense probes specific for the A1 and B1 alternates of exon 1, a series of Northern blots was performed with poly(A) ϩ RNA from a variety of normal and neoplastic murine tissues. Poly(A) ϩ RNA from these same tissues was also blotted with a nonspecific probe incorporating the nucleotide sequence from a downstream exon. All of these blots were normalized as well by blotting with a 36B4 cDNA probe. The data in Fig. 5 show that FPGS mRNA transcripts in the range of 2.3 kb could be detected in all of the tissues examined using the exon nonspecific probe. In contrast, the detection of FPGS mRNA transcripts with the A1-and B1-specific probes was highly tissue-specific. With the nonspecific probe there was considerable variability in the relative amount of 2.3-kb FPGS mRNA detected depending upon the tissue. Among normal tissues examined, FPGS mRNA content was highest in kidney and liver and lowest in spleen, lung, and small intestine. Using the same probe, FPGS mRNA was readily detectable in all of the tumors except the hepatoma. This also included the Ehrlich tumor, which had less mRNA in the sample employed than was assumed (see also the blot with the 36B4 probe). With the exon A1-specific probe, relative levels of FPGS mRNA detected among these different tissues were similar to that obtained with the nonspecific probe. However, with the B1-specific probe, the level of FPGS mRNA detected was highest in the tumors with the exception of the mouse hepatoma. The level of mRNA detected by this probe in this tumor was relatively low by comparison and extremely low in kidney and small intestine and virtually undetected in spleen, lung, and liver. The low intensity overall of the blot obtained with this probe most likely reflects the relatively high GC content of this probe compared with the other probes used in this study. In addition to the tumors identified in Fig. 5, poly(A) ϩ RNA from several other murine tumors was also probed with these two cDNA probes with similar results (data not given). These included B16 melanoma, T241 fibrosarcoma, P388 lymphoma, EO771 mammary carcinoma, and taper liver tumor.
These studies provide further evidence for substantial heterogeneity at the 5Ј end among murine FPGS RNA transcripts. Our earlier (18)  natives of exon B1 (exons B1a and B1c). The current studies extend these findings and document additional heterogeneity in the form of a transcript (variant IV) which incorporates two previously undescribed exons (A1a and A1b) which are located approximately 9.5 kb upstream of the exon B1 alternates. Like variants I and III, variant IV encodes a different mitochondrial leader peptide and unique amino acid sequence at the NH 2terminal end of cytosolic FPGS.
Our results also appear to show that the generation of 5Ј end heterogeneity during transcription of the murine FPGS gene in part reflects the activity of different promoters regulating transcription of this gene in the mouse. These promoters differ in their nucleotide sequence, the number and putative identity of various cis-acting elements, and the pattern of potential transcription start sites contained therein. Additional studies at a functional level will be necessary to characterize these promoters with regard to the extent that cis-elements, and transacting factors may initiate, enhance, or suppress transcription in either case. Of relevance to this question were other findings from Northern blotting using exon A1-and B1-specific probes, which strongly suggest that the activation of these promoters occurs in a tissue-specific manner. Activation of promoter A appears to occur in both normal mouse tissues and tumors and results mainly in the formation of transcripts incorporating exon A1a plus A1b spliced to exons 2-15. In contrast, activation of promoter B appears to occur primarily in tumor tissue resulting mainly in the formation of transcripts incorporating exons B1a plus B1b, exons B1c plus B1b, or only exon B1b spliced to exons 2-15. The relative transcriptional activity of promoter A appears to vary considerably among the various normal tissues examined with activity apparently highest in liver and kidney. In contrast, the relative transcriptional activity associated with promoter B appeared to be less variable among the different tumors examined.
5Ј end heterogeneity associated with differential promoter usage has been documented or suggested to occur in the case of a variety of mammalian genes. These include rodent genes for ␥-glutamyl transferase (30), acyl-CoA synthetase (31), and human genes for aminopeptidase N (32), carbonic anhydrase I (33), and phosphofructokinase (34). Despite this precedence, it is not clear why multiple promoters for the murine FPGS gene have evolved. However, in view of the widespread distribution of FPGS in different tissues (4) and evidence for the regulation of FPGS during development (19 -12) the regulation of transcription of this gene may necessitate multifactorial control.
The functional significance, if any, of our finding pertaining to variant V, which incorporates a large portion (exon 1d) of promoter B, is not evident and may, in fact, reflect a splicing anomaly. However, the incorporation of a downstream promoter region within a mature RNA transcript while highly unusual is not without precedence. A similar finding has been reported in the case of a multidrug resistance gene (35).