Structural organization of the reduced folate carrier gene in Chinese hamster ovary cells.

The reduced folate carrier gene (rfc) encodes a putative protein that is involved in the intracellular accumulation of folates. In this report, we describe the organization of the rfc gene from Chinese hamster ovary cells. The hamster rfc gene contains 7 exons and 6 introns, which span 15.3 kilobases. It codes for two alternatively spliced messenger RNAs, one that contains all 7 exons and one that lacks exon 2 but contains the remaining 6 exons. The transcriptional start of the gene has been mapped to six sites approximately 200 base pairs upstream of the putative ATG initiation codon. The promoter region has no TATA box-like sequence but contains a consensus Sp1 binding site. This is the first report of the genomic structure of the reduced folate carrier gene from any species.

The reduced folate carrier gene (rfc) encodes a putative protein that is involved in the intracellular accumulation of folates. In this report, we describe the organization of the rfc gene from Chinese hamster ovary cells. The hamster rfc gene contains 7 exons and 6 introns, which span 15.3 kilobases. It codes for two alternatively spliced messenger RNAs, one that contains all 7 exons and one that lacks exon 2 but contains the remaining 6 exons. The transcriptional start of the gene has been mapped to six sites approximately 200 base pairs upstream of the putative ATG initiation codon. The promoter region has no TATA box-like sequence but contains a consensus Sp1 binding site. This is the first report of the genomic structure of the reduced folate carrier gene from any species.
The intracellular accumulation of folic acid and reduced folates is essential for the survival of mammalian cells. Folates cannot be synthesized endogenously but are required as precursors for the biosynthesis of purines, pyrimidines, and amino acids.
The second system consists of a 36 -78-kDa glycoprotein termed the reduced folate carrier, which has a higher affinity for reduced folates including Mtx and a lower affinity for folic acid (18 -22). Several cDNAs that putatively encode the reduced folate carrier have recently been cloned from human (23)(24)(25)(26), mouse (27), and hamster (28) cells.
The availability of cDNA clones from various sources provides a powerful tool for examining folate transport processes. However, at present very little is known about the molecular details of folate transport.
We have previously described the isolation of both a hamster genomic DNA cosmid clone (29) and a cDNA clone (pMtxT9) (28) that can, upon transfection, complement mutant Chinese hamster ovary (CHO) cell lines that are unable to take up folates. Mutant cells transfected with the cDNA are able to bind and take up Mtx and thus become sensitive to the cytotoxic action of the drug.
In our continuing studies to understand the molecular nature of the reduced folate carrier gene we have localized the gene for hamster rfc to chromosomes 1 and Z1 at position q2-q3 in CHO cells (30). The present study describes the organization of the hamster rfc gene, the identification of two alternatively spliced mRNA isoforms, and the mapping of the transcriptional start site.

EXPERIMENTAL PROCEDURES
Cell Lines, Cosmid Clones, and Plasmid Clones-The wild-type, Mtxsensitive (Pro Ϫ 3) and Mtx-resistant (Pro Ϫ 3 MtxRII 5-3) cell lines and their maintenance have been previously described (31)(32). The cosmid clone, 100 -2, containing CHO DNA sequences that complement a mutant cell line defective in Mtx transport has been previously described (29). The isolation of the cDNA clone, pMtxT9, and its properties have been described (28).
DNA Isolation-High molecular weight DNA was isolated from exponentially growing CHO cells by the procedure of Gross-Bellard et al. (33).
Cosmid and plasmid clones were propagated in Luria-Bertani medium supplemented with ampicillin or ampicillin plus tetracycline. DNA was isolated from overnight cultures using the Qiagen plasmid kit as described by the manufacturer.
Southern Blotting-High molecular weight DNA or cosmid DNA was digested with either BamHI or HindIII (Pharmacia Biotech Inc.) according to the conditions of the supplier. The digested DNA was separated on a 0.8% agarose gels and transferred onto Biotrans nylon paper (ICN Biomedicals) by the method of Southern (34). Blots were prehybridized for 24 h at 65°C in 5 ϫ SSC (1 ϫ SSC is 0.15 M sodium chloride, 0.015 M sodium citrate), 5 ϫ Denhardt's solution, 200 g/ml salmon sperm DNA, and 0.1% SDS. Labeled DNA from the plasmid pMtxT9 was added to the hybridization mixture to a final concentration of 2-4 ng/ml.
After 48 h at 65°C the blots were washed two times for 15 min with 2 ϫ SSC, 0.1% SDS at room temperature and three times for 15 min with 0.1 ϫ SSC, 0.1% SDS at 50°C. Blots were exposed to x-ray film at Ϫ70°C with intensifying screens.
Probe Labeling-The CHO rfc cDNA fragment from the plasmid pMtxT9 was used as probe and labeled with ␣-[ 32 P]dCTP (ICN Biomedicals; 3000 Ci/mmol) by the random priming method described by Feinberg and Vogelstein (35). Labeled DNA was routinely obtained at a specific activity of 5-10 ϫ 10 8 cpm/g.
Restriction Endonuclease Mapping-The initial restriction endonuclease map of cosmid 100 -2 was derived from partial endonuclease digestion and hybridization with either T3 or T7 end-labeled sequencing primers as described previously (29). Fine mapping of restriction endonuclease sites was done by hybridizing Southern blots with endlabeled oligodeoxynucleotide primers that span the cDNA sequence in pMtxT9.
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) U58838 and U58839.
DNA Sequencing-Double-stranded DNA sequencing was performed by the dideoxy chain termination method using the T7 sequencing kit supplied by Pharmacia. Sequences were obtained using either T7 or Sp6 sequencing primers or synthetic oligodeoxynucleotide primers spanning the cDNA sequence in pMtxT9.
Cosmid clone fragments chosen for sequence analysis were isolated and subcloned into the vectors pGEM3 or pGEM4 and propagated in Escherichia coli strains SURE®, XL1-Blue, or JM109.
RNA Isolation-Poly(A ϩ ) RNA was isolated from ϳ2 ϫ 10 8 exponentially growing Pro Ϫ 3 cells using the Fast Track mRNA isolation system (Invitrogen). Approximately 30 g of poly(A ϩ ) RNA was obtained from this number of cells.
Northern Blotting-Five g of poly(A ϩ ) RNA from Pro Ϫ 3 cells were resolved on a 1.2% agarose gel containing 10 mM sodium phosphate (pH 7.0) as described previously (28). After electrophoresis, the RNA was transferred to Biotrans nylon paper (ICN Biomedicals) as described previously (28). Hybridization, washing, and autoradiography were performed as described previously for Southern blotting (29).
Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR)-Firststrand cDNA was synthesized from 1 g of poly(A ϩ ) RNA using Superscript II reverse transcriptase (Life Technologies, Inc.) according to the conditions recommended by the supplier. Sequences corresponding to pMtxT9 were amplified from the first-strand cDNA using a 20-mer forward primer (P1: 5Ј-GTGTTGTAGTGCGCGTGGTG-3Ј) that was designed at base pair Ϫ221 of the rfc mRNA1 sequence and a 20-mer reverse primer designed at either base pair 208 (P2: 5Ј-GCACCGGAAT-GATCTCATTG-3Ј) or base pair 2067 (P3: 5Ј-TGTGCCTTGGCTGGT-GTCTG-3Ј) for 30 cycles of amplification using standard PCR conditions. The PCR products were separated by electrophoresis on 0.8% agarose gels or 5% acrylamide gels and visualized by ethidium bromide staining. The PCR products were subcloned using the TA cloning kit (Invitrogen) and were sequenced as described above.
Subcloning and Transfection of the Larger mRNA Isoform-The mRNA isoforms were amplified by RT-PCR from Pro Ϫ 3 cells using the 20-mer forward primer P1 and a 20-mer reverse primer designed at base pair 1733 (P4: 5Ј-GGAGAGGTTGCTTAAGTCAG-3Ј). The resulting PCR product spans full-length coding sequences. The Expand polymerase mixture (Boehringer-Mannheim) was used to minimize polymerase errors. The PCR products were directionally cloned into the pCR3 vector using the unidirectional TA cloning kit (Invitrogen). The PCR product clones were screened by restriction endonuclease digestion and DNA sequencing to identify clones of the larger DNA isoform.
Transfection of the PCR product clones was carried out by the polybrene procedure as described previously (36). Ten g of DNA per 1-2 ϫ 10 5 cells were transfected into the mutant Pro Ϫ 3 MtxRII 5-3 cells. After transfection, cells were selected by growth in low levels of folinic acid (2 nM) (37). Cells able to grow in the selective medium were isolated and tested for sensitivity to Mtx as described previously (31).
Rapid Amplification of cDNA Ends (5Ј-RACE)-Total RNA was isolated from 1 ϫ 10 7 Pro Ϫ 3 cells using a guanidinium-isothiocyanatephenol-chloroform method (38). 5Ј-RACE was performed using the 5Ј-RACE System according to the conditions of the supplier (Life Technologies, Inc.). Reverse transcription was primed with a 20-mer reverse primer designed at base pair 500 (P5: 5Ј-ACAGAGCTGGTGAA-GACTCC-3Ј) relative to the pMtxT9 sequence. Tailed products were amplified by PCR using the 5Ј anchor primer and a nested reverse primer (P2). These products were diluted 20-or 50-fold and reamplified using the nested universal amplification primer (Life Technologies, Inc.) and a specific reverse primer designed at base pair 149 (P6: 5Ј-GGTGTGATGAAGCTCTCC-3Ј). The RACE products were cloned using the TA cloning kit (Invitrogen) and sequenced as described above.
RNase Protection Assay-Antisense probes were prepared from a 400-bp subclone of the cosmid 100 -2 in the vector pGEM4, which contained sequences extending 109 bp upstream of the first nucleotide of pMtxT9. Probes were synthesized using T3 polymerase (Pharmacia) incorporating ␣-[ 32 P]CTP (DuPont NEN; 800 Ci/mmol). The probe was purified on a 5% denaturing acrylamide gel. Approximately 5 ϫ 10 5 cpm were combined with 2.5 g of poly(A ϩ ) mRNA or 3 g of yeast RNA and hybridized for 18 h at 45°C. Ribonuclease digestion was performed using the RPA II kit according to the supplier's recommendations (Ambion Inc.). Protected fragments were separated on a 20% denaturing acrylamide gel. Sizes were approximated using 32 P end-labeled oligodeoxynucleotides.

RESULTS
Genomic Organization of the Hamster Mtx Transport Gene-Previously we have isolated a genomic clone and the corre-sponding cDNA that are able to complement the phenotype of a mutant hamster cell line that is defective in Mtx uptake. The availability of both genomic and cDNA clones for this function allowed the determination of its genomic organization. To determine the intron/exon organization of the gene, synthetic oligodeoxynucleotides derived from regions spanning the cDNA were hybridized to DNA from the cosmid to identify complementary sequences. DNA sequencing was performed either directly on the cosmid DNA or on subcloned fragments to determine the intron/exon boundaries. These analyses revealed 7 exons and 6 introns that span approximately 15.3 kb (Fig.  1A). The donor and acceptor sequences (Table I) conform to consensus GT/AG splice site sequences with the exception of the donor splice site of intron 3. This site has a C replacing the consensus T at the ϩ2 position of the intron. Although it is rare, this substitution has been observed in other donor splice sites (39).
The exon sequences were in agreement with the previously reported cDNA sequence (28) with one exception. The original cDNA was isolated from a CHO-K1 cDNA expression library. The cDNA contains the sequence AGA beginning at base pair 1459, which codes for arginine in the predicted amino acid sequence. The cosmid 100 -2, which contains genomic DNA sequences from the Pro Ϫ 3 cell line, has the sequence GGA at this position, which codes for a glycine residue. This difference appears to be a polymorphism, because other cDNA clones from the CHO-K1 library contain the AGA sequence at this position, while cDNA clones and RT-PCR sequence data from the Pro Ϫ 3 cell line all contain the GGA sequence. 2 Although the significance of this change is not clear, both cosmid DNA from Pro Ϫ 3 and cDNA from CHO-K1 or Pro Ϫ 3 (see below) are able to complement the Mtx transport defect in the mutant cells, which indicates that this change may not be functionally important.
The cosmid 100 -2 was isolated from a Pro Ϫ 3 genomic cosmid library by two rounds of genomic cloning (29). In order to ensure that the genomic organization of 100 -2 reported above had not been altered during the selection and cloning process, these sequences were compared to genomic DNA isolated from the Pro Ϫ 3 cell line by Southern hybridization using the cDNA sequence as a probe (Fig. 2). The restriction endonuclease pattern of Pro Ϫ 3 genomic DNA as compared to cosmid 100 -2 indicated a similar structure, with some exceptions. The largest fragments in both the BamHI and HindIII digests of cosmid 100 -2 DNA are smaller than their counterparts in genomic DNA. This difference is due to truncation of the ends of the DNA in the cosmid 100 -2 as a result of the cloning process (Fig.  1A). Fig. 1A also illustrates the change in the order of the restriction enzyme sites at the 3Ј end of the cosmid 100 -2. This difference may also be due to the truncation of this end during the cloning process, which involved two rounds of genomic cloning in order to isolate the complementing genomic clone (29).
Another difference in the structure of the cosmid 100 -2 is shown by the change in size of the smallest fragment in the BamHI digest. This fragment is approximately 500 bp smaller in the cosmid DNA as compared to the genomic DNA. This difference is due to a deletion in intron 2 of the cosmid 100 -2 (Fig. 1A).
The other bands in the Southern blot analysis, including the 4.3-kb BamHI doublet and the 6.4-kb HindIII band, were comparable between the cosmid 100 -2 and genomic sequences.
Identification of Alternatively Spliced Messages-To deter-mine the nature of the messenger RNA produced in Pro Ϫ 3 cells, a Northern blot of poly(A ϩ ) selected mRNA was probed with the cDNA sequence (Fig. 3A). The Northern analysis indicated that there were two mRNA species approximately 2.5 kb and 2.6 kb in size. The two mRNAs were present in similar amounts in Pro Ϫ 3 cells, since the band intensities are comparable (Fig.  3A). Amplification of the hamster rfc gene using first-strand cDNA synthesized from Pro Ϫ 3 cells confirmed the presence of two mRNA species that differed in size by 121 bp (Fig. 3B). Using PCR primers specific for the 5Ј end of pMtxT9, the 121-bp difference was localized to this portion of the pMtxT9 sequence (Fig. 3C). The PCR products shown in Fig. 3, B and C, were cloned and sequenced. The sequence of the lower molecular weight species was identical to the sequence of pMtxT9. The sequence of the higher molecular weight species contained an additional 121 bp inserted at position Ϫ48, which was known to be a splice junction (Fig. 4). An oligodeoxynucleotide specific for the 121-bp sequence was used to map the new exon relative to the genomic sequence in the cosmid 100 -2. This exon was labeled exon 2 and the numbering of the 5Ј-untranslated region of the previously reported pMtxT9 sequence (28) was changed to include the new exon (Fig. 4). Therefore the hamster rfc gene codes for two mRNA species (Fig. 1B), rfc mRNA1, which contains exons 1 through 7, and rfc mRNA2, in which exon 2 is absent. The addition of this exon occurs 5Ј to the putative translational start site. Thus it was anticipated that there should be no functional difference between the protein products encoded by the two messages. To confirm this, the two mRNA species were amplified by RT-PCR and cloned into the expression vector pCR3 (Invitrogen). The resulting RT-PCR clones were sequenced, and three clones of the large mRNA isoform, which contain the 121-bp second exon, were transfected into the mutant Pro Ϫ 3 MtxRII 5-3 cell line and selected for the ability to complement the mutant phenotype. All three clones were able to rescue the phenotype at a frequency comparable to that of the cDNA clone pMtxT9, which corresponds to the mRNA2 sequence. 2 This indicates that there is no apparent functional difference between the mRNA isoforms with respect to their ability to complement the mutant phenotype.
To ensure that DNA fidelity was maintained during the amplification process, the sequence of the RT-PCR clones was  compared to the cDNA sequence. Three sequence alterations that were presumably due to polymerase error were noted in the three RT-PCR clones. One of the three clones had no sequence alterations, while another clone had a silent mutation at base pair 780. The third clone had two point mutations, a silent mutation at base pair 1648 and a mutation at base pair 853 that resulted in the substitution of methionine for leucine at amino acid 285 in the predicted amino acid sequence. This change is relatively conservative and did not appear to affect the ability of the clone to rescue the mutant phenotype.
The 5Ј-region of the Hamster Mtx Transport Gene-The tran-scriptional start site of the hamster rfc gene was mapped using 5Ј-RACE and RNase protection analysis. The results of the 5Ј-RACE analysis are indicated in Fig. 4. Thirty-three clones were obtained and sequenced. The majority of the clones mapped to six sites near the beginning of the pMtxT9 sequence with two clones mapping to a site at base pair 53 of the putative open reading frame. RNase protection was used to confirm the location of the transcription start sites. Six RNase protection products were detected (Fig. 5) whose positions corresponded well with the 5Ј-RACE products near the beginning of the pMtxT9 sequence (Fig. 4). The predominant RNase protection product appears to correspond to the first base pair of the pMtxT9 clone (Fig. 4). Two minor bands were seen in the RNase protection approximately 50 bp upstream of the major RNase protection product. These bands were not very intense but may indicate that there is a low level usage of an upstream transcription start site. No 5Ј-RACE products corresponding to an mRNA of this size were detected.
Two hundred sixty-one base pairs of sequence upstream of the major transcriptional start site from the cosmid 100 -2 are also shown in Fig. 4. This sequence is consistent with a "housekeeping" gene promoter sequence in that it is fairly GC-rich, does not contain a TATA box element, and contains a consensus binding site for the Sp1 transcription factor at Ϫ346 (40). DISCUSSION This report describes the organization of the hamster rfc gene, the characterization of two alternatively spliced transcripts, and the identification of the transcriptional start sites. The gene is organized into seven exons and six introns. The sequences at the splice site junctions conform to the consensus GT/AG with the exception of the donor splice site of intron 3, which contains a C instead of a T at the ϩ2 position of the intron. Although this splice site does not conform to the consensus, it appears to function efficiently in CHO cells. We have not been able to detect a splicing error at this intron/exon junction in mRNA from Pro Ϫ 3 cells using RT-PCR analysis.
Splicing errors at other splice site junctions do seem to occur in the hamster rfc gene. In screening the mRNA from Pro Ϫ 3 cells by RT-PCR, we were able to detect a potential splice error at the splice acceptor site of exon 6. Two RT-PCR clones have been isolated that contain either a 7-or 29-bp deletion at the beginning of exon 6. 2 The two deletions both begin at the first base pair of exon 6 and end after the next two downstream AG sequences, indicating that they may be cryptic splice acceptor sites. The deletions in these RT-PCR products correspond to the 29-bp deletion previously reported in Williams et al. (28) for clones pMtxT5 and pMtxT7. Since the 29-bp deletion is found in both a cDNA library and by RT-PCR analysis, there must be a proportion of mRNAs in the cell that contain the deletion. Although our PCR conditions were not quantitative, the majority of PCR products isolated did not contain the 29-or the 7-bp deletion.
Another possible splicing error that involves the splice site junctions that flank exon 4 has been reported in the literature. The hamster rfc cDNA clone pMtxT5 (28) contains a 757-bp deletion which is the result of the loss of exon 4. The 3Ј end of this deletion corresponds to the 3Ј end of the 119-bp deletion reported by Brigle et al. (41) for the mouse rfc message. Although the intron/exon structure of the mouse rfc gene is not known, it is likely that this deletion also involves a splice site junction. In the present study, we have not been able to detect any sequence alterations that would correspond to either of the above deletions using RT-PCR analysis on mRNA from Pro Ϫ 3 cells.
These deletions may represent aberrantly spliced messenger RNAs that are present at very low levels and are detected due to the sensitivity of the polymerase chain reaction. We have not been able to detect mRNAs containing these deletions by Northern blot analysis, although we can detect the alternatively spliced messages that only differ in size by 121 bp. All of these deletions occur at splice junctions, but it is not clear whether they are simply splice errors or if they represent another level of regulation of this gene. It should be noted that a construct containing the full-length cDNA sequence with the 29-bp deletion at the beginning of exon 6 is not able to complement our mutant cells upon transfection. 2 This indicates that the protein produced from a transcript containing the 29-bp deletion is not functional, at least in our assay system.
The hamster rfc gene encodes two alternatively spliced transcripts. The transcripts differ with respect to the presence or absence of a 121-bp exon in the putative 5Ј-untranslated region. Upon transfection, the larger mRNA isoform is able to complement the mutant phenotype at a frequency similar to the original cDNA clone pMtxT9, indicating that there is no significant functional difference between them in this assay system. Although this is the first report of an alternative splice in the rfc gene, we also see a doublet band on a Northern blot of mouse RNA. 2 It will be interesting to see if the two mRNA isoforms show any differences in spatial/temporal expression or translational regulation in an intact organism.
We have mapped the transcriptional start sites of the rfc gene to several positions approximately 200 bp upstream of the translational start site. The RACE data and RNase protection data taken together indicate that there are approximately six transcription start sites close to the beginning of the pMtxT9 cDNA clone. The size of transcripts generated from these start sites agree with the observed size of the rfc mRNA on a Northern blot. The use of imprecise transcriptional start sites is common in TATA-less "housekeeping" genes (40).
Two additional bands in the RNase protection assay indicate that there may be a low level usage of an upstream transcriptional start site. We have not been able to detect any 5Ј-RACE products that would correspond to these bands, nor do we see a larger mRNA on a Northern blot. The possible use of an upstream transcriptional start site is currently being investigated using RT-PCR.
Brigle et al. (41) have shown an alignment of the translated nucleotide sequence upstream of the predicted ATG initiation codon from the CHO and L1210 rfc cDNAs. The authors point out that the degree of similarity and lack of an upstream FIG. 4. DNA sequences at the 5 end of the rfc gene. Numbering on the left indicates the nucleotide position relative to the translational start site designated ϩ1 (*). Lower case indicates the sequence of the alternatively spliced exon 2. The consensus Sp1 binding sequence is underlined. The SacII site at the beginning of the probe used for RNase protection (Fig. 5) is double underlined. The arrows above the sequence indicate the positions of the 5Ј-RACE products. The arrows below the sequence indicate the positions of the RNase protection products, with the double arrow indicating the major RNase protection product (Fig.  5). q, the first base pair of the cDNA clone pMtxT9. in-frame termination codon may indicate that these sequences encode additional protein information. However, an analysis of the putative 5Ј-untranslated region of the human cDNA sequence, which does not contain an in-frame stop codon, revealed very little homology with the hamster or mouse sequences. This is in contrast with the amino acid sequence downstream of the putative ATG initiation codon, which is highly conserved among hamster, mouse, and human (23). These observations combined with the transcriptional start site mapping data presented in this paper and the size of the messenger RNA on a Northern blot imply that the predicted ATG initiation codon is the major translational start site.
The information obtained in these studies should prove useful for determining the state of the rfc gene in both normal and transport-deficient cell systems.