Human Mitochondrial C1-Tetrahydrofolate Synthase

C1-tetrahydrofolate (THF) synthase is a trifunctional enzyme found in eukaryotes that contains the activities 10-formyl-THF synthetase, 5,10-methenyl-THF cyclohydrolase, and 5,10-methylene-THF dehydrogenase. The cytoplasmic isozyme of C1-THF synthase is well characterized in a number of mammals, including humans; but a mitochondrial isozyme has been previously identified only in the yeast Saccharomyces. Here, we report the identification and characterization of the human gene encoding a functional mitochondrial C1-THF synthase. The gene spans 236 kilobase pairs on chromosome 6 and consists of 28 exons plus one alternative exon. The gene encodes a protein of 978 amino acids, including an N-terminal mitochondrial targeting sequence. The mitochondrial isozyme is 61% identical to the human cytoplasmic isozyme. Expression of the gene was detected in most human tissues, but transcripts were highest in placenta, thymus, and brain. Two mRNAs were detected, a 3.6-kb transcript and a 1.1-kb transcript, and both transcripts were observed in varying ratios in each tissue. The shorter transcript results from an alternative splicing event, where exon 7 is spliced to exon 8a instead of exon 8. Exon 8a is derived from an exonized Alu sequence, sharing no homology with exon 8 of the long transcript, and encodes just 15 amino acids followed by a stop codon and a polyadenylation signal. This short transcript potentially encodes a bifunctional enzyme lacking 10-formyl-THF synthetase activity. Both transcripts initiate at the same 5′-site, 107 nucleotides up-stream of the ATG start codon. The full-length (2934 bp) cDNA fused to a C-terminal V5 epitope tag was expressed in Chinese hamster ovary cells. Immunoblots of subfractionated cells revealed a 107-kDa protein only in the mitochondrial fractions of these cells, confirming the mitochondrial localization of the protein. Yeast cells expressing the full-length human cDNA exhibited elevated 10-formyl-THF synthetase activity, confirming its identification as the human mitochondrial C1-THF synthase.

In eukaryotic cells, the mitochondrial and cytosolic compartments each contain a parallel set of one-carbon unit-interconverting enzymes (1). For example, in the yeast Saccharomyces cerevisiae, mitochondrial and cytoplasmic isozymes of C 1 -THF synthase (encoded by the nuclear genes MIS1 and ADE3, respectively) have been purified and characterized (2,3). Both isozymes exist as homodimers of 100-kDa subunits. Each subunit consists of a C-terminal 10-formyl-THF synthetase domain of ϳ70 kDa and an N-terminal bifunctional dehydrogenase/cyclohydrolase domain of ϳ30 kDa linked via a proteolytically sensitive connector region. This subunit size and domain structure are shared by cytoplasmic isozymes from mammalian and avian sources (4 -9).
All three activities of C 1 -THF synthase are found in mammalian mitochondria as well (10,11). Our studies with intact rat liver mitochondria and mitochondrial extracts demonstrated the ability of these organelles to oxidize carbon 3 of serine to formate by a folate-dependent pathway (Fig. 1, reactions [1][2][3][4] (11). However, the existence, structure, and function of the folate-interconverting activities of C 1 -THF synthase in mammalian mitochondria have been controversial. MacKenzie and co-workers (12,13) characterized a bifunctional NAD-dependent 5,10-methylene-THF dehydrogenase/5,10-methenyl-THF cyclohydrolase, originally isolated from ascites tumor cells. This bifunctional enzyme lacks the large C-terminal domain catalyzing the 10-formyl-THF synthetase activity and thus is unable to produce formate. This enzyme was shown to be a nuclear encoded mitochondrial protein (14,15), detectable only in transformed mammalian cells and embryonic or nondifferentiated tissues (12). Among adult differentiated tissues, NAD-dependent 5,10-methylene-THF dehydrogenase activity is detectable only in rat adrenal tissue (16), although the mRNA encoding this enzyme is present at low levels in all tissues examined (17). MacKenzie and co-workers (18,19) have argued that mammalian mitochondria lack a C 1 -THF synthase and that the bifunctional NAD-dependent dehydrogenase/cyclohydrolase is the mammalian homolog of the trifunctional mitochondrial enzyme.
Here, we report the identification and characterization of the human gene encoding a functional mitochondrial C 1 -THF synthase. We show that it is expressed widely in adult human tissues and that the full-length cDNA encodes a protein that localizes to mitochondria when expressed in Chinese hamster ovary (CHO) cells. These data confirm the existence of C 1 -THF synthase in mammalian mitochondria, completing the folateinterconverting pathway shown in Fig. 1.

EXPERIMENTAL PROCEDURES
Materials-All chemicals were of the highest available commercial quality. Difco media components were obtained from VWR (West Chester, PA). Restriction enzymes, shrimp alkaline phosphatase, calf intestinal alkaline phosphatase, and T4 DNA ligase were purchased from Invitrogen. Primers for PCR and sequencing were made by IDT (Coralville, IA). [␣-32 P]dATP (3000 Ci/mmol) was purchased from PerkinElmer Life Sciences.
Construction of Full-length cDNA-A partial cDNA clone (DKFZp586G1517) constructed by the German Genome Project (RZPD German Research Center for Genome Research) (20) was identified in the GenBank TM /EBI Data Bank (accession number AL117452) by a BLAST search using the cDNA sequence of the human cytoplasmic C 1 -THF synthase (21). This cDNA contains 390 nucleotides (nt) of 3Ј-noncoding sequence and a poly(A) tail, but lacks a start codon, indicating that it is truncated at the 5Ј-end. The truncated cDNA clone was obtained from RZPD, and its sequence was confirmed by the DNA Analysis Facility of the University of Texas (Austin, TX). The Human Genome Database contains the entire gene corresponding to this cDNA and predicts an additional 5Ј-exon that encodes 60 additional N-terminal amino acids. The missing 5Ј-exon (exon 1) (see Fig. 4) was PCRamplified from a genomic P1 artificial chromosome clone (dJ44A20) obtained from the Sanger Centre (Cambridge, UK). The PCR-amplified product was gel-purified using a QIAGEN gel extraction kit and subcloned into the pGEM-T Easy vector (Promega, Madison WI), and its sequence was verified. It was necessary to use MasterAmp Tfl DNA polymerase (Epicentre Technologies Corp., Madison, WI) in the PCR due to the high GC content of exon 1 (see "Results"). The partial cDNA clone and the exon 1 clone were then used as templates in a splice overlap extension (SOE)-PCR (22) to produce the full-length cDNA. The exon 1 fragment (230 bp) was amplified using Tfl polymerase and primers TOPO5Ј (5Ј-CACCATGGGCACGCGTCTGCCGCTC-3Ј, with the ATG start codon underlined) and humitoSOE3Ј (5Ј-CTTCTCTGAC-GATGGAGTCCCG-3Ј). The 2719-bp cDNA fragment was PCR-amplified using Pfu polymerase and primers GS5ЈSOE (5Ј-GGGACTC-CATCGTCAGAGAAG-3Ј) and TOPO3Ј (5Ј-GAACAAGCCTTTAACT-TGTTCTGTTTC-3Ј). Primer TOPO3Ј is complementary to the last nine codons of the open reading frame before the stop codon. Both products were gel-purified using the QIAGEN gel extraction kit. The 230-and 2719-bp PCR products served as templates in the SOE-PCR using primers TOPO5Ј and TOPO3Ј and Tfl polymerase. The full-length cDNA product (2934 bp) was gel-purified and cloned into the mammalian expression vector pcDNA3.1D/V5-His-TOPO (Invitrogen) using directional TOPO cloning according to the manufacturer's instructions. The TOPO cloning reaction was transformed into One-Shot chemically competent Escherichia coli (Invitrogen) by chemical transformation, and positive colonies were selected on YT (0.5% yeast extract, 0.8% Tryptone, and 0.5% NaCl) plates containing 50 g/ml ampicillin. The colonies were screened by PCR with a vector primer and a gene-specific primer, and positive plasmids were prepared using a QIAGEN miniplasmid preparation kit. Sequence analysis revealed a base substitution in the full-length clone compared with the original cDNA and genomic sequences, presumably incorporated during the PCRs. (Tfl polymerase, which was chosen due to the high GC content of exon 1, lacks a 3Ј 3 5Ј proofreading activity.) This substitution was repaired using the QuikChange site-directed mutagenesis kit (Stratagene). The repaired full-length cDNA clone, pcDNA3.1-humito, was sequenced completely, and the correct sequence was confirmed (GenBank TM /EBI accession number AY374130).
CHO Cell Transfection-CHO cells (1.5 ϫ 10 5 ) were plated on 35-mm diameter dishes and cultured in ␣-minimal Eagle's medium supplemented with 10% (v/v) fetal bovine serum. Duplicate plates were then transfected with 2 g of pcDNA3.1-humito/plate using the Lipo-fectAMINE 2000 reagent method (Invitrogen). After transfection, cells were cultured for an additional 48 h in regular medium before a G418containing selective medium (0.8 mg/ml) was applied. The selective medium was applied for ϳ1 week until antibiotic-resistant colonies developed. Resistant colonies were picked, replated, cultured, and collected.
Preparation of Cell Homogenates and Subcellular Fractions-Transfected cells were cultured in two 150-cm 2 T-flasks to yield 1-2 ϫ 10 8 cells. The monolayer was rinsed with phosphate-buffered saline (4 ϫ 5 ml) at 4°C and then incubated with phosphate-buffered saline containing 10 mM EDTA (10 ml) at room temperature until the cells detached (5-10 min). The flasks were tapped gently to dislodge the cells, and the cells were transferred to a 50-ml plastic conical tube. Cells were pelleted by centrifugation at 300 ϫ g for 5 min at room temperature, and the cell pellet was washed with 15 ml of homogenization solution (HMS; 250 mM sucrose and 1 mM EDTA (pH 6.9)) at 4°C. The cell pellet was resuspended in HMS (2 ml) at 4°C, transferred to a Kontes nitrogen cavitation device, and exposed to a pressure of 36 p.s.i. for 30 min at 4°C. The suspension of disrupted cells was collected into a 3-ml conical ground-glass Duall tissue grinder and further disrupted with four strokes of the homogenizer (23).
Nuclei and unbroken cells were sedimented by centrifugation at 900 ϫ g for 6 min. The supernatant was removed carefully, transferred to another centrifuge tube, and stored on ice. The pellet was resuspended in HMS (1 ml) and further dispersed by four strokes in the grinder. After centrifugation at 900 ϫ g for 6 min, the supernatant was combined with the first supernatant and stored on ice. The pellet was washed with HMS (3 ϫ 1 ml), and the final viscous pellet (nuclear fraction) was resuspended in HMS (1 ml). The combined supernatants were centrifuged at 900 ϫ g for 5 min, and any pellet was discarded. The volume of the supernatant (total post-nuclear supernatant fraction) was increased to 5 ml by the addition of HMS.
The post-nuclear supernatant was centrifuged at 10,000 ϫ g for 15 min, and the pellet was stored on ice. The supernatant was recentrifuged at 10,000 ϫ g for 15 min to give a final supernatant (cytosolic fraction). The second pellet was combined with the first, washed with HMS (2 ml), and resuspended in HMS (1 ml) to give the mitochondrial fraction. Glutamate dehydrogenase activity (24) was used as a mitochondrial marker, and lactate dehydrogenase activity (25) was used as a cytoplasmic marker.
Immunoblotting-The protein concentration of the cytosolic and mitochondrial fractions was determined using the Bradford assay (26) with bovine serum albumin as a standard. Eighty g of cytosolic and mitochondrial protein from transfected and untransfected CHO cells were fractionated on a 7.5% SDS-polyacrylamide gel for 50 min at 180 V. One-half of the gel was stained, and the proteins on the other half were transferred onto a nitrocellulose membrane (Midwest Scientific, Valley Park, MO) by electroblotting for 90 min at 250 mA. The membrane was then washed with distilled water (3 ϫ 5 min each) and blocked in 2% dry milk in Tris-buffered saline (TBS; 10 mM Tris base and 0.15 M NaCl (pH 8.0)) for 1 h at room temperature. The blocked membrane was incubated with mouse anti-V5 primary antibody (1:1000 dilution; Invitrogen) diluted in TBS and 1% dry milk for 1 h at room temperature. The membrane was then washed with TBS containing 0.0025% Tween 20 (TBST; 3 ϫ 5 min each) and incubated with goat anti-mouse secondary antibody (1:2000 dilution; Zymed Laboratories Inc., San Francisco, CA) for 1 h at room temperature. The membrane was finally washed with TBST and TBS (2 ϫ 5 min each) and rinsed with water before visualizing the bands. Reacting bands were visualized by enhanced chemiluminescence detection (ECL, Amersham Bioscience).
Expression in Yeast and Enzyme Assays-The full-length human cDNA was subcloned from pcDNA3.1-humito into the BamHI and XhoI sites of the yeast expression vector pVT103U (27). In the resulting construct, pVT-humito, the entire human mitochondrial C 1 -THF synthase open reading frame, including the mitochondrial presequence, is expressed from the ADH promoter of the vector. Yeast strain DAY3 (ser1 ura3-52 trp1 leu2 ade3-130) (28) was transformed with pVThumito or empty pVT103U vector using a lithium acetate method (29) modified as described. 2 Cells were grown in synthetic minimal medium, and extracts were prepared and assayed for NAD ϩ -and NADP ϩ -dependent methylene-THF dehydrogenase activity as described (30). 10-Formyl-THF synthetase activity was determined according to Kirksey and Appling (31).
Northern Analysis-A FirstChoice Northern human blot I kit was obtained from Ambion Inc. (Austin, TX), with poly(A) ϩ mRNA from the following adult human tissues: brain, placenta, skeletal muscle, heart, kidney, pancreas, liver, lung, spleen, and thymus. Probes were synthesized by asymmetric PCR using reagents supplied in the kit and [␣-32 P]dATP according to the kit manufacturer's instructions. The two probes represented the 5Ј-and 3Ј-ends of the putative mitochondrial C 1 -THF synthase cDNA. The 5Ј-end probe was synthesized using primer GS5ЈSOE for the sense strand and primer GSI (5Ј-CCGCTC-GAGCAAGGCATTGAGGACTTTGTTGCT-3Ј) for the antisense strand. This 304-bp probe covered nt ϩ215 to ϩ518. (The A of the ATG start codon is designated ϩ1.) The 3Ј-end probe was synthesized using primer DRA3 (5Ј-GATGCAGTCCCCTGCTATCA-3Ј) for the sense strand and primer TOPO3Ј for the antisense strand. This 465-bp probe covered nt ϩ2469 to ϩ2933, ending just before the stop codon.
A probe was also synthesized for detection of the human cytoplasmic C 1 -THF synthase. The plasmid pUC13/HS230 (obtained from Dr. R. E. MacKenzie, McGill University), which contains a 230-bp fragment near the 3Ј-end of the human cytoplasmic C 1 -THF synthase cDNA (21), was linearized by digestion with SacI. A linear PCR amplification method (following the kit manufacturer's instructions) was used to synthesize the probe. The antisense primer used was 5Ј-GTAAAACGACGGC-CAGT-3Ј, which is complementary to the vector sequences flanking the insert.
The membrane was subjected to a 1-h prehybridization at 42°C with Ultrahyb ultrasensitive hybridization buffer (Ambion Inc.). The probe was added at 10 6 cpm/ml of hybridization buffer and allowed to hybridize at 42°C overnight in a roller bottle. The membrane was then washed twice with NorthernMax low stringency wash solution (equivalent to 2ϫ SSC; Ambion Inc.) for 10 min at 42°C and twice with NorthernMax high stringency wash solution (equivalent to 0.1 ϫ SSC) for 30 min at 42°C. The membrane was exposed to a storage phosphor screen (Amersham Biosciences) for 48 h and imaged using an Amersham Biosciences 445 SI PhosphorImager. The same blot was stripped and reconstituted for hybridization with each probe according to the kit manufacturer's instructions.
Transcript Mapping-The 5Ј-and 3Ј-ends of the transcripts were mapped by RNA ligase-mediated rapid amplification of cDNA ends using the FirstChoice RLM-RACE kit from Ambion Inc. Human placental total RNA (Ambion Inc.) was used to map the 5Ј-end of the transcript. Nested antisense primers specific to the cDNA were designed for use with the two nested 5Ј-RACE primers provided in the kit (see Fig. 6). The cDNA-specific inner primer (GSI2, 5Ј-CGCCTC-GAGACGGCTGGTTCTCAGGGGACAC-3Ј, with the XhoI site underlined) was complementary to nt Ϫ9 to Ϫ30 in the 5Ј-untranslated region. The cDNA-specific outer primer (GSO2, 5Ј-AGCGCGA-CAGGGCACACGGAG-3Ј) was complementary to nt ϩ93 to ϩ73. The 5Ј-RACE inner primer and the cDNA-specific inner primer had BamHI and XhoI sites, respectively, at their 5Ј-ends to facilitate cloning.
For mapping the 3Ј-end of the 1.1-kb transcript, first-strand cDNA was synthesized from human placental total RNA using the supplied 3Ј-RACE adapter. Nested sense primers specific to the cDNA were designed for use with the two nested 3Ј-RACE primers provided in the kit. The cDNA-specific inner primer (3Ј-RACE GSI, 5Ј-CGCCTCGAG-GAACTTGTTTAGCAACAAAGTCCT-3Ј, with the XhoI site underlined) was equivalent to nt ϩ485 to ϩ508. The cDNA-specific outer primer (3Ј-RACE GSO, 5Ј-CGCCTCGAGCTCCCTCCAGATAGCAGTGAA-3Ј) was equivalent to nt ϩ390 to ϩ410. The 3Ј-RACE inner primer and the cDNA-specific inner primer had BamHI and XhoI sites, respectively, at their 5Ј-ends to facilitate cloning. PCR fragments generated in the "inner" PCRs of both 5Ј-and 3Ј-RACE were gel-purified, digested with BamHI and XhoI, and ligated separately into BamHI/XhoI-digested pBluescript II KS(ϩ) vector (Stratagene, La Jolla, CA). The ligation reactions were transformed into chemically competent XL1-Blue cells (Stratagene), and positive colonies were selected on YT/ampicillin plates. Colonies were screened by PCR using T7 reverse (5Ј-GTAATACGACTCACTATAGGGC-3Ј) and T3 forward (5Ј-AATTAACCCTCACTAAAGGG-3Ј) vector primers, and plasmids were prepared for sequence analysis. This 1.1-kb cDNA has been submitted to the GenBank TM /EBI Data Bank under accession number AY374131.

RESULTS
cDNA Identification and Cloning-A cDNA encoding an open reading frame with high similarity to human cytoplasmic C 1 -THF synthase was cloned from human uterine RNA by the German Genome Project (RZPD; GenBank TM /EBI accession number AL117452). The homology extends the length of the proteins, suggesting that the cDNA encodes another trifunctional C 1 -THF synthase (Fig. 2). This cDNA encodes 917 amino acids plus 390 nt of 3Ј-noncoding sequence and a poly(A) tail, but lacks a start codon, suggesting that it is truncated at the 5Ј-end. Blasting this sequence against the Human Genome Database (NCBI Protein Database) revealed the corresponding gene on chromosome 6 at 6q25.2. This gene spans 236 kilobase pairs and encodes the entire cDNA sequence in 27 exons plus an additional 5Ј-exon that encodes 60 additional N-terminal amino acids. The predicted initiator codon sits within a nearperfect expanded Kozak consensus sequence (32). The first half of this N-terminal extension has the characteristics of a mitochondrial leader sequence, including the potential to form a positively charged amphipathic ␣-helix. Truncation of the original cDNA clone was due to the presence of a NotI site near the 3Ј-end of the first exon; NotI was used in the cDNA cloning procedure (20). Subsequently, the RIKEN Mouse Gene Encyclopedia Project (33) identified a full-length mouse cDNA (ID22289) that predicts a protein with 88% identity to the human protein, including the N-terminal extension (Fig. 2). The mouse cDNA lacks the NotI site that caused truncation of the human cDNA. These data suggest that the gene on human chromosome 6 encodes a mitochondrial C 1 -THF synthase.
Attempts to construct a full-length cDNA by RACE using human uterine RNA were unsuccessful, probably due to the extremely high GC content (Ͼ80%) of the first exon. Instead, a genomic P1 artificial chromosome clone (dJ44A20, Sanger Centre) was used to PCR-amplify the 5Ј-exon. This was then spliced to the remaining cDNA by SOE-PCR to construct a full-length cDNA encoding the human protein (GenBank TM / EBI accession number AY374130).
CHO Cell Expression and Subcellular Localization-To determine whether the protein encoded by this cDNA is, in fact, mitochondrial, we expressed the cDNA in CHO cells. The fulllength cDNA was cloned into the mammalian expression vector pcDNA3.1D/V5-His-TOPO. This construct fused the 14-amino acid V5 epitope and a His 6 tag to the C terminus of the 2934-bp coding region. Expression of the insert in mammalian cells is driven by the cytomegalovirus promoter. The resulting plasmid, pcDNA3.1-humito, was transfected into CHO cells, and G418-resistant colonies were selected and grown. The cytosolic and mitochondrial fractions from transfected and untransfected (control) CHO cells were isolated as described under "Experimental Procedures." Each fraction was assayed for the mitochondrial marker enzyme glutamate dehydrogenase and the cytoplasmic marker enzyme lactate dehydrogenase. Glutamate dehydrogenase activity ranged from 68 to 95 mol/ min/mg of protein in the mitochondrial fractions, compared with 2.4 -4 mol/min/mg of protein in the cytoplasmic fractions. The lactate dehydrogenase activity of the mitochondrial fraction was only one-seventh that of the cytoplasmic fraction. These subcellular fractions were then subjected to SDS-PAGE and immunoblotting using antibodies against the V5 epitope (Fig. 3). A clear signal at ϳ107 kDa was detected in the mitochondrial fraction of the transfected CHO cell line (lane 2), but not in the cytoplasmic fraction (lane 1). This mobility is consistent with the expected size of the epitope-tagged construct (ϳ1000 amino acids). No signal was seen in either fraction of the untransfected CHO cell line (lanes 3 and 4). These results confirm that this cDNA encodes a protein that localizes exclusively to mitochondria in a mammalian cell line.
Expression in Yeast-The full-length human mitochondrial C 1 -THF synthase cDNA, including the 62-codon N-terminal extension, was subcloned into a yeast expression vector (pVT103U) and transformed into an ade3 deletion strain (DAY3). Disruption of the ADE3 gene, which encodes the cytoplasmic C 1 -THF synthase, results in yeast cells with very low 10-formyl-THF synthetase and 5,10-methylene-THF dehydrogenase activities; the residual activity is due to the mitochondrial isozyme (34). DAY3 cells transformed with pVT-humito overexpressed 10-formyl-THF synthetase activity ϳ9-fold compared with cells transformed with empty vector (64.6 versus 7.1 milliunits/mg of protein). However, we did not detect any increase in 5,10-methylene-THF dehydrogenase activity in cells carrying the pVT-humito plasmid using either NADP ϩ or NAD ϩ as cofactor. The third activity of C 1 -THF synthase, 5,10methenyl-THF cyclohydrolase, was not assayed because it is difficult to accurately measure this activity in crude extracts. These results, together with the mitochondrial localization data above, confirm that this cDNA encodes a protein with 10-formyl-THF synthetase activity, further supporting its identification as the human mitochondrial C 1 -THF synthase.
Gene Structure-The human gene encoding C 1 -THF synthase spans 236 kilobase pairs on chromosome 6 (Fig. 4). The coding sequence consists of 28 exons and is interrupted by 27 introns ranging from 89 to 55,350 bp in length. The start codon is present in the first exon, and the 5Ј-end of exon 1 extends 107 bp upstream of the ATG start codon (see "Transcript Mapping" below). The stop codon is present in exon 27, and exon 28 encodes 360 nt of 3Ј-untranslated region, including a polyadenylation signal (AATAAA). Exon 1 is very GC-rich (Ͼ80% GC), containing a CpG island and a NotI restriction enzyme FIG. 2. Alignment of human and mouse mitochondrial C 1 -THF synthases with human cytoplasmic C 1 -THF synthase. Black boxes denote identity, and white boxes denote conservative substitutions or identities in two of three proteins. The alignment was produced by the INRA server at the Laboratoire de Génétique Cellulaire (available at prodes.toulouse.inra.fr/multalin/multalin.html) using the MultAlin algorithm (56), and the output was generated by the ESPript program at the same site. hmito, human mitochondrial C 1 -THF synthase; mmito, mouse mitochondrial C 1 -THF synthase; hcyto, human cytoplasmic C 1 -THF synthase.
site (GCGGCCGC). The existence of this NotI site prevented the cloning of a full-length cDNA because NotI linkers were used in the cloning procedure (20). All of the intron/exon splice sites follow the GT/AG rule (35), except after the terminal exons, 8a and 28 (Table I). A scan of the 5Ј-flanking sequences by the TESS web server 3 using the TRANSFAC Version 4.0 Database predicts numerous potential transcription factorbinding sites, including Sp1, retinoic acid receptor-␣1, and CAAT/enhancer-binding protein-␣. The 5Ј-flanking sequence contains a TATAAA sequence at position Ϫ985.
Northern Analysis-A Northern blot membrane prebound with human poly(A) RNA from several tissues was obtained from Ambion Inc. A 304-bp 5Ј-end probe spanning nt 215-518 of the human mitochondrial C 1 -THF synthase cDNA revealed two bands: one at ϳ3.6 kb and the other at ϳ1.1 kb (Fig. 5A). The upper band corresponded to the expected size of the fulllength transcript. To ensure that the 1.1-kb band was not an artifact, we washed the membrane for an additional 30 min with high stringency wash buffer at 50°C. The additional wash did not eliminate either band. The upper and lower band distributions were very similar, with the highest transcript levels being in placenta, thymus, and brain. Expression was low in liver and skeletal muscle and barely detectable in heart.
To determine the relationship of the 3.6-and 1.1-kb transcripts, a 465-bp probe was synthesized that ended just before the stop codon. This 3Ј-probe detected only the 3.6-kb transcript (Fig. 5B), suggesting that the 1.1-kb transcript represents just the 5Ј-end of the cDNA.
We also compared the tissue distribution of the mitochondrial C 1 -THF synthase transcript with that of the cytoplasmic isozyme. Using a 230-bp probe from the 3Ј-end of the cytoplasmic C 1 -THF synthase cDNA (21), a 3.3-kb transcript was observed (Fig. 5C). The tissue distribution of this transcript differed from that of the mitochondrial isozyme, being highest in liver, kidney, and skeletal muscle. Thus, the human mitochondrial and cytoplasmic C 1 -THF synthase isozymes are encoded by distinct transcripts that do not cross-hybridize under these probe and wash conditions.
Transcript Mapping-A 5Ј-RACE experiment was done to determine the transcriptional start site(s). 5Ј-RACE was performed as described under "Experimental Procedures" using 10 g of human placental total RNA for first-strand cDNA synthesis by reverse transcription. This was followed by a first round of PCR (outer PCR), which gave no detectable specific product. Two l of the outer PCR product were used in a second round of PCR with nested primers (inner PCR), yielding a specific product of Ͻ300 bp. The final PCR product was gelpurified and subcloned. Nine colonies were screened by PCR, and all of them gave a product of between 220 and 298 bp. Three of the nine clones were sequenced, and all of them exhibited the same 5Ј-end 107 bp upstream of the ATG start codon (Fig. 6). These results suggest that the majority of the transcripts from this gene initiate at or near position Ϫ107, and it appears that both the 3.6-and 1.1-kb transcripts initiate from this site.
Alternative Splicing-A 3Ј-RACE experiment was performed to determine the 3Ј-end of the short 1.1-kb transcript observed on Northern blots (Fig. 5A). One g of human placental total RNA was used for first-strand cDNA synthesis. This was followed by a first round of PCR (outer PCR), which gave no detectable specific RACE product. One l of the outer PCR product was used in a second round of PCR with nested primers (inner PCR). Four distinct PCR products of 500, 350, 200, and 100 bp were detectable on a 2% agarose gel. (A smear at the top of the gel was also observed, produced from the full-length transcript.) Based on the 1.1-kb length of the short transcript and the position of the inner primer, the 500-and 350-bp RACE products were gel-purified and cloned separately. Six of the clones were sequenced to determine the 3Ј-extent of the clones. All of these clones represented the short transcript, in which exon 7 is spliced to a previously unrecognized exon, termed exon 8a, which sits in the intron between exons 7 and 8 (Fig.  7A). Exon 8a appears to be 139 bp long, although in one clone, the 3Ј-end extended 162 bp. It contains a stop codon after 45 nucleotides and a polyadenylation signal near its 3Ј-end. Thus, the 3.6-and 1.1-kb transcripts share the first seven exons and then diverge at exon 8/8a. The 1.1-kb transcript would be translated into a 275-amino acid protein in which the first 260 amino acids are identical to the full-length protein, followed by 15 unrelated amino acids (GenBank TM /EBI accession number AY374131) (Fig. 7, B and C).
An additional variation was observed upon sequencing the 3Ј-RACE clones. Several of the clones contained an extra codon at position ϩ643, at the junction between exons 6 and 7 (Fig. 8). This extra valine codon appears to arise from variation in the 3Ј-splice acceptor site during the splicing of exon 6 to exon 7. The 5Ј-splice site has the GT consensus sequence as the first 2 nt of the intron. The 3Ј-splice site has two AG consensus dinucleotides at the 3Ј-end of the intron. If the first AG dinucleotide is used, exon 7 contains 3 additional nt; if the second is used, these 3 nt are not present in exon 7.

DISCUSSION
The experiments described here confirm that humans express a mitochondrial C 1 -THF synthase, with properties very similar to those of the cytoplasmic homologs previously characterized. The full-length human cDNA encodes a protein of 978 amino acids, including an N-terminal mitochondrial targeting sequence. When the full-length cDNA was expressed in CHO cells, the targeting sequence directed the protein exclusively to mitochondria (Fig. 3). Alignment of the deduced amino acid sequence with the human cytoplasmic C 1 -THF synthase (935 residues) reveals a 62-residue N-terminal extension in the putative mitochondrial protein (Fig. 2). PSORT II analysis 4 predicts a mitochondrial targeting sequence with a cleavage site between residues 31 and 32. The next 31 residues, before 3 Available at www.cbil.upenn.edu/tess. 4 Available at psort.nibb.ac.jp/form2.html. alignment with the cytoplasmic protein begins, include an unusual run of 9 consecutive glycines and several basic residues. A very similar N-terminal extension is predicted for the mouse protein (Fig. 2). Excluding this N-terminal extension, homology to the human cytoplasmic C 1 -THF synthase is quite high (61% identity), and the putative mitochondrial protein appears to possess the same domain structure. In the cytoplasmic protein, the N-terminal dehydrogenase/cyclohydrolase domain is ϳ300 residues, and the C-terminal synthetase domain is ϳ700 residues (9). The two human proteins share 31% identity in the dehydrogenase/cyclohydrolase domains and 73% identity in the synthetase domains, including conserved active-site residues and the 10-formyl-THF-binding site in the synthetase domain (31). However, the putative mitochondrial proteins from human and mouse lack 12 amino acids near the junction between the two domains (position 318) (Fig. 2). atttttcag ACTCCT-AAAACC aaccagcaa a Introns are numbered starting with intron 1 between exons 1 and 2. b The 5Ј-end of exon 1, and therefore its length, is based on the longest 5Ј-RACE clone isolated (see "Results"). c In one 3Ј-RACE clone, exon 8a extended an additional 23 bp, for a total length of 162 bp.  Table I. FIG. 5. Northern blot analysis of mitochondrial C 1 -THF synthase transcripts in adult human tissues. A human multiple-tissue RNA blot was hybridized with 32 P-labeled probes to the 5Ј-end (A) or 3Ј-end (B) of human mitochondrial C 1 -THF synthase cDNA. In C, the membrane was hybridized with a probe from the 3Ј-end of the human cytoplasmic C 1 -THF synthase cDNA. The lanes in each panel contain RNA from (left to right) brain (b), placenta (p), skeletal muscle (s), heart (h), kidney (k), pancreas (p), liver (li), lung (lu), spleen (s), and thymus (t). The schematic diagram below shows the relative locations of the probes used for A and B on the 3.6-and 1.1-kb transcripts.
Expression of the full-length cDNA in yeast revealed elevated 10-formyl-THF synthetase activity, further supporting its identification as the human mitochondrial C 1 -THF synthase. We were unable to detect increased 5,10-methylene-THF dehydrogenase activity in these cells using either NADP ϩ or NAD ϩ as cofactor. Is the human mitochondrial enzyme multifunctional like its yeast counterpart? Given the low identity between the human cytoplasmic and mitochondrial isozymes in the dehydrogenase/cyclohydrolase domain (31%), it is conceivable that the mitochondrial protein has lost these activities. However, other members of this family have diverged as much or more (e.g. yeast Mtd1p (30)) and still retain 5,10-methylene-THF dehydrogenase activity. Another possibility is that the dehydrogenase activity of the human enzyme is simply below detection in crude yeast extracts. Depending on the species, the dehydrogenase activity of these trifunctional enzymes is only one-half to one-tenth that of the synthetase activity (2)(3)(4)(5)(6)(7)(8)(9). Finally, the construct we expressed in yeast contained the entire 62-amino acid N-terminal extension. The 10-formyl-THF synthetase activity was found in the soluble cytoplasmic fraction, but not in the mitochondrial fraction (data not shown), suggesting that the presequence was not processed. If it is retained, this extension could interfere with the dehydrogenase/cyclohydrolase activities contained in the N-terminal domain of the protein while leaving the C-terminal synthetase domain unaffected. These questions will have to await purification of the recombinant enzyme.
Expression of the gene was detected in most human tissues, but transcripts were highest in placenta, thymus, and brain. Expression was low in liver and skeletal muscle and barely detectable in heart. A mouse cDNA has also been identified that predicts a protein with 88% identity to the human protein, including the N-terminal extension, suggesting that this mitochondrial C 1 -THF synthase will be found in all mammals.
The human gene encoding this enzyme has several interesting features. The gene is large, spanning 236 kilobase pairs on chromosome 6 at 6q25.2. The gene contains 29 exons (Table I), including the alternative exon 8a found in the intron between exons 7 and 8 (Fig. 4). This same intron/exon structure is observed for the mouse homolog found on mouse chromosome 10, except that the alternative exon 8a is absent in the mouse gene. Moreover, the genes for the cytoplasmic C 1 -THF synthase from rat, mouse, and human have all been shown to contain 28 exons, with introns in nearly identical positions (36,37). This suggests that an ancestral C 1 -THF synthase gene arose before the divergence of the human and rodent lineages, Ͼ75 million years ago (38), and genes encoding the mitochondrial and cytoplasmic isozymes are probably related by a gene duplication event.
The full-length 3.6-kb transcript is encoded in 28 exons. A shorter, 1.1-kb transcript is produced by an alternative splicing event, in which exon 7 is spliced to exon 8a instead of exon 8 (Fig. 7). This transcript encodes a 275-amino acid protein in which the first 260 amino acids are identical to the full-length protein, followed by 15 amino acids not found in any other C 1 -THF synthase. The first 11 amino acids of these 15 terminal amino acids are also found, with one mismatch, near the C terminus of isoform 2 of the human ␣ 1A -adrenergic receptor, and the nucleotide sequence encoding these amino acids has high homology to an Alu repeat subfamily (39). The first 91 nt of exon 8a share 87% identity with the right half of the consensus Alu-Sc subfamily (GenBank TM /EBI accession number U14571), suggesting that exon 8a was derived from an Alu element that inserted, in the antisense orientation, into the intron between exons 7 and 8. This insertion is not present in the mouse homolog because Alu elements are found only in primates (40). The exonization and alternative splicing of this Alu sequence in the human mitochondrial C 1 -THF synthase gene are apparently due to accumulated mutations in the Alu element that produce a functional 3Ј-splice site (41).
Assuming the short transcript is translated in vivo, it is unlikely that the resulting protein would retain 5,10-methylene-THF dehydrogenase or 5,10-methenyl-THF cyclohydrolase activity. Modeling the human mitochondrial protein sequence onto the x-ray structure of the dehydrogenase/cyclohydrolase domain of the human cytoplasmic C 1 -THF synthase (42) reveals that exons 8 and 9, which are missing in the short transcript, encode the major portion of the Rossman fold of the NADP-binding site and a critical ␣-helix that forms one wall of the folate-binding site. It is likely that a truncated protein lacking these structural elements would not fold into a stable structure and would be rapidly degraded. However, without knowing how the 15 novel amino acids affect the structure, it remains possible that a stable protein with altered function could be produced. Experiments are underway to determine whether a truncated form of the protein is expressed in vivo.
Using RNA from human placenta, a single 5Ј-transcriptional start site at position Ϫ107 was identified by 5Ј-RACE (Fig. 6). It appears that both the 3.6-and 1.1-kb transcripts initiate from this site because only a single 5Ј-end was identified. A BLAST search of the Human EST Database with the 5Ј-end of the human cDNA revealed Ͼ100 entries. Four ESTs extended beyond position Ϫ107 (Fig. 6). BG481636 (position Ϫ276) and BE735249 (position Ϫ268) were isolated from choriocarcinoma mRNA; BQ062382 (position Ϫ119); and BQ055629 (position Ϫ118) were isolated from a lymphoma cell line. Thus, it appears there may be some heterogeneity in the 5Ј-transcrip- tional start site, depending on the tissue or cell type.
One additional splicing variation was discovered. Some transcripts contained an extra codon at position ϩ643, at the junction between exons 6 and 7 (Fig. 8). This valine codon appears to arise from alternative usage of AG splice acceptor sites separated by 1 nt. This type of variation in splice site selection has been seen in several other mammalian genes, including human prothymosin-␣ (43) and the rat transforming growth factor-␤ type I receptor (44). The extra codon was observed in most of the 3Ј-RACE clones we sequenced and can be found in numerous human ESTs that represent the full-length transcript. There is no evidence to suggest that this alternative splice site selection is a regulated process. It may simply be due to "sloppiness" in the splicing mechanism when two AG splice acceptor sites fall so closely together.
Based on the x-ray structure of the dehydrogenase/cyclohydrolase domain of the human cytoplasmic C 1 -THF synthase (42), the extra valine is predicted to reside on the exposed loop between ␣-helix D 2 and ␤-strand e. This loop is not part of the dehydrogenase/cyclohydrolase active site or the dimerization interface for this domain. It is thus possible that an extra valine at this position could be tolerated without affecting stability or activity of the protein. On the other hand, we do not know how the dehydrogenase/cyclohydrolase and synthetase domains interact, so it will be necessary to express the protein containing the extra valine to determine its effect.
The tissue distribution of the mitochondrial C 1 -THF synthase is quite different from that of the cytoplasmic isozyme (Fig. 5). Whereas the cytoplasmic transcript is most abundant in liver and kidney, the transcripts for the mitochondrial isozyme are relatively low in those tissues, but highest in placenta, followed by thymus, spleen, brain, and lung. The low expression of the mitochondrial isozyme in liver probably contributed to our earlier difficulties in purifying the protein from liver mitochondria. Although the ratio of the two transcripts varies somewhat from tissue to tissue, both are present in every tissue assayed, even heart (Fig. 5). The short transcript is significantly reduced in brain. Future work will be directed toward understanding the metabolic role of the mitochondrial isozyme and how that role relates to the observed tissue distribution.
The discovery of the human gene for this mitochondrial C 1 -THF synthase confirms our model for the compartmentation of folate-mediated one-carbon metabolism in mammalian cells (Fig. 1). Based on the well documented existence of a mitochondrial C 1 -THF synthase in yeast (3,45), we proposed that mammalian mitochondria also contain this trifunctional enzyme (10). All three activities of C 1 -THF synthase are found in mammalian mitochondria (10). More importantly, intact rat liver mitochondria and mitochondrial extracts were shown to oxidize carbon 3 of serine to formate by the folate-dependent pathway outlined in Fig. 1 (mitochondrial reactions 1-4) (11). However, all our attempts to purify these activities from rat liver mitochondria were unsuccessful.
During this same time period, MacKenzie and co-workers (12, 13) characterized a mammalian bifunctional NAD-depend- FIG. 7. Alternative splicing of the human mitochondrial C 1 -THF synthase transcript. A, gene structure and splicing pattern. The alternative exon 8a is in the intron between exons 7 and 8. In the 3.6-kb transcript, exon 7 is spliced to exon 8. In the 1.1-kb transcript, exon 7 is spliced to exon 8a. Exon 8a contains a stop codon (black dot) after 15 sense codons and contains a polyadenylation signal (AATAAA) near its 3Ј-end. There is no homology between exons 8a and 8. B, potential protein products. The long transcript is translated into a 978-amino acid protein, whereas the short transcript is translated into a 275-amino acid protein. The amino acid sequences of the two proteins are identical through the first seven exons, encoding 260 amino acids. The short protein has 15 amino acids from exon 8a in place of the amino acids encoded by exon 8. The junction between the dehydrogenase/cyclohydrolase (D/C) and synthetase (SYN) domains is predicted to lie within amino acids 330 -350 (Fig. 2).
The asterisks indicate variable 3Ј-splice site selection at the exon 6/7 junction (see Fig. 8). C, nucleotide and amino acid sequences of the coding sequence of exon 8a. Sequences 3Ј to the stop codon are not shown. Upper, intron/exon junctions for the 3Ј-end of exon 6 and the 5Ј-end of exon 7. The first 2 nt of the intron and the alternative AG splice acceptors sites are underlined. Lower, alternative splicing products if the first (left) or second (right) AG acceptor site is used. The amino acid sequence encoded by each product is shown below the nucleotide sequence, and the extra codon and amino acid are in boldface. ent 5,10-methylene-THF dehydrogenase/5,10-methenyl-THF cyclohydrolase, originally isolated from ascites tumor cells. This bifunctional enzyme lacks the large C-terminal domain catalyzing the 10-formyl-THF synthetase activity and thus is unable to produce formate. When this enzyme was shown to be localized in mitochondria (14,15), MacKenzie and co-workers (18,19) proposed that mammalian mitochondria lack a trifunctional C 1 -THF synthase and that this bifunctional NAD-dependent dehydrogenase/cyclohydrolase is the mammalian homolog of the trifunctional mitochondrial enzyme. There are, however, several problems with this proposal. First, the bifunctional enzyme is detectable mainly in transformed mammalian cells and embryonic or non-differentiated tissues (12). Among adult differentiated tissues, NAD-dependent 5,10-methylene-THF dehydrogenase activity is detectable only in rat adrenal tissue, but not adult liver (16). Second, the 5,10-methylene-THF dehydrogenase activity we detected in rat liver mitochondria is dependent on NADP ϩ , not NAD ϩ (11). Finally, adult rat liver mitochondria are capable of producing formate by the folate-dependent pathway (11), and formate production requires the 10-formyl-THF synthetase activity (Fig. 1, reaction 1) that is missing from the bifunctional enzyme. Clearly, only a trifunctional C 1 -THF synthase, with an NADP-dependent 5,10methylene-THF dehydrogenase activity, is consistent with the biochemical data.
Mitochondrial C 1 -THF synthase probably supports several metabolic processes in mammalian mitochondria. Folate-mediated one-carbon metabolism is involved in the synthesis of formyl-methionyl-tRNA for mitochondrial protein synthesis (Fig. 1, reaction 8) (46,47) and the oxidation of choline methyl groups via dimethylglycine dehydrogenase and sarcosine dehydrogenase (48). Mitochondrial C 1 -THF synthase may also play an important role in homocysteine metabolism. Recent studies of patients with nonketotic hyperglycinemia reveal a connection between the mitochondrially localized glycine cleavage system (GCS) (Fig. 1, reaction 5) and homocysteine metabolism. Nonketotic hyperglycinemia is an autosomal recessive brain disease caused by defects in subunits of the GCS, resulting in elevated glycine levels (49). Loss of GCS activity might be expected to cause, in addition to elevated glycine, a deficiency of mitochondrial one-carbon units. Consistent with this hypothesis, two recent studies report mild elevations of homocysteine in the plasma and cerebrospinal fluid of nonketotic hyperglycinemia patients (50, 51). Furthermore, Randak et al. (51) found that the mildly elevated plasma homocysteine levels could be reduced in their three patients by treatment with the one-carbon donor 5-formyl-THF (folinic acid, leucovorin). This observation provides strong evidence that the homocysteine elevations are due to a defect in homocysteine remethylation resulting from a deficiency of one-carbon units. Examination of Fig. 1 suggests two ways in which a loss of mitochondrial GCS activity could cause a deficiency of cytoplasmic one-carbon units. First, as suggested by Van Hove et al. (50), cells lacking a functional GCS might increase the transport of serine into mitochondria for metabolism by serine hydroxymethyltransferase to compensate for the deficiency of mitochondrial 5,10methylene-THF. This could, in turn, cause a deficiency of serine, and thus one-carbon units, in the cytoplasm. A second possibility is that formate production is defective in mitochondria from nonketotic hyperglycinemia patients. As we showed both in vitro with rat liver mitochondria (10,11), and in vivo with yeast (28,52), mitochondrial 5,10-methylene-THF is rapidly converted to formate and transported to the cytosol, where it is activated to 10-formyl-THF via cytoplasmic 10-formyl-THF synthetase (Fig. 1, mitochondrial reactions 3, 2, and 1 and  cytoplasmic reaction 1). The 10-formyl-THF is then reduced by cytoplasmic reactions to the 5-methyl-THF required for homocysteine remethylation. Consistent with this explanation is the observation that GCS activity is stimulated by glucagon (53), and glucagon lowers plasma homocysteine in rats (54), presumably by increasing the mitochondrial production of formate. An elegant stable isotope study in humans (55) provides further support for the role of mitochondrial one-carbon units in the remethylation of homocysteine. Gregory et al. (55) showed that both cytoplasmic and mitochondrial one-carbon units end up in the methyl group of methionine following infusion of deuterated serine. This result strongly supports mitochondrial formate production as a significant contributor to cytoplasmic one-carbon units in vivo in mammals and places the mitochondrial C 1 -THF synthase in the center of this pathway.