The Structure of the Gene for Murine CTP:Phosphocholine Cytidylyltransferase, Ctpct

Phosphatidylcholine (PC) is the most abundant eukaryotic phospholipid and serves critical structural and cell-signaling functions. CTP:phosphocholine cytidylyltransferase (CT) is the rate-limiting enzyme in the CDP-choline pathway of PC biosynthesis, which is utilized by all tissues and is the sole or major PC biosynthetic pathway in all non-hepatic cells. Herein, we present the complete structure of the murine CT (Ctpct) gene. One P1 genomic clone and six subsequent plasmid subclones were isolated and analyzed for the exon-intron organization of the Ctpctgene. The gene spans approximately 26 kilobases and is composed of 9 exons and 8 introns. The exons match the distinct functional domains of the CT enzyme: exon 1 is untranslated; exon 2 codes for the nuclear localization signal domain; exons 4–7 encompass the catalytic domain; exon 8 codes for the α-helical membrane-binding domain; and exon 9 includes the C-terminal phosphorylation domain. Two transcriptional initiation sites, spaced 35 nucleotides apart, were identified using 5′-rapid amplification of cDNA ends polymerase chain reaction. The 5′ natural flanking region was found to lack TATA or CAAT boxes and to contain GC-rich regions, which are features typical of promoters of housekeeping genes. Several sites that have the potential to interact with transcription regulatory factors, such as Sp1, AP1, AP2, AP3, Y1, and TFIIIA, were identified in the 5′-region of the gene and found to be distributed in two distinct clusters. These data will provide the basis for future studies on the cis- andtrans-acting factors involved in Ctpct gene transcription and for the creation of induced mutant mouse models of altered CT activity.

cell-signaling functions, involves three major enzymatic steps: phosphorylation of choline, synthesis of CDP-choline from choline-phosphate and CTP, and transfer of choline-phosphate from CDP-choline to diacylglycerol to form PC (1,2). This pathway, called the CDP-choline, or Kennedy, pathway, is the major or sole one used by all extrahepatic tissues. PC biosynthesis in hepatocytes also occurs via this pathway, but an alternative one in which phosphatidylethanolamine is converted to PC by phosphatidylethanolamine N-methyltransferase (PEMT) is also used by this cell type (1,2). The ratelimiting step in the Kennedy pathway is the synthesis of CDPcholine, which is catalyzed by the enzyme CTP:phosphocholine cytidylyltransferase (CT). CT cDNAs from several different species, including rat (3), mouse (4,5), hamster (6), and human (5), have been cloned and sequenced. All of these CT cDNAs encode a CT protein of 367 amino acids, and the sequences are highly homologous among the different species.
CT exists in both soluble and nonintegral membrane-bound forms and is subject to both pre-translational (7-9) and posttranslational regulation (1,2). Post-translational regulation may involve binding of CT to membranes in cells as well as changes in C-terminal phosphorylation. In addition, CT mRNA levels increase with growth factor stimulation of certain cells (7), after partial hepatectomy in rat liver (9), and during differentiation, and there is a decrease in CT mRNA after overexpression of PEMT2 in hepatoma cells (8). Whether the changes in mRNA levels under these conditions are due to changes in CT gene transcription or to changes in CT mRNA stability (cf. Ref. 7) has not been fully investigated. Importantly, all of these regulatory studies have been conducted in vitro or in cultured cells, and, in a few cases, results obtained from different laboratories have been contradictory. Thus, the physiological regulation of CT activity, particularly in vivo, is an important area for further investigation.
Our laboratory has recently become interested in the physiology and regulation of CT during atherogenesis (10 -12). We found that free cholesterol loading of macrophages, an important event in advanced atherosclerosis, leads to the induction of CT activity, PC biosynthesis, and PC mass (10,11). This response helps the macrophages adapt to potentially toxic levels of cellular free cholesterol, and failure of this response may be one cause of an important lesional event, namely macrophage necrosis (12). To test these ideas in vivo using induced mutant mouse models, it became necessary for us to determine the structure of the murine CT, or Ctpct, gene. This information should also be useful in addressing some of the uncertainties regarding pre-and post-translational CT regulation described above. Given the central importance of CT, it is surprising that only a small part of the structure of the Ctpct gene has been reported thus far (4). Herein, we present the complete structure of the murine Ctpct gene, which reveals a relationship between exon organization and functional domains of CT, the existence of two transcriptional initiation sites, and the presence of several potential 5Ј-upstream cis-elements that may be involved in gene transcription.

EXPERIMENTAL PROCEDURES
Materials-All chemical reagents were purchased from either Sigma or Fisher. All restriction endonucleases and other enzymes were from New England Biolabs (Beverly, MA). The [␣-32 P]dCTP was purchased from DuPont NEN. The random primer labeling kit, the 5Ј-and 3Ј-RACE PCR kits, and all synthesized primers were from Life Technologies, Inc. The DNA preparation kit was obtained from Qiagen (Chatsworth, CA), and Sequenase Version 2.0 DNA sequencing kit was from U. S. Biochemical Corp. The Taq DNA polymerase was from Perkin-Elmer. QuickHyb solution and Taq extender PCR additive were from Stratagene (La Jolla, CA). RNA Zol B was from Tel-Test "B", Inc. (Friendswood, TX), and the TA cloning kit was from Invitrogen (San Diego, CA).
Isolation and Characterization of a Complete Murine CT Genomic Clone-A PCR product amplified from rat CT cDNA (generously supplied by Dr. R. B. Cornell, Simon Fraser University, Canada) by a set of primers (Nos. 820 and 976 as shown in Table I) was used for screening a mouse 129/J stem cell genomic DNA library in a P1 vector (Genome Systems, Inc., St. Louis, MO). Four P1 clones (Nos. 4901, 4902, 4903, and 4904) containing the Ctpct gene were obtained. These four P1 clones were further confirmed by Southern blot analysis using three PCR products as probes in QuickHyb solution (Stratagene) following the protocol of the manufacturer. The three probes, which were located in the 5Ј-, internal, and the 3Ј-regions of the CT cDNA, were produced by PCR using primers Nos. 137 and 305 (5Ј), 820 and 976 (internal), and 1003 and 1177-A (3Ј), respectively, as shown in Table I. The three probes were labeled with ␣-32 P using a random primer labeling kit (Life Technologies, Inc.) following the protocol of the manufacturer. One of the P1 clones, No. 4904, contained the entire Ctpct gene (see Fig. 1) based on the results of the Southern blot. Clone No. 4904 was then digested by either EcoRI or PstI and subcloned into pBluescript KS ϩ vector, and six subclones containing Ctpct gene fragments were identified by Southern blot using CT cDNA as the probe.
Identification of the Structure of the Murine Ctpct Gene-The six subclones mentioned above were sequenced for determining exon sequences and exon-intron boundaries. In addition, the 5Ј-upstream region of clone 6 was sequenced to identify potential cis-elements involved in binding transcriptional factors. DNA was prepared using the Qiagen DNA preparation kit following the protocol of the manufacturer. DNA sequencing was performed either automatically using an automated sequencer (Applied Biosystems/Perkin-Elmer model 373A) in the DNA Core facility of Columbia University or manually by the dideoxynucleotide chain termination method using Sequenase Version 2.0 DNA sequencing reagents (U. S. Biochemical Corp.) following the protocol of the manufacturer. The primers used for DNA sequencing are listed in Table I. Sequences were analyzed using the Wisconsin GCG software package (13). Exon sequences and exon-intron boundaries were defined by comparison with both murine CT cDNA (4, 5) and rat CT cDNA (3). The intron sizes were defined by PCR using P1 clone 4904 as template and the primers listed in Table I. For introns less than 3 kb, the following PCR condition was used: initial cycle of 94°C for 1 min, 50°C for 1 min, and 72°C for 2 min; followed by 30 cycles of 94°C for 30 s, 50°C for 30 s, and 72°C for 2 min; plus a final extension of 72°C for 6 min. For introns larger than 3 kb, Taq extender PCR additive (Stratagene) was added to the PCR reaction following the instructions of the manufacturer, and the PCR condition was adjusted as follows: initial cycle of 94°C for 1 min, 50°C for 1 min, and 72°C for 10 min; followed by 30 cycles of 94°C for 30 s, 50°C for 30 s, and 72°C for 8 min; plus a final extension of 72°C for 10 min.
Determination of CT Transcriptional Initiation Sites-5Ј-RACE (or "anchored") PCR was conducted to determine the transcriptional initiation sites of the murine Ctpct gene, using a 5Ј-RACE PCR kit (Life Technologies) according to the protocol of the manufacturer. Briefly, total RNA was isolated from the livers of 10-week-old female 129/J mice using RNA Zol B following the instructions of the manufacturer. RT-PCR was performed using superScript reverse transcriptase, primer No. 382 (Table I), and the above RNA (as template), as shown in Fig. 2A. The resulting first strand of 5Ј CT cDNA fragment was purified using a GlassMax spin cartridge. Then, oligo-dC was added to the 5Ј-end by terminal deoxynucleotidyl transferase. Finally, PCR was performed using the 5Ј-tailed CT cDNA fragment as template together with primer No. 305 (Table I) and an abridged anchor primer (5Ј-GGCCACGCGTC-GACTAGTACGGGIIGGGIIGGGIIG-3Ј). The resulting PCR products were probed by Southern blot with the 32 P-labeled 5Ј-CT cDNA PCR product (using primer Nos. 137 and 305) employed previously for screening the 5Ј-end of the genomic clones (see above). PCR products that hybridized to this 5Ј-probe were cloned into pCR2.1 vector using a TA cloning kit (Invitrogen) following the protocol of the manufacturer. The CT transcriptional initiation sites were identified by sequencing of the above clones using primer No. 305.
Determination of the 3Ј-UTR of CT cDNA-3Ј-RACE PCR was employed to determine the 3Ј-UTR of CT cDNA using a 3Ј-RACE PCR kit (Life Technologies) following the protocol of the manufacturer. Briefly, total RNA was isolated as described above. RT-PCR was performed using an adapter primer (5Ј-GGCCACGCGTCGACTAGTACTTTTTTT-TTTTTTTTTT-3Ј). The resulting first strand of CT cDNA was used as template for further PCR using the primer No. 1177-S (Table I) and the adapter primer. The PCR product was cloned into pCR2.1 vector and sequenced.

RESULTS
Sequences of the Exon-Intron Boundaries and Sizes of the Introns-As described in detail under "Experimental Proce-  dures," four clones were isolated from a murine 129/J stem cell genomic DNA library in the P1 vector, and one of these genomic clones (No. 4904) was found to contain the entire Ctpct gene by Southern blot using various CT cDNA probes (Fig. 1). Clone No. 4904 was digested by either EcoRI or PstI and subcloned into pBluescript KS ϩ vector. Six subclones encompassing parts of the Ctpct gene were identified by Southern blot analysis, and these subclones were sequenced. As shown in Fig. 1, clone 1 (2 kb) contains the 3Ј portion of intron I and the 5Ј portion of exon 2; clone 2 (3 kb) contains the 3Ј portion of intron I, all of exon 2, and the 5Ј portion of intron II; clone 3 (4 kb) contains the 3Ј portion of intron II, all of exon 3, and the 5Ј portion of intron III; clone 4 (12 kb) covers the 3Ј portion of intron III, all of exons 4 -8 and introns IV-VII, and the 5Ј portion of intron VIII; clone 5 (4 kb) contains the 3Ј portion of intron VIII, all of exon 9, and the 3Ј-NFR; and clone 6 contains the 5Ј portion of intron I, all of exon 1, and the 5Ј-NFR.
The Ctpct gene is approximately 26 kb in length, which is ϳ17 times the length of the CT cDNA. The gene is composed of 9 exons interrupted by 8 introns. Exon 1 contains the 5Ј-UTR with interruption by intron I at 10 base pairs upstream of the ATG start codon, which is in exon 2; exon 9 contains the 3Ј-UTR ( Figs. 1 and 4). The sizes of the exons range from 72 to 548 bp (Table II), and the sizes of the introns, which were estimated by PCR amplification using a pair of primers located on flanking exons (see Table I), range from 0.5 kb to 6 kb ( Fig. 1 and Table  III). All exon-intron boundaries were sequenced and are listed in Table III. The boundary sequences at the 5Ј-and 3Ј-ends of all of the introns are GT and AG, respectively (Table III), which are consensus sequences for pre-mRNA splicing recognition donor and acceptor sites (14). As described in detail under "Discussion" and depicted in Fig. 1, the organization of the exons of the Ctpct gene are related to the distinct functional domains of the CT enzyme.
Determination of Transcriptional Initiation Sites and Putative Promoter Region of the Murine Ctpct Gene-The transcriptional initiation sites of the Ctpct gene were determined by 5Ј-RACE PCR. PCR was performed using a 5Ј-fragment of 129/J murine liver CT cDNA tailed with oligo-dC as template; the set of primers used were an abridged anchor primer and primer No. 305 (Table I and Fig. 2A). The reaction generated products of two distinct sizes, 400 and 370 bp, as shown in Fig.  2B. Southern blots of these PCR products hybridized with a 5Ј-CT cDNA probe (Fig. 2B), indicating that both products were  5Ј-CT cDNA fragments with different lengths. 5Ј-RACE PCR was repeated using a different preparation of total RNA from 129/J mouse liver, and the result was the same. Therefore, there appear to be two transcriptional initiation sites utilized in the Ctpct gene. To identify these sites, the 400-and 370-bp PCR products were each cloned into pCR2.1 vector and sequenced using primer No. 305 (Table I). As shown in Fig. 3, one of the start sites is 82 nucleotides upstream of the ATG start codon and marked as position ϩ1, and the other is 47 nucleotides upstream of the ATG and marked as position ϩ35. Both start with A, a purine, the most common transcription-initiating nucleotide.

GTT/GGT
Clone 6 ( Fig. 1), which contained a 4-kb PstI digestion fragment from the Ctpct gene, was identified by Southern blot using exon 1 as a probe. Clone 6 contains approximately 2.6 kb of the 5Ј-NFR of the Ctpct gene, all of exon 1 (72 bp), and approximately 1.4 kb of intron I. We sequenced 600 bp of DNA immediately upstream of the first transcription start codon (position ϩ1). In analyzing this putative promoter region of the Ctpct gene (Fig. 3), we found characteristics that are typical of housekeeping gene promoters, such as no TATA or CAAT boxes and a high G ϩ C content (see "Discussion" and Refs. [15][16][17]. Using a computer analysis of the 5Ј-NFR, a number of potential transcription factor-binding sites were revealed, as indicated in Fig. 3. A total of five potential Sp1-binding sites (GC boxes) (18) were located at positions Ϫ9, Ϫ58, Ϫ66, Ϫ70, and Ϫ144; the sites at Ϫ66 and Ϫ70 are overlapping. An AP1 site (ATGAGT-CAA) was located at position Ϫ350, an AP2 site (GCCG-GCGGG) was located at position Ϫ324, and an AP3 site (TGT-GGTTT) at position Ϫ107. Two TFIIIA sites (CGGGCTCGAA and CAGGTCGGAA) were located at positions Ϫ319 and Ϫ381. Two reversed Y1 sites (AGAGGGCGGG and AGCCGGCGGG) were located at positions Ϫ73 and Ϫ325; the first overlaps with an Sp1 site, and the second overlaps with AP2 and TFIIIA sites.
Determination of the 3Ј-UTR of CT cDNA and its Location on the Ctpct Gene-3Ј-RACE PCR (see "Experimental Procedures" and Fig. 4A) yielded a 370-bp product whose sequence is shown in Fig. 4B. The sequence from the stop codon (TAA) to the terminal adenine, which is the site where polyadenylation occurs, is 341 bp (3Ј-UTR). Although a 3Ј-terminal adenine is typical (19), an unusual polyadenylation signal sequence   Table I, and the sequence of the anchor primer is included under "Experimental Procedures." The ϳ400 and ϳ370 bp notations below the PCR scheme refer to the two PCR products identified by this procedure, indicating two transcriptional initiation sites. B, left, the PCR products on 1% agarose gel (PCR); and right, autoradiogram after hybrdization with 32 P-labeled CT cDNA (Southern blot).

FIG. 3. Transcriptional initiation sites and consensus sequences of transcription factor-binding elements in the putative 5-promoter region of murine
Ctpct gene. The sequence shows the 72 bp of exon 1 (positive numbers) and 600 bp of the putative 5Ј-promoter region (negative numbers). Two transcriptional initiation sites are indicated by the arrows (nucleotides ϩ1 and ϩ35). The consensus sequences of transcription factor-binding elements are boxed.
(AATATA) was located 14 bp upstream of the site where the poly(A) tail is added during the maturation of CT mRNA (see Ref. (19). 2 As depicted in Fig. 4C, the 3Ј-UTR was located on exon 9 of the Ctpct gene (i.e. without an intervening intron) since the sequence of the 3Ј-RACE PCR product (above) was identical to the sequence of the 3Ј-end of exon 9 of the Ctpct gene (from clone 5 in Fig. 1). The conserved sequence (TTTTTT), which is one of several possible downstream signals required for efficient eukaryotic mRNA polyadenylation (20), was located in the 3Ј-NFR of the Ctpct gene, 50 bp downstream of the 3Ј-terminus of exon 9 (Fig. 4C). DISCUSSION The structure of the murine Ctpct gene reveals several interesting points. First, the organization of the exons has a distinct relationship to the functional domains of the CT protein (Fig. 1). Exon 2 encodes the first 39 amino acid residues of CT, which includes a signal sequence (residues 8 -28) that targets CT to the nucleus (21), where the enzyme is localized in certain cell types (22). Exon 3 encodes residues 40 through 72, for which no specific function has been reported. Exons 4 -7 code for the catalytic domain (residues 73-236) (3); exon 4 contains the codons for a HSGH motif (residues 89 -92), which is thought to mediate binding of CTP by the enzyme (23). Exon 8 encodes amino acids 237-299, which contain a 58-residue ␣-helix containing three contiguous 11-residue repeats (residues 256 -288); this ␣-helix is thought to play an important role in the membrane-binding properties and enzymatic activity of CT (5,24,25). This exon also encodes a densely positivecharged region, a cluster of five lysine residues within a 7amino acid stretch (residues 248 -254), the function of which is unknown. Exon 9 codes for the C-terminal part of the CT protein (residues 300 -367), which includes a second ␣-helix (shown not to be necessary for membrane binding (26)) and multiple serine residues that become phosphorylated in vivo (27,28); phosphorylation of these serines may interfere with the binding of CT to membranes (29,30) and with the activation of CT by lipid activators, such as PC/oleic acid liposomes (31). Interestingly, the catalytic domain of mammalian CT is highly homologous to yeast CT while the nuclear localization signal, membrane-binding, and phosphorylation domains of mammalian CT are not (24,32). Thus, it appears as if the exons encoding the basic catalytic unit of CT evolved first and were later embellished with additional exon cassettes resulting in more complex post-translational regulatory control.
Other interesting features of the murine Ctpct gene include the use of two transcriptional initiation sites, the presence of an untranslated exon 1 that is approximately 6 kb upstream from the initiation codon in exon 2, and the large size of the gene. Other genes involved in lipid biosynthesis, transfer, and metabolism also have an untranslated first exon, including the genes encoding phosphatidylethanolamine transferase-2 (PEMT2) (33), apolipoproteins A-I, A-II, C-II, C-III, and E (34), and phospholipid transfer protein (35). The gene for PEMT2, a 199-amino acid integral membrane protein that catalyzes the synthesis of PC in hepatocytes, has three other features in common with the Ctpct gene, namely two transcriptional initiation sites, a very large size (ϳ30-fold larger than its cDNA versus ϳ20-fold larger for the Ctpct gene), and the absence of a TATA or CAAT box in the 5Ј-upstream region of the gene (see below) (33). Whether these similarities denote common transcriptional regulatory features between the two PC biosynthetic genes must await further studies on both genes.
In this regard, the cloning of the Ctpct gene will hopefully lead to future studies directed at understanding how transcription of the CT gene is regulated. The 5Ј-upstream sequence revealed no TATA or CAAT box, but this region is rich in G ϩ C (71% in the first 350 upstream nucleotides) and contains five GC boxes corresponding to consensus Sp1-binding sites (36) (Fig. 3). Sp1-binding sites have been shown to be present in promoters of numerous viral and cellular genes and generally located at 40 -100 nucleotides upstream of the transcriptional initiation sites (36); in the putative Ctpct promoter region, three potential Sp1-binding sites (Ϫ58, Ϫ66, and Ϫ71) are present in this location. Lack of a TATA or CAAT box, multiple origins of transcription, and GC-rich Sp1-binding sites are often found together and are typical of housekeeping genes (37)(38)(39)(40)(41). Other consensus transcriptional factor-binding sites found in the 5Ј-upstream region of the Ctpct gene include those for AP1, AP2, AP3, TFIIIA, and Y1 (Fig. 3); these sites and those for Sp1 are concentrated in two areas of the putative promoter region, namely between nucleotides Ϫ14 and Ϫ140 and between Ϫ310 and Ϫ392 (Fig. 3). Future studies will determine whether Sp1 and these other factors are involved in basal transcription of the gene or in transcriptional regulation, such as might occur during growth factor stimulation of certain cells (7), after partial hepatectomy in rat liver (9), and after overexpression of PEMT2 in hepatoma cells (8).
The major impetus for our laboratory to clone the Ctpct gene 2 The typical eukaryotic upstream polyadenylation signal is AATAAA; 3.4% of 269 surveyed vertebrate polyadenylated cDNA sequences, however, have been found to have the signal found in the Ctpct gene (AATATA) (19).

FIG. 4. 3-UTR of CT cDNA and its location on the Ctpct gene.
A, illustration of the 3Ј-RACE PCR strategy using 129/J murine liver RNA (see "Experimental Procedures" for details). The sequence of primer No. 1177-S is shown in Table I, and the sequence of the adapter primer is included under "Experimental Procedures." The ϳ370 bp notation below the PCR scheme refers to the size of the 3Ј-RACE PCR product. B, sequence of 3Ј-UTR of CT cDNA. The stop codon (TAA) is boxed, and the unusual polyadenylation signal sequence, AATATA (see text), is underlined; the total length of the 3Ј-UTR is 341 bp. C, the location of the 3Ј-UTR and the downstream polyadenylation signal (TTTTTT) on the Ctpct gene. was related to our interest in the role of the CT enzyme and PC biosynthesis during atherogenesis (10 -12) and our plans to study this relationship in vivo using induced mutant mice. Cloning of the gene was necessary for future gene targeting to create induced mutant mouse models of altered arterial wall PC biosynthesis. Additional induced mutant mouse models using CT constructs mutated in regions thought to be important in post-translational regulation (e.g. phosphorylation or membrane-binding domains) and in the consensus cis-acting sequences of the putative promoter region of the Ctpct gene will also be useful in understanding the regulation of CT in vivo.