Structure of the m1 Muscarinic Acetylcholine Receptor Gene and Its Promoter*

The m1 receptor is one of five muscarinic receptors that mediate the metabotropic actions of acetylcholine in the nervous system where it is expressed predominantly in the telencephalon and autonomic ganglia. RNase protection, primer extension, and 5′-rapid amplification of cDNA ends analysis of a rat cosmid clone containing the entire m1 gene demonstrated that the rat m1 gene consists of a single 657-base pairs (bp) non-coding exon separated by a 13.5-kilobase (kb) intron from a 2.54-kb coding exon that contains the entire open reading frame. The splice acceptor for the coding exon starting at −71 bp relative to the adenine of the initiating methionine. This genomic structure is similar to that of the m4 gene (Wood, I. C., Roopra, A., Harrington, C. A., and Buckley, N. J. (1995)J. Biol. Chem. 270, 30933–30940 and Wood, I. C., Roopra, A., and Buckley, N. J. (1996) J. Biol. Chem.271, 14221–14225). Like the m4 gene, the m1 promoter lacks TATA and CAAT consensus motifs, and the first exon and 5′-flanking region are not gc-rich. The 5′-flanking region also contains the consensus regulatory elements Sp-1, NZF-1, AP-1, AP-2, E-box, NFκB, and Oct-1. Unike the m4 promoter, there is no evidence of a RE1/NRSE silencer element in the m1 promoter. Deletional analysis and transient transfection assays demonstrates that reporter constructs containing 0.9 kb of 5′-flanking sequence and the first exon are sufficient to drive cell-specific expression of reporter gene in IMR32 neuroblastoma cells while remaining silent in 3T3 fibrobasts.

G-protein coupled receptors are responsible for mediating a vast amount of intercellular communication throughout the body, especially in the nervous system. Current estimates put the size of this gene superfamily well in excess of 1000, making it one of the largest in the mammalian genome. In situ hybridization studies indicate that each individual member probably has a unique expression profile within the nervous system, yet the factors that determine and direct the expression patterns of the members of this gene family are largely unknown. Since the response of any neuron to a neurotransmitter is determined by the repertoire of receptors expressed at the cell surface, then it is essential to understand the mechanisms that determine the types of receptor gene expressed by individual neurons. The five muscarinic receptors are encoded by a subfamily of this gene superfamily (1,2), and their gene products are reponsible for mediating the metabotropic actions of acetycholine in the nervous system and its effector tissues (3). Each of the five muscarinic receptor genes is differentially expressed throughout the central and autonomic nervous systems both in adulthood (4,5) and during embryonic development, 1 and each of the receptors exhibits a unique pharmaclogical profile (6,7). The m1 receptor is the most abundant subtype in both the central and autonomic nervous systems and is found predominantly in the telencephalon (4), autonomic ganglion cells (4,8,9), and exocrine tissue (10). Activation of the m1 receptor leads to numerous responses, including stimulation of phospholipases C (7) and A 2 (11), inhibition of cAMP production (5), activation of K ϩ and Cl Ϫ channels (12), and inhibition of opening of K ϩ M channels in neural cell lines, and sympathetic and hippcampal neurones (13,14). We have previously described the structure of the rat m4 gene and its promoter (5,(15)(16)(17) and have shown that expression is silenced in non-neuronal cells (16) via a RE1/NRSE (18,19) type silencing element. There are many locations, such as the cerebral cortex, hippocampus, striatum, and autonomic ganglia, that co-express m1 and m4 receptor genes, and equally important in view of the silencing of the m4 promoter, there are many locations that express neither the m1 nor m4 genes, such as most regions of the mesencephalon and rhombencephalon and most non-neuronal tissue. Yet even within areas that co-express m1 and m4, not all individual cells express both m1 and m4 genes. There is, therefore, an intimate matrix of overlapping and nonoverlapping expression profiles between these homologous family members. We were thus interested to identify the control regions of the m1 promoter to ascertain if the m1 and m4 genes shared a common gene structure and regulatory elements. As a first step to addressing this issue, in the present study, we present the first description and analysis of the structure of the m1 muscarinic receptor gene and its promoter.
5Ј-RACE-Total RNA was extracted from rat cerebral cortex according to Chomczynski and Saachi (20) and reverse transcribed using either AMV-RT (Seikakagu) or Tth polymerase (Promega) and a mixture of random hexamers and oligo(dT). Some reverse transcriptions were carried out in the presence of methylmercuric hydroxide to denature the RNA. Reverse transcriptions with Tth polymerase were carried out by incubating 2 g of rat cerebral cortex RNA, 200 ng of random hexamers, and reaction buffer (Promega) for 5 min. 1.8 mM dNTPs, 1 mM MnCl 2 , and 6 units of Tth polymerase (Promega) were then added, and the reaction was allowed to proceed at 42°C for 5 min followed by 75°C for 10 min. 5Ј-RACE was then carried out essentially according to Frohman (21) using Taq polymerase and two nested gene-specific primers (Rm1-7a, ACTGACAGCAGGGGGCACTGAGGT, and Rm1-62a, ACAGGTTTTTCCTCAGAGAAA) to improve specificity in subsequent rounds of PCR, and the final PCR products were cloned directly into pCRII (Invitrogen). DNA from individual colonies was sized and sequenced.
DNA Sequencing-All sequencing was performed using Taq polymerase, flouresceinated dye terminators, and an Applied Biosystems automated DNA sequencer.
RNase Protection Analysis-DNA fragments from the m1 gene spanning the regions ϩ136 to ϩ527, Ϫ805 to ϩ189, and Ϫ805 to ϩ418 were generated by PCR and cloned into pCR3 (Invitrogen). Plasmids containing the appropriate orientation of insert were digested with HindIII and gel purified. Antisense RNA probes were synthesized using Sp6 RNA polymerase (Promega) and labeled with [␣-32 P]UTP (3000 Ci/mmol, Amersham Life Sciences). The reaction was run on a 4% polyacrylamide gel and exposed to x-ray film, full-length probe was excised from the gel, and the gel slice was incubated overnight in 10 mM Tris, 1 mM EDTA (pH 7.4), 0.1% SDS to elute the probe. Approximately 1 ϫ 10 5 dpm of probe was precipitated with RNA and hybridized overnight in 30 l of 80% formamide, 400 mM NaCl, 40 mM PIPES (pH 6.8), 1 mM EDTA. Nonhybridized RNA was digested with RNase A/RNase T1, purified, and analyzed on a 6% denaturing polyacrylamide gel. A DNA sequencing reaction performed using Sequenase 2 (U. S. Biochemical Corp.) was run as a size marker.
Reporter Constructs and Assays-Reporter plasmids were constructed using the luciferase reporter vector, pGL3 basic (Promega). A 9-kb BamHI fragment including the whole of the non-coding exon and flanking regions ( Fig. 1) was cloned into pGEM3Zf ϩ (Promega), cut with PstI, and religated to generate B/P4.0pGEM3Zf ϩ . This in turn was used to generate a 2-kb HindIII fragment that was cloned into the HindIII site of pGL3 (Promega) to yield H/P-1.4/ϩ0.6pGL3. A 0.75-kb Acc65I/ EcoRI fragment was excised from B/P4.0pGEM3Zf ϩ and cloned into Acc65I/EcoRI cut H/P-1.4/ϩ0.6pGL3 to yield K/P-1.6/ϩ0.6pGL3. Exonuclease digestion (Erase-A-Base, Promega) of H/P-1.4/ϩ0.6pGL3 was used to generate ⌬H/P-1.3/ϩ0.6. S/P-0.88/ϩ0.6pGL3 was generated by excising a 0.5-kb SacI fragment from H/P-1.4/ϩ0.6pGL3 followed by religation. Qiagen column-purified DNA (routinely, 250 ng of DNA/ 10-mm well) was transfected into cells using lipofectamine (Life Technologies, Inc.). Transfections were carried out in 10-mm wells using 0.25 g of DNA and 1.25 l of LipofectAMINE applied for 4 -16 h. These conditions gave transfection efficiencies of 5-50% as judged by ␤-galactosidase histochemistry. One day after transfection, cells were har-vested into Reporter Passive Lysis Buffer (Promega). Luciferase measurements were carried out using the Promega dual luciferase assay system according to manufacturer instructions in a Turner TD-20e luminometer. Cells were cotransfected with pRL-CMV (0.3 ng of DNA/ 10-mm well, representing a 1:250 ratio of pRL:test construct) that contains the Renilla cDNA driven by the CMV promoter and luminescence measured after quenching luciferase luninescence and adding Renilla substrate with Stop and Glow (Promega) according to manufacturer instructions. Luciferase luminescence was then normalized to Renilla luminescence, and results expressed relative to normalized luminescence driven from the promoterless pGL3 basic.

RESULTS
Organization of the Rat m1 Gene-Screening of 10 6 recombinants in a rat cosmid library with coding region primers yielded a single colony, S1-139. The same colony hybridized to oligonucleotides derived from 5Ј-RACE cDNA sequence.
Comparison of the genomic sequence upstream of the open reading frame with cDNA sequence demonstrated sequence divergence 70 bp upstream of the adenine of the initiation codon. Inspection of the genomic sequence identified a consensus splice acceptor site at this position, CCTCTCTTTTCA(G/G).
Since repeated screening of cDNA libraries failed to generate any significant upstream cDNA sequence information, 5Ј-RACE was used to generate clones corresponding to the 5Јuntranslated region of the rat m1 cDNA. Use of AMV-RT generated clones that routinely terminated less than 60 bp upstream of the splice acceptor site. Supplementation of the PCR with dimethyl sulfoxide, formamide, or glycerol failed to yield any longer clones, and neither did the use of MeHgOH to denature the RNA prior to reverse transcription. However, use of the thermostable polymerase Tth (Promega) for the reverse transcription yielded several longer clones, which were used to generate the underlined sequence seen in Fig. 1. A long polypyrimidine tract stretches for about 88 bp between positions ϩ415 and ϩ503 (see Fig. 1), and this may serve as a premature transcriptional stop during reverse transcription. Interestingly, around this point, the sequence of the rat 5Ј-untranslated region diverges sharply from its porcine homologue (22). Homology downstream from this point is very high (88% over 294 bases to the initiation codon although in the porcine sequence, the consensus splice site lies further upstream at Ϫ361 with respect to the initiation codon (22). Both the position of the splice site and homology of the coding exon are highly conserved between rat and mouse with the exception of a single base change -CCTTTCTTTTCA(G/G) (23).
Before proceeding to a finer analysis of the promoter of the m1 gene, we first wished to establish whether the S1-239 cosmid contained sufficient information to direct expression of the m1 gene by transfecting the entire cosmid into the human m1 expressing neuroblastoma, NBOK1. Reverse transcription-PCR, using primers derived from noncoding and coding exons specific for the rat m1 sequence, revealed the presence of rat m1 transcripts in the transfectants (data not shown), thus verifying that the cosmid contained all information necessary to drive expression, at least in transient transfection assays. No rat m1 transcripts were detected in untransfected NBOK1 cells.
Identification of the Transcriptional Start Site-Two independent strategies were used to identify the position of the transcription start site of the m1 gene. Initially, primer extensions were performed on rat brain cortex poly(A) ϩ RNA using the two primers, 132a and 235a (see Fig. 1). Despite using several protocols and different reverse transcriptases, no prodends; PCR, polymerase chain reaction; kb, kilobase(s); bp, base pair(s). uct was obtained using the 132a primer. We did, however, obtain a 171-nucleteotide product using the 235a primer in conjunction with AMV reverse transcriptase, the length of this extended product corresponding to position ϩ287 (Fig. 1, indicated by ϩ). To complement the primer extension, we performed RNase protection experiments using antisense RNA probes shown in Fig. 2. Using probe A, the 392-nucleotide protected fragment represents protection of the entire m1 sequence within this probe (each of the probes contain plasmid vector sequence in addition to the m1 sequence shown in Fig. 2, probe A), indicating that transcription begins upstream of the 5Ј end of this probe. Using probe B, two protected bands of 184 and 186 nucleotides were observed, correponding to transcription initiation sites of ϩ1 and ϩ3 (Fig. 1, shown as asterisk and double underline). Probe C generates a protected band of 418 nucleotides, which corresponds to the same transcription initi-ation site as revealed using Probe B. The remaining two major protected fragments from probes A and C are thought not to represent transcription start points but to be probe-specific artifacts as their 5Ј ends (Fig. 1, shown as asterisk) do not correlate with each other. No protected bands were seen when using yeast tRNA as template. All of these sites were upstream of the site identified by the primer extension experiments, leading us to conclude that the latter was probably an artifactual transcriptional stop and that the upstream sites identified by RNase protection were more likely to represent true transcriptional start sites.
To further verify that transcription of the m1 gene initiates at ϩ1 and the 1st exon does not contain any introns, a comparative PCR between cDNA from brain cortex and genomic DNA was performed. cDNA was generated from rat brain cortex RNA that previously had been treated with DNase to remove genomic DNA. Two different sense primers were used, 696s, which contains sequence upstream of the proposed transcription start site, and 539s, which contains sequence downstream of the proposed transcription start site. Each of the sense primers was used in PCR in conjunction with the three antisense primers 461a, 235a, and 132a (the positions of each of the primers is shown in Fig. 1). The PCR products obtained from this analysis are shown in Fig. 3. Primer pairs 539s/46a, 539s/ 235a, and 539s/132a generated 71-, 303-, and 412-bp amplified products, respectively, from both H/P-1.4/ϩ0.6pGL3 and cDNA, indicating that the intervening sequence is exonic and contains no introns. However, primer pairs 696s/46a, 696s/ 235a, and 696s/132a generated 231-, 461-, and 568-bp amplified products only when H/P-1.4/ϩ0.6pGL3 was used as template. No signal was seen using cDNA as template, thereby indicating primer 696s must lie upstream of the transcription initiation site. These data are consistent with the guanine residues at positions ϩ1 and ϩ3 being the dominant transcription initiation sites. Furthermore, the sizes of the PCR products obtained using 539s with each of the antisense primers is the same in cDNA and genomic DNA, indicating that there are no introns between 539s and 132a.
Sequence Analysis of the 5Ј-Flanking Region of the m1 Gene-Inspection of the sequence of the upstream exon and 5Ј-flanking sequence reveals consensus binding elements for AP-1, NZF-1 (24), AP-2, Oct-1, and NFB. No TATA or CAAT consensus elements are present. The sequence flanking the transcription start site show no homology with any known initiator sequence. No significant homology with the 5Ј-flanking region or the promoter of the m4 gene is found, nor is it dominant sites (*) are double underlined, and the first is assigned position ϩ1. The extension product derived from primer extension studies is indicated by (ϩ). 5Ј-flanking sequence defined from assigning the guanine at ϩ1 is in uppercase while the non-coding exon is shown in lowercase. Underlined sequence represents the extent of the longest 5Ј-RACE cDNA clone. found with any other sequence in the data base.
Expression of Promoter-Reporter Constructs in Cell Lines-Reverse transcription-PCR analysis demonstrated the presence of m1 transcripts in IMR32 cells and their absence in 3T3 fibroblasts (Fig. 4). The four reporter constructs (K/P-1.6/ ϩ0.6pGL3, H/P-1.4/ϩ0.6pGL3, ⌬H/P-1.3/ϩ0.6pGL3, and S/P-0.88/ϩ0.6pGL3) used to assay the promoter activity in these cells all terminated at the PstI site 65 bp upstream of the splice site and started 1.60, 1.39, 1.26, and 0.88 kb upstream of the transcriptional start site, respectively. All four constructs expressed 4 -5-fold above background in m1-expressing IMR32 cells, and only the larger K/P-1.6/ϩ0.6pGL3 construct drove expression significantly above background in the nonexpressing 3T3 fibroblasts. DISCUSSION G-protein coupled receptors are a diverse and widely expressed family of receptors that mediate signaling throughout the body both in development and adulthood. Within the nervous system, they represent one of the most significant sources of phenotypic diversity, yet little is known of the factors and mechanisms and factors that regulate this cell-specific expression. As such, understanding the mechanisms governing the transcriptional regulation of members of this gene family can offer insight into the establishment and maintainance of specific patterns of gene expression within the nervous system. In the present study, we have shown that, in common with several other members of the G-protein-coupled receptor gene family, including the m4 (15,16,17,24), V1a vasopressin (25), D1a dopamine (26,27), and C5a (28) receptor genes, the m1 muscarinic receptor gene consists of a single coding exon and a single noncoding exon. Another feature shared with most, but by no means all, other G-protein-coupled receptor genes whose promoters have been examined for genes is the absence of TATA, CAAT, or initiator consensus elements. Examples include the promoters of the 5HT1a (29), 5HT2a (30), 5HT1c (31), serotonin receptors, V1a vasopressin receptor (25), D2 and D1a receptors (26,27,32), SSTR1 somatostatin receptor (33), and NPY-1 receptor (34).
Inspection of 1.6 kb of 5Ј-flanking sequence revealed several consensus regulatory elements including one AP-1 site, two AP-2 sites, two NFB sites, an E-box and an NZF-1 element. The latter is an element recognized by a zinc finger protein that is expressed in the developing nervous system (24). Since it has been shown that two neuronal proteins, one NFB-like and one distinct from NFB (BETA) (35) can interact with the NFB recognition sequence and activate transcription from the proenkephalin and HIV promoters (36), it will be interesting to ablate these sites and monitor the effect on expression of the m1 gene. As with the m4 promoter, no CRE elements are found in the proximal promoter. The existence of a 88-bp polypyrimidine/polypurine tract in the noncoding exon between ϩ415 and ϩ503 is intriguing in light of studies on other promoters such as the malic enzyme (37), EGF receptor (38), and the mouse c-Ki-ras (39), which have shown deletion of such tracts to decrease promoter activity. However the role of the polypyrimidine/polypurine tract in transcription of the m1 muscarinic receptor gene remains to be examined.
In our earlier studies, we have shown that the core promoter of the m4 muscarinic receptor gene is constitutively active and cell-specific expression is achieved by silencing expression in non-neuronal (15,16,40) via a RE1/NRSE-type silencing element (18,19). Interestingly, inspection of 2.5 kb of flanking sequence of the m1 promoter reveals no RE1/NRSE element. This observation is corroborated by the failure of a radiolabeled single-stranded RE1/NRSE oligodeoxynucleotide to hybridize to a digest of the m1 cosmid under conditions that generate a strong hybridization signal to digests of the R3-6 m4 cosmid (data not shown). Hence, unlike its m4 couterpart, it is unlikey that the m1 gene is under the control of the zinc finger silencer REST/NRSF (41,42). A recent report describes the promoter of the chick m2 receptor gene (43), and although there is evidence of a silencer region, there is no evidence of a RE1/NRSE element so it may be that quite different factors are involved in driving transcription of each of the muscarinic receptor genes. Nevertheless, a finer analysis of the proximal m1 promoter will be necessary to examine the activity of the core promoter and the role of silencing and activating elements in directing cell specific expression of the m1 gene.
Deletional analysis revealed that constructs containing between 0.6 and 1.4 kb of 5Ј-flanking sequence and the entire noncoding exon were sufficient to drive reporter gene expression in IMR32 cells, a neuroblastoma that expresses an endogenous m1 gene. This cell line expresses more m1 mRNA than any other cell line that we have thus far screened, but even so, its level of expression compared with rat cerebral cortex is low (see Fig. 4), at least as judged by reverse transcription-PCR. All reporter constructs drove 4 -5-fold luciferase expression relative to the promoterless vector, pGL3 Basic (Fig. 5). This modest stimulation of reporter gene activity is presumably a refelection of the relatively low levels of endogenous m1 expression in IMR32 cells. Only the K/P-1.6/ϩ0.6pGL3 reporter construct drove reporter gene expression in 3T3 fibroblasts, which express no endogenous m1 receptor, showing that constructs containing as little as 0.88 kb of 5Ј-flank and the noncoding exon are sufficient to drive cell-specific expression, at least in transient transfection assays. The low level of expression driven by the K/P-1.6/ϩ0.6pGL3 construct may indicate the presence of a weak non-neuronal activator between the KpnI site (Ϫ1.6 kb) and the HindIII site (Ϫ1.4 kb). Since these deletions ablate the E-box, AP-1, and the distal NZF-1 and NFB sites, then it is clear that they are not necessary for cell-specific expression. The role, if any, of the NZF-1, NFB, and AP-2 sites between the SacI site and the transcriptional initiation site await determination by a finer analysis of the proximal promoter. However, interpretation of all transient transfection assays is limited by the fact that reporter gene expression is driven by multiple episomal copies of the reporter vector. Consequently, there are numerous examples of reporter constructs that are capable of driving apparent cell-specific expression in transient transfections that nevertheless fail to recapitulate appropriate cell and/or stage-specific expression in transgenic mice, as in the case of the dopamine ␤-hydroxylase gene where reporter constructs containing 0.6 kb of 5Ј-flanking sequence can drive cell-specific expression in transient transfection assays (44,45) but give no expression in transgenic mice (46). Reporter gene expression of other G-protein-coupled receptor promoters has revealed that less than 1 kb of 5Ј-flanking sequence is sufficient to drive cell-specific expression of the V1a vasopressin (25), D1 dopamine (27), type-1 angiotensin II (47), and m4 muscarinic (15,16,17) receptor promoter constructs in transient transfection assays. However, whether such discrete constructs are capable of driving tissue-and stage-specific expression in transgenic mice has not been reported for any members of the G-protein-coupled receptor gene family.
The characterization of two muscarinic receptor promoters now enables us to examine the differential transcriptional regulation of these members of the G-protein-coupled receptor gene family. Future studies are aimed at dissecting the m1 proximal promoter to determine whether the core promoter is constitutively active, as in the case of the m4 promoter, or whether progressive deletions lead to ablation of expression in m1-expressing cells.