Mechanism of Gonadotropin Gene Expression

Regulation of the glycoprotein hormone α-subunit (GPHα) gene has been studied extensively in pituitary and placental cell lines, but little is known of the transcriptional regulators important for its ectopic expression. To investigate the molecular basis for ectopic expression, it was critical to define cis-regulatory elements and their cognatetrans-acting factors that modulate promoter activity in epithelial cell types that do not normally express GPH. DNA-mediated transient expression of promoter-reporter constructs was used to identify a novel negative regulatory element located at the GPHα gene transcription start site. Truncation or site-directed mutagenesis of this element produced up to a 10-fold increase in promoter activity. Electrophoretic mobility shift analysis detected a protein that binds specifically to a DNA motif encompassing the cap site. Based on competitive DNA binding studies with mutated oligonucleotides, it was determined that bases from −5 to −2 and +4 to +11 are critical for protein binding. The DNA sequence flanking the transcription start site from −9 to +11 is an imperfect palindrome; consequently, this motif is referred to as the cap site diad element (CSDE) and the cognate factor as the cap site-binding protein (CSBP). CSBP activity was present at different levels in nuclear extracts prepared from a variety of cell types. Significantly, the ratio of activities exhibited by the GPHα promoter with a mutated CSDE compared with the promoter with a wild-type CSDE was dependent on the transfected cell line and its content of CSBP. These results indicate that a negative regulatory element centered at the GPHα gene cap site and its cognate DNA-binding protein make a significant contribution to the production of α-subunit in a variety of tumor tissues. A detailed understanding of this cis/trans pair may further suggest a mechanism to explain, at least in part, how this gene becomes activated in nonendocrine tumors.

The glycoprotein hormone (GPH) 1 family consists of four members, chorionic gonadotropin, luteinizing hormone (LH), follicle-stimulating hormone (FSH), and thyroid-stimulating hormone (TSH). These hormones are heterodimers, sharing a common ␣-subunit but having unique ␤-subunits that are thought to confer their biological specificity. LH, FSH, and TSH are produced in the anterior pituitary, whereas chorionic gonadotropin is synthesized in the developing placenta. Thus, it is significant that the isolated ␣or ␤-subunits are also synthesized by a variety of tumors (1)(2)(3)(4)(5) and tumor-derived cell lines (6 -17). The free subunits are secreted by cell lines established from both trophoblastic (e.g. JAr, JEG-3) and nontrophoblastic (e.g. HeLa, ChaGo, CBT) tumors (16,18). In the latter instance, they are considered to be ectopic proteins, i.e. characteristic of a cell type other than that from which the tumor was derived. An active role for ␣-subunit at some stage of the tumorigenic process is supported by reports showing direct correlations between ␣-subunit production and tumor formation in nude mice (19,20) and the anchorage-independent growth of tumor cell lines in vitro (8,20).
The molecular mechanisms controlling human ␣-subunit gene expression in the placenta have been studied extensively. It has been shown that multiple elements in the first 300 bp of DNA upstream from the transcription start site (ϩ1) are necessary to regulate the gene in a tissue-specific manner (21). These are illustrated in Fig. 1. Basal promoter elements include consensus TATA and CAAT boxes residing at Ϫ29 and Ϫ89, respectively. Cyclic AMP-responsive elements (CREs) occur within two tandemly repeated 18-bp sequences that extend from Ϫ146 to Ϫ111 (22)(23)(24). The core CRE sequence is an 8-bp palindrome (TGACGTCA) that is also found in several other cAMP-responsive genes (25)(26)(27)(28)(29)(30). Adjacent to the CREs in the GPH␣ gene resides a tissue-specific enhancer located from Ϫ180 to Ϫ150 that stimulates basal levels of GPH␣ gene expression in placental choriocarcinoma cells (22,(31)(32)(33). The trophoblast-specific enhancer is a composite element, containing adjacent and overlapping DNA-binding domains for at least three proteins. The upstream enhancer requires the CRE to impart its effects on transcription (22,32,33). The distal and central domains are referred to as TSE (trophoblast-specific element) and upstream regulatory element, respectively (21,(31)(32)(33); and the proximal region between Ϫ161 and Ϫ142 contains a GATA binding motif, originally identified as ␣-activation element, which is able to respond to cAMP (34). Cellular proteins have been identified that interact specifically with these distinct regulatory elements (22,23,31,32,34,35).
Control of cell-specific expression of the ␣-subunit gene in the pituitary differs from that in the placenta (21, 32-34, 36 -38). Regulation of GPH␣ gene expression in the JEG-3 choriocarcinoma cell line and in the ␣T3-1 pituitary gonadotrope cell line is dependent on the CREs (38 -40). However, the trophoblast-specific enhancer seems not to be involved, and no TSE binding activity could be detected in gonadotrope or thyrotrope cell lines (38). In contrast, several regions upstream of the TSE have been implicated for expression in pituitary cells (40 -42). The best characterized motif, called the gonadotrope-specific element (GSE), is located in the human gene from Ϫ223 to Ϫ197 and is conserved in all mammalian ␣-subunit genes examined thus far (38). A protein that binds this element was detected in ␣T3-1 cells but not in thyrotrope cells and was identified as steroidogenic factor-1 (SF-1) (36). The regions (Ϫ480 to Ϫ417, Ϫ254 to Ϫ177, and Ϫ177 to Ϫ120) that confer thyrotrope-specific expression to the mouse ␣-subunit gene do not contain homology to the GSE (41,42). The sequence between Ϫ344 and Ϫ300 is referred to as the pituitary glycoprotein basal element (PGBE), and an imperfect palindrome centered between Ϫ342 and Ϫ329 has been shown to bind a LIM homeodomain transcription factor (LH-2) (43). This suggests that a LIM homeodomain protein can stimulate expression of one of the earliest markers of pituitary differentiation. Regions comparable with the murine PGBE have yet to be identified in the human gene. In addition, two potential basic-helix-loophelix protein binding sites (E-boxes) are located in the GPH␣ promoter just downstream (Ϫ21 to Ϫ16) and just upstream (Ϫ50 to Ϫ45) of the TATA box. These are referred to as ␣EB2 and ␣EB1, respectively, and their mutation reduces basal activity of the promoter 60 -80% in pituitary cells (44).
Despite the identification and characterization of these basal and enhancer elements that confer tissue-specific activation of the GPH␣ gene in trophoblasts, gonadotropes, and thyrotropes, the molecular basis for its expression in nonendocrine cell types remains poorly understood. In HeLa cervical carcinoma cells, the GPH ␣-subunit is ectopically expressed and secreted at levels comparable with those in the eutopic expressing JEG-3 choriocarcinoma cells (7). However, previous studies have suggested that the ␣-subunit gene proximal promoter is not as active in HeLa cells as it is in JEG-3 cells and that HeLa cells do not have the requisite binding protein to interact with the TSE (22,38). To investigate the mechanisms controlling ectopic production of the gonadotropin subunits in nonendocrine tumors, the regulation of GPH␣ gene expression was studied in HeLa cells. In a detailed analysis of the GPH␣ gene promoter sequence, an imperfect inverted repeat centered at the transcription start site was noted. It is demonstrated in this report that the palindromic cap site sequence constitutes a negative cis-acting element that differentially contributes to promoter activity in a variety of cell types depending on the level of a nuclear factor that demonstrates specific binding to the cap site sequence. The element and its cognate trans-acting factor are referred to as the cap site diad element (CSDE) and cap sitebinding protein (CSBP), respectively.

EXPERIMENTAL PROCEDURES
Cell Culture-HeLa cervical carcinoma cells, HT-29 colon carcinoma cells, Panc-1 pancreatic carcinoma cells, and GH 4 C 1 murine pituitary cancer cells were maintained in minimum essential medium supplemented with 5% bovine calf serum. JEG-3 and JAr choriocarcinoma cells, U-2 OS osteosarcoma cells, and MCF-7 breast carcinoma cells were grown in RPMI 1640 medium supplemented with 10% fetal calf serum. The MCF-10A breast cell line was maintained in Dulbecco's modified Eagle's medium/F-12 medium with 5% equine serum, 0.1 g/ml cholera toxin, 10 g/ml insulin, 0.5 g/ml amphotericin B, 0.5 g/ml hydrocortisone, and 0.02 g/ml epidermal growth factor. All media were also supplemented with glutamine (0.06%), penicillin (100 units/ml), and streptomycin (100 g/ml). Cells were grown as monolayer cultures and maintained in T-flasks at 37°C in a humidified atmosphere consisting of 95% air and 5% CO 2 .
Plasmid Constructions-(a) A 0.89-kbp fragment extending from Ϫ846 to ϩ48 was liberated from a GPH␣ genomic clone (45) by digestion with BglII and BamHI. The pBLCAT 3 vector (46) was linearized with BamHI and ligated to the BglII/BamHI fragment with T4 DNA ligase, placing the chloramphenicol acetyltransferase (CAT) reporter gene under control of the wild-type ␣-subunit gene promoter, thereby generating p␣(Ϫ846/ϩ48)CAT. (b) To construct p␣(Ϫ846/ϩ3)CAT, the p␣(Ϫ846/ϩ48)CAT vector was digested with PstI, and the liberated fragment was isolated and subcloned into the polylinker PstI site of pBLCAT 3. (c) To introduce point mutations into the cap site sequence, a DNA fragment extending from Ϫ1637 to ϩ48 of the GPH␣ genomic clone (45) was subcloned into M13, and uracil-containing singlestranded DNA was isolated to serve as a template for mutagenesis (47,48). Mutagenic primers changed bases at Ϫ5 to Ϫ2 from TAAC to ATTG (m-1) and bases at Ϫ4, Ϫ3, ϩ4 and ϩ6 from A, C, G, and T to T, G, A, and C, respectively (m-2). The fragments extending from Ϫ846 to ϩ48 containing the point mutations were liberated from M13 and subcloned into pBLCAT 3 . (d) A synthetic oligonucleotide (CSDE, Table I) extending from Ϫ13 to ϩ15 with BamHI overhangs was inserted in both forward and reverse orientation into the BamHI site of p␣(Ϫ846/ ϩ3)CAT to generate p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT, respectively. (e) To fabricate HCAP and HPAC, p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT vectors were cut with PstI, and the larger fragments were gel-purified and recircularized. Appropriate recombinants were identified by PCR or restriction fragment analysis. Sequences across the cap site (underlined) were verified: HCAP (5Ј-GACTTCATTAACTGCAGT-TACTGAGAAC-3Ј) restores the imperfect palindrome truncated at ϩ15; HPAC (5Ј-GACTTCATTAACTGCAGTTAATGAAGTC-3Ј) creates a perfect palindrome. All mutations and the orientation and number of inserts in these reporter constructs were verified by dideoxy sequencing (49).
DNA-mediated Transient Expression Assay and Construction of Stable Transfectants-Transfections were performed in duplicate using DNA-calcium phosphate co-precipitates (50) containing 10 g of the appropriate p␣CAT expression plasmid and 5 g of the internal control plasmid pCMVlacZ, which places the Escherichia coli ␤-galactosidase gene under control of the cytomegalovirus (CMV) promoter. Cells were harvested 48 h after glycerol shock, and cell lysates were assayed for CAT activity as described by Gorman et al. (51,52) and for ␤-galactosidase activity as described by Maniatis et al. (50). CAT activity was normalized to ␤-galactosidase activity for each sample. The aliquots assayed for CAT were heated at 65°C for 10 min; those to be assayed for ␤-galactosidase activity were left unheated. Protein concentration of cell extracts was determined relative to a bovine serum albumin standard by the method of Bradford (53). In separate experiments, relative CAT activities for a given vector generally varied by no more than Ϯ 20%. Stable transfectants were established by incubating HeLa SR3 cells with calcium phosphate-DNA precipitates containing 20 g of either p␣(Ϫ846/ϩ48)CAT or p␣(Ϫ846/ϩ3)CAT plus 5 g of pSV 2 neo. Twentyfour hours after the glycerol shock, cells were trypsinized and subcultured from a confluent T-25 flask into a T-75 flask in medium containing 600 g/ml G418 sulfate. Colonies present after 14 days were pooled and subcultured in medium containing 200 g/ml G418.
Preparation of Nuclear Extracts-Crude nuclear extracts from HeLa, JEG-3, HT-29, Panc-1, GH 4 C 1 , U-2 OS, JAr, MCF-7, and MCF-10A cells were prepared as follows. Confluent cells in a 150-cm 2 flask were washed and collected in a cold solution containing 50 mM potassium phosphate (pH 7.4), 150 mM NaC1, 0.5 mM EDTA, and 0.5 mM EGTA. The cell pellet was resuspended in 10 ml of cold nuclear wash buffer containing 10 mM HEPES (pH 8.0), 50 mM NaCl, 15% (w/v) sucrose, 0.1 mM EDTA, 0.5% (v/v) Triton X-100, 1 mM dithiothreitol (DTT), 5 mM MgCl 2 , and 1 mM phenylmethylsulfonyl fluoride (PMSF). After incubation for 10 min on ice, the cell lysates were underlaid with the above solution, omitting the Triton X-100 and increasing the sucrose concentration to 30%, and centrifuged at 2,000 rpm for 30 min at 4°C. Nuclear pellets were resuspended in 150 l of cold 10 mM Tris-Cl (pH 7.4) plus 1 mM EDTA (TE buffer) and then incubated for 60 min on ice after adding 150 l of a solution containing 20 mM HEPES (pH 8.0), 1 M NaCl, 20 mM MgCl 2 , 0.2 mM EDTA, 2 mM DTT, 10 mM spermidine, and 2 mM PMSF. After pelleting the nuclei, protein extracts were dialyzed against 100 volumes of a buffer containing 200 mM HEPES (pH 7.9), 20% (v/v) glycerol, 0.1 M KCl, 0.2 mM EDTA, 0.5 mM DTT, and 1 mM PMSF. After dialysis for 4 h, the buffer was replaced with an additional 100 volumes of fresh buffer, and dialysis was continued another 4 h. Protein concentration in nuclear extracts was determined as described by Bradford (53).
Electrophoretic Mobility Shift Analysis (EMSA)-Oligonucleotides used as binding probes and competitors are summarized in Table I. EMSA was performed as described by Carthew et al. (54) with modification. Binding reactions were carried out in a final volume of 25 l containing 10 g of crude nuclear extract protein, 10% (v/v) glycerol, 20 mM HEPES (pH 7.9), 100 mM KCl, 3 mM MgCl 2 , 4 mM spermidine, and 0.5 mM DTT. Poly(dI-dC)-poly(dI-dC) (2 g) was added to eliminate nonspecific protein binding. For competition analysis, the reaction mixtures were supplemented with 50 -200 ng of unlabeled oligonucleotide, which provided up to a 1000-fold excess of the competitor relative to the probe. After a 10-min preincubation at 22°C, 10,000 cpm of 32 P-labeled oligonucleotide (ϳ0.5 ng) was added, and incubation was continued for 30 min. DNA-protein complexes were resolved on 6.5% nondenaturing polyacrylamide gels in 1ϫ TBE buffer (90 mM Tris base, 64.6 mM boric acid, and 2.5 mM EDTA) and visualized by autoradiography of the dried gel.
Methylation Interference Analysis-Sense and antisense oligonucleotides (Ϫ13/ϩ22, Table I) were individually end-labeled with [␥-32 P]ATP and annealed with the appropriate unlabeled complementary strand in excess. Double-stranded oligonucleotide probes were purified over a 20% polyacrylamide gel, eluted, precipitated with ethanol, and resuspended in TE buffer as described above. The probes were partially methylated by incubating 10 6 cpm of DNA in 5-10 l of TE for 10 min at room temperature with 1-2 l of dimethyl sulfate in 200 l of buffer containing 50 mM sodium cacodylate (pH 8.0) and 1 mM EDTA (pH 8.0). The reaction was stopped by adding 40 l of a solution containing 1.5 mM NaOAc (pH 7.0), 1 M 2-mercaptoethanol, and 10 g of tRNA. The methylated probe was purified by ethanol precipitation.
The methylated probe was subjected to standard EMSA, and bands corresponding to DNA-protein complex and free probe were excised and electroeluted into 0.1ϫ TBE. The eluates were supplemented with 10 g of tRNA, extracted with phenol/chloroform (1:1), and cleared of nucleic acid by ethanol precipitation. Precipitates were rinsed with 70% ethanol, air-dried, and resuspended in 30 l of 0.5 M piperidine. The DNA was hydrolyzed at 90°C for 30 min and then precipitated twice with ethanol. The pellet was rinsed twice with 70% ethanol, air-dried, resuspended in 10 l of a solution containing 80% (v/v) deionized formamide, 50 mM Tris borate (pH 8.3), 1 mM EDTA, 0.1% (w/v) xylene cyanol, and 0.1% (w/v) bromphenol blue, and boiled for 2 min. Equal amounts of radioactivity derived from bound and free probe were subjected to electrophoresis on a 10% polyacrylamide sequencing gel containing 7 M urea.
Isolation of Total Cytoplasmic RNA-Cells that were stably transfected with p␣(Ϫ846/ϩ48)CAT or p␣(Ϫ846/ϩ3)CAT were harvested from confluent flasks by scraping into ice-cold Tris-buffered saline (50 mM Tris-Cl (pH 7.4) and 0.15 M NaCl) and washed twice in the same solution by centrifugation (1200 ϫ g; 5 min; 4°C). The cells were resuspended in ice-cold TE buffer and lysed by addition of 0.4% (v/v) Nonidet P-40. After removing nuclei by centrifugation, total cytoplasmic RNA was prepared from the postnuclear supernatant by addition of 1% sodium dodecyl sulfate and phenol extraction as previously described (7).
Primer Extension-The oligonucleotide for specific priming of CAT reverse transcripts was CAT-REV2 (5Ј-GAGCTTGGCGAGATTTTCAG-GAGCTAAGGAAGC-3Ј), which is located at Ϫ36 to Ϫ4 relative to the CAT gene translation start site. This primer (10 pmol) and dephosphorylated, HinfI-digested X174 DNA (250 ng) were end-labeled using T4 polynucleotide kinase and [␥-32 P]ATP. After labeling, reactions were heated to 90°C for 2 min to inactivate the T4 kinase. For primer extension, 1 l of end-labeled primer was added to 15 g of total RNA and 5 l of 2ϫ buffer, which contained 100 mM Tris-Cl (pH 8.3), 100 mM KCl, 20 mM MgCl 2 , 20 mM DTT, 2 mM each dNTP, and 1 mM spermidine. The primer and RNA were annealed by heating the tubes at 58°C for 20 min followed by cooling at room temperature for 10 min. Reaction mixtures for extension were constructed by adding 5 l of 2ϫ avian myeloblastosis virus primer extension buffer, 1.4 l of 40 mM sodium pyrophosphate, 1 unit of avian myeloblastosis virus reverse transcriptase, and 1.6 l of H 2 O to the annealing mixture. Incubation was at 42°C for 30 min. The products were supplemented with 20 l of loading dye and heated at 90°C for 10 min. A sample aliquot of 10 l and 1 l of labeled X174 DNA marker were loaded onto a 10% polyacrylamide, 7 M urea sequencing gel. The gel was run at 13 watts in 0.6ϫ TBE buffer for about 4 h and subjected to autoradiography.

Identification of a Regulatory Element Located at the Transcription Start Site of the Glycoprotein Hormone ␣-Subunit
Gene-The level of promoter activity for many genes can be defined by the array of cis-acting elements in the proximal promoter upstream of ϩ1 and by the interactions of their corresponding DNA-binding proteins. In addition, a number of regulatory factors have also been identified that bind at, or slightly downstream from, the transcription start site and make significant contributions to promoter activity (55-59). As described in the Introduction, the GPH␣ gene promoter is complex ( Fig. 1), containing multiple cis-acting elements in the proximal 5Ј-flanking DNA that interact with nuclear proteins isolated from HeLa cervical carcinoma cells (60), JEG-3 choriocarcinoma cells (21-23, 31, 32, 61), and ␣T3-1 pituitary gonadotropes (40 -43). However, the activity of the GPH␣ gene promoter extending from Ϫ846 to ϩ48 (i.e. p␣(Ϫ846/ϩ48)CAT) was previously reported to be extremely low in HeLa cells relative to the activity expressed in JEG-3 cells (22). To gain an understanding of the mechanisms leading to low level expression of the GPH␣ promoter in nonplacental and nonpituitary cell types such as HeLa, it was important to identify regulatory elements that account for this low promoter activity.
Because much of the earlier work had examined the first few hundred base pairs lying immediately upstream of the cap site, this study was undertaken to examine downstream sequences for potential regulatory elements. Taking advantage of a PstI restriction site centered at the ϩ1 position of the GPH␣ gene, a vector was generated that terminated at ϩ3 on the 3Ј end (i.e. p␣(Ϫ846/ϩ3)CAT). Whether the first 45 bp of the 5Ј-untranslated region of the ␣-subunit gene affected promoter activity could be assessed by comparing reporter gene expression in HeLa cells transfected with vectors p␣(Ϫ846/ϩ48)CAT and p␣(Ϫ846/ϩ3)CAT. As seen in Fig. 2, acetyltransferase activity from p␣(Ϫ846/ϩ3)CAT was increased 6 -7-fold relative to that from p␣(Ϫ846/ϩ48)CAT. These results suggest the presence of a negative regulatory element located in the first exon of the GPH␣ gene. The element is located at, or downstream of, the FIG. 1. Schematic diagram of the GPH␣ promoter regulatory elements and CSDE sequence. The GPH␣ gene extending from Ϫ350 to ϩ50 contains multiple cis-acting elements. In the human GPH␣ promoter, a murine PGBE sequence is conserved at Ϫ333 to Ϫ320; it may serve as a pituitary-specific element in thyrotropes for production of TSH. A GSE at Ϫ223 to Ϫ197 is important for basal activation of the GPH␣ gene for FSH and LH synthesis in pituitary gonadotropes. The region from Ϫ180 to Ϫ150 consists of three overlapping protein binding subdomains, the TSE, the upstream regulatory element (URE), and the ␣-activation element (␣-ACT). The cis-elements play a critical role in placenta-specific expression. The CRE is composed of two 18-bp tandem repeats that are located between Ϫ146 and Ϫ111. An element defined as the junctional regulatory element (JRE) is located downstream of the CREs (Ϫ120 to Ϫ100) and overlaps a negative androgen response element (ARE). The GPH␣ gene also contains TATA and CCAAT basal promoter elements located at position Ϫ29 and Ϫ89, respectively. Two E-boxes (␣EB1 and ␣EB2) flank the TATA motif, and a negative thyroid hormone response element (T3RE) is situated between the TATA box and the downstream E-box (␣EB2). The CSDE sequence is indicated below the map. Stars above and below the bases mark diad symmetry of the GPH␣ cap site. The start site of transcription is indicated by ϩ1. See the Introduction for appropriate references. transcription start site, as the promoter shows increased activity by truncation from ϩ48 to ϩ3. In analyzing the GPH␣ gene in this region, an imperfect inverted repeat was identified ( Fig.  1). It is located with the diad center at the transcription start site (i.e. ϩ1) and will be referred to as the CSDE.
Negative Influence of CSDE on GPH␣ Transcription-To firmly establish a role for the CSDE in expression of the GPH␣ gene, clustered point mutations were introduced into the cap site to disrupt the diad symmetry. Two mutants were constructed using classical methodologies based on M13. Mutation m-1 extends from Ϫ5 to Ϫ2 and converts TAAC to ATTG, which disrupts the diad left arm; and mutation m-2 has substitutions at Ϫ4, Ϫ3, ϩ4, and ϩ6 to change A, A, G, and T to T, G, A, and C, respectively, which alters both diad arms (Fig. 3A). The DNA fragments extending from Ϫ846 to ϩ48 containing mutated cap site sequence were released from M13 DNA and engineered into pBLCAT 3 . The reporter plasmids p␣(Ϫ846/ϩ48)CAT (wild type), m-1, m-2, and p␣(Ϫ846/ϩ3)CAT were transiently cotransfected with pCMVlacZ into HeLa cells. Acetyltransferase activity was normalized to ␤-galactosidase activity to account for differences in transfection efficiency. As shown in Fig. 3B, CAT activity from m-1, m-2, and p␣(Ϫ846/ϩ3)CAT (abbreviated as ϩ3) was increased 3-, 4-, and 6.5-fold, respectively, relative to that from the p␣(Ϫ846/ϩ48)CAT (abbreviated as ϩ48) vector set at 1.0. Because the mutant promoters were more active than the wild-type promoter, these results indicate that CSDE acts as a negative regulatory element. Furthermore, both halves of the diad contribute to the element's activity, as point mutations in m-1 inactivate the upstream arm, and truncation at ϩ3 inactivates the downstream arm.
Site of Transcription Initiation in the 3Ј-Truncated and Wildtype Promoters-The possibility was considered that the high level of expression exhibited by p␣(Ϫ846/ϩ3)CAT might arise from a change in the transcription start site as a result of the extensive 3Ј deletion. Consequently, primer extension analysis was used to map the 5Ј ends of mRNA transcribed from the wild-type (p␣(Ϫ846/ϩ48)CAT) and deletion mutant (p␣(Ϫ846/ ϩ3)CAT) vectors. A diagram of the relevant portion of these plasmids is presented in Fig. 4B. If the GPH␣ gene cap site is used, the predicted sizes of the primer extension products are 98 nucleotides for the wild-type transcript and 71 nucleotides for the 3Ј-deletion transcript. These were determined by summing the length of CAT-REV2 primer (33 nucleotides), plasmid backbone (either 17 or 35 nucleotides for wild-type and mutant, respectively), and exon I (either 48 or 3 nucleotides for wildtype and mutant, respectively). The results presented in Fig.  4A show that primer extension products generated from RNAs were the sizes predicted in Fig. 4B, indicating that both transcripts terminated at the same nucleotide, which corresponds to ϩ1 of the GPH␣ gene. It is also noted that the abundance of p␣(Ϫ846/ϩ48)CAT mRNA was significantly lower than that of p␣(Ϫ846/ϩ3)CAT mRNA, further indicating that the activity increase produced by the 3Ј-deletion mutant is at the level of transcription.
CSDE Forms a Distinct Complex with HeLa Cell Nuclear Proteins-The effects of site-directed mutations (m-1 and m-2) in the CSDE on transcriptional activity of the ␣-subunit gene promoter suggested the possibility that this site interacts with distinct nuclear proteins to repress GPH␣ gene transcription. To examine this possibility, EMSA was performed. A 28-bp oligonucleotide that extends from Ϫ13 to ϩ15 relative to the transcription start site (ϩ1) of the ␣-subunit gene was radiolabeled and incubated with HeLa cell nuclear extract (Fig. 5). Analysis of the binding mixtures by electrophoresis through native polyacrylamide gels revealed a DNA-protein complex migrating slower than the free CSDE probe (Fig. 5, lane 1). 2. Activation of the GPH␣ promoter by 3-truncation. The CAT reporter plasmids containing DNA from Ϫ846 to ϩ48 or from Ϫ846 to ϩ3 relative to the GPH␣ transcription start site (ϩ1) were transiently transfected into HeLa cells in duplicate. After 48 h, cells were harvested and assayed for CAT activity as described under "Experimental Procedures." Values represent percent conversion of [ 14 C]chloramphenicol to acetylated derivatives after normalization to ␤-galactosidase activity introduced via cotransfection of pCMVlacZ; they are the means obtained from at least three independent experiments.

FIG. 3. Transient transfection of HeLa cells with wild-type and mutant expression vectors.
A, two cap site mutants were constructed as described under "Experimental Procedures." The boxed bases indicate the clustered mutations. One mutant (m-1) had four base pair changes from Ϫ5 to Ϫ2 and the other (m-2) contained point mutations at Ϫ4, Ϫ3, ϩ4, and ϩ6. Presented are the confirmatory dideoxy sequencing gels for the two mutants and a wild-type cap site. B, the DNA fragments extending from Ϫ846 to ϩ48 and containing either wild-type or mutated cap site sequence were subcloned from coliphage M13 into the pBLCAT 3 expression vector. DNA sequence in the boxes indicate the mutated bases (lowercase). HeLa cells were transfected in duplicate with 10 g of the CAT reporter gene constructs and 5 g of pCMVlacZ. After 48 h, cells were harvested, and CAT and ␤-galactosidase activities were measured. Acetylated chloramphenicol was revealed by autoradiography, and CAT activity relative to ␤-galactosidase activity in the same extracts is indicated by the bar graph. The autoradiogram is representative of a single experiment, and the quantitative values represent the mean activity obtained from three independent experiments. This complex was eliminated by the addition of excess, unlabeled homologous (CSDE) oligonucleotide (lane 3), but it was not affected by the addition of excess, unlabeled heterologous (Het) oligonucleotide (lane 2). Additionally, 1.7-kbp GPH␣ DNA fragments (extending from Ϫ1637 to ϩ48), which contain wildtype or mutated cap site sequence, were used as competitors of protein binding to the 32 P-labeled CSDE oligonucleotide. Because the competitors were significantly different in length (28 versus 1685 bp), the levels of DNA added are indicated in Fig. 5 as picomoles. As seen, the complex was eliminated (lanes 4 -6) with increasing amounts of wild-type competitor, whereas the addition of excess unlabeled fragment containing a clustered mutation at Ϫ5 to Ϫ2 (changing TAAC to ATTG) did not significantly interfere with the formation of complex even at the highest concentration tested (lanes 7-9). Thus, elimination of the DNAprotein complex was dependent on an intact CSDE motif. The formation of a distinct DNA-protein complex with a probe representing the GPH␣ cap site diad suggests that the CSBP represents a previously undefined binding activity for the GPH␣ promoter and may act as a repressor to affect promoter activity.
The Regions from Ϫ5 to Ϫ2 and ϩ4 to ϩ11 Are Critical for Binding of CSBP to CSDE-To better localize the binding site for this factor, a series of oligonucleotides containing clustered substitution mutations that collectively span the region of diad symmetry (Table I) were synthesized and used in competition EMSA. As a first approach, the wild-type CSDE(Ϫ13/ϩ15) and mutant oligonucleotides (M-1, M-2, M-4, M-5, and ϩ3) were used as competitors for protein binding to a 32 P-labeled CSDE probe. A single complex was generated (Fig. 6A). This complex was inhibited 97% by coincubation with increasing amounts of homologous, nonradioactive CSDE oligonucleotide, whereas unlabeled mutant oligonucleotides M-1, M-2, M-4, M-5, and ϩ3 were inefficient as competitors, reducing complex formation by only 30% at the highest concentration tested (Fig. 6, A and B). Thus, the regions from Ϫ5 to Ϫ2 and ϩ5 to ϩ10 in the CSDE are important for interactions with CSBP. Because p␣(Ϫ846/ ϩ3)CAT had better promoter activity than m-1 and m-2 (Fig.  3B), it was considered that bases farther downstream may also be important for binding (see "Discussion"). To examine this possibility, another series of longer oligonucleotides containing substitution mutations on either side of, as well as within, the diad sequence was generated (Table I), and competition binding assays were performed. The wild-type oligonucleotide (Ϫ13/ ϩ22) and mutant oligonucleotides (M-8, M-1L, M-7, and M-6) were used as competitors for protein binding to a 32 P-labeled Ϫ13/ϩ22 probe. Again, one complex was observed (Fig. 6C). FIG. 4. Determination of transcription start sites by primer extension. HeLa cells were stably transfected with either p␣(Ϫ846/ ϩ48)CAT or p␣(Ϫ846/ϩ3)CAT. Total RNA was isolated from G418resistant cells cultured in minimum essential medium supplemented with 5 mM D-mannose, 3 mM sodium butyrate, and 1 mM theophylline (77,78). Total RNA (15 g) was subjected to primer extension with a 32 P-labeled CAT gene-specific oligonucleotide primer as described under "Experimental Procedures" and depicted in panel A. Arrows indicate the major extended products, 98 nucleotides for p␣(Ϫ846/ϩ48)CAT and 71 nucleotides for p␣(Ϫ846/ϩ3)CAT. As a positive control, kanamycin RNA and primer (Clontech) were used to generate a product of the expected size (87 nucleotides). The schematics in panel B depict the p␣(Ϫ846/ϩ48)CAT and p␣(Ϫ846/ϩ3)CAT vectors, the CAT-REV2 primer (33 nucleotides), and the calculated distance from the primer 5Ј-terminus to the predicted transcription start sites (indicated by ϩ1).

FIG. 5. Identification of CSDE binding activity in HeLa nuclear extracts by EMSA.
Binding assays were performed using a HeLa cell nuclear extract and 32 P-labeled CSDE oligonucleotide (lane 1) and were analyzed by electrophoresis through 6.5% polyacrylamide gels in 1ϫ TBE buffer. The free probe and the retarded complex are indicated. The complex was challenged with 0.561 pmol of unlabeled heterologous (CRE, 5Ј-CGGCAAATTGACGTCATGGTAAGCCC-3Ј) (lane 2) or homologous (lane 3) oligonucleotides. The complex was also competed with 0.045, 0.312, and 0.579 pmol of unlabeled 1.7-kbp DNA fragments that extend from Ϫ1637 to ϩ48 of the GPH␣ gene and contain wild-type (lanes 4 -6) or mutated (lanes 7-9) cap site sequence. The clustered point mutation corresponds to m-1 (Fig. 3A). Nuclear extracts (10 g of protein) were preincubated with the indicated competitors for 10 min at room temperature before addition of the CSDE probe (10,000 cpm, ϳ0.022 pmol). This complex was eliminated by increasing amounts of homologous, nonradioactive Ϫ13/ϩ22 oligonucleotide, suggesting that the complex represents specific protein-DNA interactions.
Mutations in both M-1L and M-7 severely inhibited binding, increasing by 9-and 14-fold, respectively, the level of oligonucleotide required to inhibit complex formation by 50% (Fig. 6, C  and D). In contrast, mutations in M-8 and M-6 had much less effect on binding, as they were equivalent to Ϫ13/ϩ22 at concentrations only 2-3-fold higher. Taken together with the results in Fig. 6 (A and B), these data indicate that the regions from Ϫ5 to Ϫ2 and ϩ4 to ϩ11 are the most critical for binding of HeLa CSBP, and that bases farther upstream (from Ϫ11 to Ϫ8) and downstream (from ϩ15 to ϩ19) may also contribute to complex formation but at a much reduced level.
Interactions between CSBP and CSDE as Determined by Methylation Interference-To identify specific base contacts that contribute to CSBP binding, HeLa nuclear extracts were incubated with dimethyl sulfate-treated oligonucleotide (Ϫ13/ ϩ22). In this assay, methylation of specific guanines (m 7 G) in contact with the protein, but not at guanines outside the binding site, will reduce DNA-complex formation. The results presented in Fig. 7 show that methylated guanines at ϩ2 and ϩ8 on the antisense (noncoding) strand were significantly diminished in the bound DNA, as were methylated guanines at ϩ1 and ϩ4 on the sense (coding) strand. These are indicated with large arrows in panel B. Methylation of guanines at ϩ10 on the sense strand and at Ϫ2, Ϫ8, and Ϫ11 on the antisense strand showed less interference with CSBP binding, and this is indicated by small arrows in Fig. 7B. No interference was observed by methylation of G at ϩ12 on the sense strand. These results provide strong support of the mutagenesis study to confirm the diad sequence as a protein binding motif.
Analysis of CSDE Activity When Reintroduced Downstream of the p␣(Ϫ846/ϩ3)CAT Promoter-Because mutation of the CSDE produced a GPH␣ promoter that was more robust than the wild-type promoter, it was of interest to determine whether the diad element could restore repression to the 3Ј-truncated mutant, and if it could, whether it was effective in an orientationdependent or an orientation-independent manner. To investigate this question, one copy of the CSDE was inserted downstream of the p␣(Ϫ846/ϩ3)CAT promoter in both forward and reverse orientations to generate vectors p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT. These constructs were mixed with pCMV-lacZ and transfected into HeLa cells. The CAT activity relative to ␤-galactosidase activity is shown in Fig. 8A. Expression levels for p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT were approximately 38 and 65% of those for p␣(Ϫ846/ϩ3)CAT. Thus, the downstream CSDE in either orientation could repress activity of the ␣-subunit gene promoter, but somewhat greater inhibition was provided by CSDE in the forward direction. Moreover, the promoter activity of both constructs remained higher than that of p␣(Ϫ846/ϩ48)CAT, suggesting either that full activity of the negative element may require additional sequences outside the cap site diad or that the downstream location may reduce interactions between CSBP and other factors in the transcription initiation complex.
The above constructs place the diad center of the CSDE insert 35 bp downstream of the original cap site. To test CSDE(Ϫ13/ϩ15) function in its native position, without the downstream sequence provided by p␣(Ϫ846/ϩ48)CAT, the p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT vectors were digested with restriction endonuclease PstI; and the larger fragments were religated to generate two constructs, HCAP and HPAC, named for the inclusion of the right half (H) of the cap site from CSDE(ϩ3/ϩ15) and EDSC (ϩ15/ϩ3), respectively. The PstI sites lie at the transcription start of the GPH␣ promoter and at the diad center 35 bp downstream in the inserted CSDE (and EDSC). Religation of the large fragment eliminates 35 bp and restores an intact CSDE in both the wild-type and as probes or competitors in DNA binding assays Uppercase indicates the wild-type sequences, and bold lower case identifies mutated bases. For M-4, the inserted hexamer is shown in bold lowercase and the diad left and right arms are shown in uppercase, even though a direct comparison reveals that many of the downstream positions are mutated in the strictest sense. Oligonucleotide sequences are aligned at the transcription start site indicated by the underlined G at ϩ1.
FIG. 6. Competition of gel shift complexes with oligonucleotides having clustered point mutations that collectively span the GPH␣ cap site. This analysis was carried out to define the boundaries of CSBP-CSDE interactions. A, protein (10 g) from HeLa nuclear extracts was incubated at room temperature for 20 min with 10,000 cpm of radiolabeled CSDE oligonucleotide after a 20-min preincubation with 2 g of poly(dI-dC)-poly(dI-dC) and unlabeled competitors as identified in the figure. The quantity of competitor DNA used in each reaction is shown at the top of the autoradiogram. B, the DNA-protein complex intensity was determined by densitometry and plotted as a function of competitor DNA. C, HeLa nuclear protein (10 g) was incubated as described for panel A with 10,000 cpm of 32 P-radiolabeled Ϫ13/ϩ22 oligonucleotide and 2 g of poly(dI-dC)-poly(dI-dC) after preincubation of protein with the indicated competitors. The quantity of competitor DNA used in each reaction is shown at the top of the autoradiogram. D, competition curves were determined by quantitative densitometry. Values in arbitrary integrator units for samples receiving no competitor were set at 100%. Sequences of the various oligonucleotides are summarized in Table I. inverted orientation with elimination of sequences farther downstream (i.e. between ϩ15 and ϩ48). Derivative HPAC contains mutations at ϩ8, ϩ12, ϩ13, and ϩ14 and forms a perfect palindrome. When p␣(Ϫ846/ϩ48)CAT, p␣(Ϫ846/ ϩ3)CAT, HCAP, and HPAC were transfected into HeLa cells, the CAT activity from HCAP and HPAC was 2-3-fold higher than that from p␣(Ϫ846/ϩ48)CAT but ϳ2-fold lower than that from p␣(Ϫ846/ϩ3)CAT (Fig. 8B). These results suggest that additional downstream sequence (i.e. from ϩ15 to ϩ48) of the GPH␣ gene first exon may also contribute to negative regulation of the chromosomal gene, and that converting the CSDE to a perfect palindrome does not increase its repression activity.
Relationship between CSDE Action and Levels of CSBP-Because tumors originating in a variety of tissues produce the GPH ␣-subunit (4,9,12,13,15,16,62,63), it was of interest to examine the activity of wild-type and mutant promoters in some of these cell types. Consequently, the p␣(Ϫ846/ϩ48)CAT, m-1, m-2, and p␣(Ϫ846/ϩ3)CAT vectors were co-transfected with pCMVlacZ into several different cell lines, including HT-29 (colon carcinoma), Panc-1 (pancreatic carcinoma), JEG-3 and JAr (choriocarcinoma), U-2 OS (osteosarcoma), GH 4 C 1 (pituitary carcinoma), MCF-7 (breast carcinoma), and MCF-10A (normal breast). CAT levels were normalized to those of ␤-galactosidase in the same extracts. The values reported in Table II show that the CAT activities derived from mutants m-1, m-2, and p␣(Ϫ846/ϩ3)CAT were relatively higher than those from the wild-type p␣(Ϫ846/ϩ48)CAT in most of the cell types. For example, HeLa and Panc-1 cells showed a strong preference for the mutated promoter, whereas the MCF-10A cell line showed little or no ability to discriminate among the  1 and 2) or sense (lanes 3 and 4) strand of the GPH␣ gene and spanning bases from Ϫ13 to ϩ22 were 5Ј-end-labeled with T4 polynucleotide kinase. Duplex DNA was partially methylated with dimethyl sulfate as described under "Experimental Procedures." Modified DNA was incubated with HeLa nuclear extract, and the complexed (B) and free (F) DNAs were separated by EMSA as shown in the previous figures. Six replicate samples were made. The individual bands of protein-bound (lanes 1 and 3) and unbound (lanes 2 and 4) DNAs were isolated from the gel, combined, cleaved with piperidine (1:10), and resolved by electrophoresis at 12 watts for 3 h on a 10% polyacrylamide gel containing 7 M urea. G residues are identified to the left and right of the autoradiogram. B, sequence of the Ϫ13/ϩ22 oligonucleotide. Arrowheads indicate bases whose methylation reduce complex formation; large and small arrows denote the degree of interference by the G nucleotides indicated.

FIG. 8. Reintroduction of the CSDE into the truncated GPH␣ promoter.
A, the diad element was inserted downstream of the GPH␣ promoter (extending from Ϫ846 to ϩ3) in both the forward (CSDE) and reverse (EDSC) orientations to generate p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT, respectively. The constructs were cotransfected into HeLa cells with pCMVlacZ so that ␤-galactosidase activity could be used to normalize for differences in transfection efficiency. Expression vectors are diagramed at the left; a representative autoradiogram showing acetylated derivatives of chloramphenicol (CM) is presented in the center; and the ratio of CAT to ␤-galactosidase activity obtained with the CSDE-and EDSC-containing vectors relative to the ratio obtained with the parental p␣(Ϫ846/ϩ3)CAT vector is shown at the right; values are the average of duplicate plates in two separate assays. B, the constructs HCAP and HPAC were generated from p␣(ϩ3-CSDE)CAT and p␣(ϩ3-EDSC)CAT as outlined under "Experimental Procedures." Along with p␣(Ϫ846/ϩ48)CAT and p␣(Ϫ846/ϩ3)CAT, they were cotransfected into HeLa cells with pCMVlacZ. The presentations are as described in panel A.
wild-type and mutant expression vectors, and other lines were intermediate in their relative expression levels.
Because the CSDE provided a strong negative influence on promoter activity in transient expression assays and demonstrated the capacity to form a specific DNA-protein complex in gel shift assays, it was reasoned that the variability of CSDE action in the cell lines examined might result from differences in the levels of CSBP that are characteristic of each cell type. To test this possibility, DNA-protein interactions were examined in the collection of cell lines by EMSA. When the CSDE was used as a labeled probe, a single DNA-protein complex was generated from nuclear extracts prepared from each of the cell lines listed above (Fig. 9), but their autoradiographic intensities were notably different. Competition with heterologous and homologous oligonucleotides showed that the complex in each cell type represented specific DNA-protein interactions (data not presented). The amount of complex was quantified by densitometry, and the arbitrary integrator units for each cell line, relative to that of HeLa cells set at 100, are listed in Table II. Similar results were obtained with multiple nuclear extract preparations and with varied DNA:protein ratios (data not presented). It appears unlikely that an inhibitor of CSBP binding is present in nuclear extracts from cell lines with low binding activity, as the amount of complex generated by mixtures of HeLa and MCF-10A extracts (10 g of protein each) was equal to the sum of the levels produced by each extract alone (data not presented). Additional control experiments suggested that the variable levels of CSBP were not a reflection of extract quality, as other DNA-binding proteins either did not show significant variation in binding activity, or the binding activity fluctuated in a manner distinct from that of CSBP (e.g. levels in MCF-10A were greater than those in HeLa and MCF-7).
The relative effectiveness of the most active (p␣(Ϫ846/ ϩ3)CAT) and least active (p␣(Ϫ846/ϩ48)CAT) promoters was evaluated in each cell line by determining the -fold increase in CAT/␤-galactosidase activity for the mutant compared with the wild-type promoter; this comparison is indicated as the ϩ3/ϩ48 ratio. By comparing the ratio of CAT levels produced from the two plasmids, rather than absolute levels, any intrinsic differences in the ability of specific cell lines to transcribe the GPH␣ promoter would be eliminated except for the contribution derived from the CSDE or from sequence downstream of ϩ3. Normalization of CAT activity to that of ␤-galactosidase accounts for any inherent differences in their transfection efficiency. In Fig. 10, the ratio of promoter activities (i.e. deletion mutant/wild-type) in each cell line is plotted against their corresponding level of CSDE binding activity. As seen, there is a direct correlation (r ϭ 0.96, p Ͻ 0.0001) between these parameters. That is, cells with the highest levels of CSBP activity (e.g. HeLa and Panc-1) showed a proportionately greater increase in transcription from p␣(Ϫ846/ϩ3)CAT relative to that from p␣(Ϫ846/ϩ48)CAT. Similarly, MCF-10A cells, which exhibited the lowest levels of CSBP activity, produced comparable levels of CAT activity from expression vectors driven by the FIG. 9. CSBP activity in nuclear extracts prepared from a variety of cell lines. Ten g of nuclear protein from the indicated cell lines was incubated with 10,000 cpm of 32 P-radiolabeled CSDE oligonucleotide. Free DNA and DNA-protein complex (indicated by arrows) were resolved by nondenaturing polyacrylamide gel electrophoresis. The DNA-protein complex levels were quantified by laser densitometry, and average values from two experiments with different extract preparations are reported in Table II after normalizing to values for HeLa  extracts.   TABLE II Determination of wild-type and mutant promoter activity and the relative level of CSBP in a variety of tumor cell lines Cells were transfected as described under "Experimental Procedures" with the expression vectors indicated (wild-type, m-1, m-2, ϩ3) and assayed for CAT and ␤-galactosidase 48 h after transfection. The values represent CAT activity normalized to ␤-galactosidase activity in aliquots of the same extract; they are means Ϯ standard deviation determined for duplicate flasks in three separate experiments. The last column represents the relative binding activity of CSBP in nuclear extracts prepared from the indicated cell lines. The amount of CSBP-CSDE complex was determined by laser densitometry of standard EMSA autoradiograms. The arbitrary integrator units were normalized to HeLa extracts set at 100. Values are the average of two experiments using different nuclear extract preparations. wild-type (ϩ48) and truncated promoters (ϩ3). These results suggest that the increase in activity of the mutant promoter relative to the wild-type promoter is dependent, at least in part, on the levels of CSBP activity in these cell lines.

DISCUSSION
Expression of the glycoprotein hormone ␣-subunit gene is controlled by a complex promoter that contains multiple, distinct regulatory elements that interact via specific, nuclear DNA-binding proteins (21,23,31,32,37). Regulatory elements controlling transcription are generally located upstream of the transcription start site. For example, upstream regulatory elements that interact with placenta-specific factors involved in basal transcription of the GPH␣ gene in JEG-3 choriocarcinoma cells are located between nucleotides Ϫ180 and Ϫ150 relative to the transcription start site (ϩ1). However, it has become increasingly evident that a large number of both cellular and viral genes utilize elements for transcriptional regulation that are located at, or downstream from, the cap site (55)(56)(57)(58)(59). Examples of such regulatory elements include those located in introns (64,65), in 3Ј-flanking DNA (66), and in both untranslated (67) and translated exons (68). They can function as either enhancers or silencers. Transient expression of reporter gene constructs carrying the GPH␣ gene promoter was previously reported to be significantly lower in HeLa cells than in JEG-3 cells (22), suggesting that the trans-acting factor(s) necessary for expression in JEG-3 cells is not present in HeLa cells (38). As shown in Fig. 2, however, levels of CAT activity produced from a GPH␣ 3Ј-deletion mutant, which contains 5Ј-flanking DNA extending from Ϫ846 to ϩ3, were significantly greater than those produced from the plasmid containing GPH␣ promoter DNA extending farther downstream to ϩ48. This result argues that a heretofore unrecognized motif, comprising the cap site and/or downstream sequence, constitutes a cis-regulatory element responsible, at least in part, for low level transcription of the GPH␣ promoter in transfected HeLa cells. Support for the above assertion comes from the analysis of two site-directed mutants, m-1 and m-2. Together with the 3Ј-deletion mutant, they demonstrate that proximal sequence both upstream and downstream of the transcription start site contributes to the repression activity of the diad element centered at ϩ1 (Fig. 3B). The CSDE is classified as a negative regulatory element for the GPH␣ promoter because of the greater activity of the mutants relative to that of the wild-type. The possibility that the increase in CAT production is caused either by changes in the translational efficiency of the CAT transcript via a possible change in mRNA secondary structure (69) or by the use of a novel, more active transcription start site have been ruled out on the following basis. (a) Clustered point mutations in each arm of the diad element, which had no sequence deletion, also exhibited the increased activity (Fig.  3B). (b) Elevated expression from m-1 precludes any possibility of an effect on mRNA structure or translation efficiency, as this mutation is entirely upstream of the transcription start site. (c) Primer extension of CAT mRNA demonstrates that the 3Јdeletion did not alter the transcription start site and that increased CAT activity could be accounted for by a concomitant increase in transcript levels (Fig. 4). Thus, repression activity of the CSDE is at the transcriptional level.
Most cis-acting transcriptional regulatory elements studied to date act as DNA recognition elements for trans-binding proteins. Electrophoretic mobility shift assays using oligonucleotide probes corresponding to the GPH␣ CSDE (extending from Ϫ13 to ϩ15 or ϩ22) identified a distinct complex with nuclear proteins from HeLa cells (Figs. 5 and 6). The clusteredpoint mutations of CSDE allow delineation of some of the DNA sequence determinants for protein binding (Table I and Fig. 6). Mutation of bases from Ϫ5 to Ϫ2 (M-1L) and from ϩ4 to ϩ11 (M-7) in this element abolished binding to its cognate transacting factor, whereas mutated bases farther upstream (Ϫ11 to Ϫ8 in M-8) or downstream (ϩ15 to ϩ19 in M-6) had much less effect on CSDE-CSBP interactions. Protein binding to either half of the diad element is important for inhibiting promoter activity. In a direct comparison, the ϩ3 truncation mutant elicited more CAT activity than m-1 (Fig. 3B). The difference in transcriptional activity of the mutants may be an inherent property of the binding factor, or the diad downstream arm may provide more stability to the DNA-protein interaction. This result is consistent with the fact that an oligonucleotide with an upstream mutation, M-1, affects DNA-protein complex formation to an extent less than oligonucleotides with a downstream mutation, M-2 and ϩ3 (Fig. 6A). Methylation interference analysis also suggests that guanines exhibiting the strongest interactions with CSBP were positioned in the proximal downstream half (ϩ1 to ϩ8) of the CSDE (Fig. 7). Although the experiments described here do not address this point directly, it seems unlikely that the diad element functions in a cruciform conformation because the mutation represented by HPAC, which is a perfect palindrome and would produce a more stable cruciform, was less effective than the wild-type CSDE at minimizing promoter activity. Together, these data strongly suggest that protein binding to CSDE is required for its negative regulation of the GPH␣ promoter.
A comparison of the transient expression and gel shift results ( Fig. 10) strongly suggests that CSDE activity in different cell lines is proportional to the levels of the corresponding trans-acting factor (CSBP). The electrophoretic mobility shift analyses demonstrate that the CSBP is present in a variety of cell lines, including HeLa, JEG-3, HT-29, Panc-1, U-2 OS, GH 4 C 1 , MCF-7, and MCF-10A, but at different levels (Fig. 9).  Table II. For each cell line, the -fold increase in CAT activity produced from the truncated promoter (ϩ3) relative to the wild type promoter (ϩ48) was calculated and plotted against the CSDE binding activity in nuclear extracts from the same cell type (normalized to that of HeLa cells). The ratio of promoter activities is used to eliminate inherent differences in the ability of different cell lines to transcribe the transfected GPH␣-CAT chimera. The line represents a least squares fit to the data points, with correlation coefficient of 0.96. These cell lines also showed a significant difference in their abilities to transcribe a CAT reporter gene when it was under control of a GPH␣ promoter terminating at ϩ3 as compared with one terminating at ϩ48 (Table II). Significantly, the relative effectiveness of the CSDE deletion mutant (ϩ3) compared with the wild-type diad element (ϩ48) in a given cell line was directly proportional to the level of CSBP activity in nuclear extracts prepared from that cell line (Fig. 10), i.e. cells with higher levels of CSDE binding activity showed a significant increase in their ability to transcribe the truncated promoter relative to the wild-type promoter, and cells with less CSDE binding activity showed a concomitant decrease in their ability to distinguish between the wild-type and mutated cap sites in transient expression assays. These results are interpreted to support the identification of CSBP as a repressor, the activity of which can account, at least in part, for the differential activity of the GPH␣ promoter in these cell types. It is suggested that functional activity of the CSDE displays a cellspecific preference because the protein factor(s) that binds to this element is either present at different levels or occurs with different binding affinities. Preliminary measurements by Rosenthal analysis (70) of CSDE binding activity suggests that both of these possibilities may be realized, as differences in B max and K d have been observed for CSBP in nuclear extract preparations from different cell lines. 2 Because the CSDE is located at the transcription start site of the GPH␣ gene, it provides a potentially powerful locus for regulating gene expression. Although a detailed understanding of its mechanism of action must await further investigation, the following conclusions are supported by the results described above. (a) The fact that clustered point mutations in the CSDE alleviate the negative regulation suggests that interaction of CSBP with DNA is required to suppress transcription of the GPH␣ promoter. (b) The results summarized in Fig. 8A, showing that CAT production from the ϩ3 promoter was decreased 2-3-fold when the CSDE oligonucleotide was inserted in both orientations into a site 38 bp downstream of the cap site, demonstrate that the GPH␣ CSDE can function in an orientation-independent manner, as might be expected of a palindromic (although imperfect) element. Analysis of CSDE at more distal downstream sites or at a position upstream of GPH␣ ϩ1 has not yet been examined, delaying any definitive conclusions about whether the motif has properties of a conventional silencer. (c) The variable levels of CSDE binding activity displayed in different cell lines (Table II and Fig. 9) suggest that modulating the steady-state levels of CSBP may be at least one mechanism by which CSDE-containing promoters can be regulated. In addition, it can be speculated that, like other transcription factors, the ability of CSBP to repress transcription may be further influenced by its own post-translational modification (e.g. phosphorylation, glycosylation, or acetylation). (d) Because the CSDE is an imperfect palindrome, CSBP may be a dimeric factor, and the activity of the CSDE may depend on the extent to which it is bound by homodimeric or heterodimeric forms. Several of these alternatives are under current investigation.
If CSBP functions at transcription initiation, several possible mechanisms can be proposed. (i) It may compete with basal transcription factors for binding to a common DNA target site, thereby sequestering the cap site diad and preventing the formation of a requisite pre-initiation complex. (ii) Proteinprotein contacts made between CSBP and one or more basal transcription factors (e.g. TFIID, TFIIB), accessory proteins (e.g. TAFs), enhancer-binding proteins (e.g. CREB, GATA-2 and -3, TSEB), or co-activators (e.g. p300/CBP) may lead to destabilization of the transcription machinery at the point of initiation complex formation or promoter clearance. (iii) CSBP may recruit co-repressors, such as histone deacetylases, to the GPH␣ promoter and alter the gene's chromatin organization.
A search was made of sequence data bases for the occurrence of the CSDE motif in promoters of other genes. Results of the search uncovered several elements with one or two base mismatches compared with the consensus CSDE (see below); however, of those identified, none appear to be located at or near the transcription start site. Nevertheless, it will be of interest to examine these for an effect on expression of the corresponding genes. Analysis of the GPH␣ cap site sequence suggested a resemblance to an inverted repeat (5Ј-AGTGCACT-3Ј) located between Ϫ57 and Ϫ50 of the murine junB promoter (71). This element activates the junB promoter by signaling mechanisms involving protein kinase C and protein kinase A. Of interest is the fact that this element is similar to a CRE core sequence (5Ј-TGACGTCA-3Ј) but with inverse polarity. Because the GPH␣ cap site diad also contains a perfect CRE with 3Ј to 5Ј polarity (5Ј-ACTGCAGT-3Ј), the effects of cAMP and phorbol 12-myristate 13-acetate were examined in transient expression assays by comparing p␣(Ϫ150/ϩ3)CAT and p␣(Ϫ150/ϩ48)CAT, which contain the upstream tandem CREs, with p␣(Ϫ100/ ϩ3)CAT and p␣(Ϫ100/ϩ48)CAT, which had the CREs deleted. One of each pair also contained the intact CSDE (ϩ48) or truncated CSDE (ϩ3). However, no reproducible effect of cAMP or phorbol 12-myristate 13-acetate, either positive or negative, was observed for p␣(Ϫ100/ϩ48)CAT expression (data not presented), suggesting that the GPH␣ CSDE and the junB element are distinct.
The GPH␣ gene is expressed in the pituitary of all mammals and in the placenta of horses and primates. It was of interest, therefore, to compare the human CSDE to the cap site sequence of other mammalian ␣-subunit genes. The sequence from Ϫ12 to ϩ14 for horse, rat, cow, pig, mouse, monkey, and human are listed in Table III along with a consensus sequence compiled of the bases that occur most frequently at each position. For each entry, the bases showing diad symmetry centered at ϩ1 are marked with dots above and below the left and right arms, respectively. The human sequence can also be characterized as containing an imperfect direct repeat, in which the upstream TTAACTG and downstream TTACTG sequences differ by absence or presence of a single A; these are marked with arrows below the entries. The number of bases that show diad character in each sequence and the number of mismatches compared with the consensus sequence are listed at the right. The cap site sequences are ordered in the table according to the number of bases showing diad character. As seen, this is roughly inversely proportional to the number of base mismatches. There does not seem to be an obvious relationship between the cap site sequence and placental expression. However, the ordered ranking of the species examined is interesting as it approximates the evolutionary relatedness of these mammals based on the number of nucleotide changes in seven different proteins (72), with the exception that rat is generally ranked closer to mouse.
It was noted throughout the course of this work that CAT expression vectors with the GPH␣ promoter truncated at ϩ3 were generally more active than the clustered point mutations generating m-1 and m-2. This is illustrated in Fig. 11, where the transient expression activity of wild-type (ϩ48) and mutated CSDE promoters (m-1, m-2, ϩ3) are compared in cell lines displaying a variety of CSBP levels. As seen, the activities of m-1 and m-2 approached a level ϳ4-fold higher than the wild-type promoter as the levels of CSBP increased, whereas the ϩ3 truncation mutant exhibited activity at least 8-fold greater than that of the native promoter in the same cells and without evidence of leveling off. The basis for this difference is unknown, but at least two possibilities can be suggested. First, an additional element(s) downstream of CSDE may also contribute to the negative effect of exon I sequences on GPH␣ transcription. Such an element would be lost in ϩ3 derivatives but not in m-1 and m-2, which terminate at ϩ48. Attempts using EMSA to detect another protein that interacts with DNA extending from ϩ3 to ϩ48 have been unsuccessful; hence, downstream contacts may contribute to stabilization of the CSDE-CSBP complex, yet alone are insufficient for complex generation. A second possibility is suggested by the results of Purnell and Gilmour (58), which show that a Drosophila TFIID complex makes important downstream contacts in the hsp70, hsp26, and histone H4 genes. Alignment of these promoters surrounding their transcription start sites suggested that a conserved sequence YARNTC (where Y, R, and N denote pyrimidine, purine, and any nucleotide, respectively) was important for this activity. Comparison of the GPH␣ wild-type and ϩ3 promoters (Fig. 11) shows that YARNTA at ϩ2 to ϩ7 was converted to YARNTC, the A to C transversion provided by the plasmid backbone DNA. Thus, mutation to a sequence provid-ing superior contacts for TFIID binding may also contribute a positive effect on CAT transcription once the negative CSDE is inactivated by deletion. The results in Fig. 6 show that CSBP will not bind to an oligonucleotide corresponding to the sequence of the ϩ3 construct; and in experiments not presented, it has been determined that C at ϩ7 is disruptive for CSBP binding. Together, these observations further suggest the intriguing possibility that an A to C mutation at ϩ7 in tumor tissue may be sufficient to strongly activate GPH␣ gene expression, eliminating or significantly compromising the negative contribution that CSBP renders when bound at the CSDE and providing downstream sequence with more optimal contacts for the TFIID complex.
Irrespective of its mechanism of action, the physiologic role of the CSDE, at least in part, might be as a constitutive repressor, modulating transcription levels in cell types in which the genes are compared They were aligned at the transcription start site (ϩ1), and a consensus sequence was derived. For each cap site sequence, the number of bases contributing diad character and the number of base mismatches compared to the consensus sequence were determined. Bases in the palindrome are indicated with a superscript, filled dot on the left arm and a subscript, filled dot on the right arm. Similarly, bases that diverge from the consensus sequence are marked with open boxes. The arrangement of entries in the table (top to bottom) was determined by the number of bases in the diad. The arrows above and below the sequence mark an imperfect direct repeat (TTA[A]CT).
FIG . 11. Direct comparison of mutant (؉3, m-1, m-2) and wildtype (؉48) promoters in cell lines exhibiting variable levels of CSBP. A, the nucleotide sequences surrounding the cap site of the p␣(Ϫ846/ϩ48)CAT (designated wt) and p␣(Ϫ846/ϩ3)CAT (designated ϩ3) vectors are compared with each other and to a hexanucleotide consensus sequence that promotes TFIID contact and may function as an initiator (Inr) element. The boxed sequence is derived from plasmid DNA. B, transfections were carried out in such a manner that all cell lines received an aliquot of the same DNA-calcium phosphate precipitate containing 20 g of ␣-CAT reporter plasmid and 5 g of CMV-lacZ plasmid. Forty-eight hours after transfection, cell lysates were assayed for CAT and normalized to ␤-galactosidase activity in the same extract. The CAT/␤-galactosidase values for ϩ3, m-1, and m-2 were divided by those for ϩ48, and the ratio of mutant:wild-type activity for each cell line is plotted against the CSDE levels in nuclear extracts prepared from the same cell line as determined by EMSA. Values plotted are the average of two independent experiments. Different plasmid preparations were analyzed in the transfection experiments, and different nuclear extract preparations were subjected to gel shift analysis. The curves depict regression analysis of the data points. For ϩ3, y ϭ 0.083x Ϫ 1.965, r ϭ 0.962; for m-1, y ϭ 4.998 log(x) Ϫ 6.852, r ϭ 0.989; for m-2, y ϭ 5.138 log(x) Ϫ 7.029, r ϭ 0.986. GPH␣ gene is normally expressed (i.e. gonadotrope, thyrotrope, and trophoblast) and restricting its transcription in nontrophoblastic, nonendocrine cell types in which ␣-subunit production is undesirable. In addition to the action of CSBP, a number of other negative regulatory influences on GPH␣ gene transcription have been described. These include interference by the glucocorticoid receptor at the proximal CRE (73), an inhibitory thyroid receptor binding site abutting the TATA box (Ϫ9 to Ϫ20) (74), and down-regulation by competition of a ligandbound androgen receptor complex for the JRE (75). Further inhibition may be provided by a CpG methylation-sensitive factor that binds a motif imbedded in upstream and intronic Alu repetitive DNA (76). The latter provides an explanation for high level expression of the GPH␣ gene in cell lines where it is heavily methylated and low level expression in cell lines where the gene is hypomethylated (76). Together, the occurrence of these negative regulatory elements, in addition to the variety of cell-specific enhancers (e.g. GSE and PGBE) and second messenger response elements (e.g. CRE and ␣-activation element), suggest the need by cells to maintain tight regulatory control of GPH␣ gene expression.